A Statistics instructor has a theory that females on average have higher grades than males in quantitative courses at UM-St. Louis.

These data points were chosen so that they consisted of pairs carefully matched on a variety of characteristics, such as age, experience, prior education and training, ambition, destructive activities, IQ of parents, and factors subject to free will.

He believes that, while the overall means of these two distributions are not vastly different, the small gender difference may be detected if enough of the other factors affecting the grades could be controlled and adjusted for (the Bill Cosby school of detecting causality).

First, we will just do a

He suspects females may do better, but others think it might go the
other way, so we will **use two-tailed tests in the t-tests**.

Being a lazy sort of fellow, he is willing to use built-in tools in
EXCEL to do analyses--such as t-tests- rather than resort to his Monroe
calculator and a lot of scratchpads. He has hired you as the lab assistant
to do the work. He has provided a Comma-delimited ASCII file of his data
as **sexscore.txt** .

Copy and paste the data into an EXCEL sheet and immediately do a **SaveAs,
naming it sexgrad.xls. (make sure file type is EXCEL. It's not enough
to just change the extension)**

Your mission is to analyze these data, printing to one page wide by 2 pages tall maximum. do this by selecting a range that includes your analyses and only the top of the dataset (showing your student number used in the data). Print a selected range, preview / setup click the button to fit to 1 page by l page tall. use landscape or portrait to maximize print size on the one page. You can also adjust margins to help with this. Hand in a printout as exercise 2 and put a copy of the file in the digital dropbox on mygateway..

- Comment on the results of each analysis. This will work well if you put the interpretations in text boxes by the analyses they refer to.
- Textboxes are found on the drawing toolbar. Install the toolbar by view / toolbars/ drawing. it usually goes down the bottom of the window, click on the one that looks like a box of text with an "A" in it, then click on your sheet where you want it and drag to a starting size. click in it to type, click on the edge to move it or resize it or delete it by pushing delete after clicking on the edge.
- I recommend against using the cell comment tool in EXCEL, as getting them to print out correctly can be a challenge.

- Derive tables of descriptive statistics for each of the two samples, male and female, including 95% confidence intervals. Derive a mean and variance for the combined data--you can do this using the functions, even though the data are spread over two columns. Comment on whether it looks like these two samples came from populations with different means, referring to the sample means, standard errors and confidence intervals. What happened to the variance when the data was split into two groups -versus combined?
- Construct histograms (histograms are frequency GRAPHs) on the two groups (males and females) separately,and the combined data, making sure the X axis has the right values as labels on the axis so you can align them for easy visual comparison. Use about 15-20 bins so you can see the general shapes of the distributions. Do these look like the samples came from populations with different means?
- Conduct an F test (alpha=0.05) on the variances of the two distributions and decide whether to do a t-test assuming equal or unequal variances. Comment on the resulting F value and Pvalue. What is Ho in this test? is it rejected? What does the Pvalue mean?
- Conduct the appropriate (unpaired) t-test with alpha=0.05 to see if the means of the two distributions are significantly different. What is Ho? What is Ha?
- Comment on the results. Is there a significant difference? What is the probability of getting this observed t-statistic if the Ho is true? Do you conclude from this analysis that males and females have different average grades in the parent populations?

- Add a third column containing the differences between the pairs. Derive descriptive statistics and a confidence interval (95% confidence) on the column of differences. Calculate a t value for the mean of this distribution to see if it is different from zero. You will have to calculate the t value manually to do this, and then get a critical t and a Pvalue by using the EXCEL functions TDIST and TINV. Comment on the observed t statistic and the Pvalue in drawing a conclusion whether the mean of this set of differences is significantly different from zero.
- Conduct a Procedure (data analysis tools) t-test for difference of the means of the two distributions as paired variables, with alpha =0.05 comment on the results. Are they "significant"? what does that mean? How does this compare to the t-test on the column of differences?
- Why didn't we see the difference between males and females when we did the analysis without pairing the observations? What did the pairing do for us mathematically so that we could identify a difference with higher confidence?
- In all of these, why is it best to use a t test rather than a Z test even though there is a large number of observations?