Business Statistics with EXCEL

Some extraordinary data sources on the web:

The Data Ferret gives access to Census and CDC  data
www.fedstats.gov  gives links to a large variety of government sites with statistical data
Statistics.com is an extraordinary resource for tools and data
The National Center for Health Statistics at CDC is a great central source on health statistics and microdata
You may have a use for Math Programming and EXCEL addins at http://www.mathtools.net/
Here is a web site I have found helpful.
http://www.graphpad.com/articles/interpret/principles/stat_principles.htm
Here is a database I found which has survey data.
www.netcraft.com/survey/reports/
 

EXCEL tips, tricks and tutorials:

Tutorials on pivottables and other EXCEL
 http://www.mrexcel.com/archive/Pivot/
http://www.ozgrid.com/Excel/PivotTables/ExCreatePiv1.htm
http://office.microsoft.com/en-us/assistance/HA010346331033.aspx
Excellent EXCEL Videotutorials from DataPigTechnologies
convenience copy of the course CD at statisticsvideos.com
http://contextures.com/tiptech.html

http://www.ozgrid.com
 

Some Odd Issues with EXCEL

workaround

EXCEL may remove commas on import of comma-delimited data replace commas in source with another delimiter.
paste and parse in one step with the comma delimiter set in the import wizard
Tdist won't work with negative T scores and doesn't say why use the abs() function to provide the referenced number as a positive.

 

Simple Simulations of Statistical Concepts

ANOVA uses averaging to selectively eliminate variance due to different causes, allowing us to simultaneously measure the effects of different factors.
Anova is used when the dependent variable is continuous, but the factors are described as discrete categories
anovarat.xls has macros in it because it has a dropdown list used to set the level of averaging. set security to medium and allow EXCEL to enable the macros.
Multiple Regression. The Varsorc model visualizes the concept of multiple regression. We are able to  simultaneously detect effects of several variables. by mathematically correcting for all but the one of interest. varsorc.xls has macros in it because it has spin buttons. set security to medium and allow EXCEL to enable the macros.

 

HEY!! In Office 2007, where did the EXCEL menus go?

How to find EXCEL 2003 commands in EXCEL 2007 "Ribbons"
try downloading the Excel Ribbon mapping workbook.
I recommend clicking on the OFFICE icon at the top left of EXCEL 2007, click EXCEL OPTIONS (a button down at the bottom) and fix whatever needs fixing to get access to what you need, such as "Developer tab." Change the SaveAs default to old (1997-2003) EXCEL so you can still work on the product files on older versions of EXCEL. In case you have to, you might be able to find file converters at microsoft.com. They are also promising to someday make their file formats open source. Uncommon fonts may also wreak havoc as you try to work on a file across machines. Change default font theme under "page layout." I know Times New Roman and Arial are uninspiring, but they are more likely to be on your other machine. After I get a feel for what needs to be fixed in the School's setup of EXCEL, I'll produce some guidance and videos on how to fix things, so you can relate my existing videotutorials to EXCEL 2007.
Microsoft has a compatibility pack for installation in older versions of OFFICE so you can read and write the new file formats. You need to install all the updates first. http://www.microsoft.com/downloads/details.aspx?FamilyId=941b3470-3ae9-4aee-8f43-c6bb74cd1466&displaylang=en
if you can't use the link, then you can probably find it at the microsoft office site or google it.

Videotutorials on Statistics in EXCEL 2007 

Exercise 1 Sampling and the Central Limit Theorem

--Exercise 1, An experiment on sampling statistics, the Central Limit Theorem and 95% Confidence Intervals
Random Sampling Procedure (6:05) airsampl
Repetitive Sampling with Macros (10:41) samplmacro
Sample Statistics Means, StDev, SEM and 95% CI (11:40) samplstats
Descriptive Statistics-Comparison of Sample Means to Population (11:11) dscstats
Histograms of population vs. sample means (12:47) CLThistograms
Use a textbox for comments (7:18) textbox
Set up to print to one Page (6:41) cltprt1p

Exercise 2: T-tests

Male & Female Grades--Descriptive Statistics (10:25) dscstatsg
Histograms (8:53) histsg
(unpaired) 2-sample t-test (5:04) 2smpltsg
T-test on differences between pairs (11:22) difftsg
The paired Sample T-test in EXCEL (3:06) pairtsg
Scatter Charts and Correlation (6:51) correlsg
Text box and setup to print to 1 page (5:23) prn1psg
Exercise 3: ANOVA
One -way Anova on sales pricing and volumes (10:53) anova1way
Two -way Anova on sales pricing, shelf space and volumes (6:01) anova2way
Two -way Anova Interaction of sales pricing and shelf space on volumes (4:52) interact

 
Final Project- pivottables on the YRBSS2005 data in EXCEL 2007
You might also want to look at these if you are doing exercise 5 & 6 with EXCEL 2007. pivottables work a little differently in EXCEL2007, and the file formats don't convert backwards, so you can't easily work across versions. More on this as I study it further...
Download and parsing YRBSS2005 dataset y5parse
Making a pivot table in EXCEL 2007 y5pivot
Saving the file with a pivottable (incompatible with earlier versions) y5pivsave

 
 

Table of Video Tutorials in Earlier Versions of EXCEL:

Some of these are EXE's  that include the viewer. In that case, you may have to save to the hard drive and run using window explorer or Start/ Run

These are FLASH format video files, small enough to download from home (if over the phone, eventually!).  You can borrow earphones to listen while playing them in School labs. Double click on the speaker down in the taskbar to remove muting and increase volume. plug the earphone into the green connector on the back of the computer. It uses a mini-stereo plug.
  For your convenience, these are widely available from students who have copied the course CD's.
 You have permission to copy, use and distribute all these videos for personal and nonprofit educational use.

 Use Windows Explorer rather than "open" for file management--how to copy vs. moving files
rtclkcpy
Print EXCEL file to one page prn1page
video on importing text data and parsing. import
Sampling from airplane empty seats data:  samplek
Paste special transpose to transpose data from rows to columns. transpose
sorting rows sortht
adding in addins/tools addins
descriptive statistics procedure dscstats
histograms histgram
2-sample F test for unequal variances ftestvar
t-test 2 sample treating variances as equal ttst2smp
1-sample ttest 1samplet
conversion of survey text to logical values
paired values ttest pairedt
  • Other videos on excel analysis of characteristics of the distribution of sample means:

  • Finished spreadsheet results show Central Limit Theorem on distribution of sample means.

    confidence limits on estimates of the population mean compared to the actual value
    Comparison of the distribution of original values to the distribution of sample means
     
    1-factor ANOVA - with audio  anova1V
     2-factor ANOVA with replication- with audio  anova2v
    Interaction in 2-way ANOVA  interact
    Copying text out of a pdf   pdftxt
    pasting text into EXCEL pastechi
    Modifying (expand/contract) the Chi2 sheet expchi
    pasting in new data for the Chi2 pastenew
    chart with trendline trendline
    simple linear regression simpregr
    multiple regression yr1mregr
    Use of Winzip to compress files winzip

    Data for the YRBSS 2001 & 2003 are on this disk, as well as NCHRBS 1995

    The videos below are with EXCEL XP and the YRBS 2001 and YRBS 2003 data as indicated. New data is released every two years at the CDC website. There is also a dataset there for the National College Health Risk Behavior (NCHRBS) conducted in 1995.
    There may be a new version of the NCHRBS in the near future, so it would be good to check the CDC websites from time to time. Data from adult risk behavior surveys and census data are available at the dataFerret site. For convenience, ASCII data and descriptions available now are linked here and are on this disk so you don't have to download it:
    YRBS01.dat  yrb01codebook (pdf)
    YRBS03.dat  yrb03codebook (pdf) YRB03powerpointslides
    NCHRBS95.dat  NCHRBcodebook (pdf)
    Use of Winzip to compress large files winzip
    more instructions and links to WINZIP & FTP install on this disk
    screen captures of YRBS2001 parsing in EXCEL Screen Captures of YRBS2003 parsing in EXCEL
    multiple regression (description handout) yr1mregr
     

    Using EXCEL Pivot Tables for Crosstabulation

    Simpson's Paradox

    Some other useful links on this topic:

    http://plato.stanford.edu/entries/paradox-simpson/
    http://www.google.com/search?hl=en&q=simpson%27s+paradox
    http://core.ecu.edu/psyc/wuenschk/StatHelp/Reversal-Paradox.txt
    http://repository.upenn.edu/cgi/viewcontent.cgi?article=1014&context=wharton_research_scholars
    http://wolfweb.unr.edu/homepage/jerryj/NNN/Aggregates.pdf
    U. Cal. Berkeley Graduate School Admission Sex-bias Data analysis by pivottable and Chi-square The partial dataset was obtained from Gerstman B.B. (2000) Data Analysis with Epi Info, Binary Outcome, Stratified Analysis on the web at http://www.sjsu.edu/faculty/gerstman/EpiInfo/stratified.htm
    Screencaptures showing formulas Exposition.doc
    Dataset in EXCEL sexbias2.xls
    Setting up a Pivottable Video berkpivt.html
    calculating percents of admissions Video berkpct.html
    flexible Chi2 Calculation and interpretation Video berkchi2.html

    Youth Risk Behavior Surveys

    Completed sheet with YRBS2001 data imported and parsed

    YRBS2001 

    Yrbs Download  yrb03dl dlyrb1
    Import and parse  all of YRBSS 2001or 2003 into EXCEL   yrb3pars importxl
    add labels to tops of columns for field names  y3lbltop lblontop
    Prepare EXCEL file of YRBS '03 data for multiple regression--converting to actual ages,height in inches, weight in pounds and eliminating empty data fields regrprep
    Use Pivot table to select crosstab variables
    (Description handout)
        pivot03 pivottbl 
    Current Marijuana Use and grades, with page variables.
    "Do people smoke pot because they are stupid, or are they stupid because they smoke pot?"
    mjgrd1       setting up the pivottable
    mjgrdpct    calculating percents by rows and columns 
    mjgrdgrf    adding graphs
    mjgrdpag  using the page variables
    More complex examples of pivottables with grouping of categories and more automated data extraction BA2000 website Spreadsheet models in Management Science
     Calculate the percent of males and females who smoke    pctsmk03 pcsmoke
    What would you expect if there were no relationship (Ho true)     expect03 expect
    Chi2 value =Does the Observation agree with your expectation? (Description handout
    Updated versions of these are on the web.
    Printable handout as WORD document Chi2.doc
           03smkch2 chi2calc
    Other Crosstabs modify Pivot tables to use a different field-     pivchvar modfield
    Other Crosstabs -adjust tables for new dimensions-- the "auto Chi-square"     newchi
    Other summary measures- smoking, gender and average weight     pivchvar smkavgwt

    Stock Return time Series and Market Beta

    Stock returns--download IBM and S&P500 from http://finance.yahoo.com/
    dl-sp-ibm
    Stock returns--Calculating a Market Beta for IBM abibmsp
    Example Sheet IBM, AB and S&P SPbudibm.xls

    Use of Access to predigest data for EXCEL analysis

    (an alternative to EXCEL Pivot Tables)
    parsing YRBSS data into Access and Excel  (ACCESS with yrbs1999 example) import wizard in EXCEL is very similar  imptaccs  and import2
    Crosstab Query in ACCESS (yrbs1999 example)  crosstab

    E-mail: ba3300 @ budbanis,com
    Campus office CCB 230
    On-campus phone 314-516-6136
    Off-campus 636-394-4950



    last modified October 13, 2007

    return to home page