|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
>
|
|
|
|
|
|
|
|
|
|
| C |
oach Salary
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
| E |
ye Color
Salary (in dollars) |
| Salary Range |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
22516 |
|
|
|
|
|
|
|
|
|
|
| A |
1
26425 |
A
1
29294 |
A
1
37116 |
|
|
|
|
|
|
|
|
|
| B |
1
41012 |
B
1
49806 |
B
1
51303 |
C
1
56477 |
C
1
61563 |
C
1
70295 |
C
1
71119 |
C
1
71662 |
C
1
73136 |
C
1
74574 |
C
1
84744 |
|
|
|
|
|
|
|
|
|
| D |
1
86053 |
D
1
86256 |
D
1
88901 |
D
1
92408 |
D
1
94666 |
D
1
98750 |
D
1
105363 |
E
1
112845 |
E
1
128443 |
E
1
159829 |
E
1
161341 |
E
1
286274 |
|
|
| F |
1
291587 |
F
2
22735 |
A
2
28525 |
A
2
28989 |
A
2
31865 |
A
2
32880 |
A
2
32934 |
A
2
34113 |
A
2
34764 |
A
2
38300 |
B
2
39428 |
B
2
40945 |
B
2
42545 |
B
2
43973 |
B
2
47278 |
B
2
49610 |
B
2
51980 |
C
2
56425 |
C
2
62148 |
C
2
82321 |
D
2
88219 |
D
2
88325 |
D
2
128084 |
E
2
128570 |
E
2
220092 |
F
Codes
Salary Range A
< $35,000 |
B
$35,000 – $50,000 |
C
$50,001 – $80,000 |
D
$80,001- $100,000 |
E
$100,001 – $200,000 |
F
> $200,000 |
Eye Color |
1
Green |
2
Not Green |
Usethis document file or open a new MS-Word document to enter the answers to these questions.
1. Coaches’ Salaries
A. Basic Single Variable Descriptive Statistics
Open the CoachesSalaries.xlsx file in Excel. It is a set of coaches’ salaries gathered in Tennessee in 2010.
“Clean up” the file, as described in class, if necessary. If you modified the file, save it and close it.
Open the CoachesSalaries.xlsx file in SPSS.
Use SPSS to find the following descriptive statistics for the coaches’ Salary in Dollars. (include a frequency table)
Minimum
Maximum
Range
Median
Q values
Mean
Standard Deviation
Use SPSS to create a histogram of the coaches’ Salary in Dollars with a projected normal curve included.
Calculate the z-scores for all coaches’ Salary in Dollars. (You do not need to copy and paste the z-scores into your Word document.)
1. Copy the Salary in Dollars “statistics box” from the SPSS output and paste it into your Word document.
2. Copy the Salary in Dollars frequency table from the SPSS output and paste it into your Word document.
3. Copy the Salary in Dollars histogram from the SPSS output and paste it into your Word document.
4. Review the Salary in Dollars “statistics box”. Based on your review, list the values of the following:
a. Q1
b. Q2
c. Q3
d. Q4
e. Mean
f. Median
g. Standard Deviation
5. Review the Salary in Dollars histogram. Based on that review:
a. Do the salaries appear to be a normal distribution?
b. Defend your answer to 5.a.
c.
Based on your histogram review alone
, are there any Salary in Dollars outlier candidates?
d. If there are outlier candidates, list the Salary in Dollars value(s) of the outlier candidate(s).
6. Review the Salary in Dollars z-scores for this distribution. Based on that review:
a. Are there any Salary in Dollars outlier candidates?
b. If so, list the outlier candidate Salary in Dollars values and their corresponding z-scores
7. Are your answers to 1. A. 5. c. and d. (above) the same as your answers to 1.A.6. a and b (above) ?
(Do both identify the same outliers…or no outliers?)
B. Cross-tabulation – Two-variable statistics
1. Add decode values to the
a. Salary Range variable
b. Eye Color variable
Use SPSS to cross-tabulate the Salary Range and Eye Color variables
2. Copy the cross-tabulation matrix found in SPSS output and paste it into your Word document.
2. Breast Cancer Data
Open the BreastCancerData.xlsx file in Excel. It contains data on breast cancer and fat intake for several countries.
Clean the file up, as described in class, if necessary. If you modify the file, save and close it.
Open the BreastCancerData.xlsx file with SPSS.
(After you have loaded your variables into SPSS, I recommend that you reduce the “Decimal Places” attribute for each variable to 5.)
Use SPSS to analyze this data, including:
A. Basic Single Variable Descriptive Statistics
Use SPSS to find the following descriptive statistics for Age-Adjusted Breast Cancer Mortality.
Minimum
Maximum
Range
Median
Q values
Mean
Standard Deviation
Prepare a histogram with projected normal curve for Age-Adjusted Breast Cancer Mortality.
Calculate the z-scores of all values for Age-Adjusted Breast Cancer Mortality. You do not need to copy and paste the z-scores onto your Word document.
1. Copy the Age-Adjusted Breast Cancer Mortality “statistics box” from the SPSS output and paste them into your Word document.
2. Copy the Age-Adjusted Breast Cancer Mortality frequency table from the SPSS output and paste them into your Word document.
3. Copy the Age-Adjusted Breast Cancer Mortality histogram from the SPSS output and paste it into your Word document.
Review the Age-Adjusted Breast Cancer Mortality statistics box.
Review the Age-Adjusted Breast Cancer Mortality frequency table.
Based on your review:
4. List the values of the following:
a. Q1
b. Q2
c. Q3
d. Q4
e. Mean
f. Median
g. Standard Deviation
5. For Age-Adjusted Breast Cancer Mortality, what percent of countries are higher than Japan?
6. For Age-Adjusted Breast Cancer Mortality, what percent of the countries are at or below France?
7. For Age-Adjusted Breast Cancer Mortality, in which quartile is Poland?
Review the Age-Adjusted Breast Cancer Mortality histogram
8. Based
only
on your review of the Age-Adjusted Breast Cancer Mortality histogram:
a. Based on your Age-Adjusted Breast Cancer Mortality histogram analysis alone, are there any Age- Adjusted Breast Cancer Mortality outlier candidates?
b. If there are outlier candidates, list the Age-Adjusted Breast Cancer Mortality value(s) on your Word document
Review the z-scores of the Age-Adjusted Breast Cancer Mortality distribution.
9. Based on your review of the z-scores for Age-Adjusted Breast Cancer Mortality:
a. What is the z-score for The Netherlands ?
b. Based on the z-score, is The Netherlands an outlier candidate?
c. Based on the Age-Adjusted Breast Cancer Mortality z-scores, are there any outlier candidates ?
d. If there are outlier candidates based on z-score analysis, which Age-Adjusted Breast Cancer Mortality values are they and what are their z-scores ?
B. Correlation – Two variable statistics
Use SPSS to calculate the Pearson R values for each of the following pairs of variables:
Breast cancer mortality and animal fat intake
Breast cancer mortality and vegetable fat intake
Breast cancer mortality and total fat intake
Use SPSS to prepare scatter plots (with trend line) for each of the following pairs of variables:
Breast cancer mortality and animal fat intake
Breast cancer mortality and vegetable fat intake
Breast cancer mortality and total fat intake
1. For Breast cancer mortality and animal fat intake
a. Copy the “R box” from the SPSS output and paste it into your Word document.
b. Copy the scatter plot for from the SPSS output and paste it into your Word document.
c. What is the value of the Pearson R?
d. Based on the value of the Pearson R, is there a correlation between breast cancer mortality and animal fat intake ?
e. If there is a correlation between Breast cancer mortality and animal fat intake, is it positive or negative ?
f. If there is a correlation between Breast cancer mortality and animal fat intake, is it strong or moderate or weak ?
g. 1. Based on your analysis of the correlation between Breast cancer mortality and animal fat intake, can you conclude that animal fat intake causes breast cancer ?
2.Why, or why not ?
2. For Breast cancer mortality and vegetable fat intake
a. Copy the “R box” from the SPSS output and paste it into your Word document.
b. Copy the scatter plot for from the SPSS output and paste it into your Word document.
c. What is the value of the Pearson R?
d. Based on the value of the Pearson R, is there a correlation between breast cancer mortality and vegetable fat intake ?
e. If there is a correlation between Breast cancer mortality and vegetable fat intake, is it positive or negative ?
f. If there is a correlation between Breast cancer mortality and vegetable fat intake, is it strong or moderate or weak ?
g. 1. Based on your analysis of the correlation between Breast cancer mortality and vegetable fat intake, can you conclude that vegetable fat intake causes breast cancer ?
2.Why, or why not ?
3. For Breast cancer mortality and total fat intake
a. Copy the “R box” from the SPSS output and paste it into your Word document.
b. Copy the scatter plot for from the SPSS output and paste it into your Word document.
c. What is the value of the Pearson R?
d. Based on the value of the Pearson R, is there a correlation between breast cancer mortality and total fat intake ?
e. If there is a correlation between Breast cancer mortality and total fat intake, is it positive or negative ?
f. If there is a correlation between Breast cancer mortality and total fat intake, is it strong or weak or moderate ?
g. 1. Based on your analysis of the correlation between Breast cancer mortality and total fat intake, can you conclude that total fat intake causes breast cancer ?
2.Why, or why not ?
3. Statistics In Everyday Use
Find three articles in the popular media (e.g. newspaper, web-site) that include statistics (presented as numbers or as graphics or in both ways.)
These should be three separate articles/sources, not three examples of statistics in one article
A. For each article/statistic:
1. Summarize the article and the statistic(s) it contains briefly, or copy the article, including the statistic(s) you are critiquing, into your Word document.
2. Include a citation of the source of the article.
B. For each article/statistic:
1. Do you think that the statistic or the way it is used/presented is deceptive?
2. Why or why not?
3. What is your evidence?
7 of 7
1 of 2
LSP 121 – Individual Assignment #1 – General Feedback
Use SPSS to calculate all statistics and prepare graphs.
Make sure that items copied from SPSS output and pasted to the submission document do not lose the “box” outline and
format. I will expect similar items to be copied and pasted successfully (with box and outline) on the exam.
1. A. Basic Descriptive Statistics
Include both the stats box and the frequency table
Include only the Salary variable, not the eye color and not the salary range
z-scores for outliers are greater than +2 or less than -2…..give the variable value itself
B. Frequency Distribution Presentation – histogram
This is not a normal distribution
There are outliers – give the values of the outliers, check the frequency table for specific values
Consistency – are the outliers values you found by zscore and by histogram examination the same ?
C. Another Frequency Distribution Presentation – Cross-tabulation
Include the Crosstab…not the Case Processing Summary
Crosstab should be between the two categorical variables – Eye Color and Salary Range
Include decode values for the variables
Put the variable with more values in the row
2. A. Basic Statistics and Correlation
Make sure the data in excel is cleaned up before importing it to SPSS – if there are missing values, you need to check the data in excel, clean it up
and import again
Include the stats box, frequency table, and histogram
Find the percentile on the frequency table – cumulative percentage column
Compare the value with Q1, Q2 and Q3 to find the quartile
2 of 2
B. Correlation
Prepare one R box and scatter plot for each pair of variables…each fat vs mortality rate
Include a trend line on the scatter plot
Use R, not R-squared
All are positive. Two are strong, one is weak
Correlation does not mean causation
3. Statistics in Everyday Use
Provide three examples
Are each of the examples deceptive ? (not descriptive)
Turn in your highest-quality paper
Get a qualified writer to help you with
“ probability SPSS ”
Get high-quality paper
NEW! AI matching with writer