Need to have this done by 11/24/12Thanks
Here a sample of what I need done. Also a copy of the data needed to do the project
Curve-fitting Project – Linear Model – Instruction
A) Instructions:
For this assignment, collect data exhibiting a relatively linear trend, find the line of best fit, plot the data and the line, interpret the slope, and use the linear equation to make a prediction. Also, find r2 (coefficient of determination) and r (correlation coefficient). Discuss your findings. Your topic may be that is related to sports, your work, a hobby, or something you find interesting.
B) Tasks for Linear Regression Model (LR):
(LR-1) Describe your topic, provide your data, and cite your source. Collect at least 8 data points. Label appropriately. (Post this information as a main topic here in the Project conference as well as in your completed project. Include a brief informative description in the title of your posting. Each student must use different data.)
(LR-2) Plot the points (x, y) to obtain a scatter plot. Use an appropriate scale on the horizontal and vertical axes and be sure to label carefully. Visually judge whether the data points exhibit a relatively linear trend. (If so, proceed. If not, try a different topic or data set.)
(LR-3) Find the line of best fit (regression line) and graph it on the scatter plot. State the equation of the line.
(LR-4) State the slope of the line of best fit. Carefully interpret the meaning of the slope in a sentence or two.
(LR-5) Find and state the value of r2, the coefficient of determination, and r, the correlation coefficient. See information on linear regression attached. Discuss your findings in a few sentences. Is r positive or negative? Why? Is a line a good curve to fit to this data? Why or why not? Is the linear relationship very strong, moderately strong, weak, or nonexistent?
(LR-6) Choose a value of interest and use the line of best fit to make an estimate or prediction. Show calculation work.
(LR-7) Write a brief narrative of a paragraph or two. Summarize your findings and be sure to mention any aspect of the linear model project (topic, data, scatter plot, line, r, or estimate, etc.) that you found particularly important or interesting.
See attachment( Which is a sample of what the project should look like)
C:\Documents and Settings\yvb000\Desktop\sample of Project A
See attachment of my data which you will be using for this Project(Baltimore Orioles winning games)
Baltimore Orioles winning games- Project A
Page1 of 4
(Sample) Curve-Fitting Project – Linear Model: Men’s 400 Meter Dash by Suzanne Sands
(LR-1) Purpose: To analyze the winning times for the Olympic Men’s 400 Meter Dash using a linear model
Data: The winning times were retrieved from http://www.databaseolympics.com/sport/sportevent.htm?sp=ATH&enum=130
The winning times were gathered for the most recent 16 Summer Olympics, post-WWII. (More data was available, back to 1896.)
DATA:
Summer Olympics:
Men’s 400 Meter Dash
Winning Times
Year
Time
(seconds)
1948 46.20
1952 45.90
1956 46.70
1960 44.90
1964 45.10
1968 43.80
1972 44.66
1976 44.26
1980 44.60
1984 44.27
1988 43.87
1992 43.50
1996 43.49
2000 43.84
2004 44.00
2008 43.75
(LR-2) SCATTERPLOT:
As one would expect, the winning times generally show a downward trend, as stronger competition and training
methods result in faster speeds. The trend is somewhat linear.
43.00
43.50
44.00
44.50
45.00
45.50
46.00
46.50
47.00
1944 1952 1960
1968 1976 1984 1992 2000 2008
T
im
e
(
s
e
c
o
n
d
s
)
Year
Summer Olympics: Men’s 400 Meter Dash Winning Times
tanali
Text Box
Prepared
Page 2 of 4
(LR-3)
Line of Best Fit (Regression Line)
y = −0.0431x + 129.84 where x = Year and y = Winning Time (in seconds)
(LR-4) The slope is −0.0431 and is negative since the winning times are generally decreasing.
The slope indicates that in general, the winning time decreases by 0.0431 second a year, and so the winning time decreases at an
average rate of 4(0.0431) = 0.1724 second each 4-year Olympic interval.
y = -0.0431x + 129.84
R² = 0.6991
43.00
43.50
44.00
44.50
45.00
45.50
46.00
46.50
47.00
1944 1952 1960 1968 1976 1984 1992 2000 2008
T
im
e
(
s
e
c
o
n
d
s
)
Year
Summer Olympics: Men’s 400 Meter Dash Winning Times
Page 3 of 4
(LR-5) Values of r
2
and r:
r
2
= 0.6991
We know that the slope of the regression line is negative so the correlation coefficient r must be negative.
� = −√0.6991 = −0.84
Recall that r = −1 corresponds to perfect negative correlation, and so r = −0.84 indicates moderately strong negative correlation
(relatively close to -1 but not very strong).
(LR-6) Prediction: For the 2012 Summer Olympics, substitute x = 2012 to get y = −0.0431(2012) + 129.84 ≈ 43.1 seconds.
The regression line predicts a winning time of 43.1 seconds for the Men’s 400 Meter Dash in the 2012 Summer Olympics in London.
(LR-7) Narrative:
The data consisted of the winning times for the men’s 400m event in the Summer Olympics, for 1948 through 2008. The data exhibit
a moderately strong downward linear trend, looking overall at the 60 year period.
The regression line predicts a winning time of 43.1 seconds for the 2012 Summer Olympics, which would be nearly 0.4 second less
than the existing Olympic record of 43.49 seconds, quite a feat!
Will the regression line’s prediction be accurate? In the last two decades, there appears to be more of a cyclical (up and down)
trend. Could winning times continue to drop at the same average rate? Extensive searches for talented potential athletes and
improved full-time training methods can lead to decreased winning times, but ultimately, there will be a physical limit for humans.
Note that there were some unusual data points of 46.7 seconds in 1956 and 43.80 in 1968, which are far above and far below the
regression line.
If we restrict ourselves to looking just at the most recent winning times, beyond 1968, for Olympic winning times in 1972 and
beyond (10 winning times), we have the following scatterplot and regression line.
Page 4 of 4
Using the most recent ten winning times, our regression line is y = −0.025x + 93.834.
When x = 2012, the prediction is y = −0.025(2012) + 93.834 ≈ 43.5 seconds. This line predicts a winning time of 43.5 seconds for 2012 and
that would indicate an excellent time close to the existing record of 43.49 seconds, but not dramatically below it.
Note too that for r
2
= 0.5351 and for the negatively sloping line, the correlation coefficient is � = −√0.5351 = −0.73, not as strong as when
we considered the time period going back to 1948. The most recent set of 10 winning times do not visually exhibit as strong a linear trend as the
set of 16 winning times dating back to 1948.
CONCLUSION:
I have examined two linear models, using different subsets of the Olympic winning times for the men’s 400 meter dash and both have
moderately strong negative correlation coefficients. One model uses data extending back to 1948 and predicts a winning time of 43.1 seconds
for the 2012 Olympics, and the other model uses data from the most recent 10 Olympic games and predicts 43.5 seconds. My guess is that 43.5
will be closer to the actual winning time. We will see what happens later this summer!
UPDATE: When the race was run in August, 2012, the winning time was 43.94 seconds.
y = -0.025x + 93.834
R² = 0.5351
43.40
43.60
43.80
44.00
44.20
44.40
44.60
44.80
1968 1976 1984 1992 2000 2008
T
im
e
(
s
e
c
o
n
d
s
)
Year
Summer Olympics: Men’s 400 Meter Dash Winning Times
Linear Model – Data
Baltimore Orioles Winning Games
Years Winning Games
2012 93
2011 69
2010 66
2009 64
2008 68
2007 69
2006 70
2005 74
2004 78
2003 71
2002 67
2001 63
2000 74
1999 78
1998 79
1997 98
1996 88