Course Project – Phase 4

APA Format/ 3 Pages (I will provide the assignments from phase 1,2, and 3)

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

 

This week you will begin working on Phase 4 of your course project. For Phase 4 of your course project, you will want to review your instructor’s feedback from your Phase 1, Phase 2, and Phase 3 submissions to make any necessary corrections. Remember if you have questions about the feedback to ask your instructor for assistance.

Once you have made your corrections, you will compile your information from Phase 1, Phase 2, Phase 3 and your final conclusion into one submission and submit this as your rough draft for Phase 4 of the course project. Below is a summary of the expectations for Phase 4 of the course project:

  1. Introduce your scenario and data set.

    Provide a brief overview of the scenario you are given above and the data set that you will be analyzing.
    Classify the variables in your data set.

    Save Time On Research and Writing
    Hire a Pro to Write You a 100% Plagiarism-Free Paper.
    Get My Paper

    Which variables are quantitative/qualitative?
    Which variables are discrete/continuous?
    Describe the level of measurement for each variable included in your data set.

  2. Discuss the importance of the Measures of Center and the Measures of Variation.

    What are the measures of center and why are they important?
    What are the measures of variation and why are they important?

  3. Calculate the measures of center and measures of variation. Interpret your results in context of the selected topic.

    Mean
    Median
    Mode
    Midrange
    Range
    Variance
    Standard Deviantion

  4. Discuss the importance of constructing confidence intervals for the population mean.

    What are confidence intervals?
    What is a point estimate?
    What is the best point estimate for the population mean? Explain.
    Why do we need confidence intervals?

  5. Based on your selected topic, evaluate the following:

    Find the best point estimate of the population mean.
    Construct a 95% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown.

    Please show your work for the construction of this confidence interval and be sure to use the Equation Editor to format your equations.

    Write a statement that correctly interprets the confidence interval in context of your selected topic.

  6. Based on your selected topic, evaluate the following:

    Find the best point estimate of the population mean.
    Construct a 99% confidence interval for the population mean. Assume that your data is normally distributed and σ is unknown.

    Please show your work for the construction of this confidence interval and be sure to use the Equation Editor to format your equations.

    Write a statement that correctly interprets the confidence interval in context of your selected topic.

  7. Compare and contrast your findings for the 95% and 99% confidence interval.

    Did you notice any changes in your interval estimate? Explain.
    What conclusion(s) can be drawn about your interval estimates when the confidence level is increased? Explain.

  8. Discuss the process for hypothesis testing.

    Discuss the 8 steps of hypothesis testing?
    When performing the 8 steps for hypothesis testing, which method do you prefer; P-Value method or Critical Value method? Why?

  9. Perform the hypothesis test.

    If you selected Option 1:

    Original Claim: The average salary for all jobs in Minnesota is less than $65,000.
    Test the claim using α = 0.05 and assume your data is normally distributed and σ is unknown.

    If you selected Option 2:

    Original Claim: The average age of all patients admitted to the hospital with infectious diseases is less than 65 years of age.
    Test the claim using α = 0.05 and assume your data is normally distributed and σ is unknown.

    Based on your selected topic, answer the following:

    Write the null and alternative hypothesis symbolically and identify which hypothesis is the claim.
    Is the test two-tailed, left-tailed, or right-tailed? Explain.
    Which test statistic will you use for your hypothesis test; z-test or t-test? Explain.
    What is the value of the test-statistic? What is the P-value? What is the critical value?
    5.) What is your decision; reject the null or do not reject the null?

    Explain why you made your decision including the results for your p-value and the critical value.

    State the final conclusion in non-technical terms.

  10. Conclusion

    Recap your ideas by summarizing the information presented in context of your chosen scenario.

Please be sure to show all of your work and use the Equation Editor to format your equations.

Running Head: Analysis of the job salaries

Examine Job salaries for the state of Minnesota.

Jessica Seifert

Rasmussen College

January 6, 2018

Background of the data

The provided data in excel sheet is all about different job types in a specific area Minnesota.

The data shows and explains us about the Salary of given job type. The salary gets presented in per annum terms. The objective of this analysis is to understand the salary distribution for different types of jobs.

The provided excel sheet data contains 364 various job types. The given excel gets supplied by Bureau of Labor Statistics.

In the given data there are two types of attributes (variables).

1) Job title

2) Salary

The job title is qualitative, and the level of measurement of the job title is nominal.

The salary gets presented in digits that are quantitative and discrete data, and the level of measurement of the wage is the ratio.

Importance of the measure of variation and central tendency

The size of the center tendency plays a substantial and essential in our daily life. It helps us to figure out the single center point and help us to figure out reference point of the whole data. The most well-known and commonly used measure of central tendency is mean. The mode shows the most data value in the data and median tells help us to find middle amounts of the arranged data. So, the measure of center is significant as they help us to use reference point of the whole data from that we can analyze different real-life problems. (Pharmacother, 2011)

The Measures of variation shows and make the data accessible to know the consistency in the entire data or a specific data values, the measure of change shows us that how the data varied in the data set and we can easily compare the two data sets. The different types of sizes of variation, like standard deviation, mean deviation, quartile deviation, the coefficient of variation these all get used in the different scenarios in various aspects which help us understand the data efficiently. (Pharmacother, 2011)

Using excel command the output of the descriptive statistics gets presented below.

Salary

 

 

 

Mean

62306.13

Standard Error

1003.692

Median

56520

Mode

46100

Standard Deviation

19149.21

Sample Variance

3.67E+08

Kurtosis

0.258251

Skewness

1.028919

Range

79680

Minimum

40170

Maximum

119850

Sum

22679430

Count

364

Confidence Level (95.0%)

1973.78

Mid-range = (min+ max) / 2

 

Mid-Range

80010

References
Pharmacother, J. P. (2011). Measures of central tendency.

Running Head: STATISTICS – CONFIDENCE INTERVALS 1

CONFIDENCE INTERVALS 5

Course Project – Phase 2

Jessica Seifert

Rasmussen College

January 12, 2018

Question 1

Confidence intervals get used for giving a range of two figures whereby we can expect the population parameter which would include the mean to fall in within. Confidence intervals include confidence levels which get given by a percentage of how sure we are on where the population parameter will fall into the specified range. The confidence interval is calculated using the below formula: (x ̅-E<μ

A point estimate is represented as a single value and can also be said to be one statistic.

An example is the best point estimate for a population mean (μ) would be defined as a sample mean (x ̅). This is arguably the best point estimate because we are aware of the value of the entire population`s mean and therefore would take a sample of the population and calculate the sample mean and then use it on our confidence interval formula which would help in figuring the entire range of the whole population.

Confidence intervals get used as we are not aware of what is the real value of the population parameter. We, therefore, opt to use a small sample data to help us get a better idea of the data.

Question 2

After reviewing the data in the excel sheet, I found that the sample mean 62,306 although both sample mean and population mean are not the same it’s a reasonable point estimate for the population mean. I found the standard deviation of my spreadsheet is 19,149.21.

Question 3

We already know that the sample mean is (x ̅=62,306) and sample standard deviation (s=19,149.21), we would have to find our margin of error to construct our confidence interval. The formula to see our margin of error when σ is unknown is =t_(α/2) ∙s/√n. To solve this equating we need to find our t critical value corresponding to a 95% confidence level.

Step 1

Degrees of freedom (df) = n-1

364-1 = 363

Alpha (α) = 1-(confidence level/100)

= 1-(95/100)

=0.05

Critical probability (p) = 1-( α/2).

= 05/2 is .025

1-0.25 = p=.975.

Step 2

Use excel formula to find t critical value

=T.INV(.975,363) gives us t_(α/2)=1.967

To calculate margin of error

=1.967*19,149.21/sqrt(364)

E=1,974.261

(x ̅-E<μ

and 62,306 + 1,974.261=64,280.261

The confidence interval in this scenario being (60,331.739 < μ < 64,280.261).

Question 4

This confidence interval means that the population mean of the salaries in Minnesota that range in between $40,000 and $120,000 have a 95% chance of being around $60,331.739 and $64,280.261 a year. The values give a range in which to expect the value of the mean of the population. The sample mean is the midpoint between the two numbers that provide the confidence interval. For this reason, the sample mean is referred to as the best point estimate of the population mean.

Question 5

The confidence intervals are 95% and 99%. This affects the standard deviation since the other factors are constant. As standard deviation changes so do risk, it reduces as one increases the confidence interval. Increase in confidence interval means a subsequent increase in standard deviation.

A confidence interval is meant to give a range of values where the estimated value will fall. The best value to use as the point of the estimate is the mean of the sample data. This occurs when the midpoint of two figures that are part of the range where the actual number could fall. By reducing the confidence interval, one increases the risk of the figure falling outside the set intervals. This means that the likelihood of obtaining the actual value would be reduced dramatically. The vice versa is also correct, increasing the confidence interval minimizes the probability of the value falling outside the set interval.

HYPOTHESIS TESTING 1

Course Project – Phase 3 6

Course Project – Phase 3
Jessica Seifert
Rasmussen College
January 21, 2018

1. Discuss the process of hypothesis testing.
In statistics and other subjects of learning, the hypothesis is an opinion or a claim on an issue or item which is then tested to find out if it is accurate. The statistical inference to be specific is a conjuncture on a population parameter which then gets checked to find out whether it is correct or not.
2. The eight steps of hypothesis testing
I. Null hypothesis stating
This process involves the creation of a statement that can get defined on the opposite side of the calculated guess towards the research. A good example is where a biologist thinks that using fertilizer will lead to different height for the plants. The null hypothesis, in this instance, will be that there will be no different is plant heights.
II. Alternative Hypothesis
This statement is the opposite of null hypothesis. In our example,
the alternative hypothesis will be that there will be a difference in plant heights.
III. Setting the level of significance
This process is setting the probability of the commitment of a Type 1 error which is arguably the most grievous error one can commit when conducting the exercise. The error gets denoted by α.
IV. Data collection
This part is where the data set gets collected. That implies that the data collection can either be observational or experimental exercises.
V. Test statistic
In this stage, one states what they want to test this could be the sample proportion, sample mean or a difference of the two.
VI. Decision on type of test
The test can either be one or two-tailed. One tail testing is where the error will be found on one side of the data while two-tailed testing is where the error will be on the two side extremes.
VII. Acceptance or rejection regions
This method is where the critical values of the test will be used to determine the rejection or acceptance region of the hypothesis. An appropriate level of significance is used to manage the regions.
VIII. Standardization of the test statistic
This part is where the z-test will assist in the decision making on the rejection and acceptance of the set hypothesis. The standardization helps in concluding H0 so that where p-value will be less than less than α, then Accept Ha and Reject H0.
3. Preferred method – P-Value method or Critical Value method? Why?
The critical value is the preferred method as it involves the determination of the unlikeliness or likeliness thereby determining whether the observed test statistic is extreme than the expectation of the null hypothesis was correct. It entails the comparison between the test statistic and a cutoff value known as the critical value. Where the test statistic is found to be extreme than the critical value, the null hypothesis will get rejected, and the alternative hypothesis accepted and vice versa. This method gives a clear explanation on when to accept or reject the null hypothesis.
4. Test the hypothesis for the Minnesota case.
Since σ is unknown, we thus use the t-test which is given by:
t = (Mean –u)/s/√n
Mean = 65,000
From the case scenario, the mean is u = 62,306
And the standard deviation is unknown. Hence we use the sample in scenario 1=19149
Sample size n= 364.
Definition of the hypothesis will be
H0= The average wage for all jobs in Minnesota is equal to $65,000.(μ=65000)
H1= the average salary for all jobs in Minnesota less than $65,000( μ<65000) Hence t = (65000-62306)/(19149/√364 t = 2694/100.68 t = 2.68 At α = 0.05 with n-1 degree of freedom (364-1)=363 under one tail t- Value v= 1.658 Since the t-computed (2.68) is greater than t-tabulated (from table=1.658), the null hypothesis will be rejection, and a conclusion made that the average salary for all job categories is lower than $65,000. 5. Null and alternative hypothesis symbolically H0= μ=65,000 H1= μ<65,000 6. Is the test two-tailed, left-tailed or even right-tailed? Explain… The test is left tailed since the chances for the average salary being lower than the mean are greater. This part is on the left-hand side of the normal distribution shape. 7. Which test statistic will you use for your hypothesis test; z-test or t-test? Explain. A t-test will get used because the data gets normally distributed and that the σ is unknown therefore the need for finding the standard error to be able to standardize the average salary. 8. What is the value of the test statistic? What is the P-value? t=2.68, p-value<0.05. 9. What is the critical Value? From the table t (.05,263) =1.658 10. What is your decision? The decision is to reject null hypothesis since the t-computed (2.68) is greater than the t-tabulate found from the table (1.658) at p<.05 11. The conclusion in non-technical terms The null hypothesis got rejected because there was enough evidence to show that the average salary for all job categories is lower than $65,000.

Still stressed from student homework?
Get quality assistance from academic writers!

Order your essay today and save 25% with the discount code LAVENDER