Students should work individually on the Mid-Term: Students are encouraged to talk and show each other tricks in Excel, if helpful, but files should not be shared or copied, and all students should be able to replicate analysis they used for the Mid-Term.

Instructions: For each problem, first read the background in the problem and then refer to the related tab in “Mid-Term.xlsx” for the data to be used. Each student’s successfully completed Mid-Term will include:

(1) a Word doc with each section of the problem (e.g., Problem 1, part a, sub-part i) answered clearly and straightforwardly, including brief explanations in sentence form, charts pasted into the doc, references to the XLSX file, etc. (Please type answers and paste data/charts/etc. into this DOC and save as with “- [Last Name]” at the end).

(2) an Excel file showing the work, formulas, Pivot tables, charts, etc. used to calculate and analyze the data. (Please also use the original XLSX, manipulating data, adding tabs, etc. and save as with “- [Last Name]” at the end).

PART I – Problems: Answer in the same format (DOC/XLSX) used in the homework assignments.

1. (100 Points) Revenue Forecasting Application – Show work on TAB 1 (or add “1” at the beginning of other tabs used to answer this questions) on XLSX: As we discussed in class and as Silver notes in his book, a prediction of percentage chance of something happening is not the same as saying that thing will actually happen. Suppose, for example, that a local government would expect $100 million in tax revenue for its general fund in a “good” economy and $60 million during a “bad” economy. Suppose that every fiscal year there is a 15% chance (independent probability unrelated to the previous year) of a bad economy and an 85% chance of a “good” economy. Also suppose that the government cannot know whether it was a “good” or “bad” economy until the end of the fiscal year, well after the budget has been set.

a. Possible Outcomes: In this example, how many possible outcomes are there for each fiscal year? What are these outcomes? Are they mutually exclusive? Are they collectively exhaustive?

Answer:

There are two options that government can foresee and 2 outcomes that can really happen. Taking it into consideration we can conclude that total amount of possible outcomes is 4

b. Decision Tree: Create a decision tree to illustrate the possible outcomes of 4 fiscal years. You should show (i) the independent probability of each event (chance node), based on the information in the question, and (ii) the final amount of tax revenue after 4 years given these possible 4-trial outcomes (end nodes), but do not need to calculate the full probability of each 4-trial outcome at this point.

Answer: (Paste Graphic Here or Create it Using the “Insert” Ribbon in Word and Shapes)

c. Calculating the Probability of Outcomes over Multiple Trials: Determine the probability that this local government will face “good” economies 0, 1, 2, 3, 4, 5, 6, and 7 times over 7 fiscal years. Create a reference table like the one on the top of page 174 of the text in your Excel to show your work. Be sure to show how much total tax revenue this government would have raised in each case (with 0, 1, 2, 3, 4, 5, 6, and 7 “good” economy outcomes). (References: Page 174 in the text and Page 79 in Silver – see Footnote, as well).

Answer:

d. Probability Histogram: Create a histogram like the one on page 174 for the distribution above. What is the “area under the curve” (i.e., what is the total of all the probabilities)?

Answer:

e. Interpretation: Based on the work above (a-d), especially (c) and (d), what do you notice about the chances that the local government would face multiple “bad” economies over a 7-year period? Over a 7-year period, what is the most likely amount of revenue that this government would have collected? How much does that average out to per year?

Answer:

f. Budgeting: Given this probabilistic revenue projection and no ability to access funds beyond the tax revenue projected in the problem above (e.g., no municipal bond markets, etc.), how would you set the spending side of the budget each year? Why? Would you collect a reserve or “rainy day” fund and, if so, how much would you try to save each fiscal year for that fund? Why?

Answer:

2. (120 Points) Pothole Repair Performance – Show work on TAB 2 (or add “2” at the beginning of other tabs used to answer this questions) on XLSX: This dataset shows 3 pothole repair units’ raw performance data over a 2-week period. Residents have been promised a 48-hour turn-around time on their Pothole Repair Services Requests (SR’s). Each observation in the XLSX is a pothole repair, with each column representing, respectively:

• the # of the unit that repaired the pothole (Unit 1, 2, or 3),

• whether the SR (service request) was closed on-time (within 48 hours, as promised to the resident),

• the actual time (in hours) taken to close the SR (i.e., time from resident request to SR closure in the 311 system), and

• the time “billed” to that pothole by the unit (i.e., time since the last job closed to close this one, excluding unbilled breaks).

a. Frequency Distribution Table: Using the defined classes in the data set for “SR Duration to Close (Class)”, create a frequency distribution table with columns showing (1) the classes (e.g., “1. 0-9 hours, etc.), (2) frequency (i.e., the count of observations), and (3) the relative frequency (i.e., % of total observations in each class).

Answer: (PASTE TABLE)

Classes Frequency Relative Frequency

1. 0-9 Hours 13 6.22%

2. 10-19 Hours 39 18.66%

3. 20-29 Hours 32 15.31%

4. 30-39 Hours 42 20.10%

5. 40-49 Hours 35 16.75%

6. 50-59 Hours 21 10.05%

7. 60-69 Hours 24 11.48%

8. 70-79 Hours 3 1.44%

Total 209 100.00%

b. Histogram: Using the table above, create and paste a histogram of just the relative frequency. (Hint: may be easier to copy and paste data as values from Pivot Table into other cells, rather than build chart off of the pivot table directly).

Answer: (PASTE HISTOGRAM)

c. Summary Statistics: Compute the mean, median, range, variance, standard deviation, and the first, second, and third quartile dividers of the variable “SR Duration to Close (Hours)”.

Answer:

d. Skewness: Is the variable “SR Duration to Close (Hours)” skewed? How can you tell?

Answer:

e. Pivot Table for Performance: Create a pivot table that shows the (1) count of on-time, late, and total SRs per Unit and the (2) the percentage of on-time, late, and total SRs per Unit. One way would be to create a Pivot Table with Pothole Repair Unit on the y-axis and Outcome as the x-axis, and add Outcome or as the “values” in the table two times, using the “show values as” option to make one of the two values in the table “show values as % of row total”.

Answer: (PASTE PIVOT TABLE AS IMAGE HERE)

i. On-Time Percentage: Explain each unit’s (1) on-time pothole repair count, (2) total pothole repair count, and (3) “on-time” repair percentage.

Answer:

ii. Supervisor Interpretation: If you were the pothole repair supervisor, how would you judge/compare the performance of these three units based solely on the table above? What other information would you want and why? What other kind of benchmarks would you want?

Answer:

f. Hours Billed per Unit Pivot Table: Create a new Pivot Table (or calculate manually) to find the total Worker-Hours billed by each unit and how many Worker-Hours each Unit billed on average per repair for this 2-week period.

Answer: (IF YOU DO A PIVOT TABLE, PASTE PIVOT TABLE AS IMAGE HERE)

i. Hours Billed per Unit: Explain how many (1) total Worker-Hours were billed by each Unit and (2) how many Worker-Hours each Unit billed on average per repair for this 2-week period.

Answer:

ii. Correlation Coefficient: Determine the correlation coefficient between “SR Duration to Close (Hours)” and “Worker-Hours Billed per Repair”. (You can use the “=CORREL()” command). How do you interpret this coefficient in this context?

Answer:

iii. Overtime Usage: If (1) each Unit has 3 employees assigned, (2) each employee has a 70-hour 2-week work period, and (3) all employees worked the full 2-week period, how much overtime would have been needed per Unit for this 2-week period? (i.e., overtime needed = total billed hours per unit – hours in normal 2-week work period for each unit)

Answer:

iv. Supervisor Interpretation: Considering total workload, on-time percentage, time billed per repair, and overtime used, how would you interpret these results? How does the new information change your assessment of these three units over this period? Would you change the Pothole Repair SR assignment protocol as a result of this analysis? Why or why not?

Answer:

3. (20 Points) Normal Distribution – Show work on TAB 3 (or add “3” at the beginning of other tabs used to answer this questions) on XLSX: Suppose that the property values of a given city are normally distributed, with mean $167,452 and standard deviation (SD) $64,938.

a. What is the probability that a given property value selected at random will be more than $200,000?

Answer:

b. What is the probability that a given property value selected at random will be between $100,000 and $150,000?

Answer:

c. What is the probability that a given property value selected at random will be below $170,000?

Answer:

d. Below what property value would you find 80% of the properties?

Answer:

e. Interpretation: Do you think property values in a city would likely be normally distributed? What aspects (skew, kurtosis, single/multiple mode(s), etc.) of the data would you expect or not expect to fit the characteristics of normal distribution and why?

Answer:

PART II – Short Essays: Answer on just the DOC. Answers should be at least 4-5 sentences each, but may require more depending on the question. You may add graphics if helpful.

4. (30 Points) What is a counterfactual? When evaluating a program or any public policy, why is it critical to have a counterfactual? Give examples of types of counterfactual construction and their advantages/disadvantages. What do you compare the counterfactual to?

Answer:

5. (30 Points) Based on the World Bank reading, what characteristics does a “SMART” indicator have? Give examples. Why is it important to use well thought-out indicators in monitoring and evaluation (M&E) of public sector and non-profit programs?

Answer:

6. (30 Points) Researchers decide to study a program that assigns additional police officers to high-crime cities. They then compare the crime rates in cities that received additional police officers to cities that were not eligible for the program due to low crime rates (i.e., below the threshold used) and did not receive additional police officers. They find that cities that were assigned police officers have higher crime rates than those that were not. How would you interpret this research?

Answer:

7. (30 Points) Based on Silver and the Levine et al. textbook, why is it important to communicate prediction uncertainty to decision-makers and the public at large? What are methods that can be used to communicate that uncertainty?

Answer:

8. (30 Points) Based on Silver’s chapter on Tetlock, explain the difference between “hedgehogs” and “foxes” in terms of how they model and think about the world and make predictions? What is particularly notable about the predictions of “hedgehogs” with high levels of education?

Answer:

9. (15 Points) Based on Silver, what roles do human judgment and consensus play in either improving or impairing predictions?

Answer:

10. (15 Points) Explain two of the survey error types mentioned in class and in the book. Come up with a hypothetical or real survey (from work, if applicable) situation and explain how this error could affect the findings, in what direction, and with what expected magnitude (minor/major bias).

Answer: