UOF Processing Datasets Using Pandas Numpy and Matplotlib Programming Exercise

Use Pandas, numpy and matplot lib.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv

use this link for the data

HW-6
Processing Datasets using Pandas
Due Date:4/30/2021
Total Points: 125
Due Date: 4/30/2021
Purpose: This assignment provides you practice in processing datasets using Pandas, Numpy and
Matplotlib. You may have to use the built-in data structures of Lists and Dictionary and apply techniques
you learned before the midterm. Please don’t use the csv module.
HW-6
Processing Datasets using Pandas
Due Date:4/30/2021
Background
The New York Times releases a series of data files with cumulative counts of coronavirus cases in the United States,
at the state and county level, over time. It compiles this time series data from state and local governments and health
departments to provide a complete record of the ongoing outbreak. Since late January, The Times has tracked
coronavirus cases in real-time.
The CSV link for counties is: https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv
Homework
You will write a program (several functions and plots) that processes this data set using Python’s Pandas,
Numpy and Matplotlib, module and will plot your results and produce different types of visualizations
using Matplotlib.


Please make sure that you have watched the complete series of videos for Matplotlib and
Pandas so you know how to create various visualizations. Also, make sure that you know how to
use dictionaries and read and write information from files.
You can use Spyder and Jupyter Notebooks during development but use Juypter Notebooks to
show your code and final submission results. Annotate your code and describe what you are
doing.
Important Notes




Pandas: All these problems should be done using Pandas data structures (Series and Data
frames) and techniques:
Data Structures: Please use Pandas Data Structures for the solution.
Jupyter Labs: Please use the Jupyter Labs environment to make sure you can do all these
incrementally first before you convert your code into functions and modules.
Csv Module: You don’t need to use it to read files as you can use the Pandas built-in functions.
1. [10 Points] Write a function that reads in the dataset from the file (using Pandas) and extracts all
records for a particular state, say, Virginia, in another output file. Use the csv module to read
and write files. The output file should have the name: va-counties.csv
2. [15 Points] Use the Pandas built in functions to describe this data set and show the descriptive
statistics pertaining to it.
3. [25 Points] Write a function that calculates the total number of cases and deaths in a state for a
particular day, say January 1/31/2021. Pass the state and date as an argument to your function.
4. [25 Points] Use the previous problem’s function to plot the deaths for January 2021 for each
day of the month for Virginia.
5. [15 Points] Write a function that reads in the original csv file and draws a Bar Graph plot of the
top ten states showing the states and the number of cases and number of deaths for a specified
Month.
6. [35 Points] Draw a Grid of Bar Graphs one for each month for a year (12 months = 4 x 3 ) the
top ten states starting in March 2020 and Ending Feb 2021. Assume the 4 rows are the four
seasons with months as follow. Spring (3-5), Summer(6-8), Fall (9-11), Winter(12-2)
HW-6
Processing Datasets using Pandas
Due Date:4/30/2021

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper
Still stressed from student homework?
Get quality assistance from academic writers!

Order your essay today and save 25% with the discount code LAVENDER