Big Data Analytics using Spark

Part A: Clustering –

1. Find a dataset in kaggle or any other source. Make sure that each dataset is at least 500 MB.

2. Write a detailed description of the dataset.

3. Preprocess the dataset.

4. Using K-means algorithm to cluster the dataset.

5. Use the Elbow method and the Silhouette method to find the optimal K.

Part B: Regression

1. Find one or two datasets in kaggle or any other source. Make sure that each dataset is at least 500 MB.

2. Write a detailed description of each dataset.

3. Preprocess each dataset.

4. Divide each dataset into training and testing.

5. Build two regression models.

6. Test the models and compute their accuracy.

Part C: Classification

1. Find one or two datasets in kaggle or any other source. Make sure that each dataset is at least one 500MB.

2. Write a detailed description of each dataset.

3. Preprocess each dataset.

4. Divide each dataset into training and testing.

5. Build two classification models.

6. Test the models and compute their accuracy.

Deliverables:

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Big Data Analytics using Spark ”

Get high-quality paper

Guarantee! All work is written by expert writers!

Still stressed from student homework?

Get quality assistance from academic writers!

Order now