Please read each question carefully, I don’t think this three question are hard for an expertise, so I need working code. For answer in word, please explain as detail as you can.
For question 4, Minimum spanning treesthere are three link on the questions:[haversine formula](https://en.wikipedia.org/wiki/Haversine_formula)[the US was supposed to convert to the metric system in 1975](https://en.wikipedia.org/wiki/Metrication_in_the_United_States)[Great Circle Mapper](http://www.gcmap.com/)and I take the screen shot about relevant lecture which will sent you later. For question 5: Image compression with singular value decompositionThe question is about Image compression with singular value decomposition, more detail are in the picture I uploaded.I have copy the instructor’s code which will help.The question has three sub questions import numpy as npfrom scipy import miscimport matplotlib.pyplot as pltf = misc.face(gray=True) plt.imshow(f, cmap=plt.cm.gray)U, Sigma, V = np.linalg.svd(f)print(“Shape of U:” + str(U.shape))print(“Shape of Sigma:” + str(Sigma.shape))print(“Shape of V:” + str(V.shape))For question 6: Machine learning with irisestwo links on the question:[iris dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set)[scikit-learn SVM classifier](http://scikit-learn.org/stable/modules/svm.html#svm-mathematical-formulation) Problem 5: Image compression with singular value decomposition
Run the following code to get a NumPy array corresponding to a grayscale picture of a raccoon.
In [1]: import numpy as np
from scipy import misc
import matplotlib.pyplot as plt
f = misc.face(gray=True)
plt.imshow(f, cmap=plt.cm.gray)
0
100
200
300
400
500
600
700
Out[1]:
0
200
400
600
800
1000
Problem 6: Machine learning with irises
Grading criteria: correctness and relevance of code.
6a. The Wikipedia article on the iris dataset (previously seen on homework 7) asserts:
The use of this data set in cluster analysis however is not common, since the data set only contains two clusters with rather obvious separation.
Demonstrate this by performing a clustering computation and showing that it fails to separate all the species in the dataset. (You can change the kernel to do this.)
In ( ):
65. Use the scikit-learn SVM classifier to classify species. Use a random sample of 80% of the initial data for training and the other 20% for testing, and report the accuracy rate of your predictions.
In ():
Problem 4: Minimum spanning trees
Grading criteria: correctness of code.
William Stein announces the formation of CoCalc Fiber, a new broadband network which will connect the capital cities of the 48 continental United States (excluding Alaska and Hawaii) with a series of
underground cables joining certain pairs of cities. Naturally, William wants to minimize the cost of building this network; he assumes that the cost of building a network link between some pairs of cities
is proportional to the distance between those two cities.
4a. Define a function that implements the haversine formula: given two latitude-longitude pairs, compute the distance between those two points on Earth. Use kilometers instead of miles, because the
US was supposed to convert to the metric system in 1975. Check your formula by computing the distance between Boston and San Diego, as reported by the Great Circle Mapper web site.
In [ ]:
4b. Construct a complete graph on 48 vertices in which each vertex is labeled by the name of a state, and each edge is labeled by the distance between the capital cities.
In ():
4c. Construct the minimal spanning tree corresponding to the optimal CoCalc Fiber network. (Hint: You may want to reread the lecture on 10-26-18.)
In [ ]:
Run the following code to get the singular value decomposition of the matrix corresponding to this picture.
In [2]: U, Sigma, V = np. linalg.svd (f)
print(“Shape of U:” + str(U.shape))
print(“Shape of Sigma:” + str(Sigma.shape))
print(“Shape of V: + str(V.shape))
Shape of U: (768, 768)
Shape of Sigma: (768,)
Shape of V:(1024, 1024)
5a. Write a Python function that takes as input a positive integer k and the arrays U, S, and V. It should make a 768 x k matrix Un from the first k columns of U, a k x k diagonal matrix Sx with the
first k entries of along the diagonal, and a k x 1024 matrix V from the first k rows of V. It should then multiply these matrices to get an approximation UV of the original matrix and plot the
corresponding grayscale picture. Note that this product will give a 768 x 1024 matrix, so it will be the correct size.
In 1]:
5b. Run your function for some different values of k. What approximately) is the smallest value for which the picture is recognizably a raccoon? What is the smallest value for which the picture is
indistinguishable from the original?
In ():
5c. As a function of k, what is the total combined size (number of entries) of the parts of U, E, and V that are used to reconstruct the original picture? What is the compression ratio for the two values
you found in 5b? (1.e., what percent of the original data was used in the reconstruction?)