Hello, I want help with my homework, please check attached
Important: I need you to answer on the (same) doc file I attached please, don’t do it on a different file/doc, I need the answer to be on the same one I attached please.
link for the book and slides
https://drive.google.com/drive/folders/1POFZt1I4DM…
College of Computing and Informatics
Assignment 1
Deadline: Tuesday 03/10/2023 @ 23:59
[Total Mark for this Assignment is 8]
Student Details:
Name: ###
ID: ###
CRN: ###
Instructions:
• You must submit two separate copies (one Word file and one PDF file) using the Assignment Template on
Blackboard via the allocated folder. These files must not be in compressed format.
• It is your responsibility to check and make sure that you have uploaded both the correct files.
• Zero mark will be given if you try to bypass the SafeAssign (e.g. misspell words, remove spaces between
words, hide characters, use different character sets, convert text into image or languages other than English
or any kind of manipulation).
• Email submission will not be accepted.
• You are advised to make your work clear and well-presented. This includes filling your information on the cover
page.
• You must use this template, failing which will result in zero mark.
• You MUST show all your work, and text must not be converted into an image, unless specified otherwise by
the question.
• Late submission will result in ZERO mark.
• The work should be your own, copying from students or other resources will result in ZERO mark.
• Use Times New Roman font for all your answers.
Question One
Pg. 01
Learning
Outcome(s): LO1
Define different
Question One
3 Marks
Apply the chi-square for this scenario:
data mining tasks,
An opinion poll surveyed a simple random sample of 1000 students.
problems and the
Respondents were classified by gender (male or female) and asked if she/he
algorithms most
smoked. Results are shown in the contingency table below:
appropriate for
smoke
Not smoke
Total
Male
200
200
400
Female
250
350
600
Total
450
550
1000
addressing them
Is there any relationship between the gender type and smoking? Use a 0.05
level of significance.
Note: calculate the expected value, then calculate the chi-square.
Answer:
Question Two
Pg. 02
Learning
Outcome(s): LO1
Define different
Question Two
What is Data Reduction? Provide a list of strategies associated with it? Explain
one of them with an example?
data mining tasks,
problems, and the
algorithms most
appropriate for
addressing them
1.5 Marks
Answer:
Question Three
Pg. 03
Learning
Outcome(s): LO3
Employ data
Question Three
Briefly compare the features of following concepts: Snowflake schema, fact
constellation and star schema. Support your answer by illustrating examples.
mining and data
warehousing
techniques to
solve real-world
problems
1.5 Marks
Answer:
Question Four
Pg. 04
Learning
Outcome(s): LO3
Employ data
Question Four
2 Marks
a) Apply the Apriori algorithm for following Database to find the which itemset are
satisfy the minimum support = 3, then explain or highlight the result
mining and data
Tid
Items
warehousing
10
A, C
techniques to
20
solve real-world
30
A, B, C, E
40
B, E
problems
Answer:
B, C, E