QSO 510 Quantitative Analysis for Decision MakingHomework Assignment 1
Please answer all questions
Question 1
What is all about the quantitative analysis for decision making course?
Question 2
Earlier on, data analysis was based on sample (small) data to infer the big data (population) instead of
dealing with the big data itself. However, nowadays, it has been possible to deal with the big data.
Explain very briefly, why it was not possible before to deal with big data but it is now possible.
Question 2.
It is very important to understand your data before you decide on the method of analyzing the data.
Please list the characteristics of data that you have learned so far.
Question 3.
List any three important decisions in business that may require evidences from the data before they are
made.
Question 4
What is the difference between observed and estimated data? Provide an example
Question 5
You just agreed to take position of Economic Adviser at InoSmart Inc. The manager of the company
asked you to evaluate salaries paid by the company to its employees last year. The manager delivers to
you the aggregate monthly salaries data as follows:
Months
Aggregate
Salaries
January
7434.30
February
7053.20
March
6795.00
April
7802.40
May
5019.70
June
6944.00
July
8196.55
August
6384.38
September
7911.11
October
6553.37
November
7107.64
December
8050.50
Questions 1.
Type Yes (to agree) or No (to disagree) with each of the following:
The data provided is
(a) Qualitative only …………. …
(b) Time series ……………… (c) Cross-sectional ………..
(d) Continuous ……………….
(c) Quantitative only…………….. (d) Discrete ………….
(e) Sample ……………..
(f) Population ……………. ( g) estimated …………….
(h) Observed ……………..
……….
(I) both qualitative and quantitative (non-numerical and numerical)
Normally distributed or other distributions ………….
What is statistics?
A pool of techniques/procedures that can be used to produce a
meaningful information out of raw data with an objective of providing
evidence for decision making.
What is statistics?
A pool of techniques/procedures that can be used to produce a
meaningful information out of raw data with an objective of providing
evidence for decision making.
Data
techniques/procedures(analysis)
information
decision
Data Literacy
What is data?
Facts collected for analysis or references
What is data?
Facts collected for analysis or reference
Example:
What is data?
Facts collected for analysis or reference
Example:
– Sales revenue
– profits
– Population
– Etc.
Data
Sources of Data:
– Primary data
– Secondary data
Primary data (Raw Data) – data collected/observed directly from a source
(firsthand experience) – Not processed
Methods of primary data collection:
– Surveys: interviews / questionnaires/ etc.
– Observations
– Experiments
Data
Secondary data– data collected/observed previously by someone other
than the user(s).
Sources:
– Internet
– Websites
– Organizational records
– Published sources
– Etc.
Data
Data Distribution (data set)
What is Data Distribution ?
– A collection of data (data base).
– A collection of information that is organized so that it can easily be
accessed, managed, and updated.
Data
Organizing data
1. Tabular form – data is arranged in tabular form, with rows and
columns
2. Data can be arranged in ascending or descending order
3. In a pivot table (pivot table – data summarization tool found in data
visualization programs such as excel software)
– A pivot table can automatically summarize data by organizing,
sorting, counting, providing total or give the average of the data in a
distribution.
Data
Table 1. Data set (Hypothetical)
Years of
Individual Age
Gender Education
John
29 M
Mary
36 F
Adam
24 M
Shawn
34 M
Bill
28 M
Bob
40 M
Sabina
60 F
Kelly
49 F
Years of
Experience Salary
4
2 50,000
3
12 75,000
5
2 51,000
4
5 65,000
2
10 68,000
3
8 60,000
3
6 59,000
4
5 64,000
Variables
Observations
Data
Variable (attribute)
Characteristic of an item in the distribution.
From the table above – Gender, Years of Education, Years of Experience
and Salary are variables.
Observation
List of a variable values.
E.g., observations in terms of variable “Age” in table 1 above are; 29. 36,
24, 34, 28, 40, 60 and 49
Characteristics of data (types / forms/etc.)
Characteristics of data (types / forms/etc.)
Data can either be:
– Numerical
– Non-numerical
Numerical (quantitative) – data that is expressed with digits.
Example 1: 0, 1, 2, 3, 4 (integers)
Example 2: 0.23, 0.5, 2.84, 0.0007 (decimals)
Non numerical (qualitative/ categorical) – data that is expressed with
words, letters or categories
Example 1: gender, states, countries, opinions, A, B, C, D, etc.
Example 2: category of values – 1 – 10, 10 – 20, 20- 30., category people
in terms of gender, etc.
Numerical data (quantitative)
Numerical data (quantitative)
Numerical Data can be:
– Discrete
– Continuous
Discrete: Data that can be counted or has a finite ending. It can only take
certain values (integers),
Example 1: 0, 1, 2, 3, 4, etc.
Example 2: Number of children, number of cars, number of houses, etc.
Continuous: data that has infinite number of possible values (decimals)
Example 1: 0.23, 0.5, 2.84, 0.0007
Example 2: Temperature, heights, distance, etc.
Numerical data (quantitative)
Discrete or continuous data can be:
– Observed
– Estimated
Observed data: obtained from the study (real data).
Example: the performance of students in the last term examination, etc.
Estimated data : data that is predicted (projected).
Example 1: the projected performance of students in the upcoming class.
Numerical data (quantitative)
Observed or estimated data can be:
– A Population
– A Sample
Population: Is the entire distribution of data (all entities of interest)
Example: people, overall salaries, or any other values in aggregate.
Sample: Is a subset of a population, in most cases, randomly selected to
represent the characteristics of a population as a whole
Numerical data (quantitative)
Sample selection methods:
-Simple Random Sampling (SRS) – variable or a value is selected by
chance.
-Systemic sampling – one of the first n number is selected randomly, then
every nth number after the first one will be selected.
-Stratified sampling – distribution is divided into strata and then random
samples are taken from each stratum
Numerical data (quantitative)
A population or a sample data can be:
-Cross sectional
– Time series
Cross sectional data: data collected at the same point of time.
Examples: Someone’s salary, current number of population in a town,
current number of children in a family, someone’s education level, etc.
Time series data: sequence of measurements of the same variable
collected overtime.
Examples: monthly sales, population growth, etc.
Numerical data (quantitative)
Cross sectional or time series data can be:
-Normally distributed
– Other distributions
– Normally distributed data: data that is symmetrical about the mean.
Values are equally likely to plot either above or below the mean (bellshaped distribution).
– Types:
–
Standard normal distribution
–
Non standard normal distribution
Numerical data (quantitative)
Standard normal distribution – It is the distribution that occurs when
a normal random variable has a mean of zero and a standard deviation
of one.
– The normal random variable of a standard normal distribution is
called a standard score or a z score.
Non-standard normal distribution – a distribution that occurs when
a normal random variable has a mean other than zero and
a standard deviation other than one.
Numerical data (quantitative)
Other distributions – distributions of data that skews either to the left or
to the right
A distribution is skewed if one of its tails is longer than the other ( when
the mean is pulled to either side).
– A positive skew. Distribution has a long tail in the positive direction.
– A negative skew – Distribution has a long tail in the negative direction.
Non numerical data (qualitative/categorical)
Non numerical data (qualitative/categorical)
Non-numerical data can either be:
– Nominal or
– Ordinal
Nominal data – non numerical data is nominal if there is no natural
ordering of its possible values.
Example: gender, state, countries, names of people, etc.
Ordinal data – non numerical data is ordinal if there is natural ordering of
its possible values.
Example: Education levels, priorities, ranks, etc.
Non numerical data (qualitative/categorical)
Nominal or ordinal data can be:
– A population or
– A sample
Data Analysis (techniques/procedures)