Finance report

Please read attached handout carefully and follow guidelines, Also extra guidlelines i uploaded for part B and C, Please view all and follow the guidelines to complete this assignment stick to marketing criteria pleaseee.

Due may 16th.

BA500

1

Save Time On Research and Writing

Hire a Pro to Write You a 100% Plagiarism-Free Paper.

Get My Paper

/ BA5001X Business Decision-Making

Final coursework: Lamberts Heating

Submission date: Friday 20th Apirl 2013

Scenario

Lamberts Heating has two main divisions: one manufacturing radiators and the other installation of central heating systems. The manufacturing arm of the company makes radiators for domestic and commercial central heating systems. These radiators are either used by Lamberts installation division or are sold to other central heating installation companies. Lamberts Heating has rather old machinery, presently used to manufacture the radiators, that it wants to replace. The company wants both the new machinery to be in place as soon as possible and to plan the procurement and installation as soon as possible. The company has yet to decide which piece of new machinery to buy and thus does not expect to make an order for this machinery before 3rd June 2013.

Lamberts Heatings installation division supplies and fits domestic central heating systems. It uses its own radiators to obtain the materials for this, but buys in boilers and pipes from other companies. This company is keen to increase its profit margin in this division but the business of supplying domestic heating systems is very competitive at present. The company has decided that the best way to increase its profit margin is to reduce the cost of buying the boilers from a number of suppliers. It has conducted some research into the prices charged by three major suppliers: Apex Boilers, Brunswich Heating Supplies and Centrale.

Tasks

This coursework is in three parts. You should submit all three parts together as one document.

The first part of the document will consist of three reports: one for each part of the coursework with three appendices. These appendices will contain all the details for the work you have completed in order to write the reports. All computer input and output should also be in the appendices, as should the details of any other calculations you have made.

Part A: Planning and control of capital expenditure

Question / Tasks:

The senior management of the Lamberts Heating are considering the replacement of one of the firm’s machines that is used to make radiators in the factory. You are one of the managers involved in this decision. You have had some discussions with other members of the senior management about this proposed investment and there appear to be three machines that the firm could purchase.

Either a:

1. Alumier machine, a straight replacement for the present machine from the same supplier

2. Big EZ machine from an American supplier

3. Cial machine from Japan

The senior management team wish to carefully consider the alternatives. As a first step, it was decided to accurately estimate each of the alternative’s cash flows and the following estimated figures are available:

Alumier

Big EZ

Cial

Machine

Initial Outlay

500,000

Cash inflows

Year 1

50,000

200,000

150,000

100,000

150,000

50,000

150,000

25,000

100,000

170,000

25,000

50,000

770,000

600,000

750,000

The other members of the management team have asked you to consider each of the above alternatives, using the various methods usually employed in appraisal of capital investment decisions. You have been asked to report back to the whole team at their next meeting. Whilst other members at this meeting are also aware of usual investment appraisal methods, your report is expected to assess the above alternatives in any way that you consider is appropriate. Your report will be expected to include your recommendation of the machine to purchase, a recommendation that the team can then consider at their next meeting. Your report will also be expected to include an explanation of the strengths and weaknesses of the various methods that you have utilised in your investment appraisal. A 10% discount rate has also been recommended, the yield for each machine is expected to be calculated separately, and it is acceptable for all calculations to be worked out on a pre-tax basis.

(50 Marks)

Note:

1. All references and books, etc., consulted must be included, including any internet sites used.

2. You are also expected to produce this assignment using information technology. Particularly expected for this part of the assignment would be the use of Microsoft Word and Excel, or the use of other similar word-processing and spreadsheet software to produce this coursework.

A successful assignment (in Part A of this assignment) might include consideration or explanation of:

· consideration of each of the alternative proposals under at least three methods of evaluating capital investment decisions

· comparison of the inflows and outflows for each machine under each method

· the objective(s) of each method

· the advantages and disadvantages (strengths and weaknesses) of each of these methods

· consideration of any non-financial factors which may have to be considered

· consideration of any other information you may require

· a recommendation of one of the proposals, including the reasons for your recommendation

· use of the expected word-processing and spreadsheet software

Note: The above is illustrative, covering only the main points of the assignment.

Suggested length: 1,250 words

This part of the assignment (Part A) represents 50% of your marks on this piece of coursework

Part B: Scheduling the installation of the new machinery

Having decided to buy a new piece of machinery to make the radiators, Lambert Heating now has to consider the timeframe for getting the new piece of machinery up and running. This will involve you developing the project plan in terms of determining the critical path and the sequencing of the relevant activities.

The activities involved are shown in the table below with their preceding activities and their normal durations. Some of the activities can be reduced in time by using extra staff but this is at an extra cost. The minimum duration for each activity is shown in the final column. For a number of activities, the minimum duration is the same as the normal duration as the duration of these activities cannot be reduced.

The earliest that the company expects to start the process is Monday 3rd June 2013

Activity

Normal Duration in weeks

Preceding activity

Minimum Duration

[Each day saved cost £200]

Order new machinery

None

Plan new physical layout of factory

None

Determine changes needed in existing machinery

None

Receive new equipment

Hire new employee to supervise the operation of the new machinery

Make changes needed to accommodate new machinery

Make changes needed in existing machinery

Train existing employees to use new machinery

D,E

Install new machinery

Disassemble old machinery

Conduct employee safety training on new installation

H,I

Please note that the company works a 5 day week: Monday to Friday.

Task:

Appendix

You may complete this task manually or by using MS project.

· Assuming that the project starts on Monday 3rd
June
2013, determine the shortest duration for the entire project, using the normal duration times. Specify the date at which the project can finish and the number of weeks required.

· Produce a Gantt chart showing the starting and finishing times of each activity

· Identify which activities form the critical path(s) of the project.

· If the project could be speeded up by a maximum of 3 weeks at a cost of £200 a day, where should this money be spent? Produce a new network diagram to show the new critical path(s) and total duration.

Report

Give details of the proposed calendar of works, making it clear which activities are critical and which activities have some slack. This section should be written so that the management can understand the advice being given without reference to the work in the appendix. Give details of the starting date, the finishing date and the duration of the project.

Explain how, by spending more money, the project duration can be shortened. Explain which activities have been shortened and which activities are now critical. Give details of the starting date, the finishing date and the duration of the project. Give details of the extra cost involved in shortening the project.

A successful assignment (Part B of this assignment) should include:

· Determining the shortest time in which the project can be completed using the “normal” durations

· Determining the shortest time in which the project can be completed using the “minimum” durations

· Producing a well-written, well-structured report that enables management to understand how to schedule the project. This report should include details of activities which must be completed on time and those for which there is some slack.

· The costs of shortening the project time should be included in the report as should the relevant start and finishing dates.

Note: The above is illustrative, covering only the main points of the assignment.

Suggested length: 500 words

This part of the assignment (Part B) represents 20% of your marks on this piece of coursework.

Part C: Cost of procuring boilers

This part of the coursework requires you to use linear programming techniques to determine the best way to minimise the cost of supplying boilers to Lamberts Heating

As well as manufacturing radiators (as mentioned in Part A), Lamberts Heating is also reviewing the way it procures boilers to enable the installation of domestic heating systems.

The purchasing department has identified three possible suppliers of boilers; Apex, Brunswich and Centrale. It has also identified five kinds of domestic boiler that it wants to purchase so it can supply and fit these boilers to a variety of sizes of homes.

The five boilers are known
and
coded by Italian numbers as most are manufactured in Italy ; Uno (1), Duo (2), Tre (3), Quattro (4) and Cinque (5).

The table below gives the cost of each boiler (in £s) from each supplier. The final row of the table specifies the minimum requirement of each type of boiler per year. Please note that not all boilers are available from each supplier.

Uno

Duo

Tre

Quattro

Cinque

Apex

500

750

300

—

450

Brunswich

—

725

320

875

420

Centrale

480

775

310

900

—

Requirement
(number of boilers)

2000

1500

3000

2500

2200

There are certain limitations that have to be taken into account;

· Apex can supply no more than 1,000 of the Tre boiler each year.

· Brunswich can supply no more than 800 of the Duo boiler each year.

· Centrale can supply no more than 1,800 Uno boilers each year.

Lamberts Heating wants to meet its requirement for the number of boilers needed each year at the minimum cost.

Task

You are expected to use MS Excel software for this task. However, we will accept printouts from any other Linear Programming software.

Appendix

· Define the meaning of any decision variables you are using.

· Formulate the situation described above as a linear programming problem.

· Include your computer input of the problem (and show any formulae used)

· Include the “Answer Report” and “Sensitivity Report” printouts

Report

Write a brief report to the management of Lambert Heating detailing the cheapest way of meeting their need for boilers.

This report, written in a formal style of English, should include:

· The nature of the problem being solved

· The suggested purchasing plan

· The cost of the suggested purchasing plan.

· The robustness of the plan. Provide details of the ranges of costs of boilers from each suggested supplier for which your suggested plan remains optimal, and advise the company when they will have to generate a new plan. Do not just provide lists but consider your answers in the context of the question.

· At present, Apex can supply no more than 1,000 of the Tre boiler each year. If this limit was changed so that Apex could now supply 1,100 Tre boilers, explain what effect this would have on the total minimum cost?

A successful assignment (for Part C of this assignment) should include:

· Well-defined decision variables and a correct formulation of the problem.

· Appropriate use of Excel (or other software) to produce a solution to the problem.

· Producing a well-written, well-structured report that enables management to understand how to procure boilers at the minimum price. This report should include details of your suggested purchasing plan and the cost of this plan.

· The advice to management should also enable them to determine the range of prices for each boiler for which your purchasing plan is valid.

· The implications of the increased capacity to supply by Apex should be clearly described.

Note: The above is illustrative, covering only the main points of the assignment.
Suggested length: 500 words

This part of the assignment (Part C) represents 30% of your marks on this piece of coursework.

BA 5001 Final Coursework

Final coursework: Lamberts Heating

Submission date: Friday 24th May 2013

This coursework is worth 40% of your final mark for this module.

Part A: 50 marks

Part B: 20 marks

Part C: 30 marks

Part B : Installing the new machinery
Activity Normal Duration in weeks Preceding activity Minimum Duration
[Each day saved cost £200]
A: Order new machinery 2 None 2
B: Plan new physical layout of factory 3 None 3
C: Determine changes needed in existing machinery 3 None 3
D Receive new equipment 10 A 8
E Hire new employee to supervise the operation of the new machinery 7 A 6
F Make changes needed to accommodate new machinery 15 B 13
G: Make changes needed in existing machinery 9 C 7
H: Train existing employees to use new machinery 7 D,E 6
I: Install new machinery 4 F 3
J: Disassemble old machinery 5 G 4
K: Conduct employee safety training on new installation 2 H,I 2

Part B : Instructions
You may complete this task manually or by using MS project.
Assuming that the project starts on Monday 3rd June 2013, determine the shortest duration for the entire project, using the normal duration times. Specify the date at which the project can finish and the number of weeks required.
Produce a Gantt chart showing the starting and finishing times of each activity
Identify which activities form the critical path(s) of the project.

Part B : Example from booklet
Estate Agency : pages 113 to 115 for manual example
Estate Agency: pages 124 to 127 for MS Project

Estate Agents: Network Diagram
A 0 9 C 9 14
9 0 9 5 10 15

D 9 12 E 12 15 G 15 16
3 9 12 3 12 15 1 15 16

B 0 7 F 7 11
7 4 11 4 11 15

The critical path is ADEG. The shortest finishing time for the project is 16 weeks.

Estate Agents: Floats
Float = Latest finish time (LFT) – Earliest finish time (EFT)

Activity EFT LFT float Critical
A 9 9 0 yes
B 7 11 4
C 14 15 1
D 12 12 0 yes
E 15 15 0 yes
F 11 15 4
G 16 16 0 yes

Estate Agents : gantt Chart

Estate Agents
no activity A B C D E F G 0 0 9 9 12 7 15 duration A B C D E F G 9 7 5 3 3 4 1 float A B C D E F G 0 4 1 0 0 4 0
days

UsinG MS project: network diagram

UsinG MS project: Gantt Chart

Part B : continued
If the project could be speeded up by a maximum of 3 weeks at a cost of £200 a day, where should this money be spent? Produce a new network diagram to show the new critical path(s) and total duration.
example of crashing on pages 117 to 120 (coursework is easier than this )
make sure you take note of the minimum durations
calculate the extra cost of speeding up the project

Part B : Report (about 500 words)
Give details of the proposed calendar of works, making it clear which activities are critical and which activities have some slack. This section should be written so that the management can understand the advice being given without reference to the work in the appendix. Give details of the starting date, the finishing date and the duration of the project.
Explain how, by spending more money, the project duration can be shortened. Explain which activities have been shortened and which activities are now critical. Give details of the starting date, the finishing date and the duration of the project. Give details of the extra cost involved in shortening the project.

Final Coursework : part C
Cost of procuring boilers
This part of the coursework requires you to use linear programming techniques to determine the best way to minimise the cost of supplying boilers to Lamberts Heating
As well as manufacturing radiators (as mentioned in Part A), Lamberts Heating is also reviewing the way it procures boilers to enable the installation of domestic heating systems.

Part C : Identify the decision variables
The purchasing department has identified three possible suppliers of boilers; Apex, Brunswich and Centrale. It has also identified five kinds of domestic boiler that it wants to purchase so it can supply and fit these boilers to a variety of sizes of homes.
The five boilers are known and coded by Italian numbers as most are manufactured in Italy ; Uno (1), Duo (2), Tre (3), Quattro (4) and Cinque (5).

PArt C : Identify the consTraints
The table below gives the cost of each boiler (in £s) from each supplier. The final row of the table specifies the minimum requirement of each type of boiler per year. Please note that not all boilers are available from each supplier.

Uno Duo Tre Quattro Cinque
Apex 500 750 300 — 450
Brunswich — 725 320 875 420
Centrale 480 775 310 900 —
Requirement
(number of boilers)
2000
1500
3000
2500
2200

PArt C : Identify the consTraints
There are certain limitations that have to be taken into account;
Apex can supply no more than 1,000 of the Tre boiler each year.
Brunswich can supply no more than 800 of the Duo boiler each year.
Centrale can supply no more than 1,800 Uno boilers each year.
Lamberts Heating wants to meet its requirement for the number of boilers needed each year at the minimum cost.

Example from Module booklet page 105

Home
Farm
yield per acre
cost per
acre

Meadow
Farm
yield per acre
cost per acre

corn
400 barrels
£100

corn
650 barrels
£120

wheat
300 barrels
£90

wheat
350 barrels
£80

Farm Production
A farmer has two farms, Home Farm and Meadow Farm, in which he grows corn and wheat.
Both farms are 40 acres in size.
To satisfy a contract with a local mill, the farmer must produce 7,000 barrels of corn and 11,000 barrels of wheat each year.

The farmer wishes to minimise the cost of meeting the contract.
The data for each farm is given below:

Definition of Decision variables
Variables: We need to distinguish between production at Home and Meadow Farm.
Let CH be the number of acres of corn planted at Home Farm
Let CM be the number of acres of corn planted at Meadow Farm
Let WH be the number of acres of wheat planted at Home Farm
Let WM be the number of acres of wheat planted at Meadow Farm

Formulation of problem
Minimise Cost:
Minimise 100CH + 90WH + 120CM + 80WM
subject to:
yield corn(1) 400CH + 650CM ≥ 7,000
wheat (2) 300WH + 350WM ≥11,000
area Home Farm (3) CH + WH ≤ 40
Meadow Farm (4) CM + WM ≤ 40
CH ≥ 0, CM ≥ 0, WH ≥ 0 WM≥ 0

Farm production : Input

Farm’s Answer Report

Farm’s Sensitivity Report

Coursework requirements: Appendix
Define the meaning of any decision variables you are using.
Formulate the situation as a linear programming problem.
Include your computer input of the problem (and show any formulae used)
Include the “Answer Report” and “Sensitivity Report” printouts

Coursework requirements: Report [about 500 words]
Write a brief report to the management of Lambert Heating detailing the cheapest way of meeting their need for boilers.
This report, written in a formal style of English, should include:
The nature of the problem being solved
The suggested purchasing plan
The cost of the suggested purchasing plan.
The robustness of the plan. Provide details of the ranges of costs of boilers from each suggested supplier for which your suggested plan remains optimal, and advise the company when they will have to generate a new plan

coursework requirements
Do not just provide lists but consider your answers in the context of the question.
At present, Apex can supply no more than 1,000 of the Tre boiler each year. If this limit was changed so that Apex could now supply 1,100 Tre boilers, explain what effect this would have on the total minimum cost?

Farm Production : Notes for a report
This is taken from page 106 of the module booklet
1. Planting Plan
In order to minimise the planting costs, the following should be implemented:
Plant Home Farm with 2.56 acres of wheat
Plant Meadow Farm with 10.77 acres of corn and 29.23 acres of wheat.
This will incur costs of £3,861.54, the minimum that can be achieved.

Farm Production : Notes
2. Implications of Suggested Planting Programme
The planting plan given above will result in exactly 7,000 barrels of corn and exactly 11,000 barrels of wheat being produced.
[S1 = 0, S2 = 0]
37.44 acres of Home Farm will not be planted, whilst all 40 acres of Meadow Farm will be used.

Farm Production : Notes
3. Scope of the recommendations
3.1 Change in Costs
The planting plan given above will result in a minimum cost whilst the following conditions hold:
* The cost of planting corn at Home Farm stays above £89.23 per acre.
* The cost of planting corn at Meadow Farm stays below £137.50 per acre.
* The cost of planting wheat at Home Farm stays in the range £68.57 to £105 per acres.
* The cost of planting wheat at Meadow Farm stays in the range £62.50 to £105 per acre.

Farm Production : Notes
If any one of the conditions above fails to hold then a new planting plan will be required. Furthermore, if two or more of the present costs change, a new plan will be required.
As these ranges are relatively large, the proposed plan should hold good for some time

Farm Production : Notes
3.2 Change in Contract
The proposed plan exactly meets the contracts for 7,000 barrels of corn and 11,000 barrels of wheat.
If the requirement for corn was to increase by one barrel, the extra costs incurred in meeting this target would be 22.3 pence. This marginal cost of 22.3 pence per barrel applies for levels of production between 5,571.43 and 26,000 barrels.
If the minimum amount of wheat required was increased, the extra cost would be 30 pence per barrel. This extra cost per barrel will apply whilst the contract for wheat lies in the range 10,230.8 to 22,230.8 barrels.
If the minimum requirement for corn or wheat fell outside these ranges, a new planting plan would be required.

Farm Production : Notes
3.3 Changes in Farm Size
In total only 2.56 acres of Home Farm are being used. Clearly it would not make sense to consider increasing the size of Home Farm.
All 40 acres of Meadow Farm are being used. If the size of Meadow Farm could be increased by 1 acre the minimum cost could be reduced by £25. This marginal reduction in cost applies when the size of Meadow Farm lies in the range 10.77 acres to 42.2 acres.
Similarly, if the size of Meadow Farm is reduced by 1 acre, the minimum cost will rise by £25.

Farm Production : Notes
If we could increase the size of Meadow Farm by 1 acre, we could produce an extra 350 barrels at Meadow Farm at a cost of £80.
At the same time we would reduce the yield from Home Farm by 350 barrels
i.e. 350/300 = 1.16666 acres. This reduction would save 1.16666*90 =105.
Thus the net reduction would be 105 – 80 = £25 ]

Marking Scheme: Part B
40

Input data for original problem shown 2
Correct Network diagram produced 4
Reduced time input data 2
Reduced time correct network diagram 2
total 10
Mention start data and finish date . Mention duration = 24 weeks (2) 4
Mention critical activities, giving full description of each activity for full marks (3) making it clear what critical means (2). List activities that have some slack (3) make it clear what slack means (2) 12
Advice client as to how reduce time 2
Mention new start data and new finish date , duration and cost 4
Mention the new critical activities giving full description of each activity for full marks (2) making it clear what critical means (2) 8
total 30
40

Marking scheme: Part C
appendix Defining variables 6
Stating objective function 2
Stating constraints 9
Showing correct excel input data 2
Showing all of the output data 4
total 23
report Explaining the problem . 5
Explaining the suggest purchasing plan . 8
Stating the minimum cost 3
Robustness: This plan remains optimal whilst the costs of
Uno from Apex are ???, Due from Apex are ??? etc 17
If the Apex could supply 100 more Tre boilers year, how would the minimum cost change 4
total 37
total 60

farms
corn homecorn meadowwheat homewheat meadow
0000
minimise10012090800
corn4006500>=7000
wheat3003500>=11000
home farm 110<=40 meadow farm110<=40 Answer Report 1 Microsoft Excel 14.0 Answer Report Worksheet: [Book1]Sheet1 Report Created: 05/12/2012 15:33:15 Result: Solver found a solution. All Constraints and optimality conditions are satisfied. Solver Engine Engine: Simplex LP Solution Time: 0.032 Seconds. Iterations: 4 Subproblems: 0 Solver Options Max Time Unlimited, Iterations Unlimited, Precision 0.000001, Use Automatic Scaling Max Subproblems Unlimited, Max Integer Sols Unlimited, Integer Tolerance 1%, Assume NonNegative Objective Cell (Max) Cell Name Original Value Final Value $E$4 0 12750 Variable Cells Cell Name Original Value Final Value Integer $B$3 Alpha 0 750 Contin $C$3 Beta 0 1000 Contin $D$3 Gamma 0 1500 Contin Constraints Cell Name Cell Value Formula Status Slack $E$5 total investments 100000 $E$5<=$G$5 Binding 0 $E$6 max Alpha 750 $E$6<=$G$6 Not Binding 250 $E$7 max Beta 1000 $E$7<=$G$7 Binding 0 $E$8 max Gamma 1500 $E$8<=$G$8 Binding 0 Sensitivity Report 1 Microsoft Excel 14.0 Sensitivity Report Worksheet: [Book1]Sheet1 Report Created: 05/12/2012 15:33:15 Variable Cells Final Reduced Objective Allowable Allowable Cell Name Value Cost Coefficient Increase Decrease $B$3 Alpha 750 0 7 0.2 7 $C$3 Beta 1000 0 3 1E+30 0.0833333333 $D$3 Gamma 1500 0 3 1E+30 0.6666666667 Constraints Final Shadow Constraint Allowable Allowable Cell Name Value Price R.H. Side Increase Decrease $E$5 total investments 100000 0.1166666667 100000 15000 45000 $E$6 max Alpha 750 0 1000 1E+30 250 $E$7 max Beta 1000 0.0833333333 1000 1800 600 $E$8 max Gamma 1500 0.6666666667 1500 2250 750 fred Alpha Beta Gamma 750 1000 1500 7 3 3 12750 total investments 60 25 20 100000 <= 100000 max Alpha 1 750 <= 1000 max Beta 1 1000 <= 1000 max Gamma 1 1500 <= 1500 Answer Report 2 Microsoft Excel 14.0 Answer Report Worksheet: [LP problems.xlsx]farms Report Created: 05/12/2012 15:48:05 Result: Solver found a solution. All Constraints and optimality conditions are satisfied. Solver Engine Engine: Simplex LP Solution Time: 0.016 Seconds. Iterations: 3 Subproblems: 0 Solver Options Max Time Unlimited, Iterations Unlimited, Precision 0.000001, Use Automatic Scaling Max Subproblems Unlimited, Max Integer Sols Unlimited, Integer Tolerance 1%, Assume NonNegative Objective Cell (Min) Cell Name Original Value Final Value $F$5 minimise 0 3861.5384615385 Variable Cells Cell Name Original Value Final Value Integer $B$4 corn home 0 0 Contin $C$4 corn meadow 0 10.7692307692 Contin $D$4 wheat home 0 2.5641025641 Contin $E$4 wheat meadow 0 29.2307692308 Contin Constraints Cell Name Cell Value Formula Status Slack $F$6 corn 7000 $F$6>=$H$6 Binding 0
$F$7 wheat 11000 $F$7>=$H$7 Binding 0
$F$8 home farm 2.5641025641 $F$8<=$H$8 Not Binding 37.4358974359 $F$9 meadow farm 40 $F$9<=$H$9 Binding 0 Sensitivity Report 2 Microsoft Excel 14.0 Sensitivity Report Worksheet: [LP problems.xlsx]farms Report Created: 05/12/2012 15:48:05 Variable Cells Final Reduced Objective Allowable Allowable Cell Name Value Cost Coefficient Increase Decrease $B$4 corn home 0 10.7692307692 100 1E+30 10.7692307692 $C$4 corn meadow 10.7692307692 0 120 17.5 145 $D$4 wheat home 2.5641025641 0 90 15 21.4285714286 $E$4 wheat meadow 29.2307692308 0 80 25 17.5 Constraints Final Shadow Constraint Allowable Allowable Cell Name Value Price R.H. Side Increase Decrease $F$6 corn 7000 0.2230769231 7000 19000 1428.5714285714 $F$7 wheat 11000 0.3 11000 11230.7692307692 769.2307692308 $F$8 home farm 2.5641025641 0 40 1E+30 37.4358974359 $F$9 meadow farm 40 -25 40 2.1978021978 29.2307692308 farms farms corn home corn meadow wheat home wheat meadow 0 0 0 0 minimise 100 120 90 80 0 corn 400 650 0 >= 7000
wheat 300 350 0 >= 11000
home farm 1 1 0 <= 40 meadow farm 1 1 0 <= 40 Sheet3 Objective Cell (Min) CellNameOriginal ValueFinal Value $F$5minimise03861.538462 Variable Cells CellNameOriginal ValueFinal ValueInteger $B$4corn home00Contin $C$4corn meadow010.76923077Contin $D$4wheat home02.564102564Contin $E$4wheat meadow029.23076923Contin Constraints CellNameCell ValueFormulaStatusSlack $F$6corn7000$F$6>=$H$6Binding0
$F$7wheat11000$F$7>=$H$7Binding0
$F$8home farm 2.564102564$F$8<=$H$8Not Binding37.43589744 $F$9meadow farm40$F$9<=$H$9Binding0 Answer Report 1 Microsoft Excel 14.0 Answer Report Worksheet: [Book1]Sheet1 Report Created: 05/12/2012 15:33:15 Result: Solver found a solution. All Constraints and optimality conditions are satisfied. Solver Engine Engine: Simplex LP Solution Time: 0.032 Seconds. Iterations: 4 Subproblems: 0 Solver Options Max Time Unlimited, Iterations Unlimited, Precision 0.000001, Use Automatic Scaling Max Subproblems Unlimited, Max Integer Sols Unlimited, Integer Tolerance 1%, Assume NonNegative Objective Cell (Max) Cell Name Original Value Final Value $E$4 0 12750 Variable Cells Cell Name Original Value Final Value Integer $B$3 Alpha 0 750 Contin $C$3 Beta 0 1000 Contin $D$3 Gamma 0 1500 Contin Constraints Cell Name Cell Value Formula Status Slack $E$5 total investments 100000 $E$5<=$G$5 Binding 0 $E$6 max Alpha 750 $E$6<=$G$6 Not Binding 250 $E$7 max Beta 1000 $E$7<=$G$7 Binding 0 $E$8 max Gamma 1500 $E$8<=$G$8 Binding 0 Sensitivity Report 1 Microsoft Excel 14.0 Sensitivity Report Worksheet: [Book1]Sheet1 Report Created: 05/12/2012 15:33:15 Variable Cells Final Reduced Objective Allowable Allowable Cell Name Value Cost Coefficient Increase Decrease $B$3 Alpha 750 0 7 0.2 7 $C$3 Beta 1000 0 3 1E+30 0.0833333333 $D$3 Gamma 1500 0 3 1E+30 0.6666666667 Constraints Final Shadow Constraint Allowable Allowable Cell Name Value Price R.H. Side Increase Decrease $E$5 total investments 100000 0.1166666667 100000 15000 45000 $E$6 max Alpha 750 0 1000 1E+30 250 $E$7 max Beta 1000 0.0833333333 1000 1800 600 $E$8 max Gamma 1500 0.6666666667 1500 2250 750 fred Alpha Beta Gamma 750 1000 1500 7 3 3 12750 total investments 60 25 20 100000 <= 100000 max Alpha 1 750 <= 1000 max Beta 1 1000 <= 1000 max Gamma 1 1500 <= 1500 Answer Report 2 Microsoft Excel 14.0 Answer Report Worksheet: [LP problems.xlsx]farms Report Created: 05/12/2012 15:48:05 Result: Solver found a solution. All Constraints and optimality conditions are satisfied. Solver Engine Engine: Simplex LP Solution Time: 0.016 Seconds. Iterations: 3 Subproblems: 0 Solver Options Max Time Unlimited, Iterations Unlimited, Precision 0.000001, Use Automatic Scaling Max Subproblems Unlimited, Max Integer Sols Unlimited, Integer Tolerance 1%, Assume NonNegative Objective Cell (Min) Cell Name Original Value Final Value $F$5 minimise 0 3861.5384615385 Variable Cells Cell Name Original Value Final Value Integer $B$4 corn home 0 0 Contin $C$4 corn meadow 0 10.7692307692 Contin $D$4 wheat home 0 2.5641025641 Contin $E$4 wheat meadow 0 29.2307692308 Contin Constraints Cell Name Cell Value Formula Status Slack $F$6 corn 7000 $F$6>=$H$6 Binding 0
$F$7 wheat 11000 $F$7>=$H$7 Binding 0
$F$8 home farm 2.5641025641 $F$8<=$H$8 Not Binding 37.4358974359 $F$9 meadow farm 40 $F$9<=$H$9 Binding 0 Sensitivity Report 2 Microsoft Excel 14.0 Sensitivity Report Worksheet: [LP problems.xlsx]farms Report Created: 05/12/2012 15:48:05 Variable Cells Final Reduced Objective Allowable Allowable Cell Name Value Cost Coefficient Increase Decrease $B$4 corn home 0 10.7692307692 100 1E+30 10.7692307692 $C$4 corn meadow 10.7692307692 0 120 17.5 145 $D$4 wheat home 2.5641025641 0 90 15 21.4285714286 $E$4 wheat meadow 29.2307692308 0 80 25 17.5 Constraints Final Shadow Constraint Allowable Allowable Cell Name Value Price R.H. Side Increase Decrease $F$6 corn 7000 0.2230769231 7000 19000 1428.5714285714 $F$7 wheat 11000 0.3 11000 11230.7692307692 769.2307692308 $F$8 home farm 2.5641025641 0 40 1E+30 37.4358974359 $F$9 meadow farm 40 -25 40 2.1978021978 29.2307692308 farms farms corn home corn meadow wheat home wheat meadow 0 10.7692307692 2.5641025641 29.2307692308 minimise 100 120 90 80 3861.5384615385 corn 400 650 7000 >= 7000
wheat 300 350 11000 >= 11000
home farm 1 1 2.5641025641 <= 40 meadow farm 1 1 40 <= 40 Sheet3 Variable Cells FinalReducedObjectiveAllowableAllowable CellNameValueCostCoefficientIncreaseDecrease $B$4corn home010.769230771001E+3010.76923077 $C$4corn meadow10.76923077012017.5145 $D$4wheat home2.5641025640901521.42857143 $E$4wheat meadow29.230769230802517.5 Constraints FinalShadowConstraintAllowableAllowable CellNameValuePriceR.H. SideIncreaseDecrease $F$6corn70000.2230769237000190001428.571429 $F$7wheat110000.31100011230.76923769.2307692 $F$8home farm 2.5641025640401E+3037.43589744 $F$9meadow farm40-25402.19780219829.23076923 Answer Report 1 Microsoft Excel 14.0 Answer Report Worksheet: [Book1]Sheet1 Report Created: 05/12/2012 15:33:15 Result: Solver found a solution. All Constraints and optimality conditions are satisfied. Solver Engine Engine: Simplex LP Solution Time: 0.032 Seconds. Iterations: 4 Subproblems: 0 Solver Options Max Time Unlimited, Iterations Unlimited, Precision 0.000001, Use Automatic Scaling Max Subproblems Unlimited, Max Integer Sols Unlimited, Integer Tolerance 1%, Assume NonNegative Objective Cell (Max) Cell Name Original Value Final Value $E$4 0 12750 Variable Cells Cell Name Original Value Final Value Integer $B$3 Alpha 0 750 Contin $C$3 Beta 0 1000 Contin $D$3 Gamma 0 1500 Contin Constraints Cell Name Cell Value Formula Status Slack $E$5 total investments 100000 $E$5<=$G$5 Binding 0 $E$6 max Alpha 750 $E$6<=$G$6 Not Binding 250 $E$7 max Beta 1000 $E$7<=$G$7 Binding 0 $E$8 max Gamma 1500 $E$8<=$G$8 Binding 0 Sensitivity Report 1 Microsoft Excel 14.0 Sensitivity Report Worksheet: [Book1]Sheet1 Report Created: 05/12/2012 15:33:15 Variable Cells Final Reduced Objective Allowable Allowable Cell Name Value Cost Coefficient Increase Decrease $B$3 Alpha 750 0 7 0.2 7 $C$3 Beta 1000 0 3 1E+30 0.0833333333 $D$3 Gamma 1500 0 3 1E+30 0.6666666667 Constraints Final Shadow Constraint Allowable Allowable Cell Name Value Price R.H. Side Increase Decrease $E$5 total investments 100000 0.1166666667 100000 15000 45000 $E$6 max Alpha 750 0 1000 1E+30 250 $E$7 max Beta 1000 0.0833333333 1000 1800 600 $E$8 max Gamma 1500 0.6666666667 1500 2250 750 fred Alpha Beta Gamma 750 1000 1500 7 3 3 12750 total investments 60 25 20 100000 <= 100000 max Alpha 1 750 <= 1000 max Beta 1 1000 <= 1000 max Gamma 1 1500 <= 1500 Answer Report 2 Microsoft Excel 14.0 Answer Report Worksheet: [LP problems.xlsx]farms Report Created: 05/12/2012 15:48:05 Result: Solver found a solution. All Constraints and optimality conditions are satisfied. Solver Engine Engine: Simplex LP Solution Time: 0.016 Seconds. Iterations: 3 Subproblems: 0 Solver Options Max Time Unlimited, Iterations Unlimited, Precision 0.000001, Use Automatic Scaling Max Subproblems Unlimited, Max Integer Sols Unlimited, Integer Tolerance 1%, Assume NonNegative Objective Cell (Min) Cell Name Original Value Final Value $F$5 minimise 0 3861.5384615385 Variable Cells Cell Name Original Value Final Value Integer $B$4 corn home 0 0 Contin $C$4 corn meadow 0 10.7692307692 Contin $D$4 wheat home 0 2.5641025641 Contin $E$4 wheat meadow 0 29.2307692308 Contin Constraints Cell Name Cell Value Formula Status Slack $F$6 corn 7000 $F$6>=$H$6 Binding 0
$F$7 wheat 11000 $F$7>=$H$7 Binding 0
$F$8 home farm 2.5641025641 $F$8<=$H$8 Not Binding 37.4358974359 $F$9 meadow farm 40 $F$9<=$H$9 Binding 0 Sensitivity Report 2 Microsoft Excel 14.0 Sensitivity Report Worksheet: [LP problems.xlsx]farms Report Created: 05/12/2012 15:48:05 Variable Cells Final Reduced Objective Allowable Allowable Cell Name Value Cost Coefficient Increase Decrease $B$4 corn home 0 10.7692307692 100 1E+30 10.7692307692 $C$4 corn meadow 10.7692307692 0 120 17.5 145 $D$4 wheat home 2.5641025641 0 90 15 21.4285714286 $E$4 wheat meadow 29.2307692308 0 80 25 17.5 Constraints Final Shadow Constraint Allowable Allowable Cell Name Value Price R.H. Side Increase Decrease $F$6 corn 7000 0.2230769231 7000 19000 1428.5714285714 $F$7 wheat 11000 0.3 11000 11230.7692307692 769.2307692308 $F$8 home farm 2.5641025641 0 40 1E+30 37.4358974359 $F$9 meadow farm 40 -25 40 2.1978021978 29.2307692308 farms farms corn home corn meadow wheat home wheat meadow 0 10.7692307692 2.5641025641 29.2307692308 minimise 100 120 90 80 3861.5384615385 corn 400 650 7000 >= 7000
wheat 300 350 11000 >= 11000
home farm 1 1 2.5641025641 <= 40 meadow farm 1 1 40 <= 40 Sheet3

Module

Title

Business Decision Making

Module Code

BA 5001

Session

2012/ 13

Teaching Period

Year long

Module Booklet Contents

Welcome to Module Title

Details of the Staff teaching team

Name of Module Leader

Janet Geary and Maurice Pratt

Office Location

Janet Geary : Stapleton House SH317

Maurice Pratt : Calcutta House CM 147

geary@londonmet.ac.uk

m.pratt@londonmet.ac.uk

Telephone

Janet 0207 133 3839

Maurice 0207 320 3270

Office Hours

Maurice : Monday 11am to 12, Thursday 12 to 1pm

Janet :

MODULE SPECIFICATION

Module title

Business Decision Making

Module code

BA 5001

Module Level

Level 5

Module Leader

Janet Geary & Maurice Pratt

Faculty

Business School

Teaching site(s) for course

cross-campus

Teaching period

year long (30 weeks)

Teaching mode

Day

Module Type

Year long

Credit rating
30

Prerequisites and corequisites

BA 4002 Managing Information and Accounting

Module description

The Business Decision Making module is directed at students who are following a number of different business degree programmes and requires that students have previously completed the Level 4 module Managing Information and Accounting or its equivalent. The year-long module is designed to enable students to understand the role of quantitative and statistical techniques in managerial decision making and also to familiarise students with management accounting concepts and techniques with an emphasis on decision making.
Assessment: Data analysis and reporting on findings 30%, In-class tests 30%, and integrated assignment 40%

Module aims

The aims of the module are:
•To provide an introduction to business decision making through the application of selected quantitative/statistical techniques and associated specialist software.
•To develop an understanding of how such analyses fit into the wider business and management context.
•To enable students to assess the reliability and usefulness of any information generated by the analysis and hence justify decisions made.
•To develop students’ understanding of the major uses of accounting information by management in problem-solving, decision-making and planning and control.
•To familiarise students with the decision-making framework for internal users of accounting information in short term decision-making.
•To examine alternative techniques in decision-making situations, including capital investment appraisal.
•To enable students to design spreadsheet models and interpret the managerial accounting information outputs of such models.
•To prepare students, where appropriate, for Level 6 project work in this area.

Module learning outcomes

On successful completion of this module, students will be able to:
• Understand how uncertainty can be built into business decision making.
• Apply linear programming techniques to optimise constrained decision choices.
• Use multiple regression analysis to model business problems.
• Apply a range of approaches to collect and analyse survey data.
• Apply selected techniques to manage projects.
• Use specialist software (e.g. SPSS, MSProject) to support business decision making.
• Account for short-term decisions, allocating costs utilising traditional methods and activity- based costing systems.
•

Calculate

and interpret accounting information for relevant costs and benefits in short-term decision-making situations.
• Demonstrate knowledge and understanding of the budgeting process and how to construct budgets, including the importance of behavioural implications.
• Calculate and communicate financial and non-financial measures of attractiveness in investment appraisal decisions and understand the role of cost of capital.

Syllabus

Survey methods: sampling, collection and analysis, using SPSS; statistical hypothesis tests.
Managing projects: scheduling for efficient resourcing; using MSProject.
Using SPSS to model relationships between variables – multiple regression, hypothesis tests.
Linear programming models in situations of constrained optimisation; using QSB (or equivalent).
The value of accounting information and theories of decision-making.
Accounting for short-term decisions, the nature and classification of costs, relevant costing.
Application of cost-volume-profit (CVP) analysis in decision-making for organisations and risk and uncertainty in decision-making.
The budgeting process, responsibility accounting and the effects of budgeting on motivation.
Accounting for long-term decisions: objectives of capital budgeting; the notion of time value of money; compounding and discounting.
Determination of a company’s cost of capital and capital rationing, methods of evaluating investment projects: Payback, ARR, NPV and IRR.

Assessment Strategy

The assessment will be in three parts. Formal assessment will comprise:
Assessment 1 (30%): Students will be presented with case study material that enables them to model management data in order to facilitate decision making. Students will be required to report on the results of their analysis.
Assessment 2 (30%): In-class tests. Students will be set a number of in-class tests focussing on Management Accounting.
Assessment 3 (40%): Integrated assignment, combining decision-making and project management.

Summary description of assessment items

Assessment Type

Description of item

% Weighting

Tariff

Week Due

CWK

Report on the findings of data analysis

30%

In-class tests

30%

19,23,27

CWK

Report on a case study involving decision-making and project management

40%

Learning and teaching

Learning and Teaching strategy for the module including approach to blended learning, students’ study responsibilities and opportunities for reflective learning/pdp
The teaching will consist of 2½ hour blocks per week, some of which will be in I.T. labs and some in classrooms. This will enable students to develop skills in the software appropriate to the area of the syllabus being covered.

Bibliography

Indicative bibliography and key on-line resources
Atrill, P. and McLaney, E. (2010) Management Accounting for Decision Makers, 6th Edition, Financial Times Press.
Bryman A. and Cramer D. (2011) Quantitative Data Analysis with IBM SPSS 17, 18 & 19 – A
Guide for Social Scientists, Routledge
Dewhurst, F. (2006) Quantitative Methods for Business and Management, 2nd Edition, McGraw
Hill
Drury, C. (2009) Management Accounting for Business, 4th ed, Cengage Learning EMEA.
Hongren C., Bhimani A., Datar S. and Foster G. (2012) Management and Cost Accounting, 5th
ed., Financial Times Press.
Oakshott, L. (2012) Essential Quantitative Methods, 5th Edition, Palgrave
Ray Proctor (2009), Managerial Accounting for Business Decisions, 3rd edition, Financial Times Press.
Render B. Stair, R. and Hanna, M. (2012) Quantitative Analysis for Management, 11th Edition,
Pearson
Rowntree, D. (2004) Statistics Without Tears: An introduction for non-mathematicians, Penguin
Books
Swift, L. and Piff, S. (2010) Quantitative Methods for Business, Management and Finance, 3rd
edition, Palgrave
Wisniewski, M. (2009) Quantitative Methods for Decision Makers, 5th Edition, Financial
Times/Prentice Hall

Approved to run from

September 2012

Module multivalency

Subject Standards Board

Weekly Programme Lecture Topics – Seminar/Workshop/Practical details

week

content

Assessments

Introduction to the Module
setting up data files in SPSS

Normal Distribution
Producing graphs and descriptive statistics in SPSS

Hypothesis testing: Chi-squared
Chi-squared using SPSS

Hypothesis testing : Significance tests for means
T-test for population mean , two-sample t-test using SPSS

Confidence intervals
using SPSS for confidence intervals

Correlation and Regression
Using SPSS for bivariate correlation and regression

Student Activity week

Multiple Regression
using SPSS for multiple regression

Interpreting Multiple Regression output

Graphical Linear Programming
Using Win QSB for LP

Linear programming: shadow prices and sensitivity analysis
Using Win QSB for LP

Linear Programming: extension to more than 2 variables
Using Win QSB for LP

Project management
Using MS Project

coursework 1 (30%)

Project management
Using MS Project

Intro. to accounting for management decisions

Cost behaviour and cost-volume-profit (CVP) analysis

Product costing – traditional overhead costing and activity-based costing (ABC) systems

Management Accounting Test One

Management Accounting Test 1 (10%)

Student Activity week

Budgeting I – Budgeting process and the preparation of sales and operational budgets

Budgeting II – Preparation of cash budgets and the master budget

Management Accounting Test Two

Management Accounting Test 2 (10%)

Relevant costs and benefits for decisions

Relevant information for operating decisions: Short-term decisions with scarce resources

Risk and uncertainty in decision-making

Management Accounting Test Three

Management Accounting Test 3 (10%)

Long-term decisions II – Methods of investment appraisal and capital rationing

Revision Week

Final coursework (40%

Essential Books/on line resources including Weblearn/Blackboard

For the first 15 weeks, the module booklet provides most of the required material. Students should supplement their reading of the module booklet with the recommended texts.

Required/Weekly Reading/Practice/on line resources including any Weblearn/Blackboard

Students will need to look at weblearn on a regular basis (twice week ,say)for :
1, powerpoint slides
2. data files
3. answers to selected exercises
4. Additional readings

Additional/Weekly Reading/Practice/on line resources including any Weblearn/Blackboard

This module is supported by Weblearn – students are advised to access the site on a regular basis, at least once a week

Module Assessment Details, including Assessment Criteria for all elements of the assessment, including any examination

· Well–structured, well-written reports for courseworks
· Appropriate use of software packages
· Accurate interpretations of the output from software packages
· Accurate calculations
Assessment criteria for individual assessments will be provided along with the assessment guidelines

Assessment completion dates/deadlines

Coursework 1: Friday of week 13
Test 1: In class during week 19
Test 2: In class during week 23
Test 3: in class during week 27
Coursework 2: Friday of week 30

Please ensure that coursework is handed in at the Assessments Unit not later than 5pm on the due date.

Practical session 1

In order to learn how to use SPSS , we will base our exercises around the results of a survey using the following questionnaire. We will not be analyzing all the results

Survey of Seabridge Fitness & Sports Centre

The Centre’s management wants to ensure that its members get full value for money so is undertaking this survey. Included are a number of potential developments to the Centre, and we would appreciate your help in determining the nature and priority of these developments, as well as your opinion on other aspects of the Centre. Please complete this questionnaire (which is both confidential & anonymous) and post it in the box at the reception desk.

1. Which sport/activity have you taken part in during this visit to the Centre?

Tick one box only (main sport/activity)

Swimming 1 Keep Fit/Aerobics 2 Judo 3

Badminton 4 Basketball 5 Gym Training 6

Specialist classes: Pilates 7 Alexander Technique 8

2. What is your main reason for taking part in this Sports/Activity?

Tick one box only

To get fit 1 Social 2

Competition 3 Skill development 4

3. What is your opinion on the range of Sports/Activities offered at the Centre? Tick one box only

Very good 1 Good 2 Average 3 Poor 4 Very poor 5

4. We are thinking of introducing a number of changes at the Centre and need to identify priorities.

Please show your preferences by ticking the two most important developments

Offering hot meals in the café area 1

Building a sauna adjacent to the swimming pool 2

Introducing longer opening hours (7am – 11pm) 3

Introducing family membership scheme 4

Introducing weekday only membership 5

Expanding the range of bookings for team sports 6

Expanding the range of specialist classes 7

Other please state …………………………………………………. 8

5. Are you: Male 1 Female 2

6. How old are you ________________years?

7. When do you normally attend the Centre?

Tick one box only, for each of a) & b) below

a) Time of day: Morning 1 Afternoon 2 Evening 3

b) Day: Weekday 1 Weekend 2

8. How long have you been a member of the Centre?

Please enter the length of your membership to the nearest number of years ______________

9. Please rate your fitness levels when you joined the centre and now:

Tick one box only, for each of a) & b) below

Very Good Fairly Average/ Fairly Poor Very

Good

Good Moderate Poor

Poor

a) Fitness when I joined the Centre 1 2 3 4 5 6 7

b) Fitness now 1 2 3 4

5 6 7

10. What was the main reason for joining the Centre?

Tick one box only

Location of the Centre 1

Range of facilities available 2

Recommended by a friend or relative 3

Membership rates 4

Other please state …………………………………………………. 5

11. Please rate each aspect of service quality:

Tick one box only, for each of a) & b) below

Very Good Average Poor Very

Good Poor

a) Helpfulness of the reception staff 1 2 3

4 5

b) Cleanliness of the changing rooms 1 2 3 4 5

12. Please use the space below to add any comments about the Centre.

Thank you completing this questionnaire. Please post it in the box at the reception desk.

[Thanks to Richard Charlesworth for letting us use this questionnaire and associated data file ]

Janet Geary 2012 Page 2

Questionnaire responses:

Q4(1)

Q4(2)

Q7a

Q7b

Q9a

Q9b

Q10

Q11a

Q11b

Resp

Sport

Reason

Variety

Changes1

Changes2

Gender

Age

Time

Day

Howlong

Fitness1

Fitness2

Join

Helpfulness

Cleanliness

999

888

999

Tasks required in the first practical session.

1. Load the excel data file seabridge.xls into SPSS

2. Set up the Variable View page

3. Save the file as Seabridge.sav

[Thanks to Richard Charlesworth for the notes for this session]

Using SPSS v18/19 to analyse survey data

These notes give an introduction to using SPSS (Statistical Package for the Social Sciences) for survey analysis. They are based around the questionnaire ‘Survey of Seabridge Fitness & Sports Centre’ and the survey data.

1. Entering data into SPSS

Data entry

Data can be entered in three ways:

(i) directly within SPSS (via the data editor);

(ii) copied from a spreadsheet (e.g. Excel) or Word table, and pasted into the SPSS data editor.

(iii) imported to SPSS from a previously created spreadsheet file (e.g. Excel).

Methods (i) & (ii) are the easiest methods & are described below.

Data entry using the SPSS data editor

If data is to be entered directly within SPSS, click on ‘Type in data’ (see Figure 1), then the ‘Data View’ tab at the bottom of the screen. This takes you into the SPSS data editor which has the same format as a spreadsheet with each column representing a different variable (a question), and each row the record of a different respondent (or ‘case’). Note however, that from time to time it is necessary to use more than column for a single question. For example, if multiple responses are permitted/required, a separate column is needed to record each response (e.g. see Q4 in the ‘Seabridge Fitness & Sports Centre’ questionnaire, which requires two answers).

Figure 1 – entering data directly into SPSS

Data entry by copying data from a spreadsheet or Word table

If we choose to simply copy data from a previously prepared spreadsheet such as Excel or a Word table to the SPSS data editor, remember to paste the data into the ‘Data View’ page, and not the ‘Variable View’ page. It is important to paste only the numerical data and not any variable names, which might also have been entered on the spreadsheet or table. SPSS will then automatically assign default variable names VAR00001, VAR00002 etc.

2. Defining the variables

Before undertaking any analysis it is important to fully define each variable. This is carried out on the ‘Variable View’ page. In Figure 2 the variable names have already been added – respondent, sport, reason etc, (note that ‘respondent’ is simply the questionnaire (or respondent/user number) whereas the remaining variables relate to specific questions). We are in the process of defining the variable ‘sport’: the ‘(Variable) Label’ enables the user to give a brief description so the variable ‘sport’ represents ‘Sport or activity undertaken’; similarly we can define the meaning of each of the possible numerical responses under ‘Value Labels’, so a response ‘1’ represents ‘Swimming’, ‘2’ represents ‘Keep Fit/Aerobics’ etc; under ‘Measure’ the user has already confirmed that ‘Sport’ is a ‘nominal’ (or categorical) variable.

Figure 2 – defining the variable ‘Sport’

Note with large questionnaires it is often helpful to insert the question number at the beginning of the ‘(Variable) Label’. So here we have entered ‘Q1 Sport or activity undertaken’ etc. This makes it easier to locate specific questions when conducting analyses.

The ‘Missing’ category enables us to record the code used for any missing values. For example, respondent No. 10 has failed to divulge his/her age, recorded here with a code of 999. Figure 3 shows ‘999’ being defined as the code for a missing response to Question 6 (‘Age’).

Figure 3 – defining 999 as the code for a missing value for Question 6 (‘age’)

Note that SPSS enables us to define up to three distinct missing value codes. The reason for this is that in addition to genuinely missing responses, we may choose to treat some ‘legitimate’ responses as though they were missing.

For instance, respondent number 14 has ticked more than one box for Question 9 (main reason for joining the Centre). Two or more reasons may have persuaded this respondent to join the Centre, so although this response may technically be a correct & honest reply, the ticking of only one box was requested (‘the main reason..’), and it would be inappropriate for us to choose one of these replies to code at the expense of the other, or to count them both. Instead we can introduce a missing value code (here 888) to record multiple responses.

When you have finished defining all the variables, save the file. We will be using this file next week.

Topic 1 Normal Distribution

Probability Distributions

Consider the case where we have a number of customers and we have recorded their ages.

If we take ages groups of 10 years and use probabilities rather than frequencies we get the histogram:

[the probability is the frequency of each group dived by the total frequency. The total of all probabilities adds to 1]

If we take age groups of 5 years we get :

With age groups of 2 years we get :

If we continue this process, we can draw a curve rather than a histogram. The curve will join the midpoint of the top of each column.

A continuous random variable, such as age measured to the nearest day, hour, minute etc ., has an infinite number of possible values and it is not possible to list every one with its associated probability. Furthermore the probability of each value will be so small that it must be considered approximately equal to zero. As a result we can only consider the probability that a value lies within a particular interval. In the graph above we have used classes such as 22 up to 24 to represent an interval.

For example: When measuring heights the probability that an individual is exactly 1.63 metres tall is very small but the probability that someone has a height between 1.625 and 1.635 metres is much easier, and sensible, to determine.

The Normal Distribution is one of the most important in Statistics, not only because many data distributions conform to this pattern, but also because the Normal Distribution underpins the subject of statistical inference.

The Normal Distribution curve is shown below.

The total area under the curve represents the total probabilities, thus the total area equals 1.

Properties of the Normal Distribution

The Normal Curve has the following properties:

· It is symmetrical about the mean.

· The curve is bell-shaped.

· The total area under the curve equals 1

· 68.26% of the area under the curve lies within 1 standard deviation from the mean.

· 95.44% of the area under the curve lies within 2 standard deviations from the mean.

· 99.73% of the area under the curve lies within 3 standard deviations from the mean.

The exact shape and position of the Normal curve depends upon the values of the mean and standard deviation. As Normal Distributions have an infinite combination of means and standard deviations, there is a problem in compiling tables for the probabilities.

This problem is solved by standardising the variables.

The formula below takes x values (actual measurements) and converts them into z values (for use in tables).

z = x – μ μ is the mean and σ the standard deviation.

z represents the number of standard deviations between x and the mean.

If actual measurements, x, are converted into z scores by use of the formula, then a Normal Distribution with mean μ and standard deviation σ is transformed into a Normal Distribution with mean 0 and standard deviation 1.

Example:

The examination marks for a large group of students followed a Normal Distribution. The mean mark was 50 and the standard deviation was 10 marks.

What percentage of students gained marks in the range:

a) Between 40 and 60

b) Between 30 and 70

c) above 70

a) mean = 50 s.d. =10

x = 40

i.e. 1 standard deviation below the mean.

x = 60

i.e. 1 standard deviation above the mean.

Thus a value between 40 and 60 is equivalent to a value between 1 standard deviation above and 1 standard deviation below the mean.

Using the ‘Properties of the Normal Distribution’ above, the area is 68.26%

Thus we would expect 68.26% of students to gain marks in the range 40 to 60.

b) Between 30 and 70

mean = 50 s.d. = 10

x = 30

Thus x = 30 is 2 standard deviations below the mean.

x = 70

Thus x = 70 is 2 standard deviations above the mean.

Using the ‘Properties of the Normal Distribution’, the area between 2 standard deviations above and below the mean is 95.44%

Thus we would expect 95.44% of students to gain marks in the 30 to 70 range.

c) Above 70

From part b) of this question we have find that 95.44% of students will gain marks in the range 30 to 70. This leaves 4.56% that are either above 70 or below 30. As the Normal curve is symmetrical, 2.28% will be above 70 and 2.28% will be below 30.

Thus we would expect 2.28% of students to gain marks above 70.

In this example, we could use the information from the ‘Properties of the Normal Distribution’ as the x values gave z values of 1 and 2. In most examples the z values will not be integers and we must use tables to find the areas and thus determine the probabilities.

Use of tables for the Normal Distribution

There are different ways of expressing tables for the Normal Distribution. These notes refer to the tables that follow on the next page.

The tables give the area in the right hand side tail. This area represents the probability that a random variable is greater than z.

For example . The probability that z > 1.35 = 0.0885

Tables for the Normal Distribution

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.0

0.5000

0.4960

0.4920

0.4880

0.4840

0.4801

0.4761

0.4721

0.4681

0.4641

0.1

0.4602

0.4562

0.4522

0.4483

0.4443

0.4404

0.4364

0.4325

0.4286

0.4247

0.2

0.4207

0.4168

0.4129

0.4090

0.4052

0.4013

0.3974

0.3936

0.3897

0.3859

0.3

0.3821

0.3783

0.3745

0.3707

0.3669

0.3632

0.3594

0.3557

0.3520

0.3483

0.4

0.3446

0.3409

0.3372

0.3336

0.3300

0.3264

0.3228

0.3192

0.3156

0.3121

0.5

0.3085

0.3050

0.3015

0.2981

0.2946

0.2912

0.2877

0.2843

0.2810

0.2776

0.6

0.2743

0.2709

0.2676

0.2643

0.2611

0.2578

0.2546

0.2514

0.2483

0.2451

0.7

0.2420

0.2389

0.2358

0.2327

0.2296

0.2266

0.2236

0.2206

0.2177

0.2148

0.8

0.2119

0.2090

0.2061

0.2033

0.2005

0.1977

0.1949

0.1922

0.1894

0.1867

0.9

0.1841

0.1814

0.1788

0.1762

0.1736

0.1711

0.1685

0.1660

0.1635

0.1611

1.0

0.1587

0.1562

0.1539

0.1515

0.1492

0.1469

0.1446

0.1423

0.1401

0.1379

1.1

0.1357

0.1335

0.1314

0.1292

0.1271

0.1251

0.1230

0.1210

0.1190

0.1170

1.2

0.1151

0.1131

0.1112

0.1093

0.1075

0.1056

0.1038

0.1020

0.1003

0.0985

1.3

0.0968

0.0951

0.0934

0.0918

0.0901

0.0885

0.0869

0.0853

0.0838

0.0823

1.4

0.0808

0.0793

0.0778

0.0764

0.0749

0.0735

0.0721

0.0708

0.0694

0.0681

1.5

0.0668

0.0655

0.0643

0.0630

0.0618

0.0606

0.0594

0.0582

0.0571

0.0559

1.6

0.0548

0.0537

0.0526

0.0516

0.0505

0.0495

0.0485

0.0475

0.0465

0.0455

1.7

0.0446

0.0436

0.0427

0.0418

0.0409

0.0401

0.0392

0.0384

0.0375

0.0367

1.8

0.0359

0.0351

0.0344

0.0336

0.0329

0.0322

0.0314

0.0307

0.0301

0.0294

1.9

0.0287

0.0281

0.0274

0.0268

0.0262

0.0256

0.0250

0.0244

0.0239

0.0233

2.0

0.0228

0.0222

0.0217

0.0212

0.0207

0.0202

0.0197

0.0192

0.0188

0.0183

2.1

0.0179

0.0174

0.0170

0.0166

0.0162

0.0158

0.0154

0.0150

0.0146

0.0143

2.2

0.0139

0.0136

0.0132

0.0129

0.0125

0.0122

0.0119

0.0116

0.0113

0.0110

2.3

0.0107

0.0104

0.0102

0.0099

0.0096

0.0094

0.0091

0.0089

0.0087

0.0084

2.4

0.0082

0.0080

0.0078

0.0075

0.0073

0.0071

0.0069

0.0068

0.0066

0.0064

2.5

0.0062

0.0060

0.0059

0.0057

0.0055

0.0054

0.0052

0.0051

0.0049

0.0048

2.6

0.0047

0.0045

0.0044

0.0043

0.0041

0.0040

0.0039

0.0038

0.0037

0.0036

2.7

0.0035

0.0034

0.0033

0.0032

0.0031

0.0030

0.0029

0.0028

0.0027

0.0026

2.8

0.0026

0.0025

0.0024

0.0023

0.0022

0.0021

0.0020

0.0019

2.9

0.0019

0.0018

0.0017

0.0016

0.0015

0.0014

3.0

0.0013

0.0012

0.0011

0.0010

3.1

0.0010

0.0009

0.0008

0.0007

3.2

0.0007

0.0006

0.0005

3.3

0.0005

0.0004

0.0003

3.4

0.0003

0.0002

3.5

0.0002

3.6

0.0002

0.0001

3.7

0.0001

When dealing with problems involving the use of Normal tables, it is advisable to draw sketches to identify the required areas.

1: Find the probability that z is greater than 1.

From tables; when we look up z = 1.000 we get 0.1587 Thus P( z > 1 ) = 0.1587

2: Find the probability that z is less than 2.01

When we look up z = 2.01 we get 0.0222 as the area above z = 2.01

The required area is 1 – 0.0222 = 0.9778 Thus P( z < 2.01 ) = 0.9778

3: Find the probability that z is between 0.6 and 2.01

The required area is shown.

From tables z = 0.6 gives 0.2743 z = 2.01 gives 0.02222

The required area is found by 0.2743 – 0.0222= 0.2521 as it is between greater than 0.6 and greater than 2.01 Thus P( 0.6 < z < 2.01 ) = 0.2521

4: Find the probability than z is less than -1.5

From tables we cannot look up z = -1.5 directly as only positive values are listed.

However we can look up z = + 1.5 to find the area.

As the Normal curve is symmetrical, the area below z = -1.5 is the same as the area above z = + 1.5.

From tables z = 1.5 gives 0.0668

Thus the area below z = -1.5 is 0.0668. Thus P( z < -1.5 ) = 0.0668.

5. Find the probability that z lies between -1.5 and 0.6

From tables z = 0.6 gives 0.2743 From tables z = 1.5 gives 0.0668

The required area lies between z = -1.5 and z = 0.6

The required area is 1 – .2743 – 0.0668 = 0.6589 Thus P(-1.5 < z < 0.6) = 0.6589

Worked Example:

The mean weight of 500 schoolboys is 55 kg and the standard deviation is 4 kg.

Assuming that the weights are Normally distributed, find the percentages of boys that are expected to weigh:

a) more than 58 kg

b) between 58 and 60 kg

c) between 50 and 65 kg

μ = 55 σ = 4

a) more than 58 kg

x = 58

From tables z = 0.75 gives 0.2266 Thus the required area is 0.2266

Thus we would expect 22.66% of the boys to weigh more than 58 kg

b) between 58 and 60 kg

x = 58 z = 0.75 and tables give 0.2266 Thus Area 1 (above 0.75) = 0.2266

x = 60

z = 1.25 tables give 0.1056 Thus Area 2 (above 1.25) = 0.1056

The area between z = 0.75 and z = 1.25 is Area 1 – Area 2

0.2266 – 0.1056 = 0.1210

Thus we would expect 12.1 % of the boys to weigh between 58 and 60 kg.

c) between 50 and 65 kg

x = 50

From tables z = 1.25 gives 0.1056 Thus Area 1 (below – 1.25) = 0.1056

x = 65

From tables z = 2.5 gives 0.00621 Thus A2 (above 2.5) = 0.00621

The required area is Total Area – A1 – A2 1 – 0.1056 – 0.00621 = 0.88819

Thus we would expect 88.82% of the boys to weigh between 50 and 65 kg

Example: The weekly output of a production line varies according to the Normal distribution with a mean of 1,500 units and a standard deviation of 120 units.

The manager of the factory wants to know what the production output is for 95% of the time.

To answer this question:

We wish to find the limits of production output that encloses 95% of the area.

If we take this 95% symmetrically, we have two tails of 2.5 % each left.

Thus area in the tail = 2.5% or 0.025

Scanning the tables for an area of 0.025 we find the z value to be

1.96

Thus 95% of the area lies between z = -1.96 and z = + 1.96.

We can now use this statement to solve equations. Mean = 1500 sd = 120

Upper limit

Lower limit

z = 1.96
z = x – mean = x – 1500
s.d. 120

z = -1.96
z = x – mean = x – 1500
s.d. 120

1.96 = x – 1500
120

-1.96 = x – 1500
120

1.96 × 120 = x – 1500

-1.96 × 120 = x – 1500

235.2 = x – 1500

– 235.2 = x – 1500

235.2 + 1500 = x

-235.2 + 1500 = x

1735.2 = x

1264.8 = x

Thus 95% of the time, the production output will be between 1265 and 1735 units a week.

Seminar Questions:

1. A thousand candidates sat an examination, the results of which were Normally distributed with a mean mark of 50 and a standard deviation of 10 marks.

How many candidates would be expected to score:

a) less than 75 marks

b) less than 25 marks

c) more than 60 marks

d) a grade C ( i.e. 50 to 59 inclusive)

2. A machine in a factory produces components whose lengths are Normally distributed with

mean = 102 mm and standard deviation = 1.5 mm.

a) Find the probability that, if a component is selected at random and measured, its length will be

i) Less than 100 mm

ii) Greater than 104 mm

b) If a component is only acceptable when its length lies in the range 100 mm to 104 mm, find the percentage of acceptable components.

3. As a result of tests on light bulbs, it was found that the life-time of a particular make of bulb was distributed Normally with an average life of 2040 hours and a standard deviation of 60 hours.

What percentage of bulbs are expected to last :

a) for more than 2150 hours

b) for more than 1960 hours

4. As a result of the introduction of Income Self-Assessment, at the end of the financial year, taxpayers may either have overpaid or underpaid their tax. Those taxpayers who have overpaid are entitled to a refund, whilst those who have underpaid owe the Tax Office money.

Past experience leads the Tax Office to believe that the amounts to be refunded or owed are Normally distributed.

This year, for one category of taxpayer, the mean of these amounts was a refund of 607 and a standard deviation of 320.

What proportion of taxpayers is entitled to a tax refund greater than 1,100?

What proportion of taxpayers owe money to the Tax Office?

What proportion of taxpayers is entitled to a refund of between 50 and 250?

For a separate class of taxpayers, the standard deviation is unknown. The mean refund for this class is 550 and 71.9% of these tax payers have refund greater than 300.

What is the standard deviation for this class of taxpayer.

5. A box of breakfast cereal states that it weighs 500 grams. This is the nominal weight.

When the person responsible for quality control in the company that produces the cereal measured a sample of 100 boxes of cereal the mean weight was 504 gms and the standard deviation was 2 gms. These values were exactly correct according to the companys policy.

Why would the company plan to have a mean weight greater than the nominal weight?

What is the probability that a customer, buying one packet of cereal, would buy a packet:

a) weighing more than 503 gms

b) weighing below the nominal weight

Answers

1 a) 993.79 b) 6.21 c) 158.7 d) 315.9

2 a) i) 0.0918 ii) 0.0918 b) 81.64%

3 a) 3.36% b) 90.82%

4 a) 6.18% b) 2.87% c) 8.68% d) 431

5 a) 0.6915 b) 0.02275

Practical session 2 [Using the file seabridge.sav with all variables defined]

1. Producing descriptive statistics in SPSS

2. Drawing graphs in SPSS

Data summary & analysis [Thanks to Richard Charlesworth for some of these notes on SPSS]

Figure 4 shows the basic command sequence for an elementary statistical analysis. Click first on ‘Analyze’, then ‘Descriptive Statistics’ then ‘Frequencies’

Figure 4 (above) – using ‘Analyze’ to provide some elementary statistical analysis
Figure 5 (below) – requesting an analysis of the variable ‘sport’

In Figure 5 the researcher has requested an analysis of the variable ‘Sport’ and samples of output are in Figures 6 & 7. Had we wished to do so, we could have requested the same analysis of several variables, rather than just ‘Sport’. Note the left-hand window records the output requested so far – useful for quickly locating earlier analyses.

Figure 6 (above) – a frequency table of ‘Sport’
Figure 7 (below) – a bar chart of ‘Sport’

To produce descriptive statistics for age

The output produced is

Descriptive Statistics

Minimum

Maximum

Mean

Std. Deviation

Q6 age of respondent

36.19

11.607

Valid N (listwise)

4 Incorporating output in a document

The focus of these notes is about interacting with SPSS, and hence the figures illustrate the screen display. However SPSS often produces more output than is needed (so you need to be selective); moreover this is not always in an appropriate style for inclusion in a document such as a dissertation or report.

Tables and graphs can easily be copied in to a document using copy and paste commands. SPSS tables are copied to a Word document as a Word table. You may wish to reformat aspects of the table (e.g. column widths may need readjusting, changing lines/ borders etc.) to get closer to the preferred style, and there is the added benefit that you can edit out any redundant or irrelevant information. SPSS graphs will be copied as a picture, which can be adjusted for size and position to fit in the document.

5 Saving SPSS files

Remember to save both the data and output files before exiting SPSS (you will be reminded of this by SPSS if you fail to do so). In Figure 8 the user has given the name ‘Seabridge Fitness & Sports Centre survey’ to the output file; the full filename is ‘Seabridge Fitness & Sports Centre survey.spv’. Similarly if the data file is also called ‘Seabridge Fitness & Sports Centre survey’, the full filename is ‘Seabridge Fitness & Sports Centre survey.sav’.

Figure 8 – the output file is saved as ‘Seabridge Fitness & Sports Centre .spv’

Topic 2: Hypothesis Testing: Chi-Squared test

Chi-square test of association

This test is one of the most important tests used in survey analysis. It is used to test whether two types of classification (i.e. answers to two questions) are statistically independent.

The test is extremely useful in management research and can be used to explore whether, for instance, there is any relationship between gender and management grade, gender and motivation, stress and salary level etc.

In a survey, airline passengers were asked to answer the question How did you book your flight and also asked for their gender. The chi-squared test allows us to test to see if there is any significant difference in the responses of the male and female passengers.

The resulting contingency table was:

gender

Total

male (1)

female (2)

method of booking

internet (1)

telephone(2)

travel agent (3)

Total

100

Looking at this table, it would seem that women were more likely to use the telephone more than men and that men were more likely to use travel agents. However, we need to perform the chi-squared test to see if these differences are statistically significant.

The chi-square test compares the ‘observed’ results with those would be ‘expected’ if the results were independent. The expected results are calculated by the formula:

Expected = row total × column total

Grand total

As an example, The number of males who use the internet was observed to be 14.

The expected value for this cell :

Expected = row total × column total = 29 × 52 = 15.08

Grand total 100

The table of observed and expected values is shown below:

gender

Total

male

female (2)

method of booking

internet

14 [15.08]

15 [13.92]

telephone

16 [20.80]

24 [19.20]

travel agent

22 [16.12]

9 [14.88]

Total

100

Formally we test whether our null hypothesis of independence is supported by the evidence of our survey. If not, we will reject the null hypothesis in favour of the alternative hypothesis which claims that method of booking and gender are not independent.

Formally:

Null hypothesis H0: method of booking is independent of gender

Alternative hypothesis H1: method of booking is not independent of gender

The overall comparative measure is provided by the test statistic:

This is calculated by:

observed, o

expected, e

o-e

(o – e)2

(o – e)2/e

15.08

1.08

1.1664

0.0773

13.92

1.08

1.1664

0.0838

20.8

4.8

23.04

1.1077

19.2

4.8

23.04

1.2

16.12

5.88

34.5744

2.1448

14.88

5.88

34.5744

2.3235

total

6.9371

In this case χ2 = 6.9371

The degrees of freedom , df, can be found from the formula :

df = (number of rows-1) × (number of columns-1)

Note: Do not count the “totals” in the number of rows and columns.

In this case df = (3-1) ×(2-1) = 2

Testing at a level of significance of 5%, the critical value of the chi-square distribution (with 2 df) is 5.991 (see tables at the end of this section). We compare our calculated test statistic with the tabulated value. The critical value is exceeded by the test statistic as can be clearly seen in the graph below.

The test statistic falls into the rejection (or critical) region (5.991 and above), and consequently we reject the null hypothesis that the method of booking and gender are independent. We therefore take this test result as evidence that there is an association between the method of booking and gender.

As the largest relative difference is between the use of travel agents, we can say that men are more likely to book through travel agents and women less likely.

The relevant SPSS printout is on the next page

Case Processing Summary

Cases

Valid

Missing

Total

Percent

B6 * C3

100

100.0%

.0%

100

100.0%

B6 * C3 Crosstabulation

Total

Count

Expected Count

15.1

13.9

29.0

Count

Expected Count

20.8

19.2

40.0

Count

Expected Count

16.1

14.9

31.0

Total

Count

100

Expected Count

52.0

48.0

100.0

Chi-Square Tests

Value

Asymp. Sig. (2-sided)

Pearson Chi-Square

6.937

.031

Likelihood Ratio

7.109

.029

Linear-by-Linear Association

3.204

.073

N of Valid Cases

100

a 0 cells (.0%) have expected count less than 5. The minimum expected count is 13.92.

The value for ‘Pearson Chi-Square’ is given as 6.937.

Interpretation of the output:

This gives a calculated value of chi-squared of 6.937 which can be compared with the critical value of 5.9915 (from statistics tables using a 5% level of significance). As the calculated value is greater than the critical value it does lie in the ‘reject’ region of the curve (i.e. the area to the right of our calculated value is less than 5%). SPSS gives us the area to the right of the calculated value as 0.031. We could have used this to see that the calculated value of chi-squared lies in the tail and thus we do not have to use statistics table.

If the value in the last column for Pearson Chi-Square was greater then 5% (0.05) we would have concluded that the method of booking and gender were independent at the 5% level of significance.

In this example, we would conclude that there is evidence of an association between gender and method used to book the flight.

Some limitations of the chi-square test

There are some limitations of the chi-square test.

· the test identifies only the presence of relationships, not the focus or direction;

· the statistical properties of the test require that:

· the expected values must be ≥ 5. If this is not the case, appropriate rows or columns must be pooled together; however in some cases we simply do not have enough data (or density of data throughout the table), and combining rows and columns results in collating data into meaningless overarching classes rendering the test of little or no value.

Note: The often stated requirement above, that all expected values must be ≥ 5, is a safety first measure. In practice, we can accommodate up to 20% of the cells having expected values of between 1 and 5 (notice that the SPSS output provides a reminder of this). In the event of too many cells containing small expected frequencies we would need to combine rows or columns as detailed above.

· the χ2 formula needs to be modified slightly for 2 by 2 tables. We use the formula:

This involves subtracting 0.5 from the absolute difference between o and e, before squaring the result and then proceeding as before. The problem arises because we are using the continuous chi-square distribution to approximate to the discrete cell frequencies; with a small number of cells (that is 2 by 2 tables) this approximation needs a ‘continuity correction’. In fact, if the o and e values are quite large, the original uncorrected result will be close to the adjusted value; however the problem becomes more marked when the frequencies are small, so it is wise to always use the modified formula with 2 by 2 tables. Notice that SPSS provides the continuity correction when a 2 by 2 table is analysed.

The test assumes a null hypothesis that ‘gender’ and ‘method of booking’ are statistically independent, and expected frequencies for each of the categories are calculated. These are compared with what actually was recorded in the survey, and a test statistic is computed. The value of the relevant test statistic above is 6.937, which would occur by chance with a probability of 0.031 (or 3.1%) when ‘gender’ and ‘method of booking’ are statistically independent.

In such testing we typically assume a probability value of less than 0.05 (or 5%) is evidence that the null hypothesis is incorrect. In this case we therefore have sufficient evidence to reject the null hypothesis and conclude that ‘gender’ and ‘method of booking’ are not independent. So male and females have used different methods of booking.

The chi-square test is used here because both variables are nominal. If both were ordinal the chi-square test could still be used.

Yates Correction for 2×2 Contingency Tables:

This is used for 2 by 2 contingency tables when the total number of items in the table is relatively small (say, < 50 ). With a 2x2 table the degree of freedom= (r-1)×(c-1) = 1.

Example:

42 people were asked about their voting intentions in a forthcoming referendum on the single currency. The results are shown below:

Would vote for

would vote against

total

female

male

would vote for

would vote against

female

12 (9.05)

7 (9.95)

male

8 (10.95)

15 (12.05)

We have calculated the expected values as usual i.e expected = (row total×column total)

grand total

The test statistic, with Yates Correction, is given by:

χ2 = ( |o – e| – 0.5)2

Where |o-e| denotes the absolute value of a number (regardless of sign). So |3| = |-3| = 3 etc.

H0 Rows and columns are independent [voting does not depend on gender]

H1 some form of dependence exists [the genders vote differently]

level of the test: 5%

o-e

|o-e|-0.5

(|o-e|-0.5)2

(|o-e|-0.5)2/e

9.05

2.95

2.45

6.0025

0.6633

9.95

-2.95

2.45

6.0025

0.6033

10.95

-2.95

2.45

6.0025

0.5482

12.05

2.95

2.45

6.0025

0.4981

total

2.3129

Since the critical value of χ2 with 1 degree of freedom at the 5% level is 3.8415 (from tables) then we fail to reject H0 and conclude that, on the basis of the evidence available to us, rows and columns are independent. Thus there is no evidence to support the claim that males and females vote differently.

Seminar Exercise

1. Number of defects found in a product: 0 1 2 3 4 or more:

Expected percentage: 5% 15% 30% 30% 20%

We take a random sample of 35 items and observe the following numbers of units with the stated number of defects:

Number of defects found in a product: 0 1 2 3 4 or more:

Numbers observed, 3 8 4 11 9

Test to see if this data is consistent with the claimed proportions of defectives.

(Remember to ensure that all expected values are greater than or equal to 5)

2. A random sample of people reporting sick over a five day period was:

Monday

Tuesday

Wednesday

Thursday

Friday

Is there any evidence to suggest that the reporting is not spread evenly through the week?

3. A firm uses three similar machines to produce a large number of components. On a particular day a random sample of 99 from the defective components produced on the early shift were traced back to the machine that produced them. The same was done with a random sample of 65 defectives from the late shift. The table below shows the number of defectives found to be from each machine on each shift:

machine A

machine B

machine C

Early Shift

Late Shift

Test the hypothesis that the probability of a defective coming from a particular machine is independent of the shift in which it was produced.

4. Stapleton Electrics manufacture televisions at four different factories and quality control is of great interest to the company. The table below shows reliability of the machines in a particular month:

Factory A

Factory B

Factory C

Factory D

Needed repair

No repair needed

a) Test to see if there is any association between factory and reliability of machines.

b) Subsequently, it was discovered that the figures above had been divided by 10 to make the arithmetic easier.

Correct for this and repeat the test. Does it make any difference to the result?

5. Debtovia is a small country in the middle of a recession. As part of a nation-wide survey a particular town was selected at random and the question “Would you support an incomes policy?” was asked to a number of workers selected at random. Their answers (Yes, No or Don’t Know) and their employment status(skilled/unskilled and Union/Non-Union) were recorded and the data is presented below:

Skilled and
in union

Skilled and
not in union

Unskilled and
in union

Unskilled and not in union

Yes

Don’t know

a) Test the hypothesis that there is no association between the responses to the above question and employment status.

b) Form a new 2×2 contingency table from the above data by omitting all the Dont know responses and then pooling the remaining responses to obtain one column for Skilled and one column for Unskilled workers. Test to see if there is any association evident in your new table.

c) Comment on your findings in both cases and compare the two situations.

Answers

1. χ2 test = 6.904 critical = 7.815 do not reject H0

The data is consistent with claimed proportions

2 χ2 test = 9.647 critical = 9.488 reject H0

Reporting sick is not evenly spread. Monday has fewer than expected, Friday has more.

3 χ2 test = 8.7346 critical = 5.991 reject H0

Number of defectives varies according to shift

4 a) χ2 test = 3.5773 critical = 7.815 do not reject H0

No relationship between factory and reliability

b) New χ2 test = 35.77 (i.e. 10 times as big) reject H0

5 a) χ2 test = 8.43 Critical = 12.592 do not reject H0

Conclude no association between response and employment status

b) χ2 test = 6.8727 critical = 3.841 reject H0

Conclude there is an association between response and employment

Chi-Squared Tables

area in one tail

0.100

0.050

0.025

0.010

0.005

0.001

degrees of freedom

2.7055

3.8415

5.0239

6.6349

7.8794

10.8276

4.6052

5.9915

7.3778

9.2103

10.5966

13.8155

6.2514

7.8147

9.3484

11.3449

12.8382

16.2662

7.7794

9.4877

11.1433

13.2767

14.8603

18.4668

9.2364

11.0705

12.8325

15.0863

16.7496

20.5150

10.6446

12.5916

14.4494

16.8119

18.5476

22.4577

12.0170

14.0671

16.0128

18.4753

20.2777

24.3219

13.3616

15.5073

17.5345

20.0902

21.9550

26.1245

14.6837

16.9190

19.0228

21.6660

23.5894

27.8772

15.9872

18.3070

20.4832

23.2093

25.1882

29.5883

17.2750

19.6751

21.9200

24.7250

26.7568

31.2641

18.5493

21.0261

23.3367

26.2170

28.2995

32.9095

19.8119

22.3620

24.7356

27.6882

29.8195

34.5282

21.0641

23.6848

26.1189

29.1412

31.3193

36.1233

22.3071

24.9958

27.4884

30.5779

32.8013

37.6973

23.5418

26.2962

28.8454

31.9999

34.2672

39.2524

24.7690

27.5871

30.1910

33.4087

35.7185

40.7902

25.9894

28.8693

31.5264

34.8053

37.1565

42.3124

27.2036

30.1435

32.8523

36.1909

38.5823

43.8202

28.4120

31.4104

34.1696

37.5662

39.9968

45.3147

29.6151

32.6706

35.4789

38.9322

41.4011

46.7970

30.8133

33.9244

36.7807

40.2894

42.7957

48.2679

32.0069

35.1725

38.0756

41.6384

44.1813

49.7282

33.1962

36.4150

39.3641

42.9798

45.5585

51.1786

34.3816

37.6525

40.6465

44.3141

46.9279

52.6197

35.5632

38.8851

41.9232

45.6417

48.2899

54.0520

36.7412

40.1133

43.1945

46.9629

49.6449

55.4760

37.9159

41.3371

44.4608

48.2782

50.9934

56.8923

39.0875

42.5570

45.7223

49.5879

52.3356

58.3012

40.2560

43.7730

46.9792

50.8922

53.6720

59.7031

41.4217

44.9853

48.2319

52.1914

55.0027

61.0983

42.5847

46.1943

49.4804

53.4858

56.3281

62.4872

43.7452

47.3999

50.7251

54.7755

57.6484

63.8701

44.9032

48.6024

51.9660

56.0609

58.9639

65.2472

46.0588

49.8018

53.2033

57.3421

60.2748

66.6188

47.2122

50.9985

54.4373

58.6192

61.5812

67.9852

48.3634

52.1923

55.6680

59.8925

62.8833

69.3465

49.5126

53.3835

56.8955

61.1621

64.1814

70.7029

50.6598

54.5722

58.1201

62.4281

65.4756

72.0547

51.8051

55.7585

59.3417

63.6907

66.7660

73.4020

52.9485

56.9424

60.5606

64.9501

68.0527

74.7449

54.0902

58.1240

61.7768

66.2062

69.3360

76.0838

55.2302

59.3035

62.9904

67.4593

70.6159

77.4186

56.3685

60.4809

64.2015

68.7095

71.8926

78.7495

57.5053

61.6562

65.4102

69.9568

73.1661

80.0767

58.6405

62.8296

66.6165

71.2014

74.4365

81.4003

Practical session 3. Using SPPS for the Chi-squared test and Seabridge file

As an example , suppose we are interested to see if there is any association between gender (Q5) and the main reason for joining the Centre (Q10)

Using Analyse, Descriptive Statistics, Crosstabs

Move “Q5 gender “ into the rows and “Q10 main reason “ into the columns

Click on the Statistics button and choose Chi-squared

From the cells button , choose observed and expected

Now choose continue and Ok

The resulting output is :

Case Processing Summary

Cases

Valid

Missing

Total

Percent

Q5 gender * Q10 main reason for joining

96.7%

3.3%

100.0%

Q5 gender * Q10 main reason for joining Crosstabulation

Q10 main reason for joining

Total

location of centre

range of facilities

recommended by friend/relative

membership rates

other

Q5 gender

male

Count

Expected Count

4.2

6.5

5.6

5.1

5.6

27.0

female

Count

Expected Count

4.8

7.5

6.4

5.9

6.4

31.0

Total

Count

Expected Count

9.0

14.0

12.0

11.0

12.0

58.0

Chi-Square Tests

Value

Asymp. Sig. (2-sided)

Pearson Chi-Square

10.300a

.036

Likelihood Ratio

10.682

.030

N of Valid Cases

a. 2 cells (20.0%) have expected count less than 5. The minimum expected count is 4.19.

As the area in the tail of the chi-squared distribution is 0.036 (see value in last column) and thus less than 5%, we conclude that there is an association between gender and the reason for joining the centre.

2 cells (20%) have expected values of less than 5. This is acceptable.

Topic 3 Significance Testing For Means

Notation µ is the mean of the population

is the mean of the sample

σ is the standard deviation of the population

s is the best estimate of the population standard deviation from a sample

[Reminder: We use the formula ] for the standard deviation of a population and formula ] for the best estimate of the population standard deviation from a sample ]

The Central Limit Theorem states that if random samples of the same size are repeatedly drawn from a population of any distribution, the means of those samples will be Normally Distributed.

Additionally, the mean of the sample means will be the same as the population mean.

Mean of sample means = µ

If the population size is large relative to the sample size, n, then the standard deviation of the sample means is equal to the population standard deviation divided by sample size

s.d. of sample means =

Z-test for a Population Mean

We use the z-test for a population mean when the standard deviation of the population is known.

Procedure:

Two-tailed test

1) Set up hypotheses H0 : µ= some value [null hypothesis]

H1 : µ ≠some value [alternative hypothesis]

2) Choose significance level 5% is commonly used

3) Calculate test statistic

4) Find critical value(s) from tables For a 2-tailed tail, area in each tail is half of significance level.

5) Accept or reject hypothesis

6) Summarise findings

Example:

The lengths of metal bars produced by a particular machine are Normally distributed with a mean length of 420 cm and a standard deviation of 12 cm. After a recent service to the machine, a sample of 100 bars was taken and the mean length of this sample was found to be 423cm.

Janet Geary 2012 Page 38

Is there any evidence , at the 5% level, of a change in the mean length of the bars produced by the machine. Assume that the standard deviation remains the same after the service.

In this example: population mean µ = 420

Population standard deviation σ = 12

Sample mean = 423

Sample size n = 100

1) H0 : µ = 420 [mean length remains the same]

H1 : µ ≠ 420 [mean length changes, 2-tailed test]

2) significance level , α = 0.05 [5%]

4) Critical values:

In this example we are using a two-tailed test as the alternative hypothesis has two parts ( less than 420 and greater than 420). We have to split the 5% significance level into 2 halves of 0.025 each. From the Normal tables,( below), we get that an area in a tail of 0.025 gives critical values of z = ±1.96.

5) The diagram below (to be completed in seminar) shows the “accept” and “reject” areas for H0.

As test z = 2.5 lies in the reject region, we reject H0

6) We conclude that there is evidence to suggest, at the 5% level, that the mean length of the metal bars has changed since the machine was serviced.

N.B. In this example we assumed that the standard deviation had not changed although we were unsure about whether the mean length had changed. This is not very realistic. In general, if we know the population standard deviation, we are likely to know the population mean as well and thus have no need to test it.

It is much more usual that we do not know the population mean and the population standard deviation. This scenario is covered by the t-test.

If the sample size is large , we can use the t-test .

Critical values of the Normal Distribution

Alpha is the area in one tail.

alpha

z value

alpha

z value

0.005

2.5758

0.05

1.6449

0.010

2.3263

0.10

1.2816

0.015

2.1701

0.15

1.0364

0.020

2.0537

0.20

0.8416

0.025

1.9600

0.25

0.6745

0.030

1.8808

0.30

0.5244

0.035

1.8119

0.35

0.3853

0.040

1.7507

0.40

0.2533

0.045

1.6954

0.45

0.1257

0.050

1.6449

0.50

0.0000

The t-test for a population mean

The t-test is used to test the value of a population mean when the population standard deviation is not known.

Procedure:

1) Calculate the sample mean, , and the estimated standard deviation, s.

2) Set up hypotheses H0 : µ = some value

3) Choose significance level 5% is commonly used

4) Calculate test statistic

5) Find critical value(s) from tables degrees of freedom = n-1

6) Accept or reject hypothesis

7) Summarise findings

Example:

A random sample of 8 women yielded the following cholesterol levels:

3.1 2.8 1.5 1.7 2.4 1.9 3.3 1.6

Test whether the sample could be drawn from a population whose mean cholesterol level is 3.1.

Test at the 5% level of significance.

Population mean (to be tested): µ = 3.1

Sample size: n = 8

sample mean: = 2.2875 [ from calculator]

estimated standard deviation: s = 0.7120 [ from calculator]

H0 : µ = 3.1

H1 µ ≠ 3.1 [2-tailed test]

Significance level = 5% thus α = 0.05/2 = 0.025 [2-tailed test]

degrees of freedom = n-1 = 7 α = 0.025

From the t-tables (next page), the critical values are ± 2.3646

As the value of the test statistics t = -3.2276 is outside the critical values, we Reject H0

There is evidence to suggest that this sample is not drawn from a population with a mean cholesterol level of 3.1.

t- tables

area in one tail

degrees of freedom

0.1

0.05

0.025

0.01

0.005

0.0025

3.0777

6.3138

12.7062

31.8205

63.6567

127.3213

1.8856

2.9200

4.3027

6.9646

9.9248

14.0890

1.6377

2.3534

3.1824

4.5407

5.8409

7.4533

1.5332

2.1318

2.7764

3.7469

4.6041

5.5976

1.4759

2.0150

2.5706

3.3649

4.0321

4.7733

1.4398

1.9432

2.4469

3.1427

3.7074

4.3168

1.4149

1.8946

2.3646

2.9980

3.4995

4.0293

1.3968

1.8595

2.3060

2.8965

3.3554

3.8325

1.3830

1.8331

2.2622

2.8214

3.2498

3.6897

1.3722

1.8125

2.2281

2.7638

3.1693

3.5814

1.3634

1.7959

2.2010

2.7181

3.1058

3.4966

1.3562

1.7823

2.1788

2.6810

3.0545

3.4284

1.3502

1.7709

2.1604

2.6503

3.0123

3.3725

1.3450

1.7613

2.1448

2.6245

2.9768

3.3257

1.3406

1.7531

2.1314

2.6025

2.9467

3.2860

1.3368

1.7459

2.1199

2.5835

2.9208

3.2520

1.3334

1.7396

2.1098

2.5669

2.8982

3.2224

1.3304

1.7341

2.1009

2.5524

2.8784

3.1966

1.3277

1.7291

2.0930

2.5395

2.8609

3.1737

1.3253

1.7247

2.0860

2.5280

2.8453

3.1534

1.3232

1.7207

2.0796

2.5176

2.8314

3.1352

1.3212

1.7171

2.0739

2.5083

2.8188

3.1188

1.3195

1.7139

2.0687

2.4999

2.8073

3.1040

1.3178

1.7109

2.0639

2.4922

2.7969

3.0905

1.3163

1.7081

2.0595

2.4851

2.7874

3.0782

1.3150

1.7056

2.0555

2.4786

2.7787

3.0669

1.3137

1.7033

2.0518

2.4727

2.7707

3.0565

1.3125

1.7011

2.0484

2.4671

2.7633

3.0469

1.3114

1.6991

2.0452

2.4620

2.7564

3.0380

1.3104

1.6973

2.0423

2.4573

2.7500

3.0298

1.3095

1.6955

2.0395

2.4528

2.7440

3.0221

1.3086

1.6939

2.0369

2.4487

2.7385

3.0149

1.3077

1.6924

2.0345

2.4448

2.7333

3.0082

1.3070

1.6909

2.0322

2.4411

2.7284

3.0020

1.3062

1.6896

2.0301

2.4377

2.7238

2.9960

1.3055

1.6883

2.0281

2.4345

2.7195

2.9905

1.3049

1.6871

2.0262

2.4314

2.7154

2.9852

1.3042

1.6860

2.0244

2.4286

2.7116

2.9803

1.3036

1.6849

2.0227

2.4258

2.7079

2.9756

1.3031

1.6839

2.0211

2.4233

2.7045

2.9712

1.3025

1.6829

2.0195

2.4208

2.7012

2.9670

1.3020

1.6820

2.0181

2.4185

2.6981

2.9630

1.3016

1.6811

2.0167

2.4163

2.6951

2.9592

1.3011

1.6802

2.0154

2.4141

2.6923

2.9555

1.3006

1.6794

2.0141

2.4121

2.6896

2.9521

1.3002

1.6787

2.0129

2.4102

2.6870

2.9488

1.2987

1.6759

2.0086

2.4033

2.6778

2.9370

1.2971

1.6730

2.0040

2.3961

2.6682

2.9247

1.2958

1.6706

2.0003

2.3901

2.6603

2.9146

1.2947

1.6686

1.9971

2.3851

2.6536

2.9060

1.2938

1.6669

1.9944

2.3808

2.6479

2.8987

1.2929

1.6654

1.9921

2.3771

2.6430

2.8924

1.2922

1.6641

1.9901

2.3739

2.6387

2.8870

1.2916

1.6630

1.9883

2.3710

2.6349

2.8822

1.2910

1.6620

1.9867

2.3685

2.6316

2.8779

1.2905

1.6611

1.9853

2.3662

2.6286

2.8741

100

1.2901

1.6602

1.9840

2.3642

2.6259

2.8707

200

1.2858

1.6525

1.9719

2.3451

2.6006

2.8385

300

1.2844

1.6499

1.9679

2.3388

2.5923

2.8279

400

1.2837

1.6487

1.9659

2.3357

2.5882

2.8227

500

1.2832

1.6479

1.9647

2.3338

2.5857

2.8195

600

1.2830

1.6474

1.9639

2.3326

2.5840

2.8175

700

1.2828

1.6470

1.9634

2.3317

2.5829

2.8160

800

1.2826

1.6468

1.9629

2.3310

2.5820

2.8148

900

1.2825

1.6465

1.9626

2.3305

2.5813

2.8140

1000

1.2824

1.6464

1.9623

2.3301

2.5808

2.8133

Janet Geary 2012 Page 41

Seminar Exercise

1. The commissions earned by an estate agent last year were Normally distributed with a mean of £2,560 and a standard deviation of £310. A random sample of 24 sales from this year was examined and the mean commission found to be £2,690.

Assuming no change in the standard deviation, does this evidence provide significant evidence of a change in the mean commission?

Test at the 5% level of significance.

2. A random sample of 40 sacks of animal feed is taken from a population whose mean is unknown but whose variance is 15.7 kg2. The mean weight of the sample is 86.3 kg.

Test the hypothesis that the mean weight of the sacks is 86 kg.

Test at the 5% significance level.

3. The mean time taken by a standard piece of software to run a particular task is 64 seconds. In order to compare a different piece of software it was given a sample of 15 tasks of the same type, the times being:

61.8 73.0 72.0 68.1 63.1 71.5 63.1 69.4 68.9 67.0

65.0 67.5 69.0 59.8 61.5

Does this data indicate a difference in the means times between the two pieces of software? Test at the 5% level of significance.

4. You have been told by “experts” that the average weekly salary for a part-time secretary is £275. You decide to check this assertion by taking a random sample of 30 secretaries and obtain a sample mean of £270 and a standard deviation of £40.

Test at the 5% level to determine whether the experts are correct.

5 During 2011 the number of beds required at a hospital was Normally distributed with a mean of 1800 a day and a standard deviation of 190 per day.

During the first 50 days of 2012, the average daily requirement for beds was 1830. This data is considered to be a valid sample for 2012. A senior hospital manager has claimed that this gives evidence that the requirement for beds has changed since 2011. Would you agree?

Is the sampling method valid?

Answers:

1. Test statistic, z = 2.05 reject H0 mean has changed

2. Test statistic z = 0.4788 do not reject H0

3. Test statistic t = 2.54 reject H0 times are not the same

4. Test statistic t = -0.6847 do not reject H0 experts could be right

5. Test statistic z = 1.116 do not reject H0 requirement for beds has not changed

Sampling method is questionable as only covers winter (January and February) and thus will not represent the demand for beds for the whole year. Illnesses are often seasonal.

One-tailed t-test for a Population Mean

In a one tailed test the alternative hypothesis is that the population mean is either greater than a particular value or it is less than a particular value. (With a two-tailed test it was simply “not equal”)

Example:

A jeweller was sold some silver wire that he was suspicious about. If the wire was pure silver it would give a reading of 1.5 ohms when tested for electrical resistance. If the wire was not pure silver the resistance would be increased.

The jeweller tested 5 pieces of the dubious wire and the following readings for electrical resistance were obtained:

1.51 1.49 1.54 1.52 1.54

Test at the 5% level the hypothesis that the wire is pure silver.

µ = 1.5 [to be tested] n = 5

= 1.52 [from calculator]

s = 0.0212 [from calculator]

H0 : µ = 1.5 [it is pure silver]

H1 : µ > 1.5 [it is not pure silver, resistance is increased]

Significance level = 5% thus α = 0.05 [1-tailed test]

degrees of freedom = n-1 = 4 α = 0.05

critical value = +2.1318 [we need the positive critical value for H1 > some value]

[if H1 had been < some value, we would have used the negative critical value]

Do not reject H0

There is no significance evidence to suggest that the silver wire was impure.

The jeweller should allay his suspicions

Exercises One tailed t-test

1.. A firm which manufactures panels tests a sample of 13 panels, loading them until they crack. The results of this test are the following loads (in Newtons).

2.7 4.6 3.1 4.2 3.3 6.7 8.9 8.6

7.2 8.6 8.9 7.3 8.9

An important customer of this firm asserts that the quality of the panels is getting worse and that the mean cracking load is now less than the previous value of 7 Newtons.

Does the evidence bear out the customers claim?

Test at the 5% level.

2. A health clinic claims that people following its diet programme will lose, on average, at least 8 kilograms during the programme. A random sample of 41 people on the programme showed a mean weight loss of 7 kilograms. The sample standard deviation, s, was found to be 3.1 kilograms. Test at the 5% level of significance whether the company is exaggerating, i.e. the mean weight loss, in general, is less than 8 kilograms.

3. A manager in a university department claimed that the average hours members of staff work each week is 35 hours. A random sample of 12 lecturers found that the number of hours worked in a particular week was taken. The results are given below:

37, 41, 32, 45, 39, 35, 43, 38, 40, 42, 38, 34

The staff union claims that the average number of hours worked is greater than 35.

Does the evidence bear out the unions claim?

Test at the 5% level of significance.

Answers:

1. mean =6.3846, sd = 2.4535, Test statistic t = -0.9044, critical = -1.782

do not reject H0 Evidence does not support the claim

2. Test statistic t = -2.065, critical = -1.684

reject H0 Mean weight loss is less than 8 kg. Company is exaggerating.

3. Mean = 38.66, sd = 3.821, Test statistic t = +3.3237, critical = + 1.796

reject H0 Mean hours worked is greater than 35.

Evidence does bear out the unions claim.

Topic 4 Two-sample test and test for proportions

Two Sample t-Test for Population Means

from a Normal distribution with mean µx and standard deviation σ has sample mean , sd = sx and sample size nx

Suppose we have two independent random samples :

Sample A: x1, x2 x3 ….. xnx

from a Normal distribution with mean µy and standard deviation σ has sample mean = , sd = sy and sample size ny

Sample B: y1 y2 y3 ….. yny

Notice that we are assuming that the population standard deviations are the same. Since this standard deviation is unknown we have to find an estimate for it.

Notation

sample A

sample B

sample size

sample mean

sample standard deviation

The estimate for the common standard deviation is given by s , where

The test statistic is

which fits a t-distribution with nx + ny – 2 degrees of freedom

We usually use a two sample t-test to determine if the population means are equal.

i.e. (µx – µy =0)

In this case, the test statistic is

Example:

As part of an investigation undertaken by a telephone company, a comparison of the weekly phone bills in two areas was undertaken. The figures are given below :

Area A: 5.7 12.0 10.1 13.7 11.9 11.7 10.4 7.3 5.3 6.8 11.8

Area B: 8.9 3.0 8.2 5.2 2.2 5.7 3.2 9.6 3.1 3.9

Test, at the 5% level to see if there is any difference between the mean phone bills in the two areas.

calculations

Area A

Area B

sample means,

= 9.7

= 5.3

sample s.d ,

sx = 2.91067
sx2 =8.47200

sy = 2.7109
sy2 = 7.3489

sample size,

nx = 11

ny = 10

= 7.940

common variance, s2 = 7.940 common standard deviation, s = 2.8178

H0: µx = µy or µx – µy = 0 [population means are the same]

H1: µx ≠ µy [pop. means are not the same]

Choose α = 5%

Test Statistic is

=3.574

If α = 5% for a 2-tailed test, each tail has 0.025, degrees of freedom = nx + ny -2 = 19 critical values = ± 2.0930

As the test statistic t = 3.574 is outside the “accept region “ (± 2.0930), Reject H0

The mean amount spent on telephone bills is not the same in the two areas

Exercise: 2 sample t-test

1. Independent random samples of current account balances at two branches of a particular bank yielded the following results:

Branch

number of accounts sampled

sample
mean balance

sample
standard deviation

Holloway

£1,000

£150

Islington

£920

£120

Test , at the 5% level of significance, whether there is any difference between the mean balances in the two branches of the bank.

2. A firm is studying the delivery times of two raw material suppliers. The firm is basically satisfied with supplier A and is prepared to stay with that supplier if the mean delivery time is roughly the same as that of supplier B.

Independent samples gave the following results:

sample size

sample mean
delivery time

sample
standard deviation

supplier A

14 days

3 days

supplier B

12.5 days

2 days

Test, at the 5% level of significance, whether there is any difference in the mean times for delivery.

3. In a wage discrimination case involving male and female employees, independent samples of male and female employees with five year’s experience or more provided the hourly wage results shown in the table.

sample size

sample mean
(£ per hour)

sample
standard deviation

male employees

£9.25

£1.00

female employees

£8.70

£0.80

Does wage discrimination appear to be present in this case?

Test at the 5% level of significance.

Test at the 1% level of significance.

Do your conclusions depend upon the level of significance?

4. A University careers office decided to collect data on the starting salaries for graduates in different subjects ten years ago. She wanted then to compare the averages ten years ago with those form last year’s graduates. Among the areas investigated were accounting graduates and general business graduates.

The salaries (in £1,000 per annum) found from two samples are given below:

Accounting

Business

14.4

13.2

12.6

11.8

13.1

12.5

14.0

11.5

13.5

14.9

13.1

12.3

14.1

14.5

12.4

13.7

12.6

12.2

14.9

13.4

14.6

13.1

14.6

Use a 5% level of significance to test the hypothesis that there is no difference between the mean annual starting salary of Accounting graduates and the mean starting salary of Business graduates ten years ago.

What is your conclusion?

Answers:

1. Common s = 137.31 t = 1.36 critical = ± 2.0860 do not reject H0

No significant difference in means

2. Common s = 2.6723 t = 2.4306 critical = ± 2.0 (approx) reject H0

There is a difference in the means

3. Common s = 0.9215 t = 2.569

a) critical = ± 1.9921 (approx) reject H0 There is a difference in the means

b) critical = ± 2.6430 (approx) do not reject H0 There is no difference in the means

Difference in wages is significant at the 1% level but not at the 5% level.

4 Accounting mean = 13.6583 sample s.d. = 0.8867

Business mean = 13.0091 sample s.d. = 1.0784

common s = 0.9827 t = 1.5826 critical = ± 2.0796 do not reject H0

No difference in the mean starting salaries

Test for a difference in proportions

Consider 2 random samples of sizes nx and ny with proportions of a success equal to px and py respectively. We wish to determine if there is any significant difference between the 2 proportions.

The first step is to calculate the common proportion P.

then Q = 1 – P

Note: P can be remembered as: P = total number of successes

total sampled

The test statistic is

which fits a Normal distribution with mean 0 and standard deviation 1, i.e. the standardised Normal distribution as found in tables.

The null hypothesis, H0 , will be that there is no difference between the 2 proportions.

Example:

Two newspapers conducted opinion polls asking voters whether they would vote for Mr Whittington as the next Mayor of London.

In the Evening News poll, 325 voters out of 500 said that they would vote for Mr Whittington.

In the Morning Metro poll, 201 voters out of 300 polled said they would vote for Mr. Whittington.

Is there any difference in the results of the two polls?

Answer:

H0 : there is no difference between the proportions saying they will vote for Mr Whittington.

H1 : there is a significant difference between the proportions saying they will vote for Mr Whittington.

Evening News

Morning Metro

total

number of successes

325

201

526

sample proportions

px = 325 = 0.65
500

py = 201 = 0.67
300

sample size

nx = 500

ny = 300

800

P = nx px + ny py then Q = 1- P

nx + ny

P = 5000.65 + 3000.67 = 325 + 201 = 526 = 0.6575

500 + 300 800 800

Note : we could have calculated P directly from the totals column.

P = 0.6575 thus Q = 1 – P = 0.3425

The test statistic is

=-0.5771

Testing at the 5% level of significance, with a 2-tailed test, the critical values are ± 1.96

Do not reject H0 ,

Conclude: There is no significant difference between the results of the 2 opinion polls.

Seminar Exercises Difference in proportions

1. A medical research unit decided to test two drugs, A and B, for reducing blood pressure. The drugs were given to 2 sets of volunteers. One group of 90 volunteers was treated with drug A and 60 of these volunteers reported lower blood pressure. The second group of 80 volunteers was treated with Drug B; of these 50 reported lower blood pressure. Test at the 5% level of significance if there is any difference between the 2 drugs ability to lower blood pressure.

2. A survey firm conducted door-to-door interviews on a new consumer product. Some individuals co-operated with the interviewers and completed the questionnaire whilst other individuals did not co-operate. The sample data is shown in the table below.

Testing at the 5% level of significance, test the hypothesis that the rate of co-operation is the same for both men and women.

sample size

number co-operating

men

200

110

women

300

210

3. Two universities were planning to merge. At one of the 2 universities, NLU, 200 staff were interviewed. Of these 44 said that they thought the new merged University would be “good for the local area”. At the second university, GLU, 48 of the 300 staff interviewed thought the new University would be “good for the local area”.

Test, at the 5% level of significance, whether there is any difference in the two proportions who think that the new university would be “good for the local area”.

Test at the 10% level of significance.

4. A check was conducted on a form completed in 2 offices. The first office yielded a random sample of 250 forms, of which 35 contained errors. The second office had a sample size of 300 and 27 forms that contained errors.

Test at the 10% level of significance whether there is any difference in the proportion of forms containing errors between the 2 offices.

Answers:

1. P = 0.647 z = 0.5679 No significant difference in the proportions.

2. P = 0.64 z = -3.423 There is a significant difference

3. P = 0.184 z = 1.69 No significant difference at 5% level

Significant difference at 10% level

4 P = 0.1127 z = 1.85 There is a significant difference

Janet Geary 2012 Page 48

Practical session Use of SPSs for a t-test for a population means and a two–sample t-test.

One sample t-test

Example : We wish to test if the mean age of the users of the Seabridge sports centre could come from a population with mean age = 40.

Choose Analyse, Compare Means , One-Sample T-test

Choose age as the “test variable” and enter 40 as the “test value”

Using the Options button choose 95% as the Confidence Interval Percentage

Now choose Continue and OK

The SPSS output is :

One-Sample Statistics

Mean

Std. Deviation

Std. Error Mean

Q6 age of respondent

36.19

11.607

1.524

One-Sample Test

Test Value = 40

Sig. (2-tailed)

Mean Difference

95% Confidence Interval of the Difference

Lower

Upper

Q6 age of respondent

-2.500

.015

-3.810

-6.86

-.76

In this case we would reject the hypothesis that the mean age of the population is 40 in favour of the alternative hypothesis that it is not 40 ( the value in the tail is 0.015 which is less that 0.05). This is not surprising as the sample mean is 36.19

If we repeat this test with a test value of 36

We get :

One-Sample Test

Test Value = 36

Sig. (2-tailed)

Mean Difference

95% Confidence Interval of the Difference

Lower

Upper

Q6 age of respondent

.124

.901

.190

-2.86

3.24

In this case we would not reject the null hypothesis that the mean age of the population is 36 as the value in the tail is 0.901 ( i.e. above 0.05)

Using SPSS for a two-sample t-test.

As an example we can assume that the male and female members were surveyed independently and thus can be considered as two independent samples . We can now test to see if there is a difference in the mean age of the male and female members.

Choose age as the Test Variable and gender as the Grouping Variable

Now define groups using the codes on the data file ( 1 and 2)

Choose Continue and then OK

The output takes the form of :

Group Statistics

Q5 gender

Mean

Std. Deviation

Std. Error Mean

Q6 age of respondent

male

34.67

11.622

2.237

female

37.52

11.619

2.087

Independent Samples Test

Levene’s Test for Equality of Variances

t-test for Equality of Means

Sig.

Sig. (2-tailed)

Mean Difference

Std. Error Difference

95% Confidence Interval of the Difference

Lower

Upper

Q6 age of respondent

Equal variances assumed

.037

.849

-.932

.356

-2.849

3.059

-8.977

3.278

Equal variances not assumed

-.932

54.907

.356

-2.849

3.059

-8.980

3.281

The value in the tails are 0.356 if we assume that the populations from which the samples were drawn have equal variances and 0.356 if we do not make this assumption . In both cases the values are greater than 0.05 and thus we do not reject the hypotheses that they have equal means.

This is effectively saying that a female mean age of 37.52 and a male mean age of 34.67 are not significantly different.

Janet Geary 2012 Page 52

Topic 5 Confidence Intervals

Point and Interval Estimates.

The mean rent paid by students in a provincial town is £102 per week with a standard deviation of £10. If we were ignorant of this fact (which we would be unless a census had been taken) we would have to conduct a survey to estimate the population mean.

If the results of the survey gave a sample mean of £100 we could use this to estimate the value of the population mean. This value of the sample mean, £100, is known as a point estimate because it consists of just one value. Point estimates can be very misleading as they give a false impression of accuracy. By themselves they do not recognise the fact that they are only estimates and thus are subject to a degree of uncertainty.

This question of uncertainty is addressed by using interval estimates such as confidence intervals.

A point estimate is The mean weekly rent is £100′

An interval estimate would be The mean weekly rent is in the range £96 to £104

Confidence Interval for a Population Mean – Population Standard Deviation Known

If the population standard deviation is known then we can use the Normal distribution in our calculation of the confidence interval.

The formula for a Confidence Interval is

zα is found from the Normal tables, n is the size of the sample and σ is the standard deviation of the population from which the sample is drawn.

For a 95% confidence interval α = 100% – 95% = 5% i.e. α = 0.025, Zα = 1.96

2 2

For a 90% confidence interval α= 100% – 90% = 10 % i.e. α = 0.05, Zα =1.6449

2 2

For a 80% confidence interval α= 100% – 80% = 20 % i.e. α = 0.10, Zα =1.2816

2 2

Interpretation of a Confidence Interval

A 95 % confidence interval gives a range within which there is a 95% probability that the population mean lies.

If we took 100 samples and for each calculated the sample mean, standard deviation and hence the 95% confidence intervals, 95 of these intervals would contain the population mean.

Thus 5% of the time the population mean will NOT lie in the 95% confidence interval.

Janet Geary 2012 Page 56

Example:

An accountant knows, from past experience that the standard deviation of the value of all invoices is £11.60. However he has forgotten the value of the mean.

In order to estimate the value of the mean, he takes a sample of 20 invoices and finds a sample mean of £51.41.

Determine a 95% confidence interval for the mean value of all invoices.

95 % confidence interval

95% confidence interval is [ £46.33 , £56.49 ]

Thus there is a 95% confidence that the mean value of all invoices lies in the range £46.33 to £56.49.

Confidence Interval for a Population Mean – Population Standard Deviation Unknown

If the population standard deviation is not known then we use the t distribution in our calculation of the confidence interval. We use the value s as the best estimate of the population standard deviation.

The formula for a Confidence Interval is

tα is found from the t-tables. The degrees of freedom is n -1

Example:

The heights, in cm, of 6 policemen were : 180 176 179 181 183 179

Calculate

a) a 90% Confidence Interval for the mean heights of all policemen

b) a 95% Confidence Interval for the mean heights of all policemen

c) If we had obtained the same sample statistics (mean and standard deviation) from a sample of 60 policemen what effect would this have on the 95% Confidence Interval?

Mean = 179.667 sample standard deviation s = 2.3381 [results from calculator]

a) 90% Confidence Interval

α = 0.05 degrees of freedom = n – 1 = 5 tα = 2.0150

There is a 90% confidence that the mean height of all policemen lies in the range 177.74cm to 181.59 cm.

b) 95% Confidence Interval

α = 0.025 degrees of freedom = n -1 = 5 tα = 2.5706

N.B. A higher percentage confidence interval gives a wider interval

c) 95% Confidence Interval with sample size = 60

α = 0.025 degrees of freedom = n -1 = 59 tα = 2.0003 (using df = 60)

N.B. Larger sample sizes give narrower confidence intervals.

Thus for a “more accurate” estimate of the mean,(i.e. a narrower confidence interval) take a larger sample. I hope this was obvious before but we have now shown it to be the case.

Confidence Interval for a Population Proportions –

A proportion can represent any set amount and is mainly used when the data is not numerical.

Let p = sample proportion (expressed as a decimal) then q = 1- p

If np > 5 and nq > 5 then we can calculate a Confidence Interval for a proportion by:

p ± Z α √( pq/n)

Example:

A random sample of 1,000 electors was polled and 400 of the electors said that they will vote Labour.

How accurate an estimate is this sample proportion with respect to how all electors will vote?

p = 400 = 0.4 thus q = 0.6

1000

For a 95% Confidence Interval, Z α = 1.96

95% Confidence Interval

p ± Z α √( pq/n) 0.4 ± 1.96 × √( (0.4 × 0.6)/1000)

0.4 ± 1.96 × √( 0.24/1000) 0.4 ± 1.96 × √( 0.00024)

0.4 ± 1.96 × 0.0155 0.4 ± 0.03036

95% Confidence Interval ( 0.3696 , 0.43036)

There is a 95% probability that the proportion of all electors who will vote Labour lies in the range 36.96% to 43.036%. More sensibly (37% to 43%) will vote Labour with a confidence of 95%.

This is usually reported in the press as “ 40% will vote Labour with an error due to sampling of plus or minus 3%”

Sample Size Required for a given level of accuracy

We can re-arrange the formula above to help as calculate the sample size required for a given level of accuracy.

Notation : let M be the size of the margin of error .

Then from above M = Z α √( pq/n)

For a 95% confidence interval Z α = 1.96

The size of the product pq will vary according to value of p. But as q = 1 – p, we can say that the quantity pq = p(1-p)

From the graph below of y = p (1-p) we can see that the largest possible value of pq is 0.25 .This occurs when both p and q are 0.5.

q=1-p

p(1-p)

0.1

0.9

0.09

0.2

0.8

0.16

0.3

0.7

0.21

0.4

0.6

0.24

0.5

0.25

0.6

0.4

0.24

0.7

0.3

0.21

0.8

0.2

0.16

0.9

0.1

0.09

For a 95% confidence interval we now have:

largest error: M = Z α √( pq/n)

M = 1.96 √( 0.25 /n)

rearranging this formula to make n the subject gives:

M = √( 0.25 /n)

1.96

M2 = 0.25

1.962 n

n = 0.25 × 1.96 2 = 0.9604

M2 M2

Thus if a 95% Confidence Interval required a maximum margin of error of 2% M = 0.02

n = 0.25 × 1.96 2 = 0.9604 = 2401

M2 ( 0.02)2

Some results: NB Sample size must be an integer.

margin

0.9604/M^2

minimum

sample size

9604

2401

1067.1111

1068

600.25

601

384.16

385

Seminar Exercise

1. An ambulance station is looking at its response times and takes a sample of 16 call outs and finds that the mean of the sample is 17 minutes whilst the standard deviation of the sample is 5 minutes. What are the 99% confidence limits for the mean response time of all call outs.

2. In order to evaluate the success of a television advertising campaign for a new product, a company interviewed 400 residents in the television area. 120 of them knew about the product. How accurately does this estimate the percentage of residents in the area who know about the product?

Calculate a 95% confidence interval.

3. A random sample of 400 rail passengers is taken and 55% are in favour of proposed new timetables. Calculate a 95% confidence interval for the proportion of all passengers that are in favour of the timetables.

4. A manufacturer needs to estimate the mean life of batteries they are producing. To do this they take a sample of 100 batteries and find that the mean life of this sample is 50 hours.

The standard deviation of all battery lifetimes is known to be 6 hours.

Calculate

a) a 95% Confidence Interval for the mean of all battery lifetimes.

b) a 99% Confidence Interval for the mean of all battery lifetimes.

5. The Human Resources department of a company is leading a campaign to reduce absenteeism of staff.

Last year 15% of the 59,202 hours which should have been worked over a 46 week year were lost due to absenteeism.

The campaign that was set up has now been running for 20 weeks. Absenteeism reports for the last 10 weeks of the campaign were examined. The weekly hours lost were:

week no. 11 12 13 14 15 16 17 18 19 20

hours lost 195 190 162 170 177 190 198 177 184 191

a) Considering these 10 weeks as a random sample of all the weeks to be worked, has there been a significant decrease in absenteeism over last year? You should carry out a test for the mean number of hours lost per week. Test at the 10% level of significance, and explain your choice of test statistic.

b. Calculate a 90% confidence interval for the mean number of hours lost per week after the campaign, and explain its meaning.

6. Last Christmas, the average value of purchases per customer at a toyshop was 36.00 with a standard deviation of 10.25.

The toyshop is anxious to know early on whether this years average Christmas spend is different, so it takes a random sample of 15 customers. Their spend on toys is (in ):

22 48 36 45 35 11 22 69

86 45 57 43 22 24 17

The shop wishes to be 90% confident that the error due to sampling is no larger than 3.00.

Assuming that the population standard deviation remains at 10.25

a) Is the sample taken sufficiently large to achieve the accuracy objective?

b) Test whether the average spend this Christmas is significantly different from last year’s. Test at the 10% level of significance.

c) Establish an 80% confidence interval for the average spend this year.

d) Are your answers to (b) and ( c) consistent? Explain your reasoning.

Answers

1. (13.32, 20.68)

2. (25.5%, 34.5%)

3. (50.1% , 59.9%)

4 a) (48.824, 51.176) b) (48.455 , 51.545)

5 sample mean = 183.4 sample sd = 11.6065 last year mean = 193.05

t = -2.629 critical = -1.383 reject H0 , Has been a decrease in hours lost

b) (176.67 , 190.13)

6 a) sample size is too small, it should be at least 32

b) z = 1.058 do not reject H0 ,

Mean is not significantly different from last year.

c) (35.41 , 42.19) d) both b) and c) imply that the value 36 would lie in a 90% confidence interval, i.e. are consistent

Practical session : Using SPSS to find Confidence Intervals

To find a 95% confidence of the length of membership:

Choose Analyze, Descriptive Statistics and then Explore

Select “length of membership” as the dependent list.

Now choose Statistics button and use 95% for the Confidence Interval for Mean

Now choose Continue and then OK

The output is :

Case Processing Summary

Cases

Valid

Missing

Total

Percent

Q8 length of membership

100.0%

.0%

100.0%

Descriptives

Statistic

Std. Error

Q8 length of membership

Mean

2.32

.131

95% Confidence Interval for Mean

Lower Bound

2.05

Upper Bound

2.58

5% Trimmed Mean

2.30

Median

2.00

Variance

1.034

Std. Deviation

1.017

Minimum

Maximum

Range

Interquartile Range

Skewness

.320

.309

Kurtosis

-.956

.608

This gives a confidence interval of [ 2.05 , 2.58] We could report this as :

There is a 95% confidence that the mean length of membership is between 2 and 2½years

Topic 6: Correlation and regression

1. Pearson product moment coefficient of correlation

If we wish to measure the strength of linear relationship between two variables (x and y), and the data is cardinal, we use the Pearson product moment coefficient of correlation (r), which is defined as follows:

The value of lies in the range -1 < r < +1. A r value of -1 signifies a perfect negative linear correlation, +1 a perfect positive linear correlation, and 0 no linear correlation. These scenarios are most easily demonstrated on a scatter graph. Note that the coefficient only measures the strength of relationship; it is not evidence of cause and effect.

In effect r measures how adequately a scatter of observations can be represented by a straight line. However it is more meaningful to interpret the strength of the relationship by looking at r2 (known as the coefficient of determination, where 0 < r2 < +1) which measures the proportion (or percentage) of variation in y which is explained by the variation in x. To focus on r tends to overstate the strength of the relationship (e.g. r = 0.7 seems to suggest a fairly strong relationship, but it explains less than 50% of the variation).

We can test the significance of the coefficient using the test statistic:

which has a t-distribution with N-2 degrees of freedom

Example
In a survey of 30 company employees, the correlation between length of service and age is 0.87207.
A scatter diagram of the data is shown below.

The details of the calculation for the value of the correlation coefficient are shown below:

service

age

576

2601

1224

625

1156

306

144

1600

480

1024

192

529

784

140

441

1089

132

676

441

196

3249

798

729

135

529

169

2025

585

484

121

1849

473

1225

280

361

784

224

144

1936

528

169

1600

520

1225

105

2304

432

625

121

1849

473

576

900

150

1225

210

totals

Σy =202

Σx= 981

Σy2= 2168

Σx2 =35017

Σxy =7949

= 0.87207

As the value of r = 0.87207 , the value of r2 is 0.7605
Thus 76% of the variation in the length of service can be explained by the variation in age .
The SPSS printout looks like:
Correlations

AGE

SERVICE

AGE

Pearson Correlation

0.872

Sig. (1-tailed)

SERVICE

Pearson Correlation

0.872

Sig. (1-tailed)

** Correlation is significant at the 0.01 level (1-tailed).

Thus r = 0.872071 and r2 = 0.760508; therefore we have a strong positive correlation in which approximately 76% of the variation in length of service is explained by the variation in age. The remaining 24% will be due to other factors (e.g. not all employees will have worked their way up through the company; some will have joined from other companies etc..). We can test the significance of this result by computing the test statistic and comparing it with critical values from the t-distribution. Our null hypothesis is that the true population correlation ρ = 0; our alternative hypothesis is that ρ > 0.
The test statistic
Testing at a level of significance of 5%, the critical t-value is 1.701. Since the test statistic exceeds the critical value we can reject the null hypothesis, and conclude that we have evidence of a genuine linear relationship between length of service and age. Moreover, the fact that 9.4294 is much larger than 1.701 shows that we could adopt a much smaller level of significance and still reject the hypothesis of no relationship.

Note the output does not specifically provide the test statistic but does indicate that the correlation is statistically significant at a 1% level of significance.

Note we are using a one-tailed test of significance with alternative hypothesis ρ > 0; if r had been negative, our alternative hypothesis would be ρ < 0, the test statistic would also be negative, and the critical t-value would be on the negative side of the t-distribution. Thus the null hypothesis would be rejected if the test statistic is less than the critical value (i.e. closer to the tail). The test is therefore a complete mirror image of the example above. Regression Whilst correlation measures the extent to which there is a linear relationship between 2 variables, regression enable us to find the equation that describes that linear relationship. The equation of a regression line has the form: Y = a + bX where Y is the dependent variable (the one we wish to predict / explain) and X is the independent variable. The value a is known as the intercept of the line and b measures the gradient of this line. The relevant formulae are: gradient intercept Looking at the calculation above, we can see that the value of the “top” of b is the same as the “top” of r and that the value of the “left-bottom bracket” of r is the same as the “bottom” of b. Thus Thus Thus the equation of the line linking length of service (y) and age (x) is: y = -8.2194 + 0.45727x This equation can then be used to make predictions. The SPSS regression output, with interpretation in italics, looks like: Variables Entered/Removedb Model Variables Entered Variables Removed Method 1 agea . Enter Variables Entered/Removed This simply tells us that age was the independent variable and service the dependent variable Model Summaryb Model R R Square Adjusted R Square Std. Error of the Estimate 1 .872a .761 .752 2.629 Model summary The value of the correlation coefficient, r was 0.872 and the value of r2 was 0.761. Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. 95.0% Confidence Interval for B B Std. Error Beta Lower Bound Upper Bound 1 (Constant) -8.219 1.657 -4.961 .000 -11.613 -4.826 age .457 .048 .872 9.429 .000 .358 .557 Coefficients The unstandardized coefficients give us the values of a and b in the regression equation. Thus the equation here is y = -8.219 + 0.457x The final column “Sig” gives values less than 0.01 thus we can say that the coefficients of the regression equation are significantly different from zero at the 1% level ( and thus at 5% level). Casewise Diagnosticsa Case Number Std. Residual service Predicted Value Residual 2 3.385 24 15.10 8.899 a. Dependent Variable: service Casewise diagnostics During the input dialogue, we asked for any standardised residuals outside the range -3 to + 3. The output shows that one reading, case number 2, had a large standardised residual. This indicates that this point does not fit the general trend of the straight line and can be regarded as an outlier (i.e. an unusual reading). Case number 2 is an employee aged 51 with 24 years of service. On the scatter diagram, we can see that this point is a long way from the regression line. Residuals Statisticsa Minimum Maximum Mean Std. Deviation N Predicted Value .47 17.85 6.73 4.603 30 Residual -4.785 8.899 .000 2.583 30 Std. Predicted Value -1.361 2.414 .000 1.000 30 Std. Residual -1.820 3.385 .000 .983 30 a. Dependent Variable: service Residual Statistics This table can be ignored for simple cases. Rank Correlation If we wish to measure the strength of relationship between two variables, and at least one of them is ordinal, we need to use a nonparametric measure of correlation in which the strength of relationship is now based on the ranks of the data. There are two main measures of rank correlation, Kendalls τ statistic and Spearmans rank correlation coefficient; we shall focus on the latter. Observed values for x and y are replaced by their ranks, and the difference (d) in ranking between each set of paired observations is calculated. Spearmans coefficient (rs) is found from the following formula: As with Pearsons coefficient, we can also test whether the result is statistically significant. Providing N 10 we can use the t statistic as before: which has a t-distribution with N-2 degrees of freedom For larger samples (say N > 30) we can use a normal approximation:

Example 1 [thanks to Richard Charlesworth for this example]
A survey of MBA students included a question which asked the respondent to identify the most important factor in their choice of UNL for their MBA.
Factor Full-time Part-time
MBA course programme/design 34 25
Discussion with tutor at recruitment fair/open evening 09 10
Recommended by a colleague/friend 22 18
Living in London 21 28
Credit transfer/flexible study 00 09
Prior study at UNL 04 10
Photos of the staff in the brochure 10 00

Factor:

Full-time MBA (%)

Part-time MBA (%)

FT ranks

PT ranks

MBA programme/design

Discussion at fair/open eve

4.5

0.5

0.25

Recommendation..

Living in London

CATS/flexible study

Prior study at UNL

4.5

1.5

2.25

Photos of the staff ..

Σd2 = 18.50

Therefore rs = 1 – (6×18.5) = 0.6696
7×(49-1)
Note that tied ranks are averaged (i.e. the scores of 10% for Part-time MBA jointly cover the rankings 4 and 5, so both take on the rank 4.5).

SPSS Output: Spearmans rank correlation coefficient

Nonparametric Correlations

Correlations

FT MBA

PT MBA

Spearman’s rho

FT MBA

Correlation Coefficient

0.667

Sig. (1-tailed)

0.051

PT MBA

Correlation Coefficient

0.667

Sig. (1-tailed)

0.051

How to use SPSS for Correlation and Regression

Example: load up the SPPSS file age&service.sav

To produce a correlation matrix

Choose Analyze, Correlate, Bivariate

Select age and service as the input variables. Check that Pearson is ticked. Choose OK

The resulting output looks like:

service

Pearson Correlation

.872**

Sig. (2-tailed)

.000

age

Pearson Correlation

.872**

Sig. (2-tailed)

.000

**. Correlation is significant at the 0.01 level (2-tailed).

To produce Regression output

Select Analyze Regression Linear
Select age as the independent variable
Select service as the dependent variable
Select enter as the method
Then select the statistics button
Select Casewise diagnostics
Outliers outside 3 standard deviations
tick Confidence Intervals, choose 95% as Level
then Continue
then OK

The output was shown previously

SPSS has the capacity to produce further analyses for regression, for example an analysis of the residuals is possible (and extremely useful).

Topic 7: Multiple Regression

In many situations we want to be able to use more than one independent variable to predict the value of the dependent variable. We use multiple regression in these cases.
For example, we might suspect that the price of a house depends not only on the number of bedrooms it has but also depends on the number of living rooms, number of bath/shower rooms, size of garden and so on. To predict the price of a house we would need a model that took into account all the significant factors.
We denote the independent variables as x1, x2, ……..xk and the associated coefficients as b1, b2 ….. bk . The constant of the regression equation is denoted by α .
Thus the regression equation is:
y = a + b1x1 + b2x2 + b3x3 + ………. bkxk

The formulae for calculating the values of a , b1 , b2 , b3………. bk are too complicated to include here. We will focus on analysing the SPSS printouts.

t-test on Regression Coefficients
We can use the t-test to decide if the coefficients of the regression line are significant
H0: βi = 0 [coefficient = 0 :no significant contribution to linear relationship]
H1: βi ≠ 0
The test statistics is given by: t = bi / Std Err of bi
We reject H0 if |t| > t( α/2, n-2) [found from statistics tables]
As before, we can reject H0 if the value of p <0.025 in SPSS printout.. Multicollinearity If a regression equation contains two or more “independent” variables that have a strong linear relationship between them, then the model has “multicollinearity”. This means that the independent variables are not really independent in the sense that they are related to each other. This strong linear relationship between two or more of the independent variables may make the estimates of the coefficients unreliable and may, in fact, make some coefficients negative when they should be positive (or vice versa). We can check for collinearity by looking at the correlation matrix. Multiple Regression Example: London Theatres Data was obtained from the Society of London Theatres about various statistics recorded for London Theatres from 1986 to 2010. The data set was obtained from http://www.solt.co.uk/downloads/pdfs/theatreland/2010-Graph2 The full data file is shown overleaf. year attendances in thousands gross box office revenue £ thousands average no of theatres open no of performances no of new productions 1986 10236 112068 42 16543 213 1987 10881 129589 42 16603 212 1988 10897 139338 43 16970 28 1989 10945 153251 42 16436 237 1990 11321 177904 40 15887 187 1991 10905 186790 39 15508 192 1992 10900 194772 41 15916 193 1993 11503 215619 41 15922 198 1994 11163 217763 41 16063 208 1995 11938 238741 43 17163 208 1996 11179 229017 41 16084 186 1997 11466 246082 39 15568 195 1998 11925 257920 41 16018 207 1999 11931 266565 44 17089 265 2000 11555 286556 43 16633 252 2001 11735 298989 44 17035 264 2002 12064 327972 44 17090 221 2003 11585 321485 42 16664 225 2004 12025 343674 43 17235 225 2005 12319 383942 45 17406 221 2006 12351 400853 43 16912 268 2007 13636 469939 44 17455 243 2008 13892 483349 45 18275 241 2009 14257 504984 45 17923 260 2010 14152 512332 46 18615 264 We are going to use multiple regression to produce a model that will enable the value of gross box office revenue to be explained by the other variables. There are two basic approaches to multiple regression, top-down and bottom-up. With top-down regression, we start by using all the independent variables in our model and then successively eliminate those have values of bi that are not significant (Sig T >0.025) or are highly correlated with other variables (multicollinearity)

Bottom-up approaches successively add variables to the model until no improvement can be made. The order in which the variables are added is often the order of the values of the correlation coefficients with the dependent variable (variable with the highest r is used first, then variable with second highest r is added and so on) excluding multicollinearity at each stage.
For both of these approaches we need to know what the values of the various correlation coefficients are. This can be obtained by using the correlate command in SPSS .
Note: SPSS highlights any correlations that are significant by the use of * and **.

We get:

Correlations

attendance

revenue

theatresopen

performances

newproductions

attendance

Pearson Correlation

.954**

.720**

.803**

.489*

Sig. (2-tailed)

.000

.013

revenue

Pearson Correlation

.954**

.715**

.771**

.559**

Sig. (2-tailed)

.000

.004

theatresopen

Pearson Correlation

.720**

.715**

.955**

.397*

Sig. (2-tailed)

.000

.050

performances

Pearson Correlation

.803**

.771**

.955**

.367

Sig. (2-tailed)

.000

.071

newproductions

Pearson Correlation

.489*

.559**

.397*

.367

Sig. (2-tailed)

.013

.004

.050

.071

**. Correlation is significant at the 0.01 level (2-tailed).
*. Correlation is significant at the 0.05 level (2-tailed).

From this correlation matrix we can see that:
· All of the other variables are correlated with revenue
· Attendance is highly correlated with the number of theatres open and the number of performances
As we want to use “revenue” as our dependent variable, we can guess that “attendance” will be in the model and at most one of “theatres open” and “performances”. We would not expect both “theatres open” and “performances” to be included as they are highly correlated and thus not really independent.

“Top-Down” Approach

The “top-down” approach initially involves using all four independent variables in the model.
The relevant printout is shown below with some comments in italics.
The SPSS commands are summarised below

The output is shown below:

Variables Entered/Removedb

Model

Variables Entered

Variables Removed

Method

newproductions, performances, attendance, theatresopen

Enter

a. All requested variables entered.
b. Dependent Variable: revenue

Variables Entered/Removed

All four possible independent variables have been added using the “Enter” method.

Revenue is the dependent variable.
This indicates that all 4 independent variables were used in the model.

Model Summaryb

Model

R Square

Adjusted R Square

Std. Error of the Estimate

.961a

.924

.909

36143.148

a. Predictors: (Constant), newproductions, performances, attendance, theatresopen
b. Dependent Variable: revenue

Model Summary

The value of r = 0.961 and r2 = 0.924.

However for multiple regression models, it is usual to use the adjusted R square value of 0.909 for r2 as this value takes into account the number of variables being used as well as the strength of the correlation.

ANOVAb

Model

Sum of Squares

Mean Square

Sig.

Regression

3.166E11

7.915E10

60.586

.000a

Residual

2.613E10

1.306E9

Total

3.427E11

ANOVA

The F statistics of 60.586 indicates that the model as a whole is significant ( sig < 0.025) Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) -1043627.767 177852.629 -5.868 .000 attendance 101.514 13.178 .913 7.703 .000 theatresopen 12735.681 14486.506 .200 .879 .390 performances -28.361 39.178 -.191 -.724 .478 newproductions 260.547 186.511 .104 1.397 .178 a. Dependent Variable: revenue Coefficients The regression equation is given by : revenue = -1043627.767 + 101×attendance + 12735.681× theatres open + -28.361×performances +260.547×new productions Looking at the p values (Sig. Column): Coefficient of the constant is significantly different from zero (p = 0.000 <0.025) Coefficient of attendance is significantly different from zero (p = 0.000 < 0.025) Coefficient of theatres open is NOT significantly different from zero (p = 0.390) Coefficient of performances is NOT significantly different from zero (p = 0.478) Coefficient of new productions is NOT significantly different from zero (p = 0.178) This implies that we should get a better model if we went through a process of deleting variables one by one . The first one to delete would be “performances “ as it has the highest value of p and is correlated with other variables. Residuals Statisticsa Minimum Maximum Mean Std. Deviation N Predicted Value 116690.98 536192.75 283979.76 114851.420 25 Residual -50616.152 68157.070 .000 32994.029 25 Std. Predicted Value -1.457 2.196 .000 1.000 25 Std. Residual -1.400 1.886 .000 .913 25 a. Dependent Variable: revenue Residual Statistics The Maximum standardised residual = 2.196 minimum standardised residual = -1.400 Thus all points are fairly close to the regression line and there are no outliers. Janet Geary 2012 Page 84 Conclusion: Try another model that leaves out the variable performances. However, we will not go through the whole procedure ourselves as we can utilise a special facility in SPSS that uses a “bottom up” procedure. Bottom-Up Approach (SPSS STEPWISE facility) From the correlation matrix ,we can see that the order of the correlation coefficients with sales are: 1. attendance (0.954) 2. performances (0.4578) 3. Theatres open (0.2277 4. new productions (0.559) Attendance is highly correlated with the number of theatres open and the number of performances This means that we have a case of multicollinearity here. We are unlikely to use all three of these in the “best” model. The rest of this printout is the result of the stepwise multiple regression command. NB. The user did not have to specify the order in which the variables were introduced. The “stepwise” command works by introducing the variables, one at a time, based on the value of the correlation coefficients. “Stepwise” stops when it cannot find a “better” model. Thus the last model produced by stepwise is considered the best model to use for predictions. We also can choose to have some useful plots here The step-wise process is summarised in the final output. Variables Entered/Removeda Model Variables Entered Variables Removed Method 1 attendance . Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .100).

a. Dependent Variable: revenue

The final (and best) model only uses “attendance” to predict the gross box office revenue.

Model Summaryb

Model

R Square

Adjusted R Square

Std. Error of the Estimate

.954a

.909

.905

36753.551

a. Predictors: (Constant), attendance
b. Dependent Variable: revenue

The value of adjusted r2 is high at 0.905

ANOVAb

Model

Sum of Squares

Mean Square

Sig.

Regression

3.116E11

230.702

.000a

Residual

3.107E10

1.351E9

Total

3.427E11

a. Predictors: (Constant), attendance
b. Dependent Variable: revenue
The regression model as a whole is significant as F has p=0.000

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

Sig.

Std. Error

Beta

(Constant)

-974724.442

83195.461

-11.716

.000

attendance

106.037

6.981

.954

15.189

.000

a. Dependent Variable: revenue

The regression equation is revenue = -974724 + 106.037×attendance

Both of the coefficients of the constant and attendance are significantly different from zero.

Excluded Variablesb

Model

Beta In

Sig.

Partial Correlation

Collinearity Statistics

Tolerance

theatresopen

.060a

.650

.523

.137

.482

performances

.013a

.124

.902

.026

.355

newproductions

.122a

1.770

.091

.353

.761

a. Predictors in the Model: (Constant), attendance
b. Dependent Variable: revenue

Residuals Statisticsa

Minimum

Maximum

Mean

Std. Deviation

Predicted Value

110668.88

537043.06

283979.76

113951.373

Residual

-52402.609

67772.398

.000

35979.706

Std. Predicted Value

-1.521

2.221

.000

1.000

Std. Residual

-1.426

1.844

.000

.979

a. Dependent Variable: revenue

The minimum standardised residual is -1.426 and the maximum is 1.844 so there are no outliers

The histogram of the residuals closely approximates to a normal distribution, thus the model is a good one.

The points on the p-p plot are quite close to the diagonal line and do not really exhibit any particular pattern.

A scatter graph of the standardised residuals and the standardised predicted values is evenly scattered above and below the zero line and there is no particular pattern here.
Topic 7: Interpreting Regression Output

Using the boats data for multiple regression in SPSS
Data was collected on the prices charged for weekly boat hire at Easter and in the Summer from a number of boatyards on the Norfolk Broads. For each boat, the fields shown below were recorded.
Data definitions:

We want to see how the maximum price charged during Easter for a week’s hire is related to the attributes of the boat (length, width, number of fixed berths, maximum number of berths )
The correlation matrix gives:

Correlations

length in metres

width in metres

maximum number of berths

number of fixed berths

maximum Easter price

length in metres

Pearson Correlation

.688**

.769**

.828**

.857**

Sig. (2-tailed)

.000

112

width in metres

Pearson Correlation

.688**

.498**

.492**

.537**

Sig. (2-tailed)

.000

112

maximum number of berths

Pearson Correlation

.769**

.498**

.891**

.748**

Sig. (2-tailed)

.000

112

number of fixed berths

Pearson Correlation

.828**

.492**

.891**

.863**

Sig. (2-tailed)

.000

112

maximum Easter price

Pearson Correlation

.857**

.537**

.748**

.863**

Sig. (2-tailed)

.000

112

**. Correlation is significant at the 0.01 level (2-tailed).

There appears to be some multi-collinearity here as there are strong correlations between some of the “independent” variables. In particular, “number of fixed berths” is strongly correlated with “maximum number of berths”. Thus we would not expect both of these variables to be present in a “good” model.

Using the step-wise approach:

Variables Entered/Removeda

Model

Variables Entered

Variables Removed

Method

number of fixed berths

Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .100).

length in metres

Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .100).

a. Dependent Variable: maximum Easter price

The first model just used “number of fixed berths” as this has the highest correlation with maximum Easter price. To this model, the variable “length in metres“ was added. No further additions were made as the other variables correlated strongly with these two.

Model Summaryc

Model

R Square

Adjusted R Square

Std. Error of the Estimate

Change Statistics

R Square Change

F Change

df1

df2

Sig. F Change

.863a

.745

.743

67.792

.745

322.149

110

.000

.900b

.810

.806

58.845

.064

36.988

109

.000

a. Predictors: (Constant), number of fixed berths
b. Predictors: (Constant), number of fixed berths, length in metres
c. Dependent Variable: maximum Easter price

The value of adjusted r2 was 0.743 for model 1 (number of fixed berths). This rose to r2 = 0.806 when the “length in metres” was added to model 1 to form model 2.

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

Sig.

Collinearity Statistics

Std. Error

Beta

Tolerance

VIF

(Constant)

263.311

15.076

17.466

.000

number of fixed berths

62.209

3.466

.863

17.949

.000

1.000

(Constant)

12.597

43.251

.291

.771

number of fixed berths

35.206

5.363

.489

6.564

.000

.315

3.178

length in metres

33.623

5.528

.453

6.082

.000

.315

3.178

a. Dependent Variable: maximum Easter price

In model 1 . The coefficient of number of fixed berths is significantly different from zero (p=0.000<0.05) In model 2, both coefficients are significantly different from zero (p=0.000 in both cases ) Casewise Diagnosticsa Case Number Std. Residual maximum Easter price Predicted Value Residual 15 3.397 945 745.12 199.877 a. Dependent Variable: maximum Easter price The only real outlier is case number 15 as it is the only point with a standardised residual outside the range -3 to + 3. Residuals Statisticsa Minimum Maximum Mean Std. Deviation N Predicted Value 328.79 806.65 508.26 120.382 112 Residual -158.095 199.877 .000 58.313 112 Std. Predicted Value -1.491 2.479 .000 1.000 112 Std. Residual -2.687 3.397 .000 .991 112 a. Dependent Variable: maximum Easter price The graph of the residuals does roughly resemble a normal distribution curve. Thus the assumption that the residuals are normally distributed seems reasonable. The points are close to the straight line and there is no real pattern in the points , thus the assumption of normally seems reasonable. The residuals are fairly evenly spread about the horizontal line through zero. The difference from the zero line does not change much as the standardised predictions increase. Conclusion : This is a good model. Compare this output with one that includes all possible variables. Variables Entered/Removedb Model Variables Entered Variables Removed Method 1 maximum number of berths, width in metres, length in metres, number of fixed berths . Enter a. All requested variables entered. b. Dependent Variable: maximum Easter price Model Summaryb Model R R Square Adjusted R Square Std. Error of the Estimate Change Statistics R Square Change F Change df1 df2 Sig. F Change 1 .903a .816 .809 58.396 .816 118.850 4 107 .000 a. Predictors: (Constant), maximum number of berths, width in metres, length in metres, number of fixed berths b. Dependent Variable: maximum Easter price ANOVAb Model Sum of Squares df Mean Square F Sig. 1 Regression 1621146.826 4 405286.707 118.850 .000a Residual 364878.665 107 3410.081 Total 1986025.491 111 a. Predictors: (Constant), maximum number of berths, width in metres, length in metres, number of fixed berths b. Dependent Variable: maximum Easter price The F statistics is significant and thus indicates that there is a linear relationship between maximum Easter price and at least one of the other variables Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics B Std. Error Beta Tolerance VIF 1 (Constant) 34.804 73.494 .474 .637 length in metres 36.055 6.704 .485 5.378 .000 .211 4.746 width in metres -8.502 27.532 -.018 -.309 .758 .503 1.987 number of fixed berths 44.915 7.726 .623 5.814 .000 .149 6.696 maximum number of berths -10.983 5.921 -.172 -1.855 .066 .201 4.985 a. Dependent Variable: maximum Easter price (If we were building a model ourselves, we would eliminate “width in metres” and “maximum number of berths” as the coefficients are not significantly different from zero and the signs are not what we would expect. We would expect wider boats with more berths to have an increase in price.) Casewise Diagnosticsa Case Number Std. Residual maximum Easter price Predicted Value Residual 15 3.191 945 758.64 186.357 a. Dependent Variable: maximum Easter price Residuals Statisticsa Minimum Maximum Mean Std. Deviation N Predicted Value 336.28 780.69 508.26 120.851 112 Residual -147.015 186.357 .000 57.334 112 Std. Predicted Value -1.423 2.254 .000 1.000 112 Std. Residual -2.518 3.191 .000 .982 112 a. Dependent Variable: maximum Easter price Features of a Good Linear Regression Model [SPSS printout] Source of information Desirable feature 1 Model Summary A high value of adjusted r2 [A low value indicates a poor linear fit] 2 Table of Correlations Model chosen should not contain independent variables that are highly correlated with each other. [Model should not exhibit any multi-collinearity] 3 Coefficients table All coefficients significantly different from zero i.e. Sig column has values below 0.025 [If a coefficient is close to zero it adds nothing useful to the model] 4 ANOVA The F statistic is significantly different from zero indicating that there is a relationship between the dependent variable and at least one of the independent variables. 5 Casewise diagnostics Only 5% of readings are outside the range -2 to +2 in 'Std. Residual' column Any outliers are outside the range -3 to +3 [If there are a number of outliers than data points could just be 'odd' cases or could indicate lack of a linear relationship.] 6 Histogram of residuals Histogram fits the superimposed Normal curve (or is close) [If histogram does not approximate to Normal curve, it implies residuals are not normally distributed and thus model is not a good fit. Should consider possibility of non-linear relationship] 7 Normal p-p plot Points are: scattered about the diagonal line close to the diagonal line do not exhibit any pattern [If points are not as stated above, the model is not a good fit. Should consider possibility of non-linear relationship] Seminar Exercise Using the SPSS data file boats.sav, produce the best multiple regression model for predicting the maximum summer prices. Write a report on your printouts. Janet Geary 2012 Page 102 Janet Geary 2012 Page 86 Topic 8 : Graphical Linear Programming Linear Programming methods can be used to solve problems where we wish to maximise (or minimise) a linear function subject to a number of linear constraints. Graphical linear programming can be used when there are only two variables. The stages of graphical linear programming are: 1. Formulation of the problem. This involves translating a description of a problem into a mathematical format. In particular, the linear constraints and objective function have to be generated. 2. Determination of the set of feasible solutions. This involves drawing a graph to determine the region where all the constraints are met. 3. Finding the optimal solution. This involves finding the “best” feasible solution. Example: The Soft Toy Company A manufacturer of expensive soft toys makes giant teddy bears and fluffy rabbits. Each teddy bear has a contribution to profit of £30 whilst each rabbit has a contribution of £40. (They are very expensive.) The manufacturer wants to determine which combination of teddy bears and rabbits should be made in order to maximise the contribution. Each teddy bear requires 2 hours of machining and 2 hours of hand labour whilst each rabbit requires 1 hour of machining and 3 hours of hand labour. Each teddy bear and each rabbit requires 1 kilogram of stuffing. The manufacturer has certain limitations on the possible production. In particular, there are only 50 hours of machining and 90 hours of hand labour available each week. Stuffing is in short supply, so the manufacturer can only rely on 40 kilogrammes each week. Determine which combination of teddy bears and rabbits should be produced in order to maximise profit. Problem Formulation : Summarising the details given above: machine (hour) labour (h) stuffing(kg) teddy bear 2 2 1 rabbit 1 3 1 total(week) 50 90 40 Defining the Variables: Let x be the number of teddy bears produced and sold each week Let y be the number of rabbits produced and sold each week. Formulating the Objective Function: Contribution is C = 30x + 40y so our objective function is Max 30x + 40y Formulate Constraints: machine hours: 2x + y ≤ 50 hand labour: 2x + 3y ≤ 90 stuffing: x + y ≤ 40 As we cannot have a negative number of toys we should include: x ≥ 0 y ≥ 0 Formulation Summary: max 30x + 40y such that 2x + y ≤ 50 2x + 3y ≤ 90 x + y ≤ 40 x ≥ 0 y ≥ 0 Determining the Feasible Region To draw a graph of the feasible region we have to consider each constraint in turn. Machine hours: 2x + y ≤ 50 In order to draw the line 2x + y = 50, we need two points. Choosing x = 0 and y = 0 for our two points we get: x = 0: 2x + y = 50 y = 0: 2x + y = 50 y = 50 2x = 50 x = 25 The graph now looks like: We now have to decide which side of the line is required, i.e. on which side of the line is 2x + y actually less than 50. In general, if the constraint in x and y includes ≤ the required area will be to the “bottom left” of the line. If the constraint includes ≥ the required area will be to the “top right”. Hand Labour 2x + 3y ≤ 90 x = 0 2x + 3y = 90 y = 0: 2x + 3y = 90 3y = 90 2x = 90 y = 30 x = 45 Again we will want the area under the line. The graph now looks like: Stuffing: x + y ≤ 40 x = 0 x + y = 40 y = 0 x + y = 40 y = 40 x = 40 We want the area under this line as well. The two further constraints x ≥ 0 and y ≥ 0 can be included to give the graph shown: In this graph the feasible region can be shaded. It is the region where all the constraints are satisfied. Optimising In this example we wish to maximise the objective function: 30 x + 40y Method 1: We can use the gradient of the line in order to draw an objective function line. In general, any line with equation ax + by = c has gradient -a/b The line 30x + 40y has gradient = -30 = -3 40 4 Thus we can draw any line with such a gradient (3 down and 4 along) as our initial objective function line. This line is then moved parallel to find the optimal solution. Method 2 Linear programming theory tells us that the optimal solution will always lie at a vertex (corner) of the feasible region. Trying each vertex in turn: vertex value of objective function x=0, y=0 30x + 40y = 0 x=0, y=30 30x + 40y = 1200 x=15, y=20 30x + 40y = 1250 x=25, y=0 30x + 40y = 750 The highest value of the objective function is found at: x = 15, y = 20. Production Plan. In order to maximise the value of the weekly contribution, the manufacturer should produce and sell 15 teddy bears and 20 rabbits each week. This will generate a weekly contribution of £1250 Example: Camping Trip A youth club is planning a camping trip. Two sizes of tent are available 4-person and 8-person tents. There are 64 people that want to go on the trip but there is only room on the site for 13 tents. Only 8 4-person tents are available. If each 4-person tent costs £15 a night and each 8-person tent costs £45 per night, how many of each type of tent minimises the nightly cost? Formulation: Let x be the number of 4-person tents used and let y be the number of 8-person tents used. Minimise cost: Cost = 15x + 45y Total number of tents: x + y ≤ 13 number of 4-person tents: x ≤ 8 people accommodated 4x + 8y ≥ 64 x ≥ 0, y ≥ 0 Janet Geary 2012 Page 88 Drawing: x + y = 13 x = 0, y = 13 y = 0, x = 13 4x + 8y = 64 x = 0, y = 8 y = 0, x = 16 The graph looks like: [ WinQSB graph The objective function is 15x + 45y which has gradient -15 = -1 45 3 Thus we can draw a line anywhere with this gradient (1 down and 3 along) As we wish to minimise the objective function, we must move the objective function line towards the origin. The optimal solution is: x = 8, y = 4 The cost will be 15×8 + 45×4 = 300 Thus the trip organisers should book 8 4-person tents and 4 8-person tents. This will cost £300 per night and will allow 64 people to be accommodated. A total of 12 tents are used. Janet Geary 2012 Page 89 Seminar Sheet 1. A company manufactures two types of sweatshirts: hooded and round neck. Each hooded sweatshirt makes a contribution to profit of £4 and each round neck sweatshirt makes a contribution of £3. Each hooded sweatshirt requires 1 hour of labour and each round neck requires 2 hours. There are 110 hours of labour available each day. There are limitations in the production capacity so that only 70 hooded sweatshirts can be made in a day. The company wishes to maximise the contribution from these sweatshirts. a) Formulate as a linear programming problem. b) Determine the number of hooded and round neck sweatshirts that should be made each day in order to maximise contribution. c) What is the value of the maximum daily contribution? 2. A health enthusiast would like to organise his food consumption of two diet supplements Vita and Glow so that his minimum daily requirement of three basic nutrients A, B and C is satisfied. The minimum daily requirements are 14 units of A, 12 units of B and 18 units of C. Product Vita has 2 units of A and one unit each of B and C in each packet. Product Glow has one unit each of A and B and 3 units of C in each packet. The price of Vita is 20p and the price of Glow is 40p per packet. The health enthusiast wants to determine the level of consumption of Vita and Glow that will minimise expenditure whilst satisfying the minimum daily requirements. a) Formulate the description given above as a Linear Programming problem. b) Advise the user on the best combination of Vita and Glow to use. c) What is the minimum cost? 3. A furniture manufacturer makes two types of tables: Traditional and Modern. Each traditional table requires 6 hours of cutting time, 5 hours of sanding time and 2 hours for staining. Each traditional table sold gives a contribution to profit of £50. Each modern table requires 2 hours of cutting time, 5 hours of sanding time and 4 hours of staining time. Each modern table sold makes a contribution of £30. Each day the manufacturer has available 36 hours of cutting time,40 hours of sanding time and 28 hours of staining time. All other inputs are available as required. The company can sell all the tables it makes. Janet Geary 2012 Page 91 a) Find the number of each type of table that should be made in order to maximise the contribution. Find the maximum contribution possible. 4 A clothes manufacturer offers two versions of a particular t-shirt; one printed and the other plain. The manufacturing requirements are: Cutting and printing time: The printed t-shirt takes 9 minutes each whilst each plain t-shirt takes only 3 minutes to cut and print. There are 360 minutes available for cutting and printing each day. Each printed t-shirt takes 5 minutes for sewing and packing whilst each plain t-shirt takes only 3 minutes. There are 240 minutes available for sewing and packing of these t-shirts each day. A contract with a local shop requires a minimum of 12 printed t-shirts to be produced each day. The manufacturer makes a contribution to profit of £6 from the manufacturer and sale of each printed t-shirt and £5 for each plain t-shirt. The manufacturer wishes to maximise this contribution to profit. Formulate the scenario described above as a linear programming problem. You should clearly indicate the meaning of each constraint and the meaning of any variables. Using a graphical method, or otherwise, determine which combination of printed and plain t-shirt the manufacturer should produce and sell in order to maximise contribution to profit. 5. A small engineering company makes two types of engine parts, coded as part A and part B. Part A has a contribution of £30 per unit and part B £40. The company wishes to establish the weekly production plan which maximises contribution. Production data are as follows: machining (hours) labour (hours) materials (kg) part A 4 4 1 part B 2 6 1 total available per week 100 180 40 Because of a trade agreement, sales of part A are limited to a weekly maximum of 20 units and to honour an agreement with an old established customer at least 10 units of part B must be made each week. Formulate the scenario described above as a linear programming problem. You should clearly indicate the meaning of each constraint and the meaning of any variables. Using a graphical method, or otherwise, determine which combination of part A and part B the company should produce and sell in order to maximise contribution to profit. What is the expected contribution to profit? Answers: 1. a) max 4x + 3y such that x + 2y ≤ 110 and x ≤70 b) 70 hooded and 20 round neck c) £340 2. a) min 20 x + 40y such that 2x + y ≥ 14 and x + y ≥ 12 and x + 3y ≥18 b) 9 packets of Vita and 3 packets of Glow c) £3 3 max 50x + 30y such that 6x + 2y ≤ 36 5x + 5y ≤ 40 2x + 4y ≤ 28 a) 5 traditional and 3 modern each day. b) Maximum contribution: £340 4 max 6x + 5y such that 9x + 3y ≤ 360 5x + 3y ≤ 240 x ≥12 Produce 12 printed and 60 plain t-shirts each day 5 Max 30A + 40B such that 4A + 2B ≤ 100 4A + 6B ≤ 180 A + B ≤ 40 A ≤20 B ≥10 Produce 15 units of part A and 20 units of part B each week. The expected contribution is £1250 per week from the sale of these parts. Topic 9 Linear Programming-Shadow Prices and Sensitivity Analysis Consider the following problem: A company makes two kinds of armchair; Model A with loose covers and Model B with fitted covers only. The company estimates that it can sell as many of the armchairs as it can make. Management must now determine the production targets for the next few months in order to maximise profit. The company knows that it will make a profit of £50 on each type A model and £40 on each type B model. As Model A chairs require extra packing, chairs and loose covers, the company can only manage to make a maximum of 8 chairs a day. In the machining department, where the wood for the chairs is shaped, each Model A armchair requires 1 hour whilst each Model B requires 1.5 hours. There are 15 hours of shaping time available each day. In the upholstery department each Model A requires 3 hours and each Model B requires 2 hours. There is a total of 30 hours available for upholstering each day. Determine the optimal production plan. Furthermore answer the following questions: 1. If an extra hour each day could be made available for shaping the wood, what would this be worth? 2. If an extra hour of upholstery time could be made available, by how much would profit be increased? 3. By how much could the profit on type A chairs alter before the original optimal plan changes? 4. If the original solution is to remain optimal, by how much could the profit from type B change? Model: Let x be the number of model A armchairs produced each day. Let y be the number of model B armchairs produced each day. maximise profit: max 50x + 40y subject to: max A x ≤ 8 shaping: x + 1.5y ≤ 15 upholstery: 3x + 2 y ≤ 30 and x ≥ 0, y ≥ 0 The outline graph of the problem is given below: Janet Geary 2012 Page 92 The objective function 50x + 40y has gradient = - 50 = -5 40 4 [Any line ax + by = c has gradient -a/b] We can thus draw a profit line with this gradient and move it “top-right” to maximise. The optimal solution is found at the point x = 6 and y = 6. The optimal production plan is to make 6 model A armchairs and 6 model B armchairs each day. This will produce a maximum profit of: Model A 6 @ £50 = £300 Model B 6 @ £40 = £240 Total = £540 Thus the maximum profit available is £540 a day. Slack and Binding Constraints. Since the optimal solution is bounded by two constraints, shaping and upholstery, these constraints are known as binding. Considering each constraint in turn. Shaping x + 1.5y ≤ 15 When x = 6 and y = 6, x + 1.5y = 6 + 1.5× 6 = 6 + 9 = 15 All of the available time for shaping has been used. Thus the slack on this constraint is 0. Upholstery 3x + 2y ≤ 30 When x = 6 and y = 6 3x + 2y = 3 × 6 + 2 × 6 = 18 + 12 = 30 All of the time available for upholstering has been used. The slack on this constraint is 0. Janet Geary 2012 Page 93 Maximum model A x ≤ 8 when x = 6 and y = 6 Left hand side, LHS: x = 6 Right hand side, RHS 8 Slack = RHS - LHS = 2 All of the model A armchairs possible have not been made. Thus there is a slack on this constraint of 2. N.B. Binding constraints have zero slack. Shadow Prices The shadow price ( or dual price) for a particular constraint shows the amount of improvement in the optimal objective value as the right hand side of that constraint is increased by one unit, with all other data held fixed (Eppen, Gould and Schmidt, Introductory Management Science) Thus if we are trying to maximise the objective function, the shadow price gives the increase in the value of the objective function. If we are seeking to minimise, the shadow price gives the decrease in the objective function value. Shaping x + 1.5y ≤ 15 If the time available for shaping is increased by 1 hour the new constraint will be: x + 1.5y ≤ 16 The new optimal solution will be where this line meets the upholstery line, (see graph). New: shaping x + 1.5y = 16 × 3 3x + 4.5y = 48 Old: upholstery 3x + 2y = 30 3x + 2y = 30 Subtracting 2.5y = 18 y = 7.2 When y = 7.2 3x + 2y = 30 3x + 14.4 = 30 3x = 15.6 x = 5.2 Thus the new optimal solution would be 5.2 of model A and 7.2 of model B per day In practice this would mean producing 26 model A and 36 model B in a five day week. New profit = 50x + 40y = 50 × 5.2 + 40 × 7.2 = 548 Old profit = 540 Extra profit = 8 One extra hour of shaping is worth £8. The shadow price for shaping is £8. Upholstery 3x + 2y ≤ 30 If upholstery time is increased by 1 hour, the constraint becomes: 3x + 2y ≤ 31 and the new optimal solution will be where this constraint line meets the line for shaping. New: upholstery 3x + 2y = 31 3x + 2y = 31 Old: shaping x + 1.5y = 15 × 3 3x + 4.5y = 45 -2.5y = -14 y = 5.6 Janet Geary 2012 Page 94 When y = 5.6 3x + 2y = 31 3x + 11.2 = 31 3x = 19.8 x = 6.6 New optimal solution is x = 6.6, y = 5.6 New profit 50x + 40y = 50 × 6.6 + 40 × 5.6 = 554 Old profit = 540 Extra profit = 14 One extra hour of time for upholstery is worth £14. The shadow price of upholstery is £14. Sensitivity Analysis on the Objective Coefficient Ranges The objective coefficient ranges tells us the changes that can be made in the objective function coefficients without changing the optimal solution. This is particularly useful as profits, costs etc are likely to change over time. Our objective function line , profit = 50x + 40y, has gradient -50/40 = -5/4 = -1.25 Shaping constraint line x + 1.5y = 15 has gradient -1/1.5 = -0.667 Upholstery constraint 3x + 2y = 30 has gradient -3/2 = -1.5 The profit line will reach an optimal solution at the intersection of the shaping and upholstery lines whilst the gradient of the profit line lies between the gradients of these two binding constraints. Thus the current solution will remain optimal whilst the gradient of the objective function lies between the gradient of the two binding constraints. Model A Let the profit from model A chairs change to a new amount called new. The objective function will now look like: (new)x + 40y which has gradient - new 40 The current optimal solution will remain optimal whilst: gradient of upholstery < gradient of profit line < gradient of shaping -1.5 < - new < -0.667 40 - 1.5 < - new - new < -0.667 40 40 -60 < - new - new < -0.667 40 new < 60 -new < -26.667 26.667 < new 26.667 < new < 60 Current solution remains optimal whilst the profit from model A lies in the range £26.67 to £60 Janet Geary 2012 Page 99 Thus means that the profit could rise by an amount less than £60 - £50 = £10 or it could fall by an amount less than £50 - £26.67 = £23.33 without affecting the optimal solution. If the profit from each model A armchair remains in the range £26.67 to £60 the current solution will remain optimal. Model B Let the profit from each model B armchair change to a new amount new. The new objective function will be: 50x + (new)y which has gradient - 50 new The current solution will remain optimal whilst the gradient of the profit line is between the gradients of the two binding constraints. - 1.5 < - 50 < -0.667 new -1.5 < -50 - 50 < -0.667 new new -1.5×new < -50 - 50 < -0.667×new 50 < 1.5×new 0.667×new < 50 50 < new new < 50___ 1.5 0.667 33.333 < new new < 75 33.33 < new < 75 The current solution will remain optimal whilst the profit from each model B armchair lies in the range £33.33 to £75. This is the same as saying: The current solution will remain optimal whilst the value of the profit from each model B does not rise by more than £35 [£75 - £40] or fall by more than £6.67 [£40 - £33.33]. Thus continue to produce 6 of each model each day whilst the profit from each model B is between £33.33 and £75. The WinQSB input and output looks like : Choose Edit and then variable names Choose edit and then constraints The expression can now be entered into the spreadsheet format To get the output: Choose “Solve and Analyze” then “Solve the problem” Solution is : Notes for a report The optimal production plan is to make 6 model A armchairs and 6 model B armchairs each day. This will produce a maximum daily profit of £540 If the profit from each model A armchair remains in the range £26.67 to £60 the current solution will remain optimal The current solution will remain optimal whilst the profit from each model B armchair lies in the range £33.33 to £75. All of the available time for shaping has been used. Thus the slack on this constraint is 0. One extra hour of shaping is worth £8. The shadow price for shaping is £8. All of the time available for upholstering has been used. The slack on this constraint is 0. One extra hour of time for upholstery is worth £14. The shadow price of upholstery is £14. All of the 8 model A armchairs possible have not been made. Thus there is a slack on this constraint of 2 as only 6 have been made. N.B. Only one change can be made at a time Seminar Exercise (the first part of each question was done last week) 1. A company manufactures two types of sweatshirts: hooded and round neck. Each hooded sweatshirt makes a contribution to profit of 4 and each round neck sweatshirt makes a contribution of 3. Each hooded sweatshirt requires 1 hour of labour and each round neck requires 2 hours. There are 110 hours of labour available each day. There are limitations in the production capacity so that only 70 hooded sweatshirts can be made in a day. The company wishes to maximise the contribution from these sweatshirts. a) Formulate as a linear programming problem. `b) Determine the number of hooded and round neck sweatshirts that should be made each day in order to maximise contribution. c) What is the value of the maximum daily contribution? d) If another hour of labour was available each day, what would be the maximum daily contribution? e) If the limitations in the production process were changed so that 71 hooded sweatshirts could be made each day, what would be the effect on the maximum daily contribution? f) Within what limits could the contribution from hooded sweatshirts lie, without the optimal production plan changing? 2. A health enthusiast would like to organise his food consumption of two diet supplements Vita and Glow so that his minimum daily requirement of three basic nutrients A, B and C is satisfied. The minimum daily requirements are 14 units of A, 12 units of B and 18 units of C. Product Vita has 2 units of A and one unit each of B and C in each packet. Product Glow has one unit each of A and B and 3 units of C in each packet. The price of Vita is 20p and the price of Glow is 40p per packet. The health enthusiast wants to determine the level of consumption of Vita and Glow that will minimise expenditure whilst satisfying the minimum daily requirements. a) Formulate the description given above as a Linear Programming problem. b) Advise the user on the best combination of Vita and Glow to use. c) What is the minimum cost? d) If the minimum daily requirement for the nutrient A was increased by 1 unit, what would the minimum cost be? e) If the minimum daily requirement for nutrient A was reduced by 1 unit, what would the minimum cost be? f) Within what range could the price of Vita lie without changing the optimal combination of Vita and Glow? 3. A furniture manufacturer makes two types of tables: Traditional and Modern. Each traditional table requires 6 hours of cutting time, 5 hours of sanding time and 2 hours for staining. Each traditional table sold gives a contribution to profit of £50. Each modern table requires 2 hours of cutting time, 5 hours of sanding time and 4 hours of staining time. Each modern table sold makes a contribution of £30. Each day the manufacturer has available 36 hours of cutting time, 40 hours of sanding time and 28 hours of staining time. All other inputs are available as required. The company can sell all the tables it makes. a) Find the number of each type of table that should be made in order to maximise the contribution. Find the maximum contribution possible. b) How will the company’s optimal mix of tables change if there were only 30 hours of cutting time available each day? c) If an extra hour of cutting time became available each day, by how much would the maximum contribution change? d) By how much could the contribution to profit of a Modern table increase before the optimal production plan changes? 6 A clothes manufacturer offers two versions of a particular t-shirt; one printed and the other plain. The manufacturing requirements are: Cutting and printing time: The printed t-shirt takes 9 minutes each whilst each plain t-shirt takes only 3 minutes to cut and print. There are 360 minutes available for cutting and printing each day. Each printed t-shirt takes 5 minutes for sewing and packing whilst each plain t-shirt takes only 3 minutes. There are 240 minutes available for sewing and packing of these t-shirts each day. A contract with a local shop requires a minimum of 12 printed t-shirts to be produced each day. The manufacturer makes a contribution to profit of £6 from the manufacturer and sale of each printed t-shirt and £5 for each plain t-shirt. The manufacturer wishes to maximise this contribution to profit. Formulate the scenario described above as a linear programming problem. You should clearly indicate the meaning of each constraint and the meaning of any variables. Using a graphical method, or otherwise, determine which combination of printed and plain t-shirt the manufacturer should produce and sell in order to maximise contribution to profit. Identify any binding constraints If the time available for sewing and packing the t-shirts was to increase by 1 minute each day, what effect would this have on the maximum contribution that could be earned from the manufacture and sale of the t-shirts? Janet Geary 2012 Page 100 At present the contribution from a printed t-shirt is £6. By how much could this contribution increase before the production plan would need to be changed? 6. A small engineering company makes two types of engine parts, coded as part A and part B. Part A has a contribution of £30 per unit and part B £40. The company wishes to establish the weekly production plan which maximises contribution. Production data are as follows: machining (hours) labour (hours) materials (kg) part A 4 4 1 part B 2 6 1 total available per week 100 180 40 Because of a trade agreement, sales of part A are limited to a weekly maximum of 20 units and to honour an agreement with an old established customer at least 10 units of part B must be made each week. Formulate the scenario described above as a linear programming problem. You should clearly indicate the meaning of each constraint and the meaning of any variables. Using a graphical method, or otherwise, determine which combination of part A and part B the company should produce and sell in order to maximise contribution to profit. What is the expected contribution to profit? b) If the number of hours available for machining each week increased by 5, by how much would the maximum possible contribution change? c) If the amount of material available each week increased by 1 kg, what effect would this have on the maximum contribution? d) Within what range could the contribution from a part B lie, without altering the optimal solution found previously. Answers: 1. d) maximum contribution would increase by £1.50 to £341.50 e) maximum contribution would increase by £2.50 f) contribution from hooded must be above £1.50 2. d) minimum cost will not change i.e. £3 e) minimum cost will not change i.e. £3 f) price of Vita must lie in the range 13.3p to 40p 3 a) Maximum contribution: £340 b) 3.5 Traditional and 4.5 modern each day (7 Traditional and 9 modern every 2 days) c) maximum contribution would increase by £5 d) could increase by up to £20 4 Produce 12 printed and 60 plain t-shirts each day binding constraints: sewing and order (printed >=12)
If sewing and packing time increases by 1 min, max contribution increases by £1.67
Contribution from a printed t-shirt could increase by up to £2.33 before the optimal solution changes.

Produce 15 units of part A and 20 units of part B each week. The expected contribution is £1250 per week from the sale of these parts.
b) 5 @ 1.25 = £6.25
c) c) no effect as slack constraint
d) £15 to £45

Janet Geary 2012 Page 101

Topic 10: Linear Programming – Extension to more than two variables

There are many types of problems that can be solved using linear programming techniques.
We shall limit our consideration to just two examples: financial planning and farm production.
Example: Lottery Winner
A lottery winner, Fred, has instructed his financial adviser to invest 100,000 of his winnings in the best combination of three stocks, Alpha, Beta and Gamma.
The details on these stocks are given in the table below:

Stock

price per share

estimated annual return per share

maximum possible investment

Alpha

60,000

Beta

25,000

Gamma

30,000

Formulate, and solve, a linear programme to show how many shares of each stock Fred should purchase in order to maximise the estimated total annual return.
Advise Fred on the results of the sensitivity analysis.
Formulation
Let A be the number of Alpha shares purchased.
Let B be the number of Beta shares purchased.
Let C be the number of Gamma shares purchased.
Total annual return per share is 7A + 3B + 3C
Total amount of money to be invested is £100,000
Thus 60A +25 B + 20C ≤ 100,000 [assuming we do not have to invest all of it]
Only £60,000 can be invested in Alpha,@ £60 each, therefore A ≤1,000
Only £25,000 can be invested in Beta @ £25 each, therefore B ≤1,000
Only £30,000 can be invested in Gamma @£20 each, therefore C ≤1,500
Problem: max 7A + 3B + 3C
subject to
total investment: 60A + 25B + 20C ≤ 100,000
max A: A ≤ 1,000
max B: B ≤ 1,000
max C: C ≤ 1,500
non-negativity A ≥ 0, B ≥ 0, C≥ 0
Computer Input

Computer Output

Janet Geary 2012 Page 103

Decision variable

Solution value

Alpha

750

Beta

1,000

Gamma

1,500

Objective Function (max) = 12,750

constraint

Slack or Surplus

Shadow Price

total investment

0.1167

max in A

250

max in B

0.0833

max in C

0.6667

Optimal solution:
Buy 750 shares in Alpha, 1,000 shares in Beta and 1,500 shares in Gamma.
This will give a total estimated return of £12,750.

Slacks:
Constraint total investment has slack = S1 = 0. All of the £100,000 has been invested.
Constraint max in A has slack = S2 = 250.
The number of Alpha shares bought was 250 less than the number available.
[1000 were available but only 750 bought]

Constraint max in B has slack = S3 = 0.
The maximum number of Beta shares has been purchased.
Constraint max in C has slack = S4 = 0.
The maximum number of Gamma shares has been purchased.

Binding Constraints
The binding constraints are constraints total investment, max in B, and max in C.(They all have zero slack).
Thus the maximum return is limited by the total amount of money that can be spent, the number of Beta shares available and the number of Gamma shares available.

Shadow Prices

Total Investment:
Total investment has a shadow price of 0.1167
If the total investment possible was increased by £1, a further £0.1167 could be earned.
Since any increased investment would be spent on Alpha shares, (Beta and Gamma shares are all used), every extra £1 spent on Alpha shares would generate £7/60 = £0.1167.

Max in A
The maximum Alpha constraint has shadow price equal to zero as we can earn nothing by increasing the availability of Alpha shares as there are already unused shares.

Max in B
If the total number of Beta shares available was increased by 1, a further £0.083 could be earned.
We could not predict this result as more Beta shares will reduce the number of Alpha and Gamma shares in an unknown manner.

Max in C
If the total number of Gamma shares available was increased by 1, a further £0.67 could be earned.

Sensitivity Analysis

Decision Variable

Unit Cost or Profit

Allowable Min

Allowable max

7.2

2.9167

M =Infinity

2.3333

M =Infinity

A shares
Janet Geary 2012 Page 104
The current suggested portfolio of shares will remain optimal whilst the return on an Alpha share remains in the range 0 to £7.20, assuming the other returns remain constant.
B shares
The current suggested portfolio will remain optimal whilst the return on a Beta share remains above £2.92.
C shares
The current portfolio will remain optimal whilst the return on a Gamma share remains above £2.33.

Ranges for Shadow Prices

Constraint

Right Hand Side

Shadow Price

Allowable Min RHS

Allowable Max RHS

total investment

100,000

0.1167

55,000

115,000

max in A

750

max in B

1,000

0.0833

400

2,800

max in C

1,000

0.6667

750

3,750

These values give the ranges within which the shadow prices apply.

Total investment
If the RHS of constraint total investment increases to 115,000 or less, we can say that for each extra £1 available for investment the return will increase by £0.1167. However if the increase takes the total investment available to above £115,000 we do not know what the extra return will be.
Similar arguments apply for reductions in the maximum total investment, provided the total available does not fall below £55,000.

Max in A
If the RHS of constraint max A, maximum number of Alpha shares, takes any value above 750, then for each additional share, the extra return will be zero. (This confirms that we only needed 750 Alpha shares for the original solution.)

Max in B
If the RHS of constraint max B, maximum number of Beta shares, is in the range 400 to 2,800 shares, the shadow price of £0.0833 will apply.
As long as the maximum number of Gamma shares is in the range 750 to 3750, the shadow price of £0.6667 will apply.

Farm Production
A farmer has two farms, Home Farm and Meadow Farm, in which he grows corn and wheat.
Both farms are 40 acres in size.
To satisfy a contract with a local mill, the farmer must produce 7,000 barrels of corn and 11,000 barrels of wheat each year.
The farmer wishes to minimise the cost of meeting the contract.
Janet Geary 2012 Page 105
The data for each farm is given below:

Home

Farm

yield per acre

cost per
acre

Meadow

Farm

yield per acre

cost per acre

corn

400 barrels

£100

corn

650 barrels

£120

wheat

300 barrels

£90

wheat

350 barrels

£80

Write a report to the farmer analysing your findings.

Formulation:

Variables: We need to distinguish between production at Home and Meadow Farm.
Let CH be the number of acres of corn planted at Home Farm
Let CM be the number of acres of corn planted at Meadow Farm
Let WH be the number of acres of wheat planted at Home Farm
Let WM be the number of acres of wheat planted at Meadow Farm
Minimise Cost:
Minimise 100CH + 90WH + 120CM + 80WM
subject to:
yield corn (1) 400CH + 650CM ≥ 7,000
wheat (2) 300WH + 350WM ≥ 11,000
area Home Farm (3) CH + WH ≤ 40
Meadow Farm (4) CM + WM ≤ 40
CH ≥ 0, CM ≥ 0, WH ≥0 WM ≥ 0
The input for Farms is:

Janet Geary 2012 Page 106

Report

1. Planting Plan
In order to minimise the planting costs, the following should be implemented:
Plant Home Farm with 2.56 acres of wheat
Plant Meadow Farm with 10.77 acres of corn and 29.23 acres of wheat.
This will incur costs of £3,861.54, the minimum that can be achieved.
2. Implications of Suggested Planting Programme
The planting plan given above will result in exactly 7,000 barrels of corn and exactly 11,000 barrels of wheat being produced.
[S1 = 0, S2 = 0]
37.44 acres of Home Farm will not be planted, whilst all 40 acres of Meadow Farm will be used.
3. Scope of the recommendations
3.1 Change in Costs
The planting plan given above will result in a minimum cost whilst the following conditions hold:
* The cost of planting corn at Home Farm stays above £89.23 per acre.
* The cost of planting corn at Meadow Farm stays below £137.50 per acre.
* The cost of planting wheat at Home Farm stays in the range £68.57 to £105 per acres.
* The cost of planting wheat at Meadow Farm stays in the range £62.50 to £105 per acre.
If any one of the conditions above fails to hold then a new planting plan will be required. Furthermore, if two or more of the present costs change, a new plan will be required.
As these ranges are relatively large, the proposed plan should hold good for some time
3.2 Change in Contract
The proposed plan exactly meets the contracts for 7,000 barrels of corn and 11,000 barrels of wheat.
If the requirement for corn was to increase by one barrel, the extra costs incurred in meeting this target would be 22.3 pence. This marginal cost of 22.3 pence per barrel applies for levels of production between 5,571.43 and 26,000 barrels.
If the minimum amount of wheat required was increased, the extra cost would be 30 pence per barrel. This extra cost per barrel will apply whilst the contract for wheat lies in the range 10,230.8 to 22,230.8 barrels.
If the minium requirement for corn or wheat fell outside these ranges, a new planting plan would be required.
3.3 Changes in Farm Size
In total only 2.56 acres of Home Farm are being used. Clearly it would not make sense to consider increasing the size of Home Farm.

Janet Geary 2012 Page 107
All 40 acres of Meadow Farm are being used. If the size of Meadow Farm could be increased by 1 acre the minimum cost could be reduced by 25. This marginal reduction in cost applies when the size of Meadow Farm lies in the range 10.77 acres to 42.2 acres.
Similarly, if the size of Meadow Farm is reduced by 1 acre, the minimum cost will rise by £25.
[If we could increase the size of Meadow Farm by 1 acre, we could produce an extra 350 barrels at Meadow Farm at a cost of £80.
At the same time we would reduce the yield from Home Farm by 350 barrels
i.e. 350/300 = 1.16666 acres. This reduction would save 1.16666*90 =105.
. Thus the net reduction would be 105 – 80 = £25 ]

Seminar Exercise

Question 1.
Formulate the problem below as a linear programming problem.
Solve the problem using QSB for Windows.
Write some notes on the interpretation of the output.
During the seminar discuss the formulation and the output.
A company has two factories A and B. Each factory makes two products, standard and deluxe. A unit of standard gives a profit contribution of £10, while a unit of deluxe gives a profit contribution of £15. The company wishes to maximise this profit contribution.
Each factory uses two processes, grinding and polishing, for producing its products.
Factory A has a grinding capacity of 80 hours per week and a polishing capacity of 60 hours per week. For factory B these capacities are 60 and 75 hours per week respectively.
The grinding and polishing times in hours for a unit of each type of product in each factory are given in the table below:

Factory A

Factory B

standard

deluxe

standard

deluxe

grinding

polishing

It is possible, for example, that factory B has older machines than factory A, resulting in higher unit processing times.
In addition each unit of each product uses 4 kilograms of a raw material which we shall refer to as “raw”. The company has 120 kilograms of “raw” available per week.
(Example is taken from H.P.Williams Model Building in Mathematical Programming
3rd edition, Wiley 1990 page 45)
[9.166 units of standard in factory A, 8.33 units of deluxe in factory A, 12.5 units of deluxe
in factory B each week. Maximum contribution = £404.17 per week]

Janet Geary 2012 Page 108
Question 2. Hitech Training
A small private college Hitech Training provides training in web-page design and development for companies. The companies send their employees to the college premises for short courses. Each course lasts 8 hours. The distribution of the 8 hours is flexible; a course make take place during one day or spread over a number of days. It is also possible for one course to spread over a number of weeks, e.g. 2 hours on Monday evening for 4 weeks.
The companies send employees in groups of 10 or less, as the college’s computer rooms can only accommodate 10 trainees at a time. If a company chooses to send less than 10 employees on a particular course they are not charged a reduced fee as the teaching costs are fixed. If the company wishes to send more than 10 employees, they must book more than one course.
The prices charged to the companies for a course are as follows:
Introductory Web Design £500 for an 8-hour course for up to 10 employees
Intermediate Web Design £600 for an 8-hour course for up to 10 employees
Advanced Web Design £700 for an 8-hour course for up to 10 employees
The college employs trainers to deliver these courses. The trainers are only paid for the hours they work. The rates of pay are as follows:
Introductory Web Design £18 an hour
Intermediate Web Design £21 an hour
Advanced Web Design £35 an hour
Demand for these training courses is such that the college is confident of having enough companies wanting courses to fill all the courses it provides. However, the number of courses that can be provided each week is limited by the fact that the college only has 200 trainer-hours available each week. [A trainer-hour is defined as one trainer teaching for one hour.]
The college has enough rooms and trainers to meet any time-tabling problems that could arise.
Of these 200 trainer-hours, only 80 are available for the Advanced Web Design course. A total of 160 trainer-hours are available for teaching the Intermediate or Advanced courses.
The trainers that can teach the Advanced courses can also teach the Intermediate courses. All trainers can teach the Introductory course.
The college’s analysis of the demand for its courses indicates that at least 5 of all courses offered should be at the Advanced level and at least 5 of all courses offered at Intermediate level each week.
The college wishes to maximise its Net Revenue.
This is defined as: Net Revenue = Revenue from courses – Labour Costs.

Required:
· Formulate the scenario described on the previous page as a Linear Programme.
Keep a copy of any notes made during the formulation, as you may need these in the examination.
Janet Geary 2012 Page 110
· Enter the problem into QSB for Windows (or a similar package)
Produce printouts of
The input data
Solution of the problem
Sensitivity Analysis
You will need to ensure that your printouts include: shadow prices, right-hand side ranges, ranges of objective coefficients.
· Bring these printouts and any notes to the seminar
Answer: Question 1.
Let AS be number of standard products made in factory A each week
Let BS be number of standard products made in factory B each week
Let AD be number of deluxe products made in factory A each week
Let BD be number of standard products made in factory B each week
max 10AS + 10BS+ 15AD + 15BD
subject to:
grinding in A: 4AS + 2AD ≤ 80
polishing in A: 2AS + 5AD ≤60
grinding in B: 5BS + 3BD ≤60
polishing in B: 5BS + 6BD ≤ 75
raw: 4AS + 4BS + 4AD + 4BD ≤ 120

Question 2: Output.

Topic 11: Project Management: Critical Path

Introduction
In every industry there are concerns about how to manage large-scale, complicated projects effectively. Millions of pounds in cost overruns have been wasted due to the poor planning and control of projects.

Project planning
Planning the project requires that the objectives are clearly defined so the project team knows exactly what is required of them. All the activities involved in the project must be clearly identified. An activity is the performance of an individual job that requires labour, resources and time. Once the activities have been defined, their sequential relationships; which activity comes first, which activities must precede others; must be determined.
Once the activities and their relationships to each other are known a network of activities can be drawn up. The time estimates of each activity then enable a total project completion time can be calculated. If this total completion time is longer than that specified in the objectives, means must be found to reduce the total project time (known as crashing). This reduction in time is usually done by assigning more resources to certain activities which will reduce the overall project time.

Project Control
Once the planning process is complete and the work begun, the focus then moves onto controlling actual work involved. Controlling a project involves ensuring that the timetable set in the planning stage is adhered to and that the activities are completed in the appropriate sequence. Should any unforeseen problems arise at this stage, the project may have to be re-scheduled and additional resources allocated to ensure that the original completion time is adhered to.
We are going to cover:
· Understand how to plan, monitor and control projects with the use of CPM.
· Be able to determine the earliest start, earliest finish, latest finish and float times for each activity, along with the total project completion time.
· Produce Gantt Charts to facilitate monitoring of the project.
· Reduce the total project time, at the least total cost, by crashing the network.

Critical Path Method

The Critical Path Method (CPM) is a popular technique that helps managers plan, schedule, monitor and control large and complex projects.
The steps in the procedure are:
1. Define the project and all of its significant activities.
2. Decide which activities must precede others.
3. Draw the network connecting all of the activities.
4. Assign times and/or costs estimates to each activity.
5. Compute the critical path through the network.
6. Use the network to plan, schedule, monitor and control the project.

Example: to produce a text book

Activity code

Activity description

Duration (months)

Preceding activities

Initial consultation with publisher

–

Prepare proposal

Sign contract

Write material

First proof-read

C, D

Final proof-read and publish

The CPM diagram consists of a network of boxes of the form

Activity

EST

EFT

duration

LST

LFT

EST = earliest start time LST = latest start time
EFT = earliest finish time LFT = latest finish time

A
EST
EFT
3
LST
LFT
C
EST
EFT
1
LST
LFT
E
EST
EFT
4
LST
LFT
F
EST
EFT
6
LST
LFT

B
EST
EFT
2
LST
LFT

D
EST
EFT
18
LST
LFT

Forward Pass
The forward pass involves calculating the earliest time that each activity can start and finish.
Let activity A start at time 0 (its earliest start time)
As activity A takes 3 months its earliest finish time is 0+3
Thus earliest finish time (EF) = earliest start time (ES) + duration
B cannot start until A has completed, thus its earliest start time is the same as A’s earliest finish time .Thus B has ES = 3. The duration is 2 and thus the earliest finishing time = 5
C follows on from B and thus its earliest start time = B’s earliest finish time = 5. The earliest finish time is 5+1 = 6
D follows on from B and thus the earliest start time is 5. It earliest finish time is 5+18 = 23
E follows from both C and D . As D has an earliest finish time of 23 , E cannot start until 23. Thus ES= 23. Duration is 4 , thus earliest finish time = 23+4 = 27
F cannot start until E has finished , thus earliest start time = 27 and earliest finish time = 33.
Thus the project will take 33 months .

A
0
3
3
LST
LFT
C
5
6
1
LST
LFT
E
23
27
4
LST
LFT
F
27
33
6
LST
LFT

B
3
5
2
LST
LFT

D
5
23
18
LST
LFT

Backward Pass
The backward pass is used to determine which activities are “critical” and works by considering the latest start time. Latest start time (LS) = latest finish time (LF – Duration)

The latest finish time of F = 33 (as we do not want to extend the total time taken). The latest start time = 33 – duration (6) = 27
As F can start at 27 months , then this will be E’s latest finish time . E has latest start time of 27-4 = 23
C has latest finish of 23 and earliest start of 23-1 = 22
D has latest finish time of 23 and earliest start time of 23-18 = 5
B must occur before C and D. Taking the smallest value of the latest start time form the 2 , we get that B has a latest finish time of 5. The latest start time is 5-2 = 3.
A has a latest finish time of 3 and an earliest start time of 3-3=0

A
0
3
3
0
3
C
5
6
1
22
23
E
23
27
4
23
27
F
27
33
6
27
33

B
3
5
2
3
5

D
5
23
18
5
23

The critical path is where the earliest finishing time equals the latest finishing time. In this example, the critical path is ABDEF. The minimum time the project can take is 33 months

Example: An Estate Agency is planning to open a new office in North London.
The main activities are as follows:

Activity

Description

Preceding activity

Duration (weeks)

Find new office location

None

Recruit new staff

None

Make alterations to office

Order equipment

Install new equipment

Train staff

Test operations

C,E,F

Forward Pass
A: EST = 0 duration of A = 9 EFT = 0 + 9 = 9
B: EST = 0 duration of B = 7 EFT = 0 + 7 = 7
C: EST = 9 duration of C = 5 EFT = 9+ 5 = 14
D: EST = 9 duration of D = 3 EFT = 9 + 3 = 12
E: EST = 12 duration of E = 3 EFT = 12 + 3 = 15
F: EST = 7 duration of F = 4 EFT = 7 + 4 = 11
G: G can start until C. E. F has finished. These have EFT values of 14, 15, 11.
Thus EST = 15 duration = 1 EFT = 15 + 1 = 16
Thus the earliest finishing time of the entire project is 16 weeks.

Backward Pass

G: LFT = 16 [same as final EFT] EST = LFT – duration = 16 – 1 = 15
E: LFT = 15 EST = LFT – duration = 15 – 3 = 12
C : LFT = 15 EST = LFT – duration = 15 – 5 = 10
D : LFT = 12 EST = LFT – duration = 12 – 3 = 9
F : LFT = 15 EST = LFT – duration = 15 – 4 = 11
A : LFT = 9 (smaller of EFT of C and D) EST = LFT – duration = 9 – 9 = 0
B: LFT= 11 EST = LFT – duration = 11 – 7 = 4

The critical path is the route joining the nodes where EFT = LFT.

In this example, the critical path is ADEG. The shortest finishing time for the project is 16 weeks.

Floats
The floats measure the amount of time that an activity can over-run by without affecting the total completion time.

Float = Latest finish time (LFT) – Earliest finish time (EFT)

Considering the Estate’s Agency example used earlier:

The critical path is ADEG. The shortest finishing time for the project is 16 weeks.

Activity

EFT

LFT

float

Critical

yes

Gantt Charts

Gantt Charts can be used by managers to chart the progress of a project. In these charts, time is measured along the horizontal axis, each activity is listed on the vertical axis, and a bar drawn to show the duration of each activity. The variety of Gantt chart we will use, shows each bar stating at its earliest start time (EST) and the length of the bar represent the duration of the activity with any ‘float’ being shown at the end of the duration.
The Gantt Chart for the Estates Agency is :

Monitoring progress on a Gantt chart
A crucial step in meeting a completion target is to monitor a project’s progress. The Gantt charts provide a visual means of tracking the progress of the project’s activities by the scheduled completion dates. The chart also let the manager know where he has some ‘slack’ in the sense of the ‘float’ times.
For example, at the end of day 9, Activity A should be completed and F should have been started. Activities D and C should begin next day. Ideally activity B will have finished but at long as it completes by day 11, the project can still complete on time.

Gantt Charts

Advantages

Disadvantages

1. The chart is easy to draw, particularly if appropriate software is used.
2. An earliest possible completion date is clearly visible.
3. For each activity the earliest possible start and finishing times are shown.

1. The chart only gives one possible sequence for the activities, (the earliest possible completion time).
2. The precedence rules are not clear from the chart so it is not obvious how a delay in one activity will affect the project.

Seminar Activity :
For the following cases, draw the network, find the critical path and shortest duration.
Identify the “float“ for each activity
a)

Activity

Preceding

Duration

None

B,C

Activity

Preceding

Duration

None

D,F

Activity

Preceding

Duration

None

none

B,C,E

G,H

Topic 11: Project Management: Crashing

Some definitions:

Normal Cost: The cost associated with the normal time estimate for the activity.
Crash Cost: The cost associated with the minimum possible time for an activity. Crash Costs, because of the extra wages, overtime payments etc., are higher than the normal costs.
Crash time: The minimum possible time that an activity can take.
Slope: The average cost of shortening an activity by one time unit.
Slope = increase in cost = crash cost – normal cost
decrease in time normal time-crash time
Crashing: This process involves finding the least cost method of reducing the overall project duration.
This is done by reducing the time of the activity with the minimum ‘slope’ providing this activity is on the critical path.
The process is repeated until no further savings can be made.
Example:

Activity

Preceding
activity

Normal duration

Crash duration

Normal cost (£)

Crash cost (£)

None

360

420

none

300

510

170

270

220

300

B,C

200

360

total

1250

Firstly: Draw network, find critical path and shortest duration.

Critical path = A,C,E normal duration = 14 days Total Cost = £1250

Crashing
:

First Crash:
Step 1: Calculate the ‘slopes’ using:
Slope = increase in cost = crash cost – normal cost
decrease in time normal time-crash time

Activity

critical

Normal duration

Crash duration

Normal cost (£)

Crash cost (£)

slope

yes

360

420

=(420-360) / (4-3) = 60

300

510

yes

170

270

=(270-170) / (5-3) = 50

220

300

yes

200

360

= (360-200) / (5-3) = 80

total

1250

Thus activity C has the minimum slope and thus is the cheapest way of reducing the total duration by 1 day. Reducing C by 1 day costs £50.

Step 2: New network:

Step 3: New critical path, total duration and cost

Critical paths = ACE BE and AD and duration = 13 days
Total Cost = £1250 + £50 = £1300

Second Crash: duration of C is now 4 and all activities are critical
Step 1: Calculate the ‘slopes’ using:
Slope = increase in cost = crash cost – normal cost
decrease in time normal time-crash time

Activity

critical

Normal duration

Crash duration

Normal cost (£)

Crash cost (£)

slope

yes

360

420

= 60 [from before]

Yes

300

510

= (510-300) /(8-5) = 70

Yes

170 (220)

270

= 50 [from before]

Yes

220

300

= (300-220) /(9-7) = 40

yes

200

360

= 80 [from before]

total

1250(1300)

Thus activity D has the minimum slope and thus is the cheapest way of reducing the total duration by 1 day. Reducing D by 1 day costs £40.
However, we cannot reduce D on its own as all paths are critical.
We have to consider combinations of reductions:
A and B extra cost = 60 + 70 = 130
D and E extra cost = 40 + 80 = 120
B,C and D extra cost = 70 + 50 + 40 = 160
A and E extra cost = 60 + 80 = 140
All of these combinations will reduce the total duration by 1 day.
The cheapest option is D and E.
Step 2: New network:

Step 3: New critical path, total duration and cost
Critical paths = ACE BE and AD and duration = 12 days
Total Cost = £1300 + £120 = £1420

Third Crash: duration of C is now 4, D is 8 and E is 4 and all activities are critical
Step 1:

Activity

critical

Normal duration

Crash duration

slope

yes

We cannot reduce one activity on its own as all paths are critical. We have to consider combinations of reductions:
A and B extra cost = 130
D and E extra cost = 120
B,C and D extra cost = 160
A and E extra cost = 140
All of these combinations will reduce the total duration by 1 day. The cheapest option is D and E.
Step 2: New network:

Step 3: New critical path, total duration and cost
Critical paths = ACE BE and AD and duration = 11 days
Total Cost = £1420 + £120 = £1540

Fourth Crash: duration of C is now 4, D is 7 and E is 3 and all activities are critical
Step 1:

Activity

critical

Normal duration

Crash duration

slope

yes

We cannot reduce one activity on its own as all paths are critical and we cannot reduce activities D and E as they are at their ‘crash duration’.
We have to consider combinations of reductions:
A and B extra cost = 130
D and E D and E cannot be reduced
B,C and D D cannot be reduced
A and E E cannot be reduced

The only possible option left is to reduce A and B. This will reduce the total duration by 1 day.

Step 2: New network:

Step 3: New critical path, total duration and cost
Critical paths = ACE BE and AD and duration = 10 days
Total Cost = £1540 + £130 = £1670

Activity

critical

Normal duration

Crash duration

slope

yes

The only activities that are not at their minimum duration are B and C.
As reducing B and C will not reduce the total duration, we cannot ‘crash’ any further.

Example: Klone Computers
Klone Computers is a small manufacturer of personal computers that is about to design, manufacture and Market the Klone 2000 palmbook computer. The company faces three major tasks in introducing a new computer.:
1. Manufacturing the new computer
2. Training staff and sales teams to operate the new computer
3. Advertising the new computer
When the proposed specifications for the new computer have been reviewed, the manufacturing phase begins with the design of a prototype computer. Once the design is determined, the required materials are purchased and the prototypes are manufactured. Prototype models are then tested and analysed by staff who have completed the staff training course. Based on their input, refinements are made to the prototype and an initial production run of computers is scheduled.
Staff training of company personnel begins once the computer is designed allowing staff to test the prototypes once they have been manufactured. After the computer design has been revised based on staff input, the sales force undergoes full-scale training.
Advertising is a two-phase process. First, a small group works closely with the design team so that once a product design has been chosen, the marketing team can begin an initial pre-production advertising campaign. Following this initial campaign and completion of the final design revisions, a larger advertising campaign team is introduced to the special features of the computer, and a full-scale advertising programme is launched.
The entire project is concluded when the initial production run is completed, the sales staff are trained, and the advertising campaign is underway.
Klone has come up with the following information to assist the planning of the project.

Phase

Activity

Description

Immediate predecessors

estimated completion time(days)

Manufacturing

Prototype model design

None

Purchase of materials

Manufacture of prototype models

Revision of design

Initial production run

Training

Staff training

Staff input on prototype models

C, F

Sales training

Advertising

Pre-production advertising

Post-production advertising

D,I

[ based on an example in Lawrence,John & Paternack,Barry, Applied Management

Science, 2nd edition, Wiley 2002

105

110

149

170

115

129

149

177

120

149

194

Forward pass:

Node

EFT

EST = 0 EFT = EST+ duration of A = 0 + 90 = 90

EST = 90 EFT = 90 + duration of B = 90 +15 = 105

105

EST= 105 EFT = 105 +duration of C = 105 +5 = 110

110

EST = 90 EFT= 90+ duration of F = 90 + 25 = 115

115

G cannot start until C & F have finished
EST = 115 EFT= 115duration of G = 115 +14 = 129

129

EST = 129 EFT = 129 + duration of D = 129 +20 = 149

149

EST = 149 EFT= 149+ duration of E = 149 + 21 = 170

170

EST = 149 EFT = 149+ duration of H = 149 + 28 = 177

177

EST = 90 EFT = 90 + duration of I = 90 + 30 = 120

J cannot start until both D and I have finished
EST = 149 EFT = 149 + duration of J = 149+ 45 = 194

194

105

110

115

149

170

173

194

115

129

149

115

129

149

177

166

194

120

149

194

149

194

Backward pass:

Node

LST

LFT = 194, LST = 194 – duration of J = 149

194

LFT = 194 LST = 194 – duration of H = 194 – 28 = 166

166

LFT = 194 LST = 194 – duration of E = 194 – 21 = 173

194

Via E: LFT = LST of E = 173
Via H: LFT = LST of H = 166
Via J: LFT = LST of = 149
Thus LFT of D = 149, LST of D = 159 – duration of D = 149 – 20 = 129

129

LFT = LST of D = 129 LST of G = 129 – duration of G = 129-14= 115

115

LFT = LST of G =115 LST of C = 115 – duration of c = 115 – 5 = 115

110

LFT = LST of C = 110 LST = 110 – duration of B = 110 – 15 = 95

LFT = LST of G = 115 LST of F = 115 – duration of F = 115 -25 =90

LFT = LST of J = 149 LST = I = 149 – duration of I = 149-30 = 119

119

Via B: LFT = LST of B = 95
Via F: LFT = LST of F = 90
Via I: LFT = LST of I = 119
Thus LST of A = 90 – duration of a = 90-90 = 0

Critical path is AFGDJ minimum completion time is 194 days

Floats

The floats measure the amount of time that an activity can over-run by without affecting the total completion time.

Float = Latest finish time (LFT) – Earliest finish time (EFT)

Activity

EFT

LFT

float

Critical

yes

105

110

115

149

yes

170

194

115

yes

129

Yes

177

194

120

149

194

yes

From the Gantt chart below, we can see the floats

In the Gantt chart the activities are presented at their EST with the duration represented in black and float in grey.

Seminar Question

A bank is planning to install a new computerised accounting system. The bank management has determined the activities required to complete the project, the precedence relationships, and activity time estimates (in weeks) as shown in the table below.

activity

description

predecessor

Normal time

Crash time

Normal cost

Crash cost

Recruit staff

–

4800

6300

Systems development

–

9100

15500

Systems training

3000

4000

Equipment training

3600

5000

Manual system test

B,C

Preliminary system changeover

B,C

1500

2000

Computer-personnel interface

D,E

1800

2000

Equipment modification

D,E

Equipment testing

System debugging and installation

F,G

Equipment changeover

G,I

5000

7000

a. Determine the project completion time and the critical path.
b. Represent the project as a Gantt chart.
b. Crash the network to 26 weeks. Indicate how much this will cost the bank. Identify the new critical path.
[Example is based on an example from Taylor, Bernard, Introduction to Management Science, 7th edition, Prentice Hall, 2002]

How to use MS project 2010

Using the estate agents example we had before, we can use Microsoft project to schedule the task.
The main difference we will notice at first is that MS Project uses a calendar . To make the comparison easy with our manual example let us start the project on 1st January 2013 . We calculated that the project would take at least 16 weeks . 16 weeks from Tuesday 1st January would have taken us to Monday 22nd April.
The data entry looks like:
Note only the first date, 1st January was entered , the rest were calculated by the software.
Data entered:

As a network diagram this looks like:

As Gantt Chart we get

Using the example above :
Load Up Microsoft Office Project
Choose File, New, Blank Project
To use the system with “auto scheduled tasks” we need to turn off manually scheduled tasks .
We do this by selecting , File , Options , Schedule, New task created as choose Auto Scheduled then OK

To start to enter the project details:
The screen is split in two, you will need to use more on the left hand side to add predecessors
The first entry is Z: Start project with 0 duration. It is here that we set the start date of 1st January

The rest of the details can now be added:
Notes: a) only the first starting date is added by us
b) The predecessors have to be added by the activity number given in the first column

To get the different displays:
Using the View Tab, choose Gantt chart icon

You will have to increase the size of the right-hand of the screen to see the whole chart
Using the View Tab, choose the network icon . This looks like

Seminar Activity
Use MS project to determine the critical path and the project duration for :
a. Klone Computers
b. Bank Accounting System

Answers

Critical Path is A,F,G,D,J start date = Tues1st January, end date = Friday 27th September (38 weeks and 4 days = 38×5 + 4 = 194 working days

Critical path = ADGK duration = 33 weeks

Janet Geary 2012 Page 131

Estate Agents
no activity A B C D E F G 0 0 9 9 12 7 15 duration A B C D E F G 9 7 5 3 3 4 1 float A B C D E F G 0 4 1 0 0 4 0
days

Klone Computers
A B C D E F G H I J 0 90 105 129 149 90 115 149 90 149 A B C D E F G H I J 90 15 5 20 21 25 14 28 30 45 A B C D E F G H I J 0 5 5 0 24 0 0 17 29 0
days

age of customers
20 to 30 30 to 40 40 to 50 50 to 60 60 to 70 70 to 80 0.03 0.12 0.34 0.34 0.12 0.05 age group
probability
ages of customers
20 to 25 25 to 30 30 to 35 35 to 40 40 to 45 45 to 50 50 to 55 55 to 60 60 to 65 65 to 70 70 to 75 75 to 80 0 0.01 0.02 0.05 7.0000000000000007E-2 0.12 0.22 0.22 0.12 0.08 0.04 0.03 0.02 age group
probability
ages of customers
20 to 22 22 to 24 24 to 26 26 to 28 28 to 30 30 to 32 32 to 34 34 to 36 36 to 38 38 to 40 40 to 42 42 to 44 44 to 46 46 to 48 48 to 50 50 to 52 52 to 54 54 to 56 56 to 58 58 to 60 60 to 62 62 to 64 64 to 66 66 to 68 69 to 70 70 to 72 72 to 74 74 to 76 76 to 78 78 to 80 0 0.01 0 0.01 0.01 0.01 0.02 0.02 0.02 0.05 0.05 0.05 7.0000000000000007E-2 0.09 0.08 0.11 0.08 0.08 0.06 0.03 0.04 0.02 0.02 0.02 0.01 0.02 0.01 0.01 0 0 age groups
probability
standard normal distribution
-3 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.2999999999999998 -2.2000000000000002 -2.1 -2 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1000000000000001 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1000000000000001 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2000000000000002 2.2999999999999998 2.4 2.5000000000000102 2.6 2.7 2.80000000000001 2.9000000000000101 3.0000000000000102 4.4318484119380075E-3 5.9525324197758538E-3 7.9154515829799686E-3 1.0420934814422592E-2 1.3582969233685613E-2 1.752830049356854E-2 2.2394530294842899E-2 2.8327037741601186E-2 3.5474592846231424E-2 4.3983595980427191E-2 5.3990966513188063E-2 6.5615814774676595E-2 7.8950158300894149E-2 9.4049077376886947E-2 0.11092083467945554 0.12951759566589174 0.14972746563574488 0.17136859204780736 0.19418605498321295 0.21785217703255053 0.24197072451914337 0.26608524989875482 0.28969155276148273 0.31225393336676127 0.33322460289179967 0.35206532676429952 0.36827014030332333 0.38138781546052414 0.39104269397545588 0.39695254747701181 0.3989422804014327 0.39695254747701181 0.39104269397545588 0.38138781546052414 0.36827014030332333 0.35206532676429952 0.33322460289179967 0.31225393336676127 0.28969155276148273 0.26608524989875482 0.24197072451914337 0.21785217703255053 0.19418605498321295 0.17136859204780736 0.14972746563574488 0.12951759566589174 0.11092083467945554 9.4049077376886947E-2 7.8950158300894149E-2 6.5615814774676595E-2 5.3990966513188063E-2 4.3983595980427191E-2 3.5474592846231424E-2 2.8327037741601186E-2 2.2394530294842899E-2 1.7528300493568086E-2 1.3582969233685613E-2 1.0420934814422592E-2 7.915451582979743E-3 5.9525324197756795E-3 4.431848411937874E-3 standard deviations from the mean
-3 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.2999999999999998 -2.2000000000000002 -2.1 -2 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1000000000000001 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1000000000000001 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2000000000000002 2.2999999999999998 2.4 2.5000000000000102 2.6 2.7 2.80000000000001 2.9000000000000101 3.0000000000000102 4.4318484119380075E-3 5.9525324197758538E-3 7.9154515829799686E-3 1.0420934814422592E-2 1.3582969233685613E-2 1.752830049356854E-2 2.2394530294842899E-2 2.8327037741601186E-2 3.5474592846231424E-2 4.3983595980427191E-2 5.3990966513188063E-2 6.5615814774676595E-2 7.8950158300894149E-2 9.4049077376886947E-2 0.11092083467945554 0.12951759566589174 0.14972746563574488 0.17136859204780736 0.19418605498321295 0.21785217703255053 0.24197072451914337 0.26608524989875482 0.28969155276148273 0.31225393336676127 0.33322460289179967 0.35206532676429952 0.36827014030332333 0.38138781546052414 0.39104269397545588 0.39695254747701181 0.3989422804014327 0.39695254747701181 0.39104269397545588 0.38138781546052414 0.36827014030332333 0.35206532676429952 0.33322460289179967 0.31225393336676127 0.28969155276148273 0.26608524989875482 0.24197072451914337 0.21785217703255053 0.19418605498321295 0.17136859204780736 0.14972746563574488 0.12951759566589174 0.11092083467945554 9.4049077376886947E-2 7.8950158300894149E-2 6.5615814774676595E-2 5.3990966513188063E-2 4.3983595980427191E-2 3.5474592846231424E-2 2.8327037741601186E-2 2.2394530294842899E-2 1.7528300493568086E-2 1.3582969233685613E-2 1.0420934814422592E-2 7.915451582979743E-3 5.9525324197756795E-3 4.431848411937874E-3
-3 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.2999999999999998 -2.2000000000000002 -2.1 -2 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1000000000000001 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1000000000000001 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2000000000000002 2.2999999999999998 2.4 2.5000000000000102 2.6 2.7 2.80000000000001 2.9000000000000101 3.0000000000000102 4.4318484119380075E-3 5.9525324197758538E-3 7.9154515829799686E-3 1.0420934814422592E-2 1.3582969233685613E-2 1.752830049356854E-2 2.2394530294842899E-2 2.8327037741601186E-2 3.5474592846231424E-2 4.3983595980427191E-2 5.3990966513188063E-2 6.5615814774676595E-2 7.8950158300894149E-2 9.4049077376886947E-2 0.11092083467945554 0.12951759566589174 0.14972746563574488 0.17136859204780736 0.19418605498321295 0.21785217703255053 0.24197072451914337 0.26608524989875482 0.28969155276148273 0.31225393336676127 0.33322460289179967 0.35206532676429952 0.36827014030332333 0.38138781546052414 0.39104269397545588 0.39695254747701181 0.3989422804014327 0.39695254747701181 0.39104269397545588 0.38138781546052414 0.36827014030332333 0.35206532676429952 0.33322460289179967 0.31225393336676127 0.28969155276148273 0.26608524989875482 0.24197072451914337 0.21785217703255053 0.19418605498321295 0.17136859204780736 0.14972746563574488 0.12951759566589174 0.11092083467945554 9.4049077376886947E-2 7.8950158300894149E-2 6.5615814774676595E-2 5.3990966513188063E-2 4.3983595980427191E-2 3.5474592846231424E-2 2.8327037741601186E-2 2.2394530294842899E-2 1.7528300493568086E-2 1.3582969233685613E-2 1.0420934814422592E-2 7.915451582979743E-3 5.9525324197756795E-3 4.431848411937874E-3
-3 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.2999999999999998 -2.2000000000000002 -2.1 -2 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1000000000000001 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1000000000000001 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2000000000000002 2.2999999999999998 2.4 2.5000000000000102 2.6 2.7 2.80000000000001 2.9000000000000101 3.0000000000000102 4.4318484119380075E-3 5.9525324197758538E-3 7.9154515829799686E-3 1.0420934814422592E-2 1.3582969233685613E-2 1.752830049356854E-2 2.2394530294842899E-2 2.8327037741601186E-2 3.5474592846231424E-2 4.3983595980427191E-2 5.3990966513188063E-2 6.5615814774676595E-2 7.8950158300894149E-2 9.4049077376886947E-2 0.11092083467945554 0.12951759566589174 0.14972746563574488 0.17136859204780736 0.19418605498321295 0.21785217703255053 0.24197072451914337 0.26608524989875482 0.28969155276148273 0.31225393336676127 0.33322460289179967 0.35206532676429952 0.36827014030332333 0.38138781546052414 0.39104269397545588 0.39695254747701181 0.3989422804014327 0.39695254747701181 0.39104269397545588 0.38138781546052414 0.36827014030332333 0.35206532676429952 0.33322460289179967 0.31225393336676127 0.28969155276148273 0.26608524989875482 0.24197072451914337 0.21785217703255053 0.19418605498321295 0.17136859204780736 0.14972746563574488 0.12951759566589174 0.11092083467945554 9. 4049077376886947E-2 7.8950158300894149E-2 6.5615814774676595E-2 5.3990966513188063E-2 4.3983595980427191E-2 3.5474592846231424E-2 2.8327037741601186E-2 2.2394530294842899E-2 1.7528300493568086E-2 1.3582969233685613E-2 1.0420934814422592E-2 7.915451582979743E-3 5.9525324197756795E-3 4.431848411937874E-3
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 3.28500454538971E-10 1.5189707124558215E-9 6.5981080089264338E-9 2.692440010635819E-8 1.0321177471574996E-7 3.7167987868357442E-7 1.2573768221481113E-6 3.9959352767263687E-6 1.1929659135301238E-5 3.3457556441221342E-5 8.8148920591861352E-5 2.1817067376144004E-4 5.0726201432494203E-4 1.1079621029845019E-3 2.2733906253977632E-3 4.3820751233921351E-3 7.9349129589168545E-3 1.3497741628297016E-2 2.1569329706627883E-2 3.2379398916472936E-2 4.5662271347255479E-2 6.0492681129785841E-2 7.5284358038701107E-2 8.8016331691074881E-2 9.6667029200712309E-2 9.9735570100358176E-2 9.6667029200712309E-2 8.8016331691074881E-2 7.5284358038701107E-2 6.0492681129785841E-2 4.5662271347255479E-2 3.2379398916472936E-2 2.1569329706627883E-2 1.3497741628297016E-2 7.9349129589168545E-3 4.3820751233921351E-3 2.2733906253977632E-3 1.1079621029845019E-3 5.0726201432494203E-4 2.1817067376144004E-4 8.8148920591861352E-5 3.3457556441221342E-5 1.1929659135301238E-5 7.2822757649349693E-41 1.0183369193819613E-40 1.4231295078998777E-40 1.9875859769177719E-40 2.7741886482471005E-40 3.8696761657405153E-40 5.3943835309166803E-40 7.5151488152493637E-40 1.0463137077062267E-39 1.4558438999341422E-39 2.0243997410497396E-39 2.8132368728300922E-39 3.9070133477816142E-39 5.4226560006516564E-39 7.5215575 117712494E-39 1.0426344544303782E-38 1.4443915199930144E-38 1.9997069392517598E-38 -3 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.2999999999999998 -2.2000000000000002 -2.1 -2 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1000000000000001 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1000000000000001 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2000000000000002 2.2999999999999998 2.4 2.5000000000000102 2.6 2.7 2.80000000000001 2.9000000000000101 3.0000000000000102 4.4318484119380075E-3 5.9525324197758538E-3 7.9154515829799686E-3 1.0420934814422592E-2 1.3582969233685613E-2 1.752830049356854E-2 2.2394530294842899E-2 2.8327037741601186E-2 3.5474592846231424E-2 4.3983595980427191E-2 5.3990966513188063E-2 6.5615814774676595E-2 7.8950158300894149E-2 9.4049077376886947E-2 0.11092083467945554 0.12951759566589174 0.14972746563574488 0.17136859204780736 0.19418605498321295 0.21785217703255053 0.24197072451914337 0.26608524989875482 0.28969155276148273 0.31225393336676127 0.33322460289179967 0.35206532676429952 0.36827014030332333 0.38138781546052414 0.39104269397545588 0.39695254747701181 0.3989422804014327 0.39695254747701181 0.39104269397545588 0.38138781546052414 0.36827014030332333 0.35206532676429952 0.33322460289179967 0.31225393336676127 0.28969155276148273 0.26608524989875482 0.24197072451914337 0.21785217703255053 0.19418605498321295 0.17136859204780736 0.14972746563574488 0.12951759566589174 0.11092083467945554 9.4049077376886947E-2 7.8950158300894149E-2 6.5615814774676595E-2 5.3990966513188063E-2 4.3983595980427191E-2 3.5474592846231424E-2 2.8327037741601186E-2 2.2394530294842899E-2 1.7528300493568086E-2 1.3582969233685613E-2 1.0420934814422592E-2 7.915451582979743E-3 5.9525324197756795E-3 4.431848411937874E-3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 9.000000000 0000011E-2 0.16000000000000003 0.21 0.24 0.25 0.24 0.21000000000000002 0.15999999999999998 8.9999999999999983E-2 0 value of p
value of pq = p(1-p)

employee data on age and length of service
24 51 25 34 40 32 23 28 21 33 26 21 57 27 23 45 22 43 35 19 28 44 40 35 48 25 43 24 30 35 2 24 2 9 12 6 1 5 3 4 3 1 14 5 4 13 1 11 8 1 8 12 13 3 9 2 11 4 5 6 age
length of service
Soft toy company
machine hours 0 5 10 15 20 25 30 35 40 45 50 50 40 30 20 10 0 teddy bears
rabbits
Soft toy company
machine hours 0 5 10 15 20 25 30 35 40 45 50 50 40 30 20 10 0 hand labour 0 5 10 15 20 25 30 35 40 45 30 26.666666666666664 23.333333333333332 20 16.666666666666664 13.333333333333332 10 6.6666666666666661 3.333333333333333 0 teddy bears
rabbits
Soft toy company
machine hours 0 5 10 15 20 25 30 35 40 45 50 50 40 30 20 10 0 hand labour 0 5 10 15 20 25 30 35 40 45 30 26.666666666666664 23.333333333333332 20 16.666666666666664 13.333333333333332 10 6.6666666666666661 3.333333333333333 0 0 5 10 15 20 25 30 35 40 40 35 30 25 20 15 10 5 0 teddy bears
rabbits
Soft toy company
machine hours 0 5 10 15 20 25 30 35 40 45 50 50 40 30 20 10 0 hand labour 0 5 10 15 20 25 30 35 40 45 30 26.666666666666664 23.333333333333332 20 16.666666666666664 13.333333333333332 10 6.6666666666666661 3.333333333333333 0 0 5 10 15 20 25 30 35 40 40 35 30 25 20 15 10 5 0 teddy bears
rabbits
camping trip
0 1 2 3 4 5 6 7 8 9 10 11 12 13 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 8 7.5 7 6.5 6 5.5 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 4-person tents
8-person tents

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Finance report ”

Get high-quality paper

Guarantee! All work is written by expert writers!

Still stressed from student homework?

Get quality assistance from academic writers!

Order now