4D1-9 – Stakeholders and Creating Buy-In for Implementation. see details below. Please follow all instructions given and answer the questions as given.

Discussion Instructions:

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Read Chapter 9 in Program Evaluation and Performance Measurement. Using the example found in question 1 on page 366, address the following in your post:

 

1. Who are the stakeholders and how do you engage stakeholder buy-in? 

2. What are the key challenges, in your role as a consultant, to implementation of a successful performance measurement system? 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

* You must understand and analyze successful performance measurements.

* Articulate challenges with implementing performance measurement systems with stakeholders. 

*Communicate through writing that is concise, balanced, and logically organized. 

determining whether the neighborhood watch program was the likely cause of the observed
changes in the reported burglary rate.

But continuity can also make a system less relevant over time. Suppose, for example, that a
performance measurement system was designed to pull data from several different databases,
and the original information system programming to make this work was expensive. Even if
the data needs change, there may well be a desire not to go back and repeat this work, simply
because of the resources involved. Likewise, if a performance measurement system is based on
a logic model that becomes outdated, then the measures will no longer fully reflect what the
program(s) or the organization is trying to accomplish. But going back to redo the logic model
(which can be a time-consuming, iterative process) may not be feasible in the short term, given
the resources available. The price of such a decision might be a gradual reduction in the
relevance of the system, which may not be readily detected.

With all the activity to design and implement performance measurement and reporting
systems, there has been surprisingly little effort to date to evaluate their effectiveness
(McDavid & Huse, 2012). In Chapter 10, we will discuss what is known now about the ways in
which performance information is used, but it is appropriate here to suggest some practical
steps to generate feedback that can be used to modify and better sustain performance
measurement systems:

• Develop channels for user feedback. This step is intended to create a process that will
allow the users to provide feedback and suggest ways to revise, review, and update the
performance measures. Furthermore, this step is intended to help identify when
corrections are required and how to address errors and misinterpretations of the data.

• Create an expert review panel of persons who are both knowledgeable about
performance measurement and do not have a stake in the system that is being reviewed.
Performance measurement should be conducted on an ongoing basis, and this expert
panel review can provide feedback and address issues and problems over a long-term
time frame. A review panel can also provide an independent assessment of buy-in and
use of performance information by managers and staff, and track the (intended and
unintended) effects of the system on the organization.

The credibility of performance information is an enduring concern. Davies and Warman
(1998) point to the importance of auditing in the context of the performance reports of the
(British) National Meteorological Office:

An independent audit, then, is not a luxury, it is a necessity. The credibility of the whole
system of agencies is put at risk if the data from one is found to be unverified and open to
dispute. Where performance-related bonuses are linked with outcomes, it is unreasonable
to expect staff concerned to be responsible for the measurement and reporting of results in
an objective manner when the very same results will determine their own pay. (p. 47)

Legislative auditors, in addition to recommending principles to guide public performance
reporting, have been active in promoting audits of performance reports (CCAF-FCVI, 2002;
Klay et al., 2004). Externally auditing the performance reporting process is suggested as an
important part of ensuring the longer-term credibility of the system. With varying degrees of
regularity and intensity, external audits of performance reports are occurring in some
jurisdictions at the national, state, province, and/or local levels (Gill, 2011; Schwartz & Mayne,
2005). In Britain, for example, between 2003 and 2010, the National Audit Office (NAO)

conducted assessments of the performance measures that were integral to the Public Service
Agreements between departments and the government. The NAO audits focused on
performance data systems “to assess whether they are robust, and capable of providing reliable,
valid information” (NAO, 2009).

PERFORMANCE MEASUREMENT FOR PUBLIC
ACCOUNTABILITY

Performance results can be used for two general purposes: (1) to meet public accountability
expectations and (2) to improve performance. Together, these purposes are often referred to as
performance management.

Many jurisdictions have embraced results-focused performance measurement systems with
the goal of improving public accountability (Dubnick, 2005). Performance measurement
systems can be developed so that the primary emphasis, as they are implemented, is on setting
public performance targets for each organization, measuring performance, and, in public
reports, comparing actual results with targeted outcomes. Usually, performance reports are
prepared at least annually and delivered to external stakeholders. In most jurisdictions, elected
officials and the public are the primary recipients.

The logic that has tended to underlie these systems assumes that public performance
reporting can also drive performance improvement, in that an approach that makes public
accountability the principal goal gives organizations the incentive to become more efficient
and effective (Auditor General of British Columbia, 1996). Performance improvements are
expected to come about because elected officials and other stakeholders can put pressure, via
public performance reports, on organizations to “deliver” results. Fully realized performance
management systems are expected to include internal organizational performance incentives
that are geared toward improving performance (Moynihan, 2008).

Figure 9.3 is a normative model of key intended relationships between performance
measurement, public reporting, public accountability, and performance improvement. In the
figure, public performance reporting is expected to contribute to both public accountability and
performance improvement. Furthermore, performance improvement and public accountability
are expected to reinforce each other.

The model in Figure 9.3 suggests expected relationships among performance measurement,
public reporting, public accountability, and performance improvement that are implied in
governmental reforms in many jurisdictions. As we have noted earlier, however, public
performance reporting introduces a higher-stakes side of developing and implementing
performance measurement systems. Once performance information is rendered in public
reports, it can be used in ways that have consequences (intended and unintended) for both
managers and elected officials. The literature on using performance information is rich with
findings that suggest that the characteristics of the political culture in which government
organizations are embedded can substantially influence both the quality and the uses of
performance information (de Lancer Julnes, 2006; de Lancer Julnes & Holzer, 2001; Thomas,
2006).

Figure 9.3 A Normative Model of the Intended Relationship Between Public Accountability
and Performance Improvement

We have suggested in this chapter that performance measurement systems, to be
sustainable, need to be designed and implemented so that managerial use of performance
information is the central purpose. In contrast to the relationships suggested in Figure 9.3, we
believe that in many settings, particularly where the political culture is adversarial, public
performance reporting may undermine the use of the performance information for performance
improvement (McDavid & Huse, 2012). We will explore this problem in Chapter 10.

SUMMARY

The 12 criteria for designing and implementing performance measurement systems discussed
in this chapter reflect both a technical/rational and a political/cultural view of organizations.
Both perspectives are important in designing and implementing performance measurement
systems that are sustainable. Collectively, the criteria impose some demanding requirements on
the process. It is quite likely that in any given situation, one or more of these criteria will be
difficult to address. Does that mean that, unless performance measurement systems are
designed and implemented with these 12 criteria in view, the system will fail? No, but it is
reasonable to assert that each criterion is important and does enhance the likelihood of success.

Our view is that for performance measurement systems to be sustainable, managerial
involvement is key. In this chapter, we have developed an approach to performance
measurement that emphasizes utilization of the information obtained for improving
performance. Performance measurement for public accountability is one purpose of such
systems, but making that the main purpose will tend to weaken managerial commitment to the
system over time, and thus undermine the usefulness of the measures for improving efficiency
and effectiveness.

Among the 12 criteria, six are more critical. Each contributes something necessary for
successful design and implementation, and again these reflect a mix of technical/rational and
organizational-political/cultural perspectives.

1. Sustained leadership: Without this, the process will drift and eventually halt.
2. Good communications: They are essential to developing a common understanding of the

process and increasing the likelihood of buy-in.
3. Clear expectations from the system: Be open and honest about the purposes behind the

process so that key stakeholders (managers and others) are not excluded or blindsided.

4. Resources sufficient to free up the time and expertise needed: When resources are taken
away from other programs, to measure and report on performance, the process is viewed
as a competitor to important organizational work and is often given short shrift.

5. Logic models that identify the key program and organizational constructs: The process
of logic modeling is very important to informing the selection of constructs and the
development of performance measures.

6. A measurement process that succeeds in producing valid measures in which
stakeholders have confidence: Too few performance measurement systems pay attention
to measurement validity and reliability criteria that ultimately determine the perceived
usefulness of the system.

These six criteria can be thought of as individually necessary, but they will vary in
importance in each situation. Performance measurement is a craft. In that respect, it is similar
to program evaluation. There is considerable room for creativity and professional judgment as
organizations address the challenges of measuring results.

DISCUSSION QUESTIONS

1. Assume that you are a consultant to the head of a government agency (1,000 employees)
that delivers social service programs to families. The families have incomes below the
poverty line, and most of them have one parent (often the mother) who is either working
for relatively low wages or is on social assistance. The agency is under some pressure to
develop performance measures as part of a broad government initiative to make service
organizations more efficient and effective. In your role, you are expected to give advice
to the department head that will guide the organization into the process of developing
and implementing a performance measurement system. What advice would you give
about getting the process started? What things should the department head do to increase
the likelihood of success in implementing performance measures? How should he or she
work with managers and staff to get them onside with this process? Try to be realistic in
your advice—assume that there will not be significant new resources to develop and
implement the performance measurement system.

2. Performance measurement systems are usually intended to improve the efficiency and
effectiveness of programs or organizations (improve performance). But, very few
organizations have taken the time to assess whether their performance measurement
systems are actually making a difference. Suppose that the same organization that was
referred to in Question 1 has implemented its performance measurement system. Assume
it is three years later. The department head now wants to find out whether the system has
actually improved the efficiency and effectiveness of the agency’s programs. Suppose
that you are giving this person advice about how to design an evaluation project to assess
whether the performance measurement system has “delivered.” Think of this as an
opportunity to apply your program evaluation skills to finding out whether this
performance measurement system was successfully implemented. What would be
possible criteria for the success of the system? How would you set up research designs
that would allow you to see whether the system had the intended incremental effects?
What would you measure to see if the system has been effective? What comparisons
would you build into the evaluation design?

APPENDIX A: Organizational Logic Models

Table 9A.1 Logic Model for Ministry of Human Resources, British Columbia, Canada

Note: FTE = Full-Time Equivalent; HRDC = Human Resources Development Canada.

a. Employee Performance Development Plans.

Figure 9A.1 Organizational Logic Model for Human Resources and Skills Development
Canada

Source: Integrated Business Plan 2010–2013, Human Resources and Skills Development Canada, 2010.
page 16, http://publications.gc.ca/collections/collection_2010
/rhdcc-hrsdc/HS1-11-2010-1-eng . Reproduced with the permission of the Minister of Public Works
and Government Services Canada, 2012.

REFERENCES

Auditor General of British Columbia. (1996). 1996 Annual report: A review of the activities of
the office. Victoria, British Columbia, Canada: Queen’s Printer.

Bakvis, H., & Juillet, L. (2004). The horizontal challenge: Line departments, central agencies
and leadership. Ottawa, Ontario, Canada: Canada School of Public Service.

Bevan, G., & Hamblin, R. (2009). Hitting and missing targets by ambulance services for
emergency calls: Effects of different systems of performance measurement within the UK.
Journal of the Royal Statistical Society. Series A (Statistics in Society), 172(1), 161–190.

Bish, R., & McDavid, J. C. (1988). Program evaluation and contracting out government
services. Canadian Journal of Program Evaluation, 3(1), 9–23.

Brimson, J. (1991). Activity accounting: An activity-based costing approach. New York:
Wiley.

Campbell, D. (2002). Outcomes assessment and the paradox of nonprofit accountability.
Nonprofit Management & Leadership, 12(3), 243–259.

CCAF-FCVI. (2002). Reporting principles: Taking public performance reporting to a new
level. Ottawa, Ontario, Canada: Author.

Davies, M., & Warman, A. (1998). Auditing performance indicators: The meteorological
office case study. Journal of Cost Management (January/February), 43–48.

Davies, R., & Dart, J. (2005). The “Most Significant Change” (MSC) technique: A guide to its
use. Retrieved from http://www.mande.co.uk/docs/MSCGuide

de Lancer Julnes, P. (1999). Lessons learned about performance measurement. International
Review of Public Administration, 4(2), 45–55.

de Lancer Julnes, P. (2006). Performance measurement: An effective tool for government
accountability? The debate goes on. Evaluation, 12(2), 219–235.

de Lancer Julnes, P., & Holzer, M. (2001). Promoting the utilization of performance measures
in public organizations: An empirical study of factors affecting adoption and
implementation. Public Administration Review, 61(6), 693–708.

de Waal, A. A. (2003). Behavioral factors important for the successful implementation and use
of performance management systems. Management Decision, 41(8), 688–697.

Dubnick, M. (2005). Accountability and the promise of performance: In search of the
mechanisms. Public Performance & Management Review, 28(3), 376–417.

Gill, D. (Ed.). (2011). The iron cage recreated: The performance management of state
organisations in New Zealand. Wellington, New Zealand: Institute of Policy Studies.

Goodwin, L. D. (1997). Changing conceptions of measurement validity. Journal of Nursing
Education, 36(3), 102–107.

Government of Alberta. (1995). Government Accountability Act. Revised Statutes of Alberta
2000, Chapter G-7 (2009 compilation). Edmonton, Alberta: Alberta Queen’s Printer.

Government of Alberta. (2011). Government of Alberta 2010–11 annual report. Retrieved
from http://www.finance.alberta.ca/publications/measuring/
ministry-annual-reports.html

Government of British Columbia. (2000). Budget Transparency and Accountability Act: [SBC
2000 Chapter 23]. Victoria, British Columbia: Queen’s Printer.

Government of British Columbia. (2001). Budget Transparency and Accountability Act [SBC
2000 Chapter 23] (amended). Victoria, British Columbia: Queen’s Printer.

Hildebrand, R., & McDavid, J. (2011). Joining public accountability and performance
management: A case study of Lethbridge, Alberta. Canadian Public Administration, 54(1),
41–72.

Hood, C. (1991). A public management for all seasons? Public Administration, 69(1), 3–19.
Hood, C. (2006). Gaming in targetworld: The targets approach to managing British public

services. Public Administration Review, 66(4), 515–521.

Human Resources and Skills Development Canada. (2010). Integrated business plan 2010
–2013, p. 16. Retrieved from http://publications.gc.ca/collections/collection_
2010/rhdcc-hrsdc/HS1-11-2010-1-eng

Kaplan, R. S., & Norton, D. P. (1996). The balanced scorecard: Translating strategy into
action. Boston, MA: Harvard Business School Press.

Kates, J., Marconi, K., & Mannle, T. E., Jr. (2001). Developing a performance management
system for a federal public health program: The Ryan White CARE ACT Titles I and II.
Evaluation and Program Planning, 24(2), 145–155.

Klay, W. E., McCall, S. M., & Baybes, C. E. (2004). Should financial reporting by government
encompass performance reporting? Origins and implications of the GFOA-GASB conflict.
In A. Khan & W. B. Hildreth (Eds.), Financial management theory in the public sector (pp.
115–140). Westport, CT: Praeger.

Kravchuk, R. S., & Schack, R. W. (1996). Designing effective performance-measurement
systems under the Government Performance and Results Act of 1993. Public
Administration Review, 56(4), 348–358.

Levine, C. H., Rubin, I., & Wolohojian, G. G. (1981). The politics of retrenchment: How local
governments manage fiscal stress (Vol. 130). Beverly Hills, CA: Sage.

Martin, L. L., & Kettner, P. M. (1996). Measuring the performance of human service
programs. Thousand Oaks, CA: Sage.

Mayne, J. (2001). Addressing attribution through contribution analysis: Using performance
measures sensibly. Canadian Journal of Program Evaluation, 16(1), 1–24.

Mayne, J. (2008). Building an evaluative culture for effective evaluation and results
management. Retrieved from http://www.cgiar-ilac.org/files/publications/
briefs/ILAC_Brief20_Evaluative_Culture

Mayne, J., & Rist, R. C. (2006). Studies are not enough: The necessary transformation of
evaluation. Canadian Journal of Program Evaluation, 21(3), 93–120.

McDavid, J. C. (2001a). Program evaluation in British Columbia in a time of transition: 1995
–2000. Canadian Journal of Program Evaluation, 16(Special Issue), 3–28.

McDavid, J. C. (2001b). Solid-waste contracting-out, competition, and bidding practices
among Canadian local governments. Canadian Public Administration, 44(1), 1–25.

McDavid, J. C., & Huse, I. (2006). Will evaluation prosper in the future? Canadian Journal of
Program Evaluation, 21(3), 47–72.

McDavid, J. C., & Huse, I. (2012). Legislator uses of public performance reports: Findings
from a five-year study. American Journal of Evaluation, 33(1), 7–25.

Morgan, G. (2006). Images of organization (Updated ed.). Thousand Oaks, CA: Sage.
Moynihan, D. P. (2008). The dynamics of performance management: Constructing information

and reform. Washington, DC: Georgetown University Press.
Moynihan, D. P., Pandey, S. K., & Wright, B. E. (2012). Setting the table: How

transformational leadership fosters performance information use. Journal of Public
Administration Research and Theory, 22(1), 143–164.

National Audit Office. (2009). Performance frameworks and board reporting: A review by the
performance measurement practice. Retrieved from
http://www.nao.org.uk/guidance__good_practice/
performance_measurement1.aspx

Newcomer, K. E. (Ed.). (1997). Using performance measurement to improve public and
nonprofit programs (New Directions for Evaluation, No. 75). San Francisco, CA: Jossey-
Bass.

Norman, R. (2001). Letting and making managers manage: The effect of control systems on
management action in New Zealand’s central government. International Public
Management Journal, 4(1), 65–89.

Norman, R., & Gregory, R. (2003). Paradoxes and pendulum swings: Performance
management in New Zealand’s public sector. Australian Journal of Public Administration,
62(4), 35–49.

Oregon Progress Board. (2003). Is Oregon making progress? The 2003 benchmark
performance report. Salem, OR: Author.

Otley, D. (2003). Management control and performance management: Whence and whither?
British Accounting Review, 35(4), 309–326.

Pollitt, C. (2007). Who are we, what are we doing, where are we going? Retrieved from
http://www.koz-gazdasag.hu/images/stories/
2per1/8-pollitt

Pollitt, C., Bal, R., Jerak-Zuiderent, S., Dowswell, G., & Harrison, S. (2010). Performance
regimes in health care: Institutions, critical junctures and the logic of escalation in England
and the Netherlands. Evaluation, 16(1), 13–29.

Prebble, R. (2010). With respect: Parliamentarians, officials, and judges too. Wellington, New
Zealand: Victoria University of Wellington, Institute of Policy Studies.

Propper, C., & Wilson, D. (2003). The use and usefulness of performance measures in the
public sector. Oxford Review of Economic Policy, 19(2), 250–267.

Schwartz, R., & Mayne, J. (Eds.). (2005). Quality matters: Seeking confidence in evaluating,
auditing, and performance reporting. New Brunswick, NJ: Transaction Publishers.

Senge, P. M. (1990). The fifth discipline: The art and practice of the learning organization (1st
ed.). New York: Doubleday/Currency.

Sigsgaard, P. (2002). MCS approach: Monitoring without indicators. Evaluation Journal of
Australasia, 2(1), 8–15.

Staudohar, P. D. (1975). An experiment in increasing productivity of police service employees.
Public Administration Review, 35(5), 518.

Texas State Auditor’s Office. (2012). SAO reports: Audits of performance measures. Retrieved
from http://www.sao.state.tx.us/Reports/reportpost.
cfm/perfmeas/yes

Thomas, P. G. (2004). Performance measurement, reporting and accountability: Recent trends
and future directions (SIPP Public Policy Paper Series No. 23). Retrieved from
http://www.uregina.ca/sipp/documents/pdf/
PPP23_P%20Thomas

Thomas, P. G. (2006). Performance measurement, reporting, obstacles and accountability:
Recent trends and future directions. Canberra, ACT, Australia: ANU E Press. Retrieved
from http://epress.anu.edu.au/anzsog/performance/pdf/
performance-whole

Thor, C. G. (2000). The evolution of performance measurement in government. Journal of
Cost Management, May/June, 18–26.

Treasury Board of Canada Secretariat. (2012a). Policy on evaluation. Retrieved from
http://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=15024

Treasury Board of Canada Secretariat. (2012b). Policy on management, resources and results
structures. Retrieved from http://www.tbs-sct.gc.ca/pol/doc-eng.aspx?evttoo=
X&id=18218&section=text

Wildavsky, A. B. (1979). Speaking truth to power: The art and craft of policy analysis.
Boston, MA: Little Brown.

Williams, D. W. (2003). Measuring government in the early twentieth century. Public
Administration Review, 63(6), 643–659.

Wilson, J. Q. (1989). Bureaucracy: What government agencies do and why they do it. New
York: Basic Books.

WorkSafeBC. (2011). 2010 annual report and 2011–2013 service plan. Retrieved from
http://www.worksafebc.com/publications/reports/
annual_reports/assets/pdf/2010/AnnualReport

Treasury Board of Canada Secretariat. (2012). Policy on evaluation. Retrieved from
http://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=15024

U.S. Government Accountability Office. (2011). GPRA Modernization Act implementation
provides important opportunities to address government challenges (GAO-11–617T).
Retrieved from http://www.gao.gov/assets/130/126150

Von Bertalanffy, L. (1968). General system theory: Foundations, development, applications (Rev.
ed.). New York: G. Braziller.

Wandersman, A., & Fetterman, D. (2007). Empowerment evaluation: Yesterday, today, and
tomorrow. American Journal of Evaluation, 28(2), 179–198.

Weibe, R. H. (1962). Businessmen and reform: A study of the progressive movement. Cambridge,
MA: Harvard University Press.

Wholey, J. S. (2001). Managing for results: Roles for evaluators in a new management era.
American Journal of Evaluation, 22(3), 343–347.

Wildavsky, A. B. (1979). Speaking truth to power: The art and craft of policy analysis. Boston,
MA: Little Brown.

Williams, D. W. (2003). Measuring government in the early twentieth century. Public
Administration Review, 63(6), 643–659.

Wilson, W. (1887). The study of administration. Political Science Quarterly, 2(2), 197–222.
WorkSafeBC. (2011). Reports: See 2010 annual report and 2011–2013 service plan. Retrieved

from http://www.worksafebc.com/publications/reports/default.asp

CHAPTER 9

DESIGN AND IMPLEMENTATION OF
PERFORMANCE MEASUREMENT SYSTEMS

Introduction
Key Steps in Designing and Implementing a Performance Measurement System

Identify the Organizational Champions of This Change

Understand What Performance Measurement Systems Can and Cannot Do

Establish Multichannel Ways of Communicating That Facilitate Top-Down, Bottom-Up,

and Horizontal Sharing of Information, Problem Identification, and Problem Solving
Clarify the Expectations for the Intended Uses of the Performance Information That Is

Created
Identify the Resources Available for Designing, Implementing, and Maintaining the

Performance Measurement System

Take the Time to Understand the Organizational History Around Similar Initiatives

Develop Logic Models for the Programs for Which Performance Measures Are Being

Developed, and Identify the Key Constructs to Be Measured

Identify Any Constructs That Apply Beyond Single Programs

Involve Prospective Users in Reviewing Logic Models and Constructs in the Proposed

Performance Measurement System
Measure the Constructs That Have Been Identified as Parts of the Performance

Measurement System

Record, Analyze, Interpret, and Report the Performance Data

Regularly Review Feedback From the Users and, If Needed, Make Changes to the

Performance Measurement System
Performance Measurement for Public Accountability
Summary
Discussion Questions
Appendix A: Organizational Logic Models
References

INTRODUCTION

In this chapter, we begin by introducing two complementary perspectives on public sector
organizations: (1) a technical/rational view that emphasizes systems and structures and (2) a
political/cultural view that emphasizes the dynamics that develop when we take into account
people interacting to get things done. Then, we introduce and elaborate 12 steps that are
important in designing and implementing performance measurement systems. These steps
reflect both the technical/rational and the political/cultural perspectives on organizations. As
we describe each step, we offer advice and also point to possible pitfalls and limitations while
working within complex organizations. The chapter ends with a section that serves as a
transition to Chapter 10, which discusses the uses of performance results.

The process of designing and implementing performance measurement systems uses core
knowledge and skills that are also a part of designing, conducting, and reporting program
evaluations. In Chapter 8, we pointed out that program evaluation and performance
measurement share core knowledge and skills including logic modeling and measurement. In
addition, understanding research designs and the four kinds of validity we described in Chapter
3 is valuable for understanding and working with the strengths and limitations of performance
measurement systems.

In Chapter 1, we outlined the steps that make up a typical program evaluation. In this
chapter, we will do the same for performance measurement systems, understanding that for
each situation, there will be unique circumstances that can result in differences between the
checklist below and the process that is appropriate for that context. Each of the 12 steps of
designing and implementing a performance measurement system is elaborated to clarify issues
and possible problems. We distinguish designing and implementing performance measurement
systems from the uses of such systems. Usage is a critical topic on its own, and we will
elaborate on it in Chapter 10.

Designing and implementing performance measurement systems can be a significant
organizational change, particularly in public sector organizations that have focused on
processes instead of results. Depending on the origins of such an initiative (external to the
organization, internal, top-down, or manager driven), different actors and factors will be more
or less important. When we design and implement performance measurement systems that are
intended to be sustainable, we must go beyond normative frameworks that focus on technical
and rational steps, and consider the “psychological, cultural, and political implications of
organizational change” (de Lancer Julnes, 1999, p. 49). de Lancer Julnes and Holzer (2001)
have distinguished a rational/technical framework and a political/cultural framework as key to
understanding the successful adoption, implementation, and use of performance measures.

The technical/rational perspective is grounded in a view of organizations as complex
rational means–ends systems that are designed to achieve purposive ends. This view
emphasizes the importance of systems and formal structures as keys to understanding how
complex organizations work and how to change them. With respect to performance
measurement systems, as they are designed and implemented there are rational and technical
factors to keep in mind. These factors include having sufficient resources, training people
appropriately, aligning management systems, developing appropriate information systems, and
developing valid and reliable performance measures. It is important to have an overall plan that
organizes the process, including who should be involved as different stages, how the stages
link timing-wise, what is expected—and from whom—as each stage is implemented, and how
the overall system is expected to function once it has been implemented.

The political/cultural perspective on organizations emphasizes the people dynamics in
organizations rather than the systems and structures in which they are embedded.
Organizations as political systems is one of the metaphors that Gareth Morgan (2006) includes
in his seminal book Images of Organization. This view of organizations involves
understanding how people interact with and in complex organizations. Performance
management systems and structures play a role, but individuals and coalitions can influence
and even negate the results intended from them. Organizational politics is an inevitable and
important feature of organizational dynamics. Politics does not have to be about political
parties or formal political allegiances. Instead, it is essentially about the processes (both formal
and informal) that are used to allocate scarce resources among competing values. Even though
there will be organizational and program objectives, with resources being devoted to their
achievement (the rational purposes of organizations), there will also be interests and incentives,
and coalitions of stakeholders who can either facilitate implementing and using performance
measurement systems, or impede them. Organizations are more than systems and structures.
They are fundamentally about people interacting in patterns that reflect both the intended
outcomes of the organizations as well as their own personal or group objectives (which may or
may not support the stated organizational objectives).

Overlaid on these two views of organizations is the wide range of environments in which
organizations can be embedded. In Chapter 8, we introduced the idea of complex systems to
show how complexity can serve as a useful lens to understand evolving organizations in
evolving environments. What we will see in Chapter 10 is that some environments for
organizations are more conducive to sustaining performance measurement systems than others.
Where performance measurement is focused on public reporting in high-stakes, accountability-
oriented environments, it can be challenging to construct and maintain performance
measurement systems. One “solution” that we will explore in Chapter 10 is to decouple the
performance measurement system that is used for (internal) performance management from the
performance measures that are used for external reporting (McDavid & Huse, 2012).

The 12 steps discussed in this chapter outline a process that is intended to increase the
chances that a performance measurement system will be successfully implemented and
sustained. A key part of sustaining performance measurement as an evaluative function in
organizations is to use the performance information (Moynihan, Pandey, & Wright, 2012). In
other words, there must be a demand for performance information, to sustain the supply.
Supplying performance information (e.g., preparing performance reports) where there is
limited or no demand tends to undermine the credibility of the system—lack of use is an
indication that the system is not aligned with actual substantive organizational priorities. In
many situations, the conditions under which actual organizations undertake the development of
performance measures are less than ideal. In the summary to this chapter, we identify the six
steps among the 12 that are most critical if organizations want performance measurement
systems that can contribute to managerial and organizational efforts to improve efficiency,
effectiveness, and accountability.

KEY STEPS IN DESIGNING AND IMPLEMENTING A
PERFORMANCE MEASUREMENT SYSTEM

Table 9.1 summarizes the key steps in designing and implementing a performance
measurement system. Each of these steps can be viewed as a guideline—no single performance
measurement development and implementation process will conform to all of them. In some

cases, the process may diverge from the sequence of steps. Again, this could be due to local
factors. Each of the steps in Table 9.1 is discussed more fully in the following sections. Our
discussion of the steps is intended to do two things: (1) elaborate on what is involved and (2)
point out limitations and pitfalls along the way. As you review the steps, you will see that most
of them acknowledge the importance of both a rational/technical and a political/cultural view
of organizations. Beyond the technical issues, it will be important to consider the interactions
among the people, incentives, history, and who wins and who loses.

One way that we can look at these 12 steps is to divide them between a technical/rational
and a cultural/political perspective on organizations. Among the steps, the majority are more
closely aligned with the political/cultural view of organizations: identifying the champions of
this change, understanding what performance measurement systems can actually do (and not
do), establishing and using communication channels, clarifying intended uses (for all the
stakeholders involved), understanding the organizational history and its impacts on this change
process, involving users in developing models and performance measures, and regularly
reviewing and acting on user feedback. The others—identifying resources, developing logic
models, indentifying constructs that span programs or the whole organization, measuring
constructs, and analyzing and reporting performance results—are more closely aligned with a
technical/rational view of organizations. Our approach emphasizes the importance of both
perspectives and their complementarity in building and implementing sustainable performance
measurement systems.

Table 9.1 Key Steps in Designing and Implementing a Performance Measurement System

1. Identify the organizational champions of this change.
2. Understand what a performance measurement system can and cannot do and why it is

needed.
3. Establish multichannel ways of communicating that facilitate top-down, bottom-up,

and horizontal sharing of information, problem identification, and problem solving.
4. Clarify the expectations for the uses of the performance information that will be

created.
5. Identify the resources available for developing, implementing, maintaining, and

renewing the

performance measurement system.

6. Take the time to understand the organizational history around similar initiatives.
7. Develop logic models for the programs or lines of business for which performance

measures are being developed.
8. Identify additional constructs that are intended to represent performance for

aggregations of programs or the whole organization.
9. Involve prospective users in reviewing the logic models and constructs in the

proposed performance measurement system.
10. Measure the key constructs in the performance measurement system.
11. Record, analyze, interpret, and report the performance data.
12. Regularly review feedback from users and, if needed, make changes to the

performance measurement system.

Identify the Organizational Champions of This Change

The introduction of performance measurement, particularly measuring outcomes, is an
important change in both an organization’s way of doing business and its culture (de Lancer
Julnes & Holzer, 2001). Unlike program evaluations, performance measurement systems are
ongoing, and it is therefore important that there be organizational leaders who are champions
of this change, to provide continuing support for the process from its inception onward. In
many cases, an emphasis on measuring outcomes is a significant departure from existing
practices of tracking program inputs (money, human resources), program activities, and
program outputs (work done). Most managers have experience measuring/recording inputs,
processes, and outputs, so the challenge in outcome-focused performance measurement is in
specifying the expected outcomes (stating clear objectives for programs, lines of business, or
organizations) and facilitating organizational commitment to the process of measuring and
working with outcome-related results.

By including outcomes, performance measurement commits organizations to comparing
their actual results with the stated objectives. In many jurisdictions, objectives are parsed into
annual targets, and actual outcomes are compared with the targets for that year. Thus, the
performance measurement information commonly is intended to serve multiple purposes,
including enhancing managerial decision making, encouraging organizational alignment, and
promoting transparency and accountability.

New Public Management emphasizes the (normative) importance of freeing managers from
“red tape,” that is, process-related restrictions, so that they can more efficiently and effectively
use the resources that are available (Moynihan, 2008; Norman & Gregory, 2003). Managerial
flexibility, coupled with measures for intended outcomes, is expected to offer incentives to
improve their operations. In Chapter 10, we will look at the actual uses of performance
information in governments and public sector organizations and explore in some depth the
incentives for managers to become involved in developing and using performance measures.

Because performance measurement systems are ongoing, it is important that the champions
of this change support the process from its inception onward. Moynihan et al. (2012) suggest
that leadership commitment is critical to the process and also affects performance information
uses. The nature of performance measures is that they create new information—a potential
resource in public and nonprofit organizations. Information can reduce uncertainty with respect
to the questions it is intended to answer, but the process of building performance measurement
into the organization’s business can significantly increase uncertainty for managers. The
changes implied by measuring results (outcomes), reporting results, and being held
accountable for results can loom large as the system is being designed and implemented. If a
performance measurement system is implemented as a top-down initiative, managers may see
this as a threat to their existing practices. Typically, some will resist this change, and if
leadership commitment is not sustained, the transition to performance measurement as a part of
managing programs will wane with time (de Waal, 2003).

A results-oriented approach to managing has implications for public sector accountability.
In many jurisdictions, public organizations are still expected to operate in ways that conform to
process-focused notions of accountability. In Canada, for example, the Westminster
parliamentary system makes the minister who heads each government department nominally
accountable for all that happens in his or her domain. The adversarial nature of politics,
combined with the tendency of the media and interest groups to emphasize mistakes that
become public, can bias managerial behavior toward a procedurally focused process, wherein
only “safe” decisions are made (Propper & Wilson, 2003). Navigating such environments
while working to implement performance measurement systems requires leadership that is

willing to embrace some risks, not only in developing the system but in encouraging a culture
wherein performance results are used to inform decision making. We explore these issues in
much greater detail in Chapters 10 and 11.

In most governmental settings, leadership at two levels is required. Senior executives in a
ministry or department must actively support the process of constructing and implementing a
performance measurement system. But it is equally important that the political leadership be
supportive of the development, implementation, and use of a performance measurement
system. The key intended users of performance information that is publicly reported are the
elected officials (of all the political parties) (McDavid & Huse, 2012).

In British Columbia, Canada, for example, the Budget Transparency and Accountability
Act (Government of British Columbia, 2001) specifies that annual performance reports are to
be tabled in the legislative assembly. The goal is to have committees of the legislature review
these reports and use them as they scrutinize ministry operations and future budgets. Each year,
the public reports are tabled in June and are based on the actual results for the fiscal year
ending March 31. Strategically, the reports should figure in the budgetary process for the
following year, which begins in the fall. If producing and publishing these performance reports
is not coupled with scrutiny of the reports by legislators, then a key reason for committing
resources to this form of public accountability is undermined. In Chapter 10, we will look at
the ways in which elected officials actually use performance reports.

In summary, an initial organizational commitment to performance measurement, which
typically includes designing the system, can produce “results” that are visible (e.g., a website
with the performance measurement framework), but implementing and working with the
system over three to five years is a much better indicator of its sustainability, and for this to
happen, it is critical to have organizational champions of the process.

Understand What Performance Measurement Systems Can and Cannot Do

There are limitations to what performance measurement systems can do, yet in some
jurisdictions, performance measurement has been treated as a cost-effective substitute for
program evaluation (Martin & Kettner, 1996). Public sector downsizing has diminished the
resources committed to program evaluations, and managers have been expected to initiate
performance measurement instead (McDavid, 2001b). The emphasis on performance reporting
for public accountability, and the assumption that that can drive performance improvements, is
the principal reason for making performance measurement the central evaluative approach in
many organizations. We will look at this assumption in Chapter 10 when we discuss the uses
of performance information when public reporting is mandated.

Performance measurement can be a powerful tool in managing programs or organizations.
If the measures are valid and the information is timely, emerging trends can identify possible
problems (a negative-feedback mechanism) as well as possible successes (positive feedback).
But performance measurement results only describe what is going on; they do not explain why
it is happening (McDavid & Huse, 2006; Newcomer, 1997).

Recall the distinction between intended outcomes and actual outcomes (introduced in
Chapter 1). Programs are designed to produce specified outcomes, and one way to judge the
success of a program is to see whether the intended outcomes have actually occurred. If the
actual outcomes match the intended outcomes, we might be prepared to conclude that the
program was effective.

However, we cannot conclude that the outcomes are due to the program unless we have
additional information that supports the assumption that other factors in the environment could
not have caused the observed outcomes. Getting that information is at the core of what

program evaluation is about, and it is essential that those using performance measurement
information understand this distinction. As Martin and Kettner (1996) commented when
discussing the cause-and-effect relationship that many people mistakenly understand to be
implied in performance measurement information, “Educating stakeholders about what
outcome performance measures really are, and what they are not, is an important—and little
discussed—problem associated with their use by human service programs” (p. 56).

Establishing the causal link between observed outcomes and the program that was intended
to produce them is the attribution problem. Some analysts have explicitly addressed this
problem for performance measurement. Mayne (2001) offers six strategies intended to reduce
the uncertainty about whether the observed performance measurement outcomes can be
attributed to the program. Briefly, his suggestions are as follows: (1) develop an intended-
results chain; (2) assess the existing research/evidence that supports the results chain; (3)
assess the alternative explanations for the observed results; (4) assemble the performance story;
(5) seek out additional evidence, if necessary; and (6) revise and strengthen the performance
story. Several of his suggestions are common to both program evaluation and performance
measurement, as we have outlined them in this book. His final (seventh) suggestion is to do a
program evaluation if the performance story is not sufficient to address the attribution
question,. This suggestion supports a key theme of this book—that performance measurement
and program evaluation are complementary, and each offers ways to reduce uncertainty for
managers and other stakeholders in public and nonprofit organizations.

There are nuances to the strengths and limitations of performance measurement systems.
Some programs or organizations are easier to work with in developing and implementing
performance measurement systems that can credibly connect programs to actual outcomes. In
Chapter 2, we introduced the concept of program technologies to help explain why some
program logics “work” better than others. Recall that for programs that are constructed around
high-probability program technologies (highways maintenance programs would be an
example), it is relatively straightforward to assume a linkage between program outputs and
outcomes. In other words, if you know how many lane miles of highway (as a proportion of all
the roads in a given jurisdiction) are being kept free of snow and ice in the wintertime (an
output), you have a pretty good idea of the safety of the roads (an outcome). But if you know
how many families were served by a program intended to improve parenting skills so that
parents can keep their children instead of having to give them up to foster care, you probably
do not know (at least not to the same degree as the transportation example above) whether the
program actually succeeded in improving the likelihood that children are not taken out of their
homes to be placed in foster homes. The attribution question is not as easily answered for cases
such as this one, that are constructed with low-probability program technologies.

Establish Multichannel Ways of Communicating That Facilitate Top-Down, Bottom-
Up, and Horizontal Sharing of Information, Problem Identification, and Problem
Solving

It is quite common for public sector or nonprofit organizations to begin developing a
performance measurement system informally. Managers who are keen to obtain information
that they can use formatively will take the lead in developing their own measures and
procedures for gathering and using the data. This bottom-up process is one that encourages a
sense of ownership of the system. In the British Columbia provincial government, this more
manager-driven process spanned the period roughly from 1995 to 2000 (McDavid, 2001a).
Some departments made more progress than others, in part because some department heads

were more supportive of this process than others. Because they were driven by internal
performance management needs, the systems that developed were adapted to local needs.

To support this evolutionary bottom-up process in the British Columbia government, the
Treasury Board Staff (a central agency responsible for budget analysis and program approval)
hosted an informal network of government practitioners who had an interest in performance
measurement and performance improvement. The Performance Measurement Resource Team
held monthly meetings that included speakers from ministries and outside agencies who
provided information on their problems and solutions. Attendance and contributions were
voluntary. Information sharing was the principal purpose of the sessions.

When the Budget Transparency and Accountability Act (Government of British Columbia,
2000) was passed, mandating performance measurement and public reporting government-
wide, the stakes changed dramatically. Performance measurement systems that had been
intended for formative uses were now exposed to the requirement that a selection of the
performance results would be made public in an annual report. This top-down directive to
report performance for summative purposes needed to be meshed with the bottom-up
(formative) cultures that had been developed in some ministries.

Some departments that had existing performance measurement systems confronted the
challenge of melding the existing formative and new summative thrusts of the required system
by communicating up and down and across the organization. For example, one department
responsible for the publicly funded college and university system in the province conducted a
series of formal and informal workshops and meetings with executives and senior and middle
managers in attendance. Over a period of a year, using an iterative process, the department was
able to develop a general understanding of how the new, externally focused performance
measurement system would look, what the new system would do, and how it would connect
with the internal performance management system, which the department managers were keen
to sustain.

Generally, public organizations that undertake the design and implementation of
performance measurement systems that are intended to be used internally must include the
intended users (Kravchuk & Schack, 1996), the organizational leaders of this initiative, and the
methodologists (Thor, 2000). Top-down communications can serve to clarify direction, offer a
framework and timelines for the process, clarify what resources will be available, and affirm
the importance of this initiative. Bottom-up communications can question or seek clarification
of definitions, timelines, resources, and direction. Horizontal communications can provide
examples, share problem solutions, and offer informal support.

The communications process outlined here exemplifies a culture that needs to emerge in
the organization if performance management is to take hold and be sustainable. Key to
developing a performance management culture is treating information as a resource, being
willing to “speak truth to power” (Wildavsky, 1979), and not treating performance information
as a political weapon. Kravchuk and Schack (1996) suggest that the most appropriate metaphor
to build a performance culture is the learning organization. This construct was introduced by
Senge (1990) and continues to be a goal for public organizations that have committed to
performance measurement as part of a broader performance management framework (Mayne,
2008; Mayne & Rist, 2006).

Clarify the Expectations for the Intended Uses of the Performance Information That Is
Created

Developing performance measures is intended, in part, to improve performance by
providing managers and other stakeholders with information they can use to monitor and make

adjustments to program processes. Having “real-time” information on how programs are
tracking is often viewed by managers as an asset and is an incentive to get involved in
constructing and implementing a performance measurement system. Managerial involvement
in performance measurement is a widespread expectation and is reflected in policies in some
jurisdictions.

To attract the buy-in that is essential for successful design and implementation of
performance measurement systems, we believe that performance measurement needs to be
used first and foremost for internal performance improvement. Public reporting can be a part of
the process of using performance measurement data, but it should not be the primary reason for
developing a performance measurement system (Hildebrand & McDavid, 2011). A robust
performance measurement system should support using information to inform improvements
to programs and/or the organization. It should help identify areas where activities are most
effective in producing intended outcomes and areas where improvement could be made (de
Waal, 2003).

Designing and implementing a performance measurement system primarily for public
accountability usually entails public reporting of performance results, and in jurisdictions
where performance results can be used to criticize elected officials or bureaucrats, there are
incentives to limit reporting of anything that would reflect negatively on the government of the
day. Richard Prebble, a long-time political leader in New Zealand, outlines Andrew Ladley’s
“Iron Rule of the Political Contest”:

• The opposition is intent on replacing the government.
• The government is intent on remaining in power.
• MPs want to get re-elected.
• Party leadership is dependent on retaining the confidence of colleagues (which is shaped

by the first three principles). (Prebble, 2010, p. 3)

In terms of performance measures to be reported publicly, this highlights that
organizational performance information will not only be used to review performance but will
likely be mined for details that can be used to embarrass the government.

In Chapter 10, we will look at the issues involved in using performance measurement
systems to contribute to public accountability. Understanding and balancing the incentives for
participants in this process is one of the significant challenges for the leaders of an
organization. As we mentioned earlier, developing and then using a performance measurement
system can create uncertainty for those whose programs are being assessed. They will want to
know how the information that is produced will affect them, both positively and negatively. It
is essential that the leaders of this process be forthcoming about the intended uses of the
measurement system.

If a system is designed for formative program improvement purposes, using it for
summative purposes will change the incentives for those involved. Sustaining the internal uses
of performance information will mean involving those who have contributed to the (earlier)
formative process. Changing the purposes of a performance measurement system affects the
likelihood that gaming will occur as data are collected and reported (Pollitt, 2007; Pollitt, Bal,
Jerak-Zuiderent, Dowswell, & Harrison, 2010; Propper & Wilson, 2003). In Chapter 10, we
will discuss gaming as an unintended response to incentives in performance measurement
systems.

Some organizations begin the design and implementation process by making explicit the
intention that the measurement results will only be used formatively for a 3- to 5-year period of

time, for example. That can generate the kind of buy-in that is required to develop meaningful
measures and convince participants that the process is actually useful to them. Then, as the
uses of the information are broadened to include external reporting, it may be more likely that
managers will see the value of a system that has both formative and summative purposes.

Pollitt et al. (2010) offer us a cautionary example, from the British health services, of the
transformation of the intended uses of performance information. His example suggests that
performance measurement systems that begin with formative intentions tend, over time, to
migrate to summative uses.

In the early 1980s in Britain, there were broad government concerns with hospital
efficiency that prompted the then Conservative government to initiate a system-wide
performance measurement process. Right from the start, the messages that managers and
executives were given were ambiguous. Pollitt et al. (2010) note that

despite the ostensible connection to government aims to increase central control over the
NHS, the Minister who announced the new package described PIs [performance
indicators] in formative terms. Local managers were to be equipped to make comparisons,
and the stress was on using them to trigger inquiry rather than as answers in themselves, a
message that was subsequently repeated throughout the 1980s. (p. 17)

However, by the early 1990s, the “formative” performance results were being reported
publicly, and comparisons among health districts (health trusts) were a central part of this
transition. “League tables,” wherein districts were compared across a set of performance
measures, marked the transition from formative to summative uses of the performance
information. By the late 1990s, league tables had evolved into a “star rating system,” wherein
districts could earn up to three stars for their performance. The Healthcare Commission, a
government oversight and audit agency, conducted and published the ratings and rankings.
Pollitt et al. (2010) summarize the transition from a formative to a summative performance
measurement system thus:

In more general terms, the move from formative to summative may be thought of as the
result of PIs [performance indicators] constituting a standing temptation to executive
politicians and top managers. Even if the PIs were originally installed on an explicitly
formative basis (as in the UK), they constitute a body of information which, when things
(inevitably) go wrong, can be seized upon as a new means of control and direction. (p. 21)

This change brought with it different incentives for those involved and ushered in an
ongoing dynamic wherein managerial responses to performance-related requirements included
gaming the measures, that is, manipulating activities and/or the information to enhance
performance ratings and reduce poor performance results in ways that were not intended by the
designers of the system. This issue will be explored in greater detail in the next chapter.

Identify the Resources Available for Designing, Implementing, and Maintaining the
Performance Measurement System

Organizations planning performance measurement systems often face substantial resource
constraints. One of the reasons for embracing performance measurement is to do a better job of
managing the (scarce) available resources. If a performance measurement system is mandated
by external stakeholders (e.g., a central agency, an audit office, or a board of directors), there

may be considerable pressure to plunge in without fully planning the design and
implementation phases.

Often, organizations that are implementing performance measurement systems are
expecting to achieve efficiency gains, as well as improved effectiveness. Downsizing may have
already occurred, and performance measurement is expected to occur within existing budgets.
Those involved may have the expectation that this work can be added onto the existing
workload of managers—they are clearly important stakeholders and logically should be in the
best position to suggest or validate the proposed measures. Under such conditions, the
development work may be assigned to an ad hoc committee of managers, analysts, co-op or
intern students, other temporary employees, or consultants.

Identifying possible performance measures is usually iterative, time-consuming work, but
it is only a part of the process. The work of implementing the measures (identifying data that
correspond to the performance constructs and collecting data for the measures), preparing
reports and briefings, and maintaining and renewing the system is the key difference between a
process that offers the appearance of having a performance measurement system in place (a
website, progress reports, testimonials by participants in the process) and a process that
actually results in using performance data on a continuing basis to improve the programs in the
organization. Although a “one-shot” infusion of resources can be very useful as a way to get
the process started, it is not sufficient to sustain the system. Measuring and reporting
performance takes ongoing commitments of resources, including the time of persons in the
organization.

Training for staff who will be involved in the design and implementation of the
performance measures is important. On the face of it, a minimalist approach to measuring
performance is straightforward. “Important” measures are selected, perhaps by an ad hoc
committee; data are marshaled for those measures; and the required reports are produced. But a
commitment to designing and implementing a performance measurement system that is
sustainable requires an understanding of the process of connecting performance measurement
to managing with performance data (Kates, Marconi, & Mannle, 2001).

In some jurisdictions, the creation of legislative mandates for public performance reporting
has resulted in organizational responses that meet the legislative requirements but do not build
the capacity to sustain performance measurement. However, performance measurement is
intended to be a means rather than an end in itself. Unless the organization is committed to
using the information to manage performance, it is unlikely that performance measurement
will be well integrated into the operations of the organization.

In situations where there are financial barriers to validly measuring outcomes, it is common
for performance measures to focus on outputs. In many organizations, outputs are commonly
easier to measure, and the data are more readily available. Also, managers are usually more
willing to have output data reported publicly because outputs are typically much easier to
attribute to a program or even program activity. Some performance measurement systems have
focused on outputs from their inception. The best example of that approach has been in New
Zealand, where public departments and agencies negotiate output-focused contracts with the
New Zealand Treasury (Gill, 2011). However, although outputs are important as a way to
report work done, they cannot be entirely substituted for outcomes; the assumption that if
outputs are produced, outcomes must have been produced is usually not defensible (see the
discussion of measurement validity vs. the validity of causes and effects in Chapter 4).

Take the Time to Understand the Organizational History Around Similar Initiatives

Performance measurement is not new. In Chapter 8, we learned that in the United States,
local governments began measuring the performance of services in the first years of the 20th
century (Williams, 2003). Since then, there have been several waves of government reform that
have included measuring results. New Public Management emerged in the early 1990s (Hood,
1991), in part from efforts by Western democratic governments to eliminate fiscal deficits in
the 1970s and 1980s.

In most public organizations, current efforts to develop performance measures come on top
of other, previous attempts to improve the efficiency and effectiveness of their operations.
Managers who have been a part of previous change efforts, particularly unsuccessful ones,
have experience that will affect their willingness to support current efforts to establish a system
to measure performance. It is important to understand the organizational memory of past
efforts to make changes, and to gain some understanding of why previous efforts to make
changes have or have not succeeded. The organizational lore around these changes is as
important as a dispassionate view, in that participants’ beliefs are the reality that the current
change will first need to address.

A significant issue for some public sector organizations can be the retirement of employees
who exercise their option to leave early, facilitating downsizing goals that governments have
put into place (Levine, Rubin, & Wolohojian, 1981). Long-term employees will often have an
in-depth understanding of the organization and its history. In organizations that have a history
of successful change initiatives, losing the people who were involved can be a liability when
designing and implementing a performance measurement system. Their participation in the
past may have been important in successfully implementing change initiatives. On the other
hand, if an organization has a history of questionable success in implementing change
initiatives, organizational turnover may actually be an asset.

Develop Logic Models for the Programs for Which Performance Measures Are Being
Developed, and Identify the Key Constructs to Be Measured

In Chapter 2, we discussed logic models as a way to make explicit the intended cause-and-
effect linkages in a program or even an organization. We discussed several different styles of
logic models and pointed out that selecting a logic modeling approach depends in part on how
explicit one wants to be about intended cause-and-effect linkages. A key requirement of logic
modeling that explicates causes and effects is the presentation of which outputs are connected
to which outcomes.

Key to constructing and validating logic models with stakeholders is identifying and stating
clear objectives for programs (Kravchuk & Schack, 1996). Although this requirement might
seem straightforward, it is one of the more challenging aspects of the logic modeling process.
Often, program or organizational objectives are put together to satisfy the expectations of
stakeholders, who may not agree among themselves about what a program is expected to
accomplish. One way these differences are sometimes resolved is to construct objectives that
are general enough so as to appear to meet competing expectations. Although this solution is
expedient from an organizational-political standpoint, it complicates the process of measuring
performance.

Criteria for sound program objectives were discussed in Chapter 1. Briefly, objectives
should state an expected change or improvement if the program works (e.g., reducing the
number of drug-related crimes), an expected magnitude of change (e.g., reducing the number
of drug-related crimes by 20%), a target audience/population (e.g., reducing the number of

drug-related crimes by 20% in Harrisburg, Pennsylvania), and a time frame for achieving the
intended result (e.g., reducing the number of drug-related crimes by 20% in Harrisburg,
Pennsylvania, in 2 years).

Although logic models do constrain us in the sense that they assume that programs are
open systems that are stable enough to be depicted as a static model, they are useful as a means
of identifying constructs that are candidates for performance measurement. Martin and Kettner
(1996) have identified three major foci for performance measures: (1) program efficiency
(comparing inputs with outputs), (2) program quality (whether the outputs meet some specified
quality standard), and (3) program effectiveness (whether the intended outcomes have been
achieved). They suggest that a good performance measurement system needs to track all of
these various program attributes, since each will be important to at least some program
stakeholders.

The open-systems metaphor also invites us to identify environmental factors that could
affect the program, including those that affect our outcome constructs. Although some
performance measurement systems do not measure factors that are external to the program or
organization, it is worthwhile including such constructs as candidates for measurement.
Measuring these environmental factors (or at least accounting for their influences qualitatively)
allows us to begin addressing attribution questions.

In an ideal performance measurement system, both costs and results data are available and
can be compared. An important driver behind the movement to develop planning,
programming, and budgeting systems (PPBS) in the 1960s was, in fact, the expectation that
cost-effectiveness ratios could be constructed. However, the lack of both budgetary flexibility
and information management capacities in most public sector organizations resulted in a
significant barrier to being able to fully implement PPBS at that time.

Most public sector organizations now have accounting systems that permit managers to
cost out programs. Information systems are more flexible than in the past, and the budgetary
and expenditure data are more complete. Some organizations have also developed the capacity
to cost out individual activities within each program (Brimson, 1991).

James Q. Wilson (1989) has suggested that the environment of public sector organizations
also influences the likelihood that robust measures of outputs and outcomes can be developed.
Table 9.2 adapts his approach to produce a typology describing the challenges and
opportunities for measuring outputs and outcomes in different types of organizations. Coping
organizations (in which work tasks change a lot, and results are not visible—e.g., central
government policy units), where both program technologies and environments combine to limit
performance measurement, are the least likely to be successful in measuring outputs and
outcomes. Production organizations (with simple, repetitive tasks, the results of which are
visible and countable) are the most likely to be able to build performance measurement
systems that include outputs and outcomes. Craft organizations rely on applying mixes of
professional knowledge and skills to unique tasks to produce visible outcomes—a public audit
office would be an example. Procedural organizations rely on processes to produce outputs
that are visible and countable but produce outcomes that are less visible—military
organizations are an example. Thus, craft and procedural organizations differ in their capacities
to develop output measures (procedural organizations can do this more readily) and outcome
measures (craft organizations can do this more readily).

Table 9.2 Measuring Outputs and Outcomes: Influences of Core Technologies and
Organizational Environments

Source: Adapted from Wilson (1989).

Identify Any Constructs That Apply Beyond Single Programs

Organizational logic models can be seen as an extension of program logic models, but
because they typically focus on a higher-level view of programs or business lines, the
constructs will be more general. The balanced scorecard (Kaplan & Norton, 1996) is one type
of organizational performance measurement system that includes a general (normative) model
of key organizational-level constructs that are intended to be linked causally. Typically,
balanced scorecards include clusters of performance measures for four different dimensions:
(1) organizational learning and growth, (2) internal business processes, (3) customers, and (4)
the financial perspective. Performance measures are constructed for each of these dimensions.

In Appendix A, Table 9A.1 illustrates an earlier organizational logic model for the British
Columbia Ministry of Human Resources (now the Ministry of Social Development). The
ministry was primarily focused on providing income assistance and moving income assistance
recipients into job training programs as a transition to employment. Table 9A.1 is complicated,
but if one wants to see how the operations of this entire organization fit together, an
organizational logic model is a parsimonious way to show this visually and to identify
constructs that might be candidates for constructing performance measures.

Some jurisdictions require organizational logic models that depict the high-level intended
links between strategic outcomes and programs. Figure 9A.1 in Appendix A is a high-level
logic model of Human Resources and Skills Development Canada (HRSDC), one of the largest
federal departments in the Canadian government. All federal departments and agencies in
Canada are required to develop and periodically update a Program Alignment Architecture that
summarizes departmental objectives/outcomes and how those are intended to be achieved
through the program structure (Treasury Board of Canada Secretariat, 2012b). The HRSDC
(2010) logic model shows how strategic outcomes are connected with clusters of programs.
Each program has its own cluster of outcomes and is expected to be evaluated on a 5-year
cyclical basis (Treasury Board of Canada Secretariat, 2012a).

Performance measurement systems are sometimes expected to offer measures of
performance that transcend single government departments, and measure sectoral or whole
government performance. The Government of Alberta, for example, publishes an annual report
called Measuring Up: Progress Report on the Government of Alberta Business Plan, which
describes and graphs performance trends over the previous five years (Government of Alberta,
2011). Included in the most recent report are summaries of 59 performance measures related to
10 province-wide strategic goals.

Publishing this report is required under the Government Accountability Act (Government
of Alberta, 1995) and must include “a comparison of the actual performance results to the
targets included in the government business plan under section 7(3), and an explanation of any
significant variances” (p. 6). The provincial auditor assesses a sample of the measures in each

annual report—13 of the 59 measures were audited for “completeness, reliability,
comparability and understandability” (Government of Alberta, 2011, p. 1). Although some of
the measures include comparisons with other jurisdictions (e.g., labor productivity is compared
with that of other Canadian provinces), most are displayed as a time series for Alberta alone.
As part of the annual reporting process, the Alberta government surveys a random sample of
residents of the province and asks them to rate social, health, educational, and criminal justice
–related services. The survey results are featured among the performance measures in the
annual report. The performance measures that are included in the annual report are selected
from among the ones Alberta government departments have included in their performance
reports, so the province-wide report is in part a roll-up of departmental performance results.

Many social problems cannot easily be assigned to one administrative department. An
example is homelessness. A social services department might have a mandate to provide funds
to nonprofit organizations or even developers to build housing for the homeless in a
jurisdiction. Housing is costly, and states or provinces may be reluctant to undertake such
initiatives on their own. The nature of homelessness, with its high incidence of mental health
challenges and drug dependences, will mean that housing the homeless, even if funding and
land to construct housing can be marshaled, is just part of a more comprehensive suite of
programs needed to address the complex cases that homeless persons typically present.
Homelessness transcends government departments, and even levels of government, involving
local, state/provincial, and federal governments. Effectively addressing this kind of problem
requires collaboration among agencies and governments that crosses existing organizational
and functional boundaries.

Horizontal initiatives like ones to address homelessness present challenges for measuring
performance, particularly where there is an expectation that reporting results will be part of
being accountable (Bakvis & Juillet, 2004). Developing performance measures for this kind of
program would involve a sharing of responsibility and accountability for the overall program
objectives. If permitted to focus simply on the objectives of each government department or
level of government during the design of the system, each contributor would have a tendency
to select objectives that are conservative, that is, not commit the department to be responsible
for the overall outcome. In particular, if legislation has been passed that emphasizes
departments being individually accountable, then broader sectoral objectives may well be
overlooked.

A similar problem arises for many nonprofit organizations. In Canada and the United
States, many funding organizations (e.g., governments, private foundations, the United Way)
are opting for a performance-based approach to their relationship with organizations that
deliver programs and services. Increasingly, funders expect results-focused performance
information as a condition for grants funding and renewals. Governments that have opted for
contractual relationships with nonprofit service providers are developing performance
contracting requirements that specify deliverables and often tie funding to the provision of
evidence that these results have been achieved (Bish & McDavid, 1988).

Nonprofit organizations are often quite small and are dedicated to the amelioration of a
community problem or issue that has attracted the commitment of members and volunteers.
Being required to bid for contracts and account for the performance results of the money they
have received is added onto existing administrative requirements, and many of these
organizations have limited capacity to do these additional tasks. Campbell (2002) has pointed
out that in settings where targeted outcomes span several nonprofit providers, it is beneficial to
have some collaboration among funders and for providers to agree on ways of directly
addressing the desired outcomes. If providers compete and funders continue to address parts of

a problem, the same sectoral disregard that was suggested for government departments will
happen in the nonprofit sector.

One issue that can easily be overlooked as performance measures are being developed is
the “levels of analysis” problem (McDavid, 2001a). Suppose a government department
develops a set of performance measures that is intended to indicate how the organization as a
whole is doing. If the actual performance results suggest that the organization is meeting its
overall objectives, it might be tempting to conclude that the programs that contribute to the
objectives are also effective. That would be a mistake because success at one level does not
warrant a conclusion that performance at other levels is also comparable. It is possible to have
programs that are not meeting their objectives while, overall, the organization is meeting its
objectives. Likewise, we cannot use program success alone to indicate organizational success,
nor can we use individual employee performance measures to tell us whether programs or the
organization are meeting their objectives.

Ideally, individual and group objectives should connect with program objectives, which
should in turn connect with organizational objectives. It is necessary to measure performance
at all of these levels to be able to effectively manage organizational performance.

One additional issue with respect to organization-level and sectoral measures of
performance is who should take responsibility for gathering the data and reporting
interpretations of it. Since reporting responsibilities can be linked to expectations of
accountability in some organizations, ownership of these measures becomes an important
organizational-political issue. We will discuss the political dimensions of performance
measurement in Chapter 10.

Involve Prospective Users in Reviewing Logic Models and Constructs in the Proposed
Performance Measurement System

Developing logic models of programs and/or the organization as a whole is an iterative
process. Although the end product is meant to represent the programmatic and intended causal
reasoning that transforms resources into results, it is essential that logic models be reviewed
and validated with organizational participants and other stakeholders. Involvement at this stage
of the development process will validate key constructs for prospective users and set the
agenda for developing performance measures. Program managers in particular will have an
important stake in the system. Their participation in validating the logic models increases the
likelihood that performance measurement results will be useful for program improvements.

Typically, logic models identify outputs and outcomes that are linked in intended causal
relationships. Depending on the purposes of the performance measurement process, some
constructs will be more important than others. For example, if a logic model for a job training
and placement program operated by a community nonprofit organization has identified the
number of persons who complete the training as an output and the number who are employed
full-time one year after the program as an outcome, the program managers would likely
emphasize the output as a valid measure of program performance—in part because they have
more control over that construct. But the funders might want to focus on the permanent
employment results because that is really what the program is intended to do.

By specifying the intended causal linkages, it is possible to review the relative placement
of constructs in the model and clarify which ones will be a priority for measurement. In our
example, managers might be more interested in training program completions since they are
necessary for any other intended results to occur. Depending on the clients, getting persons to
actually complete the program can be a major challenge in itself. If the performance
measurement system is intended to be summative as well, then measuring the permanent

employment status of program participants would be important—although there would be a
question of whether the program produced the observed employment results.

Figure 2.6 in Chapter 2 described a logic model for a family preservation and strengthening
program. The program was intended to offer parents of families in crisis the opportunity to
acquire and practice the skills needed to be more effective, enhancing the likelihood that they
would be able to avoid having to give up their children to foster care. A key construct in that
program logic is parents acquiring skills related to managing family issues—that construct is
the “hub” of the program logic. Program success is uniquely dependent on that happening, and
“developing parental skills” would be central to developing a suite of performance measures.
One indicator of the importance of constructs in logic models, then, is the number of causal
links connecting to and coming from each construct.

If a performance measurement system is going to be designed and implemented as a public
accountability initiative that is high stakes, that is, has resource-related consequences for those
organizational units being measured, reported, and compared, then the performance measures
chosen should be ones that would be difficult to “game” by those who are being held
accountable. Furthermore, it may be necessary to periodically audit the performance
information to assess its reliability and validity (Bevan & Hamblin, 2009). Some
jurisdictions—New Zealand, for example—regularly audit the public performance reports that
are produced by all departments and agencies (Gill, 2011).

Measure the Constructs That Have Been Identified as Parts of the Performance
Measurement System

We learned in Chapter 4 that the process of translating constructs into observables involves
measurement. For performance measurement, secondary data sources are the principal means
of measuring constructs. Because these data sources already exist, their use is generally seen to
be cost-effective. There are, however, several issues that must be kept in mind when using
secondary data sources:

• Can the existing data (usually kept by the organization) be adapted to fit constructs in the
performance measurement system? In many performance measurement situations, the
challenge is to adapt what exists, particularly data readily available via information
systems, to what is needed to translate performance constructs into reliable and valid
measures. Often, existing data have been collected for purposes that are not related to
measuring and reporting on performance. Using these data raises validity questions. Do
they really measure what the performance measurement designers say that they
measure? Or do they distort or bias the performance construct so that the data are not
credible? For example, measuring changes in employee job satisfaction by counting the
number of sick days taken by workers over time could be misleading. Changes in the
number of sick days could be due to a wide range of factors, making it an invalid
measure of job satisfaction.

• Do existing data sources sufficiently cover the constructs that need to be measured? The
issue here is whether our intended performance measures are matched by what we can
get our hands on in terms of existing data sources. In the language we introduced in
Chapter 4, this is a content validity issue.

• A separate, but related, issue is whether existing data sources permit us to triangulate our
measurements of key constructs. In other words, can we measure a given construct in

two or more independent ways, ideally with different methodologies? Generally,
triangulation increases confidence that the measures are valid.

• Can existing data sources be manipulated by stakeholders if they are included in a
performance measurement system? Managers and other organizational members
generally respond to incentives. If a performance measure becomes the focus of
summative program or service assessments, and if the data for that measure are collected
by organizational participants, it is possible that the data will be manipulated to indicate
“improved” performance (Otley, 2003).

An example of this type of situation from policing was an experiment in Orange County,
California, to link salary increases in the police department to reduced reporting rates for
certain kinds of crimes (Staudohar, 1975). The agreement between the police union and
management specified clear thresholds between percentage reductions in four types of crimes
and the magnitude of salary increases.

The experiment “succeeded.” Crime rates in the four targeted crimes decreased just enough
to maximize the wage increases. Correspondingly, crime rates increased for several related
types of crimes. A concern in this case is whether the crime classification system may have
been manipulated by participants in the experiment, given the incentive to “reduce” crimes in
order to maximize salary increases.

If primary data sources (those designed specifically for the performance measurement
system) are being used, several issues should also be kept in mind:

• Are there ongoing resources to enable collecting, coding, and reporting of data? If not,
then situations can develop where the initial infusion of resources to get the system
started may include funding to collect outcomes data (e.g., to conduct a client survey),
but beyond this point, there will be gaps in the performance measurement system where
these data are no longer collected.

• Are there issues of sampling procedures, instrument design, and implementation that
need to be reviewed or even done externally? In other words, are there methodological
requirements that need to be established to ensure the credibility of the data?

• Who will actually collect and report the data? If managers are involved, is there any
concern that their involvement could be seen to be in conflict with the incentives they
perceive?

• When managers review the performance measures that are being proposed, if a draft of
the proposed performance measures does not feature any that pertain to their programs,
they may conclude that they are being excluded and are therefore vulnerable in future
budget allocations. It is essential to have a rationale for each measure and some overall
rationale for featuring some measures but not others. Organization executives may need
to be involved in settling any managerial disagreements.

In Chapter 4, we introduced measurement validity and reliability criteria to indicate the
methodological requirements for sound measurement processes. Those criteria are rooted in the
social sciences (Goodwin, 1997), and satisfying them is generally premised on having the
resources to properly establish validity and reliability. In many performance measurement
situations, there are few resources, and limited time, to determine whether each measure is
defensible in methodological terms. Performance measurement is fundamentally about finding
indicators that plausibly connect constructs with data. In terms of the kinds of validity
discussed in Chapter 4, persons or teams that are developing and implementing performance

measures usually pay attention to face validity (On the face of it, does the measure do an
adequate job of representing the construct?), content validity (How well does the measure or
measures represent the range of content implied by the construct?), and response process
validity (Have the participants in the measurement process taken it seriously?).

We are reminded of a quote that has been attributed to Sir Josiah Stamp, a tax collector for
the government in England during the 19th century:

The government is extremely fond of amassing great quantities of statistics. These are
raised to the nth degree, the cube roots are extracted and the results are arranged into
elaborate and impressive displays. What must be kept ever in mind, however, is that in
every case, the figures are first put down by a village watchman and he puts down
anything he damn well pleases. (Source, Sir Josiah Stamp, Her Majesty’s Collector of
Inland Revenues, more than a century ago) (cited in Thomas, 2004, p. xiii)

Assessing other kinds of measurement validity (internal structure, concurrent, predictive,
convergent, and discriminant; see Chapter 4) is generally beyond the methodological resources
in performance measurement situations. The reliability of performance measures is often
assessed with a judgmental estimate of whether the measure and the data are accurate, that is,
are collected and recorded so that there are no important errors in the ways the data represent
the events or processes in question. In some jurisdictions, performance measures are audited
for reliability (see, e.g., Texas State Auditor’s Office, 2012).

An example of judgmentally assessing the reliability and validity of measures of program
results might be a social service agency that has included the number of client visits as a
performance measure for the funders of its counseling program. Suppose that initially, the
agency and the funders agree that the one measure is sufficient since payments to the agency
are linked to the volume of work done and client visits are deemed to be a reasonably accurate
measure for that purpose. To assess the validity and reliability of that measure, one would want
to know how the data are recorded (e.g., by the social worker or by the receptionist) and how
the files are transferred to the agency database (manually or electronically as part of the intake
process for each visit). Are there under- or overcounting biases in the way the data are
recorded? Do telephone consultations count as client visits? What if the same client visits the
agency repeatedly, perhaps even to a point where other prospective client appointments are less
available? Should a second measure of performance be added that tracks the number of clients
served (improving content validity)? Will that create a more balanced picture and create
incentives to move clients through the treatment process? What if clients change their
names—does that get taken into account in recording the number of clients served? Each
performance measure or combination of measures for each construct will have these types of
practical problems that must be addressed if the data in the performance measurement system
are to be credible and usable.

In jurisdictions where public performance reporting is mandated, a significant issue is an
expectation that requiring fewer performance measures for a department will simplify
performance reporting and make the performance report more concise and more readable.
Internationally, guidelines exist that suggest a rule of parsimony when it comes to selecting the
number of performance measures for public reporting. For example, the Canadian
Comprehensive Auditing Foundation (CCAF-FCVI, 2002) has outlined nine principles for
public performance reporting, one of which is to “focus on the few critical aspects of
performance” (p. 4). This same principle is reflected in guidelines developed for performance
reporting by the Queensland State Government in Australia (Thomas, 2006).

The international public accounting community has taken an interest in public performance
reporting generally and, in particular, the role that public auditors can play in assessing the
quality of public performance reports (Klay, McCall, & Baybes, 2004). The assumption is that
if the quality of the supply of public performance reports is improved, that is, performance
reports are independently audited for their credibility, they are more likely to be used, and the
demand for them will increase.

Typically, the number of performance measures in public reports is somewhere between 10
and 20, meaning that in large organizations, some programs will not be represented in the
image of the department that is conveyed publicly. A useful way to address managers wanting
their programs to be represented publicly is to commit to constructing separate internal
performance reports. Internal reports are consistent with the balancing of formative and
summative uses of performance measurement systems. It is our belief that unless a
performance measurement system is used primarily for internal performance management, it is
unlikely to be sustainable. Internal performance measures can more fully reflect each program
and are generally seen to better represent the accomplishments of programs.

One additional measurement issue is whether measures and the data that correspond to the
measures should be quantitative. In Chapter 5, we discussed the important contributions that
qualitative evaluation methods can make to program evaluations. We included an example of
how qualitative methods can be used to build a performance measurement and reporting
system (Davies & Dart, 2005; Sigsgaard, 2002). There is a meaningful distinction between the
information that is conveyed by words and that which is conveyed by numbers. Words can
provide us with texture, emotions, and a more vivid understanding of situations. Words can
qualify numbers, interpret numbers, and balance presentations. Most important, words can
describe experiences—how a program was experienced by particular clients as opposed to the
number of clients served, for example.

In performance measurement systems, it is desirable to have both quantitative and
qualitative measures/data. Stakeholders who take the time to read a mixed presentation can
learn more about program performance. But in many situations, particularly where annual
targets are set and external reporting is mandated, there is a bias toward numerical information,
since targets are nearly always stated quantitatively. If the number of persons on social
assistance is expected to be reduced by 10% in the next fiscal year, for example, the most
relevant data will be numerical. Whether that program meets its target or not, however, the
percent reduction in the number of persons on social assistance provides no information about
the process whereby that happened, and other contextual factors.

Performance measurement systems that focus primarily on providing information for
formative uses should include deeper and richer measures than those used for public reporting.
Qualitative information can provide managers with feedback that is very helpful in adjusting
program processes to improve results. Also, qualitative information can reveal to managers the
client experiences that accompany the process of measuring quantitative results.

Qualitative information presented as cases or examples that illustrate a pattern that is
reported in the quantitative data can be a powerful way to convey the meaning of the numerical
information. Although single cases can only illustrate, they communicate very effectively. For
political decision makers, case-based narratives can be essential to conveying the meaning of
performance results.

Record, Analyze, Interpret, and Report the Performance Data

One potential problem with any performance measurement system is the potential for
ambiguity in observed patterns of results. In an Oregon benchmarking report (Oregon Progress

Board, 2003), affordability of housing was offered as an indicator of the well-being of the state
(presumably of the broad social and economic systems in the state). If housing prices are
trending downward, does that mean that things are getting worse or better? From an economic
perspective, declining housing prices could mean that (a) demand is decreasing in the face of a
steady supply; (b) demand is decreasing, while supply is increasing; (c) demand and supply are
both increasing, but supply is increasing more quickly; or (d) demand and supply are both
decreasing, but demand is decreasing more quickly. Each of these scenarios suggests
something different about the well-being of the economy. To complicate matters, each of these
scenarios would have different interpretations if we were to take a social rather than an
economic perspective. The point is that prospective users of performance information should
be challenged to offer their interpretations of simulated patterns of such information (Davies &
Warman, 1998). In other words, prospective users should be offered scenarios in which
different trends and levels of measures are posed. If these trends or levels have ambiguous
interpretations—“it depends”—then it is quite likely that when the performance measurement
system is implemented, the same ambiguities will arise as reports are produced and used.
Fundamentally, ambiguous measures invite conflicting interpretations of results and will tend
to weaken the credibility of the system.

In addition to simulating different patterns of information for prospective users, it is
important to ascertain what kinds of comparisons are envisioned with performance data. A
common comparison is to look for trends over time, and make judgments based on
interpretations of those trends. An example of a publicly reported performance measure that
tracks trends over time is the WorkSafeBC measure of injured workers’ overall satisfaction
with their experience with the organization.

Each year, WorkSafeBC arranges for an independent survey of about 400 injured workers
who are randomly selected from among those who made claims for workplace injuries
(WorkSafeBC, 2011). Workers can rate their overall satisfaction on a 5-point scale from very
poor to very good. This performance measure is one of 11 that are included in the annual report
and has been used since 2003. Figure 9.1 has been excerpted from the 2010 Annual Report
(WorkSafeBC, 2011) and displays the percentages of surveyed workers who rated their overall
satisfaction as good or very good, over time. Also displayed are the targets for this
performance measure for the next three years. This format for a performance measure makes it
possible to see what the overall trend is and how that trend is expected to change in the future.
We can see that approximately three quarters of injured workers have tended to be satisfied
over time. But in 2009 and in 2010, the percentage drops. The organization is forecasting a
return to the historical percentage with modest improvements in the next 3 years. As program
evaluators, we might want to know more about why injured worker satisfaction levels dropped
in 2009 and 2010. Given the challenging economic environment in British Columbia and in
many other jurisdictions, it is possible that worker satisfaction reflects those pressures.

Another comparison that can be made using performance information is across similar
administrative units. For example, all provincial governments in Canada have ministries or
departments that manage payments to injured workers, and assess and collect insurance
premiums from employers to offset these payment costs. Figure 9.2 compares injury frequency
among all jurisdictions in Canada.

There is considerable variation among Canadian provinces in terms of injury frequency,
and a potential evaluation question would be how to explain this variation. Is it due to random
factors, or are there differences in policies and programs that are linked to this important
outcome measure?

Figure 9.1 Performance Measurement Results Over Time: Injured Workers’ Overall
Satisfaction With WorkSafeBC

Source: WorkSafeBC (2011, p. 34). Injury Frequency (per 100 workers of assessable employers) is
reprinted, with permission, from the WorkSafeBC 2010 Annual Report and 2011–2013 Service Plan.
Copyright © WorkSafeBC. Used with permission.

Figure 9.2 Injury Frequency per 100 Workers for All Canadian Provinces

Source: WorkSafeBC (2011, p. 96). Injury Frequency (per 100 workers of assessable employers) is
reprinted, with permission, from the WorkSafeBC 2010 Annual Report and 2011–2013 Service Plan.
Copyright © WorkSafeBC. Used with permission.

A third type of comparison is with benchmarks, standards, or targets. For example, in some
program or service areas, such as hospital services, it is common to use standards to assess
waiting times for services. When physicians refer patients for testing or for medical
procedures, waiting time can become a critical factor, especially where initial diagnoses
indicate a progressive disease. Performance reporting that is intended for public accountability
purposes will typically include comparisons between performance targets and actual results.
We saw an example of this with Figure 9.1, which incorporated annual targets and actual
results for overall worker satisfaction with their interactions with WorkSafeBC. As another
example of comparisons with targets, a municipal government graffiti management program
might have an objective of reducing the number of public buildings defaced by graffiti. If the
target was a maximum of 5% of buildings with graffiti (measured by a year-end physical
survey of all public buildings), the actual survey results could be compared with the target. If
the survey revealed that 10% of public buildings had graffiti on them, the program manager
(and other stakeholders) might decide to investigate the gap between the target and the actual
result. Following up on this performance result would entail asking why the observed result
occurred—a question typically in the domain of program evaluation.

Setting targets can become a contentious process. If the salaries of senior managers are
linked to achieving targets, there will be pressure to make sure the targets are achievable. If
reporting targets and achievements is part of an adversarial political culture, there will again be
pressure to make targets conservative (Davies & Warman, 1998). Norman (2001) has
suggested that performance measurement systems can result in underperformance for these
reasons. Hood (2006) points to the ratchet effect (a tendency for performance targets to be
lowered over time as agencies fail to meet them) as a problem for public sector performance
measurement in Britain.

Buy-in is an incremental process. Managers want to see what actually happens with the
performance results and the reports that are produced, before they are willing to fully accept
this change. Acceptance can also be eroded. If there is turnover in the organization’s leadership
and the new executive unilaterally shifts the balance from formative to summative uses of the
performance results, it is quite likely that resistance to the system will develop.

There is an issue of access to the performance data. In some organizations, the performance
measurement function has been separated from line management entirely. Managers do not
have access to data; instead, they receive periodic reports. Excluding managers and other
organizational members from having access to performance data tends to reinforce a cultural
norm that such information is a source of power and control. Related to access is the question
of whether users can prepare their own reports, in addition to reports that are mandated. Are
they given the opportunity to analyze the data included in existing reports, in order to
corroborate or disconfirm interpretations of the data? In New Zealand, for example, many
managers have developed their own information sources and ways of working with
performance data in their organizations (Gill, 2011).

Finally, how are reports prepared? Is there a regular cycle of reporting? Is there a process
whereby reports are reviewed and critiqued internally before they are released to users? Often,
agencies have internal vetting processes wherein the authors of reports are expected to be able
to defend the report in front of their peers before the report is released. This challenge function
is valuable as a way of assessing the defensibility of the report and anticipating the reactions of
stakeholders.

Regularly Review Feedback From the Users and, If Needed, Make Changes to the
Performance Measurement System

Uses of and organizational needs for performance data will change over time.
Implementing a system with a fixed structure (logic models and measures) at one point in time
will not ensure the relevance or continued use of the system in the future. There is a balance
between the need to maintain continuity of performance measures, on the one hand, and the
need to reflect changing organizational objectives, structures, and prospective uses of the
system, on the other (Kravchuk & Schack, 1996). In many performance measurement systems,
there are measures that are replaced periodically and measures that are works in progress. A
certain amount of continuity in the measures increases the capacity of measures to be
compared over time. Data displayed as a time series can, for example, show trends in
environmental factors, as well as changes in outputs and outcomes; by comparing
environmental variable trends with outcome trends, it may be possible to take into account the
influences of plausible rival hypotheses on particular outcome measures. Although this process
depends on the length of the time series and is often judgmental, it does permit analysts to use
some of the same tools that would be used by program evaluators. In Chapter 3, recall that in
the York crime prevention program evaluation, the unemployment rate in the community was
an external variable that was included in the evaluation to assist the evaluators in determining
whether the neighborhood watch program was the likely cause of the observed changes in the
reported burglary rate.

But continuity can also make a system less relevant over time. Suppose, for example, that a
performance measurement system was designed to pull data from several different databases,
and the original information system programming to make this work was expensive. Even if
the data needs change, there may well be a desire not to go back and repeat this work, simply
because of the resources involved. Likewise, if a performance measurement system is based on
a logic model that becomes outdated, then the measures will no longer fully reflect what the
program(s) or the organization is trying to accomplish. But going back to redo the logic model
(which can be a time-consuming, iterative process) may not be feasible in the short term, given
the resources available. The price of such a decision might be a gradual reduction in the
relevance of the system, which may not be readily detected.

With all the activity to design and implement performance measurement and reporting
systems, there has been surprisingly little effort to date to evaluate their effectiveness
(McDavid & Huse, 2012). In Chapter 10, we will discuss what is known now about the ways in
which performance information is used, but it is appropriate here to suggest some practical
steps to generate feedback that can be used to modify and better sustain performance
measurement systems:

• Develop channels for user feedback. This step is intended to create a process that will
allow the users to provide feedback and suggest ways to revise, review, and update the
performance measures. Furthermore, this step is intended to help identify when
corrections are required and how to address errors and misinterpretations of the data.

• Create an expert review panel of persons who are both knowledgeable about
performance measurement and do not have a stake in the system that is being reviewed.
Performance measurement should be conducted on an ongoing basis, and this expert
panel review can provide feedback and address issues and problems over a long-term
time frame. A review panel can also provide an independent assessment of buy-in and

use of performance information by managers and staff, and track the (intended and
unintended) effects of the system on the organization.

The credibility of performance information is an enduring concern. Davies and Warman
(1998) point to the importance of auditing in the context of the performance reports of the
(British) National Meteorological Office:

An independent audit, then, is not a luxury, it is a necessity. The credibility of the whole
system of agencies is put at risk if the data from one is found to be unverified and open to
dispute. Where performance-related bonuses are linked with outcomes, it is unreasonable
to expect staff concerned to be responsible for the measurement and reporting of results in
an objective manner when the very same results will determine their own pay. (p. 47)

Legislative auditors, in addition to recommending principles to guide public performance
reporting, have been active in promoting audits of performance reports (CCAF-FCVI,

Still stressed with your coursework?
Get quality coursework help from an expert!