51
ESRC UK Centre for Evidence Based Policy and Practice: Working Paper 24 Evidence for Accountability The nature and uses of evidence in the audit, inspection and scrutiny functions of government in the UK Ruth Levitt, William Solesbury and Tom Ling* ESRC UK Centre for Evidence Based Policy and Practice *School of Law, Languages and Social Science at Anglia Polytechnic University Email: [email protected] © January 2006: ESRC UK Centre for Evidence Based Policy and Practice

Evidence for Accountability - King's College London - … · Evidence for Accountability ... inspection and scrutiny. We are pleased to record our thanks to the ... the purpose of

Embed Size (px)

Citation preview

ESRC UK Centre for Evidence Based Policy and Practice: Working Paper 24

Evidence for Accountability

The nature and uses of evidence in the audit, inspection and scrutiny functions of government

in the UK

Ruth Levitt, William Solesbury and Tom Ling*

ESRC UK Centre for Evidence Based Policy and Practice

*School of Law, Languages and Social Science at Anglia Polytechnic University

Email: [email protected] © January 2006: ESRC UK Centre for Evidence Based Policy and Practice

Preface This Working Paper brings together four papers, a background report and a final report produced in connection with the seminar series on Evidence for Accountability organised by the ESRC UK Centre for Evidence Based Policy and Practice1 and the School of Law, Languages and Social Science at Anglia Polytechnic University in Cambridge. The series of six seminars ran between October 2004 and September 2005. Our purpose was to study the nature and uses of evidence in the audit, inspection and scrutiny functions of modern government in the UK. Seminars 1-5 brought together a core group of 14 practitioners and researchers. The first of these was a preliminary, scoping exercise and no formal report was published. For Seminar 6, a further 21 individuals were invited to contribute their expertise and thinking, to take the analysis further and consider what ideas and next steps might be identified for improving the use of evidence for accountability. A background paper on the issues raised in the preceding seminars was produced to inform discussions at this event. The outcomes constitute a summary of the findings of the entire series, which is accordingly placed at the beginning of this Working Paper. The series was productive in generating the transfer of knowledge and ideas between practitioner and researchers. It also stimulated interest in collaborating on further conceptual inquiry and action research. We seek now to articulate and refine a set of principles about evidence for accountability, and to put those principles to the test in the practical work of audit, inspection and scrutiny. We are pleased to record our thanks to the following organisations for generously supporting Seminar 6: Anglia Polytechnic University, Economic and Social Research Council, National Audit Office, National Centre for Social Research, Office for Public Management, and Queen Mary, University of London. At the time of writing (December 2005) the organisers have submitted proposals for funding the next phases of work. For further information please contact Dr Ruth Levitt ([email protected]).

The purpose of the Working Paper series of the ESRC UK Centre for Evidence Based Policy and Practice is the early dissemination of outputs from Centre research and other

activities. Some titles may subsequently appear in peer reviewed journals or other publications. In all cases, the views expressed are those of the author(s) and do not

necessarily represent those of the ESRC.

1 Then at Queen Mary, University of London, now at the School of Social Science and Public Policy, King’s College London, Strand, London WC2R 2LS.

2

Contents 1. Evidence for accountability: final report of seminar series 4

(October 2005)

2. Types of evidence in audit, inspection and scrutiny 7 (December 2004)

3. Methods for collecting and organising evidence in audit, 16 inspection and scrutiny (January 2005)

4. Synthesis of evidence in audit, inspection and scrutiny 26 (March 2005)

5. Outputs of evidence in audit, inspection and scrutiny 32 (June 2005)

6. Background paper for Seminar 6 on 12-13 September 2005 40 at Madingley Hall, Cambridge

3

1. Evidence for accountability: final report of seminar series (October 2005)

Introduction The series of six seminars on Evidence for Accountability opened in October 2004 and was completed in September 2005. Our purpose was to study the nature and uses of evidence in the audit, inspection and scrutiny functions of modern government in the UK. Seminars 1-5 brought together a core group of 14 practitioners and researchers. For Seminar 6, a further 21 individuals were invited to contribute their expertise and thinking, to take the analysis further and consider what ideas and next steps might be identified for improving the use of evidence for accountability. Seminar 1 scoped the broad territory of audit, inspection and scrutiny. Seminars 2-5 identified and considered the following main topics and questions:

• The nature and types of evidence for audit, inspection and scrutiny • The methods for organising and collecting evidence in audit, inspection and

scrutiny • Synthesising the evidence • Outputs of audit, inspection and scrutiny.

For Seminar 6 a background paper was produced, which included questions for further discussion. The main conclusions arrived at in discussions at the final seminar constitute a summary of the findings of the entire series, and fall into five main categories. The purposes of audit, inspection and scrutiny Audit, scrutiny and inspection are organisationally distinct as government functions but share some characteristics. They are second order tasks (that is, not executive tasks), empirical (based on observation and experience), and retrospective (so different from regulation, another second order task). Their generic purpose is to secure the accountability of executive bodies, such as a school subject to inspection, a local authority programme subject to scrutiny, or a public service subject to audit. Accountability implies both ‘holding persons to account’ (for their performance of a task – hence the retrospection) and ‘rendering an account’ (on which the assessment of performance rests – hence the empiricism). Both aspects necessitate high quality evidence in audit, scrutiny or inspection. However, in practice, two other purposes are in play. First, accounts of the performance of public agencies contribute to the gain (or loss) of political trust in government and its agents, so the public as service users and voters is an audience that must be addressed. Secondly, and increasingly in recent years, assessments of performance have become directed to future improvement as much as past achievement, so that the accounts must be diagnostic as well as analytical, and addressed to managers.

4

The evidence base In order to optimise the collection, analysis, synthesis and outputs of evidence by audit, inspection and scrutiny, the key characteristics of their evidence bases(s) need to be identified accurately. These might include: the scope of the information; the criteria used in deciding what to collect; the standards set for quality and reliability of evidence; and the way in which it is collected (e.g. self-assessment). These characteristics could derive from the development of some principles to guide the role of evidence in different situations and improve its ‘fitness for purpose’. Sharing evidence, and placing more emphasis on using existing evidence rather than on collecting new evidence, is a sensible aspiration. However, this presents a challenge because of the lack of clarity about the extent to which the different agencies involved in audit, inspection and scrutiny, who may be holding public service bodies to account for different things, may actually require different information. Sharing would be facilitated by making explicit some principles for the use of evidence in different situations and the characteristics of the evidence used for different purposes. Interests of users, participants and stakeholders Ultimately, the purpose of audit, inspection and scrutiny is to provide accountability of public services to the public who use services and the taxpayers who pay for them. It follows that the legitimacy, effectiveness, efficiency and development of audit, inspection and scrutiny depend on appropriate alignment with the interests of those who are the intended beneficiaries and ultimate audience – taxpayers and service users. This demands an appropriate balance between national and local targets and standards, and clarity about the objectives against which public services are being assessed. Ensuring effective accountability to the public requires more than distributing and disseminating information. It requires that the information is discussed and considered. However, the drive to produce information that can be easily summarised and communicated to a mass audience runs the risk of misleading people. The public has begun to suspect that some of evidence for accountability is indeed misleading, and this reduces rather than enhances trust, which flows from attitudes rather than from evidence. Methods and skills Among the many methods identified as being relevant to audit, inspection and scrutiny, three key distinctions are:

• between measurement and observation – for example, analysing service use data or conducting user surveys against making site visits or doing case studies

• between analysis and discourse – that is, between reasoning conclusions from evidence against debating its meaning and importance

• between autonomy and collaboration – the auditor or inspector or scrutineer assessing performance independently of the subject or working together in some way to seek an agreed assessment

5

In practice, quite eclectic mixes of these approaches and the methods within them are used by audit, inspection and scrutiny bodies, seemingly for reasons more to do with tradition and culture in the organisation than with fitness for purpose. Most bodies have in recent years extended their range of methods, although without a convergence on standard practice. The methods require different skills. Among them are:

• Analytical skills – for handling data, both quantitative and qualitative, to scientifically acceptable standards

• Investigative skills – in reviewing documentation, observing classroom practice, conducting a case study

• Facilitation skills – for example, in chairing a hearing or a focus group • Negotiation skills – for seeking consensus on either evidence or conclusions • Consultancy skills – to develop advice on improving performance • Communication skills – in getting results across to diverse audiences.

There is a need to develop a full repertoire of such skills among professionals in audit, inspection and scrutiny. But there is also a need to find the right degree of awareness, understanding and use of diverse methods among politicians and lay participants. Challenges The increasing complexity of the worlds of policy and practice is requiring audit, inspection and scrutiny to evolve and adapt. In the longer term, the collection, analysis, synthesis and outputs of evidence for audit, inspection and scrutiny need to change, to ensure relevance and legitimacy with the contexts and conditions of UK public service accountability. The purposes of these functions are changing, as are the active interests. One consequence is that the skills and methods for rendering and holding to account cannot stand still, if the functions are to be effective. Furthermore, greater attention to ‘better regulation’ and deregulation is obliging audit, inspection and scrutiny to play their part in finding a lighter touch for achieving better results. With devolved government growing in significance, and service provision through partnerships representing a substantial trend, the terrain of public service accountability relationships seems set to continue to shift. This all suggests that audit, inspection and scrutiny need actively to keep improving their understanding and uses of ‘evidence for accountability’. The participants in the seminar series hope that the work reported here and in the other papers makes a useful contribution to that learning.

6

2. Types of evidence in audit, inspection and scrutiny (December 2004)

Introduction Evidence is used in audit, inspection and scrutiny to inform understanding of specific services, roles, responsibilities, activities and events; to identify possible areas of concern and to underpin proposed improvements. But what exactly triggers the collection of evidence? Why are those particular types of evidence collected? How are they used? And how do audit, inspection and scrutiny differ in relation to these features of evidence? Why is some data collected and not other? Framing determines the question that is posed by the audit, inspection or scrutiny. This is not neutral, but institutionally shaped. Some relevant evidence may not readily available. What is not collected matters, and why institutions are not interested in that material may be significant. What the initiators of an audit, inspection or scrutiny exercise want to know is conditioned by what is available. Requirements are influenced by pre-existing evidence, which has already influenced the questioner. In the case of a National Audit Office (NAO) value for money study of changes to the wholesale electricity trading system, two very different types of evidence were considered: quantitative data on prices; and qualitative data from a focus group of electricity companies who were participants in the trading system. The first source was already available; the second was specially created by NAO, and it provided more subjective material and a perspective on the whole system. Similarly with Parliament’s monitoring of Whitehall departments, select committee enquiries rely on a lot of evidence that is subjective and comes from discussion and debate, although the Scrutiny Unit gives a harder edge by assessing the core tasks that departments have to complete in submitting evidence. It can be helpful to combine the two, by having a subjective commentary on some quantitative data. The select committee sessions taking oral evidence are more adversarial than a focus group. Select committees have powers to call for persons and papers. The evidence that committees require depends on their roles: some have constitutional remits. Organisations have to decide how much and which evidence they can afford to collect, and not afford not to collect. There are resource implications and trade-offs to be made between an organisation gathering and analysing evidence for monitoring its own activities and performance as a normal part of internal management, as against monitoring in order to be able to answer the questions that may be posed by audit, inspection or scrutiny. The House of Commons Public Administration Select Committee investigated the evidence that was collected in order to satisfy public service targets1. It found that the evidence organisations collected for performance management was not the same as that required by the target setters, and the additional burden of meeting those demands for additional evidence was resented. Purposes and uses of evidence One way of summarising the purposes and uses of evidence is to arrange them hierarchically or along spectra reflecting the dimension of control or authority. For example:

7

Purpose/use Hierarchy/spectrum

(a) Why evidence is used in audit, inspection and scrutiny

improve services; improve governance; improve government; propriety; informal purposes

(b) Types of evidence required

true and fair records that comply with regulations; quantitative, and qualitative; benchmarks; stakeholders’ views

(c) How evidence is used giving an account; holding to account; assuring redress; taking into account; reveal; conceal; distract; obstruct

(d) Criteria used for assessing evidence

conformity to specification; fitness for purpose; meeting stakeholders’ expectations; prove; improve

(e) How evidence is used to influence/change behaviour

compulsion; incentives; information; self improvement

Looking at each of these in a little more detail: Why is evidence used Explicit and implicit reasons may govern why evidence is originated, collected, processed and used in particular ways. The stated purposes may or may not coincide with the actual purposes, and the stated uses may not coincide with the actual uses. Linking the two, the uses to which the evidence is actually put may or may not coincide with the stated or unstated purposes of the evidence. Evidence collecting may extend beyond information gathering for its own sake, or for the record, into the realms of control. Information can be used to provide feedback with or without the requirement to change. Types of evidence Examples In what medium is the evidence

physical, electronic…

What format(s) is the evidence in

book, article, report, slides, a form, transcript of spoken word, visual images, sounds

What data types are present

qualitative, quantitative… text, numbers, tables, graphs, symbols, images, sounds, physical objects…

What knowledge types are present

cognitive, experiential, affective explicit, tacit; internal, external; subjective, objective…

What is its scope complete, whole system, locality/ies, a summary, a sample, an extract, a snapshot, (part of) a series, raw data, aggregated data, edited, comparative…

Who originates it a named individual, a group…; the subject itself, a reporter… How is it originated actively, automatically, passively… Where is it originated locally, remotely… When is it originated fixed point in time, continuous, random…

before the event, at the time, after the event, intermittently, regularly… How is it collected interview, focus group, hearing, inquiry, observation, negotiation,

mystery shopping, phone-in, straw poll, electoral process, conference, deliberative event, test, examination, survey, site visit, inspection of operational data, inspection of books, complaint investigation…

Where is it collected on site, locally, remotely…

8

Where is it kept on site, locally, remotely… When is it collected once, repeatedly, regularly, randomly, simultaneously, after a certain

elapsed time… How is the evidence processed

unidirectional, dynamic, transactional… translation (language), translation (format), statistical analysis, summarising, filtering, weighting, sorting, merging, interpreting, storing, deleting…

Is it used yes, no, sometimes… Who uses it originator, collector, other stakeholder How is it used requested, received, acted on, stored, ignored… Why is it originated by-product of operations, compliance, special event… Why is it collected for the record, for performance monitoring and evaluation, to comply

with accountability processes, to inform accountability processes… Why is it used to inform stakeholders, to inform policy, to comply with regulations, to

comply with accountability processes, to inform accountability processes

Criteria Different users of the evidence will employ selection criteria characteristic of their own conceptual frameworks in judging the fitness of evidence for the purposes of the audit, inspection or scrutiny. This influences their choice of particular type(s) of evidence and evidence processing. They will also employ different methodologies in using the evidence. There may be trust or distrust in the degree of disclosure of evidence and its interpretation. Criteria may be implicit, allowing some sort of score to be given in a range from maximum to average to minimum fitness. Overt as well as covert standards may be applied. For example, expectations are a notoriously subjective and therefore unreliable benchmark, although very widely used and taken as meaningful. How is evidence used Different traditions govern the reasons for using oral and/or written evidence: opinion and personalised presentation versus cold print. In the case of oral evidence, the mode or setting in which evidence is presented are key factors. For example, participants in a focus group offer evidence differently from those giving oral evidence to a committee hearing. A focus group is informal, off the record, and what is said is more discursive; the experience is more akin to a discussion between colleagues, at which information is being exchanged. Whereas an evidence session is on the record, and what is said is used externally to make judgements and scrutinise. At an oral session the mode is question and answer, with the questioner in charge. Statements that claim an evidence base for arguments, rules, policies or decisions may or may not be justified in terms of the actual evidence base used. Or the claim to be so justified may be partial or false. Such a claim may be made knowingly or unknowingly, and the partiality or falsity may be revealed or not, willingly or not. Degrees of justification can be differentiated, to provide an analysis of the level of proof, truth, authenticity or accuracy achieved by the statement and its supporting evidence base. Changing behaviour Influencing behaviour implies that there are standards or thresholds against which an activity is being checked, as a basis for stopping a behaviour, starting it, or altering it (for example, in terms of frequency, intensity, delivery or outcome). Persuasion may be

9

used to bring about voluntary conformity or compliance. Subjects may be trusted and/or permitted to comply in their own way and at their own pace while in other cases sanctions may be applied in an attempt to achieve compliance, with incentives and instructions. Self-improvement suggests that evidence is being used to stimulate self-reflection, with an explicit expectation of change for the better. Customised examples of purposes and uses of evidence for audit, inspection and scrutiny: Audit Inspection Scrutiny Why evidence is used

? ? ?

Types of evidence required

? e.g. in local government Best Value inspections a basket of evidence types is commonly used.

e.g. in local government scrutiny a basket of evidence types is commonly used.

Processes for using evidence

? ? ?

Criteria used for assessing/policing evidence

? ? ?

How evidence is used to influence / change behaviour

? ? ?

Users of the evidence A generic listing of users might include policy makers, service commissioners, service managers, service providers, service workforce, direct service users, indirect service users, auditors, inspectors, scrutineers, regulators, evaluators, parliaments (national/devolved, elected members/officials), regional assemblies and agencies (elected and appointed members/officials), local authorities (elected members/officials), NGOs, ‘the media’, ‘the general public’, ‘voters’ and informal networks. A further dimension runs from individuals through groups to the whole population in a category. Differentiation clarifies the stakeholders’ relationships to the service and to evidence about that service: Production chain for a service: commission; manage; provide

Production chain for evidence about a service: commission; originate; collect;

analyse

Routes of evidence from source: direct; via intermediaries who interpret it

Reach of evidence: restricted to immediate stakeholders; available to wider networks

10

An example of a production chain is district councils, their parish councils and the private companies who deliver the service in question for the local authorities. Some observers think the realisation that there are multiple and diverse stakeholders must bring a recognition that each one has got to be cautious about the way they use their own knowledge. Also relevant are the routes through which evidence reaches different users. The relative status and positioning of users vary. Perceived and actual status will affect perceived, actual and used influence and authority within and outside the system. The intentions of users in relation to the evidence and outcomes may differ or show degrees of coincidence, and their actual uses of evidence may differ or coincide. Parliamentary select committees commission their own sources of oral and written evidence for each enquiry, thereby distinguishing their work from that of think tanks, but they also consider other sources of evidence. Indeed, greater use can now be made of think tank evidence, which is found to be of good quality and has already filtered a lot of material from other sources. In the view of parliamentary committees, think tanks may often be better sources than the civil service, insofar as Whitehall departments are perceived as having less of an initiating role in policy making. Think tanks have a greater influence on what evidence goes into the system. Select committees want to influence policy makers and get them to respond, but they recognise a cumulative enhancement of influence and momentum if the media and the public are also concerned about the issue in question. Informal networks matter in the world of specialists. The plurality of sources and access routes they embody feed a notion of democracy. Informal networks can make a difference because they can hold the key to what sorts of information should be collected in the first place. Network members assemble views from their different sources of information and form judgements that way, which in turn become an important source of evidence for audit, inspection and scrutiny. Some evidence is created and disseminated to multiple users, beyond the recipients for whom the audit, inspection or scrutiny is formally intended, and may include informal networks. This extends the potential influence of auditors, inspectors and scrutineers , and adds a more ‘political’ purpose to their work, through their chosen messages and means of communication. They are paying more attention to the presentation of their findings, and make greater use of executive summaries and better layout to assist readers. The Audit Commission has assessed its own impact on policy makers and service providers but not on the wider audiences. There may be an unrealistic emphasis on the ‘user focus’. The users’ views do have to be gauged, but reports are not designed for them. However, the evidence used in NAO, Audit Commission and other audit, inspection and scrutiny bodies’ reports can usefully feed other similar functions. For example, departmental select committees could use evidence generated by local government, and local authorities would not need to create their own new evidence where relevant reports of inspectorates on local institutions or previous NAO or Audit Commission studies already exist.

11

Conceptual frameworks Powerful concepts are in operation in service design and delivery, but which are the most salient for audit, inspection and scrutiny? Are they shared and generic? For example, compliance and good performance are clearly very important, while issues of market failure, inequalities and resource distribution are fundamental to assessing some services, for example local crime and disorder partnerships. Value for money studies of local economic development agencies also look at whether they interfere inappropriately in local markets. One typology of conceptual frameworks has three categories: (a) high level theory, e.g. market efficiency (b) mid level theory, e.g. choice of provider (c) methodology e.g. for theory testing Most thinking about policy is informed by conceptual frameworks, whether or not they are made explicit. Conceptual frameworks shape how audit, inspection and scrutiny are conducted. They help by clarifying the terrain onto which the uses of evidence can be mapped. Within each sector, a conceptual framework determines what constitutes evidence, what evidence to collect, where to find it, the causal relationships between the elements, and the hierarchies of evidence. For example, in the health sector laboratory evidence and clinical trials data are highly rated, but evidence of clinical practice, which is a strong determinant of service quality and outcomes, has lower status. [The detailed repertoire of concepts that each sector uses will be considered in the next seminar.] Conceptual frameworks produce predetermined tests and criteria (which could be hypotheses) in the minds of those doing audit, inspection and scrutiny, and these can become explicit in their procedures. Transparency is an important factor. However, discretion and judgement in the collection, analysis and uses of evidence may also be important. This suggests a principle of ‘open-mindedness’ within audit, inspection and scrutiny that encourages the consideration of different frames of reference when appropriate. Conceptual frameworks employed in using evidence for audit, inspection and scrutiny functions influence which types, properties and uses of evidence will be regarded as salient. All the stakeholders of the service or policy in question also employ conceptual frameworks, which may be explicit or implicit or a mixture of the two, and which inform their understanding and engagement, and their perceptions of the types, purposes and uses of evidence appropriate to the audit, inspection or scrutiny of that service. The characteristics of conceptual frameworks include:

12

Characteristic of conceptual

framework Purpose or use of evidence

employs a concept of argument to understand how evidence is used to determine whether a statement or claim is proven or not proven; to understand what forms of evidence and modes of collection are required in order to make an ‘irrefutable case’ or give a ‘fair account’

employs a concept of theory testing which regards questions posed in inspection/audit/scrutiny processes as hypotheses, which can be refuted by the evidence

employs a concept of boundary to understand the system, organisation, group or individual, and what constitutes internal and external

employs a concept of time frame to understand definitions of operational (or current) and strategic (or forward looking), and determination of short, medium and longer term options for planning and management

employs concepts of markets and resource distribution

to understand the performance of an economic system in rectifying market failure and achieving redistribution

employs concepts of authority and responsibility

to understand the actions of stakeholders in controlling or directing, guiding, learning, discovering

employs a concept of access to information to understand where, how, why and when evidence may or may not be shared between the stakeholders

employs a concept of compliance to understand performance in terms of minimum standards and variances from them

employs a concept of good performance to establish what counts as ‘good performance’; to set the terms for selection and use of performance indicators, their evolution, and decisions concerning what can be measured and/or what should be measured

employs a concept of service delivery and governance

to understand the standards for delivering services, and the governance arrangements

13

Outstanding issues

• Transparency: the relationships between investigator and investigatee • Differences and implications of dynamic vs. unidirectional processes of evidence

collection • Discretion and judgement in the collection of evidence • Processes for analysing and testing the evidence • Surfacing the implicit criteria used in the selection, collection and processing of

evidence • Discretion and judgement in the analysis and uses of evidence • Objectivity, robustness, validity

Emerging ideas for further consideration Fitness for purpose Who needs evidence for what purposes? End users of services are becoming increasingly influential in the design and delivery of services. However, evidence has to be tied to a purpose, and to some observers it seems obvious that too much evidence is being collected and disseminated in audit, inspection and scrutiny, and that this evidence is not always the right kind for the right uses. ‘There is overload and too little impact’ says one observer, citing as an example the Comprehensive Performance Assessment scores of local authorities ‘…which are not informing the public about their local council’s performance’. If all stakeholders are producers of evidence, the background noise becomes so great that much of the evidence will not be drawn on. So it is imperative to ensure that a smaller range of evidence is both fit for purpose and used. Audit, inspection and scrutiny can be distorted by the availability of evidence. For example, one observer says ‘the Food Standards Agency only collects evidence in relation to its stated mission. Consumers’ interests in the constitution of food are excluded, unless they can be made to fit the “adulteration” category. The Agency’s statutory obligations are narrowly defined, but the food system as a whole raises wider questions than the Agency frames.’ ‘Playing the user card is a known device, and something of a game’, according to another observer, which ‘…risks poisoning the well of public involvement’. It raises questions about the methodological rigour with which evidence is gathered, filtered, interpreted and used. The increasing ‘user involvement’ dimension of public service accountability also raises questions about how evidence is transmitted, and by whom. Audit, inspection and scrutiny bodies claim to provide information for the public but are rarely consulted by them directly. If evidence is to be effectively transmitted, these bodies need to inform organisations that the public routinely turns to for information. Some members of the public do use evidence about public service performance in making personal choices, for example league tables, and in some cases influence is also exercised by the purveyor of that evidence. For example patients will take into account what their own GPs say if they are offered referral to a choice of hospitals. This is not

14

the case for local public services such as police or transport, which do not deliver personal services in the same way as health and education. Finally, there is a missing link between policy makers and the public, on whose behalf audit, inspection and scrutiny bodies operate. Who exercises the greatest influence over parliamentarians? Is it lobby groups who can legitimately use ‘political’ influence to push for proposals, rather than officials for whom such tactics are inappropriate? Convergence Is there a convergence between the audit, inspection and scrutiny activities, aligned with the shift in attitudes to admissible evidence? Parliamentary committees are supplementing adversarial hearings with other sources; auditors are supplementing quantitative evidence with qualitative. There is already convergence across some public services, so that children’s services are now grouped together. Ofsted was at the forefront in gathering service users’ perceptions by holding meetings with parents, and now pupils are being asked about their schools. In addition, self assessment is replacing some inspections. National standards can influence the types of evidence that audit, inspection and scrutiny employ. Benchmarks are used in school inspections, but with a lighter touch nowadays; in the earlier days of the national literacy and numeracy strategy, Ofsted inspections were much more rigid. Raising the influence of subjective impressions in audit, inspection and scrutiny fuels the powerful effect of framing. Qualitative impressions help to inform the quantitative questions, and a corrective is available through having a report of an audit, inspection or scrutiny checked in draft by the key stakeholders. Reference 1. House of Commons Public Administration Committee (2003) On target?

Government by measurement: fifth report, session 2002-03: volume 1 London: Stationery Office, 57pp (HC 62-I). Available, together with the government’s response, via: http://www.publications.parliament.uk/pa/cm200203/cmselect/cmpubadm/cmpubadm.htm

15

3. Methods for collecting and organising evidence in audit, inspection and scrutiny (January 2005)

Introduction The subject of this seminar was methods – the different approaches to collecting and organising evidence in audit, inspection and scrutiny. This focus is distinct from that to be addressed in a subsequent seminar, which concentrates on synthesis – how evidence is processed, interpreted and otherwise brought together to construct arguments and draw conclusions. Focusing first on methods provides an opportunity to examine the options, assumptions and decisions adopted in using evidence for audit, inspection and scrutiny. This paper examines some broad themes under the following headings:

• Tools and techniques • Skills • Inspection regimes; culture and philosophy of investigating organisations • Framing • Availability of evidence • Closeness to the data; stance of investigator • Quality assurance • Depth of collection • Shifts in what constitutes evidence

Tools and techniques Any audit, inspection or scrutiny should be clear about what it needs to investigate, and ensure it uses the appropriate indicators for that task. Typical techniques for gathering evidence include:

case studies • • • • • • • • • • • • • • • • • •

checklists complaint investigations conferences deliberative events electoral processes examinations focus groups hearings inquiries inspections of books inspections of operational performance data interviews metrics mystery shopping negotiations observations self-assessment

16

site visits • • • •

straw polls surveys and questionnaires tests

All tools and techniques have their own limitations. Both qualitative and quantitative tools and techniques can be useful. Who uses the tools and techniques and how they use them are significant factors, as the tools and techniques are not exclusive to audit, inspection and scrutiny. The historical context shows that in the 1980s and 1990s, new measures were developed to assess and improve public services. The advent of New Public Management, with its emphasis on the routine collection and use of management information, encouraged audit, inspection and scrutiny regimes to ask organisations for this information as a source of data for central benchmarking and planning. Among the different tools and techniques, the use of metrics, such as the Balanced Scorecard, has become common across the public services. For example the Office of Science and Technology has developed a range of measures to assess organisations, covering (a) products and customers; (b) internal learning and growth; (c) evaluation and performance; and (d) stakeholders’ and wider users’ views. The Medical Research Council (MRC), in reviewing its relationship with the units it funds, has been questioning whether the publication record of unit staff is a useful indicator of value for money for assessing the MRC’s investment, given that different specialist areas get different opportunities to publish papers. The Audit Commission looks at comparisons of costs between local areas and also uses outcome measures. It focuses on systems for performance management rather than purely on analysis of the existing data. In the NHS the Acute Hospital Portfolio comprises data specially collected for comparison purposes, which are used by the Healthcare Commission to evaluate the efficiency of delivery; the numbers are used as the first stage of a risk assessment, alongside secondary analysis of Hospital Episode Statistics, which comprise routinely collected data. The shift to greater use of existing data is noteworthy; this ‘lighter touch’, with its emphasis on the re-use and sharing of information, is intended to reduce the burden on inspected/audited organisations. Nevertheless, the number of official targets and indicators is increasing in all parts of the public sector. Ultimately the aim could be for information to be shared between different inspection and audit regimes. Since 1997 the paradigm for audit, inspection and scrutiny has changed further, with a growing interest in the idea of improvement through the co-production of solutions. This is partly driven by advances in social science research methods and evaluation techniques, and partly by political pressures, based on the realisation that many local services were of unacceptably poor quality and that implementable and sustainable improvement plans were urgently needed. More consultative methods were seen not only as a means of diagnosing problems, but also of ensuring commitment to the co-production of solutions. Yet, public bodies still tend not to use deliberative and consultative tools and techniques very extensively. These shifts reveal a recognition of the limitations of uni-directional approaches to audit, inspection and scrutiny, and a realisation that these activities are ways of testing hypotheses, which are themselves open to variance. New skills need to evolve as the understanding of co-production broadens. The National Audit Office (NAO)’s phrase ‘Helping the nation spend

17

wisely’ tries to reflect this. As these trends continue, more subtle, complex and localised processes of assessment will develop. In parallel, there is increasing use of self-assessment not only as an arm of performance management but also as a source of information for external audit or inspection. Over time the Audit Commission had been finding that compliance with its inspections did not necessarily lead authorities to embrace the responsibility for implementing changes. The sense now is that self-assessment builds greater commitment, and allows variance and local ways of driving improvement. Under Comprehensive Performance Assessment (CPA) local authorities are welcoming self-assessment, and making less use of external quality management frameworks such as the EFQM Excellence Model and the Balanced Scorecard. It is also increasingly acknowledged that audit, inspection and scrutiny need to place service users rather than providers at the centre of the process, for example by investigating children’s experiences across the education, social work and health services. In this instance, the point of view of the child is becoming the defining framework for data collection. The origins of this interest in a greater customer focus can be found partly in the private sector’s much older recognition that attempts to improve ‘customer relations management’ serve the multiple interests of increasing market penetration, market share and strengthening competitive position. Some of these approaches have been lifted into public service providers’ thinking about ways to improve customer satisfaction, as represented in the phrase ‘joined up’. The results of audits, inspections and scrutinies are increasingly being used as support for political arguments, and this creates an incentive for those being investigated to challenge them. Any checklist or hypothesis can be questioned, but the possibility of judicial review is making inspectors more rigorous in evaluating their choices of tools and techniques at two levels: Did inspectors follow the prescribed methods and make a sufficient or reasonable judgement based on the evidence? But also, what is the validity of the underlying framework and the questions asked? Audit, inspection and scrutiny investigations are required not to call for, create or use data in such a way that prompts the investigated organisation to object. To avoid the risk of appeals, such as judicial reviews, the information gathered by the investigators must be merited. The recent UK Freedom of Information reforms are a factor to be considered in this context. They may influence the decisions of organisations on what data they choose to hold, and thus the data which it is acceptable for audit, inspection and scrutiny bodies to request. Skills The skills and training that investigators bring to the task vary. Many investigators may be confident that they are fully competent to conduct an interview or focus group, for example, even though they have not been specially trained. In reality, experts trained in these methods may be required in order to ensure that the evidence is accurately gathered and presented for analysis. Nonetheless, the aims of investigators and researchers can differ significantly, and this will have important implications for the collection and organisation of data. For instance, a trained social researcher undertaking a classroom observation for a research project will construct and approach that task differently from an Ofsted inspector. The investigators’ knowledge and skills will also influence their choice of tools and techniques. Their degree of familiarity with, and proficiency in, particular qualitative and

18

quantitative data gathering approaches will affect their views on whether they can undertake the work themselves, or need to commission it from external experts. Investigators who lack training and experience in using particular research tools may avoid using them (whether directly or through commissions), even though relevant evidence could be gathered in those ways by skilled investigators. For example CHAI (Commission for Health Audit and Inspection) found it useful to create a pool of policy analysts, drawn from recent social science graduates. They work alongside the inspectors, who are professional peers of the investigatees and hence have the gravitas and credibility for the role of evidence collection. Teams comprising mixed disciplines and skills bring opportunities for triangulation in relation to the design of the data gathering and preparation for analysis. This also acts as a check on quality of the data (see later section). For teams to work effectively, there needs to be skilled project management, and the ability to manage the tensions that commonly arise within multidisciplinary teams. Whereas professional analysts will tend to favour ‘scientific’ evidence, politicians’ skills and backgrounds tend to make them most receptive to, and confident about using, data that comes in the form of narratives, anecdotes or a telling point, which they can perceive as a reality check. Partly this is because many politicians do not understand complex analysis – they leave that job to officers. Partly the voters’ voice will always trump any other evidence put before politicians. Long documents are unwelcome because politicians have limited time, so the officers tend to mediate the data in order to produce short summaries. This affects consideration of the relative merits of different qualitative and quantitative tools, and techniques for different users and uses. Local politicians tend not to understand the ‘traffic lights’ systems, which are now so prevalent in assessments of the performance and progress of the main public services, but which were not designed with them particularly in mind or perhaps explained to them. Politicians’ debating and point scoring skills still count for a lot, especially when a committee or authority’s members have to reach a decision. Investigation regimes; culture and philosophy of investigating organisations Each audit, inspection or scrutiny body establishes its own approach or regime, which is the product of its remit, its chosen style for carrying out investigations, and its political context. The regime will influence the behaviour of the organisations it investigates, mostly in the direction of compliance, although this is a dynamic, evolving arrangement. Thus, for example, the influence of the Higher Education Funding Council for England’s Research Assessment Exercise on the behaviour of academic staff and the strategies of universities has been substantial. Audit, inspection and scrutiny regimes change over time, partly in response to the way the organisations and systems they investigate adapt to the regime. New regimes precipitate responses both positive and negative, as adaptations build and in turn prompt further changes to the regime. Adaptation by the investigated bodies is not the only driver for change; new ideas and incentives about public goals emerge and become articulated and incorporated. As regimes change, so does what counts as evidence. An investigator working for an organisation that claims or believes it is seeking after ‘objective truth’ will approach the task of identifying and collecting evidence differently from one that claims to work ‘deliberatively’ with the subject organisation it is investigating. The different traditions and disciplines of audit, inspection and scrutiny shape the chosen methods of investigation and uses of the data. Furthermore, these habitual methods of collecting evidence also inform opinions about what evidence is

19

considered to be available, relevant and feasible. A shift from a so-called ‘hierarchical’ regime of investigation (which specifies the evidence requirements at the outset and applies the chosen methods uniformly) to a ‘team-’ and project-based approach (where there is more discretion in evidence selection and methods of gathering) is discernible across audit, inspection and scrutiny bodies, and explains their respective approaches to the collection and use of evidence. The NAO, for example, has gradually moved more towards the discretionary end of this spectrum. The Audit Commission has shifted between quantitative and qualitative methods in the context of the move from Compulsory Competitive Tendering (CCT) to Best Value (BV) to value for money. CIPFA (Chartered Institute of Public Finance and Accountancy) statistics had been a very authoritative data source but following the introduction of CCT many authorities withdrew from collecting statistics and by the mid 1990s CIPFA statistics could no longer be used as a reference for benchmarking. Because of that vacuum, Best Value statistics became very important. For CPA, with its focus on bespoke services and choice, there is less emphasis on costs and units, or on satisfaction, and more on achieving a balance between quality and costs. This again impacts on the preferred tools and techniques. Embedded evaluation is a necessary corollary of the new focus, and different types of evidence have particular salience for different investigating cultures. With the teamworking regime there is more discretion to construct arguments about the data, while in a ‘mass production’ regime the non-admissibility of different data types is more predetermined. The shift in culture has also been described as ‘from mass production to batch production’. The hierarchy of evidence used in the health sector is not replicated in social care. The quantitative evidence base in the latter is seen as much less significant than what service users say they want – regarding care standards in care homes, for example. An Audit Commission inspection of the supervision of young offenders included evidence concerning national standards, compliance with those standards, analysis of case records of supervisions to see whether they were tailored responsively to particular cases, and conversations with young people and case officers. Auditors collected this information in order to triangulate across and between the different types of data, and they regard this as a different approach from a traditional inspection. In the case of the MRC, the 37 units it funds were previously individually reviewed every five years in a uniform way. Now the MRC wants to understand the wider context of the units and needs a varied framework for evaluating them. It could be that what defines investigation regimes is their primary purpose – whether it is to deliver accountability or improvement – although a rhetoric of improvement may be used to disguise a regime of accountability. Furthermore, the culture of the investigator is partly an outcome of the culture of the investigated organisation. Framing Investigations may differ methodologically as a consequence of the underlying theory or hypothesis that they set out to test. These considerations will frame what particular questions are posed, how they are expressed and what sources are considered appropriate to answer them. Investigators can choose inductive or deductive approaches. Similarly, in relation to identifying the issues that an investigation should address, one method may first decide on what issues evidence is required, for example through a brainstorming exercise, before going

20

out to collect the evidence. Another method may opt to leave more open the exact scope of issues to be considered. In this more exploratory mode, the first task is to gather evidence on a wide front and narrow down the issues later. Basic assumptions differ as to how services work and what modernisation and improvement strategies would be desirable. Audit, inspection and scrutiny checklists are usually based on excellence, not on improving from a very poor position. They are also based on evidence, that is, on information assumed to be capable of illustrating what has caused different outcomes. The framing is a set of ex ante assumptions and prejudices. The creative act is to interpret what the framework means. Thus the Healthcare Commission wants to know what issues are of importance to the people it is inspecting. In an ideal situation there will be a degree of acceptance among the inspected that the inspector’s prejudices are reasonable. Issues analysis helps to deal with uncertainty, because inevitably there are clashes in cognitive views, and fashions change. Realist evaluation and logic modelling require people to be more explicit about their assumptions. For political scrutiny, agreement is needed around the focus and the key questions to be posed. Thus Ofsted and schools and local education authorities agree on the inspection framework and specific questions. Even so, different inspectors will have a different take on what they encounter. Standards offer one way to minimise the effects of prejudice but ‘[national] minimum standards’ are still a theoretical construct, not a reliable information base. Sometimes checklists are seen as inflexible because they do not represent the world of the inspected. Assumptions are often not made explicit, or explained away as ‘common sense’. In CPA there is an attempt to identify ‘key lines of enquiry’ and the answers to look for, and these are moderated to see if the right judgement was made based on the given data. The key lines of enquiry have been negotiated with the Local Government Association, but they are nevertheless open to extensive interpretation at the level of the individual authority. The topics of some local scrutiny reviews are identified by members who have clearly been influenced by recent national events or press reports, for example on flooding risk or the risks from mobile phone masts. In such situations, the weaknesses of poor local services can be overlooked. Innate bias within the overall framework is difficult to recognise and step out from, and both people and organisations become attached to their preferred paradigm or view (for example, in another context, the European Commission’s focus on small and medium enterprises may have arisen from its inability to control multinational companies). While it is reasonable to try and address differences in framing between investigators and investigatees, this should not obscure the prime intent of audit, inspection and scrutiny, which is to investigate for purposes of accountability rather than unpick service providers’ logic models. Availability of evidence Practical considerations can determine what evidence is regarded as collectable, including pre-determined time limits within which an investigation must be completed, and/or the budget available for the work. Given those constraints, methodological choices may be made in order to increase access to the people, documents and other information that the investigation requires. In addition, the responsibility to minimise the burden they impose obliges audit, inspection and scrutiny bodies to ensure that their work generates no more additional work for the services they investigate than is really necessary. What constitutes

21

proportionate effort is of course a huge question. Risk assessment should be done prior to scoping the investigation. This is still a new idea and would facilitate sample audits, but it is not yet a favoured approach. Closeness to the data; stance of investigator Some investigations are designed to produce a direct ‘live’ encounter with the subject such as an interview, focus group, oral hearing, site visit (see below) or workplace observation. In other cases secondary sources, such as analysis of existing records, performance data, compliance reports and the like may be preferred. These choices carry significant implications for bias and clarity in data collection, and raise questions about the determination of reliability in the evidence base. The investigator may adopt a detached or involved stance to gathering evidence, or a position somewhere in between. For example one investigator may consider it appropriate to be a silent, impassive observer of a workplace or activity, and attempt to ‘give nothing away’. Another may seek to behave in ways that generate engagement and two-way exchanges, in order to reduce or avoid giving the subject organisation the impression that it is being judged forensically by an authoritarian figure. Site visits are regarded as an extremely important mode of evidence gathering in several sectors, and some inspectors are deeply attached to them because they believe they can thereby obtain a more immediate and direct feel of the place. Inspectors and auditors can also observe actual interactions between people in the organisation. In Best Value inspections the site visit is an essential preliminary to identifying the key issues; documents come later. NAO auditors say they do not really have a useful mental model of the organisation they are investigating until they have experienced it personally through a site visit. Ofsted inspectors obtain credibility with the staff of the school being inspected through their track record of professional experience. There is a difference between planned site visits and unannounced ones. Mystery shopping techniques may give a more spontaneous (because unrehearsed) experience of an organisation but the NAO, for example, is making limited use of this technique so far. It knows that highly planned and rehearsed responses to visits are of limited use, but it wishes to feel more confident about what counts as proper standards of evidence collection. Parliamentary select committees gather evidence through visits too, and find that witnesses are prepared to say more to them if the circumstances are informal. This also enables other voices to be heard, rather than the ‘usual suspects’ from the representative stakeholder bodies in a particular sector, with their respective ‘party lines’. In response to requests for views and comments, Age Concern now provides a panel of older people as a service, instead of always producing the HQ voice. It was overwhelmed with requests before it set this service up. Local councillors go out on site visits too, because they want evidence from ‘the public’. There can, however, be difficulties if the ‘usual suspects’ filter what is said. There can also be a risk of prompting consultation fatigue, but innovative ways of engaging the public are being developed. These are producing interesting evidence, but it is not yet so clear how to collate it and feed it in to policy making. Experiential methods such as deliberative events are ways to generate data and different approaches to doing things. Data gathering cannot be purely ‘scientific’ and experiential

22

methods, if they result in improvements, are worth having for their own sake because they develop learning. Both detachment and closeness are valuable, and that is what teamwork can help to achieve. Local government politicians are close to the problem and to people so it is appropriate that they are the non-expert investigators, not the audit, inspection and scrutiny experts. The point is how to join them up well. However, some auditors have never themselves worked in local government so there can be many limitations on their judgement, as well as huge frustration in local authorities because the auditors do not understand the context in which the data has been gathered. Quality assurance Quality assurance has changed users’ and providers’ experience of services very significantly by raising their expectations, while it has also developed as an explicit approach to accountability. It acknowledges that information can be subject to alternative interpretations. Quality matters because the stakes can be high: people can lose their jobs as a result of audit, inspection or scrutiny recommendations. There is no option but to provide a clear audit trail. In addition, freedom of information and data protection legislation now allow greater access to personal and public records. The Healthcare Commission’s clinical governance review was double checked, not only for the recommendations but also for the evidence base on which the recommendations rested, so that the work could be shown to be sufficiently rigorous. The Audit Commission also routinely sets, applies and checks the standards of the records of evidence on which its CPA and other inspections are based. If overall, comprehensive national standards were set by government, ministers could appoint inspectors to check them and local variance would not enter the equation. However, while such standards may be part of political discourse, they do not exist in reality, so audits, inspections and scrutinies look at detailed local practice to check local performance against a more modest set of minimum standards. And even if detailed checking against a set of nationally determined standards were possible, it would not necessarily be desirable. Co-production, which is increasingly felt to be necessary for the subsequent implementation of change, is not achieved through micro-management. The label ‘amateur market research’ was applied pejoratively to the Audit Commission’s reality checks under Best Value, when it was a requirement that all stakeholders agreed with the evidence supplied to the BV team. Audit Commission value for money studies (now discontinued) became popular for a time because they offered strong examples of opportunities for development, not prescriptions. The CPA process provides quality assurance of self-assessment and improvements, with authorities creating their own improvement plans, rather than an outside inspector coming in to ‘find out what is wrong’. CPA seeks to identify the leadership and management aspects that will enable an authority to identify and deliver improvements. There can be quality problems with audit, inspection or scrutiny regimes, even if every organisation in a particular sector or service is investigated in the same manner, using a ‘one size fits all’ approach. Inconsistencies between the investigating teams can cause different judgements to be made about similar evidence. Conversely, where teams have

23

discretion to determine their own methods, they may work inefficiently because they are reinventing approaches every time instead of applying core tools that have been tried and tested. There may need to be a combined methodology, first targeting the audit, inspection or scrutiny using a core, national set of tools and approaches, and then supplementing these with a wider set of complementary tools than can be used as local circumstances require. Value judgements about the suitability or quality of different methodological approaches need to be made, and many research designs that are assumed to be ‘gold standard’ may, in reality, be intellectually weak. An investigation could be judged to be of poor quality because it has been conducted using methods that are perceived to be intrinsically inferior (irrespective of purpose) or because suitable methods have been poorly applied. A question that has been exercising those who commission research for government and public service bodies is: can and will independent research experts create and collect evidence that proves more useful to policy makers because it is (a) timely and (b) easily intelligible to non-specialists? At issue here is the appropriateness of the tools, techniques and hypotheses to the ultimate purpose of the investigation. Some of the quality assurance practices of pure research (e.g. peer review) can seem to work against timeliness. On the other hand, applied research work conducted on a more pragmatic basis (so called ‘quick and dirty’ studies) can suffer from design flaws or raise other doubts about methodological soundness. In recognition of this, organisations may try to introduce greater rigour into their applied research practices, for example NAO’s ex post work. ‘Disciplined enquiry’ methods attempt to be more research-like (as regards rigour and training), in order to meet some quality standards. Depth of collection The investigator’s dilemma is how to know when he/she has got enough, or the minimum amount of, evidence. How deeply into the data does the investigator go? For example should a document be subjected to a skim read or to a detailed textual analysis? Should evidence be assessed impressionistically for the broad themes it contains or mined for every single nuance that can be spotted? Should data be collected using a pre-defined checklist so that boxes that can be ticked? Should numerical data be subjected to a fixed set of queries in a pre-defined order? These choices demonstrate assumptions in the investigators’ approach to the structuring of the data, the boundaries being placed around it and, by implication, the determination of relevance. Of the three criteria used to assess the evidence base of research studies – relevance, reliability and sufficiency – the last is the hardest for researchers to apply. The world moves on, and the reasons for undertaking a study may change or the importance of particular components may change, with implications for the sufficiency of the data collected. Sometimes more evidence can be gathered later on at the quality assurance stage of an investigation from further interrogation of the data already to hand, or through discussion of the draft findings with the investigated organisation, which prompts further data to be made available.

24

Shifts in what constitutes evidence: convergence? Attitudes to the admissibility of evidence types are changing. The NAO would not have accepted oral evidence ten years ago, nor quoted or cited academic research as evidence in its own reports, as it could not guarantee their quality. It has shifted on both counts, so that a wider range of evidence types, including qualitative evidence, can now be included. The Audit Commission also used to be wary of citing academic sources, as its own audit reports were meant to be free standing. Experience from elsewhere in audit, inspection and scrutiny suggests that this shift is a general trend. Whereas in the past there was a preference for quantitative over qualitative evidence because of a perception that qualitative was ‘less reliable’ and quantitative was ‘more objective’, nowadays a more mixed economy operates. Insofar as qualitative data add to the ability to enable improvements, the shift also supports that aspect of the remits of audit, inspection and scrutiny bodies.

25

4. Synthesis of evidence in audit, inspection and scrutiny (March 2005)

Introduction The subject of this seminar was synthesis – how evidence is processed, interpreted and otherwise brought together to construct arguments and draw conclusions. It takes forward ideas discussed in the previous two seminars (on types and methods) and prepares the ground for considering outputs – the purposes, forms and styles of communicating the findings of audit, inspection and scrutiny – to be addressed next. Focusing on synthesis provides an opportunity to examine the options, assumptions and decisions used in analysing and interpreting evidence that has been collected or assembled for the purposes of audit, inspection and scrutiny. The starting point is: What to do with the evidence that has been collected and/or assembled in the course of carrying out an audit, inspection or scrutiny? Something does have to be done to the evidence, because evidence left alone does not ‘speak out’; it has to be taken beyond its collected state, and that step or steps constitute the synthesis. What is done to the evidence and how and when it is done will make a difference to the influence and currency of the audit, inspection or scrutiny. Some of the choices about what to do with evidence are generic, not the sole property of audit, inspection and scrutiny activities. Choices partly reflect prior ideas about ‘appropriate’ ways to extract meanings and turn them into suitably informed conclusions. Terminology: synthesis, analysis or interpretation? Terminology needs to be clarified: What are different meanings and usages of ‘synthesis’, ‘analysis’ and ‘interpretation, within and between research and practice? How are these terms used by practitioners and by researchers? Can the terms travel safely between the communities? How are the uses and meanings of these terms evolving? Research synthesis differs from audit, inspection and scrutiny synthesis in the level and depth of analysis of existing evidence. But standards in academic and practice spheres require that investigations demonstrate clarity of purpose, use appropriate methods, demonstrate objectivity and rigour, and report methods and findings clearly. Academic literature reviews and systematic reviews most commonly present a descriptive narrative of the literature that has been identified. Realist synthesis uses research and other literature to reveal and articulate the underlying assumptions about how a policy or programme is intended to work and how it works empirically. Meta-analysis assembles the results of comparable existing studies into a single larger data set and performs calculations to infer the effects at the wider level. Meta-ethnography uses a body of research literature to perform inductive content analysis and construct broader interpretations. In a scrutiny by a parliamentary select committee, once questions on the particular topic have been articulated and answers to them collected in evidence received by the committee, the clerk writes one page setting out the ‘heads of the report’. This becomes the agenda for a deliberative session of the committee, after which the clerk writes the full report, mapping out the agreed themes and synthesising the evidence into arguments that the members have voiced or can collectively agree. The committee members are drawn from a spectrum of

26

political positions and would have great difficulty creating that text themselves. Rather, they delegate the task, and consider and comment on the clerk’s text. If s/he has succeeded in accommodating the members’ various political perspectives on the issue, they are able to sign up to the text comfortably. In the case of particularly controversial reports, on the other hand, considerable debate among members may be required to accommodate divergent views. There may be a series of votes on contested wording, although it is rare that a competing ‘minority report’ is produced. The objective throughout is to produce a report which authoritatively reflects the consensus arrived at by the committee after a full examination of the evidence. The National Audit Office (NAO), when embarking on a study, sets up a series of issues in the form of logical questions. It then decides what data to collect and the methods of collection. The questions are revised in the light of the data, and a stop point is arrived at when there is enough data in hand to answer them. This is a more dynamic and iterative approach than a simple linear sequence. The NAO’s work stems from financial audit, which checks discrete information and actions. What counts are judgements about processes, based on evidence that has been weighted and differentiated hierarchically by the investigator. The investigator pulls the evidence together to form conclusions – to prove – and to go beyond that to make recommendations – to improve. Therefore, the evidence has to be processed so that it will enable the investigator to form defensible judgements. Investigators have to determine what will constitute sufficient evidence, and what judgements the evidence will support, especially where the analysis or interpretation may be contestable. Synthesis is a way of making connections between the content and meanings of the findings. Transparency is important because what is at stake behind the investigation is a key determinant, and the synthesis may be mistrusted if it is not done robustly and in view. Reporting on the synthesis process that has been adopted will be necessary to allay these doubts. Consultation with stakeholders is not enough; analysis goes beyond just hearing views. Processing the evidence Although many things can or could be done to evidence, the choice is usually a response to the need to achieve two refinements: (a) to create some order in the whole of the evidence; and (b) to develop insights about the evidence. Neither of these processes is neutral. As discussed in the previous seminars, the methods chosen for collecting evidence in an audit, inspection or scrutiny reveal the frame in which the policy or activity is being understood by the investigator. This frame is bound to influence the synthesis because it arises from an underlying theory or hypothesis about action, or cause and effect. Processes are done to evidence in order to test that theory or hypothesis. For example Ofsted inspection methods are informed by a combination of individual inspectors’ theories of what makes a good school, and the corporate Ofsted orthodoxy about what makes a good school. Inspection findings are studied with those theories and hypotheses in mind, whether explicitly or implicitly. Similarly, financial audits look at an organisation’s books, treating those records as evidence of the organisation’s financial management skills and practices in complying with a prescribed set of official regulations and standards. The individual auditor may also have a hypothesis in mind, which informs his or her ‘reading’ of the books.

27

Are there particular distinctions in relation to synthesis between audit, inspection and scrutiny, insofar as they each consider different types of evidence using different methods? For example an audit could use performance indicators, and not much else, to demonstrate accountability according to a specific framework. However, audits and inspections are tending to move beyond relying on a single narrow source of evidence to take in a wider mix of types. Conversely, in scrutinies it is arguable that little synthesis goes on at all, if the hearings are themselves the means of ‘holding to account’. The scrutiny may not do much more than report that account. There is a hierarchy in the judgements too, if for example the conclusions arrived at by a local authority scrutiny committee are compared with those from an Audit Commission inspection of a local authority under the Comprehensive Performance Assessment (CPA) framework. A lot hinges on the outcomes of the CPA for a local council. Judgements based on a synthesis of evidence will always be context-specific, reflecting organisational culture and history, and the degree of diversity in the evidence. Audits, inspections and scrutinies have to organise their analyses of evidence in an increasingly compelling manner, because their judgements are examined more carefully nowadays. The political context impinges too, where there are lay interests undertaking scrutinies, as compared with the relatively professional context for audits and inspections (although all of them are meant to take the public interest into account). In this climate of challenge and contest, trust cannot be taken for granted. There is a perceived increase in the erosion of professional legitimacy, to the extent that anything might be challenged, even if it is the result of a technically sound and rigorous piece of work. Transparency of analysis does not prevent challenge, so the process of analysis needs to include thought and clarity about the choice of standards and protocols. The recent Public Administration Select Committee’s (PASC) inquiry into inquiries found that judges feel themselves to be under attack, even though they base their judgements on evidence and the stated interpretations of evidence. The source of a judgement – who is saying it – can be as important as what is being stated. NAO judgements, for example, can sometimes be seen as pejorative or critical by subject organisations even if that was not the NAO’s intention. This has consequences for the outputs: the methods of presenting results in the public domain. Creating order in the evidence Creating order in the evidence can be achieved by grouping like elements together, for instance according to origin (country, institution) or source (client, user, expert, provider), type of evidence (written, graphic, etc.), type of content (case notes, financial accounts, etc.), and so on. For example a study of clinical outcomes of a particular acute service provided in hospitals might assemble evidence from comparable cases reported in different countries. A study of a neighbourhood renewal initiative might look at the transcripts of interviews and focus group discussions with stakeholders in local communities. These ways of creating order are a first step in searching for patterns in the evidence. Audits, inspections and scrutinies deal with the performance of organisations or systems. Specific policies or standards provide the context for that performance. Evidence is used to construct arguments about performance. The methods that have been used to collect the evidence inform the content of the argument and the way it is crafted. Different methods carry different assumptions about ‘being faithful to the data’. The spectrum ranges from one extreme of an analysis that apparently ‘simply holds a

28

mirror’ to the subject of investigation, to the other extreme where the analysis constructs a free narrative and commentary, drawing selectively on the evidence base. For example an analytical method such as a cost-benefit study could generate an argument that is crafted to be an ‘objective’ and rigorous assessment of performance. An accounting method such as that traditionally found in audit work will identify and track ‘trails’, and craft an argument about performance by explaining the significance of events in that trail. A discursive method, such as that found in parliamentary and local government scrutinies that generate evidence through ‘hearings’, make judgements about the witnesses’ evidence to craft arguments that support or challenge the witnesses’ claims. The scrutinies done by parliamentary select committees are now less likely to be bad-tempered paragraphs from members and more likely to be well-founded reports with a richer evidential base and well-made arguments. How compelling does the analysis of patterns in evidence have to be to overturn the framing and initial logic of the inquiry? If evidence collected in an investigation does not match the starting hypothesis, it has to be sifted through again, to see what will have impact in a report. Investigators also have to explore the issues that did not emerge. Many answers to the questions posed in an investigation will generate data that prove to be not interesting to the investigator, but the investigator has always to bear in mind how important the overall conclusions will be for the different stakeholders. Bayesian research suggests that people are reluctant to drop their original position or put aside their starting point, even if evidence should shift their view in a different direction. Their initial context is a powerful shaper of their outlook, and a limit on their receptivity to new or contradictory findings. There may also be a ‘conspiracy of optimism’ between the investigator and the investigated. For example, the CPA inspections at first demonstrably improved local authorities’ ability to write strategic plans. This came about because the site inspections themselves were a limited exercise, heavily dependent on papers provided by the local authority, so the better written they were, the better impression the authority was likely to provoke in the inspectors. If the purpose of top-down inspections is to assess compliance with criteria, they will control behaviour, especially if the results are tied to subsequent resource allocation. However, if inspections are intended to change or improve policies, their approach to the synthesis of evidence should be different. Some investigations are unlikely to flow in a strictly linear sequence of stages. Sometimes evidence collection is designed to be an iterative sequence, the early analysis of initial evidence used to guide what further material is to be collected and analysed next. Thus, when an investigation has reached its synthesis stage, unanticipated gaps in the evidence may be revealed that necessitate further collection of data. The initial evidence may turn out to be not usable or not informative. Case studies are a particularly compelling form of evidence for policy makers; the easily understood appeal of qualitative coalface detail will often outweigh the dry analysis of assembled quantitative fieldwork data. From a purely evidential viewpoint case studies may lack weight, in the sense that they are only able to present a single case, but their potency is such that they can often drive policy. Qualitative evidence is always selected and weighted by the investigator, even if this is not made explicit. In some circumstances this kind of evidence

29

may be deliberately cherry-picked, with the investigator making the judgement at the outset and then looking for cases to illustrate the argument. Developing insights about using the evidence The task of synthesis is to draw the necessary conclusions from the evidence. One way of developing insights about the evidence is to identify the common properties within clustered elements and the nature of differences between these clusters. Such pattern recognition may be organised around themes or topics or questions, for example. This is a second step in the search for patterns in the evidence. Some audits, inspections and scrutinies use evidence that has already been gathered and presented in clusters or structured for another purpose, for example the Hospital Episode Statistics used by the Healthcare Commission. Parliamentary select committees send their reports to the relevant government department, which then replies. The committees therefore tend to put the emphasis where they see they have the political scope to prompt practical changes. One committee deliberately sets out to create patterns in the evidence it receives by publishing an ‘Issues and Questions’ paper at the start of an inquiry. It finds this produces responses that are more directly useful, although it also intimidates some non-expert people, who do not send in views. Witnesses’ evidence submitted to the PASC inquiry on the government’s use of targets revealed the huge extent of targets, and this strongly influenced the committee’s report. Its arguments on that topic could have been heavily challenged if it had not had that evidence base. However, there is a risk that focusing on issues and questions defined at the start may limit the synthesis to those topics, so other key material present in the evidence may be missed. The NAO’s task is to save the nation money, so it is bound to pick areas for investigation where it believes it can have the biggest impact, for example where its recommendations cannot be avoided. It has already been said that investigators use tacit assumptions to frame their attitudes to the evidence and its analysis, so surfacing these assumptions can help them and others to see what manipulations they may be making in the process of synthesis. For example, critics say that the work of the NAO and Audit Commission may not be sufficiently challenged because of those bodies’ status. The NAO’s way of working is to obtain prior clearance from the organisation under investigation before publishing its report. It is in the NAO’s interest to go with the grain of the organisation’s policy, and to try and exert influence by gathering new or original data to support its arguments, and citing evidence from independent sources, to add weight and credibility to its arguments. Although the Audit Commission does not seek to obtain prior clearance of its inspection reports, there is internal self-censoring by inspectors when writing the reports, to avoid possibly damaging open confrontations with local politicians. Parliamentary select committees also operate in a highly political environment, and plan their work programmes strategically, to avoid controversial matters shortly before an election. Government whips have a strong say in who chairs the committees, so there is no point in pushing their luck too far. Local government scrutiny still suffers from the reluctance of elected members to challenge the policies and actions controlled by their own party. The underlying point is that, as well as political imperatives, self-censoring may be a result of investigators’ implicit framing and assumptions that are not fully acknowledged or surfaced.

30

Selection and weighting of evidence may be going on in investigations for other reasons, perhaps preventing investigators from ‘being true to the evidence’. The claimed independence of audit, inspection and scrutiny, now needs to be seen as heavily influenced by political realities and the implicit attitudes and assumptions of the investigators. Arguments may also sometimes be more a reflection of the work done at the reporting stage than in the evidence-collecting mode of the audit, inspection or scrutiny. Shifting traditions of investigation Audit, inspection and scrutiny bodies originate in different traditions (constitutional, organisational, professional, etc.), which characterise their purposes and styles of work. Some used to be more evidential than others, but all are now becoming more reliant on evidence, partly in order to ensure their reports are defensible. As a result, greater value is now placed on having the appropriate expertise to collect and analyse evidence. Parliamentary select committees are beginning to be able to afford to employ more staff and commission work from outside experts. Local government scrutiny work is not nearly so well resourced and, as a result, the quality of reports is more variable and they can lack methodological rigour. Scrutiny does not come from a quantitative tradition whereas audit clearly does. Audits and inspections tended to be deductive, but are sometimes now able to be a little more inductive, allowing the issues to emerge rather than closely defining them in advance. Looking at what types of staff are being recruited to audit, inspection and scrutiny bodies indicates that there is only a very gradual recognition that skills and backgrounds need to be broader, to enable richer forms of synthesis. The academic tradition of evaluation studies has not proved particularly helpful in increasing the repertoire of skills and techniques used in audit, inspection and scrutiny. Forming a judgement about the evidence requires a view of history and an embedded awareness of the domain or sector, whereas academic approaches are more technocratic, and rely on a narrower view of what works. Increasingly the ‘co-production’ of analysis by investigator and investigated working together is being attempted. This favours the probability of sustainable action on the agreed findings, but it also risks ‘contamination’ of the analysis. However, too strong an assumption of linear cause and effect can risk blindness to cross-cutting effects that the evidence may be demonstrating, or can cause signs to be missed about the long term adaptive shifts that investigated organisations make because they are locked into dependence on their investigator.

31

5. Outputs of evidence in audit, inspection and scrutiny (June 2005)

Introduction The subject of this seminar was outputs – the purposes, forms and styles of communication of the findings of audits, inspections and scrutinies. This topic takes forward the work of the previous seminars (on types, methods and synthesis) and completes the foundations for further consideration of the nature and uses of evidence for accountability in a final seminar in September 2005. Focusing on outputs provides an opportunity to examine the options, assumptions and decisions used in creating and disseminating evidence that has been defined, assembled, analysed and interpreted for the purposes of audit, inspection and scrutiny work. Understanding more about outputs reveals some of the explicit and implicit ideas concerning who the audiences are and what uses they could make of the information. It should also be possible to see how types of evidence are connected to different audiences. This paper identifies some broad themes. One way to shed more light on what underlies the choices between the different approaches to outputs that investigators adopt is to ask what impact they intend their work to have and what impact it actually has. Impact is a function of three elements: content, presentation and engagement. What impact outputs actually achieve depends on the several reasons employed for creating and communicating them in those particular ways, but also on the particular environment in which the outputs have to operate, where there may be competing news priorities, where complex messages have to be translated or ‘versioned’ for different audiences, where the track record of the investigator or the investigated influences reception, and where the political history of the subject affects the attention paid to outputs. Impact is probably also shaped by the type of relationship between investigator and investigated; some relationships impose more obligation than others on the investigated, to report back on plans and on action taken. Terminology: the media and routes of outputs ‘Narrow’ and ‘broad’ definitions of outputs can be differentiated. Narrow terms tend to refer to the forms and formats – the media – of the documents created on completion of an audit, inspection or scrutiny, such as a full written report in hard copy, a short summary of the full report, a press statement, digital versions of these plus images, sounds and other related information, all of which may also be published digitally on web sites. Broad terms refer to processes of communication, whether one-way, two-way or multi-directional, which collectively might be called the ‘dialogues’ that the investigators want to stimulate in relation to their work. These might include face to face meetings between the investigators and representatives of the organisation under investigation, or the users of that organisation’s services, exchanges of letters with boards, managers or users, presentations to seminars and conferences, training programmes for staff, briefings for parliamentarians or interviews with journalists.

32

Formats The physical look of the outputs, as well as the language used in them, can be carefully designed to appeal to particular recipients and create a certain impression. Publication may be synchronised to reach targeted readers at particularly key times for them. The wider dissemination of the messages through dialogues with and between stakeholders may not be left to chance by the investigators, but scheduled and engineered through third parties. Nowadays, investigators think more carefully about the range of outputs they can produce, and are more selective in their choices, rather than assuming that the 100 page report is sufficient. An evidential approach to investigation necessitates a fully documented and referenced report ‘for the record’, while more ‘practical’ outputs in other formats are derived from it.. One CHI (Commission for Health Improvement – the Healthcare Commission’s predecessor body) project produced a web tool for information on health services. Although built from scratch and more intellectually demanding to design, it was less time consuming to produce than the accompanying report, because it did not involve the usual research disciplines of evidence checking and referencing. The supporting report, however, was considered necessary to ensure and demonstrate that the tool was defensible and strong. Many top level service managers are only interested in the executive summary and press statement rather than the full text of an investigation report, so more work goes into obtaining internal clearance of these shorter outputs. Press conferences at which reports are presented to journalists can be ‘theatrically’ managed. The Chief Medical Officer presented some work on patient safety, and he used individual patients’ stories illustrated by still images and video clips to reinforce compelling messages about shortcomings in the services. The Audit Commission reported on a youth justice study using an extended case study of one person, which the media picked up. There are increasing moves to adopt other formats which can enliven the written report. One National Audit Office (NAO) study on cancer services included with each copy of the written report a DVD of recorded interviews. The NAO also makes use of alternative means of communicating its messages such as conferences, seminars and direct communications with the public. One study on regulation of telecommunications, for example, produced a leaflet for the general public on how to get the best from competition and choice in that market, which it distributed through Citizens’ Advice Bureaux. Although local public service scrutiny still relies on written reports, there is recognition of the need to ‘push’ the findings via other, more popularly accessible, media such as the local press, TV and radio, and web sites. Practitioners and academic researchers use case studies differently. Academics use case studies to generate theoretical understandings, whereas practitioners use them to illustrate outcomes. Case studies may sometimes be misused, if the interpretations that are made from them go beyond what the whole evidence base can support. On the other hand, an apparently well-evidenced research conclusion that cannot be linked back to a real case history may be of limited utility. Researchers sometimes complain that policy and delivery decision are based on misuse of evidence or absence of evidence. Interpreting results means that the data is being mediated, and this brings the risk of

33

forming wrong judgements. Nevertheless data has to be summarised and mediated (as the third seminar in this series, on synthesis, discussed). The trend towards greater use of anecdotes is noticeable, and the risk of being swayed by the last voice heard needs to be tempered with a commitment to balanced interpretation of the evidence. Politicians tend to be influenced by a single story, which can in itself reveal a lot about a particular service. They are less likely to study the detailed statistics or other evidence in a full report. Academic researchers often used to be trumped and dismissed by other commentators who could sway the discussion by using a powerful example. Nowadays, the analysis tends to build in more stories, so as to hold attention. Purposes The purpose of outputs is (i) to provide assurance that minimum standards for a given service are in place, (ii) to describe the strengths and weaknesses of the body or service or policy under investigation, (iii) to point to possible improvements (i.e. put accountability for quality and effectiveness of that particular service or policy under a public spotlight), and (iv) to ensure that services as delivered are aligned with government policies and agendas. Overall, what originators require of outputs is that they will make a (positive and discernible) difference to the activities and achievements of the investigated organisation or service or activity or policy, which should at least result in some benefits for some stakeholders. Outputs arise in the context of pre-existing relationships and politics. There can be many layers to each of these purposes, so the originators’ control of content, presentation and engagement issues will make a significant difference to the actual uses of outputs. The spectrum runs from compliance with minimum standards to the attainment of best practice that brings about improvement and learning. But various, more interest-laden, purposes may actually attach to the outputs, such as the wish to (a) demonstrate the fact that an investigation has taken place (i.e. reassurance that the audit, inspection or scrutiny regime is working), (b) demonstrate the accountability of the investigator, (c) strengthen the brand of the investigator itself (e.g. its superior understanding of service or policy issues, or its skill at exposing bad practices and identifying better ones). Also (d) the government has an interest in the role of audit, inspection and scrutiny as a means of demonstrating that things are getting better, for example through star ratings. There is pressure to be able to demonstrate that evaluation is reliable and making the system better because the policy is correct. These are institutional issues. By implication, outputs from investigations that do not support government policy or are equivocal about the impact of those policies may be suppressed if they are likely to upset ministerial or departmental positions. This may happen through self-censorship, and careful selection of the topics to subject to investigation. The apparatus of audit, inspection and scrutiny must be seen to be independent, if it is to increase confidence in government and attract public credibility. On the other hand, not overtly criticising government is seen as a way to get it to listen to evidence and arguments about improvement.

34

Audiences It is important to distinguish between different audiences in studying the fate of outputs of investigations. Those recipients who are in a position to make changes about the policy, service or activity that has been investigated (e.g. managers or politicians) will engage with the outputs from a different perspective than the actual end users of the service (e.g. parents, patients). For example, are Office of Standards in Education (Ofsted) reports of school inspections directed primarily at headteachers, governors or local authority officials, or at local authority elected members, or at parents of current and prospective pupils, or at the wider disinterested public? How do the requirements and appetites for information of these different recipients vary? An output aimed at one particular audience can alter the behaviour of another group of stakeholders. A study of the performance of clinicians in New York State was intended to inform patients’ choices, but it worked by influencing clinicians directly, scaring off those whose performance was judged to be poor. Outputs can be used deliberately to get a professional audience to shift by putting the information in public hands. This could be seen as ‘naming and shaming’, but in the guise of ‘empowering the consumer’. Impact In some investigations, the outputs are thoroughly cleared with the investigated body in advance of publication as a normal part of the process, so that there are no surprises, no-one is caused embarrassment, and the investigated body is given a chance to prepare and time the issuing of its response. This applies to NAO value for money studies, which clear the facts and the presentation of the facts with the body or bodies that are the subject of the study. Local government scrutiny reports are agreed in draft with the executive, so that the final report will be seen as realistic and will be taken up by the executive. While this collaborative approach has clear benefits for those concerned, and may enhance the likelihood of action being taken to bring about improvement, it risks unconscious or intentional collusion between the investigator and the investigated, which may not be in the public interest. The Audit Commission, on the other hand, sees itself as an independent outsider free to say what it wants, although it always consults government departments and national bodies on draft reports. Some audiences are not obliged to respond but may choose to do so; for example a local authority may institute a scrutiny of a particular local service as a result of critical findings in an Audit Commission inspection. The users of a report may be another investigating body, such as the Audit Commission using as evidence the findings set out in a Healthcare Commission report, or vice versa. Different inspectors using each other’s conclusions is likely to become increasingly prominent in response to the demand for ‘low burden’ inspection. However, some outputs also lead to additional work by the same inspector. For example, national data analysis by the Healthcare Commission on maternity outcomes, which was designed to support an investigation into alleged poor practice by an individual department, was used to identify and work with other organisations with apparently poor outcomes. Managers, local politicians, board members and journalists can interpret and use the information for self-promotion (‘our hospital has succeeded in being awarded foundation status’, ‘our school is the most improving local school’) or to support

35

arguments for more funding, etc. Audits, inspections and scrutinies generate information that also feeds (party) political debate about national policies, services and outcomes. Some outputs automatically prompt a statutory response from another body, where the governance arrangements of that service stipulate the reporting and accountability lines. Some outputs prompt a response from parliamentary bodies, for example the Health Services and Parliamentary Ombudsman’s annual reports are regularly considered by the House of Commons Public Accounts Select Committee (PASC). There is no logic to explain these different practices. One question is: does negotiating and clearing findings first make change more likely? In local government the investigated and investigators have also got to work together on other business, through a continuing relationship, so collusion may become a greater risk. The use of clearance can also affect processes anterior to the clearance itself if the investigated body anticipates issues arising and exercises self-improvement or self-censorship. The Audit Commission has traditionally been seen as less willing to engage in clearance than the NAO but, recently, the Comprehensive Performance Assessment process involves agreeing reports line by line with the local authority and its agencies. The Commission was not happy about this but now accepts that it is a more appropriate way to proceed. Results of investigations can always be challenged; the point is whether this is done privately beforehand through negotiation or publicly afterwards. Recent investigation and reporting processes build in recognition of challenge, by including periods for discussion of the facts and their interpretation. Self-assessment will only be challenged if evidence in the other direction arises. Investigations avoid producing a damning report, to drive improvement. Even so, there may be high level challenges to publicly announced CPA scores, and other resistance to these types of ratings. The dynamics of communicating outputs If the outputs that originate with the investigator are termed ‘primary’, it is also possible to identify ‘secondary’ and subsequent outputs of an investigation. For example the NAO’s reports are considered by PASC, which produces reports of its own. The government in turn responds to specific reports from PASC, usually in the form of published statements. Organisations that are the subject of an NAO investigation may also publish responses to the NAO report. In this way the NAO report can be seen as the beginning, as well as the end, of a process. An audit, inspection or scrutiny itself brings issues forward for the organisation to deal with, and encourages the organisation to tackle them as a priority, especially if the output receives much public attention. This may distort or support the organisation’s own existing or previous priorities, and it may sometimes be the case that recommendations in the report do not make it clear who has to do what differently, particularly where services are delivered through partnerships between organisations. On the other hand, collaboration with the investigator may enable the investigated body to make more progress on implementing the findings of the report by tapping into the investigator’s ideas and support for change. Appropriate communication processes are essential if outputs are to have the desired influence on the target audiences. Insofar as the content of outputs enters the public domain, confidentiality cannot be protected and behind-the-scenes collusion may be exposed.

36

The power of the popular media to draw everyone’s attention to particular subjects – the general public’s as well as those with a special interest – means that journalists and editors may sometimes hold the key to the pace and extent of change and improvement. Popular media interest in a subject, if a paper or programme decides to run with it, can cause a sudden huge demand for spokesmen, interpretations, arguments and rebuttals, which make waves that possibly go well beyond the intentions of the audit, inspection or scrutiny. In some cases, the investigator relies on and facilitates just this boost to the profile of its report. For the popular media to create a ‘juicy story’ from a relatively dry official report, journalists look for vignettes and punchy statistics that can be concisely and powerfully put into plain words. This in turn can affect perceptions and emphasis in managerial or professional debates about that service in the light of the investigation. The dynamics of communication become powerful factors feeding the spiral of content, presentation and engagement on that subject. Requirements for follow-up are a further influence on the communication of outputs. For example, local authority scrutiny committees are following up reports after 6 and 12 months, but it remains to be seen whether this is an effective approach. The context and questions change. And inspectors find it more stimulating to start afresh. Ofsted inspections require the Local Education Authority and schools to produce action plans that will be revisited at the next inspection. A political dynamic is also in play: by closely reading the government’s policy agenda, then choosing topics for investigation and framing reports of the investigations skilfully within that context, audits, inspections and scrutinies may produce outputs which have considerable impact while also complying with the frame itself. As ‘instruments of the coercive state’, if investigators comply with the government’s frame (because they find it politically expedient to do so) they may forego a greater degree of independence, but in doing so hold the government and provision of public services more effectively to account on behalf of the wider public interest. In addition to the learning that the service providers are being urged or required to adopt through the processes of audit, inspection and scrutiny, the investigators themselves can use the processes to undergo learning. Audits, inspections and scrutinies are a form of hypothesis-testing about a service, its inputs, outputs, outcomes and the causal or other relationships between these. Follow-up permits this aspect of learning to develop. The inspection system is generally insufficiently challenged by those it inspects, so it tends to learn only rather slowly. Aggregation, continuity and shelf life The purposes, target audiences and impacts of audits, inspections and scrutinies are also revealed by an analysis of the use of aggregation and continuing study, and through an understanding of the perceived and actual shelf life of outputs. Clearly, by looking for the connections between evidence series and trends over time, more telling analysis and prescriptions can be devised. If recommendations made in the report of an investigation, and progress on their implementation, are revisited, there is more chance of understanding ‘what works’ in improving that service or policy, but also of grasping

37

what it is that makes the investigation itself more influential. Where periodic and regular investigations are conducted on a consistent basis, the pool of accumulated evidence creates opportunities for deeper interpretation. This can also set up the expectation that change will have been made by the time of the next inspection of that organisation or service. Where investigations are not periodic or consistent, outputs risk becoming time-limited, ephemeral and possibly undervalued. Longitudinal analysis permits investigators and managers to get to grips with trends and to entrench accountability. Snapshot investigations cannot generate that sort of evidence. Some outputs, for example the annual reports of various Chief Inspectors, may be presented as ‘state of the nation’ summaries of the whole of that service, aggregating interpretations based on the inspections carried out since the last report, which may be only a small sample of the total. These reports are principally directed at national policy makers and the public, not the managers of a particular service. Aggregation raises the question of transparency of the values used to weigh evidence. In the private sector it is common to use multiple scores to evaluate services and to state the weights assigned to the elements. In the public services, where a single aggregate score is the focus, and where it is not based on stated values for weighting the components, the evidence bases for judgement are obscured. Where a named Chief Inspector writes a report, the values can be located in that individual but where a technical evaluation is done, the aggregation hides the values being used. Moreover, aggregation does not establish causality from the evidence. CPA inspections of local authorities revisit the previous scores but not the previous arguments that were used about them. This is because there may be new service managers and also because some may question whether revisiting previous topics will generate effective learning. Local scrutiny activity is a relatively young form of investigation for accountability, and it will take time to build organisational memory. However, what is now more apparent is the pressure on audit, inspection and scrutiny to increase their evidence bases, a factor that is is producing diminishing returns. Disparate and incompatible topics are aggregated and clustered to give scores that are no longer meaningful; for example, including hospital acquired infection rates and the coffee shop in the same cluster when constructing hospital ‘star ratings’. The 2 and 3 star ratings do drive managerial agendas, and the ‘dashboard’ approach does identify core basics, developmental issues and finance. But the all-important infection rates are hidden inside these. Academics may assume that investigators are building up long term insights based on the accumulated evidence, whereas investigators themselves assume that change means the context is new and needs to be looked at afresh. Systematic collection of evidence in time series is less valued that it used to be. This raises again the question of whether audiences know and are satisfied with the evidence base on which the arguments in an output rest. The proliferation of policy and policy analysis, and the fact that significant amounts of policy-relevant evidence are already not being considered, means that the shelf life of outputs may be rather short, notwithstanding the investment of time, money and brainpower in creating them. Audit, inspection and scrutiny should show what different stakeholder think is taking place – there is no single correct view of what that service should be providing and achieving – and the published outputs of investigations should enable them to exercise informed judgement.

38

Why, if audit, inspection and scrutiny activity is increasing, are the benefits not more apparent? Is it because the values of investigators do not cohere with the values of the stakeholders? That would point to the need to construct a value base for audit, inspection and scrutiny that connects more faithfully with stakeholders’ perspectives. It may be that the apparent unease among investigators is a sign of too weak a fit between the public interest and the investigators’ assumptions about that public interest. These matters are likely to be explored further in the final paper in this series.

39

6. Background paper for seminar on 12-13 September 2005, at Madingley Hall, Cambridge

Introduction This paper provides some background to inform discussions at the seminar convened for 12-13 September 2005, which completed the series of six that began in October 2004. The previous seminars brought together a core group of 14 practitioners and researchers to consider the uses of evidence in the audit, inspection and scrutiny functions of modern government in the UK. For the final seminar, participants were invited to contribute their expertise and thinking, under the Chatham House rule, to take the analysis further and consider what ideas and next steps might be identified for improving the use of evidence for accountability. Rationale for the seminar series At the outset we articulated some observations to provide a backdrop for consideration of the nature and uses of evidence for accountability in the UK. These are summarised here, and followed by a summary of the ideas we produced at the previous seminars. That government and public agencies should be publicly accountable for their activities is not controversial. However, whether the information and methods used to hold them to account are effective, intelligent and proportionate, is contested. Uses of evidence for audit, inspection and scrutiny processes are evolving, as government actions and the public services themselves change, and as ways of demonstrating and delivering accountability become more complex. We wanted to examine these matters in greater depth, to contribute to practice and research through dialogue between those two worlds. We are not aware that such an enquiry, with this specific focus on evidence, has been attempted before. We agreed to concentrate on audit, inspection and scrutiny as means of accountability, and to omit regulation from our work.

Examples AUDIT INSPECTION SCRUTINY

National Audit Office Healthcare Commission Parliamentary Select Committees

Audit Commission Ofsted Local government scrutiny committees

Audit, inspection and scrutiny seek to contribute to future service improvement as well as reporting on past performance, in an environment where government and the public services are now characterised by policy networks, power dependencies, and complex relationships between the authority structures within and across states. Features of this context include an altered relationship between the public sector and civil society, with privatisation, new partnerships, and new forms of regulation. Outcomes are frequently determined by specific institutional settings rather than by the general rules of government. The last ten years have seen the creation of more responsive, fragmented agencies held together by market-like incentives and light-touch regulation, together with the pursuit of joined-up government, a focus on delivery, the denial of ideology and the rise of ‘evidence-based’ policy.

40

Accountability investigators have therefore been examining not only the delivery of services but also the effectiveness of programmes designed to improve that delivery. Yet, the rapidly rising direct and indirect costs of external inspection are giving rise to concerns about the proper scope and best methods for ensuring accountability. Summary of earlier discussions Initial scoping of the broad territory of audit, inspection and scrutiny enabled us to identify the main topics and questions for consideration:

• The nature and types of evidence for audit, inspection and scrutiny • The methods for organising and collecting evidence in audit, inspection and

scrutiny • Synthesising the evidence • Outputs of audit, inspection and scrutiny

The nature of evidence What different types of evidence are collected and not collected by audit, inspection and scrutiny, and why? The reasons may be explicit, or they may be traditional, implicit or unexplained. Types of evidence There is enormous variety in the types of evidence collected and used. It varies, for example, in: medium, physical format, qualitative/quantitative, type of knowledge (e.g. explicit/tacit, subjective/objective), scope (e.g. comprehensive/sample), and origin (e.g. routinely for other purposes/specifically for the investigation at a fixed point in time). Conceptual frameworks Investigators’ conceptual frameworks shape the types, properties and uses of evidence that they consider to be important. Any of these frameworks may be explicit or implicit. Some examples of concepts are:

• Theory testing: regards questions posed in audit, inspection and scrutiny as hypotheses, which can be confirmed or refuted by evidence

• Boundary: to understand a system and what is internal and external to it • Authority and responsibility: to understand the actions of stakeholders • Good performance: to establish what counts as good performance and how to

measure it Criteria for judging evidence Different users of evidence have different criteria, overt or otherwise, for judging the fitness of evidence; these criteria influence both the choice of types of evidence and the approach to processing it. How evidence is used Assessments, or actions that follow them (e.g. new or changed policies), which claim to be evidence-based, have different levels of justification in making such a claim. Different institutions work to different levels of reliability in their analysis of evidence.

41

Users of evidence The ‘stakeholders’ involved in the use of evidence include:

• The producers of a service to which the evidence relates e.g. commissioners, managers, providers

• The producers of evidence about a service e.g. those who originate and collect the evidence

• Those involved in the route of evidence from its source – the intermediaries, e.g. researchers and analysts who interpret evidence

• Those beyond the immediate stakeholders who might have an interest, e.g. other organisations of the same type. Evidence and reports from some audit, inspection and scrutiny functions may be used by others with similar functions.

Methods for collecting evidence What are the different approaches to collecting and organising evidence in audit, inspection and scrutiny, and the options, assumptions and decisions that underlie these approaches? Tools and techniques The range of tools and techniques has expanded during the past two decades or so, and this has included the greater use of techniques for collecting qualitative evidence. The management information that organisations have become expected to collect for their own use is also used by investigators, as part of a shift to the greater use of existing data in order to reduce the burden of data collection. Also, the need to capture the point of view of service users is increasingly important in the choice of methods. There is a growing interest in the ‘co-production’ – with those responsible for the service being assessed – of solutions or plans for improvement. This is intended to increase their commitment to the proposed solutions, and it demands more deliberative and consultative techniques. Self-assessment is another means of increasing commitment to implementing proposed changes in those responsible for the service being assessed. The increasing possibility of appeal or judicial review means that investigators pay attention to showing not only that their judgements are based on evidence but that they have a valid underlying framework for the collection of the evidence. Skills Investigators’ skills will influence their choice of tools and techniques, both for their own use and to be commissioned from external experts. They may over- or under-estimate their own competence in different methods. Also, a lack of familiarity may cause them to avoid particular methods, although these might be appropriate when used by skilled people. Professional investigators will tend to favour methods that they think appropriate for the standard of analysis and reliability to which they aspire. It is their job to mediate the data for politicians whose skills make them more receptive to, and confident in using, anecdotes and narratives.

42

Culture and philosophy Each investigating organisation has its own culture and philosophy, which is a product of its history, remit, chosen style of investigation and political context. These shift in response to the behaviour of its investigated bodies and as new thinking emerges about the goals of audit, inspection and scrutiny. There is a discernible shift from a ‘hierarchical’ regime, which specifies the approach to be used at the outset and applies it uniformly, to a more project-based approach, where more discretion is allowed in the selection of evidence and methods. This shift is driven in part by the growing emphasis on designing services to respond to particular needs. If service design varies, so must the methods for collecting evidence. Framing The underlying hypothesis or theory for the investigation frames the questions it poses and thus influences the choice of methods. However, this choice may be based on assumptions (of what causes good and bad performance, for example) rather than on an explicit analysis. Some attempts are made to minimise the effects of inaccurate assumptions or even prejudice. For example, ‘realistic evaluation’ requires people to be more explicit about their assumptions about why and how things happen. Practical considerations Practical considerations – time and budget – also determine choice of methods. The choice is a trade-off between the desirable volume and type of data and the resources available, including the burden of work on those who have to collect the data. Of the three criteria used to assess the evidence base of research studies – relevance, reliability and sufficiency – the last is the most difficult for the investigator to apply.

43

Inspection: the Healthcare Commission and evidence

The Healthcare Commission, like all inspectorates, has to deal with the tension between ensuring consistent and robust judgements, and not burdening the inspected organisations with a heavier than necessary regime which distracts from, rather than enhancing, service improvement. Our solution depends upon pre-defining, as far as is possible, the specific evidence we seek to judge any given issue. In short we are concerned not only with robustness of evidence (sufficiency and accuracy) but also that the evidence we seek can clearly be demonstrated to be valid for judging performance (validity, discrimination and efficiency). The first two of the Healthcare Commission’s new methods of assessment to be implemented in the NHS – assessment of compliance with core standards and improvement reviews – both follow this principle. In core standards assessment, specific inspection guides for each standard specify the evidence sought for judgement and link it to the national guidance and policy to which the standard refers. For improvement reviews, a substantial period of development is included, which leads to the production of a very focused assessment framework that defines the required evidence base. This parsimonious approach to evidence requires two preconditions to work safely and effectively. First, it requires heavy analytic input and substantial time investment before inspection takes place, to research what evidence is required to make a judgement. Second, the tight focusing of evidence in planned inspection activity means that a method is required for dealing with information that emerges opportunistically about subjects that lie outside the scope of planned activity, but which point to inadequate quality of care or risks to patient safety.

Richard Hamblin, Healthcare Commission Closeness to the data Some investigations rely principally on secondary data; others use methods that involve direct contact with stakeholders. Where there is direct contact, the investigator may try to be as unobtrusive as possible in order not to influence behaviour. Or s/he may use consultative and deliberative methods that engage stakeholders, particularly if the aim is the co-production of a solution. Site visits and other forms of direct contact are widely valued by investigators for bringing a subject to life and helping them to understand the reality of a situation. The challenge then is to ensure that the visitors’ interpretation of what they see is based on a complete understanding of the situation and that visitors do not allow this experience to carry undue weight in a much larger investigation. Quality assurance More is at stake as a result of audit, inspection and scrutiny than used to be the case. Public expectations of public services are higher and people can lose their jobs as a result of reports. However, it is possible for different investigators, responsible for the investigation of different units in a system, using standardised methods, to interpret similar information differently. And where teams have the discretion to choose their

44

own methods, there is more potential for the quality of the investigation to be challenged. Despite this, investigators are more prepared to use evidence from other organisations’ investigations or research. Investigators had previously been more reluctant to use these outside sources as they could not guarantee their quality. Synthesis of evidence How is evidence processed in order to construct arguments, draw conclusions and communicate messages to the audiences? If this process of drawing meaning from the evidence is not transparent, the conclusions may be mistrusted and/or challenged. Clarification is needed on the use of the terms ‘synthesis’, ‘interpretation’ and ‘analysis’. Audit, inspection and scrutiny may have different expectations in relation to rigour and transparency, and these may differ from those of academic researchers. However, all investigators have to be confident that their evidence and its synthesis support the conclusions and recommendations that result. Processing the evidence The process of synthesising the evidence is not a neutral one; it is influenced by the investigator’s underlying theory or hypothesis. In some cases, scrutiny involves little synthesis, with the output consisting mainly of a reporting of the process of ‘holding to account’ that was the main purpose of the scrutiny. Audit and inspection have been moving to the use of more types of evidence, making the process of synthesis more complex. Creating order in the evidence Different methods of collecting evidence lend themselves to different methods of analysis, which carry different assumptions about being ‘faithful to the data’. For example, Comprehensive Performance Assessment sets out to produce an objective and rigorous assessment of performance, while parliamentary and local government scrutinies base their judgements and arguments on the evidence received from witnesses. Where investigations are intended to improve or change policy, they may use an early analysis of initial evidence to guide decisions about what further material to collect. Analysis may have to turn up some very strong patterns in order to overturn the investigator’s initial framing or hypothesis. This may limit the investigator’s receptivity to unexpected or contradictory findings. Political considerations As audit, inspection and scrutiny take place in a political environment, investigators plan their work to have the biggest impact. They may ‘self-censor’ in order to put the emphasis on where they think there is political scope to prompt some practical change. Such political imperatives may apply both to their choice of topics for investigation and to the presentation of the insights that emerge from the synthesis of evidence. Thus, conclusions are shaped by the reporting stage as much as by the evidence-collecting mode they adopted earlier in their investigation. The claimed independence of audit,

45

inspection and scrutiny is in fact heavily influenced by political realities and the implicit assumptions of the investigators.

Scrutiny: the local government scrutiny function

The local government overview and scrutiny function, introduced in 2000 to modernise the political arrangements in local authorities by ensuring a clear executive-scrutiny split, has put local councillors at the sharp end of public scrutiny. Mirroring the Westminster model of select committees, councillors, supported by professional scrutiny officers, are developing appropriate ways of gathering, assimilating, analysing and reporting on evidence to ensure that the executive of the council (and other executive bodies with responsibility for service delivery locally) are held to account. Much of their work, aligned to the local authority policy development cycle and the forward plan of the cabinet, is carried out through ‘scrutiny reviews’ where evidence is sought from local people with direct experience of local services, those who commission and provide services as well as professional officers of the council. A significant consideration is the extent to which the voice and concerns of the public are sought and reflected in recommendations made. A range of methods for data gathering is used, including public meetings, formal hearings or surveys. A feature of the new system is the move towards more informal meetings and a conversational style, intended to invite and encourage direct public participation. It is nonetheless recognised that councillors need the skills to effectively question witnesses and interrogate the data. Given their ‘lay’ status, it is a requirement that evidence is presented to councillors in easy-to-assimilate formats related to key questions to enable robust challenge. Scrutiny committees are required to operate cross-party and to reach consensus on reports and recommendations based on the evidence presented. This is usually the case. Experience to date suggests that most recommendations coming from scrutiny are taken up by the executive.

Jane Martin, Centre for Public Scrutiny Trends in reliance on evidence Audit, inspection and scrutiny are all becoming more reliant on evidence, partly in order to ensure that their reports are defensible. This brings a requirement for more staff, more varied expertise and more commissioned work by outside experts. However, not all this activity is well resourced (local government scrutiny, for example) and it is not clear that investigators are making much change to the skills and backgrounds they seek to recruit. Research skills need to be supplemented by other capabilities where employers require the ‘co-production’ of analysis and a knowledge of practical issues in the sector to be investigated. Outputs What are the outputs of audit, inspection and scrutiny, and how are they used to disseminate results? There are physical outputs, in a range of media, and processes of communication and dialogue. Formats Investigators produce a greater range of outputs than they used to. An evidential approach still requires a full report, which presents the evidence and analysis, but this may be mainly ‘for the record’ and be read by relatively few people. Other formats are

46

used increasingly, including leaflets and summaries, to ‘enliven’ the messages and to reach a wider audience. Investigators may issue press releases to encourage media coverage. Some investigators use metrics to summarise their findings. For example, the Healthcare Commission awards star ratings to NHS trusts (although this system is to be replaced in 2006) and the Audit Commission categorises local authorities as ‘poor, weak, fair or excellent’ as part of Comprehensive Performance Assessment. There is also a noticeable trend towards the greater use of anecdotes to illustrate messages. Multiple purposes Overall, the objective of investigators is to make a discernible difference to the organisation, activity or policy being investigated, which should result in some benefit for stakeholders. Within this, there is a spectrum of objectives from compliance with minimum standards to bringing about best practice and improvement. Investigators may also pursue other interests through their outputs. They may wish, for example, to demonstrate that the system of audit, inspection and scrutiny is working, to demonstrate their skills and to strengthen their reputation.

Audit: Helping consumers benefit from competition in the telecommunications market (National Audit Office)

This 2003 NAO report on Oftel followed a 1998 report and is part of a larger sequence of reports covering regulation. In some respects it exemplifies the growth of social science techniques in performance audit. It was not explicitly designed around a hypothesis. But one can discern an implicit framework of economics, based around notions of consumer choice, with transparent information key to a well-functioning market. The multidisciplinary team – accountant, investment analyst and consumer specialist – generated consumer insights through stated preference techniques. We compared these to market information (telecoms tariffs). This willingness to analyse consumer drivers was something the NAO might not have attempted in the 1990s. And we dealt well with shifting telecoms tariffs – they could have inhibited us from making strong assertions, but instead became part of the consumer confusion story. We did not however use focus groups or case studies and placed reliance on file review for some areas. And there was no ‘co-production’: the model was one of an external, independent observer appraising organisational practice. The intended impacts were to challenge regulatory practice; and identify savings for consumers. The report was the beginning, not the end, of a process to achieve these impacts. We followed it by issuing 35,000 consumer advice leaflets through Citizens Advice Bureaux; and a Public Accounts Committee hearing. Looking back, it seems quite conservative. We would now, I hope, be more adventurous on behavioural issues – what drives consumer, company and regulator behaviour – because we now see regulation as a constantly shifting dance of consumers, markets and rules.

Ed Humpherson, National Audit Office The relationship to government may also influence outputs. Audit, inspection and scrutiny must be seen to be independent if they are to be credible. However, not criticising government overtly may be a way of getting government to listen to recommendations for change.

47

Audiences The information requirements of different audiences (service managers and users, for example) vary. However, directing an output at one audience can influence another. For example, ‘naming and shaming’ poor performers to the public can stimulate services to improve. Impact The impact of outputs is related to their content and presentation, both of which are controlled by the investigator. It is also influenced by the extent of engagement with the audience, which may reflect, for example, the reputation of the investigator or the history of the topic. Some relationships require the investigated body to respond and then to report back on action taken. Others may choose to do so. In some investigations, outputs are checked and/or cleared with the investigated body before publication. This reduces the risk of inaccuracies that would undermine the report’s credibility and impact, and may increase the chance of action being taken to improve. Indeed, investigators may avoid producing a damning report, to give the best chance of driving improvement. This checking process may also bring the risk of collusion, especially if parties have to continue to work together. Dynamics of communicating outputs Outputs may encourage the receiving organisation to change its priorities, for better or worse. By drawing attention to a particular investigation, the popular media can influence the pace and extent of change. Investigators may deliberately seek such a boost to the profile of their report. Following-up an investigation after a period of time would allow investigators to learn more about the hypotheses they were testing, about ‘what works’ in bringing about improvement and about how their outputs can best have impact. However this happens infrequently, and as the system of audit, inspection and scrutiny is not often challenged rigorously by those it investigates, it learns rather slowly. Aggregation, continuity and shelf-life Trends in evidence over time can produce new insights. Periodic investigations also set up an expectation that change will be have been made by the time of the next investigation. However, audit, inspection and scrutiny practitioners tend to assume that the context will have changed since the time of any previous investigation and that therefore a longitudinal investigation will not be appropriate. This may mean that outputs have a rather short shelf life, despite the resources invested in their production. Some outputs are aggregations of summaries of investigations of many parts of a whole system. Such ‘state of the nation’ reports may be useful for the public and national policy makers, though not for service managers. They require the weighting of different components and the reasons for this are not always clear. Indeed, aggregated scores for a single service or organisation can cluster together different components of evidence in ways that are not meaningful.

48

Summary of issues raised by Seminars 1-5 Seminars 1-5 (2-5 reported here) were concerned with the broad question: to what extent do the types and quality of evidence and investigation by audit, inspection and scrutiny improve accountability and support improvements in services? The discussions highlighted a number of specific issues. These issues are set out here to provide background to the seminar on 12-13 September. They are far from an exhaustive list of issues or problems related to the use of evidence for audit, inspection and scrutiny. Nor are they a basis for a comprehensive set of recommendations. We hope that they will stimulate discussion, new thinking and the development of a more complete picture and ideas for next steps. Issues raised so far include:

(a) Is there a need to make more explicit the conceptual frameworks, values and assumptions that underlie the approaches that audit, inspection and scrutiny take to their work?

(b) Where these do not fit well with those of stakeholders, would it be possible to

develop a consensus, which would enhance the effectiveness of using evidence for accountability and service improvement?

(c) What more could be done to share evidence effectively between investigators

and between investigations and other functions?

(d) What are the implications of self-assessment and co-production of plans for improvement for the skills required of investigators?

(e) Do current approaches to quality assurance balance the diversity of

approaches to investigation with the need for rigour and credibility?

(f) Audit, inspection and scrutiny take place within a political environment, which influences the choice of topics for investigation and the presentation of findings. Do investigators need to do more to balance the need for independence with the need to have an impact on current policies and activities?

(g) How can investigators learn more effectively about how they can have most

impact on both accountability and service improvement?

49

Appendix Chair: William Solesbury, Senior Visiting Research Fellow, ESRC UK Centre for Evidence Based Policy and Practice, Queen Mary University of London2. PROGRAMME Monday 12 September 2005 6.30 pm Arrive; drinks 7.30 pm Dinner 8.30 pm Welcome Tom Ling 8.35 pm What are the key issues

concerning evidence for audit, inspection and scrutiny?

Guest Speaker: Baroness Onora O’Neill, Principal Newnham College, Cambridge, President of the British Academy

9.15 pm Discussion Opened by Steve Martin 10.00 pm Close Tuesday 13 September 2005 9.00 am Stakeholder perspectives on

evidence for accountability through audit, inspection and scrutiny: Taxpayers Service users Service delivery

professionals Service delivery managers Parliament Future generations

Six groups: The purpose of this first short session is to identifythe full range of issues and concerns that participants may want to raise

9.30 am Key issues of evidence for audit, inspection and scrutiny (see Background Paper para 45)

Three short statements: Graham Smith (Audit) Amanda Edwards (Inspection) Anne Campbell (Scrutiny)

9.45 am (20 minute break at c10.30am

Defining the key problems and identifying potential solutions

Four groups: 1. Michael Power + Ed Humpherson 2. Bob Black + Judy Renshaw 3. Trish Longdon + Jane Martin 4. Jeremy Lonsdale + Richard Hamblin

11.45 am Feedback Tom Ling 12.30 pm Lunch 1.30 pm Main issues from the seminar Tony Bovaird 1.45 pm How could the main issues be

taken forward? Four groups: 1. David Prince + Jane Steele 2. Jim Gallagher + Bill Solesbury 3. Colin Talbot + Ruth Levitt 4. Peter van der Knaap + Tom Ling

2.40 pm Discussion 3.30 pm Tea and close

2 The ESRC UK Centre for Evidence Based Policy and Practice moved to the School of Social Science and Public Policy, King’s College London, Strand, London WC2R 2LS on 1 November 2005.

50

51

Participants Black Robert Auditor General, Audit Scotland Bovaird Prof Tony University of the West of England Campbell Anne Former MP and member of the House of Commons Public

Administration Select Committee Davis Howard University of Warwick Edwards Amanda Social Care Institute for Excellence Flinders Dr Matthew University of Sheffield Gallagher Prof Jim Scottish Executive/Glasgow University Guy David Economic and Social Research Council Hamblin Richard Healthcare Commission Humpherson Ed National Audit Office Knaap Dr Peter van der Netherlands Court of Audit Levitt Dr Ruth ESRC UK Centre for Evidence Based Policy and Practice,

QMUL Ling Prof Tom Anglia Polytechnic University Longdon Trish Parliamentary and Health Service Ombudsman’s Office Lonsdale Dr Jeremy National Audit Office Martin Prof Steve University of Cardiff Martin Dr Jane Centre for Public Scrutiny Nutley Prof Sandra St Andrew’s University Park Dr Alison National Centre for Social Research Power Prof Michael London School of Economics and Political Science Prince David Standards Board for England Renshaw Dr Judy formerly Audit Commission Smith Graham Audit Commission Solesbury William ESRC UK Centre for Evidence Based Policy and Practice,

QMUL Steele Jane Office for Public Management Sullivan Prof Helen University of the West of England Sutherland Dr Kim Judge Institute, University of Cambridge Talbot Prof Colin University of Nottingham Thompson Dr Hilary Office for Public Management Waller Dagmar West Midlands Regional Assembly Welham Bryn HM Treasury