39
PNNL-23623 International Best Practices on Energy Data Management Insights for an Indian Roadmap 2014 Sha Yu Meredydd Evans Volha Roshchanka Bo Liu

International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

PNNL-23623

International Best Practices on

Energy Data Management

Insights for an Indian Roadmap

2014

Sha Yu Meredydd Evans

Volha Roshchanka Bo Liu

Page 2: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional
Page 3: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

PNNL-23623

i

International Best Practices on Energy Data Management Insights for an Indian Roadmap 2014

Sha Yu Volha Roshchanka Meredydd Evans Bo Liu

Prepared for

the U.S. Department of Energy

under Contract DE-AC05-76RL01830

Pacific Northwest National Laboratory

Richland, Washington 99352

Page 4: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional
Page 5: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

i

Acknowledgements

Under the leadership of the United States Agency for International Development and the

National Renewable Energy Laboratory, the Pacific Northwest National Laboratory is working with

partners in India to strengthen their energy data management system. Energy data management is

one of the focus areas of the Sustainable Growth Working Group, the U.S.-India Energy Dialogue.

The U.S.-India Energy Dialogue is the primary framework for cooperation on energy between the

Government of India and the Government of the United States.

The authors are grateful for research support provided by the United States Agency for

International Development, the United States Department of Energy and the National Renewable

Energy Laboratory. The authors also acknowledge the insights and support from several partners

from the Government of India (including the Planning Commission and the Central Statistics Office),

Prayas Energy Group and Nikit Abhyankar from the Lawrence Berkley National Laboratory. The

Pacific Northwest National Laboratory is operated for the United States Department of Energy by

the Battelle Memorial Institute under contract DE-AC05-76RL01830. The views and opinions

expressed in this paper are those of the authors alone.

Page 6: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

ii

List of Acronyms

AGEB Arbeitsgemeinschaft Energiebilanzen/The Working Group on Energy Balances

APPA American Public Power Association BAFA Bundesamt für Wirtschaft und Ausfuhrkontrolle/Federal Office of Economics

and Export Control BMWi Bundesministerium für Wirtschaft und Technologie/Federal Ministry of

Economics and Technology

CBECS Commercial Buildings Energy Consumption Survey

CNSTAT Committee on National Statistics of the National Academy of Sciences

COPAFS Council of Professional Associations on Federal Statistics

DECC Department of Energy and Climate Change

Destatis Statistisches Bundesamt/Federal Statistical Office

DOE Department of Energy

DUKES Digest of UK Energy Statistics

EIA Energy Information Administration

EU European Union

FCSM Federal Committee on Statistical Methodology

FERC Federal Energy Regulatory Commission GSBPM GSS

Generic Statistical Business Process Model Government Statistical Service

HRDD Human Resources Development Division

ICSP Interagency Council on Statistical Policy

NEB National Energy Board

NRC Nuclear Regulatory Commission OECD OMB

Organisation for Economic Co-operation and Development Office of Management and Budget

ONS Office for National Statistics

SCOPE Statistical Community of Practice and Engagement

UNSC United Nations Statistical Commission

Page 7: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

iii

Table of Contents

I. Introduction: the Importance of Energy Data and Need for Analysis ..................................... 1

II. Data Management Frameworks ..................................................................................................... 1

III. Key Principles of Official Statistical Systems and Model Choice .......................................... 6

IV. Best Practices on Data Coverage and Requirements ........................................................... 12

V. Prioritizing Data Reporting........................................................................................................... 20

VI. Communicating Data .................................................................................................................. 21

VII. Take-away Messages .................................................................................................................. 23

References ................................................................................................................................................ 25

Appendix A. Overlap of Statistics Codes and Principles of the United States, European Union

and United Nations Statistical Commission ....................................................................................... 27

Appendix B. UK’s Code of Practice for Official Statistics and Canada’s Dimensions of

Information Quality ................................................................................................................................ 28

Appendix C. Staff Breakdown at the United States’ EIA ................................................................... 29

Appendix D. Generic Statistical Business Process Model (GSBPM) – a Tool to Manage and/or

Evaluate Data Quality and Metadata ................................................................................................... 30

Page 8: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

iv

List of Tools and Practices Boxes

Box 1. Connecting with Users at Statistics Canada ....................................................................................... 6

Box 2. Collecting Feedback at the United States’ EIA ................................................................................... 7

Box 3. EIA’s Mandate in United States Federal Law ...................................................................................... 8

Box 4. Data Confidentiality in the UK and Germany ..................................................................................... 9

Box 5. Multi-level Coordination at Germany’s Destatis ............................................................................. 10

Box 6. Ensuring Long-term Quality through Continuous Training at Statistics Canada ................. 11

Box 7. Data on Agricultural Energy Use in the United States and Canada .......................................... 14

Box 8. Commercial Buildings Energy Consumption Survey ....................................................................... 17

List of Figures

Figure 1. EIA's staffing plans and budget allocation for the Fiscal Year 2014 ............................................................... 3

Figure 2. Common process for data validation ............................................................................................................ 19

List of Tables

Table 1. Summary of the four studied data management systems and their characteristics ....................................... 5

Table 2. Data coverage and collection methods .......................................................................................................... 13

Table 3. Frequency of data dissemination by the end-use sector ................................................................................ 13

Table 4. Frequency of data dissemination by energy source ....................................................................................... 15

Page 9: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

1

I. Introduction: the Importance of Energy Data and Need for Analysis

Reliable energy statistics are the backbone of sound national energy policies. Energy statistics

are vital to helping governments ensure energy security, reduce price volatility and plan

investments. They are also essential in guiding private investments related to energy.

Countries have adopted various models to manage energy data, depending on the energy sector

characteristics, economic structure, country size, government type and other factors. Although no

one energy data management system fits all countries, several countries can offer best practices

and effective approaches to collecting, processing and sharing national energy data.

In this report, we examine energy data management models from four countries – the United

States, Canada, United Kingdom (UK) and Germany – to illustrate the diversity of models. These

countries are reputed to have reliable energy data, and thus, we are interested in describing

effective tools and practices used to overcome challenges of their chosen models. The second part

of this report examines different types of energy statistics, and how energy data are collected,

processed, analyzed and disseminated in these countries. We study energy statistics from multiple

dimensions, including their geographic coverage, the frequency of data collection and

dissemination and the time series of energy data. In addition, we compile a series of toolboxes that

provide specific examples of best practices on different topics.

II. Data Management Frameworks

Defining good quality data, whether on energy or other topics, is complex and multi-

dimensional. Typically, aspects of data quality include accuracy (how well do the data represent

reality), completeness (quantity of data and coverage), timeliness (are data updated regularly and

in a timely way), relevance (how well do data serve its purpose) and accessibility (can users access

and understand the data). The public and data users must also perceive data as credible and

reliable, and thus, it is important to maintain objectivity and independence from political and other

influences. In addition, most statistical agencies have limited budgets and must balance trade-off

decisions about quantity and quality of data against financial and human resources constraints.

Therefore, assessing the quality of energy data requires thinking in terms of energy data

management systems, with components that need to be managed to be effective.

To maintain the effectiveness of data management systems, more broadly, governments have

developed principles, or codes of practices. These principles largely overlap (see Appendix A). The

United Nations Statistical Commission (UNSC) spearheaded international efforts to merge

statistical definitions and data quality standards1, and several countries have adopted the UNSC

principles, formally or informally. In Section 3 of this report, we group the most common principles

1 UNSC’s committee on energy, also known as the Oslo Group, has developed a number of standardized definitions, classification, and principles of compiling energy statistics. In 2011, the Oslo Group produced the International Recommendations for Energy Statistics (IRES): https://unstats.un.org/unsd/statcom/doc11/BG-IRES.pdf. Currently, the group is developing a new Energy Statistics Compilers Manual (ESCM): http://oslogroup.org/index.asp?page=escmchapters.html. These documents offer broad, comprehensive guidelines on issues related to energy statistics, but are fairly limited in illustration and analysis of energy data management systems.

Page 10: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

2

of national statistics into 5 broader categories. In other cases, like in the European Union (EU),

country-members adopted common practice standards. For example, Germany’s set of practices is

nearly identical to those of the EU’s Eurostat. Other countries, such as the United States, Canada and

the UK, developed their own management frameworks (Appendices A and B).

Despite the similarity of criteria for effective data management systems, countries have adopted

diverse models for meeting them. This is best illustrated by contrasting the data management

systems of Canada, the United States, the UK and Germany. These systems can be loosely placed on

a spectrum from the most centralized (Canada) to the least centralized (Germany). The national

statistical systems are not always designed with energy statistics in mind. National energy data

producers often operate in a given model, determined by the political system, economic structure,

country size, government type and other factors. Thus, energy statistics professionals operate

within constraints of their systems.

Canada’s national data production is very centralized. Energy data collection and production

occurs at the Manufacturing and Energy Division of Statistics Canada, the central agency that has

the legal authority to oversee all data collection efforts. The agency employs about 6,000 staff and

had a budget of about $450 million in 2012-2013. Other agencies also collect and produce more

specialized information on energy, but their efforts are focused on analysis and synthesis, and the

data they collect are specifically tailored to assist a government function. For example, Natural

Resources Canada compiles energy efficiency data for transport, energy and buildings, and National

Energy Board (NEB) collects and produces data on infrastructure for and imports/exports of oil,

natural gas and electricity to support NEB’s regulation of energy companies. However, these

agencies typically comply with standards developed by Statistics Canada. Statistics Canada also

overseas data collection efforts at the province level (The Government of Canada 1985) but is

legally authorized to enter into data sharing agreements with provinces to collect or process data

locally or jointly, when it makes logical sense. For example, provinces produce all statistics on

health, education and justice. Some provinces also collect energy data, when it is logistically easier

for the purposes or as a result of a local regulatory function, such as licensing of oil and gas

production. Thus, Alberta’s Office of Statistics and Information collects and publishes their own

province-specific energy statistics, in coordination with Statistics Canada (Statistics Canada 2012).

The United States has a distinct agency for energy statistics: the Energy Information

Administration (EIA). EIA and 12 other federal statistical agencies form the Federal Statistical

System. The Office of Management and Budget (OMB) oversees the system and is in charge of

ensuring efficiency and compliance with federal regulations. EIA is responsible for collecting,

analyzing and disseminating energy information. Although EIA is a part of the United States (U.S.)

Department of Energy (DOE), it is an independent agency. EIA employs about 370 federal staff and

hires over 250 subcontractors worth close to $40 million (Figure 1) (see Appendix C). To avoid

duplication and improve practices, EIA coordinates with other government and private

organizations, such as American Public Power Association (APPA), Nuclear Regulatory Commission

(NRC), Federal Energy Regulatory Commission (FERC), Department of Commerce and others. These

agencies typically collect energy information that is necessary for or is a result of their government

function. Dependent on confidentiality laws and other factors such agencies can share data or not.

Page 11: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

3

For example, DOE’s Office of Natural Gas Regulatory Activities is responsible for authorizing natural

gas imports and exports and, therefore, collects information from all natural gas exporting/

importing companies and then shares it with EIA. On the other hand, FERC separately collects

electricity data that enable the agency to regulate the electricity market. EIA produces the majority

of official energy data and analysis, aiming to support policymaking, business decisions and public

understanding of the sector, but not enforcement and other regulatory activities that might affect a

specific entity. Notably, some states collect other energy data that are important for local policies,

however, our scope includes only national statistics, and, therefore, we will not describe state-level

efforts here that are not part of national statistics.

Figure 1. EIA's staffing plans and budget allocation for the Fiscal Year 2014

Note: FTE standards for full-time equivalent; i.e., work hours equivalent to those of one full time

employee

The United Kingdom has assigned energy data production to a specialized government

department: the Department of Energy and Climate Change (DECC). DECC statistics staff report to

Page 12: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

4

the DECC Head of Profession for Statistics, who is accountable to a central agency, the UK Statistics

Authority. The UK Statistics Authority oversees government statistics produced in the UK and

establishes codes of practice and standards that agencies must follow. The UK has elements of

regional decentralization due to the country’s political structure. To achieve coherence in national

statistics, the UK Government and the three devolved administrations, i.e. the Scottish Government,

the Welsh Assembly Government and the Northern Ireland Government, entered into an agreement

on producing official statistics known as the Inter Administration Working Agreement on Statistics.

In addition, these entities coordinate their responsibilities for collecting statistics, so that devolved

administrations do not collect data that UK Government departments collect. The executive office of

the UK Statistics Authority – Office for National Statistics (ONS) – is the largest producer of official

statistics, while specific departments produce statistics relevant to their national policy area, and

devolved administrations collect statistics important to local government policy. The various

government departments, ONS and the administrations of Scotland and Wales form the

Government Statistical Service (GSS), a virtual community for cooperation. (Notably, Northern

Ireland administration is not part of the GSS, although they work very closely with the Service). GSS

consists of about 7,000 civil servants who collect, produce and disseminate official statistics. Thus,

DECC is the lead body for energy statistics and policy (UK Office for National Statistics n.d.) and

produces these data nationally and for the devolved administrations. DECC’s statistical unit

employs about 30 statisticians, but other staff at DECC and other government agencies also support

energy data production (Government Statistical Service n.d.). About half of DECC’s statistical

division are statisticians, whereas another half are administrative specialists and other energy

experts. Staff work in teams of 4-5 people for each fuel or subject area (MacLeay 2014). DECC also

cooperates with other agencies that work on regulatory issues and collect data that are relevant to

them. They include the UK Coal Authority, UK Atomic Energy Authority and the Office of Gas and

Electricity Markets. Northern Ireland, specifically, also collects energy data deemed important for

local policy; however, local data are outside of our scope.

Germany’s statistics system is the most decentralized of the four, and energy data are collected

and produced at the regional level. Responsibility for national statistics is divided between the

central government and 14 regional (Länder) governments2 (International Energy Agency 2013).

The Federal Statistical Office (Statistisches Bundesamt), or shortly Destatis, is the central authority

responsible for the methodological and technical aspects of surveys, compilation and publication of

national results, and collecting and processing certain data (e.g. on foreign trade) (Lalor 2013). It

employs about 2,500 employees, out of which 16 work on energy statistics. Regional offices are

responsible for the core collection and processing of statistical data, including on energy. Regional

energy data can be found at regional statistics sites but are also compiled federally at the Common

New Statistical Information System (GENESIS in German)(Federal Statistical Office and the

statistical Offices of the Länder n.d.). Aside from Destatis and regional governments, national data

are also produced by other agencies. The Working Group on Energy Balances (Arbeitsgemeinschaft

Energiebilanzen, AGEB) works through its regional offices and in close cooperation with Destatis,

industry associations and the Federal Ministry of Economics and Technology (Bundesministerium

für Wirtschaft und Technologie, BMWi) to produce national energy balance reports. AGEB consists

2 There are 16 regional governments, but some have merged their statistical offices.

Page 13: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

5

of representatives from the energy industry associations and energy research institutes (AGEB

2014). BMWi delegated the AGEB to produce Germany’s national energy balances, and Destatis

supports AGEB financially. In addition, the Federal Office of Economics and Export Control

(Bundesamt für Wirtschaft und Ausfuhrkontrolle, BAFA) collects and maintains statistics on oil and

natural gas (such as imports and prices), while the Federal Ministry for the Environment, Nature

Conservation and Nuclear Safety produces data on renewable energy. Among the four countries

studied, Germany appears to produce the smallest number of datasets, which may indicate the

challenges of coordination and the importance of developing robust, explicit coordination

approaches when a system is decentralized.

Table 1. Summary of the four studied data management systems and their characteristics

Country System description Comparative characteristics

Canada

Central statistical agency (Statistics Canada) with mandate to produce all statistics

Data sharing agreements with other agencies and provinces

Legal status, independence and mandate for data collection are more uniformly set and applied

Easier to maintain and check quality of data and methodologies as compared to other subjects area data

Easier to assess and mobilize resources Requires top management to put effort

into reaching out to policymakers and users

Easier to coordinate and find synergies across subject areas

United States

Federal statistical system consists of 13 theme-based agencies

Energy Information Administration is the primary agency on energy data

Independent, although within Department of Energy

Visibility and access to policymakers make it easier to stay relevant

Easier to maintain independence Easier to maintain and check quality of

data and methodologies for energy sectors

United Kingdom

A division within a ministry (Department of Energy and Climate Change) collects and produces most of statistics

Cooperation with other administrative divisions happens through an agreement

Relevant to energy policymakers Lack of access to wider, national data Requires more coordination between

subject area agencies Requires more effort to maintain

independence and credibility

Germany

Most of the data collection and processing occurs at 14 regional statistical offices

A central agency, Destatis, performs coordination and other functions that cannot be done by regional offices

A separate working group compiles energy balances

Local presence and relevance Requires significant coordination and

expenses Synchronization of laws on scientific

integrity and confidentiality is necessary to facilitate data sharing

Page 14: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

6

As these case studies demonstrate, statistical models can range from centralized to

decentralized, depending on how much of the responsibilities for official statistics lies with a

central organization versus specialized government agencies. Decentralization can have many

dimensions, and functions can be distributed by themes (United States, UK), regions (Germany) or

sectors such as non-governmental and private bodies (Germany). The design of a statistical system

is often determined by the policy context but can have an impact on the ease of meeting a certain

principle.

III. Key Principles of Official Statistical Systems and Model Choice

Both centralized and decentralized systems exhibit inherent strengths and weaknesses. Some of

these strengths and weaknesses are in direct trade-off with each other (Edmunds 2005). For

example, centralized systems find it challenging to stay relevant to current policy issues but easier

to coordinate, which enhances the integrity and independence of the data. At the same time,

decentralized systems might be close to policy issues but are also more easily influenced by

political consideration; these systems require more concerted effort to coordinate with other

statistical agencies and need extra checks to ensure data comparability and quality. Data

professionals do not always have much choice in the kind of system they work in, as these are often

determined by country size, socio-political environment, historic conditions and other factor. At the

same time, a statistical model can impact a government’s strategy to adhere to certain principles

and codes of practices. On the ground, many tools, practices and design modifications can help

mitigate challenges of the chosen statistical model.

We examined four reputed

statistical systems (above) to

understand how their design

affects elements of quality and

effectiveness and what tools and

practices are in place to minimize

any disadvantages of the system.

We chose to group the most

common principles of national

statistics (discussed in the

previous section and available in

Appendix A) into the following

categories: (a) relevance to

policymakers and users; (b) legal

status and independence; (c)

coordination; (d) implementation

and adequacy of resources; (e)

quality of data and methodologies.

Relevance to policymakers and

users. Official statistics inform

Box 1. Connecting with Users at Statistics Canada

When the Chief Statistician of Canada Ivan Fellegi

was determined to overcome the challenge of being

irrelevant, he focused on developing bilateral

communications with interested ministries in the central

government. He set up meetings, during which senior

representatives from ministries had an opportunity to

voice their suggestions on improvements or new products

(Scheuren, Willig et al. 2012). This mechanism helped

Statistics Canada establish cost-recovered surveys, which

are custom surveys and analysis services for clients

(Canadian government and organizations), who pay for

associated costs. Cost-recovered surveys included energy

topics, and at one point 20 percent of Statistics Canada’s

budget was from cost-recovered surveys. Results of such

surveys become publicly available (Statistics Canada

2014). To reach provincial needs, Statistics Canada holds

regular meetings with provincial statistical offices.

Statistics Canada also maintains relationships with the

business and academic communities.

Page 15: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

7

debate and decision-making by governments and communities, and, thus, staying relevant in an

accessible form is an important attribute. Centralized systems find it harder to be relevant to

policymakers because they might be remote from the policy debate or from local realities, whereas

decentralized statistical offices are typically nested within sector ministries or a certain region and

are more likely to be exposed and accessible to policymakers.

Germany’s system is decentralized regionally, and the majority of data collection and

processing is through regional statistics offices. Such regional offices are better aware of regional

peculiarities and specific data requirements. From a broader perspective, the U.S. and UK statistical

systems exhibit traits of systems decentralized by subject, but at the same time EIA and DECC,

respectively, serve the role of a central energy information producer. Being fairly central, these

agencies cannot meet all departments’ and administrative units’ needs. For this reason, other

agencies collect some information relevant to their regulatory functions at the national level and

sub-national governments collect additional energy information relevant locally. Canada has a

similar practice in place, where Statistics Canada devolved some functions and allowed provinces to

collect data jointly or entirely on sub-sectors, e.g. mining or oil production, which might be of

particular importance in certain provinces.

Other common techniques for staying relevant to data users are satisfaction surveys,

conferences and other forms of consultations (see Boxes 1 and 2).

Box 2. Collecting Feedback at the United States’ EIA

EIA attempts to ensure the utility of information through customer surveys, formal

solicitations of comments, interview and focus group studies, user conferences (e.g., EIA

sponsors the annual National Energy Modeling System/Annual Energy Outlook Conference) and

other outreach programs (EIA n.d.). In addition, EIA sponsors evaluation of its programs and

information products by the Committee on Energy Statistics of the American Statistical

Association. The committee consists of primarily mathematical statisticians, survey statisticians

and energy analysts. Regular evaluation occurs through semi-annual meetings that are open to

the public.

EIA consults other, non-federal statistical agencies, such as the Committee on National

Statistics of the National Academy of Sciences (CNSTAT) and the Council of Professional

Associations on Federal Statistics (COPAFS). CNSTAT conducts studies, workshops and trainings

to ensure that statistical data represents reality and it adequately informs public policy issues.

COPAFS gathers professional associations, businesses, research institutes and others interested

in issues impacting federal statistics. COPAFS holds regular meetings, attended by EIA, and,

reciprocally, participates in federal statistical meetings. COPAFS regularly reports on budgets of

individual statistical agencies and their plans for improvement (Spar 2011).

Page 16: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

8

Legal status and independence. All

statistical systems have a legal basis

that defines the roles and

responsibilities of players and

provides a mandate for data

collection. The legislative

framework is also a major

instrument to safeguard the

independence from political and

other undue influences, to ensure

confidentiality of data providers

and to encourage data sharing. One of the key aspects of the legal framework is the statistical law

that mandates data collection and compels response. In Canada, a single law, the Statistical Act, lays

down the main points and establishes a very broad mandate for Statistics Canada. Statistics Canada

also has the right to enter into data sharing agreements with other agencies and provinces. In fact,

the Manufacturing and Energy Division of Statistics Canada has a large number of data sharing

agreements on energy (10 to 15). The United States relies on several pieces of legislation to

establish its statistical system, but the law specifically calls out the establishment of EIA and its

mandate to collect energy data (see Box 3), which ensures that EIA has a strong legal backing and

legally established independence.

To safeguard independence, statistical agencies need to be at arm’s length from executive

offices. Both EIA3 in the United States and Statistics Canada are formally part of policymaking

ministries and departments but are not subordinated to them and do not require approval to

publish data. This is especially important given the role that EIA has in conducting analyses for the

U.S. Congress as well as for the Department of Energy.

Another important aspect of statistical agencies’ legal status is their ability to protect the

confidentiality of respondents. Statistical agencies are more likely to gain trust and receive

responses at a higher rate and with more accuracy if they can guarantee that data from companies

or individuals would not be used against them. For this reason, the U.S. law distinguishes between

data for “statistical purposes”, such as data used for “description, estimation or analysis of

characteristics of groups, without identifying the individuals or organizations that comprise such

groups, and “nonstatistical purposes”, any data that are “administrative, regulatory, law

enforcement, adjudication, or other purpose that affects the rights, privileges or benefits of a

particular identifiable respondent” (United States Government 2002). Confidentiality laws are very

strong in the United States, and EIA could not share data on a specific natural gas company if a

government agency will use it for enforcing a law or developing company-level regulatory action

(in contrast to legislation affecting an industry). For this reason, other regulatory agencies collect

their own data. Examples of such agencies include: the Federal Energy Regulatory Commission

(FERC), North American Electric Reliability Corporation (NERC), Nuclear Regulatory Commission

3 Note, this is not the case for other principles statistical agencies, e.g. the Bureau of Transportation Statistics is solely under Department of Transportation.

Box 3. EIA’s Mandate in United States Federal Law

The U.S. Congress established both DOE and EIA in

1977 through the Department of Energy Organization Act,

building upon systems and organizations first established

in 1974. The Federal Energy Administration Act of 1974

gave EIA the authority to gather data from energy

producers and consumers. Responding to surveys forms

from EIA is mandatory, and the surveyed face a penalty of

up to $5,000 for non-compliance.

Page 17: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

9

(NRC), etc. Such agencies often act through their regional offices and collect information relevant to

their regulatory function locally,

but typically do not disseminate

such data immediately. For

example, regional offices at FERC

collect data on transactions of

natural gas and electricity

companies for the purposes of

overseeing the market. FERC

centrally aggregates these data

and disseminate them after several

months, sometimes along with

analysis and other reports. This

means that EIA might duplicate

data collection efforts. In fact, both

EIA and FERC collect data such as

fuel prices but for different

purposes.

To ensure independence and

clarity in status of various

agencies, the German law allocates

responsibility clearly between the

central statistics office, Destatis,

and the regional statistical offices. Destatis takes on tasks that may be reasonably performed only

by a central institution. Regional statistical offices are not subordinate to the federal government,

and thus, the Federation cannot exert any influence on the organization, staff or funds available to

these offices. However, each statistical survey requires authorization of federal and regional laws,

with very detailed methodologies, specifications and parameters that would ensure uniformity

(Kopsch 2002). The neutrality, scientific independence, and rules for statistical confidentiality for

both the federal and regional statistical offices are laid down in the federal statistics law. This

makes regional offices less susceptible to local pressures and is a good precondition for abiding by

statistics principles.

Whether centralized of decentralized, the systems share strong, enforceable statistics laws with

clearly defined responsibilities. This helps ensure that agencies have the authority, legitimacy,

credibility and are more likely to receive funding and other resources necessary to perform

statistical tasks.

Coordination. Statistical systems require coordination among components to mobilize budgetary

and staff resources, to exploit synergies, to take advantage of possible efficiencies, to ensure

coherent output and to defend the system against potential inference (Fellegi 1996). Clearly,

coordination is more challenging for decentralized agencies, but even in the most centralized

systems, external agencies carry out some of the data collection and processing, and thus,

Box 4. Data Confidentiality in the UK and Germany

Confidentiality laws are still evolving in the UK, but

social data within GSS is protected by the Statistics and

Registration Service Act, which does not allow releasing

personal information that could potentially identify an

entity, such as an individual or a household. All producers

of official statistics must abide by the principles of National

Statistics. For example, DECC would have to ensure that

data released on building consumption by building type

and locality would not identify any household. Similarly, it

is not possible to give data on oil refineries by region

within the UK, because that would divulge business data of

the only refinery in Scotland.

Germany secured standards for data

confidentiality in the Law on Statistics for Federal

Purposes. Both federal and regional statistical offices must

abide by this law, which eases data sharing from a legal

perspective. Because agencies are bound by the same or

similar confidentiality and disclosure agreements,

coordination and sharing of data is easiest.

Page 18: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

10

coordination is required. Coordination can take place through laws or through committees and

other arrangements. In Canada, Statistics Canada and the Chief Statistician have control over the

budget plans, classification systems, reporting burden and personnel management to ensure

coordination. Thus, the Manufacturing and Energy Division of Statistics Canada can more

seamlessly obtain and share information with other departments and agencies. In the United States,

EIA has central responsibilities over energy statistics. On the one hand, EIA can easily coordinate

within the agency, but coordination with other agencies is more challenging. To assist with

coordination, OMB requires agencies to prepare detailed plans for surveys and overseas reporting

burden. OMB also chairs or sponsors the work of three interagency committees: the Interagency

Council on Statistical Policy (ICSP), the Federal Committee on Statistical Methodology (FCSM) and

the Statistical Community of Practice and Engagement (SCOPE) (United States Government

Accountability Office 2012). The committees work through subcommittees and working groups and

rely on volunteers from member agencies to take on extra responsibilities. In the UK, DECC

annually publishes statements on coordination with other agencies and potential synergies

(Department of Energy and Climate Change n.d.).

Being decentralized, Germany’s statistical system puts significant effort into coordination at all

stages of data production: planning, collection, processing and dissemination. Although the law

does not require that all statistical offices use the same methodology, comparable results in the

regions must be guaranteed. Destatis cannot compel a regional office to follow a certain procedure,

and, therefore, all actors must reach consensus, or otherwise, the data will be downgraded to the

insufficient lowest common denominator. This process is very time-consuming and expensive

(Kopsch 2002). Destatis and regional statistical offices closely cooperate on dissemination of data.

Box 5. Multi-level Coordination at Germany’s Destatis

The central statistical office, Destatis, plays a key role in coordinating Germany’s

decentralized statistical system. Destatis’ major functions include obtaining uniform data for all

of Germany, methodological and technical preparation of federal statistics, ensuring that the

regional statistical offices produce data without overlaps, in accordance with uniform concepts

and in a timely manner. Destatis is also responsible for international cooperation with the

European Union, the United Nations, International Energy Agency and other statistical

institutions abroad.

Germany agencies pursue coordination at various levels to address policy, management and

work-level issues. The heads of Destatis and the regional statistical offices meet at special

conferences three times a year. In 2010, this conference appointed several committees to

address specific issues, creating a link to experts on specific subject areas. Experts from Destatis

and the regional offices also meet frequently to discuss and decide on common methodological

and technical questions on specific subject matters. About 40 formal expert meetings exist, with

their main goal to standardize workflows and IT-infrastructure. In addition, the Inter-Ministerial

Committee for Coordination and Rationalization of Statistics coordinates statistics between

federal ministries, the Federal Audit Office and Destatis (Kopsch 2002).

Page 19: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

11

The offices develop a joint marketing plan with common principles of dissemination, pricing and

licensing policy. Federal data are disseminated through the federal Statistics Portal4, whereas

regional data are provided on individual regional internet sites and also on the Regional Database

Germany portal5. Germany’s heavy emphasis on coordination allows the decentralized statistical

system to take advantage of the regional offices which are closer to the data source. At the same

time, the coordination helps produce nationally coherent statistics. Such a system, however, comes

at with high costs and lower efficiency, and requires a culture of close coordination between

agencies.

Implementation and adequacy of resources. To ensure that resources are adequate, statistical

systems need to build in procedures for

evaluating and providing funding, staffing and

other resources needed to complete the tasks.

In the United States, DOE annually submits

requests for funding to the President’s office with

fairly detailed explanations of the needs of

specific agencies, such as EIA. Together with

budget information, EIA submits its performance

measures to the government, which are

customer satisfaction surveys and timeliness of

information. Thus, the decisions on EIA’s funding

take place at the Presidential and Congressional

levels (Department of Energy 2014).

Statistics Canada retains staff and advances

their skills through continuous training and

selective rotations. Such trainings include

courses, networking events, conferences,

coaching sessions, communication trainings and

so on (See Box 6). Rotating assignments, offered

on a selective basis and within reasonable skill

domain, ensured greater work security and

creativity, which makes the agency an attractive

place to work. For example, energy statisticians

might get to work with experts in other

manufacturing sectors or agriculture.

Quality of data and methodologies. An essential aspect of maintaining credibility is relying on

sound methodologies and producing quality data. This task can be more challenging for

decentralized systems, where there are many actors in the statistical system. In Germany, all offices

are involved in the coordination of quality assurance measures. In addition to basic quality

4 http://www.statistik-portal.de/Statistik-Portal/en/. 5 https://www.regionalstatistik.de/.

Box 6. Ensuring Long-term Quality

through Continuous Training at

Statistics Canada

Continuous training is important to

ensuring long-term quality of the statistical

system. Since 2004, Statistics Canada has a

policy of continuous learning (Statistics

Canada 2008). To implement it, the Human

Resources Development Division (HRDD) of

Statistics Canada organizes various courses,

networking events, conferences, coaching

sessions and communication trainings

(Statistics Canada 2014). HRDD also

organizes Quality Assurance exercises, in

which selected employees have group

discussion of Quality Assurance concepts

and “best practices”, and identify areas of

potential quality risk in their respective

programs. The scale and single location of

Statistics Canada makes it logistically easier

and cost-effective to organize such activities.

Decentralized systems can take advantage of

distance training systems to implement such

practice.

Page 20: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

12

assurance measures, statisticians continue developing and following methods of process

documentation and analysis, quality criteria measurement, self-assessments, external assessments

of data quality and user surveys (Destatis 2014). In addition, Destatis introduced a comprehensive

quality management system based on Total Quality Management. Apart from the quality of the

products and services, the focus is also on customer and user satisfaction, active involvement of the

staff, long-term business success and societal benefits. The process includes the development and

definition of strategic goals, the annual planning cycle of the closed conference of the senior

management, agreements on targets (contracts), program and resources planning, cost accounting,

controlling and regular staff surveys.

In the United States, OMB requires all agencies to follow general guidelines on information

quality, but also to develop their own. Thus, EIA has to comply with DOE and internally developed

guidelines. All of these guidelines are largely overlapping and in general terms include:

development of concepts and methods; planning and design of surveys, establishment of review

procedures. As part of complying with these procedures, EIA selectively audits data obtained from

surveys, for example, by sending staff to an energy producing facility or a well. Statistical agencies

also commit to using modern statistical theory and practice; staff training; implementing ongoing

quality assurance programs to improve data validity and reliability and to improve the process of

compiling, editing and analyzing data; and maintaining relationship with appropriate professional

organizations (Economic Research Service, National Agricultural Statistics Service et al. 2002).

IV. Best Practices on Data Coverage and Requirements

The statistical model and institutional framework affect how countries collect, compile and

disseminate energy data as well as the coverage of energy statistics. Energy data collection and

dissemination are specific to the national situation. Countries need to understand their data needs –

what needs to be collected and what needs to be disseminated – and develop consistent

methodologies to meet these needs. The four countries studied in this report (i.e. the United States,

Canada, UK and Germany) have different mechanisms to collect and maintain energy data. In

particular, energy data in these countries differ in type, frequency, coverage of geographic areas

and time series, as well as the methodologies used to collect data. Both EIA and Statistics Canada

collect and disseminate energy information and data, including energy production, stocks, demand,

imports, exports and prices. DECC publishes statistics relating to energy, climate change, energy

efficiency, fuel poverty, radioactive incidents and coal health. In terms of energy data, Destatis only

provides information on electricity and gas production and distribution, heat generation, energy

consumption in industry and hard coal imports.

However, these countries also have similarities in their energy data systems. A robust energy

data system has multiple dimensions: relevance to policies, credibility and objectivity of data,

accurate methodologies and data products, time-series and data coherence, accessibility and user

friendliness. End users of energy data in these countries also resemble, which generally include

policy makers, industry, non-profits and international organizations, academia, media and the

public. The following sections discuss data collection, coverage, accessibility and interpretability in

details.

Page 21: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

13

Data collection and coverage. Across all countries, energy data are typically collected through the

following approaches: administrative data, census, statistical surveys and, to some extent,

measurement and modeling (UN 2013). Data collection methods are often specific to energy data

categories and data types (Table 2).

Table 2. Data coverage and collection methods

Data category Collection methods

Available data Characteristics

Energy supply Administrative data, census, surveys

Imports/export, stock, energy production, transformation output, deliveries to the market

Data sources are a few large producers / industries / agencies. Fossil fuels, primary heat and renewables often have different methodologies.

Energy transformation

Census, surveys Transformation input/losses, transportation output

The number of data owners is relatively small.

Energy consumption

Surveys, business data from energy industries

Final energy consumption by end-users, transformation inputs

Consumption surveys have a large population and are expensive and time-consuming.

Energy prices Census, surveys Expenditures, costs and taxes

Price data can be obtained from suppliers/traders or consumers. Price data for renewables are limited. Taxes and incentives data are often not included in the energy data system.

Energy efficiency and indicators

Modeling, surveys, metering/ measurement

Energy consumption by service, per capita or per unit of floorspace/output

Energy indicators are often estimated by assembling different datasets, which presents challenges in terms of data compatibility.

Energy poverty Administrative data, surveys, modeling

Energy expenditure relative to income

Data are very limited, often obtained from consumption surveys and case studies.

Data types, frequency and coherence. Frequency of data collection and dissemination varies across countries (Table 3 and Table 4). EIA has the most frequent collection of energy supply data and the widest data coverage among all four countries, while Canada has the most frequent collection/dissemination in end-use sectors.

Table 3. Frequency of data dissemination by the end-use sector

End-use sector United States Canada UK Germany

Residential and commercial buildings T or Qd A - A**

Manufacturing Qd A A A

Transportation n/k A A -

Agriculture A* A A -

Note: n/k-not known; T – Triennially; A – Annually; Qd - Quadrennially.

*Agricultural energy use is reported as non-manufacturing energy consumption under the industrial sector.

Box 7 provides details on agricultural energy data in the United States.

**Only heat balance is available.

Page 22: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

14

EIA publishes several energy data reports on a regular basis, such as the Monthly Energy

Review, the Annual Energy Review and the Annual Energy Outlook, and these reports cover various

fuel types and end-use sectors. EIA also produces reports on specific fuel types such as Weekly

Petroleum Status Report and Quarterly Coal Report, to provide more sectoral details. Statistics

Canada currently publishes two reports, the annual Canada Year Book and the annual Report on

Energy Supply and Demand in Canada, to disseminate energy statistics. There are five major

publications for energy statistics in the UK, including the annual Digest of UK Energy Statistics

(DUKES), the quarterly Energy Trends, the annual UK Energy in Brief, the Quarterly Energy Prices

and the Energy Consumption in the UK. These publications provide official statistics relating to

fossil fuels, renewables, climate change and energy efficiency. Data dissemination by Destatis is

limited due to the Energy Statistics Act in Germany. For instance, Destatis only publishes a small

portion of renewable energy statistics, which are instead managed by the Federal Ministry for the

Environment, Nature Conservation and Nuclear Safety. Compared to the centralized system, the

decentralized system may require more resources and staff time, because users need to gather

information from multiple agencies and institutions that are involved in collecting and

disseminating energy data. It may also be less user-friendly if there are challenges with collating

and integrating the various datasets for data users. As a member country of the European Union,

Germany reports their energy data to the European Commission, which publishes energy statistics

from all member states. This could also be a potential reason why Germany disseminates less data

on its own, compared to the other three countries, in addition to the issues on coordination.

Box 7. Data on Agricultural Energy Use in the United States and Canada

In the United States, EIA does not collect data on agricultural energy use, but it reports

agricultural energy consumption under the industrial sector (i.e. the non-manufacturing

subsector). Unlike the manufacturing, buildings and transportation sectors, the non-

manufacturing sector (i.e. agriculture, construction and mining) does not have a single source of

data for energy consumption, and data are often derived from various sources by different

government agencies. The major data source for agricultural energy use is the Agriculture

Research Management Survey by the U.S. Department of Agriculture (USDA). EIA also uses

manufacturing survey data and industrial energy consumption data to calibrate non-

manufacturing energy use. But if agricultural product or waste is used as fuel, such like ethanol or

traditional biomass, EIA does maintain and collect such data. USDA also conducts the Census of

Agriculture every 5 years (dating back to 1840), but this normally does not relate to energy use.

In summary, USDA collects most agricultural energy consumption data in the United States, and

EIA analyzes those data from USDA and other agencies and includes them in its Annual Energy

Outlook. For energy use in agriculture, Statistics Canada has disseminated consumption by

primary and secondary energy sources at both provincial and national levels since 2002.

Page 23: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

15

Table 4. Frequency of data dissemination by energy source

Energy source

Category United States Canada UK Germany

Coal

Production W, Q & A M M, Q & A -

Consumption M, Q & A M & Q M, Q & A -

Disposition Q & A M M, Q & A M, SA & A

Prices/price indices D, W & A - M & Q M

Stocks M, Q & A M & A M, Q & A -

Reserves A A n/k -

Projections A A A -

Petroleum & other liquid

Production W, M & A M & A A -

Consumption W, M & A M A -

Disposition W, M & A M, Q & A A A

Prices/price indices D, W, M & A M W & Q M

Stocks W, M, SA & A M & A M, Q & A -

Reserves A A n/k -

Projections A A A -

Natural gas

Production M & A M & A M, Q & A M & A

Consumption M & A M & A Q & A M & A

Disposition M, Q & A M & A M, Q & A M & A

Prices/price indices D, W, M & A M & A M & Q M & SA

Stocks W, M, Q & A A Q & A M

Reserves A A n/k -

Projections A A A -

Renewable & alternative

fuels

Production M & A - A A

Consumption A - A A

Disposition A - A A

Prices/price indices A - - -

Stocks - - A -

Reserves - - - -

Projections A A A -

Nuclear & uranium

Production Q & A - - -

Consumption D*, M & A - - -

Disposition - - - -

Prices/price indices A - - -

Stocks - - - -

Reserves A - - -

Projections A A A -

Electricity

Production M & A M & A A M, Q & A

Consumption M & A - A -

Disposition biW, M & A M & A - M

Prices/price indices M & A M A M, SA & A

Fuel input M & A A A M & A

Note: n/k- not known; D – daily; W- weekly; M – monthly; Q – quarterly; SA – semi-annually; A – Annually.

*EIA reports data on daily nuclear capacity outage.

Page 24: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

16

Countries also provide a wide range of historical energy data. The United States has the most

consistent time series. Nearly all annual energy data reported by EIA start from 1949 and most

monthly data start from 1970s or 1980s. Projections of future energy supply and demand are also

available in the United States, Canada and the UK, and the ending years of projections are 2030 in

the UK, 2035 in Canada and 2040 in the United States.

Energy data in the United States and Canada are collected at the individual level, directly from

mines, wells, refineries, traders, manufacturers and end users, while in the UK most energy data are

collected at the aggregate level (for example, the UK collects data from mining companies instead of

individual mines). EIA aggregates and disseminates energy data at various levels, including plant,

state, census region and national. Statistics Canada usually reports data at the provincial and

national levels. The UK only publishes energy data at the national level. Data dissemination in

Germany is limited, but the available data are mostly at the national level. EIA provides detailed

state profiles in energy statistics; this data and interstate comparisons are beneficial for state and

local policymaking as well as interstate comparisons.

Statistical surveys. Consumption and production surveys are the most frequently used methods to

collect energy data. In general, production surveys, directed to suppliers of specific energy sources,

measure the quantities of specific fuels produced for and/or supplied to the market; and

consumption surveys collect information on the types of energy used by consumer groups and the

consumer characteristics associated with energy use. In other words, production surveys may

provide information on sales to different groups of end-users that help in compiling consumption,

but consumption surveys provide much more detail and confirmation of consumption patterns.

Surveys are often expensive and complex, and there is a clear trade-off between cost and

quality, timeliness and precision. Typically, there are nine steps of developing and implementing

energy surveys: specify needs, design, develop, collect, process, analyze, disseminate, archive and

evaluate (United Nations 2013). Sample size, sampling methods, response rate and capacity of

interviewers all contribute to the quality of survey results. The frequency of data collection relates

to the way in which surveys are carried out. Most countries only collect monthly data from large

energy producers/suppliers and extrapolate data for small producers/suppliers. Sample size and

sampling methodologies are also affected by the frequency of data collection. For example, if the

same data are collected monthly and annually, the monthly survey normally has a smaller sample

size.

Surveys on energy supply are easier to carry out and less expensive than consumption surveys.

The number of respondents for production surveys is often low and the competence of the

respondents is high. The population of potential respondents of consumption surveys is huge and

their knowledge on the topics is relatively low. Therefore, consumption surveys are often less

frequent and more expensive, and need more resources to carry out. For example, EIA has most

production surveys on a monthly basis, while consumption surveys, such as the Commercial

Buildings Energy Consumption Survey and Residential Energy Consumption Survey, are only

conducted once in a few years (Box 8).

Page 25: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

17

The comprehensiveness of energy surveys depends on data needs and available budget and

resources. In the United States, EIA distributes 64 active surveys to collect energy data. Other

agencies such as FERC and the Department of Interior also distribute energy surveys for their

administrative purposes6. Frequency of collection ranges from hourly and daily for Balancing

Authority operations to annually in most cases. Reporting from respondents is mandatory for most

surveys. EIA also conducts customer surveys to evaluate and improve its data services7. Statistics

Canada uses 25 surveys to collect energy data. These surveys are conducted on a monthly, quarterly

and/or annual basis. Reporting from respondents is also mandatory. DECC circulates 29 statistical

surveys to collect energy data in the UK. These surveys have sample coverage ranging from 25% to

100% and response rates ranging from 80% to 100%. Most data are collected on a monthly,

quarterly and/or annual basis. According to Destatis, there are monthly and annual surveys

conducted for data collection in Germany. Various methods of data submission are applied in each

country. Generally, respondents can submit data for surveys by Internet/electronic data

interchange, Secure File Transfer, mail or facsimile.

The capacity of interviewers is critical to energy surveys, especially consumption surveys.

Consumption surveys, compared to production surveys, are often conducted less frequently and

often by contractors. To ensure that surveys are conducted correctly and the results are reliable, it

is important for statistical agencies to provide enough training for survey interviewers and conduct

6 The Federal Energy Regulatory Commission (FERC) is an independent regulatory agency within the Department of Energy. FERC collects information necessary to fulfill its duties on regulation and licensing. EIA uses some information collected by FERC for some analyses and publications. EIA also uses information collected by the Department of Interior’s Bureau of Ocean Energy Management and Bureau of Safety and Environmental Enforcement. 7 DOE conduct various customer surveys to users and beneficiaries of DOE products or services to determine how DOE can better improve its services.

Box 8. Commercial Buildings Energy Consumption Survey

The Commercial Buildings Energy Consumption Survey (CBECS) is a survey used by EIA to

collect building characteristics and energy use in buildings where at least half of the floorspace

is used for residential, industrial or agricultural purposes. CBECS consists of two components –

the Buildings Survey (BS) for respondents at the buildings and the Energy Supplier Survey (ESS)

for energy providers of electricity, natural gas, heating oil and district heat. Some key

characteristics of CBECS are listed below (EIA n.d.):

Frequency of collection: Quadrennially

Number of respondents: 9,700 for the 2012 survey

Reporting requirements: Voluntary for BS and mandatory for ESS

Collection methods: 45-minute, in-person interviews relying on Blaise (a

computer-based interview software); building managers also receive worksheet in

advance to help compile data

Example of data usage: EIA forecasting, energy end-use intensity models, Energy

Star rating system, statistical and analytical reports

Page 26: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

18

quality checks. EIA is also considering a recommendation by the National Academy of Sciences to

conduct smaller, continuous surveys in part to maintain the capacity of interviewers and have less

staff turnover.

Quality assurance and data validation. Ensuring data quality is a core challenge for agencies that

collect and disseminate energy data. Energy data are the product of a complex process, including

defining variables, collecting data from multiple sources, processing and analyzing data and

formatting and disseminating data. Countries can develop a framework to improve and assure the

quality of data. One such framework was developed by UNECE, Eurostat and the Organisation for

Economic Co-operation and Development (OECD) to manage data quality and metadata: the

Generic Statistical Business Process Model (GSBPM) (UNECE 2009). This model has been adopted

by Statistics Canada and many European countries and is also used by the EIA. The GSBPM

encourages a standard view of the statistical business process, but is a matrix with many possible

combinations of steps and sub-steps. The main steps include: (1) specify needs; (2) design; (3)

build; (4) collect; (5) process; (6) analyze; (7) disseminate; (8) evaluate (see Appendix D). For

example, Statistics Canada uses the model to evaluate and compare risks of large- and small-scale

survey programs, as well as programs dependent on administrative data.

Statistical agencies can also use specific methods improve and assure data quality. First, the

statistical agency can identify data quality indicators to monitor data quality on an ongoing basis.

Correct unit, historical trends and data consistency are often used as indicators for quality checks.

Second, regular quality assurance reviews and evaluations can help improve data quality. One

review point is the compilation of energy balances. Compiling an energy balance requires the use of

data collected from various sources, and by experts working in a range of domains (and in some

cases institutions). This implies the assessment of data accuracy and availability and provides an

opportunity to reconcile and improve energy data. Energy data management agencies investigate

large statistical differences to check if data are wrong or incomplete. Statistical differences could

range from below 1% to 10%, depending on data type and availability. Quality evaluation and

improvement can also be based on major surveys. For example, EIA corrects and improves energy

data and models for commercial buildings when the CBECS results are available. Third, tools and

expertise can help ensure the quality of energy data. Standardized tools, such as automated edits,

survey methodology and questionnaire design, can improve data quality and produce consistent

results. In addition, agencies such as EIA have staff that work exclusively on quality checks. Fourth,

providing training to raise awareness and build capacity of staff would help establish a culture of

quality assurance because it can help to raise the awareness and capacity of staff around these

issues. Finally, peer review and stakeholders’ feedback will also help improve data quality and

coverage.

Data assurance needs to be conducted at every stage of developing an energy data system, from

survey design to data dissemination and archiving. For example, to ensure data quality, all four

countries require data validation after energy data are collected or reported. Each of the four

countries follows the same trajectory for data validation (Figure 2). Energy data collected from

respondents go through a number of plausibility checks before they are used to produce energy

statistics. These checks cover at a minimum the following issues: 1) whether the data are complete

Page 27: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

19

in terms of number and content; 2) whether the data contradict or are anomalous compared to

historical data; 3) whether the data significantly differ from estimates of energy models; 4) whether

the energy flows balance out correctly. These checks are usually based on formal checklists, logical

conclusions and experience8. In circumstances where respondents are not able to respond with

correct data immediately, an estimate will be provided until corrections are done.

Figure 2. Common process for data validation

Modeling and projection. Energy modeling is widely used by statistical agencies to collect and

analyze data. Energy modeling is used to lower survey frequency (especially consumption surveys)

or reduce the extent and complexity of data collection. As mentioned above, energy models are also

used to validate data collected from surveys. Analyzing and interpreting data are also useful

functions of energy models. For example, building energy consumption by fuel and service is critical

information for energy simulation and dynamic modeling. However, fuel shares for different

services are often difficult to obtain from household surveys, as most households do not have sub-

metering, and thus modeling can help obtain such information.

Projecting future energy supply and demand is also a major function of energy modeling. In the

United States, EIA produces Annual Energy Outlook based on the National Energy Modeling System 8 UK: Section 3.2.1 in https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/65735/373-solid-fuels-statistics-methodology.pdf Canada: http://www5.statcan.gc.ca/subject-sujet/result-resultat?pid=1741&id=1741&lang=eng&type=SDDS&pageNum=1&more=0 Germany: Section 3.4.1 in https://www.destatis.de/EN/Publications/QualityReports/BrochureQualityStandards.pdf?__blob=publicationFile

Logical Conclusion

Return to Respondents for Correction

Data Dissemination

Data Collection Data Validation

(Plausibility Checks)

Page 28: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

20

(NEMS). The publication focuses on the factors that shape the U.S. energy system over the long

term. EIA produces the publication annually to update assumptions about

socioeconomic/technological development and current policies and to discuss selected issues of the

year. Annual Energy Outlook provides the basis to examine and discuss energy production,

consumption, technology and market trends and serves as a starting point for analysis of potential

changes in energy policies.

Common challenges in energy data. Although data coverage differs across countries, in all

countries supply data come from a small number of providers with expert knowledge of the field,

while consumption data are from multiple sources and expensive to collect. Second, high-quality

and comprehensive consumption surveys are important. Supply organizations usually provide

some data on sales by consumer type, which are the consumption data at the aggregate level, but

data collected from suppliers do not include details on end-uses and consumer characteristics.

Consumption surveys collect information on energy characteristics, usage patterns and relevant

cost and socioeconomic information. In many cases, indicators obtained from consumption surveys

also serve other policy goals such as social equity. Third, data on prices and indicators are limited,

because there are multiple data sources and points and complex methods such as modeling are

often required to develop indicators. Finally, data on renewable energy are less comprehensive

than those on fossil fuels. One reason for this is that renewable energy data are sometimes collected

and maintained by a line ministry, not a statistical agency, and agencies lack adequate coordination.

V. Prioritizing Data Reporting

Given budget and resource constraints, it is critical for statistical agencies to set priorities for

data collection and dissemination. This section briefly discusses criteria that could be used to

prioritize energy data. It also assesses how countries add or reduce surveys, since surveys are often

expensive and resource intensive.

Possible prioritization criteria. It is important to align data products with priorities, particularly

given budget constraints. There are several factors that affect priorities of energy statistics.

Relevance to stakeholders’ needs. Relevance is a key pillar of a statistical agency. This

concerns with whether the available data are useful and responsive for users to address

their most important issues.

Importance of data sources. The development of sampling methodologies could weigh in

the importance of respondents. Most countries only sample large energy

producers/suppliers on a frequent basis.

Major changes/trends in the energy sector. Energy data collection is a dynamic process

and changes in the energy landscape could affect data products. For example, EIA will

extend the collection of data on oil and natural gas production in 2015 to cover more states

and crude oil production, and the primary driver for this is that changes in the production

level of natural gas in certain states and the increase in crude oil production.

Major gaps in existing energy data. Identifying major gaps through evaluation within the

statistical agency and stakeholder discussions can help prioritize future direction of data

collection.

Page 29: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

21

Coordination and avoiding redundancy. Other than the statistical agency, line ministries

often collect data for their own purposes. To collect data efficiently, it is always useful to

examine what data are already available from an administrative source or an existing

survey and whether it needs a new survey or adding questions to existing surveys.

Procedures to add new surveys. Countries often have an overarching institutional framework to

manage information collection, through which agencies identify and prioritize their needs and

coordinate with stakeholders. In the United States, the Paper Reduction Act (PRA) of 1995 requires

each federal agency to seek and obtain approval from the OMB before undertaking a collection of

information directed to 10 or more persons. Each information collection activity approved by OMB

can last up to three years. When EIA creates new surveys or revises existing surveys, it needs to

follow the PRA requirements. The PRA submission form includes various aspects of a survey,

including

The type of information collection: new, revision of an existing survey or extension of an

existing survey;

Description of the data collection: purpose, collection frequency, geographic areas, survey

respondents and potential usage;

Affected stakeholders: individuals/households, farms, business, non-profit institutions,

federal government or state and local government;

Obligation to respond: voluntary, mandatory or required to obtain or retain benefits;

Reporting burden: the number of respondents, length of the survey and hours requested to

respond, estimated costs, etc.;

Technological considerations to reduce the burden;

Efforts to identify duplication and analysis of similar existing information.

EIA needs to share the above information for public comment and solicit feedback from

stakeholders. Before seeking final approval from OMB, EIA needs to respond to all comments,

summarize significant comments and revise the survey plan as needed. It is worth noting that the

planning for new surveys often starts before the PRA submission, and it can take months for EIA to

plan a new survey and get it approved.

VI. Communicating Data

To remain relevant to stakeholders’ needs, statistical agencies need to ensure that customers

and stakeholders have access to critical information how and when they need it. It is important that

statistical agencies disseminate energy data in a consistent and user-friendly way. Effective and

timely data dissemination will inform the development of environmental, energy and economic

policies, facilitate investment planning, help green growth planning at the national and state levels

and provide inputs to modeling and analysis.

Confidentiality and data disclosure. While the statistical agency needs to communicate energy

data broadly, it also needs to protect survey respondents and data owners. Energy data

disseminated should not allow statistical units to be identified directly or indirectly. Common

Page 30: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

22

methods to protect confidentiality include data aggregation, data suppression and controlled

rounding and perturbation.

Costs and data availability. The four developed countries studied in this report do not charge for

viewing and downloading data. Energy data in the United States are fully available to the public.

The availability of energy data is limited to some extent in the other countries. Canada does not

disseminate detailed renewable energy and nuclear/uranium data in the public domain, but these

data are available on request. According to the UK Energy Research Centre, nuclear statistics and

information in the UK are managed by the Dalton Nuclear Institute9, but they are not publicly

available.

Dissemination formats and data products. Energy data can be disseminated both electronically

and in paper publications10. Dissemination formats should meet the needs of stakeholders. For

example, press releases of energy statistics need to be organized in a way that is easy to re-

disseminate by mass media, while comprehensive energy data can be posted online in multiple

electronic formats (PDF, XLS, CSV or interactive graphs/tables, etc.). In addition, the types of data

products are also critical to effectively communicating information. Most countries develop annual

energy reviews or balances to summarize information on energy supply, demand, transformation

and efficiency. Countries such as the United States also provide monthly data on the energy market

and analyze changes in market trends, which is critical for business planning and investment. There

are also press releases (e.g. EIA’s Today In Energy) to analyze specific energy issues based on data

and to communicate information in a timely manner.

Interpretability and Metadata. Information that should accompany statistical releases include:

survey/product name, objectives of survey, time frame, concepts and definitions, target population,

collection method, survey forms, sample size, sampling methodologies and response rates, error

detection, missing data, revisions, analytical methods used, provisions regarding confidentiality and

disclosure control and response methods.

Trends in communicating data. To meet evolving stakeholders’ needs, countries are improving

data dissemination by leveraging technologies. Internally, statistical agencies are streamlining the

reporting system and improving data storage (e.g. moving towards cloud storage). Countries also

extend data formats, enhance data accessibility and improve visualization. For example, EIA has an

interactive electricity data browser11 and is developing an interactive coal data browser12 to enable

users to find, graph and map energy data. The interactive browsers also allow customers to create

charts and state-level thematic maps. In addition, countries are increasing data products to meet

evolving needs. For example, EIA will add a new survey of hourly electric power data from the 68

electric systems in early 2015, which would supplement the existing biweekly, monthly and

annually surveys of electricity data.

9 http://ukedc.rl.ac.uk/ext_stats_uk.html .

10 Common data formats include the internet, CD-ROM, paper publications, files available to authorized users or for public use and press releases. 11 The electricity data browser is available at http://www.eia.gov/electricity/data/browser/ . 12 The coal data browser is available at http://www.eia.gov/beta/coal/data/browser/ .

Page 31: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

23

VII. Take-away Messages

This report reviews energy data management framework in four developed countries – Canada,

the United States, UK and Germany). Countries have adopted diverse models. Depending on how

much of the responsibilities for official statistics lies with the central organization versus

specialized government agencies, these systems can be loosely placed on a spectrum from the most

centralized (Canada) to the least centralized (Germany). Despite differences in statistical models,

countries with effective energy data management system follow some common principles,

including (a) relevance to policymakers and users; (b) legal status and independence; (c)

coordination; (d) implementation and adequacy of resources; (e) quality of data and

methodologies. In addition, statistical agencies need different types of expertise, including

statisticians, modelers, and, in some cases, topical experts in different energy fields, which need to

be considered in the staffing plan and maintained with continuous training and capacity building.

Energy data management systems usually match the overall government data approach on this

spectrum of centralized to decentralized. In countries with decentralized data systems, extra

coordination and communication are essential in maintaining data quality.

This report also studies energy data coverage in these countries, from both the data collection

and dissemination perspectives. Administrative data, census, statistical surveys, modeling and to

some extent measurement are used to collect a wide range of energy data, including energy

production, transformation, consumption, energy efficiency and indicators. Key aspects of data

coverage to consider in the system design include categories of data, frequency of data collection

and dissemination, geographic areas covered and time-series of data. Countries show different

levels of coverage. The United States has the most extensive and comprehensive energy data, with

the longest time-series, most subnational level details and most frequent data dissemination.

Learning from international best practices, there are four key areas to be emphasized in developing

a robust energy data management system. These areas are (a) improving consumption data; (b)

strengthening coordination; (c) prioritizing energy data; (d) enhancing access to data.

Although consumption data are important for designing energy and social policies, they are less

comprehensive than production data in many countries. To improve consumption data, statistical

agencies should develop and prioritize consumption surveys based on data needs, while

coordinating with existing surveys and information collected by other agencies. Countries also need

to improve the frequency with which consumption surveys are carried out in order to produce

timely data for decision-making.

Statistical systems require coordination between institutions to achieve efficiency and coherent

output. Coordination is more challenging for decentralized energy data management system,

although centralized systems also require coordination. Coordination can take place through

institutional setup such as laws or committees. It also requires coordination within the statistical

agency. The agency should review existing and available resources before creating new data

collection and dissemination mechanisms.

Statistical agencies are constrained by budget and resources. It is critical to prioritizing data

collection and reporting based on stakeholder needs, major trends in the energy sector and gaps in

Page 32: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

24

existing data. Statistical agencies also need to evaluate and review existing surveys and data

regularly to prioritize future data collection.

Data dissemination is as important as data collection. Effective and timely data dissemination

ensures the relevance of the statistical agency and contributes to policy development and business

decision-making. All four countries studied in this report provide energy data free of charge, and

constantly improve data dissemination to diversify data products and formats and provide

maximum access to energy data. In addition, sharing metadata and improving data interpretability

are important to an effective and efficient energy data management system.

Page 33: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

25

References

AGEB (2014). "Mitglieder." Retrieved September 9th, 2014, from http://www.ag-energiebilanzen.de/14-0-Mitglieder.html.

Department of Energy (2014). Department of Energy FY 2015 Congressional Budget Request. Washington, DC, Department of Energy. 3, from http://energy.gov/sites/prod/files/2014/04/f14/Volume%203.pdf.

Department of Energy and Climate Change (n.d.). Statement of Administrative Sources. London, Departemnt of Energy and Climate Change, from https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/240305/statement_administrative_sources_decc.pdf.

Destatis (2014). "Quality." Retrieved September 10th, 2014, from https://www.destatis.de/EN/Methods/Quality/Quality.html.

Economic Research Service, National Agricultural Statistics Service, Census Bureau, Economic Analysis Bureau, Education Department, Energy Information Administration, Centers for Disease Control and Prevention, Justice Department, Labor Statistics Bureau, National Science Foundation, Social Security Administration, Transportation Statistics Bureau and Internal Revenue Service (2002). Federal Statistical Organizations' Guidelines for Ensuring and Maximizing the Quality, Objectivity, Utility, and Integrity of Disseminated Information. Washington, DC, Federal Register, from https://www.federalregister.gov/articles/2002/06/04/02-13892/federal-statistical-organizations-guidelines-for-ensuring-and-maximizing-the-quality-objectivity.

Edmunds, R. (2005) "Models of Statistical Systems." 26, from http://www.paris21.org/sites/default/files/2101.pdf.

EIA (n.d.). "Comercial Building Energy Consumption Survey." Retrieved September 9th, 2014, from http://www.eia.gov/consumption/commercial/about.cfm

EIA (n.d.). "Information Quality Guidelines." Retrieved September 10th, 2014, from http://www.eia.gov/about/information_quality_guidelines.cfm.

Federal Statistical Office and the statistical Offices of the Länder (n.d.). "GENESIS Database." Retrieved September 11, 2014.

Fellegi, I. P. (1996). "Characteristics of an effective statistical system." Canadian Public Administration 39(1): 5-34.

Government Statistical Service (n.d.). "UK Statistics Producers-Department of Energy and Climate Change." Retrieved September 9th, 2014, from https://gss.civilservice.gov.uk/about/uk-statistics-producers/department-of-energy-and-climate-change-decc/.

International Energy Agency (2013). Energy Policies of IEA Countries - 2013 Germany Review, International Energy Agency: 206, from http://www.cne.es/cgi-bin/BRSCGI.exe?CMD=VEROBJ&MLKOB=734697974949.

Kopsch, G. (2002) "The role and function of regional and local statistical offices; interactions with regional and local authorities." from

Page 34: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

26

http://unstats.un.org/unsd/methods/statorg/Workshops/Yangon/Session1_Local_Kopsch_Paper.pdf.

Lalor, T. (2013). "German Federal Statistical Office " Retrieved September 9th, 2014, from http://www1.unece.org/stat/platform/display/metis/German+Federal+Statistical+Office.

Scheuren, F., B. Willig, L. Katz, M. Rae, B. Maki, D. Lee, W. Hayes, C. Lee, R. Dehaan and A. Mushtaq (2012). Profiles in Statistical Courage. Washington, DC, George Washington University, from https://www.statisticalcourage.info/downloads/Ivan%20Fellegi.pdf.

Spar, E. J. (2011). "Federal Statistics in the FY 2012 Budget." Retrieved September 10th, 2014, from http://www.copafs.org/reports/federal_statistics_in_the_fy_2012_budget.aspx.

Statistics Canada (2008). Learning at Statistics Canada. Ottawa, from https://unstats.un.org/unsd/dnss/docViewer.aspx?docID=2475.

Statistics Canada (2012). Audit of Data Sharing Agreement from http://www.statcan.gc.ca/about-apercu/pn-np-80590-73-eng.htm.

Statistics Canada (2014). "Statistics Canada surveys and analysis cost-recovery overview." Retrieved September 9th, 2014, from http://www.statcan.gc.ca/eng/cs/overview.

Statistics Canada (2014). "Workshops, training and conferences." Retrieved September 10th, 2014, from http://www.statcan.gc.ca/services/workshop-atelier-eng.htm.

The Government of Canada (1985). Statistics Act. R.S.C., 1985, c. S-19. Canada, from http://laws-lois.justice.gc.ca/eng/acts/S-19/FullText.html.

UK Office for National Statistics (n.d.). "Buisiness and Energy." Retrieved September 9th, 2014, from http://www.statistics.gov.uk/hub/business-energy/index.html.

United Nations (2013). Energy Statistics Compliers Manual, from http://oslogroup.org/index.asp?page=escmmainpage.html.

United States Government Accountability Office (2012). Federal Statistical System: Agencies Can Make Greater Use of Existing Data, but Continued Progress is Needed on Access and Quality Issues. Washington, DC, U.S. GAO, from http://www.gao.gov/products/GAO-12-54.

Page 35: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

27

Appendix A. Overlap of Statistics Codes and Principles of the United States, European Union

and United Nations Statistical Commission

a. Proximity and relevance to policy issues and topics

b. Independence from political and other undue influences

c. Trust among data providers and confidentiality

d. Public perception and credibility

e. Timeliness and punctuality

f. Impartiality and objectivity

g. Cost-effectiveness

h. Non-excessive burden on respondents

i. Availability, accessibility and clarity to the public

j. Quality of data, products, methodologies and procedures

k. Coherence and comparability

l. Adequacy of resources

m. Mandate for data collection

n. Coordination among various agencies/branches for consistency and efficiency

o. Use of international concepts, classifications and methods

p. Bilateral and multilateral cooperation

UN Statistical Commission

U.S

.

EU

Page 36: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

28

Appendix B. UK’s Code of Practice for Official Statistics and Canada’s Dimensions of

Information Quality

United Kingdom

Principle 1: Meeting user needs.

Principle 2: Impartiality and objectivity

Principle 3: Integrity

Principle 4: Sound methods and assured quality

Principle 5: Confidentiality

Principle 6: Proportionate burden

Principle 7: Resources

Principle 8: Frankness and accessibility

Canada

1. Relevance

2. Accuracy

3. Timeliness

4. Accessibility

5. Interpretability

6. Coherence

Page 37: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

29

Appendix C. Staff Breakdown at the United States’ EIA

EIA has about 370 employees, out of whom 165 are administrative and management staff.

The remaining 205 are specialty staff that work in EIA’s three main departments: Office of Energy

Analysis, Office of Energy Statistics and Office of Communication.

Job description Office of Energy

Analysis Office of Energy

Statistics Office of

Communication Total

Economist* 52 16 1 69 Operations Research Analyst

39 22 2 63

Survey Statistician

1 32 1 34

General Engineer 3 1 0 4 Petroleum Engineer

1 2 0 3

Chemical Engineer

2 0 0 2

Mathematician 1 0 0 1 Mathematical Statistician

7 21 0 28

* 1 economist is also employed at Office of Resource and Technology Management

Page 38: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

30

Appendix D. Generic Statistical Business Process Model (GSBPM) – a Tool to Manage and/or Evaluate Data Quality and Metadata

Page 39: International Best Practices on Energy Data Management · II. Data Management Frameworks Defining good quality data, whether on energy or other topics, is complex and multi-dimensional

31