Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Asia-Pacific Economic Statistics Week Seminar Component
Bangkok, 2 – 4 May 2016
Name of author Lien Suharni; Ratih Putri Pertiwi; Tri Listianingrum Organization BPS – Statistics of Indonesia Contact address BPS – Statistics of Indonesia Directorate of Census and Survey Methodology Development Jl. Dr. Sutomo No. 6-8 Jakarta 10710 Indonesia [email protected]; [email protected]; [email protected] Contact phone +62 878 8275 1150 (Lien)
Title of paper Harmonization and Integration of Internal Data Sources in Constructing Statistical Business Register (SBR) Initial Data in BPS - Statistics Indonesia
Abstract
The economic movement changes so fast so that it needs to be monitored and
anticipated continuously through a ‘live’ statistics. To provide official statistics, Statistics o e ia known as BPS start developing Statistical Business Register (SBR) in 2013. For the first time designing, developing, and implementing SBR, the preparation initial data of SBR is one of important aspect to consider. To anticipate data inconsistency issues occured in the developing process up to SBR fully established, internal data sources that available from business as usual must be identified as SBR initial data.
BPS is used to have several business directories maintained by survey areas as a baseline to produce survey frames. The maintenance of the business directories is used to be silo and independent based on industry. Since SBR expected to be centralized statistical infrastructure, one of challenges is to harmonize and integrate the business directories into SBR to be used efficiently for master frame of business surveys. The main purpose of the harmonization and integration is to achieve coherent and address data issues of the migration process from silo to centralized. The harmonization process include standardization in definition of the unit and coverage, classification that used, and also characteristics to record. With the starting application of the changes in the base year of Indonesia's GDP based on the System of National Accounts (SNA) 2008 in 2014 is the right moment so that each statistical activity can begin using classification, new concepts definitions and statistical units that are standardized according to the SNA 2008. The integration process include data gathering, matching to avoid duplicate, and profiling. The harmonization and integration process is supported by centralized information system to facilitate the process. The result of the both process then validated through confirmation to the business or ground check. Economic Census that going to be held in 2016 is a milestone for BPS SBR to get the unit full coverage and confirmed in the field. With the initial data that clean defined, BPS SBR is ready to move into the usage of administrative data.
I. Contents
I. Contents............................................................................................................................................... 2
II. Introduction .......................................................................................................................................... 3
III. SBR Initial Data Integration and Harmonization ........................................................................ 6
A. Data Harmonization ....................................................................................................................... 6
1. Statistical Unit ............................................................................................................................. 6
2. Statistical Classification ............................................................................................................. 9
3. Variables .................................................................................................................................... 10
4. Geographic location code ....................................................................................................... 12
B. Data Integration ............................................................................................................................ 12
1. Data Gathering ......................................................................................................................... 12
2. Matching .................................................................................................................................... 14
3. Profiling ...................................................................................................................................... 16
C. SBR Data Confirming with 2016 Economic Census ........................................................... 18
IV. Conclusion ..................................................................................................................................... 21
V. References ........................................................................................................................................ 23
II. Introduction
Statistics Indonesia, known as Badan Pusat Statistik (BPS), is a non-ministerial
government institution in Indonesia that directly responsible to the president. One of the BPS
missions is to provide quality statistical data through an integrated statistics by implementing
national and international standards. BPS provides ranges of statistics for governments and
communities.
There are two types of data source used to produce statistics, namely primary data and
secondary data. Primary data means data collected directly from the object of observation and
for particular purpose, while secondary data means data obtained/ collected and put together
indirectly by previous studies or records published by other agencies. BPS uses both of two
types. However, primary data collected by census and survey currently still being the major
source to produce statistics while the use of secondary data is still limited.
Census in BPS is conducted in every 10 years to enumerate all units in population.
Census is a very powerful way to describe the real situation, but census also has its drawbacks
that it requires high cost, huge number of officers, and census also takes long time to be
conducted and analyzed. In addition to census, BPS conducts survey that provide the data
faster (annually, quarterly) and cheaper. By survey, the population can be interpreted by using
sample taken which the frame is from the latest census. On the other hand, secondary data that
is available in the form of administrative data from the ministries, institutions, agencies, and
associations and not to forget the big data cloud in the internet, is right there waiting to be used
and analyzed. The use and adoption of secondary data is beneficial and allows efficiencies in
terms of cost, productivity, and innovation but does not come without its flaws. The primary
value from secondary data comes not from the data in its raw form, but from the processing and
analysis of it and the insights, products, and services that emerge from analysis. The challenge
is how to capture the data we need and use it to provide the quality statistics.
De pite of it’ rawback , census and survey data is used to produce and maintain the
master frame for various data collections in BPS. Meanwhile, the businesses changes so fast,
to depict the current situation it needs to be monitored through a rapid data updating. By only
using census and survey, the time distance between data updating and the publishing creates
discrepancy between data and the actual current situation. BPS need to complement it by
utilizing secondary data. To generate the up-to-date data which are in accordance with the real
conditions, synergies and integration between both source of data are needed.
For economic statistics production, BPS organizationally consists of 16 sub-directorates,
which each sub-directorate works on statistics production for particular industrial sector called
Subject Matter Area (SMA). Every SMA may have business directories maintained separately
based on industry. In addition to census data, the business directories are used as basis to
generate the survey frames. For example, the Sub-directorate of Construction Statistics
maintains business directory that is list of construction enterprises, and conduct a construction
survey annually or quarterly based on this list as master frame. The independent maintenance
of business directories and frame production based on industry caused data quality issues
comprise decreasing response rate, delaying time, less than optimal data collection.
In 2013, BPS started to develop Statistical Business Register (SBR) to address the data
quality issues because of the frame. SBR (formerly named as Integrated Business Register -
IBR) is a unified and integrated list of businesses (in the forms of establishment, enterprise, and
enterprise group – statistical unit model). As a statistical infrastructure, SBR has an important
and fundamental role in the formation and maintenance of an integrated economic statistical
information system that serves a variety of purposes. The aim is to provide a complete master
frame for economic unit, as well as provide national economic indicators needed in the
preparation of national accounts. With SBR, it is expected an improvement in economic data
collection system, thus the data produced can describe the real conditions of business from time
to time and provide an indicator of the economic impact due to any changes in the economy
nationally, regionally, and internationally.
BPS is still working on the SBR development to date. For the first time designing,
developing, and implementing SBR, the preparation of SBR initial data is one of important
aspect to consider. Initial data is useful to get a whole picture of economic units in Indonesia
recorded on BPS internal data sources that are available. Initial data also provide coverage
measurement as quality benchmark to the use of administrative data for primary source in SBR
updating.
SBR has a role to consolidate and integrate the business directories that used to be silo.
By integrating those directories into a single integrated frame, the task to maintain the frame will
be much more efficient. SBR will become the single source of current and comprehensive
information on business establishment and companies for the economic statistics programs in
BPS. It will also serve as a sampling frame for all economics surveys and censuses.
However, the migration process from independent business directories to integrated
SBR is not as easy work. It needs a data integration and harmonization process. Each data
needs some adaptation to finally ready to be merged into one single register. The data itself
coul ’t be irectly merge ue to i co i te cie amo g them. The process of preparing each
data before the merging process called the harmonization process. Harmonization also covers
the process on maintaining the data quality. Harmonization is important to ensure the statistical
consistence & coherence before and after SBR established.
In 2016, BPS is going to conduct 2016 Economic Census (2016 EC) that will cover a
complete enumeration of all economic units in Indonesia. 2016 EC is determined as SBR
development milestone. 2016 EC used SBR initial data for planning and designing, afterwards
2016 EC result will complete and update SBR initial data for continuous maintenance.
III. SBR Initial Data Integration and Harmonization
A. Data Harmonization
The harmonization process include standardization in definition of the unit and
coverage, classification used, and also characteristics to record. In 2014, with the
starting application of the changes in the base year of Indonesia's GDP based on the
2008 System of National Accounts (SNA) in BPS, it was the right moment for each
statistical activity to begin implementing the classification, new concepts definitions and
statistical units that are standardized according to the 2008 SNA. The harmonization
process implemented based on 2008 SNA.
1. Statistical Unit
A statistical unit is an entity about which information is sought and for which
statistics are ultimately compiled [2]. It is the unit that provides the basis for statistical
aggregates and to which tabulated data refer. The BPS SBR supports the three level
hierarchy of statistical units - enterprise group, enterprise, and establishment, which is
decribed in Figure 1.
Figure 1. Indonesian Economic Statistical Units Model
An institutional unit is an economic entity that is capable, in its own right, of
owning assets, incurring liabilities and engaging in economic activities and in
transactions with other entities [3]. An enterprise is the view of an institutional unit as a
producer of goods and services [3]. Enterprises under the control of the same owner
form a group to achieve economic advantages such as economies of scale, control of a
wider market and an increase in domestic productivity through more effective business
management. An enterprise group is a set of enterprises controlled by a group head.
The group head is a parent legal unit that is not controlled either directly or indirectly by
any other legal unit [5]. For production analysis, enterprise must be partitioned into
smaller and more homogeneous units, which the SNA defines as establishments. An
establishment is an enterprise, or part of an enterprise, that is situated in a single
location and in which only a single productive activity is carried out or in which the
principal productive activity accounts for most of the value added [3]. Ancillary unit is part
of enterprise that carried out ancillary activity. As its name implies, an ancillary activity is
incidental to the main activity of an enterprise. It facilitates the efficient running of the
enterprise but does not normally result in goods and services that can be marketed [3].
The use of institutional unit enable identification of economic actors for business
in SBR that including corporations (financial and non-financial), governments, NPISHs,
and households. Not all SMAs use characteristic before SBR created. These institutional
units elaborate the SBR coverage as following figure.
Figure 2. Institutional Unit vs Production Unit in SBR Coverage
The use of economic statistical unit model in SBR based on 2008 SNA aims to
enable international and national comparability, to provide national account needs as the
end users, and to determine best approach for data collection. At first, the
implementation of statistical unit in SBR is by defining every unit as an establishment.
The unit which actually not an establishment, then redefined later through the profiling
process.
2. Statistical Classification
For harmonization process in statistical classification, SBR determined the usage
of industrial classification based on ISIC revision 4, that namely KBLI 2009 for
Indonesian version, as standard industrial classification for classifying activities. SMAs
data have to use the same version of classifications (KBLI 2009). The main activities
using KBLI 2009 code is a mandatory field in SBR to be filled in 5 digits.
In 2014, BPS synchronized all SMA business directories that divided by industry
to be covered in SBR. The harmonization process includes synchronization of SMA
business directories and SMA who responsible for statistics production by using KBLI
2009 code as the key. The importance of this process are: to define a clear scope of the
statistics without overlapping activities; to generate a better national economic data with
comprehensive data coverage (no economic activity missed / not caught) and proper
classification of economic activities which can be compared regionally, nationally, and
internationally; and to improve the coordination between statistical activities.
The harmonization process addressed several important issues, including:
1. Distribution of all KBLI 2009 code to the SMAs who responsible for the statistics
production.
2. Each 5 digit code KBLI 2009 will only be maintained by one SMA.
3. Scope should be clarified by subject matter for agreement: additional scope that is
not covered by SMAs or dont have SMAs to maintain the data yet, incision between
two or more SMAs, cross industrial coverage, specific code to confirm, etc.
4. Nomenclature of business directories defined by rule specified by SBR for
standardization:
Homogenous grouping by KBLI code for clear boundary
No differentiation by the business scale (large, medium, small, micro)
No differentiation by the time / year
No differentiation institutional sector (households, government, enterprise /
corporate / private, foreign) unless there is a distinct group within KBLI
The harmonization process result a mapping master table of KBLI 2009 code –
responsible SMAs – business directories. The mapping master table has been agreed to
be a decree that rules the responsibility of each SMA according to KBLI. It is used as the
reference of statistical activities in BPS.
3. Variables
Based on SBR purposes, variables need to be defined to include the information
that must be recorded. SBR must include all the variables/ characteristics necessary to
provide a frame for economic surveys. Therefore, SBR related to its role as a mean of
coordination and integration bridges the updates between the subject matter as a whole,
define the variables into identity, address, contact, structure, and indicator
characteristics as following table.
Table 1. SBR Variables
Characteristics Variables Name Note
Identity
ID SBR
Name of Business
Commercial Name
Statistical Unit Enterprise Group, Enterprise, Establishment, Ancillary Unit
Year established
Year operational
Alive status 1. Active, 2. Temporary closed, 3. Pra-operation, 4. Closed, 6. Not found
Main activity
KBLI Category
KBLI Code
Secondary activity
KBLI Secondary Category
KBLI Secondary Code
Other Activity
Other KBLI Category
Other KBLI Code
Main Product
Product 2
Characteristics Variables Name Note
Product 3
Directory Code
Address
Address
RT
RT
Sub-Census Block Number
Census Block Number
Village Code
Sub-District Code
District Code
Province Code
Counntry Code
Postal Code
Latitude
Longitude
Contact
Area Code
Phone Number
Extension Number
Faximile Number
Website
CP Name
CP Position
CP Phone Number
Structure
Legal Status 1. PT/ PT Persero/Perum, 2. CV, 3. Firma, 4. Cooperative/Dana Pensiun, 5. Foundation, 6. Business license, 7. Foreign institution/ branch, 8. No legal status
Shareholder
% Share
Parent flag
Owner Name
Owner Gender 1. Male, 2. Female
Kepemilikan 1. Public 2. Private
Institutional Sector S11. Financial Corporation, S12 Non Financial Corporation, S13 General Government, S14 Houshold, S15 NPISH, S20 Rest of the World
Network Status 1. Sole proprietorship, 2. Head Office, 3. Branch Office, 4. Representative Office, 5. Plantation, 6. Ancillary Unit
Count of branches/ plantation s
If network status = 2
Count of representative branches
If network status = 2
Count of ancillary units If network status = 2
Indicator
Total Revenue
Total Payroll for Workers
Total Workers
Total Expenditure
Other Tax identification number
Characteristics Variables Name Note
Note
The harmonization process for variables is by mapping the business directories
variables into SBR variables.
4. Geographic location code
The harmonization process includes the use of same reference & period of
master village file and master census block in SBR. Mandatory field is set to province
and district level for harmonization process, and will expand to census block in the
future. In the harmonization process may needs code mapping and identify the
geographic locaton code that not in the master. The mapping process is done
automatically combine approach by system and manually.
B. Data Integration
The integration process include data gathering, matching to avoid duplicate, and
profiling.
1. Data Gathering
The gathering process is acquiring data from internal data sources that
available, adjusting the data into the SBR template (excel format), and uploading the
data into a SBR database using SBR system. The BPS internal data sources used
for SBR are EC 2006 data, SMA directories, profiling exploration data results, and
regional administrative data from 2016 EC updating preparation.
2006 EC data is obtained from processing unit in BPS. It filtered to only large
medium enterprise to be included into SBR. It consist of 116.000 units.
SMAs directories data is obtained from subject matter by request them to
send the latest directory data company, especially large-scale enterprise/legal unit to
SBR. Every SMA have various year of business directories from 2011 to 2013,
depend on their own updating process. SMA directories data then added by
agribusiness companies data from 2013 Agricultural Census. The SMA directories
data is about 545.000 unit in total.
Profiling exploration data results is obtained from exploration of large-scale
enterprise business, especially the 100 biggest group enterprise ranked by revenue
in Indonesia in 2013 (according to Globe Asia Magazine), a State’ E terpri e.
The data obtained by desk work using internet sources to access compa y’ a ual
reports, compa y’ fi a cial tateme t , compa y’ official web ite, bu i e news
websites and newspapers. The job requires computer equipment with internet
access which rely on company resources that published on the internet. The work is
done by a dedicated profiler team that consist of 30 BPS Statistical Institute fresh-
graduates, and monitored/ controlled by IBR supervisor staff in 2014. The count of
units from this source is about 37.000 units.
Regional administrative data was obtained from 2016 EC Updating
Preparation in 2014. Every BPS regional office was asked to gather list of business
from business license registration owned by the government (central and local),
registration organized by associations, etc. It results about 508.000 units.
Figure 3. SBR Data Gathering
The data uploading process is done using SBR system with data validation
and duplicate detection facillity. If the excel file uploaded to the system contain unit
that not completely filled for the SBR mandatory field, or contain wrong KBLI code, or
use the wrong type of data (numeric, text, etc), SBR system will generate error file
that contain units which need content corrections. The uploaders then have to
download the error file, fix it, and re-upload to SBR system until all units successfully
loaded to SBR. For each unit successfully entered in SBR database, the system
automatically provide them with an unique ID called SBR ID.
2. Matching
If data from multiple subject matters are combined, it will result in the amount
of data that double/ duplicates. To remove the duplication, matching process is
carried out using the IBR system by the liaison officer / team appointed by each of
the SMAs.
The matching process is done by comparing the SBR data with other data
sources. All data from gathering result is stored in a database on a separate table.
Variables used in performing the matching process refers to the 12 variables
following: name of business, commercial name, address, province, district, sub-
district, village, established year, operating year, KBLI category, KBLI code, and
statistical unit.
The SBR data matching is done in several stages illustrate in the figure 4:
Figure 4. SBR Matching Process
The matching process is including:
a) loading the data from the 2006 Economic Census Large Medium Enterprise
(2006 EC MLE) to SBR
b) matching the SBR data with SMA directories data,
c) matching the SBR data with profiling exploration result data,
d) matching the SBR data with regional administrative data from 2016 EC
preparation.
The treatment in the matching process as follows:
a) If the unit is duplicate/ double (matched), then it will have 3 options: save
matching status without editing the SBR data, save matching status with the new
data (replace the SBR data) or save the matching status with editing (not all SBR
data replaced by new data).
b) If the unit is not matched, the new data can be regarded as a new unit data, and
need approval by SMA in later process.
3. Profiling
An important step which has been done by BPS regarding data integration
process is developing business profile called profiling. Profiling is a methodology for
analyzing the structure of the legal, operational, and accounting of groups of
companies in national and global. Profiling is developed to build a statistical unit in
group, how the relationship with other companies within the group, and to find the
most efficient structure to collect the data [4]. Units that have been through
gathering and matching process, then need to be profiled. Profiling is aiming to
improve the quality of business registers and to get an even portrait of compa y’
overall structure and business units. Profiling is done only for Large Business Unit
(LBU) a tate ow e terpri e with it’ ub i iarie a pro uctio u it . Profili g
is a way to transform business structure in real world into statistical units as
implementation of economic statistical unit model.
*the name and structure of business is fiction, for illustration purpose
Figure 5. Implementation of Economic Statistical Units Model in Profiling
Profiling steps consists of company data updating, relating and may in some
cases if the information is not enough, it also need confirming to the business by
conduct business visiting (ground check). The updating by profiling is a top-down
approach, starting from the highest level (enterprise group) to the lower level
(enterprise) and the lowest (establishment). The updating steps in profiling as
following figure:
Figure 6. Profiling Steps for Updating
C. SBR Data Confirming with 2016 Economic Census
SBR data is used to produce pre-filled form and list frame on EC2016. The data
source used for pre-filled form are LBU profiling data result (take-all) a SMA’
business directories data (filtered LBU only) which has complete geographic location
code to census block. Other than that, the SBR units printed as a list (called list frame)
as coverage guidance for enumerator.
2016 EC is used as vehicle to confirming the SBR units exist in the field, has
complete, accurate, and up-to-date information. 2016 EC results are expected as the
new SBR data for long term maintenance using administrative data.
Figure 7. 2016 EC L1. P (Pre-filled) Form
Figure 8. 2016 EC L2. P (Pre-filled) Form
Figure 9. 2016 EC List Frame for Kabupaten (Regency)
IV. Conclusion
Based on the harmonization and integration process that undertaken in BPS, there are
some conclusions and recommendations for developing initial data for statistical business
register.
1. BPS has been using census and survey as primary sources of data to generate statistical
data released to the public, but it has several disadvantages such as high costs, huge
number of officers, and takes a long time to conduct. It is a challenge for BPS (SBR) to be
able to obtain the required quality data at the same time in line with business that changes
so fast.
2. The use of secondary data derived from administrative data is a promising alternative
solution which enable efficiency in terms of cost and productivity. But the process of
migration from secondary data requires data integration and harmonization process by
referring to the SNA 2008. The harmonization process is the process of preparing all the
data into one single list before the process of merging the data, including the process of
maintaining data quality and ensuring the statistical consistency and coherency.
3. Data integration and harmonization process covers several aspects, namely the data
harmonization, data integration as well as ground check through the 2016 EC. The data
harmonization includes standardizing the statistical unit, statistical classifications, variables
and geographical code. While data integration includes the process of integrating all data
available internally and then do the matching and profiling process. And the last, to make
sure the existence of each unit, the data needs to be confirmed by the 2016 EC
enumeration.
4. Data from the 2016 EC will be SBR's initial data for the provision of integrated data, which
later will be updated via surveys or secondary administrative data. To integrate 2016 EC
data, it is also necessary to redo the harmonization process through the process of data
migration from secondary data, and so on. Thus, the harmonization process will never stop
in order to maintain the data consistency and coherency.
5. Need to maintain a good relationship with the ministries / agencies as the administrative
data provider as SBR entering the stage of variables identification. With a good relationship,
it is expected that the secondary data needed from them can be provided continuously.
V. References
[1] BPS – Statistics Indonesia (2014). Laporan Hasil Studi Persiapan Sensus Ekonomi
2016 (SPSE2016) Pengecekan Lapangan Perusahaan Besar Integrated Business
Register 2014. Jakarta: BPS – Statistics Indonesia.
[2] BPS – Statistics Indonesia (2015). Modul Penyusunan Profil Perusahaan (Profiling)
Integrated Business Register (IBR) Pemutakhiran Data Dan Pembentukan Struktur
Jaringan Perusahaan 2015. Jakarta: BPS – Statistics Indonesia.
[3] European Communities, International Monetary Fund, Organisation for Economic
Co-operation and Development, United Nations and World Bank (2009). System of
National Accounts 2008. New York: European Communities, International Monetary
Fund, Organisation for Economic Co-operation and Development, United Nations
and World Bank.
[4] European Union (2010). Business Registers Recommendations Manual, 2010
Edition. Luxembourg: Publications Office of the European Union.
[5] United Nations (2009). International Recommendations for Industrial Statistics 2008.
New York: United Nations.