45
Evidence-based monitoring of international migration flows in Europe * Frans Willekens Professor Emeritus of Population Studies of the University of Groningen Honorary Fellow, Netherlands Interdisciplinary Demographic Institute (NIDI), The Hague [email protected] * Paper presented at the 103 rd DGINS Conference (Conference of the Directors Generals of National Statistical Institutes), Budapest, 21September 2017 (Keynote scientific speech)

DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

Evidence-based monitoring of international migration flows in Europe*

Frans Willekens

Professor Emeritus of Population Studies of the University of Groningen Honorary Fellow, Netherlands Interdisciplinary Demographic Institute (NIDI), The Hague

[email protected]

* Paper presented at the 103rd DGINS Conference (Conference of the Directors Generals of National Statistical Institutes), Budapest, 21September 2017 (Keynote scientific speech)

Page 2: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

2

Abstract In Europe, the monitoring and management of migration flows are high on the political agenda. Evidence-based monitoring calls for adequate data, which do not exist. The sources of data on international migration differ significantly between countries in Europe and the initiatives to improve data collection and produce comparable data, including new legislation, did not yield the expected outcome. Scientists have developed statistical models that combine quantitative and qualitative data from different sources to derive at estimates of migration flows that account for differences in definition, under-coverage, undercount and other measurement problems. Official statisticians are reluctant to substitute estimates for measurements. This paper reviews the progress made over the last decades and the challenges that remain. It concludes with several recommendations for better international migration data/estimates. They range from improved cooperation between actors to innovation in data collection and modelling. Keywords: Europe, international migration statistics, migration flow modelling Acknowledgement This is a substantially revised version of a paper presented at the 2016 Conference of European Statistics Stakeholders (CESS), co-organised by the European Statistical Advisory Committee (ESAC), Budapest, 20-21 October 2016, and the Eurostat conference “Towards more agile social statistics”, Luxembourg, 28-30 November 2016, session “Statistics on intra-EU mobility”. Many thanks to James Raymer (Australia National University, School of Demography), Jakub Bijak (University of Southampton, Social Statistics and Demography), Nathan Menton (United Nations Economic Commission for Europe, Geneva) and four referees for their comments on an earlier version.

Page 3: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

1

1. Introduction

The quality of international migration statistics in Europe has been an issue for decades. In the early 1970s, the Conference of European Statisticians (CES), a subsidiary of the United Nations Economic Commission for Europe (UNECE) and the United Nations Statistical Commission, noted serious shortcomings in the statistics of immigration and emigration (Kelly 1987). The UN Economic Commission for Europe initiated a study comparing immigration and emigration statistics of member countries and found great discrepancies. While preparing demographic scenarios for Europe in preparation of the conference on ‘Human resources in Europe at the dawn of the 21st century’, Eurostat concluded that the existing data are inaccurate and not usable for population projections (Willekens 1994). Poulain (1991) had documented the inaccuracies. Since migration flow data could not be used, Eurostat used net migration estimates instead. That practice did not change until today (EUROPOP2015) (Lanzieri, 2017a, 2017b). Net migration is obtained as a residual (population change minus the natural change) without reference to data on migration. That approach allocates to migration the effect of several statistical adjustments made to balance the demographic accounting equation. Disregarding information on immigration and emigration has far-reaching implications, not only for demographic projections and the EU Economic Policy Committee’s monitoring of the sustainability of public finances in EU Member States (which relies on Eurostat’s population projections), but also for migration governance and the public debate on immigration1.

The demand for accurate migration flow data increased ever since migration became a crucial issue for Europe and started to dominate policy and political agendas. The Amsterdam Treaty, adopted in 1997, requested the European Commission to develop uniform procedures for the management of international migration and for the production of community statistics, including migration statistics. The Treaty led to the establishment, in 2002, of the European Migration Network to promote the collection and dissemination of information on migration. In 2003, the European Commission and the European Parliament concluded that further progress towards improving migration statistics requires legislation. That resulted in new legislation in 2007, the regulation on Community statistics on migration and international protection (for further details on the history of the resolution no. 862/2007 of 11 July 2007, see Willekens and Raymer 2008). The legislation paved the way for statistical estimation methods by allowing National Statistical Institutes to use estimation methods to produce the migration data to be submitted to Eurostat: “As part of the statistics process, scientifically based and well documented statistical estimation methods may be used” (Article 9). Skaliotis and Thorogood (2007), both from Eurostat, discussed the challenges migration posed to the European Statistical System. Regulation 1260/2013 of 20 November 2013 on the establishment of a common legal framework for the production of European demographic statistics in the Member States encouraged the use of scientifically based and well 1 Recently (September 2017), the Committee on Economic and Monetary Affairs of the European Parliament requested Eurostat to consider migration flows in population projections”in order to update the analyses of the social, economic and budgetary implications of population ageing and of economic inequalities.” (European Parliament, 2017).

Page 4: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

2

documented statistical estimation methods. The achievement of the objective of the Regulation, including the production of estimates, involves all Member States in an interactive way and effective coordination at the European level (Eurostat). The two Regulations and the implementing Regulation 2017/543 of 22 March 2017 on population and housing censuses also stress the need to harmonize concepts used in the production of statistics, in particular the concept of usual residence.

These developments and targeted funding by the European Commission, in particular Eurostat and the Directorate General for Research and Innovation, stimulated new research to improve the availability, reliability and comparability of migration data2. The research resulted in an extensive assessment of data sources and the differences in the data produced, data collection practices, and activities undertaken at country and EU levels to overcome problems with migration data (Poulain et al. 2006; Kupiszewska and Nowok, 2008; Kraler and Reichel 2010). In addition, improved statistical techniques were developed for estimating migration flows (e.g. Raymer and Willekens 2008; de Beer et al. 2010; Raymer et al. 2013; Abel 2013; Wiśniowski et al. 2016) and for forecasting migration in the presence of data deficiencies (Bijak 2011; Disney 2014). These studies did not yet resolve the inadequacies in migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of Eurostat reviewed research on European migration statistics and concluded that a wealth of methods is available to official statisticians for improving migration statistics but that the potential remains under-exploited. Official statisticians are insufficiently aware of the methods that have been developed by researchers. Eurostat adds that the multiple methods studied and proposed may have created the impression that the research is not yet conclusive. Eurostat notes that the distinction between statistics and estimates hampers the implementation of research outcomes. Statistics represent the product of a compilation of records from primary data sources. Estimates represent the outcome of statistical models, possibly combining information from various sources. Official statisticians are reluctant to present estimates as official migration statistics, although the 2007 EC Regulation facilitated the use of statistical estimation methods to produce harmonized migration statistics. Eurostat calls for a strong and constant commitment to improve primary data sources and the derived statistics. Note that the compilation of records from primary data sources may involves some estimation too.

In this paper, I review recent research aimed at better data on international migration flows in Europe and argue that the most effective strategy to produce high-quality data on international migration for the monitoring and the management of migration is to create a synthetic database. A synthetic database combines quantitative and qualitative data from different sources. It contains the best possible estimates of the ‘true’ migration flows and indicators of how reliable the estimates are, given the different sources of uncertainty in the 2 Since 1994, approx. 80 projects on migration have been funded within the Social Sciences and Humanities Research Framework Programme (https://ec.europa.eu/research/social-sciences/index.cfm?pg=policies&policyname=migration-mobility). Many of these projects did not address data issues. For an overview of projects, see King and Lulle (2016) and European Commission (2016a). King and Lulle (2016, p. 32) are concerned that comparative studies of migration are “less reliable” due to data limitations and a lack of proper documentation of data. The Conference “Understanding and tackling the migration challenge: the role of research”, organised by the European Commission in February 2016, concludes that “Systematic cross- national comparative research including data collection and analysis is urgently needed.” (Boswell 2016). In addition, NORFACE had a programme to support migration research (Caarls 2016).

Page 5: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

3

reported data. I argue that the development and maintenance of a synthetic database is a learning process, which implies that knowledge is updated in light of new evidence. The Bayesian model of learning combines data from different sources while accounting for the uncertainties involved. These methods may ultimately be incorporated in the database leading to a smart database, which recognizes data types, suggests estimation methods and signals new trends and discontinuities in migration flows.

The structure of the paper is as follows. In Section 2, I approach the development of a synthetic database as a learning process. Section 3 is a very brief overview of main data sources of international migration. The subject of Section 4 is the modelling of migration flows. The Poisson model is the dominant model of migration. It is a probability model that predicts count data and associates with each prediction a probability that the prediction coincides with observations. To estimate the parameters, different types of data, including expert opinions, may be used. Bayesian inference provides a formal framework for combining different data types. Sections 5 and 6 focus on different types of observation and the modelling of errors in observation. One observational issue is selected for an in-depth discussion: the duration threshold or duration criterion applied to define usual residence and used in the definition and measurement of migration. Section 7 concludes the paper.

2. Evidence accumulation: a learning process The reasons for the inadequacies of international migration statistics, identified by the CES in the 1970s (Kelly 1987), still exist today (Poulain et al. 2006; Lanzieri 2014a):

a. No common definition of immigration and emigration. Although the EU Regulation 862/2007 requests member countries, whenever possible, to follow the United Nations recommendations on statistics of international migration (United Nations 1998), only a few countries adopt the UN definition of long-term and short-term migrant.

b. Coverage of migrants is often incomplete. In some countries, international migration statistics do not cover the entire resident population.

c. Undercount of migration continues to exist, in particular for emigration. By implication, return migrations are underreported too.

Data sources vary greatly between countries in Europe, even if some similarities exist. Some countries rely on the population census, other use surveys, and still other use administrative data, e.g. the population register, as source of migration statistics. Population registers vary in accuracy because registration depends on self-reporting and therefore on the individual’s willingness to report. Some countries introduced administrative adjustments to account for the undercount, while other do not. Countries also collaborate with other countries and share data on arrivals and departures to enhance consistency in international migration statistics. Mirror statistics, i.e. statistics produced on the same subject by other countries, explain and reduce asymmetries in reported international migration statistics. The power of official statistics depends on the trust stakeholders have in the figures. To be trustworthy, statistics should be valid, accurate, precise and reliable. Measurements are valid if they measure what they are supposed to measure. They are accurate if they represent reality. They are precise if different measurements yield results that are close. Measurements are reliable if they produce the same results under varying conditions. To produce international migration statistics that meet these requirements, direct measurements are

Page 6: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

4

necessary but not sufficient. Direct measurements (primary data) should be complemented by scientifically based and well documented statistical estimation methods that make optimal use of the observations and quantify distortions and their effects on the derived statistics. An effective strategy is to create a synthetic database combining data from different sources and to view the development and maintenance of the database as a learning process. Learning involves a knowledge structure, the search for new evidence and integration of evidence in the knowledge structure.

a. Synthetic database

Governments collect data for many non-statistical purposes, such as tax and labour market policies. Other public and private organizations collect data too for purposes of administration and management. Some scientists collect data, but even if they do not, they may have useful knowledge about migration flows. All these data can be used for statistical purposes. The European Commission (2009) supports the use of data from multiple sources, including the private sector, to improve statistics. The integration of different data types into a single synthetic database poses a major challenge. Large differences in definition and measurement of migration do not justify the production of migration statistics from raw data only. The data need to be harmonised. A useful harmonisation strategy is to use a model of migration that can accommodate different data types, both quantitative and qualitative data. The purpose of the model is to produce the best possible estimates of the ‘true’ number of migrations (by migrant category). Quantitative data come mainly from primary data sources (see following section) but may include previous measurements or estimates of migration flows, for instance data from a population census organized several years ago. Qualitative data include knowledge about migration flows elicited from subject matter experts. Estimates of true flows are updated when new data become available. An advantage of a model of true migration flows is that it can be used to simulate different types of data, including new forms of data, and different measurement methods. Models can also be used to assess the impact of data types and measurement methods on the discrepancy between true versus reported migration flows. The models can subsequently be integrated in migration forecasting (Disney et al. 2015).

The need for a model that integrates data from different sources has been set out in Eurostat’s vision for the production of statistics (European Commission 2009). In that vision, an integrated model is proposed in which needs for statistics are identified and the European Statistical System (ESS) attempts to respond to these needs by drawing upon, and integrating, information from different administrative and survey data sources (Radermacher and Thorogood 2009; Kraszewska and Thorogood 2010). Obtaining migration estimates that meet the expectations of stakeholders calls for a concerted effort. It cannot be achieved only at the national level, but needs to involve Member States in an interactive way, which requires effective communication, collaboration, data sharing, and coordination at the intra-european level.

b. Learning process

The combination of data from different sources and the updating of prior knowledge in light of new evidence are essentially learning processes. Insight produced by one data source changes when data are added from another source. Viewing the development and maintenance of a synthetic database on migration as a learning process implies a cognitive approach to database development. The cognitive approach is currently the dominant

Page 7: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

5

approach to machine learning and artificial intelligence (cognitive computing). It could also be a useful approach to database development. A formal method of learning that is particularly useful in this context is the Bayesian model of cognitive development, in short Bayesian learning. A fundamental premise is that processes such as migration involve many uncertainties; the outcome (e.g. whether an individual migrates in a given period or the number of migrations in a population during the same period) is inherently uncertain. To process information effectively and produce reliable statistics despite the uncertainties is a challenge. The uncertainties imply that an outcome can take on a range of possible values. If the outcome is a discrete variable, a probability can be associated with each possible value. If the outcome is a continuous variable, a non-zero probability can be associated with an interval. The distribution of probabilities indicate which outcomes are more likely and which are less likely. The more we know about a process, the better we are able to identify possible outcomes and predict how likely they are. The Bayesian model of learning is a formal approach to updating existing (prior) knowledge or beliefs in light of new evidence. Fundamental features of the Bayesian approach are that (1) knowledge or beliefs on processes and their outcomes are represented as probability distributions and (2) when new evidence becomes available, the prior beliefs are updated. The Bayesian method is a probabilistic method of scientific reasoning (Howson and Urbach 1989). The method has shown to be effective in a range of areas including cognitive science and statistics.

Bayesian learning involves a formal description of how new information is assimilated in existing cognitive schemes, i.e. of the mechanism of integrating data from different sources into a coherent structure. It facilitates interpretation of data and it can also be used to study the measurement bias in existing cognitive schemes. These insights contribute to the production of valid, accurate and reliable information on a subject or process from empirical observation and prior knowledge. That makes Bayesian learning particularly attractive for the estimation of international migration flows.

Bayesian learning is remarkably similar to Piaget’s theory of learning, known as constructivism. The theory states that people learn by incorporating newly acquired information or experience in the knowledge they already posses (see e.g. Miller 1983 for a good introduction to Piaget’s theory). Both learning theories insist on the importance of prior beliefs and knowledge for the interpretation of new information and the prediction of unknown outcomes (Tourmen 2016:14). According to Piaget, children and other individuals build (causal) models of the world in order to interpret observations and experiences and to predict what will happen next. Knowledge is structured and stored in mental structures, known as cognitive schemes. Schemes are structured knowledge representations in our mind. They are mental models of reality. They represent the knowledge base an individual relies on to interpret observations and experiences and to make predictions, in short to make sense of the world. They determine an individual’s beliefs about the processes in his or her environment (world view) and how these processes are perceived. New experiences and evidence usually lead to updating the cognitive schemes. Assimilation is the incorporation of new experiences into an existing framework without altering that framework. As long as new observations and experiences are aligned with the internal representations of the world, they can be assimilated and the mental model is adequate for interpretation and prediction. If new evidence contradicts an individual’s internal representation, the individual may (a) disregard the evidence (denial), (b) change his or her perception of the evidence to fit the internal representation or (c) adjust the mental representation. Piaget refers to the adjustment of

Page 8: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

6

knowledge structures in the light of new observations or experiences as accommodation. The processes of assimilation and accommodation describe a learning mechanism. Learning is building and updating cognitive schemes, a process known as constructivism. Piaget did not elaborate on how knowledge is stored in mental schemes. In the Bayesian method of learning, knowledge is stored as probabilities and probability distributions. Beliefs are subjective probabilities associated with given outcomes or events. Subjective probabilities are updated in light of new evidence. The similarities between Piaget’s theory of learning and the Bayesian method have recently attracted the interest of cognitive scientists (see e.g. Frank 2016; Tourmen 2016). Learning processes in humans and machines are increasingly being formalized as Bayesian probabilistic inference (e.g. Chater et al. 2006; Gopnik and Tenenbaum 2007; Perfors et al. 2011; Jacobs and Kruschke 2011; Gopnik and Bonawitz 2015). 3. Sources of information on migration The main data sources for international migration are censuses, administrative records and sample surveys (for a general introduction, see e.g. Bilsborrow et al. 1997; Cantisani et al. 2009; Bilsborrow 2016). At the world level, the population census is the main data source. The census reports, for members of the resident population, the current place of residence, i.e. at time of census, and the place of birth. These data make it possible to distinguish between native- and foreign-born. The census may also solicit from respondents the place of residence one or five years prior to the census or the duration of residence and the previous place of residence. Several organizations have invested in making these census data publicly available3. The quality of data varies because not all countries adhere to the UN Recommendations for Population and Housing Censuses. Some features of the census limit the usefulness of the census as a source for up-to-date data on migration flows (Willekens et al. 2016). First, the census obtains information from the resident population. Hence immigrants are included but emigrants are not. The number of emigrants from a country may be derived from censuses of destination countries (mirror data), provided the country of birth is reported (Dumont and Lemaitre 2005). Second, the age or year of migration cannot be derived from the date of birth. Hence, unless data are available on place of residence at some recent date prior to the census, the data are ill suited for an analyses of migration trends and effects on migration of social, economic or political events and processes, and natural disasters. Third, return migrations and frequent migrations go unnoticed. Fourth, censuses come only every ten years in most countries. In Europe, the traditional census is being replaced by a register-based census. In a register-based census, the census is conducted on the basis of information in the registers, rather than through field enumeration. Information in registers may be complemented by data from other sources. Valente (2010) reviews census taking in Europe.

Abel (2013) developed a method to estimate international migration flows from census data on place of current residence and place of birth. The estimates are counts of people that changed residence at least once during a period of fixed length prior to the census (see also

3www.unmigration.org http://www.oecd.org/els/mig/oecdmigrationdatabases.htm www.worldbank.org/en/topic/migrationremittancesdiasporaissues/brief/migration-remittances-data

Page 9: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

7

Abel and Sander 2014; Abel 2016). Lanzieri (2014b) of Eurostat tested whether Abel’s method can be used to overcome problems of quality and availability of migration data in Europe. The test showed that the method cannot provide a full coverage of migration flows within the EU-EFTA region primarily due to lack of input data, but can estimate the flows of persons born in specific countries. Lanzieri also found that the method can profitably be applied using any breakdown of population stocks, such as by citizenship or educational attainment.

Administrative data are produced by organizations in connection with administrative procedures. People have to register their residence status and their address when they enter school, apply for a work permit, a driver’s license or social security. They are required to report any change of address. Several countries keep a population register, an individualized data sheet (personal card) that includes a unique identification number, personal characteristics, and a continuous registration of a selection of life events. When newborn children and immigrants are registered, a data sheet is created. Deaths and emigrations result in de-registration, provided people notify the local authorities that maintain the register. The population register is used for a range of administrative purposes and, when kept up-to-date, is a tool to track individuals and retrieve data at the individual level. The population register may be linked to other administrative data, e.g. business register, housing register, register of residence permits and working permits, to individual data collected by censuses and surveys, and to administrative data collected by private organizations. Although administrative data are not collected to monitor population change, a selection of administrative data is provided to statistical institutes to produce statistics. The timeliness of the updating of the population register and the accuracy of the information determine the quality of the derived statistics. For a discussion of the potential of population registers for migration statistics (and other demographic statistics), see Poulain and Herm (2013). In addition to the registration data mentioned, other registration data are useful for migration statistics, e.g. register of visa recipients and asylum seekers.

Sample surveys provide relatively detailed data on a selection of individuals. The information is usually collected at one point in time only (cross-sectional survey). In some surveys, individuals are followed over time and information is recorded at regular intervals (panel surveys, follow-up studies). Although surveys may include information on current and previous places of residence, the sample size is usually too small to determine the level and direction of migration in a population. However, surveys may yield a wealth of information on respondents and that information may be used to determine who is likely to migrate and who is not, and why. Migration data are extracted from household surveys, labour force surveys (Wiśniowski 2017) and surveys on living conditions (see e.g. de Brauw and Carletto 2012). Several of these surveys include questions on place of birth and previous place(s) of residence. Some solicit information on household members living abroad. Recently, Bocquier (2016) assessed whether in developing countries, demographic surveys and demographic and health surveillance systems can be sources of migration data. In the area of gender statistics, it is common to collect data on gender in general social and economic surveys. Eurostat proposed a similar approach for migration (Knauth, 2012)4 and in 2010 the European

4“It is proposed that instead of creating additional surveys or other data sources on migrants, the need for information on migration and migrants should be taken into account as part of an ongoing development of a wide range of economic and above all social statistics, regardless of whether these statistics are based on administrative data sources or on statistical surveys.” (Knauth, 2012).

Page 10: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

8

Statistical System Committee (ESSC) adopted a conceptual framework and work programme for migration statistics mainstreaming and the development of migration statistics. Mainstreaming of the migration dimension in data collection has a great potential, not only for the production of migrations statistics but also for socio-economic policies and development cooperation.

Designated migration surveys exist too. Designated surveys yield better insight in (a) the who, why and how of migration, and (b) effective policies aimed at the management of flows (Willekens et al. 2016). They differ from migrant surveys, which focus on migrants. Examples of designated migration surveys include the International Passenger Survey (IPS) in the UK, the Migration between Africa and Europe (MAFE) survey, and the Mediterranean Household International Migration Survey (MED-HIMS). The IPS is used to determine the number of immigrants and emigrants of the UK. It is the main source of international migration statistics in the UK. A selection of travelers is asked how long they intend to stay in the UK or away from the UK (ONS, 2015). Intentions may change and the ONS estimates the number of ‘switchers’. To predict the number of people who stay at least 12 months in the UK or abroad (long-term international migrant LTIM), the ONS computes for each respondent in the IPS, “a person’s probability to switch their intentions based on their nationality and the average number of people who have switched their migration intentions in the previous three years.” (ONS 2016: Annex 1). The MAFE was organized in 2008 in three countries of Africa and six countries of Europe to gain insight in reasons for migration, the methods people use to enter Europe, and the impact of personal contacts on migration (Beauchemin 2010). MAFE survey data have been used to estimate rates and probabilities of emigration from countries of Africa to Europe, using extensions of statistical techniques of event history analysis that account for complex sample design (oversampling of migrant households) (Schoumaker and Beauchemin 2015; Willekens et al. 2017). In a MEDSTAT5 regional workshop in Wiesbaden in March 2008, participating countries called for the implementation of a household migration survey to overcome the lack of data on international migration for the Mediterranean (MED) region (MEDSTAT Committee for the Coordination of Statistical Activities 2011). The MED-HIMS (Households International Migration Surveys in the MED countries) questionnaire is designed to collect data on out-migration, return migration, forced migration, intention to migrate, circular migration, migration of highly-skilled persons, irregular migration, and other useful data on migration, migrants and the effects of migration on households and communities6. National statistical offices implement the surveys. For a description of the project in the context of other international migration surveys, see Bilsborrow (2016). Designated international migration surveys have common goals, use common methods and face similar challenges of sample design, questionnaire design, implementation, and data processing and analysis. To gain insight in migration flows and their root causes, scientists recently called for a World Migration Survey (Beauchemin 2013, 2014; Bilsborrow 2016; Willekens et al. 2016). The

5 MEDSTAT is the European Commission’s statistical cooperation programme for the countries of North Africa and the Eastern Mediterranean. The countries covered by MEDSTAT are: Algeria, Egypt, Israel, Jordan, Lebanon, Morocco, Syria and Tunisia, as well as the Palestinian Authority. So far (August 2017), the MED-HIMS survey has been implemented only in Jordan and Egypt. 6See the MED-HIMS website http://ec.europa.eu/eurostat/web/european-neighbourhood-policy/enp-south/med-hims

Page 11: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

9

survey could build on the experiences gathered in the MAFE and MED-HIMS surveys and other multi-country international migration surveys, such as the Mexican Migration Project (MMP)7 of Princeton University and the Push-Pull Project, a joint venture of Eurostat and the Netherlands Interdisciplinary Demographic Institute (NIDI) (Schoorl et al. 2000; Van Dalen et al. 2005). The promises and challenges of survey-based comparative international migration research have been documented and the experiences and lessons learned reviewed (Liu et al 2016). A World Migration Survey would be a significant step toward the understanding of why people leave their home country and what should be done to develop a sustainable system of global migration governance.

New technologies lead to new forms of data. Mobile phones and other internet-connected devices generate data on the geographic location of the object. Geolocation data constitute a new form of data, obtained from a variety of sources such as Global Positioning System (GPS) signals, the physical addresses associated with Internet Protocol (IP) addresses, and RFID (Radio-Frequency Identification) tags attached to objects (e.g. passports or identity card). Internet Protocol (IP) addresses have been used to map locations from where users sent e-mail or use social media within a given period. Twitter and Facebook data and Yahoo! email accounts have been used to infer migration flows. Google search data have been used to infer migration intentions and preferred destinations. Recently, Fiorio et al. (2017) used Twitter data to estimate the relationship between short-term mobility and long-term migration. Gerland (2015) and Hughes et al. (2016) review estimations of migration flows from geolocation data. Although geo-locators track the locations of online connections and not the addresses of users or owners, and IP addresses can be masked, geolocation data may complement traditional data sources, provided they are available on a regular basis, anonymous, and the selection bias and privacy issues can be resolved. The challenges of using geolocation data as a source of migration data are huge (Laczko and Rango 2014). Hughes et al. (2016) conclude that “New and traditional data sources do not substitute for each other, they complement each other. … Combining data sources is key to produce an infrastructure that is robust to unanticipated changes in the use of technology. Building that infrastructure would be a gradual and incremental process where increasing data production and access, together with the development of methods, would sustain each other. We believe that Bayesian statistical models for migration count data hold the promise of addressing the issue of unifying traditional and emerging data sources.” The view that the new forms of data, known as big data, may complement but not replace traditional data sources, is consistent with the vision of the European Statistical System (2015). 4. Modelling migration The oldest model of migration is the gravity model. It predicts migration flows from characteristics of place of origin and place of destination, and the distance between origin and destination. Characteristics include population size. Distance is usually physical distance, but can also be cultural distance. The gravity model is deterministic and lacks quantification of uncertainties in the measurement of migration. In the early 1980s, researchers reformulated the gravity model as a probability model, more particularly a Poisson regression model (see e.g. Flowerdew and Aitkin 1982; Willekens 1983). The advantages were that (i) the gravity model could easily be extended by including a range of predictors of migration, (ii) the theory of statistical inference could be used to estimate the parameters of the model, and (iii) the data generating process is specified (implicitly or explicitly). That process, which is

7http://mmp.opr.princeton.edu

Page 12: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

10

assumed to generate observations on migration numbers is a stochastic process, more particularly a Poisson process (see further). The Poisson regression model is the most popular model of migration. It is usually written as a log-linear model, with the log of the number of migrants as the dependent variable. The log-linear model is a member of the family of generalized linear models (GLM). For an introduction to the Poisson model and other probability models of migration, see e.g. Willekens (2008, 2016a). For applications of Poisson regression models in estimations of true unknown migration flows in Europe, see Abel (2010), Raymer et al. (2013) and Wiśniowski et al. (2013). Cohen et al. (2009) apply the Poisson regression model (presented as GLM) to estimate migration between selected countries and regions of the world. The assumption that migration flows are outcomes of an underlying Poisson process is restrictive, however. The Poisson distribution is fully determined by a single parameter: the expected number of migrations during a given period, e.g. a year. The variance of the Poisson-generated flows is equal to the expected value of the flows. If migration flows are small, as in international migration, the variance in the data is usually much larger than the variance implied by the Poisson process. To account for larger variance or overdispersion, an additional parameter is needed. The negative binomial distribution is often used (Davies and Guy, 1987; Congdon, 1993). Abel (2010) and Ravlik (2014) use the negative binomial regression model to predict international migration flows. Not all scientists quantify uncertainty (e.g. Poulain, 1993; de Beer et al. 2010). Those who do quantify uncertainty, do not all specify a Poisson model or its extension, the negative binomial model. Bijak (2011:96) explicitly deviates from the Poisson model in favour of a normal distribution. Brierley et al. (2008:153) assume that observations on migration flows follow a log-normal distribution with as expected value the log of the true flow and a given variance reflecting undercounting and other sources of uncertainty (log of data are normally distributed around the true values with a common assumed variance). True flows are predicted by push and pull factors. Azose and Raftery (2015) and Azose et al. (2016) focus on net migration and do not refer to the underlying process generating the migration flows. They predict net migration from past net migrations. Today, the common approach to the estimation of migration is to specify a model of flows and to determine the unknown parameter values that maximize the probability that the model predicts the observed flow data. The number of migrations (by characteristics of persons migrating, by origin and destination, during a given period) is the dependent variable of the model. In the statistical literature, that data type is referred to as count data and the stochastic process generating the data is a counting process. A counting process is a stochastic process that counts the number of events as they occur. A model with parameter values that are not plausible is not likely to yield accurate predictions of migration flows. The most common method to determine the unknown parameter values is to maximize the likelihood function. The model of migration flows relates migration to (a) factors that (are assumed to) influence migration systematically and (b) random factors. The effects of random factors are captured by specifying an appropriate stochastic process. For instance, if N(t) is a random variable denoting the migration count in year t or during the period from 0 to t, then the sequence 𝑁 𝑡 = {𝑁 0 ,𝑁 1 ,𝑁 2 ,… . } is a counting process. Counting processes arise in different

ways, e.g. by counting the number of times a person migrates before a given age x, or by counting the number of persons who migrate in a given period. The migration flow model should be consistent with the postulated underlying stochastic process. The implication is that the mathematical structure of the model of migration is determined by the assumed underlying stochastic process.

Page 13: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

11

Many statistical models are based on counting processes. The theory, which was developed by Aalen (1975) in his PhD thesis, is well-established (Andersen et al. 1993; Aalen et al. 2008). It emerged as the main statistical theory for the estimation of models of event occurrences (survival models), event sequences (event history models) and complete life histories (for a brief introduction and for applications see e.g. Willekens 2014). The Poisson process is the simplest and most widely used counting process. It has a single parameter, the expected value of the number of migrations in an observation period. The variance is equal to the expected value. If events occur randomly in continuous time and if the occurrences are independent of each other, then the counting process is a Poisson process. The parameter of the Poisson process may vary by age, sex, income, region of origin, region of destination, and other factors. The parameter may also vary in time. For each of these categories, the parameter may follow a probability distribution to reflect the unobserved heterogeneity in a population. By way of illustration, consider a change of residence and disregard the restriction on duration of stay associated with the concept of usual residence. I refer to a change of residence without duration threshold as relocation. An individual may relocate multiple times during a period of observation8. Hence, relocation is a repeatable event. Let N(t) denote the number of relocations experienced by the individual during t years of observation, from onset at time 0 to time t. Assume that relocation is governed by a Poisson process. That implies that the count variable N(t) is a Poisson random variable and the distribution of possible values of N(t) is a Poisson distribution. Without loss of generality, we assume that people are identical with respect to their relocation behaviour, which implies that all have the same propensity to relocate. The likelihood of observing n relocations between 0 and t is given by the Poisson distribution:

Pr 𝑁(𝑡) = 𝑛 𝜆 =𝜆2

𝑛! 𝑒𝑥𝑝 −𝜆 (1)

The parameter of the Poisson distribution (λ) is the expected number of relocations during the observation period (λ=E[N(t)]). The variance is also equal to λ: Var(N(t))= λ. The value of λ is determined by maximizing the probability that model (1) predicts the observations (maximum likelihood method). The relocation rate is the number of relocations per individual per year. It is the ratio of the observed total number of relocations by the study population during a given observation period (n) and the total duration of exposure (in years) by all individuals exposured to the risk of migration during that period (PY). The relocation rate is 𝜇 = 𝑛/𝑃𝑌, while : 𝜆 = 𝜇𝑃𝑌 =𝑛. Since relocation is a repeatable event, an individual remains at risk after a relocation, hence all people are at risk during the entire period irrespective of the numbers of relocations 8 In order to estimate circular migration, the UNECE Task Force on Measuring Circular Migration presents individual data on number of movements between Italy and the rest of the world between 1st January 2005 and 31st December 2014 and between Sweden and the rest of the world between 1st January 2000 and 31st December 2009 (UNECE 2016). Such count data can be viewed as being generated by an underlying Poisson process. In order to get that information for Italy, ISTAT conducted a data linkage procedure using the population register as a data source. Individuals who left Italy without deregistration are de-registered ex-officio, which means that “the recorded duration of stay in Italy since previous immigration may be unreliable.” (UNECE, 2016, p. 26). The Swedish data came from the population register. In discussing the Swedish data, the report mentions the problem of left truncation, under-coverage of circular migrants who had their first migration before 1 January 2000 (UNECE 2016:27, footnote 13). Right censoring (circular migrants living in Sweden on 31st December 2009) is an issue too. Event history models (life history models) have been developed to address these issues (see e.g. Aalen et al. 2008; Willekens 2014; Beauchemin and Schoumaker 2016).

Page 14: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

12

experienced. If people enter the population after the start of the observation period or leave the population before the end of the observation period, then the duration of exposure needs to be adjusted for late entry (left truncation) and departure (right censoring). The relocation rate 𝜇 is an occurrence-exposure rate. Note that 𝜆 = 𝜇 ∗ 𝑃𝑌. The likelihood of n events is proportional to 𝜇2𝑒𝑥𝑝 −𝜇𝑃𝑌 since the exposure level PY is known. In Poisson regression models, PY is known as offset. The estimation of the expected number of relocations during the observation period (λ) from the observed number of relocations illustrates the traditional approach to the prediction of migration flows. Frequently, relevant information about relocations and migrations is available from other sources and hence not contained in the data. For instance, migration flow data may be available for some past year or period, e.g. from a census. Subject matter experts may have relevant information that is not contained in the data, for example information on regulations introduced during the observation period that affect the registration of relocations and migrations or that cause a discontinuity in the relocation rate. Traditional models of migration often incorporate relevant prior information into the model. Algorithms to integrate historical data on migration in estimations of migration flows include the iterative proportional fitting (IPF) method, entropy maximization and the EM (Expectation-Maximisation) algorithm (for an overview of these methods, see Willekens 1999). To incorporate prior information in the prediction of migration, most researchers today adopt the Bayesian approach to statistical inference. The approach postulates that some prior information is available on the unknowns (the true flows or the parameters of the Poisson model) and that the prior information comes as probability distributions of plausible values of the unknowns. The prior information can be objective, such as migration data of an earlier period, or subjective, such expert opinions or beliefs. Fundamental features of the Bayesian approach are that (1) information and knowledge are represented as probability distributions of possible values and (2) prior information on unknowns is updated in light of (new) observations. Prior information is expressed as a probability distribution. It implies an assumption that not only the expected value of a variable of interest is known, but that the distribution of possible values of the variable is known too. In traditional methods that use prior information (e.g. IPF), prior knowledge is represented as point estimates; the distribution is not considered. If the prior information is limited, a uniform distribution is appropriate because it assigns equal probabilities to all possible migration counts. This prior is said to be non-informative. When more evidence (data) becomes available, beliefs about number of migrations are updated. The updates are captured in a posterior probability distribution. Updating beliefs, opinions, knowledge or predictions in light of new evidence is essentially a learning mechanism. To combine data and prior information on the unknowns, Bayes’ theorem is applied (for an excellent and accessible introduction, see Bijak and Bryant 2016; for a textbook see Congdon 2001):

𝑝(𝑢𝑛𝑘𝑛𝑜𝑤𝑛𝑠|𝑑𝑎𝑡𝑎) ∝𝑝(𝑑𝑎𝑡𝑎|𝑢𝑛𝑘𝑛𝑜𝑤𝑛𝑠) ∗ 𝑝(𝑢𝑛𝑘𝑛𝑜𝑤𝑛𝑠)

𝑝(𝑑𝑎𝑡𝑎) (2)

The p’s denote probability distributions, that is, probabilities or probability density functions. The term 𝑝(𝑑𝑎𝑡𝑎|𝑢𝑛𝑘𝑛𝑜𝑤𝑛𝑠) is the probability that a migration model with unknown parameters predicts the data, i.e. the observed migration flows. It is the likelihood function

Page 15: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

13

described above. The term 𝑝(𝑑𝑎𝑡𝑎) is the probability of observing the data. If the data are obtained by sampling a population, then it is the probability of obtaining that particular sample. The term 𝑝(𝑑𝑎𝑡𝑎) is fixed for any given data set and plays a minor role in most applications. It is often omitted. The term p(unknowns) is the prior probability distribution. It represents empirical evidence (objective) or beliefs (subjective) about the values of the parameters of the model prior to data collection. In case of a non-informative prior, the posterior distribution 𝑝(𝑢𝑛𝑘𝑛𝑜𝑤𝑛𝑠|𝑑𝑎𝑡𝑎) is determined by the likelihood and the Bayesian method produces results that are similar to the traditional method. ‘Unknowns’ can be replaced by ‘model’ or ‘hypothesis’ in which case the prior is the probability that we select a model or formulate a hypothesis, given the data and prior information. To illustrate the Bayesian approach to the estimation of migration, consider the likelihood function (1). Assume we have subjective prior information on λ that we want to use in the estimation procedure. We believe that λ is nonnegative and the possible values follow an exponential distribution (from 0 to ∞) with parameter ξ equal to 1, hence 𝑝 𝜆 𝜉 =𝜉𝑒𝑥𝑝 −𝜉𝜆 with ξ = 1, hence 𝑝 𝜆 = 𝑒𝑥𝑝 −𝜆 . Given the distribution, the expected value of λ (the expected number of relocations during the period of observation) is 1/ξ = 1, which may be very different from the number of relocations observed in the sample population. Suppose that, prior to data collection, we expect 1 relocation during the period of observation. The posterior distribution of the number of relocations is

Pr 𝜆 𝑁(𝑡) = 𝑛 =𝜆2𝑛! 𝑒𝑥𝑝 −𝜆 𝑒𝑥𝑝 −𝜆 𝜆2𝑛! 𝑒𝑥𝑝 −𝜆 𝑒𝑥𝑝 −𝜆 𝑑𝜆

HI

= 22JK𝜆2

𝑛! 𝑒𝑥𝑝 −2𝜆 (3)

which is the probability density function of the gamma distribution with shape parameter n+1 and scale parameter 1/2. The inverse of the scale parameter is known as rate parameter, in particular in the context of the Poisson process. Let b denote the scale parameter and c the shape parameter. A common specification of the gamma distribution is (Evans et al. 2000:98):

Pr 𝜆 𝑏, 𝑐 =(𝜆/𝑏)NOK

𝑏Γ(𝑐) 𝑒𝑥𝑝 −𝜆/𝑏 (4)

where Γ(𝑐) is the gamma function. Since c is a positive integer, Γ 𝑐 = 𝑐 − 1 ! with ! denoting factorial of c-1. The expected value of λ is E[λ] = bc, hence the expected posterior value of λ is (n+1)/2, which is the mean of (a) the prior guess of the number of relocations during the observation interval and (b) the observed number. If we believe or assume that one individual relocates during a given period, but we observe 150 relocations, then the expected posterior number of relocations is 75.5. The exponential distribution is a special case of the gamma distribution. It is a gamma distribution with c = 1 and b the inverse of the rate, the parameter of the exponential distribution. If the prior is a gamma distribution, the posterior is a gamma distribution too. The posterior and prior distributions are conjugate distributions and the posterior has a closed-form expression. For instance, if we assume a gamma prior for µ, then the posterior density for µ will be a gamma too. If the prior is G(a,b), then the posterior is G(a+n,b+PY). (Congdon 2001:35).

Page 16: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

14

Except for simple cases as the one presented here, the mathematical form of the posterior probability distribution is not known and the parameter(s) cannot be obtained analytically. The solution is to explore the (joint) posterior distribution of the unknown(s) by walking around on that distribution (surface), take samples and determine how likely the samples are given the migration model, the prior distribution of the unknowns and the data. The walk is a random walk modified by an acceptance rule. The rule states that a proposed move from the current location to a new location is accepted if that move contributes to finding the target posterior distribution. Once the target distribution is found, samples are taken to determine the unknowns. The samples are not independent. The current location determines the new sample. That is operationalized by considering each possible location as a state in a state space. The sequence of states is a Markov chain. The transition probabilities of the Markov chain are the probabilities of accepting moves, which are determined by the acceptance rule. The Markov chain that results has as its equilibrium distribution the target posterior distribution. This method is the Markov Chain Monte Carlo (MCMC) method (see e.g. Congdon 2001:466; Brooks et al., 2011; Bijak, 2010; Bijak 2011:32ff).

The MCMC method was developed in the 1940s by Metropolis (Metropolis et al.,1953) and extended by Hastings (1970). German and German (1984) introduced Gibbs sampling into the arena of statistics. The idea of Gibbs sampling is to simulate from conditional distributions to produce samples from a joint distribution. Software for MCMC simulations is abundantly available. In a chapter contributed to Bijak (2010), Wiśniowski reviews available software for Bayesian analysis. Popular software includes WinBUGS, OpenBUGS and JAGS. The BUGS platform was developed by the BUGS (Bayesian inference Using Gibbs Sampling) software project (www.mrc-bsu.cam.ac.uk/software/bugs/). Increasingly, the R software environment is used for handling Bayesian computations. A good starting point is the text by Robert and Casella (2010) and the CRAN Task View “Bayesian inference” (https://CRAN.R-project.org/view=Bayesian). 5. Modelling measurement errors with input from subject-matter experts Four key data problems emerge in the measurement of international migration (Raymer et al. 2013; Disney 2014; Disney et al. 2015): (a) the definition of migration, (b) population coverage (some population groups are omitted), (c) underreporting of migration, and (d) concerns about accuracy of the measurements. The prediction or nowcasting of the true migration flows in a given observation period by country of origin and country of destination is complicated by the mentioned measurement problems. The first is the definition of migration. Two broad data types are distinguished to define migration (Courgeau 1973): event data and status data. Event data measure event occurrences, e.g. migrations. Status data measure personal attributes, e.g. place of residence. By comparing places of residence at two points in time, the occurrence of a migration can be inferred. To distinguish these indirect measurements of migration from event data, they are referred to as transition data (see e.g. Willekens 2016a). A major source of event data is the population register. The population census and labour force surveys are major sources of transition data. Some important data problems in the measurement of migration can be reduced to these two data types (Willekens 1999; Poulain 2008). Event data and status data on migration are not really comparable, but they can be made comparable by modelling the ‘true’ migration process underlying both event data and status data. The definition of migration is two-dimensional. It includes a spatial dimension and a temporal dimension. The spatial dimension defines the areal units (places of residence)

Page 17: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

15

considered in the measurement of migration. In international migration, it is a country. The temporal dimension adds a duration criterion to the definition of residence and change of residence. In 1998, the United Nations introduced the concepts of long-term and short-term migrant and adopted the definition of resident (as opposed to visitor) included in the 1994 United Nations recommendations on tourism statistics (United Nations, 1998). A long-term migrant is a person who moves to a country other than that of his or her usual residence for a period of at least a year (12 months), so that the country of destination effectively becomes his or her new country of usual residence. A short-term migrant is a person who moves to a country other than that of his/her usual residence for a period of at least 3 months but less than 12 months. The concept of short-term migrant also depends on the reason for migration. Moves that are for purposes of recreation, holiday, visits to friends and relatives, business, medical treatment or religious pilgrimage are excluded. Many countries do not follow the UN definition, but use different duration thresholds. Some, e.g. Germany, have no threshold and consider all changes in usual residence as migrations irrespective of intended or effective duration of stay. Other countries, e.g. Poland, register a change in usual residence if and only if a person indicates that the change is permanent. For a list of EU and EFTA countries and the duration thresholds they consider in measuring migration, see Cantisani and Poulain (2006); UNECE (2012) and Raymer et al. (2013:803). The UNECE Task Force on Analysis of International Migration Estimates Using Different Length of Stay Definitions, set up in 2008 to explore the different definitions in use, found five different ways countries in Europe measure duration of stay for an immigrant and duration of absence for an emigrant (UNECE 2012). The EU Regulation 862/2007 on migration adopted the definition of long-term migrant recommended by the United Nations. The EU Regulation 1260/2013 of 20 November 2013 calls on the Member States to carry out feasibility studies to determine whether the country can comply with the UN (and Eurostat) definition of usual residence (by 31 December 2016). The definition of migration is complicated by differences in definition of residence. Countries that use a de jure enumeration of individuals register the usual place of residence, while countries that use a de facto enumeration record the actual place of residence. The concept of residence is increasingly becoming a fluid concept, one that means different things to different people. Some people, known as transnationals, have multiple residences in different parts of the world and identify with multiple communities. In other words, they have multi-sited individual and social lives (IOM 2010b; Bilgili, 2015). Transnationalism is a key factor in contemporary migration management (IOM 2010a, 2010b). The definition of migration is also complicated by the concept of legal residence. A person who changes usual residence with an intention to stay at least 12 months in another country is not recorded as an immigrant unless the person is allowed to reside within the country of destination (and can show a document as proof of residency, such as a residence permit). Transnationalism and the concept of residency pose challenges for the definition of usual residence and the measurement of international migration. These challenges have their roots in the concept of sovereign nation state, introduced in the Peace Treaty of Westphalia (Germany) of 1648 as part of the new system of political order in Europe and upheld in the UN Charter. That treaty offers the legal basis to control national borders and regulate international migration (Betts 2011).

Measurement issues are also related to main method of data collection. In general, a population register yields better data on immigrations and emigrations during a given calendar year than surveys or other means of data collection. Censuses generally provide

Page 18: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

16

accurate data on immigrants but not on emigrants. A population register and a census differ in the residence concept used. A registers considers the administrative place of residence, while the census uses the actual or usual place of residence. Countries with a population register differ in quality of the migration data. The quality is considered better in Nordic countries, which exchange individual data on international migration9. To improve its migration statistics, Romania started to exchange aggregate data with Italy and Spain, two of the main destinations of Romanian emigrants (Pisică, 2016). Two sources of errors complicate the measurement of migration further: under-registration (undercount) and undercoverage. Undercount occurs when not all migrations are recorded. If immigration and emigration depend on self-reporting, the willingness to report varies, and the undercount can be substantial. Major sources of under-registration of immigration are people who overstay their tourist visa or residence permit, and undocumented border crossing. Under-registration of emigration is caused by people leaving the country without notice. A consequence of under-registration of emigration is that return-migrations are under-registered. Some countries, e.g. the Netherlands, correct emigration statistics by including unreported emigration of foreigners if the administration reveals that residents are missing and likely moved abroad (see also footnote 6). Under-coverage occurs when some categories of the population are excluded from the measurement of migration. For instance, asylum seekers are usually excluded because they are not admitted yet to reside legally in the country, although they intend to stay at least 12 months. Countries differ in ways they record cross-border migration of nationals and foreigners (UNECE 2012). For instance, Romania’s immigration data include foreigners only, while emigration data include nationals only (Romanian Institute for Research on National Minorities 2014). Because of differences in definition and measurement, a migration between two countries may be recorded in one country but not in the other. As a consequence, sending countries and receiving countries report different migration counts. These measurement problems have been known for a long time and attempts to do something about it have a long history. The Conference of European Statisticians (CES) identified the problem, as early as 1970. In 1971 the CES organized the United Nations European Seminar on Demographic Statistics in Ankara and Istanbul, in cooperation with the United Nations Office for Technical Cooperation and the Government of Turkey (Kelly, 1987). Participants noted that there were serious shortcomings in the statistics of immigration and emigration available for UNECE countries in that they differed considerably in scope, coverage, definitions, classifications and content and that in most instances they did not meet the requirements of population analysis research. They concluded that the improvement and harmonization of statistics on international migration was an urgent task. They also recommended to organize an exchange of statistics on international migration among ECE countries. The CES followed the recommendation and the improvement of migration statistics was included in the 1972 work programme. In its 1974 meeting, the CES pointed out that the quality of immigration statistics is generally much better than that of emigration statistics and proposed that a meeting of interested countries be held to discuss arrangements for bilateral exchanges of data on migration between pairs of ECE countries. That meeting was organized in 1975. In preparation of that meeting, the ECE secretariat collected immigration and emigration statistics for 1972 and arranged the data in two origin-destination matrices, one based on immigration data and another based on emigration data. The bilateral

9Inter-NordicmigrationsarerecordedonaspecialformcommontothefiveNordiccountries(Inter-NordicMigrationCertificate).Individualdataonnewarrivalsispassedtothepopulationregisterofthecountryoforigin.

Page 19: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

17

flow data revealed serious asymmetries in the migration data compiled by the countries in the ECE region. Reported emigrations from country A to country B did not match the reported immigrations to country B from country A. In 1980, the Council of Europe collected similar data from the 21 Member States of the Council of Europe and found the same anomalies. The matrices were attached to the 1981 annual report on the demographic situation in Europe. Issues such as coverage and duration threshold were already discussed at that time. Until today, these issues have not been resolved satisfactory yet, although considerable progress has been made. Actions, called for in 1974, e.g. the exchange of statistics on international migration between countries, are still being called for today (Skaliotis and Thorogood 2007; Radermacher and Thorogood 2009; Raymer 2012; Willekens et al. 2016), although an example of good practice exists; namely, the exchange system in the Nordic countries. Insights in the types of migration data being collected increased significantly (Poulain et al., 2006; Kraler and Reichel, 2010) and methods for the reconciliation of national statistics have been developed. These methods are the subject of the remainder of this paper. Until concepts and definitions of residence and migration are refined and innovations in data collection methods and procedures reduce measurements errors and increase the comparability of data, methods are needed to infer trustworthy and comparable migration statistics from data provided by the different countries of Europe. Essentially two methods have been developed to reconcile national statistics on international migration in Europe. Both start from the bilateral migration flow data compiled by countries of origin and countries of destination and adjust the reported migration data to obtain a unique, complete and internally consistent matrix of migration flows between the countries of Europe. The first method adjusts the reported migration without explicit reference to the sources of error in the measurement of migration. The method was proposed by Poulain and Wattelar (1983). It was adopted as a point of departure in the Eurostat-funded project MIMOSA (Migration Modelling for Statistical Analyses; 2007-2009), which resulted in several publications listed below. The method considers uncertainties in the data and experts help inform the estimation procedure by their judgments on the magnitude of the uncertainties that result from measurement problems. Migration flows to and from countries with good international migration data are given priority over migration flows between countries with serious data limitations and hence a larger uncertainty in migration flows. The second method pays more attention to the measurement process and specifies a measurement model that relates the quality of migration estimates to the main sources of measurement error: differences in definition, coverage, undercount and accuracy in migration measurement. Experts are interviewed and their judgments on the relative significance of the different reasons in explaining the incomparability of data are incorporated in the model. The measurement model is combined with a migration model that aims at predicting true migration flows (latent, not observed) from knowledge of the determinants of migration. The Bayesian approach is used to combine the different data types. The method was proposed by Raymer et al. (2013) in the context of the NORFACE-funded project IMEM (Integrated Modelling of European Migration; 2009-2012). In the remainder of this section, I review the two methods.

a. The Poulain approach with extensions

Poulain and Wattelar (1983) proposed a method to reconcile immigration and emigration statistics. They distinguish three types of countries with different levels of data availability:

Page 20: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

18

(a) countries with immigration and emigration data, (b) countries with immigration or emigration data, and (c) countries without data on international migration. For each country, two correction factors are defined, one for immigration data and one for emigration data. Let Iij denote the immigrants in j originating from I, reported by receiving country j, and Eij the emigrants from i with destination j, reported by sending country i. The correction factors correct the flow from i to j such that

𝛼R𝐼TR = 𝛽T𝐸TR (5)

where 𝛼Ris a correction factor associated with the immigration data of country j and 𝛽Tis a correction factor associated with emigration data from i. If C is the number of countries, then there are C(C-1) equations and 2C unknowns. The system of equations is overdetermined, i.e. there are more equations than unknowns. To obtain an approximate solution, the Euclidean distance measure (𝛼R𝐼TR − 𝑏T𝐸TR)WT,R is minimized subject to the constraint that the total sum of estimated migration flows is equal to the total of the observed immigrations:

𝑆TR =TR

𝐼TRTR

(6)

where 𝑆TR = 0.5 𝛼TR𝐼TR + 𝛽TR𝐸TR A two-step procedure is used to improve the quality of the estimates. In a first step, five countries with complete and relatively good data are selected and the correction factors are determined. The correction factors are fixed up to a constant. To remove the constant, one correction factors is set to a given value. The authors fix the correction factor of immigration data of Denmark to unity. A correction factor equal to one preserves the reported immigration data. In a second step, the correction factors determined in the first step are fixed and those for the other countries are determined. The procedure results in two migration flow matrices, one with corrected immigration data and the other with corrected emigration data. The two matrices are close but not equal. Unique values of migration flows are obtained by averaging the corrected immigration flow and the corrected emigration flow. Poulain (1993) repeats the procedure but considers a two groups of countries. The first group consist of the Nordic countries with good data. The second group consists of the other countries. The procedure consists of three steps. First, the correction factors are estimated for the Nordic countries. In a second step, the correction factors for flows between the Nordic countries and the other countries are estimated. These factors are used in a third step to estimate the remaining migration flows. Poulain (1999) also divides countries in three groups depending on the reliability of migration data. The procedure is similar to Poulain (1993). The approach implies that migration flows between countries with good data is not influenced by data of less quality Van der Erf and van der Gaag (2007) adopt the method developed by Poulain. They start with the Nordic countries and add countries successively based on the perceived reliability of their migration data. The sequence of countries introduced in the iterative estimation procedure is determined by experts. Expert judgments are also used to adjust correction factors if appropriate. Poulain and Dal (2008) apply the procedure to estimate migration flows between 28 countries of Europe: 13 countries with consistent migration data (called referee countries) and 15 other

Page 21: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

19

countries. The correction factor for immigrations registered in Sweden is set equal to one because Sweden uses the UN definition of migration (12 months criteria) and is considered to record immigration accurately. They change the function to be minimized to

(𝛼R𝐼TR − 𝑏T𝐸TR)WT,R / 𝐼TR + 𝐸TR and maintain a single constraint that the total averaged estimated flow is equal to the total immigration. The denominator removes a limitation of the least square method, namely that large flows receive considerably more weight than small flows, which means that flows from large countries have a strong influence on the estimates. A limitation of that new distance function is that small flows receive much more weight than large flows. The problem is well-known in migration research and is resolved by considering multiple distance functions (Willekens et al., 1981; Abel, 2010). De Beer et al. (2010) adapted the constrained optimisation procedure to assure that the marginal totals of the corrected I and E matrices are equal:

𝛼R𝐼TR =R

𝛽T 𝐸TRR

(7)

and

𝛼R 𝐼TR =T

𝛽T𝐸TRT

(8)

for all i and j. Equations (7) and (8) can be written as a homogeneous system of 2C linear equations with 2C unknowns. The corrections factors are unique up to a constant. The correction factor of reported immigration data for Sweden is set equal to one, for the reason given by Poulain and Dal. Since 𝛼R[\]^_^2is fixed, the system of equations becomes a system of nonhomogeneous equations of the form 𝐴𝑥 = 𝐵, which has 2C equations and 2C-1 unknowns. The solution is of the form 𝑥 = 𝐴b𝐵, where 𝐴bis the generalized inverse of A. That solution is identical to the one obtained by minimizing (𝛼R𝐼TR − 𝑏T𝐸TR)WT,R subject to constraints (7) and (8). Missing data constitute a separate problem. Some authors omitted countries with missing data. Poulain used the correction factors obtained from countries with data to estimate migration flows for countries without data. Abel (2010) estimated the missing flows using a regression model, fitted to the harmonized international migration flow data. The predictors are characteristics (covariates) of sending and receiving countries, and characteristics of links between the countries (distance, contiguity, trade, etc.). The model is a spatial interaction model, an established type of model for estimating migration flows. The idea to introduce a migration model that relates the harmonized data to covariates was important and was adopted by others (e.g. Raymer et al., 2013). The parameters of the model are estimated taking into account the incompleteness of the observed (harmonized) data. Abel applies the EM (expectation-maximization) algorithm, which is a maximum-likelihood technique that uses the migration model to predict missing data, initially with preliminary values of the parameters, and uses the predictions to improve the parameter estimates.

b. The Raymer et al. (IMEM) approach

Raymer et al. (2013) use a migration model to predict true migration flows and a measurement model to quantify differences between observations and true flows. In their

Page 22: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

20

study, true migration flows are long-term migrations, i.e. relocations for at least 12 months. A true flow can be defined in different ways, like events can be defined differently, as long as the definition is unambiguous and unique. The measurement model captures effects of the measurement problems mentioned above. The authors assume initially that migrations are generated by a Poisson process, but they assume that the expected values of migration counts are normally distributed. That approach allows for larger variability than the Poisson distribution, i.e. for overdispersion. The dependent variable of the migration model is log(λ), where λ is the true number of migrations (UN definition) in a given year. λ is origin-destination specific and is different for each calendar year. The true number of migrations in a given year is predicted by a set of covariates. A time-invariant normally distributed random factor (random effect) is introduced to smooth flows across time. The factor induces residual correlation between the same flows at different points in time. Variation in the random factor is restricted to induce a residual correlation between flows in opposite directions. If a flow is larger than predicted by the model, the flow in the opposite direction is also expected to be larger. The parameters of the model were estimated using the Bayesian method. Weakly informative prior distributions were used (normal distributions and gamma distributions with parameters fixed by the authors or drawn from probability distributions). That implies that the predictions of migration are driven mainly by the covariates and that the influence of prior information is limited. The prior distributions are selected for computational convenience only.

To convert the reported data to comply with the definition of migration used in the true data (UN definition), a measurement error model is introduced. The covariates are assumed to be measured correctly, but migration is not measured correctly for reasons listed above. Reported migration data, i.e. the observations, are initially assumed to be generated by a Poisson process and to follow a Poisson distribution with parameter λ* (by country of origin, country of destination and calendar year). The parameter λ* differs from the parameters of the model of the true migration flows (λ) because of measurement errors. Immigrations and emigrations are modeled separately to account for the asymmetry in bilateral migration flow matrices, i.e. substantial differences between immigration data from receiving countries and emigration data from sending countries. Let i denote the sending country and j the receiving country. Let λij denote the true migration flow from i to j, 𝑍TRd the flow from i to j reported in the receiving country j (immigration data), and 𝑍TR\ the flow from i to j reported in the sending country i (emigration data), 𝜆TR∗d the expected number of migrations from i to j recorded in j, and 𝜆TR∗\ the expected number of migrations from i to j recorded in i. The expected number of migrations from i to j, observed in j, is proportional to the true flow:

𝜆TR∗d = 𝜆TR𝑎TRd (9)

The proportionality factor 𝑎TRd depends on the factors that distort the measurement of immigrations in j. Equation (9) may be written differently, as true flow = factor * data, with 𝑓𝑎𝑐𝑡𝑜𝑟 = 1/𝑎TRd . The expected number of migrations from i to j, observed in i, is also proportional to the true flow:

𝜆TR∗\ = 𝜆TR𝑎TR\ (10)

Page 23: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

21

The proportionality factor 𝑎TR\ is a function of the factors that distort the measurements of migration in country i: duration threshold, undercount, coverage, and country-specific level of accuracy of the data collection system.

a. Duration threshold If the duration threshold is identical to the threshold used in measuring the true flow (12 months in the Raymer et al. study), then the threshold effect on the distortion is 1. If the duration threshold is less than the threshold used for the true data, true migration is overestimated and aij is larger than one. If the duration threshold exceeds the one used in the true data, true migration is underestimated and aij is less than one. Five duration threshold parameters are considered, one each for duration 0 (no duration threshold), 3 months, 6 months, 12 months and permanent. The duration threshold of 12 months is the reference category (parameter is 0). b. Undercount The undercount effect is large if the undercount is large and small if undercount is low. IMEM considers two categories of undercount: low and high. c. Coverage The coverage effect captures country-specific deficiencies in measuring migration not reflected in the undercount. IMEM considers two types of coverage: standard and excellent. The coverage effect for a country is a normally distributed random variable, with mean and variance functions of the coverage assumed for that country (standard or excellent). To ensure that the random effect is between 0 and 1, a logistic transformation is used. Let ki denote the normally distributed coverage effect (-∞,+∞) for country i and let pi denote the associated random effect between 0 and 1. Then 𝑘T = 𝑙𝑜𝑔𝑖𝑡(𝑝T) and 𝑝T =1 + exp(−𝑘T) . For migration to and from the rest of the world, Raymer et al. assume perfect coverage. They justify that assumption by the more rigorous registration requirements for migrants originating from or departing to countries outside of the EU/EFTA region. That assumption and the justification are far from realistic. d. Accuracy of the data collection system A country-specific term is added to capture differences in accuracy of the data collection system, irrespective of duration threshold, undercount and coverage. IMEM distinguishes three types of data collection systems for migration: (1) registers in the Nordic countries, (2) other good register-based systems, and (3) less reliable register-based or survey systems.

Raymer et al. elicited opinions of experts on migration statistics to quantify the factors that distort the measurement of migration and to derive the prior for the estimation. Information was obtained from 11 experts using the Delphi method. Experts were invited to provide, for each distortion factor, (a) a range of values for the distortion and (b) an indication of how certain they were about that range. The method is described in detail by Wiśniowski et al. (2013). For instance, consider the duration threshold. Experts were asked by how many percent they expect the level of migration with the six-months criterion to be higher than with the 12-month criterion, which is used to measure the true migration flow. They should not give a single percentage but a range of percentages, e.g. between 15 and 30 percent. The lower bound is 𝑃K

(m)and the upper bound is 𝑃W(m)

. Hence the overcount factor ranges from 𝑑𝑢𝑟K

(m) = 1 + 𝑃K(m) to 𝑑𝑢𝑟W

(m) = 1 + 𝑃W(m). The beliefs of experts need to be translated into a

probability distribution. The authors considered probability density distributions for which the parameters could easily be calculated. To that end, an auxiliary variable d was introduced:

Page 24: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

22

𝑑 m = ln 𝑑𝑢𝑟 m

𝑑 p = ln 𝑑𝑢𝑟 p − 𝑑(m) 𝑑 I = ln 𝑑𝑢𝑟 I − 𝑑(p) − 𝑑(m) (11)

𝑑 q = −ln 𝑑𝑢𝑟 q where p denotes ‘permanent’. The expert-specific probability density of the auxiliary variable d(s), with s = {0, 3, 6, p}, is assumed to follow a log-normal probability density distribution. The mean and the standard deviation are estimated from the values of d, derived from the ranges of percentages given by the experts, weighted by elicited certainty levels. Wiśniowski et al. (2013:598) show the equations. The individual densities were used to produce a mixed probability density distribution. Raymer et al. (2013:806) state that the median of the true flow (12-month duration threshold) is 81 percent of the median of the flow measured with the 6-month duration criterion. The median of the true flow would be 51 percent of the median of the flow estimated with no time limit (overestimation 96 percent) and the median of the true flow would be 61 percent of the median of the flow measured with the 3-month criterion. The median of the true flow would be 1.64 times the median of the flow estimated with the ‘permanent’ criterion. A similar procedure was followed for the undercount and the coverage. For the undercount, the beta density was selected. The individual densities were used to produce a mixed density. The mean undercount of immigration was 72 percent with a standard deviation of 18 percent. The mean undercount of emigration was 56 percent with a standard deviation of 22 percent (Wiśniowski et al. 2013:595). The large standard deviation and the flat shape of the distribution of the mixture densities reflects the heterogeneity of expert judgements about the undercount. Raymer et al. (2013) gives further results. Experts believe that in countries with low undercount, 88 percent of the immigration and 73 percent of the emigration are reported. They believe that in countries with high undercount, 68 percent of the immigration and 45 percent of the emigration are reported. The lack of consensus among experts was an interesting finding. Wiśniowski et al. (2013) attribute it to different experiences of the experts with migration statistics. The experts’ beliefs may have been based on the data collection systems they know best. A consequence of the lack of consensus among experts is that the probability distribution, if used as a prior in Bayesian estimates of immigration and emigration, is weakly informative, i.e. not much different from a uniform distribution that attaches equal probabilities to all possible values. Wiśniowski et al. (2013:603) conclude that the expert-based prior densities led to very wide posterior distributions of estimated migration flows. Expert-based prior densities do not produce estimates of migration that are substantially different from estimates based on non-informative or weakly informative priors. The authors list four lessons learned from the the elicitation of expert opinions:

a. The form of the prior probability density distribution and the distinction between categories of countries matter.

b. The wording of questions posed to experts is important. Different formulations should be tested. Recently, Hanea et al. (2016) proposed the IDEA protocol as a method for removing linguistic uncertainties in elicitation of expert opinions.

c. Certainty levels are easily misinterpreted. If an expert expects that, in a country, the undercount of immigration is between 20 and 35 percent, and the certainty level is 70 percent, then 30 percent of the immigrations are distributed outside of the range indicated by the expert. Several experts misinterpreted that mechanism.

Page 25: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

23

d. One should be careful in selecting experts. Some invited experts were not convinced that subjective probabilities are useful information for the estimation of migration flows.

The authors do not question the usefulness of expert judgements to complement migration flow data but propose a more thorough assessment of the empirical knowledge experts have and how they use that knowledge is translated in subjective estimates of migration flows. In some cases, expert opinions may be replaced by models. In the next section, I discuss and illustrate the use of models to tackle problems currently addressed by involving experts and eliciting their judgments. 6. Modelling measurements errors with auxiliary models In this section, I argue that, although expert opinions should continue to be utilised to improve the measurement and prediction of international migration in Europe, they cannot replace the use of formal models. The question whether experts produce better predictions than models has occupied scientists for a long time. Armstrong (2001:Section 6.4), who for many years studied the use of expert judgments in forecasting cites “strong empirical evidence” that models (quantitative methods) are generally less biased and make more efficient use of data. To get more reliable and accurate expert information, Burgman (2016) advocates a change in attitudes towards expert estimates and predictions such that they are “treated with the same reverence as data, subjected to the same kind of cross-examination and verification.” (Burgman 2016:vii). An expert’s opinion is based on a model too: a mental model of true migration flows. Since the ultimate aim is to optimally combine quantitative methods (data and models) and qualitative methods (e.g. elicitation of opinions, expectations and predictions from experts, focus groups and stakeholders), formal models and mental models should be considered. Mental models are outcomes of learning. Learning involves the development of mental models (cognitive schemes), which are representations of structured knowledge. Experts use mental models too and their beliefs and opinions are based on these models. Experts with more and better structured knowledge about a subject (better subject specialists) and with a strong empirical orientation are more likely to produce better estimates and predictions. When the expert’s knowledge representation includes a deep insight in measurement procedures and the models scientists use to produce estimates and predictions, the judgments may not be much different from the figures produced by the models scientists use. An expert’s degree of confidence in his or her estimates and predictions and his/her cognitive bias are influenced by the mental model. Initiatives to develop structured methods for elicitation, using well-defined protocols, are a first step to make explicit the mental models on which expert judgments are based. Consider one of the measurement problems mentioned above, differences in duration threshold. Wiśniowski et al. (2013:603) describe how they elicit from experts their opinions on the sensitivity of migration counts to duration thresholds, and how they translate that information into probability distributions to be used in estimations of migration flows. The expert opinions are translated into probability distributions via auxiliary variables d(s), which are assumed to follow log-normal distributions. It is not clear what substantive reasons exist for the selection of the log-normal. In this section, I show how correction factors can be obtained from a model of true migration flows. As a reference, I do not use the 12-month criterion, but a 0-month criterion (no time limit). I show that, for the same underlying data-

Page 26: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

24

generating process, i.e. process producing the true data, different results can be obtained depending on the measurement of the process. A measurement model that describes the impact of measurement method on the estimates of relocations, was developed in a project to explore the use of micro-simulation for the harmonization of migration statistics (Nowok 2010, Nowok and Willekens 2011). Assume that people may relocate multiple times during an observation interval and that individuals act independently at the same constant relocation rate. This very simple situation is sufficient to illustrate the effects of differences in duration thresholds. Extensions will be considered at the end of this section. Relocations that satisfy these simple conditions are governed by a Poisson process. The distribution of numbers of relocations during an observation period is given by equation (1). In equation (1), λ is the expected number of relocations in a population during an observation period of length t. Since individuals relocate independently and at the same rate, we may consider the relocation of any single individual. The individual relocation rate is µ. It is the expected number of relocations experienced by an individual during a unit time interval, one year, say. The expected number of relocations the individual makes during a period of length t is µt. Define a migration as a relocation (change of usual residence) that is followed by a minimum duration of stay, the duration threshold. Migration statistics differ in the duration threshold used. Let dm denote the duration threshold. An individual who relocates at time t is recorded as a migrant if he or she resides in the destination continuously for at least dm years. Both actual and intended durations of stay may be used. The probability that an individual experiences n migrations between the onset of observation and time t if the duration threshold is dm is:

Pr 𝑁(𝑡) = 𝑛 𝜇, 𝑑r =(𝜇𝑡𝑧)2

𝑛! 𝑒𝑥𝑝 −𝜇𝑡𝑧 (12)

where 𝑧 = 𝑒𝑥𝑝 −𝜇𝑑r is the probability of no relocation within dm years. It measures the proportion of relocations that are migrations, given the duration threshold dm. The migration rate is zµ. The expected number of migrations during the interval of length t is

𝐸 𝑁_t(𝑡) = 𝜇𝑡𝑧 = 𝜇𝑡𝑒𝑥𝑝 −𝜇𝑑r

(13)

If a duration threshold of 1 year is used as a reference, as recommended by the United Nations, then a duration threshold of dm results in a number of migrations experienced by an individual, that is 𝑜_t times the number of migrations experienced under the 1-year duration criterion, where 𝑜_tis (Nowok and Willekens, 2011:527):

𝑜_t = 𝐸𝑁_t 𝑡𝑁_uv 𝑡

= 𝑒𝑥𝑝 −𝜇 𝑑r − 𝑑KW (14)

The overestimation is 100 ∗ 𝑜_t − 1 percent, with 𝑜_tmeasured in years. It is independent of the length of the observation period t. The overestimation is the same as the overcount factor, which is the percentage by which the number of migrations counted in a population is overestimated. For instance, if the relocation intensity is 0.2 and a country uses the 6-month criterion, then 𝑜_t = 1.10, indicating that the reported figure overestimates by 10 percent the number of migrations measured following the UN guidelines (12-month

Page 27: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

25

criterion). Recording all relocations (no time limit) results in an overestimation by 22 percent of migrations (UN definition). Counting permanent migrations only, which are defined as migrations followed by a stay of at least 5 years (Nowok 2008), results in an underestimation of migrations (UN definition) by 55 percent. If ‘permanent’ means a stay of at least 10 years, the underestimation is 84 percent, i.e. only 16 percent of the migrations (UN definition) are recorded. The overestimation is particularly sensitive to the relocation rate. The higher the rate, the higher the overestimation. Raymer et al. (2013:807) reports that experts judge the number of migrations without time limit (i.e. relocation) to be about twice the number of permanent migrations (1-year criterion). If relocation is governed by a Poisson process with constant relocation rate, the relocation rate should be around 0.7 (𝜇 = −𝑙𝑛 𝑧 = −𝑙𝑛 0.5 ) to produce the expert judgment. That figure means that, on average, an individual relocates every 18 months, which is unrealistic. Another validity test of the Poisson model is to consider actual data on migration published by countries in Europe. Figures differ for a number of reasons listed above, with differences in duration threshold being only one reason. Consider emigration from Poland. In the period 2002-2007, Poland registered an annual average of 22,306 emigrants to 18 EU and EFTA countries considered by de Beer et al. (2010), whereas the destination countries registered a total of 217,977 immigrants from Poland. Assuming that destination countries report immigrations correctly, the emigration rate of Poland would be 6 per thousand (in the period considered, the population of Poland was about 38 million). Poland records emigration if the person leaves the country permanently. Given the very low emigration rate of Polish residents, the Poisson model is unable to predict the large difference in recorded migrations in Poland and destination countries. The situation is worse if we consider the migration from Poland to one particular country. Consider migration to Sweden. During that period 2002-2007, Poland recorded an average annual emigration to Sweden of 303 persons, while Sweden recorded an annual average of 3,718 immigrants from Poland (de Beer et al. 2010). Sweden follows the UN guidelines in measuring migration. Given the very low rate of migration of Polish residents to Sweden, the Poisson model is unable to predict the large difference in recorded migrations from Poland to Sweden (Polish emigration data report only 8 percent of emigrants to Sweden if the immigration figures of Sweden are considered accurate). The Poisson model could explain the difference if (a) the migration rate from Poland to Sweden is 0.2, (b) ‘permanent’ does not mean 5 or 10 years but a stay of at least 13.5 years, and (c) other measurement errors have no effect. In that case, the measurement method used by Poland would underreport the true migration flow to Sweden by 92 percent. The assumption that all individuals have the same relocation rate is not realistic. A large proportion of the population does not consider relocation and is therefore not really at risk of migration. Suppose that 2.5 per thousand of the residents of Poland considers emigration within a year. These have a much higher emigration rate than the average of the population of Poland (0.6 per thousand). Since 0.0006 = 0.975 ∗ 0 + 0.0025 ∗ 𝑚, the emigration rate of people considering emigration is 𝑚 = I.IIIm

I.IIW{= 0.24. Destination countries use different

duration thresholds to measure immigration. If a duration threshold of 6 months is an acceptable average and ‘permanent’ emigration from Poland means a stay abroad longer than 10 years, then the proportion of emigrants recorded in Poland is 𝑒𝑥𝑝 −𝑚 𝑑m − 𝑑KW =exp −0.24 10 − 0.5 = 0.102, which is 10 percent. That figure is a very good approximation of the proportion of emigrants recorded by Poland during the period 2002-2007. During the observation period, 1.7 percent of the emigrants from Poland emigrated to Sweden. Suppose residents of Poland have a slight preference for Sweden over other

Page 28: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

26

countries in the EU and EFTA region, increasing the emigration rate of those considering emigration to Sweden to 0.27 (instead of 0.24). That emigration rate results in a proportion of emigrants to Sweden recorded by Poland of exp −0.27 10 − 1 = 0.088, which is the proportion observed in the period 2002-2007. The conclusion is that the Poisson model can yield an accurate estimate of the underreporting of emigration due to differences in duration threshold, if the migration rate does not apply to the total population but to the subpopulation that considers emigration, i.e. the potential migrants. The relocation rate should apply to them and not to people who have no intention to emigrate or have an extremely low risk of emigration. To accurately describe under-reporting or over-reporting, the Poisson model should be extended to a mover-stayer model to incorporate unobserved heterogeneity in a population with respect to the desire to emigrate. The experts, whose judgments were considered by Wiśniowski et al. (2013) and Raymer et al. (2013), indicate a much larger effect of the duration threshold than produced by the simple Poisson model. Table 1 shows the undercounts estimated by experts, the simple Poisson model with emigration rate of 0.24, and a mixture model, which is an extension of the mover-stayer model. The expert judgments indicate that experts believe that onward or return migration soon after a previous migration is considerably higher than predicted by the Poisson model, which is a reasonable assumption. Unobserved heterogeneity may explain the deviation from the Poisson model. Suppose a small proportion of the potential migrants (‘movers’) (6 percent, say) is very mobile and moves almost every six months (relocation rate is 1.8), while most (94 percent) potential migrants are modestly mobile and migrate every 10 years, on average (migration rate is 0.1), then the true migration (UN definition) as a fraction of the recorded flow is given in the third column of Table 1 (mixture model). The figures are close to the correction factors derived from the expert judgments.

Table 1. True migration flow (UN definition) as fraction of recorded flow. Expert judgments,

Poisson model and mixture model Duration threshold Experts judgment Poisson model Mixture model No time limit 3 months 6 months 12 months Permanent (p)) 5 years 10 years

0.51 0.61 0.81 1.00 1.64

0.79 0.84 0.89 1.00 2.61 8.67

0.51 0.64 0.77 1.00 1.80 2.98

A model that distinguishes between people with and without a desire to emigrate produces true migration flows as fractions of the recorded flows that are similar to the expert judgments. If the population is indeed heterogeneous with respect to their desire to migrate, then part of the difference between the recorded migration flow and the true flow (UN definition) can be attributed to the unobserved heterogeneity. That part is not a measurement problem caused by differences in duration thresholds, but a consequence of misspecification of the migration model. In that case, models of true migration flows, such the one included in IMEM, should be extended to a mixture model to allow for that unobserved heterogeneity. A well-known example of a mixture model in the study of migration is the mover-stayer model.

Page 29: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

27

The prior probability distribution should also be a mixture distribution. Expert judgments on the proportion of movers in the population may be used to construct the mixture distribution. An effect of population heterogeneity on relocation rates and migration rates is confirmed by the UNECE Task Force on analysis of international migration estimates using different length of stay definitions (UNECE 2012:13f). The duration-of-stay dependence of relocation rates varies between males and females and between nationals and foreigners. The rate also varies between first international relocation and subsequent cross-border relocations. The first relocation is followed by a shorter duration of stay than subsequent relocations. The Task Force also presents, for different countries, the proportions of relocations using the 3-month criterion that are recorded if the 12-month criterion is used. The findings differ greatly between countries. 7. Conclusion and recommendations In countries of Europe, people feel uncomfortable with the level of immigration. Political parties that promise to regain control over immigration are on the rise. Politicians respond by discussing annual ceilings on the number of immigrants or net number of migrant. How do they know the numbers? How valid and reliable are the numbers they use? This paper, as other papers on how we know the facts of international migration, paints a rather bleak picture of the state of international migration statistics. The problem was diagnosed almost 50 years ago and the problem became acute when migration became the dominant component of demographic change and a major item on the political agenda. Several initiatives were taken at the national and European (and global) levels to improve the availability, quality and comparability of international migrations statistics. The initiatives can be classified in two broad categories. The first is the improvement of the production of migration statistics by the national statistical offices and other producers of official statistics in Member States, frequently in collaboration with Eurostat and in some cases with members of the European Statistical System in other Member States. It involves the documentation of the data collection process, the harmonization of concepts and measurement methods. In some cases, it also involves the use of mirror data supplied by other countries. The second category of initiatives is the development, by the research community, of statistical methods for estimating bilateral migration flows and for harmonizing available migration data. That often involves using different types of data from multiple sources. That research produced a broad consensus among scientists that a dual strategy is required to improve statistics on international migration flow in Europe. The first component is that producers of statistics should thoroughly document the procedures they use to collect data and produce migration statistics. That documentation may be accompanied by a risk assessment in which the types and sources of uncertainty in the data and the limitations in the production of statistics are made explicit. The second component of the dual strategy is oriented towards the research community. Models serve as a vehicle to effectively combine and integrate data from different sources and produce accurate and comparable migration estimates and migration statistics. The estimates are synthetic because data from different sources are combined and integrated. The yield harmonized statistics if the estimation procedure accounts for differences in the process of data collection and production of migration statistics. All steps of the estimation methods should be thoroughly documented.

Page 30: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

28

Past research on the estimation of international migration flows, reviewed in this paper, revealed significant progress and a clear direction. A common element in all research is the use of migration flows by countries of immigration and flows by countries of emigration. The first such matrices were published in the mid 1970s, by the United Nations Economic Commission for Europe (UNECE). UNECE obtained the data from national statistical offices by a special request. Later, countries that collected the data provided the data annually. The data revealed that the immigration data and the emigration data are not consistent, that immigration is generally reported more accurately than emigration, and that some countries cannot produce such data or are not able to report immigration and emigration flow data on a regular basis. Initially, the focus of research was reconciliation of immigration and emigration flows. To make the data consistent, country-specific correction factors were estimated and applied to the reported immigration data matrix and a different set to the reported emigration data matrix (Poulain and Wattelar, 1983). The correction factors are such that a measure of distance between the adjusted matrices is minimal, while some constraints imposed on the adjusted data are satisfied. The Euclidean distance was used initially, but later other distance functions were introduced. The adjusted immigration and emigration matrices are not equal. Poulain took the average of the two adjusted matrices. Abel (2010) gave priority to the correction factors for countries of immigration because immigration is generally measured more accurately than emigration. Initially, estimated migration flows were constrained to be equal to the total of the reported immigration flows. Later, additional constraints were imposed, but the basic approach remained constrained optimization. De Beer et al. (2010) imposed the constraint that the adjusted immigration matrix and the adjusted emigration matrix have the same marginal totals. Countries differ in the quality of their migration data and some countries do not report migration flows at all. Countries also use duration thresholds that may differ from the 1-year duration of stay criterion recommended by the United Nations and Eurostat. To account for the differences and to assure that the correction factors for countries with good data remain small, stepwise procedures were developed. The correction factors for countries with good data were estimated first, and those for countries with data limitations next. That introduced the need to judge the quality of the migration data reported by statistical offices. Expert judgments were solicited to rank countries by the perceived quality of their migration data. It also led to constraints on the correction factors for countries with good data and restrictions on the adjustments of some of the migration flows. Missing data constituted a separate problem. Some authors omitted countries with missing data. Abel (2010) estimated the missing flows using a spatial interaction model. He applied the EM (expectation-maximization) algorithm to obtain the parameters of the model. Raymer et al. (2013) adopted a similar idea but replaced the constrained optimization by a measurement model. A measurement model accounts explicitly for the sources of distortion in data due to differences in concepts used, measurements and data collection systems, in this case differences in duration thresholds, coverage of migrants, undercount of migration, and accuracy of the data collection mechanism. They also provided measures of uncertainty for all flow estimates and parameters in the model. They adopt a Bayesian approach to integrate the different types of data, covariate information, and prior knowledge. The migration model is used to both estimate the missing migration flow data and augment the measurement model. True migration flows that are consistent with the United Nations and Eurostat recommendation for the measurement of international migration (long-term migrations, i.e. migrations with duration threshold of 12 months) are treated as unobserved (latent) variables that need to be estimated from flow data reported by countries of immigration and countries

Page 31: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

29

of emigration, covariate information, and expert judgments. Wiśniowski et al. (2013) give a detailed account of how expert judgments are converted into prior distributions for subsequent use in the Bayesian inference. The research community has followed an impressive trajectory in response to the call for migration data that are trustworthy and that can be used in migration governance and the migration debate. A milestone was the EU Regulation 862/2007 of 11 July 2007 allowing Member States to use scientifically based and well documented statistical estimation methods in the production of migration statistics. The research community responded vigorously and produced the know-how and the technology to generate migration statistics that are harmonized and internally consistent, and accompanied by indicators of the accuracy of the statistics. That represents the state-of-the-art today. It is not the end of the trajectory. Further improvements are envisaged. The pace of improvements will critically depend on cooperation: cooperation among members of the European Statistical System and cooperation between the ESS and the research community. A concerted effort is needed to produce the evidence that allows a debate based on opinions and facts and motivates policies that are responsive to the evidence. The success of a concerted effort will depend on having a shared vision and a clear strategy. The vision is embedded in EU Regulation 862/2007: a combination of high-quality data collection and scientifically proven techniques provide the best guarantee for trustworthy international migration statistics. Since the sources of migration data Member States rely on differ, the outcome will be a migration database that is synthetic, i.e. which combines data from different sources. The database will evolve because stakeholders’ changing expectations and queries, new sources of data and progress in science and technology. To master that process and find a proper balance between continuity and change, the perspective of a learning process is recommended. The synthetic database is a representation of reality. It represents a knowledge base for public debate, governance and research. New data may be incorporated (‘assimilated’) in the database without altering the structure of the database. When new data cannot be incorporated in an existing structure, the structure needs to be adjusted (‘accommodation’) (and the model generating the database updated). The information it contains should be reliable but will not be perfect. Therefore, indicators of epistemic uncertainty (ignorance) and aleatory uncertainty (due to randomness) should be part of the database. Approaching the development of a synthetic database as a learning process paves the way to an effective use of insights from cognitive sciences and may guide the collection of new data. The future trajectory involves several specific actions. Most have already been proposed and even advocated by others. The actions include:

1. Identify and document sources of data of international migration. The census and administrative records are main sources. Surveys, in particular household surveys, labour force surveys and designated migration surveys have untapped potential. Enhance migration mainstreaming in labour force surveys (e.g. migration questions and migration modules) and other data collection activities. Although the ESSC adopted a conceptual framework and work programme for migration statistics mainstreaming in 2010, it seems that guidelines and practical tools for mainstreaming migration in data collection have not yet been finalized. Gender mainstreaming may serve as a benchmark for migration mainstreaming (see e.g. European Commission, 2017). Geolocation data generate new data sources for migration. Hughes et al. (2016) .

Page 32: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

30

2. Statistical institutes that collect primary data or derive the statistics from primary data should publish detailed metadata on migration concepts and measures, and on the data collection process. The metadata should include a description of adjustment procedures introduced to account for non-reporting and other measurement problems. Scientists, who rely on metadata to develop methods for estimating and forecasting migration, should develop a thorough understanding of the migration data before engaging in estimation and/or forecasting (see also Disney et al., 2015).

3. Use mathematical/statistical models to produce the synthetic database. The distinction between migration model and measurement model (Raymer et al., 2013) is very useful. Migration models predict numbers of migrants by origin and destination, and by migrant attributes, such as sex, age, education. Their policy relevance increases if they include the social and economic situation of migrants (Radermacher and Thorogood 2009). Measurement models should consider coverage, undercount, duration thresholds, accuracy, and other factors that cause observations to differ from true migration flows.

4. Include circular migration in models of migration. Duration thresholds considered in migration models should be flexible to cover permanent migration, short-term migration and circular migration. The modeling can benefit from the procedures developed by National Statistical Institutes for the measurement of short-term and circular migration (see e.g. Johansson and Johansson 2016). UNECE and Eurostat support that development (see e.g. UNECE 2012, 2016; TEMPER, 2015).

5. Develop life history models of migration, in addition to the population-level models in use today. Life history models adopt a longitudinal perspective and predict individual sequences of migrations/relocations and expected durations of stay in destination countries. They provide a logical way to incorporate lifetime migration (place of birth by place or residence), long-term migration, short-term migration, repeat migrations and circular migration in a single model in a coherent and consistent way. The UNECE Task Force on Measuring Circular Migration supports a life-history approach: “In the ideal situation, the complete migration history of a person would be available. This would make it easy to determine whether a person qualifies as a circular migrant.” (UNECE 2016:20). Since data on individual migration histories will always be incomplete, truncation and censoring need to be dealt with (see also Beauchemin and Schoumaker 2016:194). The theory of counting processes is the appropriate statistical theory for dealing with truncation and censoring (Aalen et al. 2008). Recently, DeWaard et al. (2017) used a life history model to estimate expected durations of stay of in EU-15 by migrants from new-accession countries.

6. Life history models may be extended to incorporate transnational activities. For instance, a person may obtain education in one country, work and raise children in another country and retire in a third. Activities are intertwined with migration. A temporary or circular migrant engages in different activities than a permanent migrant. Life history models enable the integration of different types of relocations in the human life course and the assessment of how migration interact with education, income generating activities, partnerships, the social network, and other aspects of life. Such an extension offers an analytical framework for the study of multi-sited individual and social lives (IOM 2010b).

7. Approach the development of the synthetic database as a learning process. A learning process builds representations of real world phenomena and improves the representations in light of new evidence and experiences. If one accepts that building a synthetic database is a learning process, then insights from cognitive science can help produce better data on migration.

Page 33: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

31

8. The synthetic database is a step towards a smart or intelligent database. Databases may be trained to recognize data types, suggest estimation methods and signal new trends and discontinuities. The learning process may also point to individual decision processes and social processes that generate migration. Decision rules may be identified and incorporated in the database by replacing the statistical models by agent-based models (ABMs). Agent-based models simulate how agents process information (signals) from multiple sources in their environment and integrate that knowledge into a knowledge structure that is the basis for purposeful action (see Klabunde and Willekens 2016a for a review of agent-based models of migration and Willekens et al. 2017 and Klabunde et al. 2017 for recent illustrations).

9. Formalize learning. A formal method of learning that is particularly useful is the Bayesian method or Bayesian inference. A critical aspect of Bayesian approach is to translate information or knowledge into probability distributions. Official statisticians, who ultimately are responsible for developing the synthetic database, should be trained in the Bayesian method. Bayesian statisticians, on the other hand, should reach out to official statisticians and emphasize the logic of the Bayesian method.

10. Stimulate collaboration between National Statistical Institutes of sending and receiving countries to increase the comparability of migration data and enhance the harmonization of data collection procedures and estimation methods. Promote exchange of data and the sharing of good practices. Secure adequate funding and training. UNECE (2010) developed guidelines on using data from destination countries to improve emigration statistics of origin countries (see also the UNECE site on migration statistics http://www.unece.org/stats/migration.html) and Eurostat established a secured web repository for exchanging migration data before their release (Kotowska and Villán Criado 2015). The 2016 New York Declaration for Refugees and Migrants also calls for enhances international cooperation to improve migration data (United Nations General Assembly 2016).

11. Improve communication of migration data and publicize good practices. The Conference of European Statisticians’ initiative to publish key recommendations and good practices in the communication of population projections shows the right direction (UNECE-CES Task Force on Population Projections 2016).

12. Bridge the gap between producers of statistics and scientists. Kotowska and Villán Criado, members of the European Statistical Advisory Committee, recommend that Eurostat takes the initiative and the lead to bridge that gap. Eurostat is indeed very well positioned and has demonstrated in the past decades that it can bring together scientists and producers of official statistics (Kotowska and Villán Criado, 2015).

13. Methods for estimating emigration are particularly rare and should receive more attention. A very good point of departure is the report of the Suitland Working Group (Jensen 2013). Labour force surveys, household surveys and special migration surveys can be used to estimate rates of emigration. Wiśniowski (2017) uses Labour Force Surveys of Poland and the United Kingdom to estimate migration flows between the two countries. To identify emigrations, household surveys should collect data on the country of residence of household members living abroad, their age and the age at emigration. The sample design should assure enough observations to yield sufficiently precise estimates. Willekens et al. (2017) review the literature on the estimation of emigration. In addition, they use the Survey on Migration between Africa and Europe (MAFE) to estimate emigration rates from the Dakar region, Senegal to Europe, accounting for the complex sample design of the MAFE survey.

14. Produce reliable data on the number of irregular immigrants and integrate the data in the synthetic database. Reliable data on irregular migrants in the European Union do not

Page 34: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

32

exist. As border crossings by third country nationals are currently not registered, it is not possible to establish lists of overstayers. It is generally agreed that the majority of the 1.9 to 3.8 million of irregular immigrants within EU overstay their Schengen visa (European Commission, 2013)10 11. The European Commission proposed the establishment of an advanced passenger information system for non-EU nationals travelling to the EU (European Commission, 2013, 2016b). The system includes an Entry-Exit System (EES), which registers of entries and exits, and a Travel Information and Authorization System (ETIAS). The system is modelled after the Electronic System for Travel Authorization (ESTA) and National Security Entry/Exit System (NSEERS) in the United States. The EES includes a mechanism to identify persons overstaying their authorized stay. In May 2015, a pilot EES project was started in Portugal. The system is believed to contribute to smart border management, but the experiences of the United States indicate the many challenges that emerge and need to be resolved.

15. Initiate and support a global, concerted effort to collect data on the root causes of international migration aimed at interventions that address emigration decisions and their motivating factors rather than the consequences of the decisions. Several recommendations were made for a World Migration Survey (see Section 3 of the paper), building on the knowledge and experience, gathered across the world in recent migration surveys of limited scale.

References Aalen, O.O. 1975. Statistical inference for a family of counting processes. PhD thesis, Berkeley: University of California.

Aalen, O.O., Ø. Borgan and H.K. Gjessing. 2008. Survival and event history analysis. A process point of view. New York: Springer

Abel, G.J. 2010. “Estimation of international migration flow tables in Europe.” Journal of the Royal Statistical Society A 173(4): 797-825.

Abel. G.J. 2013. “Estimating global migration flow tables using place of birth data.” Demographic Research 28(18): 504-546.

Abel G.J. 2016. Estimates of global bilateral migration flows by gender between 1960 and 2015. Working Papers 2/2016, Vienna: Vienna Institute of Demography.

Abel, G.J. and N. Sander. 2014. “Quantifying global international migration flows.” Science,

10The figure is not repeated in the 2016 version of the proposal.11The problem of repeatedly quoting generally accepted figures is illustrated in discussions in the United States about the number of unauthorized migrants. In the United States, it is believed that 40 percent of the estimated 11 million unauthorized residents are persons who overstayed their visa. The widely cited statistic is from a rough estimate in 1997 by Warren of the Immigration and Naturalization Service, which was later subsumed into the Department of Homeland Security. See Pew Research Center (2006) for the background of the 1997 study and Warren’s new estimates in 2013 for the period 1990 to 2010.Information on overstay is not collected because people leaving the U.S. aren't required to check with immigration officials. An estimated two thirds of the persons overstaying their visa remain in the United States for more than one year (Government Accountability Office 2013). In January 2016 the U.S. Department of Homeland Security estimated that 0.9 percent of the 45 million people who entered the United States in fiscal year 2015 overstayed their visa (U.S. Department of Homeland Security 2016).

Page 35: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

33

343(1520): 1520-1522.

Andersen, P.K., Ø. Borgan, R. Gill and N. Keiding. 1993. Statistical models based on counting processes. New York: Springer

Armstrong, J. S. 2001. “Standards and practices for forecasting.” In: Principles of forecasting: A handbook for researchers and practitioners, edited by J. S. Armstrong, Norwell, MA.: Kluwer Academic Publishers (Springer): 679-732.

Azose, J.J. and A.E. Raftery. 2015. “Bayesian probabilistic projection of international migration.” Demography 52:1627-1650.

Azose, J.J., H. Sevčíková and A.E. Raftery. 2016. “Probabilistic population projections with migration uncertainty.” PNAS (Proceedings of the National Academy of Sciences of the United States of America), 113(23): 6460-6465. Beauchemin, C. 2010. MAFE. Migrations between Africa and Europe. Final report. URL: http://cordis.europa.eu/publication/rcn/16041_en.html

Beauchemin, C. 2013. Statement prepared for the Informal Hearings for High-level Dialogue on International Migration and Development (New York, July 15, 2013). Paris: International Union for the Scientific Study of Population (IUSSP) Beauchemin, C. 2014. “A manifesto for quantitative multi-sited approaches to international migration.” International Migration Review, 48(4): 921-938 Beauchemin, C. and B. Schoumaker 2016. Micro methods: longitudinal surveys and analysis. In: White, M.J. (Ed.), International Handbook of Migration and Population Distribution. Edited by M.J. White, 175-204. International Handbooks of Population 6. Dordrecht: Springer. Betts, A. 2011. Global migration governance. Oxford: Oxford University Press Bijak, J. 2011. Forecasting international migration in Europe. A Bayesian view. New York: Springer

Bijak, J. and J. Bryant 2016. “Bayesian demography 250 years after Bayes.” Population Studies 70(1): 1-19.

Bijak, J. and A. Wiśniowski 2010. “Bayesian forecasting of immigration to selected European countries by using expert knowledge.” Journal of the Royal Statistical Society A., 173(4):775-796. [DOI 10.1111/j.1467-985X.2009.00635.x]

Bilgili, Ö. 2014. “Migrants' multi-sited social lives.” Comparative Migration Studies, 2(3): 283-304.

Bilsborrow, R.E., G. Hugo, A.S. Oberai and H. Zlotnik 1997. International migration statistics: Guidelines for improving data collection systems. Geneva: International Labour Organization

Page 36: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

34

Bilsborrow, R.E. 2016. Concepts, definitions and data collection approaches. In: International Handbook of Migration and Population Distribution. Edited by M. White, International Handbooks of Population 6, 31-40. Dordrecht: Springer.

Boswell, C. 2016. Report of the conference “Understanding and tackling the migration challenge: the role of research”, Brussels, 4-5 February 2016. European Commission, CG Research and Innovation. DOI:10.2777/111442 (PDF). URL: https://ec.europa.eu/research/social-sciences/pdf/other_pubs/migration_conference_report_2016.pdf#view=fit&pagemode=none

Brierley, M.J., J.J. Forster, J.W. McDonald and P.W.F. Smith 2008. Bayesian estimation of migration flows. In: International migration in Europe. Data, models and estimates. Edited by J. Raymer and F. Willekens 149-174. Chichester: Wiley.

Burgman, M.A. 2016. Trusting judgements. How to get the best out of experts. Cambridge: Cambridge University Press

Brooks, S., A. Gelman, G.I. Jone and X-L. Meng eds. 2011. Handbook of Markov Chain Monte Carlo. Boca Rotan: Chapman & Hall (Taylor & Francis). Caarls K. 2016. NORFACE Research Programme on Migration. Migration in Europe: social, economic, cultural and policy dimensions. 2009-2014. Summary Report. URL: http://www.norface.net/wp-content/uploads/2016/06/Summary-Report-NORFACE.pdf See also http://cordis.europa.eu/result/rcn/160917_en.html Cantinsani, G. and M. Poulain 2006. “Statistics on population with usual residence.” In: THESIM: Towards harmonised European statistics on international migration. Edited by M. Poulain, N. Perrin and A. Singleton. 181-201 Louvain-la-Neuve: Presses Universitaires de Louvain.

Cantisani, G., S. Farid, D. Pearce and N. Perrin 2009. Guide on the compilation of statistics on international migration in the Euro-Mediterranean region, Publication MEDSTAT II, Paris: Ed. ADETEF URL: http://www.unhcr.org/50a4f84d9.pdf

Chater, N., J.B. Tenebaum and A. Yuille 2006. “Probabilisitc models of cognition: conceptual foundations.” Trends in Cognitive Sciences, 10(7): 287-291.

Cohen, J., M. Roig, D.C. Reuman and C. GoGwilt 2009. “International migration beyond gravity: a statistical model for use in population projections.” PNAS (Proceedings of the National Academy of Sciences of the USA), 105(40):15269-15274.

Congdon P. 1993. “Approaches to modelling overdispersion in the analysis of migration.” Environment and Planning A , 25(10):1481–1510

Congdon, P. 2001. Bayesian statistical modelling. Chichester: Wiley

Courgeau, D. 1973. “Migrants et migrations.” Population 1:95-129. English version published as Population Selected Papers No. 3, October 1979. Davies, R.B. and C.M. Guy. 1987. “The statistical modeling of flow data when the Poisson

Page 37: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

35

assumption is violated.” Geographical Analysis, 19(4): 300–314.

De Beer, J., J. Raymer, R. van der Erf and L. van Wissen 2010. “Overcoming the problems of inconsistent international migration data: a new method applied to flows in Europe.” European Journal of Population, 26: 459-481.

De Brauw, A. and C. Carletto 2012. Improving the measurement and policy relevance of migration information in multi-topic household surveys. Washington D.C.: Living Standard Measurement Study, World Bank, URL: [http://econ.worldbank.org/WBSITE/EXTERNAL/EXTDEC/EXTRESEARCH/EXTLSMS/0,,contentMDK:21610679~menuPK:4538539~pagePK:64168445~piPK:64168309~theSitePK:3358997,00.html]

DeWaard, J., K. Kim and J. Raymer 2012. “Migration systems in Europe: Evidence from harmonized flow data.” Demography, 49(4): 1307-1333.

DeWaard, J., J.T. Ha, J. Raymer and A. Wiśniowski 2017. “Migration from new-accession countries and duration expectancy in the EU-15: 2002-2008.” European Journal of Population, 33(1):33-53.

Disney, G. 2014. Model-Based Estimates of UK Immigration. PhD Thesis, Southampton: University of Southampton

Disney, G., A. Wiśniowski, J.J. Forster, P.W.F. Smith and J. Bijak 2015. Evaluation of existing migration forecasting methods and models. Report for the Migration Advisory Committee, ESRC Centre for Population Change. Southampton: University of Southampton. Dumont, J.-C. and G. Lemaitre 2005. “Counting immigrants and expatriates in OECD countries.” OECD Economic Studies, 3(1):49-83. DOI: http://dx.doi.org/10.1787/eco_studies-v2005-art3-en European Commission 2009. The production method of EU statistics: a vision for the next decade. Communication from the Commission to the European Parliament and the Council of 10 August 2009. COM(2009) 404 final. URL: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=COM:2009:0404:FIN:En:PDF European Commission 2013. Proposal for a Regulation of the European Parliament and of the Council establishing an Entry/Exit System (EES) to register entry and exit data of third country nationals crossing the external borders of the Member States of the European Union. COM(2013) 95 final. URL: http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A52013PC0095 European Commission 2016. Research & innovation projects in support of European policy: migration and mobility. Brussles: European Commission, DG Research and Innovation. EUR 27592 EN

European Commission 2016b. Proposal for a Regulation of the European Parliament and of the Council establishing an Entry/Exit System (EES) to register entry and exit data and

Page 38: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

36

refusal of entry data of third country nationals crossing the external borders of the Member States of the European Union and determining the conditions for access to the EES for law enforcement purposes and amending Regulation (EC) No 767/2008 and Regulation (EU) No 1077/2011. COM(2016) 194 final. 2016/0106 (COD). 6th April 2016. URL: http://ec.europa.eu/transparency/regdoc/?fuseaction=list&coteId=1&year=2016&number=194&version=ALL&language=en

European Commission 2017. 2017 Report on equality between women and men in the EU. Brussels: European Commission, Directorate-General for Justice and Consumers. doi:10.2838/52591

European Parliament 2017. Amendment by the European Parliament to the Commission proposal amending Regulation 99/2013 on the European Statistical Programme 2013-17, by extending it to 2020. AM\P8_AMA(2017)0158(002-002)_EN.docx. Available at http://www.europarl.europa.eu/sides/getDoc.do?type=AMD&reference=A8-2017-0158&format=PDF&language=EN&secondRef=002-002 European Statistical System 2015. ESS vision 2020. Building the future of European Statistics. Luxembourg: Publication Office of the European Union. Available at http://ec.europa.eu/eurostat/web/ess/about-us/ess-vision-2020 Evans, M., N. Hastings and B. Peacock 2000. Statistical distributions. Third Edition. New York: Wiley Fiorio, L., G. Abel, J. Cai, E. Zagheni, I. Weber and G. Vinué 2017. “Using Twitter data to estimate the relationship between short-term mobility and long-term migration.” In: Proceedings of the 2017 ACM on Web Science (WebSci ’17, Troy, New York), DOI: http://dx.doi.org/10.1145/3091478.3091496 Flowerdew, R. and M. Aitkin 1982. “A method of fitting the gravity model based on the Poisson distribution.” Journal of Regional Science 22:191–202.

Frank, M. 2016. “Was Piaget a Bayesian? Analogies between Piaget’s theory of development and formal elements in the Bayesian framework.” Blog. http://babieslearninglanguage.blogspot.be/2016/04/was-piaget-bayesian.html

Gerland, P. 2015. “Migration, mobility and big data. An overview.” Paper presented at the GMG International Conference “Harnessing migration, remittances and diaspora contributions for financing sustainable development.”, New York, 26-27 May 2016. URL: http://www.globalmigrationgroup.org/sites/default/files/GMG_2015_UNPD-PG_Migration.pptx

Gopnik, A. and E. Bonawitz 2015. “Bayesian model of child development.” WIREs Cognitive Science, 6:75-86.

Gopnik, A. and J.B. Tenenbaum. 2007. “Bayesian networks, Bayesian learning and cognitive development.” Developmental Science, 10(3):281-287

Page 39: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

37

Government Accountability Office 2013. Overstay enforcement. Additional actions needed to assess DHS’s data and improve planning for a Biometric Air Exit Program. Report to Congressional Requesters, GAO-13-683, Washington D.C.: Government Accountability Office URL: http://www.gao.gov/assets/660/656316.pdf Hanea, A. M.,M.F. McBride, M.A. Burgman and others et al. 2016. “Investigate Discuss Estimate Aggregate for structured expert judgement.” International Journal of Forecasting, 33(1):267-279 URL: http://dx.doi.org/10.1016/j.ijforecast.2016.02.008 Howson, C. and P. Urbach 1989. Scientific reasoning. The Bayesian approach. La Salle, Illinois: Open Court Publishing Company Hughes, C., E. Zagheni, G. Abel and others 2016. Inferring migrations: Traditional methods and new approaches based on mobile phone, social media, and other big data. Feasibility study on inferring (labour) mobility and migration in the European Union from big data and social media data. Report prepared for the European Commission, Programme for Employment and Social Innovation “EaSI” (2014-2020). Brussels: European Commission. Catalog N. :KE-02-16-632-EN-N URL: http://ec.europa.eu/social/main.jsp?catId=738&langId=en&pubId=7910&type=2&furtherPubs=yes IOM 2010a. “Migration and transnationalism: opportunities and challenges. Background paper for the workshop ‘Migration and transnationalism: opportunities and challenges’”, Geneva, 9-10 March 2010. Geneva: International Organization for Migration, International Dialogue on Migration URL: http://www.iom.int/idmtransnationalism

IOM 2010b. “Final report of the workshop ‘Migration and transnationalism: opportunities and challenges’”, Geneva, 9-10 March 2010. Geneva: International Organization for Migration, International Dialogue on Migration, URL: http://www.iom.int/idmtransnationalism

Jacobs, R.A. and J.K. Kruschke 2011. “Bayesian learning theory applied to human cognition.” WIREs Cognitive Science, 2:8-21.

Jensen, E.B. 2013. A review of methods for estimating emigration. Report of the Suitland Working Group on migration statistics, Washington D.C.: Population Division, U.S. Census Bureau.

Johansson, L. and T. Johansson 2016. “Register for mapping circular migration.” Statistics Sweden Dokumenttyp 2015-12-14. Paper presented at the European Population Conference, Mainz, August 2016. Stockholm: Statistics Sweden.

Kupiszewska, D. and B. Nowok 2008. “Comparability of statistics on international migration flows in the European Union.” In: International migration in Europe: Data, models and estimates edited by J. Raymer and F. Willekens 41–72 Chichester, Wiley.

Kelly, J. 1987. “Improving the comparability of international migration statistics: The contribution of the Conference of European Statisticians from 1971 to date.” International Migration Review 21(4):1017-1037.

Page 40: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

38

King, R. and A. Lulle 2016. Research on migration: facing realities and maximising opportunities. A policy review. Brussels: European Commission, DG Research and Innovation DOI: 10.2777/414370 (PDF). URL: https://ec.europa.eu/research/social-sciences/pdf/policy_reviews/ki-04-15-841_en_n.pdf

Klabunde, A. and F.J. Willekens. 2016. “Decision-making in agent-based models of migration: state of the art and challenges.” European Journal of Population, 32(1):73-97 URL: http://link.springer.com/article/10.1007/s10680-015-9362-0

Klabunde, A., S. Zinn, F. Willekens and M. Leuchter 2017. “Multistate modelling extended by behavioural rules: An application to migration.” Population Studies, in print.

Kotowska, I.E. and I. Villán Criado 2015. Migration/migrants/integration. European Statistical Advisory Committee, Doc 2015/1176 (draft).

Kraler, A. and D. Reichel 2010. Statistics on migration, integration and discrimination in Europe. PROMINSTAT final report. URL: http://cordis.europa.eu/docs/publications/1243/124376691-6_en.pdf and http://www.prominstat.eu/drupal/node/64

Kraszewska, K. and D. Thorogood 2010. Migration statistics mainstreaming. Working Paper, Joint UNECE/Eurostat Work Session on Migration Statistics, Geneva, 14-16 April 2010. URL: http://www.unece.org/stats/documents/2010.04.migration.html#/ Laczko, F. and M. Rango 2014. “Can big data help us achieve a “migration data revolution’?” Migration Policy Practice (IOM), 4(2):20-29. Lanzieri, G. 2014a. “Filling the ‘migration gaps’ — can research outcomes help us improve migration statistics?” Prepared by the Statistical Office of the European Union (Eurostat) and presented at the Sixty-second plenary session of the Conference of European Statisticians, Paris, 9-11 April 2014. ECE/CES/2014/31 URL: http://www.unece.org/stats/documents/2014.04.ces.html#/ and https://www.researchgate.net/publication/260096802_Filling_the_%27migration_gaps%27_-_can_research_outcomes_help_us_improve_migration_statistics Lanzieri, G. 2014b. “Test of an estimation method for annual migration flows between EU-EFTA countries.” Prepared by the Statistical Office of the European Union (Eurostat) and presented at the Sixty-second plenary session of the Conference of European Statisticians, Paris, 9-11 April 2014. Working Paper No. 9. URL: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.10/2014/mtg1/WP_9_Eurostat.pdf.

Lanzieri, G. 2017a. “Summary methodology of the 2015-based population projections. Technical note.” Luxembourg: Eurostat ESTAT/F-2/GL, 3 March 2017. Available at http://ec.europa.eu/eurostat/cache/metadata/Annexes/proj_esms_an1.pdf

Lanzieri, G. 2017b. “Methodology for the migration assumptions in the 2015-based population projections. Technical note.” Luxembourg: Eurostat ESTAT/F-2/GL, 5 July 2017. Available at http://ec.europa.eu/eurostat/cache/metadata/Annexes/proj_esms_an3.pdf

Page 41: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

39

Liu, M.M., M.J. Creighton, F. Riosmena and P. Baizán 2016. “Prospects for the comparative study of international migration using quasi-longitudinal micro-data.” Demographic Research 35(26):745-782.

MEDSTAT Committee for the Coordination of Statistical Activities 2011. The HIMS project (Household International Migration Survey). MEDSTAT Committee for the Coordination of Statistical Activities, 18th Session Luxembourg, September 2011. URL: http://unstats.un.org/unsd/accsub/2011docs-18th/SA-2011-24-HIMS-Eurostat.pdf. See also the Eurostat website: http://ec.europa.eu/eurostat/web/european-neighbourhood-policy/enp-south/med-hims

Miller, P.H. 1983. Theories of developmental psychology. San Francisco: W.H. Freeman and Co.

Nowok, B. 2008. “Evolution of migration statistics in selected Central European countries.” International migration in Europe. Data, models and estimates edited by J. Raymer and F. Willekens. 73-87. Chichester: Wiley.

Nowok, B. 2010. Harmonization by simulation: a contribution to comparable international migration statistics in Europe. Amsterdam: Rozenberg Publishers. URL: http://www.rug.nl/research/ursi/prc/research/popspace/migrationeurope?lang=en

Nowok, B. and F. Willekens 2011. “A probabilistic framework for harmonisation of migration statistics.” Population, Space and Place 17:521-533.

ONS 2015. Long-term international migration estimates. Methodology document. London: Office of National Statistics. ONS 2016. Note on the difference between National Insurance number registrations and the estimate of long-term international migration: 2016. London: Office of National Statistics, 12 May 2016. https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/internationalmigration/articles/noteonthedifferencebetweennationalinsurancenumberregistrationsandtheestimateoflongterminternationalmigration/2016 Perfors, A., J.B. Tenenbaum, T.L. Griffiths and F. Xu 2011. “A tutorial introduction to Bayesian models of cognitive development.” Cognition 120:302-321. Pew Research Center 2006. Modes of entry for the unauthorized migrant population. Fact sheet. URL: http://www.pewhispanic.org/2006/05/22/modes-of-entry-for-the-unauthorized-migrant-population/ Pisică, S. 2016. “Intra-EU data exchange - a method for improving migration statistics.” Paper presented at the Eurostat conference “Towards more agile social statistics”, Luxembourg, 28-20 Nov 2016. Available at http://socialstats2016.eu/ Poulain, M. 1991. “Un project d’harmonisation des statistiques de migration internationales au sein de la Communauté Européenne.” Revue Européenne des Migrations Internationales 7(2):115-138

Page 42: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

40

Poulain, M. 1993. “Confrontation des Statistiques de Migrations Intra-Européennes: Vers plus d'Harmonisation? (Confronting the statistics on inter-European migration: Towards a greater harmonization?)” European Journal of Population 9(4): 353-381.

Poulain, M. 2008. “European migration statistics: definitions, data and challenges.” In: Mapping Linguistic Diversity in Multicultural Contexts Edited by M. Barni and G. Extra 43-66 Berlin/New York: Mouton de Gruyter..

Poulain M. and C. Wattelar 1983. “Les migrations intra-européennes: à la recherche d'un fil d'Ariane au travers des 21 pays du Conseil de l'Europe (Migration between the 21 countries members of the Council of Europe: Searching for the best estimation).” Espace, Populations et Sociétés, 1(2): 11-26.

Poulain, M. and L. Dal 2008. Estimation of flows within the intra-EU migration matrix. Report for the MIMOSA project Available at http://mimosa.cytise.be/Documents/Poulain_2008.pdf

Poulain, M. and A. Herm 2013. “Central population registers as a source of demographic statistics in Europe.” Population 68(2):215-247. Poulain, M., N. Perrin and A. Singleton eds. 2006. THESIM: Towards harmonised European statistics on international migration. Louvain-la-Neuve: UCL Press.

Radermacher, W. and D. Thorogood 2009. “Meeting the growing needs for better statistics on migrants.” Paper presented during DGINS Conference “Migration – Statistical Mainstreaming”, Malta, 30 September – 1 October. URL: http://ec.europa.eu/eurostat/documents/1001617/4339944/mainstreaming-w-radermacher.pdf/e3edaf52-d5f6-471d-a26e-6f3e4a4516f0

Ravlik, M. 2014. Determinants of international migration: a global analysis. Higher School of Economics Research Paper WP BRP 52/SOC/2014. Moscow: Higher School of Economic Research. Ravlik, Maria, Determinants of International Migration: A Global Analysis (October 2, 2014). Higher School of Economics Research Paper No. WP BRP 52/SOC/2014. Available at SSRN: https://ssrn.com/abstract=2504441 or http://dx.doi.org/10.2139/ssrn.2504441 niversity Higher School of Economics (HSE), Moscow. Raymer, J. (on behalf of the IMEM team) 2012. “Information exchange and modelling: solutions to imperfect data on population movements.” Economic Commission for Europe, Conference of European Statisticians, Work Session on Migration Statistics, 17-19 October 2012, Geneva. Raymer, J. and F. Willekens eds. 2008. International migration in Europe. Data, models and estimates. Chichester: Wiley

Raymer, J., A. Wiśniowski, J.J. Forster, P.W.F. Smith and J. Bijak 2013. “Integrated modeling of European migration.” Journal of the American Statistical Association 108(503):801-819.

Robert, C. and G. Casella 2010. Introducing Monte Carlo methods with R. New York: Springer.

Page 43: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

41

Romanian Institute for Research on National Minorities 2014. National Policy recommendations on the enhancement of migration data for Romania. Report prepared in the project ‘SEEMIG Managing Migration and Its Effects – Transnational Actions Towards Evidence Based Strategies’. URL: http://www.seemig.eu/downloads/outputs/SEEMIGPolicyRecommendationsRomania.pdf

Schoorl, J.J., L. Heering, I. Esveldt, G. Groenewold, R. van der Erf, Rob, A. Bosch, H. de Valk and B. de Bruijn 2000. Push and pull factors of international migration. A comparative report. Luxembourg: Eurostat. URL: https://www.nidi.nl/shared/content/output/2000/eurostat-2000-theme1-pushpull.pdf

Schoumaker, B. and C. Beauchemin 2015. “Reconstructing trends in international migration with three questions in household surveys: Lessons from the MAFE project.” Demographic Research 32,35: 983-1030

Skaliotis, M. and D. Thorogood 2007. “Migration statistics and globalisation: challenges for the European Statistical System.” Paper presented at the 93rd DGINS Conference “The ESS response to globalisation. Are we doing enough?”, Budapest, 20-21 September 2007. URL: http://www.ksh.hu/pls/ksh/docs/eng/dgins/programme.html

TEMPER Team, 2015. Report of the International Workshop on Methodological Challenges for the Study of Return and Circular Migration. Event Review Series No. 1. Available at http://www.temperproject.eu/wp-content/uploads/2015/06/Event-Review-1-2015.pdf

Tourmen, C. 2016. “With or beyond Piaget? A dialogue between new probabilistic models of learning and the theories of Piaget.” Human Development, 59:4-25.

UNECE 2010. “Guidelines for exchanging data to improve emigration statistics.” Paper prepared for the Task Force “Measuring emigration using data collected by the receiving country.”, Geneva: United Nations Economic Commission for Europe.

UNECE 2012. Final Report of the UNECE Task Force on analysis of international migration estimates using different length of stay definitions. UNECE, Conference of European Statisticians, 16th plenary session, Paris, 6-8 June 2012. URL: http://www.unece.org/stats/migration/estimates.html

UNECE 2016. Defining and measuring circular migration. Prepared by the UNECE Task Force on Measuring Circular Migration, ECE/CES/STAT/2016/5 Geneva: United Nations Economic Commission for Europe. http://www.unece.org/index.php?id=44717

UNECE-CES Task Force on Population Projections 2016. Key recommendations and good practices in the communication of population projections. Geneva: Conference of European Statisticians, April 2016. URL: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.11/2016/WP_28_TF_Report_v5_WithAppendix_.pdf

United Nations 1998. Recommendations on statistics of international migration. Statistical Papers Series M, No. 58, Rev. 1. New York: Statistics Division, United Nations Statistical Office

Page 44: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

42

United Nations General Assembly 2016. New York Declaration of Refugees and Migrants. UN General Assembly, A/71/L_1, 13 September 2016. URL: http://refugeesmigrants.un.org/declaration U.S. Department of Homeland Security 2016. Entry/Exit overstay report. Fiscal year 2015. Washington D.C.: U.S. Department of Homeland Security. URL: https://www.dhs.gov/publication/entryexit-overstay-report Valente, P. 2010. “Census taking in Europe: how are population counted in 2010?” Population and Societies (Paris: Institut National d’Études Démographiques (INED)). No. 467. URL: https://www.ined.fr/fichier/s_rubrique/19135/pesa467.en.pdf Van Dalen, H.P., G. Groenewold and J.J. Schoorl. 2005. “Out of Africa: what drives the pressure to emigrate?” Journal of Population Economics 18:741-778. Van der Erf, R. and N. Van der Gaag 2007. An iterative procedure to revise available data in the double entry migration matrix for 2002, 2003 and 2004. Discussion Paper, Netherlands Interdisciplinary Demographic Institute, The Hague. Available at http://mimosa.cytise.be/

Willekens, F. 1983. “Log-linear modeling of spatial interaction.” Papers of the Regional Science Association 52:187-205 Willekens, F. 1994. “Monitoring international migration flows in Europe. Towards a statistical data base combining data from different sources.” European Journal of Population 10,1: 1–42. Willekens, F. 2008. “Models of migration: observations and judgements.” International migration in Europe. Data, models and estimates. Edited by J. Raymer and R. Willekens. 117-147. Chichester: Wiley.

Willekens, F. 2014. Multistate analysis of life histories with R. Cham: Springer

Willekens, F. 2016a. “Migration flows: Measurement, analysis and modeling.” International Handbook of Migration and Population Distribution edited by M. White. International Handbooks of Population 6. Dordrecht: Springer: 225-241.

Willekens, F. 2016b. “The decision to emigrate: A simulation model based on the theory of planned behaviour.” Agent-based modelling in population studies: concepts, methods and applications edited by A. Grow and J. van Bavel. 257-299. Dordrecht: Springer.

Willekens, F. and J. Raymer 2008. “Conclusion.” International migration in Europe. Data, models and estimates Edited by J. Raymer and F. Willekens. 359-369. Chichester: Wiley. Willekens, F., D. Massey, J. Raymer and C. Beauchemin 2016. “International migration under the microscope.” Science, 352(6288): 897-899. Willekens, F., A. Pór and R. Raquillet 1981. “Entropy, multiproportional, and quadratic techniques for inferring detailed migration patterns from aggregate data.” IIASA Reports. A Journal of International Applied Systems Analysis, 4: 83-124.

Page 45: DGINS Conference (Conference of the Directors · migration statistics. At the sixty-second plenary session of the Conference of European Statisticians in 2014, Lanzieri (2014a) of

43

Willekens, F., S. Zinn and M. Leuchter 2017. “Emigration rates from sample surveys. An application to Senegal.” Demography. In print.

Wiśniowski, A. 2017. “Combining labour force survey data to estimate migration flows: the case of migration from Poland to the UK.” Journal of the Royal Statistical Society A., 180, Part 1:185-202. DOI 10.1111/rssa.12189

Wiśniowski, A, J. Bijak, S. Christiansen, J.J. Forster, N. Keilman, J. Raymer and P.W.F. Smith. 2013. “Utilising expert opinion to improve the measurement of international migration in Europe.” Journal of Official Statistics, 29(4):583-607.

Wiśniowski, A., J.J. Forster, P.W.F. Smith, J. Bijak and J. Raymer 2016. “Integrated modelling of age and sex patterns of European migration.” Journal of the Royal Statistical Society A, 179(4):1007-1024. DOI 10.1111/rssa.12177