Inventor migration and knowledge flows: A two-way communication channel?

Abstract This paper documents the influence of networks of highly skilled migrants on the international diffusion of knowledge – particularly those with degrees and occupations in science, technology, engineering and mathematics. It investigates knowledge inflows to host countries brought in by skilled immigrants. It then explores knowledge feedback to home countries generated by these migrants. We test our hypotheses in a country-pair gravity model setting, for the period 1990–2010, using patent citations across countries to measure international knowledge diffusion. Our results confirm our hypotheses on the positive impact of skilled migrants on knowledge flows to host and home countries. However, the former are not robust to instrumental variables and country-pair fixed-effects, and only matter in certain contexts: when the sending countries are developing nations and for knowledge diffusion within the boundaries of multinationals.


Introduction
High-skilled workers are an important asset for a country's growth as they impact directly on knowledge production and diffusion (Nelson and Phelps, 1966;Vandenbussche et al., 2006). This is especially true of individuals with degrees and occupations in science, technology, engineering and mathematics (STEM), whose social and professional networks have been observed to disseminate important knowledge externalities (Moretti, 2004;Winters, 2014). Following the increasing globalization of STEM workers' mobility flows, special attention has been paid to the international dimension of such networks, with a focus on STEM migrants as key contributors to innovation in both their host countries (Chellaraj et al., 2008;Hunt and Gauthier-Loiselle, 2010;Kerr and Lincoln, 2010;Stephan and Levin, 2001) and in their countries of origin (Agrawal et al., 2011;Breschi et al., 2017;Kerr, 2008;Kuznetsov, 2006;Saxenian, 2006;Saxenian et al., 2002).
This paper aims to study the relationship between international knowledge diffusion and the migration of inventors, at a large, global scale. Migrant inventors constitute a representative category of STEM migrantsmost of them R&D workers, highly involved in producing the knowledge that spurs economic growth and wellbeing. We test the hypothesis of a positive relationship between inventors' migration and knowledge flows in a gravity model framework for a sample of 33 OECD receiving countries and 133 developed and developing economies. We test whether the stock of migrant inventors originally from country i and resident in country j is positively associated with knowledge inflows (KI) into country j, originating from country i. We also test if the stock of migrant inventors originating from country i and resident in country j is positively associated with knowledge outflows (KO) to country i, originating from country j. The empirical analysis is made possible by the use of a novel dataset of inventors with information on both their residence and nationality (Miguelez and Fink, 2013). This information is available for a significant majority of Patent Cooperation Treaty (PCT) applications, from 1990 to 2010, thus making it unnecessary to estimate the probable ethnic origin of inventors.
A number of studies have already addressed similar issues. Most have focused on the US, however (Agrawal et al., 2011;Breschi et al., 2017;Ganguli, 2015;Kerr, 2008;Kerr and Lincoln, 2010;Moser et al., 2014). Systematic empirical evidence on the impact of migration on knowledge diffusion is still scarce. Such evidence is much needed as (1) it is well understood that the international diffusion of knowledge and new technologies is a source of economic growth and income convergence across countries (Eaton and Kortum, 1999;Keller, 2004), and (2) despite migrants' representing a small proportion of total worldwide population (around 3% - UN-DESA and OECD, 2013), the number of high-skilled, educated migrants (particularly STEM) in certain OECD countries has exploded in recent years (Kerr et al., 2016).
Baseline results point to a positive impact of high-skilled migrants on knowledge flows. We find that doubling the number of inventors of a given nationality in a destination country leads to a 5% increase in KI to that host economy. Equally, it produces a 5.4% increase in KO to their homelands.
We use different approaches to account for unobserved effects driving both talent and knowledge flows. First, we use a recently released index of migration policy to instrument our explanatory variable (Rayp et al., 2017). Second, we introduce country-pair fixed-effects (FE) to control for unobservables. Results of the effect of inventor migration on KO are consistent either with instrumental variables (IV) regressions or with country-pair FE estimates. Conversely, the effect on KI does not survive either of the two approaches.
We also test the existence of heterogeneous effects across broad technological fields, different groups of countries, and knowledge diffusion within multinational boundaries. We find that migrant inventors are important for KI originating in low-and middle-income countries only, as well as when diffusion occurs within organizational boundaries. Finally, we note that the effects diminish dramatically when the US and the BRICS countries are excluded from the analysis, proof of the importance of these countries as magnets for worldwide talent and as main providers of STEM migrants.
The rest of the paper is organized as follows: the next section summarizes the theoretical literature on highly skilled migration and innovation, and presents previous evidence on the topic. Section 3 focuses on the research methods, including the description of our data and variables. Section 4 presents the results. We draw our conclusions in the last section.

Theory and expectations
The international diffusion of ideas (especially from leading nations to poorer areas) is central to income convergence (Coe and Helpman, 1995;Eaton and Kortum, 1999;Keller, 2004). However, because what matters most from knowledge stocks is tacit in nature, it tends to resist diffusion (Audretsch and Feldman, 1996;Polanyi, 1958;Storper and Venables, 2004) and can only be transmitted by means of frequent face-to-face interactions and meetings. It thus requires "knowledge carriers" to transmit it over geographical distances (Breschi and Lissoni, 2009;Trippl, 2013). The international mobility of human capital has thus gained attention as a channel of international knowledge diffusion.
High-skilled migration can affect host country innovation through different channels. First, skilled immigrants directly contribute to the innovation activities of the receiving societies, simply because they add to the skilled labor force -quantitative contribution (Kerr, 2013). Second, as migrants tend to be positively self-selected, they specialize in jobs for which they have a comparative advantage with respect to native workers (Bosetti et al., 2015) -i.e., qualitative contribution (Kerr, 2017). Third, more migrants contribute to more culturally diverse societies, alongside the increased creativity and complexity that goes with it (Alesina et al., 2016;Bosetti et al., 2015;Ferrucci and Lissoni, 2019;Kemeny and Cooke, 2018). Fourth, they may also favor inward FDI (Hernandez, 2014) as well as cross-border acquisitions (Useche et al., 2019). This may affect the innovation potential of the firms involved. Finally, skilled immigrants are also sources of knowledge transfer by themselves, from their original countries to the host countries, as they bring new skills, abilities and ideas to the receiving society (Lissoni, 2018). They have the ability to transfer knowledge to their host country and to their firm that was previously locked within the cultural context of their homelands (Choudhury and Kim, 2019). These are precisely the ideas tested in the historical approaches we mention in the following section (Ganguli, 2015;Hornung, 2014;Moser et al., 2014), and the main idea we aim to test in the present paper. We therefore hypothesize that larger stocks of immigrants originating from country i who are residents in country j are likely to increase knowledge diffusion from origin country i to receiving country j.
Skilled migrants may not only contribute to innovation in their host country, but also to innovation in their homelands. A burgeoning body of literature has identified positive returns of migration for the countries of origin through diaspora networks. Diasporas have been defined as "part of a people (…) that maintains a feeling of transnational community among a people and its homeland" (Chander, 2001). This feeling can be exploited to the benefit of the home country. While most research has traditionally focused on monetary remittances, more recently, knowledge remittances have gained center stage too (Saxenian et al., 2002). Knowledge remittances may take two non-mutually exclusive forms: (1) skilled migrant workers maintaining personal and professional contacts with their home countries, favoring the diffusion of knowledge on a friendly or contractual basis (Breschi et al., 2017;Meyer, 2001;Meyer and Brown, 1999;Nanda and Khanna, 2010); (2) they may decide to move back to their home countries on a permanent or temporary basis, equipped with new skills and social networks (Baruffaldi and Landoni, 2012;Choudhury, 2016). We therefore expect that larger stocks of immigrants originating from country i who are residents in country j increase knowledge diffusion from host country j to origin country i.
Economic history has extensively documented skilled migration and subsequent knowledge diffusion, typically where the sending country has a technical advantage over the receiving country, at least in some fieldse.g. Germany with respect to the US in industrial chemistry in the 1930s (Moser et al., 2014). This is less the case for skilled migration nowadays, which increasingly comes from developing countries and is centered in Englishspeaking economies as hosts (Kerr et al., 2016). A large share of this migration is to complete graduate studies abroad, for instance (Breschi et al., 2018). We therefore expect migrant inventors to have a greater effect on KO than on KI.
Our empirical analysis also divides knowledge flow corridors between two groups, i.e., developed-developed countries vs developing-developed pairs. In principle, we expect the latter to affect KO particularlydeveloping countries benefitting most from having their diasporas abroad, while the former affect KI particularlyas South-North migration generally occurs in pairs where the sending country is not technically superior to the receiving country (with a large share of this migration occurring for study purposes). However, developed-developed country pairs are more technologically similar than developing-developed pairs. In such cases, developingdeveloped migrants could be more important for introducing novel ideas to their receiving societies.
In our analysis, we also differentiate between intra-company and inter-company knowledge diffusion. Indeed, the literature has long discussed the role of firms and multinational corporations (MNCs) in managing international knowledge transfer across different locations (Hedlund, 1986;Teece, 1977). While knowledge diffuses mainly locally (Audretsch and Feldman, 1996), the ability of MNCs to transfer knowledge more effectively than is possible through market-mediated channels, is a critical means of international knowledge diffusion (Hymer, 1976;Singh, 2008). This is nevertheless not easy, even within organizational boundaries, especially with regard to the cross-national transfer of complex or tacit knowledge (Kogut and Zander, 1993;Sorenson et al., 2006;Teece, 1977). The potential gains from accessing diverse knowledge hubs are often offset by difficulties in achieving integration of knowledge across multiple locations (Singh, 2008). In order to overcome the challenges in transferring knowledge across geographic distances, MNCs may rely on the mobility of their skilled employees between their countries of origin and their destination in the MNCs' location (Caligiuri and Bonache, 2016;Minbaeva and Michailova, 2004). As stressed by Kerr et al. (2016), the extent of employment mobility within the MNCs' boundaries is often ignored by the migration literature. Nonetheless, large MNCs may have almost half of their workforce employed outside the headquarters countryand likely to be moved around (possibly on a temporary basis), so the phenomenon is on the rise. Thus, we expect to find differences in the relation between knowledge flows and STEM migration within vs outside the firm's boundaries.
Finally, the diaspora literature has differentiated between direct and indirect effects of diaspora networks (Kapur and McHale, 2005). The former arise from diaspora members deliberately interacting with their home economies. The latter arise from diaspora members serving as intermediates for easing knowledge transmission between migrants' home countries and third persons in their host economies. We expect direct effects of STEM migrants on KI and KO to be preponderant, but indirect effects are also likely to arise.

Previous evidence
In general, studies on high-skilled migration and innovation have long been confined to the area of economic history (Belfanti, 2006;Cipolla, 1972;Hornung, 2014;Luu, 2005). However, a group of scholars have worked on linking migration to innovation studies, mainly with the help of patent data (Agrawal et al., 2011;Breschi et al., 2017;Kerr, 2008;Kerr and Lincoln, 2010;Miguelez, 2018;Moser et al., 2014;Nathan, 2015). One stream of literature has focused on knowledge diffusion to receiving countries. This has generally been addressed by estimating the impact of the arrival of high-skilled workers on native knowledge creation: some papers have documented positive effects (Ganguli, 2015;Hunt and Gauthier-Loiselle, 2010), though some indicate small or even negative impacts Doran, 2015, 2012;Kerr and Lincoln, 2010). A critical challenge of this literature is identification, as most high-skilled workers may choose to relocate to highly-innovative, highlyrewarding places. Kerr and Lincoln (2010) apply a shift-share instrument across US States to study the impact of H1B visa admissions on local innovation. They find that skilled immigration leads to more patenting by inventors of Chinese and Indian origin, but not for natives -so immigrants contribute directly to innovation, rather than affecting native productivity through externalities. Using a historical migration exogenous shock, Moser et al. (2014) suggest that patenting by US-based inventors increased considerably in the 1930s in chemistry fields in which German Jewish émigrés were present, after being expelled from Nazi Germany. Interestingly, their results suggest that this effect was especially due to other inventors being attracted into the field, rather than an increase in the productivity of actual inventors. Ganguli (2015) is one of the few looking directly at migration and knowledge flows, rather than migration and innovation. The author exploits the fall of the Soviet Union as a natural experiment and the sudden migration of Russian scientists to the US which resulted. She looks at a panel of US cities and scientific fields and shows a disproportionate number of citations to Sovietera articles after the arrival of Russian migrants. Oettl and Agrawal (2008) identified internationally mobile inventors from the United States Patent and Trademark Office (USPTO) (when reporting different addresses in their patents) and find that the inventors' host countries gain knowledge inflows from their arrival, above and beyond the flows enjoyed by the firms recruiting them. Fassio et al. (2019) is one of the few studies, to our knowledge, that adopts an industry perspective -rather than a geographical one. The authors measure the impact of (skilled) immigration on innovation at industry level (citation-weighted patent production), which is critical, as skilled migrants tend to be concentrated in just a few industries. Indeed, in their analysis for France, Germany and the UK, the authors find heterogeneous effects across sectors, depending on their openness to trade and FDI.
From the perspective of migration-sending countries, the literature has usually depicted high-skilled migration as a source of brain drain, and hence political concern (Beine et al., 2001;Bhagwati and Hamada, 1974). More recently, studies claim that international co-ethnic ties may ease knowledge flows among high-skilled workers of the same origin back to the migrants' source country. Saxenian et al. (2002) surveys Silicon Valley scientists and engineers and discovers that around 82% of Chinese and Indians report having exchanged technical information with their peers back home, and 18% invest in their origin countries. Kerr (2008) uses patent data from the USPTO and by applying an ethnicity identification technique based on inventors' names shows that ethnic ties increase knowledge diffusion from the US to the migrants' home countries. The author estimates negative binomial models to show positive impacts of ethnic inventors in the US -seven foreign ethnicities identified -on knowledge flows back to their countries of origin, measured by patent citations. The effect is especially strong for high-tech industries and for the case of China, and the result is interpreted as evidence of positive returns for emigrants' sending countries. In a similar vein, Agrawal et al. (2011) build an Indian inventor database in USPTO patents using name identification techniques, and explore patent-level citations to study international knowledge flows from the US back to India. They find that patents by Indian inventors in the US do not seem to attract a higher-than-average rate of citations from the inventors' home country. The only (weak) exceptions are patents in Electronics, and patents owned by multinational firms. Interestingly, these results seem to suggest that the Indian diaspora is not a major source of knowledge feedback for the home country. The results of Agrawal et al. (2011) are reproduced and extended to another eight countries of origin in Breschi et al. (2017) -where migrant status is again identified by names. These authors find positive returns for emigrants' countries only in the case of China, Korea and Russia, and also for France, Italy and Japan within company boundarieseffects mediated by companies' self-citations. No results are found for Germany or India. They attribute the former to difficulties in determining who are true Germans residing in the US from their name and surname, a problem we do not share as we work directly with nationality. Finally, Oettl and Agrawal (2008) again provide evidence of a positive effect of internationally mobile inventors moving back to their source country. These authors find that the international movement of an inventor influences knowledge flows from the receiving country and the receiving firm to the source firm/country. And these effects double when the mover is hosted in a new geographic site within the same multinational company, in line with Breschi et al. (2017). This indicates that firms manage knowledge flows more effectively within their boundaries than outside them, and that mobility of labor reinforces intra-firm knowledge flows.

Empirical approach
For the present analysis we use a standard gravity model -see Anderson (2011) for gravity models of trade and Beine et al. (2016) for gravity models of migration. Only a few studies have extended it to study knowledge diffusion patterns (Cappelli and Montobbio, 2016;Kerr, 2008;MacGarvie, 2005;Peri, 2005). The gravity model to be estimated for KI takes the following form: and for KO: where and are, respectively, the total amount of knowledge flows from country i to country j in year t, and the number of knowledge flows from country j to country i, in year t. β1 is our parameter of interest in both equations, migijt-1 is the number of active inventors of nationality i residing in country j during year t, Zijt-1 is the set of dyadic and country-specific control variables in year t, and τi, τj and δt are country i, country j and time FE, respectively. εijt stands as the error term. Note that all time-variant explanatory variables are lagged one year to minimize reverse causality problems.
When applying the gravity model we face the issue of strong skewness in the data distribution with relatively few high values at the bottom end. A common solution for this has been to transform the gravity equation into its logarithmic form -with a normal disturbance term, then to estimate it with OLS. However, this practice may result in some heteroskedasticity in the error terms, as pointed out by Santos Silva and Tenreyro (2006). Moreover, as our dependent variables contain a large number of zeros, their logarithmic transformation would be impossible without incurring serious bias due to arbitrary transformations (Burger et al., 2009). For these reasons, Santos Silva and Tenreyro (2006) recommend estimating the multiplicative form of the model using Poisson pseudo-maximum likelihood (PPML). Given all the above, we choose to apply the PPML regression to the conditional expectation of equations (1) and (2), as: and

Patent citations and knowledge flows
Most studies reviewed in the previous sections use either trade flows or innovation outcomes after migration shocks as a proxy for knowledge exchange between countries. Finding a good measurement of the actual knowledge flows could be cumbersome to the extent that these flows are not tangible. The use of patent citations emerged as a way of overcoming this limitation -pioneered by Jaffe et al. (1993). Since then, this technique has been widely applied to various other studies, including migration research. Our solution, the use of citations as a proxy for knowledge flows, is not without its criticisms, most notably Jaffe and de Rassenfosse (2017) and Arora et al. (2018). However, these mostly relate to the use of citations as a proxy for inventor-to-inventor (or applicant-to-applicant) knowledge flows, while our analysis, at the aggregate, country-to-country level, aims to account for the outcome of a social, community phenomenon of migrant networking and communication.
Researchers have suggested using applicant-added citations and disregarding examiner-added citations (Thompson, 2006), but this is not necessarily a good solution (applicant citations are actually added by attorneys). At the European Patent Office (EPO), for instance, the large majority of citations are added by examiners. Despite this, Duguet and MacGarvie (2005) find that EPO citations are good proxies for knowledge flows as measured by CIS data for a sample of French firms. The community idea makes it possible for inventor Z to receive a knowledge token from inventor A through a word-of-mouth process passing through inventors B, C, D and so forth. The origin of the flow may escape Z's attention (Z is unaware of A), but this does not mean that the transmission did not take place. Breschi and Lissoni (2005) develop this argument in full.
For their part, Thompson and Fox-Kean (2005) criticize the fact that citations might be a biased proxy of knowledge flows to the extent that they capture knowledge similarity rather than real knowledge flows. Indeed, patents cite other patents within their technology far more frequently than those outside of their field. We address this issue, adding the appropriate controls and running separate regressions per broad technological field (see section 3.2.3).
Our dependent variable is built using cross-country citations to PCT patents -the patent database from the World International Patent Office (WIPO). 1 More precisely, we retrieve backward citations to PCT patents -as cited patents -from the OECD Citations database, July 2014, and geo-reference both cited and citing patents across all countries. From the initial data, only citing and cited patents with information on inventors and their countries of residence are selected, for the period 1990-2010, 2 and national-level citations are dropped. As some cited or citing patents are produced by teams of inventors scattered across 2 or more countries, citation counts are fractionalized as a function of the number of inventors in each citing and cited patent. Thus our dependent variable is just a dyadic variable returning the fractional count of backward patent citations from one country to another per year, weighted by the total number of inventors per country.
The KI dependent variable is built by counting the number of country j citations (citing patents) to country i patents (cited patents), grouped by country-pairs and year. The KO dependent variable is built by counting the number of country i citations (citing patents) to country j patents (cited patents), grouped by country-pairs and year. We remove self-citations at the inventor level from our analysis. Unfortunately, our sample of inventors is not disambiguated (it includes PCT inventors plus all cited inventors from any office). To exclude self-citations we compare the names of inventors listed in citing and cited patents, and exclude all citations with at least one inventor with the same (or similar) names.
We also divide knowledge flows between inter-and intra-firm citations. To identify intra-firm citations we incorporate information from the HAN OECD 2018 dataset which is a harmonized dataset of applicants (only name-harmonized, not disambiguated). This database incorporates information from ORBIS, which allows a benchmark name to link applicants' names. We also incorporate information from PATSTAT when the applicants of a given citing or a cited patent were not listed in HAN, as well as extensive manual checking. We then compare the names of citing and cited applicants, and identify as intra-firm those citations where citing and cited applicants have the same (or similar) names.
Next, we also separate citations from inventors whose country of origin is the cited one from the rest, in order to differentiate between direct knowledge flows (inventors in origin country citing only co-nationals in receiving country -KI, or migrants citing their home country's colleagues themselves -KO) and indirect knowledge flows (cross-nationality citations). This is by no means a straightforward task, as nationality is available for a large majority of PCT patents (citing ones) -but not all; and it is not available for the majority of cited patents which are not PCT. For a subsample of cited patents we were able to identify the nationality of the listed inventors if patents were either PCT or had a PCT as one of the members of the patent family. Whenever we cannot identify the full list of inventors' nationalities in a given citing-cited pair, we remove it completely from the analysis. For these reasons, results using this information should be treated with care.

STEM migration from inventor data
Most migration studies use education attained to determine skills level, and census data on the stocks of migrants with tertiary education as proxy for high-skilled migration. Yet, when it comes to STEM migration, data retrieved from censuses are less appropriate, as (1) education attained and skills can still differ markedly among tertiary educated workers, as this category collects people with science and engineering PhD together with people with non-STEM degrees or even non-university, tertiary education; (2) differences across countries emerge with respect to the quality of the education level attained, making cross-country comparisons troublesome; and (3) they are generally released every 10 years, which impedes longitudinal analysis in the short and medium run (and released to the public with a significant delay).
Some scholars have found a way to bypass these limitations by working with inventors as a proxy for STEM workers -a specific category of high-skilled migrants, most of them scientists and engineers. One advantage of using inventors' data in STEM migration studies is that migrant inventors stand as a more homogenous category of high-skilled migrants, highly involved in R&D, and behind the production of knowledge and new technologies.
For the present analysis we make use of a recent dataset on inventors from PCT patents (Miguelez and Fink, 2013), from which we are able to identify inventors with a migratory background on the basis of their nationality and place of residence. Compared to other inventor-based datasets, an advantage of using Miguelez and Fink (2013) is that it is the only one where information on inventors' nationality and residence are provided by inventors themselves. 3 It is therefore possible to identify migrant inventors by comparing information on nationality with that of residence. In our view, nationality is a more natural signal of origin than other proxies based on name identification techniques that have recently emerged in the literature (Agrawal et al., 2011;Breschi et al., 2017;Kerr, 2008). Additionally, patents administered by the PCT are international in nature, as applicants from all participant countries have, in principle, the same tendency to apply, contrary to other patent datasets which tend to be more biased towards one or other origin/destination country or region -the "home bias" effect.
Yearly country data on migrant inventors from the WIPO dataset are the starting point for computing our focal explanatory variable at a sending-receiving country pair level. More precisely, this variable stands as the annual number of active inventors -they appear listed in patents that year -who are nationals of an origin country i and residing in a given host country j (1990-2010). 4

Control variables
Following related studies (Beine et al., 2016;MacGarvie, 2005;Miguelez, 2018;Peri, 2005), we control for geographical distance as well as cultural and historical ties between countries. Two variables are included for the former: (1) a dummy variable for contiguity, taking the value 1 if the two countries share a common border and 0 otherwise, and (2) a variable measuring the distance -in kilometers -between the capital cities of both countries. Cultural ties are proxied with a dummy for common language, taking the value 1 if both countries share at least one language and 0 otherwise. To control for historical ties, we include a dummy taking the value 1 if there has been a colonial link between the two countries and 0 otherwise. 5 We also control for each country level of technological capacity with its total number of PCT inventors. This variable is informative to the extent that it measures the size of a country's innovation system, which determines both the amount of inflow and outflow of knowledge as well as the migration of talent in and out of the country. We also account for the fact that some countries are, on average, more cited than others, and may affect the direction of citation flows. We therefore introduce the host and home country average citation received per patent as controls.
Additionally, other more economically-based country-pair variables are included to minimize bias in our focal coefficients due to confounding factors. First, we include an index of technological similarity between pairs of countries in order to control for whether they both share common fields of technological specialization. This index is computed using patent data from the EPO (Coffano and Tarasconi, 2014) and applying the following formula: where fih stands for the share of patents of technological class h -according to the 30-class reclassification of IPC codes -held by country i, and fjh the share of patents of technological class h held by country j. 6 Values of the index close to one indicate that a given pair of countries are technologically similar, and values close to zero indicate they are technologically remote from each other (Jaffe, 1986). With the inclusion of this control, we aim to tackle Thompson and Fox-Kean's (2005) criticisms on the tendency for there to be more citations within than across technological fields.
Second, we use trade flows (exports and imports), collected from the COMTRADE database, to proxy for economic integration between pairs of countries, as well as to account for knowledge diffusion embodied in goods and services (Bahar and Rapoport, 2018).
Finally, we include the stock of college-educated migrants from country i living in country j, taken from the 2000 census (Artuç et al., 2015). Descriptive statistics and the correlation matrix are presented in Appendix A.2.
Many of our explanatory variables contain zeros, and therefore their logarithmic transformation is problematic.
To remedy this, we apply the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990). It behaves like a log-transformation, but is defined at zero (see recent applications by Bahar and Rapoport, 2018). Except for very small values of the variable, the inverse sine is approximately equal to its logarithmic version, and therefore it can be interpreted similarly. We apply this to all our explanatory variables except dummies, the index of technological similarity (ranging 0-1) and distance between capitals, which is log transformed.

Stylized facts
Our regressions include pairs of countries formed from 33 OECD destinations, and 133 developed and developing sending economies -see the list of countries included in Appendix A.3. [ Table 1 about here] The second panel splits citations between inter-and intra-company flows. At first sight, no major differences emerge between the two rankings (aside from the imbalance in the number of absolute citations between interand intra-company citations, as expected). Only the China-US corridor enters the top10 when intra-company citations are considered, though it already departed from the 11th position.
The bottom part of Table 1 looks at inventor migration corridors, which coincide to some extent with citation corridors, although with slight but striking differences. Unsurprisingly, the US appears as the most common host country for migrant inventors from 14 origins. Migrant inventors from China and India to the US account for 24% or almost one fourth of all migrant inventors in our dataset. There are also a large number of migrant inventors from Europe residing in the US, mainly from the UK, Germany and France, all of them are technology leading countries. When we focus on low-and middle-income sending countries, there is more variety in migrant inventors' origins (see Table A.2.2 in Appendix), but with the US as the main host economy. Migrant inventors coming from China and India to the US account altogether for around 57% of all migrant inventors originating from low-and middle-income countries. Table 2 shows baseline regressions with the usual gravity variables as controls, plus the number of inventors in countries i and j to account for size and innovativeness (results using number of patents instead are qualitatively the same), in columns 1 and 2 for, respectively, KI and KO. Regressions also include country i FE, country j FE, and time FE. Results for control variables are for the most part significant and with the expected sign, with the exception of contiguity and same colonial past, which are not significantly different from zero.

Econometric results
The focal variable -migrant inventors -is positive and significant in the case of KI and KO. Doubling the number of inventors of a given nationality to a destination country leads to an 8.4% increase in KI to the host economy, while a similar increase in the number of migrant inventors increases KO by 8.7% -coefficients can be read as elasticities (Santos Silva and Tenreyro, 2006).
Columns 3 and 4 mimic 1 and 2 but add important country-proximity controls that are not accounted for in the usual gravity models, such as technological proximity, trade (exports and imports), and the stock of collegeeducated migrants originating from country i living in country j (2000 round census). The variables are positive and significant, and our focal variable diminishes its point estimate (though remains strongly significant in both cases) confirming the necessity of adding these three controls -5% for KI, 5.4% for KO.

Confounding factors
Endogeneity issues could affect our baseline regressions and bias the results. Focal coefficients in KI regressions could be upward biased if more innovative (and highly-cited) receiving countries were to attract more inventors from abroad -KI and talent inflows would be spuriously correlated. For coefficients in KO regressions, they could be biased upwards if human capital and technological developments of sending countries increase their knowledge attractiveness (and citations received), and simultaneously increase the number of outward skilled migrants and the brain drain (see Clemens, 2014, for a discussion on the unexpected effects of development on the brain drain). Conversely, if unobserved technological development of sending countries reduces the emigration of STEM workers, our baseline estimates would be downward biased.
We address this using instrumental variables regressions. In particular, we use an index of migration policy as instrument, taken from Rayp et al. (2017). The authors compute a quantitative indicator of migration policy that accounts for restrictiveness of entry policy, staying requirements and regulations to foster integration. They combine publicly available data sources to provide a measure of "openness" to migrants (the larger the index, the more open the countries are to migration) based on these three concepts, for 38 countries between 1996 and 2014.
Note that, given that our sample ranges from 1990 to 2010, we do not use the years 2011-2014. In addition, the index is introduced with a 5-year time lag with respect to the dependent variable, and therefore a 4-year lag with respect to the variable to be instrumented (migrant inventors). The time needed for migration policy to affect inventor migration and their subsequent inventions (as we only observe migrant inventors when they patent, which could be some years after their arrival) is not immediate, and therefore a time lag is justified. We run IV regressions with different time lags of the instrument and we choose the time lag with the largest F-stat in the first stage. Consequently, our sample is reduced to the years 2001-2010 only.
As can be seen in Table 3, the instrument in the first stage (column 1) is positive and significant, as expected. Also, from the bottom of column 1 we learn that the F-stat of the first stage is well above 10, which is also a good sign of the appropriateness of the instrument. In columns 2 and 3 of Table 3 we reproduce our baseline regressions shown in Table 2, but for the indicated sample only, for comparison purposes. As can be seen, positive and significant coefficients remain. For presentation purposes, all the tables from now on do not show the coefficients for control variables, although these are always included (they are listed in the table notes).
Results showing all controls can be requested from the authors.
Columns 4 and 5 show the result for KI and KO respectively. From column 4 we learn that, when our focal variable is instrumented, inventor migration does not influence KI to the host countries any more, indicating that baseline regressions were upward biased. In column 5, on the contrary, we see that the IV coefficient for KO increases considerably, and continues to be strongly significant, indicating that baseline regressions were downward biased.
[ Table 3 about here] In order to avoid losing too many observations, we take an alternative approach to deal with unobservables -in this case, country-pair unobserved factors. 7 Thus, in Table 4 we introduce country-pair FE and repeat our baseline regressions. Of course, due to the inclusion of pair FE we remove all time invariant variables. Columns 1 and 2 show the results for KI and KO respectively. Overall, results confirm our previous IV approach: positive and significant effects of inventor migration on KO, and inexistent on KI. This leads us to conclude that inventor migration favours the overall transfer of knowledge back to their homelands, but it does not seem to affect knowledge flows into the receiving countries. 8 [ Table 4 about here] Table 5 goes one step further and shows pair-wise regressions adding interactions with our focal variable, migrant inventors, and 5 of our controls, namely, Technological similarity, Contiguity, Colony, Language, and Geographical distance (which we turn into the inverse of distance, proximity, for interpretation purposes). In their role of facilitating knowledge diffusion across borders, we may expect their impact to be larger for country pairs exhibiting stronger informational frictions, that is when the cognitive, cultural, or geographical distances between the two are more acute. Negative and significant interaction coefficients could be interpreted as causal evidence between inventor migration and knowledge diffusion, because if confounding effects drive both phenomena, they should work in such a way that they are not only capable of explaining the direct migration-diffusion link, but also their different effects across several country-pair dimensions (Kugler et al., 2018;Miguelez, 2018). Results partially go in this direction, as most of the coefficients show negative signs, though they are only significant for technological and geographical proximities.
[ Table 5 about here] In sum, going back to section 2, it seems that results for KI do not align with Ganguli's (2015) findings on Russian scientists fleeing to the US, though her analysis of this highly specific context makes comparisons difficult -historical shock, scientists. Similarly, they do not coincide with Moser et al. (2014). Interestingly, however, these authors suggest that increases in innovation come from crowding-in effects, and are not due to native inventors' productivity shifts. Similarly, Kerr and Lincoln (2010) find positive effects of H1B visa holders' inflows on patenting, but all attributable to Chinese and Indian ethnic inventors, and not to natives. With respect to KO, our results coincide with Kerr (2008) (positive and significant effects) and, at least partially, also with Breschi et al. (2017).

Field and origin-country heterogeneity
Both inventor migration and the use of citations to acknowledge ideas diffusion, are highly heterogeneous across technological fields (WIPR, 2013). We explore this issue by dividing our inventor migration and citation flows across 5 technological fields, using the standard aggregation of IPC codes (Schmoch, 2008). As shown in Table  6, none of the five sectors shows a significant positive migration-diffusion relationship for the KI equation. In the meantime, the positive effects on KO show up in all domains, but they are especially strong in electrical and mechanical engineering.
[ Table 6 about here] Next, inspired by differences in STEM migration effects on different measures of globalization across types of countries (Kugler and Rapoport, 2007;Miguelez, 2018), we explore heterogeneous effects in the migrationdiffusion relationship across the "North-North" and "South-North" axes. To do so, we multiply our main explanatory variable by two dummies: "High-income", valued 1 if the sending country is classified as a highincome country by the World Bank (before 2010), 0 otherwise, and "Middle/Low-income", valued 1 if the sending country is classified as a middle-or low-income country by the World Bank, 0 otherwise. We re-run regressions -again with country-pair FE, which are presented in Table 7: only inventors coming from middleand low-income countries significantly affect KI. Conversely, migrant inventors are critical for KO in both cases (inventors from high-or middle-and low-income countries) -though the coefficient is significantly larger in favor of middle/low income countries. 9 It seems then that migrant inventors are more important for the "South-North" corridors than for the "North-North" ones. Together with results on interactions (Table 5), we interpret this as evidence of greater transaction costs in "South-North" country-pairs, as compared to "North-North" ones, due to larger cultural and technological differences. Indeed, the average technological proximity among developed-developed pairs is significantly greater (0.54) than between developing-developed countries (0.19). All this makes the role of STEM migrants especially relevant in these contexts. Besides, as already pointed out in Breschi et al. (2017), it could be that for the "North-North" corridors most of the knowledge travels within multinationals' boundaries, jumping between their different facilities located in different places, with migrant inventors playing a more nuanced role in this case. We address this particular point in the next subsection.
[ Table 7 about here] Table 8 reproduces the main regressions, removing intra-company citations -identified using the OECD HAN database, July 2014. As can be seen, results remain the same: they are not significant for knowledge inflows (column 1), and are positive and significant for knowledge outflows (column 2). Columns 3 and 4 break the sample down into countries of origin, and again the results found in Table 7 are reproduced.

Multinationals and STEM migration
[ Table 8 about here] Next, we also re-compute our dependent variable using intra-company citations only. This is done in Table 9, where country-pair FE regressions are shown in columns 1 and 2. Interestingly, the coefficient for the relationship with KI now increases and becomes significant, which supports the idea that migrant inventors do bring in knowledge flows, but only within the boundaries of their firms. This is confirmed even more strongly when the focal variable is split between high-income and middle/low-income sending countries. As can be seen, now not only is the coefficient for middle/low-income countries positive and significant, but that of the highincome economies is too, confirming the idea that knowledge flows carried by migrants pass, at least partially, through multinationals at the same time. Again, the importance of STEM migration for intra-firm knowledge flows accords with findings by Oettl and Agrawal (2008) and Breschi et al. (2017).

Direct and indirect effects
We also explore differences in the migration-diffusion relationship, depending on whether citations occur among members of a given migrant community and their home colleagues (direct effects) or they include natives and migrants from other origins (indirect effects). We expect the former to show stronger effects, but the latter to influence diffusion, too. As we do not have information on nationality for all the inventors listed in the citing and cited patents, the results should be treated with care, but are still informative. Thus, as shown in Table 10, direct effects (both for KI and for KO) show positive coefficients, as expected. They are particularly large for KI, in fact. Results emerge also for indirect effects, which remain positive and significant. In fact, the coefficient for KI turns out to be significant when citations are split between direct and indirect effects.

Robustness analysis
We present some robustness checks in this section. As discussed in section 2 of the present paper, the large majority of empirical evidence on the relationship between STEM migration and knowledge diffusion concerns mainly the US, as it is, by far, the largest receiving talent country -especially from China and India, as well as the leading technology nation from which international spillovers emanate. In order to assess whether the results encountered in this paper, and in a large part of the related literature, can be extended beyond the US, we reproduce some of our regressions without the US as destination country. This is done in Table 11. As in previous regressions, STEM migration does not impact KI, on average. Interestingly, the coefficient on KO is considerably reduced too, becoming non-significant, due to the importance of the US as an attractor of foreign talent as well as a source of knowledge and technology to all other nations. Columns 3 and 4 differentiate across countries of origin (high-vs middle/low-income countries), and again find that what matters for non-US countries is STEM migration from developing countries, as it shows a positive and significant relationship with both KI and KO. Thus, when removing the US as receiving country, it seems that STEM migrants are important only when differences across countries are more acute ("South-North" axis).
[ Table 11 about here] Table 12 removes the BRICS countries from the analysis (Brazil, Russia, India, China and South Africa). Results do not change to a large extent with respect to the baseline. However, the effect of migrants from middle-income countries on KI, formerly positive and significant, does not arise this time. This is as expected, as a large majority of migrants from middle-income economies originate in BRICS countries.
[ Table 12 about here] Next, in Table 13 we focus on intra-European flows only. Columns 1 and 2 confirm that the migration-diffusion relationship is positive and significant, again, for KO, but not for KI (even slightly negative). Given that barely any flows come from non-high income countries, we focus instead on inter-and intra-company citations in columns 3 to 6. Results are repeated for inter-company citations (migrant inventors matter for KO). However, migrant inventors do matter for KI when these flows occur within the boundaries of the firm.
[ Table 13 about here] In further robustness checks, Table 14 shows the main results using only X, I and Y citations. These are particularly relevant documents, that may question the novelty and/or inventive step of using the citing patents (Jaffe and de Rassenfosse, 2017). This type of citation is possibly more important for inventors themselves, and less for examiners and lawyers (Criscuolo and Verspagen, 2008), which makes them more suitable for proxying knowledge flows. Fortunately, as shown in Table 14, our results and conclusions hold. In fact, the majority of coefficients are larger when using only "relevant documents".

Conclusion
In this paper we have used the gravity model to show how STEM migration -as measured by the number of migrant inventors -affects international knowledge diffusion -as measured by patent citations. While this research question is not new, systematic, global empirical evidence (especially beyond the US) is still scarce. Using inventors as a proxy for STEM migrants and their declared nationality to infer their migratory background, we have provided new results on the relationship between STEM migration and knowledge brought into their host countries, as well as knowledge sent back to their homelands. Further, we have also explored certain conditions for which these relationships are not linear: (1) when transaction costs are more acute ("South-North" axis); (2) within the boundaries of multinationals, and (3) beyond the US.
All in all, our results suggest that migrant inventors living in a given country are important for knowledge flows, not only to their homelands but also to their host countries -though to a lesser extent. Contrary to what has been advocated in the migration literature on the detrimental effect of high-skilled migration from low-income countries and an alarming brain drain, we find that low/middle-income countries benefit technologically from their migrant inventors living in high-income economies. At the same time, high-skilled migrants from low/middle-income countries also bring in some knowledge to their high-income host countries. Finally, we also learn that STEM migration and multinationals' strategies interact with each other with respect to knowledge diffusion, as well as the importance of the US in driving our results (with a few, notable exceptions).
Our research intends to convey the message that, instead of focusing the debate on brain drain issues, the attention of home and host countries' policymakers should be more oriented towards finding strategies that will establish and strengthen connections between STEM migrants and their non-mover peers, both at home and abroad, through adequate knowledge networks.
A few limitations are worth discussing. Our approach to STEM migration, based on inventors from the PCT with known nationality, could be an underestimate, as it misses inventors with a migratory background that have become nationals of their host country. If the likelihood of gaining citizenship differs across country-pairs, this could bias our estimates -though we build our explanatory variable using 1-year windows, so as to account for the most recent migrants only. Unfortunately, it is difficult to assess the severity of this potential bias. Related to this, PCT applications comprise only around 15-20% of all inventions worldwide. They are, indeed, only a subsample of all patents (and all inventors). However, the underlying inventions are likely to have a larger economic and technological value than national applications (van Zeebroeck and van Pottelsberghe de la Potterie, 2011). Finally, as our analysis remains at the aggregate, country level (despite efforts to disentangle heterogeneous effects across countries), some particularities may remain hidden. More detailed, case study approaches would be required to uncover specific singularities.  Yes Notes: Country-pair level clustered standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. The reason why observations are less than the implied 33 receiving countries * (133-1) sending * 21 years is that inventor migration data are not existent for some country-year pairs (e.g., ex-USSR republics before 1993). All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for dummies, the index of technological similarity and distance between capitals, which is log transformed. Yes Notes: Country-pair level clustered standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. Controls in all regressions include Contiguity, Colony, Common official language, ln(Distance b/ capitals), # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, Stocks migrants 2000, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for dummies, the index of technological similarity, the migration policy index, and distance between capitals, which is log transformed. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity.

KO
-0.120** 0.0707* -0.00723 -0.0207 -0.0378** (0.0488) (0.0378) (0.0358) (0.0300) (0.0171) Notes: Country-pair level clustered standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity.  Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Yes Notes: Country-pair level clustered standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Yes Notes: Country-pair level clustered standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Yes Notes: Country-pair level clustered standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Yes Notes: Country-pair level clustered standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. Observations decrease with respect to table 2, due to the fact that non-linear models (e.g., Poisson) removes country-pair observations in the absence of time variation (all zero outcomes) if countrypair FE are included. Controls in all regressions include # inventors in i, # inventors in j, Average citations in i, Average citations in j, Exports, Imports, and Technological similarity. All explanatory variables are transformed using the inverse hyperbolic sine transformation (MacKinnon and Magee, 1990), except for the index of technological similarity. Notes: All explanatory variables are transformed using the inverse hyperbolic sine transformation, except for dummies, the index of technological similarity, the migration policy index, and distance between capitals, which is log transformed.