Beyond GDP: an analysis of the socio-economic diversity of European regions

ABSTRACT This paper aims to analyze the socioeconomic diversity of the European Union (EU-28) regions from a dynamic perspective. For that purpose, we combine a series of exploratory space-time analysis approaches to multiple Factor Analysis (MFA) applied to a large range of indicators collected at the NUTS-2 level for the period 2000–2015 for the EU-28. First, we find that the first factor of MFA, interpreted as economic development (ECO-DEV), is spatially clustered and that a moderate convergence process is at work between European regions from 2000 to 2015. Second, when comparing these results with those obtained for Gross Domestic Product (GDP) per capita, we show that the convergence pattern detected with GDP per capita is more pronounced: ECO-DEV adjusts slower over time compared to GDP per capita. Third, pictures provided by the remaining interesting factors, capturing educational attainment, population dynamics and employment, are very different.


Introduction
With more than one third of the European Union budget devoted to Cohesion Policy, the regional policy of the European Union (EU), representing 351.8 billion euros for the 2014-2020 programming period, the effort provided by the European Union to support job creation, business competitiveness, economic growth, sustainable development and general improvement of citizens' quality of life, is considerable. Since the creation of the European Community, the six initial Member States already had the vision, set out in the founding Treaty, that "the Community shall aim at reducing the disparities between the levels of development of the various regions". This gave the tone for subsequent regional policies. For 2014-2020, Cohesion Policy has set eleven thematic objectives covering various priorities such as strengthening research and R&D, support the shift towards a low-carbon economy or promoting sustainable employment, labor mobility and social inclusion. While the EU's regional policy covers all European regions, it is nevertheless mainly concentrated on less developed European countries and regions in order to help them catching up and reduce economic, social and territorial disparities that are still widely present in the EU, especially with the various enlargements.
Given these stakes, it comes at no surprise that the empirical literature devoted to the analysis of regional economic disparities in Europe is substantial and has given rise to numerous studies since the seminal study by Barro et al. (1991). Existing studies can be broadly classified in two categories. On the one hand, confirmatory approaches to formal growth modeling are based on models set in the growth econometrics literature (Durlauf et al., 2005) and focus on unconditional, conditional (the so-called β-convergence) or club convergence. On the other hand, a rather atheoretical exploratory literature departs from the representative economy assumption and examines instead the entire distribution of the variable of interest, typically income, using tools such as Markov chains and distribution dynamics. With the "regional turn" that this literature has taken from the end of the 90s starting inter alia with Rey & Montouri (1999) and Lopez-Bazo et al. (1999), and because regions typically experience greater openness and heterogeneity than national economies, issues arising from the presence of spatial dependence and spatial heterogeneity in regional growth datasets have been largely explored (see for instance Rey & Le Gallo (2009) for a review). While confirmatory approaches use spatial econometric methods to tackle these issues, exploratory approaches have been developed to analyse the spatial and space-time mobility of income distributions (Rey et al., 2011;Rey, 2014). Our paper, by implementing a large range of exploratory spatial and temporal data analysis (ESTDA) techniques, belongs to this second strand of the literature.
One common feature of these studies is that they overwhelmingly focus on a univariate measure of income, such as income per capita or Gross Domestic Product (GDP) per capita, as it is the main variable in testable predictions of growth models. Moreover, when it comes to analyzing European regional disparities, this choice is further supported by the fact that some European regional policies use thresholds of this measure to allocate specific or additional fundings. Applications making use of exploratory spatial data analysis applied to the distribution of GDP or income per capita in European regions include, among others, Lopez-Bazo et al. (1999), Le Gallo & Ertur (2003), Dall'erba (2005) or Ertur & Koch (2006). Yet, other dimensions of disparities might be interesting. For instance, exploratory spatial data analysis methods were applied on educational attainment and inequality in European regions (Rodríguez-Pose & Tselios, 2011;Chocholatá & Furková, 2017;Kalogirou, 2010), on human capital in Turkish districts (Erdem, 2016) or on social capital (Fazio & Lavecchia, 2013;Botzen, 2016) and demographic ageing (Gregory & Patuelli, 2015) in European regions. More generally, the use of other measures can be rooted in the debate pertaining to income or GDP as a very incomplete and partial measure of well-being and social welfare.
The aim of this article is to depart from the current state of the literature by implementing a vast range of ESTDA measures to synthetic measures covering various aspects of economic activity: economic development, education, population and employment dynamics. These synthetic measures are obtained from a multiple factor analysis based on a large range of indicators collected at the NUTS-2 level for the period 2000-2015 for the EU-28. While ESTDA measures have been applied to analyse the space-time dynamics of income distribution in US states and Chinese states (Rey & Ye, 2010), Mexican states (Gutiérreza & Rey, 2013), Canadian cities (Breau et al., 2018) or other measures, such as total factor productivity (Di Liberto & Usai, 2013) in European regions, to the best of our knowledge, this is the first time that ESTDA methods are applied in such a way, i.e. by combining them with multiple factor analysis. We therefore extend the analysis by Del Campo et al. (2008) who also construct synthetic factors from a standard principal component analysis applied on a sample of European regions but their analysis remains static and they are not concerned with spatial issues. Further, as our first synthetic factor can be interpreted as economic development, we compare the results obtained for this factor to those obtained for GDP per capita and show that there are indeed substantial temporal and spatial differences.
The remainder of the paper is organized as follows. In section 2, we present the database that we use in section 3 to perform the multiple factor analysis. In section 4, we analyze regional inequalities and their dynamics using the first component and contrast the results with those obtained with GDP per capita. Both a global and a local analysis are undertaken. In section 5, we briefly present the results obtained for three other meaningful factors. Section 6 provides some concluding comments and suggests some avenues for future research. Table 1 presents the socio-economic variables collected for the empirical analysis. They are grouped into five broad categories: demography, economy, employment, education, and health. Economy variables (by economic sector) come from the Cambridge Econometrics' European Regional Database (ERD) and the remaining ones from the Eurostat Database "REGIO". The list of variables in Table 1 includes 19 out of the 24 main regional indicators published in the third report on economic and social cohesion (European Commission, 2004)-variables with (*) 3 in Table 1. 1 As in Del Campo et al. (2008), we exclude the remaining five variables as they do not meet the requirement necessary to undertake the empirical analysis, i.e. availability for all EU-28 countries 2 and expression as ratio or means to avoid scale problems. We then enrich and extend this first list using additional variables, which provide insights on other dimensions of disparities among European regions: demography variables (life expectancy, mean age of woman at childbirth, mean number of children), education variables (participation rate in education and training), employment variables (young people neither in employment nor in education and training 3 ) and health variables (hospital beds and health personnel per 100,000 inhabitants).

Data description
Our sample includes 275 regions at the NUTS-2 4 level in 28 European countries over the period 2000-2015 (see Table A2 for regions' distribution per EU-28 countries). 5 We report in Table B3 in the appendix the descriptive statistics for the considered variables from 2000 to 2015. Most display some huge asymmetries between EU-28 regions. The largest ones are observed for population density (a 1:3800 ratio between the minimum and maximum densities), and for the variables "young people neither in employment nor in education and training" on female, male and total population (between 1:1020 and 1:1250). Variations in the remaining demography variables are much less important compared to the population density variable: the ratios range from is 1:1.2 to 1:7.5. Regarding economy variables, GDP per capita (GDP p.c. hereafter) shows the highest dispersion (1:33) while the lowest concerns the variable "wholesale, retail, transport, etc. employment" (1:5). Overall, their variations are higher compared to demography variables excluding population density. While female, male and total employment variables exhibit some quite low dispersion (around 1:3), the unemployment variables' dispersion is important, specifically for female unemployment (1:50) and youth unemployment (1:40).
Among education variables, the variable "participation rate in education and training" shows very significant variations between regions (between 1:100 and 1:150). In comparison, the remaining variables of this group, related to the level of education display important, but less variations (between 1:10 and 1:30). Health variables' dispersion among regions is around 1:8.

Regional socio-economic indicators
Since we collected data for numerous variables (see Table 1) informing on the regions' socioeconomic conditions, we turn to data reduction techniques. Indeed, instead of analyzing the spatial pattern of each variable separately and then try to raise a global picture of regional inequalities, we extract the important information within our set of variables and express it as a collection of some (few) new orthogonal variables called "factors". This could be achieved using 1 We replace however the service employment variable by the following more disaggregated ones: emp trad, emp fin, and emp adm. Moreover, we use an additional sectoral employment variable: emp cons.
2 For variables with a "limited" number of missing values, we made some adjustments presented in Table A1. 3 neet fem (resp. neet mal) represents the share of young female (resp. male) people (population ages 15-24) who are not in employment, education or training, as a percentage of the total number of young female (resp. male) people. neet tot is the indicator computed without sex consideration. 4 NUTS stands for Nomenclature of Territorial Units for Statistics used by Eurostat. NUTS-2 refers to Basic Administrative Units and is the level at which eligibility to support from cohesion policy is determined. 5 We remove the remote French island Mayotte. and derives an integrated picture of the observations and the relations between variables and between groups of variables with a two-step procedure. In the first step, the groups of variables are made comparable in order to avoid that the analysis is dominated by the group with the strongest structure. To this end, each group of variables is normalized by dividing all its elements by first singular value (matrix equivalent of the standard deviation). Then, the normalized data tables are concatenated into a unique data table which is submitted in the second step to PCA.

4
As MFA boils down to a PCA on the concatenated-normalized data tables, the usual PCA outputs (coordinates, cosine, contributions, etc.) are available. Moreover, some specific-MFA outputs can also be derived to quantify the importance of each group in the common solution.
We apply MFA to extract a few principal components accounting for the major proportion of the total variance present in the dataset. Table 2 reports the eigenvalues (reflecting the importance of a component) of the first ten components derived from the analysis. The inertia of the first component is around 30%. The first four factors explain more than 65% of the total variance.  Table 3 contains the correlations between the first four factors and the original variables.
Since each variable appears sixteen times, as much correlation coefficients can be computed with 6 The method has been introduced in Escofier & Pages (1983, 1994. For an extensive and comprehensive review, see Pagès (2014) and Abdi et al. (2013). 7 For each region, the variables are grouped by time, from 2000 to 2015, i.e. there are 16 groups. For the first group "Year-2000", variables are ordered as in Table 1. the factors. However, as the correlation coefficients between factors and the yearly versions of each of the variables have a stable sign, we present the average correlation with factors for each variable. The most relevant correlations are shown in bold in Table 3. We also display in Table   3 the squared cosines of each variable to check for the quality of its projection on the four factors.
Factor 1, named economic development (ECO-DEV) is associated with a high level of the economic indicators presented in Table 1 (GDP and employment), a high level of education and a large number of jobs in financial and business services sectors. It also displays positive correlations with the rates of participation in education and training variables and negative correlations with the unemployment variables, the rate of people neither in employment nor in education and training variables and the agriculture, forestry and fishing sector employment.
Factor 2, named low education (LOW-EDUC), globally expresses high rate of active population with pre-primary, primary and lower secondary education and low rate of active population with upper secondary and post-secondary levels. It is also associated with a low number of jobs in the industry sector. Factor 3, named population dynamics (POP-DYN), is associated with a high percentage of children (population aged less than 15 years) and a low percentage of retired people (population aged more than 65). Factor 3 also shows a positive association with fertility rate. Therefore, a region with a high score on this factor is young and dynamic.
Factor 4, named active population (ACT-POP) is associated with regions with high population density, high percentage of active adults and also moderately associated with a large number of jobs in financial and business services sectors and with a high GDP per capita. Therefore, regions with a high value on this factor are those with attractive and competitive employment centers.
Factor 1, from its correlations with our original variables and its squared cosines, stands as a variable that provides indications of the economic situation of regions beyond GDP p.c. It can therefore be seen as an answer to the several limits of GDP pointed out in the literature (e.g. Robert et al., 2014;Fleurbaey, 2009). The next section is dedicated to the analysis of this factor: we analyze the regional disparities at work within EU-28 and their dynamics using Factor 1, while comparing the results with those obtained with GDP p.c. alone. Then, we complement the picture obtained with the analysis of Factors 2 to 4.

Beyond GDP per capita
We analyze regional inequalities and their dynamics from 2000 to 2015 within EU-28 using the first component derived from MFA: ECO-DEV.

Distribution dynamics and spatial pattern: a global assessment
To start, we display in choropleth maps using a quintile classification (see Figure C1 in the appendix), the spatial distribution of ECO-DEV in 2000 and in 2015. The darker the red (green) color, the most (less) developed the corresponding region. The visual inspection of these choropleth maps suggests a spatial clustering of similar values. In 2000, we identify a group of  poor regions belonging to Portugal, Spain, Italy, Eastern borders countries (Greece and countries of the former Eastern bloc (Poland, Romania, Bulgaria, etc.)) and on either side of France-Belgium border. This is contrasted by rich regions located mainly in the United Kingdom, Sweden, Denmark, the Netherlands, Austria and the south-west of Germany. Fifteen years and one financial crisis later, the spatial pattern of ECO-DEV has not significantly changed.
Compared to the picture provided by GDP p.c. (see Figure D4 in the appendix), we note two main things. First, regions in UK along with Scandinavian regions (excluding Norwegian regions, not in EU-28) appear relatively richer with ECO-DEV than with GDP p.c. Second, northern Italian regions, along with regions from Austria and Germany appears relatively poorer with ECO-DEV compared to GDP p.c. The well documented dualism of the Italian economy is therefore flagrant with GDP p.c. but less so when taking into account other variables.
This pattern of spatial clustering and its dynamics are explored in more detail using two where z i,t is the deviation from the mean of ECO-DEV observed in region i and period t. n is the number of regions and w ij is the (i, j) element of the spatial weight matrix and expresses how region i is spatially connected to region j. By convention w ii = 0. s 0 is a scaling factor equal to the sum of all the elements of the weight matrix (s 0 = n i=1 n j=1 w ij ). When the weight matrix is row-standardized, the expression (1) simplifies as s 0 = n. A value over (resp. below) E(I) = −1/(n−1) indicates global positive (resp.negative) global spatial autocorrelation.
Inference is based on a permutation approach.
To present the GIMA, we start with the general rank correlation coefficient (Kendall, 1962): where c is the number of concordant pairs and d the number of discordant pairs. Then, the numerator reflects the net concordance between all pairs of observations. In our application f = z t−1 and g = z t . τ ranges from -1 (perfect discordance) to 1 (perfect concordance).
A mobility index M is derived from τ as follows: Larger M is an indication of greater distributional mixing. Specifically, M = 0 implies an absence of rank mobility, while M = 1 is an indication of full ranking mobility. Rey (2004) extends this traditional rank correlation measure to incorporate a spatial dimension. Specifically, τ is decomposed as follows: where φ = ιW ι /ι(W + W )ι represents the share of all pairs that involve geographic neighbors. W and W = ιι − W − I n×n are matrices capturing respectively neighboring and non-neighboring relationships. ι is the (n × 1) unit vector. The decomposition in Equation (3) allows to determine to what extent the classic general rank correlation coefficient measure is silent about the correlation patterns between neighboring and non-neighboring regions. This can be inferred based on random spatial permutations of the attributes to develop a distribution for τ W under the null hypothesis of spatial homogeneity (Rey, 2016). The mobility index (M ) can also be decomposed as follows: We report in Table 4 the values for the global measures of spatial autocorrelation and rank concordance over 2000-2015. The spatial weight matrix used to construct these statistics is based on k-nearest neighbors calculated from the great circle distance between regions' centroids. Since this weighting scheme avoids the problem of isolated regions having non neighbors, it is very useful for our case with on a dataset composed of some islands. The k-nearest neighbors weight matrix is defined as follows: where d i (k) is the k th order smallest distance between regions i and j such that region i has exactly k neighbors. We set k = 10 to guarantee spatial connection between regions belonging to different countries 8 and avoid a block-diagonal structure of the weights matrix (Le Gallo & Ertur, 2003). With k = 10, 34.25% of the 10-nearest neighbors belong to a different country.
The evolution of Moran's I over the period reveals a positive significant and quite stable spatial autocorrelation for all years: the distribution of ECO-DEV is spatially clustered within a given time period (see Table 4). This confirms the visual inspection results: rich (resp. poor) regions are localized close to regions with relatively high (resp. low) value of ECO-DEV more often than if their localization were purely random. Interestingly, the estimated standardized Moran's I statistics are much more important than the ones obtained for GDP p.c. (see Table D5 in the appendix) and remain stable over time while the level of global spatial autocorrelation for GDP p.c. decreases over time. This reveals the existence of strong disparities within European regions when considering a synthetic index and unlike GDP p.c., the 2008 financial crisis did not have any discernible global effect on the spatial agglomeration of countries.
8 For example, to connect Greece to Italy.   Table   4), the degree of rank concordance between neighboring pairs is significantly lower than what would be expected under spatial randomness of rank changes. This would mean (using the transformation from τ W to M W ) that the mobility between neighboring pairs is significantly more important than the one expected under spatial randomness of rank changes within the observed periods, and since we observe a persistence of spatial clustering with Moran's I, this would suggest that this mobility between neighboring pairs is in the same direction. Looking at the results obtained for GDP p.c., we note that i) for almost half of the period, there is no significant difference between the mobility rate between neighboring pairs and the mobility rate expected under spatial randomness of rank changes, ii) for the remaining periods, differences exist: the mobility rate between neighboring pairs is significant and more important than the one obtained under spatial randomness. These mobility gaps are however smaller compared to the ones obtained for ECO-DEV. Overall, the mobility between neighboring pairs, higher for ECO-DEV compared to GDP p.c., provides an explanation to the high and persistent Moran's I found with ECO-DEV.

Going local: a closer look at spatial dependence and its dynamics
We continue the investigation of the spatial dynamics at work in the distribution of ECO- Moran scatterplots for a closer view of the spatial dependence and its dynamics, firstly between initial and final periods (Directional LISA) and secondly between a sequence of many periods (Markov LISA).
We start by mapping the significant LISA statistics. The LISA statistic in region i at time t (L i,t ), which formalizes the relationship between each observation of ECO-DEV and the weighted average of its neighbors (see Anselin, 1995), is defined as: with the same notations as before. Then, the LISA statistic is decomposed, i.e. each region in a given time period t is positioned in a Moran scatterplot using the coordinates z i,t and n j=1 w ij z j,t . 9 Inference is based on 9,999 random permutations and we use the Bonferroni p-value correction to deal with the multiple comparison problem. Specifically, we set to 0.05 the overall significance associated with multiple comparisons. Then since each observation (region) has ten neighbors, the individual significance is set to 0.05/10 = 0.005 as at most 10  9 The four quadrants of the Moran scatterplot report different types of spatial association between a region's ECO-DEV and that of its neighbors. In the first quadrant are located developed regions (regions with a relatively high ECO-DEV), neighbored by similar regions ("High-High" or HH). Quadrant two contains regions with relatively low ECO-DEV with developed neighbors ("Low-High" or LH), while quadrant three contains regions with a relatively low ECO-DEV with similar neighbors ("Low-Low" or LL). Finally, in quadrant four are located developed regions neighbored by regions with a relatively low ECO-DEV ("High-Low" or HL).
10 In total, more than 96.42% of regions in high-high clusters in 2000 remain in the same cluster in 2015. In addition, 7.32% of regions belonging to the non-significant cluster in 2000 move in the significant high-high cluster in 2015, amplifying the spatial association of high-high regions.
11 More than 87% of regions in low-low clusters in 2000 remain in the same cluster in 2015. In addition, 11.38% of regions belonging to the non-significant cluster in 2000 move in the low-low cluster in 2015, amplifying the spatial association of low-low regions.  reflect simultaneous positive co-movements (gain) of a region and its neighbors in ECO-DEV distribution. Conversely, movements to south-west reflect a simultaneous worsening of the position of the region and that of its neighbors in the distribution of ECO-DEV (see Rey, 2001). We also report in Figure 3 (right panel), movements with points instead of arrows to ease the visualization. Indeed, with arrows, long arrows may hide the presence of short ones.
To help the interpretation of movements displayed in Figure 3, we also provide in Figure C2 in the appendix their dynamics, depending on the location of regions at the beginning of the period. For example, in the first panel of Figure C2    This result implies that the assessment of regional economic performance provided by GDP p.c. for French regions for instance is stricter than the one provided by ECO-DEV.  Table 5. The chain has been estimated using maximum likelihood. The examination of these probabilities reveals several interesting characteristics about the spatial dynamics of ECO-DEV. First, the staying probabilities, i.e. the probability of remaining in one state between two time periods, are highest for quadrants 3 (LL) and 1 (HH) of the Moran scatterplot, followed by quadrants 2 (LH) and 4 (HL). Compared to regions in HH and LL, those in HL and LH are more likely to cross the scatterplot quadrants. Second, considering a region in the initial state LH, the movement to HH, which involves a change for the region but not its spatial lag, occurs more frequently than movement to LL, which involves a change in the position of the spatial lag but not the focal region. Similarly, for regions in HL in the initial state, moves to HH are more frequent than moves to LL. This confirms the moderated convergence pattern detected above. The relative mobility in this Markov LISA transition matrix 12 is relatively small (0.1652), confirming the persistence of spatial dependence also highlighted by GIMA. The last row of Table 5 shows the ergodic probabilities which gives an indication on the long term probabilities in each class. The higher ergodic probabilities are associated with the HH and LL columns, meaning that, only a few LL regions and a lot of HH ones will exist in the long run.
When we compare these results to the GDP p.c. ones and focus on quadrants 1 and 3 (where the majority of regions are concentrated)-see Table D6 in the appendix-we can note that staying probabilities are almost the same for ECO-DEV and GDP p.c. This means that most regions in these quadrants, while improving or worsening their positions (see Figure C2), stay in their starting quadrant over time.   Table 6. The staying probabilities are again the highest for quadrants 3 (LL) and 1 (HH) of the Moran scatterplot and the nonsignificant case, followed by quadrants 4 (HL) and 2 (LH). As for the four-class LISA Markov, i) compared to regions in HH and LL, those in HL and LH are more likely to cross the scatter plot quadrants; ii) for a region in the initial state LH, the movement to HH, which involves a change for the region but not its spatial lag, occurs more frequently than movement to LL, which involves a change in the position of the spatial lag but not the focal region. For regions in HL in the initial state, moves to HH are less frequent than moves to LL (for the four-class case, we had the opposite observation). The convergence pattern is explained in part with the movements of regions in LH which mostly either stay in the same quadrant or move in quadrant 1 (improvement).
We set up a formal test for co-movement dependence, deriving the LISA chain into two marginal discrete chains: one for the focal unit and one for the the spatial lag chain (the neighbors). Each of these marginal chains has two states H or L, depending of their position relatively to the mean value of ECO-DEV in a given time period. The test of the difference of these two transitions matrices resulted in χ 2 (9) = 4854.65, p-value < 0.0001. It indicates that the movement of regions' ECO-DEV in the distribution is dependent of the movement of the neighboring regions values. When we compare these results to those obtained with GDP p.c.
(see Table D7 in the appendix), we note, focusing on quadrants 1 (HH) and 3 (LL) where the vast majority of regions are concentrated, that the staying probability in LL is relatively stable while the one in HH is higher for ECO-DEV compared to GDP p.c. This again confirms the fact that regions are less integrated when assessed with ECO-DEV compared to GDP p.c. for countries like France, the dynamics of regional GDP p.c. is negative while with ECO-DEV, these regions are doing well. That would mean that variables like "female employment rate" or "young people neither in employment nor in education and training" are limiting the effect of GDP p.c. fall and are acting as economic stabilizers. Conversely, some regions from Eastern countries are doing well with the GDP p.c. and not with ECO-DEV. As explained above, this would mean that the GDP variables is less rigid than the others in ECO-DEV. It could also mean that there are some intra-NUTS-2 GDP p.c. disparities within these poor regions. At any rate, the results clearly highlight the fact that in some NUTS-2 regions, the GDP p.c. is a poor indicator of the economic well-being.

Complementary analysis: what about the other factors?
We briefly present in this section the results obtained for the remaining three factors: low education (LOW-EDUC), population dynamics (POP-DYN) and active population(ACT-POP) from the MFA.
Visually, the choropleth maps of these factors suggest spatial association of similar values, more pronounced for LOW-EDUC and POP-DYN compared to ACT-POP (see Figure E12 in  Recall that a region with a high score on this variable is probably young and dynamic. Therefore these clustered regions are the youngest and dynamic paces in EU-28.
These clusters are also persistent over time. One can note however, when we move to 2015, the creation of an extra cluster with regions from the north of France on the one hand and on the other hand, that most regions from Poland, previously in the high-high cluster are no longer clustering. Beside these high-high clusters, we identify in 2000 two clusters of low-low regions.
The first one consist of regions from Portugal, Spain, southern regions of France, Italy, Croatia and some regions from the core center of Europe (Austria and Germany). The second cluster is composed with regions from Greece. These regions are characterized by a high proportion of people aged more than 50. Note to finish that when moving to 2015, one can observe that regions from southern regions of France are no longer in this low-low cluster on the one hand and that on the other hand, the first low-low cluster described above expands toward the center of Europe. The last factor, ACT-POP is much less concentrated compared to the first two. As a complement to Rose diagrams, we plot regions displaying the direction of moves identified for each of our three variables. With respect to LOW-EDUC (see Figure E14 in the appendix), the majority of regions from UK, core central Europe and from the former Eastern bloc are getting more educated over time. Note also that some regions in France, Spain, Portugal, Italy and Greece follow the exact opposite path. Regarding the variable POP-DYN (see Figure E15 in the appendix), one can observe that most regions from France, Portugal, Spain, Italy and Croatia are becoming younger and conversely, regions from Scandinavia, Greece and from the former Eastern bloc are becoming aged. Finally, from the observation of ACT-POP (see Figure E16 in the appendix), we observe that most regions from Spain, Portugal, UK and the those located at the center of Europe are increasing their attractiveness over time. Conversely, most regions from France, Italy, Scandinavian countries, north of Germany and west of Poland are losing grounds on the competitiveness race. The results of the inference (see Figure   E17 in the appendix) highlight that the movement of regions and their neighbors is different from a random one at 5% for quadrants 1 and 3 (where are concentrated the majority of movements), with one exception however. Indeed, for LOW-EDUC, the movements corresponding to the situation where both the region and its neighbors increase but more so for the region itself is not significantly different from a random movement over time. As the majority of movements are concentrated in quadrant 1 and 3, globally the conclusions made above from the observation of Figure E13 remain valid.
We finally estimate LISA Markov chains over the study period (from 2000 to 2015). The obtained results are reported in Table 7. As for ECO-DEV, the staying probabilities are relatively important. They are the highest for quadrant 1 (HH) and 3 (LL). Regions in quadrants 2 (LH) and 4 (HL) are less stable than those in quadrants 1 and 3 and are thus more likely to 23 cross the scatterplot quadrants. One can note also that the relative mobility on Markov transition matrix is quite stable amongst factors 2 to 4 (0.1592, 0.1510 and 0.1350 respectively for LOW-EDUC, POP-DYN and ACT-POP). That is, the lower staying probabilities in quadrant 1 (HH) and 3 (LL) for ACT-POP compared to LOW-EDUC and POP-DYN are almost offset by its higher staying probabilities observed in quadrant 2 (LH) and 4 (HL), in comparison to those observed for LOW-EDUC, and POP-DYN. The five-class LISA Markov are in Table E9 in the appendix. Even if the estimated probabilities are lower compared to the ones from the four-class method, they remain relatively important. Also, as seen in the last section, compared to regions in HH and LL, those in HL and LH are more likely to cross the scatter plot quadrants. Two additional observations can be made. First, for regions in the initial state HL, their probability to move to HH decreases and is even null and their probability to move to LL increases. Second, for regions in the initial state LH, their probability to move to HH increases and their probability to move to LL decreases and is even null.

Conclusion
We analyze in this paper socioeconomic disparities at work in a sample of 275 regions in EU-28 from a dynamic perspective (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015). Starting from a wide set of socioeconomic indicators from Cambridge Econometrics' European Regional and Eurostat "REGIO" databases, we show Second, when we compare these results with those obtained for the usual indicator of economic activity, i.e. GDP per capita, we show that the convergence pattern detected with ECO-DEV is less pronounced than with GDP p.c. This would mean that ECO-DEV adjusts slower over time compared to GDP p.c. In other words, some of the original variables contributing to ECO-DEV (from the MFA) must be relatively rigid.
Third, pictures provided by the remaining interesting factors, i.e. factors 2 to 4 are completely different from the one provided by ECO-DEV. One can note that people of most European regions are getting more educated over time. Also, most regions from France, Portugal, Spain and Italy are becoming younger. This is however contrasted by the opposite trend of almost equal strength for regions from UK and Eastern countries. Finally, most regions from France, Italy, Scandinavian countries, north of Germany and west of Poland are losing grounds on the competitiveness race.
All these results point to the limits of GDP p.c. as single indicator of development. Several research directions could be further investigated. In particular, conditionally to the availability of data, a multiscalar analysis could be undertaken. Indeed, Díaz Dapena et al. (2019) show that for per capita GDP, a general process of convergence in the EU co-exists with intranational processes of divergence. It could be interesting to analyze whether such differences due to spatial scale also exist for the MFA factors, notably economic development.