|
Socio-Economic and Spatial Impacts of Trans-European Transport Networks Data Requirements and Structures Ian Masser, Max Craglia, Adelheid Holl Summary of: Masser, D., Craglia, M., Holl, A. (1997): Data Requirements and Structures. SASI Deliverable D7. Report to the European Commission. Department of Town and Regional Planning, University of Sheffield. Deliverable D7 sets out the information requirements for the simulation modules. The SASI-project aims to develop a comprehensive and transferable methodology with moderate data requirements for forecasting the impacts of the trans-European Transportation Networks (TETN) programme. From this objective arises the need for data sources which are generally available spatially disaggregated, as time series and comparable across the European Union. In this sense, the Eurostat data base REGIO is seen as the primary data input to the project as a whole, as it is the main official source of regional data that is available for the European Union. 1. Space The SASI model forecasts socio-economic development in 201 regions of the fifteen Member States of the European Union. These are the 'internal' regions of the model defined for SASI. The regions are based on the NUTS level 2 classification with additional subdivision of Ireland and Denmark into three and two regions respectively. The eight NUTS level 2 regions which are geographically not in Europe and not linked either to the road or rail network have not been included as internal regions, but constitute external regions. These are the French Overseas Departments, Ceuta y Melilla, the Canarias, Madeira and the Acores. In addition the rest of Europe has been subdivided into 27 external regions. These regions are necessary as additional destinations for calculations of accessibility indicators, but they are not modelled. For the external zones therefore exogenous forecasts of the destination activities relevant for the calculation of accessibility indicators need to be provided. Population and gross domestic product (GDP) will most likely be used as destination activities for the accessibility indicators in the SASI model (see SASI Deliverable D5, Schürmann et al., 1997). The SASI system of regions and the strategic networks used in SASI are also used in the concurrent DGVII projects STREAMS, EUNET and STEMM. 1.1 The NUTS Classification The Nomenclature of Statistical Territorial Units (NUTS) subdivides the European territory into interrelated levels. It is essentially a hierarchical classification in which each higher level region involves a grouping of a set of lower level regions. Beyond NUTS level 0 which corresponds to countries the classification consists of three regional levels (NUTS 1-3) and two local levels (NUTS 4 and 5). NUTS level 3 is however generally the lowest level at which Eurostat data are published (European Commission, 1995b). The current NUTS nomenclature subdivides the territory of the European Union into 15 NUTS 0 regions (countries), 77 NUTS 1 regions, 206 NUTS 2 regions and 1031 NUTS 3 regions. The NUTS classification tries to integrate the different national administrative units for which statistics are collected. However, the classification does not provide a harmonised set of regional units. At each level the number and size of regions vary greatly from country to country and not all countries and regions have a lower level subdivision. Generally, the variations in the size and population of the units increase with the level of disaggregation. In terms of area, the largest regions at all three levels of NUTS classification are situated in Sweden and Finland. Sweden for example constitutes a single NUTS level 1 region with a size of over 400,000 sq. km. Prior to the accession of Sweden and Finland, the largest regions were in Spain: the NUTS 1 region Centro with a size of about 200,000 sq. km and the NUTS 2 region Castilla-Leon with about 94,000 sq. km. In contrast, the smallest NUTS level 1 and 2 region is Brussels with a size of about 160 sq. km. In population terms, the largest NUTS level 1 region are the South East in the UK and Nordrhein-Westfalen in Germany with about 17 million inhabitants each. In contrast, the smallest region at NUTS level 1 and 2 is the island Ahvenanmaa/Aaland in Finland with a population of only 24,700 inhabitants. At NUTS level 2, the largest regions in population terms are Ile de France with 10.8 million inhabitants and Lombardia in Italy which has a population of 8.8 million, followed by Andalucia, Greater London and Cataluna, all with a population between 6 and 7 million. At NUTS level 3 the number of administrative units is much higher in Germany than in the rest of the Union. Germany has 445 NUTS level 3 regions compared with 96 in France, 95 in Italy, 65 in the UK and 52 in Spain. Consequently, the average population and the average size is lower in Germany than in most other Member States. The NUTS classification is based on national administrative units or groupings of these. This facilitates the collection of statistics but is not always the relevant territorial unit for economic comparison. This is particularly the case for certain smaller city-regions such as Hamburg for example, which are deprived of the rural hinterland that characterises most other regions. The presence of these regions may bias any comparison (measures of inequality such as minimum-maximum ratio, variance and standard deviation). This is particularly true, in the case of GDP per capita, where inward commuters generate a proportion of these region's output, but they are not counted in the resident population which is the denominator for the per capita income calculation. Thus, for these cities very high values of GDP per capita are observed. Alternative regional classifications, such as Functional Urban Regions (FUR's) address these problems. These regional units are based on travel-to-work areas and combine places of work with corresponding places of residence which, supposedly makes them economically more coherent. Although, this approach allows the establishment of a more consistent framework of areal units, there are also obvious deficiencies. The criteria used to specify a functional region are very diverse and there is no general agreement. A very important practical problem relates to the availability of data. The use of different areas other than the basic units of statistical purposes, would require significant amounts of approximations. Since in the EU, no official statistics are published for such areas, the data would have to be aggregated or disaggregated from available data by using weights. All this makes comparability and transferability difficult. Given these deficiencies and the aim of the SASI project which is to develop a model with moderate data requirements which is transferable and robust and also the fact that much of the framing of Community policy relates to the NUTS regions, we base the spatial dimension of the SASI model on the NUTS classification, being however aware of its limitations. 2. Time The temporal dimension of the model is established by dividing time into discrete time intervals or periods of one year duration. By modelling relatively short time periods both short- and long-term lagged impacts can be taken into account. The base year of the simulations will be 1981 in order to demonstrate that the model is able to reproduce the main trends of spatial development in Europe over a significant time period of the past with satisfactory accuracy. The forecasting horizon of the model has now been agreed to be 2016. 2.1 The Need for Robust Data Sets Collected Over Time The analysis of time series data suffers in some cases from discontinuities due to changes in the number of regional units, boundary changes and incomplete coverage. But more importantly, time series analysis has to allow for the cyclical movements in economic behaviour. For example, growth rates should be calculated for periods which run from a peak to a peak or a trough to a trough. In this context, we are aware, that the year 1981 has been a recession year due to an oil crisis with differential effects between regions. Since the timing of cycles and of troughs and peaks differs within the European Union from country to country, comparisons of regional performance have to be made with care. This may be of particular relevance for the calculation of socio-economic indicators over time. Changes in the NUTS classification due to the creation of new regions or the reorganisation of existing regions cause breaks in the data series which directly affect the availability and reliability of the data for the concerned regions. There has been a whole range of such changes falling in the study period of the SASI project. Examples are the creation of Flevoland out of Overijssel and Gelderland in the Netherlands in 1986, the separation of the Brussels region from Brabant in 1992 in order to obtain a hierarchical structure for the Belgian regions, and the reunification of Germany in 1990 which lead to the addition of the regions of the new German Laender and the inclusion of the former area of East-Berlin in the NUTS level 2 region of Berlin. Therefore, data prior to the reunification refers to the former West-Berlin only, whereas data after reunification refers to the total area of Berlin. With the addition of the regions of the new German Laender, a complete rearranging of their administrative boundaries took place. Similar general boundary changes took also place in Finland in 1988 at the level of provinces and in Greece at the level of the development regions. The changes which took place during the study period directly affect the availability and reliability of the data, resulting in missing data for the affected regions for the years before the changes, for example regions affected by the creation of Flevoland and Corse, (only in some cases, like population data, does Eurostat provide the adjusted figures for the years before the changes.) or the case of Berlin, where the data before the change is not directly comparable and has to be adjusted. Similarly, in the later cases of changes in the NUTS classification adaptations are necessary since the SASI model is based on the latest system of NUTS classification. Given the total number of regions included in the analysis, the regions affected by changes in the NUTS classification only represent a small share. Where the lower levels of disaggregation (NUTS 3 in our case) remained unchanged, the data can be adjusted by aggregation. This for example is the case of the Belgian regions. The remaining cases are either a problem of missing data or similar in their treatment, since estimations will be necessary. 3. Data Requirements Two major groups of data can be distinguished: data required for running the model and data needed for the calibration and validation of the model: Simulation Data Simulation data are the data required to perform a typical simulation run. They can be grouped into base year data and time series data. Base year data describe the state of the regions and the strategic rail and road networks in the base year 1981. Base year data are either regional or network data. Time series data describe exogenous developments or policies defined to control or constrain the simulation. They are either collected or estimated from actual events for the time between the base year and the present, or are assumptions or policies for the future. Time series data must be defined for each simulation period, but in practice may be entered only for specific (not necessarily equidistant) years, with the simulation model interpolating between them. Base year data Regional data (201 EU regions) Regional GDP per capita by industrial sector in 1981 Regional labour productivity (GDP per worker) by industrial sector in 1981 Regional population by five-year age group and sex in 1981 Regional educational attainment in 1981 Regional labour force participation rate by sex in 1981 Network data (pan-Europe) Link data of strategic road network in 1981 Link data of strategic rail network in 1981 Link data of air network in 1981 Time series data European data (EU) Total European GDP by industrial sector, 1981-2016 Total European immigration and outmigration, 1981-2016 National data (15 EU countries) National GDP per worker by industrial sector, 1981-2016 National fertility rates by five-year age group and sex, 1981-2016 National mortality rates by five-year age group and sex, 1981-2016 National immigration limits, 1981-2016 National educational attainment, 1981-2016 National labour force participation by sex, 1981-2016 National data (23 non-EU countries) National population, 1981-2016 National GDP, 1981-2016 Regional data (201 EU regions) Regional endowment factors, 1981-2016 Regional transfers, 1981-2016 Network data (pan-Europe) Changes of node and link data of strategic road network, 1981-2016 Changes of node and link data of strategic rail network, 1981-2016 Changes of node and link data of air network, 1981-2016 Calibration/Validation Data The regional production function in the GDP submodel is the only model function calibrated using statistical estimation techniques. All other model functions are validated by comparing the output of the whole model with observed values for the period between the base year and the present. The following data for calibration/validation are required: Calibration data Regional data (201 EU regions) Regional GDP per capita by industrial sector in 1981, 1986, 1991 Regional endowment factors in 1981, 1986, 1991 Regional labour force in 1981, 1986, 1991 Regional transfers in 1981, 1986, 1991 Regional net migration in 1981, 1986, 1991 Regional unemployment rates in 1981, 1986, 1991 Network data (pan-Europe) Node and link data of strategic road network in 1981, 1986, 1991 Node and link data of strategic rail network in 1981, 1986, 1991 Node and link data of air network in 1981, 1986, 1991 Validation data Regional data (201 EU regions) Regional population (by age and sex) in 1981, 1986, 1991, 1996 Regional GDP (by industrial sector) in 1981, 1986, 1991, 1996 Regional labour force (by sex) in 1981, 1986, 1991, 1996 Regional employment (by industrial sector) in 1981, 1986, 1991, 1996 Regional unemployment rate in 1981, 1986, 1991, 1996 4. The Key Role of the REGIO Database REGIO is the regional data bank maintained by Eurostat. Online access is provided by the Resource Centre for Access to Data on Europe (r-cade), established by the Centre for European Studies at the University of Durham and the Data Archive at the University of Essex who are offical hosts of Eurostat data. REGIO is the primary data source for EU- wide coverage of regional data provided at a regular basis and in a harmonised framework. It contains data for the three levels of statistical territorial units, NUTS levels 1, 2, and 3. The REGIO database contains data on the following statistical domains:
Of particular relevance for the SASI analysis are the data sets on:
It should be noted that only part of this data is available for the whole of the time period. Most data starts to become available in the late 1970's or early 1980's. The last year available varies from table to table but in general lies between 1991 and 1995. There are many gaps in the data for particular regions for particular years and not all information is available at all three NUTS levels. Gaps increase with the level of disaggregation and data coverage is generally poor for the new Member States and the new German Laender as well as for regions which have been either created during the study period or were affected by changes. This is the case for Flevoland, Overijssel and Gelderland for which data is not available prior to 1986 with the exception of area and population data. There are also differences between Member States in how certain kinds of data are made available. Thus some information may be available for NUTS 3 regions in some countries, but only for NUTS 2 regions or even only for NUTS 1 in others. In general, this has to do with the fact, that one of the 3 levels in most Member State is an additional level, either established for the purpose of grouping comparable units together at each NUTS level or based on a less important administrative structure. It is usually this level which has the least data available. As far as NUTS level 2 is concerned, this is an additional level in the United Kingdom and several data sets are not available at this level for the UK. The availability of demographic data as far as NUTS level 2 is concerned is very good. The data set for total population is complete with the only exception of the three former East-German regions Dessau, Halle and Magdeburg. Data for age groups are available in 5-year intervals and is also almost complete. Data on the labour force shows many more gaps. Furthermore, there is no data at all for 1982, since until 1983 the Labour Force Survey was carried out every second year only. Furthermore, there has been a change in the definition of the unemployed in 1983 and only since does the definition conform to the International Labour Office (ILO) recommendations .Apart form the gaps which are due to the creation of new regions, the newly added Member States and the new German Laender there is no data on the labour force for Spain until 1985, for Greece until 1988 with the only exception of the two regions Kriti and Thessalia. Luxembourg is missing in 1981 and for Portugal there is no data until 1986. For the UK data is only available at NUTS level 1. As with labour force data, the principal source of comparable statistical data on regional unemployment figures is also the European Union Labour Force Survey. Following the 1992 definition, unemployed persons, are persons who are actively looking for an occupation and could start within two weeks time. Unemployment rates in the EU- Member States show quite big differences (see SASI Deliverable D4, Bökemann, et al., 1997). There are very big problems related to the comparability of even harmonised unemployment data among EU-Member States, such as the rate of underemployment and social and insitutional differences. Availability of employment data is somewhat better. The regional economic accounts by branch include employment figures classified into economic branches according to the so called NACE-CLIO system (General Industrial Classification of Economic Activities in the European Communities). NACE-CLIO R3 divides the data into the three main economic sectors: agriculture, industry and services (EUROSTAT, 1996). In general, some data is also available at NACE-CLIO R6 and R17 which further subdivide the industry sector and the service sector, however data coverage becomes poorer with the level of disaggregation. For NACE-CLIO R3, there is no data for the new German Laender, Austria and Sweden. Data for Finland is first available in 1988 and data for Greece in 1987. The UK only has employment data on NUTS level 1 until 1987. Similarly, employment data in Germany is only available at NUTS level 1 for most years. REGIO contains data for GDP in millions of ECU and PPS and for GDP per capita in ECU and PPS, but the data is not disaggregated by economic sectors. Data on GDP is missing for Sweden and for Austria and Finland the data is first available in 1981. Data for the new German Laender is first available in 1991. The UK has only NUTS level 1 data for several year. Data on value added disaggregated by industrial sector shows some more gaps in addition to the ones of the GDP data set. Austria has no data at all available and German has only NUTS level 1 data for several years and for the UK data is only available at NUTS level 1 for the whole period. 5. Data Requirements Not Covered by REGIO Regional Endowment Factors The following endowment factors are proposed because they are considered highly relevant for regional economic development, can be collected or calculated for the base year and the calibration years 1981, 1986 and 1991 with a reasonable amount of effort, and can be exogenously predicted with sufficient plausibility for each year of the simulation: Availability of skilled labour (approximated by the number of university students enrolled in the region), the degree of urbanisation of a region as a measure of agglomeration effects, the availability of land as a proxy for the potential of locating a plant or business in a region, intraregional transport infrastructure and proxies for the regional quality of life. Network Data The 'strategic' road and rail networks used in SASI are subsets of the pan-European road and rail networks developed by IRPUD. The 'strategic' road and rail networks for 1996 contain all existing TETN links laid down in Decision No. 1692/96/CE of the European Parliament and the Council (European Parliament, 1996) as well as additional links selected for connectivity reasons. For past years up to the present, the networks represent the historical development of transport infrastructure, whereas for future years they represent the transport network investments and transport system improvements to be investigated. Travel cost is presently represented by travel time only: in future applications also generalised travel cost consisting of a combination of travel time and travel cost and mode-specific inconvenience will be used. Fertility Rates and International Migration Population changes are a function of birth, death, in-migration and out-migration. Therefore, in order to forecast regional population changes, data on age-specific fertility and mortality rates and migration are required. Only age-specific mortality rates are provided by REGIO. Age specific fertility rates at the national level are provided in the Council of Europe's report (Council of Europe, 1997). Data availability on international migration is particularly problematic. There is a lack of agreed definitions and practices of registration of migrants differ from country to country. Eurostat and the Council of Europe provide some limited information. Transfer Payments A comprehensive source about transfer payments by the Structural Funds of the European Union is European Commission (1997). The report lists transfer payments received by Member States in the two time periods 1989-1993 and 1994-1999. For most countries the information is broken down by NUTS-1 or NUTS-2 region and by Objective. In other cases it was possible to apportion the national totals to NUTS-2 regions on a per-capita basis. It is far more difficult to find information about transfer payments received by regions from national sources because of the variety of partly incompatible data sources that need to be consulted. Work on this data category is in progress. European Transport Projects. Future transport infrastructure investments and transport system improvements are represented by additions or modifications to the base networks. As a reference or 'baseline' scenario the implementation of all new or upgraded TETN links on which decisions already have been taken will be used. Other transport TETN scenarios will be developed by adding to the baseline scenario different subsets of the remaining TETN links laid down in Decision No. 1692/96/CE of the European Parliament and the Council (European Parliament, 1996). 6. Conclusions The Eurostat data base REGIO has been identified as the primary data input to the project, as it is the main official source of regional data that is provided on a regular basis and in a harmonised framework. The main data problems identified were large differences in the size of regions, changes in region boundaries and the creation of new regions all resulting in outliers and gaps in the data sets. Data coverage was found to be very poor for the new Member States Austria, Finland and Sweden and the new German Laender. Missing data, in particular for the base year 1981 has to be estimated or derived from other data sources such as national statistical offices. Although REGIO covers a considerable amount of the data required, other data sources are necessary for example for regional endowment factors, network data and the European Developments submodel. References Bökemann, D., Hackl, R., Kramar, H. (1997): Socio-Economic Indicators: Model and Report. SASI Deliverable D4. Report to the European Commission. Wien: Institut für Stadt- und Regionalforschung, Technische Universität Wien. Council of Europe, (1997): Recent demographic developments in Europe and North America. Strasbourg: Council of Europe Press. European Commission (1995b): Regions: nomenclature for territorial unit for statistics - NUTS. Luxembourg: Office for Official Publications of the European Communities. European Commission (1997): The Impact of Structural Policies on Economic and Social Cohesion in the Union 1989-99. Regional Development Studies 26. Luxembourg: Office for Official Publications of the European Communities. European Parliament (1996): Decision No. 1692/96/CE of the European Parliament and of the Council (23rd July 1996) on the Orientations of the Community for Developing the Trans-European Networks. Luxembourg: Office for Official Publications of the European Communities. Eurostat (1996): NACE rev.1: statistical classification of economic activities in the European Community, (Revised ed.). Luxembourg: Office for Official Publications of the European Communities. Schürmann, C., Spiekermann, K., Wegener, M. (1997): Accessibility Indicators: Model and Report. SASI Deliverable D5. Report to the European Commission. Dortmund: Institut für Raumplanung, Universität Dortmund. © 1997 Ian Masser, Max Craglia, Adelheid Holl, TRP |