Datasets and levels of observation

Levels and resolution of observation

Most of the visualizations are available at two levels of observation: Brussels and its surroundings (the Former Province of Brabant) and Belgium. As the city of Brussels extends further – both morphologically and functionally – the limits of the municipality of Brussels and the Brussels-Capital Region, these delineations have been considered as unsuitable. To define Brussels in a functional way, many delineations could have been used; see for example reference [1]. The limits of the Former Province of Brabant has been selected to catch various kind of morphologies and degrees of interrelationship with the city-centre.

Some datasets are nevertheless analysed at a single level only (see list and details hereunder): Twitter reply messages are observed at the national level only, the phone calls dataset has been made available by the provider for Brussels and its surroundings only, etc.

Most of the visualizations are available at various resolutions of observation according to the basic spatial unit chosen.

  • Municipalities. These spatial units are the smallest political and administrative subdivisions of Belgium. In, most of the analyses are available at this resolution (population density, built-up density, communities based on migration, commuting, Twitter reply messages, etc.). More information on Statbel.
  • Statistical sectors. This resolution is originally made for intra-communal statistical observation, and is built by Statbel according to socio-economic and morphologic features. Administrative delineation and population are available at this resolution. More information on Statbel website.
  • 1 km² grids. A standard grid covering the whole Europe has been made available by Eurostat to allow comparisons without effects linked to the shape and the size of administrative basic spatial units. Population and built-up proportion are available at this resolution for Brussels and its surroundings. More information on Eurostat website.
  • Voronoï cells. These spatial units result from a partition of the area under study (e.g. Brussels or Belgium), each of the N cells being constituted by the places which are the closest to each point of a set of N points. Voronoï cells are used for the phonecalls datasets, the raw data being attached to cellphone towers. Therefore the information attached to a tower is extended to the area being theoretically under coverage by each tower.

Relational data used in community detection procedures

Commuting and migration data

Home-to-work daily migrations (commuting) and permanent moves (residential migrations) between places are recorded by Statbel. These two datasets are well defined and well known as commuting and residential migrations have been studied since long time by geographers, and suffer from well-known limitations and uncertainties related to the definitions of the workplace (head office used instead of the actual workplace, multiple or non-fixed workplaces, some categories of workers not taken into account such as international civil servants, etc.) or that of the place of residence (declared and not actual, multiple moves in a year, etc.). Both commuting data and residential migrations data can be downloaded through Statbel (download original data).

Phone calls data

Phone calls data consist here of 13,4 millions of calls made between customers of a major phone provider during April and May 2015, geocoded by emission and reception antennas (caller and callee locations) at the time of the call. These data have been provided through an agreement with the provider; for privacy reasons, they cannot be shared.

iRail data

The dataset is constituted by every 869,581 on-line railway travel requests made on the iRail schedule-finder application between 20 December 2015 and 28 February 2016. This app allows users to easily schedule train journeys on the SNCB network and has been found as an elegant alternative to a lack of O-D data about mobility in Belgium as in many transportation networks where the users don’t tap in and out to enter and exit the service. The nodes represent the train stations, while the weight of a link is the number of travel requests between a pair of stations. The dataset has been constituted by using iRail API.

Twitter data

The dataset is composed by reply-to messages between pairs of Twitter users geo-located in Belgium. 9 millions of tweets have been uploaded through the platform API (link to Twitter API). 300,000 of them have been selected according to various criteria, first of all being geotagged with GPS coordinates within the boundaries of Belgium, and secondly being reply-to messages to other users, to create a network of users and of places. Each user geolocation is considered as the barycenter of the GPS coordinates of the tweets issued by the user.

Other kind of data

Population and population density

Population distribution is registered through the Belgian Census at the place of residence. Population densities – density being computed as the number of inhabitants by surface of the spatial unit – at municipalities and statistical sectors are computed thanks to data about population by spatial unit made available by Statbel (download original data). Population at 1km² grids is made available by Eurostat (download original data).

Built-up areas density

The proportion of built-up areas by spatial units has been computed thanks to the Land Registration service (“Cadastre”), at various levels as the population density (download original data).

Trucks movements

An exhaustive dataset of spatio-temporal positions of every truck circulating in Belgium is used an a proxy of goods exchanges between places in Belgium. These data are collected for taxation purposes through mandatory GPS on-board units and have been made available for our team for a sample of one week. The dataset is cleaned and redesigned to build various indicators at a precise resolution level (1km² regular grid covering the country) by splitting each trace in individual journeys. Some of these indicators are illustrated in the atlas, as the daily and hourly traffic and the daily and hourly level of connections (first origins, last destinations, intermediate stops). These data have been provided through an agreement with the provider; for privacy reasons, they cannot be shared.


[1] Jones J., Peeters D., Thomas I. (2015) Is cities delineation a pre requisite for urban modelling? The example of land price determinants in Brussels. Cybergeo: European Journal of Geography, 716.