We use satellite imagery, machine learning, luminosity, and other inputs to calculate GDP
The use of luminosity data (from satellite imagery) as a proxy for economic activity was introduced in 2012 by Henderson et al. in the American Economic Review, the world’s leading economics journal. Since then, rigorous peer-reviewed research has scrutinised luminosity as proxy for GDP, including from Nobel Prize winners, The World Bank, and the IMF.
These studies consistently find a strong relationship between these two variables. Our team of Fellows, Postdocs and PhDs from LSE, the world’s #1 ranked institution in Economic Geography improve on the literature with proprietary machine learning algorithms, new data processing methods, other data inputs, and higher resolution satellite imagery.
Our model accuracy is ~98%
We calculate accuracy by comparing our models’ prediction of annual GDP in 2019 in Europe to official data in 2019. Our predictions are ~98% accurate across ~1300 regions.
We use luminosity as a foundation because the variable is rigorously examined
Using luminosity to triangulate economic growth has nearly a decade of academic literature for us to build upon.
First appearance of luminosity and economics in academia (2012)
Night light luminosity data taken from satellite imagery was first used as a proxy for GDP in the American Economic Review (world’s top Economic journal), Measuring Economic Growth from Outer Space (2012), authored by Professor Vernon Henderson et al. Henderson is currently a Professor at the LSE Department of Geography and Environment, the #1 ranked Department in Economic Geography globally.
Peer-reviewed scrutiny (2012-2020)
1000s of papers published on luminosity and GDP. Luminosity is one of the most rigorously scrutinized proxies for GDP. E.g.
- Nobel prize winner, William Nordhaus, and Xi Chen, found that luminosity provides an accurate measure of economic activity at a granular level.
- The World Bank has been using luminosity to measure sub-national economic activity and economic development.
- The IMF is investigating the use of luminosity data to improve GDP measurements in low and middle income countries.
505’s methods (July 2019 – now)
Improvements over traditional literature include higher resolution satellite imagery, a different way of processing raw imagery, and deep learning algorithms to transform luminosity to GDP.
A luminosity-based foundation has unique advantages compared to other data inputs
Luminosity data has global coverage, high granularity, and is comparable across time.
Our methodology for Europe
Our GDP calculations involves:
- Processing and cleaning luminosity data from satellite imagery
- Forecasting national GDP
- Calculating subnational GDP
Our raw imagery comes from NASA satellites...
We use high resolution night time satellite imagery at a resolution of 500 metre x 500 metre grids globally from NASA/NOAA satellites (Suomi National Polar orbiting Partnership (S-NPP) and Joint Polar Satellite System (JPSS) satellite platforms).
Satellite imagery comes from the Visible Infrared Imaging Radiometer Suite (VIIRS) instrument, which hosts a unique panchromatic Day/Night band (DNB) that is ultra-sensitive in low-light conditions, allowing us to observe night-time lights with high spatial and temporal resolutions.
These satellites provide daily measures of night-time visible and near-infrared light, which we aggregate to a monthly level.
... we then isolate for human-generated light...
Control for satellite angle
To fully capture luminosity.
Crop out water bodies
Water reflect lights.
Filter out non-electric light sources
For example, fires and aurora.
Filter out ephemeral light
For example, lightning.
Remove obscuring factors
Like clouds, snowfall, etc.
We process raw satellite imagery to isolate for human-generated light. For example, satellite angle affects how much luminosity is captured. E.g. a satellite at Nadir (directly overhead) can miss some luminosity generated by buildings. Introducing a look angle allows us to capture more human-generated activity.
Our algorithms also adjust for snowfall, remove clouds, account for water bodies, filter out non-electric light sources, remove vegetation (e.g. tree branches can obscure luminosity) and remove ephemeral light. We then aggregate luminosity scores, measured in nano watts per square centimeter per steradian (nWatts cm2 sr-1), into the correct geographical boundaries (e.g. polygons representing regions, like Westminster, Tower Hamlets, etc.), and seasonally adjust the data to remove noise, like higher levels of luminosity during Christmas.
The result of this processing is a very strong, linear, correlation between luminosity and GDP. The graph above shows the relationship between the EU’s official GDP measures compared to our processed lumionsity scores. Each dot represents a NUTS3 region. Each color represents a country.
The above table shows the relationship between luminosity growth and official NUTS 3 GDP growth (column 1) from 2015-2019 (the full time series in our dataset where we have official GDP growth rates from EuroStat). The relationship is statistically significant at the 5% level.
Columns 2 and 3 show the relationship between log luminosity and log official NUTS 3 GDP for 2014 (the first year in our time series) and 2019 (the last year in our time series where we have official GDP data from EuroStat) respectively. The relationship is statistically significant at the 1% level.
We predict national-level GDP using macroeconomic indicators...
In order to create monthly sub-national estimates of GDP, we redistribute national-level GDP using NUTS 3-level luminosity data.
This involves training linear machine-learning models on the relationship between national-level GDP and luminosity for each month, and then predicting GDP sub-nationally.
All countries in our sample (with the exception of the UK), do not have official monthly national-level GDP data. Therefore, we first predict national-level GDP for each month using tree-based machine learning models, using inputs such as stock market data, and luminosity data.
Using this method, we break down official quarterly national GDP into monthly national GDP estimates. We similarly use these methods to forecast monthly national GDP when no official quarterly national GDP has been released.
The estimating equation resembles the above, where national GDP (from Eurostat) (y) for each country (c), and each quarter (q), is modelled as a non-linear function f of the level of luminosity (luminosity), stock market (stock.market) index, other indicators, plus some white noise.
Once the model is trained, we predict gdp for each country and each month using monthly luminosity data, stock market data, and other indicators.
Estimating subnational GDP
Once we obtain a full time series of monthly national GDP estimates, we use luminosity data to re-distribute GDP to sub-national regions.
We subsequently use ensemble learning to train linear machine learning models on national level GDP and national level luminosity data.
Hyperparameters were tuned to minimise the root mean square error (RMSE) with official NUTS 3 estimates in Europe.
Once these estimates have been derived, we undertake a benchmarking process to ensure that our dataset is aligned with official historical NUTS3 GDP data.
We feed new datasets into our models whenever new official GDP data (either national-level or sub-national) is released from Eurostat or ONS.
This ensures that our estimates are as accurate as possible and reflect both new data, and revisions to existing datasets made by official statistical agencies.
Even prior to benchmarking, our GDP estimates have a mean absolute percentage error (MAPE) of 2.10%. We determine this by comparing our NUTS3 GDP estimates for 2019 with Eurostat’s official 2019 dataset (see the accuracy section at the beginning of this page for more information).
In the graph above we plot our estimates against official Eurostat estiamtes from 2014-2019. Each circle represents a NUTS3 region. Each colour represents a country.
Our vision is to get as close to 100% accuracy as possible
It’s impossible to be 100% accurate, but we aim to be as close as possible. Our data is constantly improving. Track all changes to our data here as we continuously improve our models, inputs, and estimation techniques.