We use satellite imagery, machine learning, luminosity, and other inputs to calculate GDP

The use of luminosity data (from satellite imagery) as a proxy for economic activity was introduced in 2012 by Henderson et al. in the American Economic Review, the world’s leading economics journal. Since then, rigorous peer-reviewed research has scrutinised luminosity as proxy for GDP, including from Nobel Prize winners, The World Bank, and the IMF

These studies consistently find a strong relationship between these two variables. Our team of Fellows, Postdocs and PhDs from LSE, the world’s #1 ranked institution in Economic Geography improve on the literature with proprietary machine learning algorithms, new data processing methods, other data inputs, and higher resolution satellite imagery.

Our model accuracy is ~98%

We calculate accuracy by comparing our models’ prediction of annual GDP in 2019 in Europe to official data in 2019. Our predictions are ~98% accurate across ~1300 regions. 

We use luminosity as a foundation because the variable is rigorously examined

Using luminosity to triangulate economic growth has nearly a decade of academic literature for us to build upon.

First appearance of luminosity and economics in academia (2012)

Night light luminosity data taken from satellite imagery was first used as a proxy for GDP in the American Economic Review (world’s top Economic journal), Measuring Economic Growth from Outer Space (2012), authored by Professor Vernon Henderson et al. Henderson is currently a Professor at the LSE Department of Geography and Environment, the #1 ranked Department in Economic Geography globally.

Peer-reviewed scrutiny (2012-2020)

1000s of papers published on luminosity and GDP. Luminosity is one of the most rigorously scrutinized proxies for GDP. E.g. 

505’s methods (July 2019 – now)

Improvements over traditional literature include higher resolution satellite imagery, a different way of processing raw imagery, and deep learning algorithms to transform luminosity to GDP.

In benchmarking tests in the EU, our methods average ~98% accuracy. Our team is comprised of academic Fellows, PhDs, and postdocs, who have published peer-reviewed papers related to transforming luminosity to GDP.

A luminosity-based foundation has unique advantages compared to other data inputs

Luminosity data has global coverage, high granularity, and is comparable across time.

Our methodology for Europe

Our GDP calculations involves:

  • Processing and cleaning luminosity data from satellite imagery
  • Forecasting national GDP
  • Calculating subnational GDP

Our raw imagery comes from NASA satellites...

We use high resolution night time satellite imagery at a resolution of 500 metre x 500 metre grids globally from NASA/NOAA satellites (Suomi National Polar orbiting Partnership (S-NPP) and Joint Polar Satellite System (JPSS) satellite platforms).

Satellite imagery comes from the Visible Infrared Imaging Radiometer Suite (VIIRS) instrument, which hosts a unique panchromatic Day/Night band (DNB) that is ultra-sensitive in low-light conditions, allowing us to observe night-time lights with high spatial and temporal resolutions.

These satellites provide daily measures of night-time visible and near-infrared light, which we aggregate to a monthly level.

... we then isolate for human-generated light...

Control for satellite angle

To fully capture luminosity.

Crop out water bodies

 Water reflect lights.

Filter out non-electric light sources

For example, fires and aurora.

Filter out ephemeral light

For example, lightning.

Remove obscuring factors

Like clouds, snowfall, etc.

We process raw satellite imagery to isolate for human-generated light. For example, satellite angle affects how much luminosity is captured. E.g. a satellite at Nadir (directly overhead) can miss some luminosity generated by buildings. Introducing a look angle allows us to capture more human-generated activity.

Our algorithms also adjust for snowfall, remove clouds, account for water bodies, filter out non-electric light sources, remove vegetation (e.g. tree branches can obscure luminosity) and remove ephemeral light. We then aggregate luminosity scores, measured in nano watts per square centimeter per steradian (nWatts cm2 sr-1), into the correct geographical boundaries (e.g. polygons representing regions, like Westminster, Tower Hamlets, etc.), and seasonally adjust the data to remove noise, like higher levels of luminosity during Christmas. 

The result of this processing is a very strong, linear, correlation between luminosity and GDP. The graph above shows the relationship between the EU’s official GDP measures compared to our processed lumionsity scores. Each dot represents a NUTS3 region. Each color represents a country.  

The above table shows the relationship between luminosity growth and official NUTS 3 GDP growth (column 1) from 2015-2019 (the full time series in our dataset where we have official GDP growth rates from EuroStat). The relationship  is statistically significant at the 5% level. 

Columns 2 and 3 show the relationship between log luminosity and log official NUTS 3 GDP for 2014 (the first year in our time series) and 2019 (the last year in our time series where we have official GDP data from EuroStat) respectively. The relationship is statistically significant at the 1% level.

We predict national-level GDP using macroeconomic indicators...

In order to create monthly sub-national estimates of GDP, we redistribute national-level GDP using NUTS 3-level luminosity data.

This involves training linear machine-learning models on the relationship between national-level GDP and luminosity for each month, and then predicting GDP sub-nationally. 

All countries in our sample (with the exception of the UK), do not have official monthly national-level GDP data. Therefore, we first predict national-level GDP for each month using  tree-based machine learning models, using inputs such as stock market data, and luminosity data.   

Using this method, we break down official quarterly  national GDP into monthly national GDP estimates. We similarly use these methods to forecast monthly national GDP when no official quarterly national GDP has been released.

The estimating equation resembles the above, where national GDP (from Eurostat) (y) for each country (c), and each quarter (q), is modelled as a non-linear function f of the level of luminosity (luminosity), stock market (stock.market) index, other indicators, plus some white noise.

Once the model is trained, we predict gdp for each country and each month using monthly luminosity data, stock market data, and other indicators.

Estimating subnational GDP

Once we obtain a full time series of monthly national GDP estimates, we use luminosity data to re-distribute GDP to sub-national regions.

We subsequently use ensemble learning to train linear machine learning models on national level GDP and national level luminosity data.

Hyperparameters were tuned to minimise the root mean square error (RMSE) with official NUTS 3 estimates in Europe.

Once these estimates have been derived, we undertake a benchmarking process to ensure that our dataset is aligned with official historical NUTS3 GDP data.

We feed new datasets into our models whenever new official GDP data (either national-level or sub-national) is released from Eurostat or ONS.

This ensures that our estimates are as accurate as possible and reflect both new data, and revisions to existing datasets made by official statistical agencies.

Even prior to benchmarking, our GDP estimates have a mean absolute percentage error (MAPE) of 2.10%. We determine this by comparing our NUTS3 GDP estimates for 2019 with Eurostat’s official 2019 dataset (see the accuracy section at the beginning of this page for more information). 

In the graph above we plot our estimates against official Eurostat estiamtes from 2014-2019. Each circle represents a NUTS3 region. Each colour represents a country.

Our vision is to get as close to 100% accuracy as possible

It’s impossible to be 100% accurate, but we aim to be as close as possible. Our data is constantly improving. Track all changes to our data here as we continuously improve our models, inputs, and estimation techniques.