Introducing the Global Inequality Project (GIP) national income dataset
To study global inequality between countries, we use a dataset of Gross Domestic Product (GDP) per person, derived from the World Inequality Database (WID). Our dataset records GDP per person from 1960 to 2022, using three different currency concepts for conversion to US dollars: constant MER, variable MER, and PPP (see our note here for more information about what these currency concepts mean and how they should be interpreted).
The full GIP national income dataset, including metadata, is available for download as an excel file here. The dataset covers 169 countries in constant MER; 171 countries in variable MER; and 172 countries in PPP. In all cases, this amount to around 98% of the world population.
In this short note, we explain how the GIP national income dataset was processed from the WID, including interpolations and extrapolations that we made to fill in missing datapoints.
Constant MER and PPP
The default GDP series provided by the World Inequality Database is expressed in Local Currency Units (LCU) at constant 2022 prices. In the WID, this variable is coded as agdproi999. To convert these series to constant MER and PPP, we followed the WID’s ‘Distributional National Accounts’ (DINA) guidelines. For each country, we converted all datapoints to constant MER using the 2022 value of that country’s market exchange rate (code = xlcusxi999). To convert to PPP, we used the 2022 value of the relevant country’s PPP exchange rate (xlcuspi999).
One challenge to these calculations has to do with the 23 successor states of the former- USSR, Yugoslavia, and Czechoslovakia. For these states, GDP/cap is generally not available in the earlier years of data. To overcome this problem, we used the growth rate of the relevant former state to extrapolate the GDP/cap of the successor states back to 1960.
For instance, data on Azerbaijan’s income is only available from 1973 onwards. As such, we extrapolated Azerbaijan’s income from 1972 back to 1960, using the per capita growth rate of the Soviet Union. This assumption – i.e., that all constituent republics of the USSR grew at the same rate – is obviously simplistic, so the figures for the successor states of the USSR, Yugoslavia, and Czechoslovakia should be interpreted with caution.
Using this approach, we were able to estimate complete time series of GDP per capita from 1960 to 2021 for 169 countries in constant MER and 172 countries in PPP. With the exception of one country – Venezuela – data also extends to 2022.
Variable MER
Converting to variable MER requires a few additional steps. First, we adjusted the GDP, LCU data (agdproi999) to current prices using each countries’ GDP deflator (inyixxi999). Second, we converted these series to current US dollars using the relevant market exchange rate in each year (xlcusxi999). Finally, we deflated these estimates to 2022 prices using the World Bank’s SDR deflator – a special price index that attempts to measure inflation on the world market as a whole.
It should be noted that in several cases where the market exchange rates were clearly unrealistic, we removed those years from the dataset. In one case – Poland from 1970 to 1978 – we substituted the deleted data with more reliable exchange rate figures from the Penn World Tables. But in most cases, we simply omitted the affected data altogether. For a full list of datapoints that we removed, you can download the metadata as an excel file above.
Adjustment for defunct states
Once again, the 23 successor states of the former- USSR, Yugoslavia, and Czechoslovakia pose a challenge to the calculations. There is no exchange rate data for these states prior to approximately 1990, as these states did not exist at that time. It is, however, possible to calculate GDP in variable MER for the former states themselves. The WID includes data on the GDP of the former states in local currency units, along with GDP deflators to render these figures in current prices. Although data on their exchange rates are not available from the WID, this data is available in the Penn World Table, version 5.1., covering the period from 1970 to 1989 or 1990. As such we estimated variable MER for the three successor states by dividing the WID’s data on GDP, LCU (current prices) by the PWT exchange rate data (LCU per USD) in each year.
Using this approach, we were able to derive variable MER estimates for the former- USSR, Yugoslavia, and Czechoslovakia. However, this does not tell us about the variable MER of their successor states. To estimate the latter, we assumed that the variable MER income of each successor states was equal to the variable MER income of the relevant former state, multiplied by a country-specific adjustment factor. In each year, the adjustment factor was assumed to equal the ratio of the successor state’s income (in PPP terms) to the income of the aggregate former-state (also in PPP terms). For more details on this procedure, please consult the metadata by downloading the excel file at the top of this page.
Adjustments for low data coverage
Unfortunately, the resulting dataset suffered from significant gaps in coverage. Annual exchange rate data is not available for all countries in all years, especially during the 1960s and 1970s, creating gaps in the estimated GDP/cap dataset. Any inequality figures calculated with this data would therefore change from one year to the next as more countries become available. To overcome this problem, we did two things.
First, we interpolated intermediary years in the dataset of GDP/cap (variable MER, 2022 prices) along an exponential curve. Using this approach, we ‘filled in’ 59 datapoints for 9 countries (see metadata above). Needless to say, we regard these interpolated datapoints as indicative of long-term trends, but inappropriate for identifying annual changes.
Second, where possible, we extrapolated back to 1960 from the earliest available year of data, using the growth rate of real GDP per capita (adjusted for domestic inflation). One can think of this in terms of extrapolating back with the growth rate of GDP/cap in constant LCU, constant PPP, or constant MER. The growth rates are the same, regardless of the currency used, since they are all indexed to changes in domestic prices. In practice, we performed the calculations with the growth rate of GDP in constant 2022 LCU (WID’s agdproi999 variable).
The main limitation to this approach is that datapoints based on extrapolations won’t reflect changes in a given country’s market exchange rate. Our method implicitly assumes that the ‘exchange rate deviation index‘ (i.e., the ratio of PPP to MER) was constant in extrapolated years, so that all changes in a country’s market income can be accounted for by the growth of incomes relative to domestic prices. As such, the extrapolated series do not allow us to capture changes in inequality that arise due to shifts in the international terms of trade or the bargaining power of different countries on the world market (for more information about this problem, see our discussion of the three currency concepts used in the GIP national income dataset). This may imply that we understate the rise in inequality that occurred during the 1980s, when many Global South countries experienced a deterioration in their real exchange rates under IMF and World Bank structural adjustment programmes.
Despite this issue, we believe that our method of extrapolation is preferable to the alternative of simply omitting countries from the dataset. The growth rate of GDP relative to domestic prices is the best available evidence on changes in income in the early years of data for some countries. While it does not capture all elements of a country’s position in the world market, it is close enough to permit analysis.
Using the growth rates of GDP relative to domestic prices, we were able to extrapolate back to 1960 for 84 countries with missing GDP, MER data. Where necessary, the growth rates of the former-USSR, Yugoslavia, and Czechoslovakia were applied to their succor states.
The resulting dataset covers 171 countries, equivalent to 98% of the global population, in all years from 1960 to 2020. In all but three cases (Lebanon, Venezuela and Zimbabwe) data is available through to 2022. This allows us to analyse long term trends in international inequality without major gaps in country coverage.