In 2018, a Swedish physician, Hans Rosling, and his children released a book titled: "Factfulness: Ten Reasons We're Wrong About the World – and Why Things Are Better Than You Think." The book changed the way I think about the future by pointing out some faulty assumptions I was making about the modern world. I made this notebook out of inspiration from Rosling's book.
The data came from Gapminder and can be accessed here.
# Imports import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns
Before we get started, it's important to set up the data properly. Don't worry about understanding this section and feel free to skip it if you'd like.
# Loading the data income_per_person = pd.read_csv('data/income_per_person_gdppercapita_ppp_inflation_adjusted.csv') income_per_person.head()
5 rows × 242 columns
The above table is the format in which the data came in. In order to visualize it the way I'd like, I will write a function to melt the date columns into one column. I use this function several times throughout the project.
def melt_data(df, value_name): """ Parameters: df (DataFrame): DataFrame to melt value_name: Values of melt Returns: melt_df: Melted DataFrame """ melt_df = df.melt(id_vars='country', var_name="Year", value_name=value_name) melt_df.Year = pd.to_datetime(melt_df.Year).dt.year return melt_df
Below is what the data looks like after melting it using our new function.
income_melt = melt_data(income_per_person, "Income") income_melt.head()
Now I need a function that will help me look at the melted datasets. The function below will do just that.
def make_lineplot(data, yval, title): """ Parameters: data (DataFrame): DataFrame to melt yval: y-axis variable title: Title of the plot Returns: melt_df: Melted DataFrame """ plt.figure(figsize=(12,6)) sns.lineplot(x='Year', y=yval, data=data) plt.title(title, fontsize=20) plt.axvline(x=2018, ymin=0,ymax=np.max(data[yval]), color='r') plt.show()
Now we're ready to look at the data!
One metric used to measure the world economy is average income per person. This is simply measured by dividing the total income for a country by its number of people. A higher income per person gives the individual more buying power which translates to a higher standard of living.
Below, I look at the average income per person for every country from the years 1800 to 2018, as well as the forecasted average income through the year 2040.
The dark blue line represents the actual data that were collected and the thicker blue highlight represents a 95% confidence interval. The red vertical line represents the year 2018, so everything to the left of the red line is actual data and everything to the right is forecasted. This is true for all the plots in this notebook.
make_lineplot(income_melt, "Income", "Average World Income Per Person")
The plot above shows that people across the world have risen to a much higher standard of living over the last 50 years and are predicted to continue doing so.
But, does this truly represent every country? What if the size and growth of the largest nations in the world are falsely representing the poorer countries? Are poor countries also seeing similar growth?
Let's take a look.
In 2018, Business Insider published an article listing the 28 poorest countries in the world. Below, I filtered the dataset to contain data from only the 28 poorest countries (listed in descending order of poverty ranking).
poorest_countries = ['Sudan', 'Benin', 'Chad', 'Nepal', 'Mali', 'Guinea-Bissau', 'Ethiopia', 'Comoros', 'Tajikistan', 'Haiti', 'Rwanda', 'Guinea', 'Burkina Faso', 'Liberia', 'Uganda', 'Togo', 'Afghanistan', 'Niger', 'Sierra Leone', 'Gambia', 'Madagascar', 'Congo, Dem. Rep.', 'Mozambique', 'Yemen', 'Central African Republic', 'Malawi', 'Burundi', 'South Sudan']
Below is the same lineplot as above, but this time only including those 28 countries.
data=income_melt[income_melt.country.isin(poorest_countries)] make_lineplot(data, "Income", "Average Income Per Person for 28 Poorest Countries")
As you can see, even the poorest countries have been rising to a higher standard of living and are projected to continue doing so (just on a smaller scale). Their growth is slow, but they're moving in the right direction.
Doubling a 1,000 dollar income in 1950 only gives you 2,000 dollar income in 2018 which may not seem like a lot, but this growth rate scales very fast.
With a higher standard of living comes better health. Next, I look into the average life expectancy for each country from 1800 to 2018. Take a look at the plot, and see if you can figure out why there are two significant drops.
life_expectancy = pd.read_csv('data/life_expectancy_years.csv') expectancy_melt = melt_data(life_expectancy, "Life_Expectancy").dropna() make_lineplot(expectancy_melt, "Life_Expectancy", "Average Life Expectancy")
(If you guessed the drops are from the wars, you are correct).
Post 1950, average life expectancy skyrocketed, but before 1950, the average person could've expected to live to be about 40 yrs old. Crazy, right?
Today, it seems normal to meet elders in their 70s, but less than a century ago, they would have been an outlier. The below plot shows two distrubutions of life expectancy for the years 1918 and 2018 -- the change over a century!
plt.figure(figsize=(12,6)) sns.distplot(expectancy_melt[expectancy_melt.Year==1918].Life_Expectancy, color='r', label='1918') sns.distplot(expectancy_melt[expectancy_melt.Year==2018].Life_Expectancy, color='g', label='2018') plt.title("Average Life Expectancy Over the Last Century", fontsize=20) plt.legend() plt.show()
Looking at the distribution for 1918, only people in the far right tail lived above the age of 60 yrs. Eye-balling it, this looks like less than 5% of the population.
Just one hundred years later, these numbers are almost flipped and now it's much more common to live past 60 yrs than to die before it.
In the grand scope of things, 100 years is not a lot of time. In fact, it's only about three generations (depending on how you calculate the average age of a generation). In other words, your expected age has doubled since your great grandma was born. That's a massive improvement that's often over-looked.
Another metric to measure the standard of living is the child mortality rate of children aged 0-5 yrs for every 1,000 births. Until the year 1890, around 39% of children between the ages 0-5 yrs were expected to die. Think about that.
Today, that number is less than 5%. Again, another major improvement and a sign that we are definitely moving in the right direction.
child_mortality = pd.read_csv('data/child_mortality_0_5_year_olds_dying_per_1000_born.csv') mortality_melt = melt_data(child_mortality, "death_per_1000") make_lineplot(mortality_melt, "death_per_1000", "Child Mortality Rate (Death by 0-5 year-olds per 1000 born)")
Because babies now have a higher chance of survival, families have been having less of them. The logic used to be "let's have 6 and hope 3 of them live." Now, there's less uncertainty.
According to Rosling, poor families in general typically have more children than well off families because they rely on their children to contribute to the family's combined income. As the average income per person increases, however, poorer families tend to have less children since they can generate the same income with fewer people.
child_per_woman = pd.read_csv('data/children_per_woman_total_fertility.csv') child_per_woman_melt = melt_data(child_per_woman, "child_per_woman") make_lineplot(child_per_woman_melt, "child_per_woman", "Average Child Per Woman")
Now that we know families are having fewer children, how does that affect the total population growth?
I used to think that the population would grow infinitely, and it kind of makes sense to believe this -- in theory, the more people there are, the more babies they have, and the bigger the population gets.
But actually, this is far from truth. Rosling describes in his book that population growth reflects that of a human being. In the early stages of population development, growth is rapid. But as the world gets older and matures, the population growth begins to plateau, and might eventually even begin to decrease in the future.
population = pd.read_csv('data/population_total.csv') population_melt = melt_data(population, "population") make_lineplot(population_melt, "population", "Population Size")
In a world of what seems like constant negativity, it's good to acknowledge the good things once in a while. Are we perfect yet? No, far from it. In fact, there are still millions of people living in extreme poverty and harsh conditions around the world which seems unacceptable in today's world. But the point is that we're moving in the right direction and making a ton of progress.
I hope this notebook was eye opening. If you have any comments or feedback on how to improve it, please reach out and let me know. My email is colestriler at gmail . com.