Skip to main content

Mankiw Romer Weil (1992)

GitHub repository

Introduction & motivation

For a long time, the central question in economics has been whether there is an economic convergence across countries. Trying to adress this question, Mankiw, Romer, Weil (1992) arise as one of the most relevant papers in the field of economics. The purpose of the paper is to test the validity of the Solow model (Solow (1956)), one of the most famous frameworks to understand the economic growth process. This model attempts to explain the economic growth based on capital accumulation, labour and population growth and technology advancements (which captures the increases in productivity), setting investment as the primary source of growth. One of the striking implications of the Solow Model is that it predicts an unconditional economic convergence in the long run. Therefore, according to the model, two countries with the same parameters, but starting at the different points will end up in the same exact steady state. Consequently, once a country has the main economic and demographic parameters, the pattern of growth is just a matter of time.

Given the astonishing implications of the Solow model, it is critical to test whether the model holds or not with real world data. This paper aims to derive and simulate the Solow Model and replicate the empirical analysis done in Mankiw, Romer, Weil (1992) using python language. In section 2), we first present the Solow Model and a model simulation to help understand the underlying process of convergence. Then, in 3) we define an econometric specification and we conduct an empirical analysis given the expression of income per capita as a reference. In 4) we present the augmented Solow Model as an alternative to the classical Solow Model and we conduct, again, an empirical analysis to test its validity with real world data. Finally, in 5) we describe the main findings of this paper and some open discussion.

Research Questions:

  1. How does Solow Model work and what are its dynamics?
  2. Does the Solow Model hold with real world data?
  3. Does the augmented Solow Model hold with real world data?

To address question 1), we present and simulate the Solow Model. For questions 2) and 3) we conduct an empirical analysis taking the output per worker expression as a reference.

The Solow Model: derivation and simulation

We first provide a derivation and simulation of the Solow Model with technological progress and growth given some paremters.

# necessary imports
from scipy import optimize
from numpy import array,arange
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import statsmodels.formula.api as sm
from math import log
from statsmodels.iolib.summary2 import summary_col #To include three regression models in one table.
import warnings

# Suppress all warnings
warnings.filterwarnings('ignore')

Assumptions of the Solow model

The central assumptions of the Solow model concern the properties of the production function and the evolution of the three inputs into production (capital K, labor L, and the effectiveness of labor A) over time. The main assumptions are as follows:

  1. Production function has constant returns to scale in its two arguments, capital and effective labor. That is, doubling the quantities of capital and effective labor (for example, by doubling K and L with A held fixed) doubles the amount produced.

    Mathematically F(cK,cAL)=cF(K,AL)F (cK, cAL) = cF (K, AL) for all c \geq 0

    Intuition: The assumption of constant returns can be thought of as a combination of two separate assumptions. The first is that the economy is big enough that the gains from specialization have been exhausted. In a very small economy, there are likely to be enough possibilities for further specialization that doubling the amounts of capital and labor more than doubles output. The Solow model assumes, however, that the economy is sufficiently large that, if capital and labor double, the new inputs are used in essentially the same way as the existing inputs, and so output doubles.

  2. Inputs other than capital, labor, and the effectiveness of labor are relatively unimportant. In particular, the model neglects land and other natural resources.

  3. The initial levels of capital, labor, and knowledge are taken as given, and are assumed to be strictly positive.

  4. Labor and knowledge grow at constant rates:

    L˙(t)=nL(t)\dot{L}(t)=nL(t)

    A˙(t)=gA(t)\dot{A}(t)=gA(t)

  5. Existing capital depreciates at rate δ\delta

  6. Sum of n,gn, g and δ\delta is strictly positive

Analytics of the Solow model

The Solow model is built around two equations, a production function and a capital accumulation equation.

Production function equation:

The production function describes how inputs such as bulldozers, semiconductors, engineers, and steel-workers combine to produce output. To simplify the model, we group these inputs into two categories, capital, K, and labor, L, and denote output as Y. We also introduce technological variable, A in the basic Solow model to simulate the generation of sustained growth in per capita income. The production function is assumed to have the Cobb-Douglas form and is given by:

Y=Kα(AL)1α\begin{align*} Y = K^{\alpha}(AL)^{1-\alpha} \end{align*}

α\alpha : output elasticity of capital

1-α\alpha : output elasticity of effective labor

Since α\alpha+(1-α\alpha) = 1, this production function displays constant returns to scale, meaning that doubling the usage of capital K and effective labor AL will also double output Y.

# defining the production function for simulation
def production_function(K,L,alpha):
return K**alpha*((A*L)**(1-alpha))

Let's graph this production function and see it's shape.

# range of Capital (K) to plot the graphs
range_K = arange(0.00,1300.0,0.01)
type(range_K)

numpy.ndarray

# some exogenous parameters
alpha = 1/3 #Share of capital
A=1.5 #Technology level
s=0.3 #Savings rate
n=0.02 #Population growth
d=0.1 #Depreciation
g=0.1 #Technological growth
L=1 #Labour
plt.title("Production function",fontsize=15)
plt.xlabel("Capital(K)", fontsize=15)
plt.ylabel("Output (Y)",fontsize=15)
plt.plot(range_K,[production_function(i,L,alpha) for i in range_K],label="Y")
#the above code line takes values from range_K array one by one and supplies to production_function to plot the graph

plt.legend() #legend box
plt.grid() #grid lines
plt.axis([-5, 50, -1, 6]) #this removes the extra part of the graph
plt.show()

png

The concavity of the graph shows the existence of diminishing marginal returns to capital.

Now we transform the production function to production per effective worker function.

Y/AL=y=kα,wherek=KAL\begin{align*} Y/AL = y = k^{\alpha}, \hspace{0.2cm}where\hspace{0.2cm}k = \frac{K}{AL} \end{align*}
# defining the production per effective worker function for simulation
def production_function_per_eff_w(K,alpha):
return ((K/(A*L))**alpha)

Let's graph this production per effective worker function.

plt.title("Production per effective worker function",fontsize=15)
plt.xlabel("Capital per effective worker (k)", fontsize=15)
plt.ylabel("Output per effective worker (y)",fontsize=15)
plt.plot(range_K/(A*L),[production_function_per_eff_w(i,alpha) for i in range_K],label="y")
#The above code line takes values from range_K array one by one and supplies to production_function_per_eff_w to plot the graph.
#In the above code line we have range_K/(A*L) because we defined range_K as Capital values but here we want Capital per effective worker on the x-axis.
#Therefore range_K is divided by (A*L).

plt.legend()
plt.grid()
plt.axis([-5, 50, -1, 6])#this removes the extra part of the graph
plt.show()

png

Capital accumulation equation:

This is the second key equation of the Solow model which describes how capital accumulates. The capital accumulation equation is given by:

k˙=dkdt=sy(n+δ+g)k,wherek=KAL\begin{align*} \dot{k} = \frac{dk}{dt} = sy - (n+\delta + g )k ,\hspace{0.2cm}where\hspace{0.2cm}k = \frac{K}{AL} \end{align*}

ss: savings rate in the economy

nn: population growth rate

δ\delta: depreciation rate of the capital

gg: technological growth rate

(n+δ+g)k(n+\delta+g)k : effective depreciation in the economy

Let's graph the capital accumulation equation. The capital accumulation equation has two components: sysy and (n+δ+g)k(n+\delta+g)k. First we need to define a function for effective depreciation to plot the capital accumulation function.

# defining the effective depreciation
def effective_depreciation(n,d,g,K):
return (n+d+g)*(K/(A*L))
plt.title("Production per effective worker function",fontsize=15)
plt.xlabel("Capital per effective worker (k)", fontsize=15)
plt.ylabel("Output per effective worker (y)",fontsize=15)
plt.plot(range_K/(A*L),[production_function_per_eff_w(i,alpha)*s for i in range_K],label="s.y")
plt.plot(range_K/(A*L),[effective_depreciation(n,d,g,i) for i in range_K],label="(n+d+g).k")
#the above code line takes values from range_K array one by one and supplies to effective_depreciation to plot the graph
#in the above code line we have range_K/(A*L) because we defined range_K as Capital values but here we want Capital per effective worker on the x-axis. Therefore range_K is divided by (A*L).

plt.legend()
plt.grid()
plt.axis([-0.2, 4, -0.2, 1])#this removes the extra part of the graph
plt.show()

png

Solve for the steady state

A steady state of the economy is defined as any level kk^{∗} such that, if the economy starts with k0k_0 = kk^{∗}, then ktk_t = kk^{∗} for all t \geq 1. (George-Marios Angeletos)

To calculate the steady state kk^{*} we need to equate the following equation to zero.

k˙=dkdt=sy(n+δ+g)k=0,wherek=KAL\dot{k} = \frac{dk}{dt} = sy - (n+\delta + g )k = 0 ,\hspace{0.2cm}where\hspace{0.2cm}k = \frac{K}{AL}\newline

sy=(n+δ+g)ksy = (n+\delta + g )k

k=syn+δ+g=skαn+δ+gk^{*}=\frac{sy}{n+\delta + g}=\frac{sk^{*^{\alpha}}}{n+\delta + g}

k=(sn+δ+g)11αk^{*}=\big(\frac{s}{n+\delta + g}\big)^{\frac{1}{1-\alpha}} :This is the analytical solution for steady state capital.

#solve for kstar
(s/(n+d+g))**(1/(1-alpha))

1.5923842039667508

Steady state kk^{*} can also be solved numerically with the help of optimize.fsolve function. We just need find an intersection point betweem savings curve and effective depreciation.

initial_guess =1
kstar=optimize.fsolve(lambda w: ((production_function_per_eff_w(w*A*L,alpha)*s) - effective_depreciation(n,d,g,w*A*L)),initial_guess)
#optimize.fsolve will give such a value of w where (production_function_per_eff_w-effective_depreciation) is zero
#optimize.fsolve works on newton raphson method to find the solution and therefore it is required to provide a intial guess solution to optimize.fsolve
#inside the lambda function we need to multiply w with AL because both the functions (production_function_per_eff_w and effective_depreciation) takes Capital in argument and we need Capital per effective worker as the output of optimize.fsolve
kstar

array([1.5923842])

We can see that both numerical and analytical solutions are equal

#plot kstar in graph
plt.title("Production per effective worker function",fontsize=15)
plt.xlabel("Capital per effective worker (k)", fontsize=15)
plt.ylabel("Output per effective worker (y)",fontsize=15)
plt.plot(range_K/(A*L),[production_function_per_eff_w(i,alpha) for i in range_K],label="y")
plt.plot(range_K/(A*L),[production_function_per_eff_w(i,alpha)*s for i in range_K],label="s.y")
plt.plot(range_K/(A*L),[effective_depreciation(n,d,g,i) for i in range_K],label="(n+d+g).k")
#in the above code line we have range_K/(A*L) because we defined range_K as Capital values but here we want Capital per effective worker on the x-axis. Therefore range_K is divided by (A*L).

plt.plot([kstar for i in range_K],[i for i in range_K],'--',label="kstar") #same xvalue (kstar) for different yvalues

plt.legend()
plt.grid()
plt.axis([-0.5, 10, -0.5, 2.5])
plt.show()

png

At steady state, AA and LL are growing at the rate of gg and nn respectively and YAL\frac{Y}{AL} is constant. This implies that Y must be growing at the rate of g+ng+n. Hence the GDP growth rate is g+ng+n.

# GDP growth rate
print(round((g+n)*100,2),"percent")

12.0 percent

According to the Solow model, convergence to the steady is always ensured (MIT 14.05 Lecture Notes: The Solow Model Proposition 4)

We find that k=(sn+δ+g)11αk^{*}=\big(\frac{s}{n+\delta + g}\big)^{\frac{1}{1-\alpha}}. This implies that steady state capital per effective worker depends upon five parameters only and those are population growth (nn), technological growth (gg), depreciation (δ\delta), savings rate (ss) and capital's share in income (α\alpha). This gives rise to the concept of unconditional convergence which states that if two countries have different levels of economic development (namely different k0k_0 and y0y_0) but otherwise share the same fundamental characteristics (namely share the same technologies, saving rates, depreciation rates, and fertility rates), then the poorer country will grow faster than the richer one and will eventually (asymptotically) catch up with it (MIT 14.05 Lecture Notes: The Solow Model Proposition 4).

Dynamics of the Solow model

# set the saving rate in the economy to 30%
s=0.3
# intial values
K0 = 1 #Capital
L0 = 1 #Labor
A0 = 1 #Technology level
Y0=((A0*L0)**(1-alpha))*(K0**alpha) #from the production function
Y_AL0=Y0/(A0*L0) #Production per effective worker function
T=100 #Number of years
# intiating the lists of the main  variables for the dynamics
Time=[1901] #Year
L=[L0] #Labor
K=[K0] #Capital
A=[A0] #Technology level
Y=[Y0] #Output
Y_AL=[Y_AL0] #Production per effective worker function
for i in range(T):
L.append((1+n)*L[i]) #for instance L1=(1+n)*L0
A.append((1+g)*A[i]) #for instance A1=(1+g)*A0
K.append((s*Y[i]) - (d*K[i]) + K[i]) #for instance K1=(s*Y0) - (d*K0) + K0
Y.append(((A[i+1]*L[i+1])**(1-alpha))*(K[i+1]**alpha)) #for instance Y1=((A1*L1)**(1-alpha))*(K1**alpha)
Y_AL.append(Y[i+1]/(A[i+1]*L[i+1])) #for instance Y_AL1=Y1/(A1*L1)
Time.append(1+Time[i]) #for instance T1=1+T0
# creating the dataframe from the lists to plot the graphs
data = pd.DataFrame({'Time': Time,'Y': Y, 'K': K,'L':L,'A':A, 'Y/AL':Y_AL})
print(data.to_markdown())
TimeYKLAY/AL
0190111111
119021.147421.21.021.11.02266
219031.311751.424231.04041.211.04199
319041.495151.675331.061211.3311.05854
419051.700051.956341.082431.46411.07273
519061.929142.270721.104081.610511.08493
619072.185432.622391.126161.771561.09542
719082.472293.015781.148691.948721.10446
819092.79353.455891.171662.143591.11226
919103.153283.948351.195092.357951.11899
1019113.556374.49951.218992.593741.12481
1119124.008095.116461.243372.853121.12984
1219134.51445.807241.268243.138431.13419
1319145.081976.580841.293613.452271.13795
1419155.71837.447341.319483.79751.14121
1519166.431818.41811.345874.177251.14404
1619177.231929.505831.372794.594971.14648
1719188.1292310.72481.400245.054471.1486
1819199.1355912.09111.428255.559921.15044
19192010.264313.62271.456816.115911.15204
20192111.530415.33971.485956.72751.15342
21192212.950517.26491.515677.400251.15462
22192314.543619.42351.545988.140271.15566
23192416.330621.84421.57698.95431.15656
24192518.335424.5591.608449.849731.15734
25192620.584327.60371.6406110.83471.15802
26192723.107431.01861.6734211.91821.15861
27192825.937934.8491.7068913.111.15912
28192929.113539.14551.7410214.4211.15956
29193032.676143.9651.7758415.86311.15995
30193136.673249.37131.8113617.44941.16028
31193241.157655.43611.8475919.19431.16057
32193346.188862.23981.8845421.11381.16082
33193451.833669.87251.9222323.22521.16104
34193558.166878.43531.9606825.54771.16123
35193665.272488.04181.9998928.10241.16139
36193773.244798.81942.0398930.91271.16154
37193882.1893110.9112.0806934.00391.16166
38193992.2249124.4772.122337.40431.16177
391940103.485139.6962.1647441.14481.16186
401941116.118156.7722.2080445.25931.16194
411942130.292175.932.252249.78521.16201
421943146.195197.4252.2972454.76371.16207
431944164.039221.5412.3431960.24011.16213
441945184.059248.5992.3900566.26411.16217
451946206.521278.9562.4378572.89051.16221
461947231.724313.0172.4866180.17951.16225
471948260.001351.2322.5363488.19751.16228
481949291.727394.1092.5870797.01721.1623
491950327.324442.2172.63881106.7191.16233
501951367.264496.1922.69159117.3911.16235
511952412.076556.7522.74542129.131.16236
521953462.356624.72.80033142.0431.16238
531954518.769700.9372.85633156.2471.16239
541955582.064786.4742.91346171.8721.1624
551956653.082882.4462.97173189.0591.16241
561957732.763990.1253.03117207.9651.16242
571958822.1651110.943.09179228.7621.16243
581959922.4741246.53.15362251.6381.16243
5919601035.021398.593.2167276.8011.16244
6019611161.31569.243.28103304.4821.16244
6119621302.981760.73.34665334.931.16245
6219631461.951975.533.41358368.4231.16245
6319641640.312216.563.48186405.2651.16246
6419651840.4324873.55149445.7921.16246
6519662064.972790.433.62252490.3711.16246
6619672316.93130.883.69497539.4081.16246
6719682599.573512.863.76887593.3491.16246
6819692916.723941.443.84425652.6831.16247
6919703272.564422.313.92114717.9521.16247
7019713671.824961.853.99956789.7471.16247
7119724119.785567.214.07955868.7221.16247
7219734622.46246.434.16114955.5941.16247
7319745186.347008.54.244361051.151.16247
7419755819.077863.564.329251156.271.16247
75197665298822.924.415841271.91.16247
7619777325.559899.334.504151399.081.16247
7719788219.2711107.14.594241538.991.16247
7819799222.0212462.14.686121692.891.16247
79198010347.113982.54.779841862.181.16247
80198111609.515688.44.875442048.41.16247
81198213025.817602.44.972952253.241.16247
8219831461519749.95.072412478.561.16247
8319841639822159.45.173862726.421.16248
84198518398.624862.95.277332999.061.16248
85198620643.227896.15.382883298.971.16248
86198723161.731299.55.490543628.871.16248
87198825987.4351185.600353991.751.16248
88198929157.839402.45.712354390.931.16248
89199032715.144209.65.82664830.021.16248
90199136706.349603.15.943135313.021.16248
91199241184.555654.76.0625844.321.16248
9219934620962444.66.183246428.761.16248
93199451846.570062.86.30697071.631.16248
94199558171.878610.56.433047778.81.16248
95199665268.8882016.56178556.681.16248
96199773231.698961.56.692939412.341.16248
97199882165.81110356.8267910353.61.16248
98199992190.11245816.9633311388.91.16248
9920001034371397807.1025912527.81.16248
10020011160571568337.2446513780.61.16248
log_Y=[log(x) for x in Y] #Y reaches a very high value in 100 years therefore to plot it nicely we transform it to log values
fig, ax = plt.subplots(3,1,figsize=(14,16)) #3 subplots in 1 column.

#subplot 1 for Production per effective worker function
ax[0].plot(Time,Y_AL,'r',label='Y/AL')
ax[0].set_title('Production per effective worker function',fontweight="bold")
ax[0].grid()
ax[0].legend()

#subplot 2 for Production
ax[1].plot(Time,Y,'b',label='Y')
ax[1].ticklabel_format(style='plain')
ax[1].set_title('Production',fontweight="bold")
ax[1].grid()
ax[1].legend()

#subplot 3 for log Production
ax[2].plot(Time,log_Y,'g',label='log_Y')
ax[2].set_title('log Production',fontweight="bold")
ax[2].grid()
ax[2].legend()

#Common x-axis and y-axis labels for all the 3 subplots.
fig.supxlabel('Time')#labelling x-axis
fig.supylabel('Values')#labelling y-axis

plt.show()

png

In the first graph we observe how production per effective worker is evolving over time. It is characterized by the diminishing returns to scale from capital that shapes a concave function. Once steady state is reached, around 1934, the production per effective worker stays at a stable level. According to our simulation, this value is around 1.162.

The second graph shows how production changes over time. When it comes to production per effective worker, the key distinction is that after the steady state is attained, production continues to expand at the rate of g+ng+n. In the long run, the production grows exponentially as a result of the constant growth rate.

Lastly, in the third graph we present how log of production evolve over time. Since production reaches a very high value in 100 years, in order to plot it nicely, we transform it to log values. This is intended to aid in the visualisation of the evolution of production.

Sensitivity analysis of the Solow model

Since one of the purposes of our research is to show how savings affect growth and steady state levels, it is interesting to see how our model simulation responds to alterations in the rate of savings of the economy.

Note: Here we would like to highlight the fact that change in saving rates does not affect the steady state GDP growth rate of the economy because at steady state the GDP growth rate is equal to g+ng+n, that is, sum of technological and population growth rates. Hence, change in saving rates would only affect the growth dynamics as we will show now.

s=[0.20,0.30,0.35,0.40,0.45] #list of different saving values
data_list=[]

#we have to run a for loop over savings list s to have the dynamics for every savings rate.
for s in s:
# intial values
K0 = 1 #Capital
L0 = 1 #Labor
A0 = 1 #Technology level
Y0=((A0*L0)**(1-alpha))*(K0**alpha) #from the production function
Y_AL0=Y0/(A0*L0) #Production per effective worker function
T=100 #Number of years

# intiating the lists of the main variables
Time=[1901] #Year
L=[L0] #Labor
K=[K0] #Capital
A=[A0] #Technology level
Y=[Y0] #Output
Y_AL=[Y_AL0] #Production per effective worker function
for i in range(T):
L.append((1+n)*L[i])
A.append((1+g)*A[i])
K.append((s*Y[i]) - (d*K[i]) + K[i])
Y.append(((A[i+1]*L[i+1])**(1-alpha))*(K[i+1]**alpha)) #explain i+1?
Y_AL.append(Y[i+1]/(A[i+1]*L[i+1]))
Time.append(1+Time[i])
log_Y=[log(x) for x in Y]#Y reaches a very high value in 100 years therefore to plot it nicely we transform it to log values

# creating the dataframes to plot the graphs
data = pd.DataFrame({'Time': Time,'Y': Y, 'K': K,'L':L,'A':A, 'Y/AL':Y_AL,'log_Y':log_Y})
data_list.append(data) #all dataframes of dynamics corresponding to different savings rate is stored in data_list
fig, ax = plt.subplots(3,1,figsize=(14,16)) #3 subplots in 1 column.

#subplot 1 for Production per effective worker function
ax[0].plot(data_list[0]['Time'],data_list[0]['Y/AL'],label='Y/AL at s=20%') #for 20% savings rate
ax[0].plot(data_list[1]['Time'],data_list[1]['Y/AL'],label='Y/AL at s=30%') #for 30% savings rate
ax[0].plot(data_list[2]['Time'],data_list[2]['Y/AL'],label='Y/AL at s=35%') #for 35% savings rate
ax[0].plot(data_list[3]['Time'],data_list[3]['Y/AL'],label='Y/AL at s=40%') #for 40% savings rate
ax[0].plot(data_list[4]['Time'],data_list[4]['Y/AL'],label='Y/AL at s=45%') #for 45% savings rate
ax[0].set_title('Production per effective worker function',fontweight="bold")
ax[0].grid()
ax[0].legend()

#subplot 2 for Production
ax[1].plot(data_list[0]['Time'],data_list[0]['Y'],label='Y at s=20%') #for 20% savings rate
ax[1].plot(data_list[1]['Time'],data_list[1]['Y'],label='Y at s=30%') #for 30% savings rate
ax[1].plot(data_list[2]['Time'],data_list[2]['Y'],label='Y at s=35%') #for 35% savings rate
ax[1].plot(data_list[3]['Time'],data_list[3]['Y'],label='Y at s=40%') #for 40% savings rate
ax[1].plot(data_list[4]['Time'],data_list[4]['Y'],label='Y at s=45%') #for 45% savings rate
ax[1].ticklabel_format(style='plain')
ax[1].set_title('Production',fontweight="bold")
ax[1].grid()
ax[1].legend()

#subplot 3 for log Production
ax[2].plot(data_list[0]['Time'],data_list[0]['log_Y'],label='log_Y at s=20%') #for 20% savings rate
ax[2].plot(data_list[1]['Time'],data_list[1]['log_Y'],label='log_Y at s=30%') #for 30% savings rate
ax[2].plot(data_list[2]['Time'],data_list[2]['log_Y'],label='log_Y at s=35%') #for 35% savings rate
ax[2].plot(data_list[3]['Time'],data_list[3]['log_Y'],label='log_Y at s=40%') #for 40% savings rate
ax[2].plot(data_list[4]['Time'],data_list[4]['log_Y'],label='log_Y at s=45%') #for 45% savings rate
ax[2].set_title('log Production',fontweight="bold")
ax[2].grid()
ax[2].legend()

fig.supxlabel('Time')#labelling x-axis
fig.supylabel('Values')#labelling y-axis

plt.show()

png

As shown in the first graph, the savings rate is totally determinant of the steady state level of production per effective worker that a country achieves. If it is too low, depreciation will outweigh any gain from savings, and the value of production per effective worker will fall until it reaches a lower steady state. On the other hand, in the steady state, the higher the savings rate, the higher the level of production per effective worker.

The following two graphs have a similar interpretation: the higher the rate of savings, the higher the level of production. Although the savings rate does not affect the growth rate once steady state is reached, it does lead to different growth path.

Empirical testing of the Solow Model

In this first part we test the validity of Solow Model with the real world data that was used by Mankiw, Romer, Weil(1992).

Econometric Specification, assumptions & preview of the answer

Once we have derived and simulate the Solow Model, we need to set an econometric specification to test the model empirically with real world data. First, by setting capital per effective worker accoumlation equation to be equal to 0

kt˙=sktα(ngδ)kt=0,wherek=KAL\begin{align*} \dot{k_t} = sk_t^{\alpha} - (n - g - \delta)k_t=0, \hspace{0.2cm}where\hspace{0.2cm}k = \frac{K}{AL} \end{align*}

We get the steady state capital per effective worker expression

k=(sn+g+δ)1/(1α)\begin{align*} k^* = \left(\frac{s}{n + g + \delta}\right)^{1/(1 - \alpha)} \end{align*}

Putting this last expression into output per worker expression Y/Lt=AktαY/L_t = Ak_t^{\alpha} and taking logs, we can get the steady state level of output per worker (that can be interpreted as GDP or income per capita) as a function of the rate of savings ss and (n+g+δ)(n+g+\delta).

ln(Y/L)=lnA(0)+α(1α)ln(s)α(1α)ln(n+g+δ)\begin{align*} ln(Y/L) = lnA(0) + \frac{\alpha}{(1 - \alpha)}ln(s) - \frac{\alpha}{(1 - \alpha)}ln(n + g + \delta) \end{align*}

In order to use this specification, we assume gg (advancement of knowledge) and δ\delta (rate of depreciation) to be constant across countries. However, we do allow for differences in levels of technology A(0)A(0), setting

lnA(0)=a+ϵ\begin{align*} lnA(0) = a + \epsilon \end{align*}

where aa is constant, and ϵ\epsilon stands for country specific shocks. Therefore, our last expression is

ln(Y/L)=a+α(1α)ln(s)α(1α)ln(n+g+δ)+ϵ\begin{align*} ln(Y/L) = a + \frac{\alpha}{(1 - \alpha)}ln(s) - \frac{\alpha}{(1 - \alpha)}ln(n + g + \delta) + \epsilon \end{align*}

Finally, we also assume the rate of savings ss and population growth nn to be independent of country-specific shocks ϵ\epsilon. This last assumption is needed to satisfy exogeneity condition and estimate the econometrics specification with Ordinary Least Squares (OLS). Although the independence assumption could be discussed, for the sake of simplicity we will not deepen in that issue, and we will take it as given by now.

Preview answer

Using OLS method and assuming a set of assumptions we expect to obtain reliable coefficients for ss and (n+g+δ) (n + g + \delta). We await for a positive coefficient of ss, a negative coefficient of the same magnitude for (n+g+δ) (n + g + \delta) and a high r2 to explain output per worker variations. Finally, we expect to obtain an implied value of α\alpha close to 1/3, that is the empirical share of capital found.

Data

The data used for the this empirical analysis is taken from the Real National Accounts (Summers and Heston(1988)). This data is publically available by Professor Bruce E. Hansen of University of Wisconsin Madison, USA (https://www.ssc.wisc.edu/~bhansen/econometrics/MRW1992.xlsx). It is the same dataset used in Mankiw, Romer, Weil (1992)

data_url = 'https://www.ssc.wisc.edu/~bhansen/econometrics/MRW1992.xlsx'
#creating the main dataframe df for the analysis
df = pd.read_excel(data_url)
print(df.head().to_markdown()) #this shows the first five observations
countryNIOY60Y85Y_growthpop_growthinvestschool
0Algeria110248543714.82.624.14.5
1Angola100158811710.82.15.81.8
2Benin100111610712.22.410.81.8
3Botswana11095936718.63.228.32.9
4Burkina Faso1005298572.90.912.70.4

Variable definitions:

'country' : Country Name

'N' : 1 if all data is available and oil production is not the dominant industry, 0 otherwise

'I' : 1 if the population in 1960 were greater than one million, 0 otherwise

'O' : 1 if OECD country with I = 1, 0 otherwise

'Y60' : real GDP per working-age person in 1960, in dollars

'Y85' : real GDP per working-age person in 1985, in dollars

'Y_growth' : the yearly average growth rate (%) of real GDP for 1960-1985

'pop_growth' : the yearly average growth rate (%) of the working-age population for 1960-1985

'invest' : the share (%) of real investment (incl. government investment) in real GDP, averaged for 1960-1985

'school' : the fraction (%) of the eligible population enrolled in secondary school × the fraction (%) of the working age population that is of school age (aged 15 to 19), averaged for 1960-1985

print(df.describe().to_markdown()) #This gives a very basic understanding of the data
NIOY60Y85Y_growthpop_growthinvestschool
count121121121116108117107121118
mean0.8099170.6198350.1818183681.825683.264.094022.2794418.1575.52627
std0.3939980.4874460.3872987492.885688.671.891460.9987487.853313.53204
min000383412-0.90.34.10.4
25%100973.251209.252.81.7122.4
50%11019623484.53.92.417.74.95
75%1104274.57718.755.32.924.18.175
max11177881256359.26.836.912.1

Data description:

Data has 121 countries.

MRW 1992 divided the data into three samples as follows:

Sample 1: The first subsample is the largest, consisting of the majority of countries available except those dominated by the oil industry. The exclusion of oil-producing countries is justified by the fact that resource extraction accounts for the majority of their GDP. As a result, there are 98 countries in this subsample.

In the dataframe df these countries have "N" column value equals to 1 (An indication for non oil countries).

print(df[df['N']==1].describe().to_markdown()) #This gives the data description of non oil countries only
NIOY60Y85Y_growthpop_growthinvestschool
count989898989898989898
mean10.7653060.224492994.95309.773.99492.2010217.67245.39694
std00.4259860.4193912862.525277.181.859130.8898627.918333.46899
min100383412-0.90.34.10.4
25%110963.751174.752.7251.711.7252.4
50%110181831503.82.417.14.75
75%1104113.2570155.12.87523.48
max11112362197239.24.336.911.9

Sample 2: The second subsample exclude not only oil producers, but also countries with “bad quality data”, that is graded with a "D" according to Summers and Heston (1988) or countries whose population was less than one million in 1960. On the one hand, this subsample is mainly aimed to avoid measurement errors. On the other hand, small countries are excluded because their real income may be determined by other factors than the value added. Therefore, this subsample contains a total of 75 countries.

In the dataframe df these countries have "I" column value equals to 1 (An indication for Intermediate countries).

print(df[df['I']==1].describe().to_markdown()) #This gives the data description of intermediate countries only
NIOY60Y85Y_growthpop_growthinvestschool
count757575757575757575
mean110.2933333620.766589.834.381332.1666719.35076.38133
std000.4583562999.985410.911.736230.9751417.565953.23309
min1103836080.90.35.40.5
25%110134721673.251.4513.253.65
50%110238244924.12.419.56.6
75%111501611183.55.452.924.78.9
max11112362197239.24.336.911.9

Sample 3: Finally, the last subsample takes only 22 OECD countries with population over one million. The data in this subsample seems to be uniformly accurate and adequate, but the size of the sample is unavoidably small and it discards much of the variation in the variables of interest.

In the dataframe df these countries have "O" column value equals to 1 (An indication for OECD countries).

print(df[df['O']==1].describe().to_markdown()) #This gives the data description of OECD countries only
NIOY60Y85Y_growthpop_growthinvestschool
count222222222222222222
mean1116731.0913131.53.868181.0090925.79099.08636
std0002803.654012.490.9944540.6054594.985972.08036
min111225744442.50.317.74.8
25%1114536.511388.53.2250.622.77.925
50%1117424.5135943.750.7525.359.1
75%1118314.5152824.2751.3528.9510.7
max11112362197236.82.536.911.9

Data visualization: We present a brief data visualization section to get a better understanding of the dataset we are dealing with.

fig, ax = plt.subplots(figsize=(25,9))
ax.set_title('real GDP per working-age person in 1985, in dollars',fontweight="bold")
ax.bar(df.sort_values('Y85')['country'],df.sort_values('Y85')['Y85']) #sort and bar plot in one line
ax.xaxis.set_tick_params(rotation=90) #to rotate the xlabels
plt.axhline(y = df['Y85'].mean(),color='r', label='mean Y85') #for mean horizontal line
ax.set_xlabel('Countries')
ax.set_ylabel('Value in dollars')
plt.legend()
plt.grid()
plt.show()

png

The first graph shows the real GDP per working-age person in 1985 by countries. We can notice that although the mean is around 5.500 USD, there is a high difference between countries. Some of them are well above like the US (20.000 USD aprox), while some others are well below like Uganda (1.000 USD aprox).

Note: Data for Gambia, Swaziland, Afghanistan, Bahrain, Taiwan is not available.

fig, ax = plt.subplots(figsize=(25,9))
ax.set_title('The yearly average growth rate (%) of the working-age population for 1960-1985',fontweight="bold")

ax.bar(df.sort_values('pop_growth')['country'],df.sort_values('pop_growth')['pop_growth'])#sort and bar plot in one line
plt.axhline(y = df['pop_growth'].mean(),color='r', label='mean pop_growth')#for mean horizontal line
ax.set_xlabel('Countries')
ax.set_ylabel('Value in %')
plt.legend()
ax.xaxis.set_tick_params(rotation=90)#to rotate the xlabels
plt.grid()
plt.show()

png

This second graph replicates the same structure, but for the yearly average growth rate % of the working-age population for 1960-1985. We can see that most countries are centred around the mean that is around 2.3%. However, some of the most advanced countries like the UK, Sweden or Belgium exhibit very low population growth rates (below 1%).

Note: Data for Gambia, Swaziland, Afghanistan, Bahrain, Taiwan is not available.

fig, ax = plt.subplots(figsize=(25,9))
ax.set_title('The share (%) of real investment (incl. government investment) in real GDP, averaged for 1960-1985',fontweight="bold")

ax.bar(df.sort_values('invest')['country'],df.sort_values('invest')['invest'])#sort and bar plot in one line
plt.axhline(y = df['invest'].mean(),color='r', label='mean invest')#for mean horizontal line
ax.set_xlabel('Countries')
ax.set_ylabel('Value in %')
plt.legend()
ax.xaxis.set_tick_params(rotation=90)#to rotate the xlabels
plt.grid()
plt.show()

png

This graph shows the share of real investment in real GDP, averaged for 1960-1985. The mean is situated around 18%. Nevertheless, very significant divergences are shown within the whole set of countries. Although the group of countries with high share of investment is quite heterogeneous, the vast majority of countries with a low share of investment are low-income countries.

fig, ax = plt.subplots(figsize=(15,5))
ax.set_title('Real GDP 1985 vs Savings rate',fontweight="bold")

ax.scatter(df['invest']/100,df['Y85'])
ax.set_xlabel('investment rate (proxy for savings rates)')
ax.set_ylabel('real GDP per working-age person in 1985, in dollars')
ax.xaxis.set_tick_params(rotation=90)
plt.grid()
plt.show()

png

This last graph shows the distribution of countries by real GDP, 1985 and their savings rate. There seems to be a correlation between these two variables. The higher the saving rate (using the investment rate as a proxy), the higher the real GDP.

Note: Graph on fraction invested on human capital is in the next section.

print(df.head().to_markdown())
countryNIOY60Y85Y_growthpop_growthinvestschool
0Algeria110248543714.82.624.14.5
1Angola100158811710.82.15.81.8
2Benin100111610712.22.410.81.8
3Botswana11095936718.63.228.32.9
4Burkina Faso1005298572.90.912.70.4

Regression Analysis

In this section we conduct two different sets of regressions. The first set of regressions consists of a simple OLS for the three different subsamples. For this subset we take the log(Y/L)log(Y/L) as endogenous variable and the log(s)log(s) and log(n+g+δ)log(n + g + \delta) as exogenous variables.

Adding new variables as per the required econometric model

#getting log variables to run the regressions

df['log_Y85']=np.log(df['Y85'])
df['log_s']=np.log(df['invest']/100) # MRW 1992 calculate savings rate with investment as a proxy for savings
df['log_school']=np.log(df['school']/100)
df['log_ngd']=np.log((df['pop_growth']/100)+0.05) #we calculate (n+g+delta) using population growth (n) and
#adding 0.05, since MRW 1992 assume that g + delta is 0.05
#Unrestricted regression for non oil countries
reg1 = sm.ols("log_Y85 ~ log_s+ log_ngd",data=df[df['N']==1]).fit()

#Unrestricted regression for Intermediate countries
reg2 = sm.ols("log_Y85 ~ log_s+ log_ngd",data=df[df['I']==1]).fit()

#Unrestricted regression for OECD countries
reg3 = sm.ols("log_Y85 ~ log_s+ log_ngd",data=df[df['O']==1]).fit()

#Below is the syntax for sm.ols
#sm.ols("dependent variable ~ independent variable 1 + independent variable 2 +...., data=dataframe").fit()

The second set of regressions consists of an OLS regression for the three different subsamples, but imposing the condition that the coefficients of log(s)log(s) and log(n+g+δ)log(n + g + \delta) must be of equal magnitude but in different sign. This condition comes from the derivation of the steady state in the Solow model. Therefore, for this subset we take the log(Y/L)log(Y/L) as endogenous variable and the difference of log(s)log(s) and log(n+g+δ)log(n + g + \delta) as exogenous.

df['s_minus_ngd']=df['log_s'] - df['log_ngd'] 
#We take differences to impose the condition that both coefficients have to be of same magnitude with opposite sign.
#Restricted regression for non oil countries
reg1_restricted = sm.ols("log_Y85 ~ s_minus_ngd",data=df[df['N']==1]).fit()

#Restricted regression for Intermediate countries
reg2_restricted = sm.ols("log_Y85 ~ s_minus_ngd",data=df[df['I']==1]).fit()

#Restricted regression for OECD countries
reg3_restricted = sm.ols("log_Y85 ~ s_minus_ngd",data=df[df['O']==1]).fit()

Results

info_dictu = {'N': lambda x: x.nobs,'s.e.e.': lambda x: np.sqrt(x.scale)} 
# above code adds extra info in unrestricted regression table like number of observations and standard error of estimate

info_dictr = {'N': lambda x: x.nobs, 's.e.e.': lambda x: np.sqrt(x.scale),
'Implied α': lambda x: f"{x.params[1]/(1 + x.params[1]):.2f}"}
# above code adds extra info in restricted regression table like number of observations, standard error of estimate and Implied alpha.
# implied alpha calculation: (alpha/(1-alpha))=params[1], now solve for alpha we get alpha=params[1]/(1+params[1]). Refer MRW 1992
# params[1] gives you the coefficient of 1st independent variable in the regression.



results_unres = summary_col(results = [reg1, reg2, reg3],float_format='%0.3f',stars = True,
model_names = ['Non-Oil','Intermediate','OECD'],
info_dict = info_dictu, regressor_order = ['Intercept','log_s','log_ngd'])
#model_names for Column heading
#info_dict to add extra info
#float_format='%0.3f' : print results upto three decimal places
#in unrestricted table, implied alpha is not required.
#regressor_order tells you what independent variables you want to print first.

results_res = summary_col(results = [reg1_restricted, reg2_restricted, reg3_restricted],float_format='%0.3f',
stars = True, model_names = ['Non-Oil','Intermediate','OECD'],
info_dict = info_dictr,regressor_order = ['Intercept','s_minus_ngd'])


results_unres.add_title('Unrestricted Regressions')
results_res.add_title('Restricted Regressions')


print(results_unres)

print('\n') #add some space between the two tables

print(results_res)

png

The two sets of results are presented in this section. In the top table we show the unrestricted results for the three subsamples given the econometric specification described in 3). The first remarkable fact is that the signs of the coefficients are as expected for the three subsamples: positive for the rate savings (in logs) and negative for population growth (in logs). Moreover, the coefficients are statistically significant for the first two subsamples. Due to the sample size, coefficients in the third subsample are not statistically significant. We obtain a similar picture when focusing on the r2. For the first two subsamples we obtain a relatively high r2 (0.59 and 0.58 respectively) while in the OECD subsample the r2 goes down to 0.0118.

In the bottom table we show the results under the restricted condition for the three subsamples. We get very similar results when we apply this constraint, indicating that the constraint is satisfied. The coefficient is similar and significant and the r2 in the first two subsamples is again high. The third subsample still suffers from sample size issues.

When it comes to the magnitude of the coefficient, nevertheless, these results do not support Solow Model at all. According to the output per worker equation presented in the econometric specification, the magnitude of reported coefficients implies a share of capital (α\alpha) of 0.59. This value is much higher than the empirical level of capital share which is around 1/3. Although the implied α\alpha in the OECD subsample is very close to 1/3, 0.36, the low r2 prevents us from trusting these results. Therefore, looking at the first two subsamples, we can say that although the regressions explain high proportion of income variation, we cannot conclude that Solow Model holds with real data. The high value of reported savings coefficient may be capturing the effect of other components such as externalities from capital accumulation.

Empirical testing of the augmented Solow model

In this second part we follow Mankiw, Romer, Weil (1992) to provide the extension of the Solow Model adding human capital. We first present the model theoretically and then provide an empirical analysis similar to the one conducted before.

Augmented Solow Model

Since in this case we include human capital, cobb-douglas production function is

Yt=KtαHtβ(AtLt)1αβ\begin{align*} Y_t = K_t^{\alpha}H_t^{\beta}(A_tL_t)^{1 - \alpha - \beta} \end{align*}

Where H is the stock of human capital and the rest of the variables remain the same as before. Now, we will have not only a capital accumulation equation, but also a human capital accumulation equation evolving over time.

kt˙=skyt(n+g+δ)ktht˙=shyt(n+g+δ)ht\begin{align*} \dot{k_t} = s_{k}y_t - (n + g +\delta)k_t\\ \dot{h_t} = s_{h}y_t - (n + g +\delta)h_t\\ \end{align*}

Where k=KAL,y=YALk = \frac{K}{AL}, \hspace{0.2cm}y = \frac{Y}{AL} and h=HAL\hspace{0.2cm}h = \frac{H}{AL}. Also, sks_{k} stands for the fraction of income invested in physical capital while shs_{h} stand for the fraction invested in human capital. Therefore, we assume that the same production function applies to human capital, physical capital and consumption. It is also assumed that depreciation rate δ\delta is same for both human and physical capital. Last assumption is that α+β<1\alpha + \beta < 1, which implies that there are decreasing returns to scale to all capital and hence that the economy converges to a steady state where

k=(sk1βshβn+g+δ)1/(1αβ)h=(skαsh1αn+g+δ)1/(1αβ)\begin{align*} k^* = \left(\frac{s_k^{1-\beta}s_h^{\beta}}{n + g + \delta}\right)^{1/(1 - \alpha - \beta)} h^* = \left(\frac{s_k^{\alpha}s_h^{1-\alpha}}{n + g + \delta}\right)^{1/(1 - \alpha - \beta)} \end{align*}

Given hh^* and kk^* we can derive the following steady state level of output per worker:

ln(Yt/Lt)=lnA(0)+gt+α(1αβ)ln(sk)+β(1αβ)ln(sh)α+β(1αβ)ln(n+g+δ)\begin{align*} ln(Y_t/L_t) = lnA(0) + gt + \frac{\alpha}{(1 - \alpha - \beta)}ln(s_k) + \frac{\beta}{(1 - \alpha - \beta)}ln(s_h) - \frac{\alpha + \beta}{(1 - \alpha - \beta)}ln(n + g + \delta) \end{align*}

This equation implies two things. First, that the presence of human-capital accumulation increases the impact of physical-capital accumulation of income. Second, that high population growth lowers income per capita because the amounts of both physical and human capital must be spread more thinly over the population.

The output per worker equation can be expressed as a function of the level of hh^* instead of shs_h

ln(Yt/Lt)=lnA(0)+gt+α(1α)ln(sk)+β(1α)ln(h)α(1α)ln(n+g+δ)\begin{align*} ln(Y_t/L_t) = lnA(0) + gt + \frac{\alpha}{(1 - \alpha)}ln(s_k) + \frac{\beta}{(1 - \alpha)}ln(h^*) - \frac{\alpha}{(1 - \alpha)}ln(n + g + \delta) \end{align*}

We can notice that this expression is similar to the econometric speficiation used before. The main difference is that now we included the stock of human capital, which in the previous specification would be a component of the error term and hence, a cause of OVB.

Econometric Specification, assumptions & preview of the answer

Given that the stock of human capital is very hard to estimate, we rely on the expression of the ln(Yt/Lt)ln(Y_t/L_t) as a function of ln(sk),ln(sh)andln(n+g+δ)ln(s_k), ln(s_h) \hspace{0.2cm}and \hspace{0.2cm} ln(n + g + \delta) as a final econometric specification.

ln(Yt/Lt)=lnA(0)+gt+α(1αβ)ln(sk)+β(1αβ)ln(sh)α+β(1αβ)ln(n+g+δ)\begin{align*} ln(Y_t/L_t) = lnA(0) + gt + \frac{\alpha}{(1 - \alpha - \beta)}ln(s_k) + \frac{\beta}{(1 - \alpha - \beta)}ln(s_h) - \frac{\alpha + \beta}{(1 - \alpha - \beta)}ln(n + g + \delta) \end{align*}

Assumptions in this case are similar to those assumed for the Solow Model. First, aiming to use this specification, we assume gg (advancement of knowledge) and δ\delta (rate of depreciation) to be constant across countries. Second, in order to satisfy exogeneity condition and estimate the model with Ordinary Least Squares (OLS), we need to assume that sks_k, shs_h and nn are fully exogenous.

Preview answer

Using OLS method and assuming a set of assumptions we expect to obtain reliable coefficients for shs_h, sks_k and (n+g+δ) (n + g + \delta). We await for a positive coefficient of sks_k and shs_h, and a negative coefficient for (n+g+δ) (n + g + \delta), with a magnitude equal the sum of the other two coefficients. We also expect a high r2 to explain output per worker variations and an implied value of α\alpha close to 1/3 (the empirical share of capital found).

Data and subsamples

The dataset used is the same as before. In this case we will include the variable school as a proxy for the fraction invested in human capital. This variable measures approximately the percentage of the working-age population that is in secondary school. Although far from perfect, as long as it is proportional to ln(sh)ln(s_h) we can use it to estimate our equations.

Data visualisation

fig, ax = plt.subplots(figsize=(25,9))
ax.set_title('Fraction invested in human capital',fontweight="bold")
ax.bar(df.sort_values('school')['country'],df.sort_values('school')['school'])#sort and bar plot in one line
ax.xaxis.set_tick_params(rotation=90)#to rotate the xlabels
plt.axhline(y = df['school'].mean(),color='r', label='mean school')#for mean horizontal line
ax.set_xlabel('Countries')
ax.set_ylabel('Value in %')
plt.legend()
plt.grid()
plt.show()

png

Since for this model we add the stock of human capital, we present a graph for the fraction invested in human capital by countries. The mean is between 5-6%, but the differences between countries are very significant. Similar to the insights taken from the share of savings, most of the countries that report low fraction invested in human capital are low income countries.

Regression Analysis

Similar to what we did for the Solow Model, we conduct two sets of regressions. The first set is unrestricted and for the three subsamples we take log(Y/L)log(Y/L) as endogenous variable and log(sh)log(s_h), log(sk)log(s_k) and log(n+g+δ)log(n + g + \delta) as exogenous variables.

#Unrestricted regression for non oil countries
reg1_hc = sm.ols("log_Y85 ~ log_s+ log_ngd + log_school",data=df[df['N']==1]).fit()

#Unrestricted regression for Intermediate countries
reg2_hc = sm.ols("log_Y85 ~ log_s+ log_ngd+ log_school",data=df[df['I']==1]).fit()

#Unrestricted regression for OECD countries
reg3_hc = sm.ols("log_Y85 ~ log_s+ log_ngd+ log_school",data=df[df['O']==1]).fit()

The second set is restricted. According to the expression of ln(Yt/Lt)ln(Y_t/L_t) derived before, we impose that the coefficients on ln(sk),ln(sh),ln(s_k), ln(s_h), and ln(n+g+δ)ln(n + g + \delta) sum to zero. Therefore, for this subset we take the log(Y/L)log(Y/L) as endogenous variable and the difference of log(sh)log(s_h) and log(n+g+δ)log(n + g + \delta) and the difference of log(sk)log(s_k) and log(n+g+δ)log(n + g + \delta) as exogenous variables.

df['school_minus_ngd']=df['log_school'] - df['log_ngd'] #We take differences to impose the condition that
#(n+g+\delta), s_h and s_k coefficients sum to 0.
#Restricted regression for non oil countries
reg1_restricted_hc = sm.ols("log_Y85 ~ s_minus_ngd + school_minus_ngd ",data=df[df['N']==1]).fit()

#Restricted regression for Intermediate countries
reg2_restricted_hc = sm.ols("log_Y85 ~ s_minus_ngd + school_minus_ngd",data=df[df['I']==1]).fit()

#Restricted regression for OECD countries
reg3_restricted_hc = sm.ols("log_Y85 ~ s_minus_ngd + school_minus_ngd",data=df[df['O']==1]).fit()

Results

info_dictu = {'N': lambda x: x.nobs,'s.e.e.': lambda x: np.sqrt(x.scale)}
# above code adds extra info in unrestricted regression table like number of observations and standard error of estimate

info_dictr = {'N': lambda x: x.nobs,'s.e.e.': lambda x: np.sqrt(x.scale),
'Implied α': lambda x: f"{x.params[1]/(1 + x.params[1] + x.params[2]):.3f}",
'Implied β': lambda x: f"{x.params[2]/(1 + x.params[1] + x.params[2]):.3f}"}
# above code adds extra info in restricted regression table like number of observations, standard error of estimate and implied alpha & beta.
# implied alpha & beta calculation: refer the markdown cell after this code
# params[1] gives you the coefficient of 1st independent variable in the regression.
# params[2] gives you the coefficient of 2nd independent variable in the regression.

results_unres = summary_col(results = [reg1_hc, reg2_hc, reg3_hc],float_format='%0.3f',stars = True,
model_names = ['Non-Oil','Intermediate','OECD'],info_dict = info_dictu,
regressor_order = ['Intercept','log_s','log_ngd','log_school'])

#model_names for Column heading
#info_dict to add extra info
#float_format='%0.3f' : print results upto three decimal places
#in unrestricted table, implied alpha & beta is not required.
#regressor_order tells you what independent variables you want to print first.

results_res = summary_col(results = [reg1_restricted_hc, reg2_restricted_hc, reg3_restricted_hc],float_format='%0.3f',
stars = True,model_names = ['Non-Oil','Intermediate','OECD'],info_dict = info_dictr,
regressor_order = ['Intercept','s_minus_ngd','school_minus_ngd'])


results_res.add_title('Restricted Regressions')
results_unres.add_title('Unrestricted Regressions')
print(results_unres)
print('\n')
print(results_res)

png

Note: Implied alpha & beta calculation in restricted regression:

α1αβα+β1αβ\frac{\alpha}{1-\alpha-\beta} - \frac{\alpha + \beta}{1-\alpha-\beta} = regression coefficient of s_minus_ngd

β1αβα+β1αβ\frac{\beta}{1-\alpha-\beta} - \frac{\alpha + \beta}{1-\alpha-\beta} = regression coefficient of school_minus_ngd

We have two unknowns (α\alpha and β\beta) and two equations as restricted regression provides the values of s_minus_ngd and school_minus_ngd. We can solve for α\alpha and β\beta. And we get:

α=s_minus_ngd1+s_minus_ngd+school_minus_ngd\alpha= \frac{s\_minus\_ngd}{1+s\_minus\_ngd+school\_minus\_ngd}

β=school_minus_ngd1+s_minus_ngd+school_minus_ngd\beta= \frac{school\_minus\_ngd}{1+s\_minus\_ngd+school\_minus\_ngd}

s_minus_ngd and school_minus_ngd are params[1] and params[2] respectively in the above code.

Comment on the regression results

This section presents the results of the empirical analysis of the augmented Solow Model. The first thing to notice is that the coefficients do not vary a lot when imposing the restriction condition. This tells us that the restriction is satisfied. The sign of the coefficients is, again, as expected and in the first two subsampels, all of them are statistically significant.

A first important difference with respect to the results for the simple Solow model is that in this case the r^2 rises to 0.78. That is, when adding human capital to the regressions, we are explaining a greater share of output per worker variations. The second relevant difference is the magnitude of the coefficients. With the reported coefficients for the augmented Solow Model, the implied α\alpha ranges between 0.29 and 0.31, which is much closer to the 1/3 empirically found.

Again, results from the OECD sample are biased due to a very small sample size.

In general, these results seems to support the augmented Solow Model. That is, including human capital to the Solow Model helps explain output per worker variations, even when the proxy for the fraction invested in human capital is not perfect.

Conclusions and discussion:

In this paper, we followed Mankiw, Romer, Weil (1992) to address three different questions; 1) How Solow Model works and what are its dynamics? 2) Does the Solow Model hold with real world data? And 3) Does the augmented Solow Model hold with real world data?

For the first question, we derived and explained the Solow Model and we conducted a model simulation showing its dynamics. It is shown that the Solow Model is characterised by decreasing returns to scale in capital, which leads to a common steady state for all countries that have same parameters (population growth (nn), technological growth (gg), depreciation (δ\delta), savings rate (ss) and capital's share in income (α\alpha)). This model, hence, implies the existence of unconditional convergence once the values of the parameters are reached.

Given these astonishing implications, we tested the validity of the model with real world data through an empirical analysis. We used the output per worker equation in the steady state as a reference for the econometric specification. Although the sign of the reported coefficients is correct and the restriction condition is met, the magnitude seems to overstate the influence of savings. The most problematic result is the implied value of alpha. It is reported to be around 0.57 while the empirical share of capital found is 1/3. These results reject the validity of the Solow Model to explain income per capita variations.

As an alternative, in section 4 we presented the augmented Solow Model, including the stock of human capital as an input in the production function. Once derived the model, we used again the expression of output per worker in steady state as econometric specification to test the model with real world data. The results, in this case, seems to support the augmented Solow Model since the reported coefficients are as expected and the implied value of alpha in this case is much closer to the 1/3.

However, these results are not exempt from discussion because they heavily rely on strong assumptions made to avoid endogeneity problems. The first potential problem is the existence of Omitted Variable Bias; many variables such as the quality of institutions or the land are not included and may well be correlated with other variables such as the savings rate. The second major issue that could arise is Measurement Error. It is recognized by the authors that not all available data is perfect and homogeneous. Specifically, in the case of the “school” variable, it is stated that it is a very imperfect proxy to capture the fraction invested in human capital. The third potential issue is Reverse Causality; in this paper, we regress output per worker on savings rate. However, it could be that savings rate is also affected by the level of output per worker. In order to avoid this issue, we should find a completely exogenous variation of savings (maybe through an instrument).

All in all, although the results of this paper should not be interpreted causally, they do show that when adding human capital to the Solow model, it explains much more of the international variation in income per capita (or output per worker) and with a very realistic implied values of alpha and beta.

References

Barro, R., & Sala-i-Martin, X. (2004). Economic growth second edition.

Blanchard, O. J., Fischer, S., & BLANCHARD, O. A. (1989). Lectures on macroeconomics. MIT press.

Data: Professor Bruce E. Hansen of University of Wisconsin Madison, USA.

Mankiw, N. G., Romer, D., & Weil, D. N. (1992). A contribution to the empirics of economic growth. The quarterly journal of economics, 107(2), 407-437.

MIT OCW : Intermediate Macroeconomics lecture slides by Prof. George Marios Angeletos

Solow, R. M. (1956). A contribution to the theory of economic growth. The quarterly journal of economics, 70(1), 65-94.

Summers, R., & Heston, A. (1988). A new set of international comparisons of real product and price levels estimates for 130 countries, 1950–1985. Review of income and wealth, 34(1), 1-25.