Economics homework Assignment

Economics homework Assignment

NBER WORKING PAPER SERIES

TRANSPORTATION AND HEALTH IN A DEVELOPING COUNTRY: THE UNITED STATES, 1820–1847

Ariell Zimran

Working Paper 24943 http://www.nber.org/papers/w24943

NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue

Cambridge, MA 02138 August 2018

I am indebted to Joel Mokyr, Joseph Ferrie, and Matthew Notowidigdo for encouragement and guidance. I also thank Jeremy Atack, Hoyt Bleakley, Natalia Cantet, William Collins, Price Fishback, Bernard Harris, Richard Hornbeck, Robert Margo, Yannay Spitzer, and John Wallis for helpful suggestions and insightful comments. Thanks are also due to Timothy Cuff for sharing his data on Pennsylvania recruits to the Union Army; to Noelle Yetter for assistance at the National Archives; to Ashish Aggarwal and Danielle Williamson for excellent research assistance; to seminar participants at Vanderbilt University, Tel Aviv University, the Hebrew University of Jerusalem, and Ben Gurion University of the Negev; and to participants in the 2016 Social Science History Association Conference, the 2017 NBER DAE Summer Institute, the 2018 H2D2 Research Day at the University of Michigan, and the 2018 Midwest International Economic Development Conference. This project was supported by an Economic History Association Dissertation Fellowship, by the Northwestern University Center for Economic History, and by the Balzan Foundation. This project, by virtue of its use of the Union Army Data, was supported by Award Number P01 AG10120 from the National Institute on Aging. The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institute on Aging or the National Institute of Health. All errors are my own. The views expressed herein are those of the author and do not necessarily reflect the views of the National Bureau of Economic Research.

NBER working papers are circulated for discussion and comment purposes. They have not been peer-reviewed or been subject to the review by the NBER Board of Directors that accompanies official NBER publications.

© 2018 by Ariell Zimran. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.

Transportation and Health in a Developing Country: The United States, 1820–1847 Ariell Zimran NBER Working Paper No. 24943 August 2018 JEL No. I15,N31,N71,O18

ABSTRACT

I study the impact of transportation on health in the rural US, 1820–1847. Measuring health by average stature and using within-county panel analysis and a straight-line instrument, I find that greater transportation linkage, as measured by market access, in a cohort’s county-year of birth had an adverse impact on its health. A one-standard deviation increase in market access reduced average stature by 0.10 to 0.29 inches. These results explain 26 to 65 percent of the decline in average stature in the study period. I find evidence that transportation affected health by increasing population density, leading to a worse epidemiological environment.

Ariell Zimran Department of Economics Vanderbilt University 2301 Vanderbilt Place Nashville, TN 37235 and NBER ariell.zimran@vanderbilt.edu

1 Introduction

In the four decades prior to the Civil War, the United States experienced a “transportation revolution”

(Taylor 1951) that was in part responsible for the prodigious growth of the antebellum American economy

(e.g., Atack et al. 2010). This experience is often cited as evidence that transportation improvements

are crucial to spurring and supporting economic growth in modern developing countries (e.g., Banerjee,

Duflo, and Qian 2012)—a view that has inspired massive investment in transportation infrastructure in the

developing world (World Bank 2007) with largely positive impacts (see Donaldson 2015).

Despite the well known benefits of economic growth, transportation projects that induce it may not be

unambiguously welfare-improving. In the antebellum United States the early phases of modern economic

growth were accompanied by declining health as measured by life expectancy and average stature (Floud et

al. 2011). Similar patterns have been documented in nineteenth-century England and in modern developing

countries such as China and India (Deaton 2007; Floud et al. 2011; Floud, Wachter, and Gregory 1990;

Jayachandran and Pande 2017; Trivedi 2017),1 all of which have experienced transportation revolutions of

their own. If transportation improvements were in any way responsible for these deteriorations in health, then

this impact must be weighed against the benefits of growth in assessing the welfare effects of infrastructure

development. Little empirical evidence exists, however, on whether and how transportation improvements

affect health, and economic theory shows that the impact may be positive or negative.

In this paper, I provide such evidence by studying the effect of transportation improvements on health in

the rural United States in the period 1820–1847. Besides being of interest for its own sake, the antebellum

United States is a particularly good setting in which to study this relationship. The transportation improve-

ments of this period—consisting mostly of canal construction and improvements in river navigability—were

transformative of the American continent and economy, involving the expansion of transportation infras-

tructure into large areas that were previously isolated and undeveloped. Moreover, the time horizon of data

available due to the historical setting permits the observation of permanent and long-term health effects.

My analysis is based on two main data sources. To describe the development of the transportation

network in the antebellum United States, I use GIS shape files that have recently been made available by

Atack (2015, 2016, 2017). This source provides the location and opening date of all canals, railroads, and

navigable waterways in the antebellum United States. I use these data to compute Donaldson and Hornbeck’s

(2016) market access statistic, which is my main measure of transportation linkage, for all counties east of

1This does not refer to the commonly cited counter-cyclicality of health (e.g., Ruhm 2000). Unlike this cyclical relationship, I am referring to a relationship between economic growth and health over a few decades.

1

the Mississippi River for each year 1820-1847. As is common in studying developing and historical contexts, I

measure health by average stature, which reflects net nutritional status in childhood and adolescence (Floud

et al. 2011; Steckel 1995). I use stature data from the records of enlisters in the Union Army (Records of the

Adjutant General’s Office 1861–1865), providing data on the heights, counties of birth, and birth cohorts of

25,567 native-born white men in the birth cohorts of 1820–1847 in the Northeast and Midwest regions of the

United States. I limit the study period to 1820–1847 because these birth cohorts are the only ones (in the

antebellum period) for which there exists a sample of health data that is reasonably representative of the

population (Zimran 2018). The combination of data from these sources enables the construction of a panel

data set of county average stature and transportation linkage.2

The main empirical challenge of this paper is to determine the impact of transportation improvements

on health while addressing the possibility that any correlation between the two might be driven by omitted

and potentially unobservable variables. For instance, local characteristics might spur economic growth,

attracting transportation, affecting health, and creating a spurious relationship between the two. To address

this possibility, I use two empirical approaches. First, I exploit the panel structure of the data to estimate

specifications that include county fixed effects. Second, I use an instrumental variables approach. I construct

an instrumental variable based on the principle that transportation improvements were intended to connect

major watersheds (i.e., the Mississippi, Great Lakes, and Atlantic) to one another and to major cities.

In particular, I augment the 1820 transportation network with the shortest straight lines creating these

connections, and treat these lines as canals built incrementally over a period of 15 years. I then compute

market access based only on the 1820 network and these straight-line connections, and use this alternative

measure as the instrument for market access. This instrument builds on and shares an interpretation with

the straight-line instruments commonly used in the literature on the effects of transportation (e.g., Atack

et al. 2010; Banerjee, Duflo, and Qian 2012; Chandra and Thompson 2000; Ghani, Goswami, and Kerr 2016).

It also adds to this set of instruments by introducing a temporal component to them (see also Hornung 2015).

Using each of these identification strategies, I find a negative relationship between transportation linkage

as measured by market access and health as measured by average stature. The magnitude of this relationship

is large. According to my estimates, a one-standard deviation increase in market access was associated with

a 0.10 to 0.25 inch decline in average stature, depending on the identification strategy. To put this figure

in perspective, Zimran (2018) estimates that urbanites during this period suffered a height penalty of 0.29

inches relative to ruralists, and Deaton and Arora (2009) estimate that college graduates enjoy a 0.7 inch

2In practice, I do not use the average observed stature. Instead, I use the individual observations of stature linked to individuals’ birth years, clustering standard errors by county of birth.

2

height premium over high school graduates in the modern United States. This negative relationship is robust

to the inclusion of numerous controls and a variety of time trends.

I also investigate the hypothesis that improved market access reduced health by generating increases

in population density. In combination with insufficient sanitation and public health infrastructure in the

antebellum period, such concentration of population would have made previously undeveloped locations

less healthy (Costa 1993; Floud et al. 2011; Steckel 1995). In support of this mechanism and consistent

with other studies of the effects of transportation construction in the antebellum United States (Atack et al.

2010), I find evidence of rising population density in a county in response to increases in market access. I also

find that the effects of market access on increasing population density were stronger in counties where the

suitability for wheat and corn production was greater (according to the Food and Agriculture Organization

2002), and that the negative impact of market access on stature was stronger in these same counties. That

is, counties where population density increased the most in response to rising market access were those where

the deleterious effect on average stature was the greatest.

This paper contributes to a number of literatures. Narrowly, it adds to the understanding of the deteri-

oration in health experienced in the United States at the onset of modern economic growth—a phenomenon

known as the “Antebellum Puzzle.” This pattern is a fundamental stylized fact of American economic history

that bears on the evaluation of the welfare effects of economic growth in developing countries; but its cause

has remained poorly understood due to a lack of well identified empirical investigations. In this paper, I

provide perhaps the first piece of direct and plausibly causal evidence as to a potential explanation for this

puzzle by showing that the effect of market access on average stature, combined with the rise in market

access over the antebellum period, can explain up to 65 percent of the decline in stature. Also in the spe-

cific context of the antebellum United States, this paper adds to the literature studying the effects of canal

construction (e.g., Ransom 1970). Despite the recognized importance of these projects, the bulk of recent

scholarly attention has accrued to the later rail construction (e.g., Donaldson and Hornbeck 2016).

More broadly, this paper adds to the literature on the impacts of transportation improvements. Although

there is a large literature describing these impacts on a variety of economic outcomes,3 the effects on health

have received far less empirical attention and are not understood as well. Previous findings of a negative

relationship between transportation presence and average stature in the antebellum United States (Cuff

2005; Haines, Craig, and Weiss 2003; Yoo 2012) have largely been constrained by data availability and

3Specific case studies of the impacts of transportation improvements on a variety of economics outcomes are given by Atack et al. (2010), Baum-Snow et al. (2018), Chandra and Thompson (2000), Donaldson and Hornbeck (2016), Duranton and Turner (2012), Emran and Hou (2013), Ghani, Goswami, and Kerr (2016), Jacoby (2000), Jacoby and Minten (2009), Jaworski and Kitchens (2016), Storeygard (2016), and Tang (2014), among others.

3

methodological limitations to documenting correlations. To my knowledge, only a few studies (Burgess and

Donaldson 2012; Tang 2017) exist determining the causal effect of transportation on health in specific cases.

This paper contributes to this literature by providing an analysis of the effect of transportation on health

in the context of a large and historically important infrastructure project, and showing, with attention to

causality, that this project, despite its well known economic benefits, had a negative impact on health.

2 Background

2.1 The Economics of Transportation and Health

Economic theory proposes a number of mechanisms by which transportation might impact health. The most

direct are its potential epidemiological effects. For instance, transportation can carry disease along with

freight and passengers, and might bring infection to places that it had once been unable to reach (e.g., Tang

2017). In the antebellum United States in particular, this mechanism might have acted by linking relatively

healthy rural areas to urban areas, where disease was prevalent, and carrying this disease from the latter to

the former (Floud et al. 2011). Conversely, transportation linkages might provide previously isolated areas

with better health care by reducing pecuniary and non-pecuniary access costs, though this mechanism is

unlikely to apply to the antebellum United States, in which the available health care was primitive at best.

Transportation can also affect health indirectly through its effects on income and development. Trans-

portation linkages are often found to increase economic activity in newly linked areas (e.g., Duranton and

Turner 2012; Ghani, Goswami, and Kerr 2016), and the antebellum period is no exception (Atack et al.

2010). The resulting rise in income would lead to improved health through consumption of more and better

health-improving goods, such as food and medicine (Emran and Hou 2013; Fogel 2004). This growth can also

generate increases in population density or urbanization in newly linked areas. This effect has the potential

to harm health by increasing exposure to disease, both through increased contact between individuals and

through the sanitation consequences of greater concentrations of population (Costa 1993; Steckel 1995). This

mechanism is particularly relevant in the antebellum American context given the lack of adequate sanitation

infrastructure and technology and absence of public health projects. Besides its impact on the level of in-

come, transportation integrates newly linked areas to the larger economy, potentially affecting the volatility

of income, with theoretically ambiguous impacts on health (Burgess and Donaldson 2012).

Finally, transportation can affect health through its impact on relative prices. In the antebellum United

States, the areas being linked to the transportation network were largely food-producing. Transportation

4

linkages in this setting would tend to increase the relative farm-gate price of food: access to larger markets

would increase the price that farmers could command for their output, while linkage to manufacturing centers

in urban areas would reduce the price of manufactures produced there (Komlos 1987; Komlos and Coclanis

1997). On the other hand, rising relative food prices would bolster the incomes of food producers. If the

net effect of these changing relative prices was to reduce the consumption of health-improving goods, then

health could have deteriorated in response.

Combining all of these theoretical mechanisms, the bottom-line prediction of the impact of transportation

on health is ambiguous in sign, and is thus an empirical question. Yet there is relatively little empirical

work to enlighten this theoretical puzzle. A considerable fraction of the work that does exist focuses on the

antebellum United States as part of efforts to understand the progress of health during industrialization. Cuff

(2005), Haines, Craig, and Weiss (2003), and Yoo (2012) show that areas linked to the transportation network

or with better market access due to relative proximity to cities had worse health, as measured by average

stature and death rates, than other areas.4 However, limited information on the historical transportation

network available at the time that these studies were conducted (Atack 2013), together with limited methods

available to quantify transportation linkages, constrained these authors to describing correlations, often with

only a single year of observation of transportation presence. Thus, these results are at best suggestive of the

causal impact of transportation on health.

Studies of the transportation-health relationship in other contexts (e.g., Ali et al. 2015; Banerjee and

Sachdeva 2015; Bell and van Dillen 2018; Blimpo, Harding, and Wantchekon 2013; Stifel and Minten 2015)

are also largely suggestive, as they either are correlational, report effects on indices including health but not

on health separately, study very small regions, or focus on inputs to health rather than on health outcomes.5

These studies are also often constrained to study only the short-term effects of transportation.

There are two notable exceptions that provide causal estimates of the effect of transportation infras-

tructure on health. Tang (2017) studies the mortality effects of the construction of the railroad network in

late-nineteenth century Japan. His difference-in-differences approach reveals an increase in mortality coming

from new rail linkage that is generated by the spread of communicable diseases. Burgess and Donaldson

(2012) give causal evidence of beneficial impacts of transportation on health by showing that transportation

4Yoo’s (2012) result is more subtle, showing a positive effect of transportation in the Northeast and a negative effect in the Midwest. His analysis, however, does not exclude urban areas and is based on only a single year of observation of the transportation network.

5It is particularly important not to consider an improvement in health inputs, such as improved consumption or access to health care, as necessarily generating improvements in health outcomes. In the antebellum United States in particular, and also in many developing contexts, apparent improvements in health inputs (such as greater income and consumption) are in fact accompanied by declining health. It is thus important to study health outcomes in order to determine the true net effect on health.

5

improvements in colonial India reduced the increases in mortality in response to negative agricultural yield

shocks. Notwithstanding these papers, economists’ understanding of the health effects of transportation re-

mains limited, in part because these studies find opposing effects and in part because of the limited number

of case studies. Additional studies are necessary to better understand these impacts and the mechanisms

that generate them, especially given the potential value to policy makers in determining the welfare effects

of transportation improvements.

2.2 Transportation Improvements in the Antebellum United States

The quintessential transportation improvement of the antebellum United States was the railroad. As a

result, the impacts of this mode of transportation have been the subject of considerable and notable scrutiny

(e.g., Atack et al. 2010; Donaldson and Hornbeck 2016; Fishlow 1965; Fogel 1964). Although some rail

construction occurred in the 1830s and 1840s, the bulk of antebellum railroad construction did not occur

until the 1850s (especially in the Midwest)—after the study period for this paper. Instead, the improvement

of water transportation was the key component of the transportation revolution in the period on which

this paper focuses. This included the construction of the canal network in the Northeast and Midwest. It

also included expansions in navigability of the Mississippi River system and its major and minor tributaries

through improvements in steamboat technology and the clearance of hazards to navigation.6

The impacts of these canals and improvements in navigability have received less attention in the recent

economic history literature on transportation in the United States than have those of railroads. Earlier work

attributes considerable economic benefits to canal construction, all of which are likely to have contributed

to the ultimate health impacts of these transportation improvements on health. The most notable success

stories were the Erie Canal (Segal 1961) and the Ohio and Erie Canal (Ransom 1970). Despite the notable

financial failure of the latter, these canals contributed considerably to economic growth and development

in the areas through which they passed, to the development of manufacturing and commerce in these same

areas, and to the broader economic development of the Midwest (Niemi 1970; Ransom 1967, 1971).

2.3 The Antebellum Puzzle

Any study of health in the antebellum United States is inextricably linked to the “Antebellum Puzzle.”

Despite an improvement in the standard of living according to conventional economic measures, such as

income per capita and real wages (Costa and Steckel 1997), the antebellum period was characterized by a

6The geographical development of all of these systems is described graphically in Figure A.1.

6

precipitous decline in health. Between the first and second quarters of the nineteenth century, life expectancy

at age 10 for males declined by about 3 years (Fogel 1986). Moreover, the average height of native-born

white males in the United States—the tallest in the world at the start of the 19th century (Steckel 1995,

p. 1920)—declined by between 0.65 and 1.25 inches (depending on the estimate) between the birth cohorts

of 1830 and 1860 (A’Hearn 1998; Floud et al. 2011; Komlos 1987; Zimran 2018). It was not until nearly the

birth cohort of 1900 that average stature would begin to rise again (Fogel 1986; Steckel 1995; Zehetmayer

2011). This pattern is generally interpreted as indicating that the early stages of modern economic growth

in the United States were not unambiguously welfare-improving.

Despite a large body of research devoted to describing the decline in health during the antebellum period,

a definitive explanation has not been identified. Recent scholarship favors a combination of two mechanisms

(Floud et al. 2011).7 One, the disease explanation, holds that a variety of forces led to an increased exposure

to disease during the antebellum period (Costa 1993; Fogel 1986; Steckel 1995). The second, the food price

explanation, holds that the decline in height was the result of a rise in the relative price of food that led

individuals to substitute away from food consumption towards the consumption of manufactures (Komlos and

Coclanis 1997). Although these explanations hypothesize that forces beyond the expansion of transportation

infrastructure in the period played a role in spreading disease and changing relative prices, both also posit

a strong role for transportation, which, as discussed in section 2.1 above, can have such effects.

The empirical evidence underlying both of these explanations is limited, largely due to limited data

availability in this period. Like the evidence described above on the relationship between transportation and

health in this period, much of the evidence that has been marshaled in support of either or both of these

explanations is suggestive, based either on cross-sectional correlations or on national time series. To my

knowledge, there does not exist any work that shows directly that any particular force caused declines in

height.8 This paper, building on recent improvements in data availability and in methodological approaches

to studying transportation, contributes to addressing this limitation by showing that the antebellum decline

in health may be in part the product of the adverse consequences of transportation improvements. This

result alone cannot distinguish between the food price and disease explanations, but it does make progress

towards understanding the phenomenon, and can shed light on these two canonical explanations through

investigation of the mechanism by which transportation acts on health.

7In fact, there are at least fifteen distinct explanations, some of which are summarized by Bodenhorn, Guinnane, and Mroz (2017, p. 175). Almost all, however, can be grouped into one of these two larger categories.

8Indirect evidence is provided by Costa (1993), Haines, Craig, and Weiss (2003), Hong (2007), Komlos (1987), Sunder (2011), Sunder and Woitek (2005), and Woitek (2003), among others.

7

3 Empirical Approach

3.1 Empirical Specification

The basic specification that I use to investigate the relationship between transportation and health is

hijt = γt + δa + βTjt + z ′ jτ + εijt, (1)

where hijt is the height of individual i born in county j in year t (my measure of health), γt are birth

cohort-specific intercepts, δa are indicators for each measurement age below 21 to address cases in which

individuals are observed before reaching terminal height, Tjt is a measure of transportation linkage in the

birth year, and zj is a vector of various county-level control variables to be introduced in section 5 below.9

Because the outcome of interest is observed at the individual level but the regressor of interest is observed

only at the county level, I cluster standard errors throughout the analysis at the county level. My initial

analysis estimates this equation by ordinary least squares. This specification is comparable to that used by

prior studies of the transportation-health relationship in the antebellum United States, especially Haines,

Craig, and Weiss (2003).

This framework assumes that the effect of transportation on height is described fully by the relation-

ship of terminal height with transportation linkage in the birth year. While previous studies suggest that

transportation in the birth year is likely to be more important than in any other year of life (e.g., Steckel

1995; Woitek 2003), it is possible to determine the consequences of relaxing this assumption. I do this in

Appendix B, where I find that transportation linkage around the year of birth is more strongly associated

with terminal stature than is transportation linkage in other phases of life.

A key concern with specification (1) is that any relationship that it uncovers between transportation

and height may be spurious. For instance, a particular county may have been densely populated or highly

urbanized for some reason besides transportation linkage, such as a favorable geographic location. When

transportation infrastructure was constructed, the fact that this county was already developed would make

it more likely to become linked to the network. Moreover, the sanitation consequences of population con-

centration might make this area unhealthy. This hypothetical relationship would produce a negative β in

9I do not include individual-level controls (e.g., occupation) for two reasons. First, the Union Army data, which are my source of all individual-level information, suffer from a large degree of missing data. Limiting the sample to observations with data on all fields of interest would have serious implications for statistical power. This limitation is exacerbated by the fact that successful census linkage is required to observe many variables of interest, and requiring such linkage would further reduce sample size. Second, any individual-specific variables are more properly considered outcomes of the presence of transportation and are therefore “bad controls.”

8

specification (1) even if the true β were zero.10

One approach that I take to address such concerns is to augment specification (1) with the addition of

county fixed effects αj (requiring the omission of the county-specific controls zj) so that it becomes

hijt = αj + γt + δa + βTjt + εijt. (2)

This specification captures time-invariant county characteristics and exploits the panel structure of the data.

It also improves on studies of the transportation-health relationship in the antebellum United States, in

which panel data have not previously been available.

A concern that remains in equation (2) is that faster economic growth in a county-year driven by a force

other than transportation might both affect health and attract transportation. The concern is similar to

that expressed above, except that it applies to a county over only part of the sample period rather than

the whole, and would thus not be captured by the county fixed effects αj . One approach that I will use to

address this concern is to include county-decade fixed effects rather than simply county fixed effects.

3.2 Measures of Transportation Linkage

I use two measures of transportation linkage in the empirical analysis. The first is a simple measure that

takes a value of one in years in which a county was linked to water or rail transportation, and a zero

otherwise. While it is a straightforward measure, it faces some important drawbacks. First, it does not

capture the impacts of new forms of transportation entering already linked areas. This is exacerbated by

the fact that all coastal counties are defined as having always been linked to the transportation network.

Moreover, this binary measure does not capture changes in the transportation network that affect a county

but take place far away from it in the network. Perhaps the most important such change in the study period

is the construction of the Erie Canal, which had profound effects on the Midwest’s ability to access markets

despite all of the construction being located in the Northeast.

To address these shortcomings, I use Donaldson and Hornbeck’s (2016) market access measure. Following

an algorithm described in Appendix C, I compute approximate iceberg transportation costs, τijt ≥ 1, between

each county pair ij in each year t ∈ {1820, . . . , 1847}. Market access in county i for year t is then defined as

mit = ∑ j

pjτ θ ijt, (3)

10Not all confounds must be in this direction. For instance, if better agricultural land attracted transport construction and raised incomes and health, a spurious positive β would arise.

9

where pj is the population of county j in 1820. The choice to use 1820 population rather than year t

population is a is made because allowing population to change over time would cause market access to

capture both improvements in transportation linkage and population growth, which would have its own

impacts on health.11

There are two issues that must be addressed before the market access measure can be used. The first

concerns the choice of θ. In choosing the value of this parameter, I follow the example of Donaldson and

Hornbeck (2016) and estimate equation (2) by nonlinear least squares, taking the logarithm of market access

as defined in expression (3) as the variable Tjt. This estimation gives an estimate of θ̂ = −3.82 (s.e. = 0.48),

which I use throughout the analysis.12

The second issue concerns the interpretation of the coefficient β when Tjt is the logarithm of market

access. Interpreting specific changes in this regressor (e.g., a ten percent increase in the logarithm of market

access) is not informative, as the range of market access is affected by the choice of θ, which in turn impacts

the estimate of β. Instead, the parameters β and θ must be interpreted jointly. I focus on the impact of a

one-standard deviation increase in market access (0.30 log points).

3.3 Instrumental Variables

As an alternative identification strategy, I develop an instrument for market access that builds on the

straight-line instruments commonly used in studying the economic impacts of transportation improvements

(e.g., Atack et al. 2010; Banerjee, Duflo, and Qian 2012; Ghani, Goswami, and Kerr 2016; Hornung 2015).13

It is based on the principle that antebellum internal improvements were intended to link major watersheds

(the Atlantic, Great Lakes, and Mississippi) to one another and to major cities (Taylor 1951, p. 37).

Specifically, I draw a series of straight lines, depicted in Figure 1. The first set of lines, depicted in

panel 1(a), are the shortest connections between the major watersheds, based on the steamboat-navigability

of rivers in 1820.14 The next set of lines, depicted in panels 1(b)–1(d), identifies the 25 largest cities over

11I have repeated the analysis with population fixed at 1840 and with year t population. Results in each case are similar to those using 1820 population, though the interpretation is different with year t population.

12Ultimately, the choice of θ is not very important. Any change in the value of θ used will be largely offset by changes in the estimated value of β (Donaldson and Hornbeck 2016, pp. 831–832). Indeed, when θ is set to −1, the estimates of β are qualitatively almost identical: the numerical estimates differ, but their interpretation is nearly identical.

13The precise methods generated in previous studies are not suitable for use in the context of the antebellum United States prior to the construction of railroads. For example, a Euclidean network of the type used by Banerjee, Duflo, and Qian (2012) is based on the existence of major cities that must be linked by transportation. However, in the United States, the major cities were all on the East Coast while construction of transportation was designed to link the East to the West. Similarly, Atack et al.’s (2010) survey cities instrument is better suited to the denser rail construction of the 1850s than to the earlier, geographically dispersed canal construction of earlier years.

14I group rivers with the major body of water that they flow into. For instance, the Hudson River is part of the Atlantic watershed and the Ohio River is part of the Mississippi watershed. Panel 1(a) treats Lake Ontario as a separate watershed, as it was not connected to the other Great Lakes by a navigable waterway until 1829.

10

10,000 population in each census year 1820–1840 (though it was not until 1840 that there were at least 25

such cities) and draws the shortest lines between these cities and the three major watersheds (Atlantic, Great

Lakes, and Mississippi), provided that these lines are not more than 300 miles in length nor originate in the

South (except for Virginia, Maryland or Washington, DC).15 The repetition of lines between panels 1(b),

1(c), and 1(d) is not concerning, as the construction of a second line overlapping a first will have no impact.

I then compute market access as above and in Appendix C, with the following changes: (1) I begin

with the transportation network in its 1820 state; (2) I treat the lines of Figure 1 as canals; (3) I augment

the 1820 network by letting each line develop—beginning in 1820 for the lines in panel (a) of Figure 1 and

from the decadal year for those in other panels—over a period of 15 years in equal increments, beginning

at the originating city or at the easternmost watershed.16 This alternative measure of market access is the

instrumental variable, which I use to estimate equations (1) and (2) by instrumental variables.

As with any candidate instrument, the key concerns are relevance and excludability. Relevance will be

formally established in estimation of the first-stage equations but is already suggested by Figures 1 and 2(a).

Figure 1 (and comparison to Figure A.1) reveals that the location of these lines is a good approximation of

actual construction. For instance, the line linking the Atlantic and Great Lakes watersheds in panel 1(a)

is close to the Erie Canal; the lines in Pennsylvania in panel 1(b) closely approximate the construction of

Pennsylvania’s Main Line; and the lines in Ohio, Indiana, and Illinois in panels 1(a), 1(c), and 1(d) are

also close approximations to actual construction. Because these lines are used to compute an alternative

measure of market access, they also affect counties away from where they are constructed, as the Erie Canal

did. Moreover, as shown in Figure 2(a) for the whole sample and in Figure 2(b) for the specific case of

Montgomery County, Ohio (an arbitrary example), the temporal development of the market access implied

by the instrument tracks well with that of the actual measure.

Excludability of the instrument requires the following assumptions. In the cross-section, the identification

assumption is comparable to that of other straight-line instruments. It is that, after excluding counties from

which the lines in panels 1(b)–1(d) originate, counties on or near the straight lines of Figure 1 are similar

to those further from the lines except in their likelihood to receive beneficial surges in market access. The

identification assumption in the second dimension—the time series—has fewer analogs in the literature.17 It

15These are actually based on the urban population of counties, rather than city populations. Southern cities are excluded to better capture the true lack of internal improvements there.

16An example of the evolution of one such line is shown in Figure A.2. I have also used a 10 year development period, but the variable generated in this way does not satisfy the relevance condition for instrumental variables, whereas the variable generated with a 15 year development period does.

17An exception is Hornung (2015), who creates a dynamic straight-line instrument based on the principle that future con- struction is likely to link ends of existing lines to target destinations along the shortest possible route. My approach differs from this by not being based on actual construction.

11

(a) Watersheds (b) 1820

(c) 1830 (d) 1840

Figure 1: Straight lines for instrumentation

Note: All maps include the 1820 transportation network. In panel 1(a) the lines presented are those linking the major watersheds to one another. The lines presented in panels 1(b)–1(d) link the top 25 cities with over 10,000 population (usually there are fewer than 25) to the major watersheds with lines of 300 miles or less outside of the South, except for Virginia, Maryland and Washington, DC.

12

(a) Sample of individuals

4.8

5

5.2

5.4

5.6 lo

g( M

ar ke

t A cc

es s)

, 1 82

0 Po

pu la

tio n

1820 1830 1840 1850 Year

Actual Instrument

(b) Montgomery County, OH

4.8

5

5.2

5.4

5.6

5.8

lo g(

M ar

ke t A

cc es

s) , 1

82 0

Po pu

la tio

n

1820 1830 1840 1850 Year

Actual Instrument

Figure 2: Actual and hypothetical market access

Note: The line labeled “Actual” plots the average log market access. The line labeled “Instrument” plots instrument calculated using the straight lines of Figure 1. Panel 2(a) covers the benchmark sample of individuals using market access and the instrument in the year of birth. Panel 2(b) covers the example of Montgomery County, Ohio.

is that counties closer to the origin of a straight line in Figure 1 are not fundamentally different from those

further from the origins, except that they are likely to be linked to the transportation network sooner. A

clear concern is that the origins of the lines represent points of interest, such as cities; but given the high

costs of wagon transportation, excluding the terminus counties should render the remaining counties equally

isolated.18 I provide some empirical support for these assumptions in section 5 below.

One concern with this instrument can be easily dismissed. Although the evolution of the straight lines

is based on a fixed annual expansion, the instrument is not a time trend (indeed, year-specific indicators

are included in all specifications). Instead, the instrument, like the measure of market access, evolves

discontinuously in response to a new transport link. An example of the evolution of the instrument and of

market access in a single county in shown in Figure 2(b), which describes the experience of Montgomery

County, Ohio. The rapid increases in market access in the 1820s come from the construction of the Miami

and Erie Canal, which passed through the county and linked it to the Ohio River. The rapid increase in the

instrument in the 1830s comes from the passage of the straight line linking Hamilton County, Ohio to the

Great Lakes through the county linking it to the Ohio River. The smaller increase in the 1840s comes from

the completion of that line, completing the hypothetical linkage to Lake Erie.

18This view is supported by Donaldson and Hornbeck’s (2016) finding that Fogel’s (1964) proposed canals were not good substitutes for railroads because of the value of railroads in reducing wagon haul distances. This implies that the reduction of wagon haul distances necessary to reach transportation infrastructure is particularly important, and supports the notion that areas even a short wagon haul away from a city would be relatively isolated—a view supported by the poor roads of the antebellum period.

13

4 Data

4.1 Sources

Information on transportation infrastructure is given by GIS shape files produced by Atack (2015, 2016,

2017). These files, which also form the basis for Donaldson and Hornbeck’s (2016) market access calculations,

provide the location of all steamboat-navigable rivers, canals, and railroads in the continental United States

constructed or opened prior to 1914.19 These files do not provide information on the location of turnpikes, but

this omission is unlikely to have a major effect on results because of the high costs of wagon transportation

(Donaldson and Hornbeck 2016; Taylor 1951). Until 1850, these shape files also provide the year in which any

particular form of transportation first became operational (or navigable); after 1850, these are known yearly

for water transportation, but only every two years for railroads until 1860. Together with the categorization

of all coastal counties (either on the Atlantic, the Gulf, or the Great Lakes) as having always had access to

water transport, it is thus possible to determine whether a particular county was linked to the transportation

network in any year in the sample period (1820–1847),20 and to perform the cost calculations necessary for

the market access measure for each year in the sample period.

The information on transportation that this source provides improves on that available in prior studies of

the transportation-health relationship in the antebellum United States. As discussed by Atack (2013), earlier

studies of this period relied on potentially inaccurate information on the location of transport infrastructure

and did not have information on the opening dates of this infrastructure. For this reason, the measure

of transport linkage used by Haines, Craig, and Weiss (2003) and Yoo (2012) was an indicator for having

water transport in 1840. The new shape files of Atack (2015, 2016, 2017) enable me to improve on this

measure, both through the improved accuracy of the locations of infrastructure and by providing a temporal

component to the evolution of the transport network.

I measure health using adult height. This measure, which is commonly used as an indicator of health

in historical and developing contexts (e.g., Deaton 2007; Floud et al. 2011), is unique in the antebellum

United States in that it is perhaps the only measure of health that can provide insights into health for the

bulk of the population for a number of years.21 Average stature is increased by greater calorie consumption

and a better sanitary environment, while strenuous physical labor, malnutrition, and chronic disease tend to

19I have supplemented these files with the canals and rivers of the St. Lawrence and Champlain waterways. 20The “year of transportation arrival” refers to the year in which non-wagon transportation first became possible. The

development of the transportation network divided by mode is presented in Figure A.1. 21An alternative measure, the crude death rate, is available in the antebellum period, but only for a single year (1850). It is

therefore not possible to exploit changes over time in the transport network, as I do below in studying the impacts on height. Time series of life expectancy are also available, but cover only specific subsets of the population.

14

decrease average stature (Deaton 2007; Floud et al. 2011; Steckel 1995).22

Data on the heights of men born in the United States in the years 1820–1847 are available from the

records of enlistments in the Union Army during the Civil War (Records of the Adjutant General’s Office

1861–1865). This widely used source is informative of height, place of birth, age, year of enlistment, and place

of enlistment. I combine three random samples of this source. The first comes from the Union Army Project

(Fogel et al. 2000), which provides information on a random sample of approximately 40,000 individual

observations from the original records. The second is provided by Cuff (2005), yielding approximately 12,000

additional observations of men born in the state of Pennsylvania and serving in Pennsylvania regiments.

Finally, I collected and digitized approximately 3,000 additional observations from the original records.

As is standard in uses of these data, I restrict the sample to white men born in the Northeast or the

Midwest. I also exclude individuals measured before age 18, which, due to the timing of the Civil War,

implies that the youngest birth cohort that is systematically observed is that of 1847, as this cohort would

have turned 18 in 1865, the last year of the Civil War. I also exclude birth cohorts older than 1820 because

of the relative lack of representation of these older cohorts in the military. Finally, I limit the sample to those

for whom county of birth could be determined and for whom height, birth year, and age of measurement

are known.23 After imposing these restrictions, 31,403 observations remained for all counties (rural and

otherwise). For a subset of these observations, the county of enlistment could also be determined.

For two reasons I restrict attention to individuals born in counties that had no urban population in

1820, which reduces the sample to 25,567 individuals.24 First, there is little variation over time in the

transportation linkage of the excluded counties, as they are nearly all on major transport routes in 1820.

Second, there are many forces that may have affected health in cities that would be difficult to disentangle.

A key question regarding the enlistment data is whether they are representative of the broader population

of interest—native-born white males in the birth cohorts of 1820–1847. The over-sampling of Pennsylvanians 22Although declining height is generally understood to imply deteriorating health in historical contexts (e.g., Fogel 1986;

Steckel 1995), it is also possible that declining height might be an indication of a shift from selection to scarring. That is, declining average height might actually indicate better health if it allowed individuals who would have died in infancy to survive but to reach shorter average terminal height than those who would have survived to adulthood in the absence of improved health (Deaton 2007). Unfortunately, the data necessary to determine whether changing height is the result of selection or scarring in the context of this paper are not available. There exist data on mortality (Haines, Craig, and Weiss 2003), but these are available only for 1850 and thus do not permit the same panel analysis as do the height data. As a result, I rely on the standard interpretation of the historical heights literature, on the negative correlation between terminal height and those mortality rates that are observed in this period (e.g., Floud et al. 2011; Fogel 1986; Haines, Craig, and Weiss 2003; Steckel 1995), and on the results presented in Table A.1 showing that death rates were greater in counties with greater market access, to interpret declines in average stature as deteriorations in health, and vice versa.

23In most cases, a county of birth is directly reported, and the individual is assigned to that county. In some cases, a city or town of birth was reported instead. These were manually assigned to the appropriate county. In cases where a state of birth is reported but no county is reported, and in which the individual was linked to a census in 1850 or 1860 (linkage was only performed for observations collected by Fogel et al. 2000), the individual is assigned to his county of residence in the first census in which he is observed.

24Figure A.3 indicates the counties removed from the sample by this restriction.

15

is one obvious concern, which I address by re-weighting so that the distribution of states of residence matches

that of the 1860 census. A more nuanced concern is that selection into military service was non-random

(Bodenhorn, Guinnane, and Mroz 2017). While this is theoretically a valid concern, its potential severity is

mitigated by the fact that nearly half of the population at risk for observation and military service enlisted

(Zimran 2018). For this reason, the Union Army data are considered to be representative of the white male

population of the Northern states (Fogel et al. 2000). This view is reinforced by Zimran’s (2018) formal

investigation of bias in historical height data sources, which finds that the height data provided by the

Union Army records suffer from relatively little bias.25

Another concern is that entrance into the Union Army was subject to a minimum height requirement.

Although this requirement was not stringently enforced, the left tail of the height distribution was under-

represented.26 The common approach in the historical heights literature is to use a reduced-sample maximum

likelihood estimator that omits any observations below the cutoff point and assumes normality of the stature

distribution (A’Hearn 1998). In the present context, however, the omission of data is undesirable because

of the considerable loss of degrees of freedom through the inclusion of county fixed effects in the main

specifications and because of the subsequent introduction of instrumental variables. As a result, the results

reported below do not use such a truncation-corrected regression.

Finally, I gather county-level data from the decennial United States censuses of 1820–1850 (Manson et al.

2017). This source provides county-level population, urban population (which, following the standard census

definition, is the number of people living in places of population 2,500 or greater), and data on agricultural

and manufacturing production and employment. I supplement these data with Craig, Copland, and Weiss’s

(2012) data on the nutritional value of agricultural production for 1840 and 1850 and with data on suitability

for wheat and corn production from the Food and Agriculture Organization (2002).

I standardize all data—including the transportation linkage indicator, market access computations, as-

signment of counties of birth, and the county-specific data described above—to 1860 county boundaries. I

focus on 1860 counties because the counties of birth of enlisters are reported in the years 1861–1865, and

enlisters are likely to have reported their place of birth based on the boundaries existing at the time of the

report. Where necessary, I standardize variables to 1860 county boundaries using Hornbeck’s (2010) method.

25The issues of selection bias also inform my choice to focus on the birth cohorts of 1820–1847. While height data are available from military records for cohorts throughout the later antebellum period and nineteenth century, Zimran (2018) shows that combining data from the Civil War and from later periods can lead to strong selection bias, generated in part by the fact that after the end of the Civil War, only a small fraction of the population entered the military and had its height observed.

26This is shown in Figure A.4.

16

4.2 Summary Statistics

Using the sources described above, I create and merge two data sets. The first is a panel data set with

observations at the county-year level on transportation linkage and market access. The second provides

individual-level data from the Union Army on native-born white males with known height, county of birth,

year of birth, and age of measurement, born between 1820 and 1847. Merging these two data sets links each

individual in Union Army data to the characteristics of his county of birth in his year of birth.

Table 1 summarizes the county-level measures of transportation linkage, divided by region and decadal

year, and weighted by population. There is a clear pattern of growth over time in the fraction of the

population living in a county linked to the transportation network. In the entire sample region, less than 40

percent of the population lived in a county that was linked to the transportation network in 1820. By 1850,

this fraction had risen to over 80 percent. The Northeast and the Midwest viewed separately exhibit similar

patterns, although the population of the Northeast is consistently more linked than is that of the Midwest.

Table 1: Summary statistics for county-level data

All Midwest Northeast

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) Variable 1820 1830 1840 1850 1820 1830 1840 1850 1820 1830 1840 1850 Transportation Present 0.396 0.562 0.711 0.805 0.316 0.476 0.599 0.704 0.420 0.600 0.796 0.907

(0.489) (0.496) (0.454) (0.397) (0.465) (0.500) (0.491) (0.457) (0.495) (0.491) (0.404) (0.291)

log(Market Access), 1820 Pop. 5.005 5.420 5.563 5.650 4.835 5.273 5.429 5.519 5.056 5.484 5.664 5.782 (0.495) (0.296) (0.261) (0.252) (0.370) (0.240) (0.225) (0.220) (0.517) (0.296) (0.240) (0.211)

Counties 945 945 945 945 774 774 774 774 171 171 171 171 Notes: The sample in columns (1)–(4) includes all counties with no urban population in 1820. Columns (5)–(12) divide this sample by region. Means presented with standard deviations in parentheses. Observations weighted by population.

Figure 3 provides a graphical summary of the spread of transportation linkage over this period. Panel

3(a) shows that the transportation network gradually spread inland during this period. The sample period

began with only the coasts and the counties bordering the major internal waterways being linked to the

network, and concluded with much of the interior being linked. However, as discussed in section 3.2 above,

this binary measure is problematic. Beyond the conceptual difficulties that it poses, there simply are not

many observations of height data in counties experiencing changes in transport linkage. This is shown in

panel 3(b), which isolates the counties in which there was a change in transportation linkage between the

years 1820 and 1847 and divides them into three groups. The first (shaded in the lightest color), which

represents many of the counties in the South or westernmost Midwest are not represented by any individual

height observations, or all the representation comes from before or after the change in transportation linkage.

The second group (shaded somewhat darker) has individual height observations from both before and after

17

the change in linkage, but has only a small number of observations of stature in at least one of these groups.

Only the third group (the darkest shade, besides the black background), consisting of 31 counties, mostly

in Pennsylvania, has at least 25 observations of individual heights both before and after the change in

transportation linkage.

(a) Year of arrival (b) Changes in linkage

Figure 3: Counties by transportation change and sample coverage

Note: Panel 3(a) presents the year in which each county received a transport link, treating coastal counties and counties with an always navigable river as being linked in 1787. Panel 3(b) marks counties experiencing a change in transport linkage in 1820–1847. Counties in black experienced no change in transportation linkage between 1820 and 1847. The lightest colored counties experienced a change in transportation linkage in this period, but have no observations either before or after the change. The darker counties have observations both before and after the transportation change, but only the darkest counties have at least 25 observations both before and after the change. Sample region indicated by thick boundary.

Fortunately, the market access measure helps to address this concern. In particular, it generates variation

in the magnitude of transportation linkages and allows new linkages to affect counties other than only those

through which the infrastructure passes. The set of “treated” counties can thus be considered larger and

there is more variation in the treatment. This measure is also summarized in Table 1. As with the linkage

measure, this measure shows patterns of growth over the study period, and of greater market access in the

Northeast than in the Midwest. Directly interpreting the magnitude of the market access measure is not

possible given the discussion of section 3.2 above, but it is still possible to compare the changes over time

18

and differences over regions in other terms. For instance, the increase between 1820 and 1850 is equal to

about two and a half standard deviations of the measure in 1850.

The development of the market access measure over time is described graphically in Figure 4. This

Figure depicts the change in market access in each decade, shading counties with greater increases darker.27

It shows that market access captures changes that transportation linkage does not. For instance, the counties

in the sample region with greatest increase in market access between 1820 and 1830 are those bordering the

upper Mississippi and the Great Lakes, as well as those in western New York. These changes reflect the

opening of the Erie Canal and of the upper Mississippi. Between 1830 and 1840, large increases are observed

in central Pennsylvania and in Indiana and Ohio, reflecting canal construction. Finally, between 1840 and

1850, large increases are again observed in Indiana and Ohio, also reflecting canal construction.

Table 2 provides summary statistics at the individual level for heights and for other variables for the

complete sample and for various subsamples. Column (1) represents the benchmark sample of analysis—

native-born white males whose counties of birth had no urban population in 1820. Columns (2) and (3)

divide the sample by region, and columns (4) and (5) divide the sample by whether the individual’s county

of birth was linked or unlinked to the transportation network in the individual’s year of birth.

A majority of the sample was born in the Northeast (even after adjusting for the Pennsylvania oversample)—

a mechanical consequence of weighting the data to reflect state population in 1860. Figure 5 delves into the

geographic distribution of data in further detail. It presents the number of individual height observations

by county, separating Pennsylvania from the rest of the country as a result of its over-representation in the

sample. On the whole, the sample tends to draw from the more populous areas of the country. Importantly,

it includes almost all counties in the Northeast and the Midwest.28

Table 2 also shows that the benchmark sample was 68.1 inches tall on average, and columns (2) and (3)

reveal that the Northeast suffered a height disadvantage of about half an inch relative to the Midwest. A

height disadvantage of about 0.4 inches is present for those born in transportation-linked counties.29

There are also differences between regions and between linked and unlinked counties in measures of

population concentration. Consistent with the expected effects of transportation linkage (and with a variety

of endogeneity concerns), there is a considerable advantage in urbanization and population density at birth

27The scale in each panel is different, dividing counties by deciles of the increase in market access. The levels of market access in each year are presented in Figure A.5.

28The number of observations by birth cohort is given in Figure A.6. The number of observations is increasing in the birth cohorts from 1820 to the early 1840s, consistent with the idea that younger individuals would be more likely to join the military. The number of observations then falls sharply among the birth cohorts of the mid 1840s, consistent with the requirement to be at least 18 years of age to enlist.

29Figure A.4 presents a histogram describing the distribution of individual height observations. It shows the tendency to heap on whole inches and to exhibit shortfall below the minimum height requirement of 64 inches, but is otherwise regular.

19

(a) 1820–1830 (b) 1830–1840

(c) 1840–1850

Figure 4: Changes in market access by decade.

Note: Each panel shows the change in market access over the listed decade. For example, the panel labeled “1820–1830” shows the change in market access from 1820 to 1830. The scales are not comparable across years; instead, they depict deciles of the change in market access for that decade. Darker counties experienced a greater increase in market access. Sample region indicated by thick boundary. 20

for individuals born in linked counties.30 There is also an advantage in population density at birth for

Northeasterners, though the level of urbanization at birth was similar for the Northeast and the Midwest

(recall that any county with an urban population in 1820 is omitted). While there is a premium in agricultural

suitability for the Midwest, there does not appear to be a meaningful difference in agricultural suitability of

the birth county for individuals born in linked and unlinked counties.

Table 2: Summary statistics for individual-level data

(1) (2) (3) (4) (5) Variable All MW NE Linked Unlinked Individual-level data

Height 68.064 68.326 67.843 67.916 68.343 Inches (2.640) (2.632) (2.626) (2.631) (2.636)

Birthyear 1838.262 1839.100 1837.555 1839.039 1836.787 (6.231) (5.729) (6.542) (5.616) (7.023)

Age of Enlistment 24.277 23.484 24.946 23.511 25.731 (6.228) (5.666) (6.591) (5.572) (7.088)

Enlisted in Different State 0.280 0.315 0.251 0.266 0.308 (0.449) (0.464) (0.434) (0.442) (0.462)

Enlisted in Different County 0.631 0.721 0.563 0.604 0.686 (0.482) (0.448) (0.496) (0.489) (0.464)

County-year-level data Urbanization at Birth 0.017 0.015 0.018 0.025 0.002

(0.060) (0.062) (0.058) (0.072) (0.013)

log(Population Density) at Birth 3.274 2.802 3.629 3.505 2.809 (1.103) (1.206) (0.862) (0.992) (1.167)

Transportation Linkage at Birth 0.655 0.567 0.729 (0.475) (0.495) (0.445)

log(Market Access) at Birth, 1820 Pop. 5.462 5.378 5.532 5.603 5.192 (0.300) (0.265) (0.310) (0.192) (0.284)

County-level data Midwest 0.457 0.396 0.573

(0.498) (0.489) (0.495)

Northeast 0.543 0.604 0.427 (0.498) (0.489) (0.495)

log(Wheat Suitability) 8.693 8.914 8.508 8.702 8.678 (0.316) (0.166) (0.292) (0.268) (0.389)

log(Corn Suitability) 8.548 8.787 8.348 8.563 8.522 (0.404) (0.214) (0.417) (0.354) (0.483)

Observations 25,567 10,210 15,357 16,875 8,692 Notes: Sample includes all height observations of native-born white males born in the Northeast or Midwest in counties with no urban population in 1820. Means presented with standard deviations in parentheses. Observations weighted to correct for oversampling. Linked indicates individuals born in linked counties; unlinked denotes the opposite. MW denotes Midwest; NE denotes Northeast. The number of observations refers to the number of individuals in the sample with known height, year of enlistment, age of enlistment, and county of birth.

Finally, about 27 percent of the sample enlisted in a state other than the state of birth (state of enlistment

is determined by the state of the regiment in which an individual enlisted), while nearly 63 percent enlisted in

a county other than the county of birth (limiting the sample to those enlisting in the state of their regiment).31

30For intercensal years, the urban and total populations are imputed by assuming constant growth rates between censuses. These imputations are not used in analysis below, but are useful for developing a sense of the divisions of the sample by urbanization and population density.

31In some cases, individuals enlisted while the regiment was in the field. As I do not wish to consider military deployment

21

The probability of enlisting in a county or state other than that of birth was greater for Midwesterners but

smaller for individuals born in counties linked to the transportation network.

Figure 5: Number of observations by county

Note: This Figure includes both rural and non-rural counties and indicates the number of native-born observations of stature listing a birth place in each county with information on height and age of enlistment. Pennsylvania is displayed separately because of the oversample caused by the incorporation of the Cuff (2005) data. Sample region indicated by thick boundary.

5 Results

5.1 OLS Results

I begin the analysis by estimating equation (1) by ordinary least squares using the binary indicator of

transportation linkage as the explanatory variable of interest Tjt. Results of this estimation are presented

in columns (1)–(5) of Table 3. The regression of column (1), which includes only birth year indicators, age-

of-measurement indicators, and no other controls, yields a negative and statistically significant relationship

between transportation presence in the birth year and average stature. This relationship is robust to the

inclusion of state-specific fixed effects in column (2), though this addition reduces the magnitude of the

as a form of migration, I exclude these individuals when considering county-level migration.

22

estimated coefficient by about half. This latter estimate indicates that individuals whose counties of birth

had some sort of transportation linkage in their birth year were 0.17 inches shorter than those whose counties

of birth were unlinked in their year of birth. This magnitude is large compared to the contemporaneous urban

height penalty of 0.29 inches (Zimran 2018). It is also roughly comparable in magnitude to the estimates

of Haines, Craig, and Weiss (2003), whose benchmark results indicate that transportation linkage in the

county of birth (though not necessarily in the year of birth) was associated with a height penalty of about

0.25 inches.32 This similarity of results is not trivial, as my transportation measure, due to the availability

of Atack’s (2015, 2016, 2017) data, is more refined, as discussed above.

Table 3: OLS regressions

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Variables

Transport −0.336∗∗∗ −0.168∗∗∗ −0.069 −0.071 −0.069 (0.073) (0.057) (0.067) (0.067) (0.070)

log(Market Access), 1820 Pop. −0.994∗∗∗ −0.564∗∗∗ −0.389∗∗∗ −0.367∗∗ −0.370∗∗ (0.117) (0.119) (0.150) (0.151) (0.163)

Observations 25,567 25,567 23,567 23,567 23,567 25,567 25,567 23,567 23,567 23,567

R-squared 0.055 0.073 0.077 0.079 0.105 0.061 0.074 0.077 0.079 0.106

State FE No Yes Yes Yes Yes No Yes Yes Yes Yes

Controls No No Yes Yes Yes No No Yes Yes Yes

Birth Year × Region FE No No No Yes No No No No Yes No Birth Year × State FE No No No No Yes No No No No Yes

Significance levels: ∗∗∗ p<0.01, ∗∗ p<0.05, ∗ p<0.1 Notes: Dependent variable is height in inches. Sample includes individuals born in the Northeast or Midwest in counties with no urban population in 1820. All specifications include birth year and measurement age fixed effects. Standard errors clustered at the county level. Observations weighted to correct for oversampling.

In column (3), I repeat the specification of column (2) with the addition of a variety of county-level

controls. Some of these control variables are those that Haines, Craig, and Weiss (2003) include in their

analysis—1840 calorie and protein production, Herfindahl indices for calorie and protein production, and

1850 values of farms and capital in manufacturing. I add several other variables that may have impacted

health. These include area and 1820 population (to capture population concentration in 1820); 1840 cattle

and swine stocks; 1840 employment by sector and values of agricultural and manufacturing output. All

of these variables are included in log form and I also include the log of population in 1840 and 1850 in

order to make the other measures effectively per-capita. I also include third-degree polynomials in the

logarithm of distance from New York and Cincinnati. These controls are intended to capture a variety of

county characteristics, such as agricultural productivity, density, and geography, that might generate health

differences even in the absence of a transport link. The post-1820 values are included with full recognition

that their 1820 values would be preferable (later values may be “bad controls”). However, due to the limited

32Haines, Craig, and Weiss (2003) also do not limit the sample to only rural areas, as I have done.

23

data availability of the antebellum period the inclusion of data on, (for example) agricultural production is

not possible prior to 1840, and I err on the side of controlling for the features that these measures capture

rather than not doing so.

The inclusion of these controls in column (3) reduces the magnitude of the estimated coefficient on

transportation linkage and renders the estimated coefficient statistically insignificant. While the magnitude

of the resultant coefficient is non-negligible, it is considerably smaller than the estimates of columns (1) and

(2). This indicates that the relationship in columns (1) and (2) may be the product of omitted variables

bias. The addition in column (4) of interactions of birth year and region fixed effects, or of the interaction

of state and birth year fixed effects in column (5) has little impact on the estimates.

To determine whether the lack of a meaningful relationship between transportation linkage and height

in the presence of controls is the product of deficiencies in the binary measure or a true absence of a

relationship, columns (6)–(10) of Table 3 repeat the analysis of columns (1)–(5), but replace the binary

measure of linkage with the logarithm of market access as the explanatory variable of interest Tjt. Columns

(6) and (7) estimate equation (1) without the additional county-level controls, without and with the inclusion

of state fixed effects, respectively. As was the case with the binary measure of transportation linkage, a large,

negative, and statistically significant coefficient is present on the measure of transportation linkage, and is

nearly halved (but is otherwise robust) when state fixed effects are included. In particular, the estimates of

column (7), which include the state fixed effects, indicate that a one-standard deviation increase in market

access (0.30 log points, as shown in Table 2) is associated with a reduction in average height of 0.17 inches.

Columns (8)–(10) repeat this estimation, including the various county-level control variables, and the

region-by-birth year or state-by-birth year indicators. Unlike their analogs in columns (3)–(5), the addition

of these controls to regressions of height on market access does not eliminate the statistical significance of

the negative relationship between market access and height. Moreover, the impact of the inclusion of the

controls on the magnitude of the coefficient is smaller than it was for the transport indicator. In particular,

the estimates of column (10), which includes the state-by-birth year indicators, imply that a one-standard

deviation increase in market access is associated with a decline in average stature of 0.11 inches, or about

1.6 times the implied impact of a transportation linkage in its analog, column (5).

On the whole, these estimates suggest that there is a negative correlation between transportation linkage

and health as implied by average stature, and that the elimination of this relationship in columns (3)–(5) of

Table 3 is the product of deficiencies of the transport indicator rather than of omitted variables bias.

24

5.2 Fixed Effects Results

Like the conclusions of existing work on health in the antebellum United States, the estimates of Table 3

do not address concerns of endogeneity such as those discussed in section 3 above. Indeed, these are merely

correlations, and may be driven by transportation construction in areas that were unhealthy for reasons

unrelated to transportation. The structure of my data, in particular the ability to describe the evolution

of the transportation network over time, enables the estimation of equation (2) to partially address these

concerns.33 Results of this estimation are presented in columns (1)–(5) of Table 4. I begin in column (1)

by estimating specification (2) with the transport linkage indicator as the regressor of interest. Given the

binary regressor, this coefficient can be interpreted as a generalized difference-in-differences coefficient. The

estimated coefficient is -0.037, which is smaller than the estimates including controls in Table 3, and is

statistically insignificant. Given the limitations of the transport linkage indicator, as discussed above, the

absence of a meaningful transport-health relationship using this regressor is not surprising.

Table 4: County fixed-effects regressions

(1) (2) (3) (4) (5) Variables

Transport −0.037 (0.114)

log(Market Access), 1820 Pop. −0.519∗∗∗ −0.657∗∗ −0.441∗∗ −0.323 (0.193) (0.280) (0.187) (0.217)

Observations 25,567 25,567 25,567 25,567 25,567

R-squared 0.124 0.124 0.171 0.127 0.154

Birth Year × Region FE No No No Yes No Birth Year × State FE No No No No Yes County × Decade FE No No Yes No No

Significance levels: ∗∗∗ p<0.01, ∗∗ p<0.05, ∗ p<0.1 Notes: Dependent variable is height in inches. Sample includes individuals born in the Northeast or Midwest in counties with no urban population in 1820. All specifications include birth year, measurement age, and county fixed effects. Standard errors clustered at the county level. Observations weighted to correct for oversampling.

Column (2) estimates the same specification with the logarithm of market access as the regressor of

interest. Unlike specification (1), this estimation approach relates within-county changes in market access to

within-county changes in average stature, making no cross-county comparisons. This column reveals that the

negative and statistically significant relationship between market access and height is robust to the inclusion

of the fixed effects, and thus to the concerns that they address over endogeneity. Moreover, at -0.519, the

magnitude of the coefficient is comparable to the estimates of Table 3.34

33The county-fixed effects approach has the added benefit of not requiring the inclusion of potentially endogenous controls such as the 1840 and 1850 controls above. Instead, the county-specific characteristics that these are meant to capture will be captured by the fixed effects.

34The specification of column (2) is the one estimated by nonlinear least squares. The estimates are β̂ = −0.519 (s.e. = 0.201)

25

This result and approximate magnitude is robust to the inclusion, in column (3), of county-decade fixed

effects, rather than simply county fixed effects, in order to more flexibly address county-specific characteristics

that may be time variant. Columns (4) and (5) supplement the county fixed effects with region- and state-

by-birth year indicators, respectively. While the inclusion of these indicators reduces the magnitude of the

estimated coefficients, and in the case of column (5) it is reduced to the point of statistical insignificance

(p = 0.138), the rough magnitude and sign of the coefficient is retained, supporting the conclusion that

transportation improvements generated declines in stature-implied health.

5.3 Instrumental Variables Results

As an alternative approach to addressing the endogeneity issues facing the estimates of Table 3, I implement

the straight-line instrument strategy introduced in section 3.3 above. Before delving into the estimates, I

briefly explore, in Table 5, the evidence in support of excludability of the instrument. In particular, I relate

the characteristics of counties that are observed in 1820 to the lines of Figure 1. Given the sparsity of

data available in the early censuses, the only measures available are population density and the measures of

agricultural suitability.35

Table 5: Correlates of instrumental variables line placement

(1) (2) (3) (4) (5) (6) (7) (8) (9) Variables Wheat Corn Dens. Wheat Corn Dens. Wheat Corn Dens.

On IV Line 0.029 0.022 0.032 (0.019) (0.026) (0.213)

log(IV Market Access) in 1850 0.015 −0.049 −0.415 (0.022) (0.030) (0.638)

IV Line Year 0.001 0.005 −0.076 (0.005) (0.006) (0.077)

Observations 942 941 87 941 940 87 119 119 35

R-squared 0.605 0.583 0.464 0.617 0.611 0.627 0.571 0.368 0.312

Significance levels: ∗∗∗ p<0.01, ∗∗ p<0.05, ∗ p<0.1 Notes: Dependent variable in column header. Sample includes counties with no urban population in 1820 that are not origins of straight lines of instrumentation. Sample for regressions of population density restricted to counties that had achieved 1860 boundaries by 1820. All specifications include state fixed effects and cubics in the logarithm of distance from Cincinnati and New York. Specifications with the 1850 market access instrument as a regressor also condition on the 1820 market access instrument. Robust standard errors in parentheses.

In column (1), I regress the logarithm of the wheat suitability measure of a county on an indicator

for being on one of the lines presented in Figure 1. This regression includes state fixed effects and the

same functions of distance from New York and Cincinnati as included above. The resulting coefficient is

and θ̂ = −3.822 (s.e. = 0.476). The estimated D-statistic is 1.410 (s.e. = 0.727). The standard error for β̂ is larger than the one in Table 4 because of the additional uncertainty coming from the need to jointly estimate θ.

35The measures of agricultural suitability are not from 1820, but are innate, and so can be considered representative of the conditions in 1820.

26

statistically insignificant and small, indicating that it is not possible to reject the null hypothesis that counties

on the lines were ex ante different from others. The regression in column (2) of corn suitability shows similar

results. In both of these cases, even if the coefficients were of larger magnitude and statistically significant,

the bias induced by the positive coefficients would tend to mute the negative relationships of the transport-

health relationship that I have found. Construction targeting more potentially agriculturally productive

areas would tend to be associated with greater average height if agricultural suitability supported better

health. The regression in column (3) of the logarithm of population density on the same regressor (limiting

the sample to counties that had achieved their 1860 boundaries by 1820) shows similar results.36

Columns (4)–(6) repeat the same estimation with the value of the instrument in 1850 (approximately the

end of the study period) as the regressor. This is the value of the instrument generated by the “construction”

of the hypothetical links. In these regressions I also control for the level of the instrument in 1820 in order to

isolate the effects on the instrument of the addition of lines. These regressions yield similar results. Finally,

in columns (7)–(9), I regress the same outcomes on the year in which the lines of instrumentation reach a

particular county, restricting to counties through which a line passes. Little relationship if any is found.

Thus, these results support the identification assumptions that counties on and off of the lines are ex ante

similar, and that counties closer and farther from the origins of the line are ex ante similar.

Table 6 presents the coefficient from the estimation of equation (1) by instrumental variables with state-

specific indicators and no other controls; it is analogous to column (7) of Table 3. The first feature of note in

this column is that the first-stage estimation—that is, the estimation of specification (1) with the logarithm

of market access as the dependent variable and the logarithm of the instrumental variables-implied market

access as the regressor of interest—shows a positive and strongly statistically significant relationship between

the instrument and the potentially endogenous regressor of interest, indicating that the instrument satisfies

the relevance condition. This satisfaction of the relevance criterion remains robust throughout the various

specifications in this Table.

The relationship between market access and health as estimated by this instrumental variables approach

in column (1) is negative and statistically significant.37 Its magnitude is comparable to the ordinary least

squares estimate of Table 3 and to the fixed effects estimates of columns (2)–(5) of Table 4. Column (2) of

Table 6 adds the county-specific controls discussed above. Unlike the ordinary least squares regressions of

Table 3, the introduction of these controls increases rather than decreases the magnitude of the coefficient,

36This sample limitation is made in order to avoid changes in population density coming from changing boundaries. 37The results of Table 6 include individuals born in counties that have no urban population in 1820 but that are origin points

of a line in panels (c) or (d) of Figure 1. Omission of these individuals, who number 303, or 158 in birth years after the decadal year in which the line first appears, yields results that are virtually identical to those of Table 6.

27

Table 6: Instrumental variables regressions

(1) (2) (3) (4) (5) Variables

log(Market Access), 1820 Pop. −0.541∗∗∗ −0.744∗∗ −0.655∗ −0.832∗ −0.331 (0.190) (0.369) (0.368) (0.452) (0.494)

Observations 25,567 23,567 23,567 23,567 25,567

R-squared 0.074 0.077 0.079 0.105 0.055

State FE Yes Yes Yes Yes No

Controls No Yes Yes Yes No

Birth Year × Region FE No No Yes No No Birth Year × State FE No No No Yes No County FE No No No No Yes

First Stage 0.398∗∗∗ 0.264∗∗∗ 0.269∗∗∗ 0.237∗∗∗ 0.355∗∗∗ (0.033) (0.027) (0.027) (0.027) (0.033)

Significance levels: ∗∗∗ p<0.01, ∗∗ p<0.05, ∗ p<0.1 Notes: Dependent variable is height in inches. Sample includes individuals born in the Northeast or Midwest in counties with no urban population in 1820. All specifications include birth year and measurement age fixed effects. Standard errors clustered at the county level. Observations weighted to correct for oversampling. The column with the header FE includes both county fixed effects and an instrumentation approach.

which, at −0.744 remains negative and statistically significant, though less precisely estimated. Columns

(3) and (4) add region- and state-by-birth year indicators to the instrumental variables specification with

controls. The negative and statistically significant coefficient is robust to these controls (though the statistical

significance is marginal, with p values of 0.075 and 0.066, respectively), as is its approximate magnitude.

That the instrumental variables estimate is more negative than the analogous ordinary least squares

estimate suggests that, in fact, the direction of the bias addressed by the instrumental variables approach

is the opposite of the bias hypothesized in section 3 above. This pattern is consistent with transportation

being constructed towards areas with agricultural potential, which would have improved health, all else

equal. However, this conclusion must be taken carefully given the presence of local average treatment effects

and the possibility of measurement errors.

Finally, column (5) combines the two empirical approaches by estimating equation (2) by instrumental

variables. The first stage estimate is strong, indicating that prior first-stage estimates are robust to the

inclusion of county fixed effects. The second-stage coefficient of interest remains negative, and the magnitude

is comparable to estimates of Tables 4 and 6.38 However, the standard error of this coefficient is more than

doubled by the demands of this estimation (relative to the non-instrumental variables analog), making it

impossible to reject the null hypothesis of no effect.

Overall, based on the results of Tables 4 and 6, I conclude that the data provide strong and robust

38The difference between the estimates with and without instrumental variables, though small, tends to support transporta- tion targeting less healthy areas.

28

evidence of a negative relationship between stature and market access in the county-year of birth.39 These

estimates are consistent with previous descriptions of correlations in the antebellum United States, though

unlike those estimates, these can plausibly be interpreted causally.40

5.4 Robustness Checks

Table 7 presents a variety of robustness checks of the main results presented above. Columns (1)–(3) verify

the robustness of the results of the county-fixed effects regressions of column (2) of Table 4. Column (1) adds

the transport indicator into this regression, which already includes market access. This approach, developed

by Donaldson and Hornbeck (2016), has the benefit of identifying the impacts of market access while holding

constant a county’s transportation linkage. Identification is then based on construction elsewhere in the

transportation network. Concerns that transportation construction targeted areas that were more or less

healthy are thus reduced.41 Column (1) reveals that the negative and statistically significant coefficient

on market access is robust to this alternate source of identification. Column (2) generalizes this approach

by controlling separately for railroad, canal, and river linkages, with similar results. Finally, column (3)

includes year-specific quadratic functions in latitude and longitude. Although the coefficient is less precisely

estimated (p = 0.159), it retains its negative sign and approximate magnitude.

Table 7: Robustness checks

(1) (2) (3) (4) (5) Variables FE FE FE IV IV

log(Market Access), 1820 Pop. −0.623∗∗∗ −0.624∗∗∗ −0.302 −1.611∗∗ −1.306∗ (0.209) (0.203) (0.214) (0.634) (0.757)

Observations 25,567 25,567 25,567 23,567 23,567

R-squared 0.124 0.125 0.136 0.074 0.087

Added Control Transport Indicator Transport Mode Geo. by Yr. Starting MA Geo. by Yr.

Significance levels: ∗∗∗ p<0.01, ∗∗ p<0.05, ∗ p<0.1 Notes: Dependent variable is height in inches. Sample includes individuals born in the Northeast or Midwest in counties with no urban population in 1820. Standard errors clustered at the county level. Observations weighted to correct for oversampling. Added controls explained in text. Columns headed FE include county fixed effects. Columns headed IV estimated using the straight-line-based instrument and include all controls described in text.

Columns (4) and (5) of Table 7 test robustness of the instrumental variables regression of column (2)

of Table 6. Column (4) controls for the level of market access in a county in 1820, in order to more

39In Table A.1, I use the single year of data on death rates (1850) to study the relationship between transportation linkage as measured by market access and health as measured by death rates. In general, the results of this Table are supportive of the conclusion that there was a negative relationship between transportation and health.

40One potential concern is that migration out of the county of birth in response to transportation linkage might be responsible for the reduction in stature, rather than any effect within the county of birth. This concern is addressed in Table A.2, where I show that migration patterns in response to market access are of the opposite sign to be consistent with this concern.

41The concerns are not totally alleviated, however, as construction might take place away from a county in order to increase its market access. The Erie Canal is an example of such construction.

29

effectively isolate changes over time in market access, rather than its level, which may be endogenous even

after instrumentation because the instrument is based on the (potentially endogenous) 1820 network. The

negative and statistically significant coefficient of market access is robust to this control, and its magnitude

is increased. Finally, column (5) includes year-specific quadratics in latitude and longitude, and the result

is again robust.

5.5 The Local Development Channel

Understanding the channel through which transportation operates on health is important for at least two

reasons. First, understanding the mechanism of the effect would help to understand whether these results

apply in other settings, such as in modern developing countries. Second, a negative effect of market access

on stature is consistent with both explanations for the Antebellum Puzzle, as discussed in section 2 above.

Investigation of the channel through which the effect operates can help to better understand this phenomenon

and potentially to distinguish between these two explanations.

Given the limited data availability for this period, it is not possible to evaluate all of the mechanisms

described in section 2. Instead, I focus on determining whether there is empirical support for the disease

explanation. I concentrate specifically on the part of this explanation that claims that transportation im-

provements generated a worse epidemiological environment by creating growth in newly linked areas, which

generated worse sanitation conditions and thus worse health.42

I begin by testing whether the arrival of transportation infrastructure generated local development in the

form of greater population density by estimating the specification

log(djt) = αj + γt + β log(MAjt) + εjt

for census years 1820–1850, where djt is population density and MAjt is market access. I estimate this

equation by ordinary least squares and by instrumental variables, presenting the results in column (1) of

Table 8. Following Atack et al. (2010), I limit the sample to county-years in which counties had already

achieved their 1860 boundaries so that results are not driven by, for instance, changes in population density

caused by changes in county boundaries. Although the magnitude of the estimated relationship between

market access and population density is impacted by whether or not an instrumental variables method is

used, the general qualitative result is not. In particular, column (1) of Table 8 shows a large, positive,

42Unfortunately, a paucity of data prevent an effective test of the other mechanisms.

30

and statistically significant impact of market access on population density. The estimated coefficients imply

that a one-standard deviation increase in market access (0.407 in the data set with counties as the unit of

observation, as shown in Table 1) is associated with a 0.181 log point increase in population density according

to the fixed effects estimates, or a 0.431 log point increase according to the instrumental variables estimates.

Table 8: The local development mechanism

(1) (2) (3) (4) (5) Variable Dens. Dens. Dens. Height Height Panel A: Fixed Effects

log(Market Access), 1820 Pop. 0.445∗∗∗ 0.470∗∗∗ 0.436∗∗∗ −0.392∗∗ −0.392∗∗ (0.129) (0.123) (0.126) (0.180) (0.183)

log(MA) × log(Wheat Suit.) 0.649∗∗∗ −0.844∗ (0.210) (0.475)

log(MA) × log(Corn Suit.) 0.490∗∗∗ −0.636 (0.146) (0.428)

Observations 1,166 1,166 1,166 25,567 25,567 R-squared 0.921 0.926 0.926 0.124 0.124

Panel B: Fixed Effects and IV log(Market Access), 1820 Pop. 1.059∗∗∗ 1.038∗∗∗ 1.015∗∗∗ −0.215 −0.174

(0.188) (0.182) (0.186) (0.535) (0.555)

log(MA) × log(Wheat Suit.) 0.677∗∗ −0.400 (0.266) (0.699)

log(MA) × log(Corn Suit.) 0.442∗ −0.423 (0.232) (0.674)

Observations 1,122 1,122 1,122 25,567 25,567 R-squared 0.525 0.542 0.543 0.056 0.056

Significance levels: ∗∗∗ p<0.01, ∗∗ p<0.05, ∗ p<0.1 Notes: Dependent variable listed in the column header. Sample in columns (1)–(3) includes all county-years with borders fixed to 1860, with no urban population in 1820, and in the Midwest or Northeast. Sample in columns (4) and (5) includes individuals born in the Northeast or Midwest in counties with no urban population in 1820. All specifications include year and county fixed effects. Columns (4) and (5) also include measurement age fixed effects. Observations in columns (1)–(3) weighted by the ratio of county population in each year to total population in that year. Observations in columns (4) and (5) weighted to correct for oversampling. Standard errors clustered at the county level.

In columns (2) and (3) of Table 8, I investigate whether the effect of market access on population

density varies by a county’s potential agricultural productivity. I interact market access with the log of a

measure of a county’s average suitability for wheat or corn production. The results of this estimation reveal

that the effects of market access on increasing population density were stronger in counties with greater

agricultural suitability. I demean the measures of suitability so that, for example, the fixed effects estimates

of column (2) can be interpreted as implying that a county with average wheat suitability experienced an

increase in population density of 0.470 percent in response to a one percent increase in market access, and

that a county with wheat suitability one percent above the mean had a 0.649 percentage point stronger

reaction. Similar results are evident for corn suitability, and comparable results (though with less precision

and larger coefficients) are found when using instrumental variables. Thus, counties with greater agricultural

suitability tended to experience greater increases in population density in response to the same improvements

31

in transport linkages, likely reflecting immigration from other areas of individuals seeking to establish farms.43

Given the difference in the unit of observation available for the analysis of population density on the one

hand (one observation per county-decade) and transportation and height on the other (annual county-level

observations), it is not possible to directly relate changing population density to changing stature. Instead,

to determine whether there was a relationship between growth in population density and declines in average

stature, I investigate whether the responsiveness of height to changes in market access also differed by crop

suitability. To this end, I repeat the interaction approach in estimating equation (2) in columns (4) and

(5), with height as the dependent variable. When estimating by ordinary least squares (with county fixed

effects) in Panel A, I find that the negative relationship of market access with stature is stronger in the

more agriculturally suitable counties, though the interaction coefficients are only marginally statistically

significant (p = 0.076 for wheat suitability and p = 0.137 for corn suitability). The magnitudes of the

estimated interaction coefficients are large. When these estimates are repeated with the combination of

instrumental variables and county fixed effects, as in column (5) of Table 6, results of the same sign are

found, but as with column (5) of Table 6, the coefficients are smaller and imprecisely estimated.

Together, these results indicate that, in the counties where agricultural productivity caused population

density to grow more in response to rises in market access, the effects of market access on reducing height

were stronger. This is consistent with the contention that the effect of transportation on market access

passed through the channel of increasing local development that worsened the local disease environment.

6 The Antebellum Puzzle

In addition to the contribution made by this paper to understanding the health impacts of transportation

improvements in developing countries, this paper also helps to understand the causes of the deterioration

in health during the early phases of modern economic growth in the United States. Having shown that

transportation linkages were responsible for declining health in this period, this paper provides an empirical

basis for a potential explanation of this trend. This is the first evidence of an explanation for the Antebellum

Puzzle with estimates that can plausibly be given a causal interpretation, and the first that relates declines

in height over time to change in local circumstances (by virtue of the county fixed effects estimation). In

this section, I use the results presented in section 5 above to determine how much of the deterioration in

average stature can be attributed to the growing transportation network of the antebellum period.

43This view is further supported by the results of Table A.3, which shows an increase in the acreage devoted to farming in response to rising market access.

32

The empirical pattern to be explained is a 0.82 inch decline in average stature that is present in the

benchmark sample.44 To determine the fraction of this decline that is attributable to rising market access,

I determine the estimated impact of the rise in market access over the study period on health as implied

by the estimates above. Table 1 shows that the logarithm of market access increased by 0.645 from 1820 to

  1. Thus, the largest coefficient in Tables 4 and 6, -0.832, can explain a decline in stature of 0.54 inches,

or about 65 percent of the total decline of 0.82 inches. As a lower bound, the coefficient of -0.323 can explain

a decline in stature of 0.21 inches, or about 26 percent of the total decline.

The finding that transportation was responsible for a large fraction of the decline in stature of the ante-

bellum period is important in documenting a definitive cause for the decline in health, which has heretofore

eluded researchers. It is not, however, helpful on its own in choosing between the disease and the food price

explanations. Discriminating between these explanations is important to understanding the potential im-

pacts of economic growth beyond that induced by transportation expansions, and in determining whether,

as Komlos (1987) has argued, that the decline in stature is not indicative of a negative impact of indus-

trialization because it was the product of utility-maximizing choice. In demonstrating that the increased

concentration generated by transportation was in part responsible for the negative impact of transportation,

this paper also contributes by providing evidence supporting the disease explanation.

7 Conclusion

The post Economics homework Assignment appeared first on Lion Essays.

 

“Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!”


Economics homework Assignment was first posted on February 4, 2019 at 11:13 am.
©2019 "Lion Essays". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement. Please contact me at support@Lion Essays.com

"Get 15% discount on your first 3 orders with us"
Use the following coupon
FIRST15

Order Now