Showing posts with label econometrics. Show all posts
Showing posts with label econometrics. Show all posts

4.25.2016

Climate Econometrics

I have a new working paper out reviewing various methods, models, and assumptions used in the econometrics literature to quantify the impact of climatic conditions on society.  The process of writing this was much more challenging than I expected, but rereading it makes me feel like we as a community really learned a lot during the last decade of research. Here's the abstract:
Climate Econometrics (forthcoming in the Annual Reviews
Abstract: Identifying the effect of climate on societies is central to understanding historical economic development, designing modern policies that react to climatic events, and managing future global climate change. Here, I review, synthesize, and interpret recent advances in methods used to measure effects of climate on social and economic outcomes. Because weather variation plays a large role in recent progress, I formalize the relationship between climate and weather from an econometric perspective and discuss their use as identifying variation, highlighting tradeoffs between key assumptions in different research designs and deriving conditions when weather variation exactly identifies the effects of climate. I then describe advances in recent years, such as parameterization of climate variables from a social perspective, nonlinear models with spatial and temporal displacement, characterizing uncertainty, measurement of adaptation, cross-study comparison, and use of empirical estimates to project the impact of future climate change. I conclude by discussing remaining methodological challenges.
I summarize several highlights here.

4.06.2015

Data-driven causal inference

Distinguishing cause from effect using observational data: methods and benchmarks

From the abstract:
The discovery of causal relationships from purely observational data is a fundamental problem in science. The most elementary form of such a causal discovery problem is to decide whether X causes Y or, alternatively, Y causes X, given joint observations of two variables X, Y . This was often considered to be impossible. Nevertheless, several approaches for addressing this bivariate causal discovery problem were proposed recently. In this paper, we present the benchmark data set CauseEffectPairs that consists of 88 different "cause-effect pairs" selected from 31 datasets from various domains. We evaluated the performance of several bivariate causal discovery methods on these real-world benchmark data and on artificially simulated data. Our empirical results provide evidence that additive-noise methods are indeed able to distinguish cause from effect using only purely observational data. In addition, we prove consistency of the additive-noise method proposed by Hoyer et al. (2009).
From the arxiv.org blog (note):
The basis of the new approach is to assume that the relationship between X and Y is not symmetrical. In particular, they say that in any set of measurements there will always be noise from various cause. The key assumption is that the pattern of noise in the cause will be different to the pattern of noise in the effect. That’s because any noise in X can have an influence on Y but not vice versa.
There's been a lot of research in stats on "causal discovery" techniques, and the paper in essence is running a horse race between Additive-Noise Methods and Information Geometric Causal Inference, with ANM winning out. Some nice overview slides providing background are here.

4.15.2014

On giving a great applied talk

Jesse Shapiro* has some excellent slides on giving a good applied micro talk that are both specific enough to be of use for students prepping job market talks, as well as general enough to simply provide good fodder for thinking about how one presents one's work to any audience. I highly recommend them. (via Kyle Meng)



*: yet another Stuyvesant High School graduate.

1.17.2014

FAQs for "Reconciling disagreement over climate–conflict results in Africa"

[This is a gues blog post by my coauthor Kyle Meng.]

Sol and I just published an article in PNAS in which we reexamine a controversy in the climate-conflict literature. The debate is centered over two previous PNAS articles: the first by Burke et al. (PNAS, 2009) which claims that higher temperature increases conflict risks in sub-Saharan Africa and a second PNAS article by Buhaug (PNAS, 2010) refuting the earlier study.

How did we get here?

First, a bit of background. Whether climate change causes societies to be more violent is a critical question for our understanding of climate impacts. If climate change indeed increases violence, the economic and social costs of climate change may be far greater than what was previously considered, and thus further prompt the need to reduce greenhouse gas emissions. To answer this question, researchers in recent years have turned to data from the past asking whether violence has responded historically to changes in the local climate. Despite the increasing volume of research (summarized by Sol, Marshall Burke, and Ted Miguel in their meta-analysis published in Science and the accompanying review article in Climatic Change) this question remained somewhat controversial in the public eye. Much of this controversy was generated by this pair of PNAS papers.

What did we do?

Our new paper takes a fresh look at these two prior studies by statistically examining whether the evidence provided by Buhaug (2010) overturns the results in Burke et al. (2009). Throughout, we examine the two central claims made by Buhaug:
1) that Burke et al.'s results "do not hold up to closer inspection" and
2) climate change does not cause conflict in sub-Saharan Africa.  
Because these are quantitative papers, Buhaug’s two claims can be answered using statistical methods. What we found was that Buhaug did not run the appropriate statistical procedures needed for the claims made. When we applied the correct statistical tests, we find that:
a) the evidence in Buhaug is not statistically different from that of Burke et al. and
b) Buhaug’s results cannot support the claim that climate does not cause conflict. 
A useful analogy

The statistical reasoning in our paper is a bit technical so an analogy may be helpful here. Burke et al's main result is equivalent to saying "smoking increases lung cancer risks roughly 10%". Buhaug claims above are equivalent to stating that his analysis demonstrates that “smoking does not increase lung cancer risks” and furthermore that “smoking does not affect lung cancer risks at all”.

What we find, after applying the appropriate statistical method, is that the only equivalent claim that can be supported by Buhaug’s analysis is "smoking may increase lung cancer risks by roughly 100% or may decrease them by roughly 100% or may have no effect whatsoever". Notice this is a far different statement than what Buhaug claims he has demonstrated in 1) and 2) above. Basically, the results presented in Buhaug are so uncertain that they do not reject zero effect, but they also do not reject the original work by Burke et al.

Isn’t Buhaug just showing Burke et al.’s result is “not robust”?

In statistical analyses, we often seek to understand if a result is “robust” by demonstrating that reasonable alterations to the model do not produce dramatically different results. If successful, this type of analysis sometimes convinces us that we have not failed to account for important omitted variables (or other factors) that would alter our estimates substantively.

Importantly, however, the reverse logic is not true and “non-robustness” is not a conclusive (or logical) result. Obtaining different estimates from the application of model alterations alone does not necessarily imply that the original result is wrong since it might be the new estimate that is biased.   Observing unstable results suggests that there are errors in the specification of some (or all) of the models.  It merely means the analyst isn’t working with the right statistical model.

There must exist only one  “true” relationship between climate and conflict, it may be a coefficient of zero or a larger coefficient consistent with Burke et al., but it cannot be all these coefficients at the same time. If models with very different underlying assumptions provide dramatically different estimates, this suggests that all of the models (except perhaps one) is misspecified and should be thrown out.

A central error in Buhaug is his interpretation of his findings.  He removes critical parts of Burke et al.’s model (e.g. those that account for important differences in geography, history and culture) or re-specifies them in other ways and then advocates that the various inconsistent coefficients produced should all be taken seriously. In reality, the varying estimates produced by Buhaug are either due to added model biases or to sampling uncertainty caused by the techniques that he is using. It is incorrect to interpret this variation as evidence that Burke et al.’s estimate is “non-robust”.

So are you saying Burke et al. was right?

No. And this is a very important point. In our article, we carefully state:
“It is important to note that our findings neither confirm nor reject the results of Burke et al.. Our results simply reconcile the apparent contradiction between Burke et al. and Buhaug by demonstrating that Buhaug does not provide evidence that contradicts the results reported in Burke et al. Notably, however, other recent analyses obtain results that largely agree with Burke et al., so we think it is likely that analyses following our approach will reconcile any apparent disagreement between these other studies and Buhaug.”
That is, taking Burke et al’s result as given, we find that the evidence provided in Buhaug does not refute Burke et al. (the central claim of Buhaug). Whether Burke et al. was right about climate causing conflict in sub-Saharan Africa is a different question. We’ve tried to answer that question in other settings (e.g. our joint work published in Nature), but that’s not the contribution of this analysis.

Parting note

Lastly, we urge those interested to read our article carefully. Simply skimming the paper by hunting for statistically significant results would be missing the paper’s point. Our broader hope besides helping to reconcile this prior controversy is that the statistical reasoning underlying our work becomes more common in data-driven analyses.

1.15.2014

Reconciling disagreement over climate–conflict results in Africa

Kyle and I have a paper out in the Early Edition of PNAS this week:

Reconciling disagreement over climate–conflict results in Africa
Solomon M. Hsiang and Kyle C. Meng
Abstract: A recent study by Burke et al. [Burke M, Miguel E, Satyanath S, Dykema J, Lobell D (2009) Proc Natl Acad Sci USA 106(49):20670– 20674] reports statistical evidence that the likelihood of civil wars in African countries was elevated in hotter years. A following study by Buhaug [Buhaug H (2010) Proc Natl Acad Sci USA 107 (38):16477–16482] reports that a reexamination of the evidence overturns Burke et al.’s findings when alternative statistical models and alternative measures of conflict are used. We show that the conclusion by Buhaug is based on absent or incorrect statistical tests, both in model selection and in the comparison of results with Burke et al. When we implement the correct tests, we find there is no evidence presented in Buhaug that rejects the original results of Burke et al. 
Related reconciliation of different results in Kenya.

A brief refresher and discussion of the controversy that we are examining is here.

1.10.2014

Reconciling temperature-conflict results in Kenya

Marshall, Ted and I have a new short working paper out. When we correct the coding of a single variable in a previous study (that uses a new data set), we obtain highly localized temperature-conflict associations in Kenya that are largely in line with the rest of the literature. I think this is a useful example for why we should be careful with how we specify interaction terms.

Reconciling temperature-conflict results in Kenya
Solomon M. Hsiang, Marshall Burke, and Edward Miguel
Abstract: Theisen (JPR, 2012) recently constructed a novel high-resolution data set of intergroup and political conflict in Kenya (1989-2004) and examined whether the risk of conflict onset and incidence responds to annual pixel-level variations in temperature and precipitation.  Thiesen concluded that only extreme precipitation is associated with conflict incidence and that temperature is unrelated to conflict, seemingly at odds with recent studies that found a positive association at the pixel scale (O'laughlin et al., PNAS 2012), at the country scale (Burke et al., PNAS 2009), and at the continental scale (Hsiang et al., Nature 2011) in Africa.  Here we show these findings can be reconciled when we correct the erroneous coding of temperature-squared in Thiesen. In contrast to the original conclusions presented in Theisen, both conflict onset and conflict incidence are significantly and positively associated with local temperature in this new and independently assembled data set.

12.09.2013

What is identification?

There are relatively few non-academic internet resources on identification and causal inference in the social sciences, especially of the sort that can be consumed by a nonspecialist. To remedy that slightly I decided to tidy up and post some slides I've used to give talks on causal inference a few times in the past year. They're aimed at senior undergrad or graduate students with at least some background in statistics or econometrics, and can be found here:

Causal Inference, Identification, and Identification Strategies

Feel free to drop me a line and give me feedback, especially if somethings seems unclear / incorrect. Thanks!

7.29.2013

Forward vs. reverse causal questions

Andrew Gelman has a thought-provoking post on asking "Why?" in statistics:
Consider two broad classes of inferential questions: 
1. Forward causal inference. What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 
2. Reverse causal inference. What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? [...] 
My question here is: How can we incorporate reverse causal questions into a statistical framework that is centered around forward causal inference. (Even methods such as path analysis or structural modeling, which some feel can be used to determine the direction of causality from data, are still ultimately answering forward casual questions of the sort, What happens to y when we change x?) 
My resolution is as follows: Forward causal inference is about estimation; reverse causal inference is about model checking and hypothesis generation.
Among many gems is this:
A key theme in this discussion is the distinction between causal statements and causal questions. When Rubin dismissed reverse causal reasoning as “cocktail party chatter,” I think it was because you can’t clearly formulate a reverse causal statement. That is, a reverse causal question does not in general have a well-defined answer, even in a setting where all possible data are made available. But I think Rubin made a mistake in his dismissal. The key is that reverse questions are valuable in that they focus on an anomaly—an aspect of the data unlikely to be reproducible by the current (possibly implicit) model—and point toward possible directions of model improvement.
 You can read the rest here.

6.03.2013

Weather and Climate Data: a Guide for Economists

Now posted as an NBER working paper (it should be out in REEP this summer):

Using Weather Data and Climate Model Output in Economic Analyses of Climate Change
Maximilian Auffhammer, Solomon M. Hsiang, Wolfram Schlenker, Adam Sobel
Abstract: Economists are increasingly using weather data and climate model output in analyses of the economic impacts of climate change. This article introduces weather data sets and climate models that are frequently used, discusses the most common mistakes economists make in using these products, and identifies ways to avoid these pitfalls. We first provide an introduction to weather data, including a summary of the types of datasets available, and then discuss five common pitfalls that empirical researchers should be aware of when using historical weather data as explanatory variables in econometric applications. We then provide a brief overview of climate models and discuss two common and significant errors often made by economists when climate model output is used to simulate the future impacts of climate change on an economic outcome of interest.

4.25.2013

Toilets


Effects of Rural Sanitation on Infant Mortality and Human Capital: Evidence from India's Total Sanitation Campaign
Dean Spears
Abstract: Open defecation without a toilet or latrine is among the leading global threats to health, especially in India. Although it is well-known that modern sewage infrastructure improves health, it is unknown whether a sanitation program feasible for a low capacity, poor country government could be effective. This paper contributes the first causally identied estimates of effects of rural sanitation on health and human capital accumulation. The Indian government's Total Sanitation Campaign reports building one household pit latrine per ten rural persons from 2001 to 2011. The program offered local governments a large ex post monetary incentive to eliminate open defecation. I use several complementary identification strategies to estimate the program's effect on children's health. First, I exploit variation in program timing, comparing children born in different years. Second, I study a long difference-in-differences in aggregate mortality. Third, I exploit a discontinuity designed into the monetary incentive. Unlike many impact evaluations, this paper studies a full-scale program implemented by a large government bureaucracy with low administrative capacity. At the mean program intensity, infant mortality decreased by 4 per 1,000 and children's height increased by 0.2 standard deviations (similar to the cross-sectional difference associated with doubling household consumption per capita). These results suggest that, even in the context of governance constraints, incentivizing local leaders to promote technology adoption can be an effective strategy
How much international variation in child height can sanitation explain?
Dean Spears
Physical height is an important economic variable reflecting health and human capital. Puzzlingly, however, differences in average height across developing countries are not well explained by differences in wealth. In particular, children in India are shorter, on average, than children in Africa who are poorer, on average, a paradox called “the Asian enigma” which has received much attention from economists. This paper provides the first documentation of a quantitatively important gradient between child height and sanitation that can statistically explain a large fraction of international height differences. This association between sanitation and human capital is robustly stable, even after accounting for other heterogeneity, such as in GDP. The author applies three complementary empirical strategies to identify the association between sanitation and child height: country-level regressions across 140 country- years in 65 developing countries; within-country analysis of differences over time within Indian districts; and econometric decomposition of the India-Africa height differences in child-level data. Open defecation, which is exceptionally widespread in India, can account for much or all of the excess stunting in India.


Perhaps the most disturbing thing of all is the simple summary statistic that there are many regions where >50% of households do not have toilets.

4.12.2013

Bruce Hansen's Econometrics textbook

Dave Giles over at Econometrics Beat points out that the new version of Bruce Hansen's Ph.D.-level econometrics textbook is now available. It's freely available as a pdf in both standard and iPad formattings, and flipping through it a bit it seems to be quite readable. I particularly like the opening quote from Ragnar Frisch, first editor of Econometrica and, apparently, progenitor of the term "econometrics":

"[T]here are several aspects of the quantitative approach to economics, and no single one of these aspects, taken by itself, should be confounded with econometrics. Thus, econometrics is by no means the same as economic statistics. Nor is it identical with what we call general economic theory, although a considerable portion of this theory has a definitely quantitative character. Nor should econometrics be taken as synonymous with the application of mathematics to economics. Experience has shown that each of these three view-points, that of statistics, economic theory, and mathematics, is a necessary, but not by itself a sufficient, condition for a real understanding of the quantitative relations in modern economic life. It is the unification of all three that is powerful. And it is this unification that constitutes econometrics."

Topics covered are below the fold.

3.28.2013

Some like it hot... but not too hot

This paper has been in the works for some time now. Its innovative and important, with very pretty graphs!

Climate Amenities, Climate Change, and American Quality of Life
David Albouy, Walter Graf, Ryan Kellogg, and Hendrik Wolff

The chemistry of the human body makes our health and comfort sensitive to climate.Every day, climate influences human activity, including diet, chores, recreation,and conversation. Geographically, climate impacts the desirability of differentlocations and the quality of life they offer; few seek to live in the freezing tundraor oppressively hot deserts. This paper estimates the dollar value American householdsplace on climate amenities, including sunshine, precipitation, humidity, and especiallytemperature. Valuing climate amenities not only helps us to understand how climateaffects welfare and where people live, but also helps to inform policy responses to climate changes. 
Using a quality of life measure that is carefully constructed from local wage andhousing price differentials, the authors find that Americans favor an average dailytemperature of 65 degrees, tend to dislike marginal increases in heat more thanmarginal increases in cold, and care less about marginal changes in outdoor temperatureonce the temperature is sufficiently uncomfortable that they are unlikely to gooutside. These preferences vary by location, reflecting people's preferences for warmer or colder climates. Changes in climate amenities under business-as-usual climate change predictions imply annual welfare losses of 1 to 3 percent of incomeby 2100, holding technology and preferences constant.





2.14.2013

Temporal and spatial dynamics of subnational climate shocks and human conflict

There are many groups trying to use gridded subnational data to understand how climatic shocks generate social conflict and to verify larger-scale analyses.  However up to now, it hasn't felt like anyone had done a particularly strong analysis where the local-level data provided new insight beyond validating earlier large-scale studies at a smaller scale.  This new working paper is the best high-resolution analysis that I've seen so far.

2.06.2013

Temperature and Jewish persecution

"Recommendations" on Google scholar is getting very good. Here's a new working paper out that it flagged for me, and it hit remarkably close to home  -- it's the most recent working paper since this one that I emailed to my parents. (We've posted on climate shocks on and human conflict many previous times.)

12.11.2012

Is it true that "Everyone's a winner?" Dams in China and the challenge of balancing equity and efficiency during rapid industrialization

Jesse and I both come from the Sustainable Development PhD Program at Columbia which has once again turned out a remarkable crop of job market candidates (see outcomes from 2012 and 2011). We both agreed that their job market papers were so innovative, diverse, rigorous and important that we wanted to feature them at FE.  Their results are striking and deserve dissemination (we would probably post them anyway even if the authors weren't on the market), but they also clearly illustrate what the what the Columbia program is all about. (Apply to it here, hire one of these candidates here.) Here is the third and final post.

Large infrastructure investments are important for large-scale industrialization and economic development. Investments in power plants, roads, bridges and telecommunications, among others, provide important returns to society and are compliments to many types of private investment. But during rapid industrialization, as leaders focus on growth, there is often concern that questions of equity are cast aside. In the case of large-scale infrastructure investments, there are frequently populations ("losers") that suffer private costs when certain types of infrastructure are built -- for example, people whose homes are in the path of a new highway or who are affected by pollution from a power plant.

In public policy analysis and economics, we try to think objectively of the overall benefits of large investments to an entire society, keeping in mind that there will usually be some "losers" from the new policy in addition to a (hopefully larger) group of "winners."  In the cost-benefit analysis of large projects, we usually say if that a project is worth doing if the gains to the winners outweigh the loses to the losers -- making the implicit assumption that somehow the winners can compensate the losers for their loses and continue to benefit themselves. In cases where the winners compensate the losers enough that their losses are fully offset (i.e. they are no longer net losers), we say that the investment is "Pareto improving" because nobody is made worse off by the project.

A Pareto improving project is probably a good thing to do, since nobody is hurt and probably many people benefit. However, in the case of large infrastructure investments, it is almost guaranteed that some groups will be worse off because of the project's effects, so making sure that everyone benefits from these projects will require that the winners actually compensate the losers. Occasionally this occurs privately, but that tends to be uncommon, so with large-scale projects we often think that a central government authority has a role to play in transferring some of benefits from the project away from the winners and towards the losers.

But do these transfers actually occur? In a smoothly functioning government, one would hope so.  But the governments of rapidly developing countries don't always have the most experienced regulators and often pathologies, like corruption, lead to doubt as to whether large financial transfers will be successful.  Empirically, we have little to no evidence as to whether governments in rapidly industrializing countries (1) accurately monitor the welfare lost by losers in the wake of large projects and (2) have the capacity necessary to compensate these losers for their loses. Thus, establishing whether governments can effectively compensate losers is important for understanding whether large-scale infrastructure investments can be made beneficial (or at least "not harmful") for all members of society.

Xiaojia Bao investigates this question for the famous and controversial example of dams in China. Over the last few decades, a large number of hydroelectric dams have been build throughout China. These dams are an important source of power for China's rapidly growing economy, but they also can lead to inundation upstream, a reduction in water supply downstream, and a slowed flow of water that leads to an accumulation of pollutants both upstream and downstream.

Bao asks whether the individuals who are adversely affected by new dams are compensated for their losses. To do this, she obtains data on dams and municipal-level data on revenue and transfers from the central government.   She uses geospatial analysis to figure out which municipalities are along rivers that are dammed and also which are upstream, downstream or at the dam site.  She then compares how the construction of a new dam alters the distribution of revenues and federal transfers to municipalities along the dammed river, in comparison to adjacent municipalities that are not on the river.

Bao finds that the Chinese government has been remarkably good at compensating those communities who suffer when dams are built.  Municipalities upstream of a dam lose the most revenue both while the dam is being built and after it become operational. But at the same time, the central government increases transfers to those municipalities sufficiently so that these municipalities suffer no net loss in revenue. In contrast, populations just downstream look like they benefit slightly from the dam's operation, increasing their revenue -- and it appears that the central government is also good at reducing transfers to those municipalities so that these gains are effectively "taxed away." The only group that is a clear net winner are the municipalities that host the actual dam itself, as their revenue rises and the central government provides them with additional transfers during a dam's construction.

These findings are important because we often worry that large-scale investment projects may exacerbate existing patterns of inequality, as populations that are already marginalized are saddled with new burdens for the sake of the "greater good." However, in cases where governments can effectively distribute the benefits from large projects so that no group is made worse off, then we should not let this fear prevent us from making the socially-beneficial investments in infrastructure that are essential to long run economic development.

The paper:
Dams and Intergovernmental Transfer: Are Dam Projects Pareto Improving in China?
Xiaojia Bao  
Abstract: Large-scale dams are controversial public infrastructure projects due to the unevenly distributed benefits and losses to local regions. The central government can make redistributive fiscal transfers to attenuate the impacts and reduce the inequality among local governments, but whether large-scale dam projects are Pareto improving is still a question. Using the geographic variation of dam impacts based on distances to the river and distances to dams, this paper adopts a difference-in-difference approach to estimate dam impacts at county level in China from 1996 to 2010. I find that a large-scale dam reduces local revenue in upstream counties significantly by 16%, while increasing local revenue by similar magnitude in dam-site counties. The negative revenue impacts in upstream counties are mitigated by intergovernmental transfers from the central government, with an increase rate around 13% during the dam construction and operation periods. No significant revenue and transfer impacts are found in downstream counties, except counties far downstream. These results suggest that dam-site counties benefit from dam projects the most, and intergovernmental transfers help to balance the negative impacts of dams in upstream counties correspondingly, making large-scale dam projects close to Pareto improving outcomes in China.
In figures...

In China, Bao obtains the location, height, and construction start/stop dates for all dams built before 2010.

click to enlarge

For every dam, Bao follows the corresponding river and calculates which municipalities are "upstream" and which are "downstream." She then computes finds comparison "control" municipalities that are adjacent to these "treatment" municipalities (to account for regional trends). Here is an example for a single dam:

Click to enlarge

Bao estimates the average effect of dam construction (top) and operation(bottom) on municipal revenues as a function of distance upstream (left) or downstream (right).  Locations just upstream lose revenue, perhaps from losing land (inundation) or pollution. Locations at the dam gain revenue, perhaps because of spillovers from dam-related activity (eg. consumer spending). During operation, downstream locations benefit slightly, perhaps from flood control.

click to enlarge

Government transfers during construction/operation upstream/downstream. Upstream locations receive large positive transfers. Municipalities at the dam receive transfers during construction. Downstream locations lose some transfers (taxed away).

click to enlarge

Transfers (y-axis) vs. revenue (x-axis) for locations upstream/downstream and at the dam site, during dam construction. Locations are net "winners" if they are northeast of the grey triangle. Upstream municipalities are more than compensated for their lost revenue through transfers.   Municipalities at the dam site benefit through revenue increases and transfers.

click to enlarge

Same, but for dam operation (after construction is completed). Upstream locations are compensated for losses. Benefits to downstream locations are taxed away. Dam-site locations are net "winners".

Click to enlarge

12.05.2012

Urban bus pollution and infant health: how New York City's smog reduction program generates millions of dollars in benefits

Jesse and I both come from the Sustainable Development PhD Program at Columbia which has once again turned out a remarkable crop of job market candidates (see outcomes from 2012 and 2011). We both agreed that their job market papers were so innovative, diverse, rigorous and important that we wanted to feature them at FE.  Their results are striking and deserve dissemination (we would probably post them anyway even if the authors weren't on the market), but they also clearly illustrate what the what the Columbia program is all about. (Apply to it here, hire one of these candidates here.) Here is the second post.


Around the world, diesel-powered vehicles play a major role in moving people and goods. In particular, buses are heavily utilized in densely populated cities where large numbers of people are exposed to their exhaust. If bus exhaust has an impact on human health, then urban policy-makers would want to know this since it will affect whether or not it's worth it to invest in cleaner bus technologies. Upgrading the quality of public transport systems is usually expensive, but upgrading could have potentially large benefits since so many people live in dense urban centers and are exposed to their pollution. Deciding whether or not to invest in cleaner bus technologies is an important policy decision made by city officials, since buses aren't replaced very often and poor choices can affect city infrastructure for decades -- so its important that policy-makers know what the trade offs are when they make these decisions.

Unfortunately, to date, it has been extremely difficult to know if there are any effects of bus pollution on human health because cities are complex and bustling environments where people are constantly exposed to all sorts of rapidly changing environmental conditions. As one might imagine, looking at a city of ten-million people, each of whom is engaged daily in dozens of interacting activities, and trying to disentangle the web of factors that affect human health to isolate the effect of bus pollution is a daunting task. To tackle this problem, we would need to assemble a lot of data and conduct a careful analysis. This is exactly what Nicole Ngo has done.

Between 1990 and 2010,  New York City made major investments that transformed the city's bus fleet, reducing its emissions dramatically. To study the impact of this policy on human health, Ngo assembled a new massive data set that details exactly which bus drove on which route at what time every single day. Because the city's transition from dirty buses to clean buses occurred gradually over time, and because the dispatcher at the bus depot randomly assigns buses to different routes at different times, the people who live along bus routes were sometimes exposed to exhaust from dirtier buses and sometimes exposed to exhaust from clean buses.  By comparing health outcomes in households that are randomly exposed to the dirtier bus pollution with comparable households randomly exposed to cleaner bus pollution, Ngo can isolate the effect of the bus pollution on health.

In this paper, Ngo focuses on infant health (although I expect she will use this unique data set to study many more outcomes in the future) and measures the effect of a mother's exposure to bus pollution during pregnancy on a child's health at birth.  This is hard problem, since its impossible to know exactly all the different things that a mother does while she's pregnant and because Ngo has to use pollution data collected from air-quality monitors to model how pollution spreads from bus routes to nearby residences.  Despite these challenges, Ngo is able to detect the effect of in utero exposure to bus pollution on an infant's health at birth.  Fetuses that are exposed to higher levels of bus-generated Nitrous-Oxides (NOx) during their second and third trimester have a lower birthweight on average and fetuses exposed to more bus-generated particulate matter (PM) during those trimesters have a lower Apgar 5 score (a doctors subjective evaluation of newborn health).

The size of the effects that Ngo measures are relatively small for any individual child (so if you are pregnant and living near a bus route, you shouldn't panic).  But the aggregate effect of New York City's investment in clean buses is large, since there are many pregnant mothers who live near bus routes and who were exposed to less dangerous emissions because of these policies. Since its easiest to think about city-wide impacts using monetized measures, and because previous studies have demonstrated that higher birth weight causes an infants future income to be higher, Ngo aggregates these small impacts across many babies and estimates that the city's effort to upgrade buses increase total future earnings of these children by $66 million. Considering that the city upgraded roughly 4500 buses, this implies that each bus that was upgraded generated about $1,460 in value just through its influence on infant health and future earnings. Importantly however, Ngo notes:
This [benefit] is likely a lower bound since I do not consider increased hospitalizations costs from lower birth weights as discussed in Almond et al. (2005), nor could I find short-run or long-run costs associated with lower Apgar 5 scores.
and I expect that Ngo will uncover additional health benefits of New York City's bus program, which will likely increase estimates for the program's total benefits. Furthermore, I suspect that these estimates for the value of pollution control can be extrapolated to diesel trucks, although Ngo is appropriately cautious about doing so in her formal analysis.

These results are important for urban planners and policy-makers in cities around the world who must decide whether or not it is worth it to invest in cleaner public transit systems.  In addition, they are an excellent example of how great data and careful analysis can help us understand important human-environment relationships in complex urban systems.

The paper:
Transit buses and fetal health: An evaluation of bus pollution policies in New York City 
Nicole Ngo
Abstract The U.S. Environmental Protection Agency (EPA) reduced emission standards for transit buses by 98% between 1988 and 2010. I exploit the variation caused by these policy changes to evaluate the impacts of transit bus pollution policies on fetal health in New York City (NYC) by using bus vintage as a proxy for street-level bus emissions. I construct a novel panel data set for the NYC Transit bus fleet to assign maternal exposure to bus pollution at the census block level. Results show a 10% reduction in emission standards for particulate matter (PM) and nitrogen oxides (NOx) during pregnancy increased infant Apgar 5 scores by 0.003 points and birth weight by 6.6 grams. While the impacts on fetal health are modest, the sensitivity of later-life outcomes to prenatal conditions suggests improved emission standards between 1990 and 2009 have increased total earnings for the 2009 birth cohort who live near bus routes in NYC by at least $65.7 million.
In figures...

Bus routes in New York City, which Ngo links to residential exposure through geospatial analysis:

(click to enlarge)

Buses are upgraded throughout the two decades, with several large and abrupt changes in the fleet's composition:

(click to enlarge)

When dirtier buses are randomly assigned to travel a route, Ngo can detect this using air-monitoring stations near that route:

(click to enlarge)

Using her mathematical model of bus pollution (and its spatial diffusion) Ngo computes how New York City's investment in buses lead to a dramatic reduction in exposure to bus-generated pollutants:

(click to enlarge)


Exposure to bus-generated NOx during the second and third trimesters lowers birthweight, and exposure to bus-generated PM lowers Apgar5 scores:


(click to enlarge)

11.12.2012

Were the cost estimates for Waxman-Markey overstated by 200-300%?


Jesse and I both come from the Sustainable Development PhD Program at Columbia, which has once again turned out a remarkable crop of job market candidates (see outcomes from 2012 and 2011). We both agreed that their job market papers were so innovative, diverse, rigorous and important that we wanted to feature them at FE.  Their results are striking and deserve dissemination (we would probably post them anyway even if the authors weren't on the market), but they also clearly illustrate what the what the Columbia program is all about. (Apply to it here, hire one of these candidates here.) This is the first post.

Good policy requires good cost-benefit analysis. But when we are developing innovative policies, like those used to curb greenhouse gas emissions, it's notoriously difficult to estimate both costs and benefits since no analogous policies have ever been implemented before.  The uncertainty associated with costs and benefits tends to make many forms of environmental policy difficult to implement in part because the imagined costs (when policy-makers are considering a policy) tend to exceed actual costs (what we observe after policies are actually implemented). Kyle Meng develops an innovative approach, linking Intrade predictions about the success of Waxman-Markey with stock-market returns and abrupt political events, to measure the cost of the bill to firms as predicted by the market. This is very different from standard technocratic approaches used by the government to assess the cost of future policies, which rely on parameterized models of technology and econometric models of behavior ("structural models").

By relying on the market, Meng infers what players in affected industries actually expect to happen in their own industry. The result is a bit surprising: Meng estimates that standard costs-estimates for WM (produced before it failed to pass) are 200-300% larger than what players in the industry actually expected it to cost them.  But this still didn't stop industry players from fighting the bill -- one of the ways that Meng validates his approach is to use lobby records to show that firms which expect to suffer more from the bill (as recovered using his approach) spend more money to fight it.

It's tough to tell whether Meng's approach or the structural models are more accurate predictors of firm-level costs since WM was never brought into law, so the outcomes will remain forever unobserved. But he does show that for several similar laws (eg. the Montreal Protocol), the structural predictions tended to overestimate the actual costs of implementation (which were observed after the law was implemented and outcomes observed) by roughly a factor of two. This doesn't prove that Meng's approach is more accurate, but it shows that his estimate for the bias of the structural approach (with regard to WM) is consistant with the historical biases of these models.

The paper:

The Cost of Potential Cap-and-Trade Policy: An Event Study using Prediction Markets and Lobbying Records
Kyle Meng
Abstract: Efforts to understand the cost of climate policy have been constrained by the limited number of policies available for evaluation. This paper develops an empirical method for forecasting the expected cost to firms of a proposed climate policy that was never realized. I combine prediction market prices, which reflect market beliefs over regulatory prospects, with stock returns in order to estimate the expected cost to firms of the Waxman-Markey cap-and-trade bill, had it been implemented. I find that Waxman-Markey would have reduced the market value of a listed firm by an average of 2.0%, resulting in a total cost of $165 billion for all listed firms. The strongest effects are found in sectors with greater carbon and energy intensity, import penetration, and exposure to U.S. product markets, and in sectors granted free allowances. Because the values of unlisted firms are not observed, I use firm-level lobbying expenditures within a partial identification framework to obtain bounds for the costs borne by unlisted firms. This procedure recovers a total cost to all firms between $110 and $260 billion. I conclude by comparing estimates from this method with Waxman-Markey forecasts by prevailing computable general equilibrium models of climate policy.
In figures...

Abrupt political events that affect the expected success of WM are quantified by looking at expectations in Intrade markets:

click to enlarge

When WM appears more likely, the stock prices of CO2 intensive firms falls on average:

click to enlarge

Firms that are more CO2 intensive are affected more strongly:

click to enlarge

Firms whose stock prices are more responsive to WM lobby harder against it:

click to enlarge

How these cost estimates compare with structural cost estimates, and similar statistics for historical regulations that actually passed into law.

click to enlarge

Take home summary: Cap and trade in the USA probably would have been cheaper to implement than we thought, according to the firms it was going to regulate. 

11.07.2012

An American, a Canadian and a physicist walk into a bar with a regression... why not to use log(temperature)

Many of us applied staticians like to transform our data (prior to analysis) by taking the natural logarithm of variable values.  This transformation is clever because it transforms regression coefficients into elasticities, which are especially nice because they are unitless. In the regression

log(y) = b* log(x)

b represents the percentage change in y that is associated with a 1% change in x. But this transformation is not always a good idea.  

I frequently see papers that examine the effect of temperature (or control for it because they care about some other factor) and use log(temperature) as an independent variable.  This is a bad idea because a 1% change in temperature is an ambiguous value. 

Imagine an author estimates

log(Y) = b*log(temperature)

and obtains the estimate b = 1. The author reports that a 1% change in temperature leads to a 1% change in Y. I have seen this done many times.

Now an American reader wants to apply this estimate to some hypothetical scenario where the temperature changes from 75 Fahrenheit (F) to 80 F. She computes the change in the independent variable  D:

DAmerican = log(80)-log(75) = 0.065

and concludes that because temperature is changing 6.5%, then Y also changes 6.5% (since 0.065*b = 0.065*1 = 0.065).

But now imagine that a Canadian reader wants to do the same thing.  Canadians use the metric system, so they measure temperature in Celsius (C) rather than Fahrenheit. Because 80F = 26.67C and 75F = 23.89C, the Canadian computes

DCanadian = log(26.67)-log(23.89) = 0.110

and concludes that Y increases 11%.

Finally, a physicist tries to compute the same change in Y, but physicists use Kelvin (K) and 80F = 299.82K and 75F = 297.04K, so she uses

Dphysicist = log(299.82) - log(297.04) = 0.009

and concludes that Y increases by a measly 0.9%.

What happened? Usually we like the log transformation because it makes units irrelevant. But here changes in units dramatically changed the predication of this model, causing it to range from 0.9% to 11%! 

The answer is that the log transformation is a bad idea when the value x = 0 is not anchored to a unique [physical] interpretation. When we change from Fahrenheit to Celsius to Kelvin, we change the meaning of "zero temperature" since 0 F does not equal 0 C which does not equal 0 K.  This causes a 1% change in F to not have the same meaning as a 1% change in C or K.   The log transformation is robust to a rescaling of units but not to a recentering of units.

For comparison, log(rainfall) is an okay measure to use as an independent variable, since zero rainfall is always the same, regardless of whether one uses inches, millimeters or Smoots to measure rainfall.

10.31.2012

Hurricanes and the social safety net in US counties


The social safety net catches people after a hurricane, but this cost to society is generally not accounted for in standard estimates of a hurricane's economic impact.

The Role of Transfer Payments in Mitigating Shocks: Evidence from the Impact of Hurricanes
Tatyana Deryugina
Abstract: Little is known empirically about how aggregate economic shocks are mitigated by social safety nets. I examine the effect of hurricanes on US counties. While I find no significant changes in population, earnings, and the employment rate 0-10 years after landfall, there is a substantial increase in non-disaster government transfers. An affected county receives additional non-disaster government transfers totaling $654 per capita, which suggests that the lack of changes in basic economic indicators may be in part due to existing social safety nets. The fiscal costs of natural disasters are also much larger than the cost of disaster aid alone.

click to enlarge

click to enlarge 

Deryugina writes:
The number of construction firm locations (establishments) declines by 1.6% each year with no change in the mean. Construction employment is on average 7.6% lower in the ten years following the hurricane, and declines by 2.0% per year. The overall decline in employment suggests a drop in construction demand. This is confirmed by estimates of per capita single family housing starts, which are 8% lower on average. Wages increase by an average of 6.8%, but then fall by 0.9% each year, suggesting there may be a temporary change in the composition of construction labor demand (e.g., more demand for specialized workers) or lower labor supply… 
One possible interpretation of the decline in the local construction sector is spatial: the con- struction industry may have simply moved to nearby counties without any net effect on the sector. The implications of spatial changes, while non-trivial for the local economy, are different than if there’s a widespread capital shock. However, the fall in per capita housing starts provides evidence of a significant decrease in construction demand. Thus, the downturn in the local construction sector is not solely driven by spatial shifts in construction activity. 
There is no change in the employ- ment rate or per capita net earnings. Using 95% confidence bounds, I can rule out a decrease in mean earnings greater than 1.8% and a decrease in the mean employment rate greater than 0.5% The mean shift test for transfers indicates a 2.1% average increase in per capita government to individual transfers, equivalent to about $69 per person per year. Per capita business to individ- ual transfers in the eleven years following the hurricane are estimated to be 4.8% higher than the pre-hurricane transfers, or about $3.9 per year. There are no significant changes in the trends of any of these variables. Assuming a 3% discount rate, the present discounted value (PDV) of all government transfers is about $654 per capita, and the PDV of transfers from businesses is $37 per capita. Thus, post-hurricane transfers from general social programs are larger than transfers from disaster-specific programs and much larger than insurance payments. Because the non-disaster transfers are still significantly larger 10 years after the hurricane, the estimate of $654 per capita should be viewed as a lower bound.

The subcomponents of total government transfers to individuals are: retirement and disability insurance benefits (which includes workers’ compen-sation), medical benefits (which includes Medicare and Medicaid), income maintenance (which includes Supplemental Security Income, family assistance, and foot stamps), unemployment ben- efits, veterans’ benefits, and federal education assistance. A separate analysis of each of these components (following the same procedure as for total transfers) reveals that increases in medical and unemployment benefits explain the overwhelming majority of the net increase in total non- disaster transfers. Specifically, public medical benefits increase significantly by $435 per capita in PDV, of which $106 is Medicare spending. The estimated change in Medicare spending is not significant. 18 Because there is no significant increase in Medicare spending, the increase in pub- lic medical spending is likely due to changes in the number of people eligible for public medical benefits rather than increased medical spending on existing recipients. 
Unemployment benefits increase by about $280 per capita in PDV. There is no significant change in aggregate income maintenance (although some subcomponents, such as family assis- tance, do increase slightly) and no significant change in retirement and disability insurance bene- fits, per capita federal education assistance, or per capita veteran benefits. Thus, the majority of the increase in transfers is accounted for by unemployment insurance and public medical benefits.