Visually-Weighted Regression

[This is the overdue earth-shattering sequel to this earlier post.]

I recently posted this working paper online. It's very short, so you should probably just read it (I was actually originally going to write it as a blog post), but I'll run through the basic idea here.  

Since I'm proposing a method, I've written functions in Matlab (vwregress.m) and Stata (vwlowess.ado) to accompany the paper. You can download them here, but I expect that other folks can do a much better job implementing this idea.

Solomon M. Hsiang
Abstract: Uncertainty in regression can be efficiently and effectively communicated using the visual properties of regression lines.  Altering the "visual weight" of lines to depict the quality of information represented clearly communicates statistical confidence even when readers are unfamiliar or reckless with the formal and abstract definitions of statical uncertainty. Here, we present an example by decreasing the color-saturation of nonparametric regression lines when the variance of estimates increases. The result is a simple, visually intuitive and graphically compact display of statistical uncertainty. This approach is generalizable to almost all forms of regression.
Here's the issue. Statistical uncertainty seems to be important for two different reasons. (1) If you have to make a decision based on data, you want to have a strong understanding of the possible outcomes that might result from your decision, which itself rests on how we interpret the data.  This is the "standard" logic, I think, and it requires a precise, quantitative estimate of uncertainty.  (2) Because there is noise in data, and because sampling is uneven across independent variables, a lot of data analysis techniques generate artifacts that we should mostly just ignore.  We are often unnecessarily focused/intrigued by the wackier results that shows up in analyses, but thinking carefully about statistical uncertainty reminds us to not focus too much on these features. Except when it doesn't.

"Visually-weighted regression" is a method for presenting regression results that tries to address issue (2), taking a normal person's psychological response to graphical displays into account. I had grown a little tired of talks and referee reports where people speculate about the cause of some strange non-linearity at the edge of a regression sample, where there was no reason to believe the non-linear structure was real.  I think this and related behaviors emerge because (i) there seems to be an intellectual predisposition to thinking that "nonlinearity" is inherently more interesting that "linearity" and (ii) the traditional method for presenting uncertainty subconsciously focuses viewers attention on features of the data that are less reliable. I can't solve issue (i) with data visualization, but we can try to fix (ii). 

The goal of visually-weighted regression is to take advantage of viewer's psychological response to images in order to focus their attention on the results that are the most informative.  "Visual weight" is a concept from art and graphical design that is used to to direct a viewer's focus within an image.  Large, dark,  high-contrast, and complex structures tend to "grab" a viewer's attention.  Our brains are constantly looking for visual information and, somewhere along the evolutionary line, detailed/high-contrast structures in our field of view were probably more informative and more useful for survival, so we are programmed to give them more of our attention.  Unfortunately, the traditional approaches to displaying statistical uncertainty give more visual weight to the uncertain portions of the analysis, which is exactly backwards of what we want. Ideally, a viewer will focus more of their attention on the portions of analysis that have some statical confidence and they will mostly ignore the portions of analysis that are so uncertain that they contain little or no information.

[continued below the fold]


G-FEED blog is live!

David LobellMichael RobertsWolfram SchlenkerJarrod Welch and I have started a blog on Global Food, Environment and Economic Dynamics (G-FEED).  It's a compliment to Fight Entropy (not a substitute) and will focus primarily on food production around the world and its relationship to the environment and economics.

check it out: www.G-FEED.com


Temperature and infrastructure

Once while presenting this paper on temperature's influence on economic performance, someone in the audience asked whether any of the observed declines in output could be due to stress on infrastructure. I honestly replied that I didn't know, but that it seemed like a possibility.  If high temperatures began to interfere with the structure or integrity of steel, concrete or other materials used in infrastructure, existing systems might begin to slow down or fail.

Apparently, this is mechanisms is beginning to become an issue. One of today's cover stories in the New York Times described various infrastructure failures that are emerging around the country as effects of the persistent and extreme heat. Some highlights:
On a single day this month here, a US Airways regional jet became stuck in asphalt that had softened in 100-degree temperatures, and a subway train derailed after the heat stretched the track so far that it kinked — inserting a sharp angle into a stretch that was supposed to be straight. In East Texas, heat and drought have had a startling effect on the clay-rich soils under highways, which “just shrink like crazy,” leading to “horrendous cracking....” 
Excessive warmth and dryness are threatening other parts of the grid as well. In the Chicago area, a twin-unit nuclear plant had to get special permission to keep operating this month because the pond it uses for cooling water rose to 102 degrees; its license to operate allows it to go only to 100....
When railroads install tracks in cold weather, they heat the metal to a “neutral” temperature so it reaches a moderate length, and will withstand the shrinkage and growth typical for that climate. But if the heat historically seen in the South becomes normal farther north, the rails will be too long for that weather, and will have an increased tendency to kink. 

I don't know of any work on the economic or social impact of these types of failures. And I similarly don't know of any theory explaining how we ought to alter our patterns of infrastructure investment, based on the realization that this will continue into the future. The NYT article describes a few ad hoc adaptive measures that companies are starting to adopt, but since the lifetime of new infrastructure will extend into 2040 (or longer), we would do well to plan. This seems like an area ripe for research.


Less is more

There are many things we can do to make our research clearer to readers: make our text well organized and accessible, make clear graphs, consider the psychology of our readers, and use as little math as is necessary to explain our point.  Clarity and elegance trumps formalism and detail.

Intimidated by Equations?
Barbara R. Jasny
Although there is general agreement on the value of a strong tie between theory and data, forging links between theoretical and empirical approaches (and practitioners) is not as straightforward as it should be. New evidence of this disconnect comes from the work of Fawcett and Higginson, who examined the use of mathematical equations in 649 papers dealing with ecology and evolution that were published in 1998. They gathered citation data, excluding instances of self-citation. An increase in the number of equations per page of main text corresponded to a lower rate of citations. Overall, each additional equation in the main text of a paper was associated with a 28% decrease in the citation rate. Burying the equations in an appendix had a salutary effect on citation rate. When the citing papers were divided into theoretical and nontheoretical on the basis of their use of the word "model" in the abstract or title, the authors observed that the negative effect was due to the nontheoretical papers not citing papers with equations. There are caveats to the conclusions—examinations over longer periods of time, analysis of the relative content of the papers, and examination of the effect for online rather than print publication are all warranted. Although the authors conclude that better math education for biologists is the best long-term solution, they suggest that more immediate strategies could include the addition of explanatory text between equations.
The full PNAS article is here.

h/t Marshall Burke


Does climate affect conflict? Evidence from Shakespeare

Mark Cane sends us this:
ROMEO and JULIET         ACT 3, SCENE 1a 
[A street. MERCUTIO, BENVOLIO & Servants] 
I pray thee, good Mercutio, let's retire.
The day is hot, the Capulets abroad,
And if we meet we shall not 'scape a brawl,
For now these hot days is the mad blood stirring.
[And later they do meet the Capulets and Tybalt kills Mercutio, then Romeo kills Tybalt and the rest is tragedy.]
For more fun evidence on the psychological effect of heat on aggression, see evidence from road rage and MLB.  Also this.


Using cell phones to track post-disaster population movements in Haiti

Predictability of population displacement after the 2010 Haiti earthquake

Xin Lu, Linus Bengtsson, and Petter Holme

Abstract: Most severe disasters cause large population movements. These movements make it difficult for relief organizations to efficiently reach people in need. Understanding and predicting the locations of affected people during disasters is key to effective humanitarian relief operations and to long-term societal reconstruction. We collaborated with the largest mobile phone operator in Haiti (Digicel) and analyzed the movements of 1.9 million mobile phone users during the period from 42 d before, to 341 d after the devastating Haiti earthquake of January 12, 2010. Nineteen days after the earthquake, population movements had caused the population of the capital Port-au-Prince to decrease by an estimated 23%. Both the travel distances and size of people’s movement trajectories grew after the earthquake. These findings, in combination with the disorder that was present after the disaster, suggest that people’s movements would have become less predictable. Instead, the predictability of people’s trajectories remained high and even increased slightly during the three-month period after the earthquake. Moreover, the destinations of people who left the capital during the first three weeks after the earthquake was highly correlated with their mobility patterns during normal times, and specifically with the locations in which people had significant social bonds. For the people who left Port-au-Prince, the duration of their stay outside the city, as well as the time for their return, all followed a skewed, fat-tailed distribution. The findings suggest that population movements during disasters may be significantly more predictable than previously thought.

h/t Kyle


Early images of earth from space

Everyone knows the famous images of earth rise taken from the moon, but I surprised to run across this earlier amazing 1955 image from space. They pieced it together from a bunch of snapshots automatically shot though a pinhole in the side of a rocket as it rotated at its apex and fell back to earth.

Click to enlarge and read description.

Also cool is this first TV image from the space.

For a sense of our progress in extraterrestrial photography, compare these with the incredibly high-res images from the Suomi satellite released earlier this year.


The “Soft Side” Approach to Countering Violent Extremism

This is a guest post by Daniel P. Aldrich, associate professor of public policy at Purdue University. Prof. Aldrich was an American Association for the Advancement of Science (AAAS) fellow at USAID during the 2011-2012 academic year, and a Fulbright research fellow at the University of Tokyo during the 2012-2013 academic year.  He can be reached at daniel.aldrich@gmail.com, and followed at @DanielPAldrich on Twitter.

Violent extremism and terrorism - involving suicide bombings, improvised explosive device and small arms attacks, narco-trafficking, and kidnapping - have taken center stage for many decision makers in the United States and abroad.  The Worldwide Incidents Tracking System (WITS) established by the National Counterterrorism Center has illuminated a rising trend in the number of armed attacks by terror groups over the past decade.  Scholars (using synthetic case control analysis from Spain) have estimated the high economic costs of terrorism, with a loss of 10% in per capita GDP for individuals in areas with high numbers of terrorist attacks (Abadie and Gardeazabal 2003). Policy makers around the world have prioritized their attempts to end, manage, or handle threats from violent extremist organizations (VEOs) such as al-Qaeda in the Islamic Maghreb (AQIM) in northwest Africa, Lashkar-e-Tayyiba in South Asia, and Abu Sayyaf in the Philippines.

Given the broad agreement that violent extremism is a serious issue, what are the best policy responses to violent extremist organizations?  U.S. policymakers have long favored the use of military force, drone strikes, and covert operations as tried-and-true approaches for dealing with extremist groups because they produce clear and immediate results.  These tactics bring with them unintended side effects.  Even the most advanced unmanned drones using the latest in surveillance and tracking technologies have generated civilian casualties and turned host nations partners against the United States.  Such “collateral damage” further drives many local residents to support anti-American groups and bolsters their claims of encirclement and anti-Muslim bias.  National governments and local civilian populations in Pakistan and Yemen provide two unfortunate examples of this phenomenon.
[continued after the break]


Neurological basis for altruism

I don't usually read Nature Neuroscience, but this is an interesting neuro-economics piece.

Dorsolateral and ventromedial prefrontal cortex orchestrate normative choice

Thomas Baumgartner, Daria Knoch, Philine Hotz, Christoph Eisenegger & Ernst Fehr

Abstract: Humans are noted for their capacity to over-ride self-interest in favor of normatively valued goals. We examined the neural circuitry that is causally involved in normative, fairness-related decisions by generating a temporarily diminished capacity for costly normative behavior, a 'deviant' case, through non-invasive brain stimulation (repetitive transcranial magnetic stimulation) and compared normal subjects' functional magnetic resonance imaging signals with those of the deviant subjects. When fairness and economic self-interest were in conflict, normal subjects (who make costly normative decisions at a much higher frequency) displayed significantly higher activity in, and connectivity between, the right dorsolateral prefrontal cortex (DLPFC) and the posterior ventromedial prefrontal cortex (pVMPFC). In contrast, when there was no conflict between fairness and economic self-interest, both types of subjects displayed identical neural patterns and behaved identically. These findings suggest that a parsimonious prefrontal network, the activation of right DLPFC and pVMPFC, and the connectivity between them, facilitates subjects' willingness to incur the cost of normative decisions.

(a) Overlay of the pVMPFC cluster that showed a larger change in connectivity after unfair offers (compared with fair offers) with the right DLPFC in the left compared with the right TMS group (yellow, at P < 0.005, cluster extent = 18 voxels42) and the pVMPFC cluster that showed differential activation in the contrast unfair > fair offers in the left compared with the right TMS group (red). Overlapping voxels are displayed in orange. (b) Bar plots based on the functional ROI (red) from a indicate that the differential context-dependent change in connectivity between the left and right TMS group was qualified by a differential change in connectivity during unfair offers (unfair connectivity), but not during fair offers (fair connectivity). The left TMS group therefore only showed an increased connectivity between the right DLPFC and pVMPFC at P < 0.01 during unfair offers, whereas the connectivity between these two brain regions did not change (relative to baseline connectivity) after fair offers. Moreover, after right TMS, the connectivity between right DLPFC and pVMPFC never deviated from the baseline (indicated by the two black bars); that is, these brain regions no longer communicated more after unfair offers. Bar plots depict mean ± s.e.m. [From Nature Neuroscience]