FE has been a little sluggish to recover from break. To kick start us back in gear, I'm making good on one resolution by making this FE Week-of-Code. I'll try to post something useful that I've written from the past year each day.
It always bugged me that I could easily plot a linear or quadratic fit in Stata, but if I used a third-order polynomial I could no longer plot the results easily. Stata has built in functions like
lowess,
fpfitci and
lpolyci that will plot
very flexible functions, but those tend to be too flexible for many purposes. Sometimes I just want a function that is flexibly non-linear, but still smooth (so not
lowess) and something I can easily write down analytically (so not
fpfit or
lpoly) and perhaps not symmetric (so not
qfit). We use high-degree polynomial's all the time, but we just don't plot them very often (I think this is because there is no built-in command do it for us).
Here's my function
plot_margins.ado that does it. It takes a polynomial that you use in a regression, and plots the response function. Added bonuses: It plots the confidence interval you specify and can handle control variables (which are not plotted)
Reed Walker actually came up with this idea in an email exchange we had last year. So he deserves credit for this. I just wrote it up into an .ado file that you can call easily.
Basically, the idea is that you run any regression using Stata's factor variable notation, where you tell Stata that a variable X is continous and should be interacted with itself, eg
reg y c.x##c.x
is the notation to regress Y on X and X^2. (Check out
Stata's documentation on factor variables if this isn't familiar to you.)
Reed's idea was to then use Stata 11's new
margins command to evaluate the response of Y to X and X^2 at several points along the support of X, and then to use parmest to plot the result. (To download parmest.do, type "net install st0043.pkg" at the command line in Stata. plot_margins will call parmest, so you need to have it installed to run this function.)
The idea works really well, so long as you have Stata 11 or later (margins was introduced in Stata 11).
Here's an example. First generate some false data. Y is the outcome of interest. X is the independent
variable of interest. W is a relevant covariate. Z is an irrelevant covariate.
clear
set obs 1000
gen year = ceil(_n/100)
gen x=5*uniform()-2
gen z=2*uniform()
gen w=2*uniform()
gen e = 40*rnormal()
gen y= w + 3*z + 4*x - x^2 + x^3 + e
Then run a regression of Y on a polynomial of X (here, it's third degree) along with controls. The standard errors can be computed any fancy way you like. Here, I've done a block-bootstrap by the variable
year.
reg y w z c.x##c.x##c.x , vce(bootstrap , reps(500) cluster(year))
Then, right after running the regression, (or when the estimates from the regression are the active in memory) call
plot_margins to plot the marginal contribution of X at different values.
plot_margins x
Easy enough? I've added a few features for programming ease. Use the plot_command() option to add labels, etc to the graph
plot_margins x, plotcommand("xtit(X) ytit(Y) tit(Third order polynomial) subtit(plotted with PLOT_MARGINS) note(SE are block-bootstrapped by year and model controls for X and Z)")
The result:
or specify the option "line" to have the CI plotted as lines instead of a shaded area:
There is also an option to save the function generated by parmest. Help file is below the fold. Have fun and don't forget to email Reed and tell him "thank you!"
Update: Reed points us to marginsplot in Stata 12, which basically does the same thing. Funny that the function names are all so unique...