William Gould on Stata's blog (previously mentioned here) has two great posts (here and here) on the intuition behind matrices and regression coefficients. The section on near-singular matrices is characteristically nice:
Singular matrices are an extreme case of nearly singular matrices, which are the bane of my existence here at StataCorp. Here is what it means for a matrix to be nearly singular: [see figure]
Nearly singular matrices result in spaces that are heavily but not fully compressed. In nearly singular matrices, the mapping from x to y is still one-to-one, but x‘s that are far away from each other can end up having nearly equal y values. Nearly singular matrices cause finite-precision computers difficulty. Calculating y = Ax is easy enough, but to calculate the reverse transform x = A-1y means taking small differences and blowing them back up, which can be a numeric disaster in the making.Both posts are great and I recommend them for anyone struggling with the intuition behind what exactly you're doing when you type in reg y x.
As an added bonus, earlier this week I stumbled across Kenneth Simon's excellent pdf cheat sheet of Stata commands for intermediate / advanced econometrics, here. I was trying to figure out a way to do something cute with distributed lag models and post-estimation tests, but the sheet covers everything from the simple but important (e.g., the difference between gen old = age >= 18 and gen old = age >= 18 if age<. ) to the arcane but potentially important (e.g., nonlinear hypothesis testing). If you're in applied work and use Stata I highly recommend flipping through it. I've already found several useful techniques I wasn't even aware existed.
This comment has been removed by a blog administrator.
ReplyDelete