One afternoon a week I went to a little room with a bunch of old equipment, including an oven and a spectrometer. I found some glass slides and a lamp and a thermometer and started to make measurements, heating up the glass, watching diffraction fringes change, etc. etc.
At the end I plotted my results, and they were all over the place. I mean, simply scattershot. I took out my trusted HP calculator (remember Reverse Polish Notation?) and plugged away and came up with a best-line fit. And when I took it to my professor (Bryon Dieterle), he pretty much laughed right in my face and told me you can't just draw a stupid line between a seemingly random set of results, and anyway, I hadn't considered the uncertainties of my data (or my result) and that was actually most of experimental science. I felt pretty humiliated -- I could go back and show you the exact spot in the hallway he told me all this, almost like it was the Kennedy assassination. Then he explained uncertainties in terms of partial derivatives and it started to make sense. I think he gave me an A-.
I was always a lousy experimentalist -- I still panic at having to change a flat tire -- but in fact that humiliation taught me a great deal about what data analysis was all about and insisting on good data and precise data and not being stupid about it and all that. Anyway.
So when I see something like this
from the blog Stochastic Democracy, more or less endorsed by Matthew Yglesias here, I have to laugh. You can't just take a scatterhot of points and draw a line through it because that's what your calculator (now Excel) tells you. Sure, you can, but it's meaningless -- it's a blob! -- and it's more important to understand that it's meaningless than going through the nitty gritty details of calculation slopes and intercepts and Pearson coefficients.
I don't know the moral of this story, except that you can do a lot of stupid things with the linear correlation function on your spreadsheet. Be sure to think first.