Nerds on Wall Street: The useleness of "data mining" and "back testing" in the stock market

Topics :  Finance · Aug 08, 2009  |  0  Comments

A New Book "Nerds on Wall Street" takes a satirical look at why data mining and back testing (using past information to understand the future) doesn't make any sense An excerpt from the WSJ article (http://www.wsj.com/article/SB124967937642715417.html)

--------

 

"Mr. Leinweber got so frustrated by "irresponsible" data mining that he decided to satirize it. After casting about to find a statistic so absurd that no sensible person could possibly believe it could forecast U.S. stock prices, Mr. Leinweber settled on annual butter production in Bangladesh. Over an 13-year period, he found, this statistic "explained" 75% of the variation in the annual returns of the Standard & Poor's 500-stock index. By tossing in U.S. cheese production and the total population of sheep in both Bangladesh and the U.S., Mr. Leinweber was able to "predict" past U.S. stock returns with 99% accuracy. But the entire exercise, he says, is a total crock.

There is no conceivable reason why U.S. stock returns would be determined by Bangladeshi livestock returns. Mr. Leinweber's exercise isn't much more absurd than some actual examples of data mining. One recent scholarly paper purported to show that you can predict stock returns by tracking the number of nine-year-olds in the U.S. Another academic study asserts that stocks are more likely to go up on days when smog goes down.

That points to the first rule for keeping yourself from falling into a data mine: The results have to make sense. Correlation isn't causation, so there needs to be a logical reason why a particular factor should predict market returns. No matter how appealing the numbers may look, if the cause isn't plausible, the returns probably won't last.

--------

Make a comment


  (Don't have a website? Sign up for one.)