Mining the Web for the "Voice of the Herd" to Spot Stock Market Bubbles
Aaron Gerow, Mark T. Keane
We show that power-law analyses of financial commentaries can predict stock market bubbles, supplementing traditional volatility analyses. Using a four-year corpus of 17,713 online, finance-related articles (10M+ words) from the Financial Times, the New York Times, and the BBC, we show that week-to-week changes in power-law distributions can accurately predict market movements of the Dow Jones Industrial Average (DJI), the FTSE-100, and NIKKEI-225. Notably, these statistical regularities in language accurately predict the 2007 stock market bubble, showing emerging structure in the language of commentators, as progressively greater agreement emerged in their positive perceptions of the market. Furthermore, during the bubble period, a marked divergence in positive language occurs as revealed by a Kullback-Leibler analysis.