An interesting Business Week article has outlined something of a scary, though when you think about it, probably inevitable, concept: software that analyses millions of online comments about companies to help traders decide when to buy — and sell — stocks in the companies quoted.

Welcome to the weird and wonderful world of ‘sentiment analysis’; social media software that sifts such unstructured data as online financial news, financial social media and corporate interviews.

And it does so to a surprising degree of breadth, with one such system apparently quantifying no less than 400 types of sentiments and topic, "from optimism and anger and management changes or product releases" for in that supplier’s case no less than 6,000 U.S. stocks and exchange-traded funds.

Traders and investment firms have been using algorithmic approaches for years, which all rely on quantitative, number-crunching data. But over time these systems have become the default (everyone has them now) so the clever types now want heuristic/unstructured help, too, it seems, to give them the edge.

How do such systems typically work? The magazine story’s case study is of how investors deal with nasty stories in the news, basically, e.g. the announced outbreak of swine flu in 2009. How would that matter? Americans immediately stopped hopping on jet-planes so casually. The sentiment software picked that up and the lower stock was bought cheap – but of course sentiment bounced back, so a few days later it was all sold, betting a nice profit.

Putting this into trading models often depends on, well, using things such as CBR Online; online news and comment on publicly traded companies is increasingly becoming another vital datum in the trading game.

Transcripts of CEO interviews on financial TV sites like CNBC or what’s said in online regulatory filings or what people are saying on Twitter about a company – they all matter, now, and, to quote the story, "open up a whole new world of data that investors have never used".

Some of this stuff is very marketing and PR friendly, note. The company with the 400-parameter system, an outfit called MarketPsych, said it examined the daily performance of some big tech stocks over 13 years, looking at factors such as how innovative a company is thought to be. Those that had the highest innovation perceptions — the top 14% — outperformed the average group by 10 basis points a day, which translated to a 25% annual excess return over the S&P 500.

Are they doing well because they are innovative – or seen as innovative, hence liked? Bit chicken and egg. What’s important is that soft data is now getting to be just as useful as hard out here on the Web – and that firms who fail to take notice may suffer unjustly.

Interesting piece – we recommend it, as well as the more consumer oriented companion piece here.