Friday, September 12, 2008

A technical insider's view of the UAL Google Alerts fiasco

Guest post by Marty Betz, VP of Technology

This week saw the stock of United Airlines (UAL) drop by 75% in a single day because of information posted on the web. In this case, it was because 6-year-old content seemed ‘new’ to a user of the web – and that user pushed the ‘news’ into a major tool (Bloomberg) used by professional investors, and the markets reacted.

I thought I’d share my thoughts from a technical perspective on what happened, and how it can be avoided.

Quick (simplified) review of the chain of events:
1. Something lead to higher than usual clicking on an old news story about United going bankrupt
2. That spike in clicking made the story appear on a newspaper’s “Most Viewed Articles” list on their web site
3. Google News crawled that list and article, and sent it to Google Alerts users who track UAL
4. An investment advisor posted it to a public forum on Bloomberg
5. A Bloomberg editor pushed it up into the closely watched flow of Bloomberg ‘headlines’

So, how could this have happened? From a technology/product perspective, it’s because one very specific type of user (a professional investment researcher), doing a specific task (investment research) used a tool designed for a different user with a different task (plain old consumers like you and me keeping up on the news).

Re-purposing is a good practice. It’s often how investment professionals as well as tech startups move the ball forward, but it comes with a requirement that you get disciplined about making trade-offs between the added value and the costs and risks. In this case, there are a few aspects of a) the web and b) investment research that boost the risk of that mismatch. Let’s start with the web.

Finding what’s “new” on the web is hard. Old content makes up 99.99999…% (you get the picture) of the web. Because of this, you need to make a number of significant technical specializations to use it as a source of news. Add to that the fact that the web wasn’t designed for news delivery, so there is no reliable time stamp on a web page, much less the kind of “I’m new” flag that we would like.

Second, consumer search engines were not built to find news. As search engines have done the work to build this capability, they’ve had to make hundreds of technical choices. In the case of Google Alerts, to give their consumer user a great experience by sending them news to “keep up with what’s going on,” Google very reasonably chose to crawl a given web site’s “Most Viewed Articles” list. And even when the assumption is wrong, the worst possible consequence is old (but popular) content showing up in your Google Alert. That is, at worst, a little annoying.

Investment research is a very different animal. The investment firms and hedge funds that manage the majority of money in the market earn their keep by being careful and thorough about deciding when to buy or sell stock, but along with their deliberative research process, they are also required to pay attention to the news. It would be reckless, and potentially illegal for them to ignore news – because news moves the stock’s price. Many of the professionals who do the research at these firms have re-purposed Google Alerts to track news, and they’ve simply gotten used to some level of old stories coming in.

The impact of the repurposing shows up in the lopsided consequence of Google’s assumption being wrong.
- For the consumer user, annoyance
- For the investment professional, a $1.16 billion intraday loss
- For United Airlines, the kind of attention that hurts when you’re already under stress
- For Bloomberg and the researcher who posted the link, a big hit to their credibility.

What does it take to avoid it?

In contrast, we make our hundreds of technical decisions with the investment research user in mind. These include:

- Carefully understanding each site’s structure and publishing patterns. For example, we generally don’t look at the “Most Viewed Articles” pieces of a web site, because widely known information is the least valuable to an investment researcher

- Analyzing the words and phrases in the article to place it in time

- Sampling small pieces of every article and compare them against our database to detect stories we’ve crawled before

A lot of the hand-wringing in the press and on blogs has been about “machines gone wrong,” but the right machine/technology actually works well for the portfolio managers, analysts, CFOs and CMOs it was built for.

The Bigger Picture

While this news tracking piece of what we do is valuable for our clients and technically challenging for my team, the more compelling impact of the web for financial research comes from recognizing that the web is a comprehensive reflection of reality – the biggest investment research database that has ever existed.

Every other data source is relatively limited – and very picked over. We began releasing new capabilities in FirstRain earlier this year that mine this ‘database’ for unique insights, trends and patterns over time. The reaction from our users tells us we’re on to something big.

No comments:

There was an error in this gadget