Invasion of the Robo-Editors

  • Share
  • Read Later
I'm going on strike. That was my first thought when I heard that the guys at Google had developed a computerized news editor that could do for free what I do for a living — track news and pull the most important stories together into a vibrant, continuously updated Web page. My website is TIME.com. Theirs is Google News. But I get paid for what I do, while Google's news editor gets no compensation — no salary, no medical, no free T shirts from failing dotcoms. If West Coast dockworkers can trigger a labor dispute because automation threatens to thin their ranks, why can't I?

But before taking to the streets with a sandwich board and protest chants ("We don't split infinitives, and we don't cross picket lines!"), I figured I'd better check out the competition. The good news (for Web surfers) is that Google delivers a surprisingly high-quality product that's just as relevant and up-to-date as the human-edited news outlets with which it competes. The better news (for human editors) is that it can't do what it does without us.


LATEST COVER STORY
Mind & Body Happiness
Jan. 17, 2004
 

SPECIAL REPORTS
 Coolest Video Games 2004
 Coolest Inventions
 Wireless Society
 Cool Tech 2004


PHOTOS AND GRAPHICS
 At The Epicenter
 Paths to Pleasure
 Quotes of the Week
 This Week's Gadget
 Cartoons of the Week


MORE STORIES
Advisor: Rove Warrior
The Bushes: Family Dynasty
Klein: Benneton Ad Presidency


CNN.com: Latest News

I should have known that Google would get it right. The company is among the few shining stars in an industry littered with flameouts. Before Google came along in 1998 with its breakthrough search engine, trying to search for stuff on the Web was an exercise in information overload. Google's innovation was an algorithm (afancy comp-sci term for a set of rules) that matched search terms with the most relevant sites that get most links from the highest quality Web pages. Think of it as searching by dint of popular demand.

Google's news service takes a similar approach. Its algorithms scan more than 4,000 news publications to build a rolling list of current headlines linked to the articles' text. Google sorts those stories into categories (such as U.S., World, Business and Entertainment) and groups them by subject matter. Headlines are refreshed every 15 minutes at least. One feature news junkies are sure to love is the time stamp that indicates when each story was posted to the Net. This has the effect of letting you watch the news age before your eyes. Minutes after Reuters published a story last week about Northern Irish police invading a Sinn Fein office in Belfast, the news appeared onGoogle, time-stamped "5minutes ago."

Automated news services aren't entirely new. Sites such as Daypop and Columbia Newsblaster use similar technologies to fetch headlines from the Net. Newsblaster, created by researchers at Columbia University, even generates its own news summaries using something called natural language processing. (Guild-covered writers: take note.)

But none of these sites is really out to eliminate the traditional editorial function. What they have done is find electronic methods of repackaging the work of human reporters and editors. Google is trying to keep the inner workings of its robo-editor secret, but I did find out it uses more than 150 different criteria to cull its story list from those 4,000 news sources. Example: a boldface headline centered at the top of a Web page is considered more relevant than a smaller one farther down the page.

Google isn't perfect by any stretch. It has a particularly tough time with nuance. Take the World's Funniest Joke, a headline that made the rounds last week after an outfit based in London claimed to have identified the best current wisecrack (a not particularly funny one about two guys in the woods and a call to 911). Many sites played this in the news-of-the-strange category, but Google displayed it earnestly as one of the top stories in its World section alongside more serious headlines. Google News also has a somewhat deficient disinformation detector, a weakness that got it into trouble a couple of weeks ago when its lead story was a piece of propaganda lifted directly from the Iranian News Service.

In its defense, Google says the current site is running a beta (i.e., unfinished, buggy) version of the software. They are still tweaking algorithms and evaluating news sources. I'm willing to wait and see. Meanwhile, I think I'll do something robots can't: take a lunch break and let the computers carry the load.

Want to know more? E-mail Josh at jmacht@aol.com