News-gathering organizations can vastly expand the scope of their coverage by leaving it to computers. That’s because machines are more consistent and comprehensive, and thus better at spotting significant tends and patterns. What’s more, machines don’t take vacations or sick days, and you don’t have to pay them overtime.
In fact, Narrative Science cofounder Kristian Hammond is so sure that the virtual reporter service his company is developing will revolutionize journalism, he predicts a computerized reporter will win a Pulitzer Prize within five years.
My friend, Stephen Murray of Kazoobie, passed along a link to this article at Wired.com, which describes Narrative Science’s effort to create algorithms that will convert a baseball box score or corporate earnings statement into a readable news article. Eventually, the articles will be used by publications as a sort of native wire service and expand into other fields — politics for example. The algorithms can even be tweaked to impart a preferred writing style.
Stephen and I traded a few jokes about reliability of reporters — and about me being out of work — but Hammond tells the Wired.com author, Steven Levy, that the aim is not to replace human journalists, but to expand the scope and reduce the expense of journalistic enterprise.
I’m both skeptical and intrigued.
Consider this passage from sportswriter Bob Considine’s account of the Joe Louis knockout of Max Schemling for the International News Service in 1938, one of my all-time favorite spot-news sports stories:
Listen to this, buddy, for it comes from a guy whose palms are still wet, whose throat is still dry, and whose jaw is still agape from the utter shock of watching Joe Louis knock out Max Schmeling. It was a shocking thing, that knockout – short, sharp, merciless, complete. Louis was like this: He was a big lean copper spring, tightened and retightened through weeks of training until he was one pregnant package of coiled venom.
You think your iPad could write that? It’s a little purple by today’s standards, but I doubt an algorithm could ever be so adroit as to make relevant seemingly irrelevant observations. In fact, if it did so, it would not be journalism but a lie — after all, iPads don’t have palms, throats or jaws.
Nonetheless, Hammond’s creation does seem to produce prose that is no worse that what you might read in a lot of small-town dailies and weeklies. Consider this account, provided in the Wired.com story, of a youth baseball game:
Friona fell 10-8 to Boys Ranch in five innings on Monday at Friona despite racking up seven hits and eight runs. Friona was led by a flawless day at the dish by Hunter Sundre, who went 2-2 against Boys Ranch pitching. Sundre singled in the third inning and tripled in the fourth inning … Friona piled up the steals, swiping eight bags in all …
The program pulls from a predictable, jargon-ridden bag of verbs, but this stuff is probably good enough to make Grandma and Grandpa proud to read in print. And it would free up human sports writers to pursue enterprise pieces or hone their skills as columnist, all while saving their newspapers or websites money.
At least in theory.
I’ll not offer here a full analysis of the Wired.com article or Hammond’s claims. The article is lengthy but well worth the read. I’ll offer instead some random thoughts:
• It seems unlikely computers will ever entirely free us of bias, but that’s not necessarily a bad thing. To wit: The Big Ten is using the service in a pilot test but complained stories focused too much on winners, which caused problems when one of its teams lost to a non-conference opponent. The algorithm was tweaked to focus on the Big Ten team in such instances, which implies the formula can and will be tweaked to avoid prose that editors or readers find objectionable. Such bias quite often benefits readers by emphasizing their interests — but it leaves the craft just as vulnerable to manipulation and political agendas as before.
• Early tests focus on sports and business, which are data-intensive fields. Will the program fare as well in coverage of the arts or politics? Levy makes this point, too.
• The algorithm converts data, but can it interview someone? And even if it is given transcripts from a news conference or court proceedings (good luck getting the latter quickly), how will it know which quotes to include and which to discard? Quotes aren’t simply a matter of relevance; they can convey humor or irony, and inflection and cadence can be as important and telling as the words uttered. My guess is that this will keep this tool from doing anywhere close to the 90 percent of journalism Hammond predicts, but it does hint at a possible use as a reporting tool to complement human effort: The program could be used by a reporter to turn very quickly copy that could then be tweaked and edited.
• This also hints at a second weakness — someone still has to input information. Much of that can be automated, but if this were to be used to cover a Little League game, who would provide the box score? A reporter? A parent? A coach? Anyone who has manned the phones at a small publication such as The Beaufort Gazette or The Island Packet to take call-ins knows some coaches are more reliable and organized than others. Put another way, an algorithm doesn’t seem likely to solve the garbage-in/garbage-out problem.
• The prospect of perfect grammar, spelling and punctuation and stories turned perfectly on deadline is enough to make an editor salivate, though, particularly in this lean age of thin-stretched staffs.