Article at Wired magazine
Brin’s tolerance for “noisy data” is especially telling, since medical science tends to consider it poisonous. Biomedical researchers often limit their experiments to narrow questions that can be rigorously measured. But the emphasis on purity can mean fewer patients to study, which results in small data sets. That limits the research’s “power”—a statistical term that generally means the probability that a finding is actually true. And by design it means the data almost never turn up insights beyond what the study set out to examine.
Increasingly, though, scientists—especially those with a background in computing and information theory—are starting to wonder if that model could be inverted. Why not start with tons of data, a deluge of information, and then wade in, searching for patterns and correlations?
This is what Jim Gray, the late Microsoft researcher and computer scientist, called the fourth paradigm of science, the inevitable evolution away from hypothesis and toward patterns. Gray predicted that an “exaflood” of data would overwhelm scientists in all disciplines, unless they reconceived their notion of the scientific process and applied massive computing tools to engage with the data. “The world of science has changed,” Gray said in a 2007 speech—from now on, the data would come first.
Thursday, June 24, 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment