The End of Theory: The Data Deluge Makes the Scientific Metho...
Popularity Report
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
URL Tag Cloud
Bookmark History
Saved by 107 people (-20 private), first by anonymouse user on 2008-06-24
- Walgreen on 2009-09-07 - Tags information , daten , statistik , wissenschaft , methoden , computer , korrellation , kausalität
- Xiulizhuang on 2009-08-11 - Tags research
- Calendula28 on 2009-08-11 - Tags Science , Data , google , theory , statistics , information , technology , research
- Calvaryslz on 2009-08-11 - Tags data deluge , scientific mathod
Public Sticky notes
Highlighted by rakerman
This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.
Highlighted by taryn930
Highlighted by sshein
Highlighted by imrchen
Highlighted by isaacmao
Highlighted by ivomortani
"All models are wrong, but some are useful."
So proclaimed statistician George Box 30 years ago
Highlighted by doobii
Sensors everywhere. Infinite storage. Clouds of processors. Our ability to capture, warehouse, and understand massive amounts of data is changing science, medicine, business, and technology. As our collection of facts and figures grows, so will the opportunity to find answers to fundamental questions. Because in the era of big data, more isn't just more. More is different.
Highlighted by doobii
Highlighted by paulhoff
Highlighted by lspiro
Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database. Now Google and like-minded companies are sifting through the most measured age in history, treating this massive corpus as a laboratory of the human condition. They are the children of the Petabyte Age.
The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from the folder analogy to the file cabinet analogy to the library analogy to — well, at petabytes we ran out of organizational analogies.
At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later. For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn't pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right.
Highlighted by adukuri
Highlighted by ivomortani
Highlighted by sight_by_vision
Highlighted by imrchen
Highlighted by sight_by_vision
Highlighted by doobii
Highlighted by sight_by_vision
Highlighted by zigarth
on 2009-04-07 by zigarth
New grounds... collective consciousness is the only analogy I can think of.
Highlighted by jrstoltz
Highlighted by paulhoff
Highlighted by jpkalin
Highlighted by driessen
Highlighted by swifty1
Highlighted by sanilunlu
Highlighted by zigarth
Highlighted by swifty1
Highlighted by sanilunlu
Highlighted by ivomortani
Highlighted by driessen
Highlighted by paulhoff
Highlighted by jpkalin
Highlighted by imrchen
Highlighted by sshein
Highlighted by sshein
Highlighted by sanilunlu
Highlighted by swifty1
Highlighted by ivomortani
Highlighted by driessen
Highlighted by nstearns
Highlighted by paulhoff
Highlighted by jpkalin
Highlighted by zigarth
on 2009-04-07 by zigarth
It really is a new age in which we don't have to deal with approximations to define relationships between data -- its all dynamic and in realtime -- it really is as close to collective consciousness as we've ever gotten.
on 2009-04-07 by fcarey
I wonderwhat this technology can do with markets ? stock and otherwise
The big target here isn't advertising, though. It's science. The scientific method is built around testable hypotheses. These models, for the most part, are systems visualized in the minds of scientists. The models are then tested, and experiments confirm or falsify theoretical models of how the world works. This is the way science has worked for hundreds of years.
Scientists are trained to recognize that correlation is not causation, that no conclusions should be drawn simply on the basis of correlation between X and Y (it could just be a coincidence). Instead, you must understand the underlying mechanisms that connect the two. Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
Highlighted by driessen
Highlighted by paulhoff
Highlighted by imrchen
Highlighted by doobii
Highlighted by jrstoltz
Highlighted by lawjung
Highlighted by paulhoff
Highlighted by sight_by_vision
Highlighted by walgreen
Highlighted by ivomortani
Highlighted by driessen
Highlighted by sheryl_barnes
Highlighted by imrchen
Highlighted by doobii
Highlighted by paulhoff
Highlighted by ivomortani
Highlighted by paulhoff
Highlighted by jpkalin
Highlighted by imrchen
Highlighted by doobii
Highlighted by paulhoff
Highlighted by benkraal
Highlighted by imrchen
Highlighted by sanilunlu
Highlighted by sshein
Highlighted by swifty1
Highlighted by driessen
Highlighted by jpkalin
Highlighted by paulhoff
Highlighted by sheryl_barnes
There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot
Highlighted by walgreen
Highlighted by paulhoff
on 2008-06-30 by paulhoff
http://www.syntheticgenomics.com/about.htm
Highlighted by paulhoff
Highlighted by paulhoff
Highlighted by sight_by_vision
Highlighted by imrchen
Highlighted by sight_by_vision
Highlighted by takuya514
Highlighted by driessen
Highlighted by jrstoltz
Highlighted by tobogan
Highlighted by tobogan
Highlighted by jpkalin
Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
There's no reason to cling to our old ways. It's time to ask: What can science learn from Google?
Highlighted by driessen
Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
There's no reason to cling to our old ways. It's time to ask: What can science learn from Google?
Highlighted by ivomortani
Highlighted by tobogan
Highlighted by walgreen
Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
But faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete.
Highlighted by doobii
Highlighted by imrchen
Highlighted by tobogan
Highlighted by sight_by_vision
Highlighted by sight_by_vision
Highlighted by imrchen
Highlighted by tobogan
Highlighted by doobii
Highlighted by doobii
Highlighted by doobii


Public Comment