Skip to main content

Shirky: The Semantic Web, Syllogism, and Worldview

Popularity Report

Total Popularity Score: 0

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Rank

Bookmark History

Saved by 33 people (7 private), first by anonymouse user on 2006-03-02


Public Comment

on 2005-12-18 by jgentry

Glad I read this. Really deflates that balloon. Also helps clear up some of my confusion over the relationship of the terms "Semantic Web" and Web 2.0. No relationship. Also lays the foundation for much of his argument about Ontology, or vice versa.

Public Sticky notes

The Semantic Web is a machine for creating syllogisms. A syllogism is a form of logic, first described by Aristotle, where "...certain things being stated, something other than what is stated follows of necessity from their being so."

Highlighted by adukuri

The Semantic Web specifies ways of exposing these kinds of assertions on the Web, so that third parties can combine them to discover things that are true but not specified directly. This is the promise of the Semantic Web -- it will improve all the areas of your life where you currently use syllogisms.

Highlighted by adukuri

Dodgson wrote two books of syllogisms and methods for representing them in graphic form, and his syllogisms often took the form of sorites, where the conclusion from one pair of linked assertions becomes a new assertion to be linked to others.

Highlighted by adukuri

Despite their appealing simplicity, syllogisms don't work well in the real world, because most of the data we use is not amenable to such effortless recombination. As a result, the Semantic Web will not be very useful either.

Highlighted by adukuri

The people working on the Semantic Web greatly overestimate the value of deductive reasoning (a persistent theme in Artificial Intelligence projects generally.) The great popularizer of this error was Arthur Conan Doyle, whose Sherlock Holmes stories have done more damage to people's understanding of human intelligence than anyone other than Rene Descartes. Doyle has convinced generations of readers that what seriously smart people do when they think is to arrive at inevitable conclusions by linking antecedent facts. As Holmes famously put it "when you have eliminated the impossible, whatever remains, however improbable, must be the truth.

Highlighted by adukuri

The Semantic Web runs on meta-data, and much meta-data is untrustworthy, for a variety of reasons that are not amenable to easy solution.

Highlighted by adukuri

In the real world, we are usually operating with partial, inconclusive or context-sensitive information.

Highlighted by adukuri

Actual human expression must take into account the ambiguities of the real world,

Highlighted by adukuri

Dodgson's syllogisms actually demonstrate the limitations of the form, a pattern that could be called "proof of no concept", where the absurdity of an illustrative example undermines the point being made. So it is with the Semantic Web.

Highlighted by adukuri

We can't disallow generalizations because we can't know which statements are generalizations by looking at them.

Highlighted by adukuri

The Semantic Web is a machine for creating syllogisms. A syllogism is a form of logic, first described by Aristotle, where "...certain things being stated, something other than what is stated follows of necessity from their being so."

Highlighted by forestfortrees

The simple answer is this: The Semantic Web is a machine for creating syllogisms.

Highlighted by blehrer

The simple answer is this: The Semantic Web is a machine for creating syllogisms. A syllogism is a form of logic, first described by Aristotle, where "...certain things being stated, something other than what is stated follows of necessity from their being so." [Organon]

The canonical syllogism is:

Humans are mortal
Greeks are human
Therefore, Greeks are mortal

with the third statement derived from the previous two.

Highlighted by jangondol

The Semantic Web specifies ways of exposing these kinds of assertions on the Web, so that third parties can combine them to discover things that are true but not specified directly. This is the promise of the Semantic Web -- it will improve all the areas of your life where you currently use syllogisms.

Highlighted by blehrer

Syllogisms are Not Very Useful

Highlighted by blehrer

Despite their appealing simplicity, syllogisms don't work well in the real world, because most of the data we use is not amenable to such effortless recombination. As a result, the Semantic Web will not be very useful either.

The people working on the Semantic Web greatly overestimate the value of deductive reasoning (a persistent theme in Artificial Intelligence projects generally.) The great popularizer of this error was Arthur Conan Doyle, whose Sherlock Holmes stories have done more damage to people's understanding of human intelligence than anyone other than Rene Descartes. Doyle has convinced generations of readers that what seriously smart people do when they think is to arrive at inevitable conclusions by linking antecedent facts. As Holmes famously put it "when you have eliminated the impossible, whatever remains, however improbable, must be the truth."

Highlighted by jangondol

Critique of Pure Reason [Computational Intelligence, 3:151-237, 1987]

Highlighted by julianjonker

In the real world, we are usually operating with partial, inconclusive or context-sensitive information. When we have to make a decision based on this information, we guess, extrapolate, intuit, we do what we did last time, we do what we think our friends would do or what Jesus or Joan Jett would have done, we do all of those things and more, but we almost never use actual deductive logic.

As a consequence, almost none of the statements we make, even seemingly obvious ones, are true in the way the Semantic Web needs them to be true.

Highlighted by jangondol

Consider the following statements:

- The creator of shirky.com lives in Brooklyn
- People who live in Brooklyn speak with a Brooklyn accent

You could conclude from this pair of assertions that the creator of shirky.com pronounces it "shoiky.com." This, unlike assertions about my physical location, is false.

Highlighted by jangondol

its illustrates the kind of world we would have to live in for this form of reasoning to work, a world where language is merely math done with words

Highlighted by julianjonker

Any requirement that a given statement be cross-checked against a library of context-giving statements, which would have still further context, would doom the system to death by scale.

Highlighted by jangondol

"proof of no concept", where the absurdity of an illustrative example undermines the point being made.

Highlighted by julianjonker

The Semantic Web runs on meta-data, and much meta-data is untrustworthy, for a variety of reasons that are not amenable to easy solution. (See for example Doctorow, Pilgrim, Shirky.)

Highlighted by blehrer

game the system

Highlighted by blehrer

publish meta-data that they believe to be correct

Highlighted by blehrer

Is your "Person Name = John Smith" the same person as my "Name = John Q. Smith"? Who knows? Not the Semantic Web. The processor could "think" about this til the silicon smokes without arriving at an answer.

Highlighted by jangondol

You could conclude from this that Nike is a person, and of course you would be right. In the context of in First Amendment law, corporations are treated as people. If, however, you linked this conclusion with a medical database, you could go on to reason that Nike's kidneys move poisons from Nike's bloodstream into Nike's urine.

Highlighted by julianjonker

Ontology is Not A Requirement

Highlighted by blehrer

The first goal is simple: get people to use more meta-data. The Semantic Web was one of the earliest efforts to rely on the idea of XML as a common interchange format for data. With such a foundation, making formal agreements about the nature of whatever was being described -- an ontology -- seemed a logical next step.

Instead, it turns out that people can share data without having to share a worldview, so we got the meta-data without needing the ontology.

Highlighted by blehrer

Here Rothenberg follows the script to a tee, labeling RSS autodiscovery 'simplistic' without entertaining the idea that simplicity may be a requirement of rapid and broad diffusion. The real lesson of RSS autodiscovery is that developers can create valuable meta-data without needing any of the trappings of the Semantic Web.

Highlighted by blehrer

If the sole goal of the Semantic Web were pervasive markup, it would be nothing more than a "Got meta-data?" campaign -- a generic exhortation for developers to do what they are doing anyway.

Highlighted by blehrer

The Semantic Web takes for granted that many important aspects of the world can be specified in an unambiguous and universally agreed-on fashion, then spends a great deal of time talking about the ideal XML formats for those descriptions.

Highlighted by julianjonker

Descriptions of the Semantic Web exhibit an inversion of trivial and hard issues because the core goal does as well. The Semantic Web takes for granted that many important aspects of the world can be specified in an unambiguous and universally agreed-on fashion, then spends a great deal of time talking about the ideal XML formats for those descriptions. This puts the stress on the wrong part of the problem -- if the world were easy to describe, you could do it in Sanskrit.

Highlighted by blehrer

Consider the following assertions:

- Count Dracula is a Vampire
- Count Dracula lives in Transylvania
- Transylvania is a region of Romania
- Vampires are not real

You can draw only one non-clashing conclusion from such a set of assertions -- Romania isn't real. That's wrong, of course, but the wrongness is nowhere reflected in these statements. There is simply no way to cleanly separate fact from fiction, and this matters in surprising and subtle ways that relate to matters far more weighty than vampiric identity. Consider these assertions:

- US citizens are people
- The First Amendment covers the rights of US citizens
- Nike is protected by the First Amendment

You could conclude from this that Nike is a person, and of course you would be right. In the context of in First Amendment law, corporations are treated as people. If, however, you linked this conclusion with a medical database, you could go on to reason that Nike's kidneys move poisons from Nike's bloodstream into Nike's urine.

Highlighted by jangondol

Any attempt at a global ontology is doomed to fail, because meta-data describes a worldview. The designers of the Soviet library's cataloging system were making an assertion about the world when they made the first category of books "Works of the classical authors of Marxism-Leninism." Melvyl Dewey was making an assertion about the world when he lumped all books about non-Christian religions into a single category, listed last among books about religion. It is not possible to neatly map these two systems onto one another, or onto other classification schemes -- they describe different kinds of worlds

Highlighted by blehrer

Because meta-data describes a worldview, incompatibility is an inevitable by-product of vigorous argument. It would be relatively easy, for example, to encode a description of genes in XML, but it would be impossible to get a universal standard for such a description, because biologists are still arguing about what a gene actually is. There are several competing standards for describing genetic information, and the semantic divergence is an artifact of a real conversation among biologists. You can't get a standard til you have an agreement, and you can't force an agreement to exist where none actually does.

Highlighted by blehrer

Social networking services like Friendster and LinkedIn assume that people will treat links to one another as external signals of deep association, so that the social mesh as represented by the software will be an accurate model of the real world. In fact, the concept of friend, or even the type and depth of connection required to say you know someone, is quite slippery, and as a result, links between people on Friendster have been drained of much of their intended meaning. Trying to express implicit and fuzzy relationships in ways that are explicit and sharp doesn't clarify the meaning, it destroys it.

Highlighted by julianjonker

Furthermore, when we see attempts to enforce semantics on human situations, it ends up debasing the semantics, rather then making the connection more informative.

Highlighted by blehrer

Trying to express implicit and fuzzy relationships in ways that are explicit and sharp doesn't clarify the meaning, it destroys it.

Highlighted by blehrer

The problem is that the more semantic consistency required by a standard, the sharper the tradeoff between complexity and scale. It's easy to get broad agreement in a narrow group of users, or vice-versa, but not both.

Highlighted by blehrer

the Web succeeded in part because it does not try to make any assertions about the meaning of the documents it contained, only about their location.

Highlighted by julianjonker

The most widely adopted digital descriptor in history, the URL, regards semantics as a side conversation between consenting adults, and makes no requirements in this regard whatsoever: sports.yahoo.com/nfl/ is a valid URL, but so is 12.0.0.1/ftrjjk.ppq. The fact that a URL itself doesn't have to mean anything is essential -- the Web succeeded in part because it does not try to make any assertions about the meaning of the documents it contained, only about their location.

Highlighted by blehrer

There is a list of technologies that are actually political philosophy masquerading as code, a list that includes Xanadu, Freenet, and now the Semantic Web. The Semantic Web's philosophical argument -- the world should make more sense than it does -- is hard to argue with. The Semantic Web, with its neat ontologies and its syllogistic logic, is a nice vision. However, like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world, where deductive logic is less effective and shared worldview is harder to create than we often want to admit.

Highlighted by blehrer

People pushing such technologies often make the "gateway drug" claim that rapid adoption of simple technologies is a precursor to later adoption of much more complex ones. Lotus claimed that simple internet email would eventually leave people clamoring for the more sophisticated features of CC:Mail (RIP), PointCast (also RIP) tried to label email a "push" technology so they would look like a next-generation tool rather than a dead-end, and so on.

Highlighted by jangondol

The real lesson of RSS autodiscovery is that developers can create valuable meta-data without needing any of the trappings of the Semantic Web. Were the whole effort to be shelved tomorrow, successes like RSS autodiscovery would not be affected in the slightest.

Highlighted by jangondol

larger goal, however, is to take up the old Artificial Intelligence project in a new context.

After 50 years of work, the performance of machines designed to think about the world the way humans do has remained, to put it politely, sub-optimal. The Semantic Web sets out to address this by reversing the problem. Since it's hard to make machines think about the world, the new goal is to describe the world in ways that are easy for machines to think about.

Highlighted by jangondol

If the world can't be reduced to unambiguous statements that can be effortlessly recombined, then it will be hard to rescue the Artificial Intelligence project. And that, of course, would be unthinkable.

Highlighted by jangondol

Any attempt at a global ontology is doomed to fail, because meta-data describes a worldview. The designers of the Soviet library's cataloging system were making an assertion about the world when they made the first category of books "Works of the classical authors of Marxism-Leninism." Melvyl Dewey was making an assertion about the world when he lumped all books about non-Christian religions into a single category, listed last among books about religion. It is not possible to neatly map these two systems onto one another, or onto other classification schemes -- they describe different kinds of worlds.

Highlighted by jangondol

Furthermore, when we see attempts to enforce semantics on human situations, it ends up debasing the semantics, rather then making the connection more informative. Social networking services like Friendster and LinkedIn assume that people will treat links to one another as external signals of deep association, so that the social mesh as represented by the software will be an accurate model of the real world. In fact, the concept of friend, or even the type and depth of connection required to say you know someone, is quite slippery, and as a result, links between people on Friendster have been drained of much of their intended meaning. Trying to express implicit and fuzzy relationships in ways that are explicit and sharp doesn't clarify the meaning, it destroys it.

Highlighted by jangondol

The problem is that the more semantic consistency required by a standard, the sharper the tradeoff between complexity and scale. It's easy to get broad agreement in a narrow group of users, or vice-versa, but not both.

Highlighted by jangondol

The most widely adopted digital descriptor in history, the URL, regards semantics as a side conversation between consenting adults, and makes no requirements in this regard whatsoever: sports.yahoo.com/nfl/ is a valid URL, but so is 12.0.0.1/ftrjjk.ppq. The fact that a URL itself doesn't have to mean anything is essential -- the Web succeeded in part because it does not try to make any assertions about the meaning of the documents it contained, only about their location.

Highlighted by jangondol

The Semantic Web, with its neat ontologies and its syllogistic logic, is a nice vision. However, like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world, where deductive logic is less effective and shared worldview is harder to create than we often want to admit.

Highlighted by jangondol

The amount of meta-data we generate is increasing dramatically, and it is being exposed for consumption by machines as well as, or instead of, people. But it is being designed a bit at a time, out of self-interest and without regard for global ontology.

Highlighted by jangondol