Semantic web

From the Quicksilver Metaweb.

A Semantic web is a web where information retrieval is enhanced by the inclusion of data about data and the ability to obtain data from disparate sources.

A fundamental problem of communication is that the same word can have a variety of meanings. The meaning intended in a specific instance is determined by the context in which it appears. For example, the word "bass" can mean a type of fish or a type of musical instrument. But, the intended meaning can be determined by the context of the sentence in which the word is used. (Compare "I fished all day, but the bass were uncooperative" with "John Entwistle played bass for the Who.") Given the context, is easy to determine which meaning is intended.

What is easy for the human brain to discern is problematic for computers. When data retrieval is limited to matching key words in search terms with words in docutments, the context in which the word is used is irrelevant. The only way to specify the information retrieved by a search is to refine the keywords used and their combination. So if I were looking for a nearby place for my father to fish I have to include words like "bass" "fish" and "New York." This would eliminate search results relating to guitars but I would still need to scan through the results to find what I needed. I would need some basis to make judgments about the relevance and quality of the information I obtained. Depending on what I found, this could be a tedious process with no real way to know whether the result was accurate and reliable. Most likely, my father would stop at a local bait shop or two, talk to local fisherman and make his own decision.

In a semantic web, data can be retrieved in more sophisticated and specific ways than matching key words. At least four improvements over the keyword system are obtained. First, the number of irrelevant results is lessened. If we assume that a semantic web distinguishes between "bass" as a fish and "bass" as an instrument, a query about "bass locations in New York" would omit results about local blues clubs. Second, the number of sources of information is increased. A search for "bass" would expand to include sources written in other languages. Third, the reliability of the information retrieved would be more easily determined. The author of the results obtained would be clear, as would the date and place in which it was written. Pages written about bass fishing in the 1940's would be omitted. Fourth, the data can be manipulated in new ways. I could make my own database quickly and then create graphs regarding the number of fish caught, the location, the time of day and the weather conditions.

A semantic web is a way of organizing a massive amount of data so that information can be retrieved more efficiently and accurately.

While the example used may seem trivial, the point is that meaningful information can be retrieved only by including the context in which it appears.