spacer

Webref WebRef   Sitemap · Experts · Tools · Services · Newsletters · About i.com

home / internet / semantic / 1 To page 1current pageTo page 3To page 4To page 5
[previous][next]

Technology Consultant Manager (IL)
Next Step Systems
US-IL-Chicago

Justtechjobs.com Post A Job | Post A Resume
Developer News
OpenOffice 3.2 Lands Amid Critical Changes
Red Hat, IBM Firmly in KVM Virtualization Camp
Red Hat Talks Up Open Source Cloud Plans


Explorer's Guide to the Semantic Web

1.1.1 Indexing and retrieving information

Everyone wrestles with how to find information. Libraries have card catalogs, and now many have electronic indexes. Search engines are vital components of the Web. Yet at some point, everyone has been frustrated and annoyed by how hard it is to locate things, especially when you aren't sure what to ask for. To find information, a Semantic Web approach would expect to go beyond keyword and alphabetical indexes to let users search by concepts and categories.

The Web part brings in a persistent theme, in which information is distributed— spread throughout the Web—rather than concentrated in a few repositories. Most systems that use concept identification to retrieve information maintain their own concept hierarchies and attempt to identify those concepts in the documents they index. Sometimes concepts in a document collection are identified automatically, with varying success. To go further requires that documents be able to declare their own vocabularies and sets of concepts and to identify where they’re used.

1.1.2 Meta data

Card catalogs and electronic indexes contain data about the works that are cataloged and indexed. Data about other data is often called meta data. For example, the ISBN number and the author’s name are meta data about a novel. The datatypes describing the data in a database also fall into the category of meta data. It’s even possible to have meta meta data (a statement about the origin of a piece of meta data could be considered to be meta data about meta data, or meta meta data).

In one sense, meta data is still data; the distinction lies in the intended use of the data and in the subject of the meta data. It’s meta data that will be used for searches and for discovery of information. Annotation can also be thought of as meta data.

1.1.3 Annotation

In the world of physical documents (such as books), people write margin notes and comments, they underline and highlight passages, they staple new items to reports, and they add thoughts and ideas to those of the original authors. Markup languages like XML should, you’d think, be able to add such annotations; but today it’s hard to do this in a simple way that lets other people share your annotations and lets you move your annotations to other applications and computers. Wiki-style web sites attempt to let many people comment on and modify web pages, but this process covers only a little of what people would like to do.

Because annotations should be shareable, and because the meaning of different types of annotations should be widely understood, support for extensive annotation capabilities is often seen as part of the Semantic Web.

1.1.4 A huge interoperable database

Today it’s common to get data from a database over the Web. These databases are generally separate and not easily used as merged data sources, and a great deal of data exists outside of databases. This part of the Semantic Web vision sees ways to unify the description and retrieval of stored data, allowing much of the Web to be considered part of a large virtual database.

Consider a sports researcher looking for baseball data. There are various online baseball databases: The Major League Baseball web site is but one of many. But if our researcher wants to find performance statistics for Stan Musial, whose career lasted from the 1940s to the 1960s, she can’t get data for the whole period in a mutually compatible format. At least for baseball statistics, there is some common agreement on the definitions of the most important statistics, so that a batting average is always computed the same way—this is more than can be said for most separate collections of data.

If the Web functioned as an interoperable database, the researcher could get the data from all the important sites, and the researcher’s software would be able to either display all the data together or automatically combine data from, say, the Major League Baseball site and the Baseball Almanac.

1.1.5 Machine retrieval of data

This part of the vision focuses on automatic acquisition of data. This means that a piece of software, in pursuit of its assignment, determines what data it needs and where and how to get it, and then goes out and gets the data. Using the baseball example from the previous section, suppose our researcher has to find the right web pages, load them, and then figure out a way to get the data and organize it. This is hard to do and often takes a lot of time. Under the Semantic Web, the data format and its manner of access would be described in a way that would allow the researcher’s computer to get and use the data automatically.

1.1.6 Services

A service is a behavior that provides a benefit. Examples include making reservations, arranging schedules, providing prices, placing orders, and so forth. Think of ordering, say, a perishable item like flowers or food. Once you’ve selected a product to buy, you have to make sure that its delivery will fit into your schedule. The price, buying conditions, delivery options, and your schedule can all be thought of as services that must be activated and coordinated. In the “Semantic Web as web services” view, all these services would publish machine-readable data that would allow a computer to do all the activation and coordination for you.

1.1.7 Discovery

To use services, you (and especially your software) must be able to find them, discover what they do, and learn how to invoke them. This is the realm of discovery of services. The most obvious approach would be to create directories of services with standard access methods. The services would be described in standard terms, and information about how to access them and the available information would be encoded in standard ways.

Consider an analogy with a physical library. Most libraries in the United States use either the Dewey Decimal System or the Library of Congress method to catalog their books. After using the card catalog or its electronic version, a person becomes familiar with the classifications and learns how to find books on the shelves. Here, the standard access methods are the familiar classification system and the physical arrangement of books in the library.

A more advanced approach would be to send out discovery requests based on the services required, and for candidate services to describe their capabilities in such a way that the would-be user could deduce their capabilities and instigate a conversation to find any missing or uncertain information. Returning to the library example, this would be like getting an experienced research librarian to tell you which reference books to look at and how to understand the information in them.

1.1.8 Intelligent agents

An agent is someone or something that acts on your behalf. A software agent would act in a somewhat autonomous way, communicating with other software agents (which might be specialized) to discover services, products, or information for you. For instance, one of those specialized agents might know how to purchase airline tickets and make reservations. Another agent might perform the required services, passing the results back to your own agent, which would notify you of the outcome. It’s clear that a network of interacting agents would have to be able to describe its goals using established vocabularies, to discover services and information resources, and to use many of the capabilities described in the previous sections.

1.2 Two Semantic Web scenarios

To give you a feel for the way these areas might interact and how the Semantic Web could provide great value, here are two scenarios that were developed during the workshop “Research Challenges and Perspectives of the Semantic Web.”1
Both scenarios illustrate what might be called personal services. Of course, similar scenarios could be constructed for many other areas, such as business-to-business transactions. Note that the language is taken directly from the report without corrections for grammatical and spelling errors.

Scenario 1: A research assistant

1 This workshop was organized by the European Consortium in Informatics and Mathematics (ERCIM) for the European Union Future Emergent Technology program (EU-FET) and the US National Science Foundation (NSF). It was held in Sophia-Antipolis, France, in October 2001.

2 DARPA Agent Markup Language; see chapter 7.

home / internet / semantic / 1 To page 1current pageTo page 3To page 4To page 5
[previous][next]


The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers

webref The latest from WebReference.com Browse >
Search Engine Optimization: Selecting and Embedding Keywords · Are Google's Language Translation Web Services Ready for Prime Time? · Installing and Using Meeplace, the Business Review CMS
Sitemap · Experts · Tools · Services · Email a Colleague · Contact FREE Newsletters 
 The latest from internet.com
IBM DB2 10 for z/OS: Justifying the Upgrade · Living La Vida Colo: Choosing the Right Colocation Facility · FTC Concerns over Social Media Privacy Linger

Created: March 27, 2003
Revised: October 4, 2004

URL: http://webreference.com/internet/semantic/1