Abstract: |
In the last decade the increasing popularity of the World Wide Web has lead to an exponential growth in the number of pages available on the Web. This huge number of Web pages makes it increasingly difficult for users to find required information. In searching the Web for specific information, one gets lost in the vast number of irrelevant search results and may miss relevant material. Current Web applications provide Web pages in HTML format representing the content in natural language only and the semantics of the content is therefore not accessible by machines. To enable machines to support the user in solving information problems, the Semantic Web proposes an extension to the existing Web that makes the semantics of the Web pages machine- processable. The semantics of the information of a Web page is formalized using RDF meta-data describing the meaning of the content. The existence of semantically annotated Web pages is therefore crucial in bringing the Semantic Web into existence. Semantic annotation addresses this problem and aims to turn human-understandable content into a machine-processable form by adding semantic markup. Many tools have been developed that support the user during the annotation process. The annotation process, however, is a se- parate task and is not integrated in the Web engineering process. Web engineering proposes methodologies to design, implement and maintain Web applications but these methodologies lack the generation of meta-data. In this thesis we introduce a technique to extend existing XML-based Web engineering methodologies to develop semantically annotated Web pages. The novelty of this approach is the definition of a mapping from XML Schema to ontologies, called WEESA, that can be used to automatically generate RDF meta-data from XML content documents. We further demonstrate the integration of the WEESA meta-data generator into the Apache Cocoon Web development framework to easily extend XML-based Web applications to semantically annotated Web appli- cation. Looking at the meta-data of a single Web page gives only a limited view of the of the in- formation available in a Web application. For querying and reasoning purposes it is better to have the full meta-data model of the whole Web application as a knowledge base at hand. In this thesis we introduce the WEESA knowledge base, which is generated at server side by accumu- lating the meta-data from individual Web pages. The WEESA knowledge base is then offered for download and querying by software agents. Finally, the Vienna International Festival industry case study illustrates the use of WEESA within an Apache Cocoon Web application in real life. We discuss the lessons learned while implementing the case study and give guidelines for developing Semantic Web applications using WEESA.
|