The World Wide Web in ARANEUS

Click here to enlarge the ADM scheme.

Description

In this page we show how it is possible to extract data from the whole Web using ARANEUS. We see the Web as a huge collection of essentially unstructured documents, in the spirit of WebSQL.

Web pages are described by page-scheme WebPage as objects with a set of (possibly missing) monovalued attributes (url, title, type, length, modif), plus a list of links (LinkList) to other Web pages; for each link, we report the label (or anchor), and the URL of the destination document.

Documents can be accessed either by specifying their URL, or by querying a search engine. We use the popular Altavista engine in order to locate documents containing a set of keywords.


Examples of Queries:

Example 1: "Titles and URLs of all Documents about ARANEUS";
Example 2: "All links, from Java Italian Site, to Applets on graphics";


Back to the Home Page