The World Wide Web in ARANEUS |
||||||
Click here to enlarge the ADM scheme. |
DescriptionIn this page we show how it is possible to extract data from the whole Web using ARANEUS. We see the Web as a huge collection of essentially unstructured documents, in the spirit of WebSQL.Web pages are described by page-scheme WebPage as objects with a set of (possibly missing) monovalued attributes (url, title, type, length, modif), plus a list of links (LinkList) to other Web pages; for each link, we report the label (or anchor), and the URL of the destination document. Documents can be accessed either by specifying their URL, or by querying a search engine. We use the popular Altavista engine in order to locate documents containing a set of keywords.
|
|||||
Examples of Queries:
|
|