_____________________________________________________________________ ARANEUS WRAPPER TOOLKIT Distribution Version 1.0 July, 30 1999 _____________________________________________________________________ --- Copyright and Disclaimer --- ********************************************************************** * * * Copyright (c) 1998, 1999 * * Araneus Group and Department of "Informatica ed Automazione", * * University "Roma Tre", Rome, ITALY. * * * * All Rights Reserved. * * * * Permission to use, copy, and distribute this software and its * * documentation for NON-COMMERCIAL purposes and without fee is * * hereby granted provided that this copyright notice appears in * * all copies. * * * * THE AUTHORS MAKE NO REPRESENTATIONS OR WARRANTIES ABOUT THE * * SUITABILITY OF THE SOFTWARE, EITHER EXPRESS OR IMPLIED, INCLUDING * * BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, * * FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. THE AUTHOR * * SHALL NOT BE LIABLE FOR ANY DAMAGES SUFFERED BY LICENSEE AS A * * RESULT OF USING, MODIFYING OR DISTRIBUTING THIS SOFTWARE OR ITS * * DERIVATIVES. * * * ********************************************************************** --- Contents --- This distribution contains the Araneus Wrapper Toolkit, a set of tools for wrapper generation over Web text data sources, namely Editor and Minerva. Minerva and Editor are in essence a regular grammar parser with explicit exception handling features. Exceptions during the parsing can be captured and handled procedurally (using Editor code) in order to recover from the exception and resume the parsing. All material (especially documentation) is still in the beta-testing phase. In case you find bugs or mistakes, please be forgiving and notify the following address: crescenz@dia.uniroma3.it Minerva is developed as part of the ARANEUS Web-Base Management System, a system for creating and managing Web sites. Using the system it is possible to develop large sites based on a relational database. For further details about the ARANEUS project, and a list of on-line publications, please refer to our Web sites, at URLs: http://www.dia.uniroma3.it/Araneus http://www.difa.unibas.it/Araneus --- Installation --- NOTE: If you have installed a previous version of this distribution package, please remove all files before installing this new version. To run the system you need a Java 1.2 Development Kit (JDK 1.2). Being written in Java, the software can be run both in Windows and Unix environments. To install the software: - under Unix, type tar xzvf araneusWTKv1.0.tar.gz - under Windows, use WinZip, or any other compression utility to unzip the file Once you have unzipped the file, it will create several directories on your target drive, as follows: araneusWTK README.TXT mwg.class mwg.java editor api docs tutorial.ps shells.txt minerva minerva.properties shell MinervaShell.java MinervaShell.class examples NFfiles html - README.TXT this file; - mwg.java and mwg.class (minerva wrapper generator) A very simple java interface to use Minerva. - MinervaShell.java a slightly more sophisticated interface to use Minerva; to see how to use these shells, see file "shells.txt" in directory "docs"; - minerva, editor java packages containing the system code, as follows: - minerva : handles the wrapper generation; - editor : implements the Editor language for text file manipulation; the JavaDoc API documentation is in directory api; - docs contains a preliminary manual "The Araneus Wrapper Toolkit: A Tutorial", adapted from one of our papers on the subject, plus a brief explanation of how to use the shells; - examples contains several example wrappers: - NFfiles Minerva source code for the wrappers (.NF files) plus ConferencePage.cfg, a sample configuration file for MinervaShell; - html some HTML files upon which it is possible to test the wrappers; --- Help --- Feel free to contact us at the following address for bugs, comments, or feedback about the system: crescenz@dia.uniroma3.it