Penelope Reference Manual
ver. 0.99
G. Sindoni, M. Magnante, P. Merialdo, A. Masci, G. Mecca
Penelope is a declarative language that allows the generation of Web sites from relational databases. The structure of the target site pages can be defined by suitable statements that describe page-schemes. A set of page-schemes is a site-scheme. The reference data model for describing page-schemes is the Araneus Data Model (ADM).
The Araneus Data Model is a logical model for Web hypertexts. In ADM, each page is a URL-identified nested object. Pages are in fact described by attributes, which may be atomic, that is, text, images ot links to other pages, or complex, that is nested lists of tuples. The features of the model will be illustrated in more details in the following. For further details on the data model, consult the Araneus publication page at URL http://poincare.dia.uniroma3.it:8080/Araneus.
The following Penelope statement defines a site-scheme:
SCHEME
addressSTYLE style
ON
databaseAddress is the target site address, that is, a URL of the form:
Style is a style file to be used by Telemaco (see the Telemaco user manual) to generate and customize the graphical layout of pages in the site.
Database specifies the name of the data source, usually an ODBC data source, containing the (relational) database in which data to be published are stored.
In the following, Penelope statements will be presented with the help of examples that refer to the following Department relational database scheme:
Professor (Name, Position, Email, Phone, Photo, Publications)
Student (Name, Email, Phone)
Course (CourseName, Description, Instructor, Type)
PersonInGroup (Name, GroupName)
ResearchGroup (GroupName, Topic)
TextBook (CourseName, Title, Author, Publisher)
A site-scheme is defined by a bunch of Penelope statements, one for each page-scheme in the site. Each definition specifies, for a given page-scheme, in which way page attributes are to be mapped to database attributes. Consider the following Penelope statement:
Example 1:
DEFINE PAGE GraduateCoursePage : <CourseName>
STYLE ..\styles\test.sty
AS URL(<CourseName>);
Name : TEXT = <CourseName>;
Description : TEXT = <Description>;
Type : TEXT = <Type>;
USING Course
WHERE Type LIKE "Graduate"
END
It specifies how to generate a bunch of pages that are instances of the ADM page-scheme GraduateCorsePage, whose structure is reported in Figure 2. Data are taken from table Course in Figure 1.
CourseName |
Description |
Instructor |
Type |
Compilers |
.......... |
Riccardo Torlone |
undergraduate |
DatabaseSystems |
.......... |
Paolo Atzeni |
graduate |
OperatingSystems |
Principles of operating... |
Alessandro Micarelli |
graduate |
InformationSystems |
........... |
Paolo Atzeni |
undergraduate |
Figure 2 The ADM scheme for GraduateCoursePage
From Example 1, we can see how a Penelope statement has several clauses:
These clauses will be discussed in the following paragraphs.
The DEFINE PAGE clause primarily specifies the name of the page-scheme the statement refers to. It is also used to specify the HTML title to be given to instances of the page-scheme, as it will be discussed in the following. The database tables from which data should be extracted are listed in the USING clause. The statement in Example 1 defines how to generate instances of a page-scheme called GraduateCoursePage based on tuples in the Course table in the database. Features of the USING clause will be discussed in more detail in the following.
The style clause is an optional clause used to associate a presentation, i.e. a graphical layout, with instances of a given page scheme. This clause is ignored by Penelope. The process of specifying the HTML layout for pages is in fact taken care of by another software module, called Telemaco; (see the Telemaco User Manual). When generating pages, Penelope relies on "Attribute Style Files" (.ATS) generated by Telemaco. If .ATS files are not available, a default presentation style will be used for pages in the site.
The AS clause specifies how page-scheme attributes are to be mapped to attributes of tables listed in the USING clause. In particular, this clause specifies:
Let us discuss the second point first. As you can tell from Example 1 and Figure 2, for each attribute of the page-scheme there is an item in the AS clause. In fact, the CoursePage page-scheme has three TEXT attributes: Name, Description and Type. Attribute values for each page instance are values of the Course table (see Figure 1), as specified in the USING clause. Each page-scheme attribute is associated with a relation attribute, whose name is enclosed by a pair of acute brackets ("<" and ">"). For example, the clause specifies that values for the Name attribute in instances of CoursePage come from values of the CourseName attribute in the Course relational table.
URLs are assigned to pages using the URL() function. The URL() function creates distinct addresses based on a number of parameters. These parameters come from values of table attributes. The URL() function for the page-scheme of Example 1 is the following:
URL(<CourseName>)
The formal parameter is a key attribute of the Course table. The Penelope interpreter builds, for each tuple in the table, a page instance having a URL that is obtained from the CourseName attribute value. For example, the Database Systems course page URL will be:
site-path\CoursePage\DatabaseSystems.html
Note that a page address is generated by juxtaposing the site-path, the page-scheme name and the value obtained from the function parameter.
For unique pages, the URL() function can be given a constant value:
URL("index.html")
Here, an URL is generated directly from the parameter value. The URL() function may take more than one argument:
URL("index", <CourseName>)
Here, the URL() function builds an URL for each value of the <CourseName> attribute. A string is created from the combination of parameter values and is used for URL generation. In our example, the pairs of values are:
"index" "Compilers"
"index" "DatabaseSystems"
......................
"index" "Operating Systems"
A similar mechanism is used also to define in the DEFINE PAGE clause the special attribute title, corresponding to the HTML title of a page. Values for the title can either be constant strings or come from database attributes. For instance, in Example 1, instances of page-scheme CoursePage will have titles generated from the <CourseName> attribute of the database table. In other cases, pages might have constant titles, like the following:
DEFINE PAGE CoursePage: "Sample Site: Courses"
This option is particularly useful for titles of unique page-schemes, i.e., page-schemes having a single instance (like, for example, the Home Page).
The USING clause specifies the relational tables from which data are extracted. These tables can be either base database tables, or views. In Example 1, the Course base table is specified. The statement in the following example generates the same set of pages, but uses a view called GradCourse as a data source for the page-scheme:
Example 2:
DEFINE PAGE GraduateCoursePage : <CourseName>
AS URL(<CourseName>);
Name : TEXT = <CourseName>;
Description : TEXT = <Description>;
Type : TEXT = <Type>;
USING GradCourse : ( Select Course.*
From Course
Where Type = "Graduate")
DISTINCT
END
Relational views may be defined by any SQL statement. The query in the USING clause of Example 2 generates a table, GradCourse, by selecting those tuples in Course referring to graduate courses. The view will be temporarily materialised in the database, and then removed after all instances of the page-scheme have been generated.
GroupName |
Topic |
Database and Information Systems |
Databases and the Web |
Database and Information Systems |
Heterogeneous databases |
Database and Information Systems |
CASE technologies |
Database and Information Systems |
Active databases |
Database and Information Systems |
Database languages |
Database and Information Systems |
Medical computer science |
Database and Information Systems |
Requirement engineering |
Artificial Intelligence |
Logic for knowledge representation |
Artificial Intelligence |
Automated deduction |
Artificial Intelligence |
Modal and temporal logic |
Artificial Intelligence |
Abductive reasoning |
Graph Drawings |
Computer aided design |
Graph Drawings |
Geometric modelling & programming |
Graph Drawings |
Computer graphics |
Symbolic Computation |
Experimentation with algebraic methods |
Figure 3 The "ResearchGroup" relational table
In the page-scheme definition of Example 2, the DISTINCT directive is used. This directive can be associated to both base tables and views. It is used to specify that duplicates are to be removed when data are extracted from the database. The directive should be used with care. In fact, for some ODBC drivers (for example Microsoft Access), it is not possible to have duplicate elimination over tables with BLOB or MEMO attributes.
The ORDER BY clause is used to specify which ordering strategy should be used when inserting data inside pages. For example:
Example 3:
DEFINE PAGE ResearchGroupListPage : "ResearchGroupListPage"
AS URL("researchGroupList.html");
GroupList : LIST-OF (
ToGroup : LINK-TO ResearchGroupPage(
URL (<GroupName>);
GName : TEXT = <GroupName>;
);
);
USING ResearchGroup DISTINCT
ORDER BY GroupName
END
According to the above statement, data are extracted from tuples in the ResearchGroup table and inserted inside a single page (it has a constant URL) containing a list of research-groups, with a link for each group to the corresponding page. Essentially, the statements nests data coming from table ResearchGroup inside the single instance of page-scheme ResearchGroupListPage. Suppose table ResearchGroup is the one in Figure 3. To avoid having duplicates in the list, we use the DISTINCT directive. Moreover, we specify that list items should be ordered based on values of attribute GroupName.
The WHERE clause can be used to specify a conjunction of conditions to be used to filter data to be inserted in a page. It takes as an argument a boolean expression. The WHERE clause of Example 1 was in fact:
WHERE Type LIKE "Graduate"
Here, the Penelope interpreter compares, for each tuple of the Course table, the value of the Type attribute and the constant string after the LIKE keyword and it selects only the tuples for which the condition is satisfied. It essentially allows to perform horizontal partitions on database data and to bring into Web pages only the set of tuples that satisfy a given condition.
The association between a page-scheme attribute and a database attribute allows the Penelope interpreter to create page instances. Nevertheless, it is possible to associate constant values to TEXT attributes. For example, we may be interested in adding a constant attribute to the page-scheme of Example 1, in order to include in our page instances the string "Web Master: Paolo Rossi". We then have the following Penelope statement:
Example 4:
DEFINE PAGE CoursePage : <CourseName>
AS URL(<CourseName>);
Name : TEXT = <CourseName>;
Description : TEXT = <Description>;
Type : TEXT = <Type>;
WebMaster : TEXT = "WebMaster: Paolo Rossi";
USING Course
END
Let us consider the following Penelope statement:
Example 5:
DEFINE PAGE ProfessorPage : <Name>
AS URL(<Name>);
Name : TEXT = <Name>;
Email : TEXT = <email>;
Phone : TEXT = <Phone>;
Photo : IMAGE =<Photo>;
USING Professor
END
This statement specifies how to generate instances of the page-scheme in Figure 4 starting from values in the Professor relation of Figure 5.
Figure 4 ADM scheme for "ProfessorPage"
Name |
Position |
|
Phone |
Photo |
Publications |
Atzeni |
FullProfessor |
atzeni@ dia.uniroma3.it |
3226 |
../icons/atz.jpg |
http://www.... |
Di Battista |
FullProfessor |
gdb@ dia.uniroma3.it |
3212 |
... |
http://www.... |
Micarelli |
Professor |
micar@ dia.uniroma3.it |
3217 |
../icons/mic.jpg |
.... |
... |
… |
… |
... |
... |
.... |
Figure 5 The "Professor" relational table
In the page-scheme of Example 5 an IMAGE attribute is used. The syntax to define this type of attribute is similar to the one for TEXT attributes. The corresponding values are extracted from the Professor table of Figure 5. So, for example, the value of the Photo attribute in Prof. Micarelli’s page is ../icons/mic.jpg. The value of an IMAGE attribute must specify the name of a file storing the corresponding image. The file path may be defined in two different ways:
In the previous example, the file name is mic.jpg while its relative path is ../icons.
Penelope allows to define LINK-TO attributes, in order to link site pages. A LINK-TO attribute is composed by an URL and an anchor. Let us consider the following Penelope statement:
Example 6:
DEFINE PAGE CoursePage : <CourseName>
AS URL(<CourseName>);
Name : TEXT = <CourseName>;
Description : TEXT = <Description>;
Type : TEXT = <Type>;
ToInstructor : LINK-TO ProfessorPage (
URL(<Instructor>);
InstructorName : TEXT = <Instructor>;
);
USING Course
END
It creates the instances of the CoursePage page-scheme, which is described by the ADM scheme of Figure 10. The corresponding values are extracted from the Course relation.
CourseName |
Description |
Instructor |
Type |
Compilers |
.......... |
Torlone |
undergraduate |
DatabaseSystems |
.......... |
Atzeni |
graduate |
OperatingSystems |
Principles of operating systems... |
Micarelli |
graduate |
InformationSystems |
........... |
Atzeni |
undergraduate |
Figure 6 The "Course" relational table
Figure 7 The ADM scheme for "CoursePage"
The definition of the ToInstructor attribute has several parts:
Penelope allows also the generation of links to pages that don’t belong to the site we are developing. Those links already exist and they have a known URL. An example of external link is in the following page-scheme:
Example 7:
DEFINE PAGE ProfessorPage : <Name>
AS URL(<Professor.Name>);
Name : TEXT = <Name>;
Email : TEXT = <email>;
Phone : TEXT = <Phone>;
Photo : IMAGE =<Photo>;
ToPublications : LINK-TO ExternalPage (
URL(<Publications>);
Publications : TEXT = "Publications";
);
USING Professor
END
The essential differences between internal and external links are:
Figure 8 ADM scheme for "ProfessorPage"
LIST-OF
attributes in pages may be arbitrarily nested lists of tuples. Let us consider the following Penelope statement:Example 8:
DEFINE PAGE CourseListPage : "Courses Page"
AS URL("Courses.html");
CourseList : LIST-OF (
Course : TEXT = <CourseName>;
);
USING Course
END
It generates a Web page having the structure described by the ADM scheme of Figure 9. Values are taken from the Course relation (see Figure 1).
Figure 9 ADM scheme for "CourseListPage"
In the statement, we essentially specify that values of attribute Name in tuples coming from relation Course have to be nested inside attribute CourseList. A page definition may have several levels of nesting. For example, we might have a list attribute containing more list attributes, as follows:
DEFINE PAGE ResearchGroupListPage : "List of Research Groups"
AS URL("index.html");
GroupList : LIST-OF (
Name : TEXT = <GroupName>;
Topics : TEXT = <Topics>;
MemberList : LIST-OF (
ToMember : LINK-TO PersonPage (
URL (<Name>);
Member : TEXT = <Name>;
);
);
);
USING ResearchGroup,
PersonGroup : (Select PersonInGroup.Name AS Name,
PersonInGroup.GroupName AS GroupName,
Person.Position AS Position
from PersonInGroup,Person
where PersonInGroup.Name = Person.Name)
ORDER BY PersonGroup.Name
END
In this case, we are defining a unique page containing the list of all research groups. For each item in the list, we also report the list of members, with a link to the corresponding pages (PersonPage). As it can be seen, attribute GroupList is nested. Penelope allows in fact to define arbitrarily nested attributes, with some subtleties:
In the following paragraphs we briefly discuss some advanced features of Penelope. Another feature already available in the prototype -- namely forms -- will be described in a forthcoming version of this manual.
In order to describe how to define links with offset in Penelope, let us consider the following statements:
Example 9
DEFINE PAGE EducationPage : "EducationPage"
AS URL("education.html");
CourseList : LIST-OF (
Name : TEXT = <CourseName>, OFFSET(<CourseName>);
Description : TEXT = <Description>;
ToInstructor : LINK-TO ProfessorPage(
URL (<Instructor>);
InstructorName : TEXT = <Instructor>;
);
);
USING Course
END
Example 10:
DEFINE PAGE ProfessorPage : <Professor.Name>
AS URL(<Professor.Name>);
Name : TEXT = <Professor.Name>;
Email : TEXT = <email>;
Phone : TEXT = <Phone>;
Photo : IMAGE =<Photo>;
CourseList: LIST-OF(
ToCourse : LINK-TO EducationPage (
URL("Education.html" # <CourseName>);
Course : TEXT = <CourseName>;
);
);
USING Professor,
MyCourse: (SELECT CourseName, Instructor AS Name
FROM Course )
END
The ToCourse link has ProfessorPage as the target page-scheme, with an offset that is defined by the <CourseName> parameter, which is preceded by the terminal symbol ‘#’. In order for that addressing method to work, it is necessary to create a reference into each instance of EducationPage. The Name attribute declaration is then followed by the OFFSET function, which must be given as a parameter the same offset parameter that is passed to the URL() function of the ToCourse attribute.
Consider the following Penelope statement:
Example 11:
DEFINE PAGE CourseListPage : "CourseListPage"
AS URL("CourseList.html");
CourseList : LIST-OF (
ToCourse : LINK-TO
UnderGraduateCoursePage IF Type LIKE 'undergraduate'
UNION
GraduateCoursePage IF Type LIKE 'graduate'
(
URL(<CourseName>);
Course : TEXT =<CourseName>;
);
);
USING Course
END
It generates all instances of the page-scheme described in Figure 12, extracting data from the Course relational table of Figure 13. The CourseListPage page-scheme is unique: it has just one page instance. That page gives access to each course page by means of a list of links. In this example we use the UNION clause in the ToCourse link definition. The target page-scheme of that link may be UnderGraduateCoursePage or GraduateCoursePage. In order to use the UNION clause, the base relational table must have an attribute that allows to distinguish undergraduate course tuples from the graduate course tuples. The IF condition allows the Penelope interpreter to link an instance of the proper page-scheme. The first branch of the union is considered as a default, hence the corresponding condition may be omitted to modify the statement as follows:
ToCourse : LINK-TO
UnderGraduateCoursePage
UNION
GraduateCoursePage IF Type LIKE 'graduate'
(
URL(<CourseName>);
Course : TEXT =<CourseName>;
);
...
Figure 10 ADM scheme for "CourseListPage"
CourseName |
Description |
Instructor |
Type |
Compilers |
.......... |
Torlone |
undergraduate |
DatabaseSystems |
.......... |
Atzeni |
graduate |
OperatingSystems |
Principles of operating... |
Micarelli |
graduate |
InformationSystems |
........... |
Atzeni |
undergraduate |
Figure 11 The "Course" relational table
In this section the Penelope language semantics will be presented. In particular, we will show how each statement in the language maps to an expression in a nested relational algebra with URL invention. Operators of the algebra are briefly described in the following. Basic notions of the classic relational algebra are then requested in order to fully understand the Penelope algebra.
The Penelope algebra is essentially a nested relational algebra extended with an URL invention operator, which allows an identifier manipulation mechanism to create complex hypertext structures. The operators of the algebra work on relations and return nested objects that represent Web pages, as follows:
Let us take as an example the following ADM scheme describing undergraduate course pages:
Figure 12 ADM scheme for "CoursePage"
Suppose also that in our example database we have the following relation, describing lessons:
Lessons (CourseName, InstructorName, Date)
The Penelope statement describing page instance generation is the following:
DEFINE PAGE CoursePage : <CourseName>
AS URL : URL (<CourseName>);
CName : TEXT <CourseName>;
Description : TEXT <Type>
InstructorList : LIST OF (
InstructorName: TEXT <Instructor>;
ToInstrPage: LINK TO ProfessorPage(URL(<Instructor>));
LessonList: LIST OF
(Date: TEXT
<Date>;);
);
USING Course, Lessons
WHERE Type = ‘UG’
END
The first operation performed by the Penelope interpreter is the natural join of the tables specified in the USING clause:
R1=(Course > < Lessons)
Then, given the following table instances:
Course
CourseName |
Description |
InstructorName |
Type |
Database Systems |
The Database ... |
Paolo Atzeni |
UG |
Database Systems |
The Database ... |
Riccardo Torlone |
UG |
Compilers |
The Compilers ... |
Pino di Battista |
UG |
Compilers |
The Compilers ... |
Titto Patrignani |
UG |
DataWebs |
The DataWebs... |
Giansalvatore Mecca |
G |
Lessons
CourseName |
InstructorName |
Date |
Database Systems |
Paolo Atzeni |
1-4-98 |
Database Systems |
Riccardo Torlone |
1-5-98 |
Database Systems |
Riccardo Torlone |
1-6-98 |
Compilers |
Titto Patrignani |
1-3-98 |
DataWebs |
Giansalvatore Mecca |
1-8-99 |
The relation resulting from the join operation is:
CourseName |
Description |
InstructorName |
Type |
Date |
Database Systems |
The Database ... |
Paolo Atzeni |
UG |
1-4-98 |
Database Systems |
The Database ... |
Riccardo Torlone |
UG |
1-5-98 |
Database Systems |
The Database ... |
Riccardo Torlone |
UG |
1-6-98 |
Compilers |
The Compilers ... |
Titto Patrignani |
UG |
1-3-98 |
DataWebs |
The DataWebs... |
Giansalvatore Mecca |
G |
1-8-99 |
According to the Penelope statement, all the needed URL attributes are added to the relation. The URL values are generated by the URL function from the values of the CourseName and InstructorName attributes.
R2 = URLPageURL<-CourseName, InstrURL<-InstructorName(R1)
The resulting relation is the following:
PageURL |
InstrURL |
CourseName |
Description |
InstructorName |
Type |
Date |
http://.../DatabaseSystems |
http://.../PaoloAtzeni |
Database Systems |
The Database... |
Paolo Atzeni |
UG |
1-4-98 |
http://.../DatabaseSystems |
http://.../RiccardoTorl |
Database Systems |
The Database... |
Riccardo Torlone |
UG |
1-5-98 |
http://.../DatabaseSystems |
http://.../RiccardoTorl |
Database Systems |
The Database... |
Riccardo Torlone |
UG |
1-6-98 |
http://.../Compilers |
http://.../TittoPatrigna |
Compilers |
The Compilers... |
Titto Patrignani |
UG |
1-3-98 |
http://.../DataWeb |
http://.../GiansalvMecca |
DataWebs |
The DataWebs |
Giansalvatore Mecca |
G |
1-8-99 |
After URLs generation, the proper selection operations are performed, in order to satisfy the conditions in the WHERE clause of the Penelope statement.
R3 =
sType = ‘UG’(R2)
The resulting relation now contains only undergraduate course tuples.
PageURL |
InstrURL |
CourseName |
Description |
InstructorName |
Type |
Date |
http://.../DatabaseSystems |
http://.../PaoloAtzeni |
Database Systems |
The Database... |
Paolo Atzeni |
UG |
1-4-98 |
http://.../DatabaseSystems |
http://.../RiccardoTorl |
Database Systems |
The Database... |
Riccardo Torlone |
UG |
1-5-98 |
http://.../DatabaseSystems |
http://.../RiccardoTorl |
Database Systems |
The Database... |
Riccardo Torlone |
UG |
1-6-98 |
http://.../Compilers |
http://.../TittoPatrigna |
Compilers |
The Compilers... |
Titto Patrignani |
UG |
1-3-98 |
Now we need to get rid of all the attributes that are not useful for page instance generation, i.e. those database attributes that are not associated to any page attribute in the page-scheme (the Type attribute in this case). This is done by means of an ordinary projection operation.
R4 = pPageURL, InstrURL, CourseName, Description, InstructorName, Date(R3)
The resulting relation is:
PageURL |
InstrURL |
CourseName |
Description |
InstructorName |
Date |
http://.../DatabaseSystems |
http://.../PaoloAtzeni |
Database Systems |
The Database... |
Paolo Atzeni |
1-4-98 |
http://.../DatabaseSystems |
http://.../RiccardoTorl |
Database Systems |
The Database... |
Riccardo Torlone |
1-5-98 |
http://.../DatabaseSystems |
http://.../RiccardoTorl |
Database Systems |
The Database... |
Riccardo Torlone |
1-6-98 |
http://.../Compilers |
http://.../TittoPatrigna |
Compilers |
The Compilers... |
Titto Patrignani |
1-3-98 |
Now, the relation is ready to be transformed in a set of nested tuples, each of which corresponds essentially to a site page. Nesting operations are performed starting from inner level lists, so, in this case, the first nesting is performed on the Date attribute.
R5 =
nLessonList<-Date(R4)
A new attribute (LessonsList) is created, whose values are lists of dates. Each nested list correspond to the lessons given by an instructor.
URL |
ToInstructor |
CourseName |
Description |
InstructorName |
LessonsList |
Date |
|||||
http://.../DatabaseSystems |
http://.../PaoloAtzeni |
Database Systems |
The Database... |
Paolo Atzeni |
1-4-98 |
http://.../DatabaseSystems |
http://.../RiccardoTorl |
Database Systems |
The Database... |
Riccardo Torlone |
1-5-98 |
1-6-98 |
|||||
http://.../Compilers |
http://.../TittoPatrigna |
Compilers |
The Compilers... |
Titto Patrignani |
1-3-98 |
The final nesting is performed in order to create the InstructorList attribute:
R6 =
nInstructorList<-(InstructorName, ToInstructor, LessonList)(R5)
and each nested tuple of the resulting relation corresponds to a page instance:
URL |
CourseName |
Descr. |
InstructorList |
||
InstructorName |
ToInstructor |
Lesson |
|||
Date |
|||||
http://.../DatabaseSystems |
Database Systems |
The Database... |
Paolo Atzeni |
http://.../PaoloAtzeni |
1-4-98 |
Riccardo Torlone |
http://.../RiccardoTorl |
1-5-98 |
|||
1-6-98 |
|||||
http://.../Compilers |
Compilers |
The Compilers... |
Titto Patrignani |
http://.../TittoPatrigna |
1-3-98 |