NewtFire logo: a mosaic rendering of a firebelly newt
newtFire {dh}
Creative Commons License Last modified: Tuesday, 08-Nov-2022 20:05:01 UTC. Maintained by: Elisa E. Beshero-Bondar (eeb4 at psu.edu). Powered by firebellies.

Overview of the Assignment

For this assignment, we will be working with a file from Digital Mitford project, a file that stores stores lists of names and information on people, places, organizations, and texts, among other kinds of named entities (really, anything that is named). For this assignment, we will work with a slightly modified version of the Digital Mitford Site Index, and we have prepared a starter XSLT file for you.

  1. There are a few different ways you can access the source TEI XML file (do what is easiest from your computer):
  2. In the textEncoding-Hub, also in Class-Examples >> XSLT >> DigitalMitford_SI, we have prepared this XSLT file with the special output line and namespace adjustments we need to process input TEI XML and output HTML 5. Do a git pull to pull this file in locally to work with it, or open it in the Raw view to donwload it to your computer.

Before you dive in to start writing XSLT, please read this assignment thoroughly, so you understand what you're doing and what you need to set up in the XSLT stylesheet. Our goal is to create a structured outline in HTML of all the information about organizations in the site index. We want to output that in HTML in the form of a list with nested lists inside, representing an outline of first the categories of organization, and then inside each category, a new nested sublist of the organization names. One possible use of a webpage like this is as a list of links, so that each organization name might link to a page of information on each organization. We don’t have to generate those links now. For this assignment, we just want to learn how to transform XSLT to HTML and to generate the lists themselves by pulling the exact content we want out of our XML.

For the organization types or categories, we need to pull from the <head> element sitting inside at the top of each <listOrg> elements in our TEI file. For the organization names, we reach in to find the individual entries for <org> and their child <orgName> elements inside each <listOrg> element. Each <org> element contains one <orgName> inside that holds the best-known name of a particular organization. You may first want to experiment with XPath on the Site Index file to locate the <listOrg> elements and study the XML hierarchy of the lists. Let’s make the outer list be ordered (or numbered) list in HTML, using the HTML <ol> element, and then make the inner list be an unordered (bulleted) list, using the HTML <ul> element.

Your lists in HTML should come out looking something like this, only yours will contain many more entries in each category, because your XML document contains some new material.

  1. Archives Holding Mitford's Papers
    • Baylor University, Armstrong Browning Library
    • Berkshire Record Office
    • British Library
    • Boston Public Library
    • Cambridge University: Fitzwilliam Museum
    • Duke University Rubenstein Library
    • Eton College
    • Florida State University Special Collections
    • The Women's Library, Glasgow
  2. Organizations Relevant to Mitford's World
    • Billiard Club
    • House of Bourbon
    • Cavaliers
    • Court of Chancery
    • Church of England
    • the Cockney School
    • Mr.and Mrs.Mitford
    • the Moncks, family of John Berkeley Monck
    • New Model Army
    • Palmerite
    • Parliament
  3. Fictional Organizations Referenced by Mitford
    • Attendants &c.
    • Citizens
    • Guards
    • Guards
    • Ladies
    • Nobles (in Julian)
    • Nobles (in Rienzi)
    • officers in Charles I
    • Prelates

The underlying HTML, which we generated by running XSLT, should look like this:

   <ol>
         <li>Archives Holding Mitford's Papers<ul>
               <li>Baylor University, Armstrong Browning Library</li>
               <li>Berkshire Record Office</li>
               <li>British Library</li>
               <li>Boston Public Library</li>
               <li>Cambridge University: Fitzwilliam Museum</li>
               <li>Duke University Rubenstein Library</li>
               <li>Eton College</li>
               <li>Florida State University Special Collections</li>
               <li>The Women's Library, Glasgow</li>
            </ul>
         </li>
         <li>Organizations Relevant to Mitford's World<ul>
               <li>Billiard Club</li>
               <li>House of Bourbon</li>
               <li>Cavaliers</li>
               <li>Court of Chancery</li>
               <li>Church of England</li>
               <li>the Cockney School</li>
               <li>Mr.and Mrs.Mitford</li>
               <li>the Moncks, family of John Berkeley
                        Monck
                  </li>
               <li>New Model Army</li>
               <li>Palmerite</li>
               <li>Parliament</li>
            </ul>
         </li>
         <li>Fictional Organizations Referenced by Mitford<ul>
               <li>Attendants &amp;c.</li>
               <li>Citizens</li>
               <li>Guards</li>
               <li>Guards</li>
               <li>Ladies</li>
               <li>Nobles (in Julian)</li>
               <li>Nobles (in Rienzi)</li>
               <li>officers in Charles I
                  </li>
               <li>Prelates</li>
            </ul>
         </li>
      </ol>

In HTML ordered and unordered lists, the only elements permitted inside are list items or <li> elements. We’ve nested them so that each list item in the outside numbered list contains a category type (designating what kind of organization), followed by an embedded <ul> that contains, in turn, a separated bulleted list series, listing the name of each organization in the list.

If you’re feeling adventurous, once you obtain the output we're seeking, you may go on to build other HTML lists, working with other portions of the XML document, such as the <listBibl> or <listPerson> sections, which are formatted a little differently. The only required content of your homework, though, is the HTML outline of organization types and organization names.

Before You Begin: Set up the XSLT Stylesheet to Read TEI

Since the Digital Mitford’s Site Index is coded in the TEI namespace, we will need to make some edits to our XSLT 3.0 Stylesheet to read from a TEI document and output to HTML 5 in XHTML syntax. If we don't make these changes, XSLT will not be able to read the input file or output to the correct HTML 5 format.

So, our modified xsl:stylesheet and xsl:output elements look like this, and you should copy this into your stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
         <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0"
    xmlns="http://www.w3.org/1999/xhtml"
    xpath-default-namespace="http://www.tei-c.org/ns/1.0">
    
   <xsl:output method="xhtml" html-version="5" omit-xml-declaration="yes" 
              include-content-type="no" indent="yes"/>
    
    </xsl:stylesheet>
      

Guide to Approaching the Problem

Our XSLT transformation (after all this housekeeping) has three template rules:

  1. We have a template rule for the document node (<xsl:template match="/">), in which we create the basic HTML file structure: the <html> element, <head> and its contents, and <body>—anything that appears just once in the HTML document (one to one relationship with the root node). Inside the <body> element that we’re creating, we use <xsl:apply-templates> and select the <listOrg> elements (using an XPath expression as the value of the @select attribute). And we create our wrapper <ol> tags to set up the ordered list of organization types.
  2. We have a separate template rule that matches the <listOrg> elements (holding the lists of organizations), so it will be invoked as a result of the preceding <xsl:apply-templates> instruction, and will fire once for each <listOrg> element in our Site Index. Inside that template rule we create a new list item (<li>) for the particular <listOrg> being processed and inside the tags for that new list item we do two things. First, we apply templates to the <head> for the <listOrg>, which will cause its category description to be output when we run the transformation. Second, we create wrapper <ul> tags for the nested list that will contain the names of the organizations within that category. Inside that new <ul> element, we use an <xsl:apply-templates> rule to apply templates to (that is, to process) the <org> elements of that <listOrg>.
  3. We have a separate template rule that matches the <org> elements, which make up the items in the list of organizations, and that just applies templates to the <orgName> element within each <org>. This rule will fire once for each <org> element inside the <listOrg>, and it will be called separately for the <org> elements within each <listOrg>, so that the orgs will be rendered properly in their respective lists.

We don’t need a template rule for the <head> elements themselves because the built-in (default) template rule in XSLT for an element that doesn’t have an explicit, specified rule is just to apply templates to its children. The only child of the <head> elements is a text node, and the built-in rule for text nodes is to output them literally. In other words, if you apply templates to <head> and you don’t have a template rule that matches that element, ultimately the transformation will just output the textual content of the head, that is, the title that you want.

Important