NewtFire logo: a mosaic rendering of a firebelly newt
newtFire {dh}
Creative Commons License Last modified: Friday, 08-Apr-2022 06:48:04 UTC. Maintained by: Elisa E. Beshero-Bondar (eeb4 at psu.edu). Powered by firebellies.

The Star Wars project team has prepared an XML corpus of five distinct drafts of the first Star Wars movie, A New Hope. For this test, we are interested in looking at speeches (coded as <sp> elements) and speakers (coded as @spk attributes on those elements). You should review the code by viewing it in the database or in oXygen to be sure you understand how it is structured. The Star Wars collection is stored on the the newtFire eXist-dB server. and you may access it with a variable pointing to the collection at:

collection('/db/starwars/fixed/')

Write an XQuery script does the following: (Remember to save your work as you go using the .xql extension for XQuery. You may use your personal folder on the newtfire eXist-dB. As needed, add XQuery comments in your file.

Part 1: Write a FLWOR and return a concatenated string

  1. Begin by creating a variable called $drafts that points to the collection at collection('/db/starwars/fixed/'). Make sure your varialble returns the five Star Wars drafts before continuing. (1 point)
  2. Copy in this variable and answer a question about it in your file by writing an XQuery comment :
             let $sortOrder := 
                for $d in $drafts
                let $date := $d//date/@date ! data()
                order by $date
                return $d
    Return this variable and inspect your results. Notice what information is stored in the date attribute near the top of each file. In an XQuery comment, explain in your own words what is happening inside the $sortOrder variable and what it does to the sequence of files in the collection. (3 points)
  3. Write a for loop that looks in each member of the sequence stored in the $sortOrder variable, and also stores the position number of each turn of the for loop. (3 points)
  4. Write two variables to retrieve the date and the title of each draft in this loop. (3 points)
  5. Write a variable that counts all speeches in each draft in this loop. (2 points)
  6. Now, we want to find out the distinct speakers in each section: Write a variable that finds the speakers, removes any extra spaces from the nodes, and eliminates repeated values. (4 points)
  7. Retrieve a count of those distinct speakers. (2 points)
  8. We want our return to help us compare the count of distinct speakers with the count of all speeches in each draft. Make your first version of this return statement a concatenated line of text, outputting the position in the for loop, the draft title, the draft date, the count of speeches, and the count of distinct speakers, using text separators of your choice. (4 points)
  9. To continue to the next part of the test, comment out the previous return (do not delete it, so I can give you credit for it on this exam.) (1 point).

    Part I of this exam is worth a total possible 23 points. If you have completed this much, you have finished 77% of the exam.

Part 2: HTML or SVG: Your Choice

To complete the remaining 20% of the exam, build an HTML or an SVG file around your FLWOR, and make your FLWOR output the variables above in well-formed and tastefully repeating elements. Create your choice of either an HTML table or an SVG bar graph, showing the two counts side by side at each position mark from the for loop. You may save your output to your individual folder on the newtfire eXist if you wish. (7 points)

Submitting your test

Please submit your XQuery file on Canvas at the submission point for this exam. Download your XQuery file from the newtfire eXist-dB into a file saved with the .xql file extension. You may also submit your output SVG or HTML file as a link to its REST address from the newtfire eXist-dB server. Or download and save these to submit on Canvas. You must at least submit your XQuery .xql file to complete this exam. Bonus: One extra credit point is available if you send a functioning REST address to your output from the eXist-dB database.