NewtFire logo: a mosaic rendering of a firebelly newt
newtFire {dh}
Maintained by: Elisa E. Beshero-Bondar (eeb4 at psu.edu) Creative Commons License Last modified: Thursday, 05-Jan-2023 19:35:22 UTC. Powered by firebellies.

Spring 2022 Syllabus (Schedule) Classes meet M W F 1:25 - 2:15pm in Witkowski 109. Attend class wearing a face mask that covers your mouth and nose. (Until further notice, face masks are required inside all university buildings, regardless of vaccination status.) Remember: Your mask protects me, my mask protects you.

Read the Course Description

This contains a detailed explanation of course policies and the basis for grades.

Jump Down to the Schedule

This link jumps to the closest day to today's date. Review the schedule as we get started to get a sense of how this course will work on a daily basis.

All the Tools You Need As We Begin:

Download and install the following software on your own personal computer(s) on or before the first day of class. These software tools are available in our campus computing labs, too.

  1. <oXygen/>. The DIGIT program has purchased a site license for this software, which is installed in Kochel 77, the Lilley Library computers, and Witkowski 109. The license also permits students enrolled in the course to install the software on their home computers (for course-related use only). When installing this on your own computers, you will need the license key, which we have posted on our course Announcements section of Canvas.
  2. Zoom: Make sure your Zoom installation is up-to-date, and you are ready to connect. Sometimes we will record portions of class meetings and tutorial sessions for future reference to share over Zoom. Look for these in Canvas Announcements and use the Zoom menu option in Canvas to access these meetings.
  3. We will use GitHub for for sharing code and for project management. Create an account (choose the free options) at the https://github.com and install the GitHub client software for your operating system on your own machine on your computer. (We will explain how to use git and GitHub this in our course.)
  4. We will use the Slack chat platform for discussion and for asking questions (see https://slack.com/help/articles/218080037-Getting-started-for-new-members). Download and install the Slack client, configuring your account to use use your Penn State email address (the official address, which looks like xyz123@psu.edu, and not an alias based on your name that you may have set up), so you can join our Slack workspace: DIGIT-coders. When you receive an invitation to join this workspace you should accept.
  5. Later in the semester we will ask you to install local copies of the eXist-db XML database, which you can download from https://exist-db.org/. We will go through the installation process with you when the time comes, since it can be confusing the first time, and we recommend that you not install this application now in any case because it is updated frequently, and there is likely to be a more advanced version available by the time we need it. If you want to install it early in order to begin to experiment with it, we recommend that instead of the Latest Stable Release (version 5.3.1 as we write this) you install the most recent Nightly Build; after you click on the link from the main eXist-db page, scroll down past the FusionDB (which is a different product than eXist-db) nightly builds to reach the eXist-db nightly builds.
  6. Later in the semester, we will ask you to install Python version 3.7 or higher on your computer, and install PyCharm Edu to assist in learning and writing Python code with syntax checking. Follow instructions and links from Pycharm ( https://www.jetbrains.com/help/pycharm/quick-start-guide.html#meet ) paying attention to what you need for your own computer systems. Feel free to download and explore Pycharm Edu on your own before we start working with it together: https://www.jetbrains.com/pycharm-edu/. Also, configure Anaconda so it is available to work within Pycharm following this guide: https://www.jetbrains.com/help/pycharm/conda-support-creating-conda-virtual-environment.html. As with eXist-dB, we will also go over the installation of this and setting up a Python environment on your computer.
  7. No coding experience? Don’t worry! Past students in this course who never saw anything like markup or XML code have designed projects (like these) and even spoken about them at academic conferences! You will learn to develop your own digital tools and how to manage digital projects as teamwork.

Class Web Resources:

Week 1 Class topics Do before class

M 1-10

Welcome! Intro to the course. Intro to document formats. Working in oXygen XML Editor:
  • Open a sample Microsoft Word document in the oXygen XML Editor. (Click the "View Raw" button in GitHub to download my sample file, or open one of your own Word documents.)
  • Slime recipe
  • Join the DIGIT Coders Slack (if you are not already a member), configure Slack, enable Notifications. (See Canvas Announcements)
Install oXygen XML Editor on your own computers and add our license key (in Canvas), ideally before the first day of class but by Friday this week at the latest. Instructions and license key posted on Canvas.

W 1-12

Hands-on work with XML, encoding a recipe. XML Well-formedness with elements, attributes, comments. How to work with <oXygen/>.

F 1-14

  • Discussion of the XML recipe homework: XML Comments and Well-formedness, and how to work with <oXygen/>.
  • Introduce XML Exercise 2
Complete XML Exercise 1
Week 2 Class topics Do before class

M 1-17

Martin Luther King, Jr. Day: No classes.

W 1-19

XML Exercise 2 due

F 1-21

Week 3 Class topics Do before class

M 1-24

  • GitHub: Continue with branching experiment in class. Handling pull requests, merging branches.

W 1-26

Intensive discussion of git branching XML Exercise 3 + GitHub: push this assignment to your branch of the textAnalysis-Hub repository and issue a pull request assigned to Dr. B (ebeshero).

F 1-28

  • Introducing up-conversion with Regular Expressions: the dot, the backslash, numbers (\d, repetition indicators, matching on lines, and autotagging.
  • Regular Expressions: thinking algorithmically. Greedy and non-greedy matching.
  • Choosing a license for your GitHub repo.
Review git branching, catch up on git homeworks if necessary.
Week 4 Class topics Do before class

M 1-31

  • (By the end of the day): Five Days of Git: Part 1: Record completion on Canvas as part of GitHub Test

W 2-02

  • Regex: Greedy and non-greedy matching: When to use dot matches all, and the don't be greedy question mark.
  • Looking ahead: Project ideas for XML and text at scale. Initiate Project Ideas discussion on DIGIT-Coders Slack

F 2-04

  • Regex problem solving / debugging
  • Project possibilities and what's next! Initiate project proposal process
  • Regex Issues: Debugging and problem solving. Greedy and non-greedy matching. Selecting for what's not there.
  • Get as far as you can before class on Regex Exercise 2: Sonnets
  • Five Days of Git: Part 3: Record completion on Canvas as part of GitHub Test
Week 5 Class topics Do before class

M 2-07

  • Regex Lookahead. Debugging: Simplifying overcomplicated expressions.
  • Discussion of project ideas

W 2-09

  • Regex Debugging, complexities of the Mulan script.
  • Projects discussion, look ahead.
  • Regex Exercise 4: Mulan Screenplay
  • Discussion of professional and student DH projects
  • Five Days of Git: Part 5: Record completion on Canvas to finish the GitHub Test.
  • Canvas Discussion of professional DH projects
  • Begin posting proposal ideas
  • Revisions of problematic regex assignments due if we asked for revisions

F 2-11

  • Launch Take-home Regex Test (due W 2/16).
  • Semester project ideas
  • Validity for a project: what is a schema? What is schema validation?
  • Validation for Google Sheets
  • How to write a Relax NG schema (review for some)

Read Intro to Relax NG

  • Post Project ideas
Week 6 Class topics Do before class

M 2-14

  • Good projects: ideas, sources, teamwork expectations: discussion
  • Relax NG: data types and mixed content
  • Troubleshooting and debugging Relax NG

W 2-16

  • Relax NG schemas for project management
  • Project ideas

F 2-18

  • Relax NG: Common problems (mixed content, repetition indicators). Simplifying your code. Documenting your schemas
  • Form Project Teams (today or M 2-19): launch GitHub repos: Project Milestone 1
Relax NG Exercise 3
Week 7 Class topics Do before class

M 2-21

Project Teams

HTML review. Relationship to / difference from XML. Setting up docs/ directory for GitHub Pages.

Project Milestone 1: Launch the project GitHub repo and invite your teammates and me to join (using Settings > Manage settings). Launch Slack channel for project and invite teammates and Dr. B. Post in your Slack project thread your available meeting times to help determine a regular meeting time for your group.

W 2-23

HTML and CSS review
  • HTML 5 semantic elements
  • CSS box model
  • Web browsers and display variations
  • Positioning and controlling layouts with HTML: flexboxes
Complete HTML Exercise 1. The files go on your webspace: Provide the published web link to your files on Canvas.

F 2-25

  • Debugging HTML / CSS issues; resources for Looking Stuff Up.
  • Start XPath and XQuery in oXygen, and in eXist-dB: simple functions and sequences. Exploring XML through child and descendant axes. Predicate filters.
  • Complete HTML/CSS Exercise 2
    • Read about HTML Accessibility and apply what you learn about accessible code on your HTML code for headings, images (providing alt attributes), links, declaring the language. Try applying title attributes.
    • Read about Responsive HTML and try applying what you learn to scaling some elements on your site.
    The files go on your webspace: Provide the published web link to your files on Canvas.
  • Access newtfire eXist-dB and find the eXide window.
Week 8 Class topics Do before class

M 2-28

XPath predicates [ ] as filters. Awareness of sequences: An XPath sequence can be zero, one, or more results. XPath functions and their cardinality: can they handle only one node at a time? Or many at once? Introducing the FLWOR

W 3-02

  • XQuery: Writing FLWOR statements and outputting HTML lists and tables
  • Outputting files and saving them to the eXist-db database for previewing
  • XQuery online and offline: in eXist and in <oXygen/>

F 3-04

  • What you can count and measure with XPath in XQuery
  • Saving and Accessing files in the Newtfire eXist-db: set up individual and team project directories.
  • Test logging in to newtfire eXist-dB
  • XQuery Exercise 2: Writing a FLWOR
  • Project Checkpoint 2 (due by the end of the day):
    • Create a file directory structure for the project GitHub repo(s): Initiate the project website within the docs directory with an index.html page and some CSS. Consult with your team and Dr. B to decide on a place to work on the text files (in its own directory, or in a separate private repo?) and create that space. Create a directory for XML files. Begin populating those file directories (even with placeholder Readme.md files to describe what belongs where).
    • Assemble the text files you want to work with on the project. As a team, work on document analysis to plan for how you want these to be marked for structure. What XML structure do you want to use to contain meaningful units of text data? Aim for a clear, simple structure that distinguishes the kind of info you want to be able to track.

M 3-07 — F 3-11

Spring Break Enjoy this week!
Week 9 Class topics Do before class

M 3-14

  • What can we do with XQuery for loops: over XML nodes, and off the tree over text values (like distinct-values).
  • XQuery from eXist to Web: Writing HTML output from eXist-dB
XQuery Ex 3: Querying the Disney Songs

W 3-16

XQuery to HTML. Working with eXist-dB outputs. XQuery Exercise 4: Disney Songs data to HTML

F 3-18

Project Work / Catch-up day Project Milestone: preparing file(s) for your project collection to explore with XQuery
Week 10 Class topics Do before class

M 3-21

XQuery to HTML. Other output formats to save: preparing for network analysis: The CSV / TSV file. XQuery Exercise 5

W 3-23

Reviewing XQuery so far. Issues with for loops, building files, saving outputs. Prep for Project Milestone(s).

F 3-25

XML that makes graphics: SVG (Scalable Vector Graphics). Drawing elements, and screen grid coordinates.

Introductory Slideshow. and w3Schools SVG Tutorial.

  • Projects: Midterm Checkpoint
    • All or most of your texts are prepared in XML and (nearly) ready for XQuery and analysis. Or the work remaining to prepare your texts is easily defined on your GitHub Issues for the project.
    • Relax NG schema is prepared and associated with your files. The team is has been error-correcting and proofing the text base.
    • The website is progressing: there is a site menu and more than one page. Some content appears to announce what this project is about and what questions the team is exploring.
    • Some XQuery over all of some of the team XML is present in the team GitHub repo, and some results of that XQuery are shared, if not on the website, at least in the repo.
Week 11 Class topics Do before class

M 3-28

XQuery to SVG: Pulling data for visualizing, and plotting graphs using FLWOR statements. Namespace issues.

W 3-30

XQuery to SVG: plotting, labelling, scaling, colors. SVG Exercise 2 (from XQuery): Plot a clear, simple, legible, labelled graph.

F 4-01

XQuery to SVG development. Introducing Network Analysis via XQuery and TSV files.
  • SVG Exercise 3: Catch up on previous XQuery exercises if you need to repair them. Continue with plotting SVG from Assassin's Creed, this time converting your plot to a bar graph: Prepare evenly-spaced side-by-side bars, so the count of actions is next to the count of distinct speakers. Prepare X and Y axes for your graph, and label your counts.
  • Prepare for Network Analysis: Install Cytoscape on your computer. Begin familiarizing yourself with the Cytoscape interface, working with Cytoscape session files (with .cys extension) in the textAnalysis-Hub in Class Examples >> XQuery-NetworkAnalysis. Try opening one of the Cytoscape session files (.cys) found in one of the project directories there following our tutorial instructions.
  • Read An Introduction to Network Analysis and Cytoscape for XML Coders
Week 12 Class topics Do before class

M 4-04

Network Analysis: working with Cytoscape: Importing Data and working with the Network Analyzer and network stats. XQuery to Network Analysis: Exercise 1 (prepare a TSV for class today)

W 4-06

Network Analysis, continued: Reorganizing and styling network visualizations. Working with output files on your website.

F 4-08

  • Network Analysis in project development
  • Issue XQuery Test
Week 13 Class topics Do before class

M 4-11

Introducing Python for Digital Humanities work: Orientation to Pycharm Edu IDE and tutorial work together
  • Project Visualization Milestone:
    • Each team prepares visualizations for the project with SVG and/or Network plots. Store these in the project GitHub repo and the newtfire eXist-dB database and post progress and links to these materials on Canvas
    • Progress on Project Websites:
      • Work on incorporating places for analysis and visualizations on the site.
      • Draft background information about your resources and their origins.
      • Write up and post the research questions the team is exploring.

W 4-13

Python tutorial review together
  • Complete Pycharm Edu Community tutorials: Introduction through Strings and submit evidence of completion (via screen capture) on Canvas.
  • XQuery Test due by 11:59pm

F 4-15

Python: Working with libraries (modules and packages), saving and executing files. Introduce first project-data Python exercise. Pycharm Edu: FINISH the Intro to Python tutorials: Data structures (lists and dictionaries), Functions, classes, and objects, modules and packages, file input and output.
Week 14 Class topics Do before class

M 4-18

Python NLP work on projects Python / project data exercise 1

W 4-20

Python NLP work on projects

F 4-22

Python NLP work on projects Python Exercise 3 (and Catch-Up): Pull and save precisely-named and filed text files from your project and process with NLP, continuing to experiment with selections of NLP data (parts of speech, named entities). File in Canvas and on GitHub repos
Week 15 Class topics Do before class

M 4-25

Putting it all together: Discussion, analysis, documentation, web work. Ethics in public-facing digital data representation.
  • AI/Text analytics and ethics reading/discussion (Timnit Gebru, etc.)
  • Project Milestone due: Visualization and Documentation Development

W 4-27

Thinking about user experience, range of audiences. Data curation, analytics, and ethics issues Prep for presentations

F 4-29

Last Day! Teams deliver DIGIT Works presentations Prep for presentations
Finals Week: May 2 - 6 To Complete

H 5-05

Semester projects due by 11:59pm

Finish developing projects, and send a post to me on GitHub and Canvas to indicate your team is finished.