Page 1 of 9 Overview Students will learn about the Digital Humanities by learning about how to use data a nd text mining tools (Voyant) to explore and analy ze textual materials. This lesson supports teaching with and about primary historical sources, and teaching about historical research methods including new and developing methods fo r massive archives of digital materials, and for thinking about information in the age of Big Data. Time Required : Half d ay ( 4 hours, all computer time) Target Audiences : A dvanced high school undergraduate college or graduate students Materials Requ i r ed : Computer classroom with web access to go to: o Pioneer Da ys in F l orida : http://ufdc.ufl.edu/pioneerdays o Voyant Tools (web based reading and analysis environment for digital texts): http://voyeurtools.org/ Assessments: Because t his is an introductory lesson it is designed to introduce new terms and topics, including help ing students become familiar with primary sources, digital research methods, and c oncepts (Digital Humanities, Big Data, text mining, stop words, distant reading, etc.) F uture assessments w ill build off of the material presented in this l esson A ssessments on this lesson alone c ould include methods to evaluate learning the new terms and concepts.
Page 2 of 9 Introduction About Pioneer Days in Florida The Pioneer Days in Florida: Diaries and Letters from S ettling the Sunshine State, 1800 1900 Digital Collection ( http://ufdc.ufl.edu/pioneerdays ) includes 36,530 pages of diaries and letters describing frontier life in Florida from the end of the colonial period to the beginnings of the modern sta te. These first hand accounts document the experiences of settlers, soldiers, and travelers who trail blazed Florida during the wars of Indian Removal, the Civil War, and the Gilded Age. These 19 th century manuscript materials from the Florida Miscellaneous Manuscripts Collection within the P.K. Yonge Library of Florida History George A. Smathers Libraries, University of Florida Pioneer Days in Florida is a Digital Humanities scholarly work cura ted archive that builds on existing scholar created intellectual access from finding guides, metadata records, and database information built by scholars to support access to the resources in the collection (first as physical materials, and then as digital). As a digital collection, new opportunities are possible, including text and data mining. About the Digital Humanities As defined by Wikipedia (with many competing and complementary definitions in use by scholars): The Digital Humanities are an area of research, teaching, and creation concerned with the intersection of computing and the disciplines of the humanities Developing from the field of humanities computing, digital humanities embrace a variety of topics, from curating online collections to data mining large cultural data sets. ( http://en.wikipedia.org/wiki/Digital_humanities ) Digital Humanities research builds and uses tools for exploring digital texts, and much more
Page 3 of 9 Teaching Activities 1. Introduction & Orientation to Pioneer Days Introduction to the Di gital Humanities and Pioneer Days in Florida (text above) O rientation to Pioneer Days in Florida : http://ufdc.ufl.edu/pioneerdays o Browsing o Searching o Description, and permanent URLs for referencing 2. Student Activities with Pioneer Days Student e xploratory time on the site in pairs or groups, to read about materials Student time to select at least two items of i nterest identify ing: o Title o Permanent URL o What they think this item will be about, contain, main themes topics, etc. ? o How does its physical condition inform what they think? Optionally : t o the class and by turns each pair/group of s tudents may share their notes on their selected items of interest 3. O verview of the Voyant Tools Showing Voyant Tools web based reading and analysis environment for digital texts : http://voyeurtools.org/ Review of the interface : see http://hermeneuti.ca/voyeur/users for the interface overview f r o m the Voyant Tools website or review the interface with the example o Entering sample URL for full PDF: 1 http://ufdcimages.uflib.ufl.edu/AA/00/01/35/01/00001/LogBook.pdf o Review of the Voyant Tools interface with the sample 4. Discussion and Acti vity with the Voyant Tools (detailed below, aft er the overview and introd uction t o the Voyant Tools) 1 P ermanent UR L for the full item : http://ufdc.ufl.edu/AA00013501/00001
Page 4 of 9 Voyant Tools : i nitial screen with two ma in columns Left column, top: Cirrus word cloud 2 Left column, bottom: Summary Right column: Corpus Reader Note s on the initial screen : C lick on the s mall gear icons to adjust settings. Click on the small expand arrows next to Words in the Entire Corpus Corpus and the double arrows to the right of the Corpus Reader to expand other tools. 2 As explained http://www.tapor.ca/?id=8 : Cirrus is a visualization tool that displays a word cloud relating to the frequency of words appearing in one or more documents. One can click on any word appearing in the cloud to obtain detailed information about its relativity
Page 5 of 9 Voyant Tools : m ain screen with expanded tools showing By clicking on the small expand a rrows, additional tools display a s well as more expand arrows for even more ways to interact with the text.
Page 6 of 9 4. Discussion and Acti vity with the Voyant Tools C h a nging Stop Wo rds Stop words are words which are filtered from results, and are often based on lists of very common words. For instance, in the screen shot before this section, the largest word in the word cloud is the Click on the gear above the Cirrus word cloud to select different options, and select the Taporware (En glish) stop words. Doing so changes the word cloud dramatically by removing common words like the :
Page 7 of 9 Next, c l ick on the gear above the Word s in the Entire Corpus to remove the stop words. Notice that the most common word s are now boat and professor Clicking on professor i n the Word s in the Entire Corpus on the left column then generates the Word Trends graph on the far right. C licking on any of the terms in any of the panels similarly changes other panels.
Page 8 of 9 Discussion With these panels open and expanding others as interested h ow does this relate to other reading experiences? Discussion C oncept: Distant Reading In the Digital Humanities, one approach or method to reading masses of digitized texts is called D istan t R eading Distant reading employs quantitative methods t o analyze texts in new ways and c an offer new ways of reading. Distant reading d iffers from close readings, which are focused, deliberate, and sustained readings of texts, often focusing on brief passages of text to conduct the analys is C lose reading and distant reading are research method s or approaches that inform the gathering of data or evidence and analysis. O ther types of readings include c a sual readers skimming online websites often going fro m one site to another, in an exploratory manner Activity In considering different types of reading, students (in pairs or groups) should explore the text using the Voyant Tools with the concept of distant reading (and reading methods overall) in mind. For this activity, students should be asked to verbalize their thoughts on what they think the interface options will do before clicking or making any changes. The other student(s) should also offer their thoughts and discuss any actions and changes This pair group activity format in using interfaces is common for interface and system usability testing, where people in pairs share how they think the system does work, and which informs system design. It s a us eful practice for usability testing and for users to employ in collaborative learning for new technologies and new methods. This activity format supports learn ing interface s together and a wareness of and the development of new methods f o r appr oaching new tools and readings.
Page 9 of 9 Discussion After time exploring and discussi ng the text as read in the Voyant Tools, the description for the logbook (which c ontai ns all of this text) is useful to review to see what different information and sense of the material is gained in a quick r eview. Logbook description: 3 A woman traveler's account of an 1874 voyage on the steamboat Okahumkee of the Hart Line. The memoir appears to be the work of Martha D. (House) Allen, from Pennsylvania, although the name Martha H. Holm es is also associated with the text. Martha House Allen (1827 1900) and her husband Charles J. Allen (1822 1887), of Baring St., Philadelphia, were passengers on the Okahumkee as it traveled up the St. Johns River to the Ocklawha and on to Silver Springs. She describes daily activities, sights, and adventures on the trip, paying close attention to wildlife and local plants. Referring to herself as "the Scribe", she never uses her or any of the passengers' actual names, instead creating nicknames, according to the occupation or personality of the passenger. One of the passengers reads aloud from Harriet Beecher Stowe's Palmetto Leaves, published the previous year, as the steamboat progresses. The Okahumkee, a rear paddle steamer, was one of numerous steamboat s operated by the Hart Line in the 1870s, others being the Osceola and the Hiawatha. It was a popular tourist attraction during the last quarter of the nineteenth century and was still afloat although derelict in the 1930s. O ptional: Student Activities with Voyant Tools for Text Visualization and Data Mining Student exploration of text visualization and data mining: o Students in pairs or groups, go to Voyant Tools ( http://voyeurtools.org/ ) o Students enter text or a URL for one or more of their selected items of interest 3 http://ufdc.ufl.edu/ AA00013501/00001/citation