![]() ![]() |
![]() |
UFDC Home | Search all Groups | UF Institutional Repository | UF Institutional Repository | SobekCM Help | Center for the Humanities & the Public Sphere | | Help |
Material Information
Subjects
Notes
Record Information
|
Full Text |
PAGE 1 hope and horrorreal -life TEI the CMS/TEI/XSL/HTML stack using TEI with Sobek PAGE 2 executive summaryTEI is the Right Way to transcribe books/ texts T EI is data model only, does not provide logic or view T EI alone is not legible; it needs processing (XSL) SobekCM can host TEI today, but not well: default XSL stylesheets would help should allow styled TEI to be a principal view for an item, not just metadata PAGE 3 real -life projectsixteenth century Latin histories of Florida PAGE 4 What do we have? Books: sixteenth century histories of Florida written in Latin Whom can we help? NeoLatin scholars Historians of Florida Latin teachers and students Whats the issue? Theyre imagebased PDFs Only we have them No translations or commentaries Solutions? Transcribe to text based format Collect and publish in a CMS Add translations and commentary What technology can help? PAGE 5 exuberant hope PAGE 6 What is TEI? (in bullet points)Standard conventions for transcribing books/textsin a text based searchable formatthat computers understand* and manipulate *that any* web browser can display*with more* descriptive detail than plain HTMLdesigned for multiple text streams* like notes*herein lies previously undisclosed horror PAGE 7 Structured General Markup Language old HTML XML HTML 5 TEI iTunes Library RSS more texty more databasy for our simple purposes, TEIis a kind of XML old TEI XHTML 80s 90s 00s 10s PAGE 8 Exemplar Caesarei Privilegii TEI looks like HTML but with different /more tags easy to learnRudolphus II. Divina favente clementia electus Romanorum imperator semper Augustus Germaniae Hungariae, Bohemiae, Dalmatiae, Croatiae Sclavoniae, etc. Rex, Archidux Austriae blah blah blah PAGE 9 < pb facs= "./images/DeBry1591_Page_006.jpg" /> < persName ref= http://thesaurus.cerl.org/record/cnp01467224"> < foreName> possibly many more tags discuss editorial standards before you begin this isnt even OCDPAGE 10 What can we do with TEI?Preservation transcribe the text for future use (practically) illegible in the present Presentation publish a text for real people to read Digital Tools crossreference multiple text streams more texty more databasy1 2 3 PAGE 11 TEIfor preservationo r, first steps with Sobek PAGE 12 Sobeks strengths are PDFs and graphics Sobek normally deals with text as PDF, not as text nonPDF formats get second class citizenship PAGE 13 PDF JPG TXT a ctual size t humbnail of PDF ? 273 bytes Content length: zero PAGE 14 Confusing to researcher? PAGE 15 Why?Author/editor didn t supply text enabled PDF Author/editor contaminated pairtree object No system can eliminate user error Promoting PDF reduces visibility of user error but this makes TEI hard to find PAGE 16 Primary item view for Tommy Tiptop Wheres the text ? PAGE 17 Page Turner View (presentation via graphics, not text no TEIinvolved) PAGE 18 Wheres the text? Under metadataif not used for presentation, TEI should be a second class citizen PAGE 19 Tommy Tiptop: TEI for preservation only, if even that presentation was handled by graphics: TEI adds no value here but it is a start this is what a TEI project needs to do first PAGE 20 What would help?Sobek excels at PDFs and graphics: make it treat text based files as equally well? TEI needs to be more than plain data (XML) view: there needs to be presentation quality if researchers are actually going to read it Is Sobek maybe not the best fit for a TEI project? PAGE 21 TEIfor presentation and scholarly toolsor, the next levels are not bleeding edge PAGE 22 TEI for presentation (CHLT, mid 2000s) main text stream notes streamat least its not raw XML PAGE 23 TEI in a complex scholarly tool (Perseus, c. 1999) main text stream apparatus criticus translation (+notes) keyed lexicalegible presentation and useful tool YAY PAGE 24 TEI as data model for a scholarly tool (how things work today) main text stream textual variants notes and commentary PAGE 25 creeping horror how do we go from illegible TEI to a legible page? PAGE 26 Document Type: HTML TEI data model:test testwithout more info, browser thinks:youd like that test displayed in italics your moonspeak means nothing to meview:test testsolution:use XSL to translate TEI into HTML ( into or CSS)PAGE 27 HTML document TEI document XML document processor XSL processor HTML document processor displaymodel understanding document structure understanding specific tag meaningTEI documents on web need to be translated to HTMLviewweb browser layout engine+XSL TEI TEI HTML XSL PAGE 28 What is XSL?translates XML (to HTML, PDF, Word, LaTeX XML) XSL 1.0 is supported by every browser TEI is trivial XSL/XPath/namespaces is harder PAGE 29 < xsl: template match= "/" name= htmlShell priority= "99" > < xsl: call template name= htmlHead /> < xsl: if test= "$ includeToolbox = true()" > < xsl: call template name= teibpToolbox /> xsl:if > < xsl: apply templates /> < xsl: copy of select= "$ htmlFooter /> xsl:template >XSL is a programming language written in XML you mix HTML/output (black) with XSL commands (colors)PAGE 30 So, can I use someone elses XSL stylesheets?TEI Consortium publishes some on Github every web browser understands XSL 1.0 no browser understands XSL 2.0 ergo, TEI Consortiums XSL stylesheets are 2.0 Indianas Boilerplate is XSL 1.0 but simplistic: improves Tommy Tiptoe, doesnt handle translations, notes, etc. PAGE 31 no XSL calling Boilerplate XSL PAGE 32 Tommy Tiptop on Boilerplate (at least its legible) PAGE 33 copyright page from de Brys 1591 history of Florida click on pic for full page image legible text PAGE 34 Lesson: g oing from model to view takes XSL preservation/transcription? presentation of basic text? yes we can were close addition of translation/other streams? what our project can do now: next section need to create simple XSL stylesheet best bet: hack down Boilerplate t his could be simpler with default stylesheet PAGE 35 what would help nowa default XSL stylesheet on SobekCM would free authors/editors from having to write XSL would provide basic, consistent look would keep processing client side (in browser) but does that violate pairtree object encapsulation? PAGE 36 going from presentation to toolsthat latinists actually use PAGE 37 classicists tools use lots of text streamsLoeb (Harvard) editions: text, translation, textual notes, translators notes Teubner editions: text, apparatus fontium, apparatus criticus asynchronous streams: TOC, introduction, sigla, commentary, index, index nominum, index locorum PAGE 38 parallel Greek text parallel English translation translators notes textual notes st andard a ccessibility for c lassicists: one opening fo ur s ynchronous d ata s treams notes keyed to words, parallel texts to each other PAGE 39 Latin p oem prose summary apparatus criticus editors notes plus endnotes much later old style: in usum Delphini four synchronous streams, one asynchronous ancillary streams keyed to certain words or lines PAGE 40 Rudolphus II. Divina favente clementia electus Romanorum Imperator, semper Augustus, Germaniae Hungariae Bohemiae, Dalmatiae, Croatiae, Sclavoniae etc. Rex, Archidux Austriae, blah blah blah blah blah Rudolph II, elected Emperor of the Romans with Divine Clemency assenting, forever Augustus, king of Germany, Hungary, Bohemia, Dalmatia, Croatia, Slavonia, etc., Archduke of Austria, blah blah blah blah blah < linkGrp type =" translation "> text translation standoff link table (TEI convention) (=poor mans relational database) syncing text and translationPAGE 41 can it be done?use XSL conditionals: iterate over linkGrp when IDs are present use JavaScript: put main text in html body, hide the rest in invisible divs or iframes or script have onLoad () format things doesnt matter! conventions are for the data model (TEI) view logic can be kludged as hard as needed PAGE 42 should it be done?SobekCM organizes and catalogues better to keep text and translation separate ? pr esentable texts should go in SobekCM should complex tools go there too? maybe not PAGE 43 recap PAGE 44 TEIis one part of a stack Our Intentions Reuse by Others content management SobekCM Sobek CM raw data model TEI TEI presentation logic XSL, CSS someone elses XML parser manipulation logic JavaScript someone elses program end use HTML (browser) someone elses project PAGE 45 staffingas things stand with TEIand SobekCM :TEIauthors/editors alone cant produce presentable materials or scholarly tools, just data need skilled XSL(and CSS/JS) coder on project staff to make anything legible, let alone shiny/useful but: default XSL(+CSS/JS?) stylesheets on SobekCM would change the game for simple projects PAGE 46 Default XSL?Develop a documented default TEIXSL stylesheet (or core of stylesheets) to cover common use cases? Respect calls to bespoke XSL within same pairtree object so that authors can develop complex TEI. Inject (serverside) a default stylesheet call if none exists, so that Tommy Tiptoe never happens ? Consider doing this in presentation logic to preserve pairtree object encapsulation? PAGE 47 TEI: what class citizen?Allow TEI to be the principal view for an item only if it adds value (presentation/tools) Default TEI to be ancillary metadata if it adds no value (preservation onlyTimmy). Do not index XSL, CSS, or JS called from TEI files: they shouldnt be treated as catalogued items. PAGE 48 Preservation onlyMetadata and structural markup Not necessarily legible Tommy TiptopTEIsufficesPresentation for Skilled LatinistsLegible text(for those who read Latin)needs basic XSL/CSSBoilerplate or some vanilla XSLwould sufficePublic AccessibilityandScholarly ToolsAncillary material: apparatus, translation, commentary, indices, popover glosses, etc. needs moarTEIand advanced XSL,CSSandJSneeds new host |