Improving Digital Collection Access with Simple Search Engine Optimisation Strategies ( Publisher's URL )

Material Information

Improving Digital Collection Access with Simple Search Engine Optimisation Strategies
Series Title:
African Studies in the Digital Age. DisConnects?
Physical Description:
Book Chapter
Reboussin, Daniel A.
Taylor, Laurie N.
Place of Publication:
Leiden, The Netherlands
Publication Date:


Collected for University of Florida's Institutional Repository by the UFIR Self-Submittal tool. Submitted by Daniel Reboussin.
Publication Status:
In Press
General Note:
Volume Editors Terry Barringer and Marion Wallace.

Record Information

Source Institution:
University of Florida Institutional Repository
Holding Location:
Rights Management:
Rights reserved by authors and publisher.
System ID:

This item is only available as the following downloads:

Full Text


Improving Digital Collection Access with Simple Search Engine Optimization Strategies Daniel A. R EBOUSSIN Ph.D. Head, African Studies Collection University of Florida George A. Smathers Libraries Laurie N. T AYLOR Ph.D. Digital Humanities Librarian Un iversity of Florida George A. Smathers Libraries Challenges to supporting academic library research Supporting research access to scholarly information is an increasing challenge in academic libraries, requiring the improvement and adaptation of establi shed practices along with the application of new approaches. In the African context a lack of adequate infrastructure, technical communications, and social support poses additional challenges1, 2 3 (including those for western scholars seeking African initiated and produced research).4 However, these issues are broadly shared: the corpus of data appropriate for scholarly attention has been redefined and expanded;5, 6 growth in the volume of published research continues at unprecedented rates;7, 8 9 and the ave nues for accessing information resources 1 W. Wresch. Disconnected: H aves and H avenots in the Information Ag e (New Brunswick NJ, Rutgers University Press, 1996). 2 For an optimistic view, see: C. Angello, The A wareness and U se of E lectronic I nformation S ources A mong L ivestock R esearchers in Tanzania, Journal of I nformation L iteracy 4, 2 (2010), pp. 622, available at V4 I2 20101 3 A. Kiyindou, Accessibilit de linformation en Afrique, Africa Media Review 16, 1 (2008), pp. 7390, available at h ttp:// 4 G. Walsh, Can we get there from here? Negotiating the W ashouts, C ave ins, D ead ends and O ther H azards on the Road to R esearch on Africa, The Reference Librarian, 42, 87, (2004), pp. 596, available at 5 C. Borgman, Scholarship in the Digital Age: Information, Infrastructure, and the Internet (Cambridge, MA, MIT University Press, 2007), p. 214. 6 S. Wilberley, Jr. and W. Jones, Patterns of Information Seeking in the Humanities College & Research Libraries, 50, 6 ( 1989), pp. 638 645 7 K. Schmidt and N. Newsome, The Changing Landscape of Serials, The Serials Librarian, 52, 12, (2007), pp. 119133, available at 8 J. Navin and J. Vandever The Market for Scholarly Communication Journal of Library Administration, 51, 5/6 ( 2011), pp. 455463. 9 M. W are and M. Mabe. The STM Report: An Overview of Scientific and Scholarly Journal Publishing (Oxford, International Association of Scientific, Technical, and Medical Publishers/Prama House, 2009).


have diversified greatly into data silos.10, 11, 12 Together, these and related factors make todays scholarly information landscape larger and more complex to navigate than it was even just a few years ago.13, 14 For ce nturies the challenges of research access centred on identifying, locating, and delivering physical materials to scholars (and scholars to collections). While these issues remain relevant, the cent re has shifted. Along with this shift the focus now is on providing scholars with effective bibliographic tools, assisting t hem to choose the most appropriate tools for the question at hand and effectively apply search techniques that generate the best results. In doing so, we allow scholars to assess and evaluate only the most relevant sources in order to determine their value for further attention. Filtering many results within ones time constraints is a more common current problem than identifying a sufficient number of sources. Faced with an increasingly compl ex and challenging information environment, it is no wonder that many library users prefer fewer, simple, unified search tools15 that resemble familiar resources like Wikipedia16 and Google which, for better or worse in the context of scholarly research, the y are familiar with and trust.17, 18 In this chapter the authors propose that librarians, archivists, and collection managers19 who are responsible for providing access to research collections consider an additional approach: Search Engine Optimization (SEO).20 SEO does not rely on knowing or contacting potential users, but it nevertheless improves the effectiveness of their general online searches by elevating, when relevant, library and archival collection materials to a more prominent place in search engine results. Research users information seeking behavior 21 is changing in ways that make it difficult for librarians to fully support access to scholarly information with established practices alone: l ibrary research is increasingly undertaken initially or even exclusively online,22, 23, 24 removed from library social spaces,25, 26 10 M. Somerville, SAGE'S White Paper on Discoverability in t he Twenty First Century: Collaboration Opportunities for Publishers, Vendors, and Librarians, Against t he Grain, 24, 3 (2012), pp.1822, available at http://www.against sages white paper on discoverability in the twenty first century collaborationopportunities for publishers vendors andlibrarians/ 11 A.Hey and A. Trefethen, The Data Deluge: a n eScience Perspective, in F. Berman, G. Fox and A.J.G. Hey (eds), Grid Computing: Making the Global Infrastructure a Reality (Chichester, Wiley, 2003), pp. 809824. 12 D. Fulkerson, Remote Access Technologies for Library Collections (Hershey PA, Information S cience Reference (IGI Global), 2012). Available at /9781466602342 13 P. Lyman and H. R. Varian, How Much Information (2003), available at muchinfo 2003 14 C. Lynch, 'The Institutional Challenges of Cyberinfrastructure and e Research', EDUCA USE Review 43, 6 (2008), pp. 7488. Available at challenges cyberinfrastructureande research 15 See Borgman, Scholarship in the Digital Age, p. 255. 16 See A. Head and M.Eisenberg, Finding C ontext: What T odays C ollege S tudents say about C onducting R esearch in the D igital A ge, in Project Information Literacy Progress Report (Seattle, WA, Univers ity of Washington Information School, 2009), pp. 1113, available at ons/ 17 E. Hargittai, L. Fullerton, E. MenchenTrevino and K. Yates Thomas, Trust O nline: Young A dults E valuation of Web C ontent, International Journal of Communication, 4, (2010), pp. 468494. 18 K. Purcell, J. Brenner and L. Rainie. Search Engine Use 2012 (Washington, DC, Pew Research Centers Internet & American Life Project, March 3, 2012), available at Use 2012.aspx 19 He reafter, with the term librarian or curator, we intend to include archivists and other professional collection managers, within or outside a library institutional context. 20 N. Carroll, [Entry on Search Engine Optimization], Encyclopedia of Library and Inf ormation Sciences 3 r d ed. (Boca Raton FL, Taylor & Francis, 2011). Available at 21 D. Case, Looking for Information: a Survey of Research on Information Seeking, Needs and Behavior 3rd ed., (New York, Emerald Press, 2012). 22 L. Duke and A. Asher (eds), College L ibraries and S tudent C ulture: What W e N ow K now ( Chicago, IL, American Library Association, 2012). 2


resulting in fewer opportunities27 for librarians to constructively intervene and offer mediated search services28 (serendipitously assisting and training users at their moment of greatest n eed and receptivity).29 In the networked, online scholarly research environment researchers may not recognize that they are employing library mediated resources, which have been selected, subscribed to and managed based on local needs (a transparent servic e30 that may represent a political failure with negative budgetary implications). Many may, in fact, employ external discovery tools preferentially over the catalogue and other library sponsored systems.31 Even when employing library systems, researchers may fail to authenticate their online activities with an institutional password, preventing direct, legitimate access to subscription and proprietary Deep Web resources unavailable to general browsers32 (perhaps relying instead on peer based resource sharing).33 Many users also consider librarians to be book experts or guides to library physical spaces,34 dissuading requests35 for mediated research assistance from librarians.36 A variety of disciplinary, demographic, racial and economic differences among librar y users (and nonusers) may further limit the reach and benefit of library resources developed with the intent of benefiting all.37 How do we reach the broad population of academic library users and expand our services to potential users? We are interested here primarily in facilitating effective scholarly research and improving intellectual access to scholarly materials as a public service, rather than marketing library services. 23 N. Foster and S. Gibbons (eds), Studying S tudents: t he U ndergraduate R esearch P roject at the University of Rochester (Chicago, IL, Association of College and Research Libraries, 2007). Available at 24 Purcell, et al ., Search E ngine U se 2012. 25 R. Vondrac ek, Comfort and C onvenience? Why S tudents C hoose A lternatives to the L ibrary, Portal 7, 3, (2007), pp. 277 293. Available at /portal_libraries_and_the_academy/v007/7.3vondracek.html 26 S. Bennett, Library as P lace: Rethinking R oles, R ethinking S pace. CLIR Pub. 129. (Washington, DC, Council on Library and Information Resources, 2005). Available at 27 S. Miller and N. Murillo. Why dont S tudents ask L ibrarians for H elp?: Undergraduate H elpseeking B ehavior in T hree A cademic L ibraries in Duke and Asher, Colle ge L ibraries and S tudent C ulture, pp. 4970. 28 B. Nardi and V. O'Day, Information E cologies: using T echnology with H eart (Cambridge, MA MIT Press, 1999). 29 M. Block, Teach Them While They're Asking for Information. ExLibris (5 July 2002). 30 See: Google, Northwestern University Library increases A ccess to its E lectronic H oldings using Google Scholar Library Links, Google Scholars Library Links Program Case Study 2007. Available at 31 S. Harris, Moving Towards an Open Access Future: t he Role of Academic Libraries (SAGE Publications and British Library, 2012). Avail able at OAReport.pdf 32 B. He, M. Patel, Z. Zhang and K. Chang, Accessing the Deep Web: a Survey, Communicatio ns of the ACM (CACM) 50, 2 (2007), pp. 94 101, available at doi : 10.1145/1230819.1241670 33 V. ODay and R. Jeffries, Information A rtisans: Patterns of Result Sharing by Information Searchers, COCS '93 Proceedings of the conference on Organizational computing systems (New York, ACM, 1993), pp. 98107. Available at 10.1145/168555.168566 34 B. Fister, Fear of R eference, Chronicle of Higher Education (14 June 2002), p. B20, available at of Reference/2928 35 Miller and Murillo, Why dont S tudents ask L ibrarians for H elp? in Duke and Asher College libraries and student culture, p. 53. 36 L. Suchman, Humanmachine R econfigurations: Plans and S ituated A ctions 2 n d ed. (New York Cambridge University Press, 2007). 37 See E. Whitmire, Cultural D iversity and U ndergraduates A cademic L ibrary U se, Journal of Academic Librarianship, 29, 3 (2003) pp.148161. 3


Two established approaches to support collection access Academic libraries rely on two established, general approaches for supporting scholarly access to library resources. The first consists of maintaining and improving bibliographic search and discovery tools. Examples include the librarys catalogue,38, 39 its Web sites and assoc iated databases, as well as indexes and finding aids.40 Together with the Online Public Access Catalogue (OPAC), are navigation platforms ( Blacklight41 is an example), electronic journal management applications and federated search or Webscale discovery42, 43 products (e.g. Summon, WorldCat Local, EBSCO Discovery Services, and Primo Central ). These tools support integrated access to print and digital collections, ideally providing a single access point to resources, independent of location. An important consideration, particularly in the African context, is that such highly integrated, well designed and easy to use (though difficult to set up) commercial products are too expensive for many libraries to implement on behalf of their users. Beyond the question of affordability is usage: are library patrons using the resources for which their institutions have paid so much in support of scholarly work? Without consulting the library Web site or OPAC, patrons cannot access all of the resources available to authenticated users. This may lead researchers to purchase individual access through one resource aggregator (which shows access is not provided by the library license with that site ), although access paid for by their institution may be available elsewhere, as indicated in the OPAC or library Web site. In todays Webcentred research environment, alternatives to library developed or sponsored search and discovery tools are available elsewhere online,44 so the local library Web site (which itself may appear outdat ed, dauntingly complex, or poorly designed to many users)45, 46, 47 may have limited impact, despite the availability of high quality, specialized and appropriate tools it offers. Educating library users to authenticate their access to licensed resources and consult the librarys online tools at the appropriate time is a challenge that highlights the sociotechnical nature of many library research and access issues, as well as the benefits of expert human mediation in facilitating the highest quality library res earch.48 A static approach to reaching users (addressing only those who visit the library, either physically or via its online services) results in lost opportunities to support and assist researchers (using other resources) to conduct their scholarly work more effectively. 38 D. Tyckoson, The catalog as index to the collection, Technicalities, 17, 1 (1997), pp. 1012. 39 K. Antelman, E. Lynema, and A.K. Pace, Toward a twenty first century library catalog, Information Technology & Libraries 25, 3 (2006) pp.128139. 40 See Borgman, Scholarship in the Digital Age, p. 82. 41 Project Blacklight, Project Blackligh t, available at 42 J. Vaughan, Web Scale Discovery Services, Library Technology Reports 47, 1 (2011), pp. 161. 43 S. Garrison, G. Boston, and S. Bair, Taming Lightning in More Than One Bottle: Implementing a Local Next Generation Catalog Versus a Hosted WebScale Discovery Service, ACRL C onference P roceedings (Philadelphia, 30 March2 April 2011), available at 44 See G. Herrera, Google Scholar Users and User Behaviors: a n Exploratory Study, College & Research Libraries 72, 4 (2011), pp. 316330, available at 45 Y. M. Kim, The Adoption of University Library Website Resources: a MultiGroup Analysis, Journal of the American Society for Information Science and Technology 61, 5 (2010), pp. 978 993. 46 P. Harpel Burke, Library Homepage Design at Medium Sized Unive rsities: a Comparison to Commercial Homepages via Nielsen and Tahir.' Library Homepage Design, 21, 3 (2010), pp. 193 208. 47 A. Tombros, I. Ruthven, and J. Jose, How Users Assess Web Pages for Information Seeking, Journal of the American Society for Infor mation Science and Technology 56, 4 (2005), pp. 327 344. 48 Nardi and O'Day, Information E cologies 4


The second approach that academic libraries undertake to support and improve access is instruction in information literacy49, 50 for library users (and identified groups of potential users). While historically library instruction extends reference public service support activities to assist users with library services and resources,51 a more currently accepted view is that Academic librarians teach students information literacy skills to successfully complete assignments in preparation for a 21st century workplace....52 Proven best practices to advance the quality of library research include offering formal53 and informal54, 55 information literacy training.56 While effective for students who are engaged by such efforts,57 this approach has limits in scope, depth and acceptance.58 The range of library instruction practices across academic institutions is broad, reflecting a weak institutionalization of instruction program me s among academic libraries, educational institutions and, for North America, a lack of support by means of accreditation criteria.59 While there are international examples of a comprehensive incorporation of information literacy program me s into the curriculum60, 61 it may be that, under current budgetary circumstances, the best opportunity for most academic libraries to support broad information literacy instruction is with online courses,62 which do not necessarily suffer from the problem of scalability63 and may require few recurring resources once implemented. In the present context, because instruction is 49 CILIP, Information Literacy: Definition, Web site (London, Chartered Institute of Library and Information Professionals, 2011), available at involved/advocacy/informationliteracy/Pages/definition.aspx 50 C. Cox and E. Lindsay (eds), Information L iteracy I nstruction H andbook (Chicago, IL, Association of College and Research Libraries, 2008). 51 C. Seymour, Ethnographic Study of Information Literacy Librarians Work Experience: a Report from Two States, in C. Wilkinson and C. Bruch (eds), Transforming Information Literacy Programs: Intersecting Frontiers of Self, Library Culture, and Campus Community (Chicago, IL, Association of College and Research Libraries, 2012), p. 67. 52 E. Welty, S. Hofstetter and S. Schulte, Time to R e evaluate H ow W e T each I nformation L iteracy: Applyi ng PICO in L ibrary I nstruction, College & Research Libraries News 73, 8 (2012), p. 476. 53 C. Hollister (ed.) Best P ractices for C redit bearing I nformation L iteracy C ourses (Chicago, IL, Association of College and Research Libraries, 2010). 54 M. O'Kelly and C. Lyon Google Like a Librarian: Sharing Skills for Search Success College & Research Libraries News 72, 6 (2011), pp. 330332. 55 B. Dewey, The Embedded Librarian, Resource Sharing and Information Networks 17, 1 (2005), pp. 57. 56 A. Daugherty and M. Russo, An Assessment of the Lasting Effects of a StandAlone Information Literacy Course: t he Students' Perspective, Journal of Academic Librarianship 37, 4 (2011), pp. 319 326. 57 P. Maughan, Assessing Information Literacy among U ndergraduates: a D iscussion of the L iterature and the University of CaliforniaBerkeley A ssessment E xperience, College & Research Libraries 62, 1 ( 2001), pp. 7185, available at 2/1/71.full.pdf+html 58 K. Fast and D. Campbell, I S till L ike Google: University S tudent P erceptions of S earching OPACs and the Web, in Proceedings of the 67th ASIS&T Annual Meeting (Providence, RI, American Society for Information Science and Technolog y, 2004), p.144 available at: ons_of_Searching_OPACs_and_the_Web 59 C. Bruch and C. Wilkinson, Surveying Terrain, Clearing Pathways, in Wilkinson and Bruch, Transforming Information Literacy Programs pp. 1011. 60 A. Coulon, Penser, Classer, Catgoriser: l Efficacit de lEnseignement de la Mthodologie dans les Premiers Cycles Universitaires. Le Cas de LUniversit de Paris 8 (Saint Denis, Association International de Recherche Ethnomthodologique, 1999). 61 J. Lau (ed), Information Literacy: International P erspectives (Munich, K.G. Saur, 2008). 62 Y. Mery, J. Newby and K. Peng, Why One shot Information Literacy Sessions a re N ot the Future of Instruction: a Case for Online Credit Courses, College & Research Libraries 73, 4 (2012), pp. 366377, available at 63 C. Gibson, The H istory of I nfor mation L iteracy, in Cox and Lindsay (eds), Information Literacy Instruction Handbook p. 15. 5


most often comprised of either a general survey or a discipline specific approach, research relating to manuscripts and archives (including digital primary resource collections) is rarely emphasized. The authors support improving access to library collections in these proven and established ways. However, in this chapter we introduce a less well known, but simple and powerful additional way to assist library users in discovering relevant collections while conducting research away from library buildings, on their own, and without prior knowledge of potentially useful collections. Librarians who develop tools for discoverability64 and access (including creating metadata and finding aids for archival and manuscript collections) should consi der employing simple SEO strategies to ensure that researchers without superior online search skills can find relevant resources in their collections.65 Extending our reach with SEO practices Although less widely known in academia and infrequently employe d by collection curators,66, 67 a third approach has great potential for expanding the scope and impact of library research support activities by providing a public service to unknown users, without specialized skills, using unauthenticated accounts, and who are unaware of the specific resources that meet the search criteria they have entered on a general Web search engine site. Employing SEO techniques can enhance resource discoverability to expand the scope of impact beyond library walls. Further, they can be initiated by individual collection curators at little cost. Creating or editing Wikipedia pages related to ones collection contents, for example, can complement traditional methods in promoting discoverability and access68, 69 and may have an immediate, noticeable and demonstrable impact on online results pages and server log statistics.70, 71 In the remainder of this chapter, we describe the Web environment, outline how search engines work, and provide examples of techniques from our collaboration on the J.M Derscheid Collection that curators can use or adapt as enhancements to established practices that will make their digital archives and other collections more readily discoverable online. Throughout, we emphasize that SEO work is sociotechnical, relying on the interaction of people and technology. By understanding the research context and contents of our collections, communicating the relevant concepts related to their contents and by linking and strengthening the intellectual connections across scholarly communications channels, we work together to leverage technical tools in ways that enhance social connections and facilitate scholarly research.72 64 Discoverability is used for information seeking and usability. For usability, discoverability is the ability for users to locate something they need to complet e a certain task from S. Ginsberg, The Evolution of Discoverability, UX Magazine (March 2, 2012: Article 629), available at of discoverabil ity 65 Carroll, [Search Engine Optimization], 2011. 66 This is changing with initiatives to improve SEO through Wikipedia editing, as with studies/wikipediaat uw/ ; archives is hostinga wikipediaedit a thonon thethemeof womenat princeton/ ; edit thon ;; and 67 See also 'Search Engine Optimization', Writing y our Article (Abingdon, Taylor & Francis Author Services) available at 68 J. Beel, B. Gipp and E. Wilde, Academic Search Engine Optimization (ASEO): Optimizing Scholarly Literature for Googl e Scholar and Co., Journal of Scholarly Publishing, 41, 2 (2010), pp. 176190. 69 E. Rushton, M. Kelehan and M. Strong, Searching for a New Way to Reach Patrons: a Search Engine Optimization Pilot Project at Binghamton University Libraries, Journal of Web Librarianship, 2, 4 (2008), pp. 525547, available at 70 K. Cahill and R. Chalut, Optimal Results: w hat Libraries Need to Know about Google and Searc h Engine Optimization, The Reference Librarian, 50, 3 (2009), pp. 234247, available at 71 E. Rushton and S. Funke, The Goodness in the Evil of SEO, Searcher 19, 9 ( 2011 ), pp. 3035. 72 R. Kling, G. McKim and A. King. A Bit More to It: Scholarly Communication Forums as SocioTechnical Interaction Networks Journal of the American Society for Information Science and Technology 54, 1 6


SEO work in the context of academic libraries and archives serves the aims of research, teaching, and servic e. Thus, such work constitutes public scholarship.73 Our work is valuable in assisting researchers t o discover library and archival materials that are worthy of consideration given their online searches. This chapter uses as an example the awardwinning set of techniques that the authors employed to promote the Derscheid Collection in the University of Florida Digital Collections (UFDC).74, 75 The techniques outlined here are effective whether or not researchers read our SEO contributions online or are even awa re, prior to searching, that (for example) the Derscheid Collection is potentially relevant to their research.76 How Web Search Engines Work Web search engines succeed by providing ready, accurate access to the materials for which users are searching; how ever, many Web sites are not searched because they are not indexed. These are sometimes referred to as the Deep or Invisible Web.77 Many other Web sites could be included in the Visible Web, but are not (or fail to be included at the ideal level of relevanc e) because they do not conform to search engine requirements. Google explains how Search works: Today our algorithms rely on more than 200 unique signals, some of which youd expect like how often the search terms occur on the webpage, if they appear in the title or whether synonyms of the search terms occur on the page. Google has invented many innovations in search to improve the answers you find. The first and most well known is PageRank, named for Larry Page (Googles cofounder and CEO). PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.78 While commercial search engine algorithms are proprietary and the many unique signals and methods are updated continuously by the provider, Google, Bing, and other search engines provide documentation to assist Web site managers to optimize sites for inclusion.79 We focus on Google because it is currently the dominant global search engine and is one of the most visited sites worldwide.80 Google explains the basics of search as falling into three categories: crawling, indexing, and serving.81 Crawling (2003), 5767 avai lable at JASIST.pdf Also see: J. Bennett, Vibrant Matter: a Political Ecology of T hings (Durham NC, Duke UP, 2010). 73 In the US, public scholarship is undertaken in all fields with initiatives like Imagining America connecting artists and public life ( ) and wit h the role of public, landgrant universities specifically founded on supporting the public interest as explained by the Association of Public and LandGrant Universities, available at http://www.aplu.or g/page.aspx?pid=203 74 Center for Research Libraries, CRL Primary Source Awards, Focus on Global Resources 31, 3 (2012), pp. 34, available at 75 D. Reboussin, J. M. Derscheid Digital Collection (Gainesville, FL, George A. Smathers Libraries, University of Florida Digital Collections, 2011), available at 76 Carroll, [Search Engine Optimiz ation], 2011. 77 C. Sherman and G. Price, The Invisible Web: Uncovering Sources Search Engines Can't See, Library Trends, 52, 2 (2003), pp. 282298, available at Deep Web, Wikipedia, available at 78 How Google Search Works, available at 79 Google Webmaster Tools, available at ; see also: Bing Webmaster Tools, available at 80 Top 500, Alexa, available at 81 Google Webmaster Tools, Google Basics, available at 7


refers to the proces s by which the Googles algorithms analyze Web pages based on prior crawls,82 Web links, and sitemaps. The Sitemap protocol defines how sitemaps are used as: an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.83 In addition to exploiting protocols like these for sitemaps, Google publishes best practices to ensure Web sites are indexed properly and to prevent sites from being removed. Common best practice recommendations include design, content, technical, and quality guidelines.84 In addition to the best practices recommended by search engine providers, other Web design best practices also support SEO, including designing for user accessibility; for instance, by creating alternative text for images, visually impaired users employing audio technology can hear image descriptions on the Web site.85 SEO is thus part of a larger holistic discoverability strategy that supports Web site discoverability, content discovery within Web sites, and navigation through the Web to find information.86 While search engine guidelines may appear to relate to those areas of libraries controlled by technical professionals or dictated by systems outside content creator control, many curators do control the quality of the information presented on library Web sites. More importantly: Though relevance ranking algorithms can factor in the location and frequency of word occurrence, there is no way for software to accurately determine aboutness. [] Metadata tags applied by humans can indicate aboutness thereby improving precision. This is one of Googles secrets for success. Googles PageRank algorithm recognizes inbound links constructed by humans to be an excellent indicator of about ness.87 The metadata for aboutness includes the rich content and contextual information within a Web site, including the information inherent in the site itself. For example, .edu and .ac sites are restricted to educational, academic and research instituti ons. They are thus more likely to host quality content. Sites with .org domains tend to be more information rich than .com or .xxx sites, so domain extensions are factors for search engine relevancies and rankings.88 The network itself provides contextual i nformation, as for a Web site that is densely interlinked with well ranked sites, validating the site content through its peers in an online community. Aboutness is critical to search engine operation and a core concept in understanding how SEO functions using both technical and social standards: Technological constraints and social construction always interact in such a way that it is impossible to separate the two.89 By focusing on aboutness instead of technical attributes, which may or may not be acces sible to content creators, we focus here on SEO aspects that curators can control for meaningful impact on search engine ranking of their collections and 82 Google Basics: Crawling. Google Webmaster Tools available at 83 Home. Sitemaps, available at 84 Google Webmaster Tools, available at 85 A. Hagans, High Accessibility i s Effective Search Engine Optimization, A List Apart (2005) availa ble at: 86 P. Morville, Ambient Findability (Cambridge, MA, OReilly Media, 2005), p. 10. 87 Ibid., p. 53. 88 Indeed, the University of Idaho Library and other information literacy resources recommend human evaluation of domain extensions as factors for evaluating authority for Web sites (see e.g., ebsiteeval.html and ). 89 See J. Bolter, Ekphrasis, Virtual Reality, and t he Future of Writing, in G. Nunberg (ed), The Future of the Book (Berkeley CA, University of California Press, 1996), p. 254. 8


information discoverability for scholarly research. SEO is an enhancement, not an alternative to standard best practices. It takes advantage of the quality and extent of metadata, as well as such things as the technical data that allow item level access, to extend the reach of established tools, offering library resources to a broader scholarly community online. SEO strategies leverage best practices from the Web and curatorial practices supporting access, improving discoverability. As Diane Fulkerson argues: One of the major problems with digital collections is the lack of an overall index to collections. For researchers to find collections relevant to their research requires them to perform a specific Internet search that includes the phrase digital collection. [] Landing pages are another option for promoting or marketing digital collections. A landing page is a onepage web advertisement you arrive at by clicking on a provided link.90 Creating a landing page for a specific collection follows SEO best practices by creating a single point of reference with relevant contextual information providing about ness information for the digital collection site and its contents. Creating a landing page also follows curatorial best practices by providing the metacontext and metadata that orients readers, as well as providing resource links, help on using the system and contact information.91 A digital collection landing page supports the critical framework for contextual information in digital collections.92 Digital library researcher Jenn Riley, of the (US) Carolina Digital Library and Archives, stresses the importance of curated collections even in large scale projects, as usage improves for curated collections and for individual items when presented incontext along with their collection.93 Landing pages for collections provide links among related resources, suggest ing examples of the scholarly context. Scholarly data and documents are of most value when they are interconnected rather than independent.94 Curators may not have control over important factors for SEO related to technical optimization. Search engines r ely on a number of technical indicators in indexing sites, including server response time, sitemaps, server directory structures, HTML, CSS, and the like. However, curators may be able to raise concerns regarding SEO to technical collaborators for improvem ent and optimization. In some cases technical concerns may not be addressed due to a lack of funding for hardware, software, personnel, or for reasons related to proprietary software and allowable configuration. Whether or not curators have optimal techni cal support and whether or not it can be improved, there are ways curators can leverage their existing content expertise for SEO. As with the creation of the landing page, more about content than technology, curators can contribute information on digital c ollections or specific items to blogs, online newsletters, scholarly portals and other trusted sites. By contributing information with links, search engines will connect these with aboutness information. These are important factors in deciding what is incl uded in search engine results and in ranking them. Curators can also influence this process by sharing information on sites relevant to their community of known users and by contributing to Wikipedia 90 Fulkerson, Remote Access Technologies for Library Collections p. 71. 91 Report refers to each collection main page as a landin g page and discusses landing pages as important for design considerations: from our dlp intern/ Processing standard for creating a digital collection includes creating a landing page for promotion and marking for this consortium: groups/digitizationgroup1/project steps 92 C. Lee, A Framework for Contextual Information in Digital Collections, Journal of Documentation, 67, 1 (2011), pp. 95143, av ailable at 93 J. Riley, Competing Priorities: Sustainability, Growth, and Innovation In Digital Collections, CNI Spring 2012 Project Briefings (2 April 2012, Baltimore, MD), available at libraries/competingpriorities sustainability gr owth andinnovationin digital collections/ a nd 94 Borgman, Scholarship i n the Digital Age, p. 10. 9


Writing and editing Wikipedia articles is in keeping w ith best practices for SEO, curation, and the creation of scholarly or research context because Wikipedia is one of the most visited Web sites95 offering a framework to provide contextual information linking to external sources, including library and digital collection Web sites. Therefore, it is an excellent resource for search engines, with a high impact for SEO, because it presents properly structured information in a technically robust manner with full context for aboutness. Thus, contributions to Wikipe dia can effectively inform the larger information environment for SEO. Wikipedia should be a familiar genre for curators and scholars who have contributed to other encyclopedias. Wikipedia editors enforce policies, standards, and best practices for contributors. These include the Five Pillars: Wikipedia is an encyclopedia. [] Wikipedia is written from a neutral point of view. [] Wikipedia is free content that anyone can edit, use, modify, and distribute. [] Editors should interact with each other in a respectful and civil manner. [] Wikipedia does not have firm rules.96 These precepts productively orient writers and editors to Wikipedia s goals and approach to contributions. Wikipedia s documentation outlines general concepts for verifiability and no new research as well as: recommendations on internal linking to ensure new content is organized, discoverable and supports optimal use of the full contents of Wikipedia; and recommendations on external linking to reference sources to show the validity of t he information presented and similar best practices.97 Wikipedia also offers many writing style guides, reference guides and community forums to assist writers and editors.98 One important note: curators may write substantively about their collections and materials, but should not simply provide external links or promote their institutions, as this is a conflict of interest.99, 100 Writing for Wikipedia can be an extension of existing scholarly practice, with familiar processes for review and editing, albeit online and transparently within the public sphere.101 The Derscheid Collection example102 The relevance of curatorial expertise for SEO can be seen through the example of the Derscheid Collection, as it was first supported by conventional library practices and later extended into the online environment with digitization, metadata and SEO, all building from core curatorial values and practices. A general understanding of JeanMarie Derscheids career projects indicates his purpose for the collection. He began hi s professional life as a zoologist at the Royal Museum for Central Africa in Tervuren, Belgium (19241926).103 Engaging in early administrative, policy and fundraising roles in developing and promoting the first international wildlife conservation plans for Europe,104 he also published scientific 95 Alexa, available at 96 Wikipedia: Five Pillars, Wikipedia available at http://en.w 97 Wikipedia: Policies and Guidelines, Wikipedia available at ; Community Portal, available at 98 Help: Contents/Editing Wikipedia, Wikipedia available at 99 Wikipedia: Conflict of Interest, Wikipedia available at 100 D. Beetstra, User: Beetstra/Archivists, Wikipedia available at 101 See: A. Lally, Case Studies in Utilizing Web 2.0 to Improve the Archival Experience: Using Wikipedia to Highlight Digital Collections at the University of Washington, The Interactive Archivist (18 May 2009), available at uw/ 102 Some information here was included in the award nomination; see L. Taylor and D. Reboussin, Nomination for the Center for Research Libraries Primary Source Award, Access Category (Gainesville, FL, University of Florida, 2011) available at ; see also CRL Primary Source Awards, available at 103 P. Brien, Jean Marie Derscheid, Biographie nationale, 37, supplment tme 9, 1er fasc. (1971), pp. 211235. 104 IOPN, The International Office for the Protection of Nature: Its Origin, its Programme, its Organisation (Brussels, IOPN, 1931). 10


articles on a variety of species that he studied in Central Africa,105 conducted the first census of Mountain Gorillas in their natural habitat and surveyed the Virunga Mountains (completing the objectives of an American Museum of Natural History expedition following Carl Akeleys death on Mt. Mikeno).106 He recognized the importance of tropical forest conservation there as supporting one of only two limited habitats of endangered Mountain Gorillas107and criticized Belgian colonial agricultural policies, which promoted cultivation in these sensitive environments.108 In the midto late 1920s, he played a central role in the establishment of the Parc National Albert (now the embattled Virunga National Park), the first national par k in Africa, becoming its first director in 1930.109 Derscheid left this administrative post in 1933, following allegations of financial irregularities,110 to teach biology at the Universit coloniale in Antwerp and continue the research that produced this co llection. His academic career (and this research project) was cut short by the invasion of his native Belgium by Nazi Germany in 1940. After serving in the army medical corps, he was demobilized after the capitulation of King Leopold III at Dunkirk, and became a resistance leader in the Comet Line escape service.111 After his arrest and imprisonment as a spy in 1941, he was taken to Germany in 1942, and executed by the Gestapo on March 13, 1944.112 Incredibly, though Derscheids home was occupied by German soldiers during the war, his papers survived and were maintained by his family.113 To the best of our knowledge, the original manuscripts remain with his heirs at the family home in Sterrebeek, Belgium. The Derscheid Collection is a rich set of 20th century, sc holar curated research materials, including official colonial administrative reports (with local population census and tax collection information, for example) by District Commissioners and Governors, officials responses to a questionnaire Derscheid creat ed, oral histories and genealogies collected in interviews with clan chiefs and others relating to time periods from 18591940, and original manuscripts relating to precolonial and colonial era Burundi, Eastern Congo and Rwanda (then RuandaUrundi and Bel gian Congo).114 Supplemented by the collectors own research notes and working papers, substantial correspondence with colonial administrators and missionaries is included. These unique materials have been used by scholars to develop key historical interpret ations of the region.115 116 117 118 119 Derscheids research materials remain important for interdisciplinary research in the area, extending scholarly value well beyond their original 105 Citations for his published zoological work are available in the Wikipedia entry for Derscheid, available at 106 P. Barclay Smith, Obituary (Dr. J.M. Derscheid), Nature, 157, 3977 (1946), p. 70. 107 de Wildeman, A p ropos des Forts Congolaises: l eur Rgression Ncessits de leur tude Biologique et de la Cration de Rserves Forestires, Bulletin de la Socit Royale de Botanique de Belgique (1928), p. 59. 108 L. Dorsey, Historical D ictionary of Rwanda (Metuchen NJ, Scarecrow, 1994), p. 221. 109 M. Akeley, Carl Akeley's Africa; the Account of the Akeley EastmanPome roy African Hall Expedition of the American Museum of Natural History (New York, Dodd, Mead & Co., 1929), p. 119. 110 Brien, JeanMarie Derscheid p. 2 2 5. 111 Liste des P ersonnes A yant A id des A viateurs P asss par Comte, Kinship Belgium Web site, availab le at 112 Brien, JeanMarie Derscheid p. 23 1 113 Ibid. pp. 229, 232. 114 D. Reboussin, Compiled Guide to all 3 Derscheid Collection Microfilm Reels (Gainesvi lle, FL, University of Florida, 2004), available at 115 R. Lemarchand, Rwanda and Burundi (New York, Praeger Publishers, 1970), available at 116 A. Des Forges and D. Newbury, Defeat is the Only Bad News: Rwanda under Musinga, 1896 1931 (Madison, WI, University of Wisconsin Press, 2011), p. xix. 117 I. Linden and J. Linden, Church and Revolution in Rwanda (Manchester Manchester University Press, 1977), p. xvi. 118 C. Newbury, The Cohesion of Oppression: Clientship and Ethnicity in Rwanda, 18601960 (New York, Columbia University Press, 1988), p. 307. 119 D. Wagner, Whose History is History? A History of the Baragane People of Buragane, Southern Burundi, 18501932 (Madison, WI, University of Wisconsin, 1991), p. 530. 11


intended purpose. A portion of the collection, about 800 items including notes, illustrations and maps, was microfilmed privately by Professor Ren Lemarchand in 1965 as he conducted research for a scholarly monograph.120 Lemarchand transferred ownership of three 35mm microfilm master negative reels along with a set of positive print r eels (totaling 2,021 frames) to the University of Florida George A. Smathers Libraries, by our best guess during the 1970s.The libraries at Stanford University, Yale University, University of North Carolina, and the School of African and Oriental Studies at University of London also have prints from this microfilm, but its distribution was extremely limited. In 2000, Lemarchand alerted one of the authors to physical problems with the microfilm, initiating Reboussins involvement with the collection. Digitiz ation was not a practical option in 2000, so library staff removed the master negatives from general circulation, repaired breaks in the film, and initiated good conservation measures such as adding proper leaders, creating separate print negatives for reproduction, and making a circulating printset. The master negatives provided state of the art preservation at that time. I n 2002, Reboussin and Lemarchand secured permission to copy and distribute the reproduced collection for scholarly purposes in a letter from JeanMarie Derscheids heir, his now deceased son JeanPierre. For the first time, the Libraries could create working copies without endangering the masters, allowing scholars permanent physical access by selling reels at cost to them, either directl y or to their institution, or alternately loaning circulating reels via Interlibrary Loan. While these actions resolved physical access and addressed legal issues, intellectual access remained limited. Because this is not a collection of original manuscri pts, Reboussin did not create a standard finding aid.121 Instead, he expanded a carbon copy typescript guide122 by preparing a frame index, both to understand better this arcane, French language collection and to improve intellectual access to its hidden contents. He compiled, edited, and verified an item by item index. This file was first distributed via email, t hen on the African Studies Collection Web site, and was uploaded to the UFDC in November 2010.123 Sustained efforts to provide intellectual access to a previously hidden scholarly collection both supported and benefited from related collaborative activities during 2011. While the authors did not strategically develop a SEO plan from the onset, together we combined technical and curatorial activities to set the stage for what became an award winning project to enhance research access to the Derscheid Collection. With the item by item index online, additional efforts provided further support by developing the scholarly context and contributing to improved intellectual access overall. Importantly, for example, Lemarchand generously permitted the Libraries to digitize the full text of his 1970 book, based significantly on materials in the Derscheid Collection.124 Rwanda and Burundi was uploaded to the UFDC in November 2010 as an Open Access resource, providing an extraordinarily rich and extensive scholarly context that supports intellectual access to all of the Derscheid materials while at the same time improving access to an out of print scholarly monograph. We initiated the Derscheid microfilm digitization project, funded with generous support from the University of Florida Center for African Studies US Department of Education Title VI grant125 during Spring Semester, 2011. Our work benefited from the availability of item by item metadata in the online index, allowing rapid processing and public access that summer. With a new appreciation for the value of 120 Lemarchand, Rwanda and Burundi. 121 See Finding Aids to Manuscript and Archival Collections, Department of Special & Area Studies Collections Web site (Gainesville, FL, University of Florida George A. Smathers Libraries), available at 122 A. Des Forges, Inventory of the J. M. Derscheid Collection on R wanda (with Some Material on Burundi, the Congo, Uganda) 1967, available at 123 Reboussin, Compiled Guide to all 3 Derscheid Collection Microfilm Reels, a vailable at 124 Lemarchand, Rwanda and Burundi p. x. 125 See: National Resource Centers Program (Washington, DC, US Department of Education, Office of Postsecondary Education, n.d.), available at 12


additional contextual information that a finding aid would normally provide, Reboussin uploaded to the UFDC his translation of a French language biography from the Belgian Biographie nationale.126 While it is the most extensive biography published on JeanMarie Derscheid, Briens focus is more personal, genealogical, and closer to hagiography than current norms of scholarship dictate. In an effort to gather a variety of professional information and references where they would be conveniently available, Reboussin created a Wikipedia entry for Derscheid in September 2011,127 providing permanent, public access to the documentation he had compiled since his earliest work with the collection. The resulting dramatic change in Google search results relating to Derscheid prompted his new appreciation for t he effectiveness and value of SEO techniques as a public service. The authors concluded their substantive online work with the Derscheid Collection by creating a landing page to bring the related contextual elements together with collection materials andas a convenient target for incoming links. The success of efforts to attract scholarly attention (through online social media or elsewhere) is dependent on a foundation of relevant context and rich interlinked content. Once these goals are met, the attention of online communities communicates the value of resources in both a social and a technical sense as a peer network of incoming trusted source links from, for example, blogs and newsletters. This itself is a service to the larger scholarly community, builds support and strengthens SEO accomplishments. We therefore spoke wi th our colleagues and engaged our social networks to promote public awareness of the collection.128 Bernard Reilly, President of the Center for Research Libraries (CRL) encouraged us to submit a case study to his organizations Primary Source Award for Acces s. As a result of our success in that competition,129 the Derscheid Collection was cited (and the landing page linked) in the CRL and other institutional newsletters, communicating a level of peer review and acknowledgement from trusted sources to broad comm unities of scholars and library colleagues. The recognition also generated social media links to the collection landing page. Reboussin announced the availability of the collection during the 54th annual US African Studies Association Conference in Washington, DC.130 Scholars of Rwandan history, including a doctoral student who had just begun working with the Derscheid Collection in microfilm format, thus became aware of this newly online resource with effective endorsement from a community of respected peer s. Attention to essentially social aspects of promoting the collection leads, presumably, to engagement with the scholarly community and perhaps links to the collection landing page from trusted sources. These further strengthen discoverability for such pr eviously hidden materials, improving intellectual access for those who need it, even if they are not aware of the collection. We can measure the success of our SEO activities in several ways. The increased prominence of a collection in search engine resul ts is an objective of SEO implementation. Do related, relevant searches (which do not include the collection title) return links to the landing page on the first page of search results? If so, this indicates that researchers who are unaware of its existenc e can discover the collection. Quantitative measures are available to collection curators from server logs. For our work, this includes the automated statistical reporting through SobekCM.131 For example, as of July 2012 there were 4,797 total views of the D erscheid Collection (collection level server log results), with 31,502 cumulative views in 798 126 See Brien, JeanMarie Derscheid, pp. 211235. 127 See Wikipedia entry on Derscheid. Available at 128 This included offering a workshop on SEO at our local institution: D. Reboussin and L. Taylor, Workshop Overview: Promote UF R esearch, C ollections, & O nline M aterials with SEO and Wikipedia writing/editing UF Digital Collections (February 2, 2012), available at 129 CRL Primary Source Awards, available at htt p:// 130 T. Spear, Roundtable discussion chair, Court Politics and Colonial Power in Rwanda, 18961931: A Discussion of Defeat is the Only Bad News (ASA, 18 November 2011). Conference program me available at 131 Derscheid Usage Statistics UF Digital Collections available at and SobekCM: Monthly Usage Statistics, UF Digital Collections available at: 13


visits to the collection.132 These numbers indicate relatively high use considering this is a one year old, specialized, nonEnglish language scholarly collection. Most importantly, recognition through citation in the scholarly literature is the best indicator of impact on scholarship, undoubtedly the primary goal of our efforts to provide effective intellectual access.133 Conclusion We began this chapter by describing the challenges of supporting academic access to scholarly research as the information environment becomes larger and more complex. We explored two approaches to supporting collection access, through improved tools and training researchers. These observ ations served as a foundation for our recommendation that librarians employ SEO techniques to extend the reach of unknown users who may benefit from improved access to the scholarly resources we curate. We offered a review of how Web search engines operate and provided examples of SEO techniques employed to promote access to the Derscheid Collection Finally, we demonstrated that this previously hidden, arcane collection became more readily discoverable and its contents more easily accessible to scholars. Each manuscript collection has unique characteristics, may pose particular problems, and suggests somewhat different treatments through the course of acquisition, processing, description, and use. While some aspects of the Derscheid Collection are unusual, the authors approach to enhancing the collections discoverability by applying a variety of simple SEO techniques can be modified and adapted according to the characteristics of (and circumstances surrounding) any other collection. The fundamentally soci o technical tools we employed are available, and in fact best suited, to the curators who best understand a collections contents and its intellectual context. Though important and helpful, little or no technical support is necessary to engage in SEO work. It requires only a modest budget to provide this public service, offering state of the art intellectual access to anyone seeking the information in collection materials, whether or not they are aware of their specific needs, and independent of their library skills, online prowess, prior knowledge of the collection, or their authentication status with regard to library systems. The approach we presented may be applicable to librarians in African institutions, as well as to other collections where only modes t resources are available. As SEO activities support broad public access, they also enhance Open Access principles and are well suited to improving the discoverability of Open Access resources. This essentially curatorial public service provides a more co mplete scholarly context to both digital and print resources within collections of related materials. While we focused on using strategies to improve access to library digital collections, the value of SEO work also directly relates to the Open Access move ment wherein the value of a work is enhanced by creating access through curatorial work and by providing context for improved discoverability. In this way, SEO serves the larger Open Access movement that is part of library and academic work more generally, which relies on the creation and distribution of knowledge that is vetted through peer review.134 Dan Reboussin Laurie Taylor 132 See UFDC, History of CollectionLevel Usage, Derscheid Collection (Gainesville, FL, University o f Florida Libraries, 2012), available at 133 For example, see: D. Kiwuwa, Ethnic P olitics and D emocratic T ransition in Rwanda (London, Routledge, 2012), pp. 173, 192. 134 L. Taylor and B. Riley, Open Source and Academia: h ow Composition Benefits from the Open Source Model, Computers and Composition Online (Spring 2004), available at http://w 14