VIVO: Enabling National Networking of Scientists, Year 1 Progress Report September 1, 2009 - June 30, 2010
Title: VIVO: Enabling National Networking of Scientists, Year 1 Progress Report September 1, 2009 - June 30, 2010
Creator: Conlon, Michael
Publisher: University of Florida
Place of Publication: Gainesville, FL
Publication Date: June 30, 2010
Publication Status: Published
Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 1 Continuation Format Page VIVO : Enabling National Networking of Scientists Progress Report A. Summary The VIVO project is organized into four major teams Software Development, Ontology Development, Implementat ion and Outreach. In addition the project has Evaluation and Governance activities. Each project team and activity has accomplished its year 1 major milestones and in many cases, significant additional work has been accomplished beyond the original work plan. The project is on track for its third major software release this summer, and over 120 people are participating on the project as team members or members of advisory groups. A national conference will be held in New York City August 12 and 13 to bring to gether implementers, developers, adopters, partners, agencies and others for presentations and discussions regarding the creation of a national network of scientists. Significant challenges include: 1) recruiting leading experts for the VIVO Executive Advi sory Board; 2) harnessing the great interest and energy at the national level into partnered work to develop VIVO and VIVO related tools and resources; 3) meeting the expectations of potential adopters for a resource discovery tool, a faculty profile tool and a national network The work plan for the second year addresses these challenges as well as positions the project for sustainability. B. Software Development B.1. Introduction The first nine months of the VIVO grant have transformed VIVO from a single institut ion project with minimal documentation and support to a robust, open source software development effort already attracting additional community development interest. Two new software releases have been delivered to our implementation sites, and the second has been made broadly available for download from the project website and SourceForge 1 together with the VIVO 1.0 ontology. The single most important new feature of VIVO is linked data compatibility. In addition to the VIVO website, VIVO data is now directly harvestable as Resource Description Framework (RDF) statements, or triples, for exchange, aggregation, and searching by other parties through standard protocols. VIVO is distributed with an ontology that facilitates integration of data from human resource systems, grants databases, faculty annual reporting systems, and publication databases in a common framework that can be shared as data independently of the software. Developers both inside and outside the project are creating ap plications to demonstrate the utility and power of a rich store of structured data conforming to a published ontology. Each release of the VIVO software proceeds through defined planning and testing stages for quality assurance and early feedback on new fe atures. The first release established the application architecture and deployment paradigm at each implementation site, while the second release included scripts to migrate code, ontology, data, images and visualizations forward and has set the project on a firm footing for the further evolution of VIVO through the second year of the project. We anticipate ongoing modifications to the ontology in response to implementation site feedback, new data sources, and opportunities to align with established and new ly emerging communities of practice. B.1.a. Project management, collaboration, and issue tracking Weekly development conference calls and regular use of collaboration tools including the Confluence wiki 2 the JIRA 3 issue tracking system, and GoToMeeting 4 have sup plemented occasional face to face meetings bringing small numbers of developers together at each site. The Confluence wiki site serves as a central repository and organizing location for materials for all teams. It currently has 724 pages with over 471 a ttached documents. Separate sections of the JIRA issue tracking system support feedback from implementation sites, ontology questions and tasks, and development tasks. Support questions can be fed directly from the project web site into the user feedback s ection of the JIRA system for review by project implementation, outreach, or development staff as needed.


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 2 Continuation Format Page B.2. Development Accomplishments B.2.a. Cross Site Development Primary areas of development focused collaboration to date have included: Building a multidisciplin ary and multi site development team across all VIVO sites, primary development teams at Cornell, Indiana and Florida Collection and documentation of requirements and creation of draft function al and technical specifications Development of 53 user scenarios to address future functionality Development of p roject milestones, timeliness, and sub tasks Creating an open publicly available version of the VIVO software projects source code, packaged software, virtual appliances and end user documentation as a Sourceforge.net open source project using the Berkley Standard Distribution (BSD) License Significant improvements to the software structure, build process, configurability, and modularity to support packaging and deployment Implementation of a continuous integration process to monitor incremental changes and assure tests complete successfully Creation of first generation installation and upgrade documentation, in collaboration with Outreach and Implementation teams Modularizing authentication Enhancements to data ingest tools and workflow in VIVO Development of specifications for publication data alig nment and author disambiguation Development of a modular data harvester system to take remote National or Local data sources and convert them into RDF and the VIVO ontology Implementation of individual level, institution level, and national level visualizations of VIVO data Development of the first harvester module that will ingest publication data from the National Library of Medicine publications ar ch ive data system Design of the aggregator software to support national networking B.2.b. Cornell University The Cornell development team has focused on strategic modifications to the core VIVO application to support improved performance, simpler and more flexible configuration in the local server environment, a new navigation and browse structure, and easier local site interface customization. One major goal has been to improve the structure of the VIVO code base for greater clarity and transparency, principally th rough the separation of presentation from business logic following the Model View Controller (MVC) paradigm. Adoption of the FreeMarker 5 template engine will make it easier for new developers to contribute to the project as well as for institutions to adap t the appearance of VIVO to meet local standards and design preferences. VIVO managers preparing, constructing, and iterating through a data inges t workflow. VIVO 6 to represent relationships more complex than a single RDF subject/predicate/object triple, including the ability to model author order and the time win development specialists, but the results significantly reduce the complexity of cr eating structured linked data. A new file management system has been implemented for photos and other documents attached to VIVO entries, including a file upload and cropping interface. The development team also spearheaded the design and launch of the VIVO project web site in January, 2010.


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 3 Continuation Format Page B.2.c. University of Florida The University of Florida development team, lead by Chris Barnes, has focused on development in security and authentication, remote data harvesting and ingesting, software tools deployment, pack aging, and integration into local IT environments, and data acquisition from local and national systems of record. Shibboleth authentication has been integrated into the VIVO library. Users with GatorLink accounts at the University of Florida can use their single sign on capabilities for administration of http://vitotest.ctrip.ufl.edu This extends the authentication capabilities of VIVO allowing it to be customizable based on the institution and its preferred authenticati on model. A VirtualBox and VMWare virtual appliance is available to the public, making it easy to download and deploy VIVO for evaluation, testing, training and production use. This eliminates the need for a user to step through the components for installi ng and configuring VIVO. Once deployed, users can immediately log into the VIVO web interface and begin using it. Documentation is provided for those who wish to change passwords and secure the virtual appliance. A data harvester has been developed as a co llection of easy to use tools that can be combined, extended, and modified to fit the target use cases and workflows. The Harvester tool utilize s SOAP interfaces for retrieving publication citations and associated data as Extensible Markup Language documen ts (XML). The Harvester system provides several methods for uniquely identifying authors from the VIVO system that appear in the citation collections such as PubMed. The scoring and disambiguation algorithms are tunable and evolving as more research is don e in the field. Additio nal data sources were evaluated, including NIH RePORTER and CiteSeer The development team has assisted with support and development of the project web site along with the Outreach and Marketing teams, including attending meetings, d ocumentation, planning and programming and configuration of the portal. They have also supported the launch of an open development community site at vivo .sourceforge.net B.2.d. Indiana University The VIVO Counter developed at Indiana University is running each day at all seven VIVO implementation sites to record data and relationship counts (e.g., co authorships) These VIVO counts in conjunction with other data collected from sources such as emails, Google Analy tics and SourceForge download counts, prov ide the raw data for a map display ing the growth of the VIVO project and its global spread in use see Figure 1 Tracking thi s information has allowed us to see the emergence of users, the growing expanse of data, and a view of VIVO 's expanding reach. Data from VIVO project participants also produced the first data analysis and network visualizations to gain a deeper understand ing of VIVO data quality, coverage and growth. Results were shared with the Cornell and the Florida teams. In addition, a map of all VIVO members and their team memberships has been created. The Indiana University team also collected and cleaned the publication data from Scopus and the funding grants from NSF for all IU researchers and loaded it into VIVO as a research dataset for server performance testing, the 1.0 ontology evaluation, and visualization development Indiana University has also developed an initial architecture for creating and serving visualizations inside VIVO The architecture supports a va riety of visualization technologies to be used including Flash and JavaScript based solutions, and allows for the dynamic insertion of visualization content on a VIVO page. See Figure 2 Early development has been completed for several visualizations showing where researchers in VIVO Figure 1 VIVO people profiles, software downloads, email requests and web access activity map


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 4 Continuation Format Page publish in the world of science, a geomap showing the growth and expansion of VIVO over time, and institution level PDF report generation. B.3. Development Plans The Cornell, Florida, and Indiana teams have created an internal VIVO development road map laying out the principal technical issues facing the project and scheduling a series of focused development efforts throughout the remainder of the project period. Development plans are also being reconciled against the outcome of the user scenario process to produce a set of features ranked by importance and priority. B.3.a. Cornell Plans The Cornell team will continue to advance the core VIVO application to meet the objec tives defined in the development road map as well as address critical technical issues including improvements to the application architecture, editing and display configuration, reasoning and data store scalability, and the integration of a configurable se t of external taxonomy vocabularies. Significant ongoing development effort will also focus on putting additional control of the application appearance and interface in the hands of the implementing institutions, with priorities and approaches heavily infl uenced by user testing at Cornell and other institutions. Cornell will also take the lead on development of the aggregator service and an exemplar national search application, both as open source tools independent of VIVO and capable of handling RDF data from other sources sharing a c ommon ontology, such as the BIBO Ontology, or provided with a mapping to the VIVO ontol ogy. This serves as an example and a code base for other applications that involve aggregating VIVO data from many sources. B.3.b. Florida Plans The UF team will continue to integrate VIVO releases into Virtual Machine versions of VIVO and create specialized VM's that are suited for marketing and production use of VIVO Packaging will also be investigating use of V IVO on Tomcat compliant products like JBOSS and packaging of VIVO into installable packages. Development will continue on the VIVO harvester to include new data sources and form ats. Modules are planned for NIH RePORTER, NSF publication and grant data, par tnerships with Scopus and ISI, CiteSeer, GrantsFire, and local sources such as Active Directory and Institutional Repositories. Regarding the reconciliation of ingested data, plans include further work and exploration in the fields of neural networks, AI, and natural language processing. Currently basic and aggregate matching are being implemented for proof of concept disambiguation and automation. The Florida team is also testing the Joseki SPARQL query endpoint 7 in VIVO as an additional add on tool to fa cilitate data access and data sharing through direct query or web services. B.3.c. Indiana Plans Indiana University will continue developing additional visualizations for VIVO in the coming year, in keeping with the original goals as stated in the grant proposal. In the near term plans include further development on a set of PDF reports for VIVO individuals and sites, which will convey a variety of different statistics and analyses of those entities. Visualizations will continue to leverage publication data, and w ill also begin incorporating grant data and co investigator relationships. Figure 2 Co authorship network showing key collaborators and the frequency of collaborations


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 5 Continuation Format Page The scope of the networking and visualization team will expand to include multi institutional visualizations utilizing the VIVO aggregation service. Under the direction of Dr. Ying Ding, Indiana University developers will also explore development of semantic tools for VIVO including a SPARQL 8 query builder to facilitate users in step by step generation of SPARQL queries, integration of faceted browsing techniques from the Simile Exhi bit platform 9 into VIVO and investigations of semantic techniques to derive expertise, collaboration patterns, and knowledge diffusion from data entered into VIVO instances. A first publication is in press. 10 C. Ontology In December 2009, 3 areas of ontology activity were identified and a framework established to manage all the areas. A VIVO o ntology implementation t eam was established to manage the ongoing development and logy staff member, a development focus was identified and is being managed by the Cornell Semantic Applications group, and a and a redesigned con fluence section were also established to function as additional communication channels for the teams. Another important aspect of ontology development was also added at that time when each of the 7 sites was invited to have a representative from their site join the ontology implementation team. This allowed them to provide insight on aspects of requirements for the VIVO ontology as well as to discuss implications of the decision processes in VIVO. In the first 5 or 6 months of the project when rapid and numerous changes had to be implemented, the ontology implementation team met on a weekly basis. A cycle for the use of JIRA for tracking issues was also put in place th ereby allowing the team to track potential software issues, propose and discuss new functionalities or improvements, track user case studies, disseminate policy questions, documentation, feature descriptions, and announce new developments w ithin the VIVO o ntology group. Now that the ontology is further along in its development, the team will modify its meeting schedule based on need and continue to enhance its processes to best suite this new phase; and will also continue to have issues tracked in JIRA and manage appropriately to drive changes to the ontology. Since the start of the project, 2 versions of the ontology have been released (version 0.9 at the end of Jan 2010, and version 1.0 at the end of March 2010), and a third version is under development a nd is planned for the release with the software in mid July 2010. This new release will include important updates supporting time bounded relationships between people and their activities, including Training As part of the next release, a significant effort on documentation for ontology definitions and examples were completed with the help of 4 of the VIVO sites. The completion of all those changes will be released in the 1.1 release of the ontology. Also, the Unive rsity of Florida has begun work on continuing the development of the Lastly, at the VIVO National Conference, there will be 3 hour workshop that will cover areas of ontology work and be taught in concert by feature Paula Markes (UF) Brian Lowe, Stella Mitchell, and Jon Corson Rikert (Cornell) and Ying Ding (IU). The work shop will cover the VIVO core ontology in the context of related ontologies and introduce SPARQL queries and the use of linked data. As all this work continues in the growth and development of the VIVO core ontology, efforts will also be made to align with our sister Eagle I data model and extend our capabilities in including subject tagging to objects in VIVO. C.1. Indiana University C.1.a. Work Accomplished VIVO related ontology survey: The Indiana team investigated a list of more than 35 different ontologies, includ ing their structure, conceptualization, and management, with a focus on comparison s among SWRC 11


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 6 Continuation Format Page the VIVO Cornell Ontology and the VIVO UF Ontology. This survey facilitated t he compar ison of ontolog y representations for common concepts in academi a such as event s people, publication s and organizations, and has resulted in the emulation and direct adoption of components of other respected and popular ontolog ies in the design of the VIVO ontology An IU VIVO ontology repository 12 was created and is updated regularly, sharing the latest information o n surveys, comparison s of candidate ontologies, and copies of the VIVO ontology. VIVO related system survey: The Indiana team also conducted investigations into more than 30 existing web portal systems similar t o VIVO. The survey focused upon the main services and features of the web portals, including presentation of grants, visualization of academic social networks, interface to front users, open source status and API. Additionally, the number s of registered u sers and profiles for each system were solicited via email. This survey is an ongoing task as new systems and web portals are constantly being identified by the project partners and reflected on the IU system survey repository website VIVO core 0.9 release and test: IU tested VIVO core 0.9 by adding instances from more than five faculty members via the VIVO ontology email list and resolved via the JIRA issue tracking system Mapping with Eagle I ontology: In February, 2010, the VIVO IU ontology group visited Oregon Health Sciences Unive rsity for meetings on the potential for overlap and interoperability with the Eagle I team. Figures 3 and 4 show the mapping of VIVO an d other ontologies from the high level and detailed overlapping Mapping and data preparation for social network visualiz ation: The IU ontology team converted a large body of Information Visualization Lab data and sample Scopus data to VIVO 1.0 for use by the IU development team in implementing and testing VIVO visualizations. Semantic Web Portal: Research by the Indiana on tology team explored creation of a Semantic Web Portal leveraging the Exhibit 13 and Longwell 14 tools developed by the Simile project at MIT 15 A c ustomized Exhibit v iewer module provides a mapping mechanism to generate Exhibit views while a d ynamic SPARQL q uery module will allow users to intuitively query the repository for relationships among data elements and generate appropriate SPARQL queries for different classes and view properties for Exhibit based on the mapping file. The Semantic Search module categ orizes federated RDF triples into different groups based on the ontologies; it constructs the hierarchical navigation for search results according to the queries. C.1.b. On going tasks VIVO IU ontology localization: The IU ontology team will work with the IU imple mentation team to implement extensions to the VIVO core ontology for the IU implementation as needed. Semantic Web Portal: Current work focuses on the improvement of semantic search and integration of Vitro and the Semantic Web Portal. Although we have mad e some progress in involving semantic search in the Semantic Web Portal, we still need to improve the pagination and efficiency of the query process. In addition, we are aiming to integrate the features of the Semantic Web Portal with Vitro, either nativel y or through a data communication layer C.1.c. Future P lan s Figure 3 Ontology Mapping Overview


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 7 Continuation Format Page Enriching the Publication class in the current VIVO core ontology Further alignment with Eag le I efforts by extending the notion of Role in the VIVO core ontology Modeling domain expertise through collaboration with Eagle Center for Ontological Research, Buffalo, USA) Consolidating and documenting the VIVO ontology and system surveys to publish in academic journals. Working with IU Digital Library team to enri ch IU VIVO instances and modify VIVO core ontology for IU localization Creating convincing use cases in IU to encourage more IU researchers join VIVO portal Publishing the VIVO core ontology in one or more well known ontology repositories Working together with Cornell group for ontology reasoning and SPARQL query security testing Providing semantic mining algorithms to identify topics, associations and paths for VIVO semant ic data Integrating the Semantic Web Portal into VIVO system in order to provide user s with a faceted browsing and semantic search experience Providing research prototypes based on innovative algorithms on mining VIVO RDF data, including: Dynamic topic research interest to further facilitate scientific collaboration and grant applications ; SPARQL query builder to facilitate users step by step to generate their own SPARQL quer ies ; Semantic association finder to identify association among topics, researchers, and publication venues ; Semantic path mining for VIVO RDF data to recognize collaboration patterns a nd identify knowledge diffusion; Security for querying semantic VIVO dat a C.2. Cornell C.2.a. Work Accomplishments Ontology Team participants at Cornell include Brian Lowe, Stella Mitchell, Anup Sawant, and Jon Corson Rikert. Brian is the lead of the Semantic Applications Team within the VIVO Development Team and Jon is the Development Co ordinator for the project. Stella Mitchell and Anup Sawant were hired early in 2010 for the VIVO ontology to the new VIVO core ontology and between successive versions of the VIVO core ontology as modified for each software release. The work of the Semantic Applications Team at Cornell intersects with the Ontology Team in many areas including data ingest, ontology design and editing, reasoning configu ration and performance, and ontology mapping, and Brian, Jon and Stella have been active participants in Ontology Team conference calls. The VIVO ontology editor has been improved during the first nine months of the project, as has support for semantic app roaches to data ingest available through the VIVO application itself. The Semantic Applications Team has improved the ability to manage multiple ontologies independently within VIVO and to export the full current state of the ontology for more advanced edi ting in Protg or other ontology editors. C.2.b. Future Ontology Plans The development road map for VIVO includes a number of work segments devoted to ontology related improvements: Developing tools and documentation for the creation and management of data sourc e ontologies and the use of intermediate ontologies to facilitate data integration from multiple sources, especially to support author and title disambiguation for publications. Improving support for data integration using reasoning. Ontology mapping to facilitate data integration, query, and export at the individual VIVO and national level and evaluation of the use of reasoning to serve queries on the fly. Adding a new content review capability for researchers, as part of which the semantic applications team will design a strategy to handle accepted and rejected content (e.g., publications not authored by the individual) utilizing semantic graph management techniques


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 8 Continuation Format Page Designing and building new ontology layers to support application configuration including navigation options, display order customization and hiding, and search specification. This is an active area of collaboration with the Eagle I project. Providing additional support to manag e multiple independent data graphs within VIVO to improve performa nce, facilitate editing multiple component ontologies, manage data provenance, control privacy/visibility, and support selective export and archiving. Managing data and search queries across multiple VIVO instances and within and beyond the VIVO national n etwork exploring the use of shared references to terminology, geographic locations, organizations, events, and services. Optimizing the use of RDF and OWL features available through new updated open source APIs Providing the ability to use existing taxonomies for tagging of instance data and for search enhancements, through local graph storage of taxonomies and by direct access to taxonomies available in OWL format over the Web Continuously improving support for migrating data and aligning local exte nsions with changes to the VIVO core ontology C.3. Florida C.3.a. Work Accomplished O ntology representative participates on the Outreach Team allowing for ontology activities to inform education an d information materials created and utilized with VIVO Testing of Release 1 of the software at UF included reviewing ontology changes to see how they affected the functionality and usability of the softwar e Release one included a number of features resp onsive to issues identified at implementation sites, including organizational relationships and the need to provide more structured data on activities and roles. A UF VIVO CV data entry training session was held using Release 1 of the software. Feedback a bout problems with both the ontology and application were captured and entered into JIRA Ontology sub team of UF librarians met and reviewed the publications ontology and documented missing data properties for key information resources. This will be inco rporated into release 1.1 of the ontology. C.3.b. Current Activities with the new Outreach Team. Instructions for merging the VIVO on VIVO team exercise ontology and data with the current ontology are being developed. Initial strategy meeting for connecting with VIVO the Digital Vita 16 system was completed and next steps outlined. UF received organization data that was not based on accounting department numbers and is working through the process of developing a method to ingest that data in place of the organizational data that was hand entered. This includes making sure it maps properly to the ontology as well as working through the necessary development and inges t steps to connect this new data to people data that may have already been hand entered. UF will work at incorporating email and address changes into its local ontology extension to meet user needs and facilitate PubMed ingest of publication data. As of t h is writing, there are 60 items being evaluated for inclusion into version 1.1 of the ontology. Many, but not all, of the changes will in effect also cause software changes to the VIVO application code as well. C.3.c. Future Plans Organize and format the UF organ izational structure data obtained for VIVO then pass back to a UF group for continued updating. That group will then become the source of the future organization updates for UF VIVO


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 9 Continuation Format Page D. Implementation D.1. National Accomplishments and Status The VIVO Implementat ion Team is composed of the National Implementation Coordinator, National Implementation IT Expert, and the Implementation Lead and technical support person from each of the seven sites. Biweekly teleconferences are a venue for sites to make announcements ask questions, provide feedback, and raise topics important to local implementation. Meeting agendas include site progress reports, team reports from Ontology, Development, Evaluation, and User Support, and are used to communicat e implementation deadline s and expectations. These meetings are open to all VIVO Team members and generally have high attendance. The Implementation Coordinator and IT Expert answer implementation questions posed through email or through the JIRA reporting system. Within the first quarter, the National Implementation Coordinator and IT Expert provided on site training to Indiana, Ponce, Scripps, University of Florida, and Washington University. As there were a number of logistical and organizational questions relate d to the separate Cornell Ithaca and Weill Cornell instances, the Cornell Development Team provided the on site training to Weill. All training sessions included an introduction to VIVO and the VIVO Core ontology, overview of the VIVO interface, a hands on data ingest session, and a virtual meeting with Cornell developers. Evaluations were filled out following each of these training sessions. In addition, all sites have been offered specialized training workshops on publication maintenance and, as needed have received data ingest training. National implementation support included user testing for the VIVO application release of version 1.0, with the purpose of improving the application and becoming familiar with the system before release to the implement ation sites. A testing plan was created and support provided for a three day, 10 person user testing Ontology teams. U ser testing and feedback was provided to the Development Team regarding the six release candidates prior to the general release of VIVO application version 1.0. D ifferences in the configurations of were noted and shared with the Ontology and Development Teams. T esting expanded the cross platform compatibility testing and support for the VIVO application. In addition to previously supported platforms ( Red Hat Linux 5.4, Windows Server 2008 R2, Solaris 10, Centos 5.4, and Debi an Linux), new platforms tested and supported now include Centos 5.4 PAE (Physical Address Extension kernel) and Ubuntu 64 bit server. Since September 2009, all implementation sites have accomplished the following: Purchased server(s) Installed VIVO versio n 0 .9 and upgraded to version 1 .0 software Provide d feedback to the Development Team using JIRA issue tracking system Exposed local VIVO system to the public Utilized a ppropriate URL and namespace Created local ontology extensions as needed Identified loca l data sources for organizational structure, employee, and grants Acquired organizational structure and employee data Manually inputted or ingested organizational structure Manually inputted VIVO on VIVO visualization data Decided who to include within the local system (faculty, staff, graduate students, etc.) Ingested employee data for all departments to be included (Scripps lacked a data source and manually inputted employee data) Made d ata extractable as RDF, with short and clear URLs Additionally, a ll but one institution begun profile curation See Table 1 Table 1 URLS of Public VIVO instances


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 10 Continuation Format Page School URL of public VIVO system Showcase departments Cornell http://vivo.cornell.edu All departments Indiana http://vivo.iu.edu Undecided Ponce http://vivo.psm.edu All departments Scripps http://vivo.scripps.edu Chemical Physiology UF http://vivo.ufl.edu Biology; Entomology/Nematology; Molecular Genetics and Microbiology; Otolaryngology Washington http://vivo.wustl.edu Genetics; WU Intellectual and Developmental Disabilities Center Weill http://vivo.med.cornell.edu Cell Biology D.2. Future Work for All Sites In preparation for the August National VIVO department populated with profile data for the following fields: p eop le, web pages, images, overview/short bio statement pr eferred title, positions, principal investigator research activity, r esearch keywords, courses taught, selected publications, awards and distinctions, educational backgro und, professional service, and mailing address. The showcase department will be utilized for national and local outreach and will include an interface branded w ith institutional logo, headers, footers, text, as needed; primary tabs designated and drawing o n appropriate data; proper introductory text on the homepage; and all VIVO team members represented with profiles within the local VIVO system. After August, sites will continue to upgrade to future releases; approach local data stewards with an outline of required data; i ngest available local data (local grant and publication information, courses, etc.); ingest and/or m anually input from supplementary sources ; acquire publication data from PubMed ; link to an authentication system, if possible; and redesig n the VIVO interface, if desired Table 2 shows institutional data sources for ingest at the seven schools. Table 2 Data Sources for VIVO ins tances School HR data Grants data Course data Authentication Supplementary Cornell PeopleSoft Office of Sponsored Programs Blackboard Kerberos FRS* I ndiana PeopleSoft Office of Research Administration Sakai CAS; Kerberos FRS* P once HR data source OSRPP Moodle Active Directory CV & Biosketches Scripps MySQL database Sponsored Programs None LDAP; Active Directory; Shibboleth CV & Biosketches UF PeopleSoft Division of Sponsored Research Sakai Shibboleth Disparate FRS*; CV & Biosketches W ashington PeopleSoft Not accessible at this time Telesis Shibboleth compliant CV & Biosketches W eill SAP Coeus Angel LDAP; Active Directory CV & Biosketches *FRS Faculty Reporting System D.3. Variance from Plan The national and local implementation teams are on track and meeting all goals set forth in the proposal. D.4. Cornell University D.4.a. Accomplishments and current status The Cornell University VIVO instance was initially released in 2004 and has a VIVO coordinator and a team of librarian curators who facilitate manual data input for content not being automatically ingested. Since receiving the NIH grant, a new programmer has been hired to work on th e ingest process, starting with the faculty reporting system An existing program aid e position has been extended to full time


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 11 Continuation Format Page VIVO at Cornell currently contains data for all disciplines, across all Colleges and i s currently running on a version of the ontology preceding the national project. VIVO has almos t 1.8 million triples and approximately ninety nine tho usand individuals, of which 11, 606 are of type person. The primary site receives approximately 1800 unique hits every day (excluding robots). Table 3 lists additional websites at Cornell driven by data from VIVO. Table 3 VIVO driven sites at Cornell Instance URL Description Graduate Programs in the Life Sciences http://grad.lifesciences.cornell.edu/ Separate web application using primary data queried for a specific audience Entrepreneurship @ Cornell http://eship.cornell.edu/ Separate web application using primary data queried for a specific audience CALS Research http://research.cals.cornell.edu Filtered data with a different theme from primary applica tion Collaborate @ Cornell https://confluence.cornell.edu/display/collaborate Wiki page with image map linked to primary data queries for geographic research focus College of Arts and Scie nces: Classics http://www.arts.cornell.edu/classics/ Faculty profiles being repurposed via xml for a department web site The primary instance runs on a Dell PowerEdge 2950 with 32 GB of RAM (4x8). Besides the development servers for the NIH development, Cornell has a test instance with a similar setup for VIVO Cornell. Cornell also has a test system running the most recent version of VIVO (V1.0) and is about to update the primary instance to the latest version Mapping and migrating the large volume of existing content to Version 1.0 of the ontology has provided valuable experience for future data mapping efforts. The mig ration path is established, the production server is configured, and testing is under way. Cornell anticipate s completion of this by July 1, 2010. D.4.b. Future Work Many of the Cornell colleges have or are in the process of migrating from standalone survey tools to Activity Insight process for ingesting the public data from this service into VIVO Cornell on a regular basis. Anothe r development effort involves creating an exemplar application that repurposes linked data from the VIVO Cornell instance. The University administration has identified the need to have faculty research and extension impact statements exposed from VIVO thr ough a content management system (Drupal) as a modular extension. D.5. Indiana University (IU) D.5.a. Accomplishments and current status The IU team hired Brian Keese as a VIVO Project Programmer and has a 6 person implementation team in place. Plans are in place to c omplete all VIVO hires that have been funded under this grant. VIVO at IU has approximately 34,335 triples and currently holds 21 people from 2 organiz ations and 2 different colleges Within our development instance we have more data from IU institutional data resources but this has not been made available publicly due to current IU data policy. The IU VIVO team has created ingest processes for and ingested data related to: organizational structure, faculty position, faculty educational background, and awar ded grants. IU has also b egun obtaining unique and persistent identifiers for IU people to use as part o f VIVO URIs The IU implementation team will solicit CVs from all fac ulty in the showcase department. C urrently IU is discussing this option with the I U Medical Sciences Department in Bloomington and the Folklore Department, and this data wi ll be manually inputted by the IU implementation team by August 1, 2010. IU continues long term negotiations with our Provost and Vice Provost for Faculty and Academic Affairs to include more faculty data from the IU Faculty Annual Review system. This process at IU will have to be an opt


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 12 Continuation Format Page in procedure IU expects to use new tools from the IAM (Identity and Authorization Management) areas of information technology that will allow a one time policy opt in that can then free up institutional faculty data to be used within VIVO This type of opt in procedure has been approved at many sites that require an opt in use of personal data to meet requirements for FERPA data privacy policies within a social networking environment. VIVO production and development are hosted on existing server infrastructure of the IU Digital Library Program. The server environment is comprised of three servers: 1) Production web application ; 2) Production database server; 3) Development/test web application and database server. All servers have 2x Dual Core Intel Xeon 5160 processors, 17GB RAM, and are running 64 bit Red Hat Enterprise Linux 5.5. D.5.b. Future Work The IU team has laid the groundwork for ingest processes for additional Faculty Annual Report data (the IU Faculty Annual Report System is implemented using components of the Coeus Research Administration System; futu re ingests will include: faculty service activities, creative activities, teaching/advising activities, and publications. IU intends to test further data loads using enhanced faculty publication data from both our IU ScholarWorks Repository as well as with test data from both Elsevier and Thomson. Furthermore, the IU VIVO team has been working closely with the HUBzero Consortium at IU and the Life Science Core Technology Team at IU on an NSF Software Infrastructure for Sustained Innovation proposal that would be similar in scope to the exemplar application that repurposes linked data from the VIVO Cornell instance. Th is would enable another exemplar application utilizing IU VIVO data within the HUBzero science gateway (Joomla). Indiana Uni versity and the University of Wisconsin are also working together on a related proposal using the HUBzero gateway in order to develop a digital humanities collaboratory under the auspices of Project Bamboo (International partnership for developing digital humanities cyberinfrastructure). Project Bamboo intends to utilize VIVO data within their Workspaces initiative that will be implemented with HUBzero. D.6. Ponce School of Medicine (PSM) D.6.a. Accomplishments and current status P SM hired all personnel supported by t he VIVO grant and, throughout the first year as new tasks came up that were not part of the original proposal, added assignments to personnel as fit their backgrounds and official roles. PSM also added one person from the institution, but who is not paid b y VIVO to our local team VIVO at PSM includes approximately 39 organizations and 41 people. These consist of ingested profiles for 100% of the basic science faculty as well as a handful of other faculty. Most records contain a modest level of detail, t hough several have been more thoroughly filled. Some of the input was by ingest; however, ingest is not yet smooth eno ugh to support all PSM needs (e.g publications and grants) so a fair amount of information has been manually inputted. The server environ ment is comprised of one server: VMware ESXi 4.0 Virtual Server and running on a Windows Server 2008 R2 and Apache Tomcat Version 6.0.26 / MySQL 5.1.40 D.6.b. Future Work In the upcoming year, PSM will complete the faculty profiles on VIVO and add in several cli nical investigators to provide nearly complete representation of PSM faculty. The implementation team will continue to obtain support from the nationa l team to customize VIVO (e.g learning to change the banner and other physical appearance properties), as well as continue working to help shape the ontology to reflect the needs of small schools. D.7. The Scripps Research Institute D.7.a. Accomplishments and Current Status The Scripps team is supported by IT Services staff members who are not paid by the VIVO grant yet are essential to our success and progress to date: Sam Katkov, IT Support Specialist and Brant Kelley, IT Services Manager. The test instance of VIVO has approximately 52,025 triples and currently holds profiles for 295 people, 508 organizations 16 services, 34 subject areas and 5 articles. The test server will be copied over production by


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 13 Continuation Format Page July 1, 2010. This data was entered manually as the data harvesting and ingest procedures are not yet fully developed enough for Scripps Library team members communicated with IT, Communications, Sponsored Programs, and Human Resources departments in regard to possible data sources. Scripps solicited CVs from all faculty in the designated showcase department. Scripps is in the process of manually inputting th e Chemical Physiology faculty data and plan to complete the process by August 12, 2010. If time allows, Scripps will develop a second showcase department, the Cancer Biology department at Scripps Florida. The server environment is comprised of 2 virtual se rvers running the Linux operating system, Tomcat 6.0.26, and MySQL 5.0.77. Linux CentOS 5.4 64 bit D.7.b. Future Work The upcoming year will focus on incorporating more data into VIVO and collaborating with faculty and administration to ensure that VIVO can serve the institutional mission and make a significant contribution in moving it forward. A major goal is for VIVO to be a central component in promoting and supporting collaborations among scientists at our bi coastal campuses in California and Florida. The VI VO interface will be enhanced and branded with institutional logo, headers, footers, and text within the next month. Scripps will continue to provide feedback to help shape the VIVO open source software and ontology, and to serve as an example that smalle r institutions with limited library and IT staff can successfully implement VIVO to support their institutional research and educational missions. D.8. University of Florida (UF) D.8.a. Accomplishments and current status The UF implementation team has approximately six members with two new ly hired : Paula Markes, UF Ontology Expert, and Alexander Rockwell, UF IT Expert. VIVO at UF has almost 562,242 triples and currently holds 17,735 people, 149 grants, 636 organizations, and 22 articles. The VIVO d atabase includes all UF faculty and staff in all 16 colleges from all disciplines their respective academic departments, administrative departments, and all centers and institutes. Significant effort was put toward modeling the UF organization structure as that data is not captured in any other campus database. With this organization data stored within VIVO Alex Rockwell wrote the first external application to interface with the VIVO database and output the UF organization structure in graphical form. This code is written in Ruby and is available at github 17 In addition to information from the HR system, UF identified and secured agreements from the Division of publication repositories managed by the UF Libraries and the UF Institute for Food and Agricultural Sciences. UF VIVO has f our servers:1) vivo.ufl.edu -P roduction -64 bit Ubuntu, 8 GB ram ; 2) qa. vivo .ufl.edu -QA -64 bit CentOS, 8 GB ram ; 3) test. vivo .ufl.edu -Test -32 bit CentOS, 3 GB ram ; 4) dev. vivo .ufl.edu -Dev -32 bit CentOS, 3 GB ram D.8.b. Future Work The showcase departments will be manually inputted by the UF implementation and outreach teams by August 1, 2010. Specific tasks include auto mating ingest procedures from campus data repositories and from external data providers such as PubMed The UF development team will implement scripts to import bibliographic management files into VIVO and enable campus webmasters to easily access the dat a for use in UF websites. The UF VIVO interface is also undergoing a redesign that will be implemented in the next quarter. D.9. Washington University School of Medicine (WUSM) D.9.a. Accomplishments and Current Status people. The test VIVO instance at WUSM (behind a firewall) includes 30,667 triples and is populated with all faculty that are at least at 50% effort (1,619), their


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 14 Continuation Format Page respective departments and divisions (21), centers and institutions (4), and the medical sc hool library. These data are populated through an automatic ingest from our Human Resource PeopleSoft database. NIH b iosketches or CVs have been collected from all faculty (tenure track and research) and center members for the showcase departments A newly hired individual, Jasmine Owens, who is working at 20% effort on the VIVO grant, is entering the data. The interface is being improved in partnership with the WUSM Medical Public Affairs Office who is currently working on the visual presentation of the ho mepage. There are four VIVO servers Each is a Xen virtual machine running within a CentOS 5 based cloud on Dell PowerEdge M610 blades in an M1000c blade enclosure. They all run the same OS and have the same hardware specifications: 64 bit CentOS 5, 4Gb RAM, 1 Xeon E5520 virtual CPU core @ 2.27GHz. D.9.b. Future Work The upcoming year is focused on securing and incorporating NIH Biosketch/CV data, automatically ingesting publication data, and incorporating recent events into the WUSM VIVO instance. Specific task s include automating ingest procedures from campus data repositories and from external dat a providers such as NIH RePORTER We will enter phase II of the website improvement plan in September. D.10. Weill Cornell Medical College (WCMC) D.10.a. Accomplishments and curre nt status VIVO at WCMC contains 137,665 triples and includes 4,864 people and100 organizations. Demographic data are also ready to be linked for referral by at least one other site in the consortium. requirements for data exchange have been identified and continuously updated. An automated feed of post a ward data for all known grants has also been established and is ready for upload and link referral. This grant feed and its transform have been shared with other national sites as examples both for implementation and ontology work. The WCMC VIVO team is w orking to prioritize the next set of data to be loaded. The WCMC showcase department has been identified and the VIVO team is working with the department to upload their data. WCMC continues to identify automated aut horitative data sources and to outline t he approach to update this data within VIVO C urrent technical specifications include one production server : Solaris Zone 8GB RAM, 4GB swap, 8 VCPU ; Operating system Solaris 10 ; Tomcat 5.5.27, MySQL 5.1.30 A test server will be forthcoming but is no t yet in place. D.10.b. Future Work The WCMC VIVO team will have the showcase department completed by August 1, 2010. WCMC will hire two new VIVO positions: a developer focused primarily on the VIVO project and a trainer/outreach position to help with the trainin g of department curators and with outreach to our CTSC partner institutions. It is expected that all automated data will be link referenceable from at least one other consortium site. E. Outreach E.1. Overview and Current Status National Outreach activities are coordinated by Kristi Holmes of Washington University School of Medicine. This role was originally filled by Medha Devare of Cornell University, who recently left Cornell University and the VIVO project to pursue another pr ofessional opportunity overseas. Outreach has been reorganized and now consists of seven sub teams : Speakers Bureau, Adoption & Collaboration, VIVO web.org Web site Marketing, Educati on, Publishers & Aggregators, and the National VIVO Conference. Biweekly conference calls have been held throughout the period of the grant with the entire Outreach team, located at all 7 VIVO partner sites. Project members at the sites are actively involved in outreach activities on a national and local level including regular communication s with their faculty about the VIVO project presentations about VIVO to research groups, and administrative units on campus I t is anticipated that outreach efforts at all schools will


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 15 Continuation Format Page increase as the platform is further developed and content contained in the local instances becomes more substantial. This increased outreach effort will utilize many of the tools and resources that have been developed by the Marketing and Education teams. E.1.a. VIVO Speakers Bureau The Speakers Bureau is respon sible for identifying appropriate conference venues for VIVO presentations, locating suitable VIVO team speakers and assisting team members with developing abstracts and meeting deadlines. While the team has been reshaped within the last month, this work has been going strong since the grants inception. Since September 2009, VIVO team members have presented on VIVO thirty four time s and displayed seven posters. 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 Four papers have resulted from conference presentations 59 60 61 10 VIVO has successfully utilized professional conferences where team members are previously sc heduled to attend and has chosen to use grant funding to attend conferences not generally attended by team m embers. At several conferences and meetings of the Special Libraries Association and Medical Libraries Association, the team presented multiple pres entations and posters at the same conference. In the future, the VIVO team will continue to identify appropriate venues for presenting VIVO to the scientific and library communities. VIVO has also reached out to the Clinical and Translational Science Award (CTSA) institutions via presentations to CTSA Consortium committees and conversations with consortium members. E.1.b. VIVO Adoption & Collaboration Team The adoption and collaboration team has provided over two dozen webinars to academic institutions and organi zations across the country (see list below). In addition to a live walk through of VIVO at Cornell and University of Florida, the presentations include an explanation of what VIVO is, who its intended audience is, how data is ingested, and how the RDF expressed from VIVO fits into the semantic data cloud. Requests for information from the website contact form and other referrals are initially handled by the adoption team lead and routed to the proper person as needed (administrative, development, ontology, etc). We have a team of project members representing different aspects of VIVO that can be recrui ted to present at the webinars. Table 4 Academic and oth er Organizations who have participated in VIVO demos Academic Institutions Organizations Brown University Clinical Translational Science Award (CTSA) research networking group California Institute of Technology White House Office of Science and Technology Policy Duke University USDA Emory University Thomson Reuters Harvard University Collexis Kansas Medical Center Elsevier North Carolina State University Association of Biomolecular Resource Facilities Northwestern University CTSA Strategic Goal 3 Committee Oregon Health and Science University CTSA Communication Key Function Committee SUNY Buffalo American Academy for the Advance of Science Tufts University CTSA Biomedical Informatics Key Function Committee University of Arkansas White House Office of Science Technology and Policy University of Rochester Southeast University Research Association University of Colorado, Boulder University of Illinois at Chicago Florida Lambda Rail Board of Directors University of Miami


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 16 Continuation Format Page E.1.c. VIVO Web Team A project web site, see Figure 4 has been developed to disseminate information about the VIVO project and how to participate with VIVO The content covers events, news releases, blog entries, and educational and marketing materials. The educational materials include tutorials, documentation, FAQs, and a glossary. The marketing materials include a media kit, identity guidelines, and a flyer for the national conference. There are download links to the software and ontology and links to the newsletter subscription form, the annual conference site, and to social networking sites about VIVO The logo and color scheme have been applied per the agreed upon identity guidelines. Because the project site is created with a content management system (Drupal), the content editing and upkeep can be assigned to different project members who can then manage their content without the need of a developer. This is an ongoing process and effort is made to keep the content timely and relevant Visitors to the site have the option to submit a contact form to request information or to provide feedback about the application, website, or materials provided. User forums were created as a place for adopters and developers to share and discuss their experiences. Future plans include upkeep of the site, content development, and maintenance of the user forums. E.1.d. VIVO Marketing Team The M arketing team was developed to create a market ing strategy plan for VIVO on a national level. This committee created and established the VIVO brand, including the VIVO logo and the identity guidelines pertaining to its use. The marketing team assisted in the development and refinement of the VIVO project site Lastly, this tea m has implemented a national marketing outreach approach through Constant Contact (email database & subscription newsletters) and through listservs The Marketing team is responsible for a variety of deliverables not only on a national level but items that can be customized for each of the VIVO implementation sites. This team created appropriate marketing/publicity materials for a variety of uses including the national network, PR, first annual VIVO national conference, workshops and other VIVO events. In the second year, the Marketing team will continue to create materials for the growing variety of users as the application develops. Materials will include more interact ive elements, tutorial videos and templates. The marketing templates will be fully editable and customizable so each local institution can di rect the marketing materials at their specific local ins titution. A s materials are develop ed, they will be availabl e on http:// VIVO web.org/support under the marketing materials category. E.1.e. VIVO Education Team The Education Team is a sub group of the VIVO Outreach team and is tasked with the instructional and educational needs o f the project with regard to its installation, adoption, and dissemination. In order to meet the educational requirements for VIVO this group utiliz es an assortment of media while preparing instructional materials in a variety of formats. Early in the pr oject this team completed a needs assessment for the educational requirements of the project. The research included identifying the target audiences for VIVO and the characteristics of the learners within those group s and exploration of instructional desi gn theories to identify a methodology for approaching education within the project. T echnical documentation related to the installation and administration of the VIVO application was identified as an early need This documentation was created and is versi oned according to the current release. It is publicly available alongside the application download and will be updated throughout the project period to reflect version changes. Additional documentation has been created to support the library based model of this Figure 4 The VIVO Project Web Site, www.vivoweb.org


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 17 Continuation Format Page project. Librarians who participate in the adoption of VIVO pilot test educational materials related to data input into VIVO and provide evaluative feedback used to improve the instructional materials that will support both library staff and end u sers in utilizing the application. In coordination with the VIVO Marketing Committee, several educational products have been developed which promote and educate individuals who seek information about the project. These products include presentations, post ers, brochures, and web based materials available on the project web site. Completed educational materials include 1) Quick reference educational materials for VIVO users with various roles in the system; and 2) General VIVO overview with talking points f or presenters; information on the technical aspects of VIVO ; and the implementation plan and strategy. Educational materials currently in development include 1) Documentation describing the processes for installing, upgrading, and administering the VIVO application; and 2) Quick reference for VIVO users with the self editor role. In the upcoming months, the team aims to continue to refresh the current educational support products to reflect the evolution of the VIVO application. Newly designed materials will focus on VIVO 's end users as this segment of our target audience begins to interact with the project. At this point, we are on track with the products and deliverables outlined in the project proposal and plan. E.1.f. VIVO Publishers & Aggregators Team Th e Publishers and Aggregators Team is charged with developing relationships with publishers and aggregators for the purpose of data ingest and availability in VIVO We have initiated discussions with the Public Library of Science (PLoS), Thomson Reuters, C ollexis, Inspec, Springer, and Elsevier to make content available in VIVO Most recently, VIVO has begun a pilot project to evaluate the process by which Scopus Custom Data (Elsevier) can be used to populate institutional VIVO instances. The pilot project, carried out by the VIVO development team at the University of Florida, utilizes a Scopus Custom Data dataset and will serve as a proof of concept with respect to integrating Scopus Custom Data into individual VIVO instances for discovery across the VIVO n etwork. We plan to initiate a similar pilot project where XML data from the Scopus API can be used by institutions with a Scopus license agreement to integrate Scopus data into individual VIVO instances 62 We are very interested in our continuing discussio ns with entities such as Public Library of Science (PLoS), Thomson Reuters, Collexis, and Elsevier to make data available in VIVO We also plan to investigate the incorporation of data from resources such as CiteSeer, ArX iv, and the NIH RePORTER database, among others. Additional content providers are welcome to participate and we maintain an open spirit of collaboration. E.1.g. VIVO National Conference Team The first annual National VIVO Conference, Enabling National Networking of Scientists, will be held at th e New York Hall of Science on August 12 13, 2010. The conference will bring together scientists, developers, publishers, funding agencies, research officers, students and those supporting the develo pment of team science. This two day conference will offer workshops and tutorials for those new to VIVO those implementing VIVO at their institutions, and those wishing to develop applications using VIVO Invited speakers will present on the Semantic Web, Linked Open Data and the role of VIVO in support of team science. Panelists will discuss adoption and implementation findings. Feedback sessions will engage participants in requirements gathering and brainstorming regarding future network services. Presenters will discuss mapping, social networking, crowd sourc ing, support for societies and other national network applications. There have been 28 submissions for the meeting and we intend to open a late breaking submission window to accommodate interested parties who missed the original submission window We hav e secured keynote speakers and invited speakers for the conference. Conference preparations are ongoing. E.1.h. Other Outreach Since October 2009, the VIVO project has been featured in a number of articles 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 The VIVO project has also been highlighted in press releases from the seven VIVO installation sites and has been the subject of other local, national, and international media coverage. The VIVO project has been the subject of hundreds of blog posts, Friend Feed p osts, Tweets, and other social networking information


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 18 Continuation Format Page streams. VIVO : Enabling National Networking of Scientists has a page on Facebook 81 and a group on LinkedIn 82 to help promote the project and direct people to the project website E.2. Variation The outreach w ork described herein is in line with the work that was proposed in the grant application. F. Governance The VIVO governance plan calls for the creation of three advisory groups Executive, Technical and Scientific Advisory Boards. F.1. Executive Advisory Board Fo ur members have accepted positions on the VIVO Executive Advisory Board. Five positions remained to be filled. Two invitations have been extended and three positions remain unfilled. Two challenges to filling membership on the Executive Advisory Board h ave been identified. First, potential members are very difficult to contact. VIVO is seeking very senior people. It is important to have personal connections with such people in order to establish contact. Second, potential members of the Executive Adv isory Board are often unaware of the project. They require orientation to agree to participate. Given their very heavy commitments, providing orientation has proven to be a challenge. We continue to seek EAB members and expect to have the board constitu ted in the next several months. F.2. Technical Advisory Board T he Technical Advisory Board has been created and has met by conference call, providing important insight into the future of VIVO technology Its membership is listed in Table 5 Table 5 VIVO Technical Advisory Board F.3. Scientific Advisory Board We have not created a Scientific Advisory Board. This group is intended to consist of leading scientists who can help identify applications and opportunities for coordination world wide. Some members would be recruited by members of the Executive Advisory Board. Given the delays in constituting the EAB, we may be gin to recruit directly to the Scientific Advisory Board. G. Evaluation Evaluation has consisted of multiple components including preparations for usability and formal evaluations; conducting formal and informal surveys; and monitoring and annotating website analytics. The formal evaluation varies from the original grant, however we feel it is vital to fulfill the evaluation obligations of the grant of understanding how researchers find collaborators, how they may use VIVO and if and what is needed so VIVO c an be sustainable. Member Affiliation John Wilbanks Creative Commons York Sure President of the GESIS Leibniz Institute for Social Sciences and Professor, University of Koblenz, Germany Neil Smalheiser University of Illinois, Chicago Barand Mons University of Rotterdam, The Netherlands Kei Cheung Yale University Chris Bizer Free University of Berlin, Linked Open Data Steffen Staab University of Koblenz, Germany Abel L. Packer BIREME/OPS/OMS, Director, Brazil Stefan Decker Director of DERI Galway, Ireland Carole Goble University of Manchester, UK, co director of e Science NorthWest Dean Krafft Cornell University Library Griffin Weber Harvard University College of Medicine James Hendler Rensselaer Polytechnic Institute Carl Lagoze Cornell University, Department of Information Science


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 19 Continuation Format Page Morae software 83 has been installed and tested at each participating university. Three sites have used the software for u sability testing: Cornell Washington University School of Medicine (WUSM), and Scripps IRB approval has been attain ed at Cornell and WUSM to conduct evaluations, with t he WUSM IRB exempt protocol covers evaluations at all institutions. Usability testing is underway with VIVO v1.0 data given to the development team. This testing asks participant 7 tasks, which are recorded in audio, video, and screenshots using Morae software Formal evaluations through intervi ews with faculty, staff, and students are underway to understand how they currently find collaborators and how they would use VIVO in this process. There will be evaluations at the four institutions TSRI, WUSM, PSM and IU; Scripps has been completed for Y ear 1 WUSM, IU, and PSM Year 1 evaluations will be complete by September 2010. Year 2 evaluations will be completed by August 2010. A survey of project participants was conducted for the six month project review. Findings were presented at the review and a short follow up survey was conducted regarding the review. This will be repeated at 1 year, 18 months, and 2 years. Each survey will be slightly different as the project evolves to address current issues. Support for other VIVO efforts have also been co VIVO implementation members which tabs they felt would be impor tant for the next VIVO release, and analyses of the user scenario survey results. Support will continue in the next year as needed by institutions. G oogle Analytics accounts have been created to collect information from each of the seven institutions and VIVO web.org. Analytics collected include the frequency of VIVO downloads and the number of new site visits. G.1. Evaluation Status Table 6 lists primary objectives for the VIVO evaluation along with the activity and status of each. Table 6 Primary Objectives to be evaluated in VIVO assessment Objective in Grant Assessment Status Support network will be in place 1. Identify individuals tasked with outreach and support in the network 2. Evaluate communication/interaction among persons 1. Completed 2. Formal evaluation of this has not seemed necessary. VIVO implemented at adopting institutions 1. Visit VIVO websites at adopting institutions 2. Web based surveys to key personnel at implementation sites 1. Completed and on going as VIVO upgrades are available. 2. Surveys covering core areas of VIVO (implementation, development, and outreach) have occurred and will throughout the grant VIVO support services and training meets the needs of users at VIVO implementation sites 1. Evaluate success of training (both in just in based tutorials) 1. Not started. This will occur in the next year. VIVO disseminated beyond initial adopters 1. Collect evidence demonstrating presentations given to promote VIVO 2. Educational outreach activities 1. Presentations are listing the VIVO team wiki. This will continue for the next year. VIVO accessed and used by diverse user community 1. Visit VIVO websites at adopting institutions 1. Monitored using Google Analytics. VIVO community support developed beyond initial implementers 1. Monitor on line VIVO forums 1. In progress. This will continue in Year 2.


Program Director/Principal Investigator (Last, First, Middle ): Conlon, Michael PHS 398/2590 (Rev. 06/09 ) Page 20 Continuation Format Page H. Year 2 Work Plan The VIVO Project remains on track for all project deliverables. Significant work elements have been added to the original plan, including an early open source release, a national conference, and significant additional outreach activities. Table 7 VIVO Project Timeline, Year 2 Tasks Qtr 1 Qtr 2 Qtr 3 Qtr 4 Governance Technical and Scientific Advisory processes continue X X X X Executive Advisory meetings X X Evaluation activities and reporting X X X X Final report X X Development Complete release 2 X X Develop release 3 X X Transition to community development X X Outreach Presentations at conferences X X X X Support adoption activities X X X X Second national conference X Implementation Consortium schools implement release 2 X X Feedback from release 2 X X Use Support release 1 X X X X Support release 2 X X Sustain Develop community of adoption X X X X Develop community of implementation support X X X X Develop community of usage support X X X X Maintain community development X X X X H.1. Variance From O riginal Plan Significant additions to the plan include an open sourc e release in April 2010. This h as significantly spurred interest and adoption activity. A workshop on author disambiguation was added. This workshop, held in Gainesville, Florida on March 18 and 19 th 2010 84 brought together international expertise on issues resulting from the historical processes used to identify authors on publications. Finally, a national co nference was added to the work plan. The first annual VIVO conference will be held at the New York Hall of Science August 12 and 13 2010 Effort will be added to support implementation and development at additional sites. VIVO community development is a head of schedule. We will add facilitators and support personnel to insure that adopters and those considering adoption have appropriate support. H.2. Challenges, Risks and Highlights The formation of an Executive Advisory Board remains advisable. The board s hould assist with sustainability activities. The Technical Advisory Board is strong and active. Supplemented by an Executive Board and a Scientific Advisory Board, VIVO would have significant momentum for sustainability. Implementation of VIVO is a signi ficant undertaking. Unlike m any software systems, VIVO gain s value in the institution and in the scientific community only when substantial effort is invested to provide data to VIVO and create processes for curation of that data. Despite great enthusias m, substantial challenges await implementers. The VIVO team continues to lower the effort required through various interfaces and data partner arrangements. VIVO is a work in progress. Some potential adopters have approached VIVO as if it were a commerci al turn key product with 24/7 support expecting a level of completeness that is not yet available. It is important to national adoption that all potential adopters understand that VIVO is under development and that they can participate in that developmen t, as with other open source projects. Great interest has been developed in VIVO nationally and internationally. It is critical that the VIVO Team and its partners translate that interest into activity beyond the project team to build the national network of scientists.


