Some VIVO Things Blog

Musings on the community, software, data, use, and whatever else comes to mind.

What does it take to implement VIVO?

I am often asked what resources it takes to implement VIVO. I was asked today, for example. Sometimes the question comes from technical staff members, sometimes from university administrators.

The answer is “it depends.” No one likes this answer, and I don’t either. It depends on what you want to put into your VIVO and how prepared you are to put the things you have identified into your VIVO. If you are very prepared and your needs are simple, you should be able to have a two person team set up VIVO in a couple of months and then use 0.5 FTE to maintain it. But if you are like the University of Florida – large, complex and without mature institutional data management practices (that may sound harsh, but I work there, and have a bunch of experience with the practices) then it may take a man year or more to implement VIVO and up to two FTEs to maintain it.

Let’s break this down a bit. What might you want to put in your VIVO?

People. You’ll put in your faculty, possibly your postdocs, graduate students, housestaff and possibly your staff. At UF, we have faculty, postdocs, staff, and housestaff in our VIVO. We are considering adding graduate students. At UF, the count of people in VIVO with a current affiliation is 20,692. The number fluctuates each week based on our automated ingests. You’ll get your people data from HR or a warehouse. Hopefully you have a unique person identifier that you can use to determine if the person is already in your VIVO. And hopefully you’ll get contact information, home department, position information, current status. Back to UF, we have a unique identifier, but we can’t tell who the postdocs are – they are appointed in many different ways. The university is working to fix this, but in the interim, we have a special ingest process just to assign postdoc status based on a spreadsheet. Such is data management in the 21st century. We have a single person ingest process that we run each week based on HR data that creates and updates people as necessary, providing all information about the person.

Publications. It’s not much of a VIVO if you don’t have some way of getting the scholarly work of the people onto their profiles. The library is typically key here. They will know what resources can be used – electronic subscriptions and their license limitations. PubMed data is free and can be reused, but it’s biomedical only, which may not cover the scholarship of your institution. At UF, we use bibtex to load publications into VIVO. We have 53,659 publications in VIVO, mostly from 2008. We link these to PubMed if possible, adding MESH terms and links to full text. Most publications are automatically ingested, but some are hand-entered. We don’t have a source for books written by our faculty. The university has no record, and the various indexing services typically do not have this data. The Library of Congress does not typically have affiliation data, so we can’t tell which books were written by people at UF. We don’t have a source for presentations. We don’t have an interface to our library’s institutional repository. We load our publication data each week. We provide an RSS feed of our recent publications.

Grants. Most universities have good data on grants they have received. UF has good data and we have an automated ingest for loading grant data into VIVO. UF VIVO has all grants made to the university since 2008 – a total of 18,641. The ingest updates grants if new information is available – sometimes award amounts or dates change, sometimes the investigators change. Our Division of Sponsored Programs maintains the official data and we reflect it in VIVO.

Courses. We get data from the Registrar’s office and load it into VIVO once a year. We have all courses taught at UF since 2008 through 2012, a total of 87,269, each linked to course catalog, instructor, term. We’ll add the 2013 data soon, another 15,000 or so courses. These data form an important part of the profile of the person, indicate scholarly interests, and constitute another indicator of the social fabric of the university.

That’s it. We have four core ingests – people, publications, grants and courses.

To implement VIVO, you’ll need data. You’ll also need a creative IT staff. VIVO technology – triple store, ontology, RDF – is new to most IT people. You’ll need open-minded, creative people ready to learn.

You’ll also need the support of key stakeholders in your institution. Typically the provost, the vice president for research and the library director are involved in governance, championing and possibly leading your implementation.

All these issues get discussed at the annual VIVO Implementation Fest. The Fest brings project managers, planners, technical staff and stakeholders together for a couple of days of sharing and discussing issues related to VIVO implementation. There’s no registration fee. This year’s Fest will be at Duke University March 18-19, details on vivoweb.org.

So if you have champions, staff and data, you’re all set for a straightforward implementation. Otherwise, more effort and time will be needed. So “it depends.”