Tracking data citation

Yesterday I was pointed to an interesting blog post about the problems of tracking data citation in the blog reserachremix by Heather Piwowar.
Heather was trying to track the reuse of datasets that have a DOI through different channels (Google scholar, Web of Science etc.) and was not satisfied with the result.
The DOI names she was tracking 10.3334/ORNLDAAC/* were not assigned by DataCite but through CrossRef, and are actually a good example, why DataCite is needed.

A DOI name for a dataset alone is nothing but a an identifier. DataCite’s goal is to build up additional services for all datasets registered by our members. This includes uploading of the citations into Web of Science or Goggle Scholar and tools to measure the use, re-use and citation. These are major efforts that we want to achive in cooperation with our members and data centers. As writen in my last entry, the first services will be available in the middle of June. For a first glance of what is possible with data registration, I would strongly suggest to have a closer look on the Publishing Network for Geoscientific & Environmental Data (PANGAEA) . They have registered over half a million datasets through our system and have started to upload their content to Google Scholar and OAIster. If you try to track their data DOI names, these results should be much more satisfying (all their DOI names start with 10.1594/PANGAEA) as they not only have DOI names for their datasets but they have an excellent infrastructure behind it that is freely crawlable by third parties.

Nevertheless tracking the re-use of datasets through DOI names is still a problem, as the actual idea of re-using, referencing or citing existing data has just started. A lot of PANGAEA’s datasets are used in scholary publication and though these connections are sometimes visible (doi:10.1016/j.margeo.2004.03.017) the DOI names of the datasets do not appear in the metadata of article yet. This is one of our goals in the next month, to convince editorial boards to allow and explicitely ask for citations of datasets used in a manuscript, or as a first step to actively provide the publishers with the information that there is data available for their articles.

This is where we are going. Just assigning DOI names or any simple identifier is not enough, we need to have a central portal, additional services, and direct cooperations with third parties but most of all a strong voice to be heard by all the other players.




About datacite

Managing Agent DataCite
This entry was posted in DOI. Bookmark the permalink.

3 Responses to Tracking data citation

  1. Thanks for expanding on this, Jan!

  2. Hi Jan,
    Are you able to comment on any further progress in data citation tracking through DOI names? We are in the process of assigning DOIs to our research datasets and data collections using DataCite as the DOI RA via the Australian National Data Service. The citation tracking will be a critical measure of success. As you say, otherwise the DOIs will only serve as persistent identifiers (still a very important thing – especially for research data).
    Natasha – Griffith University, Australia

    • datacite says:

      Hi Natasha,

      we are making progress, in small steps, but nevertheless. As you can see from trhe posts on the new search portal ( we have started to collect the metadata centrally including the relevant metadata of relations from our data to articles. This relations can now be harvested by publishers and third parties, to count citations.
      We are alos engaged with publisher to encourage the authors of linking to the data from the articles and to actively allow data citation. This is a long road. but we will hobpefully present a joint DataCIte-STM statement on this at our summer meeting June 14th in Copenhagen.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s