For those of you who have taken the time to browse through my slides from the APE 2011, the most interesting bit might have been the outlook on slide 26, so I would like to go a little into detail of some of the 2011 DataCite plans:
- DataCite has broken the barrier of 1,000,000 registered objects
As of 31st of December 2010 DataCite has registered 1,002,631 DOI names.
The 1,000,000th DOI name was registered by the DataCite member ETH Zurich: doi:10.5169/SEALS-130540
Yes, this object is grey literature. I might be new to some of you, but DataCite’s definition of research data is pretty broad. I heard Liz Lyon once saying “research data is evidence”. Anything that is the foundation or part of the scientific process, but is not a scholary article or a book can be seen as research data. A picture, a video file, a map, a model, etc. yes even grey literature.
The majority of our 1,000,000 objects however is classical reserach data in the meaning of a bunch of numbers.
- DataCite metadata kernel
One of the DataCite working groups has almost finished to finalize the metadata kernel we will use at DataCite to describe all our registered data objects. I will let you know all details, when it is published in the next weeks
- Central metadata repository
In the last month the DataCite developers at British Library, TIB and CISTI have been very active in building up our new registration infrastructure. The heart of it will be our new central metadata store that will be up and running approx. June 2011. It will include the metadata descriptions from all objects registered by any DataCite member and it will be freely browsable and searchable for any third party.
One of our first cooperation is to let Thomson Reuters crawl it to upload all this content into Web of Science
Secondly we have talked to CrossRef about coordinating the possibility for all publishers to search directly in our metadata for relations from our datasets to their articles. Are you aware of the existing pilot by PANGAEA and Elsevier ? But this time with all our data centers and all publishers!
- More cooperations with publishers
Right now the information that a dataset is referenced in an article comes from the creator of the dataset manually. The next step of course has to be that the publishers actively encourage their article authors to upload their data to a data center and then cite it in the article. TIB had started a pilot with Thieme Chemistry to include the data publication into the article submission workflow. More cooperations like this have to come.
- Cooperation with eSciDoc
We are in discussion with the folks from eSciDoc to include a DOI-registration interface into their infrastructure, to offer data centers not only a DOI-registration opportunity, but furthermore a publishing and repositoy infrastructure for this.
More about it soon.
So you see, why we belive that 2011 will be a very intersting year for data and DataCite.