There is widespread and growing use of structured metadata by web search engines, such as Google Dataset Search. To help the discoverability of metadata harvested into Research Data Australia, Schema.org metadata has been added to all Collection and Service records. Once indexed by Google, a search in Google Dataset Search will retrieve a record which points back to the Collection or Service record view page in Research Data Australia. The advantage of this approach is, that as a national data catalogue, the syndication of content to a web data search tool does not require individual effort from each Contributor to receive the same coverage.
However, if Research Data Australia Contributors would like their datasets to also show up in Google Dataset Search with a direct link to their own repository as the source, then it is necessary to implement schema.org markup on each dataset landing page in the repository.
Following is a guide to making your dataset records available to Google Dataset Search from your own repository (in addition to Research Data Australia):
- Use Google's Structured Data Testing Tool to see if there is any markup already on your dataset web pages.
- If there is no structured metadata, add Schema.org metadata to every dataset landing page that you want indexed and use the "Dataset" class. Use the Structured Data Testing Tool to verify if there are any syntax errors.
- Include landing page URLs in a sitemap file to help Google find your dataset pages. Pages which are crawled (or re-crawled), will go into the Google Dataset Search index and be searchable within a few days.
- Google AI Blog (26 September 2018), Building Google Dataset Search and Fostering an Open Data Ecosystem
- Google Developers page: guidelines for dataset providers
DataCite Blog (12 December 2018), Google Dataset Search Webinar - everything you always wanted to know about Google Dataset Search, https://doi.org/10.5438/4sdj-hf49