CASE #3 – COnnecting REpositories (CORE)

CASE #3 – COnnecting REpositories (CORE)

‘In many cases we are required to mine the full-text content in order to determine the licence of the content. The licence is typically not provided as part of the metadata. We are currently depending on the recent UK Copyright Exception for text-mining to do so, but a European-wide approach would be helpful.’

Petr Knoth
KNOWLEDGE MEDIA INSTITUTE, Open University

COnnecting REpositories (CORE) is a not-for-profit service run by the Knowledge Media institute, Open University. Our research into aggregating and text-mining of research papers, supported by a range of funders including Jisc and the European Commission, has resulted in the creation of a platform with a number of applications built on top of it, providing benefits to a range of stakeholders and the general public.

CORE contains over 20 million open access research papers from worldwide repositories and journals and is used by over 90,000 unique visitors every month. By processing both full-text and metadata, CORE serves three communities:

  1. Developers, text-miners, scientometricians and others who need large-scale machine access to research papers.
  2. Researchers and the general public who need better, free access to research literature.
  3. Funders and government organisations needing to discover scientific trends and evaluate research impact.

As part of its work, CORE uses text and data mining methods on its aggregated papers in order to:

  • Extract information from research papers, including basic and advanced metadata, citations and unique identifiers.
  • Recommend content of related research papers.
  • Match papers to patents, funding opportunities and open courses to support a range of stakeholders.
  • Mine the licence of research papers to determine if they are compatible with the open access definition.
  • Support scientific knowledge discovery by improving access to research literature.
  • Categorise papers to determine the subject class and allow the monitoring of research trends.

Download the full case study.