Organisation: European Commission, Joint Research Centre
Point of contact: ralf.steinberger@ec.europa.eu

Title: JRC-Names RDF: Person and organisation spelling variants as found in multilingual news articles

Description

JRC-Names is a highly multilingual named entity resource for person and organisation names (called 'entities') developed by the European Commission's Joint Research Centre (JRC). JRC-Names consists of large lists of names and their many spelling variants (up to hundreds for a single person), including across scripts (Latin, Greek, Arabic, Cyrillic, Japanese, Chinese, etc.).

The resource is the by-product of the Europe Media Monitor (EMM, see http://emm.newsbrief.eu/overview.html ) family of applications, which has been analysing up to 220,000 news reports per day, since 2004. EMM recognises names mentioned in the news in over twenty languages and decides automatically for each newly found name whether it belongs to a new entity or whether it is a spelling variant of a previously known entity. This resource allows EMM users to display news about people or organisations even if their names are spelt differently or if the news articles are written in different languages and ...

Contributors
Ralf Steinberger ralf.steinberger@ec.europa.eu
Guillaume Jacquet guillaume.jacquet@ec.europa.eu 0000-0001-8388-3897
How to cite
Steinberger, Ralf; Jacquet, Guillaume (2015):  JRC-Names RDF: Person and organisation spelling variants as found in multilingual news articles. European Commission, Joint Research Centre (JRC) [Dataset] PID: http://data.europa.eu/89h/jrc-emm-jrc-names
Keywords
Related resources

Data access

  • JRC-Names, RDF files api

    The compressed zip file contains an RDF file containing JRC names and spelling variants of JRC names.

  • SPARQL endpoint access api

    Access to the JRC-Names dataset via the Open Data Portal SPARQL endpoint. Some specific query examples are provided in the SPARQL endpoint webpage.

Publications

  • publication JRC-Names: A freely available, highly multilingual named entity resource

Other resources

Additional information
Last modified 2016-05-19
Issue date 2015-05-19
Landing page https://ec.europa.eu/jrc/en/language-technologies/jrc-names
Geographic area European Union
Temporal coverage

From: 2004-01-01

Update frequency irregular
Language English
Data theme(s) Economy and finance; Science and technology
EuroVoc domain(s) 36 SCIENCE; 64 PRODUCTION, TECHNOLOGY AND RESEARCH
Identifier http://data.europa.eu/89h/jrc-emm-jrc-names

Please be aware that the information and links provided in the metadata above are maintained in distributed and heterogeneous information systems. Although we strive to maintain and keep links and information updated, this may not always be possible because of changes that are not registered and updated in the relevant information systems. Please, help us to maintain the system updated by indicating broken links or any other outdated information by contacting the relevant contact point. You can also inform us using the "Contact" link of this page.