Dataset of metadata created with Europe Media Monitor (EMM)/Medical Information System (MediSys) processing chain from news articles.
MEDISYS is a media monitoring system providing event-based surveillance to rapidly identify potential public health threats using information from media reports. The system displays only those articles with interest to public health (e. g. diseases, plant pests, psychoactive substances), analyses news reports and warns users with automatically generated alerts.
This dataset has a focus on Covid-19. It provides a large set of metadata automatically extracted from news articles related to Covid -19, stored as rss/xml format. It is publicly available, and anyone can build applications on top of that. The current version contains 4 months of news articles, from December 2019 to April 2020, which corresponds to more than 6 Million news articles. There is one zip file per month, containing the whole metadata information. As a example, the biggest month is March 2020, it contains 4.1 million news articles, from 76 different languages, 36 million entity occurrences (person names, organization names, location names, …), 15 million dates, 0.8 million quotations.
The information processed by MediSys is derived from the Europe Media Monitor (EMM). The freely accessible Europe Media Monitor (EMM) is a fully automatic system that analyses on-line media. It gathers and aggregates about 300,000 news articles per day from news portals world-wide in up to 80 languages
- Marco Verile
How to cite
Jacquet, Guillaume; Verile, Marco (2020): COVID-19 news monitoring with Medical Information System (Medisys). European Commission, Joint Research Centre (JRC) [Dataset] PID: http://data.europa.eu/89h/bd2f71e7-0551-4f57-8e82-fcfca8c1a462
|From date||To date|