A set of multilingual lexica containing component parts of Temporal Expressions, i.e. phrases referring to Temporal Entities. They can be integrated in a Temporal Expression parsin...
Description
This collection includes multilingual linguistic data that can be used for Computational Linguistics R&D tasks such as Text Mining, Machine Translation (MT), Information Extraction (IE), Document Classification, Cross-Lingual Information Access and Retrieval (CLIR), and more. Further relevant keywords applying to this collection include Named Entity Recognition and Classification (NERC), Quotation Recognition, clustering, categorisation, terminology extraction, sentiment analysis, disambiguation, Multi-Document Summarisation, Statistical Machine Translation (SMT), Neural Machine Translation (NMT). While the related data collection Europe Media Monitor (EMM) contains data mainly derived from on-line traditional media or social media monitoring, the Language Technology collection is based on other text types and has been created through other means.
Contact
Datasets (3)
This dataset is the first ever publicly available annotated dataset for sentiment classification and semantic polarity dictionary for Georgian. We consider both three- (positive, n...
Data Set with Metadata from Conflict and Disaster Events
Additional information
- Published by
- European Commission, Joint Research Centre
- Created date
- 2019-06-24
- Modified date
- 2022-04-28