RTE3-derived CLTE dataset

Cross-Lingual Textual Entailment (CLTE) is the task of identifying multi-directional semantic relations between two sentences, T1 and T2, written in different languages.

The RTE3-derived CLTE dataset consists of 1600 English-Spanish sentence pairs, aligned with the original dataset distributed for the Third Recognising Textual Entailment Challenge.

The Spanish sentences have been obtained through manual translation of the original English sentences.

To get the CLTE benchmark, please contact Matteo Negri (negri[at]fbk.eu)


Matteo Negri and Yashar Mehdad: Creating a Bi-lingual Entailment Corpus through Translations with Mechanical Turk: $100 for a 10-Day Rush. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data With Amazon’s Mechanical Turk.

Contact us: Matteo Negri (negri[at]fbk.eu)