ELRA resources – 15 new corpora (written) & 7 updated corpora

We are happy to announce that a new set of 15 Written Corpora is now available in our catalogue. Arabic-English, Arabic-French, Chinese-English and Chinese-French Written Parallel Corpora: This set of 15 written corpora was produced by ELDA within PEA TRAD, a project supported by the French Ministry of Defence (DGA). Available resources are listed below (click on the links for further details). ELRA-W0098 TRAD Arabic-French Newspaper Parallel corpus – Test set 1 – ISLRN: 922-732-502-473-8 This is a parallel corpus of 10,000 words in Arabic and 4 reference translations in French. The…

[Resource] Corpus ‘Australia 2015/2016’

The corpus ‘Australia 2015/2016’ includes all articles from major Australian newspapers published from August 2015 to July 2016 that include the key term ‘Australia’ or ‘Australian(s)’ in the title. Altogether, the corpus contains over 7 million tokens in almost 13,000 articles from 18 newspapers. The corpus thus reflects one year of printed media coverage of topics directly relevant to Australia. Download Australia2015/2016 Corpus here Download word frequencies from this corpus here