A modular open-source focused crawler for mining monolingual and bilingual corpora from the web, Proceedings of the 6th Workshop on Building and Using Comparable Corpora, pp.43-51, 2013. ,
PaCo 2 : a fully automated tool for gathering parallel corpora from the Web, Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), 2012. ,
Dirt cheap Web-scale parallel text from the Common Crawl, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp.1374-1383, 2013. ,