Alina Karakanta

HLT-MT

I received my PhD on the topic of Automatic Subtitling from the University of Trento and the Human Language Technologies-Machine Translation (HLT-MT) group at FBK. I am working on developing novel methods for translating audiovisual content, focusing on Speech Translation for subtitling and dubbing. My recent research aims at devising evaluation methodologies for automatic subtitling. I am recipient of an EAMT 2020 Student Grant on the topic “Towards a methodology for evaluating automatic subtitling”.

My research interests also cover corpus-based translation&interpreting studies and constrained translation. I have been a co-organiser of the Workshop on Technologies for MT of Low Resource Languages (LoResMT). Previously, I was a research assistant at the Department of Language Science and Technology, Saarland University for the DFG-funded project “Modelling human translation with a noisy channel”. I am also a professional translator and post-editor. I received my MSc in Computational Linguistics from Saarland University and BAs in Translation Studies and Interpreting Studies from the Ionian University.

Contacts

Publications

Audiovisual Translation

  • Karakanta, Alina (2022). “Experimental research in automatic subtitling: At the crossroads between Machine Translation and Audiovisual Translation”. In: Translation Spaces. DOI: https://doi.org/10.1075/ts.21021.kar.
  • Karakanta, Alina and David Orrego-Carmona (in press). “Subtitling in Transition: the case of TED talks”. In: Translation in Transition: Bridging Human and Machine Intelligence. American Translators Association series. John Benjamins.
  • Karakanta, Alina, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, and Marco Turchi (2022). “Post-editing in Automatic Subtitling: A Subtitlers’ perspective”. In: Proceedings of the 23rd Annual Conference of the European Association for Machine Translation. Ghent, Belgium: European Association for Machine Translation, pp. 259–268. URL: https://aclanthology.org/2022.eamt-1.29.
  • Karakanta, Alina, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, and Marco Turchi (2022). “Towards a methodology for evaluating automatic subtitling”. In: Proceedings of the 23rd Annual Conference of the European Association for Machine Translation. Ghent, Belgium: European Association for Machine Translation, pp. 333–334. URL: https://aclanthology.org/2022.eamt-1.57.
  • Bentivogli, Luisa, Mauro Cettolo, Marco Gaido, Alina Karakanta, Matteo Negri, and Marco Turchi (2022). “Extending the MuST-C Corpus for a Comparative Evaluation of Speech Translation Technology”. In: Proceedings of the 23rd Annual Conference of the European Association for Machine Translation. Ghent, Belgium: European Association for Machine Translation, pp. 359–360. URL: https://aclanthology.org/2022.eamt-1.70.
  • Karakanta, Alina, Franćois Buet, Mauro Cettolo, and Franćois Yvon (2022). “Evaluating Subtitle Segmentation for End-to-end Generation Systems”. In: Proceedings of LREC. Marseilles, France.
  • Alina Karakanta, Sara Papi, Matteo Negri, Marco Turchi. (2021). Simultaneous Speech Translation for Live Subtitling: from Delay to Display. In: Proceedings of the 1st Workshop on Automatic Spoken Language Translation in Real-World Settings (ASLTRW). Virtual. August 2021. Association for Machine Translation in the Americas, pp. 35-48.
  • Alina Karakanta, Marco Gaido, Matteo Negri, Marco Turchi. (2021). Between Flexibility and Consistency: Joint Generation of Captions and Subtitles. In: Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021). Bangkok, Thailand (online): Association for Computational Linguistics, pp. 215–225.
  • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi. (2021). Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference? In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Online. August 2021. Association for Computational Linguistics, pp. 2873–2887
  • Erick García Chávez and Alina Karakanta. (2021). A corpus-based comparison of prosodic features in on/off-screen dubbing. In Castagnoli, S., S. Bernardini, A. Ferraresi, M. Miličević Petrović (eds) Using Corpora in Contrastive and Translation Studies Conference (6th Edition). Bertinoro (Italy), 9-11 September 2021.
  • Alina Karakanta, Supratik Bhattacharya, Timo Baumann, Shravan Nayak, Matteo Negri and Marco Turchi. (2020). The Two Shades of Dubbing in Neural Machine Translation. In: Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain (Online): International Committee on Computational Linguistics, pp. 4327–4333.
  • Shravan Nayak, Timo Baumann, Supratik Bhattacharya, Alina Karakanta, Matteo Negri, Marco Turchi. (2020). See me speaking? Differentiating on whether words are spoken on screen or off to optimize machine dubbing. In Companion Publication of the 2020 International Conference on Multimodal Interaction (ICMI ’20 Companion), October 25–29, 2020, Virtual event, Netherlands. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3395035.3425640
  • Karakanta, Alina, Matteo Negri and Marco Turchi (2020a). “Point Break: Surfing Heterogeneous Data for Subtitle Segmentation”. In: Seventh Italian Conference on Computational Linguistics, CLiC-It. Bologna, Italy
  • Alina Karakanta. 2020. Subtitling in Transition: the case of TED talks. 2020. In Book of Abstracts: Translation in Transition (TT5). Kent State University. October 15-17.
  • Alina Karakanta, Matteo Negri, and Marco Turchi. (2020b). Is 42 the Answer to Everything in Subtitling-oriented Speech Translation? In Proceedings of the 17th International Conference on Spoken Language Translation, pages 209–219, Online, July.
  • Alina Karakanta, Matteo Negri, and Marco Turchi. (2020d).  Towards Automatic Subtitling: Assessing the Quality of Old and New Resources. In IJCoL – Italian Journal of Computational Linguistics vol. 6, n. 1 June 2020.
  • Alina Karakanta, Matteo Negri and Marco Turchi. (2020c). MuST-Cinema: a Speech-to-Subtitles Corpus. In Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), Marseille, France, May 13-15 2020.
  • Alina Karakanta, Matteo Negri, Marco Turchi. (2019). Are Subtitling Corpora Really Subtitle-like? In Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-It), November 2019, Bari, Italy.

Translation studies

  • Elke Teich, Jose Martinez Martinez and Alina Karakanta. (2021). Translation, Information Theory, and Cognition. In: The Routledge Handbook of Translation and Cognition. Alves, F. (Ed.), Jakobsen, A. L. (Ed.). London: Routledge, https://doi.org/10.4324/9781315178127
  • Alina Karakanta, Heike Przybyl and Elke Teich. (2021). Exploring variation in translation with probabilistic language models. In: Corpora in Translation and Contrastive Research in the Digital Age: Recent advances and explorations. Julia Lavid-López, Carmen Maíz-Arévalo and Juan Rafael Zamorano-Mansilla (eds). Benjamins Translation Library 158, https://doi.org/10.1075/btl.158.12kar, pp. 308–323.
  • Alina Karakanta, Katrin Menzel, Heike Przybyl, and Elke Teich (2019). Detecting linguistic variation in translated vs. interpreted texts using relative entropy. In: Empirical Investigations in the Forms of Mediated Discourse at the European Parliament, Thematic Session at the 49th Poznan Linguistic Meeting (PLM2019). Poznan.
  • Alina Karakanta, and Elke Teich (2019). “Detecting and analysing translationese with probabilistic language models”. In: Book of Abstracts: Translation in Transition 4 (TT4). Universitat Pompeu Fabra, Barcelona, pp. 38–39.
  • Alina Karakanta, Mihaela Vela, Elke Teich. (2018). EuroParl-UdS: Preserving and Extending Metadata in Parliamentary Debates. ParlaCLARIN: Creating and Using Parliamentary Corpora. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018).

Low-resource

  • Alina Karakanta, Atul Kr. Ojha, Chao-Hong Liu, Jade Abbott, John Ortega, Jonathan Washington, Nathaniel Oco, Surafel Melaku Lakew, Tommi A Pirinen, Valentin Malykh, Varvara Logacheva, and Xiaobing Zhao, eds. (2020). Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages. Suzhou, China: Association for Computational Linguistics.
  • Ojha, Atul Kr., Valentin Malykh, Alina Karakanta, and Chao-Hong Liu (2020). “Findings of the LoResMT 2020 Shared Task on Zero-Shot for Low-Resource languages”. In: Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages. Suzhou, China: Association for Computational Linguistics, pp. 33–37.
  • Surafel M. Lakew, Alina Karakanta, Marcello Federico, Matteo Negri, Marco Turchi. (2019). Adapting Multilingual Neural Machine Translation to Unseen Languages. In Proceedings of the 16th International Workshop on Spoken Language Translation (IWSLT), November, 2019.
  • Karakanta, Alina, Atul Kr. Ojha, Chao-Hong Liu, Jonathan Washington, Nathaniel Oco, Surafel Melaku Lakew, Valentin Malykh, and Xiaobing Zhao, eds. (2019). Proceedings of the 2nd Workshop on Technologies for MT of Low Resource Languages. Dublin, Ireland: European Association for Machine Translation.
  • Alina Karakanta, Jon Dehdari, Josef van Genabith. (2018). Neural machine translation for low-resource languages without parallel corpora. Machine Translation, pp. 1-23.
  • Anna Currey, Alina Karakanta, Jon Dehdari. (2016). Using related languages to enhance statistical language models. In Proceedings of the NAACL Student Research Workshop, pp. 116-123.