A Quantitative Evaluation of Neural Machine Translation Systems for English-Arabic Customs Legal Texts

16 Pages Posted: 27 May 2026

See all articles by Karam Damseh

Karam Damseh

Universiti Sains Malaysia (USM) - School of Languages, Literacies and Translation

Mozhgan Ghassemiazghandi

Universiti Sains Malaysia (USM) - School of Languages, Literacies and Translation

Date Written: May 18, 2026

Abstract

While neural machine translation has transformed cross-lingual communication, its efficacy in specialized, high-stakes domains such as legal translation remains underexplored. This study addresses the central question of how different neural machine translation paradigms perform under the extreme syntactic and terminological constraints of customs law. The primary aim is to quantitatively evaluate three prominent systems-Google Translate, Systran, and Tarjama Translator-for translating English customs legal texts into Arabic. A 1,000-segment domain-specific sample drawn from the Revised Kyoto Convention was assessed using the cross-lingual, optimized metric for translation evaluation in reference-based mode. Because each source segment was evaluated across all three systems, statistical inference was conducted using a within-segment repeated-measures analysis with multiplicity-adjusted post hoc pairwise comparisons. Results reveal a clear performance hierarchy: Google Translate achieved the highest mean score (87.70), followed closely by Tarjama Translator (87.51), while Systran scored significantly lower (84.48). Stratified analysis by sentence length demonstrated consistent performance degradation across all systems as syntactic complexity increased, with Systran exhibiting the steepest decline on long sentences. These findings establish an operational baseline for tool selection and inform policymakers regarding neural machine translation readiness for cross-border customs communication. It is recommended that practitioners incorporate complexity-aware risk controls and human-in-the-loop verification when deploying these systems in legal workflows to mitigate the risks posed by structurally flawed translations.

Keywords: Automated Evaluation, Cross-lingual Optimized Metric for Evaluation of Translation, Legal Translation, Neural Machine Translation

Suggested Citation

Damseh, Karam and Ghassemiazghandi, Mozhgan, A Quantitative Evaluation of Neural Machine Translation Systems for English-Arabic Customs Legal Texts (May 18, 2026). AWEJ for Translation & Literary Studies, Volume 10. Number 2. May 2026, Available at SSRN: https://ssrn.com/abstract=6790418

Karam Damseh (Contact Author)

Universiti Sains Malaysia (USM) - School of Languages, Literacies and Translation ( email )

Malaysia

Mozhgan Ghassemiazghandi

Universiti Sains Malaysia (USM) - School of Languages, Literacies and Translation ( email )

Malaysia

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
23
Abstract Views
48
PlumX Metrics