Clause-Based Reordering Constraints to Improve Statistical Machine Translation
Abstract
We demonstrate that statistical machine translation (SMT) can be improved substantially by imposing clause-based reordering constraints during decoding. Our analysis of clause-wise translation of different types of clauses shows that it is beneficial to apply these constraints for finite clauses, but not for non-finite clauses. In our experiments in English-Hindi translation with an SMT system (DTM2), on a test corpus containing around 850 sentences with manually annotated clause boundaries, BLEU improves to 20.4 from the baseline score of 19.4. This statistically significant improvement is also confirmed by subjective (human) evaluation. We also report preliminary work on automatically identifying the kind of clause boundaries appropriate for enforcing reordering constraints.