A Translation Evaluation Function based on Neural Network

Ameur Douib,

Kamel Smaïli,

David Langlois

Abstrakt
In this paper, we study the feasibility of using a neural network to
learn a fitness function for a machine translation system based on a genetic
algorithm termed GAMaT. The neural network is learned on features extracted
from pairs of source sentences and their translations. The fitness function is
trained in order to estimate the BLEU of a translation as precisely as possible.
The estimator has been trained on a corpus of more than 1.3 million data. The
performance is very promising: the difference between the real BLEU and the
one given by the estimator is equal to 0.12 in terms of Mean Absolute Error.
Słowa kluczowe: Statistical Machine Translation, Genetic algorithm, Quality estimation, Neural network
References

[1] Koehn P., Hoang H., Birch A., Callison-Burch C., Federico M., Bertoldi N.,
Cowan B., Shen W., Moran C., Zens R., et al., Moses: Open source toolkit
for statistical machine translation. In: Proceedings of the 45th annual meeting
of the ACL on interactive poster and demonstration sessions, Association for
Computational Linguistics, 2007, pp. 177–180.
[2] Douib A., Langlois D., Smaili K., Genetic-based decoder for statistical machine
translation. December 2016, Nous n’avons pas encore la date officielle de publication.
[3] Papineni K., Roukos S., Ward T., Zhu W.J., BLEU: a method for automatic
evaluation of machine translation. In: Proceedings of the 40th annual meeting
on association for computational linguistics, Association for Computational Linguistics,
2002, pp. 311–318.
[4] Bojar O., Chatterjee R., Federmann C., Haddow B., Hokamp C., Huck M.,
Logacheva V., Pecina P., eds. Proceedings of the Tenth Workshop on Statistical
Machine Translation. Association for Computational Linguistics, September
2015, ,.
[5] Neubig G.,Watanabe T., Optimization for statistical machine translation: A survey.
Computational Linguistics, 2016.
151
[6] Och F.J., Minimum error rate training in statistical machine translation. In:
Proceedings of the 41st Annual Meeting on Association for Computational
Linguistics-Volume 1, Association for Computational Linguistics, 2003, pp. 160–
167.
[7] Ondrej B. at. al, eds. Proceedings of the First Conference on Machine Translation.
Association for Computational Linguistics, August 2016.
[8] Langlois D., Loria system for the wmt15 quality estimation shared task. In:
Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon,
Portugal, September 2015, pp. 323–329.
[9] Kim H., Lee J.H., A recurrent neural networks approach for estimating the quality
of machine translation output. In: Proceedings of NAACL-HLT, 2016, pp. 494–
498.
[10] Koehn P., Och F.J., Marcu D., Statistical phrase-based translation. In: Proceedings
of the 2003 Conference of the North American Chapter of the Association
for Computational Linguistics on Human Language Technology-Volume 1, Association
for Computational Linguistics, 2003, pp. 48–54.
[11] Bergstra J., Bastien F., Breuleux O., Lamblin P., Pascanu R., Delalleau O.,
Desjardins G., Warde-Farley D., Goodfellow I., Bergeron A., et al., Theano:
Deep learning on gpus with python. In: NIPS 2011, BigLearning Workshop,
Granada, Spain, Citeseer, 2011.
[12] Bojar O., Buck C., Federmann C., Haddow B., Koehn P., Leveling J., Monz C.,
Pecina P., Post M., Saint-Amand H., et al., Findings of the 2014 workshop on
statistical machine translation. In: Proceedings of the Ninth Workshop on Statistical
Machine Translation, Association for Computational Linguistics Baltimore,
MD, USA, 2014, pp. 12–58.
[13] Och F.J., Ney H., A systematic comparison of various statistical alignment models.
Computational linguistics, 2003, 29(1), pp. 19–51.
[14] Willmott C.J., Matsuura K., Advantages of the mean absolute error (MAE) over
the root mean square error (RMSE) in assessing average model perfor

Czasopismo ukazuje się w sposób ciągły on-line.
Pierwotną formą czasopisma jest wersja elektroniczna.

Wersja papierowa czasopisma dostępna na www.wuj.pl