Knowledge Graphs Effectiveness in Neural Machine Translation Improvement

Benyamin Ahmadnia, Bonnie J. Dorr, Parisa Kordjamshidi


Neural Machine Translation (NMT) systems require a massive amount of Maintaining semantic relations between words during the translation process yields more accurate target-language output from Neural Machine Translation (NMT). Although difficult to achieve from training data alone, it is possible to leverage Knowledge Graphs (KGs) to retain source-language semantic relations in the corresponding target-language translation. The core idea is to use KG entity relations as embedding constraints to improve the mapping from source to target. This paper describes two embedding constraints, both of which employ Entity Linking (EL)---assigning a unique identity to entities---to associate words in training sentences with those in the KG: (1) a monolingual embedding constraint that supports an enhanced semantic representation of the source words through access to relations between entities in a KG; and (2) a bilingual embedding constraint that forces entity relations in the source-language to be carried over to the corresponding entities in the target-language translation. The method is evaluated for English-Spanish translation exploiting Freebase as a source of knowledge. Our experimental results show that exploiting KG information not only decreases the number of unknown words in the translation but also improves translation quality.


,natural language processing, knowledge graph, machine translation

Full Text:



Ahmadnia B., Dorr B.J.: Augmenting Neural Machine Translation through Round-TripTraining Approach. In: Open Computer Science, vol. 9(1), pp. 268–278, 2019.

Ahmadnia B., Dorr B.J.: Enhancing Phrase-Based Statistical Machine Translation by learning Phrase Representations Using Long Short-Term Memory Network. In: Proceedings of Recent Advances on Natural Language Processing, pp. 25–32. 2019.

Ahmadnia B., Serrano J., Haffari G.: Persian-Spanish low-resource statistical machine translation through English as pivot language. In: Proceedings of the 9th InternationalConference of Recent Advances in Natural Language Processing, pp. 24–30. 2017.

Annervaz K.M., Chowdhury S.B.R., Dukkipati A.: Learning beyond datasets: Knowl-edge Graph Augmented Neural Networks for Natural language Processing. In:Proceedngs of the Conference of the North American Chapter of the Association for Computa-tional Linguistics on Human Language Technology, pp. 313–322. 2018.

Arthur P., Neubig G., Nakamura S.: Incorporating Discrete Translation Lexicons intoNeural Machine Translation. In: Proceedings of the Second Conference on MachineTranslation, pp. 157–168. 2016.

Auer S., Bizer C., Kobilarov G., Lehmann J., Cyganiak R., Ives Z.: DBpedia: A Nu-cleus for a Web of Open Data. In:Proceedings of the 6th International Semantic WebConference, pp. 722–735. 2008.

Bahdanau D., Cho K., Bengio Y.: Neural machine translation by jointly learning to alignand translate. In:Proceedings of the International Conference on Learning Representa-tions. 2015.

Bollacker K., Evans C., Paritosh P., Sturge T., Taylor J.: Freebase: a collaborativelycreated graph database for structuring human knowledge. In:Proceedings of ACMSIGMOD International Conference on Management of Data, pp. 1247–1250. 2008.

Bordes A., Usunier N., Garcia-Duran A., Weston J., Yakhnenko O.: Translating em-beddings for modeling multi-relational data. In:Proceedings of Advances in NeuralInformation Processing Systems, pp. 2787–2795. 2013

Chah N.: OK Google, What Is Your Ontology? Or: Exploring Freebase Classificationto Understand Google’s Knowledge Graph. In:CoRR, vol. abs/1805.03885, 2018.

Chatterjee R., Negri M., Turchi M., Federico M., Specia L., Blain F.: Guiding NeuralMachine Translation Decoding with External Knowledge. In:Proceedings of the SecondConference on Machine Translation, pp. 157–168. 2017.

Chousa K., Sudoh K., Nakamura S.: Training Neural Machine Translation using WordEmbedding-based Loss. In:CoRR, vol. abs/1807.11219, 2018.

Dasgupta S.S., Ray S.N., Talukdar P.: HyTE: Hyperplane-based Temporally awareKnowledge Graph Embedding. In:Proceedings of the Conference on Empirical Meth-ods in Natural Language Processing, pp. 2001–2011. 2018.

Dong X., Gabrilovich E., Heitz G., Horn W., Lao N., Murphy K., Strohmann T., SunS., Zhang W.: Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In:Proceedings of the 20th ACM SIGKDD international conference on knowledgediscovery and data mining, pp. 601–610. 2014.

Dorr B.J.: Machine Translation Divergences: A Formal Description and Proposed Solu-tion. In:Computational Linguistics, vol. 20(4), pp. 597–633, 1994.

Dorr B.J., Pearl L., Hwa R., Habash N.: DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment. In:Proceedings of the 5thconference of the Association for Machine Translation in the Americas. 2002.

Du J., Way A.: Using babelnet to improve OOV coverage in SMT. In:Proceedings ofthe Tenth International Conference on Language Resources and Evaluation, pp. 9–15.2016.

F ̈arber M., Ell B., Menne C., Rettinger A.: A Comparative Survey of DBpedia, Freebase,OpenCyc, Wikidata, and YAGO. In:Semantic Web Journal, pp. 1–26, 2015.

Gehring J., Auli M., Grangier D., Yarats D., Dauphin Y.N.: Convolutional Sequence to Sequence Learning. In:CoRR, 2017.[20] Gu J., Lu Z., Li H., Li V.O.: Incorporating Copying Mechanism in Sequence-to-Sequence Learning. In:Proceedings of the 54th Annual Meeting of the Associationfor Computational Linguistics, pp. 1631–1640. 2016.

G ̈ulc ̧ehre C ̧ ., Firat O., Xu K., Cho K., Barrault L., Lin H., Bougares F., Schwenk H.,Bengio Y.: On Using Monolingual Corpora in Neural Machine Translation. In:CoRR,vol. abs/1503.03535, 2015.

Hochreiter S., Schmidhuber J.: Long short-term memory. In:Neural computation,vol. 9(8), pp. 1735–1780, 1997.

Jean S., Cho K., Memisevic R., Bengio Y.: On using very large target vocabulary forneural machine translation. In:Proceedings of the 53rd Annual Meeting of the Asso-ciation for Computational Linguistics and the 7th International Joint Conference onNatural Language Processing of the Asian Federation of Natural Language Processing,pp. 1–10. 2015.[24] Kelin G., Kim Y., Deng Y., Senellart J., Rush A.M.: OpenNMT: Open-Source Toolkitfor Neural Machine Translation. In:Proceedings of 55th Annual Meeting of the Associ-ation for Computational Linguistics, pp. 67–72. 2017.

Koehn P., Och F.J., Marcu D.: Statistical phrase-based translation. In:Proceedings ofthe Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 48–54. 2003.[26] Li S., Xu J., Miao G., Zhang Y., Chen Y.: A Semantic Concept Based Unknown WordsProcessing Method in Neural Machine Translation. In:Natural Language Processingand Chinese Computing, pp. 233–242. 2018. ISBN 978-3-319-73618-1.

Lin Y., Liu Z., Sun M., Liu Y., Zhu X.: Learning Entity and Relation Embeddingsfor Knowledge Graph Completion. In:Proceedings of the 29th AAAI Conference onArtificial Intelligence, pp. 2181–2187. 2015.

Linzen T., Dupoux E., Goldberg Y.: Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies. In:Transactions of the Association for Computational Linguis-tics, vol. 4, pp. 521–535, 2016.

Loung M.T., Manning C.D.: Achieving open vocabulary neural machine translationwith hybrid word-character models. In:Proceedings of the 54th Annual Meeting of theAssociation for Computational Linguistics, pp. 1054–1063. 2016.

Lu Y., Zhang J., Zong C.: Exploiting Knowledge Graph in Neural Machine Translation.In:Proceedings of Machine Translation: 14th China Workshop, CWMT, pp. 27-38.2019.[31] Luong T., Pham H., Manning C.D.: Effective Approaches to Attention-based NeuralMachine Translation. In:Proceedings of the Conference on Empirical Methods in Nat-ural Language Processing, pp. 1412–1421. 2015.

Luong T., Sutskever I., Le Q., Vinyals O., Zaremba W.: Addressing the Rare Word Prob-lem in Neural Machine Translation. In:Proceedings of the 53rd Annual Meeting of theAssociation for Computational Linguistics and the 7th International Joint Conferenceon Natural Language Processing, pp. 11–19. 2015.

Moussallem D., Arcan M., Ngonga Ngomo A.C., Buitelaar P.: Augmenting Neural Ma-chine Translation with Knowledge Graphs. In:CoRR, vol. abs/1902.08816, 2019.

Moussallem D., Usbeck R., R ̈oder M., Ngonga Ngomo A.C.: MAG: A Multilingual,Knowledge-base Agnostic and Deterministic Entity Linking Approach. In:Proceedingsof Knowledge Capture Conference. 2017.

Navigli R., Ponzetto S.P.: BabelNet: The automatic construction, evaluation and appli-cation of a wide-coverage multilingual semantic network. In:Artif. Intell., vol. 193, pp.217–250, 2012.

Pan S.J., Yang Q.: A Survey on Transfer Learning. In:IEEE Transactions on Knowledgeand Data Engineering, vol. 22(10), pp. 1345–1359, 2010.

Papineni K., Roukos S., Ward T., Zhu W.J.: BLEU: A Method for Automatic Evaluationof Machine Translation. In:Proceedings of the 40th Annual Meeting on Association forComputational Linguistics, pp. 311–318. 2001.

Robbins H., Monro S.: A stochastic approximation method. In:Annals of MathematicalStatistics, vol. 22, pp. 400–407, 1951.

Rumelhart D.E., Hinton G.E., Williams R.J.: Learning representations by back-propagating errors. In:Nature, vol. 323, pp. 533–536, 1986.

Sennrich R., Haddow B.: Linguistic Input Features Improve Neural Machine Transla-tion. In:Proceedings of the First Conference on Machine Translation, pp. 83–91. 2015.

Sennrich R., Haddow B., Birch A.: Neural Machine Translation of Rare Words withSubword Units. In:Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1715–1725. 2016.

Shi C., Liu S., Ren S., Feng S., Li M., Zhou M., Sun X., Wang H.: Knowledge-BasedSemantic Embedding for Machine Translation. In:Proceedings of the 54th AnnualMeeting of the Association for Computational Linguistics, pp. 2245–2254. 2016.

Song Y., Roth D.: Machine Learning with World Knowledge: The Position and Survey.In:CoRR, vol. abs/1705.02908, 2017.

Sorokin D., Gurevych I.: Modeling semantics with gated graph neural networks forknowledge base question answering. In:Proceedings of the 27th International Confer-ence on Computational Linguistics, pp. 3306–3317. 2018.

Sutskever I., Vinyals O., Le Q.V.: Sequence to sequence learning with neural networks.In:Proceedings of the Internationa Conference on Neural Information Processing Sys-tems, pp. 3104–3112. 2014.

Tang G., Cap F., Pettersson E., Nivre J.: An Evaluation of Neural Machine TranslationModels on Historical Spelling Normalization. In:Proceedings of the 27th International Conference on Computational Linguistics, pp. 1320–1331. 2018.

Tang G., Muller M., Rios A., Sennrich R.: Why Self-Attention? A Targeted Evaluationof Neural Machine Translation Architectures. In:Proceedings of the Conference onEmpirical Methods in Natural Language Processing, pp. 4263–4272. 2018.

Tang G., Sennrich R., Nivre J.: An Analysis of Attention Mechanisms: The Case ofWord Sense Disambiguation in Neural Machine Translation. In:Proceedings of theThird Conference on Machine Translation, pp. 26–35. 2018.

Tran K., Bisazza A., Monz C.: The Importance of Being Recurrent for Modeling Hier-archical Structure. In:Proceedings of the Conference on Empirical Methods in NaturalLanguage Processing, pp. 4731–4736. 2018.

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polo-sukhin I.: Attention is all you need. In:Proceedings of Advances in Neural InformationProcessing Systems, pp. 5998–6008. 2017.

Wang Z., Zhang J., Feng J., Chen Z.: Knowledge Graph Embedding by Translating onHyperplanes. In:Proceedings of the 28th AAAI Conference on Artificial Intelligence,pp. 1112–1119. 2014.

Wu L., Tian F., Qin T., Lai J., Liu T.Y.: A Study of Reinforcement Learning for Neu-ral Machine Translation. In:Proceedings of the Conference on Empirical Methods inNatural Language Processing, pp. 3612–3621. 2018.

Yang B., Mitchell T.: Leveraging Knowledge Bases in LSTMs for Improving MachineReading. In:Proceedings of the 55th Annual Meeting of the Association for Computa-tional Linguistics, pp. 1436–1446. 2017.

Yin W., Kann K., Yu M., Sch ̈utze H.: Comparative Study of CNN and RNN for NaturalLanguage Processing. In:CoRR, vol. abs/1702.01923, 2017.



  • There are currently no refbacks.