HiPHET: A Hybrid Approach to Translate Code Mixed Language (Hinglish) to Pure Languages (Hindi and English)

Shree Harsh Attri, Shree Harsh Attri, T.V. Prasad, G. Ramakrishna

Abstract


Bilingual code mixed (hybrid) languages has become very popular in India as a result of the spread of Western technology in the form of the television, the Internet and social media. Due to this increase in usage of code-mixed languages in day-to-day communication, the need for maintaining the integrity of Indian languages has arisen. As a result of this need the tool named Hinglish to Pure Hindi and English Translator was developed. The tool translated in three ways, namely, Hinglish to Pure Hindi and Pure English, Pure Hindi to Pure English and vice versa. The tool has achieved accuracy of 91% in giving Hindi sentences as output and of 84% in giving English sentences as output, where the input sentences were in Hinglish. The tool has also been compared with another similar tool in the paper.

Keywords


Code Mixed Language, Pure Language, Hinglish, Hybrid language, Machine Translation

Full Text:

PDF

References


Atrey Shree Harsh, Prasad T.V., Rama Krishna G, “Identification and Translation of Noun in Bilingual Code Mixed Language into Pure Form”, Int. J. of App. Engg. Research, 10(9), 2015, pp. 21591-21604

Atrey Shree Harsh, Prasad T.V., Rama Krishna G, “Issues in Parsing POS tagging of Hybrid Language”, IEEE Int. Conf. on Comp. Intel. and Cybernetics 2012, Bali, Indonesia

Attri Shree Harsh, Prasad T.V., Rama Krishna G, “Translation of Code Mixed Language to Monolingual Languages using Rule Based Approach”, Int. J. of Cloud Computing, Pub. Inderscience, Accepted.

Bharati Akshar, R. Moona, P. Reddy, B. Sankar, D.M. Sharma, R. Sangal, “Machine Translation: The Shakti Approach”, Pre-Conf. Tutorial at ICON-2003

Bharati Akshar, Vineet Chaitanya, Amba P Kulkarni, and Rajeev Sangal (1997), "Anusaaraka: Machine Translation in Stages", Vivek: A Quarterly in Artificial Intel., Vol. 10, No.3, pp. 22-25

Bhowmick Suman, Pashad Rana D, Chand Vineeta, “What is India Speaking? Exploring the “Hinglish” invasion”, Physica A 449 (2016) 375–389,http://dx.doi.org/10.1016/j.physa.2016.01.015, 2016 Elsevier

Carme A., Rafael C., Antonio M., Mikel L., Mireia G., Sergio O., Juan A., Gema R., Felipe S., Miriam A., “Open-source Portuguese-Spanish machine translation”, In Lec. Notes in Computer Science 3960 (Computational Processing of the Portuguese Language, Proc. of the 7th Int. Workshop on Comp. Processing of Written and Spoken Portuguese, PROPOR 2006), May 13-17, 2006, ME - RJ/Itatiaia, Rio de Janeiro, Brazil, p. 50-59

Chakrawarti Kumar Rajesh, Bansal Pratosh, “Approaches for Improving Hindi to English Machine Translation System”, Indian J. of Sc. and Tech., Vol 10(16)

Dimitrova, L., Koseska, V., Roszko, D., Roszko, R., “Bulgarian-Polish- Lithuanian Corpus – Current Development”, Proc. of the Int. Workshop “Multilingual resources, technologies and evaluation for Central and Eastern European languages” in conjunction with Int. Conf. RANPL ’2009. Borovec, Bulgaria, 17 Sep 2009, pp. 1-8.

Dixit Pushpa, “Hinglish as a Hybrid Language: An Analytical Study”, Int. J. of Res. and Analytical Rev. (IJRAR), 3(1), Jan – Mar 2016.

Dwivedi Sanjay Kumar, Sukhadeve Pramod Premdas, “Machine Translation System in Indian Perspectives”, J. of Computer Sc., 6 (10): 1111-1116, 2010, ISSN 1549-3636.

Google Translator available at https://www.independent.co.uk/life-style/gadgets-and-tech/news/google-translate-how-work-foreign languages-interpreter-app-search-engine-a8406131.html

Goyal P, Mittal Manav R, Mukherjee A, “Saarthak A bilingual Parser for Hindi, English and code-switching structure”, Proc. of the Workshop on Comp. Ling. for the Languages of South Asia, 10th Conf. of the Eur. Chapter, 2003, pp. 15-22.

Gupta Deepa and Niladri Chatterjee (2003), “Identification of Divergence for English to Hindi EBMT”. Proc. of MT Summit-IX, pp. 141-148.

Hajic J, Hric J, Kubon V., "CESILKO– An MT system for closely related languages", In ACL2000, Tutorial Abstracts and Demonstration Notes, pp. 7-8. ACL, Washington.

Hajic Jan, “RUSLAN - An MT System Between Closely Related Languages”, 3rd Conf. of the Eur. Chapter of the Assoc. for Comp. Ling., April 1987, Copenhagen, Denmarks.

Hamed Injy, Elmahdy Mohamed, Abdennadher, “Building a First Language Model for Code-switch Arabic-English”, 3rd Int. Conf. on Arabic Comp. Ling., ACLing 2017, 5–6 Nov 2017, Dubai, UAE.

Information on PONS available at https://en.pons.com/translate/german-norwegian

Information about Idiom, available at https://7esl.com/English idioms/#What_is_an_Idiom

Information about Idiom, available at https://www.smart-words.org/quotes-sayings/idioms-meaning.html

Information about Shiva and Shakti is available at https://web.iiit.ac.in/~papi_reddy/test.pdf

K. Vijayanand, S.I. Choudhury, P. Ratna, “VAASAANUBAADA: automatic machine translation of bilingual Bengali-Assamese news texts”, Proc. of Language Engg. Conf., 2002, 13-15 Dec. 2002

Makoto Nagao, Jun-ichi Tsujii , Koji Yada , Toshihiro Kakimoto, “An English Japanese machine translation system of the titles of scientific and engineering papers”, Proc. of 9th Conf. on Comp. Ling., p.245-252, July 05-10, 1982, Prague, Czechoslovakia.

Mall S, Jaiswal U C, “Word sense disambiguation in Hindi applied to Hindi-English machine translation”, Computer Modelling & New Technologies 2017 21(2) 58-68.

Naskar Sudip, Bandyopadhyay Sivaji, “Use of Machine Translation in India: Current Status”, Proc. of MT Summit X; Sep 13-15, 2005, Phuket, Thailand.

P.J Antony, “Machine Translation Approaches and Survey for Indian Languages”, Comp. Ling. and Chinese Lang. Processing, Vol. 18, No. 1, Mar 2013, pp. 47-78.

Poornima, C., Dhanalakshmi, V., Kumar M. A., & Soman, K. P., “Rule-based Sentence Simplification for English to Tamil Machine Translation System”, Int. J. of Computer Applications, 2011, 25(8), 38-42.

Rao T.K., “Telugu to Sanskrit Machine Translation – A Hybrid Approach”, PhD Thesis, Shri Venkateshwara University, 2017.

S. Marinov, “Structural Similarities in MT A Bulgarian-Polish Case‖”, available at http://www.gslt.hum.gu.se/~svet/courses/mt/termp.pdf

Scannell Kevin P., “Machine translation for closely related language pairs”, In: LREC (2006), pp 103–107.

Singh Gurpreet, Goyal Vishal, “Advances in Machine Translation Systems”, Language in India, Vol. 9, Nov 2009, available at www.languageinindia.com.

Singh Kushagra, Sen Indira, Kumaraguru Ponnurangam, “Language Identification and Named Entity Recognition in Hinglish Code Mixed Tweets”, Proc. of ACL 2018, Student Res. Workshop, pp. 52–58, Melbourne, Australia, July 15 - 20, 2018.

Sinha R.M.K, “An Engineering Perspective of Machine Translation: AnglaBharti-II and AnuBharti-II Architectures”, Proc. of Int. Symp. on MT, NLP and Translation Support System (iSTRANS- 2004), Nov 17-19, 2004.

Sinha R.M.K, Jain A. “Angla Hindi: An English to Hindi Machine-Aided Translation System”, MT Summit IX, New Orleans, USA, 23-27 Sep 2003.

Sinha, R. Mahesh K., and Anil Thakur. "Machine translation of bi-lingual Hindi-English (Hinglish) text.", MT Summit X, Phuket, Thailand (2005): 149-156.

Web enabled CESILKO, available at https://lindat.mff.cuni.cz/services/cesilko/about.php

Web enabled open source Apertium, available at http://www.apertium.org




DOI: https://doi.org/10.7494/csci.2020.21.3.3624

Refbacks

  • There are currently no refbacks.