Processing Malaysian Indigenous Languages: A Focus on Phonology and Grammar


Malaysian indigenous languages are of two entirely different families: Austronesian and Austroasiatic. The former consists of Malay and all the languages of Sabah and Sarawak, while the latter the aboriginal languages found only in Peninsular Malaysia. Except for Malay and a few more in Sabah and Sarawak, most of these languages have not been put into writing. This means that no writing system has been ascribed to them, despite the fact that quite a number have been described in terms of phonology, morphology and syntax. From the descriptions available, one gets a picture of their typologies and systems for processing purposes. Concerning typology, there is not much difference between the two families as far as phonemic inventories go, but there are differences in the phonological structures of the syllable and the word. As for morphology, the Austronesian languages are agglutinative, while the Austroasiatic ones are isolative. There is also a difference in the syntactical status of the word, where the former has the two categories of the full word and the particle, and the latter only the full word. This last mentioned difference leads to a divergence between them in the types of phrase, the clause, and the complex sentence. Natural language processing (NLP) is a methodology which is now being applied in the analysis of various aspects of languages. This paper discusses the constraints faced by most of the Malaysian indigenous languages in the application of this methodology.

Share and Cite:

Omar, A. (2014) Processing Malaysian Indigenous Languages: A Focus on Phonology and Grammar. Open Journal of Modern Linguistics, 4, 728-738. doi: 10.4236/ojml.2014.45063.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Banker, J. E. (1984). The Kadazan/Dusun Language. In J. W. King, & J. K. King (Eds.), Languages of Sabah: A Survey Report (pp. 297-324). Canberra: The Australian National University.
[2] Blagden, C. O., & Edwards, E. D. (1930-1932). A Chinese Vocabulary of Malacca Malay Words and Phrases Collected between A.D. 1403 and 1511 (?). Bulletin of the School of Oriental and African Studies, 6, 363-397.
[3] Dayak Cultural Foundation (1995). Atur Sepil Jaku Iban (The Spelling System of Iban). Kuching: Dayak Cultural Foundation.
[4] de Saussure, F. (1960). Course in General Linguistics. London: Peter Owen Limited.
[5] de Swaan, A. (2001). Words of the World: The Global Language System. Cambridge: Polity Press.
[6] de Swaan, A. (2003). Asia’s Affairs with English, Hindi, Filipino and Malay. In A. H. Omar (Ed.), The Genius of Malay Civilisation (pp. 365-411). Tanjong Malim: Institute of Malay Civilisation, Universiti Pendidikan Sultan Idris.
[7] Department of Education Sarawak (1964). The Full Teaching Syllabus for Junior Secondary Schools. Kuching: Department of Education Sarawak.
[8] Jan, J. M., Zaid, A. R. M., & Shamsudin, K. (2014). Antologi Cerita, Pantun, Lagu, dan Jampi Bahasa Mah Meri. Jurnal Bahasa Jendela Alam, 8, 9-61.
[9] KCA Sabah (1989). KOISAAN Language Symposium: Towards Standardisation of the Kadazan Dialects. Souvenir Book. Kundasang, 13-15 January 1989.
[10] Kurikulum, B. P. (2007). Sistem Jaku Iban di Sekula (Iban Speech Systems for Schools). Kuala Lumpur: Ministry of Education Malaysia.
[11] Lasimbang, R. (1994). Kadazan Dusun—Malay—English Dictionary. Proceedings of the Third Biennial International Conference of the Borneo Research Council, Pontianak, 10-14 July 1994.
[12] Lasimbang, R. (1998). Kadazandusun Mother Tongue Education. In K. K. Soong (Ed.), Mother Tongue Education of Malaysia Ethnic Minorities (pp. 96-99). Kuala Lumpur: Dong Jiao Zong Higher Learning Centre.
[13] Lewis, M. P. (Ed.) (2009). Ethnologue: Languages of the World (16th ed.). Dallas, Texas: SIL International.
[14] Nawang, A. H. (2005). Memoir Za’ba. Tanjong Malim: Penerbit Universiti Sultan Idris.
[15] Omar, A. H. (1975). The Verb in Kentakbong. In Essays on Malaysian Linguistics (Chapter 19). Kuala Lumpur: Dewan Bahasa dan Pustaka.
[16] Omar, A. H. (1979). Language Planning for Unity and Efficiency: A Study of the Language and Status Planning of Malaysia (Chapters 9-11). Kuala Lumpur: Penerbit Universiti Malaya.
[17] Omar, A. H. (1983). The Malay Peoples of Malaysia and Their Languages. Kuala Lumpur: Dewan Bahasa danPustaka.
[18] Omar, A. H. (2008). Ensiklopedia Bahasa Melayu. Kuala Lumpur: Dewan Bahasa danPustaka.
[19] Omar, A. H. (2013a). Sejarah Ringkas Bahasa Melayu. Kuala Lumpur: Department of Museums Malaysia.
[20] Omar, A. H. (2013b). The Iban Language of Sarawak: A Grammatical Description (Second and Enlarged Edition of 1981). Kuala Lumpur: Dewan Bahasa danPustaka.
[21] Omar, A. H. (2014a). Analysis of Folktales, Verse Forms and Incantations of the MahMeri. Jurnal Bahasa Jendela Alam, 8, 1-8.
[22] Omar, A. H. (2014b). Nahu Melayu Mutakhir. Edisi Kelima. Kuala Lumpur: Dewan Bahasa danPustaka.
[23] Omar, A. H. (Ed.) (2014). The Mah Meri Language: An Introduction. Kuala Lumpur: University of Malaya Press.
[24] Omar, A. H., & Ghazali, K. (Ed.) (2014). Indigenous Minorities of Bintulu: A Sociolinguistic Mapping. In A. H. Omar, & N. Norahim (Eds.), Linguistic Minorities: Their Existence and Identity within Larger Communities (Chapter 6). Kuching: Universiti Malaysia Sarawak.
[25] Omar, A. H., & Sandai, R. (2012a). Fonologi Bahasa Iban: Fonologi Jaku Iban. Tanjung Malim: Penerbit Universiti Pendidikan Sultan Idris.
[26] Omar, A. H., & Sandai, R. (2012b). Morfologi Bahasa Iban: Morfologi Jaku Iban. Tanjung Malim: Penerbit Universiti Pendidikan Sultan Idris.
[27] Richards, A. (1981). An Iban-English Dictionary. London: Oxford University Press.
[28] Sutlive, V., & Sutlive, J. (1994). Handy Reference Dictionary of Iban and English. Kuching: Tun Jugah Foundation.
[29] Tun Jugah Foundation (2011). Bup Sereba Reti Jaku Iban (An Iban Monolingual Dictionary). Kuching: Tun Jugah Foundation.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.