The Role of Music Theory in Stage Performance for Vocal Singing Based on the Yuelun Zhushu ()
1. Introduction
Chinese folk music originates from life, transcends it, and continually evolves within it. As one of the foundational elements of traditional Chinese music, Chinese folk songs are known for their beautiful melodic lines and sincere emotional expression. As noted by Li Manxia in her study, Chinese folk music is both a reflection of life and an artistic elevation beyond it (Li & Zou, 2021). As a multi-ethnic nation with fifty-six ethnic groups, China’s folk songs are a distinct and vital part of its musical heritage, contributing a unique beauty to the “garden” of vocal music. General Secretary Xi Jinping has emphasized that ethnic culture is a unique mark distinguishing one ethnic group from another, advocating for the exploration and promotion of China’s excellent traditional culture. In the context of ethnic culture, both material and spiritual aspects are integral. Prominent vocal educator Jin Tielin has repeatedly asserted in his writings that only what is ethnic can be truly universal. Similarly, Professor Shi Lin, a renowned vocal educator at the Shanghai Conservatory of Music, has always adhered to the teaching principle that ethnicity is of paramount importance. For vocal learners, the challenge lies in utilizing vocal techniques to express folk songs as fully as possible. This is a question the author frequently contemplates. The Yuelun Zhushu (Commentary and Annotations on the Treatise of Music) is a 16th-century Tibetan work on music theory, where the Ming scholar Awang Gongga Sonam Zhabachen provided detailed annotations and explanations of Yuelun. In addition to clarifying Tibetan music theory, Awang Gongga Sonam Zhabachen also discussed new musical concepts, such as ancient Tibetan instrumental performance techniques. Therefore, the author believes that Yuelun Zhushu has played a guiding role in both the theoretical and practical aspects of Tibetan music.
The author believes that when performing regional folk songs, attention should not only be focused on vocal techniques but also on the distinctive ethnic pronunciations within the folk songs. In the traditional Tibetan music text Yuelun Zhushu, it is recorded that formerly, singers were proficient in the lyrics; today, people focus more on practicing the melody (Gongga Sonam Zhabajianzan & Zhao, 1993). This suggests that while mastery of technical vocal skills is important, the ethnic and musical characteristics inherent in folk songs are even more crucial. If the ethnic musicality of a folk song is lost, its unique qualities—often regarded as the “soul” of the folk song—will be diminished in performance. Despite the significance of Yuelun Zhushu, scholarly attention to it remains limited. This paper aims to draw attention to the value of Yuelun Zhushu, hoping to inspire further research and recognition of its importance.
Expressing Emotion through Regional Vocal Characteristics:
Interpreting the Work
Geography and landforms, as material entities, constrain and influence music, which as an ideological form, naturally reflects various regional landscapes and folk customs (Qiao, 2009). This explanation by Qiao Jianzhong highlights the reciprocal relationship between geography, landforms, and folk music. In other words, geographic features directly impact the regional characteristics of musical forms within an ethnic group.
For example, the Ganzi Tibetan Autonomous Prefecture in Sichuan is the birthplace of “Kamba culture”. Located in the “Kang-Tibet Plateau”, the average elevation of Ganzi is above 5000 meters. The local environment, living conditions, and other factors have subtly influenced the aesthetic preferences of the residents regarding their singing. These influences have shaped unique musical concepts, contributing to the distinct musical identity of the region.
In the Yuelun Zhushu, it is described that the people of the Central Tibet region have bright and melodious voices, resembling the sounds of roosters and bees; the people of the Western Tibet region have bright, rasping voices, resembling those of cocks and horses; the people of the Ali region have strained and short voices; the people of the Kamba region have bold and upright voices, which sound majestic and rugged (Gongga Sonam Zhabajianzan & Zhao, 1993). This passage, translated by Zhao (1989), describes the distinctive vocal characteristics of the people from different regions of the Kang-Tibet Plateau. By examining these descriptions and considering the geographical features of the Kamba region, one can better understand the unique style of singing in Kamba Tibetan folk songs. This knowledge can assist singers in quickly mastering the musical style when performing songs from this region. The renowned vocal educator Ma Qiuhua has emphasized the importance of nurturing the individuality of ethnic minority vocal learners and the integration of musical cultures. While focusing on developing vocal techniques, she also strives to preserve the regional musical characteristics of ethnic minority styles. She guides students to grasp the stylistic nuances of the songs they perform. As illustrated in Figure 1, this is a mountain song from the Ganzi region, which depicts a traveler who, forced to leave their homeland, nostalgically recalls their hometown while far away.
![]()
Figure 1. Lyrics Summary:
A bird flies from the mountain cliff towards the distant horizon,
Soaring elsewhere, but its heart remains by the cliff.
The reason for its journey is the strong winds from its homeland,
For it has no choice but to head far away.
Through the analysis of the musical score, the author observes that in the first measure, the use of the fermata (a symbol indicating a sustained note) effectively evokes the vastness of the grasslands and the towering cliffs. In the final measure, the use of the grace note vocal technique illustrates the sense of reluctance and longing for one’s homeland. Additionally, the frequent use of appogiaturas in the song reflects the bird’s freedom as it soars through the blue sky. However, as it heads towards the distant horizon, the bird frequently looks back, symbolizing its sorrow for leaving its homeland. The author suggests that while singing this folk song, there is no need to convey a “majestic” style; instead, singers should seek a “rugged” quality in their voices. This approach will better express the hardships and emotional depth of the traveler, enhancing the portrayal of the young traveler’s longing for home and the difficulties of being far from it.
Lhasa, with its unique geographical location and beautiful landscapes, is a city enriched with profound cultural connotations. Situated in the southwest of China, in the central part of the Qinghai-Tibet Plateau, on the northern side of the Himalayas, and along the middle reaches of the Lhasa River, Lhasa is often referred to as the “capital of Tibet” or “Central Tibet”. Due to its high altitude and long periods of sunlight, Lhasa is also known as the “City of Sunlight”. The musical culture of Lhasa is rich and has a long history, shaped by Tibetan traditional culture, including regional culture, religious practices, dance, painting, sculpture, and folk literature. As a result, a distinctive regional musical style has developed in this area.
The folk songs of the Lhasa region are also referred to as Nangma. Jibunima (Figure 2) is a typical example of a Lhasa-style folk song. Nangma is a traditional Tibetan song and dance style, named after its performances at the Nangma Kang within the Potala Palace in Lhasa, which is the inner palace where the Dalai Lama resides. The term Nang in Tibetan means “inside” or “interior”, reflecting its association with the palace’s private spaces (Chang, 2005). Upon examining the musical score (Figure 2), it is evident that the song begins with a 15-measure prelude, and in the two measures following the prelude, there are tempo markings indicating a slowdown. Additionally, there are numerous vocal filler words such as “ah” and “la” in the lyrics. This aligns with the definition of Nangma as described by the renowned vocal educator Chang Liuzhu from the Shanghai Conservatory of Music. Chang (2005) defines Nangma music as a typical harmonic folk song… the song begins with a fixed 10-measure prelude, followed by a slow section of the musical composition… decorative filler words such as “ha” and “la” are added within the musical phrases… This is a distinctive style of Nangma. From Mr. Chang’s description of Nangma, it is clear that the musical structure of Nangma is generally fixed. Originally, Nangma music served as court music for Tibetan aristocracy, requiring advanced performance techniques and adhering to strict musical conventions.
![]()
Figure 2. Lyrics Summary:
The happy sun has risen; We are so warm;
The snow-white moon has risen,
Chasing away the darkness.
As shown in Figure 2, the song describes how happiness shines on people’s faces like the sun. The warm, red sun brings comfort, while the pure white moon rises in the sky, driving away the darkness and cold. The song conveys the joyful mood of the people through bright and powerful vocals.
In the Yuelun Zhushu, translated by Zhao Kang, there is a description of the vocal style of the people from the Central Tibet region: “…the people of Central Tibet have bright and melodious voices, resembling the sounds of roosters and bees.” This passage helps guide our understanding when performing folk songs from the Central Tibet region (including Lhasa). It has a significant influence on how we select and use vocal timbre during performances.
2. Interpreting the Work through Stage Physical Movements
In vocal stage performances, body language plays a crucial role. Body language is also referred to as gestural language. An article published in The Art of Piano profoundly introduces the definition of gestural language. Gestural language refers to the form of communication in which individuals use bodily movements to convey information, express intentions, and convey emotions. It is characterized by the non-semantic organizational structure of bodily movements, which directly corresponds to human emotions and volitional activities, and is thus considered a non-verbal form of communication (Zhang, 2003). In terms of the process of information transmission, gestural language has immediate temporal and spatial properties, facilitating rapid communication of information. Regarding its capacity, due to its non-semantic nature, gestural language contains fewer fixed meanings but more uncertainties in its extension, which implies that it carries more “meaning beyond words”. For music, which also possesses non-verbal characteristics, the interpretation through gestural language is both detailed and convincing. As for its scope of use, gestural language is primarily applied to express emotions (Zhang, 2003).
In the Yuelun Zhushu, it is recorded that inner emotions can also be expressed through external forms. How can a singer transmit the inner emotional experience from the “inside” to the “outside”? While the voice conveys emotion, the singer’s body language also plays a crucial role in expressing the emotional depth and intent of the song. By effectively using body language, the performer can enhance the stage presence, making the performance more vivid, realistic, and emotionally impactful. As noted in the Tongdian: Music Preface, A dancer, when singing is insufficient, therefore, uses hand movements, foot movements, and changes in their expression to symbolize the act, and this is called music (Wu et al., 2011). The author believes that body movements serve as an extension of the inner emotions. When voice alone is not sufficient to fully express the emotional outpouring of the heart, body movements play a vital role, which is exactly the point made in Tongdian: Music Preface. Additionally, the Yuelun Zhushu also emphasizes the importance of body movements in performances, stating that the movements should align with the rhythm of the steps... and make everyone feel beautiful. The movements in a performance must be in harmony with the emotional trajectory of the music. These actions should reflect the performer’s true emotions, ensuring that the audience does not sense any dissonance or artificiality in the performance. Only when the emotions are expressed naturally will the inner feelings be conveyed authentically to the audience.
In the practice of vocal learning and stage performance, I have categorized effective stage movements into two main aspects: facial expressions and bodily movements.
1) Facial Movements
Lelun records: When singing a hymn, the face should display a joyful expression; when singing a song of criticism or resistance, a commanding demeanor is required; when singing a song of wisdom and cause, an expression of understanding should be shown; when singing a song of admonition, a warning look should be conveyed; when singing a song of repentance, a sorrowful expression should appear; and when singing a song of joy, a cheerful expression should be evident (Zhao, 1989).
In vocal stage performance, facial movements are especially important. When performers wish to express sadness and melancholy, they can convey these emotions through a sorrowful facial expression (Ma, 2024). It is well known that facial expressions are a crucial means by which people communicate their inner emotions and state of mind. Performers use facial expressions, gestures, and body movements to enhance the emotional impact of their stage presence, conveying the feelings intended by the vocal work. Furthermore, during singing, singers can regulate their vocal technique by controlling their facial expressions. Sometimes, when your voice is not in good condition… applying the right facial expression can help mask this. In vocal technique, methods such as “lifting the smile muscles” or “yawning” can assist singers in opening their throats more effectively. Over time, these techniques have been referred to as lifting the eyebrows and brightening the eyes or raising the eyebrows and sharpening the gaze. Additionally, from the perspective of vocal psychology, a “passive” smile can also help singers regulate their mental state during a performance.
Renowned opera singer and music educator Licia Albanese once shared her insights on vocal performance: “An artist should have nine lives like a cat—one for how to use both hands, one for how to move and perform on stage, one for physical movements, one for facial expressions, one for memorizing lines, one for musical literacy, and others.” From this, we can observe that Albanese extensively describes the importance of stage movements in her nine categories of music performance. Thus, we can conclude that bodily movements are vital to stage performance, and facial movements play a significant role in this context.
The saying “the eyes are the window to the soul” is widely known, and on stage, eye contact plays an essential role in vocal performance. Noted vocal educator Jin Tielin once emphasized that eye contact should have both a beginning and an end… the eyes must never wander away from the music. Through eye contact, performers can establish a closer connection with the audience, enhancing the emotional expressiveness of the performance (Zhang, 2024). During a performance, exchanging eye contact with the audience can help bridge the gap between performer and audience, intensifying the emotional impact of the music.
2) Bodily Movements
Music that accompanies movements such as dancing to the rhythm of the steps can only be called “music in harmony with movement”. And it allows everyone to feel the beauty. In the Yuelun Zhushu, the author refers to music that accompanies the movements of the feet and hands as music in harmony with movement (Zhao, 1989). The Yuelun Zhushu outlines the connection between bodily rhythms and the audience’s perception, highlighting the crucial role of bodily movements in conveying the musical style during vocal performances. These movements must align with the emotional expression of the work and the natural expression of the performer, while simultaneously supporting the overall logical flow of the music. This concept is also emphasized in Yuelun by Sakyabhandi Gonjang Gendzin, which states, “…they are called music in harmony with movement, as this conforms to the common sense of the world.” In stage performances, bodily movements can better engage the audience’s emotions. During the performance, appropriate physical gestures can be used to express the intense emotions of the song.
In vocal performances and other artistic expressions, ancient Chinese classical poetry has long recorded bodily movements. The Tang dynasty poet Li Bai, in his ancient poem Climbing the Mountain on the Ninth Day, wrote: “Chanting songs to offer clear wine, rising to dance in disorderly steps.” In the Dance Fu by Ping Lie, a censor during the reign of Emperor Xuanzong in the Tang dynasty, it is recorded: “…Gathering at the Eastern Pavilion of the Sage, the banquet continued in the Western Garden. ...Stepping lightly as if on ripples, trailing the thin thread of a cicada’s flight. Concealing long sleeves, I hum slowly, then suddenly spring up to dance.” These poetic verses vividly depict the Tang people’s customs of drinking, dancing, and chanting songs, and further confirm that, during musical performances, bodily movements are used to express the content and emotions of the work.
3. Interpreting Vocal the Work through Aesthetic Musical Imagination
To this day, throughout the vast course of human development, there exists a force that continually drives societal, civilizational, and technological progress. At the core of this powerful force lies the human capacity for thought. Through the power of thought, humanity has advanced from nothing to something, gradually evolving from primitive societies to modern civilizations. Professor Liao Jiahua of the School of Music at Anhui Normal University once stated in his article, “One of the key markers distinguishing humans from other life forms is the ability to think, to possess creative thought, and to conquer nature through one’s exceptional creative labor. Transforming the world. As we have reached the present era, this creativity has become an intrinsic force of human nature.” (Liao, 1988)
Different musical works convey different emotions, and only by deeply understanding the intended message of a song can one interpret the piece more effectively. In the Yuelun, it is recorded, “Singing a love song is like being struck by Cupid’s arrow; singing a sacrificial song is like the joy of blooming flowers... (Zhao, 1989)” Love songs are generally meant to praise the beauty of love and express aspirations for the future, thus requiring the singer to adjust their mindset to one of optimism and positivity before performing. When singing songs related to sacrifices, the singer’s mindset needs to be serious and filled with reverence for the object of the sacrifice. Only then can the performance achieve true solemnity and reverence. In the Music Preface by Du You, it is stated, “Sound originates in the human heart; when the heart is sorrowful, the sound weakens; when the heart is at ease, the sound becomes harmonious (Wu et al., 2011).” This means that the performer’s emotions or character are reflected in the music they play or sing.
When performing vocal works, singers must quickly adjust their emotional and psychological states to match the performance. This requires cultivating the singer’s aesthetic imagination. Aesthetic imagination, also known as aesthetic imagination ability (Liu, 2023), is a crucial capability that enables the smooth unfolding of aesthetic activities. Through aesthetic imagination, the creation of the artistic conception of vocal works is also significantly shaped. The famous saying, “A thousand readers, a thousand Hamlets”, when applied to vocal performance, refers to the “second creation” and “third creation” often mentioned in vocal singing.
Regarding the psychological aspect of “re-creation” in music, the Gu Wu Lu from the late Qing dynasty provides insights into the connection between the content of a song and aesthetic creation in singing. It states, “Each piece of music has its own emotions, which refer to the narrative elements within the piece. Understanding these elements, knowing the underlying meaning of the words, and empathizing with the emotions will naturally lead to the right vocal expression. A sorrowful song will move the soul, and a joyful one will bring delight… Nowadays, many performers simply recite lyrics without understanding the meaning or emotional context of the song, performing mechanically without true connection. This is akin to a child reciting a lesson without understanding its content—this is a soulless performance, devoid of emotion”. To truly experience and understand the essence of the music, one must place themselves in the shoes of the work, which allows the music to be filled with “spirit”. Such music is capable of moving the audience. A mechanical performance, without understanding the meaning of what is sung and the emotions it conveys, becomes indistinguishable from rote memorization, and the music will lack “spirit” and fail to express the intended emotions. Liu Xie, a literary figure from the Northern and Southern Dynasties, describes this vividly: “The impassioned performer strikes the beat against the rhythm… the keen observer is moved by the beauty of the melody, and those who love the unusual are startled by the surprising sound (Liu, 1989).” This suggests that the performer’s inner character and emotions influence the way the music is interpreted. Musical works themselves are intangible, but through the performer’s interpretation and “second creation”, combined with reasonable musical imagination, the work becomes more vivid. Within the framework of the piece’s content and logical structure, strengthening the imagination allows the performer to provide a greater creative space for the second creation of the work. A performance filled with vivid imagery can more powerfully express the performer’s emotions toward the piece, deepening their understanding of the work. Through natural, smooth singing and stage performance, a closer connection with the audience can be achieved.
For example, the Mongolian folk song Norinjiya tells the story of a beautiful young woman from the Horchin Grassland who, after marrying and moving far away, longs for her hometown. When performing this piece, the singer must deeply imagine themselves as the protagonist, “Norinjiya”, and envision experiencing the same emotions as the character. By conveying these feelings “through the vocal cords” and “transmitting emotions through sound”, the performer not only expresses the emotional logic of the song but also incorporates their own understanding and individuality into the performance. This allows the common elements of the song to encompass the individuality of the performer, and vice versa, with individuality arising from the commonalities. The Yuelun Zhushu also summarizes various musical styles and refers to The Discerning Wisdom of Musical Styles in its description of these styles. Through this summary, different musical styles are further refined. By observing these various forms, one can better understand the characteristics of singing. Similarly, the application of different sounds can also be appropriately combined and used.
4. Conclusion
In summary, the author finds that the content in Yuelun Zhushu regarding vocal performance, vocal techniques, and related aspects offers valuable insights and critical reflection. It plays a positive role in the inheritance and development of minority ethnic vocal music. The influence and inspiration of minority ethnic music on vocal performance is self-evident. As a seminal work on vocal music with a profound historical context in minority ethnic regions, the musical and cultural significance of this text, along with its academic value, cannot be overlooked. Both Yuelun and Yuelun Zhushu serve as important theoretical works on Tibetan Buddhist music, providing valuable historical materials for the study of the art, integration of music and lyrics, and artistic practices of the Sakya regime in Tibet. Contemporary vocal learners, educators, and enthusiasts should engage in in-depth research from multiple perspectives, so as to gain inspiration and insights for both theory and practice.
Indeed, in the long history of Tibetan artistic development, the influence of Tibetan Buddhism has been continuous and enduring. Even today, Tibetan society and culture are almost inseparable from religion. Therefore, both Yuelun by Sakyabhandi and Yuelun Zhushu by Ami Sha contain theoretical elements related to Tibetan Buddhism, giving the reader an initial sense of the religious mystique. However, the author hopes that this article will encourage readers to engage with the Tibetan musicological texts Yuelun and Yuelun Zhushu from a broader range of musical research perspectives.