Coherent Music Composition with Efficient Deep Learning Architectures and Processes ()
ABSTRACT
In recent years, significant advancements in
music-generating deep learning models and neural networks have revolutionized
the process of composing harmonically-sounding music. One notable innovation is
the Music Transformer, a neural network that utilizes context generation and
relationship tracking in sequential input. By leveraging transformer-based
frameworks designed for handling sequential tasks and long-range functions, the
Music Transformer captures self-reference through attention and excels at finding continuations of
musical themes during training. This attention-based model offers the advantage
of being easily trainable and capable of generating musical performances with long-term
structure, as demonstrated by Google Brain’s implementation. In this study, I
will explore various instances and applications of the Music Transformer,
highlighting its ability to efficiently generate symbolic musical structures.
Additionally, I will delve into another state-of-the-art model called TonicNet,
featuring a layered architecture combining GRU and self-attention mechanisms.
TonicNet exhibits particular strength in generating music with enhanced
long-term structure, as evidenced by its superior performance in both objective
metrics and subjective evaluations. To further improve TonicNet, I will
evaluate its performance using the same metrics and propose modifications to
its hyperparameters, architecture, and dataset.
Share and Cite:
Zhang, C. (2023) Coherent Music Composition with Efficient Deep Learning Architectures and Processes.
Art and Design Review,
11, 189-206. doi:
10.4236/adr.2023.113015.
Cited by
No relevant information.