Mixed Music Analysis with Extended Specmurt


This paper introduces a mixed music analysis method using extended specmurt analysis. Conventional specmurt can only analyze a multi-pitch music signal from a single instrument and cannot analyze a mixed music signal that has several different types of instruments being played at the same time. To analyze a mixed music signal, extended specmurt is proposed. We regard the observed spectrum extracted from the mixed music as the summation of the observed spectra corresponding to each instrument. The mixed music has as many unknown fundamental frequency distributions as the number of instruments since the observed spectrum of a single instrument can be expressed as a convolution of the common harmonic structure and the fundamental frequency distribution. The relation among the observed spectrum, the common harmonic structure and the fundamental frequency distribution is transformed into a matrix representation in order to obtain the unknown fundamental frequency distributions. The equation is called extended specmurt, and the matrix of unknown components can be obtained by using a pseudo inverse matrix. The experimental result shows the effectiveness of the proposed method.

Share and Cite:

D. Nishimura, T. Nakashika, T. Takiguchi and Y. Ariki, "Mixed Music Analysis with Extended Specmurt," Journal of Software Engineering and Applications, Vol. 6 No. 5, 2013, pp. 274-279. doi: 10.4236/jsea.2013.65034.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] L. R. Rabiner, “On the Use of Autocorrelation Analysis for Pitch Detection,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 25, No. 1, 1977, pp. 24-33. doi:10.1109/TASS P.1977.1162905
[2] D. J. Hermes, “Measurement of Pitch by Subharmonic Summation,” Journal of ASA, Vol. 83, No, 1, 1988, pp. 257-264. doi:10.1121/1.396427
[3] Y. Takasawa, “Transcription with Computer,” IPSJ, Vol. 29, No. 6, 1988, pp. 593-598.
[4] T. Miwa, Y. Tadokoro and T. Saito, “The Pitch Estimation of Different Musical Instruments Sounds Using Comb Filters for Transcription,” IEICE Transactions (D-II), Vol. J81-D-II, No. 9, 1988, pp. 1965-1974.
[5] K. Kashino, K. Nakadai, T. Kinoshita and H. Tanaka, “Organization of Hierarchical Perceptual Sounds: Music Scene Analysis with Autonomous Processing Modules and a Quantitive Information Integration Mechanism,” Proceedings of International Joint Conferences on Artificial Intelligence, Vol. 1, 1995, pp. 158-164.
[6] K. Kashino, T. Kinoshita, K. Nakadai and H. Tanaka, “Chord Recognition Mechanisms in the OPTIMA Processing Architecture for Music Scene Analysis,” IEICE Transactions(D-II), Vol. J79-D-II, No. 11, 1996, pp. 1762-1770.
[7] A. Klapuri, T. Virtanen and J. Holm, “Robust Multipitch Estimation for the Analysis and Manipulation of Polyphonic Musical Signals,” Proceedings of the COST-G6 Conference on Digital Audio Effects, Verona, 7-9 December 2000, pp. 233-236.
[8] T. Virtanen and A. Klapuri, “Separation of Harmonic Sounds Using Linear Models for the Overtone Series,” Proceedings of ICASSP 2002, Vol. 2, 2002, pp. 1757-1760.
[9] M. Goto, “F0 Estimation of Melody and Bass Line in Musical Audio Signals,” IEICE Transactions(D-II), Vol. J84-D-II, No. 1, 2001, pp. 12-22.
[10] M. Goto, “A Real-Time Music Scene Description System: Predominant-F0 Estimation for Detecting Melody and Bass Lines in Real-World Audio Signals,” Speech Communication, Vol. 43, No. 4, 2004, pp. 311-329. doi:10.1016/j.specom.2004.07.001
[11] K. Miyamoto, H. Kameoka, T. Nishino, N. Ono and S. Sagayama, “Harmonic, Temporal and Timbral Unified Clustering for Multi-Instrumental Music Signal Analysis,” IPSJ SIG Technical Report, 2005-MUS, Vol. 82, 2005, pp. 71-78.
[12] H. Kameoka, J. Le Roux, N. Ono and S. Sagayama, “Harmonic Temporal Structured Clustering: A New Approach to CASA,” ASJ, Vol. 36, No. 7, 2006, pp. 575-580.
[13] K. Miyamoto, H. Kameoka, T. Nishimoto, N. Ono and S. Sagayama, “Harmonic-Temporal-Timbral Clustering (HTTC) for the Analysis of Multi-Instrument Polyphonic Music Signals,” IEEE International Conference on ICASSP 2008, Las Vegas, 31 March-4 April 2008, pp. 113-116. doi:10.1109/ICASSP.2008.4517559
[14] K. Takahashi, T. Nishimoto and S. Sagayama, “MultiPich Analysis Using Deconvolution of Log-Frequency Spectrum,” IPSJ SIG Technical Report, 2003-MUS, Vol. 127, 2008, pp. 113-116.
[15] S. Sagayama, K. Takahashi, H. Kameoka and T. Nishino, “Specmurt Analysis: A Piano-Roll-Visualization of Polyphonic Music Signal by Deconvolution of Log-Frequency Spectrum” Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing (SAPA2004), Jeju, 3 October 2004.
[16] H. Kameoka, S. Saito, T. Nishino and S. Sagayama, “Recursive Estimation of Quasi-Optimal Common Harmonic Structure Pattern for Specmurt Analysis: PianoRoll Visualization and MIDI Conversion of Polyphonic Music Signal,” IPSJ SIG Technical Report, 2004-MUS, Vol. 84, 2004, pp. 41-48.
[17] S. Saito H. Kameoka, T. Nishimoto and S. Sagayama, “Specmurt Analysis of Multi-Pitch Music Signals with Adaptive Estimation of Common Harmonic Structure,” Proceedings of the International Conference on Music Information Retrieval (ISMIR2005), London, 11-15 November 2005, pp. 84-91.
[18] S. Saito, H. Kameoka, K. Takahashi, T. Nishimoto and S. Sagayama, “Specmurt Analysis of Polyphonic Music Signals,” IEEE Transactions on ASLP, Vol. 16, No. 3, 2008, pp. 639-650. doi:10.1109/TAS L.2007.912998

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.