TITLE:
Multi-Instrument Detection in Polyphonic Music with Cultural Instruments
AUTHORS:
Sovathanak Meas, Rezza Moieni
KEYWORDS:
Convolutional Neural Network, Multi-Instrument Detection, Cultural Instruments, Deep Learning, Multi-Label Classification
JOURNAL NAME:
Open Journal of Social Sciences,
Vol.13 No.9,
September
22,
2025
ABSTRACT: The study adapts several machine-learning and deep-learning architectures to recognize 63 traditional instruments in weakly labelled, polyphonic audio synthesized from the proprietary Sound Infusion collection. Ten thousand 5s clips were algorithmically generated, features such as Mel-spectrograms, MFCCs, and VGGish embeddings were extracted, and six models were evaluated. The re-implemented Han et al. Convolutional Neural Network (CNN) attained the best result (micro F1 = 0.55; macro F1 = 0.50), approaching published performance on mainstream Instrument Recognition in Musical Audio Signals (IRMAS) data. Results highlight data scarcity and class imbalance as key obstacles for culturally diverse MIR.