CENTRO DE INVESTIGAÇÃO EM CIÊNCIA E TECNOLOGIA DAS ARTES
Music is built from sound, ultimately resulting from an elaborate interaction between the sound-generating properties of physical objects (i.e. music instruments) and the sound perception abilities of the human auditory system. Humans, even without any kind of formal music training, are typically able to extract, almost unconsciously, a great amount of relevant information from a musical signal. Features such as the beat of a musical piece, the main melody of a complex musical arrangement, the sound sources and events occurring in a complex musical mixture or the song structure are just some examples of the level of knowledge that a naive listener is commonly able to extract just from listening to a musical piece. In order to do so, the human auditory system uses a variety of cues for perceptual grouping such as similarity, proximity, harmonicity, common fate, among others .
Typical computational system for sound analysis and Music Information Retrieval (MIR) represent statistically the entire polyphonic or complex sound mixture (e.g. [2, 3]), without any attempt to first identify the different sound entities or events that may coexist in the signal. There is however some evidence that this approach has reached a 'glass ceiling'  in terms of analysis and retrieval performance.
The main problem this project addresses is the identification and segregation of sound events in 'real-world' polyphonic music signals (including monaural audio signals). The goal is to individually characterize the different sound events comprising the polyphonic mixture, and use this structured representation to improve the extraction of perceptually relevant information from complex audio and musical mixtures.
The proposed project will follow a Computational Auditory Scene Analysis (CASA) approach for modeling perceptual grouping in music listening . This approach is inspired by the current knowledge of how listeners perceive sound events in music signals, be it music notes, harmonic textures, melodic contours, instruments or other type of event , requiring a multidisciplinary approach to the problem [6, pp.14]. Although the demanding challenges faced by such CASA approaches make their performance still quite limited when compared to the human auditory system, some recent results already provide alternative and improved approaches to common sound analysis and MIR applications [T1].
The common purpose of this project is to build upon the research results already obtained by the proposed team, placing it in a good position to articulate knowledge from the different disciplines in order to design, implement and validate innovative methodologies and technologies that are useful for sound and music analysis using computer systems, namely:
In order to pursue these objectives 7 tasks have been planned and include research work on:
Project start: April 4th 2011
Projecto end: April 3rd 2014
Papers in international journals – 0/3
Communications in international meetings – 4/6
Reports – 2/4
Organization of seminars and conferences – 2/0
PhD theses – 0/1
Models – 1/3
Software – 1/1
Pilor Plants – 0/1
Prototypes – 1/1
Evaluation Datasets – 2/1
Parceiros: INESC Porto (Portugal), University of Victoria (BC, Canada), IRCAM (France), McGill University/CIRMMT (QB, Canada), FEUP (Portugal)
Financiamento: Fundação para a Ciência e Tecnologia (FCT)
A Computational Auditory Scene Analysis Framework for Sound Segregation in Music Signals (CASA-FCT) Music is built from sound, ultimately resulting from an elaborate interaction between the sound-generating properties of physical objects (i.e. music instruments) and the sound perception abilities of the human auditory system.
04/2011 to 04/2014