PPGCCM PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO FUNDAÇÃO UNIVERSIDADE FEDERAL DO ABC Phone: 11 4996-8337 http://propg.ufabc.edu.br/ppgccm

Banca de DEFESA: YUBIRY SINAMAICA GONZALEZ

Uma banca de DEFESA de DOUTORADO foi cadastrada pelo programa.
STUDENT : YUBIRY SINAMAICA GONZALEZ
DATE: 01/11/2023
TIME: 09:30
LOCAL: meet.google.com/psv-icwj-oub
TITLE:

FFT-based acoustic descriptors for musical timbre characterization using data analysis and Machine Learning


PAGES: 100
BIG AREA: Ciências Exatas e da Terra
AREA: Ciência da Computação
SUBÁREA: Metodologia e Técnicas da Computação
SUMMARY:

The musical timbre is one of the most complex sound attributes, and its characterization is an open-ended issue. The digital collection and reproduction of musical sounds through the FFT allows studying the problem of musical timbre from the perspective of musical acoustics. Therefore, the fundamental thesis is that all relevant timbral information is somehow contained in the Fourier Transform of the corresponding audio recording. The problem of timbral representation is considered to be very similar to that of color-space representations. In both cases, perceptions (audio, color) need to be operationally defined in abstract spaces for operational management and automatic computing. The main problem of the characterization of the musical timbre establishes the need to develop a minimum set of efficient acoustic descriptors that quantitatively evaluate the musical timbre from digital audio recordings and that can accurately give sufficient information on the identification of musical timbre and patterns of similarities. The analysis considers only monophonic audio recordings, corresponding to the equal temperament scale, which constitute a discrete, finite, and well-defined set of frequencies. Starting from audio recordings extracted from well-known libraries: TinySol and Good-sounds, corresponding to monophonic sounds of musical instruments, typical of a Western symphony orchestra, performed by professional musicians. A set of dimensionless and acoustically motivated descriptors is defined to quantitatively describe the partial frequency-amplitude distribution in the FFTs of audio records. Each FFT contains only two physical quantities: frequency and amplitude; whose distribution can be characterized by quantifying its fundamental component, Affinity (A) and Sharpness (S); the average values of both magnitudes; Mean Affinity (MA) and Mean Contrast (MC) and the envelope description: Harmony (H), Monotony (M), These descriptors, together with the fundamental frequency, configure a 7-dimensional space that allows geometrizing, through the Euclidean distance, the relationships of similarity and timbral proximity. Thus, the problem of timbral characterization is reduced to a grouping problem in a 7-dimensional abstract space, where each audio record corresponds to a point. Its position in the timbral space and the Euclidean distance between the registers allowed us to discriminate between timbral variations, by dynamics, octaves, musical instruments and instrument families; using the Machine Learning and data processing techniques. The timbral similarities between audio recordings are studied, creating an algorithm through Euclidean distances in a 7-dimensional space. This algorithm allowed us to find which FFTs are similar for different musical instruments. Based on the calculation of the descriptors and distance relationships, an exploratory clustering analysis has been carried out using the K-means algorithm for the analysis of the groups of musical instruments, instrument families, audio library, and pitch. We have observed that the data for each case study appear in specific delimited regions of the timbral space, which allows us to identify significant relationships in the timbric characterization process. In the analysis of timbral variations, we have considered the crescendo and vibrato, where we observe that the crescendo modifies the Average Contrast (MC)) and the vibrato modifies the Affinity (A). Finally, we have compared the classification capacity of our FFT acoustic descriptors with the descriptors from the Librosa library, applying the Random Forest classification algorithm. We observed statistically significant results for the FFT-Acoustic descriptors when classifying musical instruments and dynamics, obtaining a better classification for pitch and family of instruments when comparing them with the Librosa descriptors.


COMMITTEE MEMBERS:
Presidente - Interno ao Programa - 1673092 - RONALDO CRISTIANO PRATI
Membro Titular - Examinador(a) Externo à Instituição - IVAN EIJI YAMAUCHI SIMURRA - UFAC
Membro Titular - Examinador(a) Externo à Instituição - TIAGO TAVARES - INSPER
Membro Titular - Examinador(a) Externo à Instituição - PEDRO JAVIER GÓMEZ JAIME - UESB
Membro Titular - Examinador(a) Externo à Instituição - NELSON L. FALCON VELOZ
Membro Suplente - Examinador(a) Externo ao Programa - 3271359 - THENILLE BRAUN JANZEN
Membro Suplente - Examinador(a) Externo à Instituição - BICKY MARQUEZ
Membro Suplente - Examinador(a) Externo à Instituição - DAFNE CAROLINA ARIAS PERDOMO
Notícia cadastrada em: 03/10/2023 09:07
SIGAA | UFABC - Núcleo de Tecnologia da Informação - ||||| | Copyright © 2006-2024 - UFRN - sigaa-1.ufabc.int.br.sigaa-1-prod