Classificação de gêneros musicais utilizando convolutional neural network e data augmentation

Aguiar, Rafael de Lima

Classificação de gêneros musicais utilizando convolutional neural network e data augmentation

Detalhes bibliográficos
Ano de defesa:	2017
Autor(a) principal:	Aguiar, Rafael de Lima
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Estadual de Maringá Brasil Departamento de Informática Programa de Pós-Graduação em Ciência da Computação UEM Maringá, PR Centro de Tecnologia
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Classificação de gêneros musicais Recuperação de informação musical Espectrogramas Deep learning Data augmentation Brasil. Ciências Exatas e da Terra Ciência da Computação
Link de acesso:	http://repositorio.uem.br:8080/jspui/handle/1/2501
Resumo:	In thiswork, wepresent a master dissertation addressing automatic music genre classification as a patter recognitiontask. The content of the music pieces were described using features obtained in the visual domain, by using spectrograms created from the audio signal. This kind of image has been successfully used in thistasksince 2011 by exploring the main visual attribute that can be found in this kind of image (i.e. texture). In this work, the patterns were described by using representation learning. For this, convolutional neural networks (CNN) were used. CNN is a deep learning architecture and it has been widely used in the literature of pattern recognition. Deep learning is inspired in the human brain and CNNs in the mammal visual system. Overfitting is a recurrent problem when a classification problem is addressed by using CNN, it may occur due to the combination of lacking of training samples and a high dimensionality space. To address this problem we propose to explore data augmentation techniques. In this application domain, examples of data augmentation techniques are: cropping spectrogram images, changing the pitch of a music piece and separating harmonic and percussive components of the sound. Such procedures are implemented in both training and testing sets. In this work we present results obtained with The Latin Music Database and the best accuracy we acquired is close to the state of the art and outcome the best system we known based only in CNN.

Classificação de gêneros musicais utilizando convolutional neural network e data augmentation

Registros relacionados