Estudo e desenvolvimento de algoritmos de compressão sem perda sobre dados uniformemente distribuídos

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: CARVALHO, Caio Magno Aguiar de lattes
Orientador(a): DUAILIBE FILHO, Allan Kardec Barros lattes
Banca de defesa: DUAILIBE FILHO, Allan Kardec Barros lattes, SANTANA, Ewaldo Éder Carvalho lattes, SOUZA, Francisco da Chagas de lattes, SIQUEIRA, Hugo Valadares lattes
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal do Maranhão
Programa de Pós-Graduação: PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA DE ELETRICIDADE/CCET
Departamento: DEPARTAMENTO DE ENGENHARIA DA ELETRICIDADE/CCET
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://tedebc.ufma.br/jspui/handle/tede/4187
Resumo: The ever-increasing pace of digital information production and consumption is not keeping up with current data storage and transmission offerings, i.e., we produce more digital content than we can store and communicate, and this race will apparently not be easily balanced. Data compression techniques have been developed to optimize storage and communication mechanisms so that information occupies the minimum amount of space in a file management system or the minimum amount of bandwidth in a communication channel. Such techniques are based on the Information Theory proposed by Shannon, in which the statistics of the signal to be compressed play a key role in the efficient representation of the information. Repetition and structure are characteristics fundamentally exploited by compression algorithms. However, sequences of uniformly distributed, independent and identically distributed (i.i.d) data break these two pillars that underlie statistical compression. It is also known that ideally the coded output of a compression algorithm is uniformly distributed, so to study the possibility of compressing uniform distributions is to open up the possibility of recursive compression. The present work aims to explore this possibility by looking at the compression problem outside the statistical field, but from the inherent redundancy of standard binary coding, proposed by the Concatenation algorithm and from the geometric perspective through the SVD-spherical method. The Concatenation algorithm takes advantage of the unused bit fractions in the standard binary representation, having its maximum performance when the alphabet size of the compressed data is 2 N + 1. The experiments were conducted on RAND Corporation data, which is uniform data produced by physical processes with alphabet size 10. The results showed that it is possible to obtain up to 12.5% compression on this set.