Comparação do desempenho de ambientes virtuais na computação em nuvem privada usando a análise estatística e o benchmark Hadoop

Bôaventura, Ricardo Soares

Comparação do desempenho de ambientes virtuais na computação em nuvem privada usando a análise estatística e o benchmark Hadoop

Detalhes bibliográficos
Ano de defesa:	2015
Autor(a) principal:	Bôaventura, Ricardo Soares
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal de Uberlândia BR Programa de Pós-graduação em Engenharia Elétrica Engenharias UFU
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Virtualização Computação em nuvem Nuvem privada Planejamento experimental Experimentos com algoritmos Dominância de pareto Análise de variância Algoritmos de computador Virtualization Cloud computing Private cloud Experimental planning Experiments with algorithms Pareto dominance Analysis of variance CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
Link de acesso:	https://repositorio.ufu.br/handle/123456789/14350 https://doi.org/10.14393/ufu.te.2015.37
Resumo:	Cloud computing emerges as a new dominant paradigm in distributed systems, with a model that allows users to access, over demand, to a shared pool of computing configurable resources, such as networks, servers, storage, applications and services. These resources can be rapidly provided with minimal management effort or interaction from a supplier. In cloud computing, the infrastructure can be made available as a service through virtualization using hypervisors. Virtualization is a mechanism that presents the hardware and system resources of a given operating system. This technology is used in environments clouds through a large set of server using virtual machine monitors that are located between the hardware and the operating system. However, there is a wide spread of hypervisors, each with its own advantages and disadvantages. The specific characteristics of each virtual machine generates different performances. The aim of this work is to propose a methodology that seeks to discover how, when and as the increased performance of the algorithms in virtual environments is determined by the environment configuration and how the configuration parameters can influence each other, and finally, discover using statistical methods which settings of virtual environment achieve the best results on average. The tested algorithms (sudoku, pi, wordcount, testDFSIO read and write testDFSIO) belong to the benchmark Apache Hadoop. These experiments were planned and executed based on the experimental design theory. The experimental design is a pre-established set of tests using scientific and statistical criteria mainly, in order to determine the influence of various factors on the results (metric) of a system or process, identifying and observing the reasons that led to change in the expected value. The planning that was used is factorial planning 34, where each factor (core, memory, operating system and virtual machine) were varied in three levels. Tested operating systems were Ubuntu 14.04 64bit, CentOS 7.0 64bit and Windows 8.0 64bit; and virtual machines were tested KVM, Xen and VMware. Data were collected and analyzed using analysis of variance. The results show that the major analyzed factors changes the algorithm performance , but they can not be analyzed separately because there are also significant interactions belonging to these factors . At a 5% significance level, analysis of variance showed that the core interactions: memory, memory with OS, memory with VM and OS with VM, all these factors impact the runtime of the analyzed algorithms. According to the statistical method mean comparison was possible then make a comparison between the mean times of significant interaction between OS and VM, and based on results has been applied an adaptation of Pareto dominance theory called Pareto dominance. Also, with 5% significance level was possible to discover Pareto\'s borders. Considering the runtime algorithm, the Pareto Dominance introduced the virtual environment Xen with CentOS in the first border as the virtual environment that on average achieved the best performance for the analyzed computational algorithms. Virtual environments that occupied the second border were the environments Xen with Ubuntu and VMware with CentOS, ie they had on average lower times the first border and between them they were considered equivalent. The environments belonging to third border were KVM with Ubuntu, VMware with VMware and Ubuntu with Windows. The environments belonging to fourth border were Xen with Windows, KVM with CentOS and the environment that got on average lower than the other times was the KVM with Windows. It can be concluded that virtual machine Xen and CentOS operating system on average got the best performance. But if the user wants to use the Ubuntu operating system it is advisable to install it in Xen virtual machine. And if you want to use the Windows operating system recommends be installed on the VMware virtual machine.

Comparação do desempenho de ambientes virtuais na computação em nuvem privada usando a análise estatística e o benchmark Hadoop

Registros relacionados