Machine learning based optimization for database replication system

Bibliographic Details
Main Author: Rocha, Jéssica Costa da
Publication Date: 2020
Format: Master thesis
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10362/109751
Summary: Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
id RCAP_8bcd5fbd349187b1f334e137ce622ad5
oai_identifier_str oai:run.unl.pt:10362/109751
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Machine learning based optimization for database replication systemDatabasesMachine LearningReinforcement LearningPythonAuto-tuningInternship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsThis project falls under the category of database optimization problems and has the aim to enhance the performance of a data replication process between two databases systems (OLTP and OLAP). In DBMS, there are hundreds of knobs that are typically tuned manually by engineers. The configuration of such parameters influences the performance of the data replication process as well as the whole system. The goal of this project is to minimize latency, defined by the time that it takes for the data to be replicated from the source database to the target database. It is important to keep latency as low as possible in order to avoid long delays in the replication process which eventually leads to outdated analytics for the customers. As a means to approach this problem, a simulation environment that captures the state of the replication process between the two databases was designed to collect data. Then, it was necessary to represent numerically the incoming workload for this case study. Lastly, two machine learning approaches were implemented to automate the configuration of the parameters. The first solution is based on a reinforcement learning agent formulated as a Markov decision process and the second is having a predictive model in combination with Bayesian optimization search. The initial experimental results obtained have shown improvements in the performance measure when comparing to the traditional approach.Vanneschi, LeonardoRUNRocha, Jéssica Costa da2021-01-05T18:24:09Z2020-11-302020-11-30T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/109751TID:202572684engmetadata only accessinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T17:49:40Zoai:run.unl.pt:10362/109751Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:20:59.815410Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Machine learning based optimization for database replication system
title Machine learning based optimization for database replication system
spellingShingle Machine learning based optimization for database replication system
Rocha, Jéssica Costa da
Databases
Machine Learning
Reinforcement Learning
Python
Auto-tuning
title_short Machine learning based optimization for database replication system
title_full Machine learning based optimization for database replication system
title_fullStr Machine learning based optimization for database replication system
title_full_unstemmed Machine learning based optimization for database replication system
title_sort Machine learning based optimization for database replication system
author Rocha, Jéssica Costa da
author_facet Rocha, Jéssica Costa da
author_role author
dc.contributor.none.fl_str_mv Vanneschi, Leonardo
RUN
dc.contributor.author.fl_str_mv Rocha, Jéssica Costa da
dc.subject.por.fl_str_mv Databases
Machine Learning
Reinforcement Learning
Python
Auto-tuning
topic Databases
Machine Learning
Reinforcement Learning
Python
Auto-tuning
description Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
publishDate 2020
dc.date.none.fl_str_mv 2020-11-30
2020-11-30T00:00:00Z
2021-01-05T18:24:09Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/109751
TID:202572684
url http://hdl.handle.net/10362/109751
identifier_str_mv TID:202572684
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv metadata only access
info:eu-repo/semantics/openAccess
rights_invalid_str_mv metadata only access
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833596629571076096