Ensembles de OCRs para aplicações médicas

Bibliographic Details
Main Author: João Adriano Portela de Matos Silva
Publication Date: 2021
Format: Master thesis
Language: por
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: https://hdl.handle.net/10216/137327
Summary: With the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used.
id RCAP_da4d4ea6f91619d3ab1c77a43b897892
oai_identifier_str oai:repositorio-aberto.up.pt:10216/137327
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Ensembles de OCRs para aplicações médicasEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringWith the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used.2021-10-142021-10-14T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/137327TID:202820912porJoão Adriano Portela de Matos Silvainfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-27T17:32:25Zoai:repositorio-aberto.up.pt:10216/137327Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T22:17:47.046359Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Ensembles de OCRs para aplicações médicas
title Ensembles de OCRs para aplicações médicas
spellingShingle Ensembles de OCRs para aplicações médicas
João Adriano Portela de Matos Silva
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Ensembles de OCRs para aplicações médicas
title_full Ensembles de OCRs para aplicações médicas
title_fullStr Ensembles de OCRs para aplicações médicas
title_full_unstemmed Ensembles de OCRs para aplicações médicas
title_sort Ensembles de OCRs para aplicações médicas
author João Adriano Portela de Matos Silva
author_facet João Adriano Portela de Matos Silva
author_role author
dc.contributor.author.fl_str_mv João Adriano Portela de Matos Silva
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description With the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used.
publishDate 2021
dc.date.none.fl_str_mv 2021-10-14
2021-10-14T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/137327
TID:202820912
url https://hdl.handle.net/10216/137327
identifier_str_mv TID:202820912
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833599632961175552