Web scraping and analysis of car data
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2024 |
| Tipo de documento: | Dissertação |
| Idioma: | eng |
| Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Texto Completo: | http://hdl.handle.net/10400.22/26596 |
Resumo: | The growth of online car marketplaces has created challenges in efficiently gathering and analyzing car data due to price fluctuations and increasing digital reliance. This thesis tackles the problem through web scraping and data analysis to assist in market insights. A review of web scraping tools like BeautifulSoup, Requests, and Selenium, alongside data analysis libraries such as Pandas, was conducted. A system was developed to scrape car data from Standvirtual and analyze key attributes like price and mileage. The data was processed using Python tools, and a Flask-based server application was built for easy access, with offline analysis supported through Excel. Challenges such as incomplete data and anti-scraping measures were resolved with advanced extraction techniques and error handling. Further improvements include optimizing the scraping process and integrating machine learning models for more accurate price predictions. In conclusion, the project demonstrates the potential of web scraping for car market analysis, providing a foundation for future predictive analytics and real-time data applications. |
| id |
RCAP_2a163f4d6e9406f98e5f3ae75f118a26 |
|---|---|
| oai_identifier_str |
oai:recipp.ipp.pt:10400.22/26596 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Web scraping and analysis of car dataAgregamento e análise de dados de automóveisWeb scrapingData analysisApplicationData extractionData insertionPython librariesCar dataAgregamento de dadosAnálise de dadosAplicaçãoExtração de dadosInserção de dadosBibliotecas PythonDados de automóveisThe growth of online car marketplaces has created challenges in efficiently gathering and analyzing car data due to price fluctuations and increasing digital reliance. This thesis tackles the problem through web scraping and data analysis to assist in market insights. A review of web scraping tools like BeautifulSoup, Requests, and Selenium, alongside data analysis libraries such as Pandas, was conducted. A system was developed to scrape car data from Standvirtual and analyze key attributes like price and mileage. The data was processed using Python tools, and a Flask-based server application was built for easy access, with offline analysis supported through Excel. Challenges such as incomplete data and anti-scraping measures were resolved with advanced extraction techniques and error handling. Further improvements include optimizing the scraping process and integrating machine learning models for more accurate price predictions. In conclusion, the project demonstrates the potential of web scraping for car market analysis, providing a foundation for future predictive analytics and real-time data applications.Araújo, Susana Cláudia Nicola deREPOSITÓRIO P.PORTOSilva, João Luís Magalhães da2024-12-02T16:09:09Z2024-10-292024-10-29T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/26596urn:tid:203733142enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-03-07T10:33:29Zoai:recipp.ipp.pt:10400.22/26596Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T01:01:30.630733Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Web scraping and analysis of car data Agregamento e análise de dados de automóveis |
| title |
Web scraping and analysis of car data |
| spellingShingle |
Web scraping and analysis of car data Silva, João Luís Magalhães da Web scraping Data analysis Application Data extraction Data insertion Python libraries Car data Agregamento de dados Análise de dados Aplicação Extração de dados Inserção de dados Bibliotecas Python Dados de automóveis |
| title_short |
Web scraping and analysis of car data |
| title_full |
Web scraping and analysis of car data |
| title_fullStr |
Web scraping and analysis of car data |
| title_full_unstemmed |
Web scraping and analysis of car data |
| title_sort |
Web scraping and analysis of car data |
| author |
Silva, João Luís Magalhães da |
| author_facet |
Silva, João Luís Magalhães da |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Araújo, Susana Cláudia Nicola de REPOSITÓRIO P.PORTO |
| dc.contributor.author.fl_str_mv |
Silva, João Luís Magalhães da |
| dc.subject.por.fl_str_mv |
Web scraping Data analysis Application Data extraction Data insertion Python libraries Car data Agregamento de dados Análise de dados Aplicação Extração de dados Inserção de dados Bibliotecas Python Dados de automóveis |
| topic |
Web scraping Data analysis Application Data extraction Data insertion Python libraries Car data Agregamento de dados Análise de dados Aplicação Extração de dados Inserção de dados Bibliotecas Python Dados de automóveis |
| description |
The growth of online car marketplaces has created challenges in efficiently gathering and analyzing car data due to price fluctuations and increasing digital reliance. This thesis tackles the problem through web scraping and data analysis to assist in market insights. A review of web scraping tools like BeautifulSoup, Requests, and Selenium, alongside data analysis libraries such as Pandas, was conducted. A system was developed to scrape car data from Standvirtual and analyze key attributes like price and mileage. The data was processed using Python tools, and a Flask-based server application was built for easy access, with offline analysis supported through Excel. Challenges such as incomplete data and anti-scraping measures were resolved with advanced extraction techniques and error handling. Further improvements include optimizing the scraping process and integrating machine learning models for more accurate price predictions. In conclusion, the project demonstrates the potential of web scraping for car market analysis, providing a foundation for future predictive analytics and real-time data applications. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024-12-02T16:09:09Z 2024-10-29 2024-10-29T00:00:00Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/26596 urn:tid:203733142 |
| url |
http://hdl.handle.net/10400.22/26596 |
| identifier_str_mv |
urn:tid:203733142 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833600801485881344 |