Modelos de aprendizado de máquina interpretáveis aplicados na predição da separação e purificação de frutooligosacarídeos
Ano de defesa: | 2023 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Santa Maria
Brasil Engenharia Química UFSM Programa de Pós-Graduação em Engenharia Química Centro de Tecnologia |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://repositorio.ufsm.br/handle/1/28742 |
Resumo: | Fructooligosaccharides separation and purification are crucial for industrial applications where such methods are critically needed. However, some specific characteristics of the process can affect the selectivity of the separation process. In this work, machine learning models and additive explanation techniques were used in order to identify and analyze the main variables that affect the separation process. Initially, a comparison was made between activated carbon and zeolite as adsorbents in fixed bed columns, to determine which one has the highest selectivity for the fructooligosaccharides separation. To compare the adsorbents, eXtreme Gradient Boosting and Shapley Additive explanation were used to determine the best process conditions for each adsorbent considering the variables, temperature, time and ethanol concentration. The eXtreme Gradient Boosting showed high predictive power for both adsorbents, reaching of 0.84 0.91 for activated carbon and 0.87 0.98 for zeolite. Activated charcoal shows selectivity for fructooligosaccharides at low ethanol concentrations (7.95% v/v). Zeolite required ethanol concentrations about 8 times higher than activated carbon. In the second step, the prediction of the concentrations of the saccharides separated from a mixture containing fructooligosaccharides in zeolite fixed bed columns was carried out, using different models of machine learning, Decision Trees, Random Forest, Gradient Boosting, AdaBoost and Neural Networks Artificial. The feature importance was performed in tree-based models. Input variables were time, temperature, ethanol concentration, eluent flow rate, percentage ratio of injection volume to bed. Gradient Boosting was the best model to predict concentrations, showing 0.600 0.840, 0.0590 22.318 and 0.187 3.002 for test data. Most significant variables: ethanol concentration (for glucose concentration), percentage ratio of injection volume to bed (for fructose concentration) and time (for sucrose and fructooligosaccharides concentration). Finally, a survey of different experimental conditions was carried out in the literature, such as type of adsorbent, type of eluent, eluent concentration, eluent flow rate, temperature, column dimensions, which affect the fructooligosaccharides purity. For this, a selection of the best model, AdaBoost, Decision Tree, Random Forest, eXtreme Gradient Boosting, K Nearest Neighbors, and Ridge Regression, was performed to predict purity. After selection, the best model was optimized, and the Shapley Additive explanation methodology was applied to identify the most important variables. The best model for prediction was the Random Forest, presenting of 0.935, of 0.868, of 36.313, of 4.849, of 6.026, and of 7.885%, for test data. According to the Shapley Additive explanation the variables that most affect purity are: diameter, eluent concentration, temperature, eluent flow rate, activated carbon as adsorbent and length. The results obtained showed that machine learning is a valuable tool to better understand the ideal conditions for the fructooligosaccharides separation, allowing higher recovery rates and greater efficiency. |