PPGCCM PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO FUNDAÇÃO UNIVERSIDADE FEDERAL DO ABC Phone: 11 4996-8337 http://propg.ufabc.edu.br/ppgccm

Banca de DEFESA: LUIS CESAR DE AZEVEDO

Uma banca de DEFESA de MESTRADO foi cadastrada pelo programa.
DISCENTE : LUIS CESAR DE AZEVEDO
DATA : 18/10/2021
HORA: 09:30
LOCAL: https://meet.google.com/njq-pxyq-nzd
TÍTULO:

Quantifyingthe Bias-Variance decomposition in property predictions on Materaisl Science


PÁGINAS: 70
GRANDE ÁREA: Ciências Exatas e da Terra
ÁREA: Ciência da Computação
SUBÁREA: Metodologia e Técnicas da Computação
RESUMO:

 Most machine learning (ML) applications in quantum-chemistry (QC) datasets rely on a single statistical error parameter such as the mean squared error (MSE) to evaluate their performance. However, this approach has limitations or can even yield incorrect interpretations. Here, we report a systematic investigation of the two components of the MSE, i.e., the bias and variance, using the quantum-chemistry QM9 dataset as an example. To this end we experiment with three state-of-the-art descriptors, namely $(i)$ Symmetry Functions (SF, with two-body and three-body functions), $(ii)$ Many-body Tensor Representation (MBTR, with two- and three-body terms), and $(iii)$ Smooth Overlap of Atomic Positions (SOAP),  to evaluate the prediction process's performance using different numbers of molecules in training samples and the effect of bias and variance on the final MSE. Overall, low sample sizes are related to higher MSE. Moreover, the bias component strongly influences the larger MSEs. Furthermore, there is little agreement among molecules with higher errors (outliers) across different descriptors. However, there is a high prevalence among the outliers intersection set and the convex hull volume of geometric coordinates (VFC). According to the obtained results with the distribution of MSE (and its components bias and variance) and the appearance of outliers, it is suggested to use ensembles of models with a low bias to minimize the MSE, more specifically when using a small number of molecules in the training set.
  


MEMBROS DA BANCA:
Presidente - Interno ao Programa - 1673092 - RONALDO CRISTIANO PRATI
Membro Titular - Examinador(a) Interno ao Programa - 1932365 - FABRICIO OLIVETTI DE FRANCA
Membro Titular - Examinador(a) Externo à Instituição - GABRIEL RAVANHANI SCHLEDER
Membro Suplente - Examinador(a) Interno ao Programa - 3008017 - DENIS GUSTAVO FANTINATO
Notícia cadastrada em: 20/09/2021 20:03
SIGAA | UFABC - Núcleo de Tecnologia da Informação - ||||| | Copyright © 2006-2024 - UFRN - sigaa-1.ufabc.int.br.sigaa-1-prod