PPGCCM PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO FUNDAÇÃO UNIVERSIDADE FEDERAL DO ABC Phone: 11 4996-8337 http://propg.ufabc.edu.br/ppgccm

Banca de QUALIFICAÇÃO: PEDRO HENRIQUE PARREIRA

Uma banca de QUALIFICAÇÃO de DOUTORADO foi cadastrada pelo programa.
STUDENT : PEDRO HENRIQUE PARREIRA
DATE: 30/11/2022
TIME: 14:00
LOCAL: meet.google.com/hzd-nhqj-oxw
TITLE:

Classification in Data Streams with intermediate latency


PAGES: 65
BIG AREA: Ciências Exatas e da Terra
AREA: Ciência da Computação
SUBÁREA: Metodologia e Técnicas da Computação
SUMMARY:

Recent technological advances have popularized several devices that are capable of continuously generating data. Cell phones, for example, are devices capable of collecting and transmitting data over time and have become accessible to a large public. One of the results obtained from these technological advances was the huge increase in data generation. This scenario has favored an increase in the number of existing data streams. Data streams are characterized by continuous arrival of data, in massive amounts, and at high rates of data arrival. There is a phenomenon that can occur along a data stream which is characterized by a change in the distribution of the data. This phenomenon is called \emph{concept drift} and is very present in real data stream applications. For example, the buying habits of a particular consumer may change over time for different reasons. These characteristics present in the data streams are quite challenging for traditional machine learning algorithms, which usually assume that the data is accessible at any time and that it is in a stationary environment. These challenges have spurred an increased interest in data stream mining, including the development of classification algorithms for data streams. Some of these data stream classification algorithms have built-in mechanisms to deal with \emph{concept drift}. These classification algorithms for the data stream have been extensively evaluated by the community with several real and artificial datasets. However, most of these evaluations assume that the labellings of the instances are available immediately after the prediction, that is, without any delay. This scenario is called \emph{null latency}, and is quite optimistic in most real data stream applications. In many real data stream applications there is a time interval between making the instance and it's label available. For example, in the problem of predicting whether it will rain next week. In this case, the true label, that is, whether it actually rained or not, will only be available at the end of next week. This scenario, in which there is a time interval between the availability of the instance and its respective label, is called \emph{intermediate latency}. Most existing classification algorithms for data streams assume the \emph{null latency} scenario. In the scenario of \emph{null latency}, in the face of occurrence of \emph{concept drift} phenomenon, the classification algorithms for data streams will have immediate access to the instances and their respective labels to adapt. In this way, existing classification algorithms for data streams can restore their predictive power in the face of the occurrence of the \emph{concept drift} considering the \emph{null latency} scenario. However, the predictive performance of such classification algorithms is quite low when the \emph{intermediate latency} scenario is considered. Although it is quite common in real data stream applications, there are still few works on data streams that consider the scenario of \emph{intermediate latency}. Given such challenges, this thesis proposes the development of classification algorithms for data streams that consider the \emph{intermediate latency} scenario. The proposed classification algorithms are expected to achieve better predictive performance than the existing classification algorithms for the \emph{intermediate latency} scenario.


BANKING MEMBERS:
Presidente - Interno ao Programa - 1673092 - RONALDO CRISTIANO PRATI
Membro Titular - Examinador(a) Interno ao Programa - 3008222 - PAULO HENRIQUE PISANI
Membro Titular - Examinador(a) Externo à Instituição - VINICIUS MOURÃO ALVES DE SOUZA - PUCPR
Membro Suplente - Examinador(a) Interno ao Programa - 1722875 - DAVID CORREA MARTINS JUNIOR
Notícia cadastrada em: 20/10/2022 16:39
SIGAA | UFABC - Núcleo de Tecnologia da Informação - ||||| | Copyright © 2006-2024 - UFRN - sigaa-1.ufabc.int.br.sigaa-1-prod