• Posted on: 23 November 2021
  • By: secretaria


A Coordenação do Programa de Pós-Graduação em Informática torna pública a realização da Defesa de Dissertação de Mestrado, à distância, de FÁBIO KAZUO HASHIMOTO DE BARROS, no dia 24 de novembro de 2021, às 14h00.


Título: “An Evaluation of Data Resampling Techniques in Local-Model Hierarchical Classification Approaches”

One of the main areas of interest in machine learning is classification, a supervised learning task where the main objective is to infer machine learning models that are capable of learning patterns from labeled data instances. There are many kinds of classification tasks when we consider the type of problem they can handle, such as: binary, multi-class and hierarchical. In real world data, there are problems that could be modeled with binary, or multi-class classifiers. However, there are some problems where the class structure has an inherent hierarchical relationship between the classes. In these scenarios, it makes sense to model the problem using a hierarchical classification approach. The task of hierarchical classification is defined as a task where the class structure of a dataset can be organized in a taxonomy or hierarchy. A problem that is often observed in real world data is the imbalanced class distribution, and this problem also appears in hierarchical classification. The imbalanced class distribution occurs when there are classes that occur more often than others, the majority classes. There are also those classes that have few occurrences, the minority classes. This phenomenon causes the class distribution to be skewed towards the majority classes, which have a lot more samples than the minority ones. This unequal distribution negatively affects the classification performance, once that the classifier tends to benefit the majority class. Because of this problem, our objective is to evaluate the effectiveness of using data resampling techniques in imbalanced hierarchical datasets from different application domains. To do this, we test existing data resampling approaches and propose new ones, using them with different kinds of local hierarchical classifiers and resampling algorithms from the literature. Our choice to use local hierarchical classifiers is because there are few works in the literature that investigate imbalanced distributions in local hierarchical classification problems. After testing both the existing and proposed data resampling approaches, we evaluate their effectiveness compared to a baseline, where no resampling is used. Our results reported that resampling yields statistically significant improvement to the classification performance and also that one of the proposed resampling approaches was the best ranked approach in one of the local classification scenarios.
Palavras-chave: Hierarchical classification, Imbalanced Learning, Class Imbalance, Resampling algorithms.

A Banca será composta por:
Prof. Dr. Carlos Nascimento Silla Jr. (orientador) – PUCPR
Prof. Dr. Julio Cesar Nievola - PUCPR
Profa. Dra. Deborah Ribeiro Carvalho – PUCPR
Profa. Dra. Anne Magaly de Paula Canuto - UFRN

Curitiba, 22 de novembro de 2021.