Place: Rouen - France
Date: 31/08 to 07/09/2025
Attendees:
- Laurent Heutte (UR, France)
- Simon Bernard (UR, France)
- Fannia Pacheco (UR, France)
- Alceu de Souza Britto Jr (PUCPR, Brazil)
- Luiz Eduardo Soares de Oliveira (UFPR, Brazil)
- George Darmiton da Cunha Cavalcanti (UFPE, Brazil)
- Jean Paul Barddal (PUCPR, Brazil)
- Brazilian and French PhD students
Activities:
- First Meeting : Laurent Heutte (UR), Prof. Alceu Britto (PUCPR), Jean Paul Barddal (PUCPR), Simon Bernard (UR), Luiz E. S. Oliveria (UFPR), and George D. C Cavalcanti (UFPE)
- Seminar 1: Prof. Alceu Britto (PUCPR) - Title: Dynamic Modality and View Selection for Emotion Recognition: A Study on the Impact of Missing Modality.
Abstract: Multiple channels, such as speech (voice) and facial expressions (image), are crucial in understanding human emotions. However, AI’s journey in multimodal emotion recognition (MER) is marked by substantial technical challenges. One significant hurdle is how AI models manage the absence of a particular modality – a frequent occurrence in real-world situations. This study’s central focus is assessing the performance of two strategies when confronted with the lack of one modality: a novel multimodal dynamic modality and view selection and a cross-attention mechanism. Results on the RECOLA dataset show that dynamic selection-based methods are a promising approach for MER. In the missing modalities scenarios, most dynamic selection-based methods outperformed the baseline. The study concludes by emphasizing the intricate interplay between audio and video modalities in emotion prediction, showcasing the adaptability of dynamic selection methods in handling missing modalities.
- Seminar 2: Prof. Jean Paul Barddal (PUCPR) - Title: OnlineSIRUOS: An Inverse Random Under and Oversampling, Heterogeneous Ensemble, and Meta-Learning Approach for Imbalanced Data Stream Classification.
Abstract: In this talk, we discuss OnlineSIRUOS, an ensemble-based algorithm for data stream classification. In particular, OnlineSIRUOS was tailored to handle class imbalance in data streams. Furthermore, the results obtained using both synthetic and real-world data show that the proposed method is particularly well-suited for highly-imbalanced data streams.
- Seminar 3: Prof. George D. C. Cavalcanti (UFPE) - Title: Imbalanced Regression Pipeline Recommendation.
Abstract: Imbalanced problems are prevalent in various real-world scenarios and are extensively explored in classification tasks. However, they also present challenges for regression tasks due to the rarity of certain target values. A common alternative is to employ balancing algorithms in preprocessing to address dataset imbalance. However, due to the variety of resampling methods and learning models, determining the optimal solution requires testing many combinations. Furthermore, the learning model, dataset, and evaluation metric aect the best strategies. This work proposes the Meta-learning for Imbalanced Regression (Meta-IR) framework, which diverges from existing literature by training meta-classifiers to recommend the best pipeline composed of the resampling strategy and learning model per task in a zero-shot fashion. The meta-classifiers are trained using a set of meta-features to learn how to map the meta-features to the classes indicating the best pipeline. We propose two formulations: Independent and Chained. Independent trains the meta-classifiers to separately indicate the best learning algorithm and resampling strategy. Chained involves a sequential procedure where the output of one meta-classifier is used as input for another to model intrinsic relationship factors. The Chained scenario showed superior performance, suggesting a relationship between the learning algorithm and the resampling strategy per task. Compared with AutoML frameworks, Meta-IR obtained better results. Moreover, compared with baselines of six learning algorithms and six resampling algorithms plus no resampling, totaling 42 (6 x 7) configurations, Meta-IR outperformed all of them.
- Seminar 4: Prof. Luiz E. S. Oliveira (UFPR) - Title: Predicting Heart Failure Hospitalizations with LLMs from Health Insurance Data.
Abstract: Heart failure (HF) represents a global clinical and economic challenge, with hospitalizations accounting for 65% of disease-related costs. This study proposes an approach to predict HF hospitalizations using Large Language Models (LLMs) trained on chronological data from Brazilian health insurance beneficiaries. By converting administrative records (consultations, medications, diagnoses) into temporal narratives, models like RoBERTa and Open-Cabrita3B were fine-tuned to identify clinical deterioration patterns. The HealthHistoryRoBERTa-pt model, trained with historical health insurance data and specifically adjusted for HF, achieved an AUC-ROC of 0.93- 0.95 in prediction windows from 5 to 180 days, significantly outperforming other studies (AUC 0.63-0.76) using static clinical data or basic demographics, and those combining clinical administrative data (AUC 0.82). It is noteworthy that the ability of the model to maintain an F1-score greater than 0.85 and sensitivity of 0.87 in predictions of up to 150 days, revealing that administrative variables (e.g., history of hospitalizations, frequency of consultations) function as eective proxies for socioeconomic and behavioral factors, traditionally neglected. Compared to other works, this study demonstrates that longitudinal health insurance data combined with NLP techniques capture non-linear risk trajectories, enabling precise predictions for strategic health planning.
- Seminar 5: Assmaa Alsamadi (PhD student, UR), Title: DeepDO: Deep diagnostic for Anomaly Detection in Time Series.
Abstract: State space models (SSMs) have demonstrated remarkable performance for time series (TS) anomaly detection (AD) in dynamic control systems. However, they require mathematical modeling, which increases model complexity. Moreover, deep learning (DL) models excel at learning estimates and representations, making them suitable for AD. Yet, they often require training complex architectures and fine-tuning them per dataset to generalize across different TS types. This presentation introduces DeepDO, the first unsupervised AD model that integrates the dynamics of an specific SSM into a DL framework. Comprehensive experiments on large TS benchmark datasets demonstrate that DeepDO outperforms 40 state-of-the-art models. The results reflect DeepDO's strong generalization across diverse anomaly types in a wide range of univariate and multivariate TS.
- Seminar 6: Eduardo Ferreira (PhD student, PUCPR). Title: Ridley and Jopling Revisited - A New Approach to the Classical Leprosy Classification System.
Abstract: Leprosy remains a major public health challenge, and its correct classification is essential for guiding treatment. The traditional Ridley and Jopling method, although widely adopted, depends heavily on clinical expertise and often suffers from subjectivity. In this work, we propose an alternative approach based on machine learning, introducing a Random Forest classifier to support leprosy classification. Our method relies on pre-defined prototypes representing the spectrum of clinical presentations. For each new case, the Random Forest computes its similarity to these prototypes, assigning the most probable class. Beyond classification accuracy, the proposed model enhances interpretability by providing explanations of the decision process, indicating which clinical and laboratory variables most influenced the classification outcome.
- Seminar 7: Romain Mussard (PhD student, UR), Title: Universal Domain Adaptation for Time Series classification
Abstract: Universal Domain Adaptation (UniDA) aims to transfer knowledge from a labeled source domain to an unlabeled target domain, even when their classes are not fully shared. Few dedicated UniDA methods exist for Time Series (TS), which remains a challenging case. In general, UniDA approaches align common class samples and detect unknown target samples from emerging classes. Such detection often results from thresholding a discriminability metric. The threshold value is typically either a fine-tuned hyperparameter or a fixed value, which limits the ability of the model to adapt to new data. Furthermore, discriminability metrics exhibit overconfidence for unknown samples, leading to misclassifications. This presentation introduces UniJDOT, an optimal-transport-based method that accounts for the unknown target samples in the transport cost. Experiments on TS benchmarks demonstrate the discriminability, robustness, and state-of-the-art performance of UniJDOT.
- Go to Main Page ...