Mission 4 • MMPLARP

Place: Santiago - Chile

Date: November 21th to 28th, 2025

Attendees:

Jose Saavedra (UA, Chile)

Laurent Heutte (UR, France)

Simon Bernard (UR, France)

Alceu de Souza Britto Jr (PUCPR, Brazil, remotely)

Luiz Eduardo Soares de Oliveira (UFPR, Brazil, remotely)

George Darmiton da Cunha Cavalcanti (UFPE, Brazil, remotely)

Jean Paul Barddal (PUCPR, Brazil), remotely

Brazilian and French PhD students (remotely)

Activities:

Reception of Prof. Laurent Heutte and Prof. Simon Bernard for a scientific visit.

Participation of both researchers as invited speakers in the annual workshop on Computer Vision, contributing to scientific exchange and dissemination of research results.

Technical meetings focused on current research methods for pattern detection in historical documents, including discussion of experimental results and methodological challenges.

Planning of future collaborative research, particularly the development of a foundational model for object detection in historical documents, leveraging large-scale datasets such as HORAE.

Facilitation of academic networking by introducing the LITIS research team to Prof. Yi-Zhe Song (University of Surrey, UK), a visiting researcher specializing in computer vision models for image representation and generation, fostering potential future collaboration.

Seminar 1: Annual Workshop on Computer Vision by Prof. Laurent Heutte (URN, France), Title: Unsupervised Learning-based Information Retrieval Applied to Spot Patterns in Historical Document Images. Abstract: Historical document analysis faces significant hurdles due to the lack of labeled data and the arbitrary nature of search queries. This thesis introduces the first learning-based approach for sub-image retrieval and pattern spotting, moving beyond traditional learning-free methods. We present OS-DETR, a novel model that adapts the Transformer-based DETR architecture to localize unseen query patterns within complex document images. To bypass the scarcity of manual annotations, we developed a specialized synthetic data generation pipeline for training, paired with generalization techniques to ensure robust performance across diverse historical domains. Experimental results on both synthetic and public benchmarks validate that learning task-specific representations offers a more scalable and future-proof solution for cultural heritage preservation. Preliminary explorations into alternative data generation further open promising avenues for adaptable, domain-independent document analysis.

Seminar 2: Annual Workshop on Computer Vision by Prof. Simon Bernard (URN, France). Title: Semi-supervised Multi-domain Translation with Diffusion Models. Abstract: In this talk, I present a work recently published in TMLR (https://openreview.net/forum?id=vYdT26kDYM), that addresses semi-supervised multi-domain translation, aiming to learn mappings between arbitrary domain configurations without requiring a separate model for each one. The key idea behind the proposed Multi-Domain Diffusion (MDD) model is to assign a distinct noise level to each domain: a missing domain is represented by maximum noise (pure noise), naturally handling semi-supervised settings without any architectural changes. Less noisy domains are leveraged to reconstruct noisier ones, enabling flexible source/target domain configurations. The method is evaluated on synthetic (BL3NDT), medical imaging (BraTS2020), and face (CelebAMask-HQ) datasets.

Go to Main Page ...