2026 |
Tomaševi, Darian; Peer, Peter; Štruc, Vitomir; Miočić, Matej Diff-FIT: Generating Facial Composites with Diffusion Models Journal Article In: IEEE Access, 2026, ISBN: 2169-3536. Abstract | Links | BibTeX | Tags: diffusion, face, face composite, face editing, face recognition @article{Acces2026,Facial composites, or police sketches, are essential tools in law enforcement for reconstructing the appearance of suspects from eyewitness descriptions. Traditionally, forensic artists manually produce these composites through intensive and time-consuming collaboration with witnesses. To improve the efficiency of this process, automated approaches have leveraged advancements in deep learning and generative modeling. However, the application of recent diffusion models has remained unexplored, despite their unparalleled text-guided synthesis capabilities. To this end, we present Diff-FIT (Diffusion Facial Identification Technique), a novel multi-pipeline framework for generating photorealistic facial composites in only a few steps with pretrained diffusion models. Diff-FIT enables rapid generation of initial composites from textual descriptions, followed by intuitive sequential edits including global image-to-image translation, local text-based inpainting, and drag-based geometric transformations. Through experiments across multiple latent diffusion models and sampling parameters we determine the configuration that best balances image quality, diversity, image-text alignment, and identity consistency. In a user study involving biometric experts and non-experts, Diff-FIT achieves comparable real-world utility to state-of-the-art systems in both subjective evaluations and identification rates with generated facial composites, while enabling greater variation and flexibility through description-based generation and diverse editing pipelines for adding distinct facial features. The source code for the Diff-FIT framework is publicly available at: https://github.com/matemato/Diff-FIT. |