Collaborative deep reinforcement framework for multimodality integration and learning
No Thumbnail Available
Authors
Meeting name
Sponsors
Date
Journal Title
Format
Thesis
Subject
Abstract
Panoramic dental radiographs are essential in modern dental diagnostics, providing a comprehensive two-dimensional view of the oral cavity to identify abnormalities, restorations, and pathologies. However, manual interpretation is prone to variability caused by human fatigue, radiographic artifacts, and overlapping anatomical structures, which can reduce diagnostic precision and consistency. Although deep learning methods have improved automated segmentation and identification, most existing systems operate as static and isolated models, limiting their ability to adapt, collaborate, and improve once deployed in clinical environments. This dissertation develops and evaluates a collaborative deep reinforcement learning framework to address these limitations by enabling adaptive cooperation among multiple learning components. The proposed framework integrates complementary perception and identification models through a reinforcement-driven collaboration mechanism that dynamically optimizes their joint predictions. Rather than relying on fixed ensemble rules, an agent learns to adjust collaboration strategies based on performance-driven rewards, improving robustness under challenging conditions such as missing teeth, restorations, and orthodontic appliances. The framework operates through three key stages: independent model learning, collaborative inference via adaptive fusion, and reinforcement-based refinement guided by policy optimization. A carefully designed reward function promotes accurate localization, boundary consistency, and anatomically valid predictions, allowing the system to reduce uncertainty and iteratively improve its performance. Experimental evaluation on a publicly available panoramic radiograph dataset demonstrates that the proposed approach consistently outperforms conventional deep learning baselines in both tooth segmentation and identification. The framework achieves segmentation and identification accuracies that exceed 98%, while significantly reducing false detections and improving F1-scores in various test scenarios. Qualitative analyzes further confirm the improved boundary delineation and numbering consistency. From a clinical perspective, this work advances the development of self-improving and autonomous dental imaging systems capable of supporting diagnostic decision-making with greater accuracy and reliability. By bridging static inference with adaptive reinforcement driven collaboration, the proposed framework offers a scalable and interpretable solution with potential for extension to three-dimensional imaging, restoration analysis, and other real-world dental and medical applications.
Table of Contents
Introduction -- Foundations and related work -- DeepCollab: collaborative deep learning architecture for multimodal representation learning -- MEDFUSION-RL: collaborative multimodal deep reinforcement learning with LLM-based interpretability -- VisionKG-3D: vision knowledge graph with dynamic deep reinforcement for 3D multimodal semantic understanding -- Conclusion and future work
DOI
PubMed ID
Degree
Ph.D. (Doctor of Philosophy)
