Intelligent robotics with digital-twin alignment : semantic navigation, manipulation, planning, and human-to-robot action transformation

Alanazi, Ahmed Hamdan

Intelligent robotics with digital-twin alignment : semantic navigation, manipulation, planning, and human-to-robot action transformation

Files

Alanazi_umkc_0134D_12347.pdf (60.84 MB)

Authors

Alanazi, Ahmed Hamdan

Date

2025

Format

Thesis

Abstract

This dissertation advances AI-empowered indoor robotics through four interconnected contributions that unify navigation, manipulation, semantic planning, and human-to-robot action transformation within a digital-twin-aligned framework. GRIP, a grid-aware semantic navigation module, integrates symbolic scene understanding with hybrid search-and-policy execution to achieve robust and context-aware ObjectNav. PathFormer, a transformer-based manipulation model structured around a 3D spatial--semantic grid, generates smooth, interpretable, and physically consistent trajectories that remain tightly aligned with digital-twin simulation. KG-Transformer, a knowledge-guided semantic planner, leverages a lightweight digital twin to calibrate execution, veto unsafe behaviors, and autonomously repair failing plans across diverse indoor environments. ActionFormer, an action-generation transformer, introduces a unified imitation-learning pipeline that integrates human-activity recognition, human-motion generation, and robot-motion generation. ActionFormer supports more than twenty complex human activities, producing robot-ready demonstrations that generalize across platforms and enable end-to-end imitation learning from video and landmark sequences. Collectively, these contributions establish a coherent foundation for AI-empowered robotics grounded in digital-twin intelligence. Across benchmarks and real-world deployments, GRIP yields up to 9.6% higher success rate and more than 2x gains in path efficiency (SPL, SAE). PathFormer produces digitally consistent manipulation trajectories validated through robust sim-to-real transfer. KG-Transformer achieves 99.6% executability, delivers a +4.6-point improvement on unseen-scene tasks, and eliminates safety violations in both simulated and multi-robot execution. ActionFormer attains state-of-the-art performance in human-activity recognition and high execution accuracy across more than 20 activities, generating realistic human-motion traces and corresponding robot-motion trajectories for embodied robotic demonstration. Together, these advances deliver a trustworthy, semantically aligned, and high-performance simulation-to-reality pipeline that significantly enhances the adaptability, reliability, and real-world readiness of autonomous indoor robotic systems.

Introduction -- GRIP: a unified framework for grid-based relay and co-occurrence-aware planning in dynamic environments -- PathFormer: a transformer with 3D grid constraints for digital twin robot-arm trajectory generation -- KG-transformer: evidential knowledge-graph planning for safe language-to-action execution -- ActionFormer: a unified framework for human-to-robot action generation -- Conclusion and future work

URI

https://hdl.handle.net/10355/110323

Degree

Ph.D. (Doctor of Philosophy)

Thesis Department

Computer Science (UMKC)

Collections

2025 UMKC Dissertations - Freely Available Online
Computer Science and Electrical Engineering Electronic Theses and Dissertations (UMKC)

Full item page

Intelligent robotics with digital-twin alignment : semantic navigation, manipulation, planning, and human-to-robot action transformation

Files

Authors

Meeting name

Sponsors

Date

Journal Title

Format

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

Table of Contents

URI

DOI

PubMed ID

Degree

Thesis Department

Rights

License

Collections