Cost-efficient vision AI: challenges and solutions for real-time and stored video analytics with classical and generative AI

No Thumbnail Available

Meeting name

Sponsors

Date

Journal Title

Format

Thesis

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

Artificial Intelligence (AI) has become integral to vision-based applications, automating tasks such as object classification, detection, and segmentation in domains such as video surveillance. Vision AI systems typically involve real-time or offline analysis. Real-time analysis processes video streams as they are captured, essential for applications like live surveillance, while offline analysis processes large video datasets post-capture, supporting use cases such as crime detection, video summarization, and interactive querying. Despite their significance, Vision AI systems face critical challenges in balancing accuracy and cost-efficiency. Key cost factors include latency, model size, redundant computations, API usage, and data privacy. These challenges hinder scalability and performance, particularly in real-time systems where high latency and large models impede responsiveness. For stored video analytics, computational demands of complex querying and inefficient data processing increase costs, especially with frequent API calls in generative AI models. This dissertation addresses these challenges by exploring innovative solutions for cost-efficient Vision AI systems. Proposed approaches include optimizing model construction, reducing real-time video processing costs, mitigating API expenses in video document analysis, and developing cost-effective generative AI techniques for video analytics. These advancements aim to build a trade-off between accuracy and cost-efficiency, enabling scalable deployment of Vision AI systems across diverse applications.

Table of Contents

Introduction -- Cost-efficient model construction -- Cost-efficient video input processing -- Cost-efficient privacy preserving video analytics -- Advancing cost-efficient rag system for videos -- Cost-efficient video-to-text conversion -- Cost-efficient LLM API usage -- Conclusion

DOI

PubMed ID

Degree

Ph.D. (Doctor of Philosophy)

Thesis Department

Rights

License