Efficient H.264 video coding with a working memory of objects
Metadata[+] Show full item record
Efficient spatiotemporal prediction to remove the source redundancy is critical in video coding. The newest international standard H.264 video coding introduces several advanced features, such as multiple-frame motion prediction and spatial intra prediction , which significantly improve the overall coding efficiency. In this work, we focus on efficient H.264 video coding for video monitoring and surveillance. The video camera, mostly stationary, watches the surveillance scene continuously, compresses the video streams which are then transmitted to a remote end for information analysis or archived in a storage device. In these types of video monitoring and surveillance scenarios, the video frame rate is often set relatively low and the activities of persons in the scene often exhibit strong patterns which might repeat at different spatiotemporal scales. In this work, we aim to develop efficient methods to exploit this type of long-term source correlation to improve the overall video compression efficiency. We propose a working memory approach for efficient temporal prediction in H.264 video coding. After video frames are encoded, objects are extracted, analyzed, and indexed in a dynamic database which acts as a working memory for the H.264 video encoder. At the same time, silhouettes are evaluated by using different compression configurations and comparing with ground truth. During the encoding process, objects with similar spatial characteristics are retrieved from the working memory and used for motion prediction of objects in the current video frame. This approach extends the multiple-frame estimation and provides a more generic framework for spatiotemporal prediction of video data. Our experimental results on indoor activity monitoring video data demonstrate that the proposed approach is able to save the coding bit rate by up to 35% with a small computational overhead.