A hierarchical time-indexed database for multi-modal derived sensor data using GPU-accelerated PostgreSQL
Abstract
With the vast influx of data captured by ubiquitous sensors, such as those in wearable devices, smart sensors in long-term care facilities, and hydraulic bed sen- sors, there exists an enormous potential for exploiting these multi-modal data using advanced analytics. However, the ability to effectively utilize these growing datasets is contingent upon high-performance data-handling systems that ensure high data scala- bility and effortless data accessibility, especially in various fields demanding real-time signal processing like healthcare, artificial intelligence, machine learning, and scientific research. In this context, this thesis aims to address the challenges associated with the immense volume of sensor data by proposing an efficient database design system. This system utilizes general purpose, high-scalability database systems and integrates them with data analytics focused column stores that exploit hierarchical time indexing, compression, and dense raw numeric data storage. The approach further includes leveraging the capabilities of graphics processing units (GPUs) in conjunction with database management systems (DBMS) using the server programming interface (SPI) with PostgreSQL for accelerated signal processing. We demonstrate robust design, developments, and techniques of a hierarchical time-indexed database for decision support systems and propose an innovative database system for multi-modal derived time-series featured data (e.g., respiration, restless- ness, and heart rate). We also introduce a data-processing pipeline to enable Big Data analytics for multi-modal time-series feature data and compare the performance of CPU and GPU approaches for feature extraction from sensor data. Our evaluations reveal the performance characteristics and tradeoffs of each com- ponent, with special emphasis on data access latencies and storage requirements - vital elements for capacity planning in scalable systems. The proposed database sys- tem is demonstrated to be extremely scalable and offers straightforward integration with existing analytic tools via SQL interfaces. Furthermore, we discuss the usability, adaptability, timing metrics, and precision differences between CPU and GPU code in different thread and block configurations, thereby offering a comprehensive view of real-time signal processing systems.
Degree
M.S.