Data center resource management with temporal data center resource management with temporal dynamic workload
Metadata[+] Show full item record
The proliferation of Internet services drives the data center expansion in both size and the number. More importantly, the energy consumption (as part of total cost of ownership (TCO)) has become a social concern. When the workload demand is given, the data center operators desire minimizing their TCO. On the other hand, when the workload demand is unknown while the requirements on quality (QoE) of experience of the Internet services are given, the data center operators need to determine appropriate amount of resources and design redirection strategies in presence of multiple data centers to guarantee the QoE. For the first problem, we present formulations to minimize server energy consumption and server cost under three different data center scenarios (homogeneous, heterogeneous, hybrid hetero-homogeneous clusters) with dynamic temporal demand. Our studies show that the homogeneous model significantly differs the heterogenous model in computational time. To be able to compute optimal configurations in near real-time for large scale data centers, we propose aggregation by maximum and aggregation by mean modes for different Internet service requirements. In aggregation by maximum mode, the price of reducing computational time is over-provisioning (which causes extra energy consumption). In aggregation by mean mode, the price is the degradation of the timeliness of services. However, they still result in significant cost savings compared to the scenario when all servers are on during the entire duration. We first introduce an intuitive aggregation method: static aggregation. For each mode, dynamic aggregation is introduced to alleviate their individual drawbacks. The dynamic aggregation by maximum results in cost savings up to approximately 18% over the static aggregation by maximum. For three random distributed workload cases, the dynamic aggregation by mean can save up to approximately 50% workload reallocation compared to static aggregation. Dynamic Voltage/Frequency Scaling (DVFS) capacity is further considered in our model. Our numerical results show that adopting DVFS results in significantly reduction of energy consumption. For the second problem, the data center provides resources via the cloud computing model. We propose a hierarchical modeling approach that can easily combine all components in the data center provisioning environment. Identifying interactions among the components is the key to construct such a model. In providing internet service by cloud computing hosted in data centers, we first construct four sub-models: an outbound bandwidth model, a cloud computing (hosted by data centers) availability model, a latency model and a cloud computing response time model. Then we use a data center redirection strategy graph to glue them together. We also introduce an all-in-one barometer to ease the QoE evaluation. The numeric results show that our model serves as a very useful analytical tool for data center operators to provide appropriate resources as well as design redirection strategies. In addition, we study the redirection strategies (schemes) in a particular Internet service, agent-based virtual private networks architecture (ABVA). It refers to the environment where a third-party provider runs and administers remote access virtual private network (VPN) service for organizations that do not want to maintain their own in-house VPN servers. We consider the problem of optimally connecting users of an organization to VPN server locations in an ABVA environment so that request denial probability and latency are balanced. A user request needs a certain bandwidth between the user and the VPN server. The VPN server may deny requests when the bandwidth is insufficient (capacity limitation). At the same time, the latency perceived by a user from its current location to a VPN server is an important consideration. We present a number of strategies regarding how VPN servers are to be selected and the number of servers to be tried so that request denial probability is minimized without unduly affecting latency. These strategies are studied on a number of different topologies. For our study, we consider Poisson and non-Poisson arrival of requests under both finite and infinite population models to understand the impact on the entire system. We found that the arrival processes have a significant and consistent impact on the request denial probability and the impact on the latency is dependent on the traffic load in the infinite model. In the finite model, arrival processes have an inconsistent impact to the request denial probability. As to the latency in the finite model, arrivals that have a squared co-efficient of variation less than one is consistently largest, followed by Poisson case, then the case that the squared co-efficient of variation is more than one. Finally, a strength of this work is the comparison of infinite and finite models; we found that a mismatch between the infinite and the finite model is dependent both on the number of users in the system and the load.
Table of Contents
Introduction -- Energy-aware data center resource management when Resource requirements are given -- Data center resource allocation with DVFS -- A hierarchical model to evaluate quality of experience of online services hosted by data centers -- Balancing request denial probability and latency in an agent-based VPN architecture -- Conclusion and future research -- Appendix