Data center resource management with temporal data center resource management with temporal dynamic workload
Abstract
The proliferation of Internet services drives the data center expansion in both size
and the number. More importantly, the energy consumption (as part of total cost of ownership
(TCO)) has become a social concern. When the workload demand is given, the data
center operators desire minimizing their TCO. On the other hand, when the workload demand
is unknown while the requirements on quality (QoE) of experience of the Internet
services are given, the data center operators need to determine appropriate amount of resources
and design redirection strategies in presence of multiple data centers to guarantee
the QoE. For the first problem, we present formulations to minimize server energy consumption
and server cost under three different data center scenarios (homogeneous, heterogeneous,
hybrid hetero-homogeneous clusters) with dynamic temporal demand. Our
studies show that the homogeneous model significantly differs the heterogenous model in computational time. To be able to compute optimal configurations in near real-time for
large scale data centers, we propose aggregation by maximum and aggregation by mean
modes for different Internet service requirements. In aggregation by maximum mode,
the price of reducing computational time is over-provisioning (which causes extra energy
consumption). In aggregation by mean mode, the price is the degradation of the timeliness
of services. However, they still result in significant cost savings compared to the scenario
when all servers are on during the entire duration. We first introduce an intuitive aggregation
method: static aggregation. For each mode, dynamic aggregation is introduced
to alleviate their individual drawbacks. The dynamic aggregation by maximum results
in cost savings up to approximately 18% over the static aggregation by maximum. For
three random distributed workload cases, the dynamic aggregation by mean can save up
to approximately 50% workload reallocation compared to static aggregation. Dynamic
Voltage/Frequency Scaling (DVFS) capacity is further considered in our model. Our
numerical results show that adopting DVFS results in significantly reduction of energy
consumption. For the second problem, the data center provides resources via the cloud computing
model. We propose a hierarchical modeling approach that can easily combine all
components in the data center provisioning environment. Identifying interactions among
the components is the key to construct such a model. In providing internet service by
cloud computing hosted in data centers, we first construct four sub-models: an outbound bandwidth model, a cloud computing (hosted by data centers) availability model, a latency
model and a cloud computing response time model. Then we use a data center
redirection strategy graph to glue them together. We also introduce an all-in-one barometer
to ease the QoE evaluation. The numeric results show that our model serves as a very
useful analytical tool for data center operators to provide appropriate resources as well as
design redirection strategies. In addition, we study the redirection strategies (schemes) in a particular Internet
service, agent-based virtual private networks architecture (ABVA). It refers to the environment
where a third-party provider runs and administers remote access virtual private
network (VPN) service for organizations that do not want to maintain their own in-house
VPN servers. We consider the problem of optimally connecting users of an organization
to VPN server locations in an ABVA environment so that request denial probability and
latency are balanced. A user request needs a certain bandwidth between the user and
the VPN server. The VPN server may deny requests when the bandwidth is insufficient
(capacity limitation). At the same time, the latency perceived by a user from its current
location to a VPN server is an important consideration. We present a number of strategies
regarding how VPN servers are to be selected and the number of servers to be tried so that
request denial probability is minimized without unduly affecting latency. These strategies
are studied on a number of different topologies. For our study, we consider Poisson
and non-Poisson arrival of requests under both finite and infinite population models to understand the impact on the entire system. We found that the arrival processes have a
significant and consistent impact on the request denial probability and the impact on the
latency is dependent on the traffic load in the infinite model. In the finite model, arrival
processes have an inconsistent impact to the request denial probability. As to the latency
in the finite model, arrivals that have a squared co-efficient of variation less than one is
consistently largest, followed by Poisson case, then the case that the squared co-efficient
of variation is more than one. Finally, a strength of this work is the comparison of infinite
and finite models; we found that a mismatch between the infinite and the finite model is
dependent both on the number of users in the system and the load.
Table of Contents
Introduction -- Energy-aware data center resource management when Resource requirements are given -- Data center resource allocation with DVFS -- A hierarchical model to evaluate quality of experience of online services hosted by data centers -- Balancing request denial probability and latency in an agent-based VPN architecture -- Conclusion and future research -- Appendix
Degree
Ph.D.