Toward a Reliable Network Management Framework
Abstract
As our modern life is very much dependent on the Internet, measurement and
management of network reliability is critical. Understanding the health of a network via
outage and failure analysis is especially essential to assess the reliability of a network,
identify problem areas for network reliability improvement, and characterize the network
behavior accurately. However, little has been known on characteristics of node outages
and link failures in access networks. In this dissertation, we carry out an in-depth outage
and failure analysis of a university campus network using a rich set of node outage and
link failure data and topology information over multiple years. We investigated the diverse
statistical characteristics of both wired and wireless networks using big data analytic tools
for network management. Furthermore,we classify the different types of network failures
and management issues and their strategic resolution.
While the recent adoption of Software-Defined Networking (SDN) and softwarization of network functions and controls ease network reliability, management, and various network-level service deployments, the task of monitoring network reliability is still
very challenging. We find it challenging because it not only requires vast measuring and
processing resources but also introduces an additional intermediate network, so-called a
’control-path network’, that physically connects the control and data plane networks. We
proposed a topology-aware network management framework that utilizes Link Layer Discovery Protocol (LLDP) messages via prudent control of the frequency of LLDP messages
and considering tier-based network architecture. It provides fast and effective reliability
information for faster recovery from failures. The topology-aware analysis also enables
us to explore the economic impact and the cost of various types of network failures with
regards to Capital Expenditure (CapEx) and Operational Expenditure (OpEx).
Wireless LAN (Local Area Network) or Wi-Fi has become the primary mode of
network access for most users; thus, its performance measurement becomes a critical part
of network management in access networks. Through large-scale, extensive analysis of
a university campus Wi-Fi network, we found its performance behavior and management
issues are very distinctive from awired network. The study also informs a strategic Wi-Fi
Access Point deployment and enhanced Wi-Fi association scheme for better coverage and
enhanced user experience.
Most of the current and future networks would involve both wired and wireless
subnets. Our work of understanding the unique issues of each one and their interplay
would shed light on managing and improving network reliability in a holistic manner.
Table of Contents
Introduction -- Understanding the reliability of a university campus network with SPLUNK -- Topology-aware reliability management framework for softwarized network systems -- Protocol heterogeneity issues of campus incremental wi-fi upgrade -- Agile polymorphic software-defined fog computing platform for mobile wireless controllers and sensors -- Cost of unplanned network outage and SLA verification -- Lesson learned -- Conclusion and future directions
Degree
Ph.D.