Toward a Reliable Network Management Framework
Metadata[+] Show full item record
As our modern life is very much dependent on the Internet, measurement and management of network reliability is critical. Understanding the health of a network via outage and failure analysis is especially essential to assess the reliability of a network, identify problem areas for network reliability improvement, and characterize the network behavior accurately. However, little has been known on characteristics of node outages and link failures in access networks. In this dissertation, we carry out an in-depth outage and failure analysis of a university campus network using a rich set of node outage and link failure data and topology information over multiple years. We investigated the diverse statistical characteristics of both wired and wireless networks using big data analytic tools for network management. Furthermore,we classify the different types of network failures and management issues and their strategic resolution. While the recent adoption of Software-Deﬁned Networking (SDN) and softwarization of network functions and controls ease network reliability, management, and various network-level service deployments, the task of monitoring network reliability is still very challenging. We ﬁnd it challenging because it not only requires vast measuring and processing resources but also introduces an additional intermediate network, so-called a ’control-path network’, that physically connects the control and data plane networks. We proposed a topology-aware network management framework that utilizes Link Layer Discovery Protocol (LLDP) messages via prudent control of the frequency of LLDP messages and considering tier-based network architecture. It provides fast and effective reliability information for faster recovery from failures. The topology-aware analysis also enables us to explore the economic impact and the cost of various types of network failures with regards to Capital Expenditure (CapEx) and Operational Expenditure (OpEx). Wireless LAN (Local Area Network) or Wi-Fi has become the primary mode of network access for most users; thus, its performance measurement becomes a critical part of network management in access networks. Through large-scale, extensive analysis of a university campus Wi-Fi network, we found its performance behavior and management issues are very distinctive from awired network. The study also informs a strategic Wi-Fi Access Point deployment and enhanced Wi-Fi association scheme for better coverage and enhanced user experience. Most of the current and future networks would involve both wired and wireless subnets. Our work of understanding the unique issues of each one and their interplay would shed light on managing and improving network reliability in a holistic manner.
Table of Contents
Introduction -- Understanding the reliability of a university campus network with SPLUNK -- Topology-aware reliability management framework for softwarized network systems -- Protocol heterogeneity issues of campus incremental wi-fi upgrade -- Agile polymorphic software-defined fog computing platform for mobile wireless controllers and sensors -- Cost of unplanned network outage and SLA verification -- Lesson learned -- Conclusion and future directions