A Data Science Approach to Extracting Insights About Cities and Zones Using Open Government Data
Abstract
In this research, we introduce a system that utilizes open government data and
machine learning algorithms to extract meaningful insights about cities and zones in the
United States. It is estimated that 4% of the world’s population occupies the United
States of America. Remarkably, the US is considered the largest country to host
prominent websites on the internet [16]. It is estimated that 43% of the top one million
websites in the world are hosted in the United States (see Figure 1); promoting it as the
largest influential country in producing data on the web (followed by Germany hosting
only 8%) [16]. Although most data content on the web is unstructured, the US
government adopted the initiative to release structured data related to different fields such
as health, education, safety, development and finance. Such datasets are referred to as
Open Government Data (OGD) and are aimed at increasing the transparency and
accountability of the US government. Our aim is to provide a well-defined procedure to
process raw OGD information and produce expressive insights regarding different zones
within a city, differences between cities, or differences among zones located in different
cities.
Table of Contents
Introduction -- Approach and method -- Evaluation and results -- Conclusion and future work
Degree
M.S.