[-] Show simple item record

dc.contributor.advisorLee, Yugyung, 1960-
dc.contributor.authorTripathi, Rashmi
dc.date.issued2018
dc.date.submitted2018 Spring
dc.descriptionTitle from PDF of title page viewed June 20, 2018
dc.descriptionThesis advisor: Yugyung Lee
dc.descriptionVita
dc.descriptionIncludes bibliographical references (pages 74-76)
dc.descriptionThesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2018
dc.description.abstractThere are a lot of open source projects available on the internet. Specifically, due to the increasing interest of Deep Learning (DL), the number of DL open source projects is also increased. This project is motivated by utilizing the existing projects to develop either a new innovative project or create a better-refined version. In addition, these projects can be used to guide software developers or students to perform effective programming in their DL projects. The challenge is how to analyze the functionalities or features that are described in the source code of these projects. It is not easy to understand the semantics of the source code in these projects as the dependencies are intertwined deeply. As the complexity and scale of the projects become huge, it is not scalable to manually analyze the workflow or its semantics of these open source projects. This thesis proposed to build a semantic analytics framework, called SAF-DL, that aims (i) to analyze the sequences of operations and build a graph model, known as call-graph, in a given open source project, (ii) to cluster the similar functional paths in the call graphs using Machine Learning algorithms, (iii) to find the abstractions (clusters) of the function flows, (iv) to identify the semantics of the function flows, (v) to discover the workflow by analyzing their dependencies or similarity between the functional paths and between projects. The SAF-DL pipeline transformation from source code to the semantics of the workflow model was designed with Machine Learning and NLP techniques. In this thesis, Python/TensorFlow/Keras-based open source projects are analyzed in GitHub. A comparative analysis of models used to evaluate the effectiveness of discovery of code abstraction and workflow in the SAF-DL framework. The SAF-DL framework was implemented in Python Scikit-learn and tested using three open source projects. This thesis have demonstrated that the SAF-DL framework can be used in various applications such as search or retrieval of open source projects, source code to source code plagiarism detection, and automatic code or test case generation.eng
dc.description.tableofcontentsIntroduction -- Background and related work -- Proposed framework -- Results and evaluation -- Applications -- Conclusion and future work
dc.format.extentxiii, 77 pages
dc.identifier.urihttps://hdl.handle.net/10355/64184
dc.publisherUniversity of Missouri--Kansas Cityeng
dc.subject.lcshMachine learning
dc.subject.lcshSemantic computing
dc.subject.otherThesis -- University of Missouri--Kansas City -- Computer science
dc.titleSAF-DL: Semantic Analysis Framework for Deep Learning Open Source Projectseng
dc.typeThesiseng
thesis.degree.disciplineComputer Science (UMKC)
thesis.degree.grantorUniversity of Missouri--Kansas City
thesis.degree.levelMasters
thesis.degree.nameM.S.


Files in this item

[PDF]

This item appears in the following Collection(s)

[-] Show simple item record