Computer Science electronic theses and dissertations (MU)

Permanent URI for this collection

https://hdl.handle.net/10355/5300

The items in this collection are the theses and dissertations written by students of the Department of Computer Science. Some items may be viewed only by members of the University of Missouri System and/or University of Missouri-Columbia. Click on one of the browse buttons above for a complete listing of the works.

Browse

Now showing 1 - 5 of 384

A framework for high-relevance reciprocal matching in mentorship platforms
(University of Missouri--Columbia, 2025) Panapakam, Vijaya Krishna; Wang, Fang
[EMBARGOED UNTIL 12/01/2026] Mentorship is a critical component of professional and academic development, yet opportunities are often limited by personal networks and chance encounters, disproportionately affecting underrepresented students. Existing digital matching platforms frequently rely on restrictive keyword-based searches, which fail to capture the semantic nuance of a candidate's skills and interests, leading to suboptimal matches and continuing systemic biases. This thesis aims to address these challenges by designing, implementing, and rigorously evaluating a novel, reciprocal matching system to connect mentees and mentors with greater accuracy. The proposed solution is a two-stage hybrid recommendation algorithm. The first stage employs semantic search, utilizing transformer-based vector embeddings to represent the rich, unstructured text of user profiles and identify a comprehensive set of conceptually relevant candidates. The second stage uses Reciprocal Rank Fusion, a score-agnostic method, to merge and re-rank candidates from multiple retrieval signals, prioritizing those with consistent high rankings across different models. The system's efficacy is validated through a comprehensive two-phase evaluation framework. An offline analysis compares the proposed algorithm against traditional keyword-based and content-based filtering models using rank-aware metrics. Subsequently, an online A/B test measures real-world impact on user behavior, with the primary success metric defined as the Reciprocal Acceptance Rate. Offline analysis showed the hybrid model achieved superior precision and recall, and the subsequent online A/B test confirmed its practical superiority. The hybrid algorithm yielded statistically significant improvements over the baseline, including a +121.3% lift in Reciprocal Acceptance Rate and a 100% reduction in Empty State Frequency. This work contributes a scalable and more equitable framework for mentor matching, demonstrating that a semantic, multi-faceted approach can successfully bridge the mentorship gap and foster more meaningful professional connections
Improving resilience in federated learning against data poisoning and network disruption at the edge
(University of Missouri--Columbia, 2025) Haughton, Trevontae; Calyam, Prasad
[EMBARGOED UNTIL 12/01/2026] The increasing deployment of drone swarms at the network edge benefit from the use of Federated Learning (FL), which enables decentralized data processing for visual situational awareness while preserving data privacy. However, FL-based drone swarm systems are vulnerable to data poisoning (e.g., label flipping, feature noise) and network disruption (e.g., distributed denial of service, location spoofing) attacks. There is a need to develop robust threat modeling and mitigation mechanisms to ensure operational efficiency and security of such systems. In this paper, we present a novel FL-Defend framework that features attack characterization and defense schemes to improve resilience of FL-based drone swarm systems against data poisoning and network disruption attacks in drone swarm edge systems. Specifically, we characterize the impact of data poisoning attacks and propose defense strategies such as differential privacy and adversarial training. In addition, we characterize network disruption attacks and incorporate defense strategies such as rate limiting and anycasting. We validate our proposed schemes in the AERPAWtestbed and show that adversarial training achieves up to 91.1% accuracy under data poisoning, with a tradeoff of increased CPU usage up to 233% in active drone scenarios. For network disruptions, our defense maintains > 95% model accuracy, reduces training delays from 426s to 180s, limits packet loss to < 12%, detects anomalies with 45 ms latency, and restores up to 89% throughput using rate limiting and 71% recovery via Anycast, at the cost of a 4.2 Gbps bandwidth overhead. These findings highlight critical trade-offs between performance and security.
A Bayesian network framework for fusing spectrum-based fault localization and forward slicing
(University of Missouri--Columbia, 2025) Buchanan, Evan; Ufuktepe, Ekincan
[EMBARGOED UNTIL 12/01/2026] Fault localization techniques based on spectrum analysis (SBFL), such as Ochiai, Tarantula, D*, and Barinel, have achieved state-of-the-art performance in many benchmarks, yet their effectiveness can degrade in complex, multi-fault programs where coverage information alone is insufficient. Program slicing, in contrast, provides precise structural dependencies but often lacks statistical fault evidence. In this thesis, a generalizable probabilistic fusion framework is presented that integrates SBFL metrics with forward slicing through Bayesian networks. Unlike prior hybrids that combine these techniques via fixed heuristics, this approach models the probabilistic relationships between suspiciousness scores and slicing reachability, enabling principled evidence integration. This method is evaluated on multi-fault scenarios using Defects4J and an extended dataset of combined-bug projects, demonstrating consistent improvements over SBFL baselines. The results show that this framework not only boosts accuracy in traditional settings but also substantially improves localization in multi-fault contexts, suggesting a viable path toward more robust and adaptable fault localization in practice.
Software infrastructure for bioinformatics research : modernization and workflow automation
(University of Missouri--Columbia, 2025) Dahal, Sabin; Joshi, Trupti
[EMBARGOED UNTIL 12/01/2026] Bioinformatics platforms must evolve continuously to support the increasing scale and complexity of biological data. Legacy systems, especially those built on outdated architecture, often face limitations in scalability, security, and interoperability. Modernizing these systems ensures that computational frameworks remain maintainable, efficient, and capable of integrating new analytical workflows. Workflow automation further improves reproducibility and usability, both of which are essential in computational biology. This thesis presents the modernization and upgrade of the legacy bioinformatics system KBCommons, as well as the development of sustainable and scalable infrastructure in CrossMP. KBCommons, originally released as KBCommons v1.1 in 2019, provides a multi-omics knowledge base for integrating data across diverse organisms. It has since undergone a complete redesign of both its user interface and back-end architecture. New features now allow researchers to upload and process SNP, miRNA, methylation, RNA-seq, and proteomics datasets through automated pipelines. A dual-server workflow pipeline was implemented to separate web operations from computational tasks, improving scalability and overall performance. Secure data transfer was achieved through the integration of Google APIs. The second project, CrossMP, is a lightweight application designed to extend the infrastructure to support machine-learning workflows. A job-queue and scheduling environment was implemented to manage data-intensive analyses, track task progress, and notify users when jobs are completed. Integration with the Google Drive API enables secure handling of large datasets used in computation. Together, these systems demonstrate how software modernization and workflow automation can transform legacy bioinformatics tools into sustainable, user-friendly infrastructures for reproducible and high-throughput biological discovery. This thesis also compares the architecture and performance of two different application designs.
Visual place recognition in aerial imagery
(University of Missouri--Columbia, 2025) Nouduri, Koundinya; Palaniappan, Kannappan
[EMBARGOED UNTIL 12/01/2026] Visual localization within wide-area urban settings presents unique challenges due to building occlusions, substantial perspective variations, and large operational scales. A fundamental difficulty arises from the drastic scale discrepancy between comprehensive aerial imagery used for database construction and off-center, oblique bird's-eyeview drone imagery. While current Visual Place Recognition methods match whole images using aggregated global descriptors, we address the distinct challenge of localizing specific buildings within large-scale aerial imagery using patch-level feature matching. In this thesis, we introduce the Landmark Matching Network (LMNet) family for city-scale aerial image localization. LMNet employs a Siamese architecture with Multi-Patch matching to handle off-center landmarks and occlusions. LMNet++ incorporates multi-head attention mechanisms for improved computational efficiency. WS-LMNet extends this into a fully convolutional architecture for direct landmark detection in high-resolution Wide-Area Motion Imagery (WAMI). Building on these CNN-based methods, we introduce LMDNet leveraging DINOv3 Vision Transformer features with a patch-level similarity algorithm. LMDNet-C extends this with hierarchical representations merging semantic features with attentionweighted discriminative patches. LMDNet-VR implements a coarse-to-fine retrieval pipeline with SIFT-based geometric verification. Extensive experiments across four cities (Albuquerque, Berkeley, Los Angeles, Syracuse) using 10,000 query images demonstrate robust performance. For detection, WS-LMNet achieves 76.2% Top@1 accuracy in localizing buildings across city orbits. For view retrieval, LMDNet-VR reaches 83% Top@1 in identifying exact matching frames. The proposed methods offer practical benefits including ninefold storage reduction and real-time operation.

Browse

Recent Submissions