Optimizing the Na¨ıve String-Matching Algorithm using Parallel Processing

Sangeerththanan, S.; Yasotha, R.

dc.contributor.author	Sangeerththanan, S.
dc.contributor.author	Yasotha, R.
dc.date.accessioned	2026-03-07T04:50:06Z
dc.date.available	2026-03-07T04:50:06Z
dc.date.issued	2025
dc.identifier.uri	http://drr.vau.ac.lk/handle/123456789/1941
dc.description.abstract	String matching is a fundamental computational problem with broad applications in bioinformat ics, natural language processing, cybersecurity, and large-scale text analytics. While advanced algorithms such as Knuth–Morris–Pratt (KMP) and Boyer–Moore achieve faster performance through preprocessing and efficient skipping strategies, the Na¨ıve String-Matching Algorithm remains attractive for its concep tual simplicity, implementation ease, and adaptability across diverse datasets. However, its quadratic time complexity (O(mn)) constrains performance for large-scale data processing. To address this limitation, this research investigates parallelization techniques to enhance the Na¨ıve algorithm, introducing a novel bidirectional parallel variant named “Binamatch.” The Binamatch algorithm was implemented using Open Multi-Processing (OpenMP) to leverage thread-level parallelism on multi-core architectures. Unlike the tra ditional unidirectional comparison, Binamatch performs a bidirectional search, scanning from both ends of the pattern simultaneously to reduce redundant comparisons. The algorithm was evaluated against sequen tial and parallel implementations of Na¨ıve, Knuth–Morris–Pratt (KMP), and Boyer–Moore algorithms across various datasets, including Deoxyribonucleic Acid (DNA) sequences, ASCII text, and symbolic patterns. Ex periments conducted on 16-core and 32-core systems with chunk sizes ranging from 64 to 1024 measured scalability, execution time, and efficiency. Results revealed that although the bidirectional strategy theoret ically improved comparison locality, Binamatch underperformed in practice, executing 10–1000 times slower than baseline algorithms due to high synchronization overhead, cache inefficiency, and unbalanced workload distribution. Despite this outcome, the research contributes a rigorous benchmarking framework and empir ical insights into the challenges of parallelizing simple algorithms on shared-memory systems. To enhance future applicability, a targeted optimization plan is proposed: adaptive workload partitioning to minimize idle threads, cache-aware scheduling to improve memory locality, and hybrid execution combining sequential preprocessing with selective parallel matching. These refinements are expected to reduce synchronization costs and enhance scalability. Overall, this study underscores that while the Na¨ıve algorithm can be paral lelized, performance benefits depend critically on architectural optimization, offering valuable direction for developing efficient hybrid models in modern parallel string-matching research.	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Applied Science University of Vavuniya Sri Lanka	en_US
dc.subject	Algorithm optimization	en_US
dc.subject	Bidirectional search	en_US
dc.subject	Na¨ıve algorithm	en_US
dc.subject	OpenMP	en_US
dc.subject	Parallel processing	en_US
dc.subject	Performance benchmarking	en_US
dc.subject	String matching	en_US
dc.title	Optimizing the Na¨ıve String-Matching Algorithm using Parallel Processing	en_US
dc.type	Conference abstract	en_US
dc.identifier.proceedings	1st International Conference on Applied Sciences- 2025	en_US