Designing optimal-and fast-on-average pattern matching algorithms

Abstract : Given a pattern w and a text t, the speed of a pattern matching algorithm over t with regard to w, is the ratio of the length of t to the number of text accesses performed to search w into t. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to w, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with 9 pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed.
Document type :
Journal articles
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01310165
Contributor : Gilles Didier <>
Submitted on : Monday, May 2, 2016 - 8:45:13 AM
Last modification on : Tuesday, April 2, 2019 - 12:26:01 PM
Long-term archiving on : Tuesday, May 24, 2016 - 4:48:16 PM

File

Optimal.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01310165, version 1
  • ARXIV : 1604.08860

Collections

Citation

Gilles Didier, Laurent Tichit. Designing optimal-and fast-on-average pattern matching algorithms. Journal of Discrete Algorithms, Elsevier, 2017, 42, pp.45-60. ⟨http://www.sciencedirect.com/science/article/pii/S157086671630048X⟩. ⟨hal-01310165⟩

Share

Metrics

Record views

181

Files downloads

125