KmpMatcher + links

In an earlier post, I talked about extracting attachments from .eml files, which I was doing in pursuit of a half memory of having implemented the Knuth-Morris-Pratt algorithm. That turned out to be a dead end so I ended up consulting a lot of other sources and glomming together something new from those. You can see what I ended up with here: https://github.com/fadend/kmp_matcher.

What I mostly want to do in this post is catalogue a few of the interesting links that I found while working on this. The most interesting for me, I think, but also the ones I have furthest to go in digesting, derive KMP from naive string matching:

Kind of brain melting for me but in a good way.

My starting off point for finding many of these was this post: Reconstructing the Knuth-Morris-Pratt algorithm. That also led me to this wonderful online textbook of string matching algorithms: EXACT STRING MATCHING ALGORITHMS. (Unfortunately, there seems to be an issue with their SSL certificate at the moment.)

One thing I’ve (mostly) enjoyed working on this has been getting into using Bazel, an open sourced version of the Google internal Blaze build system. I guess maybe it’s a matter of having grown used to it over the years, but it just “feels right”. I’m planning to write more about that soon.

One other random fun thing from this project was encountering Fibonacci words via the original KMP paper. They are used in constructing a worst-case for algorithm, but what caught my eye are the pretty patterns they can produce. Looking forward to doing some of my own visualizations of them soon.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *