Extract attachments from .eml files

I have a need for an implementation of the Knuth-Morris-Pratt algorithm (or similar) handling streaming input. I’m pretty sure I’ve implemented this previously and thought I might my find my code in an old .zip file I’d sent someone in e-mail.

But Gmail refuses a direct download of the attachment, I guess due to it containing a .jar file. I was able to get the full contents of the e-mail via the “Download Original” link though. This gave me a .eml file.

Peeking at the contents of it, I could see it had a pretty simple format:

...
Date: Mon, 9 Aug 2004 22:59:46 -0500
...
Subject: Fwd: latest version of applet
...
Mime-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_97_19043715.1092110386307"
...

------=_Part_97_19043715.1092110386307
Content-Type: application/zip; name=SearchToHTML.zip; x-unix-mode=0644
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="SearchToHTML.zip"

UEsDBAoAAAAAAE...
...NQA9EAAA50wCAAAA
------=_Part_97_19043715.1092110386307--

I’m pretty sure there are existing tools to extract these attachments. In fact, I would’ve used Apple’s Mail program to extract it, except it annoyed me by demanding that I set up an e-mail provider before opening the file.

I decided writing a parser for .eml files would be a good exercise for getting back into Go a bit.

You can see the results here: https://github.com/fadend/eml.

I was a little surprised how many bumps I hit along the way; I wrote a fairly large amount of Go over my time at Google. On the other hand, I did this sporadically so that there was a lot of forgetting between each round.

I’m reading Ricardo Gerardi‘s Powerful Command-Line Applications in Go, which was helpful; I happened across it at random while looking for something else in the library. I’m enjoying it, and despite the bumps, this time I had a lot of fun working with Go. I’ve shared one other (much larger) commandline utility using Go before: https://github.com/fadend/go-photos, but now I’ll try to keep up with it, experimenting more with using it for commandline utilities where I probably would’ve used Python otherwise. I also have some server use cases in mind too.

I did extract that attachment:

% ./eml_dump --input_eml $HOME/Downloads/search_to_html_latest.eml --output_dir $HOME/Downloads/searchtohtml_attachment 

And the KMP code isn’t there 🙂 Doh.

//XXX! If this step were replaced with a look up table,
//SearchSieves would roughly implement the Knuth-Morris-Pratt
//algorithm...

Looking at this old stuff, I can see I’ve definitely learned a lot since then. (See https://revfad.com/SearchToHTML/ if you’re curious, which started as a search engine for the high school paper I founded.)

Looking forward to learning more.


Comments

One response to “Extract attachments from .eml files”

  1. A couple fun resources I encountered while working on this: https://quii.gitbook.io/learn-go-with-tests and https://gobyexample.com

Leave a Reply

Your email address will not be published. Required fields are marked *