I always loved latex for generating documents. It’s such an elegant way for a developer to “develop” documents!
Then I wanted to produce posts from the latex source I had. This is text manipulation, so I picked ruby and started writing a parser. Then I stopped.
How many times did I write a parser? Probably half a dozen times, for different reasons, but nonetheless it always started with text and ended with nodes.
This time I looked around and found treetop, which lets you write grammars to parse languages into trees and then associate operations to nodes (like converting the node to html).
Of course I could have looked around for a latex-to-html converter or an existing treetop latex grammar, but I wouldn’t have learnt treetop itself.
I just committed the grammar, the ruby script that generates html from nodes and the ruby script that I used for developing it and that reads a .tex file and converts it to html.
git://github.com/inverno/Treetop-Latex-Grammar.git
The grammar is very limited, basically just what I need for my own paper, but, who knows, someone might enjoy playing with it.
Just noticed that wordpress “pre” and “code” tags don’t really get to encode properly more than a set than < and >. Now I have fixed it by hand.
I guess that I’ll update my treetop and ruby html builder to do the encoding for wordpress. But some errors might still appear. As usual ;)