Lessons Learned: Natively Compiling Tidy HTML for Heroku
Recently I was working on a project and wanted to be able to utilize Tidy to clean up some HTML output. I added the tidy_ffi gem to my project and voila, it worked! Or, to be more specific, it worked locally.
Once I pushed to Heroku I started running into trouble, namely that
libtidy.so, the dynamically linkable native library that
tidy_ffi depends upon, wasn’t found. Uh oh.
Getting My Hands Dirty To Get Tidy
Before yesterday I knew about Heroku buildpacks in theory. I also knew that I really didn’t want to have to use one to solve this problem. My first clue came in the form of a Stack Overflow post in which someone used Tidy on a Bamboo-stack application but was having trouble migrating it. Aha! Surely this will solve my problem.
So I rolled up my sleeves to do the kind of low-level work I usually try to avoid (while still trying to avoid as much of it as possible). I used
heroku run bash to shell into a fresh Bamboo app that I created and then used SCP to copy out the
libtidy.so file there. I added it to my repo, followed the instructions on the StackOverflow post, and pushed. And the app came crashing down.
As it turns out, since the post was authored Bamboo and Cedar have diverged in their precise Linux installations. The versions are the same, the git SHAs are different. C’est la vivre. Now we turn to more complex solutions.
I knew that I would need to compile Tidy myself, but how? As it turns out, Heroku offers a tool called vulcan that allows you to create a build server in the cloud and compile binaries that are compatible with Heroku (because they’re compiled on Heroku). After a few hiccups, I had my build server up and running, but now I needed to build from source!
Tidy is an old project. I mean, it’s a really old project. It uses CVS as its versioning system. Unable to check out using CVS as per the project’s instructions (I’m still not sure why this didn’t work, but it probably has something to do with the fact that it was CVS), I instead downloaded a tarball from browsing the CVS repo on SourceForge.
Once I had the source, it was time to build it. Tidy doesn’t have a typical structure for a buildable library, but after some experimentation I finally figured out the necessary incantations:
vulcan build -v -s ~/Downloads/tidy -p /tmp/tidy -c "sh build/gnuauto/setup.sh && ./configure --prefix=/tmp/tidy --with-shared && make && make install"
Some notes about what’s going on here:
-s ~/Downloads/tidywas the directory into which I downloaded Tidy’s source.
-p /tmp/tidysets the prefix on the Heroku filesystem. Since Heroku apps can only write to
/tmpthis needed to be inside
-c "..."I used the same prefix when configuring for
maketo build to the right directory.
--with-sharedgets Tidy to compile the
.sofiles and not just
Once this command was run, vulcan downloaded a tarball containing the files I needed. Woohoo! I added this to my repo as
lib/native/libtidy.so and I was ready to rock and roll!
Getting Up and Running
Further experimentation and frustration ensued trying to get everything just right but here’s the Ruby code that finally got things working:
require 'tidy_ffi' if ENV['RACK_ENV'] == 'production' TidyFFI.library_path = "/app/lib/native/libtidy.so" require 'tidy_ffi/interface' require 'tidy_ffi/lib_tidy' end
Here we set the library path manually and throw in some extra requires that didn’t autoload properly for some reason in production. After all that work my app was able to Tidy HTML like a champ!
So Long, Comfort Zone
While I’m comfortable plumbing the depths of any Ruby application I’m not actually well versed in solving problems like this. It was a chance to step outside my comfort zone and figure something out through trial, error, patience, and frustration. Knowing a little about how
vulcan works is going to have me feeling more confident the next time I need a native library that isn’t available by default on Heroku.
If you want to use Tidy on Heroku, you don’t have to go through quite the same ardor that I did because you can download the Heroku-compatible libtidy.so file directly! Just add it to your repo, link it using the Ruby above, and have fun tidying up!