Installing TensorFlow

This is another quick post on installation difficulties and how to alleviate them. We're looking at TensorFlow as an ML solution for many of the things we are exploring with vg. It's awesome that it's free and open-source, and the community is growing by the day. However, installation isn't always a breeze.

I first tried to install tensorflow using pip a la Google's instructions (I already had python-dev and pip on my system):

sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl

This fails with an error saying that the wheel isn't supported on my platform. There's a simple workaround for this on StackOverflow, but it still wouldn't work for me. After updating pip, I tried the local install method reference in the TensorFlow docs:

wget https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
sudo pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl

This seemed to work, but then when I cracked open python and tried import tensorflow as tf, I got another error, even though I'm on Ubuntu 14.04 and not Mac OS.

The solution was to update my protobuf to the bleeding edge. git clone --recursive https://github.com/google/protobuf.git
cd protobuf/
./autogen.sh
./configure --prefix=/usr
make -j 4
make check ## All tests passed here
sudo make install
sudo ldconfig ## Places new libs on LD_LIBRARY_PATH

At this point, I had installed the C++ version of protobuf and could compile things with protoc, but I still needed the python bindings.

## Still in protobuf dir
cd python/
python setup.py build
python setup.py test ## Fails ~ 2% of all tests
python setup.py install

And only then I could test TensorFlow and run the examples. If you're installing locally, all the instructions should be about the same but you'll need to use ./configure --prefix=/your/install/dir and ensure that you add the relevant directories to the LD_LIBRARY_PATH and LD_INCLUDE_PATH. Hopefully the next post is on doing something neat with TensorFlow now that I've got it installed!

Parallel Make Tips

I spent a lot of time fixing makefiles these past weeks. It seems there isn't much about debugging makefiles on the internet, so I'll place this here as a way to collate a bunch of StackOverflow posts.

VG has quite a few dependencies and lots of individual code modules, and a serial make build takes about 20 minutes. Travis CI builds are even worse, taking over 30 minutes at times (maybe something to do with virtualization performance?). Early on we had parallel builds working, but when I introduced vg deconstruct I inadvertently (and unknowingly) broke them. Our parallel builds would work for a while and fail out, forcing us to finish each one with a serial run.

Debugging

All of our issues came down to missing Make dependencies for various targets. To debug this, I went through each file and made sure that the #include lines matched the dependencies in the Makefile. I also had some ghost targets/dependencies, where I had mispelled a dependency and Make had never complained. Once I'd made sure all the includes were set as dependencies, I would kick off a parallel build and wait to see the dreaded #Error on the command line.

There has got to be a better way to do this...

But I haven't found it yet. Sometimes running make -n (dry run) would help, as I could see what was happening without all the debug messages from packages being built. I could probably also write a little BASH/Python to find the include/dependency discrepancies, but I've been distracted with other things.

Telling Make what to make of Makefile lines

I kept getting this ambiguous error that my make lines weren't make processes, so they were being executed in serial. I just added a + to each rule of the vg source code to fix this. Thanks again, StackOverflow!

Ensure Make target is a file

I had originally used a dummy target, but this prevented Make from ever thinking that the build was complete. I think I'll avoid things like make all and stick to real targets from now on. I even use hidden files for pre-build dependencies such as setting up folders (e.g. touching a file named .pre_build).

Build executable off of the library, not a crap ton of object files

I had originally patched up vg to build the executable on a ton of object files that were also bundled up into a library for other to use. This was pretty silly on my part. By making the executable depend on the library and the library dependent on the object files, I made the build even quicker and ensured that the binary and library contained identical code. I should have done this in the first place but didn't yet know any better.

Results

vg used to take 20 minutes to build in serial and up to ten to build in parallel. I'm consistently getting builds under four minutes with make -j 4, both on my virtual machine on a Macbook Pro and my quad-core desktop. Incremental builds are fixed again, and everyone is much happier.