
Industry News Details
Caffe deep learning conquers image classification Posted on : Jan 05 - 2017
Caffe offers a strong brew for image processing, but the project shows signs of stalling
Like superheroes, deep learning packages usually have origin stories. Yangqing Jia created the Caffe project while earning his doctorate at U.C. Berkeley. The project continues as open source under the auspices of the Berkeley Vision and Learning Center (BVLC), with community contributions. The BVLC is now part of the broader Berkeley Artificial Intelligence Research (BAIR) Lab. Similarly, the scope of Caffe has been expanded beyond vision to include nonvisual deep learning problems, although the published models for Caffe are still overwhelmingly related to images and video.
Caffe is a deep learning framework made with expression, speed, and modularity in mind. Among the promised strengths are the way Caffe’s models and optimization are defined by configuration without hard-coding, as well as the option to switch between CPU and GPU by setting a single flag to train on a GPU machine, then deploy to commodity clusters or mobile devices.
Meanwhile, as we enter 2017, Caffe has been stuck at version 1.0.0 RC 3 for almost a year. While there have been code check-ins and visible progress, the project is still not stable. My experience was marred by installation problems, inability to run Jupyter notebooks, and unanswered requests for help. An outsider might get the impression that the project stalled while the deep learning community moved on to TensorFlow, CNTK, and MXNet.
Caffe features and use cases
In the slide deck DIY Deep Learning for Vision: A Hands-On Tutorial with Caffe, Jia and the core Caffe maintainers lay out the how and why of Caffe along with a “highlight reel” of Caffe examples and applications. They describe Caffe as an open framework based on fast, well-tested code, with models and worked examples for deep learning; with a core library written in pure C++ with CUDA; and with command-line, Python, and MATLAB interfaces.
Among the commercial users cited is Facebook, which employs Caffe models for objectionable content detection in uploaded images, an important function whose prudish implementation is the subject of considerable scorn from photographers. I can’t really blame Caffe for that. The Facebook engineers chose to train their filters on nipples, for example, without taking artistic context into consideration. Less controversial are the Facebook “on this day” photo feature for surfacing memories and automatic “alt text” generation to describe images for the blind.
One of Caffe’s more novel techniques is “fine-tuning.” This is the process of taking a model trained on lots of data, such as image keyword tagging from ImageNet, editing the neural network parameters for a different purpose, and using the pretrained parameters as a starting point for learning a new skill, such as image style recognition. The fine-tuning technique can sometimes reduce the time for training on the new classes.
Installing Caffe
When I first tried to review Caffe a couple of months ago, I was unable to build the Caffe executables on MacOS Sierra, which I had just installed. I tracked down the problem to a line in the makefile that explicitly referenced the frameworks for an older version of the OS by number, which is always a red flag, but I decided to wait for the maintainers to start building for Sierra before continuing the review process. I also hoped, in vain, that Nvidia would soon start supporting Xcode 8 so that I could build Caffe with CUDA GPU support yet not impact my other projects.
I picked up Caffe again in December. After updating my repository, I was able to build and test the executables for the CPU, along with configuring the Python libraries well enough to start executing a sample Jupyter notebook. When the notebook got to a cell with a shell script, however, Python crashed.
I tried installing Caffe again on Windows 10, for which there is support in a new branch of the Caffe repository. The new CMake build process claimed it was working, but didn’t seem to create executables any place I could find; the older Visual Studio build process did work once I converted the projects from Visual Studio 2013 to Visual Studio 2015. Again, however, I had trouble with the Python library installations, and this time couldn’t even start the Jupyter notebooks.
Since Caffe’s “home” system is Ubuntu, I fired up an Ubuntu “Trusty” virtual machine and tried to build Caffe there based on the documentation. As before, sadly, I was able to build and test the executables but not run the Python notebooks.
That left me only two more options before going back to troubleshoot the failed installations: building and running in Docker, or running a preconfigured machine image in the cloud. Reading the Docker file in the repository made me think that the vague installation documentation might have been at fault for my three failed attempts. The Docker script installs items on Ubuntu in a different and perhaps more sensible order than called out by the documentation. On the other hand, I’d spent several days on Caffe installations that I should have been able to complete in less than an hour, and I’d had enough.
Running Caffe
As I mentioned earlier, Caffe has command-line, Python, and MATLAB interfaces. As I currently lack a copy of MATLAB, I did not try to test that interface.
The command-line executables and libraries compile and build in C++ either with or without GPU support. I built them for CPU-only, as the one Nvidia GPU I have that is powerful enough to use with Caffe is on a MacBook Pro that has the latest version of Xcode installed, and the latest CUDA SDK still requires an older version of Xcode. Using xcode-select to switch to the older Xcode version doesn’t help, at least on my machine.
Caffe relies on ProtoText files to define its models and solvers. For example, the figures below show the model and solver configurations for the reference “CaffeNet” image classifier. View More