Today I am going to talk about my day job a bit. Contrary to previous jobs, a good part (but not all) of what I do now is either public domain or open-source. Two the projects I joined recently are Theano and PyLearn.
Theano is a mathematical expression compiler that maps expressions described in Python to machine-efficient code, either targeting the CPU or the GPU. PyLearn is a work in progress that aims to provide a comprehensive machine-learning framework for Theano.
Theano, as I said in the introduction, is in fact a compiler that knows a lot of things about mathematical expressions, especially when it comes to the mathematics of machine-learning. It will analyze your problem description (a “Theano Graph”), apply a number of optimizations (both for speed and numerical stability) and produce efficient C code for the CPU or CUDA code for the GPU.
Sometimes, the speed-ups from using the GPU over the CPU are interesting (about 10×), even when Theano uses high-performance libraries like Atlas as its back-end.
PyLearn (and PyLearn(2) on which we are currently working) offers many types of machine-learning helpers, from mathematical functions and formulæ (optimized for speed and numerical stability) to a machine learning API designed to provide useful building-blocks for learning algorithms.
Of course, I write this only as a teaser, because 1) it would be way too long for a single blog entry to describe all of that Theano and PyLearn can do; and 2) because I want you, with future entries, to discover a bit more about Theano and PyLearn, with a side-effect of some machine-learning-learning, at the rate of about once a month.
An introductory video on Theano, a presentation given by James Bergstra.
The tutorial is here.