Giving In (or Why Ruby Alone Isn't Enough)

Those of you who've spoken with me on the subject know that I'm somewhat of a dyed-in-the-wool Ruby advocate.  Ever since I first used the language, I found it to be the most beautiful, elegant, and productive language in which I've ever written code.  Ruby also opened my eyes to the possibilites of using very high level languages for writing full applications (not just nifty scripts).  What follows below is in no way an abandonment of Ruby.  However, sometimes one encounters tools that are just too interesting to pass up.  This happened to me recently, and has begun leading me down a path which I have long avoided... learning Python.

My current research involves a lot of linear algebra.  I need to manipulate large (often sparse) matrices, and perform operations such as solving standard and general eigenvalue problems.  This particular need for numerics tools left me with few options.  There is Fortran... no, I didn't even consider it.  Luckily, many of the useful numeric Fortran libraries have been wrapped in C or C++.  For a while, I'd been developing my code base in C++, using uBlas (as I have a strong affinity for Boost whenever I'm working in C++) and slepC as my general numerics package.  C++ was fine for all of my ancillary data structures and whatnot, but I find its matrix manipulation abilities and syntax somewhat lacking.  Here, by somewhat lacking, I mean an absolute PITA.  I realize that uBlas might not have the most intuitive syntax for array manipulation (and I must admit that there was at least one interesting solution I didn't explore; a C++ package named FLENS).  However, when working with matrices, what I really wanted was something like Matlab... but definately not Matlab.

In academia, Matlab is simultaneously a blessing and a curse.  Since almost every academic institution has a site license for Matlab, it's ubiqutous.  It provides almost all of the functionality you would need when dealing with matrix math and the associated operations (hence the name!).  It provides matrix manipulation syntax that often expresses, very naturally, the manner in which one may wish to alter or operate on a matrix.  It's even a decent performer, as many of its linear algebra operations wrap low-level Fortran libraries to do the real brunt of the work.  However, despite these nicities, there are some major drawbacks.  Matlab is not, at least to me, an acceptable general purpose programming language.  While Matlab is terrific for matrix manipulation, I wouldn't want to have to write a complicated data structure in it.  Futhrer, despite what I hear of some advances in Matlab 2008, it doesn't lend itself well to the OO paradigm that often helps when designing larger scale applications.  Operations like general file I/O tend to be tedious, and many algorithms scale poorly; only working acceptably for "toy" datasets in Matlab.  Often, using matlab for the numerics parts of my program complicated the pipeline significantly, as I had to compute some non-trivial matrix in a C++ program, dump it to file, read it into Matlab to manipulate and operate on it, write the results to file, and read these results back into another C++ application.  Finally, despite it's ubiquity in academia, Matlab is non-free software and hence any source I choose to release will be significantly restricted in use to those who have access to a *cough cough* legal copy of Matlab.

Enter NumPy and SciPy.  When I was looking for a solution to this very problem, I stumbled upon what seemed to be almost too perfect a solution.  It seems those in high performance and numerical computing, and related fields of research, had the same types of sentiments many years ago.  They longed for the nicities of Matlab, but embedded in a general purpose language where they could write larger scale applications.  Further, they desired the speed afforded by the highly optimized Fortran libraries.  What resulted from the sum of these desires, and a lot of hard work, is NumPy and SciPy.  Together, they act as (almost) a drop-in replacement for Matlab.  Matrices can be created, stored, manipulated, sliced, and operated on with all the ease of Matlab.  Whenever feasible, the numeirc heavy-lifting is offloaded to a Fortran library that has been conviently wrapped in Python.  Finally, Scipy offers a number of other extra goodies (it has packages for optimization, image manipulation, plotting with matplotlib, etc.).  What this all means is that the benifits of an environment like Matlab are now available in a general purpose programming language (Python), often with superior performance.  So, my journey has begun; I'm learning Python.  I searched for a similar solution in Ruby, but there seems to be noting as well developed or mature as NumPy and SciPy.  While Ruby will undoubtedly remain my favorite language, it looks like I'll become fairly intimate with Python... ehhh, I'd been looking for an excuse to learn it anyway ;-P.