Posted on August 01, 2014
I missed my post last week due to my research for grad school all going to hell at the worst possible time :(. There wasn't much to report on last week, so I'm not too perturbed. Turns out even with PRs done and tested, it still takes a long time to get them merged and have everyone agree on them. Fortunately I made up for it this week; as of now I am officially working on code generation! This is behind schedule on my original timeline, but should have been expected, as per Hofstadter's law:
Hofstadter's Law: "It always takes longer than you expect, even when you take into account Hofstadter's Law."
Things are moving ahead now, and I have some hope left that I can accomplish (most of) everything I set out to do.
I set out to accomplish a number of things this summer. Here's the status of each project goal:
General Linearization Form:
Done!
Linearization methods for Kane's and Lagrange's Methods:
Done! Lagrange stuff got merged last week.
Documentation for the above:
Done? This PR is waiting for review, but I think it's good.
Pre-linearized solution methods:
Nixed from the plan due to lack of interest.
Code Generation:
In Progress...
I also accomplished a few other things, that I found necessary for my project:
Refactored KanesMethod
and LagrangesMethod
:
This resulted in more readable, pythonic code, and also a speed/memory improvement. PR for this is still awaiting review.
Faster subs
function for mechanics
:
My msubs
implementation is several orders of magnitude faster than subs
for the kinds of expressions seen in mechanics
(even without the benefit
of cacheing that subs
has). This is in a PR awaiting review.
Soooo much work on the nan
and oo
issue:
Still not solved for all cases... :(
There are only 3 weeks left of GSoC! In my last remaining weeks, here's what I plan to get done:
Get the Bicycle Example working:
After my work everything runs faster, results in smaller, more tractable
expressions, and uses less memory. Except for the bicycle example. For some
unknown reason I can not get this thing to result in anything except nan
.
This is a regress in performance (even though everything else runs better),
and needs to be solved.
Code generation:
I've already got some stuff working, and it's really exciting. More on this below.
Get all my current stuff merged:
All that works needs to get into Sympy
. As not everyone else is being paid
to do this, it can take some time and effort to get things through the
review process and into master, but I have hope that my remaining
contributions will eventually make it in.
I think I can do it, but it'll be a stretch.
Sympy currently contains some facilities for code generation, but they lack support for the matrices that are necessary for working with dynamics problems. I hope to remedy that, as well as to make general improvements to the entire codegen module.
Code generation in sympy has three levels:
Code Printers ccode
, fcode
, and the like
These are printers that know how to print simple sympy expressions using
functionality and syntax found in that language. For example, ccode
will
print exponents using pow
, which is found in the math
library in C.
These printers don't have any knowledge of functions, multiple statements,
or header files. They simply print a single expression out on one line.
The codegen
submodule
This submodule contains facilities for representing generalized routines, and generating functions in various languages (currently C and FORTRAN) that can be compiled as a library without any changes. They know about function and variable declarations, header files, library imports, and multi-line statements. However, they have no idea how to make this generated code work with python.
Code wrapping, usually accessed through autowrap
This is where the functionality for wrapping the generated code lives.
Using the functionality provided here, one can compile and wrap generated
code, and then call it from python. The autowrap
function is the main
entry point, allowing for all 3 steps to be done in one call.
The first thing I wanted to fix was getting code generation to work with
matrices and matrix expressions. This turned out to be harder (and more
confusing) than I expected. There is currently support for a "matrix like"
object named sympy.tensor.IndexedBase
, but I really don't understand the
purpose behind it. Reading through the code and examples though it seems to be
for representing indexed loop operations in a concise form. This unfortunately
has nothing to do with the indexed types (matrices) that I plan on
implementing.
I spent a long time reading through the code and playing around with it using
pdb trying to figure out the control flow in the codegen function, and am still
a little lost. Most of what's there seems to be for supporting the Indexed
operations. After some time trying to bend them to work for matrices, I
changed plans and now am supporting Matrix
and MatrixExpr
types for matrix
operations only. Indexed
types can be used elsewhere, but they shouldn't be
used for representing matrices with expressions inside them.
I currently have this "working", but am not happy with it yet. The current layout of the module made for some hacky work adding in matrix support. I plan on doing some refactoring to make this implementation cleaner. Currently, on my codegen branch the following is supported:
Generating C code for a matrix with expressions in each element:
Matrices are set as input-output type arguments, and are modified in place before being returned.
Passing in a MatrixSymbol
as an argument:
Here the plan is to use matrices to pass in a large number of arguments.
You can think of this kind of like a vector. There's another symbolic
vector type as well in Sympy (DeferredVector
). I may end up supporting
it, but I'm not really sure what it's for. In its current implementation,
the following works:
q = [q1, q2, q3, q4, q5]
q_vec = MatrixSymbol('q', 5, 1)
sub_dict = dict(zip(q, q_vec))
# Replaces each q with elements from q_vec
expr = msubs(expr, q_vec)
# Generate a function that takes a numpy array (q) and returns expr
# This works if expr is an expression, or a matrix
func = autowrap(expr)
After I clean this up, I plan to add support for:
Common Subexpression Elimination (cse):
Even though modern compilers do this already, experimentation shows that
the large expressions generated in mechanics
benefit from generated code
having cse performed. This will be implemented as a boolean kwarg (default
False). When True, sympy's cse
function will be run on the expression,
and the code for each subexpression will be generated, followed by the
final expression. I actually don't think this will be too difficult to
implement, and should give some speed improvements on the compiled code (at
the cost of slower generation).
A ctypes
code-wrapper:
Currently the only code wrappers supported are f2py
and cython
, neither
of which is in the standard library. While the wrappers generated with those
functions may be more robust, a ctypes wrapper is also possible, with the
added benefit that ctypes
is in the standard lib.
Support for matrix expressions:
In an ideal world, I'd implement the excellent work done by Matthew
Rocklin, discussed in this video from SciPy
2013.
The idea here is that we have some knowledge about each of the matrices
involved in an expression (for example $A^{-1} B$). We may know that A is
positive definite, or symmetric, or upper triangular, etc... For each case,
there may be a faster inversion routine that we could take advantage of
rather than using a one-size-fits-all inverse function. As I don't have time
to implement support for all possible operations, and the many BLAS/LAPACK
routines that support them, I'll focus just on the inverse, as it's commonly
found in expressions in mechanics
. The thought is, we should be able to
run:
func = autowrap(Inverse(M)*F)
And have code generated that solves the expression in a fast manner, without
having to symbolically find the inverse of M
and combine it with F
into
one matrix beforehand.
Of course this is a wishlist, and it's unlikely all of this will be accomplished
in the next 3 weeks. Still, I plan to keep supporting sympy
after my GSoC
ends, so if it's not done by then it will eventually get there.
I got accepted to the GSoC reunion at the end of October! As this is the 10th annual GSoC, Google is throwing a big reunion shindig for past and present students. As there are lots of us, only a few were chosen based on a lottery, and I made it through! I'm very excited to meet other students that completed the program, listen to some interesting talks, and see the GooglePlex. I also bought my tickets to get there a day early so I have some time to explore the bay area. Last time I was out there I was 14, and I didn't get to see much of the area. If you also got accepted/live out and would be interested in meeting up, let me know! I'll be in San Jose/San Francisco October 23-26.