Jim Crist-Harif

GSoC Week 10 & 11: Bogged down in details

Posted on August 01, 2014

I missed my post last week due to my research for grad school all going to hell at the worst possible time :(. There wasn't much to report on last week, so I'm not too perturbed. Turns out even with PRs done and tested, it still takes a long time to get them merged and have everyone agree on them. Fortunately I made up for it this week; as of now I am officially working on code generation! This is behind schedule on my original timeline, but should have been expected, as per Hofstadter's law:

Hofstadter's Law: "It always takes longer than you expect, even when you take into account Hofstadter's Law."

Things are moving ahead now, and I have some hope left that I can accomplish (most of) everything I set out to do.

Project Status:

I set out to accomplish a number of things this summer. Here's the status of each project goal:

I also accomplished a few other things, that I found necessary for my project:

TODO:

There are only 3 weeks left of GSoC! In my last remaining weeks, here's what I plan to get done:

I think I can do it, but it'll be a stretch.

Code Generation

Sympy currently contains some facilities for code generation, but they lack support for the matrices that are necessary for working with dynamics problems. I hope to remedy that, as well as to make general improvements to the entire codegen module.

Code generation in sympy has three levels:

  1. Code Printers ccode, fcode, and the like

    These are printers that know how to print simple sympy expressions using functionality and syntax found in that language. For example, ccode will print exponents using pow, which is found in the math library in C. These printers don't have any knowledge of functions, multiple statements, or header files. They simply print a single expression out on one line.

  2. The codegen submodule

    This submodule contains facilities for representing generalized routines, and generating functions in various languages (currently C and FORTRAN) that can be compiled as a library without any changes. They know about function and variable declarations, header files, library imports, and multi-line statements. However, they have no idea how to make this generated code work with python.

  3. Code wrapping, usually accessed through autowrap

    This is where the functionality for wrapping the generated code lives. Using the functionality provided here, one can compile and wrap generated code, and then call it from python. The autowrap function is the main entry point, allowing for all 3 steps to be done in one call.

The first thing I wanted to fix was getting code generation to work with matrices and matrix expressions. This turned out to be harder (and more confusing) than I expected. There is currently support for a "matrix like" object named sympy.tensor.IndexedBase, but I really don't understand the purpose behind it. Reading through the code and examples though it seems to be for representing indexed loop operations in a concise form. This unfortunately has nothing to do with the indexed types (matrices) that I plan on implementing.

I spent a long time reading through the code and playing around with it using pdb trying to figure out the control flow in the codegen function, and am still a little lost. Most of what's there seems to be for supporting the Indexed operations. After some time trying to bend them to work for matrices, I changed plans and now am supporting Matrix and MatrixExpr types for matrix operations only. Indexed types can be used elsewhere, but they shouldn't be used for representing matrices with expressions inside them.

I currently have this "working", but am not happy with it yet. The current layout of the module made for some hacky work adding in matrix support. I plan on doing some refactoring to make this implementation cleaner. Currently, on my codegen branch the following is supported:

After I clean this up, I plan to add support for:

Of course this is a wishlist, and it's unlikely all of this will be accomplished in the next 3 weeks. Still, I plan to keep supporting sympy after my GSoC ends, so if it's not done by then it will eventually get there.


Other exciting news of the week:

I got accepted to the GSoC reunion at the end of October! As this is the 10th annual GSoC, Google is throwing a big reunion shindig for past and present students. As there are lots of us, only a few were chosen based on a lottery, and I made it through! I'm very excited to meet other students that completed the program, listen to some interesting talks, and see the GooglePlex. I also bought my tickets to get there a day early so I have some time to explore the bay area. Last time I was out there I was 14, and I didn't get to see much of the area. If you also got accepted/live out and would be interested in meeting up, let me know! I'll be in San Jose/San Francisco October 23-26.