The day of Reckoning(Midterm Evaluations) is upon us!

As evidenced by my previous blog posts, I’m not particularly happy with my own progress. Hopefully my recent progress will be sufficient to appease the Google gods. (Maybe with some benchmarks too)

I can report that I’ve completed quite a bit since my last blog post, for instance I implemented a toy micronumpy array implementation using lltype.Array and have now grown it into a “full” implementation. I hope that my approach will work well with the JIT, simple element lookups are a couple of dynamic dispatches away (which is a little less than ideal in my opinion) but it should JIT well since the JIT will remove the overhead of those calls.

Unfortunately, this implementation isn’t re-using much of the original code, considering that the core storage of the array is being altered, not a whole lot of the original code is proving useful, but I will be attempting to re-use as much as possible (if for no other reason than not having to reconsider all of the math). I currently have one of our original tests passing now, which is better than nothing, but not particularly satisfying…

I’ve had to deal with alot of bugs in PyPy’s CPyExt CPython extension compatibility layer, which is certainly what slowed progress the most. The time spent on that was almost entirely unexpected, and represents the biggest setback (other than family obligations…)

Trying to keep this short so that I can get back to work, I want to have some benchmarks before evaluations are over to justify my existence :-)

A Bit Behind Schedule

Life seems intent on limiting the time I can spend on my GSoC project. The first wave of family arrived at my house today, and more still will be coming. I have a lot to do to help my family host. Last week we set the goal that I should have NumPy working on PyPy by the end of last week. (which we’ll call Tuesday since that’s when we set those goals forward) In addition I was intended to have started fixing up micronumpy.

While I haven’t started on micronumpy, I have made significant strides in getting NumPy to work on PyPy. Unfortunately, CPyExt isn’t as mature as we’d hoped, and rather than working on NumPy, I’m spending my time implementing C API functions in PyPy. NumPy still doesn’t import, but I can see from the output of nm -u the number of missing symbols is being slowly whittled down. I should also note that NumPy’s setup.py has caused me significant grief since its ‘clean’ target doesn’t clean up after setup.py build_ext -i at the moment I haven’t identified whether that’s a bug in NumPy’s copy of distutils (gross in and of itself) or if it’s a nasty interaction between it and CPyExt’s presetup.py which creates a stub dll/so for testing purposes.

Now on to the positive side of things, since I began writing this post (about a week ago) I’ve satisfied the last of NumPy’s Symbol dependencies it seems, now there are still issues with importing the module, for instance some API functions get passed non-Python objects (probably being double freed) but I’m not sure how best to track this all down. I’m not sure whether I just haven’t found the source, or if it has to do with output redirection and buffering, but no matter where i’ve put printfs (in c) and prints (in python) they always end up after the exception is thrown in the output…

EDIT: It’s almost certainly turned out to be a buffering/stderr vs. stdout issue, how I’m going to get Python buffering to play nice with C is a mystery to me though… Perhaps if I turn off buffering…

So, sorry for the absence, things are still moving along, though I definitely have a bit of catching up to do.

Cheers,
Dan

Planning

To start, there was a question where my progress could be followed. There are several, unique projects which affect my ultimate goal, and each one is going to be hosted accordingly. There’s nothing interesting there right now but a woefully out of sync trunk and some basic array code (hey, it works though). but modifications to PyPy will live at codespeak.net http://codespeak.net/svn/pypy/branch/micronumpy is the url, to browse the source you can navigate a browser to http://codespeak.net/viewvc/pypy/branch/micronumpy. My modifications to NumPy will live here though the trivial changes are already in trunk (Changeset #8448 and Changeset #8418). I will have a few changes for Cython by the time the summer is through, but for the moment I may mostly ignore the mtrand.c file which has alot of problems for non-CPython implementations.

As far as plans go, it’s been decided that this week I should finish up my NumPy compatibility work as fast as possible, I am quite close, though I noticed I had some other hackish things I did to get it to build, which I need to fix. After that it will be time to port micronumpy to lltype arrays, which are PyPy’s way of creating raw arrays/buffers. Without this, the lists used in the current implementation can be moved by the garbage collector (a very useful thing). However, normal C code does not have any method for dealing with moving objects, and so in order to interact with NumPy’s C code, we need to migrate to lltype arrays.

CPyExt Tie-Ins

In order to get NumPy to build against PyPy I’ve had to make numerous changes to the CPyExt module, many of them ugly.

It’s also become apparent that Cython and NumPy both depend directly on CPython, and as such they touch structures that they shouldn’t (as dictated by the C API) and will have to be modified. Unfortunately, not everything NumPy does appears possible through the C API and therefore extensions to the C API may be necessary, I’ll need further guidance on this, but I think it’s likely that I will add PyPy*() functions which may then be proposed for CPython (at which point they’d be renamed to Py*()). The only problem with that is the possibility that these functions would become relied upon with their PyPy*() name, which would create another problem. Of course, that’s still the right way to go, as it’s not guaranteed that any extensions to the API will be accepted by the CPython folks, especially not in their originally proposed forms.

I have to apologize, this past week was my finals week, so I didn’t really accomplish much during the first week of the GSoC, but starting today I’m getting going.

Cheers,
Dan

Google Summer of Code!

I’m extremely happy to say that my Google Summer of Code project proposal was approved!  That means I get to spend my summer working on my favorite project, PyPy, and combine it with NumPy.  Hopefully the result will be an extra efficient NumPy implementation on PyPy.

This morning I met with Maciej Fijałkowski and Stéfan van der Walt, my unofficial mentor, representing PyPy, and my official mentor, representing NumPy respectively.  Since exposing my proposal to the NumPy mailing list, it became apparent that my project, in order to best serve the existing NumPy users, would need to be approached differently.  It seems that everyone uses their obscure corners of the library, and it would be tough to address only the most used parts and still make anyone happy.

Because of these needs I couldn’t really just hack away on a small subset of NumPy and have my project be useful at all. Originally it was proposed that we try and replace NumPy in a piecemeal approach with RPython code, however there were numerous technical problems and practical problems with this approach. (Such as the JIT not being so useful when there’s C code, and the extra overhead of trying to keep my code “plugged into” NumPy’s existing code) Stéfan came up with a sane approach, however.  We will first port NumPy to PyPy by way of CPyExt (a growing PyPy sub-project allowing CPython extensions to be compiled to work with PyPy, described a bit here).  This way, we will be completely compatible with NumPy on CPython.  This allows us to do whatever we please with micronumpy.  The idea as it is today is to provide both side-by side, and allow converting between the two via micronumpy.asarray() and numpy.asarray() or something of that sort.  micronumpy arrays will likely be lacking in features, but blazingly fast by comparison, and NumPy arrays will be mostly identical in speed, and as CPyExt matures, I expect that speed to converge to CPython speeds).  In the long term it would be great to make micronumpy arrays completely compatible with NumPy arrays. I plan to re-use as much of NumPy as I can for micronumpy arrays.

To Do

The question remains, then.  How should I move forward from today?  Well, I’m going to stay on top of my school work, and then on the 16th I’ll be able to really start working on this.  Starting on the 16th this is what I need to accomplish.

  1. Clean up the existing micronumpy code.  I worked on it during school and the result is functional, mostly, but it’s quite ugly, and is less than ideal in many ways.
  2. Eliminate  trivial errors with building NumPy with CPyExt.  Stéfan created a bug here.
  3. Port existing micronumpy code to low level RPython arrays, maybe should be done during the cleanup.
  4. Add __array_interface__ to micronumpy arrays to facilitate interoperability.

Droid Update

This is a test post from my new Motorola Droid. Seeing as the PSF requires you to have a blog, I thought being able to post on the go might be beneficial.

media-mouse

That’s right, I’ve done it again!  Well, not really again, whatever…  I want to write about my newest program.

It’s called media-mouse.  I created it because I like to have my music on while I’m up and about my room, but I don’t always like what’s playing.  Since I’m poor and can’t afford a remote, nor do I want to dick around with LIRC, I decided to adapt my wireless mouse to do the job.  What my program does is take over the entire screen (so it always gets your mouse clicks) and allows you to use your (wireless) mouse like a remote control. When you press right mouse, your media player (assuming it’s either Amarok 1.4 or Banshee) skips ahead one track, left mouse goes back a track, the mouse wheel controls the volume, clicking the mouse wheel plays/pauses, and escape quits my God-Forsaken program.

Screenshot of media-mouse

Screenshot of media-mouse

Ugly, isn’t it? It’s functional though…

I plan to add a nice little display that shows the current track, album art, and artist, I’ll probably just rip off what banshee has done, it looks presentable (In fact my original plan was to somehow just use banshee’s “now playing” screen, but being media player independent is rather desirable). The other feature I plan to add is media player detection. Right now you’re required to specify the media player to use on the command line. I could fairly easily detect what media player(s?) are open and allow the user to choose if there are multiple players open.

Update regarding python-xlib

It’s been a while since I’ve written anything, partially because I’m lazy as hell, and partially because of school.  Yesterday and today I’ve gotten some work done on python-xlib, they can be found in my git repository here.  The link is to the experimental branch of python xlib.  For all intents and purposes the master branch *should* represent up to date svn, only in a (superior) git form.  I’ve improved the interface for dealing with Drawable(s) and Window(s) slightly (note: a Window IS A Drawable and therefore also has the same interface improvements as Drawable).

    • Drawable:

    • x: read only # x position of the drawable, pretty straightforward, also pretty useless for Pixmap drawables, but indisposable for Windows
    • y: read only # see above
    • position: read only #  same as (x, y) use this if you want both, as it’s quicker
    • width: read only # width of the drawable, useful for all drawables
    • height: read only # see above
    • size: read only # same as (width, height) performance benefits over using width and height individually
    • Window:

    • x, y, position, width, height: read/write # all of the same as for Drawable but also write enabled, as before size and position are faster than using width/height and x/y individually
    • children: read only # list of child Window(s)
    • root: read/write # root Window for this window
    • parent: read/write # parent Window.  setting this value will reparent the current window, and place it at 0, 0 in the parent window… this isn’t always desired, and in those cases should probably use Window.reparent_window() or whatever…

So I’ve accomplished enough that I feel accomplished, the property system is still in the works, however I think i’ve thought the implementation through enough…

I’ve also considered making a ChildArray class, which would essentially wrap the children of the window, so you could do things like window.children.append() window.children.extend() and the like, to allow even more natural syntax for python-xlib.

Proposal to Improve the python-xlib API

I admit that I only concern myself with certain aspects of the API, but I believe that the current xlib API can be vastly improved by making it more ‘pythonic’.

General Improvements:

Don’t make the application programmer worry about the underlying types used in the protocol, convert to python friendly types (Card8 Card16 etc, they’re all ints to a python programmer)

The Window class

I propose that the following things be added to the Window API, the old API can be maintained for compatibility:

  • Window
    • parent
      • get: Window.query_tree()
      • set: ReparentWindow
    • children
      • get: Window.query_tree()
      • set: unimplemented (undesirable?)
    • root
      • get: Window.query_tree()
      • set: is it needed?
    • properties
      • As a custom class implementing __getitem__ which queries the existing property system

There’s much more which could benefit from some attention, but I’m going to get coding for the moment :-)

python-xlib’s Homepage

My Git Repository

Follow

Get every new post delivered to your Inbox.