welcome to gobject introspection
Sat, 29 Oct 2011 15:34 categories: blogSo I was writing a quick python/gtk/webkit application for my own personal pleasure, starting with the usual:
import gtk
import gobject
import webkit
import pango
When the interface was already working pretty nicely after about 500 LOC later, I began adding some more application logic, starting with figuring out how to properly do asynchronous http requests with my gobject main loop.
Threading is of course not an option but it had to be a simple event-based
solution. Gobject provides gobject.io_add_watch
to react to activity on
some socket but there was no library to parse the http communication over a
socket connection in sight.
At this point let me also shortly express my dislike for the synchronous nature of urllib/urllib2. This kind of behaviour is unacceptable in my eyes for network based I/O and a reason why I recently had a look at node.js.
But back to the topic. After some search I found out that one could use libcurl in connection with gobject callbacks so using this example of pycurl as a basis I wrote the following snippet which fetches a couple of http resources in parallel in an asynchronous fashion:
import os, sys, pycurl, gobject
from cStringIO import StringIO
sockets = set()
running = 1
urls = ("http://curl.haxx.se","http://www.python.org","http://pycurl.sourceforge.net")
def socket(event, socket, multi, data):
if event == pycurl.POLL_REMOVE: sockets.remove(socket)
elif socket not in sockets: sockets.add(socket)
m = pycurl.CurlMulti()
m.setopt(pycurl.M_PIPELINING, 1)
m.setopt(pycurl.M_SOCKETFUNCTION, socket)
m.handles = []
for url in urls:
c = pycurl.Curl()
c.url = url
c.body = StringIO()
c.http_code = -1
m.handles.append (c)
c.setopt(c.URL, c.url)
c.setopt(c.WRITEFUNCTION, c.body.write)
m.add_handle(c)
while (pycurl.E_CALL_MULTI_PERFORM==m.socket_all()[0]): pass
def done():
for c in m.handles:
c.http_code = c.getinfo(c.HTTP_CODE)
m.remove_handle(c)
c.close()
m.close()
for c in m.handles:
data = c.body.getvalue()
print "%-53s http_code %3d, %6d bytes" % (c.url, c.http_code, len(data))
exit()
def handler(sock, *args):
while True:
(ret,running) = m.socket_action(sock,0)
if ret!=pycurl.E_CALL_MULTI_PERFORM: break
if running==0: done()
return True
for s in sockets: gobject.io_add_watch(s, gobject.IO_IN | gobject.IO_OUT | gobject.IO_ERR, handler)
gobject.MainLoop().run()
This works nicely and I would've sticked to it when larsc wouldnt have suggested to use libsoup in connection with gobject introspection for the python binding.
Of course I could've used pycurl because curl is cool but every python binding to a C-library adds another point of possible failure or outdatedness when upstream changes.
This issue is now nicely handled by using gobject introspection or pygobject in case of python. What is does is, to use so called "typelibs" to dynamically generate a binding to any gobject code. Typelibs are generated from gir files which are XML representations of the library API.
In Debian the typelibs are stored in /usr/lib/girepository-1.0/ and even if you dont know the mechanism you will probably already have lots of definitions in this directory. You install additional files with gir-packages like gir1.2-gtk-3.0 They are already available for all kinds of libraries like clutter, gconf, glade, glib, gstreamer, gtk, pango, gobject and many more.
To use them, my import line now looks the following:
from gi.repository import Gtk, GObject, GdkPixbuf, Pango, WebKit
This also solves my problem I laid out above about grabbing data over http from within a gobject event loop.
from gi.repository import Soup
Soup can do that but there is no "real" python binding for it. With pygobject one now doesnt need a "real" binding anymore but I just import it as shown above and voila I can interface the library from my python code!
Converting my application from the normal gtk/gobject/pango/webkit bindings to their pygobject counterparts was also a piece of cake and I learned how to do it and did it in under an hour. A really good writeup about how to do it can be found here. For some initial cleanup this regex based script comes in surprisingly handy as well.
first steps with gta04
Sat, 22 Oct 2011 23:58 categories: blogapt-get install emdebian-archive-keyring echo deb http://www.emdebian.org/debian/ squeeze main >> /etc/apt/sources.list apt-get update apt-get install gcc-4.4-arm-linux-gnueabi git clone git://neil.brown.name/gta04 gta04-kernel cd gta04-kernel/ git checkout merge export ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- INSTALL_MOD_PATH=modules_inst make distclean && make gta04a3_defconfig && make uImage -j 5 && make modules -j 5 && make modules_install tar -C modules_inst -czf modules.tgz .
/usr/share/doc/python-serial/examples/miniterm.py --lf -b 115200 /dev/ttyUSB0
git clone --depth 0 git://neil.brown.name/gta04 gta04-kernel-neil git fetch origin; git reset --hard origin/merge
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git gta04-kernel-neil git remote add -t merge gta04 git://neil.brown.name/gta04 git fetch gta04 git checkout -b foo gta04/merge
discovering tcc
Sat, 22 Oct 2011 15:02 categories: blogToday I just discovered another great piece of software by Fabrice Bellard: tcc.
It is in orders of magnitude faster than gcc - lots of pieces of code of mine (albeit very small) were compiling 10-20 times faster.
Despite it being ANSI C compliant but not fully ISOC99 compliant all code of mine that I was trying it on compiled happily. Checking out the documentation, the missing ISOC99 parts turned out to be those I wouldnt use anyways (complex and imaginary numbers and variable length arrays).
Apart from being small and fast there are two killer features: C scripting support and dynamic code generation through libtcc.
By using the shebang line
#!/usr/local/bin/tcc -run
and setting the executable bit one can now "run" C source files just as scripts. tcc will compile and execute the code on the fly without even creating any temporary files but leaving the code in memory.
Since tcc is also extremely fast there is only little disadvantage over shell code. A simple helloworld.c "script" was "executed" in just 0.08 seconds where the same shell script did that in 0.03 seconds. The example "script" from the manpage reads:
#!/usr/bin/tcc -run
#include <stdio.h>
int main()
{
printf("Hello World\n");
return 0;
}
Another feature that is just the logical consequence of the above is tcc's ability to read C source from standard input and compile and run it on the fly:
echo 'main(){puts("hello");}' | tcc -run -
Now to the second amazing thing: dynamic code generation on the fly. The above is achieved using libtcc with which one can dynamically generate and compile C code through library calls to libtcc and execute the code right away from memory.
The following example program shows how to achieve this (inspired by libtcc_test.c from the tcc source):
#include <stdlib.h>
#include <stdio.h>
#include "libtcc.h"
int add(int a, int b) { return a + b; }
char my_program[] =
"int fib(int n) {\n"
" if (n <= 2) return 1;\n"
" else return fib(n-1) + fib(n-2);\n"
"}\n"
"int foobar(int n) {\n"
" printf(\"fib(%d) = %d\\n\", n, fib(n));\n"
" printf(\"add(%d, %d) = %d\\n\", n, 2 * n, add(n, 2 * n));\n"
" return 1337;\n"
"}\n";
int main(int argc, char **argv)
{
TCCState *s;
int (*foobar_func)(int);
void *mem;
s = tcc_new();
tcc_set_output_type(s, TCC_OUTPUT_MEMORY);
tcc_compile_string(s, my_program);
tcc_add_symbol(s, "add", add);
mem = malloc(tcc_relocate(s, NULL));
tcc_relocate(s, mem);
foobar_func = tcc_get_symbol(s, "foobar");
tcc_delete(s);
printf("foobar returned: %d\n", foobar_func(32));
free(mem);
return 0;
}
Two usecases of tcc already come to my mind. Firstly, there is an amazing movement going on that creates music from C oneliners. erlehman started a project on github where he is gathering a number of such oneliners. The workflow is to first dynamically generate the C source code by plugging the single line of algorithm into a simple for-loop wrapper, compiling this source with gcc and then piping the output of the generated executable into aplay or sox. Using tcc one could now just pipe the generated code into tcc which in turn would compile AND execute the code in one step. It would be faster than with gcc and would require no intermediary source files or executables but would just be one line that does everything.
Secondly there is a project I'm working on at Jacobs called Flowy where I struggle with optimizing performance of a parser of a processing language for network flow records. Performance is already quite good and to increase it, dynamic code generation has always been an option but would have been quite messy if done with gcc. With libtcc I would be able to dynamically construct the rules and execute them with just a number of library calls and without the complexity of gcc and the fact that I would have to call it as an executable.
use /dev/shm
Tue, 18 Oct 2011 17:31 categories: blogTODO: I have to use /dev/shm more often.
With even my laptop having a few gigs of RAM it is such a convenient and most importantly fast scratch space and still I'm not used to using it all the time where it would come in handy.
For example when building a rootfs with multistrap I can reduce the overall time needed for the build to finish from 23 minutes to under 14 minutes!
clearing caches
Tue, 18 Oct 2011 17:04 categories: blogFor benchmarking purposes it makes sense to clear the caches, the linux kernel
creates for us. An additional sync
beforehand makes sure that everything is
committed to disk (dirty objects will not be freed).
sync
sudo sysctl vm.drop_caches=3
or
sync
echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null
This will drop the pagecache, dentries and inodes.