sisyphus wins ICRA 2012 VMAC

categories: blog

Sisyphus is a piece of software that I wrote as a member of a team from Jacobs University led by Prof. Dr. Andreas Nüchter. It managed to place our team first in this year's IEEE ICRA 2012 Virtual Manufacturing Automation Competition in all three rounds.

The goal was, to stack a given set of boxes of different length, height and width on a pallet in a way that achieved optimal volume utilization, center of mass and interlock of the boxes. Besides the cartesian placement of a box on the pallet, the only other degree of freedom was a 90° rotation of the box around a vertical axis.

Since the arrangement of boxes into a three dimensional container is NP hard (three dimensional orthogonal knapsack), I decided for a heuristic for an approximate solution.

The premise is, that there are many boxes of equal height which was the case in the test cases that were available from the 2011 VMAC.

Given this premise, my heuristic was, to arrange the boxes into layers of equal height and then stack these layers on top of each other. A set of boxes that would be left over or too little from the start to form its own full layer, would then be stacked on the top of the completed layers.

There is a video of how this looked like.

My code is now online on github and it even documented for everybody who is not me (or a potential future me of course).

This blog post is about the "interesting" parts of sisyphus. You can read about the overall workings of it in the project's README.

Python dict to XML and XML to Python dict

The evaluation program for the challenge is reading XML files and the pallet size and the list of articles with their sizes are also given in XML format. So I had to have a way to easily read article information from XML and to easily dump my data into XML format.

Luckily, all the XML involved was not making use of XML attributes at all, so the only children a node had, where other nodes. Thus, the whole XML file could be represented as an XML dictionary with keys being tagnames and the values being other dictionaries or lists or strings or integers.

The code doing that uses xml.etree.ElementTree and turns out to be very simple:

from xml.etree import ElementTree

def xmltodict(element):
def xmltodict_handler(parent_element):
result = dict()
for element in parent_element:
if len(element):
obj = xmltodict_handler(element)
obj = element.text
if result.get(element.tag):
if hasattr(result[element.tag], "append"):
result[element.tag] = [result[element.tag], obj]
result[element.tag] = obj
return result
return {element.tag: xmltodict_handler(element)}

def dicttoxml(element):
def dicttoxml_handler(result, key, value):
if isinstance(value, list):
for e in value:
dicttoxml_handler(result, key, e)
elif isinstance(value, basestring):
elem = ElementTree.Element(key)
elem.text = value
elif isinstance(value, int) or isinstance(value, float):
elem = ElementTree.Element(key)
elem.text = str(value)
elif value is None:
res = ElementTree.Element(key)
for k, v in value.items():
dicttoxml_handler(res, k, v)
result = ElementTree.Element(element.keys()[0])
for key, value in element[element.keys()[0]].items():
dicttoxml_handler(result, key, value)
return result

def xmlfiletodict(filename):
return xmltodict(ElementTree.parse(filename).getroot())

def dicttoxmlfile(element, filename):

def xmlstringtodict(xmlstring):
return xmltodict(ElementTree.fromstring(xmlstring))

def dicttoxmlstring(element):
return ElementTree.tostring(dicttoxml(element))

Lets try this out:

>>> from util import xmlstringtodict, dicttoxmlstring
>>> xmlstring = "<foo><bar>foobar</bar><baz><a>1</a><a>2</a></baz></foo>"
>>> xmldict = xmlstringtodict(xmlstring)
>>> print xmldict
{'foo': {'baz': {'a': ['1', '2']}, 'bar': 'foobar'}}
>>> dicttoxmlstring(xmldict)

The dict container doesnt preserve order, but as XML doesnt require that, this is also not an issue.

Arranging items in layers

When it was decided, that I wanted to take the layered approach, it boiled down the 3D knapsack problem to a 2D knapsack problem. The problem statement now was: how to best fit small rectangles into a big rectangle?

I decided for a simple and fast approach as it is explained in Jake Gordon's blog article. There is a demo of his code and should the site vanish from the net, the code is on github. This solution seemed to generate results that were "good enough" while simple to implement and fast to execute. If you look very hard, you can still see some design similarities between my and his packer.js code.

Jake Gordon got his idea from Jim Scott who wrote an article of arranging randomly sized lightmaps into a bigger texture.

There is also an ActiveState Code recipe from 2005 which looks very similar to the code by Jake Gordon.

The posts of Jake Gordon and Jim Scott explain the solution well, so that I dont have to repeat it. Should the above resources go offline, I made backups of them here and here. There is also a backup of the ActiveState piece here.

Spreading items out

The algorithm above would cram all rectangles into a top-left position. As a result, there would mostly be space at the bottom and left edge of the available pallet surface. This is bad for two reasons:

  1. the mass is distributed unequally
  2. articles on the layer above at the bottom or left edge, are prone to overhang too much so that they tumble down

Instead all articles should be spread over the available pallet area, creating small gaps between them instead big spaces at the pallet borders.

Since articles were of different size, it was not clear to me from the start what "equal distribution" would even mean because it was obvious that it was not as simple as making the space between all rectangles equal. The spacing had to be different between them to accommodate for differently sized boxes.

The solution I came up with, made use of the tree structure, that was built by the algorithm that arranged the rectangles in the first place. The idea is, to spread articles vertically first, recursively starting with the deepest nodes and spreading them out in their parent rectangle. And then spreading them horizontally, spreading the upper nodes first, recursively resizing and spreading child nodes.

The whole recursion idea created problems of its own. One of the nicest recursive beauty is the following function:

def get_max_horiz_nodes(node):
if node is None or not node['article']:
return [], []
elif node['down'] and node['down']['article']:
rightbranch, sr = get_max_horiz_nodes(node['right'])
rightbranch = [node] + rightbranch
downbranch, sd = get_max_horiz_nodes(node['down'])
ar = rightbranch[len(rightbranch)-1]['article']
ad = downbranch[len(downbranch)-1]['article']
if ar['x']+ar['width'] > ad['x']+ad['width']:
return rightbranch, sr+[downbranch[0]]
return downbranch, sd+[rightbranch[0]]
rightbranch, short = get_max_horiz_nodes(node['right'])
return [node] + rightbranch, short

get_max_horiz_nodes() traverses all branches of the tree that node has below itself and returns a tuple containing the list of nodes that form the branch that stretches out widest plus the list of nodes that are in the other branches (which are shorter than the widest).

Another interesting problem was, how to decide on the gap between articles. This was interesting because the number resulting of the subtraction of the available length (or width) and the sum of the articles lengths (or widths), was mostly not divisible by the amount of gaps without leaving a rest. So there had to be an algorithm that gives me a list of integers, neither of them differing by more than one to any other, that when summed up, would give me the total amount of empty space. Or in other words: how to divide a number m into n integer pieces such that each of those integers doesnt differ more than 1 from any other.

Surprisingly, generating this list doesnt contain any complex loop constructs:

>>> m = 108 # total amount
>>> n = 7 # number of pieces
>>> d,e = divmod(m, n)
>>> pieces = (e)*[(d+1)]+(n-e)*[d]
>>> print pieces
[16, 16, 16, 15, 15, 15, 15]
>>> sum(pieces) == m
>>> len(pieces) == n

You can test out the algorithms that arrange rectangles and spread them out by cloning the git and then running:

PYTHONPATH=. python legacy/

The results will be svg files test1.svg and test2.svg, the latter showing the spread-out result.

Here is an example how the output looks like (without the red border which is drawn to mark the pallet border): contains an adaption of for the actual problem.

Permutations without known length

When creating a layer out of articles of same height, then there are four strategies that I can choose from. It is four because there are two methods that I can either use or not. I can rotate the article by 90° per default or not and I can rotate the pallet or not.

So every time that I build a new layer, there are those four options. Depending on which strategy I choose, there is a different amount of possible leftover articles that did not fit into any layer. The amount is different because each strategy is more or less efficient.

To try out all combinations of possible layer arrangements, I have to walk through a tree where at each node I branch four times for each individual strategy. Individual neighboring nodes might be the same but this outcome is unlikely due to the path leading to those neighboring nodes being different.

To simplify, lets name the four possible strategies for each layers 0, 1, 2 and 3. I now want an algorithm that enumerates through all possible permutations of those four numbers for "some" length. This is similar to counting. And the itertools module comes with the product() method that nearly does what I want. For example, should I know that my tree does not become deeper than 8 (read: no more than eight layers will be built), then I can just run:

>>> for i in itertools.product([0,1,2,3], repeat=8):
...     print i

This would work if the number of layers created with each strategy was the same. But as each strategy behaves differently depending on the input, it cannot be known before actually trying out a sequence of strategies, how many layers it will yield.

The strategy (0,0,0,0,0,0,0,0) might create 7 layers, resulting in (0,0,0,0,0,0,0,1), (0,0,0,0,0,0,0,2) and (0,0,0,0,0,0,0,3) yielding the same output as only the first 7 strategies count. This would create duplicates which I should not waste cpu cycles on later.

It might also be that (0,0,0,0,0,0,1,0) turns out to be a combination of strategies that creates more than 8 layers in which case the whole thing fails.

So what I need is a generator, that gives me a new strategy depending on how often it is asked for one. It should dynamically extend the tree of possible permutations to accommodate for any size.

Since the tree will become humongous (4^11 = 4194304), already traversed nodes should automatically be cleaned so that only the nodes that make the current list of strategies stays in memory at any point in time.

This sounded all complicated which made me even more surprised by how simple the solution was in the end.

Here a version of the algorithm that could easily be ported to C:

class Tree:
def __init__(self, branch_factor):
self.branch_factor = branch_factor
self.root = {"value": None, "parent": None, "children": []}
self.current = self.root

def next(self):
if not self.current["children"]:
self.current["children"] = [{"value":val, "parent":self.current, "children":[]} for val in range(self.branch_factor)]
self.current = self.current["children"][0]
return self.current["value"]

def reset(self):
if self.current["parent"]:
return False
if self.current["parent"]["children"]:
self.current = self.root
return True
self.current = self.current["parent"]
return self.reset()

def __str__(self):
return str(self.root)

It would be used like this:

>>> tree = Tree(4)
>>> print,,
>>> while tree.reset():
...     print,,

Which would be equivalent to calling itertools.product([1,2,3,4], 3). The special part is, that in each iteration of the loop I can call an arbitrary amount of times, just how much it is needed. Whenever I cannot generate an additional layer anymore, I can call tree.reset() to start a new permutation.

For my code I used a python specific version which is a generator:

def product_varlength(branch_factor):
root = {"value": None, "parent": None, "children": []}
current = root
while True:
if not current["children"]:
current["children"] = [{"value":val, "parent":current, "children":[]} for val in range(branch_factor)]
current = current["children"][0]
if (yield current["value"]):
while True:
if current["parent"]:
if current["parent"]["children"]:
current = root
current = current["parent"]

It is used like this:

it = product_varlength(4)
print, it.send(False), it.send(False)
while True:
    print it.send(True), it.send(False), it.send(False)

Again, the expression in the loop can have any number of it.send(False). The first it.send(True) tells the generator to do a reset.

View Comments

adblocking with a hosts file

categories: blog

Naturally adblock plus is a must-have extension for firefox but other programs displaying websites might not offer such a facility.

To block ads on any application accessing the internet, the use of a hosts file which redirects requests to certain hostnames to (which will refuse incoming connections) provides a universal method to get rid of advertisements.

The question is how to obtain a list of malicious hosts.

Searching around revealed three lists that seemed to be well-maintained:


The according hosts-file entries can be found under these urls respectively:


I also looked into the adblock plus filter rules but they mostly contain expressions for the path, query and fragment part of URIs and not so much hostnames. This makes sense because using its syntax adblock plus is able to block with much more accuracy than just blocking whole domains.

Now I wanted a combined list of them without duplicates so I cleaned them up using the following sed expression:

sed 's/\([^#]*\)#.*/\1/;s/[ \t]*$//;s/^[ \t]*//;s/[ \t]\+/ /g'

It removes comments, whitespace at the beginning and end of the line and reduces any additional whitespace (between ip and hostname) to only one space. I would then run the output through sort and uniq and append the result to my /etc/hosts.

What is still problematic about this approach is, that if one doesnt have a service bound to then every application trying to establish a TCP connection to it will meaninglessly wait for localhost to respond until timeout is reached. To avoid this and immediately send a tcp RST when the browser is redirected to when it tries to retrieve an advertisement, I use the following iptables rule:

iptables -A INPUT -i lo -p tcp -m tcp --dport 80 -j REJECT --reject-with tcp-reset

Some hosts that you might also want to add to your /etc/hosts because they are there to track users are:

They are not included by default in the lists above because it might break some websites if they were.

EDIT (2012-05-21)

I forgot to include port 443 for https in the iptables rule above. For example google uses https for and others might too, so dont forget to also reset connections to port 443 with the rule given above.

View Comments

gnuplot live update

categories: blog

When you have data that is not instantly generated but trickles in one by one (either because calculation is difficult or because the data source produces new datapoints over time) but you still want to visualize it with gnuplot the easiest option is to hit the refresh button in the wx interface everytime you want to refresh the data.

A more cool way, is to make gnuplot "autoupdate" the data it displays whenever new data is available automagically.

Lets simulate a data source by the output of this python script that will just output coordinates on a circle with radius one around the coordinate center and wait for 0.5 seconds after every step.

from math import pi, sin, cos
from time import sleep

for alpha in [2*pi*i/16.0 for i in range(16)]:
print sin(alpha), cos(alpha)

The magic will be done by this shell script which does nothing else than reading lines from stdin, appending them to a buffer and printing the buffer with gnuplot syntax around it every time new input is received.

while read line; do
echo "plot \"-\""
echo -n $lines
echo "e"

These two scripts can now be used like this:

python -u | sh | gnuplot -p

The -u option has to be passed to python to enable unbuffered output. The -p option for gnuplot makes the plot window remain even after the main gnuplot exited.

Since you with this setup gnuplot would auto-scale the coordinate axes according to the current input, a more beatiful output with a fixed size coordinate frame would be produced by:

{ echo "set xrange [-2:2]"; echo "set yrange [-2:2]"; python -u | sh } | gnuplot -p

This way you can also pass more options like title, style, terminal or others.

View Comments

persistent NAT setup

categories: blog

Often having some tablet or smartphone running linux connected to my usb port and I always have to look up how to enable NAT-ing on my laptop so that those devices can access the internet via usb-ethernet connection. So here is how to do it:

sysctl net.ipv4.ip_forward=1
iptables -t nat -A POSTROUTING -j MASQUERADE

But I will not need this reminder anymore because this is how to make the setup persistant between reboots:

$ echo net.ipv4.ip_forward = 1 > /etc/sysctl.d/ip_forward.conf
$ cat /etc/iptables/rules.v4

The content in /etc/iptables/rules.v4 is normally generated via iptables-save so there might be some extra work needed to combine this with possible existing content. The initscript required to run iptables-restore with this file as input is provided as the iptables-persistent package.

View Comments

welcome to gobject introspection

categories: blog

So I was writing a quick python/gtk/webkit application for my own personal pleasure, starting with the usual:

import gtk
import gobject
import webkit
import pango

When the interface was already working pretty nicely after about 500 LOC later, I began adding some more application logic, starting with figuring out how to properly do asynchronous http requests with my gobject main loop.

Threading is of course not an option but it had to be a simple event-based solution. Gobject provides gobject.io_add_watch to react to activity on some socket but there was no library to parse the http communication over a socket connection in sight.

At this point let me also shortly express my dislike for the synchronous nature of urllib/urllib2. This kind of behaviour is unacceptable in my eyes for network based I/O and a reason why I recently had a look at node.js.

But back to the topic. After some search I found out that one could use libcurl in connection with gobject callbacks so using this example of pycurl as a basis I wrote the following snippet which fetches a couple of http resources in parallel in an asynchronous fashion:

import os, sys, pycurl, gobject
from cStringIO import StringIO

sockets = set()
running = 1

urls = ("","","")

def socket(event, socket, multi, data):
if event == pycurl.POLL_REMOVE: sockets.remove(socket)
elif socket not in sockets: sockets.add(socket)

m = pycurl.CurlMulti()
m.setopt(pycurl.M_PIPELINING, 1)
m.setopt(pycurl.M_SOCKETFUNCTION, socket)
m.handles = []
for url in urls:
c = pycurl.Curl()
c.url = url
c.body = StringIO()
c.http_code = -1
m.handles.append (c)
c.setopt(c.URL, c.url)
c.setopt(c.WRITEFUNCTION, c.body.write)

while (pycurl.E_CALL_MULTI_PERFORM==m.socket_all()[0]): pass

def done():
for c in m.handles:
c.http_code = c.getinfo(c.HTTP_CODE)

for c in m.handles:
data = c.body.getvalue()
print "%-53s http_code %3d, %6d bytes" % (c.url, c.http_code, len(data))

def handler(sock, *args):
while True:
(ret,running) = m.socket_action(sock,0)
if ret!=pycurl.E_CALL_MULTI_PERFORM: break
if running==0: done()
return True

for s in sockets: gobject.io_add_watch(s, gobject.IO_IN | gobject.IO_OUT | gobject.IO_ERR, handler)


This works nicely and I would've sticked to it when larsc wouldnt have suggested to use libsoup in connection with gobject introspection for the python binding.

Of course I could've used pycurl because curl is cool but every python binding to a C-library adds another point of possible failure or outdatedness when upstream changes.

This issue is now nicely handled by using gobject introspection or pygobject in case of python. What is does is, to use so called "typelibs" to dynamically generate a binding to any gobject code. Typelibs are generated from gir files which are XML representations of the library API.

In Debian the typelibs are stored in /usr/lib/girepository-1.0/ and even if you dont know the mechanism you will probably already have lots of definitions in this directory. You install additional files with gir-packages like gir1.2-gtk-3.0 They are already available for all kinds of libraries like clutter, gconf, glade, glib, gstreamer, gtk, pango, gobject and many more.

To use them, my import line now looks the following:

from gi.repository import Gtk, GObject, GdkPixbuf, Pango, WebKit

This also solves my problem I laid out above about grabbing data over http from within a gobject event loop.

from gi.repository import Soup

Soup can do that but there is no "real" python binding for it. With pygobject one now doesnt need a "real" binding anymore but I just import it as shown above and voila I can interface the library from my python code!

Converting my application from the normal gtk/gobject/pango/webkit bindings to their pygobject counterparts was also a piece of cake and I learned how to do it and did it in under an hour. A really good writeup about how to do it can be found here. For some initial cleanup this regex based script comes in surprisingly handy as well.

View Comments
« Older Entries -- Newer Entries »