Mister Muffin Blog

dudle

Thu, 19 May 2011 09:33 categories: code

In my quest to minimize the external third party services I rely on in my daily life, I stumbled over dudle which is an online poll system like doodle only better :)

The problems with giving your availability for a meeting or preferences over a subject to third party services like doodle are obviously the entailed privacy issues. Since I thought that doing stuff like a simple poll can also be handled by software running on my machine I firstly just wanted something like doodle running on one of my servers. This way, whenever me or anybody else would vote, they would no longer be depending on trusting someone like doodle but they would only have to trust me, who I own the server. (and hopefully they trust me more than a unknown money making instance)

But one can do even better than that! Benjamin Kellermann implemented an online poll platform where the availability or preferences of the participants are not even known by the server itself (nor by the other participants) and only when everybody voted a sum of all the votes can be calculated to decide for a timeslot or choice of subject. Hence, the user doesn't even have to trust the server he uses that runs dudle. All the server and the other participants get to see are encrypted availability vectors from which nobody can infer the original choice of options of the user. Only when combining all of them, a sum can be calculated which represents the overall availability of all users at a given time.

By its usage dudle is as simple as doodle. The only thing that might sound tedious is the private key each user has to take care of after his registration (which of course works without giving away any private detail like email) but this is greatly solved by using a bookmarklet to enter his private key into a poll field. I was showing the setup and what its advantages are to some non-CS people and was very pleased by the positive responses I got. A big kudos to Benjamin for his great work on this piece of software!

I am running thttpd on mister-muffin.de and by design it chroots into the www directory to increase security. A result of this is, that the only cgi scripts that can be run are statically compiled executables. Dudle is written in ruby and uses git as the database backend. Hence I had to setup a minimal chroot environment inside my www directory so that dynamically linked executables like git and ruby would work. I could've bothered with compiling both statically but was not up for the trouble (yet). A requirement of this environment was of course that it was very small and only included stuff that was necessary for git and ruby to run: dynamic libraries and ruby modules. Another requirement was that there were no setuid programs that could be used by an attacker to break out of the chroot environment by becoming root.

One way to do that would be to do a normal debootstrap including git and ruby and then manually removing everything that was not needed. It turned out that a normal debootstrap creates lots of overhead in retrieving lots of things that are not needed anyways and result in also having to delete lots of things afterwards, which is not worth the hassle.

My idea was to retrieve the git and ruby debian packages and all its dependencies and all the dependencies of those recursively and then just extract those packages into a directory. Since I didnt want to do the dependency resolution manually I let myself be inspired by multistrap and used apt to do that. Using apt for this task (as well as for multistrap) is possible because one can specify a custom target directory for apt in the commandline.

Since my final setup did not contain /bin/sh which ruby needed to call git, I had to patch dudle. I also proposed a way to get rid of the htdigest dependency to Benjamin and he included that and my /bin/sh patch into dudle. ☺

#!/bin/sh -ex

# check for fakeroot
if [ "$LOGNAME" = "root" ] \
|| [ "$USER" = "root" ] \
|| [ "$USERNAME" = "root" ] \
|| [ "$SUDO_COMMAND" != "" ] \
|| [ "$SUDO_USER" != "" ] \
|| [ "$SUDO_UID" != "" ] \
|| [ "$SUDO_GID" != "" ]; then
        echo "don't run this script as root - there is no need to"
        exit
fi

# modify these
ARCH="amd64"
DIST="squeeze"
MIRROR="http://127.0.0.1:3142/ftp.de.debian.org/debian"
DIRECTORY="`pwd`/debian-$DIST-$ARCH-ministrap"

# re-execute script in fakeroot
if [ "$FAKEROOTKEY" = "" ]; then
        echo "re-executing script inside fakeroot"
        fakeroot $0;
        rsync -Phaze ssh $DIRECTORY/ mister-muffin.de:/var/www/
        ssh mister-muffin.de "chown -R www-data:www-data /var/www/dudle.mister-muffin.de/"
        exit
fi

# apt options
APT_OPTS="-y"
APT_OPTS=$APT_OPTS" -o Apt::Architecture=$ARCH"
APT_OPTS=$APT_OPTS" -o Dir::Etc::TrustedParts=$DIRECTORY/etc/apt/trusted.gpg.d"
APT_OPTS=$APT_OPTS" -o Dir::Etc::Trusted=$DIRECTORY/etc/apt/trusted.gpg"
APT_OPTS=$APT_OPTS" -o Apt::Get::AllowUnauthenticated=true"
APT_OPTS=$APT_OPTS" -o Apt::Get::Download-Only=true"
APT_OPTS=$APT_OPTS" -o Apt::Install-Recommends=false"
APT_OPTS=$APT_OPTS" -o Dir=$DIRECTORY/"
APT_OPTS=$APT_OPTS" -o Dir::Etc=$DIRECTORY/etc/apt/"
APT_OPTS=$APT_OPTS" -o Dir::Etc::SourceList=$DIRECTORY/etc/apt/sources.list"
APT_OPTS=$APT_OPTS" -o Dir::State=$DIRECTORY/var/lib/apt/"
APT_OPTS=$APT_OPTS" -o Dir::State::Status=$DIRECTORY/var/lib/dpkg/status"
APT_OPTS=$APT_OPTS" -o Dir::Cache=$DIRECTORY/var/cache/apt/"

# clean root directory
rm -rf $DIRECTORY

# initial setup for apt to work properly
mkdir -p $DIRECTORY
mkdir -p $DIRECTORY/etc/apt/
mkdir -p $DIRECTORY/etc/apt/sources.list.d/
mkdir -p $DIRECTORY/etc/apt/preferences.d/
mkdir -p $DIRECTORY/var/lib/apt/
mkdir -p $DIRECTORY/var/lib/apt/lists/partial/
mkdir -p $DIRECTORY/var/lib/dpkg/
mkdir -p $DIRECTORY/var/cache/apt/
# apt somehow needs this file to be present
touch $DIRECTORY/var/lib/dpkg/status

# fill sources.list
echo deb $MIRROR $DIST main > $DIRECTORY/etc/apt/sources.list

# update and install git and ruby
apt-get $APT_OPTS update
apt-get $APT_OPTS install ruby git-core libgettext-ruby1.8 libjson-ruby1.8

# unpack downloaded archives
for deb in $DIRECTORY/var/cache/apt/archives/*.deb; do
  dpkg -x $deb $DIRECTORY
done

# delete obsolete directories
rm -rf $DIRECTORY/usr/share/
rm -rf $DIRECTORY/usr/lib/perl/
rm -rf $DIRECTORY/usr/lib/gconv/
rm -rf $DIRECTORY/usr/lib/git-core/
rm -rf $DIRECTORY/usr/sbin/
rm -rf $DIRECTORY/var/
rm -rf $DIRECTORY/bin/
rm -rf $DIRECTORY/sbin/
rm -rf $DIRECTORY/selinux/
rm -rf $DIRECTORY/etc/*

# delete all setuid programs
find $DIRECTORY -perm -4000 -delete

# delete all binaries except for "git" and "ruby"
find $DIRECTORY/usr/bin/ -type f -o -type l | egrep -v "ruby|git$" | xargs rm -rf

# git needs /etc/passwd otherwise git says: "You dont't exist, go away!"
cat > $DIRECTORY/etc/passwd << __END__
www-data:x:33:33:www-data:/var/www:/bin/sh
__END__

# dont forget to create /tmp directory for dudle
mkdir -m 777 $DIRECTORY/tmp

# get latest dudle
bzr branch https://dudle.inf.tu-dresden.de/unstable/ $DIRECTORY/dudle.mister-muffin.de
( cd $DIRECTORY/dudle.mister-muffin.de; make; )
bzr branch https://dudle.inf.tu-dresden.de/unstable/extensions/dc-net/ $DIRECTORY/dudle.mister-muffin.de/extensions/dc-net/
( cd $DIRECTORY/dudle.mister-muffin.de/extensions/dc-net/; make; )

# fix shebang
find $DIRECTORY/dudle.mister-muffin.de/ -type f -regex ".*\.cgi\|.*\.rb" \
    | xargs sed -i 's/#!\/usr\/bin\/env ruby/#!\/usr\/bin\/ruby/'

The above code will compile a minimal chroot environment, delete everything that is not needed, fetch dudle and deploy it to my server. The comments in the code should explain everything.

View Comments

download apple trailer

Wed, 16 Mar 2011 14:51 categories: code

It is easy to find out why one can not just download the movie trailers at apple.com and how to fix it but I automated it even further using this small script which gives me the direct commands for downloading the 1080p content, which is the reason I use apple trailer in the first place instead of youtube. I also recently discovered that some trailer pages already have the h prepended to the download urls so this is also taken care of by the following small script:

import urllib2, re, sys

if len(sys.argv) != 2:
    print "supply apple trailer url"
    exit()

f = urllib2.urlopen(sys.argv[1]+'/includes/playlists/web.inc')
urls = re.findall('http://trailers.apple.com/movies/[^"\']+1080p.mov', f.read())

for url in urls:
    print "wget -U quicktime "+re.sub("(?<!h)1080p.mov$", "h1080p.mov", url)

View Comments

keene fm transmitter

Mon, 14 Mar 2011 07:30 categories: code

tl;dr: it works! software to be downloaded here

Half a year ago I purchased a usb fm transmitter from Keene Retail Ltd and funnily enough the audio part was immediately working with linux as it just showed up as an usb audio device linux was having drivers for.

[433346.713773] usbcore: registered new interface driver snd-usb-audio
[433346.731642] generic-usb 0003:046D:0A0E.0001: hiddev0,hidraw0: USB HID
v1.10i Device [HOLTEK  B-LINK USB Audio  ] on usb-0000:00:1d.0-1.2/input2
[433346.731662] usbcore: registered new interface driver usbhid
[433346.731664] usbhid: USB HID core driver

But of course there was a proprietary usb control protocol to set frequency, TX, preemphasis, volume and so on. So I started using the horrible windows only control software and captured the usb messages it sent to the stick using usbsnoop. Doing that was a horrible experience since the program was constantly making my computer freeze or disable the usb entirely so that i had to switch off the whole machine and reboot - no idea what was causing this and also no incentive to find out why. In the end I managed to capture enough data to basically understand the protocol that was used. But as with every good proprietary protocol you reverse engineer there are still things that are ambiguous or dont make sense or are redundant or where you see how the protocol developed from a less capable state. I had that all and I still dont fully understand the design goals but in the end (and what counts) I was able to assemble a samll C program that could control the FM transmitter as the windows client could.

Something out of the ordinary was, when I tried to contact the guys from Keene and ask them whether they would want to help me with writing a client that would work for their hardware on linux. It took some months but in the end I got this awesome message:

Dear Johannes,

I'm sorry this has taken a while but please find attached the source code for this unit.

If you are successful in producing linux drivers and software I would be happy to add your program as a download from our site, or link to your site should you prefer.

Good luck!

Kind regards,

Alan Quinby Director Keene Electronics Ltd

And attached I found a rar archive with a number of *.asm, *.LST, *.inc, and *.OBJ files and some files named HT82A821R with endings like *.bin, *.CV, *.DBG, *.dsw, *.MAK, *.MAP, *.OPT, *.OTP, *.PRJ and *.TSK. The first bunch of files was just assembler text and some C in between but I couldnt figure out how they belonged all together nor what toolchain those are belonging to. I also wonder why one would do any project in pure assembler instead of just using C? If anybody has a clue about what those files could mean dont hesitate to tell but since reverse engineering already gave me results I didnt feel brave enough to further dig into those piles of assembler. Nevertheless it was still great of them to just send the code around - something you only see seldomly! So kudos to Keene here.

Now the results of my reverse engineering can be found here and you can just compile and run keenectl.c with your stick inserted. You also have to run at as root and you musnt forget to rmmod the usbhid module beforehand but the program will tell you that as well once an error occurs. To set all values to default just run:

./keenectl - - - - - - -

The arguments correspond to TX (0-23), preemphasis (50us or 75us), channels (mono or stereo), frequency (floats from 76.0-108.0), PA (30 to 120), enable or disable and mute or unmute. If one of the arguments is - the default value is set. A command explicitly specifying the defaults would hence look like:

./keenectl 0 50us stereo 90.0f 120 enable unmute

To send all parameters, two 8 byte usb control messages are dispatched.

When you are done, you can play the music for example using mplayer:

mplayer -ao alsa:device=hw=1.0 YourMusic.wav

Have fun!

View Comments

disposable email

Tue, 08 Mar 2011 09:57 categories: code

Signing up for a new random internet service (maybe to just try it out or to use it only once ie. to post one message to a forum) is usually a hassle because they all want a valid email address to "verify" your identity. Now since you wouldnt want to hand out your private email address to everybody you could have a second email account at a big provider where you dont care about the spam they may subsequently send you but if you use it for a lot of services the spam easily gets out of hand (also it's not really geeky since everybody does it like that). You could also set up your own email server and create a new account for every website you want to register to. This would have the advantage that you can easily disable that account once it receives spam or once you dont need it anymore - leaving others intact. Or you could even find out if someone gave your email to someone else. Suppose you want to register at foobar.com so you create the account foobar_com@myserver.org. If you later on receive emails from someone totally different on that address you can be sure that foobar.com gave your address to someone else as they are the only ones (besides yourself) that know about it. That approach has the disadvantage though that setting up a new account requires some extra work that you have to go through again and again - even when it is only calling one script that does it for you. You of course want something totally automated without any manual labour.

A solution for that would be Disposable email addresses. You would just pick a website that offers this service, pick an address nobodye else used yet and give it to the internet service you want to register to and afterward you can just pick up the mail that was sent to that disposable email service provider. It has the advantage that the address you use doesnt have to be set up priorly - those services will just catch any mail they get and make them public under their respective addresses. Another advantage would be that again you easily see whether someone gives your email away (as with the solution of using your own email server). A disadvantage is that the address you want to use might already be picked by someone else, so if you want to use the address foobar_com@disposableprovider.org, someone else might given that address to foobar.com so you cant register at it with that address and have to pick a new one that is harder for you to remember. Also, those disposable email service providers mostly dont require you to sign up with them. As a result you have to keep track of all the addresses you use by yourself. The biggest disadvantage though is that most internet service providers (naturally) dont want you to give them a disposable email address so they would block every disposable email provider you can find with an internet search.

Now I needed a disposeable email solution with all the advantages of the above without the disadvantages, so i programmed my own little email server that would handle disposable email and would of course not be blacklisted by the big providers. I would also have the advantage of using this service only myself so that there would be no clashes of addresses I want to use but were used by somebody else before. And additionally I could always get a list of all addresses I ever used easily and manage them.

It turned out to be really easy to do in python and here is the server script that i now use for half a year on mister-muffin.de:

#!/usr/bin/env python

from smtpd import SMTPServer
import asyncore
from email import message_from_string
import os
from datetime import datetime

#ripped out of mimetypes.MimeTypes().types_map_inv
exts = {
    [...]
}

class DisposableMailServer(SMTPServer):
    def process_message(self, peer, mailfrom, rcpttos, data):
        rcpttos = [x[:-17] for x in rcpttos
                                if x.endswith('@mister-muffin.de')
                                and len(x)>17]
        if not rcpttos:
            return None

        print "new mail for", rcpttos

        msg = message_from_string(data)
        basepath = '/directory/where/to/put/all/your/email/in'
        counter = 1
        now = datetime.now().isoformat()
        header, _ = msg.as_string().split('\n\n', 1)

        for name in rcpttos:
            path = os.path.join(basepath, name, now)
            if not os.path.exists(path):
                os.makedirs(path)

            with open(os.path.join(path, "header.txt"), "w") as f:
                f.write(header)

            for part in msg.walk():
                if part.get_content_maintype() == 'multipart':
                    continue

                filename = part.get_filename()

                if not filename:
                    ext = exts.get(part.get_content_type())
                    if not ext:
                        ext = '.bin'
                    filename = 'part-%03d%s'%(counter,ext)
                else:
                    filename = 'part-%03d-%s'%(counter,filename)

                filename = os.path.join(path, filename)

                if part.get_payload(decode=True):
                    with open(filename, "w") as f:
                        f.write(part.get_payload(decode=True))

                counter += 1

if __name__ == '__main__':
    proxy = DisposableMailServer(("mister-muffin.de", 25), ("localhost", 25))
    try:
        asyncore.loop()
    except KeyboardInterrupt:
        pass

Of course this means that you can send any email to pickanything [at] mister-muffin.de and i will note about it - you have no idea how much email I get at josch [at] mister-muffin.de but i dont think i will ever use this address for real email. (I obfuscated these emails here, replacing @ by [at] and adding spaces so that search engines wont list those email addresses)

I was very surprised how simple it is to write something like that in python. What it does is to analyze all incoming email, create a new directory named by the sender in a target directory and save the email in a subdirectory named with a timestamp (which i thought would be unique enough). Inside that directory the script will place the raw email header for possible inspection and the payload which might be a text message, the same thing as html and any attachment that was sent.

The directory I chose is also published with my webserver so that I dont even need to ssh into my server to pick up email but just have to go to a certain url (which i'm not telling you because i want to keep using it for myself) and open the text or html of the mail.

Using it for youself is only a matter of fixing the basepath in line 25 and have it running on a public server of yours. I also didnt include the content of the ext variable in line 10 because it was too long but it is only a mapping of content type to file extension like:

exts = {
        'application/andrew-inset': '.ez',
        'application/annodex': '.anx',
        'application/atomcat+xml': '.atomcat',
        'application/atomserv+xml': '.atomsrv',
[...]

Have fun!

View Comments

gcc preprocessor website generation

Mon, 07 Mar 2011 08:23 categories: code

At some point I was helping a friend with doing a website for his stepfather. Since it was only five pages I wanted to have something very simple to generate html pages from a header, a body and footer where header and footer would of course be the same for every page.

My "brilliant" idea was to combine the gcc preprocessor with make to achieve this and here is the Makefile i wrote:

CC=gcc
CFLAGS=-x c -E -P

SOURCES:=$(wildcard *.tmpl)
INCLUDES:=$(wildcard *.incl)
OBJS:=$(patsubst %.tmpl, %.html, $(SOURCES))

all: $(OBJS)

%.html: %.tmpl $(INCLUDES)
  $(CC) $(CFLAGS) -o $@ $<

.PHONY: clean
clean:
 rm -f *.html

The CFLAGS are forcing gcc to treat the files as c code, to only run the preprocessor and to not generate line markers in the output.

Files named *.incl are for example header.incl and footer.incl and are included by *.tmpl files. Every *.tmpl file is turned into a *.html file and the *.tmpl files can contain statements like:

#include "header.incl"
<p>some text</p>
#include "footer.incl"

And of course due to the nature of make a *.html file will only be regenerated if one of it's dependendcies (the corresponding *.tmpl or one of the *.incl) change. No speed improvement for my five html pages setup but it's the idea that counts here ;)

View Comments

Newer Entries »

Mister Muffin Blog

Static Pages

Services

Latest Blog Posts

Categories

Archives

Syndication

dudle

download apple trailer

keene fm transmitter

disposable email

gcc preprocessor website generation