• Home
  • Tags
  • RSS
  • About
  • youtube video download

    Timestamp:
    Tags: code

    Being intimidated by youtube-dl with its over 3000 lines of code, I thought there must be a simpler way than that and wrote a little shell script that now does all I want: download youtube videos.

    #!/bin/sh -e
    
    if [ "$#" -ne "1" ]; then
      echo specify the youtube id as the first argument
      exit 1
    fi
    
    code="$1"
    
    urldecode() { echo -n $1 | sed 's/%\([0-9A-F]\{2\}\)/\\\\\\\x\1/gI' | xargs printf; }
    
    cookiejar=`mktemp`
    baseurl="http://www.youtube.com/get_video_info?video_id=$code&el=detailpage"
    data=`curl --silent --cookie-jar "$cookiejar" "$baseurl"`
    
    highestfmt=0
    highesturl=""
    title=""
    for part in `echo $data | tr '&' ' '`; do
      key=`echo $part | cut -d"=" -f1`
      value=`echo $part | cut -d"=" -f2`
      if [ "$value" != "" ]; then
        value=`urldecode "$value"`
      fi    
      case "$key" in
      "fmt_url_map") 
        for format in `echo $value | tr ',' ' '`; do
          fmt=`echo $format | cut -d"|" -f1`
          url=`echo $format | cut -d"|" -f2`
          if [ "$fmt" = "18" ] \
          || [ "$fmt" = "22" ] \
          || [ "$fmt" = "37" ] \
          || [ "$fmt" = "43" ] \
          || [ "$fmt" = "45" ] ; then
            if [ "$fmt" -gt "$highestfmt" ]; then
              highestfmt=$fmt
              highesturl=$url
            fi
          fi
        done ;;
      "title") title="$value" ;;
      esac
    done
    
    echo writing output to "${title}_${code}.mp4"
    curl --location --cookie "$cookiejar" "$highesturl" > "${title}_${code}.mp4"
    rm $cookiejar
    

    and in python because that was so much fun

    #!/usr/bin/env python
    
    import cookielib, urllib2, shutil, urlparse, sys
    
    cookie_processor = urllib2.HTTPCookieProcessor(cookielib.CookieJar())
    urllib2.install_opener(urllib2.build_opener(cookie_processor))
    
    if len(sys.argv) != 2:
        print "specify the youtube id as the first argument"
        exit(1)
    
    code = sys.argv[1]
    baseurl = "http://www.youtube.com/get_video_info?video_id=%s&el=detailpage"%code
    data = urllib2.urlopen(baseurl).read()
    
    data = urlparse.parse_qs(data)
    title = data["title"][0]
    url = dict(part.split('|', 1) for part in data["fmt_url_map"][0].split(','))
    url = url.get("37", url.get("22", url.get("18")))
    
    print "writing output to %s_%s.mp4"%(title,code)
    data = urllib2.urlopen(url)
    with open("%s_%s.mp4"%(title,code), 'wb') as fp:
        shutil.copyfileobj(data, fp, 16*1024)
    

    The shell script can also easily turned into something that will deliver you the video remotely. This is useful if you have the server in the US and get annoyed by all the “This video contains content from ****. It is not available in your country.” messages when accessing content e.g. from Germany.

    Just change the top part into this:

    #!/bin/sh -e
    
    read request
    
    while /bin/true; do
      read header
      [ "$header" = "`printf '\r'`" ] && break
    done
    
    code="${request#GET /}"
    code="${code% HTTP/*}"
    

    and the bottom part into this:

    url=`curl --silent --head --output /dev/null --write-out %{redirect_url} --cookie "$cookiejar" "$highesturl"`
    while [ "$url" != "" ]; do
            highesturl=$url
    	url=`curl --silent --head --output /dev/null --write-out %{redirect_url} --cookie "$cookiejar" "$highesturl"`
    done
    curl --silent --include --cookie "$cookiejar" "$highesturl"
    rm $cookiejar
    

    then you can run the script like this:

    while true; do netcat -l -p 80 -e youtube.sh; done
    

    or by using inetd:

    www stream tcp nowait nobody /usr/local/bin/youtube youtube
    

    And better chroot the whole thing.