port bootstrap build-ordering tool report 1
Sun, 03 Jun 2012 16:18 categories: debianA copy of this post is sent to soc-coordination@lists.alioth.debian.org as well as to debian-bootstrap@lists.mister-muffin.de.
Diary
May 21
- Cloned Dose3 and made it build
- Retrieved bootstrap.ml and bootstrap2.ml from old revisions as they were deleted
- Compiled, tested and investigated the functionality of bootstrap.ml and bootstrap2.ml on a theoretical level as no test data was available
May 22
- Pietro sends me a tarball with his current version of bootstrap.ml and dummy as well as real test data
- Created a gitorious account, project and repository
- Compiled, tested and investigated his code
- Ran into several runtime problems with the supplied dummy examples
- Created Makefile to automatically fill ./examples/real/
- Found that .dot files are too big to be rendered
- Trying to figure out how hints work, how base-system was generated and why execution takes hours
May 23
- Pietro made examples work which let me understand the code much more
- Improvement of .dot output and output formatting
- Refactored code into bootstrapCommon.ml for shared functionality and bootstrap.ml for option parsing and main()
May 24
- Play with xdeb.py
- Generate dot graphs with bootstrap.ml and analyze them with sccmap
- Try to find a way to have a reduced package selection other than main archives of ubuntu/debian
- Initial work on trying to find the list of minimal source packages that have to be cross compiled
- Create debian-bootstrap@lists.mister-muffin.de mailinglist
May 25
- Implement a replacement for apt-rdepends and grep-dctrl functionality in ocaml, both working on Package files
- Retrieve list of packages with priority:required
- Retrieve their runtime dependencies
- Retrieve the packages that are added with build-essential and dependencies
- Retrieve the list of source packages that are needed to build the above
- Retrieve list of binary packages that are build from the source packages in addition
- some more functionality in the Makefile
May 29
- Depsolver.dependency_closure replaces homebrew functionality in a better and faster way
- Only consider those binary packages that can actually be installed, given the limited amount of available packages using Depsolver.edos_install
- Create proper list diff by correctly comparing Cudf.package members
May 30
- Big code restructuring
- consider arch:all packages to be available by default
- Got helpful sourcecode comments by Pietro
May 31
- Use Depsolver.trim to reduce a universe to the installable packages
- Compile with dose 2.9.17
June 1
- Basebuildsystem now also writes output to min-cross-sources.list and base-system.list
- Begin work on basenocycles.ml to see how much the minimal system can build without cycle breaking
June 2
- Use Depsolver.trim to find source packages that can be built given the restricted universe
- Find the final list of packages that are available without solving staged build dependencies for Natty
- Many code simplifications
Results
I learned a good chunk of ocaml and how to use dose3 and libcudf.
I created a gitorious project and a git repository for all the sourcecode.
git clone git://gitorious.org/debian-bootstrap/botch.git
The git as of now contains 30 commits and 1197 lines of ocaml code.
So far, 62 emails have been exchanged between me and Pietro and Wookey.
I created a mailinglist for this project where all email exchange so far is publicly accessible in the archives. You can also download all of the email exchange in mbox format. Everybody is welcome to join and/or read the list.
What seems to be finished: the program that finds the minimal amount of source packages that have to be cross compiled to end up with a minimal build system. What it does is:
- get all essential packages
- get their runtime dependencies
- get build-essential plus runtime dependencies
- get all source packages that are necessary to build 1.-3. those are the packages that have to be cross compiled
- get a list of all packages that are built by source packages from 4.
- add all packages from 1.,2.,3. and 5. plus all arch:all packages to a universe
- use Depsolver.trim on that universe to figure out which of those packages are actually installable
The result of 7. will then contain a list of packages that are available automatically on the foreign system due to cross compiled source packages and arch:all packages.
For Debian Sid, the output of my program is:
# (1) number of packages with priority:required: 62
# (2) plus, number of dependencies of priority:required packages: 20
# (3) plus, build-essential and dependencies: 31
# number of source packages to build the above: 71
# number of additional packages built from the above source packages: 292
# (4) number of packages of those plus arch:all packages that are installable: 6421
# total number of installable packages (1)+(2)+(3)+(4): 6534
For Ubuntu Natty it is:
# (1) number of packages with priority:required: 96
# (2) plus, number of dependencies of priority:required packages: 7
# (3) plus, build-essential and dependencies: 31
# number of source packages to build the above: 87
# number of additional packages built from the above source packages: 217
# (4) number of packages of those plus arch:all packages that are installable: 2102
# total number of installable packages (1)+(2)+(3)+(4): 2236
So for Debian, 71 source packages definitely have to be made cross compilable while for Natty, the number is 87.
The last two days I was toying around with these minimal systems to see how big the number of source packages is, that can be built on top of them without running into dependency cycles. After installing the binary packages that were built, I checked again until no new packages could be built.
For Natty, I was only able to find 28 additional packages that can be built on top of the 2236 existing ones. This means that a number of dependency cycles prevent building anything else.
In the coming two weeks I will focus on coming up with a tool that cleverly helps the user to identify packages that would be useful to have for building more packages (probably determined by how many packages depend on it - debhelper is an obvious candidate). The tool would then show why that crucial package is not available (in case of debhelper because some of its runtime dependencies are not available and require debhelper to be built) and how the situation can best be resolved. The possible methods to do so are to identify a package that is part of a cycle and either cross compile it or let it have staged build dependencies.