In my
previous post I
mentioned my involvement with the
OpenRISC
or1k port. It was the technical activity
in which I spent most time during 2014 (Debian and otherwise, day job aside).
I thought that it would be nice to talk a bit about the port for people who
don't know about it, and give an update for those who do know and care. So this
post explains a bit how it came to be, details about its development, and
finally the current status. It is going to be written as a rather
personal
account, for that matter, since I did not get involved enough in the OpenRISC
community at large to learn much about its internal workings and aspects that I
was not directly involved with.
There is not much information about all of this elsewhere, only bits and pieces
scattered here and there, but specially not much public information at all about
the development of the Debian port. There is an
OpenRISC entry in the Debian wiki, but it
does not contain much information yet. Hopefully, this piece will help a bit to
preserve history and give an insight for future porters.
First Things First
I imagine that most people reading this post will be familiar with the
terminology, but just in case, to create a new Debian port means to get a Debian
system (GNU/Linux variant, in this case) to run in the OpenRISC or1k computer
architecture.
Setting to one side all differences between hardware and software, and as
described in their site:
The aim of the OpenRISC project is to create free and open source computing
platforms
It is therefore a good match for the purposes of Debian and Free Software world
in general.
The processor has not been produced in silicon, or not available for the masses
in any case. People with the necessary know-how can download the hardware
description (Verilog) and synthesise it in a FPGA, or otherwise use simulators.
It is not some piece of hardware that people can purchase yet, and there are no
plans to mass-produce it in the near future either.
The two people (including me) involved in this Debian port did not have the
hardware, so we created the port entirely through cross-compiling from other
architectures, and then compiling inside
Qemu
. In a sense, we were creating a
Debian port for hardware that "does not [phisically] exist". The software that
we built was tested by people who had hardware available in FPGA, though, so it
was at least
usable. I understand that people working in the
arm64
port had
to work similarly in the initial phases, working in the dark without access to
real hardware to compile or test.
The Spark
The first time that I heard about the initiative to create the port was in late
February of 2014, in a post which appeared in
Linux Weekly
News (sent by
Paul
Wise) and
Slashdot.
The
original
post
announcing it was actually from late January, from Christian Svensson
(blueCmd):
Some people know that I've been working on porting Glibc and doing some
toolchain work. My evil master plan was to make a Debian port, and today I'm a
happy hacker indeed!
Below is a link to a screencast of me installing Debian for OpenRISC,
installing python2.7 via apt-get (which you shouldn't do in or1ksim, it takes
ages! (but it works!)) and running a small Python
script. http://asciinema.org/a/7362
So, now, what can a Debian Hacker do when reading this? (Even if one's Hackery
Level is not that high, as it is my case). And well,
How Hard Can It Be? I
mean, Really?
Well, in my own defence, I knew that the answer to the last two questions would
be a resounding
Very . But for some reason the idea grabbed me and I
couldn't help but think that it would be a Really Exciting Project, and that
somehow I would like to get involved. So I wrote to Christian offering my help
after considering it for a few days, around mid March, and he welcomed me
aboard.
The Ball Was Already Rolling
Christian had already been in contact with the people behind
DebianBootstrap, and he had already
created the repository
http://openrisc.debian.net/ with many packages of
the base system and beyond (read: packages
name_version_or1k.deb
available to
download and install). Still nowadays the packages are not signed with proper
keys, though, so use your judgement if you want to try them.
After a few weeks, I got up to speed with the status of the project and got my
system working with the necessary tools. This meant basically
sbuild
/
schroot
to compile new packages, with the base system that Christian
already got working, installed in a
chroot
, probably with the help of
debootstrap
, and
qemu-system-or1k
to simulate the system.
Only a few of the packages were different from the version in Debian, like
gcc
,
binutils
or
glibc
-- they had not been upstreamed yet.
sbuild
ran
through
qemu-system-or1k
, so the compilation of new packages could happen
"natively" (running inside
Qemu
) rather than cross-compiling the packages,
pulling
_or1k.deb
packages for dependencies from the repository that he had
prepared, and
_all.deb
packages from
snapshots.debian.org
.
I started by trying to get the packages that I [co-]maintain in Debian compiled
for this architecture, creating the corresponding
_or1k.deb
. For most of
them, though, I needed many dependencies compiled before I could even compile my
packages.
The GNU autotools / autoreconf Problem
Since very early, many of the packages failed to build with messages such as:
Invalid configuration 'or1k-linux-gnu': machine 'or1k' not recognized
configure: error: /bin/bash ../config.sub or1k-linux-gnu failed
This means that software packages based on GNU
autotools
and using
configure
scripts need recent versions of the files
config.sub
and
config.guess
that
they ship in their root directory, to be able to detect the architecture and
generate the code accordingly.
This is counter-intuitive, having into account that GNU
autotools
were
designed to help with
portability.
Yet, in the case of creating new Debian ports, it meant that unless upstream had
very recent versions of
config. guess,sub
, it would prevent the package to
compile straight away in the new architectures -- even if invoking
gcc
without
ado would have worked without problems in most cases for native compilation.
Of course this did not only affect
or1k
, and there was already the
autoreconf effort underway as a way to
update these files automatically when building Debian packages, pushed by people
porting Debian to the new architectures added in 2013/2014 (
mips64el
,
arm64
,
ppc64el
), which encountered the same roadblock. This affected
around a
thousand source
packages
in unstable. A Royal Pain. Also, all of their reverse dependencies (packages
that depended on those to be built) could not be compiled straight away.
The bugs were not Release Critical, though (none of these architectures were
officially accepted at the time), so for people not concerned with the new ports
there was no big incentive to get them fixed. This problem, which
conceptually is easily solvable, prevented new ports to even attempt compile
vast portions of the archive straight away (cleanly, without modifications to
the package or to the host system), for weeks or months.
The GNU autotools / autoreconf Solution
We tackled this problem mainly in two ways.
First, more useful for Debian in general, was to do as other porters were doing
and submit bug reports and patches to Debian packages requesting them to use
autoreconf
, and NMUing packages (uploading changes to the archive without
the official maintainers' intervention). A few NMUs were made for packages
which had bug reports with patches available for a while, that were in the
critical path to get many other packages compiled, and that were orphan or had
almost no maintainer activity.
The people working in the other new ports, and mainly Ubuntu people which helped
with some of those ports and wanted to support them, had submitted a large
amount of requests since late 2013, so there was no shortage of NMUs to be made.
Some porters, not being Debian Developers, could not easily get the changes
applied; so I also helped a bit the porters of other architectures, specially
later on before the freeze of Jessie, to get as many packages compiled in those
architectures as possible.
The second way was to create
dpkg-buildpackage
hooks that updated
unconditionally
config. guess,sub
before attempting to build the package in
the local build system. This local and temporary solution allowed us in the
or1k
port to get many
_or1k.deb
packages in the experimental repository,
which in turn would allow many more packages to compile. After a few weeks, I
set up many
sbuilds
in a multi-core machine attempting to build
uninterruptedly packages that were not previously built and which had their
dependencies available. Every now and then (typically several times per day in
peak times) I pushed the resulting
_or1k.deb
files to the repository, so more
packages would have the necessary dependencies ready to attempt to build.
Christian was doing something similar, and by April at peak times, among the two
of us, we were compiling some days more than a hundred packages -- a huge
amount of packages did not need any change other than up-to-date
config. guess,sub
files. At some point, late April, Christian set up
wanna-build
in a few hosts to do this more elegantly and smartly than my
method, and more effectively as well.
Ugly Hacks, Bugs and Shortcomings in the Toolchain and Qemu
Some packages are extremely important because many other packages need them to
compile (like
cmake
,
Qt
or
GTK+
), and they are themselves very complex and
have dependency loops. They had deeper problems than the
autoreconf
issue and
needed some
seriously dirty hacking to get them built.
To try to get as many packages compiled as possible, we sometimes compiled these
important packages with some functionality disabled, disabling some binary
packages (e.g. Java bindings) or specially disabling documentation (using
DEB_BUILD_OPTIONS=nodoc
when possible, and more aggressively when needed by
removing chunks of
debian/rules
). I tried to use the more aggressive methods
in as few packages as possible, though, about a dozen in total. We also used
DEB_BUILD_OPTIONS=nocheck
for speeding up compilation and avoiding build
failures -- many packages' tests failed due to
qemu-system-or1k
not supporting
multi-threading, which we could do nothing about at the time, but otherwise the
packages mostly passed tests fine.
Due to bugs and shortcomings in
Qemu
and the toolchain --like the compiler
lacking
atomics, missing functionality in
glibc
,
Qemu
entering in endless
loops, or programs segfaulting (especially
gettext
, used by many packages and
causing the packages failing to build)--, we had to resort to some very
creative ways or time-consuming dull work to edit
debian/rules
, or to create
wrappers of the real programs avoiding or forcing certain options (like
gcc
-O0
, since
-O2
made buggy binaries too often).
To avoid having a mix of cleanly compiled and hacked packages in the same
repository, Christian set up a two-tiered repository system -- the
clean
one and the
dirty one. In the
dirty one we dumped all of the packages that
we got built, no matter how. The packages in the
clean one could use packages
from the
dirty one to build, but they themselves were compiled without any
hackery. Of course this was not a completely airtight solution, since they
could contain code injected at build time from the "dirty repository" (e.g. by
static linking), and perhaps other quirks. We hoped to get rid of these
problems later by rebuilding all packages against
clean builds of all their
dependencies.
In addition, Christian also spent significant amounts of time working inside the
OpenRISC community, debugging problems, testing and recompiling special versions
of the toolchain that we could use to advance in our compilation of the whole
archive. There were other people in the OpenRISC community implementing the
necessary bits in the toolchain, but I don't know the details.
Good Progress
Olof Kindgren wrote the
OpenRISC health report April
2014
(actually posted in May), explaining the status at the time of projects in the
broad OpenRISC community, and talking about the software side, Debian port
included. Sadly, I think that there have been no more "health reports" since
then. There was also a new post in Slashdot entitled
OpenRISC Gains Atomic
Operations and Multicore
Support
shortly thereafter.
In the side of the Debian port, from time to time new versions of packages
entered unstable and we started to use those newer versions. Some of them had
nice fixes, like the
autoreconf
updates, so they did not require local
modifications. In other cases, the new versions failed to build when old ones
had worked (e.g. because the newer versions added support and dependencies of
new versions of
gnutls
,
systemd
or other packages not yet available for
or1k
), and we had to repeat or create more nasty hacks to get the packages
built again.
But in general, progress was very good. There were about 10k arch-dependent
packages in Debian at the time, and we got about half of them compiled by the
beginning of May, counting
clean and
dirty. And, if I recall correctly,
there were around the same number of arch=all (which can be installed in any
architecture, after the package is built in one of them). Counting both, it
meant that systems using
or1k
got about 15k packages available, or 75% of the
whole Debian archive (at least "main", we excluded "contrib" and "non-free").
Not bad.
By the middle to end of May, we had about 6k arch-dependent packages compiled,
and 4k to go. The count of packages eventually reached ~6.6k at its peak (I
think that in June/July). Many had been built with hacks and not rebuilt
cleanly yet, but everything was fine until the amount of packages built
plateaued.
Plateauing
There were multiple reasons for that. One of them is that after having fixed
the
autoreconf
issue locally in some packages, new versions were uploaded to
Debian without fixing that problem (in many cases there was no bug report or
patch yet, so it was understandable; in other cases the requests were ignored).
The
wanna-build
for the
clean repository set up by Christian rightly
considered the package out-of-date and prepared to build the more recent
version, that failed. Then, other packages entering the unstable archive and
build-depending on newer versions of those could not be built
("BD-Uninstallable"), until we built the newer versions of the dependencies in
the
dirty repository with local hacks. Consequently, the count of cleanly
built packages went back-and-forth, when not backwards.
More challenging was the fact that when creating a new port, language compilers
which are written in that same language need to be built for that architecture
first. Sometimes it is not the compiler, but compile-time or run-time support
for modules of a language are not ported yet. Obviously, as with other
dependencies, large amounts of packages written in those languages are bound to
remain uncompiled for a long time. As Colin Watson explained in
porting
Haskell's GHC to arm64
and
ppc64el
,
untangling some of the chicken-and-egg problems of language compilers for new
ports is extremely challenging.
Perl and Python are pretty much a pre-requisite of the base Debian system, and
Christian got them working early on. But for example in May, 247 packages
depended on
r-base-dev
(GNU R) for building, and 736 on
ghc
, and we did not
have these dependencies compiled. Just counting those two, 1k source packages
of the remaining 4k to 5k to be compiled for the new architecture would have to
wait for a long time. Then there was Java, Mono, etc...
Even more worrying problems were the pending issues with the toolchain, like
atomics in
glibc
, or
make check
failing for some packages in the
clean
repository built with
wanna-build
. Christian continued to work on the
toolchain and liasing with the rest of the OpenRISC community, I continued to
request more changes to the Debian archive through a few requests to use
autoreconf
, and pushing a few more NMUs. Though many requests were attended,
I soon got negative replies/reactions and backed off a bit. In the meantime,
the porters of other new architectures at the time were mostly submitting
requests to support them and not NMUing much either.
Upstreaming
Things continued more or less in the same state until the end of the summer.
The new ports needed as many packages built as possible before the evaluation of
which official ports to accept (in early September, I think, the final decision
around the time of the freeze). Porters of the other new architectures (and
maintainers, and other helpful Debian Developers) were by then more active in
pushing for changes, specially remaining
autoreconf
issues, many of which
benefited
or1k
. As I said before, I also kept pushing NMUs now and then,
specially during summer, for packages which were not of immediate benefit for
our port but helped the others (e.g.
ppc64el
needed updates to
ltmain.sh
for
libtool
which were not necessary for
or1k
, in addition to
config. guess,sub
).
In parallel in the
or1k
camp, there were patches that needed changes to be
sent upstream, like for example Python's
NumPy
, that I submitted in May to
the Debian package and upstream, and was uploaded to Debian in September with a
new upstream release. Similar paths were followed between May and September for
packages such as
jemalloc
,
ocaml
,
gstreamer0.10
,
libgc
,
mesa
, X.org's
cf
module and
cmake
(patch created by Christian).
In April, Christian had reached the amazing milestone of tracking and getting
all of the contributors of the port of GNU
binutils
to assign copyright to the
Free Software Foundation (FSF), all of the work was refreshed and upstreamed.
In July or August, he started to gather information about the contributors of
the GCC port, which had started more than a decade ago.
After that, nothing much happened (from the outside) until the end of the year,
when Christian sent a
message about the status of upstreaming
GCC to
the OpenRISC community. There was only one remaining person to assign the
copyright to the FSF, but it was a blocker. In addition, there was the need to
find one or more maintainers to liaise with upstream, review the patches, fix
the remaining failures in the test suite and keeping the port in good shape. A
few months after that and from what I could gather, the status remains the same.
Current Status, and The Future?
In terms of the Debian port, there have not been huge visible changes since the
end of the summer, and not only because of the
Jessie freeze.
It seems that for this effort to keep going forward and be sustainable,
sorting out the issues with GCC and Glibc is essential. That means having
the toolchain completely pushed upstream and in good shape, and particularly
completing the copyright assignment. Debian will not accept private forks
of those essential packages without a very good reason even in unofficially
supported ports; and from the point of view of porters, working in the remaining
not-yet-built packages with continuing problems in the toolchain is very
frustrating and time-consuming.
Other than that, there is already a significant amount of software available
that could run in an
or1k
system, so I think that overall the project has
achieved a significant amount of success. Granted, KDE and LibreOffice are
not available yet, neither are the tools depending on Haskell or Java. But a
lot of software is available (including things high in the stack, like XFCE),
and in many aspects it should provide a much more functional system that the one
available in Linux (or other free software) systems in the late 1990s. If the
blocking issues are sorted out in the near future, the effort needed to get a
very functional port, on par with the unofficial Debian ports, should not be
that big.
In my opinion, and looking at the big picture, not bad at all for an
architecture whose hardware implementation is not easy to come by, and in which
the port was created almost solely with simulators. That it has been possible
to get this far with such meagre resources, it's an amazing feat of Free
Software and Debian in particular.
As for the future, time will tell, as usual. I will try to keep you posted if
there is any significant change in the future.