Search Results: "ghe"

17 November 2017

Jonathan Carter: I am now a Debian Developer

It finally happened On the 6th of April 2017, I finally took the plunge and applied for Debian Developer status. On 1 August, during DebConf in Montr al, my application was approved. If you re paying attention to the dates you might notice that that was nearly 4 months ago already. I was trying to write a story about how it came to be, but it ended up long. Really long (current draft is around 20 times longer than this entire post). So I decided I d rather do a proper bio page one day and just do a super short version for now so that someone might end up actually reading it. How it started In 1999 no wait, I can t start there, as much as I want to, this is a short post, so In 2003, I started doing some contract work for the Shuttleworth Foundation. I was interested in collaborating with them on tuXlabs, a project to get Linux computers into schools. For the few months before that, I was mostly using SuSE Linux. The open source team at the Shuttleworth Foundation all used Debian though, which seemed like a bizarre choice to me since everything in Debian was really old and its boot-floppies installer program kept crashing on my very vanilla computers.

SLUG (Schools Linux Users Group) group photo. SLUG was founded to support the tuXlab schools that ran Linux.

My contract work then later turned into a full-time job there. This was a big deal for me, because I didn t want to support Windows ever again, and I didn t ever think that it would even be possible for me to get a job where I could work on free software full time. Since everyone in my team used Debian, I thought that I should probably give it another try. I did, and I hated it. One morning I went to talk to my manager, Thomas Black, and told him that I just don t get it and I need some help. Thomas was a big mentor to me during this phase. He told me that I should try upgrading to testing, which I did, and somehow I ended up on unstable, and I loved it. Before that I used to subscribe to a website called freshmeat that listed new releases of upstream software and then, I would download and compile it myself so that I always had the newest versions of everything. Debian unstable made that whole process obsolete, and I became a huge fan of it. Early on I also hit a problem where two packages tried to install the same file, and I was delighted to find how easily I could find package state and maintainer scripts and fix them to get my system going again. Thomas told me that anyone could become a Debian Developer and maintain packages in Debian and that I should check it out and joked that maybe I could eventually snap up highvoltage@debian.org . I just laughed because back then you might as well have told me that I could run for president of the United States, it really felt like something rather far-fetched and unobtainable at that point, but the seed was planted :) Ubuntu and beyond

Ubuntu 4.10 default desktop Image from distrowatch

One day, Thomas told me that Mark is planning to provide official support for Debian unstable. The details were sparse, but this was still exciting news. A few months later Thomas gave me a CD with just warty written on it and said that I should install it on a server so that we can try it out. It was great, it used the new debian-installer and installed fine everywhere I tried it, and the software was nice and fresh. Later Thomas told me that this system is going to be called Ubuntu and the desktop edition has naked people on it. I wasn t sure what he meant and was kind of dumbfounded so I just laughed and said something like Uh ok . At least it made a lot more sense when I finally saw the desktop pre-release version and when it got the byline Linux for Human Beings . Fun fact, one of my first jobs at the foundation was to register the ubuntu.com domain name. Unfortunately I found it was already owned by a domain squatter and it was eventually handled by legal. Closer to Ubuntu s first release, Mark brought over a whole bunch of Debian developers that was working on Ubuntu over to the foundation and they were around for a few days getting some sun. Thomas kept saying Go talk to them! Go talk to them! , but I felt so intimidated by them that I couldn t even bring myself to walk up and say hello. In the interest of keeping this short, I m leaving out a lot of history but later on, I read through the Debian packaging policy and really started getting into packaging and also discovered Daniel Holbach s packaging tutorials on YouTube. These helped me tremendously. Some day (hopefully soon), I d like to do a similar video series that might help a new generation of packagers. I ve also been following DebConf online since DebConf 7, which was incredibly educational for me. Little did I know that just 5 years later I would even attend one, and another 5 years after that I d end up being on the DebConf Committee and have also already been on a local team for one.

DebConf16 Organisers, Photo by Jurie Senekal.

It s been a long journey for me and I would like to help anyone who is also interested in becoming a Debian maintainer or developer. If you ever need help with your package, upload it to https://mentors.debian.net and if I have some spare time I ll certainly help you out and sponsor an upload. Thanks to everyone who have helped me along the way, I really appreciate it!

14 November 2017

Jonathan Dowland: WadC 2.2

Bird Cage, a WadC-generated map for Heretic
Bird Cage map
I have recently released version 2.2 of Wad Compiler, a lazy functional programming language and IDE for the construction of Doom maps. The biggest change in this version is a reworking of the preferences system (to use the Java Preferences API), the wadcli command-line interface respecting preferences and a new preferences UI dialog (adapted from Quake Injector). There are two new example maps: A Labyrinth demonstration contributed by "Yoruk", and a Heretic map Bird Cage by yours truly. These are both now amongst the largest examples in the collection, although laby.wl was generated by a higher-level program. For more information see the release notes and the reference, or check out the new gallery of examples or skip straight to downloads. I have no plans to work on WadC further (but never say never, I suppose.)

03 November 2017

Rog rio Brito: Comparison of JDK installation of various Linux distributions

Today I spent some time in the morning seeing how one would install the JDK on Linux distributions. This is to create a little comparative tutorial to teach introductory Java. Installing the JDK is, thanks to the OpenJDK developers in Debian and Ubuntu (Matthias Klose and helpers), a very easy task. You simply type something like:
apt-get install openjdk-8-jdk
Since for a student it is better to have everything for experiments, I install the full version, not only the -headless version. Given my familiarity with Debian/Ubuntu, I didn't have to think about the way of installing it, of course. But as this is a tutorial meant to be as general as I can, I tried also to include instructions on how to install Java on other distributions. The first two that came to my mind were openSUSE and Fedora. Both use the RPM package format for their "native" packages (in the same sense that Debian uses DEB packages for "native" packages). But they use different higher-level tools to install such packages: Fedora uses a tool called dnf, while openSUSE uses zypper. To try these distributions, I got their netinstall ISOs and used qemu/kvm to install on a virtual machine. I used the following to install/run the virtual machines (the example below, is, of course, for openSUSE):
qemu-system-x86_64 -enable-kvm -m 4096 -smp 2 -net nic,model=e1000 -net user -drive index=0,media=disk,cache=unsafe,file=suse.qcow2 -cdrom openSUSE-Leap-42.3-NET-x86_64.iso
The names of the packages also change from one distribution to another. On Fedora, I had to use:
dnf install java-1.8.0-openjdk-devel
On openSUSE, I had to use:
zypper install java-1_8_0-openjdk-devel
Note that one distribution uses dots in the names of the packages while the other uses underscores. One interesting thing that I noticed with dnf was that, when I used it, it automatically refreshed the package lists from the network, something which I forgot, and it was a pleasant surprise. I don't know about zypper, but I guess that it probably had fresh indices when the installation finished. Both installations were effortless after I knew the names of the packages to install. Oh, BTW, in my 5 minute exploration with these distributions, I noticed that if you don't want the JDK, but only the JRE, then you omit the -devel suffix. It makes sense when you think about it, for consistency with other packages, but Debian's conventions also make sense (JRE with -jre suffix, JDK with -jdk suffix). I failed miserably to use Fedora's prebaked, vanilla cloud image, as I couldn't login on this image and I decided to just install the whole OS on a fresh virtual machine. I don't have instructions on how to install on Gentoo nor on Arch, though. I now see how hard it is to cover instructions/provide software for as many distributions as you wish, given the multitude of package managers, conventions etc.

01 November 2017

Petter Reinholdtsen: Some notes on fault tolerant storage systems

If you care about how fault tolerant your storage is, you might find these articles and papers interesting. They have formed how I think of when designing a storage system. Several of these research papers are based on data collected from hundred thousands or millions of disk, and their findings are eye opening. The short story is simply do not implicitly trust RAID or redundant storage systems. Details matter. And unfortunately there are few options on Linux addressing all the identified issues. Both ZFS and Btrfs are doing a fairly good job, but have legal and practical issues on their own. I wonder how cluster file systems like Ceph do in this regard. After all, there is an old saying, you know you have a distributed system when the crash of a compyter you have never heard of stops you from getting any work done. The same holds true if fault tolerance do not work. Just remember, in the end, it do not matter how redundant, or how fault tolerant your storage is, if you do not continuously monitor its status to detect and replace failed disks.

12 October 2017

Joachim Breitner: Isabelle functions: Always total, sometimes undefined

Often, when I mention how things work in the interactive theorem prover [Isabelle/HOL] (in the following just Isabelle 1) to people with a strong background in functional programming (whether that means Haskell or Coq or something else), I cause confusion, especially around the issue of what is a function, are function total and what is the business with undefined. In this blog post, I want to explain some these issues, aimed at functional programmers or type theoreticians. Note that this is not meant to be a tutorial; I will not explain how to do these things, and will focus on what they mean.

HOL is a logic of total functions If I have a Isabelle function f :: a b between two types a and b (the function arrow in Isabelle is , not ), then by definition of what it means to be a function in HOL whenever I have a value x :: a, then the expression f x (i.e. f applied to x) is a value of type b. Therefore, and without exception, every Isabelle function is total. In particular, it cannot be that f x does not exist for some x :: a. This is a first difference from Haskell, which does have partial functions like
spin :: Maybe Integer -> Bool
spin (Just n) = spin (Just (n+1))
Here, neither the expression spin Nothing nor the expression spin (Just 42) produce a value of type Bool: The former raises an exception ( incomplete pattern match ), the latter does not terminate. Confusingly, though, both expressions have type Bool. Because every function is total, this confusion cannot arise in Isabelle: If an expression e has type t, then it is a value of type t. This trait is shared with other total systems, including Coq. Did you notice the emphasis I put on the word is here, and how I deliberately did not write evaluates to or returns ? This is because of another big source for confusion:

Isabelle functions do not compute We (i.e., functional programmers) stole the word function from mathematics and repurposed it2. But the word function , in the context of Isabelle, refers to the mathematical concept of a function, and it helps to keep that in mind. What is the difference?
  • A function a b in functional programming is an algorithm that, given a value of type a, calculates (returns, evaluates to) a value of type b.
  • A function a b in math (or Isabelle) associates with each value of type a a value of type b.
For example, the following is a perfectly valid function definition in math (and HOL), but could not be a function in the programming sense:
definition foo :: "(nat   real)   real" where
  "foo seq = (if convergent seq then lim seq else 0)"
This assigns a real number to every sequence, but it does not compute it in any useful sense. From this it follows that

Isabelle functions are specified, not defined Consider this function definition:
fun plus :: "nat   nat   nat"  where
   "plus 0       m = m"
   "plus (Suc n) m = Suc (plus n m)"
To a functional programmer, this reads
plus is a function that analyses its first argument. If that is 0, then it returns the second argument. Otherwise, it calls itself with the predecessor of the first argument and increases the result by one.
which is clearly a description of a computation. But to Isabelle, the above reads
plus is a binary function on natural numbers, and it satisfies the following two equations:
And in fact, it is not so much Isabelle that reads it this way, but rather the fun command, which is external to the Isabelle logic. The fun command analyses the given equations, constructs a non-recursive definition of plus under the hood, passes that to Isabelle and then proves that the given equations hold for plus. One interesting consequence of this is that different specifications can lead to the same functions. In fact, if we would define plus' by recursing on the second argument, we d obtain the the same function (i.e. plus = plus' is a theorem, and there would be no way of telling the two apart).

Termination is a property of specifications, not functions Because a function does not evaluate, it does not make sense to ask if it terminates. The question of termination arises before the function is defined: The fun command can only construct plus in a way that the equations hold if it passes a termination check very much like Fixpoint in Coq. But while the termination check of Fixpoint in Coq is a deep part of the basic logic, in Isabelle it is simply something that this particular command requires for its internal machinery to go through. At no point does a termination proof of the function exist as a theorem inside the logic. And other commands may have other means of defining a function that do not even require such a termination argument! For example, a function specification that is tail-recursive can be turned in to a function, even without a termination proof: The following definition describes a higher-order function that iterates its first argument f on the second argument x until it finds a fixpoint. It is completely polymorphic (the single quote in 'a indicates that this is a type variable):
partial_function (tailrec)
  fixpoint :: "('a   'a)   'a   'a"
where
  "fixpoint f x = (if f x = x then x else fixpoint f (f x))"
We can work with this definition just fine. For example, if we instantiate f with ( x. x-1), we can prove that it will always return 0:
lemma "fixpoint (  n . n - 1) (n::nat) = 0"
  by (induction n) (auto simp add: fixpoint.simps)
Similarly, if we have a function that works within the option monad (i.e. Maybe in Haskell), its specification can always be turned into a function without an explicit termination proof here one that calculates the Collatz sequence:
partial_function (option) collatz :: "nat   nat list option"
 where "collatz n =
        (if n = 1 then Some [n]
         else if even n
           then do   ns <- collatz (n div 2);    Some (n # ns)  
           else do   ns <- collatz (3 * n + 1);  Some (n # ns) )"
Note that lists in Isabelle are finite (like in Coq, unlike in Haskell), so this function returns a list only if the collatz sequence eventually reaches 1. I expect these definitions to make a Coq user very uneasy. How can fixpoint be a total function? What is fixpoint ( n. n+1)? What if we run collatz n for a n where the Collatz sequence does not reach 1?3 We will come back to that question after a little detour

HOL is a logic of non-empty types Another big difference between Isabelle and Coq is that in Isabelle, every type is inhabited. Just like the totality of functions, this is a very fundamental fact about what HOL defines to be a type. Isabelle gets away with that design because in Isabelle, we do not use types for propositions (like we do in Coq), so we do not need empty types to denote false propositions. This design has an important consequence: It allows the existence of a polymorphic expression that inhabits any type, namely
undefined :: 'a
The naming of this term alone has caused a great deal of confusion for Isabelle beginners, or in communication with users of different systems, so I implore you to not read too much into the name. In fact, you will have a better time if you think of it as arbitrary or, even better, unknown. Since undefined can be instantiated at any type, we can instantiate it for example at bool, and we can observe an important fact: undefined is not an extra value besides the usual ones . It is simply some value of that type, which is demonstrated in the following lemma:
lemma "undefined = True   undefined = False" by auto
In fact, if the type has only one value (such as the unit type), then we know the value of undefined for sure:
lemma "undefined = ()" by auto
It is very handy to be able to produce an expression of any type, as we will see as follows

Partial functions are just underspecified functions For example, it allows us to translate incomplete function specifications. Consider this definition, Isabelle s equivalent of Haskell s partial fromJust function:
fun fromSome :: "'a option   'a" where
  "fromSome (Some x) = x"
This definition is accepted by fun (albeit with a warning), and the generated function fromSome behaves exactly as specified: when applied to Some x, it is x. The term fromSome None is also a value of type 'a, we just do not know which one it is, as the specification does not address that. So fromSome None behaves just like undefined above, i.e. we can prove
lemma "fromSome None = False   fromSome None = True" by auto
Here is a small exercise for you: Can you come up with an explanation for the following lemma:
fun constOrId :: "bool   bool" where
  "constOrId True = True"
lemma "constOrId = ( _.True)   constOrId = ( x. x)"
  by (metis (full_types) constOrId.simps)
Overall, this behavior makes sense if we remember that function definitions in Isabelle are not really definitions, but rather specifications. And a partial function definition is simply a underspecification. The resulting function is simply any function hat fulfills the specification, and the two lemmas above underline that observation.

Nonterminating functions are also just underspecified Let us return to the puzzle posed by fixpoint above. Clearly, the function seen as a functional program is not total: When passed the argument ( n. n + 1) or ( b. b) it will loop forever trying to find a fixed point. But Isabelle functions are not functional programs, and the definitions are just specifications. What does the specification say about the case when f has no fixed-point? It states that the equation fixpoint f x = fixpoint f (f x) holds. And this equation has a solution, for example fixpoint f _ = undefined. Or more concretely: The specification of the fixpoint function states that fixpoint ( b. b) True = fixpoint ( b. b) False has to hold, but it does not specify which particular value (True or False) it should denote any is fine.

Not all function specifications are ok At this point you might wonder: Can I just specify any equations for a function f and get a function out of that? But rest assured: That is not the case. For example, no Isabelle command allows you define a function bogus :: () nat with the equation bogus () = Suc (bogus ()), because this equation does not have a solution. We can actually prove that such a function cannot exist:
lemma no_bogus: "  bogus. bogus () = Suc (bogus ())" by simp
(Of course, not_bogus () = not_bogus () is just fine )

You cannot reason about partiality in Isabelle We have seen that there are many ways to define functions that one might consider partial . Given a function, can we prove that it is not partial in that sense? Unfortunately, but unavoidably, no: Since undefined is not a separate, recognizable value, but rather simply an unknown one, there is no way of stating that A function result is not specified . Here is an example that demonstrates this: Two partial functions (one with not all cases specified, the other one with a self-referential specification) are indistinguishable from the total variant:
fun partial1 :: "bool   unit" where
  "partial1 True = ()"
partial_function (tailrec) partial2 :: "bool   unit" where
  "partial2 b = partial2 b"
fun total :: "bool   unit" where
  "total True = ()"
  "total False = ()"
lemma "partial1 = total   partial2 = total" by auto
If you really do want to reason about partiality of functional programs in Isabelle, you should consider implementing them not as plain HOL functions, but rather use HOLCF, where you can give equational specifications of functional programs and obtain continuous functions between domains. In that setting, () and partial2 = total. We have done that to verify some of HLint s equations.

You can still compute with Isabelle functions I hope by this point, I have not scared away anyone who wants to use Isabelle for functional programming, and in fact, you can use it for that. If the equations that you pass to fun are a reasonable definition for a function (in the programming sense), then these equations, used as rewriting rules, will allow you to compute that function quite like you would in Coq or Haskell. Moreover, Isabelle supports code extraction: You can take the equations of your Isabelle functions and have them expored into Ocaml, Haskell, Scala or Standard ML. See Concon for a conference management system with confidentially verified in Isabelle. While these usually are the equations you defined the function with, they don't have to: You can declare other proved equations to be used for code extraction, e.g. to refine your elegant definitions to performant ones. Like with code extraction from Coq to, say, Haskell, the adequacy of the translations rests on a moral reasoning foundation. Unlike extraction from Coq, where you have an (unformalized) guarantee that the resulting Haskell code is terminating, you do not get that guarantee from Isabelle. Conversely, this allows you do reason about and extract non-terminating programs, like fixpoint, which is not possible in Coq. There is currently ongoing work about verified code generation, where the code equations are reflected into a deep embedding of HOL in Isabelle that would allow explicit termination proofs.

Conclusion We have seen how in Isabelle, every function is total. Function declarations have equations, but these do not define the function in an computational sense, but rather specify them. Because in HOL, there are no empty types, many specifications that appear partial (incomplete patterns, non-terminating recursion) have solutions in the space of total functions. Partiality in the specification is no longer visible in the final product.

PS: Axiom undefined in Coq This section is speculative, and an invitation for discussion. Coq already distinguishes between types used in programs (Set) and types used in proofs Prop. Could Coq ensure that every t : Set is non-empty? I imagine this would require additional checks in the Inductive command, similar to the checks that the Isabelle command datatype has to perform4, and it would disallow Empty_set. If so, then it would be sound to add the following axiom
Axiom undefined : forall (a : Set), a.
wouldn't it? This axiom does not have any computational meaning, but that seems to be ok for optional Coq axioms, like classical reasoning or function extensionality. With this in place, how much of what I describe above about function definitions in Isabelle could now be done soundly in Coq. Certainly pattern matches would not have to be complete and could sport an implicit case _ undefined. Would it help with non-obviously terminating functions? Would it allow a Coq command Tailrecursive that accepts any tailrecursive function without a termination check?

  1. Isabelle is a metalogical framework, and other logics, e.g. Isabelle/ZF, behave differently. For the purpose of this blog post, I always mean Isabelle/HOL.
  2. Isabelle is a metalogical framework, and other logics, e.g. Isabelle/ZF, behave differently. For the purpose of this blog post, I always mean Isabelle/HOL.
  3. Let me know if you find such an n. Besides n = 0.
  4. Like fun, the constructions by datatype are not part of the logic, but create a type definition from more primitive notions that is isomorphic to the specified data type.

03 October 2017

Dimitri John Ledkov: An interesting bug - network-manager, glibc, dpkg-shlibdeps, systemd, and finally binutils

Not so long ago I went to effectively recompile NetworkManager and fix up minor bug in it. It built fine across all architectures, was considered to be installable etc. And I was expecting it to just migrate across. At the time, glibc was at 2.26 in artful-proposed and NetworkManager was built against it. However release pocket was at glibc 2.24. In Ubuntu we have a ProposedMigration process in place which ensures that newly built packages do not regress in the number of architectures built for; installable on; and do not regress themselves or any reverse dependencies at runtime.

Thus before my build of NetworkManager was considered for migration, it was tested in the release pocket against packages in the release pocket. Specifically, since package metadata only requires glibc 2.17 NetworkManager was tested against glibc currently in the release pocket, which should just work fine....
autopkgtest [21:47:38]: test nm: [-----------------------
test_auto_ip4 (__main__.ColdplugEthernet)
ethernet: auto-connection, IPv4 ... FAIL ----- NetworkManager.log -----
NetworkManager: /lib/x86_64-linux-gnu/libc.so.6: version GLIBC_2.25' not found (required by NetworkManager)
At first I only saw failing tests, which I thought is transient failure. Thus they were retried a few times. Then I looked at the autopkgtest log and saw above error messages. Perplexed, I have started a lxd container with ubuntu artful, enabled proposed and installed just network-manager from artful-proposed and indeed a simple NetworkManager --help failed with above error from linker.

I am too young to know what dependency-hell means, since ever since I used Linux (Ubuntu 7.04) all glibc symbols were versioned, and dpkg-shlibdeps would generate correct minimum dependencies for a package. Alas in this case readelf confirmed that indeed /usr/sbin/NetworkManager requires 2.25 and dpkg depends is >= 2.17.

Further reading readelf output I checked that all of the glibc symbols used are 2.17 or lower, and only the "Version needs section '.gnu.version_r'" referenced GLIBC_2.25 symbol. Inspecting dpkg-shlibdeps code I noticed that it does not parse that section and only searches through the dynamic symbols used to establish the minimum required version.

Things started to smell fishy. On one hand, I trust dpkg-shlibdeps to generate the right dependencies. On the other hand I also trust linker to not tell lies either. Hence I opened a Debian BTS bug report about this issue.

At this point, I really wanted to figure out where the reference to 2.25 comes from. Clearly it was not from any private symbols as then the reference would be on 2.26. Checking glibc abi lists I found there were only a handful of symbols marked as 2.25
$ grep 2.25 ./sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
GLIBC_2.25 GLIBC_2.25 A
GLIBC_2.25 __explicit_bzero_chk F
GLIBC_2.25 explicit_bzero F
GLIBC_2.25 getentropy F
GLIBC_2.25 getrandom F
GLIBC_2.25 strfromd F
GLIBC_2.25 strfromf F
GLIBC_2.25 strfroml F
Blindly grepping for these in network-manager source tree I found following:
$ grep explicit_bzero -r configure.ac src/
configure.ac: explicit_bzero],
src/systemd/src/basic/string-util.h:void explicit_bzero(void *p, size_t l);
src/systemd/src/basic/string-util.c:void explicit_bzero(void *p, size_t l)
src/systemd/src/basic/string-util.c: explicit_bzero(x, strlen(x));
First of all it seems like network-manager includes a partial embedded copy of systemd. Secondly that code is compiled into a temporary library and has autconf detection logic to use explicit_bzero. It also has an embedded implementation of explicit_bzero when it is not available in libc, however it does not have FORTIFY_SOURCES implementation of said function (__explicit_bzero_chk) as was later pointed out to me. And whilst this function is compiled into an intermediary noinst library, no functions that use explicit_bzero are used in the end by NetworkManger binary. To proof this, I've dropped all code that uses explicit_bzero, rebuild the package against glibc 2.26, and voila it only had Version reference on glibc 2.17 as expected from the end-result usage of shared symbols.

At this point toolchain bug was a suspect. It seems like whilst explicit_bzero shared symbol got optimised out, the version reference on 2.25 persisted to the linked binaries. At this point in the archive a snapshot version of binutils was in use. And in fact forcefully downgrading bintuils resulted in correct compilation / versions table referencing only glibc 2.17.

Mathias then took over a tarball of object files and filed upstream bug report against bintuils: "[2.29 Regression] ld.bfd keeps a version reference in .gnu.version_r for symbols which are optimized out". The discussion in that bug report is a bit beyond me as to me binutils is black magic. All I understood there was "we moved sweep and pass to another place due to some bugs", doing that introduced this bug, thus do multiple sweep and passes to make sure we fix old bugs and don't regress this either. Or something like that. Comments / Better description of the bintuils fix are welcomed.

Binutils got fixed by upstream developers, cherry-picked into debian, and ubuntu, network-manager got rebuild and everything is wonderful now. However, it does look like unused / deadend code paths tripped up optimisations in the toolchain which managed to slip by distribution package dependency generation and needless require a higher up version of glibc. I guess the lesson here is do not embed/compile unused code. Also I'm not sure why network-manager uses networkd internals like this, and maybe systemd should expose more APIs or serialise more state into /run, as most other things query things over dbus, private socket, or by establishing watches on /run/systemd/netif. I'll look into that another day.

Thanks a lot to Guillem Jover, Matthias Klose, Alan Modra, H.J. Lu, and others for getting involved. I would not be able to raise, debug, or fix this issue all by myself.

28 September 2017

Matthias Klumpp: Adding fonts to software centers

Last year, the AppStream specification gained proper support for adding metadata for fonts, after Richard Hughes did some work on it years ago. We weren t happy with how fonts were handled at that time, so we searched for better solutions, which is why this took a bit longer to be done. Last year, I was implementing the final support for fonts in both appstream-generator (the metadata extractor used by Debian and a few others) as well as the AppStream specification. This blogpost was sitting on my todo list as a draft for a long time now, and I only just now managed to finish it, so sorry for announcing this so late. Fonts are already available via AppStream for a year, and this post just sums up the status quo and some neat tricks if you want to write metainfo files for fonts. If you are following AppStream (or the Debian fonts list), you know everything already  . Both Richard and I first tried to extract all the metadata to display fonts in a proper way to the users from the font files directly. This turned out to be very difficult, since font metadata is often wrong or incomplete, and certain desirable bits of metadata (like a longer description) are missing entirely. After messing around with different ways to solve this for days (afterall, by extracting the data from font files directly we would have hundreds of fonts directly available in software centers), I also came to the same conclusion as Richard: The best and easiest solution here is to mandate the availability of metainfo files per font. Which brings me to the second issue: What is a font? For any person knowing about fonts, they will understand one font as one font face, e.g. Lato Regular Italic or Lato Bold . A user however will see the font family as a font, e.g. just Lato instead of all the font faces separated out. Since AppStream data is used primarily by software centers, we want something that is easy for users to understand. Hence, an AppStream font components really describes a font family or collection of fonts, instead of individual font faces. We do also want AppStream data to be useful for system components looking for a specific font, which is why font components will advertise the individual font face names they contain via a
<provides/>
-tag. Naming fonts and making them identifiable is a whole other issue, I used a document from Adobe on font naming issues as a rough guideline while working on this. How to write a good metainfo file for a font is best shown with an example. Lato is a well-looking font family that we want displayed in a software center. So, we write a metainfo file for it an place it in
/usr/share/metainfo/com.latofonts.Lato.metainfo.xml
for the AppStream metadata generator to pick up:
<?xml version="1.0" encoding="UTF-8"?>
<component type="font">
  <id>com.latofonts.Lato</id>
  <metadata_license>FSFAP</metadata_license>
  <project_license>OFL-1.1</project_license>
  <name>Lato</name>
  <summary>A sanserif type face fam ily</summary>
  <description>
    <p>
      Lato is a sanserif type face fam ily designed in the Sum mer 2010 by Warsaw-based designer
       ukasz Dziedzic ( Lato  means  Sum mer  in Pol ish). In Decem ber 2010 the Lato fam ily
      was pub lished under the open-source Open Font License by his foundry tyPoland, with
      sup port from Google.
    </p>
  </description>
  <url type="homepage">http://www.latofonts.com/</url>
  <provides>
    <font>Lato Regular</font>
    <font>Lato Black Italic</font>
    <font>Lato Black</font>
    <font>Lato Bold Italic</font>
    <font>Lato Bold</font>
    <font>Lato Hairline Italic</font>
    ...
  </provides>
</component>
When the file is processed, we know that we need to look for fonts in the package it is contained in. So, the appstream-generator will load all the fonts in the package and render example texts for them as an image, so we can show users a preview of the font. It will also use heuristics to render an icon for the respective font component using its regular typeface. Of course that is not ideal what if there are multiple font faces in a package? What if the heuristics fail to detect the right font face to display? This behavior can be influenced by adding
<font/>
tags to a
<provides/>
tag in the metainfo file. The font-provides tags should contain the fullnames of the font faces you want to associate with this font component. If the font file does not define a fullname, the family and style are used instead. That way, someone writing the metainfo file can control which fonts belong to the described component. The metadata generator will also pick the first mentioned font name in the
<provides/>
list as the one to render the example icon for. It will also sort the example text images in the same order as the fonts are listed in the provides-tag. The example lines of text are written in a language matching the font using Pango. But what about symbolic fonts? Or fonts where any heuristic fails? At the moment, we see ugly tofu characters or boxes instead of an actual, useful representation of the font. This brings me to an inofficial extension to font metainfo files, that, as far as I know, only appstream-generator supports at the moment. I am not happy enough with this solution to add it to the real specification, but it serves as a good method to fix up the edge cases where we can not render good example images for fonts. AppStream-Generator supports the FontIconText and FontSampleText custom AppStream properties to allow metainfo file authors to override the default texts and autodetected values. FontIconText will override the characters used to render the icon, while FontSampleText can be a line of text used to render the example images. This is especially useful for symbolic fonts, where the heuristics usually fail and we do not know which glyphs would be representative for a font. For example, a font with mathematical symbols might want to add the following to its metainfo file:
<custom>
  <value key="FontIconText"> </value>
  <value key="FontSampleText">       ...         </value>
</custom>
Any unicode glyphs are allowed, but asgen will but some length restrictions on the texts. So, In summary:

Dirk Eddelbuettel: RcppZiggurat 0.1.4

ziggurats A maintenance release of RcppZiggurat is now on the CRAN network for R. It switched the vignette to the our new pinp package and its two-column pdf default. The RcppZiggurat package updates the code for the Ziggurat generator which provides very fast draws from a Normal distribution. The package provides a simple C++ wrapper class for the generator improving on the very basic macros, and permits comparison among several existing Ziggurat implementations. This can be seen in the figure where Ziggurat from this package dominates accessing the implementations from the GSL, QuantLib and Gretl---all of which are still way faster than the default Normal generator in R (which is of course of higher code complexity). The NEWS file entry below lists all changes.

Changes in version 0.1.4 (2017-07-27)
  • The vignette now uses the pinp package in two-column mode.
  • Dynamic symbol registration is now enabled.

Courtesy of CRANberries, there is also a diffstat report for the most recent release. More information is on the RcppZiggurat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

26 September 2017

Colin Watson: A mysterious bug with Twisted plugins

I fixed a bug in Launchpad recently that led me deeper than I expected. Launchpad uses Buildout as its build system for Python packages, and it s served us well for many years. However, we re using 1.7.1, which doesn t support ensuring that packages required using setuptools setup_requires keyword only ever come from the local index URL when one is specified; that s an essential constraint we need to be able to impose so that our build system isn t immediately sensitive to downtime or changes in PyPI. There are various issues/PRs about this in Buildout (e.g. #238), but even if those are fixed it ll almost certainly only be in Buildout v2, and upgrading to that is its own kettle of fish for other reasons. All this is a serious problem for us because newer versions of many of our vital dependencies (Twisted and testtools, to name but two) use setup_requires to pull in pbr, and so we ve been stuck on old versions for some time; this is part of why Launchpad doesn t yet support newer SSH key types, for instance. This situation obviously isn t sustainable. To deal with this, I ve been working for some time on switching to virtualenv and pip. This is harder than you might think: Launchpad is a long-lived and complicated project, and it had quite a number of explicit and implicit dependencies on Buildout s configuration and behaviour. Upgrading our infrastructure from Ubuntu 12.04 to 16.04 has helped a lot (12.04 s baseline virtualenv and pip have some deficiencies that would have required a more complicated bootstrapping procedure). I ve dealt with most of these: for example, I had to reorganise a lot of our helper scripts (1, 2, 3), but there are still a few more things to go. One remaining problem was that our Buildout configuration relied on building several different environments with different Python paths for various things. While this would technically be possible by way of building multiple virtualenvs, this would inflate our build time even further (we re already going to have to cope with some slowdown as a result of using virtualenv, because the build system now has to do a lot more than constructing a glorified link farm to a bunch of cached eggs), and it seems like unnecessary complexity. The obvious thing to do seemed to be to collapse these into a single environment, since there was no obvious reason why it should actually matter if txpkgupload and txlongpoll were carefully kept off the path when running most of Launchpad: so I did that. Then our build system got very sad. Hmm, I thought. To keep our test times somewhat manageable, we run them in parallel across 20 containers, and we randomise the order in which they run to try to shake out test isolation bugs. It s not completely unknown for there to be some oddities resulting from that. So I ran it again. Nope, but slightly differently sad this time. Furthermore, I couldn t reproduce these failures locally no matter how hard I tried. Oh dear. This was obviously not going to be a good day. In fact I spent a while on various different guesswork-based approaches. I found bug 571334 in Ampoule, an AMP-based process pool implementation that we use for some job runners, and proposed a fix for that, but cherry-picking that fix into Launchpad didn t help matters. I tried backing out subsets of my changes and determined that if both txlongpoll and txpkgupload were absent from the Python module path in the context of the tests in question then everything was fine. I tried running strace locally and staring at the output for some time in the hope of enlightenment: that reminded me that the two packages in question install modules under twisted.plugins, which did at least establish a reason they might affect the environment that was more plausible than magic, but nothing much more specific than that. On Friday I was fiddling about with this again and trying to insert some more debugging when I noticed some interesting behaviour around plugin caching. If I caused the txpkgupload plugin to raise an exception when loaded, the Twisted plugin system would remove its dropin.cache (because it was stale) and not create a new one (because there was now no content to put in it). After that, running the relevant tests would fail as I d seen in our buildbot. Aha! This meant that I could also reproduce it by doing an even cleaner build than I d previously tried to do, by removing the cached txpkgupload and txlongpoll eggs and allowing the build system to recreate them. When they were recreated, they didn t contain dropin.cache, instead allowing that to be created on first use. Based on this clue I was able to get to the answer relatively quickly. Ampoule has a specialised bootstrapping sequence for its worker processes that starts by doing this:
from twisted.application import reactors
reactors.installReactor(reactor)
Now, twisted.application.reactors.installReactor calls twisted.plugin.getPlugins, so the very start of this bootstrapping sequence is going to involve loading all plugins found on the module path (I assume it s possible to write a plugin that adds an alternative reactor implementation). If dropin.cache is up to date, then it will just get the information it needs from that; but if it isn t, it will go ahead and import the plugin. If the plugin happens (as Twisted code often does) to run from twisted.internet import reactor at some point while being imported, then that will install the platform s default reactor, and then twisted.application.reactors.installReactor will raise ReactorAlreadyInstalledError. Since Ampoule turns this into an info-level log message for some reason, and the tests in question only passed through error-level messages or higher, this meant that all we could see was that a worker process had exited non-zero but not why. The Twisted documentation recommends generating the plugin cache at build time for other reasons, but we weren t doing that. Fixing that makes everything work again. There are still a few more things needed to get us onto pip, but we re now pretty close. After that we can finally start bringing our dependencies up to date.

15 September 2017

Chris Lamb: Which packages on my system are reproducible?

Whilst anyone can inspect the source code of free software for malicious flaws, most software is distributed pre-compiled to end users. The motivation behind the Reproducible Builds effort is to allow verification that no flaws have been introduced either maliciously or accidentally during this compilation process. As part of this project I wrote a script to determine which packages installed on your system are "reproducible" or not:
$ apt install devscripts
[ ]
$ reproducible-check
[ ]
W: subversion (1.9.7-2) is unreproducible (libsvn-perl, libsvn1, subversion) <https://tests.reproducible-builds.org/debian/subversion>
W: taglib (1.11.1+dfsg.1-0.1) is unreproducible (libtag1v5, libtag1v5-vanilla) <https://tests.reproducible-builds.org/debian/taglib>
W: tcltk-defaults (8.6.0+9) is unreproducible (tcl, tk) <https://tests.reproducible-builds.org/debian/tcltk-defaults>
W: tk8.6 (8.6.7-1) is unreproducible (libtk8.6, tk8.6) <https://tests.reproducible-builds.org/debian/tk8.6>
W: valgrind (1:3.13.0-1) is unreproducible <https://tests.reproducible-builds.org/debian/valgrind>
W: wavpack (5.1.0-2) is unreproducible (libwavpack1) <https://tests.reproducible-builds.org/debian/wavpack>
W: x265 (2.5-2) is unreproducible (libx265-130) <https://tests.reproducible-builds.org/debian/x265>
W: xen (4.8.1-1+deb9u1) is unreproducible (libxen-4.8, libxenstore3.0) <https://tests.reproducible-builds.org/debian/xen>
W: xmlstarlet (1.6.1-2) is unreproducible <https://tests.reproducible-builds.org/debian/xmlstarlet>
W: xorg-server (2:1.19.3-2) is unreproducible (xserver-xephyr, xserver-xorg-core) <https://tests.reproducible-builds.org/debian/xorg-server>
282/4494 (6.28%) of installed binary packages are unreproducible.
Whether a package is "reproducible" or not is determined by querying the Debian Reproducible Builds testing framework.


The --raw command-line argument lets you play with the data in more detail. For example, you can see who maintains your unreproducible packages:
$ reproducible-check --raw   dd-list --stdin
Alec Leamas <leamas.alec@gmail.com>
   lirc (U)
Alessandro Ghedini <ghedo@debian.org>
   valgrind
Alessio Treglia <alessio@debian.org>
   fluidsynth (U)
   libsoxr (U)
[ ]


reproducible-check is available in devscripts since version 2.17.10, which landed in Debian unstable on 14th September 2017.

13 September 2017

Shirish Agarwal: Android, Android marketplace and gaming addiction.

This would be a longish piece so please bear and play with tea, coffee, beer or anything stronger that you desire while reading below  I had bought an Android phone, a Samsung J5 just before going to debconf 2016. It was more for being in-trend rather than really using it. The one which I shared is the upgraded version (recentish) the one I have is 2 GB for which I had paid around double of what the list price was. The only reason I bought the model is that it had removable battery at the price point I was willing to pay. I did see that Samsung has the same ham-handed issues with audio as previous Nokia devices use to, the speakers and microphone probably the cheapest you can get on the market. Nokia was same too, at least on the lower-end of the market, while Oppo has loud ringtones and loud music, perfect for those who are a bit hard of hearing (as yours truly is). I had been pleasantly surprised by the quality of photos Samsung J5 was churning even though I m less than average shooter, never been really into it so was a sort of wake-up call for where camera sensor technology is advancing. And of course with newer phones the kind of detail it can capture is mesmerizing to say the least, although wide-angle shots still would take some time to get right I guess. If memory serves me right, sometime back Laura Arjona Reina (who handles part of debian-publicity and part of debian-women among other responsibilities) shared a blog post on p.d.o. where she had shared the troubles she had while exporting data from the phone. While she shared that and I lack the time or the energy to try and find it ( the entry is really bookmarkable, at least that specific blog post). What was interesting though that I had gone few years ago to Bangalore, there is an organization which I like and admire CIS great for researchers. Anyways they had done a project getting between 10-20 phones from the market made of Chinese origin (almost all mobiles sold in India, the fabrication of CPU and APU etc. are done in China/Taiwan and then assembled here). Here what is done at the most is assembly which for all political purposes is called manufacturing . All the mobiles kept quite a bit of info. on the device even after you wiped them clean/put some other ROM on them. The CIS site is more than a bit cluttered otherwise would have shared the direct link. I do hope to send an e-mail to CIS and hopefully they will respond with the report and will share that here as and when. It would be interesting to know if after people flash a custom rom if the status quo is the same as it was before. I do suspect it would be the former as flashing ROMs on phones is still a bit of specialized subject at least here in India with even an average phone costing a month or two s salary or more and the idea of bricking the phone scares most people (including yours truly). Anyways, for a long time I was on bed and had the phone. I used 2 games from the android marketplace which both mum and I enjoy and enjoyed. Those are Real jigsaw and Jigsaw puzzle HD . The permissions dialog which Real jigsaw among other games has is horrible and part of me freaks that all such apps. have unrestricted access to my storage area. Ideally, what Android should have done is either partition or give functionality to the user to have private space for their photos and whatever media they have and the rest of the area is like a public park. If anybody has any thoughts on partitioning on Android phone would like to hear that. One game though which really hooked mumma and me is The Island Experiment . It reminded me of my younger days when gaming addiction was not treated as a disease but thankfully now is . I would call myself somewhat of a functional addict as in do my every day things, work etc. but do dream about the game as to what it will show me next. A part of it is the game is web-based (which means it needs constant internet connection) and web access is somewhat pricey, although with Reliance Jio an upcoming data network operator having bundles of money and promising the moon, network issues at least on low-bandwidth game which I and mum are playing hopefully will not have any issues. I haven t used tshark or any such tool to analyze the traffic but I guess it probably just sends short messages of number of clicks in a time period and things like that, all the rest (I guess) is happening on the mobile itself. I know at sometime I probably will try to put a custom rom on it but which one is the question as there are so many and also which is most compatible with my device. It seems I would have to do lot of homework before I can make any choices. Couple of months back, a friend of mine Akshat who has been using Android for few years enabled Developer Options which I didn t know about till he shared that info. with me. I do hope people do check Akshat s repo. as he has made/has quite a few useful scripts, especially if you are into digital photography. I have shared with him gimp scripting few days back so along with imagemagick you might see him doing some rough scripts in it. Of course, if people use it and give feedback he might clean the scripts a bit so it gives useful error messages and gives statement like gimp is not installed on your system, please install it or ask for specific version but as it works in free software it is somewhat directly proportional to the number of users, bugs and users behind it. A good example of what I mean is youtube-dl . I filed 873853 where I shared the upstream ticket. Apparently YouTube again changed few days back and while upstream has fixed it, the youtube-dl maintainer probably needs to find time and get the new version up. Apparently the issues lies in [$] dpkg -L youtube-dl grep youtube.py
/usr/lib/python3/dist-packages/youtube_dl/extractor/youtube.py
Hopefully somebody does the needful. Btw, I find f-droid extremely useful and especially osmand but sadly both of them are not shared or talked about by people  The reason I shared about Developer Options in Android is that few days back I noticed that the phone wonks off and has unpredictable behaviour such as not letting me browse the web, do additions or deletions using the google play store and alike . Things came to a head when few days back I saw a fibre-optic splicing operation being carried by some workers near my home by the state operator which elated me and wanted to shoot the video for it but the battery died/there was no power even though I hadn t used it much. I have deliberately shared the hindi version which tells how that knowledge is now coming to the masses. I had seen fibre-optic splicing more than a decade and a half back at some net conference where it was going to be in your neighbourhood soonish, hopefully it will happen soon  I had my suspicions for quite sometime all the issues with the phone were not due to proper charging. During course of my investigation, found out that in Developer Options there is an option called USB Configuration and changing that from the default MTP (Media Transfer Protocol) (which is basically used to put or take movies, music or any file from the phone to the computer or vice-versa improved much better behaviour on my android phone. But this caused an unexpected side-effect, I got pretty aggressive polling of the phone by the computer even after saying I do not want to share the phone details with the computer. This I filed as 874216 . The phone and I am guessing most Samsung phones today come with an adaptor with a USB male socket which goes in to the phone s usb port. There is the classical port for electricity but like most people heavily rely on usb charging even for deep fully powered down phone for full charging. One interesting project which I saw which came in Debian some days back is dummydroid. I did file a bug about it . I do hope that either the maintainer gives some more documentation. I am sure many people might use and add to the resource if the documentation was/is there. I did take a look at the package and the profile seems to be like an xml pair kind of database. Having more profiles shouldn t be hard if we knew what info. needs to be added and how do we find that info. Lastly, I am slowly transferring all the above knowledge to my mum as well, although in small doses. She, just like me has and had problems coming from resistive touchscreen to capacitive touchscreen. You can call me wrong but resistive touchscreen seemed to be superior and not as error-prone or liable to commit mistakes as is possible in capacitive touchscreens. There may be a setting to higher/lower the threshold for touching which I have not been able to find as of yet. Hope somebody finds something useful in there. I do hope that Debian does become a replacement to be used on such mobiles but then they would have to duplicate/also have some sort of mainstream content with editors to help people find stuff, something that Debian is not so good at currently. Also I m not sure Synaptic is good fit as a mobile store.
Filed under: Miscellenous Tagged: #Android, #capacitive touchscreen, #custom ROMs, #digital photography, #dummydroid, #f-droid, #fabrication, #flashing, #game addiction, #Google Play Store, #Mainstreaming Debian, #mobile connectivity, #Oppo, #osmand, #planet-debian, #resistive touchscreen, #Samsung Galaxy J5, #scripting, #USB charging, #USB configuration, #youtube-dl, gaming

10 September 2017

intrigeri: Can you reproduce this Tails ISO image?

Thanks to a Mozilla Open Source Software award, we have been working on making the Tails ISO images build reproducibly. We have made huge progress: since a few months, ISO images built by Tails core developers and our CI system have always been identical. But we're not done yet and we need your help! Our first call for testing build reproducibility in August uncovered a number of remaining issues. We think that we have fixed them all since, and we now want to find out what other problems may prevent you from building our ISO image reproducibly. Please try to build an ISO image today, and tell us whether it matches ours! Build an ISO These instructions have been tested on Debian Stretch and testing/sid. If you're using another distribution, you may need to adjust them. If you get stuck at some point in the process, see our more detailed build documentation and don't hesitate to contact us: Setup the build environment You need a system that supports KVM, 1 GiB of free memory, and about 20 GiB of disk space.
  1. Install the build dependencies:
    sudo apt install \
        git \
        rake \
        libvirt-daemon-system \
        dnsmasq-base \
        ebtables \
        qemu-system-x86 \
        qemu-utils \
        vagrant \
        vagrant-libvirt \
        vmdebootstrap && \
    sudo systemctl restart libvirtd
    
  2. Ensure your user is in the relevant groups:
    for group in kvm libvirt libvirt-qemu ; do
       sudo adduser "$(whoami)" "$group"
    done
    
  3. Logout and log back in to apply the new group memberships.
Build Tails 3.2~alpha2 This should produce a Tails ISO image:
git clone https://git-tails.immerda.ch/tails && \
cd tails && \
git checkout 3.2-alpha2 && \
git submodule update --init && \
rake build
Send us feedback! No matter how your build attempt turned out we are interested in your feedback. Gather system information To gather the information we need about your system, run the following commands in the terminal where you've run rake build:
sudo apt install apt-show-versions && \
(
  for f in /etc/issue /proc/cpuinfo
  do
    echo "--- File: $ f  ---"
    cat "$ f "
    echo
  done
  for c in free locale env 'uname -a' '/usr/sbin/libvirtd --version' \
            'qemu-system-x86_64 --version' 'vagrant --version'
  do
    echo "--- Command: $ c  ---"
    eval "$ c "
    echo
  done
  echo '--- APT package versions ---'
  apt-show-versions qemu:amd64 linux-image-amd64:amd64 vagrant \
                    libvirt0:amd64
)   bzip2 > system-info.txt.bz2
Then check that the generated file doesn't contain any sensitive information you do not want to leak:
bzless system-info.txt.bz2
Next, please follow the instructions below that match your situation! If the build failed Sorry about that. Please help us fix it by opening a ticket: If the build succeeded Compute the SHA-512 checksum of the resulting ISO image:
sha512sum tails-amd64-3.2~alpha2.iso
Compare your checksum with ours:
9b4e9e7ee7b2ab6a3fb959d4e4a2db346ae322f9db5409be4d5460156fa1101c23d834a1886c0ce6bef2ed6fe378a7e76f03394c7f651cc4c9a44ba608dda0bc
If the checksums match: success, congrats for reproducing Tails 3.2~alpha2! Please send an email to tails-dev@boum.org (public) or tails@boum.org (private) with the subject "Reproduction of Tails 3.2~alpha2 successful" and system-info.txt.bz2 attached. Thanks in advance! Then you can stop reading here. Else, if the checksums differ: too bad, but really it's good news as the whole point of the exercise is precisely to identify such problems :) Now you are in a great position to help improve the reproducibility of Tails ISO images by following these instructions:
  1. Install diffoscope version 83 or higher and all the packages it recommends. For example, if you're using Debian Stretch:
    sudo apt remove diffoscope && \
    echo 'deb http://ftp.debian.org/debian stretch-backports main' \
        sudo tee /etc/apt/sources.list.d/stretch-backports.list && \
    sudo apt update && \
    sudo apt -o APT::Install-Recommends="true" \
             install diffoscope/stretch-backports
    
  2. Download the official Tails 3.2~alpha2 ISO image.
  3. Compare the official Tails 3.2~alpha2 ISO image with yours:
    diffoscope \
           --text diffoscope.txt \
           --html diffoscope.html \
           --max-report-size 262144000 \
           --max-diff-block-lines 10000 \
           --max-diff-input-lines 10000000 \
           path/to/official/tails-amd64-3.2~alpha2.iso \
           path/to/your/own/tails-amd64-3.2~alpha2.iso
    bzip2 diffoscope. txt,html 
    
  4. Send an email to tails-dev@boum.org (public) or tails@boum.org (private) with the subject "Reproduction of Tails 3.2~alpha2 failed", attaching:
    • system-info.txt.bz2;
    • the smallest file among diffoscope.txt.bz2 and diffoscope.html.bz2, except if they are larger than 100 KiB, in which case better upload the file somewhere (e.g. share.riseup.net and share the link in your email.
Thanks a lot! Credits Thanks to Ulrike & anonym who authored a draft on which this blog post is based.

05 September 2017

Kees Cook: security things in Linux v4.13

Previously: v4.12. Here s a short summary of some of interesting security things in Sunday s v4.13 release of the Linux kernel: security documentation ReSTification
The kernel has been switching to formatting documentation with ReST, and I noticed that none of the Documentation/security/ tree had been converted yet. I took the opportunity to take a few passes at formatting the existing documentation and, at Jon Corbet s recommendation, split it up between end-user documentation (which is mainly how to use LSMs) and developer documentation (which is mainly how to use various internal APIs). A bunch of these docs need some updating, so maybe with the improved visibility, they ll get some extra attention. CONFIG_REFCOUNT_FULL
Since Peter Zijlstra implemented the refcount_t API in v4.11, Elena Reshetova (with Hans Liljestrand and David Windsor) has been systematically replacing atomic_t reference counters with refcount_t. As of v4.13, there are now close to 125 conversions with many more to come. However, there were concerns over the performance characteristics of the refcount_t implementation from the maintainers of the net, mm, and block subsystems. In order to assuage these concerns and help the conversion progress continue, I added an unchecked refcount_t implementation (identical to the earlier atomic_t implementation) as the default, with the fully checked implementation now available under CONFIG_REFCOUNT_FULL. The plan is that for v4.14 and beyond, the kernel can grow per-architecture implementations of refcount_t that have performance characteristics on par with atomic_t (as done in grsecurity s PAX_REFCOUNT). CONFIG_FORTIFY_SOURCE
Daniel Micay created a version of glibc s FORTIFY_SOURCE compile-time and run-time protection for finding overflows in the common string (e.g. strcpy, strcmp) and memory (e.g. memcpy, memcmp) functions. The idea is that since the compiler already knows the size of many of the buffer arguments used by these functions, it can already build in checks for buffer overflows. When all the sizes are known at compile time, this can actually allow the compiler to fail the build instead of continuing with a proven overflow. When only some of the sizes are known (e.g. destination size is known at compile-time, but source size is only known at run-time) run-time checks are added to catch any cases where an overflow might happen. Adding this found several places where minor leaks were happening, and Daniel and I chased down fixes for them. One interesting note about this protection is that is only examines the size of the whole object for its size (via __builtin_object_size(..., 0)). If you have a string within a structure, CONFIG_FORTIFY_SOURCE as currently implemented will make sure only that you can t copy beyond the structure (but therefore, you can still overflow the string within the structure). The next step in enhancing this protection is to switch from 0 (above) to 1, which will use the closest surrounding subobject (e.g. the string). However, there are a lot of cases where the kernel intentionally copies across multiple structure fields, which means more fixes before this higher level can be enabled. NULL-prefixed stack canary
Rik van Riel and Daniel Micay changed how the stack canary is defined on 64-bit systems to always make sure that the leading byte is zero. This provides a deterministic defense against overflowing string functions (e.g. strcpy), since they will either stop an overflowing read at the NULL byte, or be unable to write a NULL byte, thereby always triggering the canary check. This does reduce the entropy from 64 bits to 56 bits for overflow cases where NULL bytes can be written (e.g. memcpy), but the trade-off is worth it. (Besdies, x86_64 s canary was 32-bits until recently.) IPC refactoring
Partially in support of allowing IPC structure layouts to be randomized by the randstruct plugin, Manfred Spraul and I reorganized the internal layout of how IPC is tracked in the kernel. The resulting allocations are smaller and much easier to deal with, even if I initially missed a few needed container_of() uses. randstruct gcc plugin
I ported grsecurity s clever randstruct gcc plugin to upstream. This plugin allows structure layouts to be randomized on a per-build basis, providing a probabilistic defense against attacks that need to know the location of sensitive structure fields in kernel memory (which is most attacks). By moving things around in this fashion, attackers need to perform much more work to determine the resulting layout before they can mount a reliable attack. Unfortunately, due to the timing of the development cycle, only the manual mode of randstruct landed in upstream (i.e. marking structures with __randomize_layout). v4.14 will also have the automatic mode enabled, which randomizes all structures that contain only function pointers. A large number of fixes to support randstruct have been landing from v4.10 through v4.13, most of which were already identified and fixed by grsecurity, but many were novel, either in newly added drivers, as whitelisted cross-structure casts, refactorings (like IPC noted above), or in a corner case on ARM found during upstream testing. lower ELF_ET_DYN_BASE
One of the issues identified from the Stack Clash set of vulnerabilities was that it was possible to collide stack memory with the highest portion of a PIE program s text memory since the default ELF_ET_DYN_BASE (the lowest possible random position of a PIE executable in memory) was already so high in the memory layout (specifically, 2/3rds of the way through the address space). Fixing this required teaching the ELF loader how to load interpreters as shared objects in the mmap region instead of as a PIE executable (to avoid potentially colliding with the binary it was loading). As a result, the PIE default could be moved down to ET_EXEC (0x400000) on 32-bit, entirely avoiding the subset of Stack Clash attacks. 64-bit could be moved to just above the 32-bit address space (0x100000000), leaving the entire 32-bit region open for VMs to do 32-bit addressing, but late in the cycle it was discovered that Address Sanitizer couldn t handle it moving. With most of the Stack Clash risk only applicable to 32-bit, fixing 64-bit has been deferred until there is a way to teach Address Sanitizer how to load itself as a shared object instead of as a PIE binary. early device randomness
I noticed that early device randomness wasn t actually getting added to the kernel entropy pools, so I fixed that to improve the effectiveness of the latent_entropy gcc plugin. That s it for now; please let me know if I missed anything. As a side note, I was rather alarmed to discover that due to all my trivial ReSTification formatting, and tiny FORTIFY_SOURCE and randstruct fixes, I made it into the most active 4.13 developers list (by patch count) at LWN with 76 patches: a whopping 0.6% of the cycle s patches. ;) Anyway, the v4.14 merge window is open!

2017, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

14 August 2017

Jamie McClelland: Diversity doesn't help the bottom line

A Google software engineer's sexist screed against diversity has been making the rounds lately. Most notable are the offensive and mis-guided statements about gender essentialism, which honestly make the thing hard to read at all. What seems lost in the hype, however, is that his primary point seems quite accurate. In short: If Google successfully diversified it's workforce, racial and gender tensions would increase not decrease, divisiveness would spread and, with all liklihood, Google could be damaged. Imagine what would happen if the thousands of existing, mostly male, white and Asian engineers, the majority of whom are convinced that they play no part in racism and sexism, were confronted with thousands of smart and ambitious women, African Americans and Latinos who were becoming their bosses, telling them to work in different ways, and taking "their" promotions. It would be a revolution! I'd love to see it. Google's bosses definitely do not. That's why none of the diversity programs at Google or any other major tech company are having any impact - because they are not designed to have an impact. They are designed to boost morale and make their existing engineers feel good about what they do. Google has one goal: to make money. And one strategy: to design software that that people want to use. One of their tactics that is highly effective is building tight knit groups of programmers who work well together. If the creation of hostile, racist and sexist environments is a by-product - well, it's not one that affects their bottom line. Would Google make better software with a more diverse group of engineers? Definitely! For one, if African American engineers were working on their facial recognition software, it's doubtful it would have mistaken people with black faces for gorillas. However, if the perceived improvement in software outweighed the risks of diversification, then Google would not waste any time on feel-good programs and trainings - they would simply build a jobs pipeline and change their job outreach programs to recruit substantially more female, African Americans and Latino candidates. In the end, this risk avoidance and failure to perceive the limitations of homogeneity is the achiles heel of corporate software design. Our challenge is to see what we can build outside the confines of corporate culture that prioritizes profits, production efficiency, and stability. What can we do with teams that are willing to embrace racial and gender tension, risk diviseveness and be willing to see benefits beyond releasing version 1.0?

13 August 2017

Enrico Zini: Consensually doing things together?

On 2017-08-06 I have a talk at DebConf17 in Montreal titled "Consensually doing things together?" (video). Here are the talk notes. Abstract At DebConf Heidelberg I talked about how Free Software has a lot to do about consensually doing things together. Is that always true, at least in Debian? I d like to explore what motivates one to start a project and what motivates one to keep maintaining it. What are the energy levels required to manage bits of Debian as the project keeps growing. How easy it is to say no. Whether we have roles in Debian that require irreplaceable heroes to keep them going. What could be done to make life easier for heroes, easy enough that mere mortals can help, or take their place. Unhappy is the community that needs heroes, and unhappy is the community that needs martyrs. I d like to try and make sure that now, or in the very near future, Debian is not such an unhappy community. Consensually doing things together I gave a talk in Heidelberg. Valhalla made stickers Debian France distributed many of them. There's one on my laptop. Which reminds me of what we ought to be doing. Of what we have a chance to do, if we play our cards right. I'm going to talk about relationships. Consensual relationships. Relationships in short. Nonconsensual relationships are usually called abuse. I like to see Debian as a relationship between multiple people. And I'd like it to be a consensual one. I'd like it not to be abuse. Consent From wikpedia:
In Canada "consent means the voluntary agreement of the complainant to engage in sexual activity" without abuse or exploitation of "trust, power or authority", coercion or threats.[7] Consent can also be revoked at any moment.[8] There are 3 pillars often included in the description of sexual consent, or "the way we let others know what we're up for, be it a good-night kiss or the moments leading up to sex." They are:
  • Knowing exactly what and how much I'm agreeing to
  • Expressing my intent to participate
  • Deciding freely and voluntarily to participate[20]
Saying "I've decided I won't do laundry anymore" when the other partner is tired, or busy doing things. Is different than saying "I've decided I won't do laundry anymore" when the other partner has a chance to say "why? tell me more" and take part in negotiation. Resources: Relationships Debian is the Universal Operating System. Debian is made and maintained by people. The long term health of debian is a consequence of the long term health of the relationship between Debian contributors. Debian doesn't need to be technically perfect, it needs to be socially healthy. Technical problems can be fixed by a healty community. graph showing relationship between avoidance, accomodation, compromise, competition, collaboration The Thomas-Kilmann Conflict Mode Instrument: source png. Motivations Quick poll: What are your motivations to be in a relationship? Which of those motivations are healthy/unhealthy? "Galadriel" (noun, by Francesca Ciceri): a task you have to do otherwise Sauron takes over Middle Earth See: http://blog.zouish.org/nonupdd/#/22/1 What motivates me to start a project or pick one up? What motivates me to keep maintaning a project? What motivates you? What's an example of a sustainable motivation? Is it really all consensual in Debian? Energy Energy that thing which is measured in spoons. The metaphore comes from people suffering with chronic health issues:
"Spoons" are a visual representation used as a unit of measure used to quantify how much energy a person has throughout a given day. Each activity requires a given number of spoons, which will only be replaced as the person "recharges" through rest. A person who runs out of spoons has no choice but to rest until their spoons are replenished.
For example, in Debian, I could spend: What is one person capable of doing? Have reasonable expectations, on others: Have reasonable expectations, on yourself: Debian is a shared responsibility When spoons are limited, what takes more energy tends not to get done As the project grows, project-wide tasks become harder Are they still humanly achievable? I don't want Debian to have positions that require hero-types to fill them Dictatorship of who has more spoons: Perfectionism You are in a relationship that is just perfect. All your friends look up to you. You give people relationship advice. You are safe in knowing that You Are Doing It Right. Then one day you have an argument in public. You don't just have to deal with the argument, but also with your reputation and self-perception shattering. One things I hate about Debian: consistent technical excellence. I don't want to be required to always be right. One of my favourite moments in the history of Debian is the openssl bug Debian doesn't need to be technically perfect, it needs to be socially healthy, technical problems can be fixed. I want to remove perfectionism from Debian: if we discover we've been wrong all the time in something important, it's not the end of Debian, it's the beginning of an improved Debian. Too good to be true There comes a point in most people's dating experience where one learns that when some things feel too good to be true, they might indeed be. There are people who cannot say no: There are people who cannot take a no: Note the diversity statement: it's not a problem to have one of those (and many other) tendencies, as long as one manages to keep interacting constructively with the rest of the community Also, it is important to be aware of these patterns, to be able to compensate for one's own tendencies. What happens when an avoidant person meets a narcissistic person, and they are both unaware of the risks? Resources: Note: there are problems with the way these resources are framed: Red flag / green flag http://pervocracy.blogspot.ca/2012/07/green-flags.html Ask for examples of red/green flags in Debian. Green flags: Red flags: Apologies / Dealing with issues I don't see the usefulness of apologies that are about accepting blame, or making a person stop complaining. I see apologies as opportunities to understand the problem I caused, help fix it, and possibly find ways of avoiding causing that problem again in the future. A Better Way to Say Sorry lists a 4 step process, which is basically what we do when in bug reports already: 1, Try to understand and reproduce the exact problem the person had. 2. Try to find the cause of the issue. 3. Try to find a solution for the issue. 4. Verify with the reporter that the solution does indeed fix the issue. This is just to say
My software ate
the files
that where in
your home directory and which
you were probably
needing
for work Forgive me
it was so quick to write
without tests
and it worked so well for me
(inspired by a 1934 poem by William Carlos Williams) Don't be afraid to fail Don't be afraid to fail or drop the ball. I think that anything that has a label attached of "if you don't do it, nobody will", shouldn't fall on anybody's shoulders and should be shared no matter what. Shared or dropped. Share the responsibility for a healthy relationship Don't expect that the more experienced mates will take care of everything. In a project with active people counted by the thousand, it's unlikely that harassment isn't happening. Is anyone writing anti-harassment? Do we have stats? Is having an email address and a CoC giving us a false sense of security?
When you get involved in a new community, such as Debian, find out early where, if that happens, you can find support, understanding, and help to make it stop. If you cannot find any, or if the only thing you can find is people who say "it never happens here", consider whether you really want to be in that community.
(from http://www.enricozini.org/blog/2016/debian/you-ll-thank-me-later/)
There are some nice people in the world. I mean nice people, the sort I couldn t describe myself as. People who are friends with everyone, who are somehow never involved in any argument, who seem content to spend their time drawing pictures of bumblebees on flowers that make everyone happy. Those people are great to have around. You want to hold onto them as much as you can. But people only have so much tolerance for jerkiness, and really nice people often have less tolerance than the rest of us. The trouble with not ejecting a jerk whether their shenanigans are deliberate or incidental is that you allow the average jerkiness of the community to rise slightly. The higher it goes, the more likely it is that those really nice people will come around less often, or stop coming around at all. That, in turn, makes the average jerkiness rise even more, which teaches the original jerk that their behavior is acceptable and makes your community more appealing to other jerks. Meanwhile, more people at the nice end of the scale are drifting away.
(from https://eev.ee/blog/2016/07/22/on-a-technicality/) Give people freedom If someone tries something in Debian, try to acknowledge and accept their work. You can give feedback on what they are doing, and try not to stand in their way, unless what they are doing is actually hurting you. In that case, try to collaborate, so that you all can get what you need. It's ok if you don't like everything that they are doing. I personally don't care if people tell me I'm good when I do something, I perceive it a bit like "good boy" or "good dog". I rather prefer if people show an interest, say "that looks useful" or "how does it work?" or "what do you need to deploy this?" Acknowledge that I've done something. I don't care if it's especially liked, give me the freedom to keep doing it. Don't give me rewards, give me space and dignity. Rather than feeding my ego, feed by freedom, and feed my possibility to create.

08 August 2017

Jonathan Dowland: libraries

Cover for The Rise Of The Meritocracy Cover for The Rise Of The Meritocracy
At some point during my Undergraduate years I lost the habit of using Libraries. On reflection this is probably Amazon's fault. In recent years I've tried to get back into the habit of using them. Using libraries is a great idea if you are trying to lead a more minimalist life. I am registered to use Libraries in two counties: North Tyneside, where I live, and Newcastle, where I work. The union of the two counties' catalogues is pretty extensive. Perhaps surprisingly I have found North Tyneside to offer both better customer service and a more interesting selection of books. Sometimes there are still things that are hard to get ahold of. After listening to BBC Radio 4's documentary The Rise and Fall of Meritocracy, presented by Toby Young, I became interested in reading The Rise of the Meritocracy: an alarmist, speculative essay that coined the term meritocracy, written by Toby's father, Michael Young. The book was not on either catalogue. It is out of print, with the price of second hand copies fluctuating but generally higher than I am prepared to pay. I finally managed to find a copy in Newcastle University's Library. As an associate of the School of Computing I have access to the Library services. It's an interesting read, and I think if it were framed more as a novel than as an essay it might be remembered in the same bracket as Brave New World or 1984.

04 August 2017

Daniel Silverstone: USB Device Stacks, on RTFM

I have been spending time with Jorge Aparicio's RTFM for Cortex M3 framework for writing Rust to target Cortex-M3 devices from Arm (and particularly the STM32F103 from ST Microelectronics). Jorge's work in this area has been of interest to me ever since I discovered him working on this stuff a while ago. I am very tempted by the idea of being able to implement code for the STM32 with the guarantees of Rust and the language features which I have come to love such as the trait system. I have been thinking to myself that, while I admire and appreciate the work done on the GNUK, I would like to, personally, have a go at implementing some kind of security token on an STM32 as a USB device. And with the advent of the RTFM for M3 work, and Jorge's magical tooling to make it easier to access and control the registers on an M3 microcontroller, I figured it'd be super-nice to do this in Rust, with all the advantages that entails in terms of isolating unsafe behaviour and generally having the potential to be more easily verified as not misbehaving. To do this though, means that I need a USB device stack which will work in the RTFM framework. Sadly it seems that, thus-far, only Jorge has been working on drivers for any of the M3 devices his framework supports. And one person can only do so much. So, in my infinite madness, I decided I should investigate the complexity of writing a USB device stack in Rust for the RTFM/M3 framework. (Why I thought this was a good idea is lost to the mists of late night Googling, but hey, it might make a good talk at the next conference I go to). As such, this blog post, and further ones along these lines, will serve as a partial tour of what I'm up to, and a partial aide-memoir for me about learning USB. If I get something horribly wrong, please DO contact me to correct me, otherwise I'll just continue to be wrong. If I've simplified something but it's still strictly correct, just let me know if it's an oversimplification since in a lot of cases there's no point in me putting the full details into a blog posting. I will mostly be considering USB2.0 protocol details but only really for low and full speed devices. (The hardware I'm targetting does low-speed and full-speed, but not high-speed. Though some similar HW does high-speed too, I don't have any to hand right now)

A brief introduction to USB In order to go much further, I needed a grounding in USB. It's a multi-layer protocol as you might expect, though we can probably ignore the actual electrical layer since any device we might hope to support will have to have a hardware block to deal with that. We will however need to consider the packet layer (since that will inform how the hardware block is implemented and thus its interface) and then the higher level protocols on top. USB is a deliberately asymmetric protocol. Devices are meant to be significantly easier to implement, both in terms of hardware and software, as compared with hosts. As such, despite some STM32s having OTG ports, I have no intention of supporting host mode at this time. USB is arranged into a set of busses which are, at least in the USB1.1 case, broadcast domains. As such, each device has an address assigned to it by the host during an early phase called 'configuration'. Once the address is assigned, the device is expected to only ever respond to messages addressed to it. Note that since everything is asymmetric in USB, the device can't send messages on its own, but has to be asked for them by the host, and as such the addressing is always from host toward device. USB devices then expose a number of endpoints through which communication can flow IN to the host or OUT to the device. Endpoints are not bidirectional, but the in and out endpoints do overlap in numbering. There is a special pair of endpoints, IN0 and OUT0 which, between them, form what I will call the device control endpoints. The device control endpoints are important since every USB device MUST implement them, and there are a number of well defined messages which pass over them to control the USB device. In theory a bare minimum USB device would implement only the device control endpoints.

Configurations, and Classes, and Interfaces, Oh My! In order for the host to understand what the USB device is, and what it is capable of, part of the device control endpoints' responsibility is to provide a set of descriptors which describe the device. These descriptors form a heirarchy and are then glommed together into a big lump of data which the host can download from the device in order to decide what it is and how to use it. Because of various historical reasons, where a multi-byte value is used, they are defined to be little-endian, though there are some BCD fields. Descriptors always start with a length byte and a type byte because that way the host can parse/skip as necessary, with ease. The first descriptor is the device descriptor, is a big one, and looks like this:
Device Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (18)
bDescriptorType 1 1 Constant Device Descriptor (0x01)
bcdUSB 2 2 BCD USB spec version compiled with
bDeviceClass 4 1 Class Code, assigned by USB org (0 means "Look at interface descriptors", common value is 2 for CDC)
bDeviceSubClass 5 1 SubClass Code, assigned by USB org (usually 0)
bDeviceProtocol 6 1 Protocol Code, assigned by USB org (usually 0)
bMaxPacketSize 7 1 Number Max packet size for IN0/OUT0 (Valid are 8, 16, 32, 64)
idVendor 8 2 ID 16bit Vendor ID (Assigned by USB org)
idProduct 10 2 ID 16bit Product ID (Assigned by manufacturer)
bcdDevice 12 2 BCD Device version number (same encoding as bcdUSB)
iManufacturer 14 1 Index String index of manufacturer name (0 if unavailable)
iProduct 15 1 Index String index of product name (0 if unavailable)
iSerialNumber 16 1 Index String index of device serial number (0 if unavailable)
bNumConfigurations 17 1 Number Count of configurations the device has.
This looks quite complex, but breaks down into a relatively simple two halves. The first eight bytes carries everything necessary for the host to be able to configure itself and the device control endpoints properly in order to communicate effectively. Since eight bytes is the bare minimum a device must be able to transmit in one go, the host can guarantee to get those, and they tell it what kind of device it is, what USB protocol it supports, and what the maximum transfer size is for its device control endpoints. The encoding of the bcdUSB and bcdDevice fields is interesting too. It is of the form 0xMMmm where MM is the major number, mm the minor. So USB2.0 is encoded as 0x0200, USB1.1 as 0x0110 etc. If the device version is 17.36 then that'd be 0x1736. Other fields of note are bDeviceClass which can be 0 meaning that interfaces will specify their classes, and idVendor/idProduct which between them form the primary way for the specific USB device to be identified. The Index fields are indices into a string table which we'll look at later. For now it's enough to know that wherever a string index is needed, 0 can be provided to mean "no string here". The last field is bNumConfigurations and this indicates the number of ways in which this device might function. A USB device can provide any number of these configurations, though typically only one is provided. If the host wishes to switch between configurations then it will have to effectively entirely quiesce and reset the device. The next kind of descriptor is the configuration descriptor. This one is much shorter, but starts with the same two fields:
Configuration Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (9)
bDescriptorType 1 1 Constant Configuration Descriptor (0x02)
wTotalLength 2 2 Number Size of the configuration in bytes, in total
bNumInterfaces 4 1 Number The number of interfaces in this configuration
bConfigurationValue 5 1 Number The value to use to select this configuration
iConfiguration 6 1 Index The name of this configuration (0 for unavailable)
bmAttributes 7 1 Bitmap Attributes field (see below)
bMaxPower 8 1 Number Maximum bus power this configuration will draw (in 2mA increments)
An important field to consider here is the bmAttributes field which tells the host some useful information. Bit 7 must be set, bit 6 is set if the device would be self-powered in this configuration, bit 5 indicates that the device would like to be able to wake the host from sleep mode, and bits 4 to 0 must be unset. The bMaxPower field is interesting because it encodes the power draw of the device (when set to this configuration). USB allows for up to 100mA of draw per device when it isn't yet configured, and up to 500mA when configured. The value may be used to decide if it's sensible to configure a device if the host is in a low power situation. Typically this field will be set to 50 to indicate the nominal 100mA is fine, or 250 to request the full 500mA. Finally, the wTotalLength field is interesting because it tells the host the total length of this configuration, including all the interface and endpoint descriptors which make it up. With this field, the host can allocate enough RAM to fetch the entire configuration descriptor block at once, simplifying matters dramatically for it. Each configuration has one or more interfaces. The interfaces group some endpoints together into a logical function. For example a configuration for a multifunction scanner/fax/printer might have an interface for the scanner function, one for the fax, and one for the printer. Endpoints are not shared among interfaces, so when building this table, be careful. Next, logically, come the interface descriptors:
Interface Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (9)
bDescriptorType 1 1 Constant Interface Descriptor (0x04)
bInterfaceNumber 2 1 Number The number of the interface
bAlternateSetting 3 1 Number The interface alternate index
bNumEndpoints 4 1 Number The number of endpoints in this interface
bInterfaceClass 5 1 Class The interface class (USB Org defined)
bInterfaceSubClass 6 1 SubClass The interface subclass (USB Org defined)
bInterfaceProtocol 7 1 Protocol The interface protocol (USB Org defined)
iInterface 8 1 Index The name of the interface (or 0 if not provided)
The important values here are the class/subclass/protocol fields which provide a lot of information to the host about what the interface is. If the class is a USB Org defined one (e.g. 0x02 for Communications Device Class) then the host may already have drivers designed to work with the interface meaning that the device manufacturer doesn't have to provide host drivers. The bInterfaceNumber is used by the host to indicate this interface when sending messages, and the bAlternateSetting is a way to vary interfaces. Two interfaces with the came bInterfaceNumber but different bAlternateSettings can be switched between (like configurations, but) without resetting the device. Hopefully the rest of this descriptor is self-evident by now. The next descriptor kind is endpoint descriptors:
Endpoint Descriptor
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (7)
bDescriptorType 1 1 Constant Endpoint Descriptor (0x05)
bEndpointAddress 2 1 Endpoint Endpoint address (see below)
bmAttributes 3 1 Bitmap Endpoint attributes (see below)
wMaxPacketSize 4 2 Number Maximum packet size this endpoint can send/receive
bInterval 6 1 Number Interval for polling endpoint (in frames)
The bEndpointAddress is a 4 bit endpoint number (so there're 16 endpoint indices) and a bit to indicate IN vs. OUT. Bit 7 is the direction marker and bits 3 to 0 are the endpoint number. This means there are 32 endpoints in total, 16 in each direction, 2 of which are reserved (IN0 and OUT0) giving 30 endpoints available for interfaces to use in any given configuration. The bmAttributes bitmap covers the transfer type of the endpoint (more below), and the bInterval is an interval measured in frames (1ms for low or full speed, 125 s in high speed). bInterval is only valid for some endpoint types. The final descriptor kind is for the strings which we've seen indices for throughout the above. String descriptors have two forms:
String Descriptor (index zero)
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (variable)
bDescriptorType 1 1 Constant String Descriptor (0x03)
wLangID[0] 2 2 Number Language code zero (e.g. 0x0409 for en_US)
wLangID[n] 4.. 2 Number Language code n ...
This form (for descriptor 0) is that of a series of language IDs supported by the device. The device may support any number of languages. When the host requests a string descriptor, it will supply both the index of the string and also the language id it desires (from the list available in string descriptor zero). The host can tell how many language IDs are available simply by dividing bLength by 2 and subtracting 1 for the two header bytes. And for string descriptors of an index greater than zero:
String Descriptor (index greater than zero)
Field Name Byte start Byte length Encoding Meaning
bLength 0 1 Number Size of the descriptor in bytes (variable)
bDescriptorType 1 1 Constant String Descriptor (0x03)
bString 2.. .. Unicode The string, in "unicode" format
This second form of the string descriptor is simply the the string is in what the USB spec calls 'Unicode' format which is, as of 2005, defined to be UTF16-LE without a BOM or terminator. Since string descriptors are of a variable length, the host must request strings in two transactions. First a request for 2 bytes is sent, retrieving the bLength and bDescriptorType fields which can be checked and memory allocated. Then a request for bLength bytes can be sent to retrieve the entire string descriptor.

Putting that all together Phew, this is getting to be quite a long posting, so I'm going to leave this here and in my next post I'll talk about how the host and device pass packets to get all that information to the host, and how it gets used.

31 July 2017

Chris Lamb: Free software activities in July 2017

Here is my monthly update covering what I have been doing in the free software world during July 2017 (previous month): I also blogged about my recent lintian hacking and installation-birthday package.
Reproducible builds

Whilst anyone can inspect the source code of free software for malicious flaws, most software is distributed pre-compiled to end users. The motivation behind the Reproducible Builds effort is to permit verification that no flaws have been introduced either maliciously or accidentally during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. (I have generously been awarded a grant from the Core Infrastructure Initiative to fund my work in this area.) This month I:
  • Assisted Mattia with a draft of an extensive status update to the debian-devel-announce mailing list. There were interesting follow-up discussions on Hacker News and Reddit.
  • Submitted the following patches to fix reproducibility-related toolchain issues within Debian:
  • I also submitted 5 patches to fix specific reproducibility issues in autopep8, castle-game-engine, grep, libcdio & tinymux.
  • Categorised a large number of packages and issues in the Reproducible Builds "notes" repository.
  • Worked on publishing our weekly reports. (#114 #115, #116 & #117)

I also made the following changes to our tooling:
diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues.

  • comparators.xml:
    • Fix EPUB "missing file" tests; they ship a META-INF/container.xml file. [ ]
    • Misc style fixups. [ ]
  • APK files can also be identified as "DOS/MBR boot sector". (#868486)
  • comparators.sqlite: Simplify file detection by rewriting manual recognizes call with a Sqlite3Database.RE_FILE_TYPE definition. [ ]
  • comparators.directory:
    • Revert the removal of a try-except. (#868534)
    • Tidy module. [ ]

strip-nondeterminism

strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build.

  • Add missing File::Temp imports in the JAR and PNG handlers. This appears to have been exposed by lazily-loading handlers in #867982. (#868077)

buildinfo.debian.net

buildinfo.debian.net is my experiment into how to process, store and distribute .buildinfo files after the Debian archive software has processed them.

  • Avoid a race condition between check-and-creation of Buildinfo instances. [ ]


Debian My activities as the current Debian Project Leader are covered in my "Bits from the DPL emails to the debian-devel-announce mailing list.
Patches contributed
  • obs-studio: Remove annoying "click wrapper" on first startup. (#867756)
  • vim: Syntax highlighting for debian/copyright files. (#869965)
  • moin: Incorrect timezone offset applied due to "84600" typo. (#868463)
  • ssss: Add a simple autopkgtest. (#869645)
  • dch: Please bump $latest_bpo_dist to current stable release. (#867662)
  • python-kaitaistruct: Remove Markdown and homepage references from package long descriptions. (#869265)
  • album-data: Correct invalid Vcs-Git URI. (#869822)
  • pytest-sourceorder: Update Homepage field. (#869125)
I also made a very large number of contributions to the Lintian static analysis tool. To avoid duplication here, I have outlined them in a separate post.

Debian LTS

This month I have been paid to work 18 hours on Debian Long Term Support (LTS). In that time I did the following:
  • "Frontdesk" duties, triaging CVEs, etc.
  • Issued DLA 1014-1 for libclamunrar, a library to add unrar support to the Clam anti-virus software to fix an arbitrary code execution vulnerability.
  • Issued DLA 1015-1 for the libgcrypt11 crypto library to fix a "sliding windows" information leak.
  • Issued DLA 1016-1 for radare2 (a reverse-engineering framework) to prevent a remote denial-of-service attack.
  • Issued DLA 1017-1 to fix a heap-based buffer over-read in the mpg123 audio library.
  • Issued DLA 1018-1 for the sqlite3 database engine to prevent a vulnerability that could be exploited via a specially-crafted database file.
  • Issued DLA 1019-1 to patch a cross-site scripting (XSS) exploit in phpldapadmin, a web-based interface for administering LDAP servers.
  • Issued DLA 1024-1 to prevent an information leak in nginx via a specially-crafted HTTP range.
  • Issued DLA 1028-1 for apache2 to prevent the leakage of potentially confidential information via providing Authorization Digest headers.
  • Issued DLA 1033-1 for the memcached in-memory object caching server to prevent a remote denial-of-service attack.

Uploads
  • redis:
    • 4:4.0.0-1 Upload new major upstream release to unstable.
    • 4:4.0.0-2 Make /usr/bin/redis-server in the primary package a symlink to /usr/bin/redis-check-rdb in the redis-tools package to prevent duplicate debug symbols that result in a package file collision. (#868551)
    • 4:4.0.0-3 Add -latomic to LDFLAGS to avoid a FTBFS on the mips & mipsel architectures.
    • 4:4.0.1-1 New upstream version. Install 00-RELEASENOTES as the upstream changelog.
    • 4:4.0.1-2 Skip non-deterministic tests that rely on timing. (#857855)
  • python-django:
    • 1:1.11.3-1 New upstream bugfix release. Check DEB_BUILD_PROFILES consistently, not DEB_BUILD_OPTIONS.
  • bfs:
    • 1.0.2-2 & 1.0.2-3 Use help2man to generate a manpage.
    • 1.0.2-4 Set hardening=+all for bindnow, etc.
    • 1.0.2-5 & 1.0.2-6 Don't use upstream's release target as it overrides our CFLAGS & install RELEASES.md as the upstream changelog.
    • 1.1-1 New upstream release.
  • libfiu:
    • 0.95-4 Apply patch from Steve Langasek to fix autopkgtests. (#869709)
  • python-daiquiri:
    • 1.0.1-1 Initial upload. (ITP)
    • 1.1.0-1 New upstream release.
    • 1.1.0-2 Tidy package long description.
    • 1.2.1-1 New upstream release.

I also reviewed and sponsored the uploads of gtts-token 1.1.1-1 and nlopt 2.4.2+dfsg-3.

Debian bugs filed
  • ITP: python-daiquiri Python library to easily setup basic logging functionality. (#867322)
  • twittering-mode: Correct incorrect time formatting due to "84600" typo. (#868479)

29 July 2017

Chris Lamb: More Lintian hacking

Lintian is static analysis tool for Debian packages, reporting on various errors, omissions and quality-assurance issues to the maintainer. I seem to have found myself hacking on it a bit more recently (see my previous installment). In particular, here's the code of mine which made for a total of 20 bugs closed that made it into the recent 2.5.52 release:
New tags
  • Check for the presence of an .asc signature in a .changes file if an upstream signing key is present. (#833585, tag)
  • Warn when dpkg-statoverride --add is called without a corresponding --list. (#652963, tag)
  • Check for years in debian/copyright that are later than the top entry in debian/changelog. (#807461, tag)
  • Trigger a warning when DEB_BUILD_OPTIONS is used instead of DEB_BUILD_MAINT_OPTIONS. (#833691, tag)
  • Look for "FIXME" and similar placeholders in various files in the debian directory. (#846009, tag)
  • Check for useless build-dependencies on dh-autoreconf or autotools-dev under Debhelper compatibility levels 10 or higher. (#844191, tag)
  • Emit a warning if GObject Introspection packages are missing dependencies on $ gir:Depends . (#860801, tag)
  • Check packages do not contain upstart configuration under /etc/init. (#825348, tag)
  • Emit a classification tag if maintainer scripts such as debian/postinst is an ELF binary. (tag)
  • Check for overly-generic manual pages such as README.3pm.gz. (#792846, tag)
  • Ensure that (non-ELF) maintainer scripts begin with #!. (#843428, tag)

Regression fixes
  • Ensure r-data-without-readme-source checks the source package, not the binary; README.source files are not installed in the latter. (#866322, tag)
  • Don't emit source-contains-prebuilt-ms-help-file for files generated by Halibut. (#867673, tag)
  • Add .yml to the list of file extensions to avoid false positives when emitting extra-license-file. (#856137, tag)
  • Append a regression test for enumerated lists in the "a) b) c) " style, which would previously trigger a "duplicate word" warning if the following paragraph began with an "a." (#844166, tag)

Documentation updates
  • Rename copyright-contains-dh-make-perl-boilerplate to copyright-contains-automatically-extracted-boilerplate as it can be generated by other tools such as dh-make-elpa. (#841832, tag)
  • Changes to new-package-should-not-package-python2-module (tag):
    • Upgrade from I: to W:. (#829744)
    • Clarify wording in description to make the justification clearer.
  • Clarify justification in debian-rules-parses-dpkg-parsechangelog. (#865882, tag)
  • Expand the rationale for the latest-debian-changelog-entry-without-new-date tag to mention possible implications for reproducible builds. (tag)
  • Update the source-contains-prebuilt-ms-help-file description; there exists free software to generate .chm files. (tag)
  • Append an example shell snippet to explain how to prevent init.d-script-sourcing-without-test. (tag)
  • Add a missing "contains" verb to the description of the debhelper-autoscript-in-maintainer-scripts tag. (tag)
  • Consistently use the same "Debian style" RFC822 date format for both "Mirror timestamp" and "Last updated" on the Lintian index page. (#828720)

Misc
  • Allow the use of suppress-tags=<tag>[,<tag>[,<tag>]] in ~/.lintianrc. (#764486)
  • Improve the support for "3.0 (git)" packages. However, they remain marked as unsupported-source-format as they are not accepted by the Debian archive. (#605999)
  • Apply patch from Dylan A ssi to also check for .RData files (not just .Rdata) when checking for the copyright status of R Project data files. (#868178, tag)
  • Match more Lena S derberg images. (#827941, tag)
  • Refactor a hard-coded list of possible upstream key locations to the common/signing-key-filenames Lintian::Data resource.

25 July 2017

Satyam Zode: Maya - the OpenEBS Go Kit Project

The Kit project in Go is the common project containing all the standard libraries or packages used across all the Go projects in the organization. Motivation I attended GopherCon India 2017, there was a talk on Package Oriented Design In Go by William Kennedy. In that talk, William explained some really important and thoughtful design principles which We can apply in our day to day life, while writing Go. Hence, I wanted to apply these design philosophies to the Go projects in which I have been working on as a part of OpenEBS project. I learnt a good practice of having a Go Kit project at the organization level from William s talk. What is the Kit Project? The Kit project in Go is the common project containing all the standard libraries or packages used across all the Go projects in the organization. Packages in the Kit project should follow design philosophies. Need for a Kit project Sometimes, We write same Go packages again and again to do the same task at different levels in the different Go projects under the same organization. For example, we write custom logger package in the different Go projects. If the custom logger package is same across the organization and can be reused by simply importing it, then this custom logger package is the perfect fit for Kit project. You can sense how much time and cost it will save for us when we have a Kit project. How to convert existing projects to have kit Maya is the kit project in the progress. I will walk through our journey of creating a Kit project called maya for OpenEBS organization from existing Go projects. At OpenEBS, as a open source and growing Go project, We value Go principles and We try hard to leverage Go s offerings. Maya is the Kit project for the Application projects like maya-apiserver, maya-storage-bot etc. Maya contains all the kubernetes & nomad API s, common utilities etc. needed for development of maya-apiserver and maya-storage-bot. In the near future, we are trying to push our custom libraries to maya. So that, it will become a promising Go kit project for OpenEBS community. We have specifically followed the package oriented design principles in Go to create maya as a kit project. Project structure for the Maya as Kit project Example usage of maya kit project in maya-apiserver Maya-apiserver uses maya as a Kit project. Maya-apiserver exposes OpenEBS operations in form of REST APIs. This allows multiple clients e.g. volume related plugins to consume OpenEBS storage operations exposed by Maya API server. Conclusion Go Kit project should contain packages which are usable, purposeful and portable. Go Kit projects will improve the efficiency of the organization at both human and code level.

Next.