Search Results: "dato"

7 November 2016

Reproducible builds folks: Reproducible Builds: week 80 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday October 30 and Saturday November 5 2016: Upcoming events Reproducible work in other projects Bugs filed Reviews of unreproducible packages 81 package reviews have been added, 14 have been updated and 43 have been removed in this week, adding to our knowledge about identified issues. 3 issue types have been updated: 1 issue type has been removed: 1 issue type has been updated: Weekly QA work During of reproducibility testing, some FTBFS bugs have been detected and reported by: diffoscope development buildinfo.debian.net development tests.reproducible-builds.org Reproducible Debian: Misc. Also with thanks to Profitbricks sponsoring the "hardware" resources, Holger created a 13 core machine with 24GB RAM and 100GB SSD based storage so that Ximin can do further tests and development on GCC and other software on a fast machine. This week's edition was written by Chris Lamb, Ximin Luo, Vagrant Cascadian, Holger Levsen and reviewed by a bunch of Reproducible Builds folks on IRC.

31 October 2016

Antoine Beaupr : My free software activities, October 2016

Debian Long Term Support (LTS) This is my 7th month working on Debian LTS, started by Raphael Hertzog at Freexian, after a long pause during the summer. I have worked on the following packages and CVEs: I have also helped review work on the following packages:
  • imagemagick: reviewed BenH's work to figure out what was done. unfortunately, I forgot to officially take on the package and Roberto started working on it in the meantime. I nevertheless took time to review Roberto's work and outline possible issues with the original patchset suggested
  • tiff: reviewed Raphael's work on the hairy TIFFTAG_* issues, all the gory details in this email
The work on ImageMagick and GraphicsMagick was particularly intriguing. Looking at the source of those programs makes me wonder why were are still using them at all: it's a tangled mess of C code that is bound to bring up more and more vulnerabilities, time after time. It seems there's always an "Magick" vulnerability waiting to be fixed out there... I somehow hoped that the fork would bring more stability and reliability, but it seems they are suffering from similar issues because, fundamentally, they haven't rewritten ImageMagick... It looks this is something that affects all image programs. The review I have done on the tiff suite give me the same shivering sensation as reviewing the "Magick" code. It feels like all image libraries are poorly implemented and then bound to be exploited somehow... Nevertheless, if I had to use a library of the sort in my software, I would stay away from the "Magick" forks and try something like imlib2 first... Finally, I also did some minor work on the user and developer LTS documentation and some triage work on samba, xen and libass. I also looked at the dreaded CVE-2016-7117 vulnerability in the Linux kernel to verify its impact on wheezy users. I also looked at implementing a --lts flag for dch (see bug #762715). It was difficult to get back to work after such a long pause, but I am happy I was able to contribute a significant number of hours. It's a bit difficult to find work sometimes in LTS-land, even if there's actually always a lot of work to be done. For example, I used to be one of the people doing frontdesk work, but those duties are now assigned until the end of the year, so it's unlikely I will be doing any of that for the forseable future. Similarly, a lot of packages were assigned when I started looking at the available packages. There was an interesting discussion on the internal mailing list regarding unlocking package ownership, because some people had packages locked for weeks, sometimes months, without significant activity. Hopefully that situation will improve after that discussion. Another interesting discussion I participated in is the question of whether the LTS team should be waiting for unstable to be fixed before publishing fixes in oldstable. It seems the consensus right now is that it shouldn't be mandatory to fix issues in unstable before we fix security isssues in oldstable and stable. After all, security support for testing and unstable is limited. But I was happy to learn that working on brand new patches is part of our mandate as part of the LTS work. I did work on such a patch for tar which ended up being adopted by the original reporter, although upstream ended up implementing our recommendation in a better way. It's coincidentally the first time since I start working on LTS that I didn't get the number of requested hours, which means that there are more people working on LTS. That is a good thing, but I am worried it may also mean people are more spread out and less capable of focusing for longer periods of time on more difficult problems. It also means that the team is growing faster than the funding, which is unfortunate: now is a good time as any to remind you to see if you can make your company fund the LTS project if you are still running Debian wheezy.

Other free software work It seems like forever that I did such a report, and while I was on vacation, a lot has happened since the last report.

Monkeysign I have done extensive work on Monkeysign, trying to bring it kicking and screaming in the new world of GnuPG 2.1. This was the objective of the 2.1 release, which collected about two years of work and patches, including arbitrary MUA support (e.g. Thunderbird), config files support, and a release on PyPI. I have had to release about 4 more releases to try and fix the build chain, ship the test suite with the program and have a primitive preferences panel. The 2.2 release also finally features Tor suport! I am also happy to have moved more documentation to Read the docs, part of which I mentionned in in a previous article. The git repositories and issues were also moved to a Gitlab instance which will hopefully improve the collaboration workflow, although we still have issues in streamlining the merge request workflow. All in all, I am happy to be working on Monkeysign, but it has been a frustrating experience. In the last years, I have been maintaining the project largely on my own: although there are about 20 contributors in Monkeysign, I have committed over 90% of the commits in the code. New contributors recently showed up, and I hope this will release some pressure on me being the sole maintainer, but I am not sure how viable the project is.

Funding free software work More and more, I wonder how to sustain my contributions to free software. As a previous article has shown, I work a lot on the computer, even when I am not on a full-time job. Monkeysign has been a significant time drain in the last months, and I have done this work on a completely volunteer basis. I wouldn't mind so much except that there is a lot of work I do on a volunteer basis. This means that I sometimes must prioritize paid consulting work, at the expense of those volunteer projects. While most of my paid work usually revolves around free sofware, the benefits of paid work are not always immediately obvious, as the primary objective is to deliver to the customer, and the community as a whole is somewhat of a side-effect. I have watched with interest joeyh's adventures into crowdfunding which seems to be working pretty well for him. Unfortunately, I cannot claim the incredible (and well-deserved) reputation Joey has, and even if I could, I can't live with 500$ a month. I would love to hear if people would be interested in funding my work in such a way. I am hesitant in launching a crowdfunding campaign because it is difficult to identify what exactly I am working on from one month to the next. Looking back at earlier reports shows that I am all over the place: one month I'll work on a Perl Wiki (Ikiwiki), the next one I'll be hacking at a multimedia home cinema (Kodi). I can hardly think of how to fund those things short of "just give me money to work on anything I feel like", which I can hardly ask for of anyone. Even worse, it feels like the audience here is either friends or colleagues. It would make little sense for me to seek funding from those people: colleagues have the same funding problems I do, and I don't want to empoverish my friends... So far I have taken the approach of trying to get funding for work I am doing, bit by bit. For example, I have recently been told that LWN actually pays for contributed articles and have started running articles by them before publishing them here. This is looking good: they will publish an article I wrote about the Omnia router I have recently received. I give them exclusive rights on the article for two weeks, but I otherwise retain full ownership over the article and will publish them after the exclusive period here. Hopefully, I will be able to find more such projects that pays for the work I do on a day to day basis.

Open Street Map editing I have ramped up my OpenStreetMap contributions, having (temporarily) moved to a different location. There are lots of things to map here: trails, gaz stations and lots of other things are missing from the map. Sometimes the effort looks a bit ridiculous, reminding me of my early days of editing OSM. I have registered to OSM Live, a project to fund OSM editors that, I must admit, doesn't help much with funding my work: with the hundreds of edits I did in October, I received the equivalent of 1.80$CAD in Bitcoins. This may be the lowest hourly salary I have ever received, probably going at a rate of 10 per hour! Still, it's interesting to be able to point people to the project if someone wants to contribute to OSM mappers. But mappers should have no illusions about getting a decent salary from this effort, I am sorry to say.

Bounties I feel this is similar to the "bounty" model used by the Borg project: I claimed around $80USD in that project for what probably amounts to tens of hours of work, yet another salary that would qualify as "poor". Another example is a feature I would like to implement in Borg: support for protocols other than SSH. There is currently no bounty on this, but a similar feature, S3 support has one of the largest bounties Borg has ever seen: $225USD. And the claimant for the bounty hasn't actually implemented the feature, instead backing up to S3, the patch (to a third-party tool) actually enables support for Amazon Cloud Drive, a completely different API. Even at $225, I wouldn't be able to complete any of those features and get a decent salary. As well explained by the Snowdrift reviews, bounties just don't work at all... The ludicrous 10% fee charged by Bountysource made sure I would never do business with them ever again anyways.

Other work There are probably more things I did recently, but I am having difficulty keeping track of the last 5 months of on and off work, so you will forgive that I am not as exhaustive as I usually am.

1 October 2016

Kees Cook: security things in Linux v4.6

Previously: v4.5. The v4.6 Linux kernel release included a bunch of stuff, with much more of it under the KSPP umbrella. seccomp support for parisc Helge Deller added seccomp support for parisc, which including plumbing support for PTRACE_GETREGSET to get the self-tests working. x86 32-bit mmap ASLR vs unlimited stack fixed Hector Marco-Gisbert removed a long-standing limitation to mmap ASLR on 32-bit x86, where setting an unlimited stack (e.g. ulimit -s unlimited ) would turn off mmap ASLR (which provided a way to bypass ASLR when executing setuid processes). Given that ASLR entropy can now be controlled directly (see the v4.5 post), and that the cases where this created an actual problem are very rare, means that if a system sees collisions between unlimited stack and mmap ASLR, they can just adjust the 32-bit ASLR entropy instead. x86 execute-only memory Dave Hansen added Protection Key support for future x86 CPUs and, as part of this, implemented support for execute only memory in user-space. On pkeys-supporting CPUs, using mmap(..., PROT_EXEC) (i.e. without PROT_READ) will mean that the memory can be executed but cannot be read (or written). This provides some mitigation against automated ROP gadget finding where an executable is read out of memory to find places that can be used to build a malicious execution path. Using this will require changing some linker behavior (to avoid putting data in executable areas), but seems to otherwise Just Work. I m looking forward to either emulated QEmu support or access to one of these fancy CPUs. CONFIG_DEBUG_RODATA enabled by default on arm and arm64, and mandatory on x86 Ard Biesheuvel (arm64) and I (arm) made the poorly-named CONFIG_DEBUG_RODATA enabled by default. This feature controls whether the kernel enforces proper memory protections on its own memory regions (code memory is executable and read-only, read-only data is actually read-only and non-executable, and writable data is non-executable). This protection is a fundamental security primitive for kernel self-protection, so making it on-by-default is required to start any kind of attack surface reduction within the kernel. On x86 CONFIG_DEBUG_RODATA was already enabled by default, but, at Ingo Molnar s suggestion, I made it mandatory: CONFIG_DEBUG_RODATA cannot be turned off on x86. I expect we ll get there with arm and arm64 too, but the protection is still somewhat new on these architectures, so it s reasonable to continue to leave an out for developers that find themselves tripping over it. arm64 KASLR text base offset Ard Biesheuvel reworked a ton of arm64 infrastructure to support kernel relocation and, building on that, Kernel Address Space Layout Randomization of the kernel text base offset (and module base offset). As with x86 text base KASLR, this is a probabilistic defense that raises the bar for kernel attacks where finding the KASLR offset must be added to the chain of exploits used for a successful attack. One big difference from x86 is that the entropy for the KASLR must come either from Device Tree (in the /chosen/kaslr-seed property) or from UEFI (via EFI_RNG_PROTOCOL), so if you re building arm64 devices, make sure you have a strong source of early-boot entropy that you can expose through your boot-firmware or boot-loader. zero-poison after free Laura Abbott reworked a bunch of the kernel memory management debugging code to add zeroing of freed memory, similar to PaX/Grsecurity s PAX_MEMORY_SANITIZE feature. This feature means that memory is cleared at free, wiping any sensitive data so it doesn t have an opportunity to leak in various ways (e.g. accidentally uninitialized structures or padding), and that certain types of use-after-free flaws cannot be exploited since the memory has been wiped. To take things even a step further, the poisoning can be verified at allocation time to make sure that nothing wrote to it between free and allocation (called sanity checking ), which can catch another small subset of flaws. To understand the pieces of this, it s worth describing that the kernel s higher level allocator, the page allocator (e.g. __get_free_pages()) is used by the finer-grained slab allocator (e.g. kmem_cache_alloc(), kmalloc()). Poisoning is handled separately in both allocators. The zero-poisoning happens at the page allocator level. Since the slab allocators tend to do their own allocation/freeing, their poisoning happens separately (since on slab free nothing has been freed up to the page allocator). Only limited performance tuning has been done, so the penalty is rather high at the moment, at about 9% when doing a kernel build workload. Future work will include some exclusion of frequently-freed caches (similar to PAX_MEMORY_SANITIZE), and making the options entirely CONFIG controlled (right now both CONFIGs are needed to build in the code, and a kernel command line is needed to activate it). Performing the sanity checking (mentioned above) adds another roughly 3% penalty. In the general case (and once the performance of the poisoning is improved), the security value of the sanity checking isn t worth the performance trade-off. Tests for the features can be found in lkdtm as READ_AFTER_FREE and READ_BUDDY_AFTER_FREE. If you re feeling especially paranoid and have enabled sanity-checking, WRITE_AFTER_FREE and WRITE_BUDDY_AFTER_FREE can test these as well. To perform zero-poisoning of page allocations and (currently non-zero) poisoning of slab allocations, build with:
CONFIG_DEBUG_PAGEALLOC=n
CONFIG_PAGE_POISONING=y
CONFIG_PAGE_POISONING_NO_SANITY=y
CONFIG_PAGE_POISONING_ZERO=y
CONFIG_SLUB_DEBUG=y
and enable the page allocator poisoning and slab allocator poisoning at boot with this on the kernel command line:
page_poison=on slub_debug=P
To add sanity-checking, change PAGE_POISONING_NO_SANITY=n, and add F to slub_debug as slub_debug=PF . read-only after init I added the infrastructure to support making certain kernel memory read-only after kernel initialization (inspired by a small part of PaX/Grsecurity s KERNEXEC functionality). The goal is to continue to reduce the attack surface within the kernel by making even more of the memory, especially function pointer tables, read-only (which depends on CONFIG_DEBUG_RODATA above). Function pointer tables (and similar structures) are frequently targeted by attackers when redirecting execution. While many are already declared const in the kernel source code, making them read-only (and therefore unavailable to attackers) for their entire lifetime, there is a class of variables that get initialized during kernel (and module) start-up (i.e. written to during functions that are marked __init ) and then never (intentionally) written to again. Some examples are things like the VDSO, vector tables, arch-specific callbacks, etc. As it turns out, most architectures with kernel memory protection already delay making their data read-only until after __init (see mark_rodata_ro()), so it s trivial to declare a new data section ( .data..ro_after_init ) and add it to the existing read-only data section ( .rodata ). Kernel structures can be annotated with the new section (via the __ro_after_init macro), and they ll become read-only once boot has finished. The next step for attack surface reduction infrastructure will be to create a kernel memory region that is passively read-only, but can be made temporarily writable (by a single un-preemptable CPU), for storing sensitive structures that are written to only very rarely. Once this is done, much more of the kernel s attack surface can be made read-only for the majority of its lifetime. As people identify places where __ro_after_init can be used, we can grow the protection. A good place to start is to look through the PaX/Grsecurity patch to find uses of __read_only on variables that are only written to during __init functions. The rest are places that will need the temporarily-writable infrastructure (PaX/Grsecurity uses pax_open_kernel()/pax_close_kernel() for these). That s it for v4.6, next up will be v4.7!

2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

16 August 2016

Lars Wirzenius: 20 years ago I became a Debian developer

Today it is 23 years ago since Ian Murdock published his intention to develop a new Linux distribution, Debian. It also about 20 years since I became a Debian developer and made my first package upload. In the time since: It's been a good twenty years. And the fun ain't over yet.

8 June 2016

Lucas Nussbaum: Re: Sysadmin Skills and University Degrees

Russell Coker wrote about Sysadmin Skills and University Degrees. I couldn t agree more that a major deficiency in Computer Science degrees is the lack of sysadmin training. It seems like most sysadmins learned most of what they know from experience. It s very hard to recruit young engineers (freshly out of university) for sysadmin jobs, and the job interviews are often a bit depressing. Sysadmins jobs are also not very popular with this public, probably because university curriculums fail to emphasize what s exciting about those jobs. However, I think I disagree rather deeply with Russell s detailed analysis. First, Version Control. Well, I think that it s pretty well covered in university curriculums nowadays. From my point of view, teaching CS in Universit de Lorraine (France), mostly in Licence Professionnelle Administration de Syst mes, R seaux et Applications base de Logiciels Libres (warning: french), a BSc degree focusing on Linux systems administration, it s not usual to see student projects with a mandatory use of Git. And it doesn t seem to be a major problem for students (which always surprises me). However, I wouldn t rate Version Control as the most important thing that is required for a sysadmin. Similarly Dependencies and Backups are things that should be covered, but probably not as first class citizens. I think that there are several pillars in the typical sysadmin knowledge. First and foremost, sysadmins need a good understanding of the inner workings of an operating system. I sometimes feel that many Operating Systems Design courses are a bit too much focused on the Design side of things. Yes, it s useful to understand the low-level mechanisms, and be able to (mentally) recreate an OS from scratch. But it s also interesting to know how real systems are actually built, and what are the trade-off involved. I very much enjoyed reading Branden Gregg s Systems Performance: Enterprise and the Cloud because each chapter starts with a great overview of how things are in the real world, with a very good level of detail. Also, addressing OS design from the point of view of performance could be a way to turn those courses into something more attractive for students: many people like to measure, benchmark, optimize things, and it s quite easy to demonstrate how different designs, or different configurations, make a big difference in terms of performance in the context of OS design. It s possible to be a sysadmin and ignore, say, the existence of the VFS, but there s a large class of problems that you will never be able to solve. It can be a good trade-off for a curriculum (e.g. at the BSc level) to decide to ignore most of the low-level stuff, but it s important to be aware of it. Students also need to learn how to design a proper infrastructure (that meets requirements in terms of scalability, availability, security, and maybe elasticity). Yes, backups are important. But monitoring is, too. As well as high availability. In order to scale, it s important to be able to automatize stuff. Russell writes that Sysadmins need some programming skills, but that s mostly scripting and basic debugging. Well, when you design an infrastructure, or when you use configuration management tools such as Puppet, in some sense, you are programming, and in terms of needs to abstract things, it s actually similar to doing object-oriented programming, with similar choices (should I use that off-the-shelf puppet module, or re-develop my own? How should everything fit together?). Also, when debugging, it s often useful to be able to dig into code, understand what the developer was trying to do, and if the expected behavior actually matches what you are seeing. It often results in spending a lot of time to create a one-line fix, and it requires very advanced programming skills. Again, it s possible to be a sysadmin with only limited software development knowledge, but there s a large class of things that you are unlikely to address properly. I think that what makes sysadmins jobs both very interesting and very challenging is that they require a very wide range of knowledge. There s often the ability to learn about new stuff (much more than in software development jobs). Of course, the difficult question is where to draw the line. What is the sysadmin knowledge that every CS graduate should have, even in curriculums not targeting sysadmin jobs? What is the sysadmin knowledge for a sysadmin BSc degree? for a sysadmin MSc degree?

6 June 2016

Reproducible builds folks: Reprotest has a preliminary CLI and configuration file handling

Author: ceridwen This is the first draft of reprotest's interface, and I welcome comments on how to improve it. At the moment, reprotest's CLI takes two mandatory arguments, the build command to run and the build artifact file to test after running the build. If the build command or build artifact have spaces, they have to be passed as strings, e.g. "debuild -b -uc -us". For optional arguments, it has --variations, which accepts a list of possible build variations to test, one or more of 'captures_environment', 'domain_host', 'filesystem', 'home', 'kernel', 'locales', 'path', 'shell', 'time', 'timezone', 'umask', and 'user_group' (see variations for more information); --dont_vary, which makes reprotest not test any variations in the given list (the default is to run all variations); --source_root, which accepts a directory to run the build command in and defaults to the current working directory; and --verbose, which will eventually enable more detailed logging. To get help for the CLI, run reprotest -h or reprotest --help. The config file has one section, basics, and the same options as the CLI, except there's no dont_vary option, and there are build_command and artifact options. If build_command and/or artifact are set in the config file, reprotest can be run without passing those as command-line arguments. Command-line arguments always override config file options. Reprotest currently searches the working directory for the config file, but it will also eventually search the user's home directory. A sample config file is below.
[basics]
build_command = setup.py sdist
artifact = dist/reprotest-0.1.tar.gz
source_root = reprotest/
variations =
  captures_environment
  domain_host
  filesystem
  home
  host
  kernel
  locales
  path
  shell
  time
  timezone
  umask
  user_group
At the moment, the only build variations that reprotest actually tests are the environment variable variations: captures_environment, home, locales, and timezone. Over the next week, I plan to add the rest of the basic variations and accompanying tests. I also need to write tests for the CLI and the configuration file. After that, I intend to work on getting (s)chroot communication working, which will involve integrating autopkgtest code. Some of the variations require specific other packages to be installed: for instance, the locales variation currently requires the fr_CH.UTF-8 locale. Locales are a particular problem because I don't know of a way in Debian to specify that a given locale must be installed. For other packages, it's unclear to me whether I should specify them as depends or recommends: they aren't dependencies in a strict sense, but marking them as dependencies will make it easier to install a fully-functional reprotest. When reprotest runs with variations enabled that it can't test because it doesn't have the correct packages installed, I intend to have it print a warning but continue to run. tests.reproducible-builds.org also has different settings, such as different locales, for different architectures. I'm not clear on why this is. I'd prefer to avoid having to generate a giant list of variations based on architecture, but if necessary, I can do that. The prebuilder script contains variations specific to Linux, to Debian, and to pbuilder/cowbuilder. I'm not including Debian-specific variations until I get much more of the basic functionality implemented, and I'm not sure I'm going to include pbuilder-specific variations ever, because it's probably better for extensibility to other OSes, e.g. BSD, to add support for plugins or more complicated configurations. I implemented the variations by creating a function for each variation. Each function takes as input two build commands, two source trees, and two sets of environment variables and returns the same. At the moment, I'm using dictionaries for the environment variables, mutating them in-place and passing the references forward. I'm probably going to replace those at some point with an immutable mapping. While at the moment, reprotest only builds on the existing system, when I start extending it to other build environments, this will require double-dispatch, because the code that needs to be executed will depend on both the variation to be tested and the environment being built on. At the moment, I'm probably going to implement this with a dictionary with tuple keys of (build_environment, variation) or nested dictionaries. If it's necessary for code to depend on OS or architecture, too, this could end up becoming a triple or quadruple dispatch.

24 May 2016

Alberto Garc a: I/O bursts with QEMU 2.6

QEMU 2.6 was released a few days ago. One new feature that I have been working on is the new way to configure I/O limits in disk drives to allow bursts and increase the responsiveness of the virtual machine. In this post I ll try to explain how it works. The basic settings First I will summarize the basic settings that were already available in earlier versions of QEMU. Two aspects of the disk I/O can be limited: the number of bytes per second and the number of operations per second (IOPS). For each one of them the user can set a global limit or separate limits for read and write operations. This gives us a total of six different parameters. I/O limits can be set using the throttling.* parameters of -drive, or using the QMP block_set_io_throttle command. These are the names of the parameters for both cases:
-drive block_set_io_throttle
throttling.iops-total iops
throttling.iops-read iops_rd
throttling.iops-write iops_wr
throttling.bps-total bps
throttling.bps-read bps_rd
throttling.bps-write bps_wr
It is possible to set limits for both IOPS and bps at the same time, and for each case we can decide whether to have separate read and write limits or not, but if iops-total is set then neither iops-read nor iops-write can be set. The same applies to bps-total and bps-read/write. The default value of these parameters is 0, and it means unlimited. In its most basic usage, the user can add a drive to QEMU with a limit of, say, 100 IOPS with the following -drive line:
-drive file=hd0.qcow2,throttling.iops-total=100
We can do the same using QMP. In this case all these parameters are mandatory, so we must set to 0 the ones that we don t want to limit:
     "execute": "block_set_io_throttle",
     "arguments":  
        "device": "virtio0",
        "iops": 100,
        "iops_rd": 0,
        "iops_wr": 0,
        "bps": 0,
        "bps_rd": 0,
        "bps_wr": 0
      
    
I/O bursts While the settings that we have just seen are enough to prevent the virtual machine from performing too much I/O, it can be useful to allow the user to exceed those limits occasionally. This way we can have a more responsive VM that is able to cope better with peaks of activity while keeping the average limits lower the rest of the time. Starting from QEMU 2.6, it is possible to allow the user to do bursts of I/O for a configurable amount of time. A burst is an amount of I/O that can exceed the basic limit, and there are two parameters that control them: their length and the maximum amount of I/O they allow. These two can be configured separately for each one of the six basic parameters described in the previous section, but here we ll use iops-total as an example. The I/O limit during bursts is set using iops-total-max , and the maximum length (in seconds) is set with iops-total-max-length . So if we want to configure a drive with a basic limit of 100 IOPS and allow bursts of 2000 IOPS for 60 seconds, we would do it like this (the line is split for clarity):
   -drive file=hd0.qcow2,
          throttling.iops-total=100,
          throttling.iops-total-max=2000,
          throttling.iops-total-max-length=60
Or with QMP:
     "execute": "block_set_io_throttle",
     "arguments":  
        "device": "virtio0",
        "iops": 100,
        "iops_rd": 0,
        "iops_wr": 0,
        "bps": 0,
        "bps_rd": 0,
        "bps_wr": 0,
        "iops_max": 2000,
        "iops_max_length": 60,
      
    
With this, the user can perform I/O on hd0.qcow2 at a rate of 2000 IOPS for 1 minute before it s throttled down to 100 IOPS. The user will be able to do bursts again if there s a sufficiently long period of time with unused I/O (see below for details). The default value for iops-total-max is 0 and it means that bursts are not allowed. iops-total-max-length can only be set if iops-total-max is set as well, and its default value is 1 second. Controlling the size of I/O operations When applying IOPS limits all I/O operations are treated equally regardless of their size. This means that the user can take advantage of this in order to circumvent the limits and submit one huge I/O request instead of several smaller ones. QEMU provides a setting called throttling.iops-size to prevent this from happening. This setting specifies the size (in bytes) of an I/O request for accounting purposes. Larger requests will be counted proportionally to this size. For example, if iops-size is set to 4096 then an 8KB request will be counted as two, and a 6KB request will be counted as one and a half. This only applies to requests larger than iops-size: smaller requests will be always counted as one, no matter their size. The default value of iops-size is 0 and it means that the size of the requests is never taken into account when applying IOPS limits. Applying I/O limits to groups of disks In all the examples so far we have seen how to apply limits to the I/O performed on individual drives, but QEMU allows grouping drives so they all share the same limits. This feature is available since QEMU 2.4. Please refer to the post I wrote when it was published for more details. The Leaky Bucket algorithm I/O limits in QEMU are implemented using the leaky bucket algorithm (specifically the Leaky bucket as a meter variant). This algorithm uses the analogy of a bucket that leaks water constantly. The water that gets into the bucket represents the I/O that has been performed, and no more I/O is allowed once the bucket is full. To see the way this corresponds to the throttling parameters in QEMU, consider the following values:
  iops-total=100
  iops-total-max=2000
  iops-total-max-length=60
bucket The bucket is initially empty, therefore water can be added until it s full at a rate of 2000 IOPS (the burst rate). Once the bucket is full we can only add as much water as it leaks, therefore the I/O rate is reduced to 100 IOPS. If we add less water than it leaks then the bucket will start to empty, allowing for bursts again. Note that since water is leaking from the bucket even during bursts, it will take a bit more than 60 seconds at 2000 IOPS to fill it up. After those 60 seconds the bucket will have leaked 60 x 100 = 6000, allowing for 3 more seconds of I/O at 2000 IOPS. Also, due to the way the algorithm works, longer burst can be done at a lower I/O rate, e.g. 1000 IOPS during 120 seconds. Acknowledgments As usual, my work in QEMU is sponsored by Outscale and has been made possible by Igalia and the help of the QEMU development team. igalia-outscale Enjoy QEMU 2.6!

2 May 2016

Vincent Bernat: Pragmatic Debian packaging

While the creation of Debian packages is abundantly documented, most tutorials are targeted to packages implementing the Debian policy. Moreover, Debian packaging has a reputation of being unnecessarily difficult1 and many people prefer to use less constrained tools2 like fpm or CheckInstall. However, I would like to show how building Debian packages with the official tools can become straightforward if you bend some rules:
  1. No source package will be generated. Packages will be built directly from a checkout of a VCS repository.
  2. Additional dependencies can be downloaded during build. Packaging individually each dependency is a painstaking work, notably when you have to deal with some fast-paced ecosystems like Java, Javascript and Go.
  3. The produced packages may bundle dependencies. This is likely to raise some concerns about security and long-term maintenance, but this is a common trade-off in many ecosystems, notably Java, Javascript and Go.

Pragmatic packages 101 In the Debian archive, you have two kinds of packages: the source packages and the binary packages. Each binary package is built from a source package. You need a name for each package. As stated in the introduction, we won t generate a source package but we will work with its unpacked form which is any source tree containing a debian/ directory. In our examples, we will start with a source tree containing only a debian/ directory but you are free to include this debian/ directory into an existing project. As an example, we will package memcached, a distributed memory cache. There are four files to create:
  • debian/compat,
  • debian/changelog,
  • debian/control, and
  • debian/rules.
The first one is easy. Just put 9 in it:
echo 9 > debian/compat
The second one has the following content:
memcached (0-0) UNRELEASED; urgency=medium
  * Fake entry
 -- Happy Packager <happy@example.com>  Tue, 19 Apr 2016 22:27:05 +0200
The only important information is the name of the source package, memcached, on the first line. Everything else can be left as is as it won t influence the generated binary packages.

The control file debian/control describes the metadata of both the source package and the generated binary packages. We have to write a block for each of them.
Source: memcached
Maintainer: Vincent Bernat <bernat@debian.org>
Package: memcached
Architecture: any
Description: high-performance memory object caching system
The source package is called memcached. We have to use the same name as in debian/changelog. We generate only one binary package: memcached. In the remaining of the example, when you see memcached, this is the name of a binary package. The Architecture field should be set to either any or all. Use all exclusively if the package contains only arch-independent files. In doubt, just stick to any. The Description field contains a short description of the binary package.

The build recipe The last mandatory file is debian/rules. It s the recipe of the package. We need to retrieve memcached, build it and install its file tree in debian/memcached/. It looks like this:
#!/usr/bin/make -f
DISTRIBUTION = $(shell lsb_release -sr)
VERSION = 1.4.25
PACKAGEVERSION = $(VERSION)-0~$(DISTRIBUTION)0
TARBALL = memcached-$(VERSION).tar.gz
URL = http://www.memcached.org/files/$(TARBALL)
%:
    dh $@
override_dh_auto_clean:
override_dh_auto_test:
override_dh_auto_build:
override_dh_auto_install:
    wget -N --progress=dot:mega $(URL)
    tar --strip-components=1 -xf $(TARBALL)
    ./configure --prefix=/usr
    make
    make install DESTDIR=debian/memcached
override_dh_gencontrol:
    dh_gencontrol -- -v$(PACKAGEVERSION)
The empty targets override_dh_auto_clean, override_dh_auto_test and override_dh_auto_build keep debhelper from being too smart. The override_dh_gencontrol target sets the package version3 without updating debian/changelog. If you ignore the slight boilerplate, the recipe is quite similar to what you would have done with fpm:
DISTRIBUTION=$(lsb_release -sr)
VERSION=1.4.25
PACKAGEVERSION=$ VERSION -0~$ DISTRIBUTION 0
TARBALL=memcached-$ VERSION .tar.gz
URL=http://www.memcached.org/files/$ TARBALL 
wget -N --progress=dot:mega $ URL 
tar --strip-components=1 -xf $ TARBALL 
./configure --prefix=/usr
make
make install DESTDIR=/tmp/installdir
# Build the final package
fpm -s dir -t deb \
    -n memcached \
    -v $ PACKAGEVERSION  \
    -C /tmp/installdir \
    --description "high-performance memory object caching system"
You can review the whole package tree on GitHub and build it with dpkg-buildpackage -us -uc -b.

Pragmatic packages 102 At this point, we can iterate and add several improvements to our memcached package. None of those are mandatory but they are usually worth the additional effort.

Build dependencies Our initial build recipe only work when several packages are installed, like wget and libevent-dev. They are not present on all Debian systems. You can easily express that you need them by adding a Build-Depends section for the source package in debian/control:
Source: memcached
Build-Depends: debhelper (>= 9),
               wget, ca-certificates, lsb-release,
               libevent-dev
Always specify the debhelper (>= 9) dependency as we heavily rely on it. We don t require make or a C compiler because it is assumed that the build-essential meta-package is installed and it pulls those. dpkg-buildpackage will complain if the dependencies are not met. If you want to install those packages from your CI system, you can use the following command4:
mk-build-deps \
    -t 'apt-get -o Debug::pkgProblemResolver=yes --no-install-recommends -qqy' \
    -i -r debian/control
You may also want to investigate pbuilder or sbuild, two tools to build Debian packages in a clean isolated environment.

Runtime dependencies If the resulting package is installed on a freshly installed machine, it won t work because it will be missing libevent, a required library for memcached. You can express the dependencies needed by each binary package by adding a Depends field. Moreover, for dynamic libraries, you can automatically get the right dependencies by using some substitution variables:
Package: memcached
Depends: $ misc:Depends , $ shlibs:Depends 
The resulting package will contain the following information:
$ dpkg -I ../memcached_1.4.25-0\~unstable0_amd64.deb   grep Depends
 Depends: libc6 (>= 2.17), libevent-2.0-5 (>= 2.0.10-stable)

Integration with init system Most packaged daemons come with some integration with the init system. This integration ensures the daemon will be started on boot and restarted on upgrade. For Debian-based distributions, there are several init systems available. The most prominent ones are:
  • System-V init is the historical init system. More modern inits are able to reuse scripts written for this init, so this is a safe common denominator for packaged daemons.
  • Upstart is the less-historical init system for Ubuntu (used in Ubuntu 14.10 and previous releases).
  • systemd is the default init system for Debian since Jessie and for Ubuntu since 15.04.
Writing a correct script for the System-V init is error-prone. Therefore, I usually prefer to provide a native configuration file for the default init system of the targeted distribution (Upstart and systemd).

System-V If you want to provide a System-V init script, have a look at /etc/init.d/skeleton on the most ancient distribution you want to target and adapt it5. Put the result in debian/memcached.init. It will be installed at the right place, invoked on install, upgrade and removal. On Debian-based systems, many init scripts allow user customizations by providing a /etc/default/memcached file. You can ship one by putting its content in debian/memcached.default.

Upstart Providing an Upstart job is similar: put it in debian/memcached.upstart. For example:
description "memcached daemon"
start on runlevel [2345]
stop on runlevel [!2345]
respawn
respawn limit 5 60
expect daemon
script
  . /etc/default/memcached
  exec memcached -d -u $USER -p $PORT -m $CACHESIZE -c $MAXCONN $OPTIONS
end script
When writing an Upstart job, the most important directive is expect. Be sure to get it right. Here, we use expect daemon and memcached is started with the -d flag.

systemd Providing a systemd unit is a bit more complex. The content of the file should go in debian/memcached.service. For example:
[Unit]
Description=memcached daemon
After=network.target
[Service]
Type=forking
EnvironmentFile=/etc/default/memcached
ExecStart=/usr/bin/memcached -d -u $USER -p $PORT -m $CACHESIZE -c $MAXCONN $OPTIONS
Restart=on-failure
[Install]
WantedBy=multi-user.target
We reuse /etc/default/memcached even if it is not considered a good practice with systemd6. Like for Upstart, the directive Type is quite important. We used forking as memcached is started with the -d flag. You also need to add a build-dependency to dh-systemd in debian/control:
Source: memcached
Build-Depends: debhelper (>= 9),
               wget, ca-certificates, lsb-release,
               libevent-dev,
               dh-systemd
And you need to modify the default rule in debian/rules:
%:
    dh $@ --with systemd
The extra complexity is a bit unfortunate but systemd integration is not part of debhelper7. Without those additional modifications, the unit will get installed but you won t get a proper integration and the service won t be enabled on install or boot.

Dedicated user Many daemons don t need to run as root and it is a good practice to ship a dedicated user. In the case of memcached, we can provide a _memcached user8. Add a debian/memcached.postinst file with the following content:
#!/bin/sh
set -e
case "$1" in
    configure)
        adduser --system --disabled-password --disabled-login --home /var/empty \
                --no-create-home --quiet --force-badname --group _memcached
        ;;
esac
#DEBHELPER#
exit 0
There is no cleanup of the user when the package is removed for two reasons:
  1. Less stuff to write.
  2. The user could still own some files.
The utility adduser will do the right thing whatever the requested user already exists or not. You need to add it as a dependency in debian/control:
Package: memcached
Depends: $ misc:Depends , $ shlibs:Depends , adduser
The #DEBHELPER# marker is important as it will be replaced by some code to handle the service configuration files (or some other stuff). You can review the whole package tree on GitHub and build it with dpkg-buildpackage -us -uc -b.

Pragmatic packages 103 It is possible to leverage debhelper to reduce the recipe size and to make it more declarative. This section is quite optional and it requires understanding a bit more how a Debian package is built. Feel free to skip it.

The big picture There are four steps to build a regular Debian package:
  1. debian/rules clean should clean the source tree to make it pristine.
  2. debian/rules build should trigger the build. For an autoconf-based software, like memcached, this step should execute something like ./configure && make.
  3. debian/rules install should install the file tree of each binary package. For an autoconf-based software, this step should execute make install DESTDIR=debian/memcached.
  4. debian/rules binary will pack the different file trees into binary packages.
You don t directly write each of those targets. Instead, you let dh, a component of debhelper, do most of the work. The following debian/rules file should do almost everything correctly with many source packages:
#!/usr/bin/make -f
%:
    dh $@
For each of the four targets described above, you can run dh with --no-act to see what it would do. For example:
$ dh build --no-act
   dh_testdir
   dh_update_autotools_config
   dh_auto_configure
   dh_auto_build
   dh_auto_test
Each of those helpers has a manual page. Helpers starting with dh_auto_ are a bit magic . For example, dh_auto_configure will try to automatically configure a package prior to building: it will detect the build system and invoke ./configure, cmake or Makefile.PL. If one of the helpers do not do the right thing, you can replace it by using an override target:
override_dh_auto_configure:
    ./configure --with-some-grog
Those helpers are also configurable, so you can just alter a bit their behaviour by invoking them with additional options:
override_dh_auto_configure:
    dh_auto_configure -- --with-some-grog
This way, ./configure will be called with your custom flag but also with a lot of default flags like --prefix=/usr for better integration. In the initial memcached example, we overrode all those magic targets. dh_auto_clean, dh_auto_configure and dh_auto_build are converted to no-ops to avoid any unexpected behaviour. dh_auto_install is hijacked to do all the build process. Additionally, we modified the behavior of the dh_gencontrol helper by forcing the version number instead of using the one from debian/changelog.

Automatic builds As memcached is an autoconf-enabled package, dh knows how to build it: ./configure && make && make install. Therefore, we can let it handle most of the work with this debian/rules file:
#!/usr/bin/make -f
DISTRIBUTION = $(shell lsb_release -sr)
VERSION = 1.4.25
PACKAGEVERSION = $(VERSION)-0~$(DISTRIBUTION)0
TARBALL = memcached-$(VERSION).tar.gz
URL = http://www.memcached.org/files/$(TARBALL)
%:
    dh $@ --with systemd
override_dh_auto_clean:
    wget -N --progress=dot:mega $(URL)
    tar --strip-components=1 -xf $(TARBALL)
override_dh_auto_test:
    # Don't run the whitespace test
    rm t/whitespace.t
    dh_auto_test
override_dh_gencontrol:
    dh_gencontrol -- -v$(PACKAGEVERSION)
The dh_auto_clean target is hijacked to download and setup the source tree9. We don t override the dh_auto_configure step, so dh will execute the ./configure script with the appropriate options. We don t override the dh_auto_build step either: dh will execute make. dh_auto_test is invoked after the build and it will run the memcached test suite. We need to override it because one of the test is complaining about odd whitespaces in the debian/ directory. We suppress this rogue test and let dh_auto_test executes the test suite. dh_auto_install is not overriden either, so dh will execute some variant of make install. To get a better sense of the difference, here is a diff:
--- memcached-intermediate/debian/rules 2016-04-30 14:02:37.425593362 +0200
+++ memcached/debian/rules  2016-05-01 14:55:15.815063835 +0200
@@ -12,10 +12,9 @@
 override_dh_auto_clean:
-override_dh_auto_test:
-override_dh_auto_build:
-override_dh_auto_install:
    wget -N --progress=dot:mega $(URL)
    tar --strip-components=1 -xf $(TARBALL)
-   ./configure --prefix=/usr
-   make
-   make install DESTDIR=debian/memcached
+
+override_dh_auto_test:
+   # Don't run the whitespace test
+   rm t/whitespace.t
+   dh_auto_test
It is up to you to decide if dh can do some work for you, but you could try to start from a minimal debian/rules and only override some targets.

Install additional files While make install installed the essential files for memcached, you may want to put additional files in the binary package. You could use cp in your build recipe, but you can also declare them:
  • files listed in debian/memcached.docs will be copied to /usr/share/doc/memcached by dh_installdocs,
  • files listed in debian/memcached.examples will be copied to /usr/share/doc/memcached/examples by dh_installexamples,
  • files listed in debian/memcached.manpages will be copied to the appropriate subdirectory of /usr/share/man by dh_installman,
Here is an example using wildcards for debian/memcached.docs:
doc/*.txt
If you need to copy some files to an arbitrary location, you can list them along with their destination directories in debian/memcached.install and dh_install will take care of the copy. Here is an example:
scripts/memcached-tool usr/bin
Using those files make the build process more declarative. It is a matter of taste and you are free to use cp in debian/rules instead. You can review the whole package tree on GitHub.

Other examples The GitHub repository contains some additional examples. They all follow the same scheme:
  • dh_auto_clean is hijacked to download and setup the source tree
  • dh_gencontrol is modified to use a computed version
Notably, you ll find daemons in Java, Go, Python and Node.js. The goal of those examples is to demonstrate that using Debian tools to build Debian packages can be straightforward. Hope this helps.

  1. People may remember the time before debhelper 7.0.50 (circa 2009) where debian/rules was a daunting beast. However, nowaday, the boilerplate is quite reduced.
  2. The complexity is not the only reason. Those alternative tools enable the creation of RPM packages, something that Debian tools obviously don t.
  3. There are many ways to version a package. Again, if you want to be pragmatic, the proposed solution should be good enough for Ubuntu. On Debian, it doesn t cover upgrade from one distribution version to another, but we assume that nowadays, systems get reinstalled instead of being upgraded.
  4. You also need to install devscripts and equivs package.
  5. It s also possible to use a script provided by upstream. However, there is no such thing as an init script that works on all distributions. Compare the proposed with the skeleton, check if it is using start-stop-daemon and if it sources /lib/lsb/init-functions before considering it. If it seems to fit, you can install it yourself in debian/memcached/etc/init.d/. debhelper will ensure its proper integration.
  6. Instead, a user wanting to customize the options is expected to edit the unit with systemctl edit.
  7. See #822670
  8. The Debian Policy doesn t provide any hint for the naming convention of those system users. A common usage is to prefix the daemon name with an underscore (like _memcached). Another common usage is to use Debian- as a prefix. The main drawback of the latest solution is that the name is likely to be replaced by the UID in ps and top because of its length.
  9. We could call dh_auto_clean at the end of the target to let it invoke make clean. However, it is assumed that a fresh checkout is used before each build.

26 April 2016

Matthias Klumpp: Why are AppStream metainfo files XML data?

This is a question raised quite quite often, the last time in a blogpost by Thomas, so I thought it is a good idea to give a slightly longer explanation (and also create an article to link to ). There are basically three reasons for using XML as the default format for metainfo files: 1. XML is easily forward/backward compatible, while YAML is not This is a matter of extending the AppStream metainfo files with new entries, or adapt existing entries to new needs. Take this example XML line for defining an icon for an application:
<icon type="cached">foobar.png</icon>
and now the equivalent YAML:
Icons:
  cached: foobar.png
Now consider we want to add a width and height property to the icons, because we started to allow more than one icon size. Easy for the XML:
<icon type="cached" width="128" height="128">foobar.png</icon>
This line of XML can be read correctly by both old parsers, which will just see the icon as before without reading the size information, and new parsers, which can make use of the additional information if they want. The change is both forward and backward compatible. This looks differently with the YAML file. The foobar.png is a string-type, and parsers will expect a string as value for the cached key, while we would need a dictionary there to include the additional width/height information:
Icons:
  cached: name: foobar.png
          width: 128
          height: 128
The change shown above will break existing parsers though. Of course, we could add a cached2 key, but that would require people to write two entries, to keep compatibility with older parsers:
Icons:
  cached: foobar.png
  cached2: name: foobar.png
          width: 128
          height: 128
Less than ideal. While there are ways to break compatibility in XML documents too, as well as ways to design YAML documents in a way which minimizes the risk of breaking compatibility later, keeping the format future-proof is far easier with XML compared to YAML (and sometimes simply not possible with YAML documents). This makes XML a good choice for this usecase, since we can not do transitions with thousands of independent upstream projects easily, and need to care about backwards compatibility. 2. Translating YAML is not much fun A property of AppStream metainfo files is that they can be easily translated into multiple languages. For that, tools like intltool and itstool exist to aid with translating XML using Gettext files. This can be done at project build-time, keeping a clean, minimal XML file, or before, storing the translated strings directly in the XML document. Generally, YAML files can be translated too. Take the following example (shamelessly copied from Dolphin):
<summary>File Manager</summary>
<summary xml:lang="bs">Upravitelj datoteka</summary>
<summary xml:lang="cs">Spr vce soubor </summary>
<summary xml:lang="da">Filh ndtering</summary>
This would become something like this in YAML:
Summary:
  C: File Manager
  bs: Upravitelj datoteka
  cs: Spr vce soubor 
  da: Filh ndtering
Looks manageable, right? Now, AppStream also covers long descriptions, where individual paragraphs can be translated by the translators. This looks like this in XML:
<description>
  <p>Dolphin is a lightweight file manager. It has been designed with ease of use and simplicity in mind, while still allowing flexibility and customisation. This means that you can do your file management exactly the way you want to do it.</p>
  <p xml:lang="de">Dolphin ist ein schlankes Programm zur Dateiverwaltung. Es wurde mit dem Ziel entwickelt, einfach in der Anwendung, dabei aber auch flexibel und anpassungsf hig zu sein. Sie k nnen daher Ihre Dateiverwaltungsaufgaben genau nach Ihren Bed rfnissen ausf hren.</p>
  <p>Features:</p>
  <p xml:lang="de">Funktionen:</p>
  <p xml:lang="es">Caracter sticas:</p>
  <ul>
  <li>Navigation (or breadcrumb) bar for URLs, allowing you to quickly navigate through the hierarchy of files and folders.</li>
  <li xml:lang="de">Navigationsleiste f r Adressen (auch editierbar), mit der Sie schnell durch die Hierarchie der Dateien und Ordner navigieren k nnen.</li>
  <li xml:lang="es">barra de navegaci n (o de ruta completa) para URL que permite navegar r pidamente a trav s de la jerarqu a de archivos y carpetas.</li>
  <li>Supports several different kinds of view styles and properties and allows you to configure the view exactly how you want it.</li>
  ....
  </ul>
</description>
Now, how would you represent this in YAML? Since we need to preserve the paragraph and enumeration markup somehow, and creating a large chain of YAML dictionaries is not really a sane option, the only choices would be: In both cases, we would loose the ability to translate individual paragraphs, which also means that as soon as the developer changes the original text in YAML, translators would need to translate the whole bunch again, which is inconvenient. On top of that, there are no tools to translate YAML properly that I am aware of, so we would need to write those too. 3. Allowing XML and YAML makes a confusing story and adds complexity While adding YAML as a format would not be too hard, given that we already support it for DEP-11 distro metadata (Debian uses this), it would make the business of creating metainfo files more confusing. At time, we have a clear story: Write the XML, store it in /usr/share/metainfo, use standard tools to translate the translatable entries. Adding YAML to the mix adds an additional choice that needs to be supported for eternity and also has the problems mentioned above. I wanted to add YAML as format for AppStream, and we discussed this at the hackfest as well, but in the end I think it isn t worth the pain of supporting it for upstream projects (remember, someone needs to maintain the parsers and specification too and keep XML and YAML in sync and updated). Don t get me wrong, I love YAML, but for translated metadata which needs a guarantee on format stability it is not the ideal choice. So yeah, XML isn t fun to write by hand. But for this case, XML is a good choice.

29 February 2016

Daniel Silverstone: Kicad hacking - Intra-sheet links and ERC

This is a bit of an odd posting since it's about something I've done but is also here to help me explain why I did it and thus perhaps encourage some discussion around the topic within the Kicad community...
Recently (as you will know if you follow this blog anywhere it is syndicated) I have started playing with Kicad for the development of some hardware projects I've had a desire for. In addition, some of you may be aware that I used to work for a hardware/software consultancy called Simtec, and there I got to play for a while with an EDA tool called Mentor Designview. Mentor was an expensive, slow, clunky, old-school EDA tool, but I grew to understand and like the workflow. I spent time looking at gEDA and Eagle when I wanted to get back into hardware hacking for my own ends; but neither did I really click with. On the other hand, a mere 10 minutes with Kicad and I knew I had found the tool I wanted to work with long-term. I designed the beer'o'meter project (a flow meter for the pub we are somehow intimately involved with) and then started on my first personal surface-mount project -- SamDAC which is a DAC designed to work with our HiFi in our study at home. As I worked on the SamDAC project, I realised that I was missing a very particular thing from Mentor, something which I had low-level been annoyed by while looking at other EDA tools -- Kicad lacks a mechanism to mark a wire as being linked to somewhere else on the same sheet. Almost all of the EDA tools I've looked at seem to lack this nicety, and honestly I miss it greatly, so I figured it was time to see if I could successfully hack on Kicad. Kicad is written in C++, and it has been mumble mumble years since I last did any C++, either for personal hacking or professionally, so it took a little while for that part of my brain to kick back in enough for me to grok the codebase. Kicad is not a small project, taking around ten minutes to build on my not-inconsiderable computer. And while it beavered away building, I spent time looking around the source code, particularly the schematic editor eeschema. To skip ahead a bit, after a couple of days of hacking around, I had a proof-of-concept for the intra-sheet links which I had been missing from my days with Mentor, and some ERC (electrical rules checking) to go alongside that to help produce schematics without unwanted "sharp corners". In total, I added: I forked the Kicad mirror on Github and pushed my own branch with this work to my Kicad fork. All of this is meant to allow schematic capture engineers to more clearly state their intentions regarding what they are drawing. The intra-sheet link could be thought of like a no-connect element, except instead of saying "this explicitly goes nowhere" we're saying "this explicitly goes somewhere else on this sheet, you can go look for it". Obviously, people who dislike (or simply don't want to use) such intra-sheet link elements can just disable that ERC tickybox and not be bothered by them in the least (well except for the toolbar button and menu item I suppose). Whether this work gets accepted into Kicad, or festers and dies on the vine, it was good fun developing it and I'd like to illustrate how it could help you, and why I wrote it in the first place:

A contrived story
Note, while this story is meant to be taken seriously, it is somewhat contrived, the examples are likely electrical madness, but please just think about the purpose of the checks etc.
To help to illustrate the feature and why it exists, I'd like to tell you a somewhat contrived story about Fred. Fred is a schematic capture engineer and his main job is to review schematics generated by his colleagues. Fred and his colleagues work with Kicad (hurrah) but of late they've been having a few issues with being able to cleanly review schematics. Fred's colleagues are not the neatest of engineers. In particular they tend to be quite lazy when it comes to running busses, which are not (for example) address and data busses, around their designs and they tend to simply have wires which end in mid-space and pick up somewhere else on the sheet. All this is perfectly reasonable of course, and Kicad handles it with aplomb. Sadly it seems quite error prone for Fred's workplace. As an example, Fred's colleague Ben has been designing the power supply for a particular board. As with most power supplies, plenty of capacitors are needed to stabilise the regulators and smooth the output. In the example below, the intent is that all of the capacitors are on the FOO net. Contrived problem example 1 Sadly there's a missing junction and/or slightly misplaced label in the upper section which means that C2 and C3 simply don't join to the FOO net. This could easily be missed, but the ERC can't spot it at all since there's more than one thing on each net, so the pins of the capacitors are connected to something. Fred is very sad that this kind of problem can sometimes escape notice by the schematic designer Ben, Fred himself, and the layout engineer, resulting in boards which simply do not work. Fred takes it upon himself to request that the strict wiring checks ERC is made mandatory for all designs, and that the design engineers be required to use intra-sheet link symbols when they have signals which wander off to other parts of the sheet like FOO does in the example. Without any further schematic changes, strict wiring checks enabled gives the following points of ERC concern for Ben to think about: Contrived problem example 2 As you can see, the ERC is pointing at the wire ends and the warnings are simply that the wires are dangling and that this is not acceptable. This warning is very like the pin-not-connected warnings which can be silenced with an explicit no-connect schematic element. Ben, being a well behaved and gentle soul, obeys the design edicts from Fred and seeks out the intra-sheet link symbols, clearing off the ERC markers and then adding intra-sheet links to his design: Contrived problem example 3 This silences the dangling end ERC check, which is good, however it results in another ERC warning: Contrived problem example 4 This time, the warning for Ben to deal with is that the intra-sheet links are pointless. Each exists without a companion to link to because of the net name hiccough in the top section. It takes Ben a moment to realise that the mistake which has been made is that a junction is missing in the top section. He adds the junction and bingo the ERC is clean once more: Contrived problem example 5 Now, this might not seem like much gain for so much effort, but Ben can now be more confident that the FOO net is properly linked across his design and Fred can know, when he looks at the top part of the design, that Ben intended for the FOO net to go somewhere else on the sheet and he can look for it.

Why do this at all? Okay, dropping out of our story now, let's discuss why these ERC checks are worthwhile and why the intra-sheet link schematic element is needed.
Note: This bit is here to remind me of why I did the work, and to hopefully explain a little more about why I think it's worth adding to Kicad...
Designers are (one assumes) human beings. As humans we (and I count myself here too) are prone to mistakes. Sadly mistakes are often subtle and could easily be thought of as deliberate if the right thought processes are not followed carefully when reviewing. Anyone who has ever done code review, proofread a document, or performed any such activity, will be quite familiar with the problems which can be introduced by a syntactically and semantically valid construct which simply turns out to be wrong in the greater context. When drawing designs, I often end up with bits of wire sticking out of schematic sections which are not yet complete. Sadly if I sleep between design sessions, I often lose track of whether such a dangling wire is meant to be attached to more stuff, or is simply left because the net is picked up elsewhere on the sheet. With intra-sheet link elements available, if I had intended the latter, I'd have just dropped such an element on the end of the wire before I stopped for the day. Also, when drawing designs, I sometimes forget to label a wire, especially if it has just passed through a filter or current-limiting resistor or similar. As such, even with intra-sheet link elements to show me when I mean for a net to go bimbling off across the sheet, I can sometimes end up with unnamed nets whose capacitors end up not used for anything useful. This is where the ERC comes in. By having the ERC complain if a wire dangles -- the design engineer won't forget to add links (or check more explicitly if the wire is meant to be attached to something else). By having junctions which don't actually link anything warned about, the engineer can't just slap a junction blob down on the end of a wire to silence that warning, since that doesn't mean anything to a reviewer later down the line. By having the ERC warn if a net has exactly one intra-sheet link attached to it, missing net names and errors such as that shown in my contrived example above can be spotted and corrected. Ultimately this entire piece of work is about ensuring that the intent of the design engineer is captured clearly in the schematic. If the design engineer meant to leave that wire dangling because it's joining to another bit of wire elsewhere on the sheet, they can put the intra-sheet links in to show this. The associated ERC checks are there purely to ensure that the validation of this intent is not bypassed accidentally, or deliberately, in order to make the use of this more worthwhile and to increase the usefulness of the ERC on designs where signals jump around on sheets where wiring them up directly would just create a mess.

17 February 2016

Riku Voipio: Ancient Linux swag

Since I've now been using Linux for 20 years, I've dug up some artifacts from the early journey.
  1. First the book, from late 1995. This from before Tux, so the penguin in the cover is just a co-incidence. The book came with a slackware 3.0 CD, which was my entrance to Linux. Today, almost all of the book is outdated - slackware and lilo install? printing with lpr? mtools and dosemu? ftp, telnet with SLIP dialup? Manually configuring XFree86 and fvwm? How I miss those times!* The only parts of the book are still valid are: shell and vi guides. I didn't read latter, and instead imported my favorite editor from dos FTE.
  2. Fast forward some years, into my first programming job. Ready to advertise the Linux revolution, I bought the mug on right. Nobody else would have a Tux mug, so nobody would accidentally take my mug from the office dishwasher. That only worked for my first work place (a huge and nationally hated IT consultant house). The next workplace, a mobile gaming startup (in 2001, I was there before it was trendy!) - and there was already plenty of Linux mugs when I joined...
  3. While today it may be hard to imagine, those days using Microsoft office tools was mandatory. That leads to the third memorabilia in the picture. Wordperfect for Linux existed for a brief while, and in the box (can you imagine, software came in physical boxes?) came a Tux plush.
* Wait no, I don't miss those times at all

2 January 2016

Iain R. Learmonth: Fun in Hamburg Airport

Today I'm travelling home from 32c3, back to Aberdeen via London from Hamburg Airport. My experience passing through security to get to the gates this morning has angered me to the point that I had to vent. They have full-body scanners installed, which in my opinion are a massive invasion of my personal privacy. As I approached the security checkpoint, I noticed a sign where it stated that use of the scanner was not mandatory and that a manual screening process was available. Once I'd loaded my bags into the trays, I asked the security officer how to opt for the manual screening at which point I was told it was not available due to a lack of staff. I was forced to allow for an intimate scan to take place or miss my flight. I'm not really in a position to be making alternative arrangements for travel and so I had to submit but it has left me outraged. Privacy is about the ability to choose what I share and who I share it with. This was meant to be a right protected by the ECHR, but I guess human rights are outdated now we're entering 2016. Bring on the dystopian future!

22 December 2015

Dmitry Shachnev: ReText 5.3 released

On Sunday I have released ReText 5.3, and here finally comes the official announcement. Highlights in this release are: Also, a week before ReText 5.3 a new version of PyMarkups was released, bringing enhanced support for the Textile markup. You can now edit Textile files in ReText too, provided that python-textile module for Python 3 is installed. As usual, you can get the latest release from PyPI or from the Debian/Ubuntu repositories. Please report any bugs you find to our issue tracker.

15 December 2015

Antoine Beaupr : Using a Yubikey NEO for SSH and OpenPGP on Debian jessie

I recently ordered two Yubikey devices from Yubico, partly because of a special offer from Github. I ordered both a Yubikey NEO and a Yubikey 4, although I am not sure I remember why I ordered two - you can see their Yubikey product comparison if you want to figure that out, but basically, the main difference is that the NEO has support for NFC while the "4" has support for larger RSA key sizes (4096). This article details my experiment on the matter. It is partly based on first hand experience, but also links to various other tutorials that helped me along the way. Especially thanks to folks on various IRC channels that really helped me out in understanding this. My objective in getting a hardware security token like this was three-fold:
  1. use 2FA on important websites like Github, to improve the security of critical infrastructure (like my Borg backup software)
  2. login to remote SSH servers without exposing my password or my private key material on third party computers
  3. store OpenPGP key material on the key securely, so that the private key material can never be compromised
To make a long story short: this article documents step 2 and implicitly step 3 (because I use OpenPGP to login to SSH servers). However it is not possible to use the key on arbitrary third party computers, given how much setup was necessary to make the thing work at all. 2FA on the Github site completely failed, but could be used on other sites, although this is not covered by this article. I have also not experimented in details the other ways the Yubikey can also be used (sorry for the acronym flood) as: Update: OATH works! It is easy to configure and i added a section below.

Not recommended After experimenting with the device and doing a little more research, I am not sure it was the right decision to buy a Yubikey. I would not recommend buying Yubikey devices because they don't allow changing the firmware, making the device basically proprietary, even in the face of an embarrassing security vulnerability on the Yubikey NEO that came out in 2015. A security device, obviously, should be as open as the protocols it uses, otherwise it's basically impossible to trust that the crypto hasn't been backdoored or compromised, or, in this case, is vulnerable to the simplest drive-by attacks. Furthermore, it turns out that the primary use case that Github was promoting is actually not working as advertised: to use the Yubikey on Github, you actually first need to configure 2FA with another tool, either with your phone's text messages (SMS) or with something like Google Authenticator. After contacting Github support, they explained that the Yubikey is seen as a "backup device", which seems really odd to me, especially considering the promotion and the fact that I don't have a "smart" (aka "Google", it seems these days) phone or the desire to share my personal phone number with Github. Finally, as I mentioned before, the fact that those devices are fairly new and the configuration necessary to make them work at all is completely obtuse, non-standardized or at least not available by default on arbitrary computers makes them basically impossible to use on other computers than your own specially crafted gems.

Plugging it in The Yubikey, when inserted into a USB port, seems to be detected properly. It shows up both as a USB keyboard and a generic device.
d c 14 17:23:26 angela kernel: input: Yubico Yubikey NEO OTP+U2F as /devices/pci0000:00/0000:00:12.0/usb3/3-2/3-2:1.0/0003:1050:0114.0016/input/input127
d c 14 17:23:26 angela kernel: hid-generic 0003:1050:0114.0016: input,hidraw3: USB HID v1.10 Keyboard [Yubico Yubikey NEO OTP+U2F] on usb-0000:00:12.0-2/input0
d c 14 17:23:26 angela kernel: hid-generic 0003:1050:0114.0017: hiddev0,hidraw4: USB HID v1.10 Device [Yubico Yubikey NEO OTP+U2F] on usb-0000:00:12.0-2/input1
We'll be changing this now - we want to to support OTP, U2F and CCID. Don't worry about those acronyms now, but U2F is for the web, CCID is for GPG/SSH, and OTP is for the One Time Passwords stuff mentionned earlier. I am using the Yubikey Personalization tool from stretch because the jessie one is too old, according to Gorzen. Indeed, I found out that the jessie version doesn't ship with the proper udev rules. Also, note that we need to run as sudo otherwise we get a permission denied:
$ sudo apt install yubikey-personalization/stretch
$ sudo ykpersonalize -m86
Firmware version 3.4.3 Touch level 1541 Program sequence 1
The USB mode will be set to: 0x86
Commit? (y/n) [n]: y
To understand better what the above does, see the NEO composite device documentation. The next step is to reconnect the key, for the udev rules to kick in. If you were like me, you enthusiastically plugged in the device before installing the yubikey-personalization package, and the udev rules were not present then.

Configuring a PIN Various operations will require you to enter a PIN when talking to the key. The default PIN is 123456 and the default admin PIN is 12345678. You will want to change that, otherwise someone that gets a hold of your key could do any operation without your consent. For this, you need to use:
$ gpg --card-edit
> passwd
> admin
> passwd
Be sure to remember those passwords! Of course, the key material on the Yubikey can be revoked when you loose the key, but only if you still have control of the master key, or if you have a OpenPGP revocation certification (which you should have).

Configuring GPG To do OpenPGP operations (like decryption, signatures and so on), or SSH operations (like authentication on a remote server), you need to talk with GPG. Yes, OpenPGP keys are RSA keys that can be used to authenticate with SSH servers, that's not new and I have already been doing this with Monkeysphere for a while. Now the challenge is how to make GPG talk with the Yubikey. So the next step is to see if gpg can see the key alright, as described in the Yubikey importing keys howto - you will need first to install scdaemon and pcscd (according to this howto) for gpg-agent to be able to talk with the key:
$ sudo apt install scdaemon gnupg-agent pcscd
$ gpg-connect-agent --hex "scd apdu 00 f1 00 00" /bye
ERR 100663404 Card error <SCD>
Well that failed. At this point, touching the key types a bunch of seemingly random characters wherever my cursor is sitting - fun but totally useless still. That was because I failed to reconnect the key: make sure the udev rules are in place and reconnect the key, the above should work:
$ gpg-connect-agent --hex "scd apdu 00 f1 00 00" /bye
D[0000]  01 00 10 90 00                                     .....
OK
This shows it is running the firmware 1.10, which is not vulnerable to the infamous security issue. (Note: I also happened to install opensc and gpgsm because of this suggestion but I am not sure they are required at all.)

Using SSH To make GPG work with SSH, you need somehow to start gpg-agent with ssh support, for example with:
gpg-agent --daemon --enable-ssh-support bash
Of course, this will work better if it's started with your Xsession. Such an agent should already be started, so you just need to add the ssh emulation to its configuration file and restart your X session.
echo 'enable-ssh-support' >> .gnupg/gpg-agent.conf
In Debian Jessie, the ssh-agent wrapper will not start if it detects that you have already one running (for example from gpg-agent) but if that fails, you can try commenting out use-ssh-agent from /etc/X11/Xsession.options to keep it from starting up in your session. (Thanks to stackexchange for that reference.) Here I assume you have already created an authentication subkey on your PGP key. If you haven't, I suggest trying out simply monkeysphere gen-subkey, which will generate an authentication subkey for you. You can also do it by hand by following one of the OpenPGP/SSH tutorials from Yubikey, especially the more complete one. If you are going to generate a completely new OpenPGP key, you may want to follow this simpler tutorial here. Then you need to move your authentication subkey to the Yubikey. For this, you need to edit the key and use the keytocard command:
$ gpg2 --edit-key anarcat@debian.org
> toggle
> key 2
> keytocard
> save
Here, we use toggle to show the OpenPGP private key material. You should see a key marked with A for Authentication. Mine was the second one so I selected it with key 2 which put a star next to it. The keytocard command moved it to the key and save ensure the key was removed from the local keyring. Obviously, backups are essential before doing this, because it's perfectly possible to loose that key in the process, for example if you destroy or lose the key or forget the password. It's probably better to create a completely different authentication subkey for just this purpose, but that may require reconfiguring all remote SSH hosts, and you may not want to do that. Then SSH should magically talk with the GPG agent and ask you for the PIN! That's pretty much all there is to it - if it doesn't, it means that gpg-agent is not your SSH agent, and obviously things will fail... Also, you should be able to see the key being loaded in the agent when it is:
$ ssh-add -l
2048 23:f3:be:bf:1e:da:e8:ad:4b:c7:f6:60:5e:03:c2:a6 cardno:000603647189 (RSA)
.. that's about it! I have yet to cover 2FA and OpenPGP support, but that got me going for a while and I'll stop geeking around with that thing for now. It was fun, for sure, but not sure it's worth it for now.

Using OATH This is pretty neat: it allows you to add two factor authentication to a lot of things. For example, PAM has such a module, which I will configure here to allow myself to login to my server from untrusted machines. While I will expose my main password to keyloggers, the OTP password will prevent that from being reused. This is a simplified version of this OATH tutorial. We install the PAM module with:
sudo apt install libpam-oath
Then, we can hook it into any PAM consumer, for example with sshd:
--- a/pam.d/sshd
+++ b/pam.d/sshd
@@ -1,5 +1,8 @@
 # PAM configuration for the Secure Shell service
+# for the yubikey
+auth required pam_oath.so usersfile=/etc/users.oath window=5 digits=8
+
 # Standard Un*x authentication.
 @include common-auth
We also needed to allow OTP passwords in sshd explicitely, with:
ChallengeResponseAuthentication yes
This will force the user to enter a valid oath token on the server. Unfortunately, this will affect all users, regardless of whether they are present in the users.oath file. I filed bug #807990 regarding this, with a patch. Also, it means the main password is still exposed on the client machine - you can use the sufficient keyword instead of required to workaround that, but then it means anyone with your key can login to your machine, which is something to keep in mind. The /etc/users.oath file needs to be created with something like:
#type   username        pin     start seed
HOTP    anarcat -   00
00 is obviously a fake, and insecure string. Generate a proper one with:
dd if=/dev/random bs=1k count=1   sha1sum # create a random secret
Then the shared secret needs to be added to the Yubikey:
ykpersonalize -1 -o oath-hotp -o oath-hotp8 -o append-cr -a
You simply paste the random secret you created above, when prompted, and that shared secret will be saved in a Yubikey slot for future use. Next time you login to the SSH server, you will be prompted for a OATH password, you just touch the button on the key and it will be pasted there:
$ ssh -o PubkeyAuthentication=no anarc.at
One-time password (OATH) for  anarcat':
Password:
[... logged in!]
Final note: the centralized file approach makes it hard, if not impossible, for users to update their own secret token.. It would be nice if there would be a user-accessible token file, maybe ~/.oath? Filed feature request #807992 about this as well.

13 December 2015

Vasudev Kamath: Mount ordering and automount options in systemd

Just to give a brief recap of why I'm writing this post, I've to describe a brief encounter I had with systemd which rendered my system unbootable. I first felt its systemd's problem but later figured out it was my mistake. So below is brief story divided into 2 sections Problem and Solution. Here Problem describes issue I faced with systemd and Solution discusses the .mount and .automount suffix files used by systemd.
Problem I have several local bin folders and I don't want to modify PATH environment variable adding every other folder path. I first started using aufs as alternative to symlinks or modifying PATH variable. Recently I learnt about much more light weight union fs which is in kernel, overlayfs. And Google led me to Arch wiki article which showed me a fstab entry like below
overlay        /usr/local/bin        overlay noauto,x-systemd.automount,lowerdir=/home/vasudev/Documents/utilities/bin,upperdir=/home/vasudev/Documents/utilities/jonas-bin,workdir=/home/vasudev/Documents/utilities/bin-overlay    0       0
And after adding this entry, on next reboot systemd is not able to mount my LVM home and swap partition. It does however mount root partition. It gives me emergency target but login never returns. So without any alternative I had to reinstall the system. But funnily enough I started encountering the same issue (yeah after I had added above entry into fstab). Yeah that time I never thought that its culprit. My friend Ritesh finally got my system booting after the weird x-systemd.automount option. I never investigated further that time on why this problem occurred.
Solution After re-encounter with similar problem and some other project I read the manual on systemd.mount, systemd-fstab-generator and systemd.automount and I had some understanding of what really went wrong in my case above. So now let us see what really is happening. All the above happened because, systemd translates the /etc/fstab into .mount units at run time using systemd-fstab-generator. Every entry in fstab translates into a file named after the mount point. The / in the path of the mount point is replaced by a -. So / mount point is named as -.mount and /home is named as home.mount and /boot becomes boot.mount. All these files can be seen in directory /run/systemd/generator. And all these mount points are needed by local-fs.target , if any of these mount points fail, local-fs.target fails. And if local-fs.target fails it will invoke emergency.target. systemd-fstab-generator manual suggests the ordering information in /etc/fstab is discarded. That means if you have union mounts, bind mounts or fuse mounts in the fstab which is normally at end of fstab and uses path from /home or some other device mount points it will fail to get mounted. This happens because systemd-fstab-generator didn't consider the ordering in fstab. This is what happened in my case my overlay mount point which is usr-local-bin.mount was some how happened to be tried before home.mount because there is no explicit dependency like Requires= or After= was declared. When systemd tried to mount it the path required under /home were not really present hence failed which in return invoked emergency.target as usr-local-bin.mount is Requires= dependency for local-fs.target. Now what I didn't understand is why emregency.target never gave me the root shell after entering login information. I feel that this part is unrelated to the problem I described above and is some other bug. To over come this what we can do is provided systemd-fstab-generator some information on dependency of each mount point. systemd.mount manual page suggests several such options. One which I used in my case is x-systemd.requires which should be placed in options column of fstab and specify the mount point which is needed before it has to be mounted. So my overlay fs entry translates to something like below
overlay        /usr/local/bin        overlay noauto,x-systemd.requires=/home,x-systemd.automount,lowerdir=/home/vasudev/Documents/utilities/bin,upperdir=/home/vasudev/Documents/utilities/jonas-bin,workdir=/home/vasudev/Documents/utilities/bin-overlay   0       0
There is another special option called x-systemd.automount this will make systemd-fstab-generator to create a .automount file for this mount point. What does systemd.automount do? It achieves on-demand file mounting and parallelized file system mounting. Something similar to systemd.socket the file system will be mounted when you access the mount point for first time. So now if you try to see dependency of usr-local-bin.mount you will see following.
systemctl show -p After usr-local-bin.mount
After=-.mount systemd-journald.socket local-fs-pre.target system.slice usr-local-bin.automount
This means now usr-local-bin.mount depends on usr-local-bin.automount. And let us see what usr-local-bin.automount needs.
systemctl show -p Requires usr-local-bin.automount
Requires=-.mount home.mount
systemctl show -p After usr-local-bin.automount
After=-.mount home.mount
So clearly usr-local-bin.automount is activated after -.mount and home.mount is activated. Similar can be done for any bind mounts or fuse mounts which require other mount points before it can be mounted. Also note that x-systemd.autmount is not mandatory option for declaring dependencies, I used it just to make sure /usr/local/bin is mounted only when its really needed.
Conclusion A lot of traditional way has been changed by systemd. I never understood why my system failed to boot in first place this happened because I was not really aware of how systemd works and was trying to debug the problem with traditional approach. So there is really a learning curve involved for every sysadmin out there who is going to use systemd. Most of them will read documentation before hand but others like me will learn after having a situation like above. :-). So systemd interfering into /etc/fstab is good?. I don't know but since systemd parallelizes the boot procedure something like this is really needed. Is there a way to make systemd not touch /etc/fstab?. Yes there is you need to pass fstab=0 option in kernel command line and systemd-fstab-generator doesn't create any .mount or .swap files from your /etc/fstab. !NB Also it looks like x-systemd.requires option was introduced recently and is not available in systemd <= 215 which is default in Jessie. So how do you declare dependencies in Jessie system?. I don't have any answer!. I did read that x-systemd.automount which is available in those versions of systemd can be used, but I'm yet to experiment this. If it succeeds I will write a post on it.

26 November 2015

Olivier Berger: Handling video files produced for a MOOC on Windows with git and git-annex

This post is intended to document some elements of workflow that I ve setup to manage videos produced for a MOOC, where different colleagues work collaboratively on a set of video sequences, in a remote way. We are a team of several schools working on the same course, and we have an incremental process, so we need some collaboration over a quite long period of many remote authors, over a set of video sequences. We re probably going to review some of the videos and make changes, so we need to monitor changes, and submit versions to colleagues on remote sites so they can criticize and get later edits. We may have more that one site doing video production. Thus we need to share videos along the flow of production, editing and revision of the course contents, in a way that is manageable by power users (we re all computer scientists, used to SVN or Git). I ve decided to start an experiment with Git and Git-Annex to try and manage the videos like we use to do for slides sources in LaTeX. Obviously the main issue is that videos are big files, demanding in storage space and bandwidth for transfers. We want to keep a track of everything which is done during the production of the videos, so that we can later re-do some of the video editing, for instance if we change the graphic design elements (logos, subtitles, frame dimensions, additional effects, etc.), for instance in the case where we would improve the classes over the seasons. On the other hand, not all colleagues want to have to download a full copy of all rushes on their laptop if they just want to review on particular sequence of the course. They will only need to download the final edit MP4. Even if they re interested in being able to fetch all the rushes, should they want to try and improve the videos. Git-Annex brings us the ability to decouple the presence of files in directories, managed by regular Git commands, from the presence of the file contents (the big stuff), which is managed by Git-Annex. Here s a quick description of our setup : Why didn t we use git-annex on windows directly, on the Windows host which is the source of the files ? We tried, but that didn t make it. Git-Annex assistant somehow crashed on us, thus causing the Git history to be strange, so that became unmanageable, and more important, we need robust backups, so we can t allow to handle something we don t fully trust: shooting again a video is really costly (setting up again a shooting set, with lighting, cameras, and a professor who has to repeat again the speech!). The rsync (with delete on destination) from windows to Linux is robust. Git-Annex on Linux seems robust so far. That s enough for now :-) The drawback is that we need manual intervention for starting the rsync, and also that we must make sure that the rsync target is ready to get a backup. The target of the rsync on Linux is a git-annex clone using the default indirect mode, which handles the files as symlinks to the actual copies managed by git-annex inside the .git/ directory. But that ain t suitable to be compared to the origin of the rsync mirror which are plain files on the Windows computer. We must then do a git-annex edit on the whole target of the rsync mirror before the rsync, so that the files are there as regular video files. This is costly, in terms of storage, and also copying time (our repo contains around 50 Gb, and the Linux host is a rather tiny laptop). After the rsync, all the files need to be compared to the SHA256 known to git-annex so that only modified files are taken into account in the commit. We perform a git-annex add on the whole files (for new files having appeared at rsync time), and then a git-annex sync . That takes a lot of time, since all SHA256 computations are quite long for such a set of big files (the video rushes and edited videos are in HD). So the process needs to be the following, on the target Linux host:
  1. git annex add .
  2. git annex sync
  3. git annex copy . to server1
  4. git annex copy . to server2
  5. git annex edit .
  6. only then : rsync
Iterate ad lib  I would have preferred to have a working git-annex on windows, but that is a much more manageable process for me for now, and until we have more videos to handle in our repo that our Linux laptop can hold, we re quite safe. Next steps will probably involve gardening the contents of the repo on the Linux host so we re only keeping copies of current files, and older copies are only kept on the 2 servers, in case of later need. I hope this can be useful to others, and I d welcome suggestions on how to improve our process.

23 November 2015

Riku Voipio: Using ser2net for serial access.

Is your table a mess of wires? Do you have multiple devices connected via serial and can't remember which is /dev/ttyUSBX is connected to what board? Unless you are a embedded developer, you are unlikely to deal with serial much anymore - In that case you can just jump to the next post in your news feed. Introducting ser2netUsually people start with minicom for serial access. There are better tools - picocom, screen, etc. But to easily map multiple serial ports, use ser2net. Ser2net makes serial ports available over telnet. Persistent usb device names and ser2netTo remember which usb-serial adapter is connected to what, we use the /dev/serial tree created by udev, in /etc/ser2net.conf:

# arndale
7004:telnet:0:'/dev/serial/by-path/pci-0000:00:1d.0-usb-0:1.8.1:1.0-port0':115200 8DATABITS NONE 1STOPBIT
# cubox
7005:telnet:0:/dev/serial/by-id/usb-Prolific_Technology_Inc._USB-Serial_Controller_D-if00-port0:115200 8DATABITS NONE 1STOPBIT
# sonic-screwdriver
7006:telnet:0:/dev/serial/by-id/usb-FTDI_FT230X_96Boards_Console_DAZ0KA02-if00-port0:115200 8DATABITS NONE 1STOPBIT
The by-path syntax is needed, if you have many identical usb-to-serial adapters. In that case a Patch from BTS is needed to support quoting in serial path. Ser2net doesn't seems very actively maintained upstream - a sure sign that project is stagnant is a homepage still at sourceforge.net... This patch among other interesting features can be also be found in various ser2net forks in github. Setting easy to remember names Finally, unless you want to memorize the port numbers, set TCP port to name mappings in /etc/services:

# Local services
arndale 7004/tcp
cubox 7005/tcp
sonic-screwdriver 7006/tcp
Now finally:
telnet localhost sonic-screwdriver
^Mandatory picture of serial port connection in action

11 November 2015

Kees Cook: evolution of seccomp

I m excited to see other people thinking about userspace-to-kernel attack surface reduction ideas. Theo de Raadt recently published slides describing Pledge. This uses the same ideas that seccomp implements, but with less granularity. While seccomp works at the individual syscall level and in addition to killing processes, it allows for signaling, tracing, and errno spoofing. As de Raadt mentions, Pledge could be implemented with seccomp very easily: libseccomp would just categorize syscalls. I don t really understand the presentation s mention of Optional Security , though. Pledge, like seccomp, is an opt-in feature. Nothing in the kernel refuses to run unpledged programs. I assume his point was that when it gets ubiquitously built into programs (like stack protector), it s effectively not optional (which is alluded to later as comprehensive applicability ~= mandatory mitigation ). Regardless, this sensible (though optional) design gets me back to his slide on seccomp, which seems to have a number of misunderstandings: OpenBSD has some interesting advantages in the syscall filtering department, especially around sockets. Right now, it s hard for Linux syscall filtering to understand why a given socket is being used. Something like SOCK_DNS seems like it could be quite handy. Another nice feature of Pledge is the path whitelist feature. As it s still under development, I hope they expand this to include more things than just paths. Argument inspection is a weak point for seccomp, but under Linux, most of the arguments are ultimately exposed to the LSM layer. Last year I experimented with creating a seccomp LSM for path matching where programs could declare whitelists, similar to standard LSMs. So, yes, Linux could match this API on seccomp . It d just take some extensions to libseccomp to implement pledge(), as I described at the top. With OpenBSD doing a bunch of analysis work on common programs, it d be excellent to see this usable on Linux too. So far on Linux, only a few programs (e.g. Chrome, vsftpd) have bothered to do this using seccomp, and it could be argued that this is ultimately due to how fine grained it is.

2015, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

9 November 2015

Vasudev Kamath: Taming systemd-nspawn for running containers

I've been trying to run containers using systemd-nspawn for quite some time. I was always bumping to one or other dead end. This is not systemd-nspawn's fault, rather my impatience stopping me from reading manual pages properly lack of good tutorial like article available online. Compared to this LXC has a quite a lot of good tutorials and howto's available online. This article is my effort to create a notes putting all required information in one place.
Creating a Debian Base Install First step is to have a minimal Debian system some where on your hard disk. This can be easily done using debootsrap. I wrote a custom script to avoid reading manual every time I want to run debootstrap. Parts of this script (mostly packages and the root password generation) is stolen from lxc-debian template provided by lxc package.
#!/bin/sh
set -e
set -x
usage ()  
    echo "$ 0##/*  [options] <suite> <target> [<mirror>]"
    echo "Bootstrap rootfs for Debian"
    echo
    cat <<EOF
    --arch         set the architecture to install
    --root-passwd  set the root password for bootstrapped rootfs
    EOF
 
# copied from the lxc-debian template
packages=ifupdown,\
locales,\
libui-dialog-perl,\
dialog,\
isc-dhcp-client,\
netbase,\
net-tools,\
iproute,\
openssh-server,\
dbus
if [ $(id -u) -ne 0 ]; then
    echo "You must be root to execute this command"
    exit 2
fi
if [ $# -lt 2 ]; then
   usage $0
fi
while true; do
    case "$1" in
        --root-passwd --root-passwd=?*)
            if [ "$1" = "--root-passwd" -a -n "$2" ]; then
                ROOT_PASSWD="$2"
                shift 2
            elif [ "$1" != "$ 1#--root-passwd= " ]; then
                ROOT_PASSWD="$ 1#--root-passwd= "
                shift 1
            else
                # copied from lxc-debian template
                ROOT_PASSWD="$(dd if=/dev/urandom bs=6 count=1 2>/dev/null base64)"
                ECHO_PASSWD="yes"
            fi
            ;;
        --arch --arch=?*)
            if [ "$1" = "--arch" -a -n "$2" ]; then
                ARCHITECTURE="$2"
                shift 2
            elif [ "$1" != "$ 1#--arch= " ]; then
                ARCHITECTURE="$ 1#--arch= "
                shift 1
            else
                ARCHITECTURE="$(dpkg-architecture -q DEB_HOST_ARCH)"
            fi
            ;;
        *)
            break
            ;;
    esac
done
release="$1"
target="$2"
if [ -z "$1" ]   [ -z "$2" ]; then
    echo "You must specify suite and target"
    exit 1
fi
if [ -n "$3" ]; then
    MIRROR="$3"
fi
MIRROR=$ MIRROR:-http://httpredir.debian.org/debian 
echo "Downloading Debian $release ..."
debootstrap --verbose --variant=minbase --arch=$ARCHITECTURE \
             --include=$packages \
             "$release" "$target" "$MIRROR"
if [ -n "$ROOT_PASSWD" ]; then
    echo "root:$ROOT_PASSWD"   chroot "$target" chpasswd
    echo "Root password is '$ROOT_PASSWRD', please change!"
fi
It just gets my needs done, if you don't like it feel free to modify or use debootstrap directly. !NB Please install dbus package in the minimal base install, otherwise you will not be able to control the container using machinectl
Manually Running Container and then persisting it Next we need to run the container manually. This is done by using following command.
systemd-nspawn -bD   /path/to/container --network-veth \
     --network-bridge=natbr0 --machine=Machinename
--machine option is not mandatory, if not specified systemd-nspawn will take the directory name as machine name, and if you have characters like - in the directory name it translates to hexcode x2d and controlling container with name becomes difficult. --network-veth specifies the systemd-nspawn to enable virtual ethernet based networking and --network-bridge tells the bridge interface on host system to be used by systemd-nspawn. These options together constitutes private networking for container. If not specified container can use host systems interface there by removing network isolation of container. Once you run this command container comes up. You can now run machinectl to control the container. Container can be persisted using following command
machinectl enable container-name
This will create a symbolic link of /lib/systemd/system/systemd-nspawn@service to /etc/systemd/system/machine.target.wants/. This allows you to start or stop container using machinectl or systemctl command. Only catch here is your base install should be in /var/lib/machines/. What I do in my case is create a symbolic link from my base container to /var/lib/machines/container-name. !NB Note that symbolic link name under /var/lib/machines should be same as the container name you gave using --machine switch or the directory name if you didn't specify --machine
Persisting Container Networking We did persist the container in above step, but this doesn't persist the networking options we provided in command line. systemd-nspawn@.service provides following command to invoke container.
ExecStart=/usr/bin/systemd-nspawn --quiet --keep-unit --boot --link-journal=try-guest --network-veth --settings=override --machine=%I
To persist the bridge networking configuration we did in command line, we need the help of systemd-networkd. So first we need to enable the systemd-networkd.service on both container and the host system.
systemctl enable systemd-networkd.service
Now inside the container, interfaces will be named as hostN. Depending on how many interfaces we have N increments. In our example case we had single interface, hence it will named as host0. By default network interfaces will be down inside container, hence systemd-networkd is needed to put it up. We put the following in /etc/systemd/network/host0.network file inside the container.
[Match]
Name=host0
[Network]
Description=Container wired interface host0
DHCP=yes
And in the host system we just configure the bridge interface using systemd-nspawn. I put following in natbr0.netdev in /etc/systemd/network/
[NetDev]
Description=Bridge natbr0
Name=natbr0
Kind=bridge
In my case I already had configured the bridge using /etc/network/interfaces file for lxc. I think its not really needed to use systemd-networkd in this case. Since systemd-networkd doesn't do anything if network / virtual device is already present I safely put above configuration and enabled systemd-networkd. Just for the notes here is my natbr0 configuration in interfaces file.
auto natbr0
iface natbr0 inet static
   address 172.16.10.1
   netmask 255.255.255.0
   pre-up brctl addbr natbr0
   post-down brctl delbr natbr0
   post-down sysctl net.ipv4.ip_forward=0
   post-down sysctl net.ipv6.conf.all.forwarding=0
   post-up sysctl net.ipv4.ip_forward=1
   post-up sysctl net.ipv6.conf.all.forwarding=1
   post-up iptables -A POSTROUTING -t mangle -p udp --dport bootpc -s 172.16.0.0/16 -j CHECKSUM --checksum-fill
   pre-down iptables -D POSTROUTING -t mangle -p udp --dport bootpc -s 172.16.0.0/16 -j CHECKSUM --checksum-fill
Once this is done just reload the systemd-networkd and make sure you have dnsmasq or any other DHCP server running in your system. Now the last part is to tell systemd-nspawn to use the bridge networking interface we have defined. This is done using container-name.nspawn file. Put this file under /etc/systemd/nspawn folder.
[Exec]
Boot=on
[Files]
Bind=/home/vasudev/Documents/Python/upstream/apt-offline/
[Network]
VirtualEthernet=yes
Bridge=natbr0
Here you can specify networking, and files mounting section of the container. For full list please refer the systemd.nspawn manual page. Now all this is done you can happily do
machinectl start container-name
#or
systemctl start systemd-nspawn@container-name
Resource Control Now all things said and done, one last part remains. Yes what is the point if we can't control how much resource does the container use. Atleast it is more important for me, because I use old and bit low powered laptop. Systemd provides way to control the resource using Control interface. To see all the the interfaces exposed by systemd please refer systemd.resource-control manual page. The way to control the resource is using systemctl. Once container starts running we can run following command.
systemctl set-property container-name CPUShares=200 CPUQuota=30% MemoryLimit=500M
The manual page does say that these things can be put under [Slice] section of unit files. Now I don't have clear idea if this can be put under .nspawn files or not. For the sake of persisting the container I manually wrote the service file for container by copying systemd-nspawn@.service and adding [Slice] section. But if I don't know how to find out if this had any effect or not. If some one knows about this please share your suggestions to me and I will update this section with your provided information.
Conclussion All in all I like systemd-nspawn a lot. I use it to run container for development of apt-offline. I previously used lxc where all can be controlled using a single config file. But I feel systemd-nspawn is more tightly integrated with system than lxc. There is definitely more in systemd-nspawn than I've currently figured out. Only thing is its not as popular as other alternatives and definitely lacks good howto documentation.For now only way out is dig the manual pages, scratch your head, pull your hair out and figure out new possibilities in systemd-nspawn. ;-)

8 November 2015

Daniel Pocock: Problems observed during Cambridge mini-DebConf RTC demo

A few problems were observed during the demo of RTC services at the Cambridge mini-DebConf yesterday. As it turns out, many of them are already documented and solutions are available for some of them. Multiple concurrent SIP registrations I had made some test calls on Friday using rtc.debian.org and I still had the site open in another tab in another browser window. When people tried to call me during the demo, both tabs were actually ringing but only one was visible. When a SIP client registers, the SIP registration server sends it a list of all other concurrent registrations in the response message. We simply need to extend JSCommunicator to inspect the response message and give some visual feedback about other concurrent registrations. Issue #69. SIP also provides a mechanism to clear concurrent registrations and that could be made available with a button or configuration option too (Issue #9). Callee hears ringing before connectivity checks completed The second issue during the WebRTC demo was that the callee (myself) was alerted about the call before the ICE checks had been performed. The optimal procedure to provide a slick user experience is to run the connectivity checks before alerting the callee. If the connectivity checks fail, the callee should never be alerted with a ringing sound and would never know somebody had tried to call. The caller would be told that the call was unable to be attempted and encouraged to consider trying again on another wifi network. RFC 5245 recommends that connectivity checks should be done first but it is not mandatory. One reason this is problematic with WebRTC is the need to display the pop-up asking the user for permission to share their microphone and webcam: the popup must appear before connectivity checks can commence. This has been discussed in the JsSIP issue tracker. Non-WebRTC softphones, such as Lumicall, do the connectivity checks before alerting the callee. Dealing with UDP blocking It appears the corporate wifi network in the venue was blocking the UDP packets so the connectivity checks could never complete, not even using a TURN server to relay the packets. People trying to use the service on home wifi networks, in small offices and mobile tethering should not have this problem as these services generally permit UDP by default. Some corporate networks, student accommodation and wifi networks in some larger hotels have blocked UDP and in these cases, additional effort must be made to get through the firewall. The TURN server we are running for rtc.debian.org also supports a TLS transport but it simply isn't configured yet. At the time we originally launched the WebRTC service in 2013, the browsers didn't support TURN over TLS at all but now they do. This is probably the biggest problem encountered during the demo but it does not require any code change to resolve this, just configuration, so a solution is well within reach. During the demo, we worked around the issue by turning off the wifi on my laptop and using tethering with a 4G mobile network. All the calls made successfully during the demo used the tethering solution. Add a connectivity check timeout The ICE connectivity checks appeared to keep running for a long time. Usually, if UDP is not blocked, the ICE checks would complete in less than two seconds. Therefore, the JavaScript needs to set a timeout between two and five seconds when it starts the checks and give the user a helpful warning about their network problems if the timeout is exceeded. Issue #73 in JSCommunicator. While these lengthy connectivity checks appear disappointing, it is worth remembering that this is an improvement over the first generation of softphones: none of them made these checks at all, they would simply tell the user the call had been answered but audio and video would only be working in one direction or not at all. Microphone issues One of the users calling into the demo, Juliana, was visible on the screen but we couldn't hear her. This was a local audio hardware issue with her laptop or headset. It would be useful if the JavaScript could provide visual feedback when it detects a voice (issue #74) and even better, integrating with the sound settings so that the user can see if the microphone is muted or the gain is very low (issue #75). Thanks to participants in the demo I'd like to thank all the participants in the demo, including Juliana Louback who called us from New York, Laura Arjona who called us from Madrid, Daniel Silverstone who called from about three meters away in the front row and Iain Learmonth who helped co-ordinate the test calls over IRC. Thanks are also due to Steve McIntyre, the local Debian community, ARM and the other sponsors for making another mini-DebConf in the UK this year.

Next.

Previous.