Search Results: "rover"

6 August 2020

Chris Lamb: The Bringers of Beethoven

This is a curiously poignant work to me that I doubt I would ever be able to communicate in writing. I found it first about fifteen years ago with a friend who I am quite regrettably no longer in regular contact with, so there was some complicated nostalgia entangled with rediscovering it today. What might I say about it instead? One tell-tale sign of 'good' art is that you can find something new in it, or yourself, each time. In this sense, despite The Bringers of Beethoven being more than a little ridiculous, it is somehow 'good' music to me. For example, it only really dawned on me now that the whole poem is an allegory for a GDR-like totalitarianism. But I also realised that it is not an accident that it is Beethoven himself (quite literally the soundtrack for Enlightenment humanism) that is being weaponised here, rather than some fourth-rate composer of military marches or one with a problematic past. That is to say, not only is the poem arguing that something universally recognised as an unalloyed good can be subverted for propagandistic ends, but that is precisely the point being made by the regime. An inverted Clockwork Orange, if you like. Yet when I listen to it again I can't help but laugh. I think of the 18th-century poet Alexander Pope, who first used the word bathos to refer to those abrupt and often absurd transitions from the elevated to the ordinary, contrasting it with the concept of pathos, the sincere feeling of sadness and tragedy. I can't think of two better words.

21 June 2020

Enrico Zini: Forced italianisation of S dtirol

Italianization (Italian: Italianizzazione; Croatian: talijanizacija; Slovene: poitaljan evanje; German: Italianisierung; Greek: ) is the spread of Italian culture and language, either by integration or assimilation.[1][2]
In 1919, at the time of its annexation, the middle part of the County of Tyrol which is today called South Tyrol (in Italian Alto Adige) was inhabited by almost 90% German speakers.[1] Under the 1939 South Tyrol Option Agreement, Adolf Hitler and Benito Mussolini determined the status of the German and Ladin (Rhaeto-Romanic) ethnic groups living in the region. They could emigrate to Germany, or stay in Italy and accept their complete Italianization. As a consequence of this, the society of South Tyrol was deeply riven. Those who wanted to stay, the so-called Dableiber, were condemned as traitors while those who left (Optanten) were defamed as Nazis. Because of the outbreak of World War II, this agreement was never fully implemented. Illegal Katakombenschulen ("Catacomb schools") were set up to teach children the German language.
The Prontuario dei nomi locali dell'Alto Adige (Italian for Reference Work of Place Names of Alto Adige) is a list of Italianized toponyms for mostly German place names in South Tyrol (Alto Adige in Italian) which was published in 1916 by the Royal Italian Geographic Society (Reale Societ Geografica Italiana). The list was called the Prontuario in short and later formed an important part of the Italianization campaign initiated by the fascist regime, as it became the basis for the official place and district names in the Italian-annexed southern part of the County of Tyrol.
Ettore Tolomei (16 August 1865, in Rovereto 25 May 1952, in Rome) was an Italian nationalist and fascist. He was designated a Member of the Italian Senate in 1923, and ennobled as Conte della Vetta in 1937.
The South Tyrol Option Agreement (German: Option in S dtirol; Italian: Opzioni in Alto Adige) was an agreement in effect between 1939 and 1943, when the native German speaking people in South Tyrol and three communes in the province of Belluno were given the option of either emigrating to neighboring Nazi Germany (of which Austria was a part after the 1938 Anschluss) or remaining in Fascist Italy and being forcibly integrated into the mainstream Italian culture, losing their language and cultural heritage. Over 80% opted to move to Germany.

31 May 2020

Enrico Zini: Controversial inventors

Paul-F lix Armand-Delille (3 July 1874 in Fourchambault, Ni vre 4 September 1963) was a physician, bacteriologist, professor, and member of the French Academy of Medicine who accidentally brought about the collapse of rabbit populations throughout much of Europe and beyond in the 1950s by infecting them with myxomatosis.
Charles Franklin Kettering (August 29, 1876 November 25, 1958) sometimes known as Charles "Boss" Kettering[1] was an American inventor, engineer, businessman, and the holder of 186 patents.[2] He was a founder of Delco, and was head of research at General Motors from 1920 to 1947. Among his most widely used automotive developments were the electrical starting motor[3] and leaded gasoline.[4][5] In association with the DuPont Chemical Company, he was also responsible for the invention of Freon refrigerant for refrigeration and air conditioning systems. At DuPont he also was responsible for the development of Duco lacquers and enamels, the first practical colored paints for mass-produced automobiles. While working with the Dayton-Wright Company he developed the "Bug" aerial torpedo, considered the world's first aerial missile.[6] He led the advancement of practical, lightweight two-stroke diesel engines, revolutionizing the locomotive and heavy equipment industries. In 1927, he founded the Kettering Foundation, a non-partisan research foundation. He was featured on the cover of Time magazine on January 9, 1933.
John Charles Cutler (June 29, 1915 February 8, 2003) was a senior surgeon, and the acting chief of the venereal disease program in the United States Public Health Service. After his death, his involvement in several controversial and unethical medical studies of syphilis was revealed, including the Guatemala and the Tuskegee syphilis experiments.
Ivy Ledbetter Lee (July 16, 1877 November 9, 1934) was an American publicity expert and a founder of modern public relations. Lee is best known for his public relations work with the Rockefeller family. His first major client was the Pennsylvania Railroad, followed by numerous major railroads such as the New York Central, the Baltimore and Ohio, and the Harriman lines such as the Union Pacific. He established the Association of Railroad Executives, which included providing public relations services to the industry. Lee advised major industrial corporations, including steel, automobile, tobacco, meat packing, and rubber, as well as public utilities, banks, and even foreign governments. Lee pioneered the use of internal magazines to maintain employee morale, as well as management newsletters, stockholder reports, and news releases to the media. He did a great deal of pro bono work, which he knew was important to his own public image, and during World War I, he became the publicity director for the American Red Cross.[1]

9 May 2020

Michael Stapelberg: Hermetic packages (in distri)

In distri, packages (e.g. emacs) are hermetic. By hermetic, I mean that the dependencies a package uses (e.g. libusb) don t change, even when newer versions are installed. For example, if package libusb-amd64-1.0.22-7 is available at build time, the package will always use that same version, even after the newer libusb-amd64-1.0.23-8 will be installed into the package store. Another way of saying the same thing is: packages in distri are always co-installable. This makes the package store more robust: additions to it will not break the system. On a technical level, the package store is implemented as a directory containing distri SquashFS images and metadata files, into which packages are installed in an atomic way.

Out of scope: plugins are not hermetic by design One exception where hermeticity is not desired are plugin mechanisms: optionally loading out-of-tree code at runtime obviously is not hermetic. As an example, consider glibc s Name Service Switch (NSS) mechanism. Page 29.4.1 Adding another Service to NSS describes how glibc searches $prefix/lib for shared libraries at runtime. Debian ships about a dozen NSS libraries for a variety of purposes, and enterprise setups might add their own into the mix. systemd (as of v245) accounts for 4 NSS libraries, e.g. nss-systemd for user/group name resolution for users allocated through systemd s DynamicUser= option. Having packages be as hermetic as possible remains a worthwhile goal despite any exceptions: I will gladly use a 99% hermetic system over a 0% hermetic system any day. Side note: Xorg s driver model (which can be characterized as a plugin mechanism) does not fall under this category because of its tight API/ABI coupling! For this case, where drivers are only guaranteed to work with precisely the Xorg version for which they were compiled, distri uses per-package exchange directories.

Implementation of hermetic packages in distri On a technical level, the requirement is: all paths used by the program must always result in the same contents. This is implemented in distri via the read-only package store mounted at /ro, e.g. files underneath /ro/emacs-amd64-26.3-15 never change. To change all paths used by a program, in practice, three strategies cover most paths:

ELF interpreter and dynamic libraries Programs on Linux use the ELF file format, which contains two kinds of references: First, the ELF interpreter (PT_INTERP segment), which is used to start the program. For dynamically linked programs on 64-bit systems, this is typically ld.so(8). Many distributions use system-global paths such as /lib64/ld-linux-x86-64.so.2, but distri compiles programs with -Wl,--dynamic-linker=/ro/glibc-amd64-2.31-4/out/lib/ld-linux-x86-64.so.2 so that the full path ends up in the binary. The ELF interpreter is shown by file(1), but you can also use readelf -a $BINARY grep 'program interpreter' to display it. And secondly, the rpath, a run-time search path for dynamic libraries. Instead of storing full references to all dynamic libraries, we set the rpath so that ld.so(8) will find the correct dynamic libraries. Originally, we used to just set a long rpath, containing one entry for each dynamic library dependency. However, we have since switched to using a single lib subdirectory per package as its rpath, and placing symlinks with full path references into that lib directory, e.g. using -Wl,-rpath=/ro/grep-amd64-3.4-4/lib. This is better for performance, as ld.so uses a per-directory cache. Note that program load times are significantly influenced by how quickly you can locate the dynamic libraries. distri uses a FUSE file system to load programs from, so getting proper -ENOENT caching into place drastically sped up program load times. Instead of compiling software with the -Wl,--dynamic-linker and -Wl,-rpath flags, one can also modify these fields after the fact using patchelf(1). For closed-source programs, this is the only possibility. The rpath can be inspected by using e.g. readelf -a $BINARY grep RPATH.

Environment variable setup wrapper programs Many programs are influenced by environment variables: to start another program, said program is often found by checking each directory in the PATH environment variable. Such search paths are prevalent in scripting languages, too, to find modules. Python has PYTHONPATH, Perl has PERL5LIB, and so on. To set up these search path environment variables at run time, distri employs an indirection. Instead of e.g. teensy-loader-cli, you run a small wrapper program that calls precisely one execve system call with the desired environment variables. Initially, I used shell scripts as wrapper programs because they are easily inspectable. This turned out to be too slow, so I switched to compiled programs. I m linking them statically for fast startup, and I m linking them against musl libc for significantly smaller file sizes than glibc (per-executable overhead adds up quickly in a distribution!). Note that the wrapper programs prepend to the PATH environment variable, they don t replace it in its entirely. This is important so that users have a way to extend the PATH (and other variables) if they so choose. This doesn t hurt hermeticity because it is only relevant for programs that were not present at build time, i.e. plugin mechanisms which, by design, cannot be hermetic.

Shebang interpreter patching The Shebang of scripts contains a path, too, and hence needs to be changed. We don t do this in distri yet (the number of packaged scripts is small), but we should.

Performance requirements The performance improvements in the previous sections are not just good to have, but practically required when many processes are involved: without them, you ll encounter second-long delays in magit which spawns many git processes under the covers, or in dracut, which spawns one cp(1) process per file.

Downside: rebuild of packages required to pick up changes Linux distributions such as Debian consider it an advantage to roll out security fixes to the entire system by updating a single shared library package (e.g. openssl). The flip side of that coin is that changes to a single critical package can break the entire system. With hermetic packages, all reverse dependencies must be rebuilt when a library s changes should be picked up by the whole system. E.g., when openssl changes, curl must be rebuilt to pick up the new version of openssl. This approach trades off using more bandwidth and more disk space (temporarily) against reducing the blast radius of any individual package update.

Downside: long env variables are cumbersome to deal with This can be partially mitigated by removing empty directories at build time, which will result in shorter variables. In general, there is no getting around this. One little trick is to use tr : '\n', e.g.:
distri0# echo $PATH
/usr/bin:/bin:/usr/sbin:/sbin:/ro/openssh-amd64-8.2p1-11/out/bin
distri0# echo $PATH   tr : '\n'
/usr/bin
/bin
/usr/sbin
/sbin
/ro/openssh-amd64-8.2p1-11/out/bin

Edge cases The implementation outlined above works well in hundreds of packages, and only a small handful exhibited problems of any kind. Here are some issues I encountered:

Issue: accidental ABI breakage in plugin mechanisms NSS libraries built against glibc 2.28 and newer cannot be loaded by glibc 2.27. In all likelihood, such changes do not happen too often, but it does illustrate that glibc s published interface spec is not sufficient for forwards and backwards compatibility. In distri, we could likely use a per-package exchange directory for glibc s NSS mechanism to prevent the above problem from happening in the future.

Issue: wrapper bypass when a program re-executes itself Some programs try to arrange for themselves to be re-executed outside of their current process tree. For example, consider building a program with the meson build system:
  1. When meson first configures the build, it generates ninja files (think Makefiles) which contain command lines that run the meson --internal helper.
  2. Once meson returns, ninja is called as a separate process, so it will not have the environment which the meson wrapper sets up. ninja then runs the previously persisted meson command line. Since the command line uses the full path to meson (not to its wrapper), it bypasses the wrapper.
Luckily, not many programs try to arrange for other process trees to run them. Here is a table summarizing how affected programs might try to arrange for re-execution, whether the technique results in a wrapper bypass, and what we do about it in distri:
technique to execute itself uses wrapper mitigation
run-time: find own basename in PATH yes wrapper program
compile-time: embed expected path no; bypass! configure or patch
run-time: argv[0] or /proc/self/exe no; bypass! patch
One might think that setting argv[0] to the wrapper location seems like a way to side-step this problem. We tried doing this in distri, but had to revert and go the other way.

Misc smaller issues

Appendix: Could other distributions adopt hermetic packages? At a very high level, adopting hermetic packages will require two steps:
  1. Using fully qualified paths whose contents don t change (e.g. /ro/emacs-amd64-26.3-15) generally requires rebuilding programs, e.g. with --prefix set.
  2. Once you use fully qualified paths you need to make the packages able to exchange data. distri solves this with exchange directories, implemented in the /ro file system which is backed by a FUSE daemon.
The first step is pretty simple, whereas the second step is where I expect controversy around any suggested mechanism.

Appendix: demo (in distri) This appendix contains commands and their outputs, run on upcoming distri version supersilverhaze, but verified to work on older versions, too. Large outputs have been collapsed and can be expanded by clicking on the output. The /bin directory contains symlinks for the union of all package s bin subdirectories:
distri0# readlink -f /bin/teensy_loader_cli
/ro/teensy-loader-cli-amd64-2.1+g20180927-7/bin/teensy_loader_cli
The wrapper program in the bin subdirectory is small:
distri0# ls -lh $(readlink -f /bin/teensy_loader_cli)
-rwxr-xr-x 1 root root 46K Apr 21 21:56 /ro/teensy-loader-cli-amd64-2.1+g20180927-7/bin/teensy_loader_cli
Wrapper programs execute quickly:
distri0# strace -fvy /bin/teensy_loader_cli & head cat -n
     1  execve("/bin/teensy_loader_cli", ["/bin/teensy_loader_cli"], ["USER=root", "LOGNAME=root", "HOME=/root", "PATH=/ro/bash-amd64-5.0-4/bin:/r"..., "SHELL=/bin/zsh", "TERM=screen.xterm-256color", "XDG_SESSION_ID=c1", "XDG_RUNTIME_DIR=/run/user/0", "DBUS_SESSION_BUS_ADDRESS=unix:pa"..., "XDG_SESSION_TYPE=tty", "XDG_SESSION_CLASS=user", "SSH_CLIENT=10.0.2.2 42556 22", "SSH_CONNECTION=10.0.2.2 42556 10"..., "SSHTTY=/dev/pts/0", "SHLVL=1", "PWD=/root", "OLDPWD=/root", "=/usr/bin/strace", "LD_LIBRARY_PATH=/ro/bash-amd64-5"..., "PERL5LIB=/ro/bash-amd64-5.0-4/ou"..., "PYTHONPATH=/ro/bash-amd64-5.b0-4/"...]) = 0
     2  arch_prctl(ARCH_SET_FS, 0x40c878)       = 0
     3  set_tid_address(0x40ca9c)               = 715
     4  brk(NULL)                               = 0x15b9000
     5  brk(0x15ba000)                          = 0x15ba000
     6  brk(0x15bb000)                          = 0x15bb000
     7  brk(0x15bd000)                          = 0x15bd000
     8  brk(0x15bf000)                          = 0x15bf000
     9  brk(0x15c1000)                          = 0x15c1000
    10  execve("/ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli", ["/ro/teensy-loader-cli-amd64-2.1+"...], ["USER=root", "LOGNAME=root", "HOME=/root", "PATH=/ro/bash-amd64-5.0-4/bin:/r"..., "SHELL=/bin/zsh", "TERM=screen.xterm-256color", "XDG_SESSION_ID=c1", "XDG_RUNTIME_DIR=/run/user/0", "DBUS_SESSION_BUS_ADDRESS=unix:pa"..., "XDG_SESSION_TYPE=tty", "XDG_SESSION_CLASS=user", "SSH_CLIENT=10.0.2.2 42556 22", "SSH_CONNECTION=10.0.2.2 42556 10"..., "SSHTTY=/dev/pts/0", "SHLVL=1", "PWD=/root", "OLDPWD=/root", "=/usr/bin/strace", "LD_LIBRARY_PATH=/ro/bash-amd64-5"..., "PERL5LIB=/ro/bash-amd64-5.0-4/ou"..., "PYTHONPATH=/ro/bash-amd64-5.0-4/"...]) = 0
Confirm which ELF interpreter is set for a binary using readelf(1):
distri0# readelf -a /ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli grep 'program interpreter'
[Requesting program interpreter: /ro/glibc-amd64-2.31-4/out/lib/ld-linux-x86-64.so.2]
Confirm the rpath is set to the package s lib subdirectory using readelf(1):
distri0# readelf -a /ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli grep RPATH
 0x000000000000000f (RPATH)              Library rpath: [/ro/teensy-loader-cli-amd64-2.1+g20180927-7/lib]
and verify the lib subdirectory has the expected symlinks and target versions:
distri0# find /ro/teensy-loader-cli-amd64-*/lib -type f -printf '%P -> %l\n'
libc.so.6 -> /ro/glibc-amd64-2.31-4/out/lib/libc-2.31.so
libpthread.so.0 -> /ro/glibc-amd64-2.31-4/out/lib/libpthread-2.31.so
librt.so.1 -> /ro/glibc-amd64-2.31-4/out/lib/librt-2.31.so
libudev.so.1 -> /ro/libudev-amd64-245-11/out/lib/libudev.so.1.6.17
libusb-0.1.so.4 -> /ro/libusb-compat-amd64-0.1.5-7/out/lib/libusb-0.1.so.4.4.4
libusb-1.0.so.0 -> /ro/libusb-amd64-1.0.23-8/out/lib/libusb-1.0.so.0.2.0
To verify the correct libraries are actually loaded, you can set the LD_DEBUG environment variable for ld.so(8):
distri0# LD_DEBUG=libs teensy_loader_cli
[ ]
       678:     find library=libc.so.6 [0]; searching
       678:      search path=/ro/teensy-loader-cli-amd64-2.1+g20180927-7/lib            (RPATH from file /ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli)
       678:       trying file=/ro/teensy-loader-cli-amd64-2.1+g20180927-7/lib/libc.so.6
       678:
[ ]
NSS libraries that distri ships:
find /lib/ -name "libnss_*.so.2" -type f -printf '%P -> %l\n'
libnss_myhostname.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_myhostname.so.2
libnss_mymachines.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_mymachines.so.2
libnss_resolve.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_resolve.so.2
libnss_systemd.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_systemd.so.2
libnss_compat.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_compat.so.2
libnss_db.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_db.so.2
libnss_dns.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_dns.so.2
libnss_files.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_files.so.2
libnss_hesiod.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_hesiod.so.2

24 April 2020

Mark Brown: Book club: Our Software Dependency Problem

A short while ago Daniel, Lars and I met to discuss Russ Cox s excellent essay Our Software Dependency Problem. This essay looks at software reuse in general, especially in the context of modern distribution methods like PyPI and NPM which make the whole process much more frictionless than traditional distribution methods used with languages like C. Possibly our biggest conclusion was that the essay is so eminently sensible that we mostly just talked about how much we agreed with it and how comprehensive it was, we particularly admired the clarity with which it explores how to evaluate the quality of free software projects. Next time we ll have to pick something more controversial to discuss!

14 April 2020

Markus Koschany: My Free Software Activities in March 2020

Welcome to gambaru.de. Here is my monthly report (+ the first week in April) that covers what I have been doing for Debian. If you re interested in Java, Games and LTS topics, this might be interesting for you. I am sure I am not the only one who will remember March 2020 in the future as a month nobody was really fond of. I was mostly occupied with non-Debian work and managed to get ill in the same week I wanted to celebrate my birthday but it didn t matter anyway because of ehm quarantine and social distancing. Maybe next year March will be great again.
Debian Games Debian Java Misc Debian LTS This was my 49. month as a paid contributor and I have been paid to work 10 hours on Debian LTS, a project started by Rapha l Hertzog. In that time I did the following: ELTS Extended Long Term Support (ELTS) is a project led by Freexian to further extend the lifetime of Debian releases. It is not an official Debian project but all Debian users benefit from it without cost. The current ELTS release is Debian 7 Wheezy . This was my 22. month and I have been paid to work 9 hours on ELTS. Thanks for reading and see you next time.

30 March 2020

Shirish Agarwal: Covid 19 and the Indian response.

There have been lot of stories about Coronavirus and with it a lot of political blame-game has been happening. The first step that India took of a lockdown is and was a good step but without having a plan as to how especially the poor and the needy and especially the huge migrant population that India has (internal migration) be affected by it. A 2019 World Economic Forum shares the stats. as 139 million people. That is a huge amount of people and there are a variety of both push and pull factors which has displaced these huge number of people. While there have been attempts in the past and probably will continue in future they will be hampered unless we have trust-worthy data which is where there is lots that need to be done. In the recent few years, both the primary and secondary data has generated lot of controversies within India as well as abroad so no point in rehashing all of that. Even the definition of who is a migrant needs to be well-established just as who is a farmer . The simplest lucanae in the later is those who have land are known as farmers but the tenant farmers and their wives are not added as farmers hence the true numbers are never known. Is this an India-specific problem or similar definition issues are there in the rest of the world I don t know.

How our Policies fail to reach the poor and the vulnerable The sad part is most policies in India are made in castles in the air . An interview by the wire shares the conundrum of those who are affected and the policies which are enacted for them (it s a youtube video, sorry)
If one with an open and fresh mind sees the interview it is clear that why there was a huge reverse migration from Indian cities to villages. The poor and marginalized has always seen the Indian state as an extortive force so it doesn t make sense for them to be in the cities. The Prime Minister s annoucement of food for 3 months was a clear indication for the migrant population that for 3 months they will have no work. Faced with such a scenario, the best option for them was to return to their native places. While videos of huge number of migrants were shown of Delhi, this was the scenario of most states and cities, including Pune, my own city . Another interesting point which was made is most of the policies will need the migrants to be back in the villages. Most of these are tied to the accounts which are opened in villages, so even if they want to have the benefits they will have to migrate to villages in order to use them. Of course, everybody in India knows how leaky the administration is. The late Shri Rajiv Gandhi had famously and infamously remarked once how leaky the Public Distribution system and such systems are. It s only 10 paise out of rupee which reaches the poor. And he said this about 30 years ago. There have been numerous reports of both IPS (Indian Police Services) reforms and IAS (Indian Administrative Services) reforms over the years, many of the committee reports have been in public domain and in fact was part of the election manifesto of the ruling party in 2014 but no movement has happened on that part. The only thing which has happened is people from the ruling party have been appointed on various posts which is same as earlier governments. I was discussing with a friend who is a contractor and builder about the construction labour issues which were pointed in the report and if it is true that many a times the migrant labour is not counted. While he shared a number of cases where he knew, a more recent case in public memory was when some labourers died while building Amanora mall which is perhaps one of largest malls in India. There were few accidents while constructing the mall. Apparently, the insurance money which should have gone to the migrant laborer was taken by somebody close to the developers who were building the mall. I have a friend in who lives in Jharkhand who is a labour officer. She has shared with me so many stories of how the labourers are exploited. Keep in mind she has been a labor officer appointed by the state and her salary is paid by the state. So she always has to maintain a balance of ensuring worker s rights and the interests of the state, private entities etc. which are usually in cahoots with the state and it is possible that lot of times the State wins over the worker s rights. Again, as a labour officer, she doesn t have that much power and when she was new to the work, she was often frustrated but as she remarked few months back, she has started taking it easy (routinized) as anyways it wasn t helping her in any good way. Also there have been plenty of cases of labor officers being murdered so its easier to understand why one tries to retain some sanity while doing their job.

The Indian response and the World Response The Indian response has been the lockdown and very limited testing. We seem to be following the pattern of UK and U.S. which had been slow to respond and slow to testing. In the past Kerala showed the way but this time even that is not enough. At the end of the day we need to test, test and test just as shared by the WHO chairman. India is trying to create its own cheap test kits with ICMR approval, for e.g. a firm from my own city Pune MyLab has been given approval. We will know how good or bad they are only after they have been field-tested. For ventilators we have asked Mahindra and Mahindra even though there are companies like Allied Medical and others who have exported to EU and others which the Govt. is still taking time to think through. This is similar to how in UK some companies who are with the Govt. but who have no experience in making ventilators are been given orders while those who have experience and were exporting to Germany and other countries are not been given orders. The playbook is errily similar. In India, we don t have the infrastructure for any new patients, period. Heck only a couple of states have done something proper for the anganwadi workers. In fact, last year there were massive strikes by anganwadi workers all over India but only NDTV showed a bit of it along with some of the news channels from South India. Most mainstream channels chose to ignore it. On the world stage, some of the other countries and how they have responded perhaps need sharing. For e.g. I didn t know that Cuba had so many doctors and the politics between it and Brazil. Or the interesting stats. shared by Andreas Backhaus which seems to show how distributed the issue (age-wise) is rather than just a few groups as has been told in Indian media. What was surprising for me is the 20-29 age group which has not been shared so much in the Indian media which is the bulk of our population. The HBR article also makes a few key points which I hope both the general public and policymakers both in India as well as elsewhere take note of. What is worrying though that people can be infected twice or more as seems to be from Singapore or China and elsewhere. I have read enough of Robin Cook and Michael Crichton books to be aware that viruses can do whatever. They will over time mutate, how things will happen then is anybody s guess. What I found interesting is the world economic forum article which hypothesis that it may be two viruses which got together as well as research paper from journal from poteome research which has recently been published. The biggest myth flying around is that summer will halt or kill the spread which even some of my friends have been victim of . While a part of me wants to believe them, a simple scientific fact has been viruses have probably been around us and evolved over time, just like we have. In fact, there have been cases of people dying due to common cold and other things. Viruses are so prevalent it s unbelivable. What is and was interesting to note is that bat-borne viruses as well as pangolin viruses had been theorized and shared by Chinese researchers going all the way back to 90 s . The problem is even if we killed all the bats in the world, some other virus will take its place for sure. One of the ideas I had, dunno if it s feasible or not that at least in places like Airports, we should have some sort of screenings and a labs working on virology. Of course, this will mean more expenses for flying passengers but for public health and safety maybe it would worth doing so. In any case, virologists should have a field day cataloging various viruses and would make it harder for viruses to spread as fast as this one has. The virus spread also showed a lack of leadership in most of our leaders who didn t react fast enough. While one hopes people do learn from this, I am afraid the whole thing is far from over. These are unprecedented times and hope that all are maintaining social distancing and going out only when needed.

24 March 2020

Russ Allbery: Review: Lost in Math

Review: Lost in Math, by Sabine Hossenfelder
Publisher: Basic
Copyright: June 2018
ISBN: 0-465-09426-0
Format: Kindle
Pages: 248
Listening to experts argue can be one of the better ways to learn about a new field. It does require some basic orientation and grounding or can be confusing or, worse, wildly misleading, so some advance research or Internet searches are warranted. But it provides some interesting advantages over reading multiple popular introductions to a field. First, experts arguing with each other are more precise about their points of agreement and disagreement because they're trying to persuade someone who is well-informed. The points of agreement are often more informative than the points of disagreement, since they can provide a feel for what is uncontroversial among experts in the field. Second, internal arguments tend to be less starry-eyed. One of the purposes of popularizations of a field is to get the reader excited about it, and that can be fun to read. But to generate that excitement, the author has a tendency to smooth over disagreements and play up exciting but unproven ideas. Expert disagreements pull the cover off of the uncertainty and highlight the boundaries of what we know and how we know it. Lost in Math (subtitled How Beauty Leads Physics Astray) is not quite an argument between experts. That's hard to find in book form; most of the arguments in the scientific world happen in academic papers, and I rarely have the energy or attention span to read those. But it comes close. Hossenfelder is questioning the foundations of modern particle physics for the general public, but also for her fellow scientists. High-energy particle physics is facing a tricky challenge. We have a solid theory (the standard model) which explains nearly everything that we have currently observed. The remaining gaps are primarily at very large scales (dark matter and dark energy) or near phenomena that are extremely difficult to study (black holes). For everything else, the standard model predicts our subatomic world to an exceptionally high degree of accuracy. But physicists don't like the theory. The details of why are much of the topic of this book, but the short version is that the theory does not seem either elegant or beautiful. It relies on a large number of measured constants that seem to have no underlying explanation, which is contrary to a core aesthetic principle that physicists use to judge new theories. Accompanying this problem is another: New experiments in particle physics that may be able to confirm or disprove alternate theories that go beyond the standard model are exceptionally expensive. All of the easy experiments have been done. Building equipment that can probe beyond the standard model is incredibly expensive, and thus only a few of those experiments have been done. This leads to two issues: Particle physics has an overgrowth of theories (such as string theory) that are largely untethered from experiments and are not being tested and validated or disproved, and spending on new experiments is guided primarily by a sense of scientific aesthetics that may simply be incorrect. Enter Lost in Math. Hossenfelder's book picks up a thread of skepticism about string theory (and, in Hossenfelder's case, supersymmetry as well) that I previously read in Lee Smolin's The Trouble with Physics. But while Smolin's critique was primarily within the standard aesthetic and epistemological framework of particle physics, Hossenfelder is questioning that framework directly. Why should nature be beautiful? Why should constants be small? What if the universe does have a large number of free constants? And is the dislike of an extremely reliable theory on aesthetic grounds a good basis for guiding which experiments we fund?
Do you recall the temple of science, in which the foundations of physics are the bottommost level, and we try to break through to deeper understanding? As I've come to the end of my travels, I worry that the cracks we're seeing in the floor aren't really cracks at all but merely intricate patterns. We're digging in the wrong places.
Lost in Math will teach you a bit less about physics than Smolin's book, although there is some of that here. Smolin's book was about two-thirds physics and one-third sociology of science. Lost in Math is about two-thirds sociology and one-third physics. But that sociology is engrossing. It's obvious in retrospect, but I hadn't thought before about the practical effects of running out of unexplained data on a theoretical field, or about the transition from more data than we can explain to having to spend billions of dollars to acquire new data. And Hossenfelder takes direct aim at the human tendency to find aesthetically appealing patterns and unified explanations, and scores some palpable hits.
I went into physics because I don't understand human behavior. I went into physics because math tells it how it is. I liked the cleanliness, the unambiguous machinery, the command math has over nature. Two decades later, what prevents me from understanding physics is that I still don't understand human behavior. "We cannot give exact mathematical rules that define if a theory is attractive or not," says Gian Francesco Giudice. "However, it is surprising how the beauty and elegance of a theory are universally recognized by people from different cultures. When I tell you, 'Look, I have a new paper and my theory is beautiful,' I don't have to tell you the details of my theory; you will get why I'm excited. Right?" I don't get it. That's why I am talking to him. Why should the laws of nature care what I find beautiful? Such a connection between me and the universe seems very mystical, very romantic, very not me. But then Gian doesn't think that nature cares what I find beautiful, but what he finds beautiful.
The structure of this book is half tour of how physics judges which theories are worthy of investigation and half personal quest to decide whether physics has lost contact with reality. Hossenfelder approaches this second thread with multiple interviews of famous scientists in the field. She probes at their bases for preferring one theory over another, at how objective those preferences can or should be, and what it means for physics if they're wrong (as increasingly appears to be the case for supersymmetry). In so doing, she humanizes theory development in a way that I found fascinating. The drawback to reading about ongoing arguments is the lack of a conclusion. Lost in Math, unsurprisingly, does not provide an epiphany about the future direction of high-energy particle physics. Its conclusion, to the extent that it has one, is a plea to find a way to put particle physics back on firmer experimental footing and to avoid cognitive biases in theory development. Given the cost of experiments and the nature of humans, this is challenging. But I enjoyed reading this questioning, contrarian take, and I think it's valuable for understanding the limits, biases, and distortions at the edge of new theory development. Rating: 7 out of 10

2 March 2020

Russell Coker: Amazon Prime and Netflix

I ve been trying both Amazon Prime and Netflix. I signed up for the month free of Amazon Prime to watch Good Omens and Picard . Good Omens is definitely worth the effort of setting up the month free of Amazon Prime and is worth the month s subscription if you have used your free month in the past and Picard is ok. Content Amazon Prime has a medium amount of other content, I m now paying for a month of Amazon Prime mainly because there s enough documentaries to take a month. For reference there are plenty of good ones about war and about space exploration. There are also some really rubbish documentaries, for example a 2 part documentary about the Magna Carta where the second part starts with Grover Norquist claiming that the Magna Carta is justification for not having any taxes (the first part seemed ok). Netflix has a lot of great content. A big problem with Netflix is that there aren t good ways of searching and organising the content you want to watch. It would be really nice if Netflix could use some machine learning for recommendations and recommend shows based on what I ve liked and also what I ve disliked. On both Netflix and Amazon when you view the details of a show it gives a short list of similar shows which is nice. With Amazon I have no complaints about that. But with Netflix the content library is so great that you get lost in a maze of links. On the Android tablet interface for Netflix it shows 12 similar shows in a grid and on the web interface it s a row of 20 shows with looped scrolling. Then as you click a different show you get another list of 12/20 shows which will usually have some overlap with the previous one. It would be nice if you could easily swipe left on shows you don t like to avoid having them repeatedly presented to you. On Netflix I ve really enjoyed the Altered Carbon series (which is significantly more violent than I anticipated), Black Mirror (the episode written by Trent Reznor and starring Miley Cyrus is particularly good), and Love Death and Robots . Overall I currently rate Love Death and Robots as in many ways the best series I ve ever watched because the episodes are all short and get straight to the point. One advantage of online video is that they don t need to pad episodes out or cut them short to fit a TV time slot, they can use as much time as necessary to tell the story. Watch List Having a single row of shows to watch is fine for the amount of content that Amazon has, but for the Netflix content you can easily get 100 shows on your watch list and it would be good to be able to search my watch list by genre (it s a drag to flick through dozens of icons of war documentaries when I m in the mood for an action movie as the icons are somewhat similar). As well as a list of shows you selected to watch Netflix has a list of shows that have been recently watched with no way to edit it which is separate from the list of shows selected to watch. So if you watch 5 minutes of a show and decide that it sucks then it stays on the list until you have partially watched 10 other shows recently. For my usage the recently watched list is the most important thing as I m watching some serial shows and wouldn t want to go through the 100 shows on my watch list to find them. If I ve decided that a movie sucked after watching a bit of it I don t want to be reminded of it by seeing the icon every time I use Netflix for the next month. Amazon has only a single watch next list for shows that you have watched recently and shows that you selected as worth watching. It allows editing the list which is nice, but then Amazon also often keeps shows on the list when you have finished watching them and removed them from the to watch setting. Amazon s watch list is also generally buggy, at one time it decided that a movie was no longer available in my region but didn t let me remove it from the list. Quality Apparently the Netflix web interface on Linux only allows 720p video while the Amazon web interface on all platforms is limited to 720p. In any case my Internet connection is probably only good enough for 1080p at most. I haven t noticed any quality differences between Netflix and Amazon Prime. Multiple Users Netflix allows you to create profiles for multiple users with separate watch lists which is very handy. They also don t have IP address restrictions so it s a common practice for people to share a Netflix account with relatives. If you try to use Netflix when the maximum number of sessions for your account is in use it will show a list of what the other people on your account are watching (so if you share with your parents be careful about that). Amazon doesn t allow creating multiple profiles, but the content isn t that great. The trend in video streaming is for proprietary content to force users to subscribe to a service. So sharing an Amazon Prime account with a few people so you want watch the proprietary content would make sense. Watching Patterns Sometimes when I m particularly distracted I can t focus on one show for any length of time. Both Amazon and Netflix (and probably all other online streaming services) allow me to skip between shows easily. That s always been a feature of YouTube, but with YouTube you get recommended increasingly viral content until you find yourself watching utter rubbish. At least with Amazon and Netflix there is a minimum quality level even if that is reality TV. Conclusion Amazon Prime has a smaller range of content and some really rubbish documentaries. I don t mind the documentaries about UFOs and other fringe stuff as it s obvious what it is and you can avoid it. A documentary that has me watching for an hour before it s revealed to be a promo for Grover Norquist is really bad, did the hour of it that I watched have good content or just rubbish too? Netflix has a huge range of content and the quality level is generally very high. If you are going to watch TV then subscribing to Netflix is probably a good idea. It s reasonably cheap, has a good (not great) interface, and has a lot of content including some great original content. For Amazon maybe subscribe for 1 month every second year to binge watch the Amazon proprietary content that interests you.

17 October 2017

Antoine Beaupr : A comparison of cryptographic keycards

An earlier article showed that private key storage is an important problem to solve in any cryptographic system and established keycards as a good way to store private key material offline. But which keycard should we use? This article examines the form factor, openness, and performance of four keycards to try to help readers choose the one that will fit their needs. I have personally been using a YubiKey NEO, since a 2015 announcement on GitHub promoting two-factor authentication. I was also able to hook up my SSH authentication key into the YubiKey's 2048 bit RSA slot. It seemed natural to move the other subkeys onto the keycard, provided that performance was sufficient. The mail client that I use, (Notmuch), blocks when decrypting messages, which could be a serious problems on large email threads from encrypted mailing lists. So I built a test harness and got access to some more keycards: I bought a FST-01 from its creator, Yutaka Niibe, at the last DebConf and Nitrokey donated a Nitrokey Pro. I also bought a YubiKey 4 when I got the NEO. There are of course other keycards out there, but those are the ones I could get my hands on. You'll notice none of those keycards have a physical keypad to enter passwords, so they are all vulnerable to keyloggers that could extract the key's PIN. Keep in mind, however, that even with the PIN, an attacker could only ask the keycard to decrypt or sign material but not extract the key that is protected by the card's firmware.

Form factor The Nitrokey Pro, YubiKey NEO (worn out), YubiKey 4, and FST-01 The four keycards have similar form factors: they all connect to a standard USB port, although both YubiKey keycards have a capacitive button by which the user triggers two-factor authentication and the YubiKey 4 can also require a button press to confirm private key use. The YubiKeys feel sturdier than the other two. The NEO has withstood two years of punishment in my pockets along with the rest of my "real" keyring and there is only minimal wear on the keycard in the picture. It's also thinner so it fits well on the keyring. The FST-01 stands out from the other two with its minimal design. Out of the box, the FST-01 comes without a case, so the circuitry is exposed. This is deliberate: one of its goals is to be as transparent as possible, both in terms of software and hardware design and you definitely get that feeling at the physical level. Unfortunately, that does mean it feels more brittle than other models: I wouldn't carry it in my pocket all the time, although there is a case that may protect the key a little better, but it does not provide an easy way to hook it into a keyring. In the group picture above, the FST-01 is the pink plastic thing, which is a rubbery casing I received along with the device when I got it. Notice how the USB connectors of the YubiKeys differ from the other two: while the FST-01 and the Nitrokey have standard USB connectors, the YubiKey has only a "half-connector", which is what makes it thinner than the other two. The "Nano" form factor takes this even further and almost disappears in the USB port. Unfortunately, this arrangement means the YubiKey NEO often comes loose and falls out of the USB port, especially when connected to a laptop. On my workstation, however, it usually stays put even with my whole keyring hanging off of it. I suspect this adds more strain to the host's USB port but that's a tradeoff I've lived with without any noticeable wear so far. Finally, the NEO has this peculiar feature of supporting NFC for certain operations, as LWN previously covered, but I haven't used that feature yet. The Nitrokey Pro looks like a normal USB key, in contrast with the other two devices. It does feel a little brittle when compared with the YubiKey, although only time will tell how much of a beating it can take. It has a small ring in the case so it is possible to carry it directly on your keyring, but I would be worried the cap would come off eventually. Nitrokey devices are also two times thicker than the Yubico models which makes them less convenient to carry around on keyrings.

Open and closed designs The FST-01 is as open as hardware comes, down to the PCB design available as KiCad files in this Git repository. The software running on the card is the Gnuk firmware that implements the OpenPGP card protocol, but you can also get it with firmware implementing a true random number generator (TRNG) called NeuG (pronounced "noisy"); the device is programmable through a standard Serial Wire Debug (SWD) port. The Nitrokey Start model also runs the Gnuk firmware. However, the Nitrokey website announces only ECC and RSA 2048-bit support for the Start, while the FST-01 also supports RSA-4096. Nitrokey's founder Jan Suhr, in a private email, explained that this is because "Gnuk doesn't support RSA-3072 or larger at a reasonable speed". Its devices (the Pro, Start, and HSM models) use a similar chip to the FST-01: the STM32F103 microcontroller. Nitrokey Pro with STM32F103TBU6 MCU Nitrokey also publishes its hardware designs, on GitHub, which shows the Pro is basically a fork of the FST-01, according to the ChangeLog. I opened the case to confirm it was using the STM MCU, something I should warn you against; I broke one of the pins holding it together when opening it so now it's even more fragile. But at least, I was able to confirm it was built using the STM32F103TBU6 MCU, like the FST-01. Nitrokey back side But this is where the comparison ends: on the back side, we find a SIM card reader that holds the OpenPGP card that, in turn, holds the private key material and does the cryptographic operations. So, in effect, the Nitrokey Pro is really a evolution of the original OpenPGP card readers. Nitrokey confirmed the OpenPGP card featured in the Pro is the same as the one shipped by the Free Software Foundation Europe (FSFE): the BasicCard built by ZeitControl. Those cards, however, are covered by NDAs and the firmware is only partially open source. This makes the Nitrokey Pro less open than the FST-01, but that's an inevitable tradeoff when choosing a design based on the OpenPGP cards, which Suhr described to me as "pretty proprietary". There are other keycards out there, however, for example the SLJ52GDL150-150k smartcard suggested by Debian developer Yves-Alexis Perez, which he prefers as it is certified by French and German authorities. In that blog post, he also said he was experimenting with the GPL-licensed OpenPGP applet implemented by the French ANSSI. But the YubiKey devices are even further away in the closed-design direction. Both the hardware designs and firmware are proprietary. The YubiKey NEO, for example, cannot be upgraded at all, even though it is based on an open firmware. According to Yubico's FAQ, this is due to "best security practices": "There is a 'no upgrade' policy for our devices since nothing, including malware, can write to the firmware." I find this decision questionable in a context where security updates are often more important than trying to design a bulletproof design, which may simply be impossible. And the YubiKey NEO did suffer from critical security issue that allowed attackers to bypass the PIN protection on the card, which raises the question of the actual protection of the private key material on those cards. According to Niibe, "some OpenPGP cards store the private key unencrypted. It is a common attitude for many smartcard implementations", which was confirmed by Suhr: "the private key is protected by hardware mechanisms which prevent its extraction and misuse". He is referring to the use of tamper resistance. After that security issue, there was no other option for YubiKey NEO users than to get a new keycard (for free, thankfully) from Yubico, which also meant discarding the private key material on the key. For OpenPGP keys, this may mean having to bootstrap the web of trust from scratch if the keycard was responsible for the main certification key. But at least the NEO is running free software based on the OpenPGP card applet and the source is still available on GitHub. The YubiKey 4, on the other hand, is now closed source, which was controversial when the new model was announced last year. It led the main Linux Foundation system administrator, Konstantin Ryabitsev, to withdraw his endorsement of Yubico products. In response, Yubico argued that this approach was essential to the security of its devices, which are now based on "a secure chip, which has built-in countermeasures to mitigate a long list of attacks". In particular, it claims that:
A commercial-grade AVR or ARM controller is unfit to be used in a security product. In most cases, these controllers are easy to attack, from breaking in via a debug/JTAG/TAP port to probing memory contents. Various forms of fault injection and side-channel analysis are possible, sometimes allowing for a complete key recovery in a shockingly short period of time.
While I understand those concerns, they eventually come down to the trust you have in an organization. Not only do we have to trust Yubico, but also hardware manufacturers and designs they have chosen. Every step in the hidden supply chain is then trusted to make correct technical decisions and not introduce any backdoors. History, unfortunately, is not on Yubico's side: Snowden revealed the example of RSA security accepting what renowned cryptographer Bruce Schneier described as a "bribe" from the NSA to weaken its ECC implementation, by using the presumably backdoored Dual_EC_DRBG algorithm. What makes Yubico or its suppliers so different from RSA Security? Remember that RSA Security used to be an adamant opponent to the degradation of encryption standards, campaigning against the Clipper chip in the first crypto wars. Even if we trust the Yubico supply chain, how can we trust a closed design using what basically amounts to security through obscurity? Publicly auditable designs are an important tradition in cryptography, and that principle shouldn't stop when software is frozen into silicon. In fact, a critical vulnerability called ROCA disclosed recently affects closed "smartcards" like the YubiKey 4 and allows full private key recovery from the public key if the key was generated on a vulnerable keycard. When speaking with Ars Technica, the researchers outlined the importance of open designs and questioned the reliability of certification:
Our work highlights the dangers of keeping the design secret and the implementation closed-source, even if both are thoroughly analyzed and certified by experts. The lack of public information causes a delay in the discovery of flaws (and hinders the process of checking for them), thereby increasing the number of already deployed and affected devices at the time of detection.
This issue with open hardware designs seems to be recurring topic of conversation on the Gnuk mailing list. For example, there was a discussion in September 2017 regarding possible hardware vulnerabilities in the STM MCU that would allow extraction of encrypted key material from the key. Niibe referred to a talk presented at the WOOT 17 workshop, where Johannes Obermaier and Stefan Tatschner, from the Fraunhofer Institute, demonstrated attacks against the STMF0 family MCUs. It is still unclear if those attacks also apply to the older STMF1 design used in the FST-01, however. Furthermore, extracted private key material is still protected by user passphrase, but the Gnuk uses a weak key derivation function, so brute-forcing attacks may be possible. Fortunately, there is work in progress to make GnuPG hash the passphrase before sending it to the keycard, which should make such attacks harder if not completely pointless. When asked about the Yubico claims in a private email, Niibe did recognize that "it is true that there are more weak points in general purpose implementations than special implementations". During the last DebConf in Montreal, Niibe explained:
If you don't trust me, you should not buy from me. Source code availability is only a single factor: someone can maliciously replace the firmware to enable advanced attacks.
Niibe recommends to "build the firmware yourself", also saying the design of the FST-01 uses normal hardware that "everyone can replicate". Those advantages are hard to deny for a cryptographic system: using more generic components makes it harder for hostile parties to mount targeted attacks. A counter-argument here is that it can be difficult for a regular user to audit such designs, let alone physically build the device from scratch but, in a mailing list discussion, Debian developer Ian Jackson explained that:
You don't need to be able to validate it personally. The thing spooks most hate is discovery. Backdooring supposedly-free hardware is harder (more costly) because it comes with greater risk of discovery. To put it concretely: if they backdoor all of them, someone (not necessarily you) might notice. (Backdooring only yours involves messing with the shipping arrangements and so on, and supposes that you specifically are of interest.)
Since that, as far as we know, the STM microcontrollers are not backdoored, I would tend to favor those devices instead of proprietary ones, as such a backdoor would be more easily detectable than in a closed design. Even though physical attacks may be possible against those microcontrollers, in the end, if an attacker has physical access to a keycard, I consider the key compromised, even if it has the best chip on the market. In our email exchange, Niibe argued that "when a token is lost, it is better to revoke keys, even if the token is considered secure enough". So like any other device, physical compromise of tokens may mean compromise of the key and should trigger key-revocation procedures.

Algorithms and performance To establish reliable performance results, I wrote a benchmark program naively called crypto-bench that could produce comparable results between the different keys. The program takes each algorithm/keycard combination and runs 1000 decryptions of a 16-byte file (one AES-128 block) using GnuPG, after priming it to get the password cached. I assume the overhead of GnuPG calls to be negligible, as it should be the same across all tokens, so comparisons are possible. AES encryption is constant across all tests as it is always performed on the host and fast enough to be irrelevant in the tests. I used the following:
  • Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz running Debian 9 ("stretch"/stable amd64), using GnuPG 2.1.18-6 (from the stable Debian package)
  • Nitrokey Pro 0.8 (latest firmware)
  • FST-01, running Gnuk version 1.2.5 (latest firmware)
  • YubiKey NEO OpenPGP applet 1.0.10 (not upgradable)
  • YubiKey 4 4.2.6 (not upgradable)
I ran crypto-bench for each keycard, which resulted in the following:
Algorithm Device Mean time (s)
ECDH-Curve25519 CPU 0.036
FST-01 0.135
RSA-2048 CPU 0.016
YubiKey-4 0.162
Nitrokey-Pro 0.610
YubiKey-NEO 0.736
FST-01 1.265
RSA-4096 CPU 0.043
YubiKey-4 0.875
Nitrokey-Pro 3.150
FST-01 8.218
Decryption graph There we see the performance of the four keycards I tested, compared with the same operations done without a keycard: the "CPU" device. That provides the baseline time of GnuPG decrypting the file. The first obvious observation is that using a keycard is slower: in the best scenario (FST-01 + ECC) we see a four-fold slowdown, but in the worst case (also FST-01, but RSA-4096), we see a catastrophic 200-fold slowdown. When I presented the results on the Gnuk mailing list, GnuPG developer Werner Koch confirmed those "numbers are as expected":
With a crypto chip RSA is much faster. By design the Gnuk can't be as fast - it is just a simple MCU. However, using Curve25519 Gnuk is really fast.
And yes, the FST-01 is really fast at doing ECC, but it's also the only keycard that handles ECC in my tests; the Nitrokey Start and Nitrokey HSM should support it as well, but I haven't been able to test those devices. Also note that the YubiKey NEO doesn't support RSA-4096 at all, so we can only compare RSA-2048 across keycards. We should note, however, that ECC is slower than RSA on the CPU, which suggests the Gnuk ECC implementation used by the FST-01 is exceptionally fast. In discussions about improving the performance of the FST-01, Niibe estimated the user tolerance threshold to be "2 seconds decryption time". In a new design using the STM32L432 microcontroller, Aurelien Jarno was able to bring the numbers for RSA-2048 decryption from 1.27s down to 0.65s, and for RSA-4096, from 8.22s down to 3.87s seconds. RSA-4096 is still beyond the two-second threshold, but at least it brings the FST-01 close to the YubiKey NEO and Nitrokey Pro performance levels. We should also underline the superior performance of the YubiKey 4: whatever that thing is doing, it's doing it faster than anyone else. It does RSA-4096 faster than the FST-01 does RSA-2048, and almost as fast as the Nitrokey Pro does RSA-2048. We should also note that the Nitrokey Pro also fails to cross the two-second threshold for RSA-4096 decryption. For me, the FST-01's stellar performance with ECC outshines the other devices. Maybe it says more about the efficiency of the algorithm than the FST-01 or Gnuk's design, but it's definitely an interesting avenue for people who want to deploy those modern algorithms. So, in terms of performance, it is clear that both the YubiKey 4 and the FST-01 take the prize in their own areas (RSA and ECC, respectively).

Conclusion In the above presentation, I have evaluated four cryptographic keycards for use with various OpenPGP operations. What the results show is that the only efficient way of storing a 4096-bit encryption key on a keycard would be to use the YubiKey 4. Unfortunately, I do not feel we should put our trust in such closed designs so I would argue you should either stick with 2048-bit encryption subkeys or keep the keys on disk. Considering that losing such a key would be catastrophic, this might be a good approach anyway. You should also consider switching to ECC encryption: even though it may not be supported everywhere, GnuPG supports having multiple encryption subkeys on a keyring: if one algorithm is unsupported (e.g. GnuPG 1.4 doesn't support ECC), it will fall back to a supported algorithm (e.g. RSA). Do not forget your previously encrypted material doesn't magically re-encrypt itself using your new encryption subkey, however. For authentication and signing keys, speed is not such an issue, so I would warmly recommend either the Nitrokey Pro or Start, or the FST-01, depending on whether you want to start experimenting with ECC algorithms. Availability also seems to be an issue for the FST-01. While you can generally get the device when you meet Niibe in person for a few bucks (I bought mine for around \$30 Canadian), the Seeed online shop says the device is out of stock at the time of this writing, even though Jonathan McDowell said that may be inaccurate in a debian-project discussion. Nevertheless, this issue may make the Nitrokey devices more attractive. When deciding on using the Pro or Start, Suhr offered the following advice:
In practice smart card security has been proven to work well (at least if you use a decent smart card). Therefore the Nitrokey Pro should be used for high security cases. If you don't trust the smart card or if Nitrokey Start is just sufficient for you, you can choose that one. This is why we offer both models.
So far, I have created a signing subkey and moved that and my authentication key to the YubiKey NEO, because it's a device I physically trust to keep itself together in my pockets and I was already using it. It has served me well so far, especially with its extra features like U2F and HOTP support, which I use frequently. Those features are also available on the Nitrokey Pro, so that may be an alternative if I lose the YubiKey. I will probably move my main certification key to the FST-01 and a LUKS-encrypted USB disk, to keep that certification key offline but backed up on two different devices. As for the encryption key, I'll wait for keycard performance to improve, or simply switch my whole keyring to ECC and use the FST-01 or Nitrokey Start for that purpose.
[The author would like to thank Nitrokey for providing hardware for testing.] This article first appeared in the Linux Weekly News.

12 October 2017

Joachim Breitner: Isabelle functions: Always total, sometimes undefined

Often, when I mention how things work in the interactive theorem prover [Isabelle/HOL] (in the following just Isabelle 1) to people with a strong background in functional programming (whether that means Haskell or Coq or something else), I cause confusion, especially around the issue of what is a function, are function total and what is the business with undefined. In this blog post, I want to explain some these issues, aimed at functional programmers or type theoreticians. Note that this is not meant to be a tutorial; I will not explain how to do these things, and will focus on what they mean.

HOL is a logic of total functions If I have a Isabelle function f :: a b between two types a and b (the function arrow in Isabelle is , not ), then by definition of what it means to be a function in HOL whenever I have a value x :: a, then the expression f x (i.e. f applied to x) is a value of type b. Therefore, and without exception, every Isabelle function is total. In particular, it cannot be that f x does not exist for some x :: a. This is a first difference from Haskell, which does have partial functions like
spin :: Maybe Integer -> Bool
spin (Just n) = spin (Just (n+1))
Here, neither the expression spin Nothing nor the expression spin (Just 42) produce a value of type Bool: The former raises an exception ( incomplete pattern match ), the latter does not terminate. Confusingly, though, both expressions have type Bool. Because every function is total, this confusion cannot arise in Isabelle: If an expression e has type t, then it is a value of type t. This trait is shared with other total systems, including Coq. Did you notice the emphasis I put on the word is here, and how I deliberately did not write evaluates to or returns ? This is because of another big source for confusion:

Isabelle functions do not compute We (i.e., functional programmers) stole the word function from mathematics and repurposed it2. But the word function , in the context of Isabelle, refers to the mathematical concept of a function, and it helps to keep that in mind. What is the difference?
  • A function a b in functional programming is an algorithm that, given a value of type a, calculates (returns, evaluates to) a value of type b.
  • A function a b in math (or Isabelle) associates with each value of type a a value of type b.
For example, the following is a perfectly valid function definition in math (and HOL), but could not be a function in the programming sense:
definition foo :: "(nat   real)   real" where
  "foo seq = (if convergent seq then lim seq else 0)"
This assigns a real number to every sequence, but it does not compute it in any useful sense. From this it follows that

Isabelle functions are specified, not defined Consider this function definition:
fun plus :: "nat   nat   nat"  where
   "plus 0       m = m"
   "plus (Suc n) m = Suc (plus n m)"
To a functional programmer, this reads
plus is a function that analyses its first argument. If that is 0, then it returns the second argument. Otherwise, it calls itself with the predecessor of the first argument and increases the result by one.
which is clearly a description of a computation. But to Isabelle, the above reads
plus is a binary function on natural numbers, and it satisfies the following two equations:
And in fact, it is not so much Isabelle that reads it this way, but rather the fun command, which is external to the Isabelle logic. The fun command analyses the given equations, constructs a non-recursive definition of plus under the hood, passes that to Isabelle and then proves that the given equations hold for plus. One interesting consequence of this is that different specifications can lead to the same functions. In fact, if we would define plus' by recursing on the second argument, we d obtain the the same function (i.e. plus = plus' is a theorem, and there would be no way of telling the two apart).

Termination is a property of specifications, not functions Because a function does not evaluate, it does not make sense to ask if it terminates. The question of termination arises before the function is defined: The fun command can only construct plus in a way that the equations hold if it passes a termination check very much like Fixpoint in Coq. But while the termination check of Fixpoint in Coq is a deep part of the basic logic, in Isabelle it is simply something that this particular command requires for its internal machinery to go through. At no point does a termination proof of the function exist as a theorem inside the logic. And other commands may have other means of defining a function that do not even require such a termination argument! For example, a function specification that is tail-recursive can be turned in to a function, even without a termination proof: The following definition describes a higher-order function that iterates its first argument f on the second argument x until it finds a fixpoint. It is completely polymorphic (the single quote in 'a indicates that this is a type variable):
partial_function (tailrec)
  fixpoint :: "('a   'a)   'a   'a"
where
  "fixpoint f x = (if f x = x then x else fixpoint f (f x))"
We can work with this definition just fine. For example, if we instantiate f with ( x. x-1), we can prove that it will always return 0:
lemma "fixpoint (  n . n - 1) (n::nat) = 0"
  by (induction n) (auto simp add: fixpoint.simps)
Similarly, if we have a function that works within the option monad (i.e. Maybe in Haskell), its specification can always be turned into a function without an explicit termination proof here one that calculates the Collatz sequence:
partial_function (option) collatz :: "nat   nat list option"
 where "collatz n =
        (if n = 1 then Some [n]
         else if even n
           then do   ns <- collatz (n div 2);    Some (n # ns)  
           else do   ns <- collatz (3 * n + 1);  Some (n # ns) )"
Note that lists in Isabelle are finite (like in Coq, unlike in Haskell), so this function returns a list only if the collatz sequence eventually reaches 1. I expect these definitions to make a Coq user very uneasy. How can fixpoint be a total function? What is fixpoint ( n. n+1)? What if we run collatz n for a n where the Collatz sequence does not reach 1?3 We will come back to that question after a little detour

HOL is a logic of non-empty types Another big difference between Isabelle and Coq is that in Isabelle, every type is inhabited. Just like the totality of functions, this is a very fundamental fact about what HOL defines to be a type. Isabelle gets away with that design because in Isabelle, we do not use types for propositions (like we do in Coq), so we do not need empty types to denote false propositions. This design has an important consequence: It allows the existence of a polymorphic expression that inhabits any type, namely
undefined :: 'a
The naming of this term alone has caused a great deal of confusion for Isabelle beginners, or in communication with users of different systems, so I implore you to not read too much into the name. In fact, you will have a better time if you think of it as arbitrary or, even better, unknown. Since undefined can be instantiated at any type, we can instantiate it for example at bool, and we can observe an important fact: undefined is not an extra value besides the usual ones . It is simply some value of that type, which is demonstrated in the following lemma:
lemma "undefined = True   undefined = False" by auto
In fact, if the type has only one value (such as the unit type), then we know the value of undefined for sure:
lemma "undefined = ()" by auto
It is very handy to be able to produce an expression of any type, as we will see as follows

Partial functions are just underspecified functions For example, it allows us to translate incomplete function specifications. Consider this definition, Isabelle s equivalent of Haskell s partial fromJust function:
fun fromSome :: "'a option   'a" where
  "fromSome (Some x) = x"
This definition is accepted by fun (albeit with a warning), and the generated function fromSome behaves exactly as specified: when applied to Some x, it is x. The term fromSome None is also a value of type 'a, we just do not know which one it is, as the specification does not address that. So fromSome None behaves just like undefined above, i.e. we can prove
lemma "fromSome None = False   fromSome None = True" by auto
Here is a small exercise for you: Can you come up with an explanation for the following lemma:
fun constOrId :: "bool   bool" where
  "constOrId True = True"
lemma "constOrId = ( _.True)   constOrId = ( x. x)"
  by (metis (full_types) constOrId.simps)
Overall, this behavior makes sense if we remember that function definitions in Isabelle are not really definitions, but rather specifications. And a partial function definition is simply a underspecification. The resulting function is simply any function hat fulfills the specification, and the two lemmas above underline that observation.

Nonterminating functions are also just underspecified Let us return to the puzzle posed by fixpoint above. Clearly, the function seen as a functional program is not total: When passed the argument ( n. n + 1) or ( b. b) it will loop forever trying to find a fixed point. But Isabelle functions are not functional programs, and the definitions are just specifications. What does the specification say about the case when f has no fixed-point? It states that the equation fixpoint f x = fixpoint f (f x) holds. And this equation has a solution, for example fixpoint f _ = undefined. Or more concretely: The specification of the fixpoint function states that fixpoint ( b. b) True = fixpoint ( b. b) False has to hold, but it does not specify which particular value (True or False) it should denote any is fine.

Not all function specifications are ok At this point you might wonder: Can I just specify any equations for a function f and get a function out of that? But rest assured: That is not the case. For example, no Isabelle command allows you define a function bogus :: () nat with the equation bogus () = Suc (bogus ()), because this equation does not have a solution. We can actually prove that such a function cannot exist:
lemma no_bogus: "  bogus. bogus () = Suc (bogus ())" by simp
(Of course, not_bogus () = not_bogus () is just fine )

You cannot reason about partiality in Isabelle We have seen that there are many ways to define functions that one might consider partial . Given a function, can we prove that it is not partial in that sense? Unfortunately, but unavoidably, no: Since undefined is not a separate, recognizable value, but rather simply an unknown one, there is no way of stating that A function result is not specified . Here is an example that demonstrates this: Two partial functions (one with not all cases specified, the other one with a self-referential specification) are indistinguishable from the total variant:
fun partial1 :: "bool   unit" where
  "partial1 True = ()"
partial_function (tailrec) partial2 :: "bool   unit" where
  "partial2 b = partial2 b"
fun total :: "bool   unit" where
  "total True = ()"
  "total False = ()"
lemma "partial1 = total   partial2 = total" by auto
If you really do want to reason about partiality of functional programs in Isabelle, you should consider implementing them not as plain HOL functions, but rather use HOLCF, where you can give equational specifications of functional programs and obtain continuous functions between domains. In that setting, () and partial2 = total. We have done that to verify some of HLint s equations.

You can still compute with Isabelle functions I hope by this point, I have not scared away anyone who wants to use Isabelle for functional programming, and in fact, you can use it for that. If the equations that you pass to fun are a reasonable definition for a function (in the programming sense), then these equations, used as rewriting rules, will allow you to compute that function quite like you would in Coq or Haskell. Moreover, Isabelle supports code extraction: You can take the equations of your Isabelle functions and have them expored into Ocaml, Haskell, Scala or Standard ML. See Concon for a conference management system with confidentially verified in Isabelle. While these usually are the equations you defined the function with, they don't have to: You can declare other proved equations to be used for code extraction, e.g. to refine your elegant definitions to performant ones. Like with code extraction from Coq to, say, Haskell, the adequacy of the translations rests on a moral reasoning foundation. Unlike extraction from Coq, where you have an (unformalized) guarantee that the resulting Haskell code is terminating, you do not get that guarantee from Isabelle. Conversely, this allows you do reason about and extract non-terminating programs, like fixpoint, which is not possible in Coq. There is currently ongoing work about verified code generation, where the code equations are reflected into a deep embedding of HOL in Isabelle that would allow explicit termination proofs.

Conclusion We have seen how in Isabelle, every function is total. Function declarations have equations, but these do not define the function in an computational sense, but rather specify them. Because in HOL, there are no empty types, many specifications that appear partial (incomplete patterns, non-terminating recursion) have solutions in the space of total functions. Partiality in the specification is no longer visible in the final product.

PS: Axiom undefined in Coq This section is speculative, and an invitation for discussion. Coq already distinguishes between types used in programs (Set) and types used in proofs Prop. Could Coq ensure that every t : Set is non-empty? I imagine this would require additional checks in the Inductive command, similar to the checks that the Isabelle command datatype has to perform4, and it would disallow Empty_set. If so, then it would be sound to add the following axiom
Axiom undefined : forall (a : Set), a.
wouldn't it? This axiom does not have any computational meaning, but that seems to be ok for optional Coq axioms, like classical reasoning or function extensionality. With this in place, how much of what I describe above about function definitions in Isabelle could now be done soundly in Coq. Certainly pattern matches would not have to be complete and could sport an implicit case _ undefined. Would it help with non-obviously terminating functions? Would it allow a Coq command Tailrecursive that accepts any tailrecursive function without a termination check?

  1. Isabelle is a metalogical framework, and other logics, e.g. Isabelle/ZF, behave differently. For the purpose of this blog post, I always mean Isabelle/HOL.
  2. Isabelle is a metalogical framework, and other logics, e.g. Isabelle/ZF, behave differently. For the purpose of this blog post, I always mean Isabelle/HOL.
  3. Let me know if you find such an n. Besides n = 0.
  4. Like fun, the constructions by datatype are not part of the logic, but create a type definition from more primitive notions that is isomorphic to the specified data type.

28 September 2017

Russell Coker: Process Monitoring

Since forking the Mon project to etbemon [1] I ve been spending a lot of time working on the monitor scripts. Actually monitoring something is usually quite easy, deciding what to monitor tends to be the hard part. The process monitoring script ps.monitor is the one I m about to redesign. Here are some of my ideas for monitoring processes. Please comment if you have any suggestions for how do do things better. For people who don t use mon, the monitor scripts return 0 if everything is OK and 1 if there s a problem along with using stdout to display an error message. While I m not aware of anyone hooking mon scripts into a different monitoring system that s going to be easy to do. One thing I plan to work on in the future is interoperability between mon and other systems such as Nagios. Basic Monitoring
ps.monitor tor:1-1 master:1-2 auditd:1-1 cron:1-5 rsyslogd:1-1 dbus-daemon:1- sshd:1- watchdog:1-2
I m currently planning some sort of rewrite of the process monitoring script. The current functionality is to have a list of process names on the command line with minimum and maximum numbers for the instances of the process in question. The above is a sample of the configuration of the monitor. There are some limitations to this, the master process in this instance refers to the main process of Postfix, but other daemons use the same process name (it s one of those names that s wrong because it s so obvious). One obvious solution to this is to give the option of specifying the full path so that /usr/lib/postfix/sbin/master can be differentiated from all the other programs named master. The next issue is processes that may run on behalf of multiple users. With sshd there is a single process to accept new connections running as root and a process running under the UID of each logged in user. So the number of sshd processes running as root will be one greater than the number of root login sessions. This means that if a sysadmin logs in directly as root via ssh (which is controversial and not the topic of this post merely something that people do which I have to support) and the master process then crashes (or the sysadmin stops it either accidentally or deliberately) there won t be an alert about the missing process. Of course the correct thing to do is to have a monitor talk to port 22 and look for the string SSH-2.0-OpenSSH_ . Sometimes there are multiple instances of a daemon running under different UIDs that need to be monitored separately. So obviously we need the ability to monitor processes by UID. In many cases process monitoring can be replaced by monitoring of service ports. So if something is listening on port 25 then it probably means that the Postfix master process is running regardless of what other master processes there are. But for my use I find it handy to have multiple monitors, if I get a Jabber message about being unable to send mail to a server immediately followed by a Jabber message from that server saying that master isn t running I don t need to fully wake up to know where the problem is. SE Linux One feature that I want is monitoring SE Linux contexts of processes in the same way as monitoring UIDs. While I m not interested in writing tests for other security systems I would be happy to include code that other people write. So whatever I do I want to make it flexible enough to work with multiple security systems. Transient Processes Most daemons have a second process of the same name running during the startup process. This means if you monitor for exactly 1 instance of a process you may get an alert about 2 processes running when logrotate or something similar restarts the daemon. Also you may get an alert about 0 instances if the check happens to run at exactly the wrong time during the restart. My current way of dealing with this on my servers is to not alert until the second failure event with the alertafter 2 directive. The failure_interval directive allows specifying the time between checks when the monitor is in a failed state, setting that to a low value means that waiting for a second failure result doesn t delay the notification much. To deal with this I ve been thinking of making the ps.monitor script automatically check again after a specified delay. I think that solving the problem with a single parameter to the monitor script is better than using 2 configuration directives to mon to work around it. CPU Use Mon currently has a loadavg.monitor script that to check the load average. But that won t catch the case of a single process using too much CPU time but not enough to raise the system load average. Also it won t catch the case of a CPU hungry process going quiet (EG when the SETI at Home server goes down) while another process goes into an infinite loop. One way of addressing this would be to have the ps.monitor script have yet another configuration option to monitor CPU use, but this might get confusing. Another option would be to have a separate script that alerts on any process that uses more than a specified percentage of CPU time over it s lifetime or over the last few seconds unless it s in a whitelist of processes and users who are exempt from such checks. Probably every regular user would be exempt from such checks because you never know when they will run a file compression program. Also there is a short list of daemons that are excluded (like BOINC) and system processes (like gzip which is run from several cron jobs). Monitoring for Exclusion A common programming mistake is to call setuid() before setgid() which means that the program doesn t have permission to call setgid(). If return codes aren t checked (and people who make such rookie mistakes tend not to check return codes) then the process keeps elevated permissions. Checking for processes running as GID 0 but not UID 0 would be handy. As an aside a quick examination of a Debian/Testing workstation didn t show any obvious way that a process with GID 0 could gain elevated privileges, but that could change with one chmod 770 command. On a SE Linux system there should be only one process running with the domain init_t. Currently that doesn t happen in Stretch systems running daemons such as mysqld and tor due to policy not matching the recent functionality of systemd as requested by daemon service files. Such issues will keep occurring so we need automated tests for them. Automated tests for configuration errors that might impact system security is a bigger issue, I ll probably write a separate blog post about it.

17 August 2017

Shirish Agarwal: Composers are not given due recognition

Update Some youtube links are not viewable or even seen on planet.debian.org. Seems p.d.o. tries its best to remove external links, sorry for the breakage. Beware some youtube-links would be shared in this entry, sorry couldn t find a better/easier media platform to work with. If anyone knows any other platform or wants to suggest, feel free to either mail me or let me know in comments. I want to start today s sharing with a picture of Ganesha I saw today. It is and was public art hence sharing it without an issue. Sketch of Ganesha/Ganapati This is starting of festivities time in India and Ganesha or Ganpati is looked up as a good omen in India. The festival of Ganesh Chaturthi would be starting on the 25th of August and is a sight to behold. Just like Rio has its carnival, Ganesh Chaturthi is also a carnival. We also have parades where people come with Pandals (or temporary structures) The mythology says he has a sweet tooth (hence lot of distribution of sweets, especially modak) and anything which might be troubling people, he creates solutions for them. Here is one video of how people celebrate his immersion in India. This is from my home-town few years ago, every year the madness and the celebrations are becoming more and more. People from far off come to see how we celebrate and see how different people make their Pandals. While some are with music, others are with social messages. Usually people start going to see these structures after dusk and return home way after midnight or early morning. I hope to do this endeavour after many years. One is drunk from hearing all sorts of different kids of music, decoration, messages, a feast and a strain to all the senses. Ganesha immersion celebrations https://www.youtube.com/watch?v=hjWfpGUryho If one is interested one can find more info. at https://en.wikipedia.org/wiki/Ganesh_Chaturthi After quite a bit of time, I wrote an article about various foss internships which I knew besides GSOC over the years. I finally penned them down at https://itsfoss.com/best-open-source-internships/ Interestingly, I was amazed to see that all FOSS U.S. projects (outside of GSOC) are for students who are either living or studying in U.S. and have a student work visa (which from private discussions I came to know is lot harder to get nowadays than before). Except for the National Science Foundation (NSF) which probably has U.S. defence relations and hence they might be sensitive, I fail to understand other institutes preferences for only getting people from the U.S. and hence having a lesser talent pool of people. This also affects the growth of the projects themselves. Just think how limited Debian would have been if it had decided to only have people from only any one community develop it. Dunno if this is due to the present President Trump or these policies had been there before. It would be nice and interesting if people in the know can share. What has also been interesting to watch is Mr. Trump blaming low-cost manufacturing centres like India and China when as far as I recall, lot of manufacturing, specifically auto-mobiles manufacturing was shifted out of the U.S. to Ireland and other places years before which are relatively high-cost places (at least compared to India). I *believe* the change was as early as in 1980 s itself where India was insulated and had a limited market for everything (similar to Russian communism as shown in popular media but not so bad.) Interestingly, it took almost a month for the perl 2.56 to make the transition smoothly. It took quite a bit of time for all the components to work together and be installable. Also saw this few days back http://fortune.com/2017/04/12/auto-industry-decline/ While Tesla is expensive even by American standards the idea of lesser parts, lesser complexity and hence lower costs to use, maintain is good. I do hope that he and his team or any of the competitors do overcome the significant challenges. Any significant improvement in battery technology is bound to have huge impact in almost everything that is used in 21st century. Two recent articles tell me the future may become present very quickly. https://www.purdue.edu/newsroom/releases/2017/Q2/instantly-rechargeable-battery-could-change-the-future-of-electric-and-hybrid-automobiles.html Toyota could finally start mass producing electric cars thanks to China I do hope to see EV being prevalent before the next decade is over otherwise we don t have any hope due to climate change. As for my health, I am much better than before. Just to share some stats, before my illness for lack of better word, I was 120 kgs. , when I was kept in the hospital for about 2-2.5 weeks I came down to 95 kgs. and now back upto 108 kgs. Do go for exercising every other day and trying to get back the strength, stamina and increasing a bit of both. Doctors have given me another 4-5 months after which a brain scan will reveal if there are any remaining blood clots in the brain or not. Lastly, while it has become somewhat of a sensitive issue to love Muslims or to talk about their work in any field in the current political climate, there are 4-5 music pieces I listen whenever I can, especially before going to bed. While almost all the pieces have been sung and written by Muslims, sadly I don t know who the composers of these beautiful songs are. While it is much easier to get the names of the singer and the lyricist, one of the more important roles in my view is the composer or/and music arranger. Without them, the songs would not have the same haunting quality that the songs have. While I have been lucky to find the names of the composer/music arranger for the pieces below but this is not the case if and when the songs comes on television. I do remember in old times at least on Radio they used to mention about who has given the music as well, dunno in modern times. I am sharing the songs, and hopefully will also share the translations if I find on the web, please see the lyrics. The numbering is for convenience only and am torn in these 4-5 songs which is the best. Just to share these are all sufi love songs except the last one which I am sharing. 1. Lyrical song https://www.youtube.com/watch?v=ehqN6oTpmb8 Translation with video of song http://www.bollynook.com/en/lyrics/6443/aaj-din-chadheya/ While there probably are stories with each song, I was lucky to find the story about this one. The lyrics of the song are actually a love lost Punjabi poet who writes in the memory of his beloved to which he could not marry and he pens those when standing in line for his liquor. The story goes on that he marries a girl later in life who bears a resemblance to his beloved whom he couldn t forget till his dying day. 2. Lyrical song https://www.youtube.com/watch?v=uTC_2c83qn0 The same song has been sung by different people and I love them all the more for it. Another video https://www.youtube.com/watch?v=3G7Qg4LJ7WE Another video https://www.youtube.com/watch?v=kOsvNuR3m5Y Translation http://www.bollynook.com/en/lyrics/10703/o-re-piya/ 3. Lyrical song https://www.youtube.com/watch?v=qG7Kms_YA5Q The translation http://www.filmyquotes.com/songs/885 The translation of the song is a bit crude but then translations are supposed to be crude  Anyways, the above song is what would be called a perfect Sufi song. I hope people enjoy the longing and the silence which follows this piece. Another classic one 4. Lyrical song https://www.youtube.com/watch?v=Ube5XhN_lpM English translation http://www.ardhamy.com/song/aye-dil-e-naadan While the song is from the movie Razia Sultana and was a flop as the movie was about Race and controversial then as it probably would be now. As seen in the other songs of the same genre, it has strands of longing, loneliness as seen of the above. 5. Lyrical song https://www.youtube.com/watch?v=tv242qOnHJA This one is not a sufi song but I love all the women and the girls and the way they enhanced the song. I dunno how much they must have practised as it s a very fast and peppy song and doesn t give time to the singer to breathe except for that one section which has a bit of Carnatic music. At the very end I would like to share http://www.globalrhythm.net/ I have found some interesting sounds on the site. Hope the site enriches you as well. FWIW I have no links with the site except as somebody who likes to diversify his music listening. Lastly, for a long period of time, I had been hearing the criticism, especially for FOSS games that they don t have AAA quality assets. Recently I came across a game called Starship Theory (sadly its only for MS-Windows) Game video https://www.youtube.com/watch?v=imaL2pjNURg You look at the game and see the number of videos the guy has made. What FOSS game developers can learn from this, you don t need high-end 2.5/3d models, clipart will do but need depth in gameplay which can make FOSS games be popular and also earn a pretty bundle. I do hope some FOSS game upstream developers take note and use that game s inspiration to bring more depth. That doesn t mean games like 0ad are not liked by people but it takes huge amount of time and resources. 0ad video https://www.youtube.com/watch?v=DHx5XBtypcQ Hope you have a good time with all the ideas, anecdotes and videos I shared above.
Filed under: Miscellenous Tagged: #FOSS Internships, #Ganesh Chaturthi, #Ganeshji, #planet-debian, #Sufi Bollywood Music, FOSS, FOSS games, politics

13 August 2017

Enrico Zini: Consensually doing things together?

On 2017-08-06 I have a talk at DebConf17 in Montreal titled "Consensually doing things together?" (video). Here are the talk notes. Abstract At DebConf Heidelberg I talked about how Free Software has a lot to do about consensually doing things together. Is that always true, at least in Debian? I d like to explore what motivates one to start a project and what motivates one to keep maintaining it. What are the energy levels required to manage bits of Debian as the project keeps growing. How easy it is to say no. Whether we have roles in Debian that require irreplaceable heroes to keep them going. What could be done to make life easier for heroes, easy enough that mere mortals can help, or take their place. Unhappy is the community that needs heroes, and unhappy is the community that needs martyrs. I d like to try and make sure that now, or in the very near future, Debian is not such an unhappy community. Consensually doing things together I gave a talk in Heidelberg. Valhalla made stickers Debian France distributed many of them. There's one on my laptop. Which reminds me of what we ought to be doing. Of what we have a chance to do, if we play our cards right. I'm going to talk about relationships. Consensual relationships. Relationships in short. Nonconsensual relationships are usually called abuse. I like to see Debian as a relationship between multiple people. And I'd like it to be a consensual one. I'd like it not to be abuse. Consent From wikpedia:
In Canada "consent means the voluntary agreement of the complainant to engage in sexual activity" without abuse or exploitation of "trust, power or authority", coercion or threats.[7] Consent can also be revoked at any moment.[8] There are 3 pillars often included in the description of sexual consent, or "the way we let others know what we're up for, be it a good-night kiss or the moments leading up to sex." They are:
  • Knowing exactly what and how much I'm agreeing to
  • Expressing my intent to participate
  • Deciding freely and voluntarily to participate[20]
Saying "I've decided I won't do laundry anymore" when the other partner is tired, or busy doing things. Is different than saying "I've decided I won't do laundry anymore" when the other partner has a chance to say "why? tell me more" and take part in negotiation. Resources: Relationships Debian is the Universal Operating System. Debian is made and maintained by people. The long term health of debian is a consequence of the long term health of the relationship between Debian contributors. Debian doesn't need to be technically perfect, it needs to be socially healthy. Technical problems can be fixed by a healty community. graph showing relationship between avoidance, accomodation, compromise, competition, collaboration The Thomas-Kilmann Conflict Mode Instrument: source png. Motivations Quick poll: What are your motivations to be in a relationship? Which of those motivations are healthy/unhealthy? "Galadriel" (noun, by Francesca Ciceri): a task you have to do otherwise Sauron takes over Middle Earth See: http://blog.zouish.org/nonupdd/#/22/1 What motivates me to start a project or pick one up? What motivates me to keep maintaning a project? What motivates you? What's an example of a sustainable motivation? Is it really all consensual in Debian? Energy Energy that thing which is measured in spoons. The metaphore comes from people suffering with chronic health issues:
"Spoons" are a visual representation used as a unit of measure used to quantify how much energy a person has throughout a given day. Each activity requires a given number of spoons, which will only be replaced as the person "recharges" through rest. A person who runs out of spoons has no choice but to rest until their spoons are replenished.
For example, in Debian, I could spend: What is one person capable of doing? Have reasonable expectations, on others: Have reasonable expectations, on yourself: Debian is a shared responsibility When spoons are limited, what takes more energy tends not to get done As the project grows, project-wide tasks become harder Are they still humanly achievable? I don't want Debian to have positions that require hero-types to fill them Dictatorship of who has more spoons: Perfectionism You are in a relationship that is just perfect. All your friends look up to you. You give people relationship advice. You are safe in knowing that You Are Doing It Right. Then one day you have an argument in public. You don't just have to deal with the argument, but also with your reputation and self-perception shattering. One things I hate about Debian: consistent technical excellence. I don't want to be required to always be right. One of my favourite moments in the history of Debian is the openssl bug Debian doesn't need to be technically perfect, it needs to be socially healthy, technical problems can be fixed. I want to remove perfectionism from Debian: if we discover we've been wrong all the time in something important, it's not the end of Debian, it's the beginning of an improved Debian. Too good to be true There comes a point in most people's dating experience where one learns that when some things feel too good to be true, they might indeed be. There are people who cannot say no: There are people who cannot take a no: Note the diversity statement: it's not a problem to have one of those (and many other) tendencies, as long as one manages to keep interacting constructively with the rest of the community Also, it is important to be aware of these patterns, to be able to compensate for one's own tendencies. What happens when an avoidant person meets a narcissistic person, and they are both unaware of the risks? Resources: Note: there are problems with the way these resources are framed: Red flag / green flag http://pervocracy.blogspot.ca/2012/07/green-flags.html Ask for examples of red/green flags in Debian. Green flags: Red flags: Apologies / Dealing with issues I don't see the usefulness of apologies that are about accepting blame, or making a person stop complaining. I see apologies as opportunities to understand the problem I caused, help fix it, and possibly find ways of avoiding causing that problem again in the future. A Better Way to Say Sorry lists a 4 step process, which is basically what we do when in bug reports already: 1, Try to understand and reproduce the exact problem the person had. 2. Try to find the cause of the issue. 3. Try to find a solution for the issue. 4. Verify with the reporter that the solution does indeed fix the issue. This is just to say
My software ate
the files
that where in
your home directory and which
you were probably
needing
for work Forgive me
it was so quick to write
without tests
and it worked so well for me
(inspired by a 1934 poem by William Carlos Williams) Don't be afraid to fail Don't be afraid to fail or drop the ball. I think that anything that has a label attached of "if you don't do it, nobody will", shouldn't fall on anybody's shoulders and should be shared no matter what. Shared or dropped. Share the responsibility for a healthy relationship Don't expect that the more experienced mates will take care of everything. In a project with active people counted by the thousand, it's unlikely that harassment isn't happening. Is anyone writing anti-harassment? Do we have stats? Is having an email address and a CoC giving us a false sense of security?
When you get involved in a new community, such as Debian, find out early where, if that happens, you can find support, understanding, and help to make it stop. If you cannot find any, or if the only thing you can find is people who say "it never happens here", consider whether you really want to be in that community.
(from http://www.enricozini.org/blog/2016/debian/you-ll-thank-me-later/)
There are some nice people in the world. I mean nice people, the sort I couldn t describe myself as. People who are friends with everyone, who are somehow never involved in any argument, who seem content to spend their time drawing pictures of bumblebees on flowers that make everyone happy. Those people are great to have around. You want to hold onto them as much as you can. But people only have so much tolerance for jerkiness, and really nice people often have less tolerance than the rest of us. The trouble with not ejecting a jerk whether their shenanigans are deliberate or incidental is that you allow the average jerkiness of the community to rise slightly. The higher it goes, the more likely it is that those really nice people will come around less often, or stop coming around at all. That, in turn, makes the average jerkiness rise even more, which teaches the original jerk that their behavior is acceptable and makes your community more appealing to other jerks. Meanwhile, more people at the nice end of the scale are drifting away.
(from https://eev.ee/blog/2016/07/22/on-a-technicality/) Give people freedom If someone tries something in Debian, try to acknowledge and accept their work. You can give feedback on what they are doing, and try not to stand in their way, unless what they are doing is actually hurting you. In that case, try to collaborate, so that you all can get what you need. It's ok if you don't like everything that they are doing. I personally don't care if people tell me I'm good when I do something, I perceive it a bit like "good boy" or "good dog". I rather prefer if people show an interest, say "that looks useful" or "how does it work?" or "what do you need to deploy this?" Acknowledge that I've done something. I don't care if it's especially liked, give me the freedom to keep doing it. Don't give me rewards, give me space and dignity. Rather than feeding my ego, feed by freedom, and feed my possibility to create.

28 July 2017

Joachim Breitner: How is coinduction the dual of induction?

Earlier today, I demonstrated how to work with coinduction in the theorem provers Isabelle, Coq and Agda, with a very simple example. This reminded me of a discussion I had in Karlsruhe with my then colleague Denis Lohner: If coinduction is the dual of induction, why do the induction principles look so different? I like what we observed there, so I d like to share this. The following is mostly based on my naive understanding of coinduction based on what I observe in the implementation in Isabelle. I am sure that a different, more categorial presentation of datatypes (as initial resp. terminal objects in some category of algebras) makes the duality more obvious, but that does not necessarily help the working Isabelle user who wants to make sense of coninduction.

Inductive lists I will use the usual polymorphic list data type as an example. So on the one hand, we have normal, finite inductive lists:
datatype 'a list = nil   cons (hd : 'a) (tl : "'a list")
with the well-known induction principle that many of my readers know by heart (syntax slightly un-isabellized):
P nil   ( x xs. P xs   P (cons x xs))     xs. P xs

Coinductive lists In contrast, if we define our lists coinductively to get possibly infinite, Haskell-style lists, by writing
codatatype 'a llist = lnil   lcons (hd : 'a)  (tl : "'a llist")
we get the following coinduction principle:
(  xs ys.
    R xs ys'   (xs = lnil) = (ys = lnil)  
               (xs   lnil   ys'   lnil  
	         hd xs = hd ys   R (tl xs) (tl ys)))  
  (  xs ys. R xs ys   xs = ys)
This is less scary that it looks at first. It tell you if you give me a relation R between lists which implies that either both lists are empty or both lists are nonempty, and furthermore if both are non-empty, that they have the same head and tails related by R, then any two lists related by R are actually equal. If you think of the infinte list as a series of states of a computer program, then this is nothing else than a bisimulation. So we have two proof principles, both of which make intuitive sense. But how are they related? They look very different! In one, we have a predicate P, in the other a relation R, to point out just one difference.

Relation induction To see how they are dual to each other, we have to recognize that both these theorems are actually specializations of a more general (co)induction principle. The datatype declaration automatically creates a relator:
rel_list :: ('a   'b   bool)   'a list   'b list   bool
The definition of rel_list R xs ys is that xs and ys have the same shape (i.e. length), and that the corresponding elements are pairwise related by R. You might have defined this relation yourself at some time, and if so, you probably introduced it as an inductive predicate. So it is not surprising that the following induction principle characterizes this relation:
Q nil nil  
( x xs y ys. R x y   Q xs ys   Q (cons x xs) (cons y ys))  
( xs ys   rel_list R xs ys   Q xs ys)
Note how how similar this lemma is in shape to the normal induction for lists above! And indeed, if we choose Q xs ys (P xs xs = ys) and R x y (x = y), then we obtain exactly that. In that sense, the relation induction is a generalization of the normal induction.

Relation coinduction The same observation can be made in the coinductive world. Here, as well, the codatatype declaration introduces a function
rel_llist :: ('a   'b   bool)   'a llist   'b llist   bool
which relates lists of the same shape with related elements only that this one also relates infinite lists, and therefore is a coinductive relation. The corresponding rule for proof by coinduction is not surprising and should remind you of bisimulation, too:
( xs ys.
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))  
(  xs ys   R xs ys   rel_llist Q xs ys)
It is even more obvious that this is a generalization of the standard coinduction principle shown above: Just instantiate Q with equality, which turns rel_llist Q into equality on the lists, and you have the theorem above.

The duality With our induction and coinduction principle generalized to relations, suddenly a duality emerges: If you turn around the implication in the conclusion of one you get the conclusion of the other one. This is an example of cosomething is something with arrows reversed . But what about the premise(s) of the rules? What happens if we turn around the arrow here? Although slighty less immediate, it turns out that they are the same as well. To see that, we start with the premise of the coinduction rule, reverse the implication and then show that to be equivalent to the two premises of the induction rule:
( xs ys.
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))
=   case analysis (the other two cases are vacuously true)  
  ( xs ys.
    xs = lnil   ys = lnil  
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))
  ( xs ys.
    xs   lnil   ys   lnil
    R xs ys   (xs = lnil) = (ys = lnil)  
              (xs   lnil   ys   lnil  
	        Q (hd xs) (hd ys)   R (tl xs) (tl ys)))
=   simplification  
  ( xs ys.  xs = lnil   ys = lnil   R xs ys
  ( x xs y ys.  R (cons x xs) (cons y ys)   (Q x y   R xs ys))
=   more rewriting  
  R nil nil
  ( x xs y ys. Q x y   R xs ys   R (cons x xs) (cons y ys))

Conclusion The coinduction rule is not the direct dual of the induction rule, but both are specializations of more general, relational proof methods, where the duality is clearly present. More generally, this little excursion shows that it is often beneficial to think of types less as sets, and more as relations this way of thinking is surprisingly fruitful, and led to proofs of parametricity and free theorems and other nice things.

27 July 2017

Joachim Breitner: Coinduction in Coq and Isabelle

The DeepSpec Summer School is almost over, and I have had a few good discussions. One revolved around coinduction: What is it, how does it differ from induction, and how do you actually prove something. In the course of the discussion, I came up with a very simple coinductive exercise, and solved it both in Coq and Isabelle

The task Define the extended natural numbers coinductively. Define the min function and the relation. Show that min(n, m) n holds.

Coq The definitions are straight forward. Note that in Coq, we use the same command to define a coinductive data type and a coinductively defined relation:
CoInductive ENat :=
    N : ENat
    S : ENat -> ENat.
CoFixpoint min (n : ENat) (m : ENat)
  :=match n, m with   S n', S m' => S (min n' m')
                      _, _       => N end.
CoInductive le : ENat -> ENat -> Prop :=
    leN : forall m, le N m
    leS : forall n m, le n m -> le (S n) (S m).
The lemma is specified as
Lemma min_le: forall n m, le (min n m) n.
and the proof method of choice to show that some coinductive relation holds, is cofix. One would wish that the following proof would work:
Lemma min_le: forall n m, le (min n m) n.
Proof.
  cofix.
  destruct n, m.
  * apply leN.
  * apply leN.
  * apply leN.
  * apply leS.
    apply min_le.
Qed.
but we get the error message
Error:
In environment
min_le : forall n m : ENat, le (min n m) n
Unable to unify "le N ?M170" with "le (min N N) N
Effectively, as Coq is trying to figure out whether our proof is correct, i.e. type-checks, it stumbled on the equation min N N = N, and like a kid scared of coinduction, it did not dare to run the min function. The reason it does not just run a CoFixpoint is that doing so too daringly might simply not terminate. So, as Adam explains in a chapter of his book, Coq reduces a cofixpoint only when it is the scrutinee of a match statement. So we need to get a match statement in place. We can do so with a helper function:
Definition evalN (n : ENat) :=
  match n with   N => N
                 S n => S n end.
Lemma evalN_eq : forall n, evalN n = n.
Proof. intros. destruct n; reflexivity. Qed.
This function does not really do anything besides nudging Coq to actually evaluate its argument to a constructor (N or S _). We can use it in the proof to guide Coq, and the following goes through:
Lemma min_le: forall n m, le (min n m) n.
Proof.
  cofix.
  destruct n, m; rewrite <- evalN_eq with (n := min _ _).
  * apply leN.
  * apply leN.
  * apply leN.
  * apply leS.
    apply min_le.
Qed.

Isabelle In Isabelle, definitions and types are very different things, so we use different commands to define ENat and le:
theory ENat imports  Main begin
codatatype ENat =  N   S  ENat
primcorec min where
   "min n m = (case n of
       N   N
       S n'   (case m of
        N   N
        S m'   S (min n' m')))"
coinductive le where
  leN: "le N m"
  leS: "le n m   le (S n) (S m)"
There are actually many ways of defining min; I chose the one most similar to the one above. For more details, see the corec tutorial. Now to the proof:
lemma min_le: "le (min n m) n"
proof (coinduction arbitrary: n m)
  case le
  show ?case
  proof(cases n)
    case N then show ?thesis by simp
  next
    case (S n') then show ?thesis
    proof(cases m)
      case N then show ?thesis by simp
    next
      case (S m')  with  n = _  show ?thesis
        unfolding min.code[where n = n and m = m]
        by auto
    qed
  qed
qed
The coinduction proof methods produces this goal:
proof (state)
goal (1 subgoal):
 1.  n m. ( m'. min n m = N   n = m')  
          ( n' m'.
               min n m = S n'  
               n = S m'  
	       (( n m. n' = min n m   m' = n)   le n' m'))
I chose to spell the proof out in the Isar proof language, where the outermost proof structure is done relatively explicity, and I proceed by case analysis mimiking the min function definition. In the cases where one argument of min is N, Isabelle s simplifier (a term rewriting tactic, so to say), can solve the goal automatically. This is because the primcorec command produces a bunch of lemmas, one of which states n = N m = N min n m = N. In the other case, we need to help Isabelle a bit to reduce the call to min (S n) (S m) using the unfolding methods, where min.code contains exactly the equation that we used to specify min. Using just unfolding min.code would send this method into a loop, so we restrict it to the concrete arguments n and m. Then auto can solve the remaining goal (despite all the existential quantifiers).

Summary Both theorem provers are able to prove the desired result. To me it seems that it is slightly more convenient in Isabelle because a lot of Coq infrastructure relies on the type checker being able to effectively evaluate expressions, which is tricky with cofixpoints, wheras evaluation plays a much less central role in Isabelle, where rewriting is the crucial technique, and while one still cannot simply throw min.code into the simpset, so working with objects that do not evaluate easily or completely is less strange.

Agda I was challenged to do it in Agda. Here it is:
module ENat where
open import Coinduction
data ENat : Set where
  N : ENat
  S :   ENat   ENat
min : ENat   ENat   ENat
min (S n') (S m') = S (  (min (  n') (  m')))
min _ _ = N
data le : ENat   ENat   Set where
  leN :    m    le N m
  leS :    n m      (le (  n) (  m))   le (S n) (S m)
min_le :    n m    le (min n m) n
min_le  S n'   S m'  = leS (  min_le)
min_le  N      S m'  = leN
min_le  S n'   N  = leN
min_le  N      N  = leN
I will refrain from commenting it, because I do not really know what I have been doing here, but it typechecks, and refer you to the official documentation on coinduction in Agda. But let me note that I wrote this using plain inductive types and recursion, and added , and until it worked.

14 June 2017

Antoine Beaupr : Alioth moving toward pagure

Since 2003, the Debian project has been running a server called Alioth to host source code version control systems. The server will hit the end of life of the Debian LTS release (Wheezy) next year; that deadline raised some questions regarding the plans for the server over the coming years. Naturally, that led to a discussion regarding possible replacements. In response, the current Alioth maintainer, Alexander Wirt, announced a sprint to migrate to pagure, a free-software "Git-centered forge" written in Python for the Fedora project, which LWN covered last year. Alioth currently runs FusionForge, previously known as GForge, which is the free-software fork of the SourceForge code base when that service closed its source in 2001. Alioth hosts source code repositories, mainly Git and Subversion (SVN) and, like other "forge" sites, also offers forums, issue trackers, and mailing list services. While other alternatives are still being evaluated, a consensus has emerged on a migration plan from FusionForage to a more modern and minimal platform based on pagure.

Why not GitLab? While this may come as a surprise to some who would expect Debian to use the more popular GitLab project, the discussion and decision actually took place a while back. During a lengthy debate last year, Debian contributors discussed the relative merits of different code-hosting platforms, following the initiative of Debian Developer "Pirate" Praveen Arimbrathodiyil to package GitLab for Debian. At that time, Praveen also got a public GitLab instance running for Debian (gitlab.debian.net), which was sponsored by GitLab B.V. the commercial entity behind the GitLab project. The sponsorship was originally offered in 2015 by the GitLab CEO, presumably to counter a possible move to GitHub, as there was a discussion about creating a GitHub Organization for Debian at the time. The deployment of a Debian-specific GitLab instance then raised the question of the overlap with the already existing git.debian.org service, which is backed by Alioth's FusionForge deployment. It then seemed natural that the new GitLab instance would replace Alioth. But when Praveen directly proposed to move to GitLab, Wirt stepped in and explained that a migration plan was already in progress. The plan then was to migrate to a simpler gitolite-based setup, a decision that was apparently made in corridor discussions surrounding the Alioth Git replacement BoF held during Debconf 2015. The first objection raised by Wirt against GitLab was its "huge number of dependencies". Another issue Wirt identified was the "open core / enterprise model", preferring a "real open source system", an opinion which seems shared by other participants on the mailing list. Wirt backed his concerns with an hypothetical example:
Debian needs feature X but it is already in the enterprise version. We make a patch and, for commercial reasons, it never gets merged (they already sell it in the enterprise version). Which means we will have to fork the software and keep those patches forever. Been there done that. For me, that isn't acceptable.
This concern was further deepened when GitLab's Director of Strategic Partnerships, Eliran Mesika, explained the company's stewardship policy that explains how GitLab decides which features end up in the proprietary version. Praveen pointed out that:
[...] basically it boils down to features that they consider important for organizations with less than 100 developers may get accepted. I see that as a red flag for a big community like debian.
Since there are over 600 Debian Developers, the community seems to fall within the needs of "enterprise" users. The features the Debian community may need are, by definition, appropriate only to the "Enterprise Edition" (GitLab EE), the non-free version, and are therefore unlikely to end up in the "Community Edition" (GitLab CE), the free-software version. Interestingly, Mesika asked for clarification on which features were missing, explaining that GitLab is actually open to adding features to GitLab CE. The response from Debian Developer Holger Levsen was categorical: "It's not about a specific patch. Free GitLab and we can talk again." But beyond the practical and ethical concerns, some specific features Debian needs are currently only in GitLab EE. For example, debian.org systems use LDAP for authentication, which would obviously be useful in a GitLab deployment; GitLab CE supports basic LDAP authentication, but advanced features, like group or SSH-key synchronization, are only available in GitLab EE. Wirt also expressed concern about the Contributor License Agreement that GitLab B.V. requires contributors to sign when they send patches, which forces users to allow the release of their code under a non-free license. The debate then went on going through a exhaustive inventory of different free-software alternatives:
  • GitLab, a Ruby-based GitHub replacement, dual-licensed MIT/Commercial
  • Gogs, Go, MIT
  • Gitblit, Java, Apache-licensed
  • Kallithea, in Python, also supports Mercurial, GPLv3
  • and finally, pagure, also written Python, GPLv2
A feature comparison between each project was created in the Debian wiki as well. In the end, however, Praveen gave up on replacing Alioth with GitLab because of the controversy and moved on to support the pagure migration, which resolved the discussion in July 2016. More recently, Wirt admitted in an IRC conversation that "on the technical side I like GitLab a lot more than pagure" and that "as a user, GitLab is much nicer than pagure and it has those nice CI [continuous integration] features". However, as he explained in his blog "GitLab is Opencore, [and] that it is not entirely opensource. I don't think we should use software licensed under such a model for one of our core services" which leaves pagure as the only stable candidate. Other candidates were excluded on technical grounds, according to Wirt: Gogs "doesn't scale well" and a quick security check didn't yield satisfactory results; "Gitblit is Java" and Kallithea doesn't have support for accessing repositories over SSH (although there is a pending pull request to add the feature). In an email interview, Sid Sijbrandij, CEO of GitLab, did say that "we want to make sure that our open source edition can be used by open source projects". He gave examples of features liberated following requests by the community, such as branded login pages for the VLC project and GitLab Pages after popular demand. He stressed that "There are no artificial limits in our open source edition and some organizations use it with more than 20.000 users." So if the concern of the Debian community is that features may be missing from GitLab CE, there is definitely an opening from GitLab to add those features. If, however, the concern is purely ethical, it's hard to see how an agreement could be reached. As Sijbrandij put it:
On the mailinglist it seemed that some Debian maintainers do not agree with our open core business model and demand that there is no proprietary version. We respect that position but we don't think we can compete with the purely proprietary software like GitHub with this model.

Working toward a pagure migration The issue of Alioth maintenance came up again last month when Boyuan Yang asked what would happen to Alioth when support for Debian LTS (Wheezy) ends next year. Wirt brought up the pagure migration proposal and the community tried to make a plan for the migration. One of the issues raised was the question of the non-Git repositories hosted on Alioth, as pagure, like GitLab, only supports Git. Indeed, Ben Hutchings calculated that while 90% (\~19,000) of the repositories currently on Alioth are Git, there are 2,400 SVN repositories and a handful of Mercurial, Bazaar (bzr), Darcs, Arch, and even CVS repositories. As part of an informal survey, however, most packaging teams explained they either had already migrated away from SVN to Git or were in the process of doing so. The largest CVS user, the web site team, also explained it was progressively migrating to Git. Mattia Rizzolo then proposed that older repository services like SVN could continue running even if FusionForge goes down, as FusionForge is, after all, just a web interface to manage those back-end services. Repository creation would be disabled, but older repositories would stay operational until they migrate to Git. This would, effectively, mean the end of non-Git repository support for new projects in the Debian community, at least officially. Another issue is the creation of a Debian package for pagure. Ironically, while Praveen and other Debian maintainers have been working for 5 years to package GitLab for Debian, pagure isn't packaged yet. Antonio Terceiro, another Debian Developer, explained this isn't actually a large problem for debian.org services: "note that DSA [Debian System Administrator team] does not need/want the service software itself packaged, only its dependencies". Indeed, for Debian-specific code bases like ci.debian.net or tracker.debian.org, it may not make sense to have the overhead of maintaining Debian packages since those tools have limited use outside of the Debian project directly. While Debian derivatives and other distributions could reuse them, what usually happens is that other distributions roll their own software, like Ubuntu did with the Launchpad project. Still, Paul Wise, a member of the DSA team, reasoned that it was better, in the long term, to have Debian packages for debian.org services:
Personally I'm leaning towards the feeling that all configuration, code and dependencies for Debian services should be packaged and subjected to the usual Debian QA activities but I acknowledge that the current archive setup (testing migration plus backporting etc) doesn't necessarily make this easy.
Wise did say that "DSA doesn't have any hard rules/policy written down, just evaluation on a case-by-case basis" which probably means that pagure packaging will not be a blocker for deployment. The last pending issue is the question of the mailing lists hosted on Alioth, as pagure doesn't offer mailing list management (nor does GitLab). In fact, there are three different mailing list services for the Debian project: Wirt, with his "list-master hat" on, explained that the main mailing list service is "not really suited as a self-service" and expressed concern at the idea of migrating the large number mailing lists hosted on Alioth. Indeed, there are around 1,400 lists on Alioth while the main service has a set of 300 lists selected by the list masters. No solution for those mailing lists was found at the time of this writing. In the end, it seems like the Debian project has chosen pagure, the simpler, less featureful, but also less controversial, solution and will use the same hosting software as their fellow Linux distribution, Fedora. Wirt is also considering using FreeIPA for account management on top of pagure. The plan is to migrate away from FusionForge one bit at a time, and pagure is the solution for the first step: the Git repositories. Lists, other repositories, and additional features of FusionForge will be dealt with later on, but Wirt expects a plan to come out of the upcoming sprint. It will also be interesting to see how the interoperability promises of pagure will play out in the Debian world. Even though the federation features of pagure are still at the early stages, one can already clone issues and pull requests as Git repositories, which allows for a crude federation mechanism. In any case, given the long history and the wide variety of workflows in the Debian project, it is unlikely that a single tool will solve all problems. Alioth itself has significant overlap with other Debian services; not only does it handle mailing lists and forums, but it also has its own issue tracker that overlaps with the Debian bug tracking system (BTS). This is just the way things are in Debian: it is an old project with lots of moving part. As Jonathan Dowland put it: "The nature of the project is loosely-coupled, some redundancy, lots of legacy cruft, and sadly more than one way to do it." Hopefully, pagure will not become part of that "legacy redundant cruft". But at this point, the focus is on keeping the services running in a simpler, more maintainable way. The discussions between Debian and GitLab are still going on as we speak, but given how controversial the "open core" model used by GitLab is for the Debian community, pagure does seem like a more logical alternative.
Note: this article first appeared in the Linux Weekly News.

30 April 2017

Russ Allbery: Review: Periodic Tales

Review: Periodic Tales, by Hugh Aldersey-Williams
Publisher: HarperCollins
Copyright: February 2011
ISBN: 0-06-207881-X
Format: Kindle
Pages: 451
Perhaps my favorite chapter in Randall Munroe's What If? is his examination of what would happen if you assembled a periodic table from square blocks of each element. As with most What If? questions, the answer is "everyone in the vicinity dies," but it's all about the journey. The periodic table is full of so many elements that we rarely hear about but which have fascinating properties. It was partly in the memory of that chapter that I bought Periodic Tales on impulse after seeing a mention of it somewhere on the Internet (I now forget where). Hugh Aldersey-Williams is a journalist and author, but with a background in natural sciences. He also has a life-long hobby of collecting samples of the elements and attempting to complete his own private copy of the periodic table, albeit with considerably more precautions and sample containment than Munroe's thought experiment. Periodic Tales is inspired by that collection. It's a tour and cultural history of many of the elements, discussing their discovery, their role in commerce and industry, their appearance, and often some personal anecdotes. This is not exactly a chemistry book, although there's certainly some chemistry here, nor is it a history, although Aldersey-Williams usually includes some historical notes about each element he discusses. The best term might be an anthropology of the elements: a discussion of how they've influenced culture and an examination of the cultural assumptions and connections we've constructed around them. But primarily it's an idiosyncratic and personal tour of the things Aldersey-Williams found interesting about each one. Periodic Tales is not comprehensive. The completionist in me found that a bit disappointing, and there are a few elements that I think would have fit the overall thrust of the book but are missing. (Lithium and its connection to mental health and now computer batteries comes to mind.) It's also not organized in the obvious way, either horizontally or vertically along the periodic table. Instead, Aldersey-Williams has divided the elements he talks about into five major but fairly artificial divisions: power (primarily in the economic sense), fire (focused on burning and light), craft (the materials from which we make things), beauty, and earth. Obviously, these are fuzzy; silver appears in craft, but could easily be in power with gold. I'm not sure how defensible this division was. But it does, for good or for ill, break the reader's mind away from a purely chemical and analytical treatment and towards broader cultural associations. This cultural focus, along with Aldersey-Williams's clear and conversational style, are what pull this book firmly away from being a beautified recitation of facts that could be gleamed from Wikipedia. It also leads to some unexpected choices of focus. For example, the cultural touchstone he chooses for sodium is not salt (which is a broad enough topic for an entire book) but sodium street lights, the ubiquitous and color-distorting light of modern city nights, thus placing salt in the "fire" category of the book. Discussion of cobalt is focused on pigments: the brilliant colors of paint made possible by its many brightly-colored compounds. Arsenic is, of course, a poison, but it's also a source of green, widely used in wallpaper (and Aldersey-Williams discusses the connection with the controversial death of Napoleon). And the discussion of aluminum starts with a sculpture, and includes a fascinating discussion of "banalization" as we become used to use of a new metal, which the author continues when looking a titanium and its currently-occurring cultural transition between the simply new and modern and a well-established metal with its own unique cultural associations. One drawback of the somewhat scattered organization is that, while Periodic Tales provides fascinating glimmers of the history of chemistry and the search to isolate elements, those glimmers are disjointed and presented in no particular order. Recently-discovered metals are discussed alongside ancient ones, and the huge surge in elemental isolation in the 1800s is all jumbled together. Wikipedia has a very useful timeline that helps sort out one's sense of history, but there was a part of me left wanting a more structured presentation. I read books like this primarily for the fascinating trivia. Mercury: known in ancient times, but nearly useless, so used primarily for ritual and decoration (making the modern reader cringe). Relative abundancies of different elements, which often aren't at all what one might think. Rare earths (not actually that rare): isolated through careful, tedious work by Swedish mining chemists whom most people have never heard of, unlike the discoverers of many other elements. And the discovery of the noble gases, which is a fascinating bit of disruptive science made possible by new technology (the spectroscope), forcing a rethinking of the periodic table (which had no column for noble gases). I read a lot of this while on vacation and told interesting tidbits to my parents over breakfast or dinner. It's that sort of book. This is definitely in the popular science and popular writing category, for all the pluses and minuses that brings. It's not a detailed look at either chemistry or history. But it's very fun to read, it provides a lot of conversational material, and it takes a cultural approach that would not have previously occurred to me. Recommended if you like this sort of thing. Rating: 7 out of 10

21 April 2017

Rhonda D'Vine: Home

A fair amount of things happened since I last blogged something else than music. First of all we did actually hold a Debian Diversity meeting. It was quite nice, less people around than hoped for, and I account that to some extend to the trolls and haters that defaced the titanpad page for the agenda and destroyed the doodle entry for settling on a date for the meeting. They even tried to troll my blog with comments, and while I did approve controversial responses in the past, those went over the line of being acceptable and didn't carry any relevant content. One response that I didn't approve but kept in my mailbox is even giving me strength to carry on. There is one sentence in it that speaks to me: Think you can stop us? You can't you stupid b*tch. You have ruined the Debian community for us. The rest of the message is of no further relevance, but even though I can't take credit for being responsible for that, I'm glad to be a perceived part of ruining the Debian community for intolerant and hateful people. A lot of other things happened since too. Mostly locally here in Vienna, several queer empowering groups were founding around me, some of them existed already, some formed with the help of myself. We now have several great regular meetings for non-binary people, for queer polyamory people about which we gave an interview, a queer playfight (I might explain that concept another time), a polyamory discussion group, two bi-/pansexual groups, a queer-feminist choir, and there will be an European Lesbian* Conference in October where I help with the organization and on June 21st I'll finally receive the keys to my flat in Que[e]rbau Seestadt. I'm sooo looking forward to it. It will be part of the Let me come Home experience that I'm currently in. Another part of that experience is that I started changing my name (and gender marker) officially. I had my first appointment in the corresponding bureau, and I hope that it won't last too long because I have to get my papers in time for booking my flight to Montreal, and somewhen along the process my current passport won't contain correct data anymore. So for the people who have it in their signing policy to see government IDs this might be your chance to finally sign my key then. I plan to do a diversity BoF at debconf where we can speak more directly on where we want to head with the project. I hope I'll find the time to do an IRC meeting beforehand. I'm just uncertain how to coordinate that one to make it accessible for interested parties while keeping the destructive trolls out. I'm open for ideas here.

/personal permanent link Comments: 3 Flattr this

13 April 2017

Antoine Beaupr : New approaches to network fast paths

With the speed of network hardware now reaching 100 Gbps and distributed denial-of-service (DDoS) attacks going in the Tbps range, Linux kernel developers are scrambling to optimize key network paths in the kernel to keep up. Many efforts are actually geared toward getting traffic out of the costly Linux TCP stack. We have already covered the XDP (eXpress Data Path) patch set, but two new ideas surfaced during the Netconf and Netdev conferences held in Toronto and Montreal in early April 2017. One is a patch set called af_packet, which aims at extracting raw packets from the kernel as fast as possible; the other is the idea of implementing in-kernel layer-7 proxying. There are also user-space network stacks like Netmap, DPDK, or Snabb (which we previously covered). This article aims at clarifying what all those components do and to provide a short status update for the tools we have already covered. We will focus on in-kernel solutions for now. Indeed, user-space tools have a fundamental limitation: if they need to re-inject packets onto the network, they must again pay the expensive cost of crossing the kernel barrier. User-space performance is effectively bounded by that fundamental design. So we'll focus on kernel solutions here. We will start from the lowest part of the stack, the af_packet patch set, and work our way up the stack all the way up to layer-7 and in-kernel proxying.

af_packet v4 John Fastabend presented a new version of a patch set that was first published in January regarding the af_packet protocol family, which is currently used by tcpdump to extract packets from network interfaces. The goal of this change is to allow zero-copy transfers between user-space applications and the NIC (network interface card) transmit and receive ring buffers. Such optimizations are useful for telecommunications companies, which may use it for deep packet inspection or running exotic protocols in user space. Another use case is running a high-performance intrusion detection system that needs to watch large traffic streams in realtime to catch certain types of attacks. Fastabend presented his work during the Netdev network-performance workshop, but also brought the patch set up for discussion during Netconf. There, he said he could achieve line-rate extraction (and injection) of packets, with packet rates as high as 30Mpps. This performance gain is possible because user-space pages are directly DMA-mapped to the NIC, which is also a security concern. The other downside of this approach is that a complete pair of ring buffers needs to be dedicated for this purpose; whereas before packets were copied to user space, now they are memory-mapped, so the user-space side needs to process those packets quickly otherwise they are simply dropped. Furthermore, it's an "all or nothing" approach; while NIC-level classifiers could be used to steer part of the traffic to a specific queue, once traffic hits that queue, it is only accessible through the af_packet interface and not the rest of the regular stack. If done correctly, however, this could actually improve the way user-space stacks access those packets, providing projects like DPDK a safer way to share pages with the NIC, because it is well defined and kernel-controlled. According to Jesper Dangaard Brouer (during review of this article):
This proposal will be a safer way to share raw packet data between user space and kernel space than what DPDK is doing, [by providing] a cleaner separation as we keep driver code in the kernel where it belongs.
During the Netdev network-performance workshop, Fastabend asked if there was a better data structure to use for such a purpose. The goal here is to provide a consistent interface to user space regardless of the driver or hardware used to extract packets from the wire. af_packet currently defines its own packet format that abstracts away the NIC-specific details, but there are other possible formats. For example, someone in the audience proposed the virtio packet format. Alexei Starovoitov rejected this idea because af_packet is a kernel-specific facility while virtio has its own separate specification with its own requirements. The next step for af_packet is the posting of the new "v4" patch set, although Miller warned that this wouldn't get merged until proper XDP support lands in the Intel drivers. The concern, of course, is that the kernel would have multiple incomplete bypass solutions available at once. Hopefully, Fastabend will present the (by then) merged patch set at the next Netdev conference in November.

XDP updates Higher up in the networking stack sits XDP. The af_packet feature differs from XDP in that it does not perform any sort of analysis or mangling of packets; its objective is purely to get the data into and out of the kernel as fast as possible, completely bypassing the regular kernel networking stack. XDP also sits before the networking stack except that, according to Brouer, it is "focused on cooperating with the existing network stack infrastructure, and on use-cases where the packet doesn't necessarily need to leave kernel space (like routing and bridging, or skipping complex code-paths)." XDP has evolved quite a bit since we last covered it in LWN. It seems that most of the controversy surrounding the introduction of XDP in the Linux kernel has died down in public discussions, under the leadership of David Miller, who heralded XDP as the right solution for a long-term architecture in the kernel. He presented XDP as a fast, flexible, and safe solution. Indeed, one of the controversies surrounding XDP was the question of the inherent security challenges with introducing user-provided programs directly into the Linux kernel to mangle packets at such a low level. Miller argued that whatever protections are expected for user-space programs also apply to XDP programs, comparing the virtual memory protections to the eBPF (extended BPF) verifier applied to XDP programs. Those programs are actually eBPF that have an interesting set of restrictions:
  • they have a limited size
  • they cannot jump backward (and thus cannot loop), so they execute in predictable time
  • they do only static allocation, so they are also limited in memory
XDP is not a one-size-fits-all solution: netfilter, the TC traffic shaper, and other normal Linux utilities still have their place. There is, however, a clear use case for a solution like XDP in the kernel. For example, Facebook and Cloudflare have both started testing XDP and, in Facebook's case, deploying XDP in production. Martin Kafai Lau, from Facebook, presented the tool set the company is using to construct a DDoS-resilience solution and a level-4 load balancer (L4LB), which got a ten-times performance improvement over the previous IPVS-based solution. Facebook rolled out its own user-space solution called "Droplet" to detect hostile traffic and deploy blocking rules in the form of eBPF programs loaded in XDP. Lau demonstrated the way Facebook deploys a three-part chained eBPF program: the first part allows debugging and dumping of packets, the second is Droplet itself, which drops undesirable traffic, and the last segment is the load balancer, which mangles the packets to tweak their destination according to internal rules. Droplet can drop DDoS attacks at line rate while keeping the architecture flexible, which were two key design requirements. Gilberto Bertin, from Cloudflare, presented a similar approach: Cloudflare has a tool that processes sFlow data generated from iptables in order to generate cBPF (classic BPF) mitigation rules that are then deployed on edge routers. Those rules are created with a tool called bpfgen, part of Cloudflare's BSD-licensed bpftools suite. For example, it could create a cBPF bytecode blob that would match DNS queries to any example.com domain with something like:
    bpfgen dns *.example.com
Originally, Cloudflare would deploy those rules to plain iptables firewalls with the xt_bpf module, but this led to performance issues. It then deployed a proprietary user-space solution based on Solarflare hardware, but this has the performance limitations of user-space applications getting packets back onto the wire involves the cost of re-injecting packets back into the kernel. This is why Cloudflare is experimenting with XDP, which was partly developed in response to the company's problems, to deploy those BPF programs. A concern that Bertin identified was the lack of visibility into dropped packets. Cloudflare currently samples some of the dropped traffic to analyze attacks; this is not currently possible with XDP unless you pass the packets down the stack, which is expensive. Miller agreed that the lack of monitoring for XDP programs is a large issue that needs to be resolved, and suggested creating a way to mark packets for extraction to allow analysis. Cloudflare is currently in a testing phase with XDP and it is unclear if its whole XDP tool chain will be publicly available. While those two companies are starting to use XDP as-is, there is more work needed to complete the XDP project. As mentioned above and in our previous coverage, massive statistics extraction is still limited in the Linux kernel and introspection is difficult. Furthermore, while the existing actions (XDP_DROP and XDP_TX, see the documentation for more information) are well implemented and used, another action may be introduced, called XDP_REDIRECT, which would allow redirecting packets to different network interfaces. Such an action could also be used to accelerate bridges as packets could be "switched" based on the MAC address table. XDP also requires network driver support, which is currently limited. For example, the Intel drivers still do not support XDP, although that should come pretty soon. Miller, in his Netdev keynote, focused on XDP and presented it as the standard solution that is safe, fast, and usable. He identified the next steps of XDP development to be the addition of debugging mechanisms, better sampling tools for statistics and analysis, and user-space consistency. Miller foresees a future for XDP similar to the popularization of the Arduino chips: a simple set of tools that anyone, not just developers, can use. He gave the example of an Arduino tutorial that he followed where he could just look up a part number and get easy-to-use instructions on how to program it. Similar components should be available for XDP. For this purpose, the conference saw the creation of a new mailing list called xdp-newbies where people can learn how to create XDP build environments and how to write XDP programs.

In-kernel layer-7 proxying The third approach that struck me as innovative is the idea of doing layer-7 (application) proxying directly in the kernel. This comes from the idea that, traditionally, we build firewalls to segregate traffic and apply controls, but as most services move to HTTP, those policies become ineffective. Thomas Graf, presented this idea during Netconf using a Star Wars allegory: what if the Death Star were a server with an API? You would have endpoints like /dock or /comms that would allow you to dock a ship or communicate with the Death Star. Those API endpoints should obviously be public, but then there is this /exhaust-port endpoint that should never be publicly available. In order for a firewall to protect such a system, it must be able to inspect traffic at a higher level than the traditional address-port pairs. Graf presented a design where the kernel would create an in-kernel socket that would negotiate TCP connections on behalf of user space and then be able to apply arbitrary eBPF rules in the kernel. Graf's design of in-kernel proxying In this scenario, instead of doing the traditional transfer from Netfilter's TPROXY to user space, the kernel directly decapsulates the HTTP traffic and passes it to BPF rules that can make decisions without doing expensive context switches or memory copies in the case of simply wanting to refuse traffic (e.g. issue an HTTP 403 error). This, of course, requires the inclusion of kTLS to process HTTPS connections. HTTP2 support may also prove problematic, as it multiplexes connections and is harder to decapsulate. This design was described as a "pure pre-accept() hook". Starovoitov also compared the design to the kernel connection multiplexer (KCM). Tom Herbert, KCM's author, agreed that it could be extended to support this, but would require some extensions in user space to provide an interface between regular socket-based applications and the KCM layer. In any case, if the application does TLS (and lots of them do), kTLS gets tricky because it breaks the end-to-end nature of TLS, in effect becoming a man in the middle between the client and the application. Eric Dumazet argued that HA-Proxy already does things like this: it uses splice() to avoid copying too much data around, but it still does a context switch to hand over processing to user space, something that could be fixed in the general case. Another similar project that was presented at Netdev is the Tempesta firewall and reverse-proxy. The speaker, Alex Krizhanovsky, explained the Tempesta developers have taken one person month to port the mbed TLS stack to the Linux kernel to allow an in-kernel TLS handshake. Tempesta also implements rate limiting, cookies, and JavaScript challenges to mitigate DDoS attacks. The argument behind the project is that "it's easier to move TLS to the kernel than it is to move the TCP/IP stack to user space". Graf explained that he is familiar with Krizhanovsky's work and he is hoping to collaborate. In effect, the design Graf is working on would serve as a foundation for Krizhanovsky's in-kernel HTTP server (kHTTP). In a private email, Graf explained that:
The main differences in the implementation are currently that we foresee to use BPF for protocol parsing to avoid having to implement every single application protocol natively in the kernel. Tempesta likely sees this less of an issue as they are probably only targeting HTTP/1.1 and HTTP/2 and to some [extent] JavaScript.
Neither project is really ready for production yet. There didn't seem to be any significant pushback from key network developers against the idea, which surprised some people, so it is likely we will see more and more layer-7 intelligence move into the kernel sooner rather than later.

Conclusion All of this work aims at replacing a rag-tag bunch of proprietary solutions that recently came up to bypass the Linux kernel TCP/IP stack and improve performance for firewalls, proxies, and other key edge network elements. The idea is that, unless the kernel improves its performance, or at least provides a way to bypass its more complex code paths, people will work around it. With this set of solutions in place, engineers will now be able to use standard APIs to hook high-performance systems into the Linux kernel.
The author would like to thank the Netdev and Netconf organizers for travel assistance, Thomas Graf for a review of the in-kernel proxying section of this article, and Jesper Dangaard Brouer for review of the af_packet and XDP sections. Note: this article first appeared in the Linux Weekly News.

Next.