Search Results: "xam"

6 June 2020

Petter Reinholdtsen: Secure Socket API - a simple and powerful approach for TLS support in software

As a member of the Norwegian Unix User Group, I have the pleasure of receiving the USENIX magazine ;login: several times a year. I rarely have time to read all the articles, but try to at least skim through them all as there is a lot of nice knowledge passed on there. I even carry the latest issue with me most of the time to try to get through all the articles when I have a few spare minutes. The other day I came across a nice article titled "The Secure Socket API: TLS as an Operating System Service" with a marvellous idea I hope can make it all the way into the POSIX standard. The idea is as simple as it is powerful. By introducing a new socket() option IPPROTO_TLS to use TLS, and a system wide service to handle setting up TLS connections, one both make it trivial to add TLS support to any program currently using the POSIX socket API, and gain system wide control over certificates, TLS versions and encryption systems used. Instead of doing this:
int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
the program code would be doing this:
int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TLS);
According to the ;login: article, converting a C program to use TLS would normally modify only 5-10 lines in the code, which is amazing when compared to using for example the OpenSSL API. The project has set up the https://securesocketapi.org/ web site to spread the idea, and the code for a kernel module and the associated system daemon is available from two github repositories: ssa and ssa-daemon. Unfortunately there is no explicit license information with the code, so its copyright status is unclear. A request to solve this about it has been unsolved since 2018-08-17. I love the idea of extending socket() to gain TLS support, and understand why it is an advantage to implement this as a kernel module and system wide service daemon, but can not help to think that it would be a lot easier to get projects to move to this way of setting up TLS if it was done with a user space approach where programs wanting to use this API approach could just link with a wrapper library. I recommend you check out this simple and powerful approach to more secure network connections. :) As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

4 June 2020

Steve McIntyre: What can you preseed when installing Debian?

Preseeding is a very useful way of installing and pre-configuring a Debian system in one go. You simply supply lots of the settings that your new system will need up front, in a preseed file. The installer will use those settings instead of asking questions, and it will also pass on any extra settings via the debconf database so that any further package setup will use them. There is documentation about how to do this in the Debian wiki at https://wiki.debian.org/DebianInstaller/Preseed, and an example preseed file for our current stable release (Debian 10, "buster") in the release notes. One complaint I've heard is that it can be difficult to work out exactly the right data to use in a preseed file, as the format is not the easiest to work with by hand. It's also difficult to find exactly what settings can be changed in a preseed. So, I've written a script to parse all the debconf templates in each release in the Debian archive and dump all the possible settings in each. I've put the results up online at my debian-preseed site in case it's useful. The data will be updated daily as needed to make sure it's current. Updated June 2020 - changed the URL for the preseed site now I have a domain set up at https://preseed.debian.net/.

Antoine Beaupr : Replacing Smokeping with Prometheus

I've been struggling with replacing parts of my old sysadmin monitoring toolkit (previously built with Nagios, Munin and Smokeping) with more modern tools (specifically Prometheus, its "exporters" and Grafana) for a while now. Replacing Munin with Prometheus and Grafana is fairly straightforward: the network architecture ("server pulls metrics from all nodes") is similar and there are lots of exporters. They are a little harder to write than Munin modules, but that makes them more flexible and efficient, which was a huge problem in Munin. I wrote a Migrating from Munin guide that summarizes those differences. Replacing Nagios is much harder, and I still haven't quite figured out if it's worth it.

How does Smokeping work Leaving those two aside for now, I'm left with Smokeping, which I used in my previous job to diagnose routing issues, using Smokeping as a decentralized looking glass, which was handy to debug long term issues. Smokeping is a strange animal: it's fundamentally similar to Munin, except it's harder to write plugins for it, so most people just use it for Ping, something for which it excels at. Its trick is this: instead of doing a single ping and returning this metrics, it does multiple ones and returns multiple metrics. Specifically, smokeping will send multiple ICMP packets (20 by default), with a low interval (500ms by default) and a single retry. It also pings multiple hosts at once which means it can quickly scan multiple hosts simultaneously. You therefore see network conditions affecting one host reflected in further hosts down (or up) the chain. The multiple metrics also mean you can draw graphs with "error bars" which Smokeping shows as "smoke" (hence the name). You also get per-metric packet loss. Basically, smokeping runs this command and collects the output in a RRD database:
fping -c $count -q -b $backoff -r $retry -4 -b $packetsize -t $timeout -i $mininterval -p $hostinterval $host [ $host ...]
... where those parameters are, by default:
  • $count is 20 (packets)
  • $backoff is 1 (avoid exponential backoff)
  • $timeout is 1.5s
  • $mininterval is 0.01s (minimum wait interval between any target)
  • $hostinterval is 1.5s (minimum wait between probes on a single target)
It can also override stuff like the source address and TOS fields. This probe will complete between 30 and 60 seconds, if my math is right (0% and 100% packet loss).

How do draw Smokeping graphs in Grafana A naive implementation of Smokeping in Prometheus/Grafana would be to use the blackbox exporter and create a dashboard displaying those metrics. I've done this at home, and then I realized that I was missing something. Here's what I did.
  1. install the blackbox exporter:
    apt install prometheus-blackbox-exporter
    
  2. make sure to allow capabilities so it can ping:
    dpkg-reconfigure prometheus-blackbox-exporter
    
  3. hook monitoring targets into prometheus.yml (the default blackbox exporter configuration is fine):
    scrape_configs:
      - job_name: blackbox
          metrics_path: /probe
          params:
            module: [icmp]
          scrape_interval: 5s
          static_configs:
            - targets:
              - octavia.anarc.at
              # hardcoded in DNS
              - nexthop.anarc.at
              - koumbit.net
              - dns.google
          relabel_configs:
            - source_labels: [__address__]
              target_label: __param_target
            - source_labels: [__param_target]
              target_label: instance
            - target_label: __address__
              replacement: 127.0.0.1:9115  # The blackbox exporter's real hostname:port.
    
    Notice how we lower the scrape_interval to 5 seconds to get more samples. nexthop.anarc.at was added into DNS to avoid hardcoding my upstream ISP's IP in my configuration.
  4. create a Grafana panel to graph the results. first, add this query:
    sum(probe_icmp_duration_seconds phase="rtt" ) by (instance)
    
    • Set the Legend field to instance RTT
    • Set Draw modes to lines and Mode options to staircase
    • Set the Left Y axis Unit to duration(s)
    • Show the Legend As table, with Min, Avg, Max and Current enabled
    Then add this query, for packet loss:
    1-avg_over_time(probe_success[$__interval])!=0 or null
    
    • Set the Legend field to instance packet loss
    • Set a Add series override to Lines: false, Null point mode: null, Points: true, Points Radios: 1, Color: deep red, and, most importantly, Y-axis: 2
    • Set the Right Y axis Unit to percent (0.0-1.0) and set Y-max to 1
    Then set the entire thing to Repeat, on target, vertically. And you need to add a target variable like label_values(probe_success, instance).
The result looks something like this:
A plot of RTT and packet loss over time of three nodes Not bad, but not Smokeping
This actually looks pretty good! I've uploaded the resulting dashboard in the Grafana dashboard repository.

What is missing? Now, that doesn't exactly look like Smokeping, does it. It's pretty good, but it's not quite what we want. What is missing is variance, the "smoke" in Smokeping. There's a good article about replacing Smokeping with Grafana. They wrote a custom script to write samples into InfluxDB so unfortunately we can't use it in this case, since we don't have InfluxDB's query language. I couldn't quite figure out how to do the same in PromQL. I tried:
stddev(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" )
stddev_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [$__interval])
stddev_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [1m])
The first two give zero for all samples. The latter works, but doesn't look as good as Smokeping. So there might be something I'm missing. SuperQ wrote a special exporter for this called smokeping_prober that came out of this discussion in the blackbox exporter. Instead of delegating scheduling and target definition to Prometheus, the targets are set in the exporter. They also take a different approach than Smokeping: instead of recording the individual variations, they delegate that to Prometheus, through the use of "buckets". Then they use a query like this:
histogram_quantile(0.9 rate(smokeping_response_duration_seconds_bucket[$__interval]))
This is the rationale to SuperQ's implementation:
Yes, I know about smokeping's bursts of pings. IMO, smokeping's data model is flawed that way. This is where I intentionally deviated from the smokeping exact way of doing things. This prober sends a smooth, regular series of packets in order to be measuring at regular controlled intervals. Instead of 20 packets, over 10 seconds, every minute. You send one packet per second and scrape every 15. This has the same overall effect, but the measurement is, IMO, more accurate, as it's a continuous stream. There's no 50 second gap of no metrics about the ICMP stream. Also, you don't get back one metric for those 20 packets, you get several. Min, Max, Avg, StdDev. With the histogram data, you can calculate much more than just that using the raw data. For example, IMO, avg and max are not all that useful for continuous stream monitoring. What I really want to know is the 90th percentile or 99th percentile. This smokeping prober is not intended to be a one-to-one replacement for exactly smokeping's real implementation. But simply provide similar functionality, using the power of Prometheus and PromQL to make it better. [...] one of the reason I prefer the histogram datatype, is you can use the heatmap panel type in Grafana, which is superior to the individual min/max/avg/stddev metrics that come from smokeping. Say you had two routes, one slow and one fast. And some pings are sent over one and not the other. Rather than see a wide min/max equaling a wide stddev, the heatmap would show a "line" for both routes.
That's an interesting point. I have also ended up adding a heatmap graph to my dashboard, independently. And it is true it shows those "lines" much better... So maybe that, if we ignore legacy, we're actually happy with what we get, even with the plain blackbox exporter. So yes, we're missing pretty "fuzz" lines around the main lines, but maybe that's alright. It would be possible to do the equivalent to the InfluxDB hack, with queries like:
min_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [30s])
avg_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [5m])
max_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [30s])
The output looks something like this:
A plot of RTT and packet loss over time of three nodes, with minimax Looks more like Smokeping!
But there's a problem there: see how the middle graph "dips" sometimes below 20ms? That's the min_over_time function (incorrectly, IMHO) returning zero. I haven't quite figured out how to fix that, and I'm not sure it is better. But it does look more like Smokeping than the previous graph. Update: I forgot to mention one big thing that this setup is missing. Smokeping has this nice feature that you can order and group probe targets in a "folder"-like hierarchy. It is often used to group probes by location, which makes it easier to scan a lot of targets. This is harder to do in this setup. It might be possible to setup location-specific "jobs" and select based on that, but it's not exactly the same.

Credits Credits to Chris Siebenmann for his article about Prometheus and pings which gave me the avg_over_time query idea.

Reproducible Builds: Reproducible Builds in May 2020

Welcome to the May 2020 report from the Reproducible Builds project. One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. Nonetheless, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes. In these reports we outline the most important things that we and the rest of the community have been up to over the past month.

News The Corona-Warn app that helps trace infection chains of SARS-CoV-2/COVID-19 in Germany had a feature request filed against it that it build reproducibly. A number of academics from Cornell University have published a paper titled Backstabber s Knife Collection which reviews various open source software supply chain attacks:
Recent years saw a number of supply chain attacks that leverage the increasing use of open source during software development, which is facilitated by dependency managers that automatically resolve, download and install hundreds of open source packages throughout the software life cycle.
In related news, the LineageOS Android distribution announced that a hacker had access to the infrastructure of their servers after exploiting an unpatched vulnerability. Marcin Jachymiak of the Sia decentralised cloud storage platform posted on their blog that their siac and siad utilities can now be built reproducibly:
This means that anyone can recreate the same binaries produced from our official release process. Now anyone can verify that the release binaries were created using the source code we say they were created from. No single person or computer needs to be trusted when producing the binaries now, which greatly reduces the attack surface for Sia users.
Synchronicity is a distributed build system for Rust build artifacts which have been published to crates.io. The goal of Synchronicity is to provide a distributed binary transparency system which is independent of any central operator. The Comparison of Linux distributions article on Wikipedia now features a Reproducible Builds column indicating whether distributions approach and progress towards achieving reproducible builds.

Distribution work In Debian this month: In Alpine Linux, an issue was filed and closed regarding the reproducibility of .apk packages. Allan McRae of the ArchLinux project posted their third Reproducible builds progress report to the arch-dev-public mailing list which includes the following call for help:
We also need help to investigate and fix the packages that fail to reproduce that we have not investigated as of yet.
In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

Software development

diffoscope Chris Lamb made the changes listed below to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. He also prepared and uploaded versions 142, 143, 144, 145 and 146 to Debian, PyPI, etc.
  • Comparison improvements:
    • Improve fuzzy matching of JSON files as file now supports recognising JSON data. (#106)
    • Refactor .changes and .buildinfo handling to show all details (including the GnuPG header and footer components) even when referenced files are not present. (#122)
    • Use our BuildinfoFile comparator (etc.) regardless of whether the associated files (such as the orig.tar.gz and the .deb) are present. [ ]
    • Include GnuPG signature data when comparing .buildinfo, .changes, etc. [ ]
    • Add support for printing Android APK signatures via apksigner(1). (#121)
    • Identify iOS App Zip archive data as .zip files. (#116)
    • Add support for Apple Xcode .mobilepovision files. (#113)
  • Bug fixes:
    • Don t print a traceback if we pass a single, missing argument to diffoscope (eg. a JSON diff to re-load). [ ]
    • Correct differences typo in the ApkFile handler. (#127)
  • Output improvements:
    • Never emit the same id="foo" anchor reference twice in the HTML output, otherwise identically-named parts will not be able to linked to via a #foo anchor. (#120)
    • Never emit an empty id anchor either; it is not possible to link to #. [ ]
    • Don t pretty-print the output when using the --json presenter; it will usually be too complicated to be readable by the human anyway. [ ]
    • Use the SHA256 over MD5 hash when generating page names for the HTML directory-style presenter. (#124)
  • Reporting improvements:
    • Clarify the message when we truncate the number of lines to standard error [ ] and reduce the number of maximum lines printed to 25 as usually the error is obvious by then [ ].
    • Print the amount of free space that we have available in our temporary directory as a debugging message. [ ]
    • Clarify Command [ ] failed with exit code messages to remove duplicate exited with exit but also to note that diffoscope is interpreting this as an error. [ ]
    • Don t leak the full path of the temporary directory in Command [ ] exited with 1 messages. (#126)
    • Clarify the warning message when we cannot import the debian Python module. [ ]
    • Don t repeat stderr from if both commands emit the same output. [ ]
    • Clarify that an external command emits for both files, otherwise it can look like we are repeating itself when, in reality, it is being run twice. [ ]
  • Testsuite improvements:
    • Prevent apksigner test failures due to lack of binfmt_misc, eg. on Salsa CI and elsewhere. [ ]
    • Drop .travis.yml as we use Salsa instead. [ ]
  • Dockerfile improvements:
    • Add a .dockerignore file to whitelist files we actually need in our container. (#105)
    • Use ARG instead of ENV when setting up the DEBIAN_FRONTEND environment variable at runtime. (#103)
    • Run as a non-root user in container. (#102)
    • Install/remove the build-essential during build so we can install the recommended packages from Git. [ ]
  • Codebase improvements:
    • Bump the officially required version of Python from 3.5 to 3.6. (#117)
    • Drop the (default) shell=False keyword argument to subprocess.Popen so that the potentially-unsafe shell=True is more obvious. [ ]
    • Perform string normalisation in Black [ ] and include the Black output in the assertion failure too [ ].
    • Inline MissingFile s special handling of deb822 to prevent leaking through abstract layers. [ ][ ]
    • Allow a bare try/except block when cleaning up temporary files with respect to the flake8 quality assurance tool. [ ]
    • Rename in_dsc_path to dsc_in_same_dir to clarify the use of this variable. [ ]
    • Abstract out the duplicated parts of the debian_fallback class [ ] and add descriptions for the file types. [ ]
    • Various commenting and internal documentation improvements. [ ][ ]
    • Rename the Openssl command class to OpenSSLPKCS7 to accommodate other command names with this prefix. [ ]
  • Misc:
    • Rename the --debugger command-line argument to --pdb. [ ]
    • Normalise filesystem stat(2) birth times (ie. st_birthtime) in the same way we do with the stat(1) command s Access: and Change: times to fix a nondeterministic build failure in GNU Guix. (#74)
    • Ignore case when ordering our file format descriptions. [ ]
    • Drop, add and tidy various module imports. [ ][ ][ ][ ]
In addition:
  • Jean-Romain Garnier fixed a general issue where, for example, LibarchiveMember s has_same_content method was called regardless of the underlying type of file. [ ]
  • Daniel Fullmer fixed an issue where some filesystems could only be mounted read-only. (!49)
  • Emanuel Bronshtein provided a patch to prevent a build of the Docker image containing parts of the build s. (#123)
  • Mattia Rizzolo added an entry to debian/py3dist-overrides to ensure the rpm-python module is used in package dependencies (#89) and moved to using the new execute_after_* and execute_before_* Debhelper rules [ ].

Chris Lamb also performed a huge overhaul of diffoscope s website:
  • Add a completely new design. [ ][ ]
  • Dynamically generate our contributor list [ ] and supported file formats [ ] from the main Git repository.
  • Add a separate, canonical page for every new release. [ ][ ][ ]
  • Generate a latest release section and display that with the corresponding date on the homepage. [ ]
  • Add an RSS feed of our releases [ ][ ][ ][ ][ ] and add to Planet Debian [ ].
  • Use Jekyll s absolute_url and relative_url where possible [ ][ ] and move a number of configuration variables to _config.yml [ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Other tools Elsewhere in our tooling: strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. In May, Chris Lamb uploaded version 1.8.1-1 to Debian unstable and Bernhard M. Wiedemann fixed an off-by-one error when parsing PNG image modification times. (#16) In disorderfs, our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues, Chris Lamb replaced the term dirents in place of directory entries in human-readable output/log messages [ ] and used the astyle source code formatter with the default settings to the main disorderfs.cpp source file [ ]. Holger Levsen bumped the debhelper-compat level to 13 in disorderfs [ ] and reprotest [ ], and for the GNU Guix distribution Vagrant Cascadian updated the versions of disorderfs to version 0.5.10 [ ] and diffoscope to version 145 [ ].

Project documentation & website
  • Carl Dong:
  • Chris Lamb:
    • Rename the Who page to Projects . [ ]
    • Ensure that Jekyll enters the _docs subdirectory to find the _docs/index.md file after an internal move. (#27)
    • Wrap ltmain.sh etc. in preformatted quotes. [ ]
    • Wrap the SOURCE_DATE_EPOCH Python examples onto more lines to prevent visual overflow on the page. [ ]
    • Correct a preferred spelling error. [ ]
  • Holger Levsen:
    • Sort our Academic publications page by publication year [ ] and add Trusting Trust and Fully Countering Trusting Trust through Diverse Double-Compiling [ ].
  • Juri Dispan:

Testing framework We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced. Holger Levsen made the following changes:
  • System health status:
    • Improve page description. [ ]
    • Add more weight to proxy failures. [ ]
    • More verbose debug/failure messages. [ ][ ][ ]
    • Work around strangeness in the Bash shell let VARIABLE=0 exits with an error. [ ]
  • Debian:
    • Fail loudly if there are more than three .buildinfo files with the same name. [ ]
    • Fix a typo which prevented /usr merge variation on Debian unstable. [ ]
    • Temporarily ignore PHP s horde](https://www.horde.org/) packages in Debian bullseye. [ ]
    • Document how to reboot all nodes in parallel, working around molly-guard. [ ]
  • Further work on a Debian package rebuilder:
    • Workaround and document various issues in the debrebuild script. [ ][ ][ ][ ]
    • Improve output in the case of errors. [ ][ ][ ][ ]
    • Improve documentation and future goals [ ][ ][ ][ ], in particular documentiing two real world tests case for an impossible to recreate build environment [ ].
    • Find the right source package to rebuild. [ ]
    • Increase the frequency we run the script. [ ][ ][ ][ ]
    • Improve downloading and selection of the sources to build. [ ][ ][ ]
    • Improve version string handling.. [ ]
    • Handle build failures better. [ ]. [ ]. [ ]
    • Also consider architecture all .buildinfo files. [ ][ ]
In addition:
  • kpcyrd, for Alpine Linux, updated the alpine_schroot.sh script now that a patch for abuild had been released upstream. [ ]
  • Alexander Couzens of the OpenWrt project renamed the brcm47xx target to bcm47xx. [ ]
  • Mattia Rizzolo fixed the printing of the build environment during the second build [ ][ ][ ] and made a number of improvements to the script that deploys Jenkins across our infrastructure [ ][ ][ ].
Lastly, Vagrant Cascadian clarified in the documentation that you need to be user jenkins to run the blacklist command [ ] and the usual build node maintenance was performed was performed by Holger Levsen [ ][ ][ ], Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ].

Mailing list: There were a number of discussions on our mailing list this month: Paul Spooren started a thread titled Reproducible Builds Verification Format which reopens the discussion around a schema for sharing the results from distributed rebuilders:
To make the results accessible, storable and create tools around them, they should all follow the same schema, a reproducible builds verification format. The format tries to be as generic as possible to cover all open source projects offering precompiled source code. It stores the rebuilder results of what is reproducible and what not.
Hans-Christoph Steiner of the Guardian Project also continued his previous discussion regarding making our website translatable. Lastly, Leo Wandersleb posted a detailed request for feedback on a question of supply chain security and other issues of software review; Leo is the founder of the Wallet Scrutiny project which aims to prove the security of Android Bitcoin Wallets:
Do you own your Bitcoins or do you trust that your app allows you to use your coins while they are actually controlled by them ? Do you have a backup? Do they have a copy they didn t tell you about? Did anybody check the wallet for deliberate backdoors or vulnerabilities? Could anybody check the wallet for those?
Elsewhere, Leo had posted instructions on his attempts to reproduce the binaries for the BlueWallet Bitcoin wallet for iOS and Android platforms.


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Jelle van der Waa and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

3 June 2020

Jonathan Dowland: using Template Haskell to generate boilerplate

Here's a practical example of applying Template Haskell to reduce the amount of boilerplate code that is otherwise required. I wrote the below after following this excellent blog post by Matt Parsons. This post will be much higher-level, read Matt's blog for the gorier details. Liquorice Liquorice is a toy project of mine from a few years ago that lets you draw 2D geometric structures similar to LOGO. Liquorice offers two interfaces: pure functions that operate on an explicit Context (the pen location: existing lines, etc.), and a second "stateful" interface where the input and output are handled in the background. I prefix the pure ones P. and the stateful ones S. in this blog post for clarity. The stateful interface can be much nicer to use for larger drawings. Compare example8b.hs, written in terms of the pure functions, and the stateful equivalent example8.hs. The majority of the stateful functions are "wrapped" versions of the pure functions. For example, the pure function P.step takes two numbers and moves the pen forward and sideways. Its type signature is
P.step :: Int -> Int -> Context -> Context
Here's the signature and implementation of the stateful equivalent:
S.step :: Int -> Int -> State Context ()
S.step x y = modify (P.step x y)
Writing these wrapped functions for the 29 pure functions is boilerplate that can be generated automatically with Template Haskell. Generating the wrapper functions Given the Name of a function to wrap, we construct an instance of FunD, the TH data-type representing a function definition. We use the base name of the incoming function as the name for the new one.
mkWrap fn = do
     
    let name = mkName (nameBase fn)
    return $ FunD name [ Clause (map VarP args) (NormalB rhs) [] ]
To determine how many arguments the wrapper function needs to accept, we need to determine the input function's arity. We use Template Haskell's reify function to get type information about the function, and derive the arity from that. Matt Parson's covers this exactly in his blog.
info    <- reify fn
let ty   = (\(VarI _ t _ ) -> t) info
let n    = arity ty - 1
args    <- replicateM n (newName "arg")
We can use the list "args" directly in the clause part of the function definition, as the data-type expects a list. For the right-hand side, we need to convert from a list of arguments to function application. That's a simple left-fold:
-- mkFnApp f [a,b,c] => ((f a) b) c => f a b c
mkFnApp = foldl (\e -> appE e . varE)
rhs     <- [  modify $(mkFnApp (varE fn) args) :: State Context ()  ]
We use TH's oxford brackets for the definition of rhs. This permits us to write real Haskell inside the brackets, and get an expression data-type outside them. Within we have a splice (the $( )), which does the opposite: the code is evaluated at compile time and generates an Exp that is then converted into the equivalent Haskell code and spliced into place. Finally, we need to apply the above to a list of Names. Sadly, we can't get at the list of exported names from a Module automatically. There is an open request for a TH extension for this. In the meantime, we export a list of the functions to wrap from the Pure module and operate on that
import Liquorice.Pure
wrapPureFunctions = mapM mkWrap pureFns
Finally, we 'call' wrapPureFunctions at the top level in our state module and Template Haskell splices all the function definitions into place. The final code ended up only around 30 lines of code, and saved about the same number of lines of boilerplate. But in doing this I noticed some missing functions, and it will pay dividends if more pure functions are added. Limitations The current implementation has one significant limitation: it cannot handle higher-order functions. An example of a pure higher-order function is place, which moves the pen, performs an operation, and then moves it back:
P.place :: Int -> Int -> (Context -> Context) -> Context -> Context
Wrapping this is not sufficient because the higher-order parameter has the pure function signature Context -> Context. If we wrapped it, the stateful version of the function would accept a pure function as the parameter, but you would expect it to accept another stateful function. To handle these, at a minimum we would need to detect the function arguments that have type Context -> Context and replace them with State Context (). The right-hand side of the wrapped function would also need to do more work to handle wrapping and unwrapping the parameter. I haven't spent much time thinking about it but I'm not sure that a general purpose wrapper would work for all higher-order functions. For the time being I've just re-implemented the half-dozen of them.

Ben Hutchings: Introducing debplate, a template system for Debian packages

For about two months I've been working on a new project, debplate, which currently lives at benh/debplate on Salsa. This is a template system for Debian packages, primarily intended to ease building multiple similar binary packages from a single source. With some changes, it could also be useful for making multiple source packages consistent (issue #9). I want debplate to be capable of replacing the kernel team's existing template system and a lot of its custom scripting, but it is also meant to a general tool. I believe it's already capable of supporting source packages with relatively simple needs, and there are some examples of these in the debplate source. My long-term goal is that debplate will replace most team-specific and package-specific template systems, making those source packages using it less unusual and easier to contribute to. I gave a short talk about debplate at MiniDebConf Online on Sunday.

Dirk Eddelbuettel: littler 0.3.10: Some more updates

max-heap image The eleventh release of littler as a CRAN package is now available, following in the fourteen-ish year history as a package started by Jeff in 2006, and joined by me a few weeks later. littler is the first command-line interface for R as it predates Rscript. It allows for piping as well for shebang scripting via #!, uses command-line arguments more consistently and still starts faster. It also always loaded the methods package which Rscript only started to do in recent years. littler lives on Linux and Unix, has its difficulties on macOS due to yet-another-braindeadedness there (who ever thought case-insensitive filesystems as a default where a good idea?) and simply does not exist on Windows (yet the build system could be extended see RInside for an existence proof, and volunteers are welcome!). See the FAQ vignette on how to add it to your PATH. A few examples are highlighted at the Github repo, as well as in the examples vignette. This release adds a new helper / example script installBioc.r for BioConductor package installation, generalizes the roxygenize() wrapper roxy.r a little, and polished a couple of other corners. The NEWS file entry is below.

Changes in littler version 0.3.10 (2020-06-02)
  • Changes in examples
    • The update.r script only considers writeable directories.
    • The rcc.r script tries to report full logs by setting _R_CHECK_TESTS_NLINES_=0.
    • The tt.r script has an improved ncpu fallback.
    • Several installation and updating scripts set _R_SHLIB_STRIP_ to TRUE.
    • A new script installBioc.r was added.
    • The --error option to install2.r was generalized (Sergio Oller in #78).
    • The roxy.r script was extended a little.
  • Changes in package
    • Travis CI now uses R 4.0.0 and the bionic distro

CRANberries provides a comparison to the previous release. Full details for the littler release are provided as usual at the ChangeLog page. The code is available via the GitHub repo, from tarballs and now of course also from its CRAN page and via install.packages("littler"). Binary packages are available directly in Debian as well as soon via Ubuntu binaries at CRAN thanks to the tireless Michael Rutter. Comments and suggestions are welcome at the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

2 June 2020

Sylvestre Ledru: Debian rebuild with clang 10 + some patches

Because of the lock-down in France and thanks to Lucas, I have been able to make some progress rebuilding Debian with clang instead of gcc.

TLDR Instead of patching clang itself, I used a different approach this time: patching Debian tools or implementing some workaround to mitigate an issue.
The percentage of packages failing drop from 4.5% to 3.6% (1400 packages to 1110 - on a total of 31014). I focused on two classes of issues:

Qmake As I have no intention to merge the patch upstream, I used a very dirty workaround. I overwrote the g++ qmake file by clang's:
https://salsa.debian.org/lucas/collab-qa-tools/-/blob/master/modes/clang10#L44-47 I dropped the number of this failure to 0, making some packages build flawlessly (example: qtcreator, chessx, fwbuilder, etc). However, some packages are still failing later and therefore increasing the number of failures in some other categories like link error. For example, qtads fails because of ordered comparison between pointer and zero or oscar fails on a -Werror,-Wdeprecated-copy error. Breaking the build later also highlighted some new classes of issues which didn't occur with clang < 10.
For example, warnings related to C++ range loop or implicit int float conversion (I fixed a bunch of them in Firefox) .

Symbol differences Historically, symbol management for C++ in Debian has been a pain. Russ Allbery wrote a blog post in 2012 explaining the situation. AFAIK, it hasn't changed much.
Once more, I took the dirty approach: if there new or missing symbols, don't fail the build.
The rational is the following: Packages in the Debian archive are supposed to build without any issue. If there is new or missing symbols, it is probably clang generating a different library but this library is very likely working as expected (and usable by a program compiled with g++ or clang). It is purely a different approach taken by the compiler developer. In order to mitigate this issue, before the build starts, I am modifying dpkg-gensymbols to transform the error into a warning.
So, the typical Debian error some new symbols appeared in the symbols file or some symbols or patterns disappeared in the symbols file will NOT fail the build. Unsurprisingly, all but one package (libktorrent) build. Even if I am pessimistic, I reported a bug on dpkg-dev to evaluate if we could improve dpkg-gensymbol not to fail on these cases.

Next steps The next offender is Imake.tmpl:2243:10: fatal error: ' X11 .rules' file not found with more than an hundred occurrences, reported upstream quite sometime ago. Then, the big issues are going to be much harder to fix as they are real issues/warnings (with -Werror) in the code of the packages. Example: -Wc++11-narrowing & Wreserved-user-defined-literal... The list is long.
I will probably work on that when llvm/clang 11 are in RC phase.

For maintainers & upstream Maintainer of Debian/Ubuntu packages? I am providing a list of failing packages per maintainer: https://clang.debian.net/maintainers.php
For upstream, it is also easy to test with clang. Usually, apt install clang && CC=clang CXX=clang++ <build step> is good enough.

Conclusion With these two changes, I have been able to fix about 290 packages. I think I will be able to get that down a bit more but we will soon reach a plateau as many warnings/issues will have to fix in the C/C++ code itself.

31 May 2020

Chris Lamb: Free software activities in May 2020

Here is my monthly update covering what I have been doing in the free software world during May 2020 (previous month): In Lintian, the static analysis tool for Debian packages:
Reproducible builds One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. The initiative is proud to be a member project of the Software Freedom Conservancy, a not-for-profit 501(c)(3) charity focused on ethical technology and user freedom. Conservancy acts as a corporate umbrella allowing projects to operate as non-profit initiatives without managing their own corporate structure. If you like the work of the Conservancy or the Reproducible Builds project, please consider becoming an official supporter.
Elsewhere in our tooling, I made the following changes to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including preparing and uploading versions 142, 143, 144, 145 and 146 to Debian: I also performed a huge overhaul of diffoscope's website:
Lastly, I made a large number of changes to the main Reproducible Builds website and documentation:
Debian LTS This month I contributed 17 hours to Debian Long Term Support (LTS) and 9 hours on its sister Extended LTS project. You can find out more about the two projects via the following video:
Debian I filed the following bug reports in Debian this month: I also filed a number of bugs against packages that are not compatible with Django 3.x, (organised around a single master bug) including django-taggit, sorl-thumbnail, django-simple-captcha, django-cas-server, django-cors-headers, python-django-csp, django-pipeline, python-django-jsonfield, python-django-contact-form, django-model-utils, django-fsm, django-modeltranslation, django-oauth-toolkit, libthumbor, python-django-extensions, python-django-imagekit, python-django-navtag, python-django-tagging, djangorestframework, django-haystack, django-taggit, django-simple-captcha, python-django-registration, python-django-pyscss, python-django-compressor, python-django-crispy-forms & python-django-mptt,
Lastly, I made the following uploads to Debian: I also sponsored an upload for adminer (4.7.7-1), also uploading it to buster-backports.

30 May 2020

Sean Whitton: GNU Emacs' Transient Mark mode

Something I ve found myself doing as the pandemic rolls on is picking out and (re-)reading through sections of the GNU Emacs manual and the GNU Emacs Lisp reference manual. This has got me (too) interested in some of the recent history of Emacs development, and I did some digging into archives of emacs-devel from 2008 (15M mbox) regarding the change to turn Transient Mark mode on by default and set mark-even-if-inactive to true by default in Emacs 23.1. It s not always clear which objections to turning on Transient Mark mode by default take into account the mark-even-if-inactive change. I think that turning on Transient Mark mode along with mark-even-if-inactive is a good default. The question that remains is whether the disadvantages of Transient Mark mode are significant enough that experienced Emacs users should consider altering Emacs default behaviour to mitigate them. Here s one popular blog arguing for some mitigations. How might Transient Mark mode be disadvantageous? The suggestion is that it makes using the mark for navigation rather than for acting on regions less convenient:
  1. setting a mark just so you can jump back to it (i) is a distinct operation you have to think of separately; and (ii) requires two keypresses, C-SPC C-SPC, rather than just one keypress
  2. using exchange-point-and-mark activates the region, so to use it for navigation you need to use either C-u C-x C-x or C-x C-x C-g, neither of which are convenient to type, or else it will be difficult to set regions at the place you ve just jumped to because you ll already have one active.
There are two other disadvantages that people bring up which I am disregarding. The first is that it makes it harder for new users to learn useful ways in which to use the mark when it s deactivated. This happened to me, but it can be mitigated without making any behavioural changes to Emacs. The second is that the visual highlighting of the region can be distracting. So far as I can tell, this is only a problem with exchange-point-and-mark, and it s subsumed by the problem of that command actually activating the region. The rest of the time Emacs automatic deactivation of the region seems sufficient. How might disabling Transient Mark mode be disadvantageous? When Transient Mark mode is on, many commands will do something usefully different when the mark is active. The number of commands in Emacs which work this way is only going to increase now that Transient Mark mode is the default. If you disable Transient Mark mode, then to use those features you need to temporarily activate Transient Mark mode. This can be fiddly and/or require a lot of keypresses, depending on exactly where you want to put the region. Without being able to see the region, it might be harder to know where it is. Indeed, this is one of the main reasons for wanting Transient Mark mode to be the default, to avoid confusing new users. I don t think this is likely to affect experienced Emacs users often, however, and on occasions when more precision is really needed, C-u C-x C-x will make the region visible. So I m not counting this as a disadvantage. How might we mitigate these two sets of disadvantages? Here are the two middle grounds I m considering. Mitigation #1: Transient Mark mode, but hack C-x C-x behaviour
(defun spw/exchange-point-and-mark (arg)
  "Exchange point and mark, but reactivate mark a bit less often.

Specifically, invert the meaning of ARG in the case where
Transient Mark mode is on but the region is inactive."
  (interactive "P")
  (exchange-point-and-mark
   (if (and transient-mark-mode (not mark-active))
       (not arg)
     arg)))
(global-set-key [remap exchange-point-and-mark] &aposspw/exchange-point-and-mark)
We avoid turning Transient Mark mode off, but mitigate the second of the two disadvantages given above. I can t figure out why it was thought to be a good idea to make C-x C-x reactivate the mark and require C-u C-x C-x to use the action of exchanging point and mark as a means of navigation. There needs to a binding to reactivate the mark, but in roughly ten years of having Transient Mark mode turned on, I ve found that the need to reactivate the mark doesn t come up often, so the shorter and longer bindings seem the wrong way around. Not sure what I m missing here. Mitigation #2: disable Transient Mark mode, but enable it temporarily more often
(setq transient-mark-mode nil)
(defun spw/remap-mark-command (command &optional map)
  "Remap a mark-* command to temporarily activate Transient Mark mode."
  (let* ((cmd (symbol-name command))
         (fun (intern (concat "spw/" cmd)))
         (doc (concat "Call  "
                      cmd
                      "&apos and temporarily activate Transient Mark mode.")))
    (fset fun  (lambda ()
                 ,doc
                 (interactive)
                 (call-interactively #&apos,command)
                 (activate-mark)))
    (if map
        (define-key map (vector &aposremap command) fun)
      (global-set-key (vector &aposremap command) fun))))
(dolist (command &apos(mark-word
                   mark-sexp
                   mark-paragraph
                   mark-defun
                   mark-page
                   mark-whole-buffer))
  (spw/remap-mark-command command))
(with-eval-after-load &aposorg
  (spw/remap-mark-command &aposorg-mark-subtree org-mode-map))
;; optional
(global-set-key "\M-=" (lambda () (interactive) (activate-mark)))
;; resettle the previous occupant
(global-set-key "\C-cw" &aposcount-words-region)
Here we remove both of the disadvantages of Transient Mark mode given above, and mitigate the main disadvantage of not activating Transient Mark mode by making it more convenient to activate it temporarily. For example, this enables using C-M-SPC C-M-SPC M-( to wrap the following two function arguments in parentheses. And you can hit M-h a few times to mark some blocks of text or code, then operate on them with commands like M-% and C-/ which behave differently when the region is active.1 Comparing these mitigations Both of these mitigations handle the second of the two disadvantages of Transient Mark mode given above. What remains, then, is
  1. under the effects of mitigation #1, how much of a barrier to using marks for navigational purposes is it to have to press C-SPC C-SPC instead of having a single binding, C-SPC, for all manual mark setting2
  2. under the effects of mitigation #2, how much of a barrier to taking advantage of commands which act differently when the region is active is it to have to temporarily enable Transient Mark mode with C-SPC C-SPC, M-= or one of the mark-* commands?
These are unknowns.3 So I m going to have to experiment, I think, to determine which mitigation to use, if either. In particular, I don t know whether it s really significant that setting a mark for navigational purposes and for region marking purposes are distinct operations under mitigation #1. My plan is to start with mitigation #2 because that has the additional advantage of allowing me to confirm or disconfirm my belief that not being able to see where the region is will only rarely get in my way.

  1. The idea of making the mark-* commands activate the mark comes from an emacs-devel post by Stefan Monnier in the archives linked above.
  2. One remaining possibility I m not considering is mitigation #1 plus binding something else to do the same as C-SPC C-SPC. I don t believe there are any easily rebindable keys which are easier to type than typing C-SPC twice. And this does not deal with the two distinct mark-setting operations problem.
  3. Another way to look at this is the question of which of setting a mark for navigational purposes and activating a mark should get C-SPC and which should get C-SPC C-SPC.

Reproducible Builds (diffoscope): diffoscope 146 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 146. This version includes the following changes:
[ Chris Lamb ]
* Refactor .changes and .buildinfo handling to show all details (including
  the GPG header and footer components), even when referenced files are not
  present. (Closes: reproducible-builds/diffoscope#122)
* Normalise filesystem stat(2) "birth times" (ie. st_birthtime) in the same
  way we do with stat(1)'s "Access:" and "Change:" times to fix a
  nondetermistic build failure on GNU Guix.
  (Closes: reproducible-builds/diffoscope#74)
* Drop the (default) subprocess.Popen(shell=False) keyword argument so that
  the more unsafe shell=True is more obvious.
* Ignore lower vs. upper-case when ordering our file format descriptions.
* Don't skip string normalisation in Black.
[ Mattia Rizzolo ]
* Add a "py3dist" override for the rpm-python module (Closes: #949598)
* Bump the debhelper compat level to 13 and use the new
  execute_after_*/execture_before_* style rules.
* Fix a spelling error in changelog.
[ Daniel Fullmer ]
* Mount GuestFS filesystem images readonly.
[ Jean-Romain Garnier ]
* Prevent an issue where (for example) LibarchiveMember's has_same_content
  method is called regardless of the actual type of file.
You find out more by visiting the project homepage.

27 May 2020

Russ Allbery: Review: Middlegame

Review: Middlegame, by Seanan McGuire
Publisher: Tor
Copyright: May 2019
ISBN: 1-250-19551-9
Format: Kindle
Pages: 528
Roger and Dodger are cuckoo children, alchemical constructs created by other alchemical constructs masquerading as humans. They are halves of the primal force of the universe, the Doctrine of Ethos (which is not what the Doctrine of Ethos is, but that is one of my lesser problems with this book), divided into language and math and kept separate to properly mature. In this case, separate means being adopted by families on opposite coasts of the United States, ignorant of each other's existence and closely monitored by agents Reed controls. None of that prevents Roger and Dodger from becoming each other's invisible friends at the age of seven, effortlessly communicating psychically even though they've never met. That could have been the start of an enjoyable story that hearkened back to an earlier age of science fiction: the secret science experiments discover that they have more power than their creators expected, form a clandestine alliance, and fight back against the people who are trying to control them. I have fond memories of Escape to Witch Mountain and would have happily read that book. Unfortunately, that isn't the story McGuire wanted to tell. The story she told involves ripping Roger and Dodger apart, breaking Dodger, and turning Roger into an abusive asshole. Whooboy, where to start. This book made me very angry, in a way that I would not have been if it didn't contain the bones of a much better novel. Four of them, to be precise: four other books that would have felt less gratuitously cruel and less apparently oblivious to just how bad Roger's behavior is. There are some things to like. One of them is that the structure of this book is clever. I can't tell you how it's clever because the structure doesn't become clear until more than halfway through and it completely changes the story in a way that would be a massive spoiler. But it's an interesting spin on an old idea, one that gave Roger and Dodger a type of agency in the story that has far-ranging implications. I enjoyed thinking about it. That leads me to another element I liked: Erin. She makes only fleeting appearances until well into the story, but I thought she competed with Dodger for being the best character of the book. The second of the better novels I saw in the bones of Middlegame was the same story told from Erin's perspective. I found myself guessing at her motives and paying close attention to hints that led to a story with a much different emotional tone. Viewing the ending of the book through her eyes instead of Roger and Dodger's puts it in a different, more complicated, and more thought-provoking light. Unfortunately, she's not McGuire's protagonist. She instead is one of the monsters of this book, which leads to my first, although not my strongest, complaint. It felt like McGuire was trying too hard to write horror, packing Middlegame with the visuals of horror movies without the underlying structure required to make them effective. I'm not a fan of horror personally, so to some extent I'm grateful that the horrific elements were ineffective, but it makes for some frustratingly bad writing. For example, one of the longest horror scenes in the book features Erin, and should be a defining moment for the character. Unfortunately, it's so heavy on visuals and so focused on what McGuire wants the reader to be thinking that it doesn't show any of the psychology underlying Erin's decisions. The camera is pointed the wrong way; all the interesting storytelling work, moral complexity, and world-building darkness is happening in the character we don't get to see. And, on top of that, McGuire overuses foreshadowing so much that it robs the scene of suspense and terror. Again, I'm partly grateful, since I don't read books for suspense and terror, but it means the scene does only a fraction of the work it could. This problem of trying too hard extends to the writing. McGuire has a bit of a tendency in all of her books to overdo the descriptions, but is usually saved by narrative momentum. Unfortunately, that's not true here, and her prose often seems overwrought. She also resorts to this style of description, which never fails to irritate me:
The thought has barely formed when a different shape looms over him, grinning widely enough to show every tooth in its head. They are even, white, and perfect, and yet he somehow can't stop himself from thinking there's something wrong with them, that they're mismatched, that this assortment of teeth was never meant to share a single jaw, a single terrible smile.
This isn't effective. This is telling the reader how they're supposed to feel about the thing you're describing, without doing the work of writing a description that makes them feel that way. (Also, you may see what I mean by overwrought.) That leads me to my next complaint: the villains. My problem is not so much with Leigh, who I thought was an adequate monster, if a bit single-note. There's some thought and depth behind her arguments with Reed, a few hints of her own motives that were more convincing for not being fully shown. The descriptions of how dangerous she is were reasonably effective. She's a good villain for this type of dark fantasy story where the world is dangerous and full of terrors (and reminded me of some of the villains from McGuire's October Daye series). Reed, though, is a storytelling train wreck. The Big Bad of the novel is the least interesting character in it. He is a stuffed tailcoat full of malicious incompetence who is only dangerous because the author proclaims him to be. It only adds insult to injury that he kills off a far more nuanced and creative villain before the novel starts, replacing her ambiguous goals with Snidely Whiplash mustache-twirling. The reader has to suffer through extended scenes focused on him as he brags, monologues, and obsesses over his eventual victory without an ounce of nuance or subtlety. Worse is the dynamic between him and Leigh, which is only one symptom of the problem with Middlegame that made me the most angry: the degree to this book oozes patriarchy. Every man in this book, including the supposed hero, orders around the women, who are forced in various ways to obey. This is the most obvious between Leigh and Reed, but it's the most toxic, if generally more subtle, between Roger and Dodger. Dodger is great. I had absolutely no trouble identifying with and rooting for her as a character. The nasty things that McGuire does to her over the course of the book (and wow does that never let up) made me like her more when she tenaciously refuses to give up. Dodger is the math component of the Doctrine of Ethos, and early in the book I thought McGuire handled that well, particularly given how difficult it is to write a preternatural genius. Towards the end of this book, her math sadly turns into a very non-mathematical magic (more on this in a moment), but her character holds all the way through. It felt like she carved her personality out of this story through sheer force of will and clung to it despite the plot. I wanted to rescue her from this novel and put her into a better book, such as the one in which her college friends (who are great; McGuire is very good at female friendships when she writes them) stage an intervention, kick a few people out of her life, and convince her to trust them. Unfortunately, Dodger is, by authorial fiat, half of a bound pair, and the other half of that pair is Roger, who is the sort of nice guy everyone likes and thinks is sweet and charming until he turns into an emotional trap door right when you need him the most and dumps you into the ocean to drown. And then somehow makes you do all the work of helping him feel better about his betrayal. The most egregious (and most patriarchal) thing Roger does in this book is late in the book and a fairly substantial spoiler, so I can't rant about that properly. But even before that, Roger keeps doing the the same damn emotional abandonment trick, and the book is heavily invested into justifying it and making excuses for him. Excuses that, I should note, are not made for Dodger; her failings are due to her mistakes and weaknesses, whereas Roger's are natural reactions to outside forces. I got very, very tired of this, and I'm upset by how little awareness the narrative voice showed for how dysfunctional and abusive this relationship is. The solution is always for Dodger to reunite with Roger; it's built into the structure of the story. I have a weakness for the soul-bound pair, in part from reading a lot of Mercedes Lackey at an impressionable age, but one of the dangerous pitfalls of the concept is that the characters then have to have an almost flawless relationship. If not, it can turn abusive very quickly, since the characters by definition cannot leave each other. It's essentially coercive, so as soon as the relationship shows a dark side, the author needs to be extremely careful. McGuire was not. There is an attempted partial patch, late in the book, for the patriarchal structure. One of the characters complains about it, and another says that the gender of the language and math pairs is random and went either way in other pairs. Given that both of the pairs that we meet in this story have the same male-dominant gender dynamic, what I took from this is that McGuire realized there was a problem but wasn't able to fix it. (I'm also reminded of David R. Henry's old line that it's never a good sign when the characters start complaining about the plot.) The structural problems are all the more frustrating because I think there were ways out of them. Roger is supposedly the embodiment of language, not that you'd be able to tell from most scenes in this novel. For reasons that I do not understand, McGuire expressed that as a love of words: lexicography, translation, and synonyms. This makes no sense to me. Those are some of the more structured and rules-based (and hence mathematical) parts of language. If Roger had instead been focused on stories collecting them, telling them, and understanding why and how they're told he would have had a clearer contrast with Dodger. More importantly, it would have solved the plot problem that McGuire solved with a nasty bit of patriarchy. So much could have been done with Dodger building a structure of math around Roger's story-based expansion of the possible, and it would have grounded Dodger's mathematics in something more interesting than symbolic magic. To me, it's such an obvious lost opportunity. I'm still upset about this book. McGuire does a lovely bit of world-building with Asphodel Baker, what little we see of her. I found the hidden alchemical war against her work by L. Frank Baum delightful, and enjoyed every excerpt from the fictional Over the Woodward Wall scattered throughout Middlegame. But a problem with inventing a fictional book to excerpt in a real novel is that the reader may decide that the fictional book sounds a lot better than the book they're reading, and start wishing they could just read that book instead. That was certainly the case for me. I'm sad that Over the Woodward Wall doesn't exist, and am mostly infuriated by Middlegame. Dodger and Erin deserved to live in a better book. Should you want to read this anyway (and I do know people who liked it), serious content warning for self-harm. Rating: 4 out of 10

26 May 2020

Russell Coker: Cruises and Covid19

Problems With Cruises GQ has an insightful and detailed article about Covid19 and the Diamond Princess [1], I recommend reading it. FastCompany has a brief article about bookings for cruises in August [2]. There have been many negative comments about this online. The first thing to note is that the cancellation policies on those cruises are more lenient than usual and the prices are lower. So it s not unreasonable for someone to put down a deposit on a half price holiday in the hope that Covid19 goes away (as so many prominent people have been saying it will) in the knowledge that they will get it refunded if things don t work out. Of course if the cruise line goes bankrupt then no-one will get a refund, but I think people are expecting that won t happen. The GQ article highlights some serious problems with the way cruise ships operate. They have staff crammed in to small cabins and the working areas allow transmission of disease. These problems can be alleviated, they could allocate more space to staff quarters and have more capable air conditioning systems to put in more fresh air. During the life of a cruise ship significant changes are often made, replacing engines with newer more efficient models, changing the size of various rooms for entertainment, installing new waterslides, and many other changes are routinely made. Changing the staff only areas to have better ventilation and more separate space (maybe capsule-hotel style cabins with fresh air piped in) would not be a difficult change. It would take some money and some dry-dock time which would be a significant expense for cruise companies. Cruises Are Great People like social environments, they want to have situations where there are as many people as possible without it becoming impossible to move. Cruise ships are carefully designed for the flow of passengers. Both the layout of the ship and the schedule of events are carefully planned to avoid excessive crowds. In terms of meeting the requirement of having as many people as possible in a small area without being unable to move cruise ships are probably ideal. Because there is a large number of people in a restricted space there are economies of scale on a cruise ship that aren t available anywhere else. For example the main items on the menu are made in a production line process, this can only be done when you have hundreds of people sitting down to order at the same time. The same applies to all forms of entertainment on board, they plan the events based on statistical knowledge of what people want to attend. This makes it more economical to run than land based entertainment where people can decide to go elsewhere. On a ship a certain portion of the passengers will see whatever show is presented each night, regardless of whether it s singing, dancing, or magic. One major advantage of cruises is that they are all inclusive. If you are on a regular holiday would you pay to see a singing or dancing show? Probably not, but if it s included then you might as well do it and it will be pretty good. This benefit is really appreciated by people taking kids on holidays, if kids do things like refuse to attend a performance that you were going to see or reject food once it s served then it won t cost any extra. People Who Criticise Cruises For the people who sneer at cruises, do you like going to bars? Do you like going to restaurants? Live music shows? Visiting foreign beaches? A cruise gets you all that and more for a discount price. If Groupon had a deal that gave you a cheap hotel stay with all meals included, free non-alcoholic drinks at bars, day long entertainment for kids at the kids clubs, and two live performances every evening how many of the people who reject cruises would buy it? A typical cruise is just like a Groupon deal for non-stop entertainment from 8AM to 11PM. Will Cruises Restart? The entertainment options that cruises offer are greatly desired by many people. Most cruises are aimed at budget travellers, the price is cheaper than a hotel in a major city. Such cruises greatly depend on economies of scale, if they can t get the ships filled then they would need to raise prices (thus decreasing demand) to try to make a profit. I think that some older cruise ships will be scrapped in the near future and some of the newer ships will be sold to cruise lines that cater to cheap travel (IE P&O may scrap some ships and some of the older Princess ships may be transferred to them). Overall I predict a decrease in the number of middle-class cruise ships. For the expensive cruises (where the cheapest cabins cost over $1000US per person per night) I don t expect any real changes, maybe they will have fewer passengers and higher prices to allow more social distancing or something. I am certain that cruises will start again, but it s too early to predict when. Going on a cruise is about as safe as going to a concert or a major sporting event. No-one is predicting that sporting stadiums will be closed forever or live concerts will be cancelled forever, so really no-one should expect that cruises will be cancelled forever. Whether companies that own ships or stadiums go bankrupt in the mean time is yet to be determined. One thing that s been happening for years is themed cruises. A group can book out an entire ship or part of a ship for a themed cruise. I expect this to become much more popular when cruises start again as it will make it easier to fill ships. In the past it seems that cruise lines let companies book their ships for events but didn t take much of an active role in the process. I think that the management of cruise lines will look to aggressively market themed cruises to anyone who might help, for starters they could reach out to every 80s and 90s pop group those fans are all old enough to be interested in themed cruises and the musicians won t be asking for too much money. Conclusion Humans are social creatures. People want to attend events with many other people. Covid 19 won t be the last pandemic, and it may not even be eradicated in the near future. The possibility of having a society where no-one leaves home unless they are in a hazmat suit has been explored in science fiction, but I don t think that s a plausible scenario for the near future and I don t think that it s something that will be caused by Covid 19.

25 May 2020

Bits from Debian: DebConf20 registration is open!

DebConf20 banner We are happy to announce that registration for DebConf20 is now open. The event will take place from August 23rd to 29th, 2020 at the University of Haifa, in Israel, and will be preceded by DebCamp, from August 16th to 22nd. Although the Covid-19 situation is still rather fluid, as of now, Israel seems to be on top of the situation. Days with less than 10 new diagnosed infections are becoming common and businesses and schools are slowly reopening. As such, we are hoping that, at least as far as regulations go, we will be able to hold an in-person conference. There is more (and up to date) information at the conference's FAQ. Which means, barring a second wave, that there is reason to hope that the conference can go forward. For that, we need your help. We need to know, assuming health regulations permit it, how many people intend to attend. This year probably more than ever before, prompt registration is very important to us. If after months of staying at home you feel that rubbing elbows with fellow Debian Developers is precisely the remedy that will salvage 2020, then we ask that you do register as soon as possible. Sadly, things are still not clear enough for us to make a final commitment to holding an in-person conference, but knowing how many people intend to attend will be a great help in making that decision. The deadline for deciding on postponing, cancelling or changing the format of the conference is June 8th. To register for DebConf20, please visit our website and log into the registration system and fill out the form. You can always edit or cancel your registration, but please note that the last day to confirm or cancel is July 26th, 2020 23:59:59 UTC. We cannot guarantee availability of accommodation, food and swag for unconfirmed registrations. We do suggest that attendees begin making travel arrangements as soon as possible, of course. Please bear in mind that most air carriers allow free cancellations and changes. Any questions about registrations should be addressed to registration@debconf.org. Bursary for travel, accomodation and meals In an effort to widen the diversity of DebConf attendees, the Debian Project allocates a part of the financial resources obtained through sponsorships to pay for bursaries (travel, accommodation, and/or meals) for participants who request this support when they register. As resources are limited, we will examine the requests and decide who will receive the bursaries. They will be destined: Giving a talk, organizing an event or helping during DebConf20 is taken into account when deciding upon your bursary, so please mention them in your bursary application. For more information about bursaries, please visit Applying for a Bursary to DebConf Attention: deadline to apply for bursaries using the registration form before May 31st, 2019 23:59:59 UTC. This deadline is necessary in order to the organisers to have some time to analyze the requests. To register for the Conference, either with or without a bursary request, please visit: https://debconf20.debconf.org/register Participation to DebConf20 is conditional to your respect of our Code of Conduct. We require you to read, understand and abide by this code. DebConf would not be possible without the generous support of all our sponsors, especially our Platinum Sponsor Lenovo and Gold Sponsors deepin and Matanel Foundation. DebConf20 is still accepting sponsors; if you are interested, or think you know of others who would be willing to help, please get in touch!

24 May 2020

Fran ois Marier: Printing hard-to-print PDFs on Linux

I recently found a few PDFs which I was unable to print due to those files causing insufficient printer memory errors: I found a detailed explanation of what might be causing this which pointed the finger at transparent images, a PDF 1.4 feature which apparently requires a more recent version of PostScript than what my printer supports. Using Okular's Force rasterization option (accessible via the print dialog) does work by essentially rendering everything ahead of time and outputing a big image to be sent to the printer. The quality is not very good however.

Converting a PDF to DjVu The best solution I found makes use of a different file format: .djvu Such files are not PDFs, but can still be opened in Evince and Okular, as well as in the dedicated DjVuLibre application. As an example, I was unable to print page 11 of this paper. Using pdfinfo, I found that it is in PDF 1.5 format and so the transparency effects could be the cause of the out-of-memory printer error. Here's how I converted it to a high-quality DjVu file I could print without problems using Evince:
pdf2djvu -d 1200 2002.04049.pdf > 2002.04049-1200dpi.djvu

Converting a PDF to PDF 1.3 I also tried the DjVu trick on a different unprintable PDF, but it failed to print, even after lowering the resolution to 600dpi:
pdf2djvu -d 600 dow-faq_v1.1.pdf > dow-faq_v1.1-600dpi.djvu
In this case, I used a different technique and simply converted the PDF to version 1.3 (from version 1.6 according to pdfinfo):
ps2pdf13 -r1200x1200 dow-faq_v1.1.pdf dow-faq_v1.1-1200dpi.pdf
This eliminates the problematic transparency and rasterizes the elements that version 1.3 doesn't support.

23 May 2020

Dirk Eddelbuettel: RcppSimdJson 0.0.5: Updated Upstream

A new RcppSimdJson release with updated upstream simdjson code just arrived on CRAN. RcppSimdJson wraps the fantastic and genuinely impressive simdjson library by Daniel Lemire and collaborators. Via some very clever algorithmic engineering to obtain largely branch-free code, coupled with modern C++ and newer compiler instructions, it results in parsing gigabytes of JSON parsed per second which is quite mindboggling. The best-case performance is faster than CPU speed as use of parallel SIMD instructions and careful branch avoidance can lead to less than one cpu cycle use per byte parsed; see the video of the recent talk by Daniel Lemire at QCon (which was also voted best talk). This release brings updated upstream code (thanks to Brendan Knapp) plus a new example and minimal tweaks. The full NEWS entry follows.

Changes in version 0.0.5 (2020-05-23)
  • Add parseExample from earlier upstream announcement (Dirk).
  • Synced with upstream (Brendan in #12) closing #11).
  • Updated example parseExample to API changes (Brendan).

Courtesy of CRANberries, there is also a diffstat report for this release. For questions, suggestions, or issues please use the issue tracker at the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

18 May 2020

Fran ois Marier: Displaying client IP address using Apache Server-Side Includes

If you use a Dynamic DNS setup to reach machines which are not behind a stable IP address, you will likely have a need to probe these machines' public IP addresses. One option is to use an insecure service like Oracle's http://checkip.dyndns.com/ which echoes back your client IP, but you can also do this on your own server if you have one. There are multiple options to do this, like writing a CGI or PHP script, but those are fairly heavyweight if that's all you need mod_cgi or PHP for. Instead, I decided to use Apache's built-in Server-Side Includes.

Apache configuration Start by turning on the include filter by adding the following in /etc/apache2/conf-available/ssi.conf:
AddType text/html .shtml
AddOutputFilter INCLUDES .shtml
and making that configuration file active:
a2enconf ssi
Then, find the vhost file where you want to enable SSI and add the following options to a Location or Directory section:
<Location /ssi_files>
    Options +IncludesNOEXEC
    SSLRequireSSL
    Header set Content-Security-Policy: "default-src 'none'"
    Header set X-Content-Type-Options: "nosniff"
</Location>
before adding the necessary modules:
a2enmod headers
a2enmod include
and restarting Apache:
apache2ctl configtest && systemctl restart apache2.service
Create an shtml page With the web server ready to process SSI instructions, the following HTML blurb can be used to display the client IP address:
<!--#echo var="REMOTE_ADDR" -->
or any other built-in variable. Note that you don't need to write a valid HTML for the variable to be substituted and so the above one-liner is all I use on my server.

Security concerns The first thing to note is that the configuration section uses the IncludesNOEXEC option in order to disable arbitrary command execution via SSI. In addition, you can also make sure that the cgi module is disabled since that's a dependency of the more dangerous side of SSI:
a2dismod cgi
Of course, if you rely on this IP address to be accurate, for example because you'll be putting it in your DNS, then you should make sure that you only serve this page over HTTPS, which can be enforced via the SSLRequireSSL directive. I included two other headers in the above vhost config (Content-Security-Policy and X-Content-Type-Options) in order to limit the damage that could be done in case a malicious file was accidentally dropped in that directory. Finally, I suggest making sure that only the root user has writable access to the directory which has server-side includes enabled:
$ ls -la /var/www/ssi_includes/
total 12
drwxr-xr-x  2 root     root     4096 May 18 15:58 .
drwxr-xr-x 16 root     root     4096 May 18 15:40 ..
-rw-r--r--  1 root     root        0 May 18 15:46 index.html
-rw-r--r--  1 root     root       32 May 18 15:58 whatsmyip.shtml

Arturo Borrero Gonz lez: A better Toolforge: upgrading the Kubernetes cluster

Logos This post was originally published in the Wikimedia Tech blog, and is authored by Arturo Borrero Gonzalez and Brooke Storm. One of the most successful and important products provided by the Wikimedia Cloud Services team at the Wikimedia Foundation is Toolforge. Toolforge is a platform that allows users and developers to run and use a variety of applications that help the Wikimedia movement and mission from the technical point of view in general. Toolforge is a hosting service commonly known in the industry as a Platform as a Service (PaaS). Toolforge is powered by two different backend engines, Kubernetes and GridEngine. This article focuses on how we made a better Toolforge by integrating a newer version of Kubernetes and, along with it, some more modern workflows. The starting point in this story is 2018. Yes, two years ago! We identified that we could do better with our Kubernetes deployment in Toolforge. We were using a very old version, v1.4. Using an old version of any software has more or less the same consequences everywhere: you lack security improvements and some modern key features. Once it was clear that we wanted to upgrade our Kubernetes cluster, both the engineering work and the endless chain of challenges started. It turns out that Kubernetes is a complex and modern technology, which adds some extra abstraction layers to add flexibility and some intelligence to a very old systems engineering need: hosting and running a variety of applications. Our first challenge was to understand what our use case for a modern Kubernetes was. We were particularly interested in some key features: Soon enough we faced another Kubernetes native challenge: the documentation. For a newcomer, learning and understanding how to adapt Kubernetes to a given use case can be really challenging. We identified some baffling patterns in the docs. For example, different documentation pages would assume you were using different Kubernetes deployments (Minikube vs kubeadm vs a hosted service). We are running Kubernetes like you would on bare-metal (well, in CloudVPS virtual machines), and some documents directly referred to ours as a corner case. During late 2018 and early 2019, we started brainstorming and prototyping. We wanted our cluster to be reproducible and easily rebuildable, and in the Technology Department at the Wikimedia Foundation, we rely on Puppet for that. One of the first things to decide was how to deploy and build the cluster while integrating with Puppet. This is not as simple as it seems because Kubernetes itself is a collection of reconciliation loops, just like Puppet is. So we had to decide what to put directly in Kubernetes and what to control and make visible through Puppet. We decided to stick with kubeadm as the deployment method, as it seems to be the more upstream-standardized tool for the task. We had to make some interesting decisions by trial and error, like where to run the required etcd servers, what the kubeadm init file would look like, how to proxy and load-balance the API on our bare-metal deployment, what network overlay to choose, etc. If you take a look at our public notes, you can get a glimpse of the number of decisions we had to make. Our Kubernetes wasn t going to be a generic cluster, we needed a Toolforge Kubernetes service. This means we don t use some of the components, and also, we add some additional pieces and configurations to it. By the second half of 2019, we were working full-speed on the new Kubernetes cluster. We already had an idea of what we wanted and how to do it. There were a couple of important topics for discussions, for example: We will describe in detail the final state of those pieces in another blog post, but each of the topics required several hours of engineering time, research, tests, and meetings before reaching a point in which we were comfortable with moving forward. By the end of 2019 and early 2020, we felt like all the pieces were in place, and we started thinking about how to migrate the users, the workloads, from the old cluster to the new one. This migration plan mostly materialized in a Wikitech page which contains concrete information for our users and the community. The interaction with the community was a key success element. Thanks to our vibrant and involved users, we had several early adopters and beta testers that helped us identify early flaws in our designs. The feedback they provided was very valuable for us. Some folks helped solve technical problems, helped with the migration plan or even helped make some design decisions. Worth noting that some of the changes that were presented to our users were not easy to handle for them, like new quotas and usage limits. Introducing new workflows and deprecating old ones is always a risky operation. Even though the migration procedure from the old cluster to the new one was fairly simple, there were some rough edges. We helped our users navigate them. A common issue was a webservice not being able to run in the new cluster due to stricter quota limiting the resources for the tool. Another example is the new Ingress layer failing to properly work with some webservices s particular options. By March 2020, we no longer had anything running in the old Kubernetes cluster, and the migration was completed. We then started thinking about another step towards making a better Toolforge, which is introducing the toolforge.org domain. There is plenty of information about the change to this new domain in Wikitech News. The community wanted a better Toolforge, and so do we, and after almost 2 years of work, we have it! All the work that was done represents the commitment of the Wikimedia Foundation to support the technical community and how we really want to pursue technical engagement in general in the Wikimedia movement. In a follow-up post we will present and discuss more in-depth about some technical details of the new Kubernetes cluster, stay tuned! This post was originally published in the Wikimedia Tech blog, and is authored by Arturo Borrero Gonzalez and Brooke Storm.

17 May 2020

Steve Kemp: Some brief sysbox highlights

I started work on sysbox again recently, adding a couple of simple utilities. (The whole project is a small collection of utilities, distributed as a single binary to ease installation.) Imagine you want to run a command for every line of STDIN, here's a good example:
 $ cat input   sysbox exec-stdin "youtube-dl  "
Here you see for every (non-empty) line of input read from STDIN the command "youtube-dl" has been executed. " " gets expanded to the complete line read. You can also access individual fields, kinda like awk. (Yes youtube-dl can read a list of URLs from a file, this is an example!) Another example, run groups for every local user:
$ cat /etc/passwd   sysbox exec-stdin --split=: groups  1 
Here you see we have split the input-lines read from STDIN by the : character, instead of by whitespace, and we've accessed the first field via " 1 ". This is certainly easier for scripting than using a bash loop. On the topic of bash; command-completion for each subcommand, and their arguments, is now present:
$ source <(sysbox bash-completion)
And I've added a text-based UI for selecting files. You can also execute a command, against the selected file:
$ sysbox choose-file -exec "xine  " /srv/tv
This is what that looks like: /2020/05/17-choose-file.png You'll see: As well as choosing files, you can also select from lines read via STDIN, and you can filter the command in the same way as before. (i.e. " " is the selected item.) Other commands received updates, so the calculator now allows storing results in variables:
$ sysbox calc
calc> let a = 3
3
calc> a / 9 * 3
1
calc> 1 + 2 * a
7
calc> 1.2 + 3.4
4.600000

Erich Schubert: Contact Tracing Apps are Useless

Some people believe that automatic contact tracing apps will help contain the Coronavirus epidemic. They won t. Sorry to bring the bad news, but IT and mobile phones and artificial intelligence will not solve every problem. In my opinion, those that promise to solve these things with artificial intelligence / mobile phones / apps / your-favorite-buzzword are at least overly optimistic and blinder Aktionismus (*), if not naive, detachted from reality, or fraudsters that just want to get some funding. (*) there does not seem to be an English word for this doing something just for the sake of doing something, without thinking about whether it makes sense to do so Here are the reasons why it will not work:
  1. Signal quality. Forget detecting proximity with Bluetooth Low Energy. Yes, there are attempts to use BLE beacons for indoor positioning. But these use that you can learn fingerprints of which beacons are visible at which points, combined with additional information such as movement sensors and history (you do not teleport around in a building). BLE signals and antennas apparently tend to be very prone to orientation differences, signal reflections, and of course you will not have the idealized controlled environment used in such prototypes. The contacts have a single device, and they move this is not comparable to indoor positioning. I strongly doubt you can tell whether you are close to someone, or not.
  2. Close vs. protection. The app cannot detect protection in place. Being close to someone behind a plexiglass window or even a solid wall is very different from being close otherwise. You will get a lot of false contacts this way. That neighbor that you have never seen living in the appartment above will likely be considered a close contact of yours, as you sleep next to each other every day
  3. Low adoption rates. Apparently even in technology affine Singapore, fewer than 20% of people installed the app. That does not even mean they use it regularly. In Austria, the number is apparently below 5%, and people complain that it does not detect contact But in order for this approach to work, you will need Chinese-style mass surveillance that literally puts you in prison if you do not install the app.
  4. False alerts. Because of these issues, you will get false alerts, until you just do not care anymore.
  5. False sense of security. Honestly: the app does not pretect you at all. All it tries to do is to make the tracing of contacts easier. It will not tell you reliably if you have been infected (as mentioned above, too many false positives, too few users) nor that you are relatively safe (too few contacts included, too slow testing and reporting). It will all be on the quality of about 10 days ago you may or may not have contact with someone that tested positive, please contact someone to expose more data to tell you that it is actually another false alert .
  6. Trust. In Germany, the app will be operated by T-Systems and SAP. Not exactly two companies that have a lot of fans SAP seems to be one of the most hated software around. Neither company is known for caring about privacy much, but they are prototypical for business first . Its trust the cat to keep the cream. Yes, I know they want to make it open-source. But likely only the client, and you will still have to trust that the binary in the app stores is actually built from this source code, and not from a modified copy. As long as the name T-Systems and SAP are associated to the app, people will not trust it. Plus, we all know that the app will be bad, given the reputation of these companies at making horrible software systems
  7. Too late. SAP and T-Systems want to have the app ready in mid June. Seriously, this must be a joke? It will be very buggy in the beginning (because it is SAP!) and it will not be working reliably before end of July. There will not be a substantial user before fall. But given the low infection rates in Germany, nobody will bother to install it anymore, because the perceived benefit is 0 one the infection rates are low.
  8. Infighting. You may remember that there was the discussion before that there should be a pan-european effort. Except that in the end, everybody fought everybody else, countries went into different directions and they all broke up. France wanted a centralized systems, while in Germany people pointed out that the users will not accept this and only a distributed system will have a chance. That failed effort was known as Pan-European Privacy-Preserving Proximity Tracing (PEPP-PT) vs. Decentralized Privacy-Preserving Proximity Tracing (DP-3T) , and it turned out to have become a big clusterfuck . And that is just the tip of the iceberg.
Iceleand, probably the country that handled the Corona crisis best (they issued a travel advisory against Austria, when they were still happily spreading the virus at apres-ski; they massively tested, and got the infections down to almost zero within 6 weeks), has been experimenting with such an app. Iceland as a fairly close community managed to have almost 40% of people install their app. So did it help? No: The technology is more or less I wouldn t say useless [ ] it wasn t a game changer for us. The contact tracing app is just a huge waste of effort and public money. And pretty much the same applies to any other attempts to solve this with IT. There is a lot of buzz about solving the Corona crisis with artificial intelligence: bullshit! That is just naive. Do not speculate about magic power of AI. Get the data, understand the data, and you will see it does not help. Because its real data. Its dirty. Its late. Its contradicting. Its incomplete. It is all what AI currently can not handle well. This is not image recognition. You have no labels. Many of the attempts in this direction already fail at the trivial 7-day seasonality you observe in the data For example, the widely known John Hopkins Has the curve flattened trend has a stupid, useless indicator based on 5 day averages. And hence you get the weekly up and downs due to weekends. They show pretty up and down indicators. But these are affected mostly by the day of the week. And nobody cares. Notice that they currently even have big negative infections in their plots? There is no data on when someone was infected. Because such data simply does not exist. What you have is data when someone tested positive (mostly), when someone reported symptons (sometimes, but some never have symptoms!), and when someone dies (but then you do not know if it was because of Corona, because of other issues that became just worse because of Corona, or hit by a car without any relation to Corona). The data that we work with is incredibly delayed, yet we pretend it is live . Stop reading tea leaves. Stop pretending AI can save the world from Corona.

Next.