Search Results: "spa"

6 June 2020

Petter Reinholdtsen: Secure Socket API - a simple and powerful approach for TLS support in software

As a member of the Norwegian Unix User Group, I have the pleasure of receiving the USENIX magazine ;login: several times a year. I rarely have time to read all the articles, but try to at least skim through them all as there is a lot of nice knowledge passed on there. I even carry the latest issue with me most of the time to try to get through all the articles when I have a few spare minutes. The other day I came across a nice article titled "The Secure Socket API: TLS as an Operating System Service" with a marvellous idea I hope can make it all the way into the POSIX standard. The idea is as simple as it is powerful. By introducing a new socket() option IPPROTO_TLS to use TLS, and a system wide service to handle setting up TLS connections, one both make it trivial to add TLS support to any program currently using the POSIX socket API, and gain system wide control over certificates, TLS versions and encryption systems used. Instead of doing this:
int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
the program code would be doing this:
int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TLS);
According to the ;login: article, converting a C program to use TLS would normally modify only 5-10 lines in the code, which is amazing when compared to using for example the OpenSSL API. The project has set up the https://securesocketapi.org/ web site to spread the idea, and the code for a kernel module and the associated system daemon is available from two github repositories: ssa and ssa-daemon. Unfortunately there is no explicit license information with the code, so its copyright status is unclear. A request to solve this about it has been unsolved since 2018-08-17. I love the idea of extending socket() to gain TLS support, and understand why it is an advantage to implement this as a kernel module and system wide service daemon, but can not help to think that it would be a lot easier to get projects to move to this way of setting up TLS if it was done with a user space approach where programs wanting to use this API approach could just link with a wrapper library. I recommend you check out this simple and powerful approach to more secure network connections. :) As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Russell Coker: Comparing Compression

I just did a quick test of different compression options in Debian. The source file is a 1.1G MySQL dump file. The time is user CPU time on a i7-930 running under KVM, the compression programs may have different levels of optimisation for other CPU families. Facebook people designed the zstd compression system (here s a page giving an overview of it [1]). It has some interesting new features that can provide real differences at scale (like unusually large windows and pre-defined dictionaries), but I just tested the default mode and the -9 option for more compression. For the SQL file zstd -9 provides significantly better compression than gzip while taking only slightly less CPU time than gzip -9 while zstd with the default option (equivalent to zstd -3 ) gives much faster compression than gzip -9 while also being slightly smaller. For this use case bzip2 is too slow for inline compression of a MySQL dump as the dump process locks tables and can hang clients. The lzma and xz compression algorithms provide significant benefits in size but the time taken is grossly disproportionate. In a quick check of my collection of files compressed with gzip I was only able to fine 1 fild that got less compression with zstd with default options, and that file got better compression with zstd -9 . So zstd seems to beat gzip everywhere by every measure. The bzip2 compression seems to be obsolete, zstd -9 is much faster and has slightly smaller output. Both xz and lzma seem to offer a combination of compression and time taken that zstd can t beat (for this file type at least). The ultra compression mode 22 gives 2% smaller output files but almost 28 minutes of CPU time for compression is a bit ridiculous. There is a threaded mode for zstd that could potentially allow a shorter wall clock time for zstd --ultra -22 than lzma/xz while also giving better compression.
Compression Time Size
zstd 5.2s 130m
zstd -9 28.4s 114m
gzip -9 33.4s 141m
bzip2 -9 3m51 119m
lzma 6m20 97m
xz 6m36 97m
zstd -19 9m57 99m
zstd --ultra -22 27m46 95m
Conclusion For distributions like Debian which have large archives of files that are compressed once and transferred a lot the zstd --ultra -22 compression might be useful with multi-threaded compression. But given that Debian already has xz in use it might not be worth changing until faster CPUs with lots of cores become more commonly available. One could argue that for Debian it doesn t make sense to change from xz as hard drives seem to be getting larger capacity (and also smaller physical size) faster than the Debian archive is growing. One possible reason for adopting zstd in a distribution like Debian is that there are more tuning options for things like memory use. It would be possible to have packages for an architecture like ARM that tends to have less RAM compressed in a way that decreases memory use on decompression. For general compression such as compressing log files and making backups it seems that zstd is the clear winner. Even bzip2 is far too slow and in my tests zstd clearly beats gzip for every combination of compression and time taken. There may be some corner cases where gzip can compete on compression time due to CPU features, optimisation for CPUs, etc but I expect that in almost all cases zstd will win for compression size and time. As an aside I once noticed the 32bit of gzip compressing faster than the 64bit version on an Opteron system, the 32bit version had assembly optimisation and the 64bit version didn t at that time. To create a tar archive you can run tar czf or tar cJf to create an archive with gzip or xz compression. To create an archive with zstd compression you have to use tar --zstd -cf , that s 7 extra characters to type. It s likely that for most casual archive creation (EG for copying files around on a LAN or USB stick) saving 7 characters of typing is more of a benefit than saving a small amount of CPU time and storage space. It would be really good if tar got a single character option for zstd compression. The external dictionary support in zstd would work really well with rsync for backups. Currently rsync only supports zlib, adding zstd support would be a good project for someone (unfortunately I don t have enough spare time). Now I will change my database backup scripts to use zstd. Update: The command tar acvf a.zst filenames will create a zstd compressed tar archive, the a option to GNU tar makes it autodetect the compression type from the file name. Thanks Enrico!

5 June 2020

Sean Whitton: spacecadetrebindings

I ve been less good at taking adequate typing breaks during the lockdown and I ve become concerned about how much chording my left hand does on its own during typical Emacs usage, with caps lock rebound to control, as I ve had it for years. I thought that now was as good a time as any to do something drastic about this. Here are my rebindings: Optional extras: This has the following advantages: The main disadvantage, aside from an adjustment period when I feel that someone has inserted a massive marshmellow between me and my computer, is that Ctrl-Alt combinations are a bit difficult; in Emacs, C-M-SPC is hard to do without. However I think I ve found a decent way to do it (thumb on control, curled ring finger on alt, possibly little finger on shift for Emacs infamous C-M-S-v standard binding).

4 June 2020

Antoine Beaupr : Replacing Smokeping with Prometheus

I've been struggling with replacing parts of my old sysadmin monitoring toolkit (previously built with Nagios, Munin and Smokeping) with more modern tools (specifically Prometheus, its "exporters" and Grafana) for a while now. Replacing Munin with Prometheus and Grafana is fairly straightforward: the network architecture ("server pulls metrics from all nodes") is similar and there are lots of exporters. They are a little harder to write than Munin modules, but that makes them more flexible and efficient, which was a huge problem in Munin. I wrote a Migrating from Munin guide that summarizes those differences. Replacing Nagios is much harder, and I still haven't quite figured out if it's worth it.

How does Smokeping work Leaving those two aside for now, I'm left with Smokeping, which I used in my previous job to diagnose routing issues, using Smokeping as a decentralized looking glass, which was handy to debug long term issues. Smokeping is a strange animal: it's fundamentally similar to Munin, except it's harder to write plugins for it, so most people just use it for Ping, something for which it excels at. Its trick is this: instead of doing a single ping and returning this metrics, it does multiple ones and returns multiple metrics. Specifically, smokeping will send multiple ICMP packets (20 by default), with a low interval (500ms by default) and a single retry. It also pings multiple hosts at once which means it can quickly scan multiple hosts simultaneously. You therefore see network conditions affecting one host reflected in further hosts down (or up) the chain. The multiple metrics also mean you can draw graphs with "error bars" which Smokeping shows as "smoke" (hence the name). You also get per-metric packet loss. Basically, smokeping runs this command and collects the output in a RRD database:
fping -c $count -q -b $backoff -r $retry -4 -b $packetsize -t $timeout -i $mininterval -p $hostinterval $host [ $host ...]
... where those parameters are, by default:
  • $count is 20 (packets)
  • $backoff is 1 (avoid exponential backoff)
  • $timeout is 1.5s
  • $mininterval is 0.01s (minimum wait interval between any target)
  • $hostinterval is 1.5s (minimum wait between probes on a single target)
It can also override stuff like the source address and TOS fields. This probe will complete between 30 and 60 seconds, if my math is right (0% and 100% packet loss).

How do draw Smokeping graphs in Grafana A naive implementation of Smokeping in Prometheus/Grafana would be to use the blackbox exporter and create a dashboard displaying those metrics. I've done this at home, and then I realized that I was missing something. Here's what I did.
  1. install the blackbox exporter:
    apt install prometheus-blackbox-exporter
    
  2. make sure to allow capabilities so it can ping:
    dpkg-reconfigure prometheus-blackbox-exporter
    
  3. hook monitoring targets into prometheus.yml (the default blackbox exporter configuration is fine):
    scrape_configs:
      - job_name: blackbox
          metrics_path: /probe
          params:
            module: [icmp]
          scrape_interval: 5s
          static_configs:
            - targets:
              - octavia.anarc.at
              # hardcoded in DNS
              - nexthop.anarc.at
              - koumbit.net
              - dns.google
          relabel_configs:
            - source_labels: [__address__]
              target_label: __param_target
            - source_labels: [__param_target]
              target_label: instance
            - target_label: __address__
              replacement: 127.0.0.1:9115  # The blackbox exporter's real hostname:port.
    
    Notice how we lower the scrape_interval to 5 seconds to get more samples. nexthop.anarc.at was added into DNS to avoid hardcoding my upstream ISP's IP in my configuration.
  4. create a Grafana panel to graph the results. first, add this query:
    sum(probe_icmp_duration_seconds phase="rtt" ) by (instance)
    
    • Set the Legend field to instance RTT
    • Set Draw modes to lines and Mode options to staircase
    • Set the Left Y axis Unit to duration(s)
    • Show the Legend As table, with Min, Avg, Max and Current enabled
    Then add this query, for packet loss:
    1-avg_over_time(probe_success[$__interval])!=0 or null
    
    • Set the Legend field to instance packet loss
    • Set a Add series override to Lines: false, Null point mode: null, Points: true, Points Radios: 1, Color: deep red, and, most importantly, Y-axis: 2
    • Set the Right Y axis Unit to percent (0.0-1.0) and set Y-max to 1
    Then set the entire thing to Repeat, on target, vertically. And you need to add a target variable like label_values(probe_success, instance).
The result looks something like this:
A plot of RTT and packet loss over time of three nodes Not bad, but not Smokeping
This actually looks pretty good! I've uploaded the resulting dashboard in the Grafana dashboard repository.

What is missing? Now, that doesn't exactly look like Smokeping, does it. It's pretty good, but it's not quite what we want. What is missing is variance, the "smoke" in Smokeping. There's a good article about replacing Smokeping with Grafana. They wrote a custom script to write samples into InfluxDB so unfortunately we can't use it in this case, since we don't have InfluxDB's query language. I couldn't quite figure out how to do the same in PromQL. I tried:
stddev(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" )
stddev_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [$__interval])
stddev_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [1m])
The first two give zero for all samples. The latter works, but doesn't look as good as Smokeping. So there might be something I'm missing. SuperQ wrote a special exporter for this called smokeping_prober that came out of this discussion in the blackbox exporter. Instead of delegating scheduling and target definition to Prometheus, the targets are set in the exporter. They also take a different approach than Smokeping: instead of recording the individual variations, they delegate that to Prometheus, through the use of "buckets". Then they use a query like this:
histogram_quantile(0.9 rate(smokeping_response_duration_seconds_bucket[$__interval]))
This is the rationale to SuperQ's implementation:
Yes, I know about smokeping's bursts of pings. IMO, smokeping's data model is flawed that way. This is where I intentionally deviated from the smokeping exact way of doing things. This prober sends a smooth, regular series of packets in order to be measuring at regular controlled intervals. Instead of 20 packets, over 10 seconds, every minute. You send one packet per second and scrape every 15. This has the same overall effect, but the measurement is, IMO, more accurate, as it's a continuous stream. There's no 50 second gap of no metrics about the ICMP stream. Also, you don't get back one metric for those 20 packets, you get several. Min, Max, Avg, StdDev. With the histogram data, you can calculate much more than just that using the raw data. For example, IMO, avg and max are not all that useful for continuous stream monitoring. What I really want to know is the 90th percentile or 99th percentile. This smokeping prober is not intended to be a one-to-one replacement for exactly smokeping's real implementation. But simply provide similar functionality, using the power of Prometheus and PromQL to make it better. [...] one of the reason I prefer the histogram datatype, is you can use the heatmap panel type in Grafana, which is superior to the individual min/max/avg/stddev metrics that come from smokeping. Say you had two routes, one slow and one fast. And some pings are sent over one and not the other. Rather than see a wide min/max equaling a wide stddev, the heatmap would show a "line" for both routes.
That's an interesting point. I have also ended up adding a heatmap graph to my dashboard, independently. And it is true it shows those "lines" much better... So maybe that, if we ignore legacy, we're actually happy with what we get, even with the plain blackbox exporter. So yes, we're missing pretty "fuzz" lines around the main lines, but maybe that's alright. It would be possible to do the equivalent to the InfluxDB hack, with queries like:
min_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [30s])
avg_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [5m])
max_over_time(probe_icmp_duration_seconds phase="rtt",instance=~"$instance" [30s])
The output looks something like this:
A plot of RTT and packet loss over time of three nodes, with minimax Looks more like Smokeping!
But there's a problem there: see how the middle graph "dips" sometimes below 20ms? That's the min_over_time function (incorrectly, IMHO) returning zero. I haven't quite figured out how to fix that, and I'm not sure it is better. But it does look more like Smokeping than the previous graph. Update: I forgot to mention one big thing that this setup is missing. Smokeping has this nice feature that you can order and group probe targets in a "folder"-like hierarchy. It is often used to group probes by location, which makes it easier to scan a lot of targets. This is harder to do in this setup. It might be possible to setup location-specific "jobs" and select based on that, but it's not exactly the same.

Credits Credits to Chris Siebenmann for his article about Prometheus and pings which gave me the avg_over_time query idea.

Reproducible Builds: Reproducible Builds in May 2020

Welcome to the May 2020 report from the Reproducible Builds project. One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. Nonetheless, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes. In these reports we outline the most important things that we and the rest of the community have been up to over the past month.

News The Corona-Warn app that helps trace infection chains of SARS-CoV-2/COVID-19 in Germany had a feature request filed against it that it build reproducibly. A number of academics from Cornell University have published a paper titled Backstabber s Knife Collection which reviews various open source software supply chain attacks:
Recent years saw a number of supply chain attacks that leverage the increasing use of open source during software development, which is facilitated by dependency managers that automatically resolve, download and install hundreds of open source packages throughout the software life cycle.
In related news, the LineageOS Android distribution announced that a hacker had access to the infrastructure of their servers after exploiting an unpatched vulnerability. Marcin Jachymiak of the Sia decentralised cloud storage platform posted on their blog that their siac and siad utilities can now be built reproducibly:
This means that anyone can recreate the same binaries produced from our official release process. Now anyone can verify that the release binaries were created using the source code we say they were created from. No single person or computer needs to be trusted when producing the binaries now, which greatly reduces the attack surface for Sia users.
Synchronicity is a distributed build system for Rust build artifacts which have been published to crates.io. The goal of Synchronicity is to provide a distributed binary transparency system which is independent of any central operator. The Comparison of Linux distributions article on Wikipedia now features a Reproducible Builds column indicating whether distributions approach and progress towards achieving reproducible builds.

Distribution work In Debian this month: In Alpine Linux, an issue was filed and closed regarding the reproducibility of .apk packages. Allan McRae of the ArchLinux project posted their third Reproducible builds progress report to the arch-dev-public mailing list which includes the following call for help:
We also need help to investigate and fix the packages that fail to reproduce that we have not investigated as of yet.
In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

Software development

diffoscope Chris Lamb made the changes listed below to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. He also prepared and uploaded versions 142, 143, 144, 145 and 146 to Debian, PyPI, etc.
  • Comparison improvements:
    • Improve fuzzy matching of JSON files as file now supports recognising JSON data. (#106)
    • Refactor .changes and .buildinfo handling to show all details (including the GnuPG header and footer components) even when referenced files are not present. (#122)
    • Use our BuildinfoFile comparator (etc.) regardless of whether the associated files (such as the orig.tar.gz and the .deb) are present. [ ]
    • Include GnuPG signature data when comparing .buildinfo, .changes, etc. [ ]
    • Add support for printing Android APK signatures via apksigner(1). (#121)
    • Identify iOS App Zip archive data as .zip files. (#116)
    • Add support for Apple Xcode .mobilepovision files. (#113)
  • Bug fixes:
    • Don t print a traceback if we pass a single, missing argument to diffoscope (eg. a JSON diff to re-load). [ ]
    • Correct differences typo in the ApkFile handler. (#127)
  • Output improvements:
    • Never emit the same id="foo" anchor reference twice in the HTML output, otherwise identically-named parts will not be able to linked to via a #foo anchor. (#120)
    • Never emit an empty id anchor either; it is not possible to link to #. [ ]
    • Don t pretty-print the output when using the --json presenter; it will usually be too complicated to be readable by the human anyway. [ ]
    • Use the SHA256 over MD5 hash when generating page names for the HTML directory-style presenter. (#124)
  • Reporting improvements:
    • Clarify the message when we truncate the number of lines to standard error [ ] and reduce the number of maximum lines printed to 25 as usually the error is obvious by then [ ].
    • Print the amount of free space that we have available in our temporary directory as a debugging message. [ ]
    • Clarify Command [ ] failed with exit code messages to remove duplicate exited with exit but also to note that diffoscope is interpreting this as an error. [ ]
    • Don t leak the full path of the temporary directory in Command [ ] exited with 1 messages. (#126)
    • Clarify the warning message when we cannot import the debian Python module. [ ]
    • Don t repeat stderr from if both commands emit the same output. [ ]
    • Clarify that an external command emits for both files, otherwise it can look like we are repeating itself when, in reality, it is being run twice. [ ]
  • Testsuite improvements:
    • Prevent apksigner test failures due to lack of binfmt_misc, eg. on Salsa CI and elsewhere. [ ]
    • Drop .travis.yml as we use Salsa instead. [ ]
  • Dockerfile improvements:
    • Add a .dockerignore file to whitelist files we actually need in our container. (#105)
    • Use ARG instead of ENV when setting up the DEBIAN_FRONTEND environment variable at runtime. (#103)
    • Run as a non-root user in container. (#102)
    • Install/remove the build-essential during build so we can install the recommended packages from Git. [ ]
  • Codebase improvements:
    • Bump the officially required version of Python from 3.5 to 3.6. (#117)
    • Drop the (default) shell=False keyword argument to subprocess.Popen so that the potentially-unsafe shell=True is more obvious. [ ]
    • Perform string normalisation in Black [ ] and include the Black output in the assertion failure too [ ].
    • Inline MissingFile s special handling of deb822 to prevent leaking through abstract layers. [ ][ ]
    • Allow a bare try/except block when cleaning up temporary files with respect to the flake8 quality assurance tool. [ ]
    • Rename in_dsc_path to dsc_in_same_dir to clarify the use of this variable. [ ]
    • Abstract out the duplicated parts of the debian_fallback class [ ] and add descriptions for the file types. [ ]
    • Various commenting and internal documentation improvements. [ ][ ]
    • Rename the Openssl command class to OpenSSLPKCS7 to accommodate other command names with this prefix. [ ]
  • Misc:
    • Rename the --debugger command-line argument to --pdb. [ ]
    • Normalise filesystem stat(2) birth times (ie. st_birthtime) in the same way we do with the stat(1) command s Access: and Change: times to fix a nondeterministic build failure in GNU Guix. (#74)
    • Ignore case when ordering our file format descriptions. [ ]
    • Drop, add and tidy various module imports. [ ][ ][ ][ ]
In addition:
  • Jean-Romain Garnier fixed a general issue where, for example, LibarchiveMember s has_same_content method was called regardless of the underlying type of file. [ ]
  • Daniel Fullmer fixed an issue where some filesystems could only be mounted read-only. (!49)
  • Emanuel Bronshtein provided a patch to prevent a build of the Docker image containing parts of the build s. (#123)
  • Mattia Rizzolo added an entry to debian/py3dist-overrides to ensure the rpm-python module is used in package dependencies (#89) and moved to using the new execute_after_* and execute_before_* Debhelper rules [ ].

Chris Lamb also performed a huge overhaul of diffoscope s website:
  • Add a completely new design. [ ][ ]
  • Dynamically generate our contributor list [ ] and supported file formats [ ] from the main Git repository.
  • Add a separate, canonical page for every new release. [ ][ ][ ]
  • Generate a latest release section and display that with the corresponding date on the homepage. [ ]
  • Add an RSS feed of our releases [ ][ ][ ][ ][ ] and add to Planet Debian [ ].
  • Use Jekyll s absolute_url and relative_url where possible [ ][ ] and move a number of configuration variables to _config.yml [ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Other tools Elsewhere in our tooling: strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. In May, Chris Lamb uploaded version 1.8.1-1 to Debian unstable and Bernhard M. Wiedemann fixed an off-by-one error when parsing PNG image modification times. (#16) In disorderfs, our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues, Chris Lamb replaced the term dirents in place of directory entries in human-readable output/log messages [ ] and used the astyle source code formatter with the default settings to the main disorderfs.cpp source file [ ]. Holger Levsen bumped the debhelper-compat level to 13 in disorderfs [ ] and reprotest [ ], and for the GNU Guix distribution Vagrant Cascadian updated the versions of disorderfs to version 0.5.10 [ ] and diffoscope to version 145 [ ].

Project documentation & website
  • Carl Dong:
  • Chris Lamb:
    • Rename the Who page to Projects . [ ]
    • Ensure that Jekyll enters the _docs subdirectory to find the _docs/index.md file after an internal move. (#27)
    • Wrap ltmain.sh etc. in preformatted quotes. [ ]
    • Wrap the SOURCE_DATE_EPOCH Python examples onto more lines to prevent visual overflow on the page. [ ]
    • Correct a preferred spelling error. [ ]
  • Holger Levsen:
    • Sort our Academic publications page by publication year [ ] and add Trusting Trust and Fully Countering Trusting Trust through Diverse Double-Compiling [ ].
  • Juri Dispan:

Testing framework We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced. Holger Levsen made the following changes:
  • System health status:
    • Improve page description. [ ]
    • Add more weight to proxy failures. [ ]
    • More verbose debug/failure messages. [ ][ ][ ]
    • Work around strangeness in the Bash shell let VARIABLE=0 exits with an error. [ ]
  • Debian:
    • Fail loudly if there are more than three .buildinfo files with the same name. [ ]
    • Fix a typo which prevented /usr merge variation on Debian unstable. [ ]
    • Temporarily ignore PHP s horde](https://www.horde.org/) packages in Debian bullseye. [ ]
    • Document how to reboot all nodes in parallel, working around molly-guard. [ ]
  • Further work on a Debian package rebuilder:
    • Workaround and document various issues in the debrebuild script. [ ][ ][ ][ ]
    • Improve output in the case of errors. [ ][ ][ ][ ]
    • Improve documentation and future goals [ ][ ][ ][ ], in particular documentiing two real world tests case for an impossible to recreate build environment [ ].
    • Find the right source package to rebuild. [ ]
    • Increase the frequency we run the script. [ ][ ][ ][ ]
    • Improve downloading and selection of the sources to build. [ ][ ][ ]
    • Improve version string handling.. [ ]
    • Handle build failures better. [ ]. [ ]. [ ]
    • Also consider architecture all .buildinfo files. [ ][ ]
In addition:
  • kpcyrd, for Alpine Linux, updated the alpine_schroot.sh script now that a patch for abuild had been released upstream. [ ]
  • Alexander Couzens of the OpenWrt project renamed the brcm47xx target to bcm47xx. [ ]
  • Mattia Rizzolo fixed the printing of the build environment during the second build [ ][ ][ ] and made a number of improvements to the script that deploys Jenkins across our infrastructure [ ][ ][ ].
Lastly, Vagrant Cascadian clarified in the documentation that you need to be user jenkins to run the blacklist command [ ] and the usual build node maintenance was performed was performed by Holger Levsen [ ][ ][ ], Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ].

Mailing list: There were a number of discussions on our mailing list this month: Paul Spooren started a thread titled Reproducible Builds Verification Format which reopens the discussion around a schema for sharing the results from distributed rebuilders:
To make the results accessible, storable and create tools around them, they should all follow the same schema, a reproducible builds verification format. The format tries to be as generic as possible to cover all open source projects offering precompiled source code. It stores the rebuilder results of what is reproducible and what not.
Hans-Christoph Steiner of the Guardian Project also continued his previous discussion regarding making our website translatable. Lastly, Leo Wandersleb posted a detailed request for feedback on a question of supply chain security and other issues of software review; Leo is the founder of the Wallet Scrutiny project which aims to prove the security of Android Bitcoin Wallets:
Do you own your Bitcoins or do you trust that your app allows you to use your coins while they are actually controlled by them ? Do you have a backup? Do they have a copy they didn t tell you about? Did anybody check the wallet for deliberate backdoors or vulnerabilities? Could anybody check the wallet for those?
Elsewhere, Leo had posted instructions on his attempts to reproduce the binaries for the BlueWallet Bitcoin wallet for iOS and Android platforms.


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Jelle van der Waa and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

Gunnar Wolf: Tor from Telmex. When I say achievement unlocked , I mean it!

The blockade has ended! For some introduction.. Back in 2016, Telmex Mexico s foremost communications provider and, through the brands grouped under the Am rica M vil brand, one of Latin America s most important ISPs set up rules to block connecitons to (at least) seven of Tor s directory authorities (DirAuths). We believe they might have blocked all of them, in an attempt to block connections from Tor from anywhere in their networks, but Tor is much more resourceful than that so, the measure was not too effective. Only Some blocking did hurt Telmex s users: The ability to play an active role in Tor. The ability to host Tor relays at home. Why? Because the consensus protocol requires relays to be reachable in order to be measured from the network s DirAuths.

Technical work to prove the blocking We dug into the issue as part of the work we carried out in the project I was happy to lead between 2018 and 2019, UNAM/DGAPA/PAPIME PE102718. In March 2019, I presented a paper titled Distributed Detection of Tor Directory Authorities Censorship in Mexico (alternative download in the Topic on Internet Censorship and Surveillance (TICS) track of the XVIII International Conference on Networks. Then We had many talks inside our group, but nothing seemed to move for several months. We did successfully push for increasing the number of Tor relays in Mexico (we managed to go from two to eleven stable relays not much in absolute terms, but quite good relatively, even more considering most users were not technically able to run one!) Jacobo N jera, journalist participant of our project, didn t leave things there just lying around waiting magically for justice to happen. Together with Vasilis Ververis, from the Magma Project, they presented some weeks ago a Case study: Tor Directory Authorities Censorship in Mexico.

Pushing to action But a good part of being a journalist is knowing how and when to spread the word. Having already two technical studies showing the blocking in place, Jacobo presented his findings with an article in GlobalVoices: The largest telecommunications operator in Mexico blocks the secure network. Surprisingly (to me, at least), this story was picked up by a major Mexican newspaper: The same evening the story hit GlobalVoices, Rodrigo Riquelme posted an article, in the Technology section of El Economista, titled Telmex blocks seven out of ten accesses to the Tor network in Mexico. And that very same day, Telmex sent a reply I am translating in full (that is now included at the end of Riquelme s article):
Mexico City, May 28, 2020 In relation to Tor navigation from TELMEX s network, the company informs: In TELMEX, we are committed to the full respect to navigation freedom for all of our users. TELMEX practices no application-level blocking policies; the Tor application, as well as the rest of Internet applications, can be freely accessed from our network. In order to protect the Internauts information, the seven refered nodes were in their time reported because they were associated with the distribution of the WannaCry ransomware, which is the reason they were filtered, but this does not hamper the use of the Tor application.

So we got an answer ? Jacobo knew we had to take advantage of this answer, and act fast! He entered rush-writing mode and, with the help of our good friend and lawyer Salvador Alc ntar, we wrote a short letter to Renato Flores Cartas, Corporative Communication of Am rica M vil, and sent it on June 1st. Next thing I know, this evening Jacobo was asking me if I could confirm the blocking was lifted. What I could not believe it! But, yes Today Jacobo published the confirmation that the seven blocked IP routes were finally reachable again from ASN 8151 (UNINET / Telmex / Am rica M vil)! Of course, this story was picked up again by El Economista Telmex unblocks IP addresses for the Tor network s directory authority server IPs in Mexico.

Wrapping up How can I put this in words? I am very, very, very, very, very, very, very, very, very happy we managed to see this through! Although we have been pushing for increasing the usage of Tor among users at risk in Mexico Being a journalist, defending human rights, are still a high-risk profession in my country. We strongly believe in this, and will continue trying to raise awareness of the usage. But, just as with free software, using network anonymization tools is not all. We need more people to become active, to become engaged, to become active participants in anonymization. As the adage says, anonymity loves company In order to build strong, sufficient anonymization capability for everybody that needs it, we need more people to provide relay services. And this is a huge step to improve Mexico s participation in the Tor network!
Image credits: Seeing My World Through a Keyhole, by Kate Ter Haar (CC-BY); Tor logo (Wikimedia Commons)

3 June 2020

Keith Packard: picolibc-ryu

Float/String Conversion in Picolibc: Enter Ry I recently wrote about this topic having concluded that the best route for now was to use the malloc-free, but imprecise, conversion routines in the tinystdio alternative. A few days later, Sreepathi Pai pointed me at some very recent work in this area: This is amazing! Thirty years after the papers referenced in the previous post, Ulf Adams came up with some really cool ideas and managed to reduce the math required for 64-bit conversion to 128 bit integers. This is a huge leap forward; we were doing long multi-precision computations before, and now it's all short enough to fit in registers (ok, a lot of registers, but still). Getting the Ry Code The code is available on github: https://github.com/ulfjack/ryu. Reading through it, it's very clear that the author focuses on performance with lots of tuning for common cases. Still, it's quite readable, especially compared with the newlib multi-precision based code. Picolibc String/Float conversion interface Picolibc has some pretty basic needs for the float/string conversion code, it wants four functions:
  1. __dtoa_engine
    int
    __dtoa_engine(double x, struct dtoa *dtoa, uint8_t max_digits, uint8_t max_decimals);
    
    This converts the double x to a string of decimal digits and a decimal exponent stored inside the 'dtoa' struct. It limits the total number of digits to max_digits and, optionally (when max_decimals is non-zero), limits the number of fractional digits to max_decimals - 1. This latter supports 'f' formats. Returns the number of digits stored, which is <= max_digits. Less if the number can be accurately represented in fewer digits.
  2. __ftoa_engine
    int
    __ftoa_engine(float x, struct ftoa *ftoa, uint8_t max_digits, uint8_t max_decimals);
    
    The same as __dtoa_engine, except for floats.
  3. __atod_engine
    double
    __atod_engine(uint64_t m10, int e10);
    
    To avoid needing to handle stdio inside the conversion function, __atod_engine receives fully parsed values, the base-10 significand (m10) and exponent (e10). The value to convert is m10 * pow(10, e10).
  4. __atof_engine
    float
    __atof_engine(uint32_t m10, int e10);
    
    The same as __atod_engine, except for floats.
With these, it can do printf, scanf, ecvt, fcvt, gcvt, strtod, strtof and atof. Porting Ry to Picolibc The existing Ry float-to-string code always generates the number of digits necessary for accurate output. I had to hack it up to generate correctly rounded shorter output when max_digits or max_decimals were smaller. I'm not sure I managed to do that correctly, but at least it appears to be passing all of the test cases I have. In normal operation, Ry iteratively removes digits from the answer that aren't necessary to disambiguate with neighboring values. What I changed was to keep removing digits using that method until the answer had few enough digits to fit in the desired length. There's some tricky rounding code that adjusts the final result and I had to bypass that if I'd removed extra digits. That was about the only change necessary to the core algorithm. I also trimmed the code to only include the general case and not the performance improvements, then wrapped it with code to provide the _engine interface. On the string-to-float side, most of what I needed to do was remove the string parsing bits at the start of the function and switch from performance-optimized to space-optimized versions of a couple of internal routines. Correctness Results Because these new functions are now 'exact', I was able to adjust the picolibc tests to compare all of the bits for string/float conversion instead of having to permit a bit of slop in the answers. With those changes, the picolibc test suite passes, which offers some assurance that things aren't completely broken. Size Results Snek uses the 32-bit float versions of the conversion routines, and for that, the size difference is:
   text    data     bss     dec     hex filename
  59068      44   37968   97080   17b38 snek-qemu-riscv-orig.elf
  59430      44   37968   97442   17ca2 snek-qemu-riscv-ryu.elf
    362
362 bytes added to gain accurate printf/strtof results seems like a good trade-off in this case. Performance I haven't measured performance at all, but I suspect that it won't be nearly as problematic on most platforms as the source code makes it appear. And that's because Ry is entirely integer arithmetic with no floating point at all. This avoids using the soft fp code for platforms without hardware float support. Pointers to the Code I haven't merged this to picolibc master yet, it's on the ryu branch: Review, especially of the hack above to return short results, would be greatly appreciated! Thanks again to Ulf Adams for creating this code and to Sreepathi Pai for sending me a note about it!

2 June 2020

Olivier Berger: Mixing NRELab s Antidote and Eclipse Che on the same k8s cluster

You may have heard of my search for Cloud solutions to run labs in an academic context, with a focus on free an open source solutions . You may read previous installments of this blog, or for a shorter, check the presentation I ve recorded last week. I ve become quite interested, in the latest month, in 2 projects: NRELab s Antidote and Eclipse Che. Antidote is the software that powers NRELabs, a labs platform for learning network automation, which runs on top of Kubernetes (k8s). The interesting thing is that for each learner, there can be a dedicated k8s namespace with multiple virtual nodes running on a separate network. This can be used in the context of virtual classes/labs where our students will perform network labs in parallel on the same cluster. Eclipse Che powers Eclipse on the Cloud , making available software development environments, for developers, on a Kubernetes Cloud. Developers typically work from a Web page instead of installing local development tools. Both projects seem quite complementary. For one, we both teach networks and software developments. So that would naturally appeal for many professors. Furthermore, Eclipse Che provides a few features that Antidote is lacking : authenticating users (with keycloak), and persisting their work in workspaces, between work sessions. Typically what we need in our academic context where students will work on the same labs during scheduled classes, week after week, during or off-hours. Thus it would be great to have more integration between the 2 environments. I intend to work on that front, but that takes time, as running stuff on Kubernetes isn t exactly trivial, at least when you re like me and want to use a vanilla kubernetes. I ve mainly relied on running k8s inside VMs using Vagrant and/or minikube so far. A first milestone I ve achieved is making sure that Antidote and Eclipse Che aren t incompatible. Antidote s selfmedicate script was actually running inside a Vagrant VM, where I had difficulties installing Eclipse Che (probably because of old software, or particular networking setup details). I ve overcome this hurdle, as I m now able to install both environments on a single Kubernetes VM (using my own Vagrant setup). Running Eclipse Che (alongsite Antidote) on a k8s Vagrant VM. This proves only that there s no show stopper there, but a lot of work remains. Stay tuned.

Sylvestre Ledru: Debian rebuild with clang 10 + some patches

Because of the lock-down in France and thanks to Lucas, I have been able to make some progress rebuilding Debian with clang instead of gcc.

TLDR Instead of patching clang itself, I used a different approach this time: patching Debian tools or implementing some workaround to mitigate an issue.
The percentage of packages failing drop from 4.5% to 3.6% (1400 packages to 1110 - on a total of 31014). I focused on two classes of issues:

Qmake As I have no intention to merge the patch upstream, I used a very dirty workaround. I overwrote the g++ qmake file by clang's:
https://salsa.debian.org/lucas/collab-qa-tools/-/blob/master/modes/clang10#L44-47 I dropped the number of this failure to 0, making some packages build flawlessly (example: qtcreator, chessx, fwbuilder, etc). However, some packages are still failing later and therefore increasing the number of failures in some other categories like link error. For example, qtads fails because of ordered comparison between pointer and zero or oscar fails on a -Werror,-Wdeprecated-copy error. Breaking the build later also highlighted some new classes of issues which didn't occur with clang < 10.
For example, warnings related to C++ range loop or implicit int float conversion (I fixed a bunch of them in Firefox) .

Symbol differences Historically, symbol management for C++ in Debian has been a pain. Russ Allbery wrote a blog post in 2012 explaining the situation. AFAIK, it hasn't changed much.
Once more, I took the dirty approach: if there new or missing symbols, don't fail the build.
The rational is the following: Packages in the Debian archive are supposed to build without any issue. If there is new or missing symbols, it is probably clang generating a different library but this library is very likely working as expected (and usable by a program compiled with g++ or clang). It is purely a different approach taken by the compiler developer. In order to mitigate this issue, before the build starts, I am modifying dpkg-gensymbols to transform the error into a warning.
So, the typical Debian error some new symbols appeared in the symbols file or some symbols or patterns disappeared in the symbols file will NOT fail the build. Unsurprisingly, all but one package (libktorrent) build. Even if I am pessimistic, I reported a bug on dpkg-dev to evaluate if we could improve dpkg-gensymbol not to fail on these cases.

Next steps The next offender is Imake.tmpl:2243:10: fatal error: ' X11 .rules' file not found with more than an hundred occurrences, reported upstream quite sometime ago. Then, the big issues are going to be much harder to fix as they are real issues/warnings (with -Werror) in the code of the packages. Example: -Wc++11-narrowing & Wreserved-user-defined-literal... The list is long.
I will probably work on that when llvm/clang 11 are in RC phase.

For maintainers & upstream Maintainer of Debian/Ubuntu packages? I am providing a list of failing packages per maintainer: https://clang.debian.net/maintainers.php
For upstream, it is also easy to test with clang. Usually, apt install clang && CC=clang CXX=clang++ <build step> is good enough.

Conclusion With these two changes, I have been able to fix about 290 packages. I think I will be able to get that down a bit more but we will soon reach a plateau as many warnings/issues will have to fix in the C/C++ code itself.

Lisandro Dami n Nicanor P rez Meyer: Simplified Monitoring of Patients in Situations of Mass Hospitalization (MoSimPa) - Fighting COVID-19

I have been quite absent from Debian stuff lately, but this increased since COVID-19 hits us. In this blog post I'll try to sketch what I have been doing to help fight COVID-19 this last few months.

In the beginningWhen the pandemic reached Argentina the government started a quarantine. We engineers (like engineers around the world) started to think on how to put our abilities in order to help with the situation. Some worked toward providing more protection elements to medical staff, some towards increasing the number of ventilation machines at disposal. Another group of people started thinking on another ways of helping. In Bah a Blanca arised the idea of monitoring some variables remotely and in masse.

Simplified Monitoring of Patients in Situations of Mass Hospitalization (MoSimPa)

This is where the idea of remotely monitored devices came in, and MoSimPa (from the spanish of "monitoreo simplificado de pacientes en situaci n de internaci n masiva") started to get form. The idea is simple: oximetry (SpO2), heart rate and body temperature will be recorded and, instead of being shown in a display in the device itself, they will be transmitted and monitored in one or more places. In this way medical staff doesn't has to reach a patient constantly and monitoring could be done by medical staff for more patients at the same time. In place monitoring can also happen using a cellphone or tablet.

The devices do not have a screen of their own and almost no buttons, making them more cheap to build and thus more in line with the current economic reality of Argentina.


This is where the project Para Ayudar was created. The project aims to produce the aforementioned non-invasive device to be used in health institutions, hospitals, intra hospital transports and homes.

It is worth to note that the system is designed as a complementary measure for continuous monitoring of a pacient. Care should be taken to check that symptomps and overall patient status don't mean an inmediate life threat. In other words, it is NOT designed for ICUs.

All the above done with Free/Libre/Open Source software and hardware designs. Any manufacturing company can then use them for mass production.

The importance of early pneumonia detection


We were already working in MoSimPa when an NYTimes article caught or attention: "The Infection That s Silently Killing Coronavirus Patients". From the article:

A vast majority of Covid pneumonia patients I met had remarkably low oxygen saturations at triage seemingly incompatible with life but they were using their cellphones as we put them on monitors. Although breathing fast, they had relatively minimal apparent distress, despite dangerously low oxygen levels and terrible pneumonia on chest X-rays.

This greatly reinforced the idea we were on the right track.

The project from a technical standpoint


As the project is primarily designed for and by Argentinians the current system design and software documentation is written in spanish, but the source code (or at least most of it) is written in english. Should anyone need it in english please do not hesitate in asking me.

General system description

System schema

The system is comprised of the devices, a main machine acting as a server (in our case for small setups a Raspberry Pi) and the possibility of accessing data trough cell phones, tablets or other PCs in the network.

The hardware


As of today this is the only part in which I still can't provide schematics, but I'll update this blog post and technical doc with them as soon as I get my hands into them.

Again the design is due to be built in Argentina where getting our hands on hardware is not easy. Moreover it needs to be as cheap as possible, specially now that the Argentinian currency, the peso, is every day more depreciated. So we decided on using an ESP32 as the main microprocessor and a set of Maxim sensors devices. Again, more info when I have them at hand.

The software


Here we have many more components to describe. Firstly the ESP32 code is done with the Arduino SDK. This part of the stack will receive many updates soon, as soon as the first hardware prototypes are out.

For the rest of the stack I decided to go ahead with whatever is available in Debian stable. Why? Well, Raspbian provides a Debian stable-based image and I'm a Debian Developer, so things should go just natural for me in that front. Of course each component has its own packaging. I'm one of Debian's Qt maintainers then using Qt will also be quite natural for me. Plots? Qwt, of course. And with that I have most of my necessities fulfilled. I choose PostgreSql as database server and Mosquitto as MQTT broker.

Between the database and MQTT is mosimpa-datakeeper. The piece of software from which medical staff monitor patients is unsurprisingly called mosimpa-monitor.

mosimpa-monitor
MoSimPa's monitor main screen

mosimpa-monitor plots
Plots of a patient's data


mosimpa-monitor-alarms-setup
Alarm thresholds setup


And for managing patients, devices, locations and internments (CRUD anyone?) there is currently a Qt-based application called mosimpa-abm.

mosimpa-abm
ABM main screen


mosimpa-abm-internments
ABM internments view

The idea is to replace it with a web service so it doesn't needs to be confined to the RPi or require installations in other machines. I considered using webassembly but I would have to also build PostgreSql in order to compile Qt's plugin.

Translations? Of course! As I have already mentioned the code is written in English. Qt allows to easily translate applications, so I keep a Spanish one as the code changes (and we are primarily targeting spanish-speaking people). But of course this also means it can be easily translated to whichever language is necessary.

Even if I am a packager I still have some stuff to fix from the packaging itself, like letting datakeeper run with its own user. I just haven't got to it yet.



Certifications


We are working towards getting the system certified by ANMAT, which is the Argentinian equivalent for EEUU's FDA.

Funding


While all the people involved are working ad-honorem funding is still required in order to buy materials, create the prototypes, etc. The project created payments links with Mercado Pago (in spanish and argentinian pesos) and other bank methods (PDF, also in spanish).

I repeat the links here with an aproximation to US$.

- 500 AR$ (less than 8 US$)
- 1000 AR$ (less than 15 US$)
- 2000 AR$ (less than 30 US$)
- 3000 AR$ (less than 45 US$)
- 5000 AR$ (less than 75 US$)

You can check the actual convertion rate in https://www.google.com/search?q=argentine+peso+to+us+dollars

The project was also presented at a funding call of argentinian Agencia de Promoci n de la Investigaci n, el Desarrollo Tecnol gico y la Innovaci n (Agencia I+D+i). 900+ projects where presented and 64 funded, MoSimPa between them.

Sylvain Beucler: Debian LTS and ELTS - May 2020

Debian LTS Logo Here is my transparent report for my work on the Debian Long Term Support (LTS) and Debian Extended Long Term Support (ELTS), which extend the security support for past Debian releases, as a paid contributor. In May, the monthly sponsored hours were split evenly among contributors depending on their max availability - I was assigned 17.25h for LTS (out of 30 max; all done) and 9.25h for ELTS (out of 20 max; all done). A survey will be published very shortly to gather feedback from all parties involved in LTS (users, other Debian teams...) -- let us know what you think, so we start the forthcoming new (Stretch) LTS cycle in the best conditions :) Discussion is progressing on funding & governance of larger LTS-related projects. Who should decide: contributors, Freexian, sponsors? Do we fund with a percentage or by capping resources allocated on security updates? I voiced concerns over funding these at the expense of smaller, more organic, more recurrent tasks that are less easy to specify but greatly contribute to the overall quality nevertheless. ELTS - Wheezy LTS - Jessie Documentation/Scripts

1 June 2020

Jonathan Carter: Free Software Activities for 2020-05

I would say that this was a crazy month, but with everything ever escalating, does that even mean anything anymore? I lost track of tracking my activities in the second half of the month, and I m still not very good at logging the soft stuff , that is, things like non-technical work but that also takes up a lot of time, but will continue to work on it.
Towards the end of the month I spent a huge amount of time on MiniDebConf Online, I m glad it all worked out, and will write a seperate blog entry on that. Thank you again to everyone for making it a success! I m also moving DPL activities to the DPL blog, so even though it s been a busy month in the free software world my activity log here will look somewhat deceptively short this month MiniDebConf Online 2020-05-06: Help prepare initial CfP mail. 2020-05-06: Process some feedback regarding accessibility on Jitsi.

Debian Packaging 2020-05-02: Upload package gnome-shell-extension-workspaces-to-dock (53-1) to Debian unstable. 2020-05-02: Upload package tetzle (2.1.6-1) to Debian unstable. 2020-05-06: Upload package bundlewrap (3.9.0-1) to Debian unstable. 2020-05-06: Accept MR#1 for connectagram. 2020-05-06: Upload package connectagram (1.2.11-1) to Debian unstable. 2020-05-07: Upload package gnome-shell-extension-multi-monitors (20-1) to Debian unstable (Closes: #956169). 2020-05-07: Upload package tanglet (1.5.6-1) to Debian unstable. 2020-05-16: Upload package calamares (3.2.24-1) to Debian unstable. 2020-05-16: Accept MR#1 for tuxpaint-config. 2020-05-16: Accept MR#7 for debian-live. 2020-05-18: Upload package bundlewrap (3.10.0) to Debian unstable.

Debian Mentoring 2020-05-02: Sponsor package gamemode (1.5.1-3) (Games team request). 2020-05-16: Sponsor package gamemode (1.5.1-4) (Games team request).

Paul Wise: FLOSS Activities May 2020

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration
  • nsntrace: talk to upstream about collaborative maintenance
  • Debian: deploy changes, debug issue with GPS markers file generation, migrate bls/DUCK from alioth-archive to salsa
  • Debian website: ran map cron job, synced mirrors
  • Debian wiki: approve accounts, ping folks with bouncing email

Communication

Sponsors The apt-offline work and the libfile-libmagic-perl backports were sponsored. All other work was done on a volunteer basis.

31 May 2020

Chris Lamb: Free software activities in May 2020

Here is my monthly update covering what I have been doing in the free software world during May 2020 (previous month): In Lintian, the static analysis tool for Debian packages:
Reproducible builds One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. The initiative is proud to be a member project of the Software Freedom Conservancy, a not-for-profit 501(c)(3) charity focused on ethical technology and user freedom. Conservancy acts as a corporate umbrella allowing projects to operate as non-profit initiatives without managing their own corporate structure. If you like the work of the Conservancy or the Reproducible Builds project, please consider becoming an official supporter.
Elsewhere in our tooling, I made the following changes to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including preparing and uploading versions 142, 143, 144, 145 and 146 to Debian: I also performed a huge overhaul of diffoscope's website:
Lastly, I made a large number of changes to the main Reproducible Builds website and documentation:
Debian LTS This month I contributed 17 hours to Debian Long Term Support (LTS) and 9 hours on its sister Extended LTS project. You can find out more about the two projects via the following video:
Debian I filed the following bug reports in Debian this month: I also filed a number of bugs against packages that are not compatible with Django 3.x, (organised around a single master bug) including django-taggit, sorl-thumbnail, django-simple-captcha, django-cas-server, django-cors-headers, python-django-csp, django-pipeline, python-django-jsonfield, python-django-contact-form, django-model-utils, django-fsm, django-modeltranslation, django-oauth-toolkit, libthumbor, python-django-extensions, python-django-imagekit, python-django-navtag, python-django-tagging, djangorestframework, django-haystack, django-taggit, django-simple-captcha, python-django-registration, python-django-pyscss, python-django-compressor, python-django-crispy-forms & python-django-mptt,
Lastly, I made the following uploads to Debian: I also sponsored an upload for adminer (4.7.7-1), also uploading it to buster-backports.

30 May 2020

Sean Whitton: GNU Emacs' Transient Mark mode

Something I ve found myself doing as the pandemic rolls on is picking out and (re-)reading through sections of the GNU Emacs manual and the GNU Emacs Lisp reference manual. This has got me (too) interested in some of the recent history of Emacs development, and I did some digging into archives of emacs-devel from 2008 (15M mbox) regarding the change to turn Transient Mark mode on by default and set mark-even-if-inactive to true by default in Emacs 23.1. It s not always clear which objections to turning on Transient Mark mode by default take into account the mark-even-if-inactive change. I think that turning on Transient Mark mode along with mark-even-if-inactive is a good default. The question that remains is whether the disadvantages of Transient Mark mode are significant enough that experienced Emacs users should consider altering Emacs default behaviour to mitigate them. Here s one popular blog arguing for some mitigations. How might Transient Mark mode be disadvantageous? The suggestion is that it makes using the mark for navigation rather than for acting on regions less convenient:
  1. setting a mark just so you can jump back to it (i) is a distinct operation you have to think of separately; and (ii) requires two keypresses, C-SPC C-SPC, rather than just one keypress
  2. using exchange-point-and-mark activates the region, so to use it for navigation you need to use either C-u C-x C-x or C-x C-x C-g, neither of which are convenient to type, or else it will be difficult to set regions at the place you ve just jumped to because you ll already have one active.
There are two other disadvantages that people bring up which I am disregarding. The first is that it makes it harder for new users to learn useful ways in which to use the mark when it s deactivated. This happened to me, but it can be mitigated without making any behavioural changes to Emacs. The second is that the visual highlighting of the region can be distracting. So far as I can tell, this is only a problem with exchange-point-and-mark, and it s subsumed by the problem of that command actually activating the region. The rest of the time Emacs automatic deactivation of the region seems sufficient. How might disabling Transient Mark mode be disadvantageous? When Transient Mark mode is on, many commands will do something usefully different when the mark is active. The number of commands in Emacs which work this way is only going to increase now that Transient Mark mode is the default. If you disable Transient Mark mode, then to use those features you need to temporarily activate Transient Mark mode. This can be fiddly and/or require a lot of keypresses, depending on exactly where you want to put the region. Without being able to see the region, it might be harder to know where it is. Indeed, this is one of the main reasons for wanting Transient Mark mode to be the default, to avoid confusing new users. I don t think this is likely to affect experienced Emacs users often, however, and on occasions when more precision is really needed, C-u C-x C-x will make the region visible. So I m not counting this as a disadvantage. How might we mitigate these two sets of disadvantages? Here are the two middle grounds I m considering. Mitigation #1: Transient Mark mode, but hack C-x C-x behaviour
(defun spw/exchange-point-and-mark (arg)
  "Exchange point and mark, but reactivate mark a bit less often.

Specifically, invert the meaning of ARG in the case where
Transient Mark mode is on but the region is inactive."
  (interactive "P")
  (exchange-point-and-mark
   (if (and transient-mark-mode (not mark-active))
       (not arg)
     arg)))
(global-set-key [remap exchange-point-and-mark] &aposspw/exchange-point-and-mark)
We avoid turning Transient Mark mode off, but mitigate the second of the two disadvantages given above. I can t figure out why it was thought to be a good idea to make C-x C-x reactivate the mark and require C-u C-x C-x to use the action of exchanging point and mark as a means of navigation. There needs to a binding to reactivate the mark, but in roughly ten years of having Transient Mark mode turned on, I ve found that the need to reactivate the mark doesn t come up often, so the shorter and longer bindings seem the wrong way around. Not sure what I m missing here. Mitigation #2: disable Transient Mark mode, but enable it temporarily more often
(setq transient-mark-mode nil)
(defun spw/remap-mark-command (command &optional map)
  "Remap a mark-* command to temporarily activate Transient Mark mode."
  (let* ((cmd (symbol-name command))
         (fun (intern (concat "spw/" cmd)))
         (doc (concat "Call  "
                      cmd
                      "&apos and temporarily activate Transient Mark mode.")))
    (fset fun  (lambda ()
                 ,doc
                 (interactive)
                 (call-interactively #&apos,command)
                 (activate-mark)))
    (if map
        (define-key map (vector &aposremap command) fun)
      (global-set-key (vector &aposremap command) fun))))
(dolist (command &apos(mark-word
                   mark-sexp
                   mark-paragraph
                   mark-defun
                   mark-page
                   mark-whole-buffer))
  (spw/remap-mark-command command))
(with-eval-after-load &aposorg
  (spw/remap-mark-command &aposorg-mark-subtree org-mode-map))
;; optional
(global-set-key "\M-=" (lambda () (interactive) (activate-mark)))
;; resettle the previous occupant
(global-set-key "\C-cw" &aposcount-words-region)
Here we remove both of the disadvantages of Transient Mark mode given above, and mitigate the main disadvantage of not activating Transient Mark mode by making it more convenient to activate it temporarily. For example, this enables using C-M-SPC C-M-SPC M-( to wrap the following two function arguments in parentheses. And you can hit M-h a few times to mark some blocks of text or code, then operate on them with commands like M-% and C-/ which behave differently when the region is active.1 Comparing these mitigations Both of these mitigations handle the second of the two disadvantages of Transient Mark mode given above. What remains, then, is
  1. under the effects of mitigation #1, how much of a barrier to using marks for navigational purposes is it to have to press C-SPC C-SPC instead of having a single binding, C-SPC, for all manual mark setting2
  2. under the effects of mitigation #2, how much of a barrier to taking advantage of commands which act differently when the region is active is it to have to temporarily enable Transient Mark mode with C-SPC C-SPC, M-= or one of the mark-* commands?
These are unknowns.3 So I m going to have to experiment, I think, to determine which mitigation to use, if either. In particular, I don t know whether it s really significant that setting a mark for navigational purposes and for region marking purposes are distinct operations under mitigation #1. My plan is to start with mitigation #2 because that has the additional advantage of allowing me to confirm or disconfirm my belief that not being able to see where the region is will only rarely get in my way.

  1. The idea of making the mark-* commands activate the mark comes from an emacs-devel post by Stefan Monnier in the archives linked above.
  2. One remaining possibility I m not considering is mitigation #1 plus binding something else to do the same as C-SPC C-SPC. I don t believe there are any easily rebindable keys which are easier to type than typing C-SPC twice. And this does not deal with the two distinct mark-setting operations problem.
  3. Another way to look at this is the question of which of setting a mark for navigational purposes and activating a mark should get C-SPC and which should get C-SPC C-SPC.

29 May 2020

Keith Packard: picolibc-string-float

Float/String Conversion in Picolibc Exact conversion between strings and floats seems like a fairly straightforward problem. There are two related problems:
  1. String to Float conversion. In this case, the goal is to construct the floating point number which most closely approximates the number represented by the string.
  2. Float to String conversion. Here, the goal is to generate the shortest string which, when fed back into the String to Float conversion code, exactly reproduces the original value.
When linked together, getting from float to string and back to float is a round trip , and an exact pair of algorithms does this for every floating point value. Solutions for both directions were published in the proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation, with the string-to-float version written by William Clinger and the float-to-string version written by Guy Steele and Jon White. These solutions rely on very high precision integer arithmetic to get every case correct, with float-to-string requiring up to 1050 bits for the 64-bit IEEE floating point format. That's a lot of bits. Newlib Float/String Conversion The original newlib code, written in 1998 by David M. Gay, has arbitrary-precision numeric code for these functions to get exact results. However, it has the disadvantages of performing numerous memory allocations, consuming considerable space for the code, and taking a long time for conversions. The first disadvantage, using malloc during conversion, ended up causing a number of CVEs because the results of malloc were not being checked. That's bad on all platforms, but especially bad for embedded systems where reading and writing through NULL pointers may have unknown effects. Upstream newlib applied a quick fix to check the allocations and call abort. Again, on platforms with an OS, that at least provides a way to shut down the program and let the operating environment figure out what to do next. On tiny embedded systems, there may not be any way to log an error message or even restart the system. Ok, so we want to get rid of the calls to abort and have the error reported back through the API call which caused the problem. That's got two issues, one mere technical work, and another mere re-interpretation of specifications. Let's review the specification issue. The libc APIs involved here are: Input: Output: Scanf and printf are both documented to set errno to ENOMEM when they run out of memory, but none of the other functions takes that possibility into account. So we'll make some stuff up and hope it works out: Now, looking back at the technical challenge. That's a simple matter of inserting checks at each allocation, or call which may result in an allocation, and reporting failure back up the call stack, unwinding any intermediate state to avoid leaking memory. Testing Every Possible Allocation Failure There are a lot of allocation calls in the newlib code. And the call stack can get pretty deep. A simple visual inspection of the code didn't seem sufficient to me to validate the allocation checking code. So I instrumented malloc, making it count the number of allocations and fail at a specific one. Now I can count the total number of allocations done over the entire test suite run for each API involved and then run the test suite that many times, failing each allocation in turn and checking to make sure we recover correctly. By that, I mean: There were about 60000 allocations to track, so I ran the test suite that many times, which (with the added malloc tracing enabled) took about 12 hours. Bits Pushed to the Repository With the testing complete, I'm reasonably confident that the code is now working, and that these CVEs are more completely squashed. If someone is interested in back-porting the newlib fixes upstream to newlib, that would be awesome. It's not completely trivial as this part of picolibc has diverged a bit due to the elimination of the reent structure. Picolibc's Tinystdio Float/String Conversion Picolibc contains a complete replacement for stdio which was originally adopted from avr libc. That's a stdio implementation designed to run on 8-bit Atmel processors and focuses on very limited memory use and small code size. It does this while maintaining surprisingly complete support for C99 printf and scanf support. However, it also does this without any arbitrary precision arithmetic, which means it doesn't get the right answer all of the time. For most embedded systems, this is usually a good trade off -- floating point input and output are likely to be largely used for diagnostics and debugging, so mostly correct answers are probably sufficient. The original avr-libc code only supports 32-bit floats, as that's all the ABI on those processors has. I extended that to 64-, 80- and 128- bit floats to cover double and long double on x86 and RISC-V processors. Then I spent a bunch of time adjusting the code to get it to more accurately support C99 standards. Tinystdio also had strtod support, but it was missing ecvt, fcvt and gcvt. For those, picolibc was just falling back to the old newlib code, which introduced all of the memory allocation issues we've just read about. Fixing that so that tinystdio was self-contained and did ecvt, fcvt and gcvt internally required writing those functions in terms of the float-to-string primitives already provided in tinystdio to support printf. gcvt is most easily supported by just calling sprintf. Once complete, the default picolibc build, using tinystdio, no longer does any memory allocation for float/string conversions.

28 May 2020

Antoine Beaupr : Upgrading my home server uplink

For more than a few decades now (!), I've been running my own server. First it was just my old Pentium 1 squatting on university networks, but eventually grew into a real server somewhere at the dawn of the millenia. Apart from the university days, the server was mostly hosted over ADSL links, first a handful of megabits, up to the current 25 Mbps down, 6 Mbps up that the Bell Canada network seems to allow to its resellers (currently Teksavvy Internet, or TSI).

Why change? Obviously, this speed is showing its age, and especially in this age of Pandemia where everyone is on videoconferencing all the time. But it's also inconvenient when I need to upload large files on the network. I also host a variety of services on this network, and I always worry that any idiot can (rather trivially) DoS my server, so I often feel I should pack a little more punch at home (although I have no illusions about my capacity of resisting any sort of DoS attack at home of course). Also, the idea of having gigabit links at home brings back the idea of the original internet, that everyone on the internet is a "peer". "Client" and "servers" are just a technical distinction and everyone should be able to run a server.

Requirements So I'm shopping for a replacement. The requirements are:
  1. higher speed than 25/6, preferably 100mbps down, 30mbps up, or more. ideally 1gbps symmetric.
  2. static or near-static IP address: I run a DNS server with its IP in the glue records (although the latter could possibly be relaxed). ideally a /29 or more.
  3. all ports open: I run an SMTP server (incoming and outgoing) along with a webserver and other experiments. ideally, no firewall or policy should be blocking me from hosting stuff, unless there's an attack or security issue, obviously.
  4. clean IP address: the SMTP server needs to have a good reputation, so the IP address should not be in a "residential space" pool.
  5. IPv6 support: TSI offers IPv6 support, but it is buggy (I frequently have to restart the IPv6 interface on the router because the delegated block stops routing, and they haven't been able to figure out the problem). ideally, a /56.
  6. less than 100$/mth, ideally close to the current 60$/mth I pay.
(All amounts in $CAD.)

Contestants I wrote a similar message asking major ISPs in my city for those services, including business service if necessary: I have not contacted those providers:
  • Bell Canada: i have sworn, two decades ago, never to do business with that company ever again. They have a near-monopoly on almost all telcos in Canada and I want to give them as little money as possible.
  • Videotron: I know for a fact they do not allow servers on their network, and their IPv6 has been in beta for so long it has become somewhat of a joke now
I might have forgotten some, let me know if you're in the area and have a good recommendation. I'll update this post with findings as they come in. Keep in mind that I am in a major Canadian city, less than a kilometer from a major telco exchange site, so it's not like I'm in a rural community. This should just work.

TSI First answer from TSI was "we do not provide 30mbps upload on residential services", even though they seem to have that package on their website. They confirmed that they "don't have a option more than 10 mbps upload." TSI were the first to respond, within 24h.

Oricom They offer a 100/30 link for 65$ plus 25$ for a static IP. No IPv6 yet, unlikely to come soon. No services blocked, they have their own PoP within Videotron's datacenters so clients come out from their IP address space. I can confirm that the IP is fairly static from the office. Oricom were the second to respond, within 24h, but required a phone call instead of an email exchange. Responded within 6 hours after leaving a voicemail.

Ebox Ebox claims my neighborhood supports 400mbps down, but offered me a 100/30 package with 350Go bandwidth per month for 54.95$/mth or unlimited for 65$/mth. Many ports are blocked, which makes it impossible for me to use their service:
  • port 25 blocked incoming
  • port 25 filtered outgoing (only allowed to their servers)
  • port 53 blocked incoming (!)
No static IP addressing, shared dynamic space so no garantee on reputation. IPv6 only on DSL, so no high speed IPv6. Ebox took the longest to respond, about 48 hours.

Beanfield / Openface Even though they have a really interesting service (50$/mth for unlimited 1gbps), they are not in my building. I did try to contact them over chat, they told me to call, and I left a message. They responded saying they mostly offer business services for now, no residential in Montreal.

Elana Hashman: Presenter mode in LibreOffice Impress without an external display

I typically use LibreOffice Impress for my talks, much to some folks' surprise. Yes, you can make slides look okay with free software! But there's one annoying caveat that has bothered me for ages. Impress makes it nearly impossible to enter presenter mode with a single display, while also displaying slides. I have never understood this limitation, but it's existed for a minimum of seven years. I've tried all sorts of workarounds over the years, including a macro that forces LibreOffice into presenter mode, which I never was able to figure out how to reverse once I ran it... This has previously been an annoyance but never posed a big problem, because when push came to shove I could leave my house and use an external monitor or screen when presenting at meetups. But now, everything's virtual, I'm in lockdown indefinitely, and I don't have another display available at home. And about 8 hours before speaking at a meetup today, I realized I didn't have a way to share my slides while seeing my speaker notes. Uh oh. So I got this stupid idea. ...why don't I just placate LibreOffice with a FAKE display? Virtual displays with xrandr My GPU had this capability innately, it turns out, if I could just whisper the right incantations to unlock its secrets:
ehashman@red-dot:~$ cat /usr/share/X11/xorg.conf.d/20-intel.conf 
Section "Device"
    Identifier "intelgpu0"
    Driver "intel"
    Option "VirtualHeads" "1"
EndSection
After restarting X to allow this newly created config to take effect, I now could see two new virtual displays available for use:
ehashman@red-dot:~$ xrandr
Screen 0: minimum 8 x 8, current 3840 x 1080, maximum 32767 x 32767
eDP1 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis) 310mm x 170mm
   1920x1080     60.01*+  59.93  
   ...
   640x360       59.84    59.32    60.00  
DP1 disconnected (normal left inverted right x axis y axis)
DP2 disconnected (normal left inverted right x axis y axis)
HDMI1 disconnected (normal left inverted right x axis y axis)
HDMI2 disconnected (normal left inverted right x axis y axis)
VIRTUAL1 disconnected (normal left inverted right x axis y axis)
VIRTUAL2 disconnected (normal left inverted right x axis y axis)
Nice. Now, to actually use it:
ehashman@red-dot:~$ xrandr --addmode VIRTUAL1 1920x1080
ehashman@red-dot:~$ xrandr --output VIRTUAL1 --mode 1920x1080 --right-of eDP1
And indeed, after running these commands, I found myself with a virtual display, very happy to black hole all my windows, available to the imaginary right of my laptop screen. This allowed me to mash that "Present" button in LibreOffice and get my presenter notes on my laptop display, while putting my actual slides in a virtual time-out that I could still screenshare! Wouldn't it be nice if LibreOffice just fixed the actual bug? Well, actually... I must forgive myself for my stage panic. The talk ended up going great, and the immediate problem was solved. But it turns out this bug has been addressed upstream! It's just... not well-documented. A couple years ago, there was a forum post on ask.libreoffice.org that featured this exact question, and a solution was provided!
Yes, use Open Expert Configuration via Tools > Options > LibreOffice > Advanced. Search for StartAlways. You should get a node org.openoffice.Office.PresenterScreen with line Presenter. Double-click that line to toggle the boolean value to true.
I tested this out locally and... yeah that works. But it turns out the bug from 2013 had not been updated with this solution until just a few months ago. There are very limited search results for this configuration property. I wish this was much better documented. But so it goes with free software; here's a hack and a real solution as we all try to improve things :)

26 May 2020

Russell Coker: Cruises and Covid19

Problems With Cruises GQ has an insightful and detailed article about Covid19 and the Diamond Princess [1], I recommend reading it. FastCompany has a brief article about bookings for cruises in August [2]. There have been many negative comments about this online. The first thing to note is that the cancellation policies on those cruises are more lenient than usual and the prices are lower. So it s not unreasonable for someone to put down a deposit on a half price holiday in the hope that Covid19 goes away (as so many prominent people have been saying it will) in the knowledge that they will get it refunded if things don t work out. Of course if the cruise line goes bankrupt then no-one will get a refund, but I think people are expecting that won t happen. The GQ article highlights some serious problems with the way cruise ships operate. They have staff crammed in to small cabins and the working areas allow transmission of disease. These problems can be alleviated, they could allocate more space to staff quarters and have more capable air conditioning systems to put in more fresh air. During the life of a cruise ship significant changes are often made, replacing engines with newer more efficient models, changing the size of various rooms for entertainment, installing new waterslides, and many other changes are routinely made. Changing the staff only areas to have better ventilation and more separate space (maybe capsule-hotel style cabins with fresh air piped in) would not be a difficult change. It would take some money and some dry-dock time which would be a significant expense for cruise companies. Cruises Are Great People like social environments, they want to have situations where there are as many people as possible without it becoming impossible to move. Cruise ships are carefully designed for the flow of passengers. Both the layout of the ship and the schedule of events are carefully planned to avoid excessive crowds. In terms of meeting the requirement of having as many people as possible in a small area without being unable to move cruise ships are probably ideal. Because there is a large number of people in a restricted space there are economies of scale on a cruise ship that aren t available anywhere else. For example the main items on the menu are made in a production line process, this can only be done when you have hundreds of people sitting down to order at the same time. The same applies to all forms of entertainment on board, they plan the events based on statistical knowledge of what people want to attend. This makes it more economical to run than land based entertainment where people can decide to go elsewhere. On a ship a certain portion of the passengers will see whatever show is presented each night, regardless of whether it s singing, dancing, or magic. One major advantage of cruises is that they are all inclusive. If you are on a regular holiday would you pay to see a singing or dancing show? Probably not, but if it s included then you might as well do it and it will be pretty good. This benefit is really appreciated by people taking kids on holidays, if kids do things like refuse to attend a performance that you were going to see or reject food once it s served then it won t cost any extra. People Who Criticise Cruises For the people who sneer at cruises, do you like going to bars? Do you like going to restaurants? Live music shows? Visiting foreign beaches? A cruise gets you all that and more for a discount price. If Groupon had a deal that gave you a cheap hotel stay with all meals included, free non-alcoholic drinks at bars, day long entertainment for kids at the kids clubs, and two live performances every evening how many of the people who reject cruises would buy it? A typical cruise is just like a Groupon deal for non-stop entertainment from 8AM to 11PM. Will Cruises Restart? The entertainment options that cruises offer are greatly desired by many people. Most cruises are aimed at budget travellers, the price is cheaper than a hotel in a major city. Such cruises greatly depend on economies of scale, if they can t get the ships filled then they would need to raise prices (thus decreasing demand) to try to make a profit. I think that some older cruise ships will be scrapped in the near future and some of the newer ships will be sold to cruise lines that cater to cheap travel (IE P&O may scrap some ships and some of the older Princess ships may be transferred to them). Overall I predict a decrease in the number of middle-class cruise ships. For the expensive cruises (where the cheapest cabins cost over $1000US per person per night) I don t expect any real changes, maybe they will have fewer passengers and higher prices to allow more social distancing or something. I am certain that cruises will start again, but it s too early to predict when. Going on a cruise is about as safe as going to a concert or a major sporting event. No-one is predicting that sporting stadiums will be closed forever or live concerts will be cancelled forever, so really no-one should expect that cruises will be cancelled forever. Whether companies that own ships or stadiums go bankrupt in the mean time is yet to be determined. One thing that s been happening for years is themed cruises. A group can book out an entire ship or part of a ship for a themed cruise. I expect this to become much more popular when cruises start again as it will make it easier to fill ships. In the past it seems that cruise lines let companies book their ships for events but didn t take much of an active role in the process. I think that the management of cruise lines will look to aggressively market themed cruises to anyone who might help, for starters they could reach out to every 80s and 90s pop group those fans are all old enough to be interested in themed cruises and the musicians won t be asking for too much money. Conclusion Humans are social creatures. People want to attend events with many other people. Covid 19 won t be the last pandemic, and it may not even be eradicated in the near future. The possibility of having a society where no-one leaves home unless they are in a hazmat suit has been explored in science fiction, but I don t think that s a plausible scenario for the near future and I don t think that it s something that will be caused by Covid 19.

Russ Allbery: Review: The Ten Thousand Doors of January

Review: The Ten Thousand Doors of January, by Alix E. Harrow
Publisher: Redhook
Copyright: September 2019
ISBN: 0-316-42198-7
Format: Kindle
Pages: 373
In 1901, at the age of seven, January found a Door. It was barely more than a frame in a ruined house in a field in Kentucky, but she wrote a story about opening it, and then did.
Once there was a brave and temeraryous (sp?) girl who found a Door. It was a magic Door that's why it has a capital D. She opened the Door.
The Door led to a bluff over the sea and above a city, a place very far from Kentucky, and she almost stayed, but she came back through the Door when her guardian, Mr. Locke, called. The adventure cost her a diary, several lectures, days of being locked in her room, and the remnants of her strained relationship with her father. When she went back, the frame of the Door was burned to the ground. That was the end of Doors for January for some time, and the continuation of a difficult childhood. She was cared for by her father's employer as a sort of exotic pet, dutifully attempting to obey, grateful for Mr. Locke's protection, and convinced that he was occasionally sneaking her presents through a box in the Pharaoh Room out of some hidden kindness. Her father appeared rarely, said little, and refused to take her with him. Three things helped: the grocery boy who smuggled her stories, an intimidating black woman sent by her father to replace her nurse, and her dog.
Once upon a time there was a good girl who met a bad dog, and they became the very best of friends. She and her dog were inseparable from that day forward.
I will give you a minor spoiler that I would have preferred to have had, since it would have saved me some unwarranted worry and some mental yelling at the author: The above story strains but holds. January's adventure truly starts the day before her seventeenth birthday, when she finds a book titled The Ten Thousand Doors in the box in the Pharaoh Room. As you may have guessed from the title, The Ten Thousand Doors of January is a portal fantasy, but it's the sort of portal fantasy that is more concerned with the portal itself than the world on the other side of it. (Hello to all of you out there who, like me, have vivid memories of the Wood between the Worlds.) It's a book about traveling and restlessness and the possibility of escape, about the ability to return home again, and about the sort of people who want to close those doors because the possibility of change created by people moving around freely threatens the world they have carefully constructed. Structurally, the central part of the book is told by interleaving chapters of January's tale with chapters from The Ten Thousand Doors. That book within a book starts with the framing of a scholarly treatment but quickly becomes a biography of a woman: Adelaide Lee Larson, a half-wild farm girl who met her true love at the threshold of a Door and then spent much of her life looking for him. I am not a very observant reader for plot details, particularly for books that I'm enjoying. I read books primarily for the emotional beats and the story structure, and often miss rather obvious story clues. (I'm hopeless at guessing the outcomes of mysteries.) Therefore, when I say that there are many things January is unaware of that are obvious to the reader, that's saying a lot. Even more clues were apparent when I skimmed the first chapter again, and a more observant reader would probably have seen them on the first read. Right down to Mr. Locke's name, Harrow is not very subtle about the moral shape of this world. That can make the early chapters of the book frustrating. January is being emotionally (and later physically) abused by the people who have power in her life, but she's very deeply trapped by false loyalty and lack of external context. Winning free of that is much of the story of the book, and at times it has the unpleasantness of watching someone make excuses for her abuser. At other times it has the unpleasantness of watching someone be abused. But this is the place where I thought the nested story structure worked marvelously. January escapes into the story of The Ten Thousand Doors at the worst moments of her life, and the reader escapes with her. Harrow uses the desire to switch scenes back to the more adventurous and positive story to construct and reinforce the emotional structure of the book. For me, it worked extremely well. It helps that the ending is glorious. The payoff is worth all the discomfort and tension-building in the first half of the book. Both The Ten Thousand Doors and the surrounding narrative reach deeply satisfying conclusions, ones that are entangled but separate in just the ways that they need to be. January's abilities, actions, and decisions at the end of the book were just the outcome that I needed but didn't entirely guess in advance. I could barely put down the last quarter of this story and loved every moment of the conclusion. This is the sort of book that can be hard to describe in a review because its merits don't rest on an original twist or easily-summarized idea. The elements here are all elements found in other books: portal fantasy, the importance of story-telling, coming of age, found family, irrepressible and indomitable characters, and the battle of the primal freedom of travel and discovery and belief against the structural forces that keep rulers in place. The merits of this book are in the small details: the way that January's stories are sparse and rare and sometimes breathtaking, the handling of tattoos, the construction of other worlds with a few deft strokes, and the way Harrow embraces the emotional divergence between January's life and Adelaide's to help the reader synchronize the emotional structure of their reading experience with January's.
She writes a door of blood and silver. The door opens just for her.
The Ten Thousand Doors of January is up against a very strong slate for both the Nebula and the Hugo this year, and I suspect it may be edged out by other books, although I wouldn't be unhappy if it won. (It probably has a better shot at the Nebula than the Hugo.) But I will be stunned if Harrow doesn't walk away with the Mythopoeic Award. This seems like exactly the type of book that award was created for. This is an excellent book, one of the best I've read so far this year. Highly recommended. Rating: 9 out of 10

Next.