Search Results: "beh"

14 August 2022

Russ Allbery: Review: Still Not Safe

Review: Still Not Safe, by Robert L. Wears & Kathleen M. Sutcliffe
Publisher: Oxford University Press
Copyright: November 2019
ISBN: 0-19-027128-0
Format: Kindle
Pages: 232
Still Not Safe is an examination of the recent politics and history of patient safety in medicine. Its conclusions are summarized by the opening paragraph of the preface:
The American moral and social philosopher Eric Hoffer reportedly said that every great cause begins as a movement, becomes a business, and eventually degenerates into a racket. The reform movement to make healthcare safer is clearly a great cause, but patient safety efforts are increasingly following Hoffer's path.
Robert Wears was Professor of Emergency Medicine at the University of Florida specializing in patient safety. Kathleen Sutcliffe is Professor of Medicine and Business at Johns Hopkins. This book is based on research funded by a grant from the Robert Wood Johnson Foundation, for which both Wears and Sutcliffe were primary investigators. (Wears died in 2017, but the acknowledgments imply that at least early drafts of the book existed by that point and it was indeed co-written.) The anchor of the story of patient safety in Still Not Safe is the 1999 report from the Institute of Medicine entitled To Err is Human, to which the authors attribute an explosion of public scrutiny of medical safety. The headline conclusion of that report, which led nightly news programs after its release, was that 44,000 to 120,000 people died each year in the United States due to medical error. This report prompted government legislation, funding for new safety initiatives, a flurry of follow-on reports, and significant public awareness of medical harm. What it did not produce, in the authors' view, is significant improvements in patient safety. The central topic of this book is an analysis of why patient safety efforts have had so little measurable effect. The authors attribute this to three primary causes: an unwillingness to involve safety experts from outside medicine or absorb safety lessons from other disciplines, an obsession with human error that led to profound misunderstandings of the nature of safety, and the misuse of safety concerns as a means to centralize control of medical practice in the hands of physician-administrators. (The term used by the authors is "managerial, scientific-bureaucratic medicine," which is technically accurate but rather awkward.) Biggest complaint first: This book desperately needed examples, case studies, or something to make these ideas concrete. There are essentially none in 230 pages apart from passing mentions of famous cases of medical error that added to public pressure, and a tantalizing but maddeningly nonspecific discussion of the atypically successful effort to radically improve the safety of anesthesia. Apparently anesthesiologists involved safety experts from outside medicine, avoided a focus on human error, turned safety into an engineering problem, and made concrete improvements that had a hugely positive impact on the number of adverse events for patients. Sounds fascinating! Alas, I'm just as much in the dark about what those improvements were as I was when I started reading this book. Apart from a vague mention of some unspecified improvements to anesthesia machines, there are no concrete descriptions whatsoever. I understand that the authors were probably leery of giving too many specific examples of successful safety initiatives since one of their core points is that safety is a mindset and philosophy rather than a replicable set of actions, and copying the actions of another field without understanding their underlying motivations or context within a larger system is doomed to failure. But you have to give the reader something, or the book starts feeling like a flurry of abstract assertions. Much is made here of the drawbacks of a focus on human error, and the superiority of the safety analysis done in other fields that have moved beyond error-centric analysis (and in some cases have largely discarded the word "error" as inherently unhelpful and ambiguous). That leads naturally to showing an analysis of an adverse incident through an error lens and then through a more nuanced safety lens, making the differences concrete for the reader. It was maddening to me that the authors never did this. This book was recommended to me as part of a discussion about safety and reliability in tech and the need to learn from safety practices in other fields. In that context, I didn't find it useful, although surprisingly that's because the thinking in medicine (at least as presented by these authors) seems behind the current thinking in distributed systems. The idea that human error is not a useful model for approaching reliability is standard in large tech companies, nearly all of which use blameless postmortems for exactly that reason. Tech, similar to medicine, does have a tendency to be insular and not look outside the field for good ideas, but the approach to large-scale reliability in tech seems to have avoided the other traps discussed here. (Security is another matter, but security is also adversarial, which creates different problems that I suspect require different tools.) What I did find fascinating in this book, although not directly applicable to my own work, is the way in which a focus on human error becomes a justification for bureaucratic control and therefore a concentration of power in a managerial layer. If the assumption is that medical harm is primarily caused by humans making avoidable mistakes, and therefore the solution is to prevent humans from making mistakes through better training, discipline, or process, this creates organizations that are divided into those who make the rules and those who follow the rules. The long-term result is a practice of medicine in which a small number of experts decide the correct treatment for a given problem, and then all other practitioners are expected to precisely follow that treatment plan to avoid "errors." (The best distributed systems approaches may avoid this problem, but this failure mode seems nearly universal in technical support organizations.) I was startled by how accurate that portrayal of medicine felt. My assumption prior to reading this book was that the modern experience of medicine as an assembly line with patients as widgets was caused by the pressure for higher "productivity" and thus shorter visit times, combined with (in the US) the distorting effects of our broken medical insurance system. After reading this book, I've added a misguided way of thinking about medical error and risk avoidance to that analysis. One of the authors' points (which, as usual, I wish they'd made more concrete with a case study) is that the same thought process that lets a doctor make a correct diagnosis and find a working treatment is the thought process that may lead to an incorrect diagnosis or treatment. There is not a separable state of "mental error" that can be eliminated. Decision-making processes are more complicated and more integrated than that. If you try to prevent "errors" by eliminating flexibility, you also eliminate vital tools for successfully treating patients. The authors are careful to point out that the prior state of medicine in which each doctor was a force to themselves and there was no role for patient safety as a discipline was also bad for safety. Reverting to the state of medicine before the advent of the scientific-bureaucratic error-avoiding culture is also not a solution. But, rather at odds with other popular books about medicine, the authors are highly critical of safety changes focused on human error prevention, such as mandatory checklists. In their view, this is exactly the sort of attempt to blindly copy the machinery of safety in another field (in this case, air travel) without understanding the underlying purpose and system of which it's a part. I am not qualified to judge the sharp dispute over whether there is solid clinical evidence that checklists are helpful (these authors claim there is not; I know other books make different claims, and I suspect it may depend heavily on how the checklist is used). But I found the authors' argument that one has to design systems holistically for safety, not try to patch in safety later by turning certain tasks into rote processes and humans into machines, to be persuasive. I'm not willing to recommend this book given how devoid it is of concrete examples. I was able to fill in some of that because of prior experience with the literature on site reliability engineering, but a reader who wasn't previously familiar with discussions of safety or reliability may find much of this book too abstract to be comprehensible. But I'm not sorry I read it. I hadn't previously thought about the power dynamics of a focus on error, and I think that will be a valuable observation to keep in mind. Rating: 6 out of 10

8 August 2022

Ian Jackson: dkim-rotate - rotation and revocation of DKIM signing keys

Background Internet email is becoming more reliant on DKIM, a scheme for having mail servers cryptographically sign emails. The Big Email providers have started silently spambinning messages that lack either DKIM signatures, or SPF. DKIM is arguably less broken than SPF, so I wanted to deploy it. But it has a problem: if done in a naive way, it makes all your emails non-repudiable, forever. This is not really a desirable property - at least, not desirable for you, although it can be nice for someone who (for example) gets hold of leaked messages obtained by hacking mailboxes. This problem was described at some length in Matthew Green s article Ok Google: please publish your DKIM secret keys. Following links from that article does get you to a short script to achieve key rotation but it had a number of problems, and wasn t useable in my context. dkim-rotate So I have written my own software for rotating and revoking DKIM keys: dkim-rotate. I think it is a good solution to this problem, and it ought to be deployable in many contexts (and readily adaptable to those it doesn t already support). Here s the feature list taken from the README: Complications It seems like it should be a simple problem. Keep N keys, and every day (or whatever), generate and start using a new key, and deliberately leak the oldest private key. But, things are more complicated than that. Considerably more complicated, as it turns out. I didn t want the DKIM key rotation software to have to edit the actual DNS zones for each relevant mail domain. That would tightly entangle the mail server administration with the DNS administration, and there are many contexts (including many of mine) where these roles are separated. The solution is to use DNS aliases (CNAME). But, now we need a fixed, relatively small, set of CNAME records for each mail domain. That means a fixed, relatively small set of key identifiers ( selectors in DKIM terminology), which must be used in rotation. We don t want the private keys to be published via the DNS because that makes an ever-growing DNS zone, which isn t great for performance; and, because we want to place barriers in the way of processes which might enumerate the set of keys we use (and the set of keys we have leaked) and keep records of what status each key had when. So we need a separate publication channel - for which a webserver was the obvious answer. We want the private keys to be readily noticeable and findable by someone who is verifying an alleged leaked email dump, but to be hard to enumerate. (One part of the strategy for this is to leave a note about it, with the prospective private key url, in the email headers.) The key rotation operations are more complicated than first appears, too. The short summary, above, neglects to consider the fact that DNS updates have a nonzero propagation time: if you change the DNS, not everyone on the Internet will experience the change immediately. So as well as a timeout for how long it might take an email to be delivered (ie, how long the DKIM signature remains valid), there is also a timeout for how long to wait after updating the DNS, before relying on everyone having got the memo. (This same timeout applies both before starting to sign emails with a new key, and before deliberately compromising a key which has been withdrawn and deadvertised.) Updating the DNS, and the MTA configuration, are fallible operations. So we need to cope with out-of-course situations, where a previous DNS or MTA update failed. In that case, we need to retry the failed update, and not proceed with key rotation. We mustn t start the timer for the key rotation until the update has been implemented. The rotation script will usually be run by cron, but it might be run by hand, and when it is run by hand it ought not to jump the gun and do anything too early (ie, before the relevant timeout has expired). cron jobs don t always run, and don t always run at precisely the right time. (And there s daylight saving time, to consider, too.) So overall, it s not sufficient to drive the system via cron and have it proceed by one unit of rotation on each run. And, hardest of all, I wanted to support post-deployment configuration changes, while continuing to keep the whole the system operational. Otherwise, you have to bake in all the timing parameters right at the beginning and can t change anything ever. So for example, I wanted to be able to change the email and DNS propagation delays, and even the number of selectors to use, without adversely affecting the delivery of already-sent emails, and without having to shut anything down. I think I have solved these problems. The resulting system is one which keeps track of the timing constraints, and the next event which might occur, on a per-key basis. It calculates on each run, which key(s) can be advanced to the next stage of their lifecycle, and performs the appropriate operations. The regular key update schedule is then an emergent property of the config parameters and cron job schedule. (I provide some example config.) Exim Integrating dkim-rotate itself with Exim was fairly easy. The lsearch lookup function can be used to fish information out of a suitable data file maintained by dkim-rotate. But a final awkwardness was getting Exim to make the right DKIM signatures, at the right time. When making a DKIM signature, one must choose a signing authority domain name: who should we claim to be? (This is the SDID in DKIM terms.) A mailserver that handles many different mail domains will be able to make good signatures on behalf of many of them. It seems to me that domain to be the mail domain in the From: header of the email. (The RFC doesn t seem to be clear on what is expected.) Exim doesn t seem to have anything builtin to do that. And, you only want to DKIM-sign emails that are originated locally or from trustworthy sources. You don t want to DKIM-sign messages that you received from the global Internet, and are sending out again (eg because of an email alias or mailing list). In theory if you verify DKIM on all incoming emails, you could avoid being fooled into signing bad emails, but rejecting all non-DKIM-verified email would be a very strong policy decision. Again, Exim doesn t seem to have cooked machinery. The resulting Exim configuration parameters run to 22 lines, and because they re parameters to an existing config item (the smtp transport) they can t even easily be deployed as a drop-in file via Debian s split config Exim configuration scheme. (I don t know if the file written for Exim s use by dkim-rotate would be suitable for other MTAs, but this part of dkim-rotate could easily be extended.) Conclusion I have today released dkim-rotate 0.4, which is the first public release for general use. I have it deployed and working, but it s new so there may well be bugs to work out. If you would like to try it out, you can get it via git from Debian Salsa. (Debian folks can also find it freshly in Debian unstable.)

comment count unavailable comments

1 August 2022

Bastian Venthur: Keychron keyboards fixed on Linux

Last year, I wrote about on how to get my buggy Keychron C1 keyboard working properly on Linux by setting a kernel module parameter. Afterwards, I contacted Hans de Goede since he was the last one that contributed a major patch to the relevant kernel module. After some debugging, it turned out that the Keychron keyboards are indeed misbehaving when set to Windows mode. Almost a year later, Bryan Cain provided a patch fixing the behavior, which has now been merged to the Linux kernel in 5.19. Thank you, Hans and Bryan!

26 July 2022

Raphaël Hertzog: Freexian s report about Debian Long Term Support, June 2022

A Debian LTS logo
Like each month, have a look at the work funded by Freexian s Debian LTS offering. Debian project funding No any major updates on running projects.
Two 1, 2 projects are in the pipeline now.
Tryton project is in a review phase. Gradle projects is still fighting in work. In June, we put aside 2254 EUR to fund Debian projects. We re looking forward to receive more projects from various Debian teams! Learn more about the rationale behind this initiative in this article. Debian LTS contributors In June, 15 contributors have been paid to work on Debian LTS, their reports are available: Evolution of the situation In June we released 27 DLAs.

This is a special month, where we have two releases (stretch and jessie) as ELTS and NO release as LTS. Buster is still handled by the security team and will probably be given in LTS hands at the beginning of the August. During this month we are updating the infrastructure, documentation and improve our internal processes to switch to a new release.
Many developers have just returned back from Debconf22, hold in Prizren, Kosovo! Many (E)LTS members could meet face-to-face and discuss some technical and social topics! Also LTS BoF took place, where the project was introduced (link to video).
Thanks to our sponsors Sponsors that joined recently are in bold. We are pleased to welcome Alter Way where their support of Debian is publicly acknowledged at the higher level, see this French quote of Alterway s CEO.

20 July 2022

Enrico Zini: Deconstruction of the DAM hat

Further reading Talk notes Intro Debian Account Managers Responsibility for official membership What DAM is not Unexpected responsibilities DAM warnings DAM warnings? House rules Interpreting house rules Governance by bullying How about the Community Team? How about DAM? How about the DPL? Concentrating responsibility Empowering developers What needs to happen

16 July 2022

Petter Reinholdtsen: Automatic LinuxCNC servo PID tuning?

While working on a CNC with servo motors controlled by the LinuxCNC PID controller, I recently had to learn how to tune the collection of values that control such mathematical machinery that a PID controller is. It proved to be a lot harder than I hoped, and I still have not succeeded in getting the Z PID controller to successfully defy gravity, nor X and Y to move accurately and reliably. But while climbing up this rather steep learning curve, I discovered that some motor control systems are able to tune their PID controllers. I got the impression from the documentation that LinuxCNC were not. This proved to be not true The LinuxCNC pid component is the recommended PID controller to use. It uses eight constants Pgain, Igain, Dgain, bias, FF0, FF1, FF2 and FF3 to calculate the output value based on current and wanted state, and all of these need to have a sensible value for the controller to behave properly. Note, there are even more values involved, theser are just the most important ones. In my case I need the X, Y and Z axes to follow the requested path with little error. This has proved quite a challenge for someone who have never tuned a PID controller before, but there is at least some help to be found. I discovered that included in LinuxCNC was this old PID component at_pid claiming to have auto tuning capabilities. Sadly it had been neglected since 2011, and could not be used as a plug in replacement for the default pid component. One would have to rewriting the LinuxCNC HAL setup to test at_pid. This was rather sad, when I wanted to quickly test auto tuning to see if it did a better job than me at figuring out good P, I and D values to use. I decided to have a look if the situation could be improved. This involved trying to understand the code and history of the pid and at_pid components. Apparently they had a common ancestor, as code structure, comments and variable names were quite close to each other. Sadly this was not reflected in the git history, making it hard to figure out what really happened. My guess is that the author of at_pid.c took a version of pid.c, rewrote it to follow the structure he wished pid.c to have, then added support for auto tuning and finally got it included into the LinuxCNC repository. The restructuring and lack of early history made it harder to figure out which part of the code were relevant to the auto tuning, and which part of the code needed to be updated to work the same way as the current pid.c implementation. I started by trying to isolate relevant changes in pid.c, and applying them to at_pid.c. My aim was to make sure the at_pid component could replace the pid component with a simple change in the HAL setup loadrt line, without having to "rewire" the rest of the HAL configuration. After a few hours following this approach, I had learned quite a lot about the code structure of both components, while concluding I was heading down the wrong rabbit hole, and should get back to the surface and find a different path. For the second attempt, I decided to throw away all the PID control related part of the original at_pid.c, and instead isolate and lift the auto tuning part of the code and inject it into a copy of pid.c. This ensured compatibility with the current pid component, while adding auto tuning as a run time option. To make it easier to identify the relevant parts in the future, I wrapped all the auto tuning code with '#ifdef AUTO_TUNER'. The end result behave just like the current pid component by default, as that part of the code is identical. The end result entered the LinuxCNC master branch a few days ago. To enable auto tuning, one need to set a few HAL pins in the PID component. The most important ones are tune-effort, tune-mode and tune-start. But lets take a step back, and see what the auto tuning code will do. I do not know the mathematical foundation of the at_pid algorithm, but from observation I can tell that the algorithm will, when enabled, produce a square wave pattern centered around the bias value on the output pin of the PID controller. This can be seen using the HAL Scope provided by LinuxCNC. In my case, this is translated into voltage (+-10V) sent to the motor controller, which in turn is translated into motor speed. So at_pid will ask the motor to move the axis back and forth. The number of cycles in the pattern is controlled by the tune-cycles pin, and the extremes of the wave pattern is controlled by the tune-effort pin. Of course, trying to change the direction of a physical object instantly (as in going directly from a positive voltage to the equivalent negative voltage) do not change velocity instantly, and it take some time for the object to slow down and move in the opposite direction. This result in a more smooth movement wave form, as the axis in question were vibrating back and forth. When the axis reached the target speed in the opposing direction, the auto tuner change direction again. After several of these changes, the average time delay between the 'peaks' and 'valleys' of this movement graph is then used to calculate proposed values for Pgain, Igain and Dgain, and insert them into the HAL model to use by the pid controller. The auto tuned settings are not great, but htye work a lot better than the values I had been able to cook up on my own, at least for the horizontal X and Y axis. But I had to use very small tune-effort values, as my motor controllers error out if the voltage change too quickly. I've been less lucky with the Z axis, which is moving a heavy object up and down, and seem to confuse the algorithm. The Z axis movement became a lot better when I introduced a bias value to counter the gravitational drag, but I will have to work a lot more on the Z axis PID values. Armed with this knowledge, it is time to look at how to do the tuning. Lets say the HAL configuration in question load the PID component for X, Y and Z like this:
loadrt pid names=pid.x,pid.y,pid.z
Armed with the new and improved at_pid component, the new line will look like this:
loadrt at_pid names=pid.x,pid.y,pid.z
The rest of the HAL setup can stay the same. This work because the components are referenced by name. If the component had used count=3 instead, all use of pid.# had to be changed to at_pid.#. To start tuning the X axis, move the axis to the middle of its range, to make sure it do not hit anything when it start moving back and forth. Next, set the tune-effort to a low number in the output range. I used 0.1 as my initial value. Next, assign 1 to the tune-mode value. Note, this will disable the pid controlling part and feed 0 to the output pin, which in my case initially caused a lot of drift. In my case it proved to be a good idea with X and Y to tune the motor driver to make sure 0 voltage stopped the motor rotation. On the other hand, for the Z axis this proved to be a bad idea, so it will depend on your setup. It might help to set the bias value to a output value that reduce or eliminate the axis drift. Finally, after setting tune-mode, set tune-start to 1 to activate the auto tuning. If all go well, your axis will vibrate for a few seconds and when it is done, new values for Pgain, Igain and Dgain will be active. To test them, change tune-mode back to 0. Note that this might cause the machine to suddenly jerk as it bring the axis back to its commanded position, which it might have drifted away from during tuning. To summarize with some halcmd lines:
setp pid.x.tune-effort 0.1
setp pid.x.tune-mode 1
setp pid.x.tune-start 1
# wait for the tuning to complete
setp pid.x.tune-mode 0
After doing this task quite a few times while trying to figure out how to properly tune the PID controllers on the machine in, I decided to figure out if this process could be automated, and wrote a script to do the entire tuning process from power on. The end result will ensure the machine is powered on and ready to run, home all axis if it is not already done, check that the extra tuning pins are available, move the axis to its mid point, run the auto tuning and re-enable the pid controller when it is done. It can be run several times. Check out the run-auto-pid-tuner script on github if you want to learn how it is done. My hope is that this little adventure can inspire someone who know more about motor PID controller tuning can implement even better algorithms for automatic PID tuning in LinuxCNC, making life easier for both me and all the others that want to use LinuxCNC but lack the in depth knowledge needed to tune PID controllers well. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

1 July 2022

Steve Kemp: An update on my simple golang TCL interpreter

So my previous post introduced a trivial interpreter for a TCL-like language. In the past week or two I've cleaned it up, fixed a bunch of bugs, and added 100% test-coverage. I'm actually pretty happy with it now. One of the reasons for starting this toy project was to experiment with how easy it is to extend the language using itself Some things are simple, for example replacing this:
puts "3 x 4 = [expr 3 * 4]"
With this:
puts "3 x 4 = [* 3 4]"
Just means defining a function (proc) named *. Which we can do like so:
proc *  a b   
    expr $a * $b
 
(Of course we don't have lists, or variadic arguments, so this is still a bit of a toy example.) Doing more than that is hard though without support for more primitives written in the parent language than I've implemented. The obvious thing I'm missing is a native implementation of upvalue, which is TCL primitive allowing you to affect/update variables in higher-scopes. Without that you can't write things as nicely as you would like, and have to fall back to horrid hacks or be unable to do things.
# define a procedure to run a body N times
proc repeat  n body   
    set res ""
    while  > $n 0   
        decr n
        set res [$body]
     
    $res
 
# test it out
set foo 12
repeat 5   incr foo  
#  foo is now 17 (i.e. 12 + 5)
A similar story implementing the loop word, which should allow you to set the contents of a variable and run a body a number of times:
proc loop  var min max bdy   
    // result
    set res ""
    // set the variable.  Horrid.
    // We miss upvalue here.
    eval "set $var [set min]"
    // Run the test
    while  <= [set "$$var"] $max    
        set res [$bdy]
        // This is a bit horrid
        // We miss upvalue here, and not for the first time.
        eval  incr "$var" 
     
    // return the last result
    $res
 
loop cur 0 10   puts "current iteration $cur ($min->$max)"  
# output is:
# => current iteration 0 (0-10)
# => current iteration 1 (0-10)
# ...
That said I did have fun writing some simple test-cases, and implementing assert, assert_equal, etc. In conclusion I think the number of required primitives needed to implement your own control-flow, and run-time behaviour, is a bit higher than I'd like. Writing switch, repeat, while, and similar primitives inside TCL is harder than creating those same things in FORTH, for example.

29 June 2022

Aigars Mahinovs: Long travel in an electric car

Since the first week of April 2022 I have (finally!) changed my company car from a plug-in hybrid to a fully electic car. My new ride, for the next two years, is a BMW i4 M50 in Aventurine Red metallic. An ellegant car with very deep and memorable color, insanely powerful (544 hp/795 Nm), sub-4 second 0-100 km/h, large 84 kWh battery (80 kWh usable), charging up to 210 kW, top speed of 225 km/h and also very efficient (which came out best in this trip) with WLTP range of 510 km and EVDB real range of 435 km. The car also has performance tyres (Hankook Ventus S1 evo3 245/45R18 100Y XL in front and 255/45R18 103Y XL in rear all at recommended 2.5 bar) that have reduced efficiency. So I wanted to document and describe how was it for me to travel ~2000 km (one way) with this, electric, car from south of Germany to north of Latvia. I have done this trip many times before since I live in Germany now and travel back to my relatives in Latvia 1-2 times per year. This was the first time I made this trip in an electric car. And as this trip includes both travelling in Germany (where BEV infrastructure is best in the world) and across Eastern/Northen Europe, I believe that this can be interesting to a few people out there. Normally when I travelled this trip with a gasoline/diesel car I would normally drive for two days with an intermediate stop somewhere around Warsaw with about 12 hours of travel time in each day. This would normally include a couple bathroom stops in each day, at least one longer lunch stop and 3-4 refueling stops on top of that. Normally this would use at least 6 liters of fuel per 100 km on average with total usage of about 270 liters for the whole trip (or about 540 just in fuel costs, nowadays). My (personal) quirk is that both fuel and recharging of my (business) car inside Germany is actually paid by my employer, so it is useful for me to charge up (or fill up) at the last station in Gemany before driving on. The plan for this trip was made in a similar way as when travelling with a gasoline car: travelling as fast as possible on German Autobahn network to last chargin stop on the A4 near G rlitz, there charging up as much as reasonable and then travelling to a hotel in Warsaw, charging there overnight and travelling north towards Ionity chargers in Lithuania from where reaching the final target in north of Latvia should be possible. How did this plan meet the reality? Travelling inside Germany with an electric car was basically perfect. The most efficient way would involve driving fast and hard with top speed of even 180 km/h (where possible due to speed limits and traffic). BMW i4 is very efficient at high speeds with consumption maxing out at 28 kWh/100km when you actually drive at this speed all the time. In real situation in this trip we saw consumption of 20.8-22.2 kWh/100km in the first legs of the trip. The more traffic there is, the more speed limits and roadworks, the lower is the average speed and also the lower the consumption. With this kind of consumption we could comfortably drive 2 hours as fast as we could and then pick any fast charger along the route and in 26 minutes at a charger (50 kWh charged total) we'd be ready to drive for another 2 hours. This lines up very well with recommended rest stops for biological reasons (bathroom, water or coffee, a bit of movement to get blood circulating) and very close to what I had to do anyway with a gasoline car. With a gasoline car I had to refuel first, then park, then go to bathroom and so on. With an electric car I can do all of that while the car is charging and in the end the total time for a stop is very similar. Also not that there was a crazy heat wave going on and temperature outside was at about 34C minimum the whole day and hitting 40C at one point of the trip, so a lot of power was used for cooling. The car has a heat pump standard, but it still was working hard to keep us cool in the sun. The car was able to plan a charging route with all the charging stops required and had all the good options (like multiple intermediate stops) that many other cars (hi Tesla) and mobile apps (hi Google and Apple) do not have yet. There are a couple bugs with charging route and display of current route guidance, those are already fixed and will be delivered with over the air update with July 2022 update. Another good alterantive is the ABRP (A Better Route Planner) that was specifically designed for electric car routing along the best route for charging. Most phone apps (like Google Maps) have no idea about your specific electric car - it has no idea about the battery capacity, charging curve and is missing key live data as well - what is the current consumption and remaining energy in the battery. ABRP is different - it has data and profiles for almost all electric cars and can also be linked to live vehicle data, either via a OBD dongle or via a new Tronity cloud service. Tronity reads data from vehicle-specific cloud service, such as MyBMW service, saves it, tracks history and also re-transmits it to ABRP for live navigation planning. ABRP allows for options and settings that no car or app offers, for example, saying that you want to stop at a particular place for an hour or until battery is charged to 90%, or saying that you have specific charging cards and would only want to stop at chargers that support those. Both the car and the ABRP also support alternate routes even with multiple intermediate stops. In comparison, route planning by Google Maps or Apple Maps or Waze or even Tesla does not really come close. After charging up in the last German fast charger, a more interesting part of the trip started. In Poland the density of high performance chargers (HPC) is much lower than in Germany. There are many chargers (west of Warsaw), but vast majority of them are (relatively) slow 50kW chargers. And that is a difference between putting 50kWh into the car in 23-26 minutes or in 60 minutes. It does not seem too much, but the key bit here is that for 20 minutes there is easy to find stuff that should be done anyway, but after that you are done and you are just waiting for the car and if that takes 4 more minutes or 40 more minutes is a big, perceptual, difference. So using HPC is much, much preferable. So we put in the Ionity charger near Lodz as our intermediate target and the car suggested an intermediate stop at a Greenway charger by Katy Wroclawskie. The location is a bit weird - it has 4 charging stations with 150 kW each. The weird bits are that each station has two CCS connectors, but only one parking place (and the connectors share power, so if two cars were to connect, each would get half power). Also from the front of the location one can only see two stations, the otehr two are semi-hidden around a corner. We actually missed them on the way to Latvia and one person actually waited for the charger behind us for about 10 minutes. We only discovered the other two stations on the way back. With slower speeds in Poland the consumption goes down to 18 kWh/100km which translates to now up to 3 hours driving between stops. At the end of the first day we drove istarting from Ulm from 9:30 in the morning until about 23:00 in the evening with total distance of about 1100 km, 5 charging stops, starting with 92% battery, charging for 26 min (50 kWh), 33 min (57 kWh + lunch), 17 min (23 kWh), 12 min (17 kWh) and 13 min (37 kW). In the last two chargers you can see the difference between a good and fast 150 kW charger at high battery charge level and a really fast Ionity charger at low battery charge level, which makes charging faster still. Arriving to hotel with 23% of battery. Overnight the car charged from a Porsche Destination Charger to 87% (57 kWh). That was a bit less than I would expect from a full power 11kW charger, but good enough. Hotels should really install 11kW Type2 chargers for their guests, it is a really significant bonus that drives more clients to you. The road between Warsaw and Kaunas is the most difficult part of the trip for both driving itself and also for charging. For driving the problem is that there will be a new highway going from Warsaw to Lithuanian border, but it is actually not fully ready yet. So parts of the way one drives on the new, great and wide highway and parts of the way one drives on temporary roads or on old single lane undivided roads. And the most annoying part is navigating between parts as signs are not always clear and the maps are either too old or too new. Some maps do not have the new roads and others have on the roads that have not been actually build or opened to traffic yet. It's really easy to loose ones way and take a significant detour. As far as charging goes, basically there is only the slow 50 kW chargers between Warsaw and Kaunas (for now). We chose to charge on the last charger in Poland, by Suwalki Kaufland. That was not a good idea - there is only one 50 kW CCS and many people decide the same, so there can be a wait. We had to wait 17 minutes before we could charge for 30 more minutes just to get 18 kWh into the battery. Not the best use of time. On the way back we chose a different charger in Lomza where would have a relaxed dinner while the car was charging. That was far more relaxing and a better use of time. We also tried charging at an Orlen charger that was not recommended by our car and we found out why. Unlike all other chargers during our entire trip, this charger did not accept our universal BMW Charging RFID card. Instead it demanded that we download their own Orlen app and register there. The app is only available in some countries (and not in others) and on iPhone it is only available in Polish. That is a bad exception to the rule and a bad example. This is also how most charging works in USA. Here in Europe that is not normal. The normal is to use a charging card - either provided from the car maker or from another supplier (like PlugSufring or Maingau Energy). The providers then make roaming arrangements with all the charging networks, so the cards just work everywhere. In the end the user gets the prices and the bills from their card provider as a single monthly bill. This also saves all any credit card charges for the user. Having a clear, separate RFID card also means that one can easily choose how to pay for each charging session. For example, I have a corporate RFID card that my company pays for (for charging in Germany) and a private BMW Charging card that I am paying myself for (for charging abroad). Having the car itself authenticate direct with the charger (like Tesla does) removes the option to choose how to pay. Having each charge network have to use their own app or token bring too much chaos and takes too much setup. The optimum is having one card that works everywhere and having the option to have additional card or cards for specific purposes. Reaching Ionity chargers in Lithuania is again a breath of fresh air - 20-24 minutes to charge 50 kWh is as expected. One can charge on the first Ionity just enough to reach the next one and then on the second charger one can charge up enough to either reach the Ionity charger in Adazi or the final target in Latvia. There is a huge number of CSDD (Road Traffic and Safety Directorate) managed chargers all over Latvia, but they are 50 kW chargers. Good enough for local travel, but not great for long distance trips. BMW i4 charges at over 50 kW on a HPC even at over 90% battery state of charge (SoC). This means that it is always faster to charge up in a HPC than in a 50 kW charger, if that is at all possible. We also tested the CSDD chargers - they worked without any issues. One could pay with the BMW Charging RFID card, one could use the CSDD e-mobi app or token and one could also use Mobilly - an app that you can use in Latvia for everything from parking to public transport tickets or museums or car washes. We managed to reach our final destination near Aluksne with 17% range remaining after just 3 charging stops: 17+30 min (18 kWh), 24 min (48 kWh), 28 min (36 kWh). Last stop we charged to 90% which took a few extra minutes that would have been optimal. For travel around in Latvia we were charging at our target farmhouse from a normal 3 kW Schuko EU socket. That is very slow. We charged for 33 hours and went from 17% to 94%, so not really full. That was perfectly fine for our purposes. We easily reached Riga, drove to the sea and then back to Aluksne with 8% still in reserve and started charging again for the next trip. If it were required to drive around more and charge faster, we could have used the normal 3-phase 440V connection in the farmhouse to have a red CEE 16A plug installed (same as people use for welders). BMW i4 comes standard with a new BMW Flexible Fast Charger that has changable socket adapters. It comes by default with a Schucko connector in Europe, but for 90 one can buy an adapter for blue CEE plug (3.7 kW) or red CEE 16A or 32A plugs (11 kW). Some public charging stations in France actually use the blue CEE plugs instead of more common Type2 electric car charging stations. The CEE plugs are also common in camping parking places. On the way back the long distance BEV travel was already well understood and did not cause us any problem. From our destination we could easily reach the first Ionity in Lithuania, on the Panevezhis bypass road where in just 8 minutes we got 19 kWh and were ready to drive on to Kaunas, there a longer 32 minute stop before the charging desert of Suwalki Gap that gave us 52 kWh to 90%. That brought us to a shopping mall in Lomzha where we had some food and charged up 39 kWh in lazy 50 minutes. That was enough to bring us to our return hotel for the night - Hotel 500W in Strykow by Lodz that has a 50kW charger on site, while we were having late dinner and preparing for sleep, the car easily recharged to full (71 kWh in 95 minutes), so I just moved it from charger to a parking spot just before going to sleep. Really easy and well flowing day. Second day back went even better as we just needed an 18 minute stop at the same Katy Wroclawskie charger as before to get 22 kWh and that was enough to get back to Germany. After that we were again flying on the Autobahn and charging as needed, 15 min (31 kWh), 23 min (48 kWh) and 31 min (54 kWh + food). We started the day on about 9:40 and were home at 21:40 after driving just over 1000 km on that day. So less than 12 hours for 1000 km travelled, including all charging, bio stops, food and some traffic jams as well. Not bad. Now let's take a look at all the apps and data connections that a technically minded customer can have for their car. Architecturally the car is a network of computers by itself, but it is very secured and normally people do not have any direct access. However, once you log in into the car with your BMW account the car gets your profile info and preferences (seat settings, navigation favorites, ...) and the car then also can start sending information to the BMW backend about its status. This information is then available to the user over multiple different channels. There is no separate channel for each of those data flow. The data only goes once to the backend and then all other communication of apps happens with the backend. First of all the MyBMW app. This is the go-to for everything about the car - seeing its current status and location (when not driving), sending commands to the car (lock, unlock, flash lights, pre-condition, ...) and also monitor and control charging processes. You can also plan a route or destination in the app in advance and then just send it over to the car so it already knows where to drive to when you get to the car. This can also integrate with calendar entries, if you have locations for appointments, for example. This also shows full charging history and allows a very easy export of that data, here I exported all charging sessions from June and then trimmed it back to only sessions relevant to the trip and cut off some design elements to have the data more visible. So one can very easily see when and where we were charging, how much power we got at each spot and (if you set prices for locations) can even show costs. I've already mentioned the Tronity service and its ABRP integration, but it also saves the information that it gets from the car and gathers that data over time. It has nice aspects, like showing the driven routes on a map, having ways to do business trip accounting and having good calendar view. Sadly it does not correctly capture the data for charging sessions (the amounts are incorrect). Update: after talking to Tronity support, it looks like the bug was in the incorrect value for the usable battery capacity for my car. They will look into getting th eright values there by default, but as a workaround one can edit their car in their system (after at least one charging session) and directly set the expected battery capacity (usable) in the car properties on the Tronity web portal settings. One other fun way to see data from your BMW is using the BMW integration in Home Assistant. This brings the car as a device in your own smart home. You can read all the variables from the car current status (and Home Asisstant makes cute historical charts) and you can even see interesting trends, for example for remaining range shows much higher value in Latvia as its prediction is adapted to Latvian road speeds and during the trip it adapts to Polish and then to German road speeds and thus to higher consumption and thus lower maximum predicted remaining range. Having the car attached to the Home Assistant also allows you to attach the car to automations, both as data and event source (like detecting when car enters the "Home" zone) and also as target, so you could flash car lights or even unlock or lock it when certain conditions are met. So, what in the end was the most important thing - cost of the trip? In total we charged up 863 kWh, so that would normally cost one about 290 , which is close to half what this trip would have costed with a gasoline car. Out of that 279 kWh in Germany (paid by my employer) and 154 kWh in the farmhouse (paid by our wonderful relatives :D) so in the end the charging that I actually need to pay adds up to 430 kWh or about 150 . Typically, it took about 400 in fuel that I had to pay to get to Latvia and back. The difference is really nice! In the end I believe that there are three different ways of charging:
  • incidental charging - this is wast majority of charging in the normal day-to-day life. The car gets charged when and where it is convinient to do so along the way. If we go to a movie or a shop and there is a chance to leave the car at a charger, then it can charge up. Works really well, does not take extra time for charging from us.
  • fast charging - charging up at a HPC during optimal charging conditions - from relatively low level to no more than 70-80% while you are still doing all the normal things one would do in a quick stop in a long travel process: bio things, cleaning the windscreen, getting a coffee or a snack.
  • necessary charging - charging from a whatever charger is available just enough to be able to reach the next destination or the next fast charger.
The last category is the only one that is really annoying and should be avoided at all costs. Even by shifting your plans so that you find something else useful to do while necessary charging is happening and thus, at least partially, shifting it over to incidental charging category. Then you are no longer just waiting for the car, you are doing something else and the car magically is charged up again. And when one does that, then travelling with an electric car becomes no more annoying than travelling with a gasoline car. Having more breaks in a trip is a good thing and makes the trips actually easier and less stressfull - I was more relaxed during and after this trip than during previous trips. Having the car air conditioning always be on, even when stopped, was a godsend in the insane heat wave of 30C-38C that we were driving trough. Final stats: 4425 km driven in the trip. Average consumption: 18.7 kWh/100km. Time driving: 2 days and 3 hours. Car regened 152 kWh. Charging stations recharged 863 kWh. Questions? You can use this i4talk forum thread or this Twitter thread to ask them to me.

Russell Coker: Philips 438P1 43 4K Monitor

I have just returned a Philips 438P1 43 4K Monitor [1] and gone back to my Samsung 28 4K monitor model LU28E590DS/XY AKA UE590. The main listed differences are the size and the fact that the Samsung is TN but the Philips is IPS. Here s a comparison of TN and IPS technologies [2]. Generally I think that TN is probably best for a monitor but in theory IPS shouldn t be far behind. The Philips monitor has a screen with a shiny surface which may be good for a TV but isn t good for a monitor. Also it seemed to blur the pixels a bit which again is probably OK for a TV that is trying to emulate curved images but not good for a monitor where it s all artificial straight lines. The most important thing for me in a monitor is how well it displays text in small fonts, for that I don t really want the round parts of the letters to look genuinely round as a clear octagon or rectangle is better than a fuzzy circle. There is some controversy about the ideal size for monitors. Some people think that nothing larger than 28 is needed and some people think that a 43 is totally usable. After testing I determined that 43 is really too big, I had to move to see it all. Also for my use it s convenient to be able to turn a monitor slightly to allow someone else to get a good view and a 43 monitor is too large to move much (maybe future technology for lighter monitors will change this). Previously I had been unable to get my Samsung monitor to work at 4K resolution with 60Hz and had believed it was due to cheap video cards. I got the Philips monitor to work with HDMI so it s apparent that the Samsung monitor doesn t do 4K@60Hz on HDMI. This isn t a real problem as the Samsung monitor doesn t have built in speakers. The Philips monitor has built in speakers for HDMI sound which means one less cable to my PC and no desk space taken by speakers. I bought the Philips monitor on eBay in opened unused condition. Inside the box was a sheet with a printout stating that the monitor blanks the screen periodically, so the seller knew that it wasn t in unused condition, it was tested and failed the test. If the Philips monitor had been as minimally broken as described then I might have kept it. However it seems that certain patterns of input caused it to reboot. For example I could be watching Netflix and have it drop out, I would press the left arrow to watch that bit again and have it drop out again. On one occasion I did a test and found that a 5 second section of Netflix content caused the monitor to reboot on 6/8 times I viewed it. The workaround I discovered was to switch between maximised window and full-screen mode when it had a dropout. So I just press left-arrow and then F and I can keep watching. That s not what I expect from a $700 monitor! I considered checking for Philips firmware updates but decided against it because I didn t want to risk voiding the warranty if it didn t work correctly and I decided I just didn t like the monitor that much. Ideally for my next monitor I ll get a 4K screen of about 35 , TN, and a screen that s not shiny. At the moment there doesn t seem to be many monitors between 32 and 43 in size, so 32 may do. I am quite happy with the Samsung monitor so getting the same but slightly larger is fine. It s a pity they stopped making 5K displays.

26 June 2022

Russ Allbery: Review: Feet of Clay

Review: Feet of Clay, by Terry Pratchett
Series: Discworld #19
Publisher: Harper
Copyright: October 1996
Printing: February 2014
ISBN: 0-06-227551-8
Format: Mass market
Pages: 392
Feet of Clay is the 19th Discworld novel, the third Watch novel, and probably not the best place to start. You could read only Guards! Guards! and Men at Arms before this one, though, if you wanted. This story opens with a golem selling another golem to a factory owner, obviously not caring about the price. This is followed by two murders: an elderly priest, and the curator of a dwarven bread museum. (Dwarf bread is a much-feared weapon of war.) Meanwhile, assassins are still trying to kill Watch Commander Vimes, who has an appointment to get a coat of arms. A dwarf named Cheery Littlebottom is joining the Watch. And Lord Vetinari, the ruler of Ankh-Morpork, has been poisoned. There's a lot going on in this book, and while it's all in some sense related, it's more interwoven than part of a single story. The result felt to me like a day-in-the-life episode of a cop show: a lot of character development, a few largely separate plot lines so that the characters have something to do, and the development of a few long-running themes that are neither started nor concluded in this book. We check in on all the individual Watch members we've met to date, add new ones, and at the end of the book everyone is roughly back to where they were when the book started. This is, to be clear, not a bad thing for a book to do. It relies on the reader already caring about the characters and being invested in the long arc of the series, but both of those are true of me, so it worked. Cheery is a good addition, giving Pratchett an opportunity to explore gender nonconformity with a twist (all dwarfs are expected to act the same way regardless of gender, which doesn't work for Cheery) and, even better, giving Angua more scenes. Angua is among my favorite Watch characters, although I wish she'd gotten more of a resolution for her relationship anxiety in this book. The primary plot is about golems, which on Discworld are used in factories because they work nonstop, have no other needs, and do whatever they're told. Nearly everyone in Ankh-Morpork considers them machinery. If you've read any Discworld books before, you will find it unsurprising that Pratchett calls that belief into question, but the ways he gets there, and the links between the golem plot and the other plot threads, have a few good twists and turns. Reading this, I was reminded vividly of Orwell's discussion of Charles Dickens:
It seems that in every attack Dickens makes upon society he is always pointing to a change of spirit rather than a change of structure. It is hopeless to try and pin him down to any definite remedy, still more to any political doctrine. His approach is always along the moral plane, and his attitude is sufficiently summed up in that remark about Strong's school being as different from Creakle's "as good is from evil." Two things can be very much alike and yet abysmally different. Heaven and Hell are in the same place. Useless to change institutions without a "change of heart" that, essentially, is what he is always saying. If that were all, he might be no more than a cheer-up writer, a reactionary humbug. A "change of heart" is in fact the alibi of people who do not wish to endanger the status quo. But Dickens is not a humbug, except in minor matters, and the strongest single impression one carries away from his books is that of a hatred of tyranny.
and later:
His radicalism is of the vaguest kind, and yet one always knows that it is there. That is the difference between being a moralist and a politician. He has no constructive suggestions, not even a clear grasp of the nature of the society he is attacking, only an emotional perception that something is wrong, all he can finally say is, "Behave decently," which, as I suggested earlier, is not necessarily so shallow as it sounds. Most revolutionaries are potential Tories, because they imagine that everything can be put right by altering the shape of society; once that change is effected, as it sometimes is, they see no need for any other. Dickens has not this kind of mental coarseness. The vagueness of his discontent is the mark of its permanence. What he is out against is not this or that institution, but, as Chesterton put it, "an expression on the human face."
I think Pratchett is, in that sense, a Dickensian writer, and it shows all through Discworld. He does write political crises (there is one in this book), but the crises are moral or personal, not ideological or structural. The Watch novels are often concerned with systems of government, but focus primarily on the popular appeal of kings, the skill of the Patrician, and the greed of those who would maneuver for power. Pratchett does not write (at least so far) about the proper role of government, the impact of Vetinari's policies (or even what those policies may be), or political theory in any deep sense. What he does write about, at great length, is morality, fairness, and a deeply generous humanism, all of which are central to the golem plot. Vimes is a great protagonist for this type of story. He's grumpy, cynical, stubborn, and prejudiced, and we learn in this book that he's a descendant of the Discworld version of Oliver Cromwell. He can be reflexively self-centered, and he has no clear idea how to use his newfound resources. But he behaves decently towards people, in both big and small things, for reasons that the reader feels he could never adequately explain, but which are rooted in empathy and an instinctual sense of fairness. It's fun to watch him grumble his way through the plot while making snide comments about mysteries and detectives. I do have to complain a bit about one of those mysteries, though. I would have enjoyed the plot around Vetinari's poisoning more if Pratchett hadn't mercilessly teased readers who know a bit about French history. An allusion or two would have been fun, but he kept dropping references while having Vimes ignore them, and I found the overall effect both frustrating and irritating. That and a few other bits, like Angua's uncommunicative angst, fell flat for me. Thankfully, several other excellent scenes made up for them, such as Nobby's high society party and everything about the College of Heralds. Also, Vimes's impish PDA (smartphone without the phone, for those younger than I am) remains absurdly good commentary on the annoyances of portable digital devices despite an original publication date of 1996. Feet of Clay is less focused than the previous Watch novels and more of a series book than most Discworld novels. You're reading about characters introduced in previous books with problems that will continue into subsequent books. The plot and the mysteries are there to drive the story but seem relatively incidental to the characterization. This isn't a complaint; at this point in the series, I'm in it for the long haul, and I liked the variation. As usual, Pratchett is stronger for me when he's not overly focused on parody. His own characters are as good as the material he's been parodying, and I'm happy to see them get a book that's not overshadowed by another material. If you've read this far in the series, or even in just the Watch novels, recommended. Followed by Hogfather in publication order and, thematically, by Jingo. Rating: 8 out of 10

23 June 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, May 2022

A Debian LTS logo
Like each month, have a look at the work funded by Freexian s Debian LTS offering. Debian project funding Two [1, 2] projects are in the pipeline now. Tryton project is in a final phase. Gradle projects is fighting with technical difficulties. In May, we put aside 2233 EUR to fund Debian projects. We re looking forward to receive more projects from various Debian teams! Learn more about the rationale behind this initiative in this article. Debian LTS contributors In May, 14 contributors have been paid to work on Debian LTS, their reports are available: Evolution of the situation In May we released 49 DLAs. The security tracker currently lists 71 packages with a known CVE and the dla-needed.txt file has 65 packages needing an update. The number of paid contributors increased significantly, we are pleased to welcome our latest team members: Andreas R nnquist, Dominik George, Enrico Zini and Stefano Rivera. It is worth pointing out that we are getting close to the end of the LTS period for Debian 9. After June 30th, no new security updates will be made available on security.debian.org. We are preparing to overtake Debian 10 Buster for the next two years and to make this process as smooth as possible. But Freexian and its team of paid Debian contributors will continue to maintain Debian 9 going forward for the customers of the Extended LTS offer. If you have Debian 9 servers to keep secure, it s time to subscribe! You might not have noticed, but Freexian formalized a mission statement where we explain that our purpose is to help improve Debian. For this, we want to fund work time for the Debian developers that recently joined Freexian as collaborators. The Extended LTS and the PHP LTS offers are built following a model that will help us to achieve this if we manage to have enough customers for those offers. So consider subscribing: you help your organization but you also help Debian! Thanks to our sponsors Sponsors that joined recently are in bold.

20 June 2022

Iustin Pop: Experiment: A week of running

My sports friends know that I wasn t able to really run in many, many years, due to a recurring injury that was not fully diagnosed and which, after many sessions with the doctor, ended up with OK-ish state for day-to-day life but also with these words: Maybe, running is just not for you? The year 2012 was my running year . I went to a number of races, wrote blog posts, then slowly started running only rarely, then a few years later I was really only running once in a while, and coupled with a number of bad ideas of the type lets run today after a long break, but a lot , I started injuring my foot. Add a few more years, some more kilograms on my body, a one event of jumping with a kid on my shoulders and landing on my bad foot, and the setup was complete. Doctor visits, therapy, slow improvements, but not really solving the problem. 6 months breaks, small attempts at running, pain again, repeat, pain again, etc. It ended up with me acknowledging that yes, maybe running is not for me, and I should really give it up. Incidentally, in 2021, as part of me trying to improve my health/diet, I tried some thing that is not important for this post and for the first time in a long time, I was fully, 100%, pain free in my leg during day-to-day activities. Huh, maybe this is not purely related to running? From that point on, my foot became, very slowly, better. I started doing short runs (2-3km), especially on holidays where I can t bike, and if I was careful, it didn t go too bad. But I knew I can t run, so these were rare events. In April this year, on vacation, I run a couple of times - 20km distance. In May, 12km. Then, there was a Garmin Badge I really wanted, so against my good judgement, I did a run/walk (2:1 ratio) the previous weekend, and to my surprise, no unwanted side-effect. And I got an idea: what if I do short run/walks an entire week? When does my foot break ? I mean, by now I knew that a short (3-4, maybe 5km) run that has pauses doesn t negatively impact my foot. What about the 2nd one? Or the 3rd one? When does it break? Is it distance, or something else? The other problem was - when to run? I mean, on top of hybrid work model. When working from home, all good, but when working from the office? So the other, somewhat more impossible task for me, was to wake up early and run before 8 AM. Clearly destined to fail! But, the following day (Monday), I did wake up and 3km. Then Tuesday again, 3.3km (and later, one hour of biking). Wed - 3.3km. Thu - 4.40km, at 4:1 ratio (2m:30s). Friday, 3.7km (4:1), plus a very long for me (112km) bike ride. By this time, I was physically dead. Not my foot, just my entire body. On Saturday morning, Training Peaks said my form is -52, and it starts warning below -15. I woke up late and groggy, and I had to extra motivate myself to go for the last, 5.3km run, to round up the week. On Friday and Saturday, my problem leg did start to how to say, remind me it is problematic? But not like previously, no waking in the morning with a stiff tendon. No, just not fully happy. And, to my surprise, correlated again with my consumption of problematic food (I was getting hungrier and hungrier, and eating too much of things I should keep an eye on). At this point, with the week behind me: Did my experiment make me wiser? Not really. Happier? Yes, 100%. I plan to buy some new running clothes, my current ones are really old. But did I really understand how my body function? A loud no. Sigh. The next challenge will be, how to manage my time across multiple sports (and work, and family, and other hobbies). Still, knowing that I can anytime go for 25-35 minutes of running, without preparation, is very reassuring. Freedom, health and injury-free sports to everyone!

19 June 2022

John Goerzen: Pipes, deadlocks, and strace annoyingly fixing them

This is a complex tale I will attempt to make simple(ish). I ve (re)learned more than I cared to about the details of pipes, signals, and certain system calls and the solution is still elusive. For some time now, I have been using NNCP to back up my files. These backups are sent to my backup system, which effectively does this to process them (each ZFS send is piped to a shell script that winds up running this):
gpg -q -d   zstdcat -T0   zfs receive -u -o readonly=on "$STORE/$DEST"
This processes tens of thousands of zfs sends per week. Recently, having written Filespooler, I switched to sending the backups using Filespooler over NNCP. Now fspl (the Filespooler executable) opens the file for each stream and then connects it to what amounts to this pipeline:
bash -c 'gpg -q -d 2>/dev/null   zstdcat -T0'   zfs receive -u -o readonly=on "$STORE/$DEST"
Actually, to be more precise, it spins up the bash part of it, reads a few bytes from it, and then connects it to the zfs receive. And this works well almost always. In something like 1/1000 of the cases, it deadlocks, and I still don t know why. But I can talk about the journey of trying to figure it out (and maybe some of you will have some ideas). Filespooler is written in Rust, and uses Rust s Command system. Effectively what happens is this:
  1. The fspl process has a File handle, which after forking but before invoking bash, it dup2 s to stdin.
  2. The connection between bash and zfs receive is a standard Unix pipe.
I cannot get the problem to duplicate when I run the entire thing under strace -f. So I am left trying to peek at it from the outside. What happens if I try to attach to each component with strace -p? So the plot thickens! Why would connecting to zstdcat and zfs receive cause them to actually change behavior? strace works by using the ptrace system call, and ptrace in a number of cases requires sending SIGSTOP to a process. In a complicated set of circumstances, a system call may return EINTR when a SIGSTOP is received, with the idea that the system call should be retried. I can t see, from either zstdcat or zfs, if this is happening, though. So I thought, how about having Filespooler manually copy data from bash to zfs receive in a read/write loop instead of having them connected directly via a pipe? That is, there would be two pipes going there: one where Filespooler reads from the bash command, and one where it writes to zfs. If nothing else, I could instrument it with debugging. And so I did, and I found that when it deadlocked, it was deadlocking on write but with no discernible pattern as to where or when. So I went back to directly connected. In analyzing straces, I found a Rust bug which I reported in which it is failing to close the read end of a pipe in the parent post-fork. However, having implemented a workaround for this, it doesn t prevent the deadlock so this is orthogonal to the issue at hand. Among the two strange things here are things returning to normal when I attach strace to zstdcat, and things crashing when I attach strace to zfs. I decided to investigate the latter. It turns out that the ZFS code that is reading from stdin during zfs receive is in the kernel module, not userland. Here is the part that is triggering the imcomplete stream error:
                int err = zfs_file_read(fp, (char *)buf + done,
                    len - done, &resid);
                if (resid == len - done)  
                        /*
                         * Note: ECKSUM or ZFS_ERR_STREAM_TRUNCATED indicates
                         * that the receive was interrupted and can
                         * potentially be resumed.
                         */
                        err = SET_ERROR(ZFS_ERR_STREAM_TRUNCATED);
                 
resid is an output parameter with the number of bytes remaining from a short read, so in this case, if the read produced zero bytes, then it sets that error. What s zfs_file_read then? It boils down to a thin wrapper around kernel_read(). This winds up calling __kernel_read(), which calls read_iter on the pipe, which is pipe_read(). That s where I don t have the knowledge to get into the weeds right now. So it seems likely to me that the problem has something to do with zfs receive. But, what, and why does it only not work in this one very specific situation, and only so rarely? And why does attaching strace to zstdcat make it all work again? I m indeed puzzled! Update 2022-06-20: See the followup post which identifies this as likely a kernel bug and explains why this particular use of Filespooler made it easier to trigger.

17 June 2022

Antoine Beaupr : Matrix notes

I have some concerns about Matrix (the protocol, not the movie that came out recently, although I do have concerns about that as well). I've been watching the project for a long time, and it seems more a promising alternative to many protocols like IRC, XMPP, and Signal. This review may sound a bit negative, because it focuses on those concerns. I am the operator of an IRC network and people keep asking me to bridge it with Matrix. I have myself considered just giving up on IRC and converting to Matrix. This space is a living document exploring my research of that problem space. The TL;DR: is that no, I'm not setting up a bridge just yet, and I'm still on IRC. This article was written over the course of the last three months, but I have been watching the Matrix project for years (my logs seem to say 2016 at least). The article is rather long. It will likely take you half an hour to read, so copy this over to your ebook reader, your tablet, or dead trees, and lean back and relax as I show you around the Matrix. Or, alternatively, just jump to a section that interest you, most likely the conclusion.

Introduction to Matrix Matrix is an "open standard for interoperable, decentralised, real-time communication over IP. It can be used to power Instant Messaging, VoIP/WebRTC signalling, Internet of Things communication - or anywhere you need a standard HTTP API for publishing and subscribing to data whilst tracking the conversation history". It's also (when compared with XMPP) "an eventually consistent global JSON database with an HTTP API and pubsub semantics - whilst XMPP can be thought of as a message passing protocol." According to their FAQ, the project started in 2014, has about 20,000 servers, and millions of users. Matrix works over HTTPS but over a special port: 8448.

Security and privacy I have some concerns about the security promises of Matrix. It's advertised as a "secure" with "E2E [end-to-end] encryption", but how does it actually work?

Data retention defaults One of my main concerns with Matrix is data retention, which is a key part of security in a threat model where (for example) an hostile state actor wants to surveil your communications and can seize your devices. On IRC, servers don't actually keep messages all that long: they pass them along to other servers and clients as fast as they can, only keep them in memory, and move on to the next message. There are no concerns about data retention on messages (and their metadata) other than the network layer. (I'm ignoring the issues with user registration, which is a separate, if valid, concern.) Obviously, an hostile server could log everything passing through it, but IRC federations are normally tightly controlled. So, if you trust your IRC operators, you should be fairly safe. Obviously, clients can (and often do, even if OTR is configured!) log all messages, but this is generally not the default. Irssi, for example, does not log by default. IRC bouncers are more likely to log to disk, of course, to be able to do what they do. Compare this to Matrix: when you send a message to a Matrix homeserver, that server first stores it in its internal SQL database. Then it will transmit that message to all clients connected to that server and room, and to all other servers that have clients connected to that room. Those remote servers, in turn, will keep a copy of that message and all its metadata in their own database, by default forever. On encrypted rooms those messages are encrypted, but not their metadata. There is a mechanism to expire entries in Synapse, but it is not enabled by default. So one should generally assume that a message sent on Matrix is never expired.

GDPR in the federation But even if that setting was enabled by default, how do you control it? This is a fundamental problem of the federation: if any user is allowed to join a room (which is the default), those user's servers will log all content and metadata from that room. That includes private, one-on-one conversations, since those are essentially rooms as well. In the context of the GDPR, this is really tricky: who is the responsible party (known as the "data controller") here? It's basically any yahoo who fires up a home server and joins a room. In a federated network, one has to wonder whether GDPR enforcement is even possible at all. But in Matrix in particular, if you want to enforce your right to be forgotten in a given room, you would have to:
  1. enumerate all the users that ever joined the room while you were there
  2. discover all their home servers
  3. start a GDPR procedure against all those servers
I recognize this is a hard problem to solve while still keeping an open ecosystem. But I believe that Matrix should have much stricter defaults towards data retention than right now. Message expiry should be enforced by default, for example. (Note that there are also redaction policies that could be used to implement part of the GDPR automatically, see the privacy policy discussion below on that.) Also keep in mind that, in the brave new peer-to-peer world that Matrix is heading towards, the boundary between server and client is likely to be fuzzier, which would make applying the GDPR even more difficult. Update: this comment links to this post (in german) which apparently studied the question and concluded that Matrix is not GDPR-compliant. In fact, maybe Synapse should be designed so that there's no configurable flag to turn off data retention. A bit like how most system loggers in UNIX (e.g. syslog) come with a log retention system that typically rotate logs after a few weeks or month. Historically, this was designed to keep hard drives from filling up, but it also has the added benefit of limiting the amount of personal information kept on disk in this modern day. (Arguably, syslog doesn't rotate logs on its own, but, say, Debian GNU/Linux, as an installed system, does have log retention policies well defined for installed packages, and those can be discussed. And "no expiry" is definitely a bug.

Matrix.org privacy policy When I first looked at Matrix, five years ago, Element.io was called Riot.im and had a rather dubious privacy policy:
We currently use cookies to support our use of Google Analytics on the Website and Service. Google Analytics collects information about how you use the Website and Service. [...] This helps us to provide you with a good experience when you browse our Website and use our Service and also allows us to improve our Website and our Service.
When I asked Matrix people about why they were using Google Analytics, they explained this was for development purposes and they were aiming for velocity at the time, not privacy (paraphrasing here). They also included a "free to snitch" clause:
If we are or believe that we are under a duty to disclose or share your personal data, we will do so in order to comply with any legal obligation, the instructions or requests of a governmental authority or regulator, including those outside of the UK.
Those are really broad terms, above and beyond what is typically expected legally. Like the current retention policies, such user tracking and ... "liberal" collaboration practices with the state set a bad precedent for other home servers. Thankfully, since the above policy was published (2017), the GDPR was "implemented" (2018) and it seems like both the Element.io privacy policy and the Matrix.org privacy policy have been somewhat improved since. Notable points of the new privacy policies:
  • 2.3.1.1: the "federation" section actually outlines that "Federated homeservers and Matrix clients which respect the Matrix protocol are expected to honour these controls and redaction/erasure requests, but other federated homeservers are outside of the span of control of Element, and we cannot guarantee how this data will be processed"
  • 2.6: users under the age of 16 should not use the matrix.org service
  • 2.10: Upcloud, Mythic Beast, Amazon, and CloudFlare possibly have access to your data (it's nice to at least mention this in the privacy policy: many providers don't even bother admitting to this kind of delegation)
  • Element 2.2.1: mentions many more third parties (Twilio, Stripe, Quaderno, LinkedIn, Twitter, Google, Outplay, PipeDrive, HubSpot, Posthog, Sentry, and Matomo (phew!) used when you are paying Matrix.org for hosting
I'm not super happy with all the trackers they have on the Element platform, but then again you don't have to use that service. Your favorite homeserver (assuming you are not on Matrix.org) probably has their own Element deployment, hopefully without all that garbage. Overall, this is all a huge improvement over the previous privacy policy, so hats off to the Matrix people for figuring out a reasonable policy in such a tricky context. I particularly like this bit:
We will forget your copy of your data upon your request. We will also forward your request to be forgotten onto federated homeservers. However - these homeservers are outside our span of control, so we cannot guarantee they will forget your data.
It's great they implemented those mechanisms and, after all, if there's an hostile party in there, nothing can prevent them from using screenshots to just exfiltrate your data away from the client side anyways, even with services typically seen as more secure, like Signal. As an aside, I also appreciate that Matrix.org has a fairly decent code of conduct, based on the TODO CoC which checks all the boxes in the geekfeminism wiki.

Metadata handling Overall, privacy protections in Matrix mostly concern message contents, not metadata. In other words, who's talking with who, when and from where is not well protected. Compared to a tool like Signal, which goes through great lengths to anonymize that data with features like private contact discovery, disappearing messages, sealed senders, and private groups, Matrix is definitely behind. (Note: there is an issue open about message lifetimes in Element since 2020, but it's not at even at the MSC stage yet.) This is a known issue (opened in 2019) in Synapse, but this is not just an implementation issue, it's a flaw in the protocol itself. Home servers keep join/leave of all rooms, which gives clear text information about who is talking to. Synapse logs may also contain privately identifiable information that home server admins might not be aware of in the first place. Those log rotation policies are separate from the server-level retention policy, which may be confusing for a novice sysadmin. Combine this with the federation: even if you trust your home server to do the right thing, the second you join a public room with third-party home servers, those ideas kind of get thrown out because those servers can do whatever they want with that information. Again, a problem that is hard to solve in any federation. To be fair, IRC doesn't have a great story here either: any client knows not only who's talking to who in a room, but also typically their client IP address. Servers can (and often do) obfuscate this, but often that obfuscation is trivial to reverse. Some servers do provide "cloaks" (sometimes automatically), but that's kind of a "slap-on" solution that actually moves the problem elsewhere: now the server knows a little more about the user. Overall, I would worry much more about a Matrix home server seizure than a IRC or Signal server seizure. Signal does get subpoenas, and they can only give out a tiny bit of information about their users: their phone number, and their registration, and last connection date. Matrix carries a lot more information in its database.

Amplification attacks on URL previews I (still!) run an Icecast server and sometimes share links to it on IRC which, obviously, also ends up on (more than one!) Matrix home servers because some people connect to IRC using Matrix. This, in turn, means that Matrix will connect to that URL to generate a link preview. I feel this outlines a security issue, especially because those sockets would be kept open seemingly forever. I tried to warn the Matrix security team but somehow, I don't think this issue was taken very seriously. Here's the disclosure timeline:
  • January 18: contacted Matrix security
  • January 19: response: already reported as a bug
  • January 20: response: can't reproduce
  • January 31: timeout added, considered solved
  • January 31: I respond that I believe the security issue is underestimated, ask for clearance to disclose
  • February 1: response: asking for two weeks delay after the next release (1.53.0) including another patch, presumably in two weeks' time
  • February 22: Matrix 1.53.0 released
  • April 14: I notice the release, ask for clearance again
  • April 14: response: referred to the public disclosure
There are a couple of problems here:
  1. the bug was publicly disclosed in September 2020, and not considered a security issue until I notified them, and even then, I had to insist
  2. no clear disclosure policy timeline was proposed or seems established in the project (there is a security disclosure policy but it doesn't include any predefined timeline)
  3. I wasn't informed of the disclosure
  4. the actual solution is a size limit (10MB, already implemented), a time limit (30 seconds, implemented in PR 11784), and a content type allow list (HTML, "media" or JSON, implemented in PR 11936), and I'm not sure it's adequate
  5. (pure vanity:) I did not make it to their Hall of fame
I'm not sure those solutions are adequate because they all seem to assume a single home server will pull that one URL for a little while then stop. But in a federated network, many (possibly thousands) home servers may be connected in a single room at once. If an attacker drops a link into such a room, all those servers would connect to that link all at once. This is an amplification attack: a small amount of traffic will generate a lot more traffic to a single target. It doesn't matter there are size or time limits: the amplification is what matters here. It should also be noted that clients that generate link previews have more amplification because they are more numerous than servers. And of course, the default Matrix client (Element) does generate link previews as well. That said, this is possibly not a problem specific to Matrix: any federated service that generates link previews may suffer from this. I'm honestly not sure what the solution is here. Maybe moderation? Maybe link previews are just evil? All I know is there was this weird bug in my Icecast server and I tried to ring the bell about it, and it feels it was swept under the rug. Somehow I feel this is bound to blow up again in the future, even with the current mitigation.

Moderation In Matrix like elsewhere, Moderation is a hard problem. There is a detailed moderation guide and much of this problem space is actively worked on in Matrix right now. A fundamental problem with moderating a federated space is that a user banned from a room can rejoin the room from another server. This is why spam is such a problem in Email, and why IRC networks have stopped federating ages ago (see the IRC history for that fascinating story).

The mjolnir bot The mjolnir moderation bot is designed to help with some of those things. It can kick and ban users, redact all of a user's message (as opposed to one by one), all of this across multiple rooms. It can also subscribe to a federated block list published by matrix.org to block known abusers (users or servers). Bans are pretty flexible and can operate at the user, room, or server level. Matrix people suggest making the bot admin of your channels, because you can't take back admin from a user once given.

The command-line tool There's also a new command line tool designed to do things like:
  • System notify users (all users/users from a list, specific user)
  • delete sessions/devices not seen for X days
  • purge the remote media cache
  • select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
  • purge history of theses rooms
  • shutdown rooms
This tool and Mjolnir are based on the admin API built into Synapse.

Rate limiting Synapse has pretty good built-in rate-limiting which blocks repeated login, registration, joining, or messaging attempts. It may also end up throttling servers on the federation based on those settings.

Fundamental federation problems Because users joining a room may come from another server, room moderators are at the mercy of the registration and moderation policies of those servers. Matrix is like IRC's +R mode ("only registered users can join") by default, except that anyone can register their own homeserver, which makes this limited. Server admins can block IP addresses and home servers, but those tools are not easily available to room admins. There is an API (m.room.server_acl in /devtools) but it is not reliable (thanks Austin Huang for the clarification). Matrix has the concept of guest accounts, but it is not used very much, and virtually no client or homeserver supports it. This contrasts with the way IRC works: by default, anyone can join an IRC network even without authentication. Some channels require registration, but in general you are free to join and look around (until you get blocked, of course). I have seen anecdotal evidence (CW: Twitter, nitter link) that "moderating bridges is hell", and I can imagine why. Moderation is already hard enough on one federation, when you bridge a room with another network, you inherit all the problems from that network but without the entire abuse control tools from the original network's API...

Room admins Matrix, in particular, has the problem that room administrators (which have the power to redact messages, ban users, and promote other users) are bound to their Matrix ID which is, in turn, bound to their home servers. This implies that a home server administrators could (1) impersonate a given user and (2) use that to hijack the room. So in practice, the home server is the trust anchor for rooms, not the user themselves. That said, if server B administrator hijack user joe on server B, they will hijack that room on that specific server. This will not (necessarily) affect users on the other servers, as servers could refuse parts of the updates or ban the compromised account (or server). It does seem like a major flaw that room credentials are bound to Matrix identifiers, as opposed to the E2E encryption credentials. In an encrypted room even with fully verified members, a compromised or hostile home server can still take over the room by impersonating an admin. That admin (or even a newly minted user) can then send events or listen on the conversations. This is even more frustrating when you consider that Matrix events are actually signed and therefore have some authentication attached to them, acting like some sort of Merkle tree (as it contains a link to previous events). That signature, however, is made from the homeserver PKI keys, not the client's E2E keys, which makes E2E feel like it has been "bolted on" later.

Availability While Matrix has a strong advantage over Signal in that it's decentralized (so anyone can run their own homeserver,), I couldn't find an easy way to run a "multi-primary" setup, or even a "redundant" setup (even if with a single primary backend), short of going full-on "replicate PostgreSQL and Redis data", which is not typically for the faint of heart.

How this works in IRC On IRC, it's quite easy to setup redundant nodes. All you need is:
  1. a new machine (with it's own public address with an open port)
  2. a shared secret (or certificate) between that machine and an existing one on the network
  3. a connect block on both servers
That's it: the node will join the network and people can connect to it as usual and share the same user/namespace as the rest of the network. The servers take care of synchronizing state: you do not need to worry about replicating a database server. (Now, experienced IRC people will know there's a catch here: IRC doesn't have authentication built in, and relies on "services" which are basically bots that authenticate users (I'm simplifying, don't nitpick). If that service goes down, the network still works, but then people can't authenticate, and they can start doing nasty things like steal people's identity if they get knocked offline. But still: basic functionality still works: you can talk in rooms and with users that are on the reachable network.)

User identities Matrix is more complicated. Each "home server" has its own identity namespace: a specific user (say @anarcat:matrix.org) is bound to that specific home server. If that server goes down, that user is completely disconnected. They could register a new account elsewhere and reconnect, but then they basically lose all their configuration: contacts, joined channels are all lost. (Also notice how the Matrix IDs don't look like a typical user address like an email in XMPP. They at least did their homework and got the allocation for the scheme.)

Rooms Users talk to each other in "rooms", even in one-to-one communications. (Rooms are also used for other things like "spaces", they're basically used for everything, think "everything is a file" kind of tool.) For rooms, home servers act more like IRC nodes in that they keep a local state of the chat room and synchronize it with other servers. Users can keep talking inside a room if the server that originally hosts the room goes down. Rooms can have a local, server-specific "alias" so that, say, #room:matrix.org is also visible as #room:example.com on the example.com home server. Both addresses refer to the same room underlying room. (Finding this in the Element settings is not obvious though, because that "alias" are actually called a "local address" there. So to create such an alias (in Element), you need to go in the room settings' "General" section, "Show more" in "Local address", then add the alias name (e.g. foo), and then that room will be available on your example.com homeserver as #foo:example.com.) So a room doesn't belong to a server, it belongs to the federation, and anyone can join the room from any serer (if the room is public, or if invited otherwise). You can create a room on server A and when a user from server B joins, the room will be replicated on server B as well. If server A fails, server B will keep relaying traffic to connected users and servers. A room is therefore not fundamentally addressed with the above alias, instead ,it has a internal Matrix ID, which basically a random string. It has a server name attached to it, but that was made just to avoid collisions. That can get a little confusing. For example, the #fractal:gnome.org room is an alias on the gnome.org server, but the room ID is !hwiGbsdSTZIwSRfybq:matrix.org. That's because the room was created on matrix.org, but the preferred branding is gnome.org now. As an aside, rooms, by default, live forever, even after the last user quits. There's an admin API to delete rooms and a tombstone event to redirect to another one, but neither have a GUI yet. The latter is part of MSC1501 ("Room version upgrades") which allows a room admin to close a room, with a message and a pointer to another room.

Spaces Discovering rooms can be tricky: there is a per-server room directory, but Matrix.org people are trying to deprecate it in favor of "Spaces". Room directories were ripe for abuse: anyone can create a room, so anyone can show up in there. It's possible to restrict who can add aliases, but anyways directories were seen as too limited. In contrast, a "Space" is basically a room that's an index of other rooms (including other spaces), so existing moderation and administration mechanism that work in rooms can (somewhat) work in spaces as well. This enables a room directory that works across federation, regardless on which server they were originally created. New users can be added to a space or room automatically in Synapse. (Existing users can be told about the space with a server notice.) This gives admins a way to pre-populate a list of rooms on a server, which is useful to build clusters of related home servers, providing some sort of redundancy, at the room -- not user -- level.

Home servers So while you can workaround a home server going down at the room level, there's no such thing at the home server level, for user identities. So if you want those identities to be stable in the long term, you need to think about high availability. One limitation is that the domain name (e.g. matrix.example.com) must never change in the future, as renaming home servers is not supported. The documentation used to say you could "run a hot spare" but that has been removed. Last I heard, it was not possible to run a high-availability setup where multiple, separate locations could replace each other automatically. You can have high performance setups where the load gets distributed among workers, but those are based on a shared database (Redis and PostgreSQL) backend. So my guess is it would be possible to create a "warm" spare server of a matrix home server with regular PostgreSQL replication, but that is not documented in the Synapse manual. This sort of setup would also not be useful to deal with networking issues or denial of service attacks, as you will not be able to spread the load over multiple network locations easily. Redis and PostgreSQL heroes are welcome to provide their multi-primary solution in the comments. In the meantime, I'll just point out this is a solution that's handled somewhat more gracefully in IRC, by having the possibility of delegating the authentication layer.

Delegations If you do not want to run a Matrix server yourself, it's possible to delegate the entire thing to another server. There's a server discovery API which uses the .well-known pattern (or SRV records, but that's "not recommended" and a bit confusing) to delegate that service to another server. Be warned that the server still needs to be explicitly configured for your domain. You can't just put:
  "m.server": "matrix.org:443"  
... on https://example.com/.well-known/matrix/server and start using @you:example.com as a Matrix ID. That's because Matrix doesn't support "virtual hosting" and you'd still be connecting to rooms and people with your matrix.org identity, not example.com as you would normally expect. This is also why you cannot rename your home server. The server discovery API is what allows servers to find each other. Clients, on the other hand, use the client-server discovery API: this is what allows a given client to find your home server when you type your Matrix ID on login.

Performance The high availability discussion brushed over the performance of Matrix itself, but let's now dig into that.

Horizontal scalability There were serious scalability issues of the main Matrix server, Synapse, in the past. So the Matrix team has been working hard to improve its design. Since Synapse 1.22 the home server can horizontally scale to multiple workers (see this blog post for details) which can make it easier to scale large servers.

Other implementations There are other promising home servers implementations from a performance standpoint (dendrite, Golang, entered beta in late 2020; conduit, Rust, beta; others), but none of those are feature-complete so there's a trade-off to be made there. Synapse is also adding a lot of feature fast, so it's an open question whether the others will ever catch up. (I have heard that Dendrite might actually surpass Synapse in features within a few years, which would put Synapse in a more "LTS" situation.)

Latency Matrix can feel slow sometimes. For example, joining the "Matrix HQ" room in Element (from matrix.debian.social) takes a few minutes and then fails. That is because the home server has to sync the entire room state when you join the room. There was promising work on this announced in the lengthy 2021 retrospective, and some of that work landed (partial sync) in the 1.53 release already. Other improvements coming include sliding sync, lazy loading over federation, and fast room joins. So that's actually something that could be fixed in the fairly short term. But in general, communication in Matrix doesn't feel as "snappy" as on IRC or even Signal. It's hard to quantify this without instrumenting a full latency test bed (for example the tools I used in the terminal emulators latency tests), but even just typing in a web browser feels slower than typing in a xterm or Emacs for me. Even in conversations, I "feel" people don't immediately respond as fast. In fact, this could be an interesting double-blind experiment to make: have people guess whether they are talking to a person on Matrix, XMPP, or IRC, for example. My theory would be that people could notice that Matrix users are slower, if only because of the TCP round-trip time each message has to take.

Transport Some courageous person actually made some tests of various messaging platforms on a congested network. His evaluation was basically:
  • Briar: uses Tor, so unusable except locally
  • Matrix: "struggled to send and receive messages", joining a room takes forever as it has to sync all history, "took 20-30 seconds for my messages to be sent and another 20 seconds for further responses"
  • XMPP: "worked in real-time, full encryption, with nearly zero lag"
So that was interesting. I suspect IRC would have also fared better, but that's just a feeling. Other improvements to the transport layer include support for websocket and the CoAP proxy work from 2019 (targeting 100bps links), but both seem stalled at the time of writing. The Matrix people have also announced the pinecone p2p overlay network which aims at solving large, internet-scale routing problems. See also this talk at FOSDEM 2022.

Usability

Onboarding and workflow The workflow for joining a room, when you use Element web, is not great:
  1. click on a link in a web browser
  2. land on (say) https://matrix.to/#/#matrix-dev:matrix.org
  3. offers "Element", yeah that's sounds great, let's click "Continue"
  4. land on https://app.element.io/#/room%2F%23matrix-dev%3Amatrix.org and then you need to register, aaargh
As you might have guessed by now, there is a specification to solve this, but web browsers need to adopt it as well, so that's far from actually being solved. At least browsers generally know about the matrix: scheme, it's just not exactly clear what they should do with it, especially when the handler is just another web page (e.g. Element web). In general, when compared with tools like Signal or WhatsApp, Matrix doesn't fare so well in terms of user discovery. I probably have some of my normal contacts that have a Matrix account as well, but there's really no way to know. It's kind of creepy when Signal tells you "this person is on Signal!" but it's also pretty cool that it works, and they actually implemented it pretty well. Registration is also less obvious: in Signal, the app confirms your phone number automatically. It's friction-less and quick. In Matrix, you need to learn about home servers, pick one, register (with a password! aargh!), and then setup encryption keys (not default), etc. It's a lot more friction. And look, I understand: giving away your phone number is a huge trade-off. I don't like it either. But it solves a real problem and makes encryption accessible to a ton more people. Matrix does have "identity servers" that can serve that purpose, but I don't feel confident sharing my phone number there. It doesn't help that the identity servers don't have private contact discovery: giving them your phone number is a more serious security compromise than with Signal. There's a catch-22 here too: because no one feels like giving away their phone numbers, no one does, and everyone assumes that stuff doesn't work anyways. Like it or not, Signal forcing people to divulge their phone number actually gives them critical mass that means actually a lot of my relatives are on Signal and I don't have to install crap like WhatsApp to talk with them.

5 minute clients evaluation Throughout all my tests I evaluated a handful of Matrix clients, mostly from Flathub because almost none of them are packaged in Debian. Right now I'm using Element, the flagship client from Matrix.org, in a web browser window, with the PopUp Window extension. This makes it look almost like a native app, and opens links in my main browser window (instead of a new tab in that separate window), which is nice. But I'm tired of buying memory to feed my web browser, so this indirection has to stop. Furthermore, I'm often getting completely logged off from Element, which means re-logging in, recovering my security keys, and reconfiguring my settings. That is extremely annoying. Coming from Irssi, Element is really "GUI-y" (pronounced "gooey"). Lots of clickety happening. To mark conversations as read, in particular, I need to click-click-click on all the tabs that have some activity. There's no "jump to latest message" or "mark all as read" functionality as far as I could tell. In Irssi the former is built-in (alt-a) and I made a custom /READ command for the latter:
/ALIAS READ script exec \$_->activity(0) for Irssi::windows
And yes, that's a Perl script in my IRC client. I am not aware of any Matrix client that does stuff like that, except maybe Weechat, if we can call it a Matrix client, or Irssi itself, now that it has a Matrix plugin (!). As for other clients, I have looked through the Matrix Client Matrix (confusing right?) to try to figure out which one to try, and, even after selecting Linux as a filter, the chart is just too wide to figure out anything. So I tried those, kind of randomly:
  • Fractal
  • Mirage
  • Nheko
  • Quaternion
Unfortunately, I lost my notes on those, I don't actually remember which one did what. I still have a session open with Mirage, so I guess that means it's the one I preferred, but I remember they were also all very GUI-y. Maybe I need to look at weechat-matrix or gomuks. At least Weechat is scriptable so I could continue playing the power-user. Right now my strategy with messaging (and that includes microblogging like Twitter or Mastodon) is that everything goes through my IRC client, so Weechat could actually fit well in there. Going with gomuks, on the other hand, would mean running it in parallel with Irssi or ... ditching IRC, which is a leap I'm not quite ready to take just yet. Oh, and basically none of those clients (except Nheko and Element) support VoIP, which is still kind of a second-class citizen in Matrix. It does not support large multimedia rooms, for example: Jitsi was used for FOSDEM instead of the native videoconferencing system.

Bots This falls a little aside the "usability" section, but I didn't know where to put this... There's a few Matrix bots out there, and you are likely going to be able to replace your existing bots with Matrix bots. It's true that IRC has a long and impressive history with lots of various bots doing various things, but given how young Matrix is, there's still a good variety:
  • maubot: generic bot with tons of usual plugins like sed, dice, karma, xkcd, echo, rss, reminder, translate, react, exec, gitlab/github webhook receivers, weather, etc
  • opsdroid: framework to implement "chat ops" in Matrix, connects with Matrix, GitHub, GitLab, Shell commands, Slack, etc
  • matrix-nio: another framework, used to build lots more bots like:
    • hemppa: generic bot with various functionality like weather, RSS feeds, calendars, cron jobs, OpenStreetmaps lookups, URL title snarfing, wolfram alpha, astronomy pic of the day, Mastodon bridge, room bridging, oh dear
    • devops: ping, curl, etc
    • podbot: play podcast episodes from AntennaPod
    • cody: Python, Ruby, Javascript REPL
    • eno: generic bot, "personal assistant"
  • mjolnir: moderation bot
  • hookshot: bridge with GitLab/GitHub
  • matrix-monitor-bot: latency monitor
One thing I haven't found an equivalent for is Debian's MeetBot. There's an archive bot but it doesn't have topics or a meeting chair, or HTML logs.

Working on Matrix As a developer, I find Matrix kind of intimidating. The specification is huge. The official specification itself looks somewhat digestable: it's only 6 APIs so that looks, at first, kind of reasonable. But whenever you start asking complicated questions about Matrix, you quickly fall into the Matrix Spec Change specification (which, yes, is a separate specification). And there are literally hundreds of MSCs flying around. It's hard to tell what's been adopted and what hasn't, and even harder to figure out if your specific client has implemented it. (One trendy answer to this problem is to "rewrite it in rust": Matrix are working on implementing a lot of those specifications in a matrix-rust-sdk that's designed to take the implementation details away from users.) Just taking the latest weekly Matrix report, you find that three new MSCs proposed, just last week! There's even a graph that shows the number of MSCs is progressing steadily, at 600+ proposals total, with the majority (300+) "new". I would guess the "merged" ones are at about 150. That's a lot of text which includes stuff like 3D worlds which, frankly, I don't think you should be working on when you have such important security and usability problems. (The internet as a whole, arguably, doesn't fare much better. RFC600 is a really obscure discussion about "INTERFACING AN ILLINOIS PLASMA TERMINAL TO THE ARPANET". Maybe that's how many MSCs will end up as well, left forgotten in the pits of history.) And that's the thing: maybe the Matrix people have a different objective than I have. They want to connect everything to everything, and make Matrix a generic transport for all sorts of applications, including virtual reality, collaborative editors, and so on. I just want secure, simple messaging. Possibly with good file transfers, and video calls. That it works with existing stuff is good, and it should be federated to remove the "Signal point of failure". So I'm a bit worried with the direction all those MSCs are taking, especially when you consider that clients other than Element are still struggling to keep up with basic features like end-to-end encryption or room discovery, never mind voice or spaces...

Conclusion Overall, Matrix is somehow in the space XMPP was a few years ago. It has a ton of features, pretty good clients, and a large community. It seems to have gained some of the momentum that XMPP has lost. It may have the most potential to replace Signal if something bad would happen to it (like, I don't know, getting banned or going nuts with cryptocurrency)... But it's really not there yet, and I don't see Matrix trying to get there either, which is a bit worrisome.

Looking back at history I'm also worried that we are repeating the errors of the past. The history of federated services is really fascinating:. IRC, FTP, HTTP, and SMTP were all created in the early days of the internet, and are all still around (except, arguably, FTP, which was removed from major browsers recently). All of them had to face serious challenges in growing their federation. IRC had numerous conflicts and forks, both at the technical level but also at the political level. The history of IRC is really something that anyone working on a federated system should study in detail, because they are bound to make the same mistakes if they are not familiar with it. The "short" version is:
  • 1988: Finnish researcher publishes first IRC source code
  • 1989: 40 servers worldwide, mostly universities
  • 1990: EFnet ("eris-free network") fork which blocks the "open relay", named Eris - followers of Eris form the A-net, which promptly dissolves itself, with only EFnet remaining
  • 1992: Undernet fork, which offered authentication ("services"), routing improvements and timestamp-based channel synchronisation
  • 1994: DALnet fork, from Undernet, again on a technical disagreement
  • 1995: Freenode founded
  • 1996: IRCnet forks from EFnet, following a flame war of historical proportion, splitting the network between Europe and the Americas
  • 1997: Quakenet founded
  • 1999: (XMPP founded)
  • 2001: 6 million users, OFTC founded
  • 2002: DALnet peaks at 136,000 users
  • 2003: IRC as a whole peaks at 10 million users, EFnet peaks at 141,000 users
  • 2004: (Facebook founded), Undernet peaks at 159,000 users
  • 2005: Quakenet peaks at 242,000 users, IRCnet peaks at 136,000 (Youtube founded)
  • 2006: (Twitter founded)
  • 2009: (WhatsApp, Pinterest founded)
  • 2010: (TextSecure AKA Signal, Instagram founded)
  • 2011: (Snapchat founded)
  • ~2013: Freenode peaks at ~100,000 users
  • 2016: IRCv3 standardisation effort started (TikTok founded)
  • 2021: Freenode self-destructs, Libera chat founded
  • 2022: Libera peaks at 50,000 users, OFTC peaks at 30,000 users
(The numbers were taken from the Wikipedia page and Netsplit.de. Note that I also include other networks launch in parenthesis for context.) Pretty dramatic, don't you think? Eventually, somehow, IRC became irrelevant for most people: few people are even aware of it now. With less than a million users active, it's smaller than Mastodon, XMPP, or Matrix at this point.1 If I were to venture a guess, I'd say that infighting, lack of a standardization body, and a somewhat annoying protocol meant the network could not grow. It's also possible that the decentralised yet centralised structure of IRC networks limited their reliability and growth. But large social media companies have also taken over the space: observe how IRC numbers peak around the time the wave of large social media companies emerge, especially Facebook (2.9B users!!) and Twitter (400M users).

Where the federated services are in history Right now, Matrix, and Mastodon (and email!) are at the "pre-EFnet" stage: anyone can join the federation. Mastodon has started working on a global block list of fascist servers which is interesting, but it's still an open federation. Right now, Matrix is totally open, but matrix.org publishes a (federated) block list of hostile servers (#matrix-org-coc-bl:matrix.org, yes, of course it's a room). Interestingly, Email is also in that stage, where there are block lists of spammers, and it's a race between those blockers and spammers. Large email providers, obviously, are getting closer to the EFnet stage: you could consider they only accept email from themselves or between themselves. It's getting increasingly hard to deliver mail to Outlook and Gmail for example, partly because of bias against small providers, but also because they are including more and more machine-learning tools to sort through email and those systems are, fundamentally, unknowable. It's not quite the same as splitting the federation the way EFnet did, but the effect is similar. HTTP has somehow managed to live in a parallel universe, as it's technically still completely federated: anyone can start a web server if they have a public IP address and anyone can connect to it. The catch, of course, is how you find the darn thing. Which is how Google became one of the most powerful corporations on earth, and how they became the gatekeepers of human knowledge online. I have only briefly mentioned XMPP here, and my XMPP fans will undoubtedly comment on that, but I think it's somewhere in the middle of all of this. It was co-opted by Facebook and Google, and both corporations have abandoned it to its fate. I remember fondly the days where I could do instant messaging with my contacts who had a Gmail account. Those days are gone, and I don't talk to anyone over Jabber anymore, unfortunately. And this is a threat that Matrix still has to face. It's also the threat Email is currently facing. On the one hand corporations like Facebook want to completely destroy it and have mostly succeeded: many people just have an email account to register on things and talk to their friends over Instagram or (lately) TikTok (which, I know, is not Facebook, but they started that fire). On the other hand, you have corporations like Microsoft and Google who are still using and providing email services because, frankly, you still do need email for stuff, just like fax is still around but they are more and more isolated in their own silo. At this point, it's only a matter of time they reach critical mass and just decide that the risk of allowing external mail coming in is not worth the cost. They'll simply flip the switch and work on an allow-list principle. Then we'll have closed the loop and email will be dead, just like IRC is "dead" now. I wonder which path Matrix will take. Could it liberate us from these vicious cycles? Update: this generated some discussions on lobste.rs.

  1. According to Wikipedia, there are currently about 500 distinct IRC networks operating, on about 1,000 servers, serving over 250,000 users. In contrast, Mastodon seems to be around 5 million users, Matrix.org claimed at FOSDEM 2021 to have about 28 million globally visible accounts, and Signal lays claim to over 40 million souls. XMPP claims to have "millions" of users on the xmpp.org homepage but the FAQ says they don't actually know. On the proprietary silo side of the fence, this page says
    • Facebook: 2.9 billion users
    • WhatsApp: 2B
    • Instagram: 1.4B
    • TikTok: 1B
    • Snapchat: 500M
    • Pinterest: 480M
    • Twitter: 397M
    Notable omission from that list: Youtube, with its mind-boggling 2.6 billion users... Those are not the kind of numbers you just "need to convince a brother or sister" to grow the network...

15 June 2022

Edward Betts: Find link needs a rewrite, the visual editor broke it

Find link is a tool that I wrote for adding links between articles in Wikipedia. Given an article title, find link will find other articles that include the entered article title but no link to the article. There is the option to edit the found articles and add the missing link. For example, you might want to find missing links to the gig economy article.
I originally wrote the tool in 2008 when the MediaWiki software didn't have a rich-text editor. Wikipedia articles were edited by writing wiki markup in MediaWiki syntax. Since then MediaWiki has evolved and now has rich-text editing via the visual editor. Users don't need to know how to write wiki markup to modify an article. Within MediaWiki there is a user preference to disable the visual editor and stick with editing via the original wiki markup. Find link edits articles by taking the article text, adding the missing link, and sending the user to the changes view of the modified article on Wikipedia, if they're happy with the change they hit save. This only works with the original editor, it doesn't work with the visual editor.
English Wikipedia has had the visual editor enabled by default since 2016. For somebody to use find link they need to disable the visual editor in their Wikipedia preferences first. Fixing this bug means quite a significant change to how the tool works. My plan is to rewrite find link to save edits directly without needing to send the user to Wikipedia article edit change view page to make the edits. Users will authenticate with their Wikipedia account via OAuth and give permission for find link to edit articles on their behalf. Some of my other tools use OAuth for editing OpenStreetMap and Wikidata, so I'm confident about using it to edit Wikipedia. The source code for find link is on GitHub. I'll post updates here as I make progress on the rewrite.

11 June 2022

Louis-Philippe V ronneau: Updating a rooted Pixel 3a

A short while after getting a Pixel 3a, I decided to root it, mostly to have more control over the charging procedure. In order to preserve battery life, I like my phone to stop charging at around 75% of full battery capacity and to shut down automatically at around 12%. Some Android ROMs have extra settings to manage this, but LineageOS unfortunately does not. Android already comes with a fairly complex mechanism to handle the charge cycle, but it is mostly controlled by the kernel and cannot be easily configured by end-users. acc is a higher-level "systemless" interface for the Android kernel battery management, but one needs root to do anything interesting with it. Once rooted, you can use the AccA app instead of playing on the command line to fine tune your battery settings. Sadly, having a rooted phone also means I need to re-root it each time there is an OS update (typically each week). Somehow, I keep forgetting the exact procedure to do this! Hopefully, I will be able to use this post as a reference in the future :) Note that these instructions might not apply to your exact phone model, proceed with caution! Extract the boot.img file This procedure mostly comes from the LineageOS documentation on extracting proprietary blobs from the payload.
  1. Download the latest LineageOS image for your phone.
  2. unzip the image to get the payload.bin file inside it.
  3. Clone the LineageOS scripts git repository: $ git clone https://github.com/LineageOS/scripts
  4. extract the boot image (requires python3-protobuf): $ mkdir extracted-payload $ python3 scripts/update-payload-extractor/extract.py payload.bin --output_dir extracted-payload
You should now have a boot.img file. Patch the boot image file using Magisk
  1. Upload the boot.img file you previously extracted to your device.
  2. Open Magisk and patch the boot.img file.
  3. Download the patched file back on your computer.
Flash the patched boot image
  1. Enable ADB debug mode on your phone.
  2. Reboot into fastboot mode. $ adb reboot fastboot
  3. Flash the patched boot image file: $ fastboot flash boot magisk_patched-foo.img
  4. Disable ADB debug mode on your phone.
Troubleshooting In an ideal world, you would do this entire process each time you upgrade to a new LineageOS version. Sadly, this creates friction and makes updating much more troublesome. To simplify things, you can try to flash an old patched boot.img file after upgrading, instead of generating it each time. In my experience, it usually works. When it does not, the device behaves weirdly after a reboot and things that require proprietary blobs (like WiFi) will stop working. If that happens:
  1. Download the latest LineageOS version for your phone.
  2. Reboot into recovery (Power + Volume Down).
  3. Click on "Apply Updates"
  4. Sideload the ROM: $ adb sideload lineageos-foo.zip

9 June 2022

Enrico Zini: Updating cbqt for bullseye

Back in 2017 I did work to setup a cross-building toolchain for QT Creator, that takes advantage of Debian's packaging for all the dependency ecosystem. It ended with cbqt which is a little script that sets up a chroot to hold cross-build-dependencies, to avoid conflicting with packages in the host system, and sets up a qmake alternative to make use of them. Today I'm dusting off that work, to ensure it works on Debian bullseye. Resetting QT Creator To make things reproducible, I wanted to reset QT Creator's configuration. Besides purging and reinstalling the package, one needs to manually remove: Updating cbqt Easy start, change the distribution for the chroot:
-DIST_CODENAME = "stretch"
+DIST_CODENAME = "bullseye"
Adding LIBDIR Something else does not work:
Test$ qmake-armhf -makefile
Info: creating stash file  /Test/.qmake.stash
Test$ make
[...]
/usr/bin/arm-linux-gnueabihf-g++ -Wl,-O1 -Wl,-rpath-link, /armhf/lib/arm-linux-gnueabihf -Wl,-rpath-link, /armhf/usr/lib/arm-linux-gnueabihf -Wl,-rpath-link, /armhf/usr/lib/ -o Test main.o mainwindow.o moc_mainwindow.o    /armhf/usr/lib/arm-linux-gnueabihf/libQt5Widgets.so  /armhf/usr/lib/arm-linux-gnueabihf/libQt5Gui.so  /armhf/usr/lib/arm-linux-gnueabihf/libQt5Core.so -lGLESv2 -lpthread
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lGLESv2
collect2: error: ld returned 1 exit status
make: *** [Makefile:146: Test] Error 1
I figured that now I also need to set QMAKE_LIBDIR and not just QMAKE_RPATHLINKDIR:
--- a/cbqt
+++ b/cbqt
@@ -241,18 +241,21 @@ include(../common/linux.conf)
 include(../common/gcc-base-unix.conf)
 include(../common/g++-unix.conf)
+QMAKE_LIBDIR +=  chroot.abspath /lib/arm-linux-gnueabihf
+QMAKE_LIBDIR +=  chroot.abspath /usr/lib/arm-linux-gnueabihf
+QMAKE_LIBDIR +=  chroot.abspath /usr/lib/
 QMAKE_RPATHLINKDIR +=  chroot.abspath /lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR +=  chroot.abspath /usr/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR +=  chroot.abspath /usr/lib/
Now it links again:
Test$ qmake-armhf -makefile
Test$ make
/usr/bin/arm-linux-gnueabihf-g++ -Wl,-O1 -Wl,-rpath-link, /armhf/lib/arm-linux-gnueabihf -Wl,-rpath-link, /armhf/usr/lib/arm-linux-gnueabihf -Wl,-rpath-link, /armhf/usr/lib/ -o Test main.o mainwindow.o moc_mainwindow.o   -L /armhf/lib/arm-linux-gnueabihf -L /armhf/usr/lib/arm-linux-gnueabihf -L /armhf/usr/lib/  /armhf/usr/lib/arm-linux-gnueabihf/libQt5Widgets.so  /armhf/usr/lib/arm-linux-gnueabihf/libQt5Gui.so  /armhf/usr/lib/arm-linux-gnueabihf/libQt5Core.so -lGLESv2 -lpthread
Making it work in Qt Creator Time to try it in Qt Creator, and sadly it fails:
 /armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.
QMAKE_CXX.COMPILER_MACROS is not defined I traced it to this bit in armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf (nonrelevant bits deleted):
isEmpty($$ target_prefix .COMPILER_MACROS)  
    msvc  
        #  
      else: gcc ghs  
        vars = $$qtVariablesFromGCC($$QMAKE_CXX)
     
    for (v, vars)  
        #  
        $$ target_prefix .COMPILER_MACROS += $$v
     
    cache($$ target_prefix .COMPILER_MACROS, set stash)
  else  
    #  
 
It turns out that qmake is not able to realise that the compiler is gcc, so vars does not get set, nothing is set in COMPILER_MACROS, and qmake fails. Reproducing it on the command line When run manually, however, qmake-armhf worked, so it would be good to know how Qt Creator is actually running qmake. Since it frustratingly does not show what commands it runs, I'll have to strace it:
strace -e trace=execve --string-limit=123456 -o qtcreator.trace -f qtcreator
And there it is:
$ grep qmake- qtcreator.trace
1015841 execve("/usr/local/bin/qmake-armhf", ["/usr/local/bin/qmake-armhf", "-query"], 0x56096e923040 /* 54 vars */) = 0
1015865 execve("/usr/local/bin/qmake-armhf", ["/usr/local/bin/qmake-armhf", " /Test/Test.pro", "-spec", "arm-linux-gnueabihf", "CONFIG+=debug", "CONFIG+=qml_debug"], 0x7f5cb4023e20 /* 55 vars */) = 0
I run the command manually and indeed I reproduce the problem:
$ /usr/local/bin/qmake-armhf Test.pro -spec arm-linux-gnueabihf CONFIG+=debug CONFIG+=qml_debug
 /armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.
I try removing options until I find the one that breaks it and... now it's always broken! Even manually running qmake-armhf, like I did earlier, stopped working:
$ rm .qmake.stash
$ qmake-armhf -makefile
 /armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.
Debugging toolchain.prf I tried purging and reinstalling qtcreator, and recreating the chroot, but qmake-armhf is staying broken. I'll let that be, and try to debug toolchain.prf. By grepping gcc in the mkspecs directory, I managed to figure out that: Sadly, I failed to find reference documentation for QMAKE_COMPILER's syntax and behaviour. I also failed to find why qmake-armhf worked earlier, and I am also failing to restore the system to a situation where it works again. Maybe I dreamt that it worked? I had some manual change laying around from some previous fiddling with things? Anyway at least now I have the fix:
--- a/cbqt
+++ b/cbqt
@@ -248,7 +248,7 @@ QMAKE_RPATHLINKDIR +=  chroot.abspath /lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR +=  chroot.abspath /usr/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR +=  chroot.abspath /usr/lib/
-QMAKE_COMPILER          =  chroot.arch_triplet -gcc
+QMAKE_COMPILER          = gcc  chroot.arch_triplet -gcc
 QMAKE_CC                = /usr/bin/ chroot.arch_triplet -gcc
Fixing a compiler mismatch warning In setting up the kit, Qt Creator also complained that the compiler from qmake did not match the one configured in the kit. That was easy to fix, by pointing at the host system cross-compiler in qmake.conf:
 QMAKE_COMPILER          =  chroot.arch_triplet -gcc
-QMAKE_CC                =  chroot.arch_triplet -gcc
+QMAKE_CC                = /usr/bin/ chroot.arch_triplet -gcc
 QMAKE_LINK_C            = $$QMAKE_CC
 QMAKE_LINK_C_SHLIB      = $$QMAKE_CC
-QMAKE_CXX               =  chroot.arch_triplet -g++
+QMAKE_CXX               = /usr/bin/ chroot.arch_triplet -g++
 QMAKE_LINK              = $$QMAKE_CXX
 QMAKE_LINK_SHLIB        = $$QMAKE_CXX
Updated setup instructions Create an armhf environment:
sudo cbqt ./armhf --create --verbose
Create a qmake wrapper that builds with this environment:
sudo ./cbqt ./armhf --qmake -o /usr/local/bin/qmake-armhf
Install the build-dependencies that you need:
# Note: :arch is added automatically to package names if no arch is explicitly specified
sudo ./cbqt ./armhf --install libqt5svg5-dev libmosquittopp-dev qtwebengine5-dev
Build with qmake Use qmake-armhf instead of qmake and it works perfectly:
qmake-armhf -makefile
make
Set up Qt Creator Configure a new Kit in Qt Creator:
  1. Tools/Options, then Kits, then Add
  2. Name: armhf (or anything you like)
  3. In the Qt Versions tab, click Add then set the path of the new Qt to /usr/local/bin/qmake-armhf. Click Apply.
  4. Back in the Kits, select the Qt version you just created in the Qt version field
  5. In Compilers, select the ARM versions of GCC. If they do not appear, install crossbuild-essential-armhf, then in the Compilers tab click Re-detect and then Apply to make them available for selection
  6. Dismiss the dialog with "OK": the new kit is ready
Now you can choose the default kit to build and run locally, and the armhf kit for remote cross-development. I tried looking at sdktool to automate this step, and it requires a nontrivial amount of work to do it reliably, so these manual instructions will have to do. Credits This has been done as part of my work with Truelite.

3 June 2022

Rapha&#235;l Hertzog: Freexian s report about Debian Long Term Support, April 2022

A Debian LTS logo
Like each month, have a look at the work funded by Freexian s Debian LTS offering. Debian project funding Two projects are currently in the pipeline: Gradle enterprise and Tryton update. Progress is quite slow on the Gradle one, there are technical difficulties. The tryton one was stalled because the developer had not enough time but seems to progress smoothly in the last weeks. In April, we put aside 2635 EUR to fund Debian projects. We re looking forward to receive more projects from various Debian teams! Learn more about the rationale behind this initiative in this article. Debian LTS contributors In April, 11 contributors have been paid to work on Debian LTS, their reports are available: Evolution of the situation In April we released 30 DLAs and we were glad to welcome a new customer with Alter Way. The security tracker currently lists 72 packages with a known CVE and the dla-needed.txt file has 71 packages needing an update. It is worth pointing out that we are getting close to the end of the LTS period for Debian 9. After June 30th, no new security updates will be made available on security.debian.org. But Freexian and its team of paid Debian contributors will continue to maintain Debian 9 going forward for the customers of the Extended LTS offer. If you have Debian 9 servers to keep secure, it s time to subscribe! You might not have noticed, but Freexian formalized a mission statement where we explain that our purpose is to help improve Debian. For this, we want to fund work time for the Debian developers that recently joined Freexian as collaborators. The Extended LTS and the PHP LTS offers are built following a model that will help us to achieve this if we manage to have enough customers for those offers. So consider subscribing: you help your organization but you also help Debian! Thanks to our sponsors Sponsors that joined recently are in bold.

1 June 2022

Russ Allbery: Review: The Seeress of Kell

Review: The Seeress of Kell, by David Eddings
Series: The Malloreon #5
Publisher: Del Rey
Copyright: May 1991
Printing: May 1992
ISBN: 0-345-37759-1
Format: Mass market
Pages: 374
The Seeress of Kell is the conclusion of the five-book Malloreon series and a direct sequel to Sorceress of Darshiva. You do not want to begin the series here (or, to be honest, at all). We have finally finished the relaxed tour of Mallorea, the second continent of Eddings's remarkably small two-continent world. The heroes have gathered all of their required companions and are headed for Kell, where the seeress Cyradis awaits. From there, they and the new Child of Dark must find their way to the Place Which Is No More for the final confrontation. By "find," I mean please remain seated with your hands, arms, feet, and legs inside the vehicle. The protagonists have about as much to do with the conclusion of this series as the passengers of a roller coaster have control over its steering. I am laughing at my younger self, who quite enjoyed this series (although as I recall found it a bit repetitive) and compared it favorably to the earlier Belgariad series. My memory kept telling me that the conclusion of the series was lots of fun. Reader, it was not. It was hilariously bad. Both of Eddings's first two series, but particularly this one, take place in a fantasy world full of true prophecy. The conceit of the Malloreon in particular (this is a minor spoiler for the early books, but not one that I think interferes with enjoyment) is that there are two competing prophecies that agree on most events but are in conflict over a critical outcome. True prophecy creates an agency problem: why have protagonists if everything they do is fixed in prophecy? The normal way to avoid that is to make the prophecy sufficiently confusing and the mechanism by which it comes true sufficiently subtle that everyone has to act as if there is no prophecy, thus reducing the role of the prophecy to foreshadowing and a game the author plays with the reader. What makes the Malloreon interesting (and I mean this sincerely) is that Eddings instead leans into the idea of a prophecy as an active agent leading the protagonists around by the nose. As a meta-story commentary on fantasy stories, this can be quite entertaining, and it helps that the prophecy appears as a likable character of sorts in the book. The trap that Eddings had mostly avoided before now is that this structure can make the choices of the protagonists entirely pointless. In The Seeress of Kell, he dives head-first into the trap and then pulls it shut behind him. The worst part is Ce'Nedra, who once again spends an entire book either carping at Garion in ways that are supposed to be endearing (but aren't) or being actively useless. The low point is when she is manipulated into betraying the heroes, costing them a significant advantage. We're then told that, rather than being a horrific disaster, this is her important and vital role in the story, and indeed the whole reason why she was in the story at all. The heroes were too far ahead of the villains and were in danger of causing the prophecy to fail. At that point, one might reasonably ask why one is bothering reading a novel instead of a summary of the invented history that Eddings is going to tell whether his characters cooperate or not. The whole middle section of the book is like this: nothing any of the characters do matters because everything is explicitly destined. That includes an extended series of interludes following the other main characters from the Belgariad, who are racing to catch up with the main party but who will turn out to have no role of significance whatsoever. I wouldn't mind this as much if the prophecy were more active in the story, given that it's the actual protagonist. But it mostly disappears. Instead, the characters blunder around doing whatever seems like a good idea at the time, while Cyradis acts like a bizarre sort of referee with a Calvinball rule set and every random action turns out to be the fulfillment of prophecy in the most ham-handed possible way. Zandramas, meanwhile, is trying to break the prophecy, which would have been a moderately interesting story hook if anyone (Eddings included) thought she were potentially capable of doing so. Since no one truly believes there's any peril, this turns into a series of pointless battles the reader has no reason to care about. All of this sets up what has been advertised since the start of the series as a decision between good and evil. Now, at the least minute, Eddings (through various character mouthpieces) tries to claim that the decision is not actually between good and evil, but is somehow beyond morality. No one believes this, including the narrator and the reader, making all of the philosophizing a tedious exercise in page-turning. To pull off a contention like that, the author has to lay some sort of foundation to allow the reader to see the supposed villain in multiple lights. Eddings does none of that, instead emphasizing how evil she is at every opportunity. On top of that, this supposed free choice on which the entire universe rests and for which all of history was pointed depends on someone with astonishing conflicts of interest. While the book is going on about how carefully the prophecy is ensuring that everyone is in the right place at the right time so that no side has an advantage, one side is accruing an absurdly powerful advantage. And the characters don't even seem to realize it! The less said about the climax, the better. Unsurprisingly, it was completely predictable. Also, while I am complaining, I could never get past how this entire series starts off with and revolves around an incredibly traumatic and ongoing event that has no impact whatsoever on the person to whom the trauma happens. Other people are intermittently upset or sad, but not only is that person not harmed, they act, at the end of this book, as if the entire series had never happened. There is one bright spot in this book, and ironically it's the one plot element that Eddings didn't make blatantly obvious in advance and therefore I don't want to spoil it. All I'll say is that one of the companions the heroes pick up along the way turns out to be my favorite character of the series, plays a significant role in the interpersonal dynamics between the heroes, and steals every scene that she's in by being more sensible than any of the other characters in the story. Her story, and backstory, is emotional and moving and is the best part of this book. Otherwise, not only is the plot a mess and the story structure a failure, but this is also Eddings at his most sexist and socially conservative. There is an extended epilogue after the plot resolution that serves primarily as a showcase of stereotypes: baffled men having their habits and preferences rewritten by their wives, cast-iron gender roles inside marriage, cringeworthy jokes, and of course loads and loads of children because that obviously should be everyone's happily ever after. All of this happens to the characters rather than being planned or actively desired, continuing the theme of prophecy and lack of agency, although of course they're all happy about it (shown mostly via grumbling). One could write an entire academic paper on the tension between this series and the concept of consent. There were bits of the Malloreon that I enjoyed, but they were generally in spite of the plot rather than because of it. I do like several of Eddings's characters, and in places I liked the lack of urgency and the sense of safety. But I think endings still have to deliver some twist or punch or, at the very least, some clear need for the protagonists to take an action other than stand in the right room at the right time. Eddings probably tried to supply that (I can make a few guesses about where), but it failed miserably for me, making this the worst book of the series. Unless like me you're revisiting this out of curiosity for your teenage reading habits (and even then, consider not), avoid. Rating: 3 out of 10

31 May 2022

Russ Allbery: Review: Maskerade

Review: Maskerade, by Terry Pratchett
Series: Discworld #18
Publisher: Harper
Copyright: 1995
Printing: February 2014
ISBN: 0-06-227552-6
Format: Mass market
Pages: 360
Maskerade is the 18th book of the Discworld series, but you probably could start here. You'd miss the introduction of Granny Weatherwax and Nanny Ogg, which might be a bit confusing, but I suspect you could pick it up as you went if you wanted. This is a sequel of sorts to Lords and Ladies, but not in a very immediate sense. Granny is getting distracted and less interested in day-to-day witching in Lancre. This is not good; Granny is incredibly powerful, and bored and distracted witches can go to dark places. Nanny is concerned. Granny needs something to do, and their coven needs a third. It's not been the same since they lost their maiden member. Nanny's solution to this problem is two-pronged. First, they'd had their eye on a local girl named Agnes, who had magic but who wasn't interested in being a witch. Perhaps it was time to recruit her anyway, even though she'd left Lancre for Ankh-Morpork. And second, Granny needs something to light a fire under her, something that will get her outraged and ready to engage with the world. Something like a cookbook of aphrodisiac recipes attributed to the Witch of Lancre. Agnes, meanwhile, is auditioning for the opera. She's a sensible person, cursed her whole life by having a wonderful personality, but a part of her deep inside wants to be called Perdita X. Dream and have a dramatic life. Having a wonderful personality can be very frustrating, but no one in Lancre took either that desire or her name seriously. Perhaps the opera is somewhere where she can find the life she's looking for, along with another opportunity to try on the Perdita name. One thing she can do is sing; that's where all of her magic went. The Ankh-Morpork opera is indeed dramatic. It's also losing an astounding amount of money for its new owner, who foolishly thought owning an opera would be a good retirement project after running a cheese business. And it's haunted by a ghost, a very tangible ghost who has started killing people. I think this is my favorite Discworld novel to date (although with a caveat about the ending that I'll get to in a moment). It's certainly the one that had me laughing out loud the most. Agnes (including her Perdita personality aspect) shot to the top of my list of favorite Discworld characters, in part because I found her sensible personality so utterly relatable. She is fascinated by drama, she wants to be in the middle of it and let her inner Perdita goth character revel in it, and yet she cannot help being practical and unflappable even when surrounded by people who use far too many exclamation points. It's one thing to want drama in the abstract; it's quite another to be heedlessly dramatic in the moment, when there's an obviously reasonable thing to do instead. Pratchett writes this wonderfully. The other half of the story follows Granny and Nanny, who are unstoppable forces of nature and a wonderful team. They have the sort of long-standing, unshakable adult friendship between very unlike people that's full of banter and minor irritations layered on top of a deep mutual understanding and respect. Once they decide to start investigating this supposed opera ghost, they divvy up the investigative work with hardly a word exchanged. Planning isn't necessary; they both know each other's strengths. We've gotten a lot of Granny's skills in previous books. Maskerade gives Nanny a chance to show off her skills, and it's a delight. She effortlessly becomes the sort of friendly grandmother who blends in so well that no one questions why she's there, and thus manages to be in the middle of every important event. Granny watches and thinks and theorizes; Nanny simply gets into the middle of everything and talks to everyone until people tell her what she wants to know. There's no real doubt that the two of them are going to get to the bottom of anything they want to get to the bottom of, but watching how they get there is a delight. I love how Pratchett handles that sort of magical power from a world-building perspective. Ankh-Morpork is the Big City, the center of political power in most of the Discworld books, and Granny and Nanny are from the boondocks. By convention, that means they should either be awed or confused by the city, or gain power in the city by transforming it in some way to match their area of power. This isn't how Pratchett writes witches at all. Their magic is in understanding people, and the people in Ankh-Morpork are just as much people as the people in Lancre. The differences of the city may warrant an occasional grumpy aside, but the witches are fully as capable of navigating the city as they are their home town. Maskerade is, of course, a parody of opera and musicals, with Phantom of the Opera playing the central role in much the same way that Macbeth did in Wyrd Sisters. Agnes ends up doing the singing for a beautiful, thin actress named Christine, who can't sing at all despite being an opera star, uses a truly astonishing excess of exclamation points, and strategically faints at the first sign of danger. (And, despite all of this, is still likable in that way that it's impossible to be really upset at a puppy.) She is the special chosen focus of the ghost, whose murderous taunting is a direct parody of the Phantom. That was a sufficiently obvious reference that even I picked up on it, despite being familiar with Phantom of the Opera only via the soundtrack. Apart from that, though, the references were lost on me, since I'm neither a musical nor an opera fan. That didn't hurt my enjoyment of the book in the slightest; in fact, I suspect it's part of why it's in my top tier of Discworld books. One of my complaints about Discworld to date is that Pratchett often overdoes the parody to the extent that it gets in the way of his own (excellent) characters and story. Maybe it's better to read Discworld novels where one doesn't recognize the material being parodied and thus doesn't keep getting distracted by references. It's probably worth mentioning that Agnes is a large woman and there are several jokes about her weight in Maskerade. I think they're the good sort of jokes, about how absurd human bodies can be, not the mean sort? Pratchett never implies her weight is any sort of moral failing or something she should change; quite the contrary, Nanny considers it a sign of solid Lancre genes. But there is some fat discrimination in the opera itself, since one of the things Pratchett is commenting on is the switch from full-bodied female opera singers to thin actresses matching an idealized beauty standard. Christine is the latter, but she can't sing, and the solution is for Agnes to sing for her from behind, something that was also done in real opera. I'm not a good judge of how well this plot line was handled; be aware, going in, if this may bother you. What did bother me was the ending, and more generally the degree to which Granny and Nanny felt comfortable making decisions about Agnes's life without consulting her or appearing to care what she thought of their conclusions. Pratchett seemed to be on their side, emphasizing how well they know people. But Agnes left Lancre and avoided the witches for a reason, and that reason is not honored in much the same way that Lancre refused to honor her desire to go by Perdita. This doesn't seem to be malicious, and Agnes herself is a little uncertain about her choice of identity, but it still rubbed me the wrong way. I felt like Agnes got steamrolled by both the other characters and by Pratchett, and it's the one thing about this book that I didn't like. Hopefully future Discworld books about these characters revisit Agnes's agency. Overall, though, this was great, and a huge improvement over Interesting Times. I'm excited for the next witches book. Followed in publication order by Feet of Clay, and later by Carpe Jugulum in the thematic sense. Rating: 8 out of 10

Next.

Previous.