Search Results: "radu"

12 July 2025

Reproducible Builds: Reproducible Builds in June 2025

Welcome to the 6th report from the Reproducible Builds project in 2025. Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. If you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website. In this report:
  1. Reproducible Builds at FOSSY 2025
  2. Distribution work
  3. diffoscope
  4. OSS Rebuild updates
  5. Website updates
  6. Upstream patches
  7. Reproducibility testing framework

Reproducible Builds at FOSSY 2025 On Saturday 2nd August, Vagrant Cascadian and Chris Lamb will be presenting at this year s FOSSY 2025. Their talk, titled Never Mind the Checkboxes, Here s Reproducible Builds!, is being introduced as follows:
There are numerous policy compliance and regulatory processes being developed that target software development but do they solve actual problems? Does it improve the quality of software? Do Software Bill of Materials (SBOMs) actually give you the information necessary to verify how a given software artifact was built? What is the goal of all these compliance checklists anyways or more importantly, what should the goals be? If a software object is signed, who should be trusted to sign it, and can they be trusted forever?
The talk will introduce the audience to Reproducible Builds as a set of best practices which allow users and developers to verify that software artifacts were built from the source code, but also allows auditing for license compliance, providing security benefits, and removes the need to trust arbitrary software vendors. Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, USA, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2025 website, including the full programme schedule. Vagrant and Chris will also be staffing a table this year, where they will be available to answer any questions about Reproducible Builds and discuss collaborations with other projects.

Distribution work In Debian this month:
  • Holger Levsen has discovered that it is now possible to bootstrap a minimal Debian trixie using 100% reproducible packages. This result can itself be reproduced, using the debian-repro-status tool and mmdebstrap s support for hooks:
      $ mmdebstrap --variant=apt --include=debian-repro-status \
           --chrooted-customize-hook=debian-repro-status \
           trixie /dev/null 2>&1   grep "Your system has"
       INFO  debian-repro-status > Your system has 100.00% been reproduced.
    
  • On our mailing list this month, Helmut Grohne wrote an extensive message raising an issue related to Uploads with conflicting buildinfo filenames:
    Having several .buildinfo files for the same architecture is something that we plausibly want to have eventually. Imagine running two sets of buildds and assembling a single upload containing buildinfo files from both buildds in the same upload. In a similar vein, as a developer I may want to supply several .buildinfo files with my source upload (e.g. for multiple architectures). Doing any of this is incompatible with current incoming processing and with reprepro.
  • 5 reviews of Debian packages were added, 4 were updated and 8 were removed this month adding to our ever-growing knowledge about identified issues.

In GNU Guix, Timothee Mathieu reported that a long-standing issue with reproducibility of shell containers across different host operating systems has been solved. In their message, Timothee mentions:
I discovered that pytorch (and maybe other dependencies) has a reproducibility problem of order 1e-5 when on AVX512 compared to AVX2. I first tried to solve the problem by disabling AVX512 at the level of pytorch, but it did not work. The dev of pytorch said that it may be because some components dispatch computation to MKL-DNN, I tried to disable AVX512 on MKL, and still the results were not reproducible, I also tried to deactivate in openmpi without success. I finally concluded that there was a problem with AVX512 somewhere in the dependencies graph but I gave up identifying where, as this seems very complicated.

The IzzyOnDroid Android APK repository made more progress in June. Not only have they just passed 48% reproducibility coverage, Ben started making their reproducible builds more visible, by offering rbtlog shields, a kind of badge that has been quickly picked up by many developers who are proud to present their applications reproducibility status.
Lastly, in openSUSE news, Bernhard M. Wiedemann posted another monthly update for their work there.

diffoscope diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 298, 299 and 300 to Debian:
  • Add python3-defusedxml to the Build-Depends in order to include it in the Docker image. [ ]
  • Handle the RPM format s HEADERSIGNATURES and HEADERIMMUTABLE as a special-case to avoid unnecessarily large diffs. Thanks to Daniel Duan for the report and suggestion. [ ][ ]
  • Update copyright years. [ ]
In addition, @puer-robustus fixed a regression introduced in an earlier commit which resulted in some differences being lost. [ ][ ] Lastly, Vagrant Cascadian updated diffoscope in GNU Guix to version 299 [ ][ ] and 300 [ ][ ].

OSS Rebuild updates OSS Rebuild has added a new network analyzer that provides transparent HTTP(S) interception during builds, capturing all network traffic to monitor external dependencies and identify suspicious behavior, even in unmodified maintainer-controlled build processes. The text-based user interface now features automated failure clustering that can group similar rebuild failures and provides natural language failure summaries, making it easier to identify and understand patterns across large numbers of build failures. OSS Rebuild has also improved the local development experience with a unified interface for build execution strategies, allowing for more extensible environment setup for build execution. The team also designed a new website and logo.

Website updates Once again, there were a number of improvements made to our website this month including:
  • Arnaud Brousseau added Stage , a new Linux distribution, to our Tools page.
  • Chris Lamb improved the docker instructions on the diffoscope website. [ ]


Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In June, however, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Installed and deployed rebuilderd version 0.24 from Debian unstable in order to make use of the new compression feature added by Jarl Gullberg for the database. This resulted in massive decrease of the SQLite databases:
      • 79G 2.8G (all)
      • 84G 3.2G (amd64)
      • 75G 2.9G (arm64)
      • 45G 2.1G (armel)
      • 48G 2.2G (armhf)
      • 73G 2.8G (i386)
      • 72G 2.7G (ppc64el)
      • 45G 2.1G (riscv64)
      for a combined saving from 521G 20.8G. This naturally reduces the requirements to run an independent rebuilderd instance and will permit us to add more Debian suites as well.
    • During migration to the latest version of rebuilderd, make sure several services are not started. [ ]
    • Actually run rebuilderd from /usr/bin. [ ]
    • Raise temperatures for NVME devices on some riscv64 nodes that should be ignored. [ ][ ]
    • Use a 64KB kernel page size on the ppc64el architecture (see #1106757). [ ]
    • Improve ordering of some failed to reproduce statistics. [ ]
    • Detect a number of potential causes of build failures within the statistics. [ ][ ]
    • Add support for manually scheduling for the any architecture. [ ]
  • Misc:
    • Update the Codethink nodes as there are now many kernels installed. [ ][ ]
    • Install linux-sysctl-defaults on Debian trixie systems as we need ping functionality. [ ]
    • Limit the fs.nr_open kernel turnable. [ ]
    • Stop submitting results to deprecated buildinfo.debian.net service. [ ][ ]
In addition, Jochen Sprickerhof greatly improved the statistics and the logging functionality, including adopting to the new database format of rebuilderd version 0.24.0 [ ] and temporarily increasing maximum log size in order to debug a nettlesome build [ ]. Jochen also dropped the CPUSchedulingPolicy=idle systemd flag on the workers. [ ]

Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

11 June 2025

Gunnar Wolf: Understanding Misunderstandings - Evaluating LLMs on Networking Questions

This post is a review for Computing Reviews for Understanding Misunderstandings - Evaluating LLMs on Networking Questions , a article published in Association for Computing Machinery (ACM), SIGCOMM Computer Communication Review
Large language models (LLMs) have awed the world, emerging as the fastest-growing application of all time ChatGPT reached 100 million active users in January 2023, just two months after its launch. After an initial cycle, they have gradually been mostly accepted and incorporated into various workflows, and their basic mechanics are no longer beyond the understanding of people with moderate computer literacy. Now, given that the technology is better understood, we face the question of how convenient LLM chatbots are for different occupations. This paper embarks on the question of whether LLMs can be useful for networking applications. This paper systematizes querying three popular LLMs (GPT-3.5, GPT-4, and Claude 3) with questions taken from several network management online courses and certifications, and presents a taxonomy of six axes along which the incorrect responses were classified: The authors also measure four strategies toward improving answers: The authors observe that, while some of those strategies were marginally useful, they sometimes resulted in degraded performance. The authors queried the commercially available instances of Gemini and GPT, which achieved scores over 90 percent for basic subjects but fared notably worse in topics that require understanding and converting between different numeric notations, such as working with Internet protocol (IP) addresses, even if they are trivial (that is, presenting the subnet mask for a given network address expressed as the typical IPv4 dotted-quad representation). As a last item in the paper, the authors compare performance with three popular open-source models: Llama3.1, Gemma2, and Mistral with their default settings. Although those models are almost 20 times smaller than the GPT-3.5 commercial model used, they reached comparable performance levels. Sadly, the paper does not delve deeper into these models, which can be deployed locally and adapted to specific scenarios. The paper is easy to read and does not require deep mathematical or AI-related knowledge. It presents a clear comparison along the described axes for the 503 multiple-choice questions presented. This paper can be used as a guide for structuring similar studies over different fields.

31 May 2025

Antoine Beaupr : Traffic meter per ASN without logs

Have you ever found yourself in the situation where you had no or anonymized logs and still wanted to figure out where your traffic was coming from? Or you have multiple upstreams and are looking to see if you can save fees by getting into peering agreements with some other party? Or your site is getting heavy load but you can't pinpoint it on a single IP and you suspect some amoral corporation is training their degenerate AI on your content with a bot army? (You might be getting onto something there.) If that rings a bell, read on.

TL;DR: ... or just skip the cruft and install asncounter:
pip install asncounter
Also available in Debian 14 or later, or possibly in Debian 13 backports (soon to be released) if people are interested:
apt install asncounter
Then count whoever is hitting your network with:
awk ' print $2 ' /var/log/apache2/*access*.log   asncounter
or:
tail -F /var/log/apache2/*access*.log   awk ' print $2 '   asncounter
or:
tcpdump -q -n   asncounter --input-format=tcpdump --repl
or:
tcpdump -q -i eth0 -n -Q in "tcp and tcp[tcpflags] & tcp-syn != 0 and (port 80 or port 443)"   asncounter --input-format=tcpdump --repl
Read on for why this matters, and why I wrote yet another weird tool (almost) from scratch.

Background and manual work This is a tool I've been dreaming of for a long, long time. Back in 2006, at Koumbit a colleague had setup TAS ("Traffic Accounting System", " " in Russian, apparently), a collection of Perl script that would do per-IP accounting. It was pretty cool: it would count bytes per IP addresses and, from that, you could do analysis. But the project died, and it was kind of bespoke. Fast forward twenty years, and I find myself fighting off bots at the Tor Project (the irony...), with our GitLab suffering pretty bad slowdowns (see issue tpo/tpa/team#41677 for the latest public issue, the juicier one is confidential, unfortunately). (We did have some issues caused by overloads in CI, as we host, after all, a fork of Firefox, which is a massive repository, but the applications team did sustained, awesome work to fix issues on that side, again and again (see tpo/applications/tor-browser#43121 for the latest, and tpo/applications/tor-browser#43121 for some pretty impressive correlation work, I work with really skilled people). But those issues, I believe were fixed.) So I had the feeling it was our turn to get hammered by the AI bots. But how do we tell? I could tell something was hammering at the costly /commit/ and (especially costly) /blame/ endpoint. So at first, I pulled out the trusted awk, sort uniq -c sort -n tail pipeline I am sure others have worked out before:
awk ' print $1 ' /var/log/nginx/*.log   sort   uniq -c   sort -n   tail -10
For people new to this, that pulls the first field out of web server log files, sort the list, counts the number of unique entries, and sorts that so that the most common entries (or IPs) show up first, then show the top 10. That, other words, answers the question of "which IP address visits this web server the most?" Based on this, I found a couple of IP addresses that looked like Alibaba. I had already addressed an abuse complaint to them (tpo/tpa/team#42152) but never got a response, so I just blocked their entire network blocks, rather violently:
for cidr in 47.240.0.0/14 47.246.0.0/16 47.244.0.0/15 47.235.0.0/16 47.236.0.0/14; do 
  iptables-legacy -I INPUT -s $cidr -j REJECT
done
That made Ali Baba and his forty thieves (specifically their AL-3 network go away, but our load was still high, and I was still seeing various IPs crawling the costly endpoints. And this time, it was hard to tell who they were: you'll notice all the Alibaba IPs are inside the same 47.0.0.0/8 prefix. Although it's not a /8 itself, it's all inside the same prefix, so it's visually easy to pick it apart, especially for a brain like mine who's stared too long at logs flowing by too fast for their own mental health. What I had then was different, and I was tired of doing the stupid thing I had been doing for decades at this point. I had recently stumbled upon pyasn recently (in January, according to my notes) and somehow found it again, and thought "I bet I could write a quick script that loops over IPs and counts IPs per ASN". (Obviously, there are lots of other tools out there for that kind of monitoring. Argos, for example, presumably does this, but it's a kind of a huge stack. You can also get into netflows, but there's serious privacy implications with those. There are also lots of per-IP counters like promacct, but that doesn't scale. Or maybe someone already had solved this problem and I just wasted a week of my life, who knows. Someone will let me know, I hope, either way.)

ASNs and networks A quick aside, for people not familiar with how the internet works. People that know about ASNs, BGP announcements and so on can skip. The internet is the network of networks. It's made of multiple networks that talk to each other. The way this works is there is a Border Gateway Protocol (BGP), a relatively simple TCP-based protocol, that the edge routers of those networks used to announce each other what network they manage. Each of those network is called an Autonomous System (AS) and has an AS number (ASN) to uniquely identify it. Just like IP addresses, ASNs are allocated by IANA and local registries, they're pretty cheap and useful if you like running your own routers, get one. When you have an ASN, you'll use it to, say, announce to your BGP neighbors "I have 198.51.100.0/24" over here and the others might say "okay, and I have 216.90.108.31/19 over here, and I know of this other ASN over there that has 192.0.2.1/24 too! And gradually, those announcements flood the entire network, and you end up with each BGP having a routing table of the global internet, with a map of which network block, or "prefix" is announced by which ASN. It's how the internet works, and it's a useful thing to know, because it's what, ultimately, makes an organisation responsible for an IP address. There are "looking glass" tools like the one provided by routeviews.org which allow you to effectively run "trace routes" (but not the same as traceroute, which actively sends probes from your location), type an IP address in that form to fiddle with it. You will end up with an "AS path", the way to get from the looking glass to the announced network. But I digress, and that's kind of out of scope. Point is, internet is made of networks, networks are autonomous systems (AS) and they have numbers (ASNs), and they announced IP prefixes (or "network blocks") that ultimately tells you who is responsible for traffic on the internet.

Introducing asncounter So my goal was to get from "lots of IP addresses" to "list of ASNs", possibly also the list of prefixes (because why not). Turns out pyasn makes that really easy. I managed to build a prototype in probably less than an hour, just look at the first version, it's 44 lines (sloccount) of Python, and it works, provided you have already downloaded the required datafiles from routeviews.org. (Obviously, the latest version is longer at close to 1000 lines, but it downloads the data files automatically, and has many more features). The way the first prototype (and later versions too, mostly) worked is that you feed it a list of IP addresses on standard input, it looks up the ASN and prefix associated with the IP, and increments a counter for those, then print the result. That showed me something like this:
root@gitlab-02:~/anarcat-scripts# tcpdump -q -i eth0 -n -Q in "(udp or tcp)"   ./asncounter.py --tcpdump                                                                                                                                                                          
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode                                                                
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes                                                             
INFO: collecting IPs from stdin, using datfile ipasn_20250523.1600.dat.gz                                                                
INFO: loading datfile /root/.cache/pyasn/ipasn_20250523.1600.dat.gz...                                                                   
INFO: loading /root/.cache/pyasn/asnames.json                       
ASN     count   AS               
136907  7811    HWCLOUDS-AS-AP HUAWEI CLOUDS, HK                                                                                         
[----]  359     [REDACTED]
[----]  313     [REDACTED]
8075    254     MICROSOFT-CORP-MSN-AS-BLOCK, US
[---]   164     [REDACTED]
[----]  136     [REDACTED]
24940   114     HETZNER-AS, DE  
[----]  98      [REDACTED]
14618   82      AMAZON-AES, US                                                                                                           
[----]  79      [REDACTED]
prefix  count                                         
166.108.192.0/20        1294                                                                                                             
188.239.32.0/20 1056                                          
166.108.224.0/20        970                    
111.119.192.0/20        951              
124.243.128.0/18        667                                         
94.74.80.0/20   651                                                 
111.119.224.0/20        622                                         
111.119.240.0/20        566           
111.119.208.0/20        538                                         
[REDACTED]  313           
Even without ratios and a total count (which will come later), it was quite clear that Huawei was doing something big on the server. At that point, it was responsible for a quarter to half of the traffic on our GitLab server or about 5-10 queries per second. But just looking at the logs, or per IP hit counts, it was really hard to tell. That traffic is really well distributed. If you look more closely at the output above, you'll notice I redacted a couple of entries except major providers, for privacy reasons. But you'll also notice almost nothing is redacted in the prefix list, why? Because all of those networks are Huawei! Their announcements are kind of bonkers: they have hundreds of such prefixes. Now, clever people in the know will say "of course they do, it's an hyperscaler; just ASN14618 (AMAZON-AES) there is way more announcements, they have 1416 prefixes!" Yes, of course, but they are not generating half of my traffic (at least, not yet). But even then: this also applies to Amazon! This way of counting traffic is way more useful for large scale operations like this, because you group by organisation instead of by server or individual endpoint. And, ultimately, this is why asncounter matters: it allows you to group your traffic by organisation, the place you can actually negotiate with. Now, of course, that assumes those are entities you can talk with. I have written to both Alibaba and Huawei, and have yet to receive a response. I assume I never will. In their defence, I wrote in English, perhaps I should have made the effort of translating my message in Chinese, but then again English is the Lingua Franca of the Internet, and I doubt that's actually the issue.

The Huawei and Facebook blocks Another aside, because this is my blog and I am not looking for a Pullitzer here. So I blocked Huawei from our GitLab server (and before you tear your shirt open: only our GitLab server, everything else is still accessible to them, including our email server to respond to my complaint). I did so 24h after emailing them, and after examining their user agent (UA) headers. Boy that was fun. In a sample of 268 requests I analyzed, they churned out 246 different UAs. At first glance, they looked legit, like:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Safari on a Mac, so far so good. But when you start digging, you notice some strange things, like here's Safari running on Linux:
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.457.0 Safari/534.3
Was Safari ported to Linux? I guess that's.. possible? But here is Safari running on a 15 year old Ubuntu release (10.10):
Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Ubuntu/10.10 Chromium/12.0.702.0 Chrome/12.0.702.0 Safari/534.24
Speaking of old, here's Safari again, but this time running on Windows NT 5.1, AKA Windows XP, released 2001, EOL since 2019:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-CA) AppleWebKit/534.13 (KHTML like Gecko) Chrome/9.0.597.98 Safari/534.13
Really? Here's Firefox 3.6, released 14 years ago, there were quite a lot of those:
Mozilla/5.0 (Windows; U; Windows NT 6.1; lt; rv:1.9.2) Gecko/20100115 Firefox/3.6
I remember running those old Firefox releases, those were the days. But to me, those look like entirely fake UAs, deliberately rotated to make it look like legitimate traffic. In comparison, Facebook seemed a bit more legit, in the sense that they don't fake it. most hits are from:
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
which, according their documentation:
crawls the web for use cases such as training AI models or improving products by indexing content directly
From what I could tell, it was even respecting our rather liberal robots.txt rules, in that it wasn't crawling the sprawling /blame/ or /commit/ endpoints, explicitly forbidden by robots.txt. So I've blocked the Facebook bot in robots.txt and, amazingly, it just went away. Good job Facebook, as much as I think you've given the empire to neo-nazis, cause depression and genocide, you know how to run a crawler, thanks. Huawei was blocked at the web server level, with a friendly 429 status code telling people to contact us (over email) if they need help. And they don't care: they're still hammering the server, from what I can tell, but then again, I didn't block the entire ASN just yet, just the blocks I found crawling the server over a couple hours.

A full asncounter run So what does a day in asncounter look like? Well, you start with a problem, say you're getting too much traffic and want to see where it's from. First you need to sample it. Typically, you'd do that with tcpdump or tailing a log file:
tail -F /var/log/apache2/*access*.log   awk ' print $2 '   asncounter
If you have lots of traffic or care about your users' privacy, you're not going to log IP addresses, so tcpdump is likely a good option instead:
tcpdump -q -n   asncounter --input-format=tcpdump --repl
If you really get a lot of traffic, you might want to get a subset of that to avoid overwhelming asncounter, it's not fast enough to do multiple gigabit/second, I bet, so here's only incoming SYN IPv4 packets:
tcpdump -q -n -Q in "tcp and tcp[tcpflags] & tcp-syn != 0 and (port 80 or port 443)"   asncounter --input-format=tcpdump --repl
In any case, at this point you're staring at a process, just sitting there. If you passed the --repl or --manhole arguments, you're lucky: you have a Python shell inside the program. Otherwise, send SIGHUP to the thing to have it dump the nice tables out:
pkill -HUP asncounter
Here's an example run:
> awk ' print $2 ' /var/log/apache2/*access*.log   asncounter
INFO: using datfile ipasn_20250527.1600.dat.gz
INFO: collecting addresses from <stdin>
INFO: loading datfile /home/anarcat/.cache/pyasn/ipasn_20250527.1600.dat.gz...
INFO: finished reading data
INFO: loading /home/anarcat/.cache/pyasn/asnames.json
count   percent ASN AS
12779   69.33   66496   SAMPLE, CA
3361    18.23   None    None
366 1.99    66497   EXAMPLE, FR
337 1.83    16276   OVH, FR
321 1.74    8075    MICROSOFT-CORP-MSN-AS-BLOCK, US
309 1.68    14061   DIGITALOCEAN-ASN, US
128 0.69    16509   AMAZON-02, US
77  0.42    48090   DMZHOST, GB
56  0.3 136907  HWCLOUDS-AS-AP HUAWEI CLOUDS, HK
53  0.29    17621   CNCGROUP-SH China Unicom Shanghai network, CN
total: 18433
count   percent prefix  ASN AS
12779   69.33   192.0.2.0/24    66496   SAMPLE, CA
3361    18.23   None        
298 1.62    178.128.208.0/20    14061   DIGITALOCEAN-ASN, US
289 1.57    51.222.0.0/16   16276   OVH, FR
272 1.48    2001:DB8::/48   66497   EXAMPLE, FR
235 1.27    172.160.0.0/11  8075    MICROSOFT-CORP-MSN-AS-BLOCK, US
94  0.51    2001:DB8:1::/48 66497   EXAMPLE, FR
72  0.39    47.128.0.0/14   16509   AMAZON-02, US
69  0.37    93.123.109.0/24 48090   DMZHOST, GB
53  0.29    27.115.124.0/24 17621   CNCGROUP-SH China Unicom Shanghai network, CN
Those numbers are actually from my home network, not GitLab. Over there, the battle still rages on, but at least the vampire bots are banging their heads against the solid Nginx wall instead of eating the fragile heart of GitLab. We had a significant improvement in latency thanks to the Facebook and Huawei blocks... Here are the "workhorse request duration stats" for various time ranges, 20h after the block:
range mean max stdev
20h 449ms 958ms 39ms
7d 1.78s 5m 14.9s
30d 2.08s 3.86m 8.86s
6m 901ms 27.3s 2.43s
We went from two seconds mean to 500ms! And look at that standard deviation! 39ms! It was ten seconds before! I doubt we'll keep it that way very long but for now, it feels like I won a battle, and I didn't even have to setup anubis or go-away, although I suspect that will unfortunately come. Note that asncounter also supports exporting Prometheus metrics, but you should be careful with this, as it can lead to cardinal explosion, especially if you track by prefix (which can be disabled with --no-prefixes . Folks interested in more details should read the fine manual for more examples, usage, and discussion. It shows, among other things, how to effectively block lots of networks from Nginx, aggregate multiple prefixes, block entire ASNs, and more! So there you have it: I now have the tool I wish I had 20 years ago. Hopefully it will stay useful for another 20 years, although I'm not sure we'll have still have internet in 20 years. I welcome constructive feedback, "oh no you rewrote X", Grafana dashboards, bug reports, pull requests, and "hell yeah" comments. Hacker News, let it rip, I know you can give me another juicy quote for my blog. This work was done as part of my paid work for the Tor Project, currently in a fundraising drive, give us money if you like what you read.

21 May 2025

Simon Quigley: Fences and Values

Don t knock the fence down before you know why it s up. I repeat this phrase over and over again, yet the (metaphorical) Homeowner s Association still decides my fence is the wrong color.Well, now you get to know why the fence is up. If anyone s actually willing to challenge me on this level, I d welcome it.The four ideas I d like to discuss are this: quantum physics, Lutheranism, mental resilience, and psychology. I ve been studying these topics intensely for the past decade as a passion project. I m just going to let my thoughts flow, but I d like to hear other opinions on this.Can the mysteries of the mind, the subatomic world, and faith converge to reveal deeper truths?When it comes to self-taught knowledge on analysis, I m mostly learned on Freud, with some hints of Jung and Peterson. I ve read much of the original source material, and watched countless presentations on it. This all being said, I m both learned on Rothbard and Marx, so if there is a major flaw in the way of Freud is frowned upon, I d genuinely like to know so I can update my research and juxtapose the two schools of thought.Alongside this, although probably not directly relevant, I m learned on John Locke and transcendentalism. What I d like to focus on here is this the Id.The Id is the pleasure-seeking, instinctual part of the psyche. Jung further extends this into the idea of the shadow self, and Peterson maps the meanings of these texts into a combined work (at least in my rudimentary understanding).In my research, the Id represents the part of your psyche that deals with religious values. As an example, if you re an impulsive person, turning to a spiritual or religious outlet can be highly beneficial. I ve been using references from the foundational text of the Judaeo-Christian value system this entire time, feel free to re-read my other blog posts (instead of claiming they don t exist).Let s tie this into quantum physics. This is the part where I ll struggle most. I ve watched several movies about this, read several books, and even learned about it academically, but quantum physics is likely to be my weak spot here.I did some research, and here are the elements I m looking for: uncertainty principle, wave-particle duality, quantum entanglement, and the observer effect.I already know about the cat in the box. And the Cat in the Hat, for that matter. I know about wave-particle duality from an incredibly intelligent high school physics teacher of mine. I know about the uncertainty principle purely in a colloquial sense. The remaining element I need to wrap my head around is quantum entanglement, but it feels like I m almost there.These concepts do actually challenge the idea of pure free will. It s almost like we re coming full circle. Some theologians (including myself, if you can call me a self-taught one) do believe the idea of quantum indeterminacy can be a space where divine action may take place. You could also liken the unpredictable nature of the Id to quantum indeterminacy as well. These are ones to think about, because in all reality, they re subjective opinions. I do believe they re interconnected.In terms of Lutheranism, I ll be short on this one. Please do go read the full history behind Martin Luther and his turbulent relationship with Catholicism. I m not a Bible thumper, and I actually think this is the first time I ve mentioned religion publicly at all. This being said, now I m actually ready to defend the points on an academic level.The Id represents hidden psychological forces, quantum physics reveals subatomic mysteries, and Lutheranism emphasizes faith in the unseen God. Okay, so we have the baseline. Now, time for some mental resilience. When I think of mental resilience, the first people I think of are David Goggins and Jocko Willink. I ve also enjoyed Dr. Andrew Huberman s podcast.The idea there is simple if you understand exactly how to learn, you know your fundamentals well enough to draw them and explain them vividly on a whiteboard, and you can make it a habit, at that point you re ready to work on your mental resilience. Little by little, gradually, how far can you push the bar towards the ceiling?There s obviously limits. People sometimes get scared when I mention mental resilience, but obviously that s a bit of a catch 22. There are plenty of satirical videos out there, and of course, I don t believe in Goggins or Jocko wholeheartedly. They re just tools in the toolbox when times get tough.I wish you all well, and I hope this gets you thinking about those people who just insist there is no God or higher being, and think you re stupid for believing there is one. Those people obviously haven t read analysis, in my own opinion.Have a great night!

22 April 2025

Joey Hess: offgrid electric car

Eight months ago I came up my rocky driveway in an electric car, with the back full of solar panel mounting rails. I didn't know how I'd manage to keep it charged. I got the car earlier than planned, with my offgrid solar upgrade only beginning. There's no nearby EV charger, and winter was coming, less solar power every day. Still, it was the right time to take a leap to offgid EV life. My existing 1 kilowatt solar array could charge the car only 5 miles on a good day. Here's my first try at charging the car offgrid:
first feeble charging offgrid
It was not worth charging the car that way, the house battery tended to get drained while doing that, and adding cycles to that battery is not desirable. So that was only a proof of concept, I knew I'd need to upgrade. My goal with the upgrade was to charge the car directly from the sun, even when it was cloudy, using the house battery only to skate over brief darker periods (like a thunderstorm). By mid October, I had enough solar installed to do that (5 kilowatts).
me standing in front of solar fence
first charging from solar fence
Using this, in 2 days I charged the car up from 57% to 82%, and took off on a celebratory road trip to Niagra Falls, where I charged the car from hydro power from a dam my grandfather had engineered. When I got home, it was November. Days were getting ever shorter. My solar upgrade was only 1/3rd complete and could charge the car 30-some miles per day, but only on a good day, and weather was getting worse. I came back with a low state of charge (both car and me), and needed to get back to full in time for my Thanksgiving trip at the end of the month. I decided to limit my trips to town.
charging up gradually through the month of November
This kind of medium term planning about car travel was new to me. But not too unusual for offgrid living. You look at the weather forecast and make some rough plans, and get to feel connected to the natural world a bit more. December is the real test for offgrid solar, and honestly this was a bit rough, with a road trip planned for the end of the month. I did the usual holiday stuff but otherwise holed up at home a bit more than I usually would. Charging was limited and the cold made it charge less efficiently.
bleak December charging
Still, I was busy installing more solar panels, and by winter solstice, was back to charging 30 miles on a good day. Of course, from there out things improved. In January and February I was able to charge up easily enough for my usual trips despite the cold. By March the car was often getting full before I needed to go anywhere, and I was doing long round trips without bothering to fast charge along the way, coming home low, knowing even cloudy days would let it charge up enough. That brings me up to today. The car is 80% full and heading up toward 100% for a long trip on Friday. Despite the sky being milky white today with no visible sun, there's plenty of power to absorb, and the car charger turned on at 11 am with the house battery already full. My solar upgrade is only 2/3rds complete, and also I have not yet installed my inverter upgrade, so the car can only currenly charge at 9 amps despite much more solar power often being available. So I'm looking forward to how next December goes with my full planned solar array and faster charging. But first, a summer where I expect the car will mostly be charged up and ready to go at all times, and the only car expense will be fast charging on road trips!
By the way, the code I've written to automate offgrid charging that runs only when there's enough solar power is here. And here are the charging graphs for the other months. All told, it's charged 475 kwh offgrid, enough to drive more than 1500 miles.
January
February
March
April

20 April 2025

Russ Allbery: Review: Up the Down Staircase

Review: Up the Down Staircase, by Bel Kaufman
Publisher: Vintage Books
Copyright: 1964, 1991, 2019
Printing: 2019
ISBN: 0-525-56566-3
Format: Kindle
Pages: 360
Up the Down Staircase is a novel (in an unconventional format, which I'll describe in a moment) about the experiences of a new teacher in a fictional New York City high school. It was a massive best-seller in the 1960s, including a 1967 movie, but seems to have dropped out of the public discussion. I read it from the library sometime in the late 1980s or early 1990s and have thought about it periodically ever since. It was Bel Kaufman's first novel. Sylvia Barrett is a new graduate with a master's degree in English, where she specialized in Chaucer. As Up the Down Staircase opens, it is her first day as an English teacher in Calvin Coolidge High School. As she says in a letter to a college friend:
What I really had in mind was to do a little teaching. "And gladly wolde he lerne, and gladly teche" like Chaucer's Clerke of Oxenford. I had come eager to share all I know and feel; to imbue the young with a love for their language and literature; to instruct and to inspire. What happened in real life (when I had asked why they were taking English, a boy said: "To help us in real life") was something else again, and even if I could describe it, you would think I am exaggerating.
She instead encounters chaos and bureaucracy, broken windows and mindless regulations, a librarian who is so protective of her books that she doesn't let any students touch them, a school guidance counselor who thinks she's Freud, and a principal whose sole interaction with the school is to occasionally float through on a cushion of cliches, dispensing utterly useless wisdom only to vanish again.
I want to take this opportunity to extend a warm welcome to all faculty and staff, and the sincere hope that you have returned from a healthful and fruitful summer vacation with renewed vim and vigor, ready to gird your loins and tackle the many important and vital tasks that lie ahead undaunted. Thank you for your help and cooperation in the past and future. Maxwell E. Clarke
Principal
In practice, the school is run by James J. McHare, Clarke's administrative assistant, who signs his messages JJ McH, Adm. Asst. and who Sylvia immediately starts calling Admiral Ass. McHare is a micro-managing control freak who spends the book desperately attempting to impose order over school procedures, the teachers, and the students, with very little success. The title of the book comes from one of his detention slips:
Please admit bearer to class Detained by me for going Up the Down staircase and subsequent insolence. JJ McH
The conceit of this book is that, except for the first and last chapters, it consists only of memos, letters, notes, circulars, and other paper detritus, often said to come from Sylvia's wastepaper basket. Sylvia serves as the first-person narrator through her long letters to her college friend, and through shorter but more frequent exchanges via intraschool memo with Beatrice Schachter, another English teacher at the same school, but much of the book lies outside her narration. The reader has to piece together what's happening from the discarded paper of a dysfunctional institution. Amid the bureaucratic and personal communications, there are frequent chapters with notes from the students, usually from the suggestion box that Sylvia establishes early in the book. These start as chaotic glimpses of often-misspelled wariness or open hostility, but over the course of Up the Down Staircase, some of the students become characters with fragmentary but still visible story arcs. This remains confusing throughout the novel there are too many students to keep them entirely straight, and several of them use pseudonyms for the suggestion box but it's the sort of confusion that feels like an intentional authorial choice. It mirrors the difficulty a teacher has in piecing together and remembering the stories of individual students in overstuffed classrooms, even if (like Sylvia and unlike several of her colleagues) the teacher is trying to pay attention. At the start, Up the Down Staircase reads as mostly-disconnected humor. There is a strong "kids say the darnedest things" vibe, which didn't entirely work for me, but the send-up of chaotic bureaucracy is both more sophisticated and more entertaining. It has the "laugh so that you don't cry" absurdity of a system with insufficient resources, entirely absent management, and colleagues who have let their quirks take over their personalities. Sylvia alternates between incredulity and stubbornness, and I think this book is at its best when it shows the small acts of practical defiance that one uses to carve out space and coherence from mismanaged bureaucracy. But this book is not just a collection of humorous anecdotes about teaching high school. Sylvia is sincere in her desire to teach, which crystallizes around, but is not limited to, a quixotic attempt to reach one delinquent that everyone else in the school has written off. She slowly finds her footing, she has a few breakthroughs in reaching her students, and the book slowly turns into an earnest portrayal of an attempt to make the system work despite its obvious unfitness for purpose. This part of the book is hard to review. Parts of it worked brilliantly; I could feel myself both adjusting my expectations alongside Sylvia to something less idealistic and also celebrating the rare breakthrough with her. Parts of it were weirdly uncomfortable in ways that I'm not sure I enjoyed. That includes Sylvia's climactic conversation with the boy she's been trying to reach, which was weirdly charged and ambiguous in a way that felt like the author's reach exceeding their grasp. One thing that didn't help my enjoyment is Sylvia's relationship with Paul Barringer, another of the English teachers and a frustrated novelist and poet. Everyone who works at the school has found their own way to cope with the stress and chaos, and many of the ways that seem humorous turn out to have a deeper logic and even heroism. Paul's, however, is to retreat into indifference and alcohol. He is a believable character who works with Kaufman's themes, but he's also entirely unlikable. I never understood why Sylvia tolerated that creepy asshole, let alone kept having lunch with him. It is clear from the plot of the book that Kaufman at least partially understands Paul's deficiencies, but that did not help me enjoy reading about him. This is a great example of a book that tried to do something unusual and risky and didn't entirely pull it off. I like books that take a risk, and sometimes Up the Down Staircase is very funny or suddenly insightful in a way that I'm not sure Kaufman could have reached with a more traditional novel. It takes a hard look at what it means to try to make a system work when it's clearly broken and you can't change it, and the way all of the characters arrive at different answers that are much deeper than their initial impressions was subtle and effective. It's the sort of book that sticks in your head, as shown by the fact I bought it on a whim to re-read some 35 years after I first read it. But it's not consistently great. Some parts of it drag, the characters are frustratingly hard to keep track of, and the emotional climax points are odd and unsatisfying, at least to me. I'm not sure whether to recommend it or not, but it's certainly unusual. I'm glad I read it again, but I probably won't re-read it for another 35 years, at least. If you are considering getting this book, be aware that it has a lot of drawings and several hand-written letters. The publisher of the edition I read did a reasonably good job formatting this for an ebook, but some of the pages, particularly the hand-written letters, were extremely hard to read on a Kindle. Consider paper, or at least reading on a tablet or computer screen, if you don't want to have to puzzle over low-resolution images. The 1991 trade paperback had a new introduction by the author, reproduced in the edition I read as an afterward (which is a better choice than an introduction). It is a long and fascinating essay from Kaufman about her experience with the reaction to this book, culminating in a passionate plea for supporting public schools and public school teachers. Kaufman's personal account adds a lot of depth to the story; I highly recommend it. Content note: Self-harm, plus several scenes that are closely adjacent to student-teacher relationships. Kaufman deals frankly with the problems of mostly-poor high school kids, including sexuality, so be warned that this is not the humorous romp that it might appear on first glance. A couple of the scenes made me uncomfortable; there isn't anything explicit, but the emotional overtones can be pretty disturbing. Rating: 7 out of 10

4 April 2025

Gunnar Wolf: Naming things revisited

How long has it been since you last saw a conversation over different blogs syndicated at the same planet? Well, it s one of the good memories of the early 2010s. And there is an opportunity to re-engage! I came across Evgeni s post naming things is hard in Planet Debian. So, what names have I given my computers? I have had many since the mid-1990s I also had several during the decade before that, but before Linux, my computers didn t hve a formal name. Naming my computers something nice Linux gave me. I have forgotten many. Some of the names I have used:

1 March 2025

Debian Brasil: Debian Day 30 years online in Brazil


title: Debian Day 30 years online in Brazil description: by Paulo Henrique de Lima Santana (phls) published: true date: 2025-03-01T17:39:03.284Z tags: blog, english editor: markdown dateCreated: 2023-08-25T16:00:00.000Z In 2023 the traditional Debian Day is being celebrated in a special way, after all on August 16th Debian turned 30 years old! To celebrate this special milestone in the Debian's life, the Debian Brasil community organized a week with talks online from August 14th to 18th. The event was named Debian 30 years. Two talks were held per night, from 7:00 pm to 10:00 pm, streamed on the Debian Brasil channel on YouTube totaling 10 talks. The recordings are also available on the Debian Brazil channel on Peertube. We had the participation of 9 DDs, 1 DM, 3 contributors in 10 activities. The live audience varied a lot, and the peak was on the preseed talk with Eriberto Mota when we had 47 people watching. Thank you to all participants for the contribution you made to the success of our event. Veja abaixo as fotos de cada atividade: Nova gera o: uma entrevista com iniciantes no projeto Debian Nova gera o: uma entrevista com iniciantes no projeto Debian Instala o personalizada e automatizada do Debian com preseed Instala o personalizada e automatizada do Debian com preseed Manipulando patches com git-buildpackage Manipulando patches com git-buildpackage debian.social: Socializando Debian do jeito Debian debian.social: Socializando Debian do jeito Debian Proxy reverso com WireGuard Proxy reverso com WireGuard Celebra o dos 30 anos do Debian! Celebra o dos 30 anos do Debian! Instalando o Debian em disco criptografado com LUKS Instalando o Debian em disco criptografado com LUKS O que a equipe de localiza o j  conquistou nesses 30 anos O que a equipe de localiza o j conquistou nesses 30 anos Debian - Projeto e Comunidade! Debian - Projeto e Comunidade! Design Gr fico e Software livre, o que fazer e por onde come ar Design Gr fico e Software livre, o que fazer e por onde come ar

23 February 2025

Kentaro Hayashi: Short journey to Mozc 2.29.5160.102+dfsg-1

Introduction This is just a note-taking about how to upgrading Mozc package for up-coming trixie ready (with many restrictions) last year. Maybe Mozc 2.29.5160.102+dfsg-1.3 will be shipped for Debian 13 (trixie).

FTBFS with Mozc 2.28.4715.102+dfsg-2.2 In May 2024, I've found that Mozc was removed from testing, and still in FTBFS. #1068186 - mozc: FTBFS with abseil 20230802: ../../base/init_mozc.cc:90:29: error: absl::debian5::flags_internal::ArgvListAction has not been declared - Debian Bug report logs That FTBFS was fixed in the Mozc upstream, but not applied for a while. Not only upstream patch, but also additional linkage patch was required to fix it. Mozc is the de-fact standard input method editor for Japanese. Most of Japanese uses it by default on linux desktop. (Even though frontend input method framework is different, the background engine is Mozc in most cases - uim-mozc for task-japanese-desktop, ibus-mozc for task-japanese-gnome-desktop in Debian) There is a case that Mozc was re-built locally with integrated external dictionary to improve quantity of vocabulary. If FTBFS keep ongoing, it means that it blocks such a usage. So I've sent patches to fix it and they were merged.

Motivation to update Mozc With fixing #1068186, I've also found Mozc version is not synced to upstream for a long time. At that time, Mozc in unstable was version 2.28.4715.102+dfsg, but upstream already released 2.30.5544.102. It seems that Mozc's maintainer was too busy and can't afford to update it, so I've tried to do it.

The blockers for updating Mozc But, it was not so easy task to do so. If you want to package latest Mozc, there were many blockers.
  • Newer Mozc requires Bazel to build, but there is no Bazel package to fit it (There is bazel-bootstrap 4.x, but it's old. v6.x or newer one is required.)
  • Newer abseil and protobuf were required
  • Renderer was changed to Qt. GTK renderer was removed
  • Revise existing patchsets (e.g. for UIM, for Fcitx)
It was not all.

Road to latest Mozc First, I knew the existence of debian-bazel, so I've posted about bazel-packaging progress. Any updates about bazel packaging effort? Sadly there was no response from it. Thus, it was not realistic to adopt Bazel as build tool chain. In other words, we need to keep GYP patch and maintain it. And as another topic, upstream changed renderer from GTK+ to Qt. Here are the major topics about each release of Mozc.
  • 2.30.5544.102 Require abseil 20240116.1 or later
  • 2.29.5544.102 GYP was deprecated
  • 2.29.5374.102
  • 2.29.5268.102 No gtk renderer anymore, need Qt.
  • 2.29.5160.102
    • The last version that gtk renderer is available.
    • --use_gyp_for_ibus_build option was removed.
  • 2.28.5029.102
  • 2.28.4880.102
  • 2.28.4715.102+dfsg Debian sid
The internal renderer change are too big, and before GYP deprecation in 2.29.5544.102, GYP support was already removed gradually. As a result, target to 2.29.5160.102 was the practical approach to make it forward.

Revisit existing patchsets for 2.28.4715.102+dfsg Second, need to revisit existing patchset to triage them.
  • 0001-Update-uim-mozc-to-c979f127acaeb7b35d3344e8b1e40848e.patch
    • Required
  • 0002-Support-fcitx.patch
    • Required
  • 0003-Change-compiler-from-clang-to-gcc.patch
  • 0004-Add-usage_dict.txt.patch
    • Required. (maybe)
  • 0005-Enable-verbose-build.patch
    • Required.
  • 0006-Update-gyp-using-absl.patch
    • Required and need massive refactoring.
  • 0007-common.gypi-Use-command-v-instead-of-which.patch
    • (maybe) Not needed anymore
  • 0009-protobuf.gyp-Add-latomic-to-link_settings.patch
    • Required.
  • 0010-Fix-the-compile-error-of-ParseCommandLineFlags-with.patch
    • Required. Should be merged into 0006 patch.
  • 0011-Fix-missing-abseil-gyp-link-settings.patch
    • Required. Should be merged into 0006 patch.
UIM patch was maintained in third-party repository, and directory structure was quite different from Mozc. It seems that maintenance activity was too low, so it was not enough that picking changes from macuim. It was required to fix FTBFS additionally. Fcitx patch was also maintained in fcitx/mozc. But it tracks only master branch, so it was hard to pick patchset for specific version of Mozc. Finally, I could manage to refresh patchset for 2.29.5160.102.
  • support-uim.patch
  • support-fcitx.patch
  • change-compiler-from-clang-to-gcc.patch
  • add-japanese-usage-dictionary.patch
  • enable-verbose-build.patch
  • update-gyp-using-system-abseil.patch
  • gyp-using-command-instead-of-which.patch
  • gyp-protobuf-link-with-atomic.patch
  • enable-deprecated-gtk-renderer.patch
  • fix-compile-error-of-ParseCommandLineFlags.patch
  • enable-use_gyp_for_ibus_build-again.patch
  • ibus-drop-needless-client_mock.patch
  • protobuf-revert-internal-cleanup.patch
  • uim-mozc-fix-ftbfs.patch

Improve packaging task Mozc need to be repacked, but it didn't use Files-Excluded yet. So I've introduced d/watch to repack upstream source. It makes source package more reproducible.

OT: Hardware breakage There was another blocker to do this task. I've hit the situation that g++ cause SEGV during building Mozc randomly. First, I wonder why it fails, but digging further more, finally I've found that memory module was corrupted. Thus I've lost 32GB memory modules. :-<

Unexpected behaviour in uim-mozc When uploaded Mozc 2.29.5160.102+dfsg-1 to experimental, I've found that there is a case that uim-mozc behaves weird. The candidate words were shown with flickering. But it was not regression in this upload. uim-mozc with Wayland cause that problem. Thus GNOME and derivatives might not be affected because ibus-mozc will be used.

Mozc 2.29.5160.102+dfsg-1 As the patchset was matured, then uploaded 2.29.5160.102+dfsg-1 with --delayed 15 option.
$ dput --delayed 15 mozc_2.29.5160.102+dfsg-1_source.changes
Uploading mozc using ftp to ftp-master (host: ftp.upload.debian.org; directory: /pub/UploadQueue/DELAYED/15-day)
running allowed-distribution: check whether a local profile permits uploads to the target distribution
running protected-distribution: warn before uploading to distributions where a special policy applies
running checksum: verify checksums before uploading
running suite-mismatch: check the target distribution for common errors
running gpg: check GnuPG signatures before the upload
 signfile dsc mozc_2.29.5160.102+dfsg-1.dsc 719EB2D93DBE9C4D21FBA064F7FB75C566ED20E3
 fixup_buildinfo mozc_2.29.5160.102+dfsg-1.dsc mozc_2.29.5160.102+dfsg-1_amd64.buildinfo
 signfile buildinfo mozc_2.29.5160.102+dfsg-1_amd64.buildinfo 719EB2D93DBE9C4D21FBA064F7FB75C566ED20E3
 fixup_changes dsc mozc_2.29.5160.102+dfsg-1.dsc mozc_2.29.5160.102+dfsg-1_source.changes
 fixup_changes buildinfo mozc_2.29.5160.102+dfsg-1_amd64.buildinfo mozc_2.29.5160.102+dfsg-1_source.changes
 signfile changes mozc_2.29.5160.102+dfsg-1_source.changes 719EB2D93DBE9C4D21FBA064F7FB75C566ED20E3
Successfully signed dsc, buildinfo, changes files
Uploading mozc_2.29.5160.102+dfsg-1.dsc
Uploading mozc_2.29.5160.102+dfsg-1.debian.tar.xz
Uploading mozc_2.29.5160.102+dfsg-1_amd64.buildinfo
Uploading mozc_2.29.5160.102+dfsg-1_source.changes
Mozc 2.29.5160.102+dfsg-1 was landed to unstable at 2024-12-20.

Additional bug fixes Additionally, the following bugs were also fixed. These bugs were fixed in 2.29.5160.102+dfsg-1.1 And more, I've found that even though missing pristine-tar branch commit, salsa CI succeeds. I've sent MR for this issue and already merged into.

Mozc and future in Debian In this short journey, I gave up to updating more newer Mozc because the version of dependency libraries were not updated. Note that protobuf 3.25.4 on experimental depends on older absl 20230802, so it must be rebuilt against absl 20240722.0. And more, we need to consider how to migrate from GTK renderer to Qt renderer in the future.

21 February 2025

Russell Coker: Links February 2025

Oliver Lindburg wrote an interesting article about Designing for Crisis [1]. Bruce Schneier blogged about how to cryptographically identify other humans in advance of AT technology allowing faking people you know [2]. Anarcat has an interesting review of qalc which is a really good calculator, I ll install it on all my workstations [3]. It even does furlongs per fortnight! This would be good to be called from a LLM system when someone asks about mathematical things. Krebs has an informative article about a criminal employed by Elon s DOGE [4]. Conservatives tend to be criminals. Krebs wrote an interesting article about the security of the iOS (and presumably Android) apps for DeekSeek [5]. Seems that the DeepSeek people did everything wrong. Bruce Schneier and Davi Ottenheimer wrote an insightful article DOGE as a National Cyberattack [6]. Bruce Schneier and Barath Raghavan wrote an insightful article about why and how computer generated voices should sound robotic [7]. Cory Doctorow has an interesting approach to the trade war between the US and Canada, instead of putting tarrifs on imports from the US the Canadian government should make it legal for Canadians to unlock their own property [8]. This youtube video about designing a compressed air engine for a model plane is interesting [9]. Krebs has an interesting article on phishing and mobile phone wallets, Google and Apple need to restrict the number of wallets per phone [10]. The Daily WTF has a good summary of why Elon s DOGE organisation is badly designed and run and a brief mention of how it damages the US [11]. ArsTechnica has an informative article about device code phishing [12]. The increased use of single-sign-on is going to make this more of a problem. Shrivu wrote an insightful and informative article on how to backdoor LLMs [13]. Cory Doctorow wrote an informative post about MLMs and how they are the mirror world version of community organising [14].

28 January 2025

Russ Allbery: Review: Moose Madness

Review: Moose Madness, by Mar Delaney
Publisher: Kalikoi
Copyright: May 2021
ASIN: B094HGT1ZB
Format: Kindle
Pages: 68
Moose Madness is a sapphic shifter romance novella (on the short side for a novella) by the same author as Wolf Country. It was originally published in the anthology Her Wild Soulmate, which appears to be very out of print. Maggie (she hates the nickname Moose) grew up in Moose Point, a tiny fictional highway town in (I think) Alaska. (There is, unsurprisingly, an actual Moose Point in Alaska, but it's a geographic feature and not a small town.) She stayed after graduation and is now a waitress in the Moose Point Pub. She's also a shifter; specifically, she is a moose shifter like her mother, the town mayor. (Her father is a fox shifter.) As the story opens, the annual Moose Madness festival is about to turn the entire town into a blizzard of moose kitsch. Fiona Barton was Maggie's nemesis in high school. She was the cool, popular girl, a red-headed wolf shifter whose friend group teased and bullied awkward and uncoordinated Maggie mercilessly. She was also Maggie's impossible crush, although the very idea seemed laughable. Fi left town after graduation, and Maggie hadn't thought about her for years. Then she walks into Moose Point Pub dressed in biker leathers, with piercings and one side of her head shaved, back in town for a wedding in her pack. Much to the shock of both Maggie and Fi, they realize that they're soulmates as soon as their eyes meet. Now what? If you thought I wasn't going to read the moose and wolf shifter romance once I knew it existed, you do not know me very well. I have been saving it for when I needed something light and fun. It seemed like the right palette cleanser after a very disappointing book. Moose Madness takes place in the same universe as Wolf Country, which means there are secret shifters all over Alaska (and presumably elsewhere) and they have the strong magical version of love at first sight. If one is a shifter, one knows immediately as soon as one locks eyes with one's soulmate and this feeling is never wrong. This is not my favorite romance trope, but if I get moose shifter romance out of it, I'll endure. As you can tell from the setup, this is enemies-to-lovers, but the whole soulmate thing shortcuts the enemies to lovers transition rather abruptly. There's a bit of apologizing and air-clearing at the start, but most of the novella covers the period right after enemies have become lovers and are getting to know each other properly. If you like that part of the arc, you will probably enjoy this, but be warned that it's slight and somewhat obvious. There's a bit of tension from protective parents and annoying pack mates, but it's sorted out quickly and easily. If you want the characters to work for the relationship, this is not the novella for you. It's essentially all vibes. I liked the vibes, though! Maggie is easy to like, and Fi does a solid job apologizing. I wish there was quite a bit more moose than we get, but Delaney captures the combination of apparent awkwardness and raw power of a moose and has a good eye for how beautiful large herbivores can be. This is not the sort of book that gives a moment's thought to wolves being predators and moose being, in at least some sense, prey animals, so if you are expecting that to be a plot point, you will be disappointed. As with Wolf Country, Delaney elides most of the messier and more ethically questionable aspects of sometimes being an animal. This is a sweet, short novella about two well-meaning and fundamentally nice people who are figuring out that middle school and high school are shitty and sometimes horrible but don't need to define the rest of one's life. It's very forgettable, but it made me smile, and it was indeed a good palette cleanser. If you are, like me, the sort of person who immediately thought "oh, I have to read that" as soon as you saw the moose shifter romance, keep your expectations low, but I don't think this will disappoint. If you are not that sort of person, you can safely miss this one. Rating: 6 out of 10

31 December 2024

Chris Lamb: Favourites of 2024

Here are my favourite books and movies that I read and watched throughout 2024. It wasn't quite the stellar year for books as previous years: few of those books that make you want to recommend and/or buy them for all your friends. In subconscious compensation, perhaps, I reread a few classics (e.g. True Grit, Solaris), and I'm almost finished my second read of War and Peace.

Books

Elif Batuman: Either/Or (2022) Stella Gibbons: Cold Comfort Farm (1932) Michel Faber: Under The Skin (2000) Wallace Stegner: Crossing to Safety (1987) Gustave Flaubert: Madame Bovary (1857) Rachel Cusk: Outline (2014) Sara Gran: The Book of the Most Precious Substance (2022) Anonymous: The Railway Traveller s Handy Book (1862) Natalie Hodges: Uncommon Measure: A Journey Through Music, Performance, and the Science of Time (2022)Gary K. Wolf: Who Censored Roger Rabbit? (1981)

Films Recent releases

Seen at a 2023 festival. Disappointments this year included Blitz (Steve McQueen), Love Lies Bleeding (Rose Glass), The Room Next Door (Pedro Almod var) and Emilia P rez (Jacques Audiard), whilst the worst new film this year was likely The Substance (Coralie Fargeat), followed by Megalopolis (Francis Ford Coppola), Unfrosted (Jerry Seinfeld) and Joker: Folie Deux (Todd Phillips).
Older releases ie. Films released before 2023, and not including rewatches from previous years. Distinctly unenjoyable watches included The Island of Dr. Moreau (John Frankenheimer, 1996), Southland Tales (Richard Kelly, 2006), Any Given Sunday (Oliver Stone, 1999) & The Hairdresser s Husband (Patrice Leconte, 19990). On the other hand, unforgettable cinema experiences this year included big-screen rewatches of Solaris (Andrei Tarkovsky, 1972), Blade Runner (Ridley Scott, 1982), Apocalypse Now (Francis Ford Coppola, 1979) and Die Hard (John McTiernan, 1988).

19 December 2024

Gregory Colpart: MiniDebConf Toulouse 2024

After the MiniDebConf Marseille 2019, COVID-19 made it impossible or difficult to organize new MiniDebConfs for a few years. With the gradual resumption of in-person events (like FOSDEM, DebConf, etc.), the idea emerged to host another MiniDebConf in France, but with a lighter organizational load. In 2023, we decided to reach out to the organizers of Capitole du Libre to repeat the experience of 2017: hosting a MiniDebConf alongside their annual event in Toulouse in November. However, our request came too late for 2023. After discussions with Capitole du Libre in November 2023 in Toulouse and again in February 2024 in Brussels, we confirmed that a MiniDebConf Toulouse would take place in November 2024! We then assembled a small organizing team and got to work: a Call for Papers in May 2024, adding a two-day MiniDebCamp, coordinating with the DebConf video team, securing sponsors, creating a logo, ordering T-shirts and stickers, planning the schedule, and managing registrations. Even with lighter logistics (conference rooms, badges, and catering during the weekend were handled by Capitole du Libre), there was still quite a bit of preparation to do. On Thursday, November 14, and Friday, November 15, 2024, about forty developers arrived from around the world (France, Spain, Italy, Switzerland, Germany, England, Brazil, Uruguay, India, Brest, Marseille ) to spend two days at the MiniDebCamp in the beautiful collaborative spaces of Artilect in Toulouse city center.
Then, on Saturday, November 16, and Sunday, November 17, 2024, the MiniDebConf took place at ENSEEIHT as part of the Capitole du Libre event. The conference kicked off on Saturday morning with an opening session by J r my Lecour, which included a tribute to Lunar (Nicolas Dandrimont). This was followed by Reproducible Builds Rebuilding What is Distributed from ftp.debian.org (Holger Levsen) and Discussion on My Research Work on Sustainability of Debian OS (Eda). After lunch at the Capitole du Libre food trucks, the intense afternoon schedule began: What s New in the Linux Kernel (and What s Missing in Debian) (Ben Hutchings), Linux Live Patching in Debian (Santiago Ruano Rinc n), Trixie on Mobile: Are We There Yet? (Arnaud Ferraris), PostgreSQL Container Groups, aka cgroups Down the Road (C dric Villemain), Upgrading a Thousand Debian Hosts in Less Than an Hour (J r my Lecour and myself), and Using Debusine to Automate Your QA (Stefano Rivera & co). Sunday marked the second day, starting with a presentation on DebConf 25 (Benjamin Somers), which will be held in Brest in July 2025. The morning continued with talks: How LTS Goes Beyond LTS (Santiago Ruano Rinc n & Roberto C. S nchez), Cross-Building (Helmut Grohne), and State of JavaScript (Bastien Roucari s). In the afternoon, there were Lightning Talks, PyPI Security: Past, Present & Future (Salvo LtWorf Tomaselli), and the classic Bits from DPL (Andreas Tille), before closing with the final session led by Pierre-Elliott B cue. All talks are available on video (a huge thanks to the amazing DebConf video team), and many thanks to our sponsors (Viridien, Freexian, Evolix, Collabora, and Data Bene). A big thank-you as well to the entire Capitole du Libre team for hosting and supporting us see you in Brest in July 2025! Articles about (or mentioning) MiniDebConf Toulouse:

1 September 2024

Colin Watson: Free software activity in August 2024

All but about four hours of my Debian contributions this month were sponsored by Freexian. (I ended up going a bit over my 20% billing limit this month.) You can also support my work directly via Liberapay. man-db and friends I released libpipeline 1.5.8 and man-db 2.13.0. Since autopkgtests are great for making sure we spot regressions caused by changes in dependencies, I added one to man-db that runs the upstream tests against the installed package. This required some preparatory work upstream, but otherwise was surprisingly easy to do. OpenSSH I fixed the various 9.8 regressions I mentioned last month: socket activation, libssh2, and Twisted. There were a few other regressions reported too: TCP wrappers support, openssh-server-udeb, and xinetd were all broken by changes related to the listener/per-session binary split, and I fixed all of those. Once all that had made it through to testing, I finally uploaded the first stage of my plan to split out GSS-API support: there are now openssh-client-gssapi and openssh-server-gssapi packages in unstable, and if you use either GSS-API authentication or key exchange then you should install the corresponding package in order for upgrades to trixie+1 to work correctly. I ll write a release note once this has reached testing. Multiple identical results from getaddrinfo I expect this is really a bug in a chroot creation script somewhere, but I haven t been able to track down what s causing it yet. My sbuild chroots, and apparently Lucas Nussbaum s as well, have an /etc/hosts that looks like this:
$ cat /var/lib/schroot/chroots/sid-amd64/etc/hosts
127.0.0.1       localhost
127.0.1.1       [...]
127.0.0.1       localhost ip6-localhost ip6-loopback
The last line clearly ought to be ::1 rather than 127.0.0.1; but things mostly work anyway, since most code doesn t really care which protocol it uses to talk to localhost. However, a few things try to set up test listeners by calling getaddrinfo("localhost", ...) and binding a socket for each result. This goes wrong if there are duplicates in the resulting list, and the test output is typically very confusing: it looks just like what you d see if a test isn t tearing down its resources correctly, which is a much more common thing for a test suite to get wrong, so it took me a while to spot the problem. I ran into this in both python-asyncssh (#1052788, upstream PR) and Ruby (ruby3.1/#1069399, ruby3.2/#1064685, ruby3.3/#1077462, upstream PR). The latter took a while since Ruby isn t one of my languages, but hey, I ve tackled much harder side quests. I NMUed ruby3.1 for this since it was showing up as a blocker for openssl testing migration, but haven t done the other active versions (yet, anyway). OpenSSL vs. cryptography I tend to care about openssl migrating to testing promptly, since openssh uploads have a habit of getting stuck on it otherwise. Debian s OpenSSL packaging recently split out some legacy code (cryptography that s no longer considered a good idea to use, but that s sometimes needed for compatibility) to an openssl-legacy-provider package, and added a Recommends on it. Most users install Recommends, but package build processes don t; and the Python cryptography package requires this code unless you set the CRYPTOGRAPHY_OPENSSL_NO_LEGACY=1 environment variable, which caused a bunch of packages that build-depend on it to fail to build. After playing whack-a-mole setting that environment variable in a few packages build process, I decided I didn t want to be caught in the middle here and filed an upstream issue to see if I could get Debian s OpenSSL team and cryptography s upstream talking to each other directly. There was some moderately spirited discussion and the issue remains open, but for the time being the OpenSSL team has effectively reverted the change so it s no longer a pressing problem. GCC 14 regressions Continuing from last month, I fixed build failures in pccts (NMU) and trn4. Python team I upgraded alembic, automat, gunicorn, incremental, referencing, pympler (fixing compatibility with Python >= 3.10), python-aiohttp, python-asyncssh (fixing CVE-2023-46445, CVE-2023-46446, and CVE-2023-48795), python-avro, python-multidict (fixing a build failure with GCC 14), python-tokenize-rt, python-zipp, pyupgrade, twisted (fixing CVE-2024-41671 and CVE-2024-41810), zope.exceptions, zope.interface, zope.proxy, zope.security, and zope.testrunner to new upstream versions. In the process, I added myself to Uploaders for zope.interface; I m reasonably comfortable with the Zope Toolkit and I seem to be gradually picking up much of its maintenance in Debian. A few of these required their own bits of yak-shaving: I improved some Multi-Arch: foreign tagging (python-importlib-metadata, python-typing-extensions, python-zipp). I fixed build failures in pipenv, python-stdlib-list, psycopg3, and sen, and fixed autopkgtest failures in autoimport (upstream PR), python-semantic-release and rstcheck. Upstream for zope.file (not in Debian) filed an issue about a test failure with Python 3.12, which I tracked down to a Python 3.12 compatibility PR in zope.security. I made python-nacl build reproducibly (upstream PR). I moved aliased files from / to /usr in timekpr-next (#1073722). Installer team I applied a patch from Ubuntu to make os-prober support building with the noudeb profile (#983325).

18 August 2024

Debian Brasil: Debian Day 2024 em Pouso Alegre/MG - Brasil

por Thiago Pezzo e Giovani Ferreira As celebra es locais do Dia do Debian 2024 tamb m aconteceram em Pouso Alegre, MG, Brasil. Neste ano conseguimos organizar dois dias de palestras! No dia 14 de agosto de 2024, quarta-feira pela manh , estivemos no campus Pouso Alegre do Instituto Federal de Educa o, Ci ncia e Tecnologia do Sul de Minas Gerais (IFSULDEMINAS). Fizemos a apresenta o introdut ria do Projeto Debian, sistema operacional e comunidade, para os tr s anos do Curso T cnico de Ensino M dio em Inform tica. O evento foi fechado para o IFSULDEMINAS e estiveram presentes por volta de 60 estudantes. J no dia 17 de agosto de 2024, um s bado pela manh , realizamos o evento aberto comunidade na Universidade do Vale do Sapuca (Univ s), com apoio institucional do Curso de Sistemas de Informa o. Falamos sobre o Projeto Debian com Giovani Ferreira (Debian Developer); sobre a equipe de tradu o Debian pt_BR com Thiago Pezzo; sobre experi ncias no dia a dia com uso de softwares livres com Virg nia Cardoso; e sobre como configurar um ambiente de desenvolvimento pronto para produ o usando Debian e Docker com Marcos Ant nio dos Santos. Encerradas as palestras, foram servidos salgadinhos, caf e bolo, enquanto os/as participantes conversavam, tiravam d vidas e partilhavam experi ncias. Gostar amos de agradecer a todas as pessoas que nos ajudaram: Algumas fotos: Apresenta o no campus Pouso Alegre do IFSULDEMINAS 1 Apresenta o no campus Pouso Alegre do IFSULDEMINAS 2 Apresenta o no campus F tica da UNIV S 1 Apresenta o no campus F tica da UNIV S 2 Apresenta o no campus F tica da UNIV S 3 Apresenta o no campus F tica da UNIV S 4

22 June 2024

Russ Allbery: Review: And the Stars Will Sing

Review: And the Stars Will Sing, by Michelle Browne
Series: The Meaning Wars #1
Publisher: Michelle Browne
Copyright: 2012, 2021
Printing: 2021
ASIN: B0075G7GEA
Format: Kindle
Pages: 85
And the Stars Will Sing is a self-published science fiction novella, the first of a (currently) five book series. I believe it may be Browne's first publication, although I don't have a good data source to confirm. Crystal Weiss is a new graduate from Mars, about to leave the solar system to her first job assignment: installation of a permanent wormhole in the vicinity of Messier 14. Her expertise is the placement calculations. The heavy mathematical lifting is of course done by computers, but humans have to do the mapping and some of the guidance. And the Stars Will Sing is an epistolary novel, told in the form of her letters to her friend Sarah. I feel bad when I stumble across a book like this. I want to stick with my habit of writing a review of each book I read, but it's one thing to pan a bad book by a famous author and another thing to pick on a self-published novella that I read due to some recommendation or mention whose details I've forgotten. Worse, I think this wasn't even the recommended book; I looked up the author, saw that the first of a series was on sale, and thought "oh, hey, I like epistolary novels and I'm in the mood for some queer space opera." This book didn't seem that queer (there is a secondary lesbian relationship but the main relationship seemed rather conventional), but I'll get to the romance in a moment. I was not the reader for this book. There's a reason why most of the books I read are from traditional publishers; I'm too critical of a reader for a lot of early self-published work. It's not that I dislike self-publishing as a concept many self-published books are excellent and the large publishers have numerous problems but publishers enforce a quality bar. Inconsistently, unfairly, and by rejecting a lot of good work, but still, they do. I'm fairly sure traditional publishers would have passed on this book; the quality of the writing isn't there yet. (It's certainly a better book than I could have written! But that's why I'm writing my reviews over in my quiet corner of the Internet and not selling fiction to other people.) The early chapters aren't too bad, although they have a choppy, cliched style that more writing experience usually smoothes out. The later chapters have more dialogue, enough that I started wondering how Crystal could remember that much dialogue verbatim to put into a letter, and it's not good. All of the characters talk roughly the same (even the aliens), the dialogue felt even more cliched than the rest of the writing, and I started getting distracted by the speech tags. Crystal comes across as very young, impulsive, and a drama magnet who likes being all up in her coworkers' business. None of these are objective flaws in the book, but I could tell early on that I was going to find her annoying. She has a heavily-foreshadowed enemies-to-lovers thing with one of her male coworkers. Her constant complaining about him at the start of the story was bad enough, but the real problem is that in the very few places where he has more personality than plastic lawn furniture, he's being obnoxious to Crystal. I'm used to being puzzled by a protagonist's choice in love interests, but this one felt less like an odd personality choice and more a lack of writing skill. Even if the relationship is being set up for failure (not true by the end of this book), you've got to help me understand what the protagonist saw in him or was getting out of the relationship. The plot was so predictable that it ironically surprised me. I was sure that some sort of twist or complication was coming, but no. I will give Browne some credit for writing a slightly more realistic character reaction to violence than most SF authors, but there was nothing in the plot to hold my interest. The world-building was generic science fiction with aliens. It had a few glimmers of promise, but there was some sort of psychic hand-waving involved in siting wormholes that didn't work for me and the plot climax made no sense to me whatsoever. This is the kind of bad book that I don't want to hold against the writer. Twelve years later and with numerous other novels and novellas under her belt, her writing is probably much better. I do think this book would have benefited from an editor telling her it wasn't good enough for publication yet, but that's not how the Kindle self-publishing world works. Mostly, this is my fault: I half-followed a recommendation into an area of publishing that I know from past experience I should avoid without a solid review from an equally critical reader. Followed by The Stolen, a two-story collection. Rating: 3 out of 10

1 June 2024

Ian Jackson: What your vote is worth - a back of the envelope calculation

tl;dr: Your vote really counts! Each vote in a UK General Election is worth maybe 100,000 - to you and all your fellow citizens taken together. If you really care about the welfare of everyone affected by actions of the UK government, then it s worth that to you too. Introduction It seems a common perception that one vote, in amongst all those millions, doesn t really matter. So maybe it s not worth voting. But, voting is (largely) what determines what the government does - and the government is big. It s as big as all the people. If you are the kind of person who cares about what happens to everyone in your polity and indeed everyone its actions affect, then even your one vote is very important indeed. A method for back of the envelope calculation It would be nice to give a quantitative estimate. Many things in our society are measured in money, so let s try taking a stab at calculating the money value of your vote. The argument I m going to make is this: the government (by which I include the legislature), which is selected by our votes, decides how to spend the national budget. So, basically, I m going to divide the budget, by the electorate. UK Parliament UK Parliamentary elections decide not only the House of Commons, but, through that, the government. The upper house, the House of Lords, has very limited influence. So I think it s fair to regard the Parliamentary election as, simply, controlling that budget. Being lazy, I m going to use Wikipedia data. We have the size of the electorate, for 2019, 47.6 million. But your influence isn t shared with the whole electorate, only with the other people who also vote. Turnout in 2019 was 67.3%. The 2019 budget isn t listed but I ll just average the 2018 and March 2020 figures 842bn and 873bn, so 857 billion. (Strictly speaking I should add up the budgets for the period of the Parliament, but that seems like a lot of effort.) There s a discrepancy in the timescale we need to account for. Your vote influences the budgets for several years, depending how long it is until the next election. Taking Wikipedia s list of elections this century there ve been 7 in 24 years. So that s an average of about 3.4y. So, multiplying it through, we have ( 857b * (24 / 7)) / (47.6M * 67.3%), giving a guess at the value of your UK General Election vote: 92,000. European Parliament 2022 budget for the European Union (Wikipedia again) was 170.6 bn. The last election, in 2019, had a turnout of 198,352,638. Each EU Parliament lasts 5 years. The Parliament, however, shares responsibility for the budget with the European Council, which is controlled, ultimately, by national governments. We have to pick a numerical value for the Parliament s share of the influence. Over the past years the Parliament has gradually been more willing to exercise its powers in this area. I m going to arbitrarily call its share 50%. The calculation, then, is 170.6 bn * 5 * 50% / 198M, giving a guess at the value of your EU Parliamentary Election vote: 2150. This much smaller figure reflects simply that the EU doesn t spend very much money, for a polity of its size. (Those stories in the British press giving the impression that the EU is massively wasteful are, simply, lies.) The interaction of this calculation with the Council s share of the influence, and with national budgets, is a bit of a question, but given the much smaller amounts involved, it doesn t seem worth thinking about that too hard. Only if you care about other people as much as yourself! All of this is only true for you if you value and want to help everyone in your society. That includes immigrants, women, unemployed people, disabled people, people who are much poorer or richer than you, etc. If you think about it in purely personal terms, your vote is hardly worth anything - because while the effect of your vote, overall, is very large, that effect is shared by everyone in your polity. So if you only care about yourself, voting is a total waste of time. The more selfish and xenophobic and racist and so on you are - caring only about people like yourself - the less your vote is worth. This is why voting is rightly seen as a civic duty. I just spent 30 to courier my EP vote to Den Haag. That only makes sense because I m very willing to spend that 30 to try to improve the spending of the 2000 or so that s my share of the EU budget. This is a very rough analysis These calculations neglect a lot of very important things: politics isn t just about the allocation of resources. It s also about values, and bad politics can seriously harm people. Arguably many of those effects of your vote, are much more important than just how the budget is set and spent. It would be interesting to see an attempt at a similar analysis but for taking into account life and death questions like hate crime, traffic violence, healthcare, refugees welfare, and so on. I m not sure how to approach that. Maybe some real social scientists have done so? References welcome. Also, even on its own terms, this analysis is very rough and ready. We haven t modelled the ability of the government to change its tax rates; perhaps we should be multiplying GDP (or some other better measure) by 90% percentile total tax rate amongst countries like this one . The amount of influence that can be wielded by one vote is probably nonlinear in the size of the political faction, but IDK in which direction. In unfair voting systems like the UK s, some people s votes are worth much more than others. In a very marginal constituency, which is a target seat, your vote might be worth tens of millions. In a safe seat, it might only be worth a few thousand. And in practical terms you don t get to choose precisely the policies you want; you have to pick a party, which is sometimes very much a question of the lesser evil. So, there is much I haven t modelled. But the key point stands: Conclusion Although your vote is diluted by everyone else s votes, together, we control the government, which affects us all. So if you care about the whole of society, the big numbers in the divisor, and the numerator, cancel out. You can think of your vote as controlling one citizen s worth of government activity.
edited 2024-06-01 09:40 Z to fix a grammar botch


comment count unavailable comments

18 April 2024

Jonathan McDowell: Sorting out backup internet #2: 5G modem

Having setup recursive DNS it was time to actually sort out a backup internet connection. I live in a Virgin Media area, but I still haven t forgiven them for my terrible Virgin experiences when moving here. Plus it involves a bigger contractual commitment. There are no altnets locally (though I m watching youfibre who have already rolled out in a few Belfast exchanges), so I decided to go for a 5G modem. That gives some flexibility, and is a bit easier to get up and running. I started by purchasing a ZTE MC7010. This had the advantage of being reasonably cheap off eBay, not having any wifi functionality I would just have to disable (it s going to plug it into the same router the FTTP connection terminates on), being outdoor mountable should I decide to go that way, and, finally, being powered via PoE. For now this device sits on the window sill in my study, which is at the top of the house. I printed a table stand for it which mostly does the job (though not as well with a normal, rather than flat, network cable). The router lives downstairs, so I ve extended a dedicated VLAN through the study switch, down to the core switch and out to the router. The PoE study switch can only do GigE, not 2.5Gb/s, but at present that s far from the limiting factor on the speed of the connection. The device is 3 branded, and, as it happens, I ve ended up with a 3 SIM in it. Up until recently my personal phone was with them, but they ve kicked me off Go Roam, so I ve moved. Going with 3 for the backup connection provides some slight extra measure of resiliency; we now have devices on all 4 major UK networks in the house. The SIM is a preloaded data only SIM good for a year; I don t expect to use all of the data allowance, but I didn t want to have to worry about unexpected excess charges. Performance turns out to be disappointing; I end up locking the device to 4G as the 5G signal is marginal - leaving it enabled results in constantly switching between 4G + 5G and a significant extra latency. The smokeping graph below shows a brief period where I removed the 4G lock and allowed 5G: Smokeping 4G vs 5G graph (There s a handy zte.js script to allow doing this from the device web interface.) I get about 10Mb/s sustained downloads out of it. EE/Vodafone did not lead to significantly better results, so for now I m accepting it is what it is. I tried relocating the device to another part of the house (a little tricky while still providing switch-based PoE, but I have an injector), without much improvement. Equally pinning the 4G to certain bands provided a short term improvement (I got up to 40-50Mb/s sustained), but not reliably so. speedtest.net results This is disappointing, but if it turns out to be a problem I can look at mounting it externally. I also assume as 5G is gradually rolled out further things will naturally improve, but that might be wishful thinking on my part. Rather than wait until my main link had a problem I decided to try a day working over the 5G connection. I spend a lot of my time either in browser based apps or accessing remote systems via SSH, so I m reasonably sensitive to a jittery or otherwise flaky connection. I picked a day that I did not have any meetings planned, but as it happened I ended up with an adhoc video call arranged. I m pleased to say that it all worked just fine; definitely noticeable as slower than the FTTP connection (to be expected), but all workable and even the video call was fine (at least from my end). Looking at the traffic graph shows the expected ~ 10Mb/s peak (actually a little higher, and looking at the FTTP stats for previous days not out of keeping with what we see there), and you can just about see the ~ 3Mb/s symmetric use by the video call at 2pm: 4G traffic during the work day The test run also helped iron out the fact that the content filter was still enabled on the SIM, but that was easily resolved. Up next, vaguely automatic failover.

26 January 2024

Bastian Venthur: Investigating popularity of Python build backends over time

Inspired by a Mastodon post by Fran oise Conil, who investigated the current popularity of build backends used in pyproject.toml files, I wanted to investigate how the popularity of build backends used in pyproject.toml files evolved over the years since the introduction of PEP-0517 in 2015. Getting the data Tom Forbes provides a huge dataset that contains information about every file within every release uploaded to PyPI. To get the current dataset, we can use:
curl -L --remote-name-all $(curl -L "https://github.com/pypi-data/data/raw/main/links/dataset.txt")
This will download approximately 30GB of parquet files, providing detailed information about each file included in a PyPI upload, including:
  1. project name, version and release date
  2. file path, size and line count
  3. hash of the file
The dataset does not contain the actual files themselves though, more on that in a moment. Querying the dataset using duckdb We can now use duckdb to query the parquet files directly. Let s look into the schema first:
describe select * from '*.parquet';
 
    column_name     column_type    null    
      varchar         varchar     varchar  
 
  project_name      VARCHAR       YES      
  project_version   VARCHAR       YES      
  project_release   VARCHAR       YES      
  uploaded_on       TIMESTAMP     YES      
  path              VARCHAR       YES      
  archive_path      VARCHAR       YES      
  size              UBIGINT       YES      
  hash              BLOB          YES      
  skip_reason       VARCHAR       YES      
  lines             UBIGINT       YES      
  repository        UINTEGER      YES      
 
  11 rows                       6 columns  
 
From all files mentioned in the dataset, we only care about pyproject.toml files that are in the project s root directory. Since we ll still have to download the actual files, we need to get the path and the repository to construct the corresponding URL to the mirror that contains all files in a bunch of huge git repositories. Some files are not available on the mirrors; to skip these, we only take files where the skip_reason is empty. We also care about the timestamp of the upload (uploaded_on) and the hash to avoid processing identical files twice:
select
    path,
    hash,
    uploaded_on,
    repository
from '*.parquet'
where
    skip_reason == '' and
    lower(string_split(path, '/')[-1]) == 'pyproject.toml' and
    len(string_split(path, '/')) == 5
order by uploaded_on desc
This query runs for a few minutes on my laptop and returns ~1.2M rows. Getting the actual files Using the repository and path, we can now construct an URL from which we can fetch the actual file for further processing:
url = f"https://raw.githubusercontent.com/pypi-data/pypi-mirror- repository /code/ path "
We can download the individual pyproject.toml files and parse them to read the build-backend into a dictionary mapping the file-hash to the build backend. Downloads on GitHub are rate-limited, so downloading 1.2M files will take a couple of days. By skipping files with a hash we ve already processed, we can avoid downloading the same file more than once, cutting the required downloads by circa 50%. Results Assuming the data is complete and my analysis is sound, these are the findings: There is a surprising amount of build backends in use, but the overall amount of uploads per build backend decreases quickly, with a long tail of single uploads:
>>> results.backend.value_counts()
backend
setuptools        701550
poetry            380830
hatchling          56917
flit               36223
pdm                11437
maturin             9796
jupyter             1707
mesonpy              625
scikit               556
                   ...
postry                 1
tree                   1
setuptoos              1
neuron                 1
avalon                 1
maturimaturinn         1
jsonpath               1
ha                     1
pyo3                   1
Name: count, Length: 73, dtype: int64
We pick only the top 4 build backends, and group the remaining ones (including PDM and Maturin) into other so they are accounted for as well. The following plot shows the relative distribution of build backends over time. Each bin represents a time span of 28 days. I chose 28 days to reduce visual clutter. Within each bin, the height of the bars corresponds to the relative proportion of uploads during that time interval: Relative distribution of build backends over time Looking at the right side of the plot, we see the current distribution. It confirms Fran oise s findings about the current popularity of build backends: Between 2018 and 2020 the graph exhibits significant fluctuations, due to the relatively low amount uploads utizing pyproject.toml files. During that early period, Flit started as the most popular build backend, but was eventually displaced by Setuptools and Poetry. Between 2020 and 2020, the overall usage of pyproject.toml files increased significantly. By the end of 2022, the share of Setuptools peaked at 70%. After 2020, other build backends experienced a gradual rise in popularity. Amongh these, Hatch emerged as a notable contender, steadily gaining traction and ultimately stabilizing at 10%. We can also look into the absolute distribution of build backends over time: Absolute distribution of build backends over time The plot shows that Setuptools has the strongest growth trajectory, surpassing all other build backends. Poetry and Hatch are growing at a comparable rate, but since Hatch started roughly 4 years after Poetry, it s lagging behind in popularity. Despite not being among the most widely used backends anymore, Flit maintains a steady and consistent growth pattern, indicating its enduring relevance in the Python packaging landscape. The script for downloading and analyzing the data can be found in my GitHub repository. It contains the results of the duckb query (so you don t have to download the full dataset) and the pickled dictionary, mapping the file hashes to the build backends, saving you days for downloading and analyzing the pyproject.toml files yourself.

2 October 2023

Aigars Mahinovs: Debconf 23 photos all

Two weeks have passed since Debconf 23 came to a close in Kochi, Kerala, India this year. In keeping with the more relaxed nature of Debconf in India, the rest of my photos from the event were to be published about two weeks from the end of the event. That will give me a bit more time to process them correctly and also give all of you a chance to see these pictures with fresh eyes and stir up new memories from the event. In the end we are looking at 653 photos and one video. Several different group photos, including a return of the pool group photo that was missing from the event since Mexico in 2006! This year was the first for a new camera (Canon R7) and I am quite happy with the results, even if I still need to learn a lot about this new beast. Also the gradual improvements of panorama stiching software (Hugin) ment that this year I did not need to manually correct any face-melt events on any of the group photos. So that is cool! DebConf 23 pool Group photo You can find all my photos on: Also, don't forget to explore the rest of the Git LFS share content - there are very many great photos by others this year as well!

Next.