Search Results: "rlb"

21 March 2024

Ian Jackson: How to use Rust on Debian (and Ubuntu, etc.)

tl;dr: Don t just apt install rustc cargo. Either do that and make sure to use only Rust libraries from your distro (with the tiresome config runes below); or, just use rustup. Don t do the obvious thing; it s never what you want Debian ships a Rust compiler, and a large number of Rust libraries. But if you just do things the obvious default way, with apt install rustc cargo, you will end up using Debian s compiler but upstream libraries, directly and uncurated from crates.io. This is not what you want. There are about two reasonable things to do, depending on your preferences. Q. Download and run whatever code from the internet? The key question is this: Are you comfortable downloading code, directly from hundreds of upstream Rust package maintainers, and running it ? That s what cargo does. It s one of the main things it s for. Debian s cargo behaves, in this respect, just like upstream s. Let me say that again: Debian s cargo promiscuously downloads code from crates.io just like upstream cargo. So if you use Debian s cargo in the most obvious way, you are still downloading and running all those random libraries. The only thing you re avoiding downloading is the Rust compiler itself, which is precisely the part that is most carefully maintained, and of least concern. Debian s cargo can even download from crates.io when you re building official Debian source packages written in Rust: if you run dpkg-buildpackage, the downloading is suppressed; but a plain cargo build will try to obtain and use dependencies from the upstream ecosystem. ( Happily , if you do this, it s quite likely to bail out early due to version mismatches, before actually downloading anything.) Option 1: WTF, no I don t want curl bash OK, but then you must limit yourself to libraries available within Debian. Each Debian release provides a curated set. It may or may not be sufficient for your needs. Many capable programs can be written using the packages in Debian. But any upstream Rust project that you encounter is likely to be a pain to get working, unless their maintainers specifically intend to support this. (This is fairly rare, and the Rust tooling doesn t make it easy.) To go with this plan, apt install rustc cargo and put this in your configuration, in $HOME/.cargo/config.toml:
[source.debian-packages]
directory = "/usr/share/cargo/registry"
[source.crates-io]
replace-with = "debian-packages"
This causes cargo to look in /usr/share for dependencies, rather than downloading them from crates.io. You must then install the librust-FOO-dev packages for each of your dependencies, with apt. This will allow you to write your own program in Rust, and build it using cargo build. Option 2: Biting the curl bash bullet If you want to build software that isn t specifically targeted at Debian s Rust you will probably need to use packages from crates.io, not from Debian. If you re doing to do that, there is little point not using rustup to get the latest compiler. rustup s install rune is alarming, but cargo will be doing exactly the same kind of thing, only worse (because it trusts many more people) and more hidden. So in this case: do run the curl bash install rune. Hopefully the Rust project you are trying to build have shipped a Cargo.lock; that contains hashes of all the dependencies that they last used and tested. If you run cargo build --locked, cargo will only use those versions, which are hopefully OK. And you can run cargo audit to see if there are any reported vulnerabilities or problems. But you ll have to bootstrap this with cargo install --locked cargo-audit; cargo-audit is from the RUSTSEC folks who do care about these kind of things, so hopefully running their code (and their dependencies) is fine. Note the --locked which is needed because cargo s default behaviour is wrong. Privilege separation This approach is rather alarming. For my personal use, I wrote a privsep tool which allows me to run all this upstream Rust code as a separate user. That tool is nailing-cargo. It s not particularly well productised, or tested, but it does work for at least one person besides me. You may wish to try it out, or consider alternative arrangements. Bug reports and patches welcome. OMG what a mess Indeed. There are large number of technical and social factors at play. cargo itself is deeply troubling, both in principle, and in detail. I often find myself severely disappointed with its maintainers decisions. In mitigation, much of the wider Rust upstream community does takes this kind of thing very seriously, and often makes good choices. RUSTSEC is one of the results. Debian s technical arrangements for Rust packaging are quite dysfunctional, too: IMO the scheme is based on fundamentally wrong design principles. But, the Debian Rust packaging team is dynamic, constantly working the update treadmills; and the team is generally welcoming and helpful. Sadly last time I explored the possibility, the Debian Rust Team didn t have the appetite for more fundamental changes to the workflow (including, for example, changes to dependency version handling). Significant improvements to upstream cargo s approach seem unlikely, too; we can only hope that eventually someone might manage to supplant it.
edited 2024-03-21 21:49 to add a cut tag


comment count unavailable comments

12 July 2023

Reproducible Builds: Reproducible Builds in June 2023

Welcome to the June 2023 report from the Reproducible Builds project In our reports, we outline the most important things that we have been up to over the past month. As always, if you are interested in contributing to the project, please visit our Contribute page on our website.


We are very happy to announce the upcoming Reproducible Builds Summit which set to take place from October 31st November 2nd 2023, in the vibrant city of Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. If you re interesting in joining us this year, please make sure to read the event page] which has more details about the event and location. (You may also be interested in attending PackagingCon 2023 held a few days before in Berlin.)
This month, Vagrant Cascadian will present at FOSSY 2023 on the topic of Breaking the Chains of Trusting Trust:
Corrupted build environments can deliver compromised cryptographically signed binaries. Several exploits in critical supply chains have been demonstrated in recent years, proving that this is not just theoretical. The most well secured build environments are still single points of failure when they fail. [ ] This talk will focus on the state of the art from several angles in related Free and Open Source Software projects, what works, current challenges and future plans for building trustworthy toolchains you do not need to trust.
Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2023 website, including the full programme schedule.
Marcel Fourn , Dominik Wermke, William Enck, Sascha Fahl and Yasemin Acar recently published an academic paper in the 44th IEEE Symposium on Security and Privacy titled It s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security . The abstract reads as follows:
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created.
However, in contrast to other papers that touch on some theoretical aspect of reproducible builds, the authors paper takes a different approach. Starting with the observation that much of the software industry believes R-Bs are too far out of reach for most projects and conjoining that with a goal of to help identify a path for R-Bs to become a commonplace property , the paper has a different methodology:
We conducted a series of 24 semi-structured expert interviews with participants from the Reproducible-Builds.org project, and iterated on our questions with the reproducible builds community. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We identify experiences that help and hinder adoption, which heavily include communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community.
A PDF of the paper is now available, as is an entry on the CISPA Helmholtz Center for Information Security website and an entry under the TeamUSEC Human-Centered Security research group.
On our mailing list this month:
The antagonist is David Schwartz, who correctly says There are dozens of complex reasons why what seems to be the same sequence of operations might produce different end results, but goes on to say I totally disagree with your general viewpoint that compilers must provide for reproducability [sic]. Dwight Tovey and I (Larry Doolittle) argue for reproducible builds. I assert Any program especially a mission-critical program like a compiler that cannot reproduce a result at will is broken. Also it s commonplace to take a binary from the net, and check to see if it was trojaned by attempting to recreate it from source.

Lastly, there were a few changes to our website this month too, including Bernhard M. Wiedemann adding a simplified Rust example to our documentation about the SOURCE_DATE_EPOCH environment variable [ ], Chris Lamb made it easier to parse our summit announcement at a glance [ ], Mattia Rizzolo added the summit announcement at a glance [ ] itself [ ][ ][ ] and Rahul Bajaj added a taxonomy of variations in build environments [ ].

Distribution work 27 reviews of Debian packages were added, 40 were updated and 8 were removed this month adding to our knowledge about identified issues. A new randomness_in_documentation_generated_by_mkdocs toolchain issue was added by Chris Lamb [ ], and the deterministic flag on the paths_vary_due_to_usrmerge issue as we are not currently testing usrmerge issues [ ] issues.
Roland Clobus posted his 18th update of the status of reproducible Debian ISO images on our mailing list. Roland reported that all major desktops build reproducibly with bullseye, bookworm, trixie and sid , but he also mentioned amongst many changes that not only are the non-free images being built (and are reproducible) but that the live images are generated officially by Debian itself. [ ]
Jan-Benedict Glaw noticed a problem when building NetBSD for the VAX architecture. Noting that Reproducible builds [are] probably not as reproducible as we thought , Jan-Benedict goes on to describe that when two builds from different source directories won t produce the same result and adds various notes about sub-optimal handling of the CFLAGS environment variable. [ ]
F-Droid added 21 new reproducible apps in June, resulting in a new record of 145 reproducible apps in total. [ ]. (This page now sports missing data for March May 2023.) F-Droid contributors also reported an issue with broken resources in APKs making some builds unreproducible. [ ]
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE

Upstream patches

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:
  • Additions to a (relatively) new Documented Jenkins Maintenance (djm) script to automatically shrink a cache & save a backup of old data [ ], automatically split out previous months data from logfiles into specially-named files [ ], prevent concurrent remote logfile fetches by using a lock file [ ] and to add/remove various debugging statements [ ].
  • Updates to the automated system health checks to, for example, to correctly detect new kernel warnings due to a wording change [ ] and to explicitly observe which old/unused kernels should be removed [ ]. This was related to an improvement so that various kernel issues on Ubuntu-based nodes are automatically fixed. [ ]
Holger and Vagrant Cascadian updated all thirty-five hosts running Debian on the amd64, armhf, and i386 architectures to Debian bookworm, with the exception of the Jenkins host itself which will be upgraded after the release of Debian 12.1. In addition, Mattia Rizzolo updated the email configuration for the @reproducible-builds.org domain to correctly accept incoming mails from jenkins.debian.net [ ] as well as to set up DomainKeys Identified Mail (DKIM) signing [ ]. And working together with Holger, Mattia also updated the Jenkins configuration to start testing Debian trixie which resulted in stopped testing Debian buster. And, finally, Jan-Benedict Glaw contributed patches for improved NetBSD testing.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

17 June 2022

Antoine Beaupr : Matrix notes

I have some concerns about Matrix (the protocol, not the movie that came out recently, although I do have concerns about that as well). I've been watching the project for a long time, and it seems more a promising alternative to many protocols like IRC, XMPP, and Signal. This review may sound a bit negative, because it focuses on those concerns. I am the operator of an IRC network and people keep asking me to bridge it with Matrix. I have myself considered just giving up on IRC and converting to Matrix. This space is a living document exploring my research of that problem space. The TL;DR: is that no, I'm not setting up a bridge just yet, and I'm still on IRC. This article was written over the course of the last three months, but I have been watching the Matrix project for years (my logs seem to say 2016 at least). The article is rather long. It will likely take you half an hour to read, so copy this over to your ebook reader, your tablet, or dead trees, and lean back and relax as I show you around the Matrix. Or, alternatively, just jump to a section that interest you, most likely the conclusion.

Introduction to Matrix Matrix is an "open standard for interoperable, decentralised, real-time communication over IP. It can be used to power Instant Messaging, VoIP/WebRTC signalling, Internet of Things communication - or anywhere you need a standard HTTP API for publishing and subscribing to data whilst tracking the conversation history". It's also (when compared with XMPP) "an eventually consistent global JSON database with an HTTP API and pubsub semantics - whilst XMPP can be thought of as a message passing protocol." According to their FAQ, the project started in 2014, has about 20,000 servers, and millions of users. Matrix works over HTTPS but over a special port: 8448.

Security and privacy I have some concerns about the security promises of Matrix. It's advertised as a "secure" with "E2E [end-to-end] encryption", but how does it actually work?

Data retention defaults One of my main concerns with Matrix is data retention, which is a key part of security in a threat model where (for example) an hostile state actor wants to surveil your communications and can seize your devices. On IRC, servers don't actually keep messages all that long: they pass them along to other servers and clients as fast as they can, only keep them in memory, and move on to the next message. There are no concerns about data retention on messages (and their metadata) other than the network layer. (I'm ignoring the issues with user registration, which is a separate, if valid, concern.) Obviously, an hostile server could log everything passing through it, but IRC federations are normally tightly controlled. So, if you trust your IRC operators, you should be fairly safe. Obviously, clients can (and often do, even if OTR is configured!) log all messages, but this is generally not the default. Irssi, for example, does not log by default. IRC bouncers are more likely to log to disk, of course, to be able to do what they do. Compare this to Matrix: when you send a message to a Matrix homeserver, that server first stores it in its internal SQL database. Then it will transmit that message to all clients connected to that server and room, and to all other servers that have clients connected to that room. Those remote servers, in turn, will keep a copy of that message and all its metadata in their own database, by default forever. On encrypted rooms those messages are encrypted, but not their metadata. There is a mechanism to expire entries in Synapse, but it is not enabled by default. So one should generally assume that a message sent on Matrix is never expired.

GDPR in the federation But even if that setting was enabled by default, how do you control it? This is a fundamental problem of the federation: if any user is allowed to join a room (which is the default), those user's servers will log all content and metadata from that room. That includes private, one-on-one conversations, since those are essentially rooms as well. In the context of the GDPR, this is really tricky: who is the responsible party (known as the "data controller") here? It's basically any yahoo who fires up a home server and joins a room. In a federated network, one has to wonder whether GDPR enforcement is even possible at all. But in Matrix in particular, if you want to enforce your right to be forgotten in a given room, you would have to:
  1. enumerate all the users that ever joined the room while you were there
  2. discover all their home servers
  3. start a GDPR procedure against all those servers
I recognize this is a hard problem to solve while still keeping an open ecosystem. But I believe that Matrix should have much stricter defaults towards data retention than right now. Message expiry should be enforced by default, for example. (Note that there are also redaction policies that could be used to implement part of the GDPR automatically, see the privacy policy discussion below on that.) Also keep in mind that, in the brave new peer-to-peer world that Matrix is heading towards, the boundary between server and client is likely to be fuzzier, which would make applying the GDPR even more difficult. Update: this comment links to this post (in german) which apparently studied the question and concluded that Matrix is not GDPR-compliant. In fact, maybe Synapse should be designed so that there's no configurable flag to turn off data retention. A bit like how most system loggers in UNIX (e.g. syslog) come with a log retention system that typically rotate logs after a few weeks or month. Historically, this was designed to keep hard drives from filling up, but it also has the added benefit of limiting the amount of personal information kept on disk in this modern day. (Arguably, syslog doesn't rotate logs on its own, but, say, Debian GNU/Linux, as an installed system, does have log retention policies well defined for installed packages, and those can be discussed. And "no expiry" is definitely a bug.

Matrix.org privacy policy When I first looked at Matrix, five years ago, Element.io was called Riot.im and had a rather dubious privacy policy:
We currently use cookies to support our use of Google Analytics on the Website and Service. Google Analytics collects information about how you use the Website and Service. [...] This helps us to provide you with a good experience when you browse our Website and use our Service and also allows us to improve our Website and our Service.
When I asked Matrix people about why they were using Google Analytics, they explained this was for development purposes and they were aiming for velocity at the time, not privacy (paraphrasing here). They also included a "free to snitch" clause:
If we are or believe that we are under a duty to disclose or share your personal data, we will do so in order to comply with any legal obligation, the instructions or requests of a governmental authority or regulator, including those outside of the UK.
Those are really broad terms, above and beyond what is typically expected legally. Like the current retention policies, such user tracking and ... "liberal" collaboration practices with the state set a bad precedent for other home servers. Thankfully, since the above policy was published (2017), the GDPR was "implemented" (2018) and it seems like both the Element.io privacy policy and the Matrix.org privacy policy have been somewhat improved since. Notable points of the new privacy policies:
  • 2.3.1.1: the "federation" section actually outlines that "Federated homeservers and Matrix clients which respect the Matrix protocol are expected to honour these controls and redaction/erasure requests, but other federated homeservers are outside of the span of control of Element, and we cannot guarantee how this data will be processed"
  • 2.6: users under the age of 16 should not use the matrix.org service
  • 2.10: Upcloud, Mythic Beast, Amazon, and CloudFlare possibly have access to your data (it's nice to at least mention this in the privacy policy: many providers don't even bother admitting to this kind of delegation)
  • Element 2.2.1: mentions many more third parties (Twilio, Stripe, Quaderno, LinkedIn, Twitter, Google, Outplay, PipeDrive, HubSpot, Posthog, Sentry, and Matomo (phew!) used when you are paying Matrix.org for hosting
I'm not super happy with all the trackers they have on the Element platform, but then again you don't have to use that service. Your favorite homeserver (assuming you are not on Matrix.org) probably has their own Element deployment, hopefully without all that garbage. Overall, this is all a huge improvement over the previous privacy policy, so hats off to the Matrix people for figuring out a reasonable policy in such a tricky context. I particularly like this bit:
We will forget your copy of your data upon your request. We will also forward your request to be forgotten onto federated homeservers. However - these homeservers are outside our span of control, so we cannot guarantee they will forget your data.
It's great they implemented those mechanisms and, after all, if there's an hostile party in there, nothing can prevent them from using screenshots to just exfiltrate your data away from the client side anyways, even with services typically seen as more secure, like Signal. As an aside, I also appreciate that Matrix.org has a fairly decent code of conduct, based on the TODO CoC which checks all the boxes in the geekfeminism wiki.

Metadata handling Overall, privacy protections in Matrix mostly concern message contents, not metadata. In other words, who's talking with who, when and from where is not well protected. Compared to a tool like Signal, which goes through great lengths to anonymize that data with features like private contact discovery, disappearing messages, sealed senders, and private groups, Matrix is definitely behind. (Note: there is an issue open about message lifetimes in Element since 2020, but it's not at even at the MSC stage yet.) This is a known issue (opened in 2019) in Synapse, but this is not just an implementation issue, it's a flaw in the protocol itself. Home servers keep join/leave of all rooms, which gives clear text information about who is talking to. Synapse logs may also contain privately identifiable information that home server admins might not be aware of in the first place. Those log rotation policies are separate from the server-level retention policy, which may be confusing for a novice sysadmin. Combine this with the federation: even if you trust your home server to do the right thing, the second you join a public room with third-party home servers, those ideas kind of get thrown out because those servers can do whatever they want with that information. Again, a problem that is hard to solve in any federation. To be fair, IRC doesn't have a great story here either: any client knows not only who's talking to who in a room, but also typically their client IP address. Servers can (and often do) obfuscate this, but often that obfuscation is trivial to reverse. Some servers do provide "cloaks" (sometimes automatically), but that's kind of a "slap-on" solution that actually moves the problem elsewhere: now the server knows a little more about the user. Overall, I would worry much more about a Matrix home server seizure than a IRC or Signal server seizure. Signal does get subpoenas, and they can only give out a tiny bit of information about their users: their phone number, and their registration, and last connection date. Matrix carries a lot more information in its database.

Amplification attacks on URL previews I (still!) run an Icecast server and sometimes share links to it on IRC which, obviously, also ends up on (more than one!) Matrix home servers because some people connect to IRC using Matrix. This, in turn, means that Matrix will connect to that URL to generate a link preview. I feel this outlines a security issue, especially because those sockets would be kept open seemingly forever. I tried to warn the Matrix security team but somehow, I don't think this issue was taken very seriously. Here's the disclosure timeline:
  • January 18: contacted Matrix security
  • January 19: response: already reported as a bug
  • January 20: response: can't reproduce
  • January 31: timeout added, considered solved
  • January 31: I respond that I believe the security issue is underestimated, ask for clearance to disclose
  • February 1: response: asking for two weeks delay after the next release (1.53.0) including another patch, presumably in two weeks' time
  • February 22: Matrix 1.53.0 released
  • April 14: I notice the release, ask for clearance again
  • April 14: response: referred to the public disclosure
There are a couple of problems here:
  1. the bug was publicly disclosed in September 2020, and not considered a security issue until I notified them, and even then, I had to insist
  2. no clear disclosure policy timeline was proposed or seems established in the project (there is a security disclosure policy but it doesn't include any predefined timeline)
  3. I wasn't informed of the disclosure
  4. the actual solution is a size limit (10MB, already implemented), a time limit (30 seconds, implemented in PR 11784), and a content type allow list (HTML, "media" or JSON, implemented in PR 11936), and I'm not sure it's adequate
  5. (pure vanity:) I did not make it to their Hall of fame
I'm not sure those solutions are adequate because they all seem to assume a single home server will pull that one URL for a little while then stop. But in a federated network, many (possibly thousands) home servers may be connected in a single room at once. If an attacker drops a link into such a room, all those servers would connect to that link all at once. This is an amplification attack: a small amount of traffic will generate a lot more traffic to a single target. It doesn't matter there are size or time limits: the amplification is what matters here. It should also be noted that clients that generate link previews have more amplification because they are more numerous than servers. And of course, the default Matrix client (Element) does generate link previews as well. That said, this is possibly not a problem specific to Matrix: any federated service that generates link previews may suffer from this. I'm honestly not sure what the solution is here. Maybe moderation? Maybe link previews are just evil? All I know is there was this weird bug in my Icecast server and I tried to ring the bell about it, and it feels it was swept under the rug. Somehow I feel this is bound to blow up again in the future, even with the current mitigation.

Moderation In Matrix like elsewhere, Moderation is a hard problem. There is a detailed moderation guide and much of this problem space is actively worked on in Matrix right now. A fundamental problem with moderating a federated space is that a user banned from a room can rejoin the room from another server. This is why spam is such a problem in Email, and why IRC networks have stopped federating ages ago (see the IRC history for that fascinating story).

The mjolnir bot The mjolnir moderation bot is designed to help with some of those things. It can kick and ban users, redact all of a user's message (as opposed to one by one), all of this across multiple rooms. It can also subscribe to a federated block list published by matrix.org to block known abusers (users or servers). Bans are pretty flexible and can operate at the user, room, or server level. Matrix people suggest making the bot admin of your channels, because you can't take back admin from a user once given.

The command-line tool There's also a new command line tool designed to do things like:
  • System notify users (all users/users from a list, specific user)
  • delete sessions/devices not seen for X days
  • purge the remote media cache
  • select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
  • purge history of theses rooms
  • shutdown rooms
This tool and Mjolnir are based on the admin API built into Synapse.

Rate limiting Synapse has pretty good built-in rate-limiting which blocks repeated login, registration, joining, or messaging attempts. It may also end up throttling servers on the federation based on those settings.

Fundamental federation problems Because users joining a room may come from another server, room moderators are at the mercy of the registration and moderation policies of those servers. Matrix is like IRC's +R mode ("only registered users can join") by default, except that anyone can register their own homeserver, which makes this limited. Server admins can block IP addresses and home servers, but those tools are not easily available to room admins. There is an API (m.room.server_acl in /devtools) but it is not reliable (thanks Austin Huang for the clarification). Matrix has the concept of guest accounts, but it is not used very much, and virtually no client or homeserver supports it. This contrasts with the way IRC works: by default, anyone can join an IRC network even without authentication. Some channels require registration, but in general you are free to join and look around (until you get blocked, of course). I have seen anecdotal evidence (CW: Twitter, nitter link) that "moderating bridges is hell", and I can imagine why. Moderation is already hard enough on one federation, when you bridge a room with another network, you inherit all the problems from that network but without the entire abuse control tools from the original network's API...

Room admins Matrix, in particular, has the problem that room administrators (which have the power to redact messages, ban users, and promote other users) are bound to their Matrix ID which is, in turn, bound to their home servers. This implies that a home server administrators could (1) impersonate a given user and (2) use that to hijack the room. So in practice, the home server is the trust anchor for rooms, not the user themselves. That said, if server B administrator hijack user joe on server B, they will hijack that room on that specific server. This will not (necessarily) affect users on the other servers, as servers could refuse parts of the updates or ban the compromised account (or server). It does seem like a major flaw that room credentials are bound to Matrix identifiers, as opposed to the E2E encryption credentials. In an encrypted room even with fully verified members, a compromised or hostile home server can still take over the room by impersonating an admin. That admin (or even a newly minted user) can then send events or listen on the conversations. This is even more frustrating when you consider that Matrix events are actually signed and therefore have some authentication attached to them, acting like some sort of Merkle tree (as it contains a link to previous events). That signature, however, is made from the homeserver PKI keys, not the client's E2E keys, which makes E2E feel like it has been "bolted on" later.

Availability While Matrix has a strong advantage over Signal in that it's decentralized (so anyone can run their own homeserver,), I couldn't find an easy way to run a "multi-primary" setup, or even a "redundant" setup (even if with a single primary backend), short of going full-on "replicate PostgreSQL and Redis data", which is not typically for the faint of heart.

How this works in IRC On IRC, it's quite easy to setup redundant nodes. All you need is:
  1. a new machine (with it's own public address with an open port)
  2. a shared secret (or certificate) between that machine and an existing one on the network
  3. a connect block on both servers
That's it: the node will join the network and people can connect to it as usual and share the same user/namespace as the rest of the network. The servers take care of synchronizing state: you do not need to worry about replicating a database server. (Now, experienced IRC people will know there's a catch here: IRC doesn't have authentication built in, and relies on "services" which are basically bots that authenticate users (I'm simplifying, don't nitpick). If that service goes down, the network still works, but then people can't authenticate, and they can start doing nasty things like steal people's identity if they get knocked offline. But still: basic functionality still works: you can talk in rooms and with users that are on the reachable network.)

User identities Matrix is more complicated. Each "home server" has its own identity namespace: a specific user (say @anarcat:matrix.org) is bound to that specific home server. If that server goes down, that user is completely disconnected. They could register a new account elsewhere and reconnect, but then they basically lose all their configuration: contacts, joined channels are all lost. (Also notice how the Matrix IDs don't look like a typical user address like an email in XMPP. They at least did their homework and got the allocation for the scheme.)

Rooms Users talk to each other in "rooms", even in one-to-one communications. (Rooms are also used for other things like "spaces", they're basically used for everything, think "everything is a file" kind of tool.) For rooms, home servers act more like IRC nodes in that they keep a local state of the chat room and synchronize it with other servers. Users can keep talking inside a room if the server that originally hosts the room goes down. Rooms can have a local, server-specific "alias" so that, say, #room:matrix.org is also visible as #room:example.com on the example.com home server. Both addresses refer to the same room underlying room. (Finding this in the Element settings is not obvious though, because that "alias" are actually called a "local address" there. So to create such an alias (in Element), you need to go in the room settings' "General" section, "Show more" in "Local address", then add the alias name (e.g. foo), and then that room will be available on your example.com homeserver as #foo:example.com.) So a room doesn't belong to a server, it belongs to the federation, and anyone can join the room from any serer (if the room is public, or if invited otherwise). You can create a room on server A and when a user from server B joins, the room will be replicated on server B as well. If server A fails, server B will keep relaying traffic to connected users and servers. A room is therefore not fundamentally addressed with the above alias, instead ,it has a internal Matrix ID, which basically a random string. It has a server name attached to it, but that was made just to avoid collisions. That can get a little confusing. For example, the #fractal:gnome.org room is an alias on the gnome.org server, but the room ID is !hwiGbsdSTZIwSRfybq:matrix.org. That's because the room was created on matrix.org, but the preferred branding is gnome.org now. As an aside, rooms, by default, live forever, even after the last user quits. There's an admin API to delete rooms and a tombstone event to redirect to another one, but neither have a GUI yet. The latter is part of MSC1501 ("Room version upgrades") which allows a room admin to close a room, with a message and a pointer to another room.

Spaces Discovering rooms can be tricky: there is a per-server room directory, but Matrix.org people are trying to deprecate it in favor of "Spaces". Room directories were ripe for abuse: anyone can create a room, so anyone can show up in there. It's possible to restrict who can add aliases, but anyways directories were seen as too limited. In contrast, a "Space" is basically a room that's an index of other rooms (including other spaces), so existing moderation and administration mechanism that work in rooms can (somewhat) work in spaces as well. This enables a room directory that works across federation, regardless on which server they were originally created. New users can be added to a space or room automatically in Synapse. (Existing users can be told about the space with a server notice.) This gives admins a way to pre-populate a list of rooms on a server, which is useful to build clusters of related home servers, providing some sort of redundancy, at the room -- not user -- level.

Home servers So while you can workaround a home server going down at the room level, there's no such thing at the home server level, for user identities. So if you want those identities to be stable in the long term, you need to think about high availability. One limitation is that the domain name (e.g. matrix.example.com) must never change in the future, as renaming home servers is not supported. The documentation used to say you could "run a hot spare" but that has been removed. Last I heard, it was not possible to run a high-availability setup where multiple, separate locations could replace each other automatically. You can have high performance setups where the load gets distributed among workers, but those are based on a shared database (Redis and PostgreSQL) backend. So my guess is it would be possible to create a "warm" spare server of a matrix home server with regular PostgreSQL replication, but that is not documented in the Synapse manual. This sort of setup would also not be useful to deal with networking issues or denial of service attacks, as you will not be able to spread the load over multiple network locations easily. Redis and PostgreSQL heroes are welcome to provide their multi-primary solution in the comments. In the meantime, I'll just point out this is a solution that's handled somewhat more gracefully in IRC, by having the possibility of delegating the authentication layer.

Delegations If you do not want to run a Matrix server yourself, it's possible to delegate the entire thing to another server. There's a server discovery API which uses the .well-known pattern (or SRV records, but that's "not recommended" and a bit confusing) to delegate that service to another server. Be warned that the server still needs to be explicitly configured for your domain. You can't just put:
  "m.server": "matrix.org:443"  
... on https://example.com/.well-known/matrix/server and start using @you:example.com as a Matrix ID. That's because Matrix doesn't support "virtual hosting" and you'd still be connecting to rooms and people with your matrix.org identity, not example.com as you would normally expect. This is also why you cannot rename your home server. The server discovery API is what allows servers to find each other. Clients, on the other hand, use the client-server discovery API: this is what allows a given client to find your home server when you type your Matrix ID on login.

Performance The high availability discussion brushed over the performance of Matrix itself, but let's now dig into that.

Horizontal scalability There were serious scalability issues of the main Matrix server, Synapse, in the past. So the Matrix team has been working hard to improve its design. Since Synapse 1.22 the home server can horizontally scale to multiple workers (see this blog post for details) which can make it easier to scale large servers.

Other implementations There are other promising home servers implementations from a performance standpoint (dendrite, Golang, entered beta in late 2020; conduit, Rust, beta; others), but none of those are feature-complete so there's a trade-off to be made there. Synapse is also adding a lot of feature fast, so it's an open question whether the others will ever catch up. (I have heard that Dendrite might actually surpass Synapse in features within a few years, which would put Synapse in a more "LTS" situation.)

Latency Matrix can feel slow sometimes. For example, joining the "Matrix HQ" room in Element (from matrix.debian.social) takes a few minutes and then fails. That is because the home server has to sync the entire room state when you join the room. There was promising work on this announced in the lengthy 2021 retrospective, and some of that work landed (partial sync) in the 1.53 release already. Other improvements coming include sliding sync, lazy loading over federation, and fast room joins. So that's actually something that could be fixed in the fairly short term. But in general, communication in Matrix doesn't feel as "snappy" as on IRC or even Signal. It's hard to quantify this without instrumenting a full latency test bed (for example the tools I used in the terminal emulators latency tests), but even just typing in a web browser feels slower than typing in a xterm or Emacs for me. Even in conversations, I "feel" people don't immediately respond as fast. In fact, this could be an interesting double-blind experiment to make: have people guess whether they are talking to a person on Matrix, XMPP, or IRC, for example. My theory would be that people could notice that Matrix users are slower, if only because of the TCP round-trip time each message has to take.

Transport Some courageous person actually made some tests of various messaging platforms on a congested network. His evaluation was basically:
  • Briar: uses Tor, so unusable except locally
  • Matrix: "struggled to send and receive messages", joining a room takes forever as it has to sync all history, "took 20-30 seconds for my messages to be sent and another 20 seconds for further responses"
  • XMPP: "worked in real-time, full encryption, with nearly zero lag"
So that was interesting. I suspect IRC would have also fared better, but that's just a feeling. Other improvements to the transport layer include support for websocket and the CoAP proxy work from 2019 (targeting 100bps links), but both seem stalled at the time of writing. The Matrix people have also announced the pinecone p2p overlay network which aims at solving large, internet-scale routing problems. See also this talk at FOSDEM 2022.

Usability

Onboarding and workflow The workflow for joining a room, when you use Element web, is not great:
  1. click on a link in a web browser
  2. land on (say) https://matrix.to/#/#matrix-dev:matrix.org
  3. offers "Element", yeah that's sounds great, let's click "Continue"
  4. land on https://app.element.io/#/room%2F%23matrix-dev%3Amatrix.org and then you need to register, aaargh
As you might have guessed by now, there is a specification to solve this, but web browsers need to adopt it as well, so that's far from actually being solved. At least browsers generally know about the matrix: scheme, it's just not exactly clear what they should do with it, especially when the handler is just another web page (e.g. Element web). In general, when compared with tools like Signal or WhatsApp, Matrix doesn't fare so well in terms of user discovery. I probably have some of my normal contacts that have a Matrix account as well, but there's really no way to know. It's kind of creepy when Signal tells you "this person is on Signal!" but it's also pretty cool that it works, and they actually implemented it pretty well. Registration is also less obvious: in Signal, the app confirms your phone number automatically. It's friction-less and quick. In Matrix, you need to learn about home servers, pick one, register (with a password! aargh!), and then setup encryption keys (not default), etc. It's a lot more friction. And look, I understand: giving away your phone number is a huge trade-off. I don't like it either. But it solves a real problem and makes encryption accessible to a ton more people. Matrix does have "identity servers" that can serve that purpose, but I don't feel confident sharing my phone number there. It doesn't help that the identity servers don't have private contact discovery: giving them your phone number is a more serious security compromise than with Signal. There's a catch-22 here too: because no one feels like giving away their phone numbers, no one does, and everyone assumes that stuff doesn't work anyways. Like it or not, Signal forcing people to divulge their phone number actually gives them critical mass that means actually a lot of my relatives are on Signal and I don't have to install crap like WhatsApp to talk with them.

5 minute clients evaluation Throughout all my tests I evaluated a handful of Matrix clients, mostly from Flathub because almost none of them are packaged in Debian. Right now I'm using Element, the flagship client from Matrix.org, in a web browser window, with the PopUp Window extension. This makes it look almost like a native app, and opens links in my main browser window (instead of a new tab in that separate window), which is nice. But I'm tired of buying memory to feed my web browser, so this indirection has to stop. Furthermore, I'm often getting completely logged off from Element, which means re-logging in, recovering my security keys, and reconfiguring my settings. That is extremely annoying. Coming from Irssi, Element is really "GUI-y" (pronounced "gooey"). Lots of clickety happening. To mark conversations as read, in particular, I need to click-click-click on all the tabs that have some activity. There's no "jump to latest message" or "mark all as read" functionality as far as I could tell. In Irssi the former is built-in (alt-a) and I made a custom /READ command for the latter:
/ALIAS READ script exec \$_->activity(0) for Irssi::windows
And yes, that's a Perl script in my IRC client. I am not aware of any Matrix client that does stuff like that, except maybe Weechat, if we can call it a Matrix client, or Irssi itself, now that it has a Matrix plugin (!). As for other clients, I have looked through the Matrix Client Matrix (confusing right?) to try to figure out which one to try, and, even after selecting Linux as a filter, the chart is just too wide to figure out anything. So I tried those, kind of randomly:
  • Fractal
  • Mirage
  • Nheko
  • Quaternion
Unfortunately, I lost my notes on those, I don't actually remember which one did what. I still have a session open with Mirage, so I guess that means it's the one I preferred, but I remember they were also all very GUI-y. Maybe I need to look at weechat-matrix or gomuks. At least Weechat is scriptable so I could continue playing the power-user. Right now my strategy with messaging (and that includes microblogging like Twitter or Mastodon) is that everything goes through my IRC client, so Weechat could actually fit well in there. Going with gomuks, on the other hand, would mean running it in parallel with Irssi or ... ditching IRC, which is a leap I'm not quite ready to take just yet. Oh, and basically none of those clients (except Nheko and Element) support VoIP, which is still kind of a second-class citizen in Matrix. It does not support large multimedia rooms, for example: Jitsi was used for FOSDEM instead of the native videoconferencing system.

Bots This falls a little aside the "usability" section, but I didn't know where to put this... There's a few Matrix bots out there, and you are likely going to be able to replace your existing bots with Matrix bots. It's true that IRC has a long and impressive history with lots of various bots doing various things, but given how young Matrix is, there's still a good variety:
  • maubot: generic bot with tons of usual plugins like sed, dice, karma, xkcd, echo, rss, reminder, translate, react, exec, gitlab/github webhook receivers, weather, etc
  • opsdroid: framework to implement "chat ops" in Matrix, connects with Matrix, GitHub, GitLab, Shell commands, Slack, etc
  • matrix-nio: another framework, used to build lots more bots like:
    • hemppa: generic bot with various functionality like weather, RSS feeds, calendars, cron jobs, OpenStreetmaps lookups, URL title snarfing, wolfram alpha, astronomy pic of the day, Mastodon bridge, room bridging, oh dear
    • devops: ping, curl, etc
    • podbot: play podcast episodes from AntennaPod
    • cody: Python, Ruby, Javascript REPL
    • eno: generic bot, "personal assistant"
  • mjolnir: moderation bot
  • hookshot: bridge with GitLab/GitHub
  • matrix-monitor-bot: latency monitor
One thing I haven't found an equivalent for is Debian's MeetBot. There's an archive bot but it doesn't have topics or a meeting chair, or HTML logs.

Working on Matrix As a developer, I find Matrix kind of intimidating. The specification is huge. The official specification itself looks somewhat digestable: it's only 6 APIs so that looks, at first, kind of reasonable. But whenever you start asking complicated questions about Matrix, you quickly fall into the Matrix Spec Change specification (which, yes, is a separate specification). And there are literally hundreds of MSCs flying around. It's hard to tell what's been adopted and what hasn't, and even harder to figure out if your specific client has implemented it. (One trendy answer to this problem is to "rewrite it in rust": Matrix are working on implementing a lot of those specifications in a matrix-rust-sdk that's designed to take the implementation details away from users.) Just taking the latest weekly Matrix report, you find that three new MSCs proposed, just last week! There's even a graph that shows the number of MSCs is progressing steadily, at 600+ proposals total, with the majority (300+) "new". I would guess the "merged" ones are at about 150. That's a lot of text which includes stuff like 3D worlds which, frankly, I don't think you should be working on when you have such important security and usability problems. (The internet as a whole, arguably, doesn't fare much better. RFC600 is a really obscure discussion about "INTERFACING AN ILLINOIS PLASMA TERMINAL TO THE ARPANET". Maybe that's how many MSCs will end up as well, left forgotten in the pits of history.) And that's the thing: maybe the Matrix people have a different objective than I have. They want to connect everything to everything, and make Matrix a generic transport for all sorts of applications, including virtual reality, collaborative editors, and so on. I just want secure, simple messaging. Possibly with good file transfers, and video calls. That it works with existing stuff is good, and it should be federated to remove the "Signal point of failure". So I'm a bit worried with the direction all those MSCs are taking, especially when you consider that clients other than Element are still struggling to keep up with basic features like end-to-end encryption or room discovery, never mind voice or spaces...

Conclusion Overall, Matrix is somehow in the space XMPP was a few years ago. It has a ton of features, pretty good clients, and a large community. It seems to have gained some of the momentum that XMPP has lost. It may have the most potential to replace Signal if something bad would happen to it (like, I don't know, getting banned or going nuts with cryptocurrency)... But it's really not there yet, and I don't see Matrix trying to get there either, which is a bit worrisome.

Looking back at history I'm also worried that we are repeating the errors of the past. The history of federated services is really fascinating:. IRC, FTP, HTTP, and SMTP were all created in the early days of the internet, and are all still around (except, arguably, FTP, which was removed from major browsers recently). All of them had to face serious challenges in growing their federation. IRC had numerous conflicts and forks, both at the technical level but also at the political level. The history of IRC is really something that anyone working on a federated system should study in detail, because they are bound to make the same mistakes if they are not familiar with it. The "short" version is:
  • 1988: Finnish researcher publishes first IRC source code
  • 1989: 40 servers worldwide, mostly universities
  • 1990: EFnet ("eris-free network") fork which blocks the "open relay", named Eris - followers of Eris form the A-net, which promptly dissolves itself, with only EFnet remaining
  • 1992: Undernet fork, which offered authentication ("services"), routing improvements and timestamp-based channel synchronisation
  • 1994: DALnet fork, from Undernet, again on a technical disagreement
  • 1995: Freenode founded
  • 1996: IRCnet forks from EFnet, following a flame war of historical proportion, splitting the network between Europe and the Americas
  • 1997: Quakenet founded
  • 1999: (XMPP founded)
  • 2001: 6 million users, OFTC founded
  • 2002: DALnet peaks at 136,000 users
  • 2003: IRC as a whole peaks at 10 million users, EFnet peaks at 141,000 users
  • 2004: (Facebook founded), Undernet peaks at 159,000 users
  • 2005: Quakenet peaks at 242,000 users, IRCnet peaks at 136,000 (Youtube founded)
  • 2006: (Twitter founded)
  • 2009: (WhatsApp, Pinterest founded)
  • 2010: (TextSecure AKA Signal, Instagram founded)
  • 2011: (Snapchat founded)
  • ~2013: Freenode peaks at ~100,000 users
  • 2016: IRCv3 standardisation effort started (TikTok founded)
  • 2021: Freenode self-destructs, Libera chat founded
  • 2022: Libera peaks at 50,000 users, OFTC peaks at 30,000 users
(The numbers were taken from the Wikipedia page and Netsplit.de. Note that I also include other networks launch in parenthesis for context.) Pretty dramatic, don't you think? Eventually, somehow, IRC became irrelevant for most people: few people are even aware of it now. With less than a million users active, it's smaller than Mastodon, XMPP, or Matrix at this point.1 If I were to venture a guess, I'd say that infighting, lack of a standardization body, and a somewhat annoying protocol meant the network could not grow. It's also possible that the decentralised yet centralised structure of IRC networks limited their reliability and growth. But large social media companies have also taken over the space: observe how IRC numbers peak around the time the wave of large social media companies emerge, especially Facebook (2.9B users!!) and Twitter (400M users).

Where the federated services are in history Right now, Matrix, and Mastodon (and email!) are at the "pre-EFnet" stage: anyone can join the federation. Mastodon has started working on a global block list of fascist servers which is interesting, but it's still an open federation. Right now, Matrix is totally open, but matrix.org publishes a (federated) block list of hostile servers (#matrix-org-coc-bl:matrix.org, yes, of course it's a room). Interestingly, Email is also in that stage, where there are block lists of spammers, and it's a race between those blockers and spammers. Large email providers, obviously, are getting closer to the EFnet stage: you could consider they only accept email from themselves or between themselves. It's getting increasingly hard to deliver mail to Outlook and Gmail for example, partly because of bias against small providers, but also because they are including more and more machine-learning tools to sort through email and those systems are, fundamentally, unknowable. It's not quite the same as splitting the federation the way EFnet did, but the effect is similar. HTTP has somehow managed to live in a parallel universe, as it's technically still completely federated: anyone can start a web server if they have a public IP address and anyone can connect to it. The catch, of course, is how you find the darn thing. Which is how Google became one of the most powerful corporations on earth, and how they became the gatekeepers of human knowledge online. I have only briefly mentioned XMPP here, and my XMPP fans will undoubtedly comment on that, but I think it's somewhere in the middle of all of this. It was co-opted by Facebook and Google, and both corporations have abandoned it to its fate. I remember fondly the days where I could do instant messaging with my contacts who had a Gmail account. Those days are gone, and I don't talk to anyone over Jabber anymore, unfortunately. And this is a threat that Matrix still has to face. It's also the threat Email is currently facing. On the one hand corporations like Facebook want to completely destroy it and have mostly succeeded: many people just have an email account to register on things and talk to their friends over Instagram or (lately) TikTok (which, I know, is not Facebook, but they started that fire). On the other hand, you have corporations like Microsoft and Google who are still using and providing email services because, frankly, you still do need email for stuff, just like fax is still around but they are more and more isolated in their own silo. At this point, it's only a matter of time they reach critical mass and just decide that the risk of allowing external mail coming in is not worth the cost. They'll simply flip the switch and work on an allow-list principle. Then we'll have closed the loop and email will be dead, just like IRC is "dead" now. I wonder which path Matrix will take. Could it liberate us from these vicious cycles? Update: this generated some discussions on lobste.rs.

  1. According to Wikipedia, there are currently about 500 distinct IRC networks operating, on about 1,000 servers, serving over 250,000 users. In contrast, Mastodon seems to be around 5 million users, Matrix.org claimed at FOSDEM 2021 to have about 28 million globally visible accounts, and Signal lays claim to over 40 million souls. XMPP claims to have "millions" of users on the xmpp.org homepage but the FAQ says they don't actually know. On the proprietary silo side of the fence, this page says
    • Facebook: 2.9 billion users
    • WhatsApp: 2B
    • Instagram: 1.4B
    • TikTok: 1B
    • Snapchat: 500M
    • Pinterest: 480M
    • Twitter: 397M
    Notable omission from that list: Youtube, with its mind-boggling 2.6 billion users... Those are not the kind of numbers you just "need to convince a brother or sister" to grow the network...

19 May 2022

Ulrike Uhlig: How do kids conceive the internet? - part 2

I promised a follow up to my post about interviews about how children conceptualize the internet. Here it is. (Maybe not the last one!)

The internet, it s that thing that acts up all the time, right? As said in my first post, I abandoned the idea to interview children younger than 9 years because it seems they are not necessarily aware that they are using the internet. But it turns out that some do have heard about the internet. My friend Anna, who has 9 younger siblings, tried to win some of her brothers and sisters for an interview with me. At the dinner table, this turned into a discussion and she sent me an incredibly funny video where two of her brothers and sisters, aged 5 and 6, discuss with her about the internet. I won t share the video for privacy reasons besides, the kids speak in the wondrous dialect of Vorarlberg, a region in western Austria, close to the border with Liechtenstein. Here s a transcription of the dinner table discussion:
  • Anna: what is the internet?
  • both children: (shouting as if it was a game of who gets it first) photo! mobile! device! camera!
  • Anna: But one can have a camera without the internet
  • M.: Internet is the mobile phone charger! Mobile phone full!
  • J.: Internet is internet is
  • M.: I know! Internet is where you can charge something, the mobile phone and
  • Anna: You mean electricity?
  • M.: Yeah, that is the internet, electricity!
  • Anna: (laughs), Yes, the internet works a bit similarly, true.
  • J.: It s the electricity of the house!
  • Anna: The electricity of the house
(everyone is talking at the same time now.)
  • Anna: And what s WiFi?
  • M.: WiFi it s the TV!
  • Anna (laughs)
  • M.: WiFi is there so it doesn t act up!
  • Anna (laughs harder)
  • J. (repeats what M. just said): WiFi is there so it doesn t act up!
  • Anna: So that what doesn t act up?
  • M.: (moves her finger wildly drawing a small circle in the air) So that it doesn t spin!
  • Anna: Ah?
  • M.: When one wants to watch something on Youtube, well then that the thing doesn t spin like that!
  • Anna: Ahhh! so when you use Youtube, you need the internet, right?
  • J.: Yes, so that one can watch things.
I really like how the kids associate the internet with a thing that works all the time, except for when it doesn t work. Then they notice: The internet is acting up! Probably, when that happens, parents or older siblings say: the internet is acting up or let me check why the internet acts up again and maybe they get up from the sofa, switch a home router on and off again, which creates this association with electricity. (Just for the sake of clarity for fellow multilingualist readers, the kids used the German word spinnen , which I translated to acting up . In French that would be d conner .)

WiFi for everyone! I interviewed another of Anna s siblings, a 10 year old boy. He told me that he does not really use the internet by himself yet, and does not own any internet capable device. He watches when older family members look up stuff on Google, or put on a video on Youtube, Netflix, or Amazon he knew all these brand names though. In the living room, there s Alexa, he told me, and he uses the internet by asking Alexa to play music.
Then I say: Alexa, play this song!
Interestingly, he knew that, in order to listen to a CD, the internet was not needed. When asked how a drawing would look like that explains the internet, he drew a scheme of the living room at home, with the TV, Alexa, and some kind of WiFi dongle, maybe a repeater. (Unfortunately I did not manage to get his drawing.) If he could ask a wise and friendly dragon one thing about the internet that he always wanted to know, he would ask How much internet can one have and what are all the things one can do with the internet? If he could change the internet for the better for everyone, he would build a gigantic building which would provide the entire world with WiFi.

Cut out the stupid stuff from the internet His slightly older sister does own a laptop and a smartphone. She uses the internet to watch movies, or series, to talk with her friends, or to listen to music. When asked how she would explain the internet to an alien, she said that
one can do a lot of things on the internet, but on the internet there can be stupid things, but also good things, one can learn stuff on the internet, for example how to do crochet.
Most importantly, she noticed that
one needs the internet nowadays.
A child's drawing. On the left, a smartphone with WhatsApp, saying 'calls with WhatsApp'. In the middle a TV saying 'watching movies'. On the right, a laptop with lots of open windowns. Her drawing shows how she uses the internet: calls using WhatsApp, watching movies online, and a laptop with open windows on the screen. She would ask the dragon that can explain one thing she always wanted to know about the internet:
What is the internet? How does it work at all? How does it function?
What she would change has to do with her earlier remark about stupid things:
I would make it so that there are less stupid things. It would be good to use the internet for better things, but not for useless things, that one doesn t actually need.
When I asked her what she meant by stupid things , she replied:
Useless videos where one talks about nonsense. And one can also google stupid things, for example how long will i be alive? and stuff like that.

Patterns From the interviews I made until now, there seems to be a cut between then age where kids don t own a device and use the internet to watch movies, series or listen to music and the age where they start owning a device and then they start talking to their friends, and create accounts on social media. This seems to happen roughly at ages 9-10. I m still surprised at the amount of ideas that kids have, when asked what they would change on the internet if they could. I m sure there s more if one goes looking for it.

Thanks Thanks to my friends who made all these interviews possible either by allowing me to meet their children, or their younger siblings: Anna, Christa, Aline, Cindy, and Martina.

18 May 2022

Louis-Philippe V ronneau: Clojure Team 2022 Sprint Report

This is the report for the Debian Clojure Team remote sprint that took place on May 13-14th. Looking at my previous blog entries, this was my first Debian sprint since July 2020! Crazy how fast time flies... Many thanks to those who participated, namely: Sadly, Utkarsh Gupta although having planned on participating ended up not being able to and worked on DebConf Bursary paperwork instead. rlb Rob mostly worked on creating a dh-clojure tool to help make packaging Clojure libraries easier. At the moment, most of the packaging is done manually, by invoking build tools by hand. Having a tool to automate many of the steps required to build Clojure packages would go a long way in making them more uniform. His work (although still very much a WIP) can be found here: https://salsa.debian.org/rlb/dh-clojure/ ehashman Elana: lavamind It was J r me's first time working on Clojure packages, and things went great! During the sprint, he: allentiak Leandro joined us on Saturday, since he couldn't get off work on Friday. He mostly continued working on replacing our in-house scripts for /usr/bin/clojure by upstream's, a task he had already started during GSoC 2021. Sadly, none of us were familiar with Debian's mechanism for alternatives. If you (yes you, dear reader) are familiar with it, I'm sure he would warmly welcome feedback on his development branch. pollo As for me, I: Overall, it was quite a productive sprint! Thanks to Debian for sponsoring our food during the sprint. It was nice to be able to concentrate on fixing things instead of making food :) Here's a bonus picture of the nice sushi platter I ended up getting for dinner on Saturday night: Picture of a sushi platter

1 January 2021

Utkarsh Gupta: FOSS Activites in December 2020

Here s my (fifteenth) monthly update about the activities I ve done in the F/L/OSS world.

Debian
This was my 24th month of contributing to Debian. I became a DM in late March last year and a DD last Christmas! \o/ Amongs a lot of things, this was month was crazy, hectic, adventerous, and the last of 2020 more on some parts later this month.
I finally finished my 7th semester (FTW!) and moved onto my last one! That said, I had been busy with other things but still did a bunch of Debian stuff Here are the following things I did this month:

Uploads and bug fixes:

Other $things:
  • Attended the Debian Ruby team meeting.
  • Mentoring for newcomers.
  • FTP Trainee reviewing.
  • Moderation of -project mailing list.
  • Sponsored golang-github-gorilla-css for Fedrico.

Debian (E)LTS
Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. And Debian Extended LTS (ELTS) is its sister project, extending support to the Jessie release (+2 years after LTS support). This was my fifteenth month as a Debian LTS and sixth month as a Debian ELTS paid contributor.
I was assigned 26.00 hours for LTS and 38.25 hours for ELTS and worked on the following things:

LTS CVE Fixes and Announcements:
  • Issued DLA 2474-1, fixing CVE-2020-28928, for musl.
    For Debian 9 Stretch, these problems have been fixed in version 1.1.16-3+deb9u1.
  • Issued DLA 2481-1, fixing CVE-2020-25709 and CVE-2020-25710, for openldap.
    For Debian 9 Stretch, these problems have been fixed in version 2.4.44+dfsg-5+deb9u6.
  • Issued DLA 2484-1, fixing #969126, for python-certbot.
    For Debian 9 Stretch, these problems have been fixed in version 0.28.0-1~deb9u3.
  • Issued DLA 2487-1, fixing CVE-2020-27350, for apt.
    For Debian 9 Stretch, these problems have been fixed in version 1.4.11. The update was prepared by the maintainer, Julian.
  • Issued DLA 2488-1, fixing CVE-2020-27351, for python-apt.
    For Debian 9 Stretch, these problems have been fixed in version 1.4.2. The update was prepared by the maintainer, Julian.
  • Issued DLA 2495-1, fixing CVE-2020-17527, for tomcat8.
    For Debian 9 Stretch, these problems have been fixed in version 8.5.54-0+deb9u5.
  • Issued DLA 2488-2, for python-apt.
    For Debian 9 Stretch, these problems have been fixed in version 1.4.3. The update was prepared by the maintainer, Julian.
  • Issued DLA 2508-1, fixing CVE-2020-35730, for roundcube.
    For Debian 9 Stretch, these problems have been fixed in version 1.2.3+dfsg.1-4+deb9u8. The update was prepared by the maintainer, Guilhem.

ELTS CVE Fixes and Announcements:

Other (E)LTS Work:
  • Front-desk duty from 21-12 until 27-12 and from 28-12 until 03-01 for both LTS and ELTS.
  • Triaged openldap, python-certbot, lemonldap-ng, qemu, gdm3, open-iscsi, gobby, jackson-databind, wavpack, cairo, nsd, tomcat8, and bountycastle.
  • Marked CVE-2020-17527/tomcat8 as not-affected for jessie.
  • Marked CVE-2020-28052/bountycastle as not-affected for jessie.
  • Marked CVE-2020-14394/qemu as postponed for jessie.
  • Marked CVE-2020-35738/wavpack as not-affected for jessie.
  • Marked CVE-2020-3550 3-6 /qemu as postponed for jessie.
  • Marked CVE-2020-3550 3-6 /qemu as postponed for stretch.
  • Marked CVE-2020-16093/lemonldap-ng as no-dsa for stretch.
  • Marked CVE-2020-27837/gdm3 as no-dsa for stretch.
  • Marked CVE-2020- 13987, 13988, 17437 /open-iscsi as no-dsa for stretch.
  • Marked CVE-2020-35450/gobby as no-dsa for stretch.
  • Marked CVE-2020-35728/jackson-databind as no-dsa for stretch.
  • Marked CVE-2020-28935/nsd as no-dsa for stretch.
  • Auto EOL ed libpam-tacplus, open-iscsi, wireshark, gdm3, golang-go.crypto, jackson-databind, spotweb, python-autobahn, asterisk, nsd, ruby-nokogiri, linux, and motion for jessie.
  • General discussion on LTS private and public mailing list.

Other $things! \o/

Bugs and Patches Well, I did report some bugs and issues and also sent some patches:
  • Issue #44 for github-activity-readme, asking for a feature request to set custom committer s email address.
  • Issue #711 for git2go, reporting build failure for the library.
  • PR #89 for rubocop-rails_config, bumping RuboCop::Packaging to v0.5.
  • Issue #36 for rubocop-packaging, asking to try out mutant :)
  • PR #212 for cucumber-ruby-core, bumping RuboCop::Packaging to v0.5.
  • PR #213 for cucumber-ruby-core, enabling RuboCop::Packaging.
  • Issue #19 for behance, asking to relax constraints on faraday and faraday_middleware.
  • PR #37 for rubocop-packaging, enabling tests against ruby3.0! \o/
  • PR #489 for cucumber-rails, bumping RuboCop::Packaging to v0.5.
  • Issue #362 for nheko, reporting a crash when opening the application.
  • PR #1282 for paper_trail, adding RuboCop::Packaging amongst other used extensions.
  • Bug #978640 for nheko Debian package, reporting a crash, as a result of libfmt7 regression.

Misc and Fun Besides squashing bugs and submitting patches, I did some other things as well!
  • Participated in my first Advent of Code event! :)
    Whilst it was indeed fun, I didn t really complete it. No reason, really. But I ll definitely come back stronger next year, heh! :)
    All the solutions thus far could be found here.
  • Did a couple of reviews for some PRs and triaged some bugs here and there, meh.
  • Also did some cloud debugging, not so fun if you ask me, but cool enough to make me want to do it again! ^_^
  • Worked along with pollo, zigo, ehashman, rlb, et al for puppet and puppetserver in Debian. OMG, they re so lovely! <3
  • Ordered some interesting books to read January onward. New year resolution? Meh, not really. Or maybe. But nah.
  • Also did some interesting stuff this month but can t really talk about it now. Hopefully sooooon.

Until next time.
:wq for today.

23 April 2017

Andreas Metzler: balance sheet snowboarding season 2016/17

Another year of minimal snow. Again there was early snowfall in the mountains at the start of November, but the snow was gone soon again. There was no snow up to 2000 meters of altitude until about January 3. Christmas week was spent hiking up and taking the lift down. I had my first day on board on January 6 on artificial snow, and the first one on natural snow on January 19. Down where I live (800m), snow was tight the whole winter, never topping 1m. Measuring station Diedamskopf at 1800m above sea-level topped at slightly above 200cm, on April 19. Last boarding day was yesterday (April 22) in Warth with hero-conditions. I had a preopening on the glacier in Pitztal at start of November with Pure Boarding. However due to the long waiting-period between pre-opening and start of season it did not pay off. By the time I rode regularily I had forgotten almost everything I learned at Carving-School. Nevertheless I strong season due to long periods on stable, sunny weather with 30 days on piste (counting the day I went up and barely managed a single blind run in superdense fog). Anyway, here is the balance-sheet:
2005/06 2006/07 2007/08 2008/09 2009/10 2010/11 2011/12 2012/13 2013/14 2014/15 2015/16 2016/17
number of (partial) days251729373030252330241730
Dam ls1010510162310429944
Diedamskopf154242313414191131223
Warth/Schr cken030413100213
total meters of altitude12463474096219936226774202089203918228588203562274706224909138037269819
highscore10247m8321m12108m11272m11888m10976m13076m13885m12848m132781101512245
# of runs309189503551462449516468597530354634

29 March 2017

Dirk Eddelbuettel: #1: Easy Package Registration

Welcome to the first actual post in the R4 series, following the short announcement earlier this week.

Context Last month, Brian Ripley announced on r-devel that registration of routines would now be tested for by R CMD check in r-devel (which by next month will become R 3.4.0). A NOTE will be issued now, this will presumably turn into a WARNING at some point. Writing R Extensions has an updated introduction) of the topic. Package registration has long been available, and applies to all native (i.e. "compiled") function via the .C(), .Call(), .Fortran() or .External() interfaces. If you use any of those -- and .Call() may be the only truly relevant one here -- then is of interest to you. Brian Ripley and Kurt Hornik also added a new helper function: tools::package_native_routine_registration_skeleton(). It parses the R code of your package and collects all native function entrypoints in order to autogenerate the registration. It is available in R-devel now, will be in R 3.4.0 and makes adding such registration truly trivial. But as of today, it requires that you have R-devel. Once R 3.4.0 is out, you can call the helper directly. As for R-devel, there have always been at least two ways to use it: build it locally (which we may cover in another R4 installment), or using Docker. Here will focus on the latter by relying on the Rocker project by Carl and myself.

Use the Rocker drd Image We assume you can run Docker on your system. How to add it on Windows, macOS or Linux is beyond our scope here today, but also covered extensively elsewhere. So we assume you can execute docker and e.g. bring in the 'daily r-devel' image drd from our Rocker project via
~$ docker pull rocker/drd
With that, we can use R-devel to create the registration file very easily in a single call (which is a long command-line we have broken up with one line-break for the display below). The following is real-life example when I needed to add registration to the RcppTOML package for this week's update:
~/git/rcpptoml(master)$ docker run --rm -ti -v $(pwd):/mnt rocker/drd \  ## line break
             RD --slave -e 'tools::package_native_routine_registration_skeleton("/mnt")'
#include <R.h>
#include <Rinternals.h>
#include <stdlib.h> // for NULL
#include <R_ext/Rdynload.h>
/* FIXME: 
   Check these declarations against the C/Fortran source code.
*/
/* .Call calls */
extern SEXP RcppTOML_tomlparseImpl(SEXP, SEXP, SEXP);
static const R_CallMethodDef CallEntries[] =  
     "RcppTOML_tomlparseImpl", (DL_FUNC) &RcppTOML_tomlparseImpl, 3 ,
     NULL, NULL, 0 
 ;
void R_init_RcppTOML(DllInfo *dll)
 
    R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
    R_useDynamicSymbols(dll, FALSE);
 
edd@max:~/git/rcpptoml(master)$ 

Decompose the Command We can understand the docker command invocation above through its components:
  • docker run is the basic call to a container
  • --rm -ti does subsequent cleanup (--rm) and gives a terminal (-t) that is interactive (-i)
  • -v $(pwd):/mnt uses the -v a:b options to make local directory a available as b in the container; here $(pwd) calls print working directory to get the local directory which is now mapped to /mnt in the container
  • rocker/drd invokes the 'drd' container of the Rocker project
  • RD is a shorthand to the R-devel binary inside the container, and the main reason we use this container
  • -e 'tools::package_native_routine_registration_skeleton("/mnt") calls the helper function of R (currently in R-devel only, so we use RD) to compute a possible init.c file based on the current directory -- which is /mnt inside the container
That it. We get a call to the R function executed inside the Docker container, examining the package in the working directory and creating a registration file for it which is display to the console.

Copy the Output to src/init.c We simply copy the output to a file src/init.c; I often fold one opening brace one line up.

Update NAMESPACE We also change one line in NAMESPACE from (for this package) useDynLib("RcppTOML") to useDynLib("RcppTOML", .registration=TRUE). Adjust accordingly for other package names.

That's it! And with that we a have a package which no longer provokes the NOTE as seen by the checks page. Calls to native routines are now safer (less of a chance for name clashing), get called quicker as we skip the symbol search (see the WRE discussion) and best of all this applies to all native routines whether written by hand or written via a generator such as Rcpp Attributes. So give this a try to get your package up-to-date.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

30 December 2016

Andreas Metzler: I am wondering

I am wondering how many years it is going to take me to stop expecting snow below 2000m in December. This has now been the third year in a row with basically zero snow until christmas up to the top of the mountains around here. (And this year it is not going to change anymore, there might be a tiny bit of snow on January 3, but not enough to ride a board on.) I should have grown accustomed to it, but I have not managed yet, it still feels like a let-down. some illustrations.

30 November 2016

Chris Lamb: Free software activities in November 2016

Here is my monthly update covering what I have been doing in the free software world (previous month):
Reproducible builds

Whilst anyone can inspect the source code of free software for malicious flaws, most software is distributed pre-compiled to end users. The motivation behind the Reproducible Builds effort is to permit verification that no flaws have been introduced either maliciously or accidentally during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.

This month:

My work in the Reproducible Builds project was also covered in our weekly reports. (#80, #81, #82 #83.

Toolchain issues I submitted the following patches to fix reproducibility-related toolchain issues with Debian:

strip-nondeterminism

strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build.


jenkins.debian.net

jenkins.debian.net runs our comprehensive testing framework.

  • buildinfo.debian.net has moved to SSL. (ac3b9e7)
  • Submit signing keys to keyservers after generation. (bdee6ff)
  • Various cosmetic changes, including
    • Prefer if X not in Y over if not X in Y. (bc23884)
    • No need for a dictionary; let's just use a set. (bf3fb6c)
    • Avoid DRY violation by using a for loop. (4125ec5)

I also submitted 9 patches to fix specific reproducibility issues in apktool, cairo-5c, lava-dispatcher, lava-server, node-rimraf, perlbrew, qsynth, tunnelx & zp.

Debian

Debian LTS This month I have been paid to work 11 hours on Debian Long Term Support (LTS). In that time I did the following:
  • "Frontdesk" duties, triaging CVEs, etc.
  • Issued DLA 697-1 for bsdiff fixing an arbitrary write vulnerability.
  • Issued DLA 705-1 for python-imaging correcting a number of memory overflow issues.
  • Issued DLA 713-1 for sniffit where a buffer overflow allowed a specially-crafted configuration file to provide a root shell.
  • Issued DLA 723-1 for libsoap-lite-perl preventing a Billion Laughs XML expansion attack.
  • Issued DLA 724-1 for mcabber fixing a roster push attack.

Uploads
  • redis:
    • 3.2.5-2 Tighten permissions of /var/ lib,log /redis. (#842987)
    • 3.2.5-3 & 3.2.5-4 Improve autopkgtest tests and install upstream's MANIFESTO and README.md documentation.
  • gunicorn (19.6.0-9) Adding autopkgtest tests.
  • libfiu:
    • 0.94-1 Add autopkgtest tests.
    • 0.95-1, 0.95-2 & 0.95-3 New upstream release and improve autopkgtest coverage.
  • python-django (1.10.3-1) New upstream release.
  • aptfs (0.8-3, 0.8-4 & 0.8-5) Adding and subsequently improving the autopkgtext tests.


I performed the following QA uploads:


Finally, I also made the following non-maintainer uploads:
  • libident (0.22-3.1) Move from obsolete Source-Version substvar to binary:Version. (#833195)
  • libpcl1 (1.6-1.1) Move from obsolete Source-Version substvar to binary:Version. (#833196)
  • pygopherd (2.0.18.4+nmu1) Move from obsolete Source-Version substvar to $ source:Version . (#833202)


RC bugs


I also filed 59 FTBFS bugs against arc-gui-clients, asyncpg, blhc, civicrm, d-feet, dpdk, fbpanel, freeciv, freeplane, gant, golang-github-googleapis-gax-go, golang-github-googleapis-proto-client-go, haskell-cabal-install, haskell-fail, haskell-monadcatchio-transformers, hg-git, htsjdk, hyperscan, jasperreports, json-simple, keystone, koji, libapache-mod-musicindex, libcoap, libdr-tarantool-perl, libmath-bigint-gmp-perl, libpng1.6, link-grammar, lua-sql, mediatomb, mitmproxy, ncrack, net-tools, node-dateformat, node-fuzzaldrin-plus, node-nopt, open-infrastructure-system-images, open-infrastructure-system-images, photofloat, ppp, ptlib, python-mpop, python-mysqldb, python-passlib, python-protobix, python-ttystatus, redland, ros-message-generation, ruby-ethon, ruby-nokogiri, salt-formula-ceilometer, spykeviewer, sssd, suil, torus-trooper, trash-cli, twisted-web2, uftp & wide-dhcpv6.

FTP Team

As a Debian FTP assistant I ACCEPTed 70 packages: bbqsql, coz-profiler, cross-toolchain-base, cross-toolchain-base-ports, dgit-test-dummy, django-anymail, django-hstore, django-html-sanitizer, django-impersonate, django-wkhtmltopdf, gcc-6-cross, gcc-defaults, gnome-shell-extension-dashtodock, golang-defaults, golang-github-btcsuite-fastsha256, golang-github-dnephin-cobra, golang-github-docker-go-events, golang-github-gogits-cron, golang-github-opencontainers-image-spec, haskell-debian, kpmcore, libdancer-logger-syslog-perl, libmoox-buildargs-perl, libmoox-role-cloneset-perl, libreoffice, linux-firmware-raspi3, linux-latest, node-babel-runtime, node-big.js, node-buffer-shims, node-charm, node-cliui, node-core-js, node-cpr, node-difflet, node-doctrine, node-duplexer2, node-emojis-list, node-eslint-plugin-flowtype, node-everything.js, node-execa, node-grunt-contrib-coffee, node-grunt-contrib-concat, node-jquery-textcomplete, node-js-tokens, node-json5, node-jsonfile, node-marked-man, node-os-locale, node-sparkles, node-tap-parser, node-time-stamp, node-wrap-ansi, ooniprobe, policycoreutils, pybind11, pygresql, pysynphot, python-axolotl, python-drizzle, python-geoip2, python-mockupdb, python-pyforge, python-sentinels, python-waiting, pythonmagick, r-cran-isocodes, ruby-unicode-display-width, suricata & voctomix-outcasts. I additionally filed 4 RC bugs against packages that had incomplete debian/copyright files against node-cliui, node-core-js, node-cpr & node-grunt-contrib-concat.

21 November 2016

Reproducible builds folks: Reproducible Builds: week 82 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday November 13 and Saturday November 19 2016: Media coverage Elsewhere in Debian Documentation update Packages reviewed and fixed, and bugs filed Reviews of unreproducible packages 43 package reviews have been added, 4 have been updated and 12 have been removed in this week, adding to our knowledge about identified issues. 2 issue types have been updated: 4 issue types have been added: Weekly QA work During our reproducibility testing, some FTBFS bugs have been detected and reported by: strip-nondeterminism development disorderfs development debrebuild development debrebuild is new tool proposed by HW42 and josch (see #774415: "From srebuild sbuild-wrapper to debrebuild"). debrepatch development debrepatch is a set of scripts that we're currently developing to make it easier to track unapplied patches. We have a lot of those and we're not always sure if they still work. The plan is to set up jobs to automatically apply old reproducibility patches to newer versions of packages and notify the right people if they don't apply and/or no longer make the package reproducible. debpatch is a component of debrepatch that applies debdiffs to Debian source packages. In other words, it is to debdiff(1) what patch(1) is to diff(1). It is a general tool that is not specific to Reproducible Builds. This week, Ximin Luo worked on making it more "production-ready" and will soon submit it for inclusion in devscripts. reprotest development Ximin Luo significantly improved reprotest, adding presets and auto-detection of which preset to use. One can now run e.g. reprotest auto . or reprotest auto $pkg_$ver.dsc instead of the long command lines that were needed before. He also made it easier to set up build dependencies inside the virtual server and made it possible to specify pre-build dependencies that reprotest itself needs to set up the variations. Previously one had to manually edit the virtual server to do that, which was not very usable to humans without an in-depth knowledge of the building process. These changes will be tested some more and then released in the near future as reprotest 0.4. tests.reproducible-builds.org Misc. This week's edition was written by Chris Lamb, Holger Levsen, Ximin Luo and reviewed by a bunch of Reproducible Builds folks on IRC.

21 August 2016

Gregor Herrmann: RC bugs 2016/30-33

not much to report but I got at least some RC bugs fixed in the last weeks. again mostly perl stuff:

6 May 2016

Matthias Klumpp: Adventures in D programming

I recently wrote a bigger project in the D programming language, the appstream-generator (asgen). Since I rarely leave the C/C++/Python realm, and came to like many aspects of D, I thought blogging about my experience could be useful for people considering to use D. Disclaimer: I am not an expert on programming language design, and this is not universally valid criticism of D just my personal opinion from building one project with it. Why choose D in the first place? The previous AppStream generator was written in Python, which wasn t ideal for the task for multiple reasons, most notably multiprocessing and LMDB not working well together (and in general, multiprocessing being terrible to work with) and the need to reimplement some already existing C code in Python again. So, I wanted a compiled language which would work well together with the existing C code in libappstream. Using C was an option, but my least favourite one (writing this in C would have been much more cumbersome). I looked at Go and Rust and wrote some small programs performing basic operations that I needed for asgen, to get a feeling for the language. Interfacing C code with Go was relatively hard since libappstream is a GObject-based C library, I expected to be able to auto-generate Go bindings from the GIR, but there were only few outdated projects available which did that. Rust on the other hand required the most time in learning it, and since I only briefly looked into it, I still can t write Rust code without having the coding reference open. I started to implement the same examples in D just for fun, as I didn t plan to use D (I was aiming at Go back then), but the language looked interesting. The D language had the huge advantage of being very familiar to me as a C/C++ programmer, while also having a rich standard library, which included great stuff like std.concurrency.Generator, std.parallelism, etc. Translating Python code into D was incredibly easy, additionally a gir-d-generator which is actively maintained exists (I created a small fork anyway, to be able to directly link against the libappstream library, instead of dynamically loading it). What is great about D? This list is just a huge braindump of things I had on my mind at the time of writing  Interfacing with C There are multiple things which make D awesome, for example interfacing with C code and to a limited degree with C++ code is really easy. Also, working with functions from C in D feels natural. Take these C functions imported into D:
extern(C):
nothrow:
struct _mystruct  
alias mystruct_p = _mystruct*;
mystruct_p = mystruct_create ();
mystruct_load_file (mystruct_p my, const(char) *filename);
mystruct_free (mystruct_p my);
You can call them from D code in two ways:
auto test = mystruct_create ();
// treating "test" as function parameter
mystruct_load_file (test, "/tmp/example");
// treating the function as member of "test"
test.mystruct_load_file ("/tmp/example");
test.mystruct_free ();
This allows writing logically sane code, in case the C functions can really be considered member functions of the struct they are acting on. This property of the language is a general concept, so a function which takes a string as first parameter, can also be called like a member function of string. Writing D bindings to existing C code is also really simple, and can even be automatized using tools like dstep. Since D can also easily export C functions, calling D code from C is also possible. Getting rid of C++ cruft There are many things which are bad in C++, some of which are inherited from C. D kills pretty much all of the stuff I found annoying. Some cool stuff from D is now in C++ as well, which makes this point a bit less strong, but it s still valid. E.g. getting rid of the #include preprocessor dance by using symbolic import statements makes sense, and there have IMHO been huge improvements over C++ when it comes to metaprogramming. Incredibly powerful metaprogramming Getting into detail about that would take way too long, but the metaprogramming abilities of D must be mentioned. You can do pretty much anything at compiletime, for example compiling regular expressions to make them run faster at runtime, or mixing in additional code from string constants. The template system is also very well thought out, and never caused me headaches as much as C++ sometimes manages to do. Built-in unit-test support Unittesting with D is really easy: You just add one or more unittest blocks to your code, in which you write your tests. When running the tests, the D compiler will collect the unittest blocks and build a test application out of them. The unittest scope is useful, because you can keep the actual code and the tests close together, and it encourages writing tests and keep them up-to-date. Additionally, D has built-in support for contract programming, which helps to further reduce bugs by validating input/output. Safe D While D gives you the whole power of a low-level system programming language, it also allows you to write safer code and have the compiler check for that, while still being able to use unsafe functions when needed. Unfortunately, @safe is not the default for functions though. Separate operators for addition and concatenation D exclusively uses the + operator for addition, while the ~ operator is used for concatenation. This is likely a personal quirk, but I love it very much that this distinction exists. It s nice for things like addition of two vectors vs. concatenation of vectors, and makes the whole language much more precise in its meaning. Optional garbage collector D has an optional garbage collector. Developing in D without GC is currently a bit cumbersome, but these issues are being addressed. If you can live with a GC though, having it active makes programming much easier. Built-in documentation generator This is almost granted for most new languages, but still something I want to mention: Ddoc is a standard tool to generate code documentation for D code, with a defined syntax for describing function parameters, classes, etc. It will even take the contents of a unittest scope to generate automatic examples for the usage of a function, which is pretty cool. Scope blocks The scope statement allows one to execute a bit of code before the function exists, when it failed or was successful. This is incredibly useful when working with C code, where a free statement needs to be issued when the function is exited, or some arbitrary cleanup needs to be performed on error. Yes, we do have smart pointers in C++ and with some GCC/Clang extensions a similar feature in C too. But the scopes concept in D is much more powerful. See Scope Guard Statement for details. Built-in syntax for parallel programming Working with threads is so much more fun in D compared to C! I recommend taking a look at the parallelism chapter of the Programming in D book. Pure functions D allows to mark functions as purely-functional, which allows the compiler to do optimizations on them, e.g. cache their return value. See pure-functions. D is fast! D matches the speed of C++ in almost all occasions, so you won t lose performance when writing D code that is, unless you have the GC run often in a threaded environment. Very active and friendly community The D community is very active and friendly so far I only had good experience, and I basically came into the community asking some tough questions regarding distro-integration and ABI stability of D. The D community is very enthusiastic about pushing D and especially the metaprogramming features of D to its limits, and consists of very knowledgeable people. Most discussion happens at the forums/newsgroups at forum.dlang.org. What is bad about D? Half-proprietary reference compiler This is probably the biggest issue. Not because the proprietary compiler is bad per se, but because of the implications this has for the D ecosystem. For the reference D compiler, Digital Mars D (DMD), only the frontend is distributed under a free license (Boost), while the backend is proprietary. The FLOSS frontend is what the free compilers, LLVM D Compiler (LDC) and GNU D Compiler (GDC) are based on. But since DMD is the reference compiler, most features land there first, and the Phobos standard library and druntime is tuned to work with DMD first. Since major Linux distributions can t ship with DMD, and the free compilers GDC and LDC lack behind DMD in terms of language, runtime and standard-library compatibility, this creates a split world of code that compiles with LDC, GDC or DMD, but never with all D compilers due to it relying on features not yet in e.g. GDCs Phobos. Especially for Linux distributions, there is no way to say use this compiler to get the best and latest D compatibility . Additionally, if people can t simply apt install latest-d, they are less likely to try the language. This is probably mainly an issue on Linux, but since Linux is the place where web applications are usually written and people are likely to try out new languages, it s really bad that the proprietary reference compiler is hurting D adoption in that way. That being said, I want to make clear DMD is a great compiler, which is very fast and build efficient code. I only criticise the fact that it is the language reference compiler. UPDATE: To clarify the half-proprietary nature of the compiler, let me quote the D FAQ:
The front end for the dmd D compiler is open source. The back end for dmd is licensed from Symantec, and is not compatible with open-source licenses such as the GPL. Nonetheless, the complete source comes with the compiler, and all development takes place publically on github. Compilers using the DMD front end and the GCC and LLVM open source backends are also available. The runtime library is completely open source using the Boost License 1.0. The gdc and ldc D compilers are completely open sourced.
Phobos (standard library) is deprecating features too quickly This basically goes hand in hand with the compiler issue mentioned above. Each D compiler ships its own version of Phobos, which it was tested against. For GDC, which I used to compile my code due to LDC having bugs at that time, this means that it is shipping with a very outdated copy of Phobos. Due to the rapid evolution of Phobos, this meant that the documentation of Phobos and the actual code I was working with were not always in sync, leading to many frustrating experiences. Furthermore, Phobos is sometimes removing deprecated bits about a year after they have been deprecated. Together with the older-Phobos situation, you might find yourself in a place where a feature was dropped, but the cool replacement is not yet available. Or you are unable to import some 3rd-party code because it uses some deprecated-and-removed feature internally. Or you are unable to use other code, because it was developed with a D compiler shipping with a newer Phobos. This is really annoying, and probably the biggest source of unhappiness I had while working with D especially the documentation not matching the actual code is a bad experience for someone new to the language. Incomplete free compilers with varying degrees of maturity LDC and GDC have bugs, and for someone new to the language it s not clear which one to choose. Both LDC and GDC have their own issues at time, but they are rapidly getting better, and I only encountered some actual compiler bugs in LDC (GDC worked fine, but with an incredibly out-of-date Phobos). All issues are fixed meanwhile, but this was a frustrating experience. Some clear advice or explanation which of the free compilers is to prefer when you are new to D would be neat. For GDC in particular, being developed outside of the main GCC project is likely a problem, because distributors need to manually add it to their GCC packaging, instead of having it readily available. I assume this is due to the DRuntime/Phobos not being subjected to the FSF CLA, but I can t actually say anything substantial about this issue. Debian adds GDC to its GCC packaging, but e.g. Fedora does not do that. No ABI compatibility D has a defined ABI too bad that in reality, the compilers are not interoperable. A binary compiled with GDC can t call a library compiled with LDC or DMD. GDC actually doesn t even support building shared libraries yet. For distributions, this is quite terrible, because it means that there must be one default D compiler, without any exception, and that users also need to use that specific compiler to link against distribution-provided D libraries. The different runtimes per compiler complicate that problem further. The D package manager, dub, does not yet play well with distro packaging This is an issue that is important to me, since I want my software to be easily packageable by Linux distributions. The issues causing packaging to be hard are reported as dub issue #838 and issue #839, with quite positive feedback so far, so this might soon be solved. The GC is sometimes an issue The garbage collector in D is quite dated (according to their own docs) and is currently being reworked. While working with asgen, which is a program creating a large amount of interconnected data structures in a threaded environment, I realized that the GC is significantly slowing down the application when threads are used (it also seems to use UNIX signals SIGUSR1 and SIGUSR2 to stop/resume threads, which I still find odd). Also, the GC performed poorly on memory pressure, which did get asgen killed by the OOM killer on some more memory-constrained machines. Triggering a manual collection run after a large amount of these interconnected data structures wasn t needed anymore solved this problem for most systems, but it would of course have been better to not needing to give the GC any hints. The stop-the-world behavior isn t a problem for asgen, but it might be for other applications. These issues are at time being worked on, with a GSoC project laying the foundation for further GC improvements. version is a reserved word Okay, that is admittedly a very tiny nitpick, but when developing an app which works with packages and versions, it s slightly annoying. The version keyword is used for conditional compilation, and needing to abbreviate it to ver in all parts of the code sucks a little (e.g. the Package interface can t have a property version , but now has ver instead). The ecosystem is not (yet) mature In general it can be said that the D ecosystem, while existing for almost 9 years, is not yet that mature. There are various quirks you have to deal with when working with D code on Linux. It s always nothing major, usually you can easily solve these issues and go on, but it s annoying to have these papercuts. This is not something which can be resolved by D itself, this point will solve itself as more people start to use D and D support in Linux distributions gets more polished. Conclusion I like to work with D, and I consider it to be a great language the quirks it has in its toolchain are not that bad to prevent writing great things with it. At time, if I am not writing a shared library or something which uses much existing C++ code, I would prefer D for that task. If a garbage collector is a problem (e.g. for some real-time applications, or when the target architecture can t run a GC), I would not recommend to use D. Rust seems to be the much better choice then. In any case, D s flat learning curve (for C/C++ people) paired with the smart choices taken in language design, the powerful metaprogramming, the rich standard library and helpful community makes it great to try out and to develop software for scenarios where you would otherwise choose C++ or Java. Quite honestly, I think D could be a great language for tasks where you would usually choose Python, Java or C++, and I am seriously considering to replace quite some Python code with D code. For very low-level stuff, C is IMHO still the better choice. As always, choosing the right programming language is only 50% technical aspects, and 50% personal taste  UPDATE: To get some idea of D, check out the D tour on the new website tour.dlang.org.

30 April 2016

Chris Lamb: Free software activities in April 2016

Here is my monthly update covering a large part of what I have been doing in the free software world (previously):
Debian My work in the Reproducible Builds project was covered in our weekly reports. (#48, #49, #50, #51 & #52)
Uploads
  • redis (2:3.0.7-3) Adding, amongst some other changes, systemd LimitNOFILE support to allow a higher number of open file descriptors.


FTP Team

As a Debian FTP assistant I ACCEPTed 135 packages: aptitude, asm, beagle, blends, btrfs-progs, camitk, cegui-mk2, cmor-tables, containerd, debian-science, debops, debops-playbooks, designate-dashboard, efitools, facedetect, flask-testing, fstl, ganeti-os-noop, gnupg, golang-fsnotify, golang-github-appc-goaci, golang-github-benbjohnson-tmpl, golang-github-dchest-safefile, golang-github-docker-go, golang-github-dylanmei-winrmtest, golang-github-hawkular-hawkular-client-go, golang-github-hlandau-degoutils, golang-github-hpcloud-tail, golang-github-klauspost-pgzip, golang-github-kyokomi-emoji, golang-github-masterminds-semver-dev, golang-github-masterminds-vcs-dev, golang-github-masterzen-xmlpath, golang-github-mitchellh-ioprogress, golang-github-smartystreets-assertions, golang-gopkg-hlandau-configurable.v1, golang-gopkg-hlandau-easyconfig.v1, golang-gopkg-hlandau-service.v2, golang-objx, golang-pty, golang-text, gpaste, gradle-plugin-protobuf, grip, haskell-brick, haskell-hledger-ui, haskell-lambdabot-haskell-plugins, haskell-text-zipper, haskell-werewolf, hkgerman, howdoi, jupyter-client, jupyter-core, letsencrypt.sh, libbpp-phyl, libbpp-raa, libbpp-seq, libbpp-seq-omics, libcbor-xs-perl, libdancer-plugin-email-perl, libdata-page-pageset-perl, libevt, libevtx, libgit-version-compare-perl, libgovirt, libmsiecf, libnet-ldap-server-test-perl, libpgobject-type-datetime-perl, libpgobject-type-json-perl, libpng1.6, librest-client-perl, libsecp256k1, libsmali-java, libtemplates-parser, libtest-requires-git-perl, libtext-xslate-perl, linux, linux-signed, mandelbulber2, netlib-java, nginx, node-rc, node-utml, nvidia-cuda-toolkit, openfst, openjdk-9, openssl, php-cache-integration-tests, pulseaudio, pyfr, pygccxml, pytest-runner, python-adventure, python-arrayfire, python-django-feincms, python-fastimport, python-fitsio, python-imagesize, python-lib389, python-libtrace, python-neovim-gui, python3-proselint, pythonpy, pyzo, r-cran-ca, r-cran-fitbitscraper, r-cran-goftest, r-cran-rnexml, r-cran-rprotobuf, rrdtool, ruby-proxifier, ruby-seamless-database-pool, ruby-syslog-logger, rustc, s5, sahara-dashboard, salt-formula-ceilometer, salt-formula-cinder, salt-formula-glance, salt-formula-heat, salt-formula-horizon, salt-formula-keystone, salt-formula-neutron, salt-formula-nova, seer, simplejson, smrtanalysis, tiles-autotag, tqdm, tran, trove-dashboard, vim, vulkan, xapian-bindings & xapian-core.

17 April 2016

Andreas Metzler: balance sheet snowboarding season 2015/16

A very weak season, mainly due to two reasons: Here is the balance sheet:
2005/06 2006/07 2007/08 2008/09 2009/10 2010/11 2011/12 2012/13 2013/14 2014/15 2015/2016
number of (partial) days2517293730302523302417
Dam ls101051016231042994
Diedamskopf1542423134141911312
Warth/Schr cken03041310021
total meters of altitude12463474096219936226774202089203918228588203562274706224909138037
highscore10247m8321m12108m11272m11888m10976m13076m13885m12848m1327811015
# of runs309189503551462449516468597530354

27 December 2015

Vincent Sanders: The only pleasure I get from moving house is stumbling across books I had forgotton I owned

I have to agree with John Burnside on that statement, after having recently moved house again rediscovering our book collection has been a salve for an otherwise exhausting undertaking. I returned to Cambridge four years ago, initially on my own and then subsequently the family moved down to be with me.

We rented a house but, with two growing teenagers, the accommodation was becoming a little crowded. Melodie and I decided the relocation was permanent and started looking for our own property, eventually finding something to our liking in Cottenham village.

Melodie took the opportunity to have the house cleaned and decorated while empty because of overlapping time with our rental property. This meant we had to be a little careful while moving in as there was still wet paint in places.

Some of our books
Moving weekend was made bearable by Steve, Jonathan and Jo lending a hand especially on the trips to Yorkshire to retrieve, amongst other things, the aforementioned book collection. We were also fortunate to have Andy and Jane doing many other important jobs around the place while the rest of us were messing about in vans.

The desk in the study
The seemingly obligatory trip to IKEA to acquire furniture was made much more fun by trying to park a luton van which was only possible because Steve and Jonathan helped me. Though it turns out IKEA ship mattresses rolled up so tight they can be moved in an estate car so taking the van was unnecessary.

Alex under his loft bed
Having moved in it seems like every weekend is filled with a never ending "todo" list of jobs. From clearing gutters to building a desk in the study. Eight weeks on and the list seems to be slowly shrinking meaning I can even do some lower priority things like the server rack which was actually a fun project.


Joshua in his completed roomThe holidays this year afforded me some time to finish the boys bedrooms. They both got loft beds with a substantial area underneath. This allows them both to have double beds along with a desk and plenty of storage. Completing the rooms required the construction of some flat pack furniture which rather than simply do myself I supervised the boys doing it themselves.

Alexander building flat pack furniture
Teaching them by letting them get on with it was a surprisingly effective and both of them got the hang of the construction method pretty quickly. There was only a couple of errors from which they learned immediately and did not repeat (draw bottoms having a finished side and front becomes back when you are constructing upside down)

Joshua assembling flat pack furniture
The house is starting to feel like home and soon all the problems will fade from memory while the good will remain. Certainly our first holiday season has been comfortable here and I look forward to many more re-reading our books.

23 November 2015

Riku Voipio: Using ser2net for serial access.

Is your table a mess of wires? Do you have multiple devices connected via serial and can't remember which is /dev/ttyUSBX is connected to what board? Unless you are a embedded developer, you are unlikely to deal with serial much anymore - In that case you can just jump to the next post in your news feed. Introducting ser2netUsually people start with minicom for serial access. There are better tools - picocom, screen, etc. But to easily map multiple serial ports, use ser2net. Ser2net makes serial ports available over telnet. Persistent usb device names and ser2netTo remember which usb-serial adapter is connected to what, we use the /dev/serial tree created by udev, in /etc/ser2net.conf:

# arndale
7004:telnet:0:'/dev/serial/by-path/pci-0000:00:1d.0-usb-0:1.8.1:1.0-port0':115200 8DATABITS NONE 1STOPBIT
# cubox
7005:telnet:0:/dev/serial/by-id/usb-Prolific_Technology_Inc._USB-Serial_Controller_D-if00-port0:115200 8DATABITS NONE 1STOPBIT
# sonic-screwdriver
7006:telnet:0:/dev/serial/by-id/usb-FTDI_FT230X_96Boards_Console_DAZ0KA02-if00-port0:115200 8DATABITS NONE 1STOPBIT
The by-path syntax is needed, if you have many identical usb-to-serial adapters. In that case a Patch from BTS is needed to support quoting in serial path. Ser2net doesn't seems very actively maintained upstream - a sure sign that project is stagnant is a homepage still at sourceforge.net... This patch among other interesting features can be also be found in various ser2net forks in github. Setting easy to remember names Finally, unless you want to memorize the port numbers, set TCP port to name mappings in /etc/services:

# Local services
arndale 7004/tcp
cubox 7005/tcp
sonic-screwdriver 7006/tcp
Now finally:
telnet localhost sonic-screwdriver
^Mandatory picture of serial port connection in action

8 August 2015

Dirk Eddelbuettel: drat 0.1.0: Even more repository support

drat and double drat A new version 0.1.0 of the drat package arrived on CRAN today. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages, and is finding increasing by other projects. This version 0.1.0 builds on the previous releases and now adds complete support for binaries for both Windows and OS X. This builds on what had been added in the previous release 0.0.4. This and other new features are listed below: Also of note: The rOpenSci project now uses drat to distribute their code, and Carl Boettiger has written a nice blog post about it. Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

21 June 2015

Steve Kemp: We're all about storing objects

Recently I've been experimenting with camlistore, which is yet another object storage system. Camlistore gains immediate points because it is written in Go, and is a project initiated by Brad Fitzpatrick, the creator of Perlbal, memcached, and Livejournal of course. Camlistore is designed exactly how I'd like to see an object storage-system - each server allows you to:
It should be noted more is possible, there's a pretty web UI for example, but I'm simplifying. Do your own homework :)
With those primitives you can allow a client-library to upload a file once, then in the background a bunch of dumb servers can decide amongst themselves "Hey I have data with ID:33333 - Do you?". If nobody else does they can upload a second copy. In short this kind of system allows the replication to be decoupled from the storage. The obvious risk is obvious though: if you upload a file the chunks might live on a host that dies 20 minutes later, just before the content was replicated. That risk is minimal, but valid. There is also the risk that sudden rashes of uploads leave the system consuming all the internal-bandwith constantly comparing chunk-IDs, trying to see if data is replaced that has been copied numerous times in the past, or trying to play "catch-up" if the new-content is larger than the replica-bandwidth. I guess it should possible to detect those conditions, but they're things to be concerned about. Anyway the biggest downside with camlistore is documentation about rebalancing, replication, or anything other than simple single-server setups. Some people have blogged about it, and I got it working between two nodes, but I didn't feel confident it was as robust as I wanted it to be. I have a strong belief that Camlistore will become a project of joy and wonder, but it isn't quite there yet. I certainly don't want to stop watching it :) On to the more personal .. I'm all about the object storage these days. Right now most of my objects are packed in a collection of boxes. On the 6th of next month a shipping container will come pick them up and take them to Finland. For pretty much 20 days in a row we've been taking things to the skip, or the local charity-shops. I expect that by the time we've relocated the amount of possesions we'll maintain will be at least a fifth of our current levels. We're working on the general rule of thumb: "If it is possible to replace an item we will not take it". That means chess-sets, mirrors, etc, will not be carried. DVDs, for example, have been slashed brutally such that we're only transferring 40 out of a starting collection of 500+. Only personal, one-off, unique, or "significant" items will be transported. This includes things like personal photographs, family items, and similar. Clothes? Well I need to take one jacket, but more can be bought. The only place I put my foot down was books. Yes I'm a kindle-user these days, but I spent many years tracking down some rare volumes, and though it would be possible to repeat that effort I just don't want to. I've also decided that I'm carrying my complete toolbox. Some of the tools I took with me when I left home at 18 have stayed with me for the past 20+ years. I don't need this specific crowbar, or axe, but I'm damned if I'm going to lose them now. So they stay. Object storage - some objects are more important than they should be!

27 May 2015

Dirk Eddelbuettel: drat 0.0.4: Yet more features and documentation

A new version, now at 0.0.4, of the drat package arrived on CRAN yesterday. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages, and is finding increasing by other projects. Version 0.0.4 brings both new code and more documentation: Courtesy of CRANberries, there is a comparison to the previous release. More detailed information is on the drat page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Next.