Search Results: "warp"

26 October 2025

Russ Allbery: Review: Ancestral Night

Review: Ancestral Night, by Elizabeth Bear
Series: White Space #1
Publisher: Saga Press
Copyright: 2019
ISBN: 1-5344-0300-0
Format: Kindle
Pages: 501
Ancestral Night is a far-future space opera novel and the first of a series. It shares a universe with Bear's earier Jacob's Ladder trilogy, and there is a passing reference to the events of Grail that would be a spoiler if you put the pieces together, but it's easy to miss. You do not need to read the earlier series to read this book (although it's a good series and you might enjoy it). Halmey Dz is a member of the vast interstellar federation called the Synarche, which has put an end to war and other large-scale anti-social behavior through a process called rightminding. Every person has a neural implant that can serve as supplemental memory, off-load some thought processes, and, crucially, regulate neurotransmitters and hormones to help people stay on an even keel. It works, mostly. One could argue Halmey is an exception. Raised in a clade that took rightminding to an extreme of suppression of individual personality into a sort of hive mind, she became involved with a terrorist during her legally mandated time outside of her all-consuming family before she could make an adult decision to stay with them (essentially a rumspringa). The result was a tragedy that Halmey doesn't like to think about, one that's left deep emotional scars. But Halmey herself would argue she's not an exception: She's put her history behind her, found partners that she trusts, and is a well-adjusted member of the Synarche.
Eventually, I realized that I was wasting my time, and if I wanted to hide from humanity in a bottle, I was better off making it a titanium one with a warp drive and a couple of carefully selected companions.
Halmey does salvage: finding ships lost in white space and retrieving them. One of her partners is Connla, a pilot originally from a somewhat atavistic world called Spartacus. The other is their salvage tug.
The boat didn't have a name. He wasn't deemed significant enough to need a name by the authorities and registries that govern such things. He had a registration number 657-2929-04, Human/Terra and he had a class, salvage tug, but he didn't have a name. Officially. We called him Singer. If Singer had an opinion on the issue, he'd never registered it but he never complained. Singer was the shipmind as well as the ship or at least, he inhabited the ship's virtual spaces the same way we inhabited the physical ones but my partner Connla and I didn't own him. You can't own a sentience in civilized space.
As Ancestral Night opens, the three of them are investigating a tip of a white space anomoly well off the beaten path. They thought it might be a lost ship that failed a transition. What they find instead is a dead Ativahika and a mysterious ship equipped with artificial gravity. The Ativahikas are a presumed sentient race of living ships that are on the most alien outskirts of the Synarche confederation. They don't communicate, at least so far as Halmey is aware. She also wasn't aware they died, but this one is thoroughly dead, next to an apparently abandoned ship of unknown origin with a piece of technology beyond the capabilities of the Synarche. The three salvagers get very little time to absorb this scene before they are attacked by pirates. I have always liked Bear's science fiction better than her fantasy, and this is no exception. This was great stuff. Halmey is a talkative, opinionated infodumper, which is a great first-person protagonist to have in a fictional universe this rich with delightful corners. There are some Big Dumb Object vibes (one of my favorite parts of salvage stories), solid character work, a mysterious past that has some satisfying heft once it's revealed, and a whole lot more moral philosophy than I was expecting from the setup. All of it is woven together with experienced skill, unsurprising given Bear's long and prolific career. And it's full of delightful world-building bits: Halmey's afthands (a surgical adaptation for zero gravity work) and grumpiness at the sheer amount of gravity she has to deal with over the course of this book, the Culture-style ship names, and a faster-than-light travel system that of course won't pass physics muster but provides a satisfying quantity of hooky bits for plot to attach to. The backbone of this book is an ancient artifact mystery crossed with a murder investigation. Who killed the Ativahika? Where did the gravity generator come from? Those are good questions with interesting answers. But the heart of the book is a philosophical conflict: What are the boundaries between identity and society? How much power should society have to reshape who we are? If you deny parts of yourself to fit in with society, is this necessarily a form of oppression? I wrote a couple of paragraphs of elaboration, and then deleted them; on further thought, I don't want to give any more details about what Bear is doing in this book. I will only say that I was not expecting this level of thoughtfulness about a notoriously complex and tricky philosophical topic in a full-throated adventure science fiction novel. I think some people may find the ending strange and disappointing. I loved it, and weeks after finishing this book I'm still thinking about it. Ancestral Night has some pacing problems. There is a long stretch in the middle of the book that felt repetitive and strained, where Bear holds the reader at a high level of alert and dread for long enough that I found it enervating. There are also a few political cheap shots where Bear picks the weakest form of an opposing argument instead of the strongest. (Some of the cheap shots are rather satisfying, though.) The dramatic arc of the book is... odd, in a way that I think was entirely intentional given how well it works with the thematic message, but which is also unsettling. You may not get the catharsis that you're expecting. But all of this serves a purpose, and I thought that purpose was interesting. Ancestral Night is one of those books that I liked more a week after I finished it than I did when I finished it.
Epiphanies are wonderful. I m really grateful that our brains do so much processing outside the line of sight of our consciousnesses. Can you imagine how downright boring thinking would be if you had to go through all that stuff line by line?
Also, for once, I think Bear hit on exactly the right level of description rather than leaving me trying to piece together clues and hope I understood the plot. It helps that Halmey loves to explain things, so there are a lot of miniature infodumps, but I found them interesting and a satisfying throwback to an earlier style of science fiction that focused more on world-building than on interpersonal drama. There is drama, but most of it is internal, and I thought the balance was about right. This is solid, well-crafted work and a good addition to the genre. I am looking forward to the rest of the series. Followed by Machine, which shifts to a different protagonist. Rating: 8 out of 10

21 September 2025

Joey Hess: cheap DIY solar fence design

A year ago I installed a 4 kilowatt solar fence. I'm revisiting it this Sun Day, to share the design, now that I have prooved it out.
The solar fence and some other ground and pole mount solar panels, seen through leaves.
Solar fencing manufacturers have some good simple designs, but it's hard to buy for a small installation. They are selling to utility scale solar mostly. And those are installed by driving metal beams into the ground, which requires heavy machinery. Since I have experience with Ironridge rails for roof mount solar, I decided to adapt that system for a vertical mount. Which is something it was not designed for. I combined the Ironridge hardware with regular parts from the hardware store. The cost of mounting solar panels nowadays is often higher than the cost of the panels. I hoped to match the cost, and I nearly did. The solar panels cost $100 each, and the fence cost $110 per solar panel. This fence was significantly cheaper than conventional ground mount arrays that I considered as alternatives, and made a better use of a difficult hillside location. I used 7 foot long Ironridge XR-10 rails, which fit 2 solar panels per rail. (Longer rails would need a center post anyway, and the 7 foot long rails have cheaper shipping, since they do not need to be shipped freight.) For the fence posts, I used regular 4x4" treated posts. 12 foot long, set in 3 foot deep post holes, with 3x 50 lb bags of concrete per hole and 6 inches of gravel on the bottom.
detail of how the rails are mounted to the posts, and the panels to the rails
To connect the Ironridge rails to the fence posts, I used the Ironridge LFT-03-M1 slotted L-foot bracket. Screwed into the post with a 5/8 x 3 inch hot-dipped galvanized lag screw. Since a treated post can react badly with an aluminum bracket, there needs to be some flashing between the post and bracket. I used Shurtape PW-100 tape for that. I see no sign of corrosion after 1 year. The rest of the Ironridge system is a T-bolt that connects the rail to the L-foot (part BHW-SQ-02-A1), and Ironridge solar panel fasteners (UFO-CL-01-A1 and UFO-STP-40MM-M1). Also XR-10 end caps and wire clips. Since the Ironridge hardware is not designed to hold a solar panel at a 90 degree angle, I was concerned that the panels might slide downward over time. To help prevent that, I added some additional support brackets under the bottom of the panels. So far, that does not seem to have been a problem though. I installed Aptos 370 watt solar panels on the fence. They are bifacial, and while the posts block the back partially, there is still bifacial gain on cloudy days. I left enough space under the solar panels to be able to run a push mower under them.
Me standing in front of the solar fence at end of construction
I put pairs of posts next to one-another, so each 7 foot segment of fence had its own 2 posts. This is the least elegant part of this design, but fitting 2 brackets next to one-another on a single post isn't feasible. I bolted the pairs of posts together with some spacers. A side benefit of doing it this way is that treated lumber can warp as it dries, and this prevented much twisting of the posts. Using separate posts for each segment also means that the fence can traverse a hill easily. And it does not need to be perfectly straight. In fact, my fence has a 30 degree bend in the middle. This means it has both south facing and south-west facing panels, so can catch the light for longer during the day. After building the fence, I noticed there was a slight bit of sway at the top, since 9 feet of wooden post is not entirely rigid. My worry was that a gusty wind could rattle the solar panels. While I did not actually observe that happening, I added some diagonal back bracing for peace of mind.
view of rear upper corner of solar fence, showing back bracing connection
Inspecting the fence today, I find no problems after the first year. I hope it will last 30 years, with the lifespan of the treated lumber being the likely determining factor. As part of my larger (and still ongoing) ground mount solar install, the solar fence has consistently provided great power. The vertical orientation works well at latitude 36. It also turned out that the back of the fence was useful to hang conduit and wiring and solar equipment, and so it turned into the electrical backbone of my whole solar field. But that's another story.. solar fence parts list
quantity cost per unit description
10 $27.89 7 foot Ironridge XR-10 rail
12 $20.18 12 foot treated 4x4
30 $4.86 Ironridge UFO-CL-01-A1
20 $0.87 Ironridge UFO-STP-40MM-M1
1 $12.62 Ironridge XR-10 end caps (20 pack)
20 $2.63 Ironridge LFT-03-M1
20 $1.69 Ironridge BHW-SQ-02-A1
22 $2.65 5/8 x 3 inch hot-dipped galvanized lag screw
10 $0.50 6 gravel per post
30 $6.91 50 lb bags of quickcrete
1 $15.00 Shurtape PW-100 Corrosion Protection Pipe Wrap Tape
N/A $30 other bolts and hardware (approximate)
$1100 total (Does not include cost of panels, wiring, or electrical hardware.)

11 April 2025

Reproducible Builds: Reproducible Builds in March 2025

Welcome to the third report in 2025 from the Reproducible Builds project. Our monthly reports outline what we ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. As usual, however, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. Table of contents:
  1. Debian bookworm live images now fully reproducible from their binary packages
  2. How NixOS and reproducible builds could have detected the xz backdoor
  3. LWN: Fedora change aims for 99% package reproducibility
  4. Python adopts PEP standard for specifying package dependencies
  5. OSS Rebuild real-time validation and tooling improvements
  6. SimpleX Chat server components now reproducible
  7. Three new scholarly papers
  8. Distribution roundup
  9. An overview of Supply Chain Attacks on Linux distributions
  10. diffoscope & strip-nondeterminism
  11. Website updates
  12. Reproducibility testing framework
  13. Upstream patches

Debian bookworm live images now fully reproducible from their binary packages Roland Clobus announced on our mailing list this month that all the major desktop variants (ie. Gnome, KDE, etc.) can be reproducibly created for Debian bullseye, bookworm and trixie from their (pre-compiled) binary packages. Building reproducible Debian live images does not require building from reproducible source code, but this is still a remarkable achievement. Some large proportion of the binary packages that comprise these live images can (and were) built reproducibly, but live image generation works at a higher level. (By contrast, full or end-to-end reproducibility of a bootable OS image will, in time, require both the compile-the-packages the build-the-bootable-image stages to be reproducible.) Nevertheless, in response, Roland s announcement generated significant congratulations as well as some discussion regarding the finer points of the terms employed: a full outline of the replies can be found here. The news was also picked up by Linux Weekly News (LWN) as well as to Hacker News.

How NixOS and reproducible builds could have detected the xz backdoor Julien Malka aka luj published an in-depth blog post this month with the highly-stimulating title How NixOS and reproducible builds could have detected the xz backdoor for the benefit of all . Starting with an dive into the relevant technical details of the XZ Utils backdoor, Julien s article goes on to describe how we might avoid the xz catastrophe in the future by building software from trusted sources and building trust into untrusted release tarballs by way of comparing sources and leveraging bitwise reproducibility, i.e. applying the practices of Reproducible Builds. The article generated significant discussion on Hacker News as well as on Linux Weekly News (LWN).

LWN: Fedora change aims for 99% package reproducibility Linux Weekly News (LWN) contributor Joe Brockmeier has published a detailed round-up on how Fedora change aims for 99% package reproducibility. The article opens by mentioning that although Debian has been working toward reproducible builds for more than a decade , the Fedora project has now:
progressed far enough that the project is now considering a change proposal for the Fedora 43 development cycle, expected to be released in October, with a goal of making 99% of Fedora s package builds reproducible. So far, reaction to the proposal seems favorable and focused primarily on how to achieve the goal with minimal pain for packagers rather than whether to attempt it.
The Change Proposal itself is worth reading:
Over the last few releases, we [Fedora] changed our build infrastructure to make package builds reproducible. This is enough to reach 90%. The remaining issues need to be fixed in individual packages. After this Change, package builds are expected to be reproducible. Bugs will be filed against packages when an irreproducibility is detected. The goal is to have no fewer than 99% of package builds reproducible.
Further discussion can be found on the Fedora mailing list as well as on Fedora s Discourse instance.

Python adopts PEP standard for specifying package dependencies Python developer Brett Cannon reported on Fosstodon that PEP 751 was recently accepted. This design document has the purpose of describing a file format to record Python dependencies for installation reproducibility . As the abstract of the proposal writes:
This PEP proposes a new file format for specifying dependencies to enable reproducible installation in a Python environment. The format is designed to be human-readable and machine-generated. Installers consuming the file should be able to calculate what to install without the need for dependency resolution at install-time.
The PEP, which itself supersedes PEP 665, mentions that there are at least five well-known solutions to this problem in the community .

OSS Rebuild real-time validation and tooling improvements OSS Rebuild aims to automate rebuilding upstream language packages (e.g. from PyPI, crates.io, npm registries) and publish signed attestations and build definitions for public use. OSS Rebuild is now attempting rebuilds as packages are published, shortening the time to validating rebuilds and publishing attestations. Aman Sharma contributed classifiers and fixes for common sources of non-determinism in JAR packages. Improvements were also made to some of the core tools in the project:
  • timewarp for simulating the registry responses from sometime in the past.
  • proxy for transparent interception and logging of network activity.
  • and stabilize, yet another nondeterminism fixer.

SimpleX Chat server components now reproducible SimpleX Chat is a privacy-oriented decentralised messaging platform that eliminates user identifiers and metadata, offers end-to-end encryption and has a unique approach to decentralised identity. Starting from version 6.3, however, Simplex has implemented reproducible builds for its server components. This advancement allows anyone to verify that the binaries distributed by SimpleX match the source code, improving transparency and trustworthiness.

Three new scholarly papers Aman Sharma of the KTH Royal Institute of Technology of Stockholm, Sweden published a paper on Build and Runtime Integrity for Java (PDF). The paper s abstract notes that Software Supply Chain attacks are increasingly threatening the security of software systems and goes on to compare build- and run-time integrity:
Build-time integrity ensures that the software artifact creation process, from source code to compiled binaries, remains untampered. Runtime integrity, on the other hand, guarantees that the executing application loads and runs only trusted code, preventing dynamic injection of malicious components.
Aman s paper explores solutions to safeguard Java applications and proposes some novel techniques to detect malicious code injection. A full PDF of the paper is available.
In addition, Hamed Okhravi and Nathan Burow of Massachusetts Institute of Technology (MIT) Lincoln Laboratory along with Fred B. Schneider of Cornell University published a paper in the most recent edition of IEEE Security & Privacy on Software Bill of Materials as a Proactive Defense:
The recently mandated software bill of materials (SBOM) is intended to help mitigate software supply-chain risk. We discuss extensions that would enable an SBOM to serve as a basis for making trust assessments thus also serving as a proactive defense.
A full PDF of the paper is available.
Lastly, congratulations to Giacomo Benedetti of the University of Genoa for publishing their PhD thesis. Titled Improving Transparency, Trust, and Automation in the Software Supply Chain, Giacomo s thesis:
addresses three critical aspects of the software supply chain to enhance security: transparency, trust, and automation. First, it investigates transparency as a mechanism to empower developers with accurate and complete insights into the software components integrated into their applications. To this end, the thesis introduces SUNSET and PIP-SBOM, leveraging modeling and SBOMs (Software Bill of Materials) as foundational tools for transparency and security. Second, it examines software trust, focusing on the effectiveness of reproducible builds in major ecosystems and proposing solutions to bolster their adoption. Finally, it emphasizes the role of automation in modern software management, particularly in ensuring user safety and application reliability. This includes developing a tool for automated security testing of GitHub Actions and analyzing the permission models of prominent platforms like GitHub, GitLab, and BitBucket.

Distribution roundup In Debian this month:
The IzzyOnDroid Android APK repository reached another milestone in March, crossing the 40% coverage mark specifically, more than 42% of the apps in the repository is now reproducible Thanks to funding by NLnet/Mobifree, the project was also to put more time into their tooling. For instance, developers can now run easily their own verification builder in less than 5 minutes . This currently supports Debian-based systems, but support for RPM-based systems is incoming. Future work in the pipeline, including documentation, guidelines and helpers for debugging.
Fedora developer Zbigniew J drzejewski-Szmek announced a work-in-progress script called fedora-repro-build which attempts to reproduce an existing package within a Koji build environment. Although the project s README file lists a number of fields will always or almost always vary (and there are a non-zero list of other known issues), this is an excellent first step towards full Fedora reproducibility (see above for more information).
Lastly, in openSUSE news, Bernhard M. Wiedemann posted another monthly update for his work there.

An overview of Supply Chain Attacks on Linux distributions Fenrisk, a cybersecurity risk-management company, has published a lengthy overview of Supply Chain Attacks on Linux distributions. Authored by Maxime Rinaudo, the article asks:
[What] would it take to compromise an entire Linux distribution directly through their public infrastructure? Is it possible to perform such a compromise as simple security researchers with no available resources but time?

diffoscope & strip-nondeterminism diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 290, 291, 292 and 293 and 293 to Debian:
  • Bug fixes:
    • file(1) version 5.46 now returns XHTML document for .xhtml files such as those found nested within our .epub tests. [ ]
    • Also consider .aar files as APK files, at least for the sake of diffoscope. [ ]
    • Require the new, upcoming, version of file(1) and update our quine-related testcase. [ ]
  • Codebase improvements:
    • Ensure all calls to our_check_output in the ELF comparator have the potential CalledProcessError exception caught. [ ][ ]
    • Correct an import masking issue. [ ]
    • Add a missing subprocess import. [ ]
    • Reformat openssl.py. [ ]
    • Update copyright years. [ ][ ][ ]
In addition, Ivan Trubach contributed a change to ignore the st_size metadata entry for directories as it is essentially arbitrary and introduces unnecessary or even spurious changes. [ ]

Website updates Once again, there were a number of improvements made to our website this month, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In March, a number of changes were made by Holger Levsen, including:
  • reproduce.debian.net-related:
    • Add links to two related bugs about buildinfos.debian.net. [ ]
    • Add an extra sync to the database backup. [ ]
    • Overhaul description of what the service is about. [ ][ ][ ][ ][ ][ ]
    • Improve the documentation to indicate that need to fix syncronisation pipes. [ ][ ]
    • Improve the statistics page by breaking down output by architecture. [ ]
    • Add a copyright statement. [ ]
    • Add a space after the package name so one can search for specific packages more easily. [ ]
    • Add a script to work around/implement a missing feature of debrebuild. [ ]
  • Misc:
    • Run debian-repro-status at the end of the chroot-install tests. [ ][ ]
    • Document that we have unused diskspace at Ionos. [ ]
In addition:
  • James Addison made a number of changes to the reproduce.debian.net homepage. [ ][ ].
  • Jochen Sprickerhof updated the statistics generation to catch No space left on device issues. [ ]
  • Mattia Rizzolo added a better command to stop the builders [ ] and fixed the reStructuredText syntax in the README.infrastructure file. [ ]
And finally, node maintenance was performed by Holger Levsen [ ][ ][ ] and Mattia Rizzolo [ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 March 2025

Otto Kek l inen: Will decentralized social media soon go mainstream?

Featured image of post Will decentralized social media soon go mainstream?In today s digital landscape, social media is more than just a communication tool it is the primary medium for global discourse. Heads of state, corporate leaders and cultural influencers now broadcast their statements directly to the world, shaping public opinion in real time. However, the dominance of a few centralized platforms X/Twitter, Facebook and YouTube raises critical concerns about control, censorship and the monopolization of information. Those who control these networks effectively wield significant power over public discourse. In response, a new wave of distributed social media platforms has emerged, each built on different decentralized protocols designed to provide greater autonomy, censorship resistance and user control. While Wikipedia maintains a comprehensive list of distributed social networking software and protocols, it does not cover recent blockchain-based systems, nor does it highlight which have the most potential for mainstream adoption. This post explores the leading decentralized social media platforms and the protocols they are based on: Mastodon (ActivityPub), Bluesky (AT Protocol), Warpcast (Farcaster), Hey (Lens) and Primal (Nostr).

Comparison of architecture and mainstream adoption potential
Protocol Identity System Example Storage model Cost for end users Potential
Mastodon Tied to server domain @ottok@mastodon.social Federated instances Free (some instances charge) High
Bluesky Portable (DID) ottoke.bsky.social Federated instances Free Moderate
Farcaster ENS (Ethereum) @ottok Blockchain + off-chain Small gas fees Moderate
Lens NFT-based (Polygon) @ottok Blockchain + off-chain Small gas fees Niche
Nostr Cryptographic Keys npub16lc6uhqpg6dnqajylkhwuh3j7ynhcnje508tt4v6703w9kjlv9vqzz4z7f Federated instances Free (some instances charge) Niche

1. Mastodon (ActivityPub) Screenshot of Mastodon Mastodon was created in 2016 by Eugen Rochko, a German software developer who sought to provide a decentralized and user-controlled alternative to Twitter. It was built on the ActivityPub protocol, now standardized by W3C Social Web Working Group, to allow users to join independent servers while still communicating across the broader Mastodon network. Mastodon operates on a federated model, where multiple independently run servers communicate via ActivityPub. Each server sets its own moderation policies, leading to a decentralized but fragmented experience. The servers can alternatively be called instances, relays or nodes, depending on what vocabulary a protocol has standardized on.
  • Identity: User identity is tied to the instance where they registered, represented as @username@instance.tld.
  • Storage: Data is stored on individual instances, which federate messages to other instances based on their configurations.
  • Cost: Free to use, but relies on instance operators willing to run the servers.
The protocol defines multiple activities such as:
  • Creating a post
  • Liking
  • Sharing
  • Following
  • Commenting

Example Message in ActivityPub (JSON-LD Format)
json
 
 "@context": "https://www.w3.org/ns/activitystreams",
 "type": "Create",
 "actor": "https://mastodon.social/users/ottok",
 "object":  
 "type": "Note",
 "content": "Hello from #Mastodon!",
 "published": "2025-03-03T12:00:00Z",
 "to": ["https://www.w3.org/ns/activitystreams#Public"]
  
 
Servers communicate across different platforms by publishing activities to their followers or forwarding activities between servers. Standard HTTPS is used between servers for communication, and the messages use JSON-LD for data representation. The WebFinger protocol is used for user discovery. There is however no neat way for home server discovery yet. This means that if you are browsing e.g. Fosstodon and want to follow a user and press Follow, a dialog will pop up asking you to enter your own home server (e.g. mastodon.social) to redirect you there for actually executing the Follow action on with your account. Mastodon is open source under the AGPL at github.com/mastodon/mastodon. Anyone can operate their own instance. It just requires to run your own server and some skills to maintain a Ruby on Rails app with a PostgreSQL database backend, and basic understanding of the protocol to configure federation with other ActivityPub instances.

Popularity: Already established, but will it grow more? Mastodon has seen steady growth, especially after Twitter s acquisition in 2022, with some estimates stating it peaked at 10 million users across thousands of instances. However, its fragmented user experience and the complexity of choosing instances have hindered mainstream adoption. Still, it remains the most established decentralized alternative to Twitter. Note that Donald Trump s Truth Social is based on the Mastodon software but does not federate with the ActivityPub network. The ActivityPub protocol is the most widely used of its kind. One of the other most popular services is the Lemmy link sharing service, similar to Reddit. The larger ecosystem of ActivityPub is called Fediverse, and estimates put the total active user count around 6 million.

2. Bluesky (AT Protocol) Screenshot of Bluesky Interestingly, Bluesky was conceived within Twitter in 2019 by Twitter founder Jack Dorsey. After being incubated as a Twitter-funded project, it spun off as an independent Public Benefit LLC in February 2022 and launched its public beta in February 2023. Bluesky runs on top of the Authenticated Transfer (AT) Protocol published at https://github.com/bluesky-social/atproto. The protocol enables portable identities and data ownership, meaning users can migrate between platforms while keeping their identity and content intact. In practice, however, there is only one popular server at the moment, which is Bluesky itself.
  • Identity: Usernames are domain-based (e.g., @user.bsky.social).
  • Storage: Content is theoretically federated among various servers.
  • Cost: Free to use, but relies on instance operators willing to run the servers.

Example Message in AT Protocol (JSON Format)
json
 
 "repo": "did:plc:ottoke.bsky.social",
 "collection": "app.bsky.feed.post",
 "record":  
 "$type": "app.bsky.feed.post",
 "text": "Hello from Bluesky!",
 "createdAt": "2025-03-03T12:00:00Z",
 "langs": ["en"]
  
 

Popularity: Hybrid approach may have business benefits? Bluesky reported over 3 million users by 2024, probably getting traction due to its Twitter-like interface and Jack Dorsey s involvement. Its hybrid approach decentralized identity with centralized components could make it a strong candidate for mainstream adoption, assuming it can scale effectively.

3. Warpcast (Farcaster Network) Farcaster was launched in 2021 by Dan Romero and Varun Srinivasan, both former crypto exchange Coinbase executives, to create a decentralized but user-friendly social network. Built on the Ethereum blockchain, it could potentially offer a very attack-resistant communication medium. However, in my own testing, Farcaster does not seem to fully leverage what Ethereum could offer. First of all, there is no diversity in programs implementing the protocol as at the moment there is only Warpcast. In Warpcast the signup requires an initial 5 USD fee that is not payable in ETH, and users need to create a new wallet address on the Ethereum layer 2 network Base instead of simply reusing their existing Ethereum wallet address or ENS name. Despite this, I can understand why Farcaster may have decided to start out like this. Having a single client program may be the best strategy initially. One of the decentralized chat protocol Matrix founders, Matthew Hodgson, shared in his FOSDEM 2025 talk that he slightly regrets focusing too much on developing the protocol instead of making sure the app to use it is attractive to end users. So it may be sensible to ensure Warpcast gets popular first, before attempting to make the Farcaster protocol widely used. As a protocol Farcaster s hybrid approach makes it more scalable than fully on-chain networks, giving it a higher chance of mainstream adoption if it integrates seamlessly with broader Web3 ecosystems.
  • Identity: ENS (Ethereum Name Service) domains are used as usernames.
  • Storage: Messages are stored in off-chain hubs, while identity is on-chain.
  • Cost: Users must pay gas fees for some operations but reading and posting messages is mostly free.

Example Message in Farcaster (JSON Format)
json
 
 "fid": 766579,
 "username": "ottok",
 "custodyAddress": "0x127853e48be3870172baa4215d63b6d815d18f21",
 "connectedWallet": "0x3ebe43aa3ae5b891ca1577d9c49563c0cee8da88",
 "text": "Hello from Farcaster!",
 "publishedAt": 1709424000,
 "replyTo": null,
 "embeds": []
 

Popularity: Decentralized social media + decentralized payments a winning combo? Ethereum founder Vitalik Buterin (warpcast.com/vbuterin) and many core developers are active on the platform. Warpcast, the main client for Farcaster, has seen increasing adoption, especially among Ethereum developers and Web3 enthusiasts. I too have an profile at warpcast.com/ottok. However, the numbers are still very low and far from reaching network effects to really take off. Blockchain-based social media networks, particularly those built on Ethereum, are compelling because they leverage existing user wallets and persistent identities while enabling native payment functionality. When combined with decentralized content funding through micropayments, these blockchain-backed social networks could offer unique advantages that centralized platforms may find difficult to replicate, being decentralized both as a technical network and in a funding mechanism.

4. Hey.xyz (Lens Network) The Lens Protocol was developed by decentralized finance (DeFi) team Aave and launched in May 2022 to provide a user-owned social media network. While initially built on Polygon, it has since launched its own Layer 2 network called the Lens Network in February 2024. Lens is currently the main competitor to Farcaster. Lens stores profile ownership and references on-chain, while content is stored on IPFS/Arweave, enabling composability with DeFi and NFTs.
  • Identity: Profile ownership is tied to NFTs on the Polygon blockchain.
  • Storage: Content is on-chain and integrates with IPFS/Arweave (like NFTs).
  • Cost: Users must pay gas fees for some operations but reading and posting messages is mostly free.

Example Message in Lens (JSON Format)
json
 
 "profileId": "@ottok",
 "contentURI": "ar://QmExampleHash",
 "collectModule": "0x23b9467334bEb345aAa6fd1545538F3d54436e96",
 "referenceModule": "0x0000000000000000000000000000000000000000",
 "timestamp": 1709558400
 

Popularity: Probably not as social media site, but maybe as protocol? The social media side of Lens is mainly the Hey.xyz website, which seems to have fewer users than Warpcast, and is even further away from reaching critical mass for network effects. The Lens protocol however has a lot of advanced features and it may gain adoption as the building block for many Web3 apps.

5. Primal.net (Nostr Network) Nostr (Notes and Other Stuff Transmitted by Relays) was conceptualized in 2020 by an anonymous developer known as fiatjaf. One of the primary design tenets was to be a censorship-resistant protocol and it is popular among Bitcoin enthusiasts, with Jack Dorsey being one of the public supporters. Unlike the Farcaster and Lens protocols, Nostr is not blockchain-based but just a network of relay servers for message distribution. If does however use public key cryptography for identities, similar to how wallets work in crypto.
  • Identity: Public-private key pairs define identity (with prefix npub...).
  • Storage: Content is federated among multiple servers, which in Nostr vocabulary are called relays.
  • Cost: No gas fees, but relies on relay operators willing to run the servers.

Example Message in Nostr (JSON Format)
json
 
 "id": "note1xyz...",
 "pubkey": "npub1...",
 "kind": 1,
 "content": "Hello from Nostr!",
 "created_at": 1709558400,
 "tags": [],
 "sig": "sig1..."
 

Popularity: If Jack Dorsey and Bitcoiners promote it enough? Primal.net as a web app is pretty solid, but it does not stand out much. While Jack Dorsey has shown support by donating $1.5 million to the protocol development in December 2021, its success likely depends on broader adoption by the Bitcoin community.

Will any of these replace X/Twitter? As usage patterns vary, the statistics are not fully comparable, but this overview of the situation in March 2025 gives a decent overview.
Platform Total Accounts Active Users Growth Trend
Mastodon ~10 million ~1 million Steady
Bluesky ~33 million ~1 million Steady
Nostr ~41 million ~20 thousand Steady
Farcaster ~850 thousand ~50 thousand Flat
Lens ~140 thousand ~20 thousand Flat
Mastodon and Bluesky have already reached millions of users, while Lens and Farcaster are growing within crypto communities. It is however clear that none of these are anywhere close to how popular X/Twitter is. In particular, Mastodon had a huge influx of users in the fall of 2022 when Twitter was acquired, but to challenge the incumbents the growth would need to significantly accelerate. We can all accelerate this development by embracing decentralized social media now alongside existing dominant platforms. Who knows, given the right circumstances maybe X.com leadership decides to change the operating model and start federating contents to break out from a walled garden model. The likelyhood of such development would increase if decentralized networks get popular, and the encumbents feel they need to participate to not lose out.

Past and future The idea of decentralized social media is not new. One early pioneer identi.ca launched in 2008, only two years after Twitter, using the OStatus protocol to promote decentralization. A few years later it evolved into pump.io with the ActivityPump protocol, and also forked into GNU Social that continued with OStatus. I remember when these happened, and that in 2010 also Diaspora launched with fairly large publicity. Surprisingly both of these still operate (I can still post both on identi.ca and diasp.org), but the activity fizzled out years ago. The protocol however survived partially and evolved into ActivityPub, which is now the backbone of the Fediverse. The evolution of decentralized social media over the next decade will likely parallel developments in democracy, freedom of speech and public discourse. While the early 2010s emphasized maximum independence and freedom, the late 2010s saw growing support for content moderation to combat misinformation. The AI era introduces new challenges, potentially requiring proof-of-humanity verification for content authenticity. Key factors that will determine success:
  • User experience and ease of onboarding
  • Network effects and critical mass of users
  • Integration with existing web3 infrastructure
  • Balance between decentralization and usability
  • Sustainable economic models for infrastructure
This is clearly an area of development worth monitoring closely, as the next few years may determine which protocol becomes the de facto standard for decentralized social communication.

16 December 2024

Russ Allbery: Review: Finders

Review: Finders, by Melissa Scott
Series: Firstborn, Lastborn #1
Publisher: Candlemark & Gleam
Copyright: 2018
ISBN: 1-936460-87-4
Format: Kindle
Pages: 409
Finders is a far future science fiction novel with cyberpunk vibes. It is the first of a series, but the second (and, so far, only other) book of the series is a prequel. It stands alone reasonably well (more on that later). Cassilde Sam is a salvor. That means she specializes in exploring ancient wrecks and ruins left behind by the Ancients and salvaging materials that can be reused. The most important of those are what are called Ancestral elements: BLUE, which can hold programming; GOLD, which which reacts to BLUE instructions; RED, which produces actions or output; and GREEN, the rarest and most valuable, which powers everything else. Cassilde and her partner Dai Winter file claims on newly-discovered or incompletely salvaged Ancestor sites and then extract elemental material and anything else of value in their small salvage ship. Cassilde is also dying. She has Lightman's, an incurable degenerative disease that can only be treated with ever-increasing quantities of GREEN. It's hard to sleep, hard to get warm, hard to breathe, and eventually she'll run out of money to pay for the GREEN and she'll die. To push that day off into the future, she and Dai need work. The good news is that the wreckage of a new Ancestor sky palace was discovered in a long orbit and will create enough salvage work for every experienced salvor in the system. The bad news is that they're not qualified to bid on it. They need a scholar with a class-one license to bid on the best sections, and they haven't had a reliable scholar since their former partner and lover Summerland Ashe picked the opposite side in the Troubles and left the Fringe for the Entente, the more densely settled and connected portion of human space. But, unexpectedly and suspiciously, Ashe may be back and offering to work with them again. So, first, I love this setting. This is far from the first SF novel that is set in the aftermath of a general collapse of human civilization and revolving around discovering lost mysteries. Most examples of that genre are post-apocalyptic novels limited to Earth or the local solar system, but Kate Elliott's Unconquerable Sun comes immediately to mind. It's also not the first space archaeology series I've read; Kristine Kathyrn Rusch's story series starting with "Diving into the Wreck" also came to mind. But I don't recall the last time I've seen the author sell the setting so effectively. This is a world with starships and spaceports and clearly advanced technology, but it feels like a post-collapse society that's built on ruins. It's not just that technology runs on half-understood Ancestral elements and states fight over control of debris fields. It's also that the society repurposes Ancestral remnants in ways that both they and the reader know weren't originally intended, and that sometimes are more ingenious or efficient than how the Ancestors probably used them. There's a creative grittiness here that reminds me of good cyberpunk. It's not just good atmospheric writing, though. Scott makes a world-building decision that is going to sound trivial when I say it, but that has brilliant implications for the rest of the setting. There was not just one collapse; there were two. The Ancestor civilization, presumed to be the first human civilization, has passed into myth, quite literally when it comes to the stories around its downfall in the aftermath of a war against AIs. After the Ancestors came the Successors, who followed a similar salvage and rebuild approach and got as far as inventing their own warp drive technology that was based on but different than the Ancestor technology. Then they also collapsed, leaving their adapted technology and salvage operations layered over Ancestor sites. Cassilde's civilization is the third human starfaring civilization, and it is very specifically the third, neither the second nor one of dozens. This has so many small but effective implications that improve this story. A fall happened twice, so it feels like a pattern that makes Cassilde's civilization paranoid, but it happened for two very different reasons, so there is room to argue against it being a pattern. Salvage is harder because of the layering of Ancestor and Successor activity. Successors had their own way of controlling technology that is not accessible to Cassilde and her crew but is also not how the technology was intended to be used, which sends small ripples of interesting complexity through the background. And salvors are competing not only against each other but also against Successor salvage operations for which they have fragmentary records. It's a beautifully effective touch. Melissa Scott has been publishing science fiction for forty years, and it shows in this book. The protagonists are older characters: established professionals with resource problems but also social connections and an earned reputation, people who are trying to do a job and live their lives, not change the world. The writing is competent, deft, and atmospheric, with the confidence of long practice, but it also has the feel of an earlier era of science fiction. I mentioned the cyberpunk influence, which shows in the grittiness of the descriptions, the marginality of the characters in society, and the background theme of repurposing and reusing technology in unintended ways. This is the sort of book that feels solidly in the center of science fiction, without the genre mixing into either fantasy or romance that has become somewhat more common, and also without the dramatics of space opera (although the reader discovers that the stakes of this novel may be higher than anyone realized). And yet, so much of this book is about navigating a complicated romantic relationship, and that's where the story structure felt a bit odd. Cassilde, Dai, and Ashe were a polyamorous triad (polyamory also shows up in Scott's excellent Roads of Heaven series), and much of the first third of the book deals with the fracturing of trust with Ashe and their renegotiation of that relationship given his return. This is refreshingly written as the thoughtful interaction of three adults who take issues of trust seriously, but that also means it's much less dramatic than it sounds, and that means this book starts exceptionally slow. Scott is going somewhere, and the slow build became engrossing around the midpoint of the book, but I had to fight to stick with it at the start. About 80% of the way through this book, I had no idea how Scott was going to wrap things up in the pages remaining and was bracing myself for some sort of series cliffhanger. This is not what happens; the plot is not fully resolved in every detail, but it reaches a conclusion of sorts that does not mandate a sequel. I did think the end was a little bit unsatisfying, though, and I want another book that explores the implications of the ending. I think it would have to be a much different book, and the tonal shift might be stark. I've had this book on my to-read list for a while and kept putting it off because I wasn't sure I was in the mood for something precarious and gritty. This turned out to be an accurate worry: this is literally a book about salvaging the pieces of something full of wonders inextricably connected to dangers. You have to be in a cyberpunk sort of mood. But I've never read a bad Melissa Scott book, and this is no exception. The simplicity and ALL-CAPSNESS of the Ancestral elements grated a bit, but apart from that, the world-building is exceptional and well worth the trip. Recommended, although be warned that, if you're like me, it may not grab you from the first page. Followed by Fallen, but that book is a prequel that does not share any protagonists. Content notes: disability and degenerative illness in a universe where magical cures are possible, so be warned if that specific thematic combination is not what you're looking for. Rating: 7 out of 10

17 November 2023

Jonathan Dowland: HLedger, regex matches and field assignments

I've finally landed a patch/feature for HLedger I've been working on-and-off (mostly off) since around March. HLedger has a powerful CSV importer which you configure with a set of rules. Rules consist of conditional matchers (does field X in this CSV row match this regular expression?) and field assignments (set the resulting transaction's account to Y). motivating problem 1 Here's an example of one of my rules for handling credit card repayments. This rule is applied when I import a CSV for my current account, which pays the credit card:
if AMERICAN EXPRESS
    account2 liabilities:amex
This results in a ledger entry like the following
2023-10-31 AMERICAN EXPRESS
    assets:current           - 6.66
    liabilities:amex           6.66
My current account statements cover calendar months. My credit card period spans mid-month to mid-month. I pay it off by direct debit, which comes out after the credit card period, towards the very end of the calendar month. That transaction falls roughly halfway through the next credit card period. On my credit card statements, that repayment is "warped" to the start of the list of transactions, clearing the outstanding balance from the previous period. When I import my credit card data to HLedger, I want to compare the result against a PDF statement to make sure my ledger matches reality. The repayment "warping" makes this awkward, because it means the balance for roughly half the new transactions (those that fall before the real-date of the repayment) don't match up. motivating problem 2 I start new ledger files each year. I need to import the closing balances from the previous year to the next, which I do by exporting the final balance from the previous year in CSV and importing that into the new ledgers in the usual way. Between 2022 and 2023 I changed the scheme I use for account names so I need to translate between the old and the new in the opening balances. I couldn't think of a way of achieving this in the import rules (besides writing a bespoke rule for every possible old account name) so I abused another HLedger feature instead, HLedger aliases. For example I added this alias in my family ledger file for 2023
alias /^family:(.*)/ = \1
These are ugly and I'd prefer to get rid of them. regex match groups A common feature of regular expressions is defining match groups which can be referenced elsewhere, such as on the far-side of a substitution. I added match group support to HLedger's field assignments. addressing date warping Here's an updated version rule from the first motivating problem:
if AMERICAN EXPRESS
& %date (..)/(..)/(....)
    account2 liabilities:amex
    comment2 date:\3-\2-16
We now match on on extra date field, and surround the day/month/year components with parentheses to define match groups. We add a second field assignment too, setting the second posting's "comment" field to a string which, once the match groups are interpolated, instructs HLedger to do date warping (I wrote about this in date warping in HLedger) The new transaction looks like this:
2023-10-31 AMERICAN EXPRESS
    assets:current           - 6.66
    liabilities:amex           6.66 ; date:2023-10-16
getting rid of aliases In the second problem, I can strip off the unwanted account name prefixes at CSV import time, with rules like this
if %account2 ^family:(.*)$
    account2 \1
When! This stuff landed a week ago in early November, and is not yet in a Hledger release.

2 November 2023

Reproducible Builds: Farewell from the Reproducible Builds Summit 2023!

Farewell from the Reproducible Builds summit, which just took place in Hamburg, Germany: This year, we were thrilled to host the seventh edition of this exciting event. Topics covered this year included:

as well as countless informal discussions and hacking sessions into the night. Projects represented at the venue included:
Debian, openSUSE, QubesOS, GNU Guix, Arch Linux, phosh, Mobian, PureOS, JustBuild, LibreOffice, Warpforge, OpenWrt, F-Droid, NixOS, ElectroBSD, Apache Security, Buildroot, Systemd, Apache Maven, Fedora, Privoxy, CHAINS (KTH Royal Institute of Technology), coreboot, GitHub, Tor Project, Ubuntu, rebuilderd, repro-env, spytrap-adb, arch-repro-status, etc.

A huge thanks to our sponsors and partners for making the event possible:
Aspiration

Event facilitation

Mullvad

Platinum sponsor


If you weren t able to make it this year, don t worry; just look out for an announcement in 2024 for the next event.

29 August 2023

Jonathan Dowland: Gazelle Twin

A couple of releases A couple of releases
I discovered Gazelle Twin last year via Stuart Maconie's Freak Zone (two of her tracks ended up on my 2022 Halloween playlist). Through her website I learned of The Horror Show! exhibition at Somerset House1 in London that I managed to visit earlier this year. I've been intending to write a 5-track blog post (a la Underworld, the Cure, Coil) for a while but I have been spurred on by the excellent news that she's got a new album on the way, and, she's performing at the Sage Gateshead in November. Buy tickets now!! Here's the five tracks I recommend to get started:
  1. Anti-Body, from 2014's UNFLESH. I particularly love the percussion. Perc did a good hard-house-style remix on Fleshed Out, the companion remix album. Anti-Body by Gazelle Twin
  2. Fire Leap, from Gazelle Twin and NYX's collaborative album Deep England. The album is a re-interpretation of material from Gazelle Twin's earlier album, Pastoral, with the exception of this track, which is a cover of Paul Giovanni's song from The Wicker Man. There's a common aesthetic in all three works: eerie-folk, England's self-mythologising as seen through a warped and cracked lens. Anti-Body by Gazelle Twin There's a fantastic performance video of the whole album, directed by Iain Forsyth and Jane Pollard, available on YouTube.
  3. Better In My Day, from the aforementioned Pastoral. This track and this album are, I think, less accessible, more challenging than the re-interpreted material. That's not a bad thing: I'm still working on digesting it! This is one of the more abrasive, confrontational tracks. Better In My Day by Gazelle Twin
  4. I am Shell I am Bone, from way back to her first release, The Entire City in 2011. Composed, recorded, self-produced, self-released. It's remarkable to me how different each phase of GT's work are to one another. This album evokes a strong sense of atmosphere and place to me. There's a hint of possible influence of Joy Division's Unknown Pleasures (or New Order's Movement) in places. The b-side to this song is a cover of Joy Division's The Eternal. I Am Shell I Am Bone by Gazelle Twin
  5. GT re-issued This Entire City in 2022 along with a companion-piece EP of newly-released material from the same era, The Wastelands. This isn't cutting room floor stuff, though, as evidenced by the strength of my final pick, Hole in my Heart. Hole In My Heart by Gazelle Twin
It's hard to pick just five tracks when doing these (that's the point I suppose). I note that I haven't picked anything from her wide-ranging soundtrack work: her last three or four of her releases have been soundtrack works, released on the well-respected UK label Invada, as will be her forthcoming album. You can find all the released stuff on Gazelle Twin's Bandcamp Page.

  1. I recently learned that PJ Harvey once hosted album sessions in Somerset House, and allowed fans to watch her at work during the recording: https://www.theguardian.com/music/2015/jan/16/pj-harvey-somerset-house-recording-in-progress

27 August 2023

Gunnar Wolf: Interested in adopting the RPi images for Debian?

Back in June 2018, Michael Stapelberg put the Raspberry Pi image building up for adoption. He created the first set of unofficial, experimental Raspberry Pi images for Debian. I promptly answered to him, and while it took me some time to actually warp my head around Michael s work, managed to eventually do so. By December, I started pushing some updates. Not only that: I didn t think much about it in the beginning, as the needed non-free pacakge was called raspi3-firmware, but By early 2019, I had it running for all of the then-available Raspberry families (so the package was naturally renamed to raspi-firmware). I got my Raspberry Pi 4 at DebConf19 (thanks to Andy, who brought it from Cambridge), and it soon joined the happy Debian family. The images are built daily, and are available in https://raspi.debian.net. In the process, I also adopted Lars great vmdb2 image building tool, and have kept it decently up to date (yes, I m currently lagging behind, but I ll get to it soonish ). Anyway This year, I have been seriously neglecting the Raspberry builds. I have simply not had time to regularly test built images, nor to debug why the builder has not picked up building for trixie (testing). And my time availability is not going to improve any time soon. We are close to one month away from moving for six months to Paran (Argentina), where I ll be focusing on my PhD. And while I do contemplate taking my Raspberries along, I do not forsee being able to put much energy to them. So This is basically a call for adoption for the Raspberry Debian images building service. I do intend to stick around and try to help. It s not only me (although I m responsible for the build itself) we have a nice and healthy group of Debian people hanging out in the #debian-raspberrypi channel in OFTC IRC. Don t be afraid, and come ask. I hope giving this project in adoption will breathe new life into it!

12 July 2023

Reproducible Builds: Reproducible Builds in June 2023

Welcome to the June 2023 report from the Reproducible Builds project In our reports, we outline the most important things that we have been up to over the past month. As always, if you are interested in contributing to the project, please visit our Contribute page on our website.


We are very happy to announce the upcoming Reproducible Builds Summit which set to take place from October 31st November 2nd 2023, in the vibrant city of Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. Our aim is to create an inclusive space that fosters collaboration, innovation and problem-solving. We are thrilled to host the seventh edition of this exciting event, following the success of previous summits in various iconic locations around the world, including Venice, Marrakesh, Paris, Berlin and Athens. If you re interesting in joining us this year, please make sure to read the event page] which has more details about the event and location. (You may also be interested in attending PackagingCon 2023 held a few days before in Berlin.)
This month, Vagrant Cascadian will present at FOSSY 2023 on the topic of Breaking the Chains of Trusting Trust:
Corrupted build environments can deliver compromised cryptographically signed binaries. Several exploits in critical supply chains have been demonstrated in recent years, proving that this is not just theoretical. The most well secured build environments are still single points of failure when they fail. [ ] This talk will focus on the state of the art from several angles in related Free and Open Source Software projects, what works, current challenges and future plans for building trustworthy toolchains you do not need to trust.
Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, FOSSY aims to be a community-focused event: Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you . More information on the event is available on the FOSSY 2023 website, including the full programme schedule.
Marcel Fourn , Dominik Wermke, William Enck, Sascha Fahl and Yasemin Acar recently published an academic paper in the 44th IEEE Symposium on Security and Privacy titled It s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security . The abstract reads as follows:
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created.
However, in contrast to other papers that touch on some theoretical aspect of reproducible builds, the authors paper takes a different approach. Starting with the observation that much of the software industry believes R-Bs are too far out of reach for most projects and conjoining that with a goal of to help identify a path for R-Bs to become a commonplace property , the paper has a different methodology:
We conducted a series of 24 semi-structured expert interviews with participants from the Reproducible-Builds.org project, and iterated on our questions with the reproducible builds community. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We identify experiences that help and hinder adoption, which heavily include communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community.
A PDF of the paper is now available, as is an entry on the CISPA Helmholtz Center for Information Security website and an entry under the TeamUSEC Human-Centered Security research group.
On our mailing list this month:
The antagonist is David Schwartz, who correctly says There are dozens of complex reasons why what seems to be the same sequence of operations might produce different end results, but goes on to say I totally disagree with your general viewpoint that compilers must provide for reproducability [sic]. Dwight Tovey and I (Larry Doolittle) argue for reproducible builds. I assert Any program especially a mission-critical program like a compiler that cannot reproduce a result at will is broken. Also it s commonplace to take a binary from the net, and check to see if it was trojaned by attempting to recreate it from source.

Lastly, there were a few changes to our website this month too, including Bernhard M. Wiedemann adding a simplified Rust example to our documentation about the SOURCE_DATE_EPOCH environment variable [ ], Chris Lamb made it easier to parse our summit announcement at a glance [ ], Mattia Rizzolo added the summit announcement at a glance [ ] itself [ ][ ][ ] and Rahul Bajaj added a taxonomy of variations in build environments [ ].

Distribution work 27 reviews of Debian packages were added, 40 were updated and 8 were removed this month adding to our knowledge about identified issues. A new randomness_in_documentation_generated_by_mkdocs toolchain issue was added by Chris Lamb [ ], and the deterministic flag on the paths_vary_due_to_usrmerge issue as we are not currently testing usrmerge issues [ ] issues.
Roland Clobus posted his 18th update of the status of reproducible Debian ISO images on our mailing list. Roland reported that all major desktops build reproducibly with bullseye, bookworm, trixie and sid , but he also mentioned amongst many changes that not only are the non-free images being built (and are reproducible) but that the live images are generated officially by Debian itself. [ ]
Jan-Benedict Glaw noticed a problem when building NetBSD for the VAX architecture. Noting that Reproducible builds [are] probably not as reproducible as we thought , Jan-Benedict goes on to describe that when two builds from different source directories won t produce the same result and adds various notes about sub-optimal handling of the CFLAGS environment variable. [ ]
F-Droid added 21 new reproducible apps in June, resulting in a new record of 145 reproducible apps in total. [ ]. (This page now sports missing data for March May 2023.) F-Droid contributors also reported an issue with broken resources in APKs making some builds unreproducible. [ ]
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE

Upstream patches

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In June, a number of changes were made by Holger Levsen, including:
  • Additions to a (relatively) new Documented Jenkins Maintenance (djm) script to automatically shrink a cache & save a backup of old data [ ], automatically split out previous months data from logfiles into specially-named files [ ], prevent concurrent remote logfile fetches by using a lock file [ ] and to add/remove various debugging statements [ ].
  • Updates to the automated system health checks to, for example, to correctly detect new kernel warnings due to a wording change [ ] and to explicitly observe which old/unused kernels should be removed [ ]. This was related to an improvement so that various kernel issues on Ubuntu-based nodes are automatically fixed. [ ]
Holger and Vagrant Cascadian updated all thirty-five hosts running Debian on the amd64, armhf, and i386 architectures to Debian bookworm, with the exception of the Jenkins host itself which will be upgraded after the release of Debian 12.1. In addition, Mattia Rizzolo updated the email configuration for the @reproducible-builds.org domain to correctly accept incoming mails from jenkins.debian.net [ ] as well as to set up DomainKeys Identified Mail (DKIM) signing [ ]. And working together with Holger, Mattia also updated the Jenkins configuration to start testing Debian trixie which resulted in stopped testing Debian buster. And, finally, Jan-Benedict Glaw contributed patches for improved NetBSD testing.

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

7 March 2023

Jonathan Dowland: date warping in HLedger

My credit card and bank account rarely agree on the date for when I pay it off1. Since I added balance assertions for bank account transactions, I need the transaction in my ledger to match what the bank thinks, otherwise the balance assertions would start to fail. The skew is not normally more than a couple of days, and could be corrected by changing the date for just one of the two postings. But the skew is not very important, and altering the posting date could be used for something more useful. date warping credit card repayments My credit card bills land halfway through the month, so February's bill covers transactions between January 15th and February 14th. I pay off the bill in full each month using Direct Debit. The credit card company consider the bill paid immediately, but they don't actually draw it until the end of the month (Jan 31 in the running example). This means the payment transaction for a given month lands halfway through the period covered by the next month's bill. The credit card bill itself shows the payment date at the end of the month but presents the transaction "warped" right to the start. This is actually useful, because it means the balance is zero for the first purchase on the bill. The credit card data in CSV form has the repayment transaction at the date it occurred, not warped to the start of the period. When I import this into HLedger, the credit card account balance for each new transaction does not match the statement right up to the point of the repayment, half way through. This makes spot-checking that the imported data matches the statement a bit more awkward. So, I have started "warping" the payment transaction to the start of the billing period, just like the credit card statement does:
2022-12-31  pay credit card
  asset:bank      -500
  liabilities:credit card  ;  date:2022-12-15
I can then spot-check the transactions in HLedger after import, in particular the final one, and the final account balance, and then write a manual balance assertion when I'm finished. I'd quite like to automate the adjusted posting date, too, but I haven't figured out how to do that just yet. date warping for refunds Another thing I've found "date warping" useful is for marrying up refunds with their related purchase. Imagine I spent 200 on some shoes in late January, but returned most of them in early February:
2023-01-25  buy some shoes. hedging on the size
  liabilities:credit card      -200
  expenses:shoes
2023-02-05  return the ones that don't fit
  liabilities:credit card       150
  expenses:shoes
If I look at how much I've spent on shoes per month, it looks odd: 200 in January (although ultimately I only spent 50), and -150 in February.
Balance changes in 2023-01-01..2023-02-28:
$ hledger bal -Mt expenses:shoes
                    Jan     Feb 
================++===============
 expenses:shoes     200    -150 
----------------++---------------
                    200    -150 
By "warping" the refund's posting to the expense account to the purchase date, how much I ultimately spent on shoes is more properly reflected:
2023-02-05  return the ones that don't fit
  liabilities:credit card       150
  expenses:shoes  ; date:2023-01-25
resulting in
$ hledger bal -Mt expenses:shoes
Balance changes in 2023-01-01..2023-02-28:
                   Jan  Feb 
================++===========
 expenses:shoes     50    0 
----------------++-----------
                    50    0 
I suppose whether you'd want to do this is a matter of taste.

  1. Amazon rarely agrees with my bank on when we've paid for things either. For other reasons, Amazon is a beast to tackle in another blog post.

8 December 2022

Reproducible Builds: Reproducible Builds in November 2022

Welcome to yet another report from the Reproducible Builds project, this time for November 2022. In all of these reports (which we have been publishing regularly since May 2015) we attempt to outline the most important things that we have been up to over the past month. As always, if you interested in contributing to the project, please visit our Contribute page on our website.

Reproducible Builds Summit 2022 Following-up from last month s report about our recent summit in Venice, Italy, a comprehensive report from the meeting has not been finalised yet watch this space! As a very small preview, however, we can link to several issues that were filed about the website during the summit (#38, #39, #40, #41, #42, #43, etc.) and collectively learned about Software Bill of Materials (SBOM) s and how .buildinfo files can be seen/used as SBOMs. And, no less importantly, the Reproducible Builds t-shirt design has been updated

Reproducible Builds at European Cyber Week 2022 During the European Cyber Week 2022, a Capture The Flag (CTF) cybersecurity challenge was created by Fr d ric Pierret on the subject of Reproducible Builds. The challenge consisted in a pedagogical sense based on how to make a software release reproducible. To progress through the challenge issues that affect the reproducibility of build (such as build path, timestamps, file ordering, etc.) were to be fixed in steps in order to get the final flag in order to win the challenge. At the end of the competition, five people succeeded in solving the challenge, all of whom were awarded with a shirt. Fr d ric Pierret intends to create similar challenge in the form of a how to in the Reproducible Builds documentation, but two of the 2022 winners are shown here:

On business adoption and use of reproducible builds Simon Butler announced on the rb-general mailing list that the Software Quality Journal published an article called On business adoption and use of reproducible builds for open and closed source software. This article is an interview-based study which focuses on the adoption and uses of Reproducible Builds in industry, with a focus on investigating the reasons why organisations might not have adopted them:
[ ] industry application of R-Bs appears limited, and we seek to understand whether awareness is low or if significant technical and business reasons prevent wider adoption.
This is achieved through interviews with software practitioners and business managers, and touches on both the business and technical reasons supporting the adoption (or not) of Reproducible Builds. The article also begins with an excellent explanation and literature review, and even introduces a new helpful analogy for reproducible builds:
[Users are] able to perform a bitwise comparison of the two binaries to verify that they are identical and that the distributed binary is indeed built from the source code in the way the provider claims. Applied in this manner, R-Bs function as a canary, a mechanism that indicates when something might be wrong, and offer an improvement in security over running unverified binaries on computer systems.
The full paper is available to download on an open access basis. Elsewhere in academia, Beatriz Michelson Reichert and Rafael R. Obelheiro have published a paper proposing a systematic threat model for a generic software development pipeline identifying possible mitigations for each threat (PDF). Under the Tampering rubric of their paper, various attacks against Continuous Integration (CI) processes:
An attacker may insert a backdoor into a CI or build tool and thus introduce vulnerabilities into the software (resulting in an improper build). To avoid this threat, it is the developer s responsibility to take due care when making use of third-party build tools. Tampered compilers can be mitigated using diversity, as in the diverse double compiling (DDC) technique. Reproducible builds, a recent research topic, can also provide mitigation for this problem. (PDF)

Misc news
On our mailing list this month:

Debian & other Linux distributions Over 50 reviews of Debian packages were added this month, another 48 were updated and almost 30 were removed, all of which adds to our knowledge about identified issues. Two new issue types were added as well. [ ][ ]. Vagrant Cascadian announced on our mailing list another online sprint to help clear the huge backlog of reproducible builds patches submitted by performing NMUs (Non-Maintainer Uploads). The first such sprint took place on September 22nd, but others were held on October 6th and October 20th. There were two additional sprints that occurred in November, however, which resulted in the following progress: Lastly, Roland Clobus posted his latest update of the status of reproducible Debian ISO images on our mailing list. This reports that all major desktops build reproducibly with bullseye, bookworm and sid as well as that no custom patches needed to applied to Debian unstable for this result to occur. During November, however, Roland proposed some modifications to live-setup and the rebuild script has been adjusted to fix the failing Jenkins tests for Debian bullseye [ ][ ].
In other news, Miro Hron ok proposed a change to clamp build modification times to the value of SOURCE_DATE_EPOCH. This was initially suggested and discussed on a devel@ mailing list post but was later written up on the Fedora Wiki as well as being officially proposed to Fedora Engineering Steering Committee (FESCo).

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 226 and 227 to Debian:
  • Support both python3-progressbar and python3-progressbar2, two modules providing the progressbar Python module. [ ]
  • Don t run Python decompiling tests on Python bytecode that file(1) cannot detect yet and Python 3.11 cannot unmarshal. (#1024335)
  • Don t attempt to attach text-only differences notice if there are no differences to begin with. (#1024171)
  • Make sure we recommend apksigcopier. [ ]
  • Tidy generation of os_list. [ ]
  • Make the code clearer around generating the Debian substvars . [ ]
  • Use our assert_diff helper in test_lzip.py. [ ]
  • Drop other copyright notices from lzip.py and test_lzip.py. [ ]
In addition to this, Christopher Baines added lzip support [ ], and FC Stegerman added an optimisation whereby we don t run apktool if no differences are detected before the signing block [ ].
A significant number of changes were made to the Reproducible Builds website and documentation this month, including Chris Lamb ensuring the openEuler logo is correctly visible with a white background [ ], FC Stegerman de-duplicated by email address to avoid listing some contributors twice [ ], Herv Boutemy added Apache Maven to the list of affiliated projects [ ] and boyska updated our Contribute page to remark that the Reproducible Builds presence on salsa.debian.org is not just the Git repository but is also for creating issues [ ][ ]. In addition to all this, however, Holger Levsen made the following changes:
  • Add a number of existing publications [ ][ ] and update metadata for some existing publications as well [ ].
  • Hide draft posts on the website homepage. [ ]
  • Add the Warpforge build tool as a participating project of the summit. [ ]
  • Clarify in the footer that we welcome patches to the website repository. [ ]

Testing framework The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In October, the following changes were made by Holger Levsen:
  • Improve the generation of meta package sets (used in grouping packages for reporting/statistical purposes) to treat Debian bookworm as equivalent to Debian unstable in this specific case [ ] and to parse the list of packages used in the Debian cloud images [ ][ ][ ].
  • Temporarily allow Frederic to ssh(1) into our snapshot server as the jenkins user. [ ]
  • Keep some reproducible jobs Jenkins logs much longer [ ] (later reverted).
  • Improve the node health checks to detect failures to update the Debian cloud image package set [ ][ ] and to improve prioritisation of some kernel warnings [ ].
  • Always echo any IRC output to Jenkins output as well. [ ]
  • Deal gracefully with problems related to processing the cloud image package set. [ ]
Finally, Roland Clobus continued his work on testing Live Debian images, including adding support for specifying the origin of the Debian installer [ ] and to warn when the image has unmet dependencies in the package list (e.g. due to a transition) [ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. You can get in touch with us via:

29 October 2022

Louis-Philippe V ronneau: Extruded Schiit Stack

I've been a fan of the products manufactured by Schiit Audio for a while now. They are affordable (for high-end audio gear), sound great, are made in the USA1 and I think their industrial design looks great. I first started with one of their classic "Schiit Stack"2, but eventually upgraded to the Modi Multibit (I wanted the TOSLINK input), added a physical EQ (the Loki) and eventually got a Sys when I bought a Vidar speaker amp. The original Schiit Stack being 2 devices high was pretty manageable as-is. With my current 4-high stack though, things became unstable and I had to resort to finding a way to bolt them together. Mooching from a friend with a 3D printer, I printed this clever mount from Thingiverse. It worked well enough, but was somewhat imperfect for multiple reasons:
  1. The plastic tabs had a tendency of breaking in two when the screws where tight enough for the stack to feel solid.
  2. The plastic wasn't really rigid enough to support the 4 devices properly and the stack, being back-heavy from the cables, was unstable and tipped over easily.
  3. Due to the plastic tabs being fragile, it was pretty much impossible to disassemble the stack.
This last issue was what killed this solution for me. When I tried to replace my Modi 2 by the Modi Multibit, the mount pretty much crumbled away. Sadly, my friend warped a bunch of pieces on his 3D printer while trying to print ABS and I couldn't have him print me replacement parts either. After a while, I grew tired of having these four devices laying around my desk and wasting valuable space. I had tasted the 4-stack and knew how better things could be! That's when I realised the solution was to ditch 3D printing altogether, use aluminum framing extrusions and build my own stack out of metal. The 4 different Schiit devices with the hardware needed to build the extruded frame This was my first time working with aluminium frame extrusions and I had tons of fun! I specced the first version using 10mm x 10mm rails from McMaster-Carr, but discovered they do not ship to residential addresses in Canada... After looking at local options, I then decided to use 15mm x 15mm rails from Misumi. I went with this option since the rails are still small enough not to be an eyesore, but also because this system uses M3 screws, which the Schiit mini series also uses, making assembly much easier. I choose to make the assembled stack quite a bit taller than the previous one made with 3D printed plastic, as I found the headphone amp got pretty hot during the summer and I wanted to provide better airflow. If you are interested in replicating this stack, here are the parts I used, all from Misumi: I didn't order any since I had some already, but you'll also need M3 screws, namely: You can also cheap out and use only M3-10 screws (as I did), but you'll have to use the extra nuts you ordered as spacers. The assembled stack, complete with my lucky cat For the curious ones, the cabling is done this way:
                            
                                                              
          Magni (hp amp)       Vidar (sp amp)  
                                                              
                            
                                      
                            
                                             
                              Sys (switch)   
                                                      
                                     
                                                      
                                                      
                            
                                                              
                 Modi (DAC)        Loki (EQ)    
                                                              
                            
The Vidar is not part of the actual stack, as it's a 600W amp that weights 10kg :D. The last thing I think I want to change in this setup is the cables. The ones I have are too long for the stack. Shorter ones would reduce the wasted space in the back and make the whole thing more elegant.

  1. As in, designed, manufactured and assembled in the USA, from parts, transformers and boards made in the USA. I find this pretty impressive.
  2. A USB DAC and a headphone amp you can stack one of top of the other.

2 November 2020

Vincent Bernat: My collection of vintage PC cards

Recently, I have been gathering some old hardware at my parents house, notably PC extension cards, as they don t take much room and can be converted to a nice display item. Unfortunately, I was not very concerned about keeping stuff around. Compared to all the hardware I have acquired over the years, only a few pieces remain.

Tseng Labs ET4000AX (1989) This SVGA graphics card was installed into a PC powered by a 386SX CPU running at 16 MHz. This was a good card at the time as it was pretty fast. It didn t feature 2D acceleration, unlike the later ET4000/W32. This version only features 512 KB of RAM. It can display 1024 768 images with 16 colors or 800 600 with 256 colors. It was also compatible with CGA, EGA, VGA, MDA, and Hercules modes. No contemporary games were using the SVGA modes but the higher resolutions were useful with Windows 3. This card was manufactured directly by Tseng Labs.
Carte Tseng Labs ET4000AX ISA au-dessus de la bo te "Plan te Aventure"
Tseng Labs ET4000 AX ISA card

AdLib clone (1992) My first sound card was an AdLib. My parents bought it in Canada during the summer holidays in 1992. It uses a Yamaha OPL2 chip to produce sound via FM synthesis. The first game I have tried is Indiana Jones and the Last Crusade. I think I gave this AdLib to a friend once I upgraded my PC with a Sound Blaster Pro 2. Recently, I needed one for a side project, but they are rare and expensive on eBay. Someone mentioned a cheap clone on Vogons, so I bought it. It was sold by Sun Moon Star in 1992 and shipped with a CD-ROM of Doom shareware.
AdLib clone on top of "Alone in the Dark" box
AdLib clone ISA card by Sun Moon Star
On this topic, take a look at OPL2LPT: an AdLib sound card for the parallel port and OPL2 Audio Board: an AdLib sound card for Arduino .

Sound Blaster Pro 2 (1992) Later, I switched the AdLib sound card with a Sound Blaster Pro 2. It features an OPL3 chip and was also able to output digital samples. At the time, this was a welcome addition, but not as important as the FM synthesis introduced earlier by the AdLib.
Sound Blaster Pro 2 on top of "Day of the Tentacle" box
Sound Blaster Pro 2 ISA card

Promise EIDE 2300 Plus (1995) I bought this card mostly for the serial port. I was using a 486DX2 running at 66 MHz with a Creatix LC 288 FC external modem. The serial port was driven by an 8250 UART with no buffer. Thanks to Terminate, I was able to connect to BBSes with DOS, but this was not possible with Windows 3 or OS/2. I needed one of these fancy new cards with a 16550 UART, featuring a 16-byte buffer. At the time, this was quite difficult to find in France. During a holiday trip, I convinced my parent to make a short detour from Los Angeles to San Diego to buy this Promise EIDE 2300 Plus controller card at a shop I located through an advertisement in a local magazine! The card also features an EIDE controller with multi-word DMA mode 2 support. In contrast with the older PIO modes, the CPU didn t have to copy data from disk to memory.
Promise EIDE 2300 Plus next to an OS/2 Warp CD
Promise EIDE 2300 Plus VLB card

3dfx Voodoo2 Magic 3D II (1998) The 3dfx Voodoo2 was one of the first add-in graphics cards implementing hardware acceleration of 3D graphics. I bought it from a friend along with his Pentium II box in 1999. It was a big evolutionary step in PC gaming, as games became more beautiful and fluid. A traditional video controller was still required for 2D. A pass-through VGA cable daisy-chained the video controller to the Voodoo, which was itself connected to the monitor.
3dfx Voodoo 2 Magic 3D II on top of "Jedi Knight: Dark Forces II" box
3dfx Voodoo2 Magic 3D II PCI card

3Com 3C905C-TX-M Tornado (1999) In the early 2000s, in college, the Internet connection on the campus was provided by a student association through a 100 Mbps Ethernet cable. If you wanted to reach the maximum speed, the 3Com 3C905C-TX-M PCI network adapter, nicknamed Tornado , was the card you needed. We would buy it second-hand by the dozen and sell them to other students for around 30 .
3COM 3C905C-TX-M on top of "Red Alert" box
3Com 3C905C-TX-M PCI card

30 June 2020

Norbert Preining: TeX Live Debian update 20200629

More than a month has passed since the last update of TeX Live packages in Debian, so here is a new checkout!
All arch all packages have been updated to the tlnet state as of 2020-06-29, see the detailed update list below. Enjoy. New packages akshar, beamertheme-pure-minimalistic, biblatex-unified, biblatex-vancouver, bookshelf, commutative-diagrams, conditext, courierten, ektype-tanka, hvarabic, kpfonts-otf, marathi, menucard, namedef, pgf-pie, pwebmac, qrbill, semantex, shtthesis, tikz-lake-fig, tile-graphic, utf8add. Updated packages abnt, achemso, algolrevived, amiri, amscls, animate, antanilipsum, apa7, babel, bangtex, baskervillef, beamerappendixnote, beamerswitch, beamertheme-focus, bengali, bib2gls, biblatex-apa, biblatex-philosophy, biblatex-phys, biblatex-software, biblatex-swiss-legal, bibleref, bookshelf, bxjscls, caption, ccool, cellprops, changes, chemfig, circuitikz, cloze, cnltx, cochineal, commutative-diagrams, comprehensive, context, context-vim, cquthesis, crop, crossword, ctex, cweb, denisbdoc, dijkstra, doclicense, domitian, dps, draftwatermark, dvipdfmx, ebong, ellipsis, emoji, endofproofwd, eqexam, erewhon, erewhon-math, erw-l3, etbb, euflag, examplep, fancyvrb, fbb, fbox, fei, fira, fontools, fontsetup, fontsize, forest-quickstart, gbt7714, genealogytree, haranoaji, haranoaji-extra, hitszthesis, hvarabic, hyperxmp, icon-appr, kpfonts, kpfonts-otf, l3backend, l3build, l3experimental, l3kernel, latex-amsmath-dev, latexbangla, latex-base-dev, latexdemo, latexdiff, latex-graphics-dev, latexindent, latex-make, latexmp, latex-mr, latex-tools-dev, libertinus-fonts, libertinust1math, lion-msc, listings, logix, lshort-czech, lshort-german, lshort-polish, lshort-portuguese, lshort-russian, lshort-slovenian, lshort-thai, lshort-ukr, lshort-vietnamese, luamesh, lua-uca, luavlna, lwarp, marathi, memoir, mnras, moderntimeline, na-position, newcomputermodern, newpx, nicematrix, nodetree, ocgx2, oldstandard, optex, parskip, pdfcrop, pdfpc, pdftexcmds, pdfxup, pgf, pgfornament, pgf-pie, pgf-umlcd, pgf-umlsd, pict2e, plautopatch, poemscol, pst-circ, pst-eucl, pst-func, pstricks, pwebmac, pxjahyper, quran, rec-thy, reledmac, rest-api, sanskrit, sanskrit-t1, scholax, semantex, showexpl, shtthesis, suftesi, svg, tcolorbox, tex4ht, texinfo, thesis-ekf, thuthesis, tkz-doc, tlshell, toptesi, tuda-ci, tudscr, twemoji-colr, univie-ling, updmap-map, vancouver, velthuis, witharrows, wtref, xecjk, xepersian-hm, xetex-itrans, xfakebold, xindex, xindy, xltabular, yathesis, ydoc, yquant, zref.

31 October 2017

Norbert Preining: Debian/TeX Live 2017.20171031-1

Halloween is here, time to upload a new set of scary packages of TeX Live. About a month has passed, so there is the usual big stream up updates. There was actually an intermediate release to get out some urgent fixes, but I never reported the news here. So here are the accumulated changes and updates. My favorite this time is wallcalendar, a great class to design all kind of calendars, it looks really well done. I immediately will start putting one together. On the font side there is the new addition coelacanth. To quote from the README: Coelacanth is inspired by the classic Centaur type design of Bruce Rogers, described by some as the most beautiful typeface ever designed. It aims to be a professional quality type family for general book typesetting. And indeed it is beautiful! Other noteworthy addition is the Spark font that allows creating sparklines in the running text with LaTeX. Enjoy. New packages algobox, amscls-doc, beilstein, bib2gls, coelacanth, crossreftools, dejavu-otf, dijkstra, ducksay, dynkin-diagrams, eqnnumwarn, fetchcls, fixjfm, glossaries-finnish, hagenberg-thesis, hecthese, ifxptex, isopt, istgame, ku-template, limecv, mensa-tex, musicography, na-position, notestex, outlining, pdfreview, spark-otf, spark-otf-fonts, theatre, unitn-bimrep, upzhkinsoku, wallcalendar, xltabular. Updated packages acmart, amsmath, animate, arabluatex, arara, babel, babel-french, bangorexam, baskervillef, beebe, biblatex-philosophy, biblatex-source-division, bibletext, bidi, bxjaprnind, bxjscls, bxpapersize, bytefield, classicthesis, cochineal, complexity, cooking-units, curves, datetime2-german, dccpaper, doclicense, docsurvey, eledmac, epstopdf, eqparbox, esami, etoc, fbb, fei, fithesis, fmtcount, fnspe, fonts-tlwg, fontspec, genealogytree, glossaries, glossaries-extra, hecthese, hepthesis, hvfloat, ifplatform, ifptex, inconsolata, jfmutil, jsclasses, ketcindy, knowledge, koma-script, l3build, l3experimental, l3kernel, l3packages, langsci, latex2man, latexbug, lato, leadsheets, libertinust1math, listofitems, luatexja, luatexko, luatodonotes, lwarp, markdown, mcf2graph, media9, newtx, novel, numspell, ocgx2, overpic, philokalia, phonenumbers, platex, poemscol, pst-exa, pst-geometrictools, pst-ovl, pst-plot, pst-pulley, pst-tools, pst-vehicle, pst2pdf, pstool, pstricks, pstricks-add, pxchfon, pxjahyper, quran, randomlist, rec-thy, reledmac, robustindex, scratch, skrapport, spectralsequences, tcolorbox, tetex, tex4ht, texcount, texdoc, tikzducks, tikzsymbols, toptesi, translation-biblatex-de, unicode-math, updmap-map, uplatex, widetable, xcharter, xepersian, xetexko, xetexref, xsim, zhlipsum.

28 October 2017

Steinar H. Gunderson: Introducing Narabu, part 4: Decoding

Narabu is a new intraframe video codec. You probably want to read part 1, part 2 and part 3 first. So we're at the stage where the structure is in place. How do we decode? Once we have the structure, it's actually fairly straightforward: First of all, we need to figure out where each slice starts and ends. This is done on the CPU, but it's mostly just setting up pointers, so it's super-cheap. It doesn't see any pixels at all, just lengths and some probability distributions (those are decoded on the CPU, but they're only a few hundred values and no FP math is involved). Then, we set up local groups of size 64 (8x8) with 512 floats of shared memory. Each group will be used for decoding one slice (320 blocks), all coefficients. Each thread starts off doing rANS decoding (this involves a lookup into a small LUT, and of course some arithmetic) and dequantization of 8 blocks (for its coefficient); this means we now have 512 coefficients, so we add a barrier, do 64 horizontal IDCTs (one 8-point IDCT per thread), add a new barrier, do 64 vertical IDCTs, and then finally write out the data. We can then do the next 8 coefficients the same way (we kept the rANS decoding state), and so on. Note how the parallelism changes during the decoding; it's a bit counterintuitive at first, and barriers are not for free, but it avoids saving the coefficients to global memory (which is costly). First, the parallelism is over coefficients, then over horizontal DCT blocks, then over vertical DCT blocks, and then we start all over again. In CPU multithreading, this would be very tricky, and probably not worth it at all, but on the GPU, it gives us tons of parallelism. One problem is that the rANS work is going to be unbalanced within each warp. There's a lot more data (and thus more calculation and loading of compressed data from memory) in the lower-frequency coefficients, which means the other threads in the warp are wasting time doing nothing when they do more work. I tried various schemes to balance it better (like making even larger thread groups to get e.g. more coefficient 0s that could work together, or reordering the threads' coefficients in a zizag), but seemingly it didn't help any. Again, the lack of a profiler here is hampering effective performance investigations. In any case, the performance is reasonably good (my GTX 950 does 1280x720 luma-only in a little under 0.4 ms, which equates to ~1400 fps for full 4:2:2). As a side note, GoPro open-sourced their CineForm HD implementation the other day, and I guess it only goes to show that these kind of things really belong on the GPU; they claim similar performance numbers to what I get on my low-end NVIDIA GPU (923.6 fps for 1080p 4:2:2, which would be roughtly 1800 fps on 720p60), but that's on a 4GHz 8-core Broadwell-E, which is basically taking the most expensive enthusiast desktop CPU you can get your hands on and then overclocking it. Kudos to GoPro for freeing it, though (even under a very useful license), even though FFmpeg already had a working reverse-engineered implementation. :-) Next time, we'll look at the encoder, which is a bit more complicated. After that, we'll do some performance tests and probably wrap up the series. Stay tuned.

22 October 2017

Steinar H. Gunderson: Introducing Narabu, part 3: Parallel structure

Narabu is a new intraframe video codec. You probably want to read part 1 and part 2 first. Now having a rough idea of how the GPU works, it's obvious that we need our codec to split the image into a lot of independent chunks to be parallel; we're aiming for about a thousand chunks. We'll also aim for a format that's as simple as possible; other people can try to push the frontiers on research, I just want something that's fast, reasonably good, and not too hard to implement. First of all, let's look at JPEG. It works by converting the image to Y'CbCr (usually with subsampled chroma), split each channel into 8x8 blocks, DCT them, quantize and then use Huffman coding to encode those coefficients. This is a very standard structure, and JPEG, despite being really old by now, is actually hard to beat, so let's use that as a template. But we probably can't use JPEG wholesale. Why? The answer is that the Huffman coding is just too serial. It's all just one stream, and without knowing where the previous symbol ends, we don't know where to start decoding the next. You can partially solve this by having frequent restart markers, but these reduce coding efficiency, and there's no index of them; you'd have to make a parallel scan for them, which is annoying. So we'll need to at least do something about the entropy coding. Do we need to change anything else to reach our ~1000 parallel target? 720p is the standard video target these days; the ~1M raw pixels would be great if we could do all of them independently, but we obviously can't, since DCT is not independent per pixel (that would sort of defeat the purpose). However, there are 14,400 DCT blocks (8x8), so we can certainly do all the DCTs in parallel. Quantization after that is trivially parallelizable, at least as long as we don't aim for trellis or the likes. So indeed, it's only entropy coding that we need to worry about. Since we're changing entropy coding anyway, I picked rANS, which is confusing but has a number of neat properties; it's roughly as expensive as Huffman coding, but has an efficiency closer to that of arithmetic encoding, and you can switch distributions for each symbol. It requires a bit more calculation, but GPUs have plenty of ALUs (a typical recommendation is 10:1 calculation to memory access), so that should be fine. I picked a pretty common variation with 32-bit state and 8-bit I/O, since 64-bit arithmetic is not universally available on GPUs (I checked a variation with 64-bit state and 32-bit I/O, but it didn't really matter much for speed). Fabian Giesen has described how you can actually do rANS encoding and decoding in parallel over a warp, but I've treated it as a purely serial operation. I don't do anything like adaptation, though; each coefficient is assigned to one out of four rANS static distributions that are signaled at the start of each stream. (Four is a nice tradeoff between coding efficiency, L1 cache use and the cost of transmitting the distributions themselves. I originally used eight, but a bit of clustering showed it was possible to get it down to four at basically no extra cost. JPEG does a similar thing, with separate Huffman codes for AC and DC coefficients. And I've got separate distributions for luma and chroma, which also makes a lot of sense.) The restart cost of rANS is basically writing the state to disk; some of that we would need to write even without a restart, and there are ways to combine many such end states for less waste (I'm not using them), but let's be conservative and assume we waste all of it for each restart. At 150 Mbit/second for 720p60, the luma plane of one frame is about 150 kB. 10% (15 kB) sounds reasonable to sacrifice to restarts, which means we can have about 3750 of them, or one for each 250th pixel or so. (3750 restarts also means 3750 independent entropy coding streams, so we're still well above our 1000 target.) So the structure is more or less clear; we'll DCT the blocks in parallel (since an 8x8 block can be expressed as 8 vertical DCTs and then 8 horizontal DCTs, we can even get 8-way parallelism there), and then encode at least 250 coefficients into an rANS stream. I've chosen to let a stream encompass a single coefficient from each of 320 DCT blocks instead of all coefficients from 5 DCT blocks; it's sort of arbitrary and might be a bit unintuitive, but it feels more natural, especially since you don't need to switch rANS distributions for each coefficient. So that's a lot of rambling for a sort-of abstract format. Next time we'll go into how decoding works. Or encoding. I haven't really made up my mind yet.

19 October 2017

Steinar H. Gunderson: Introducing Narabu, part 2: Meet the GPU

Narabu is a new intraframe video codec. You may or may not want to read part 1 first. The GPU, despite being extremely more flexible than it was fifteen years ago, is still a very different beast from your CPU, and not all problems map well to it performance-wise. Thus, before designing a codec, it's useful to know what our platform looks like. A GPU has lots of special functionality for graphics (well, duh), but we'll be concentrating on the compute shader subset in this context, ie., we won't be drawing any polygons. Roughly, a GPU (as I understand it!) is built up about as follows: A GPU contains 1 20 cores; NVIDIA calls them SMs (shader multiprocessors), Intel calls them subslices. (Trivia: A typical mid-range Intel GPU contains two cores, and thus is designated GT2.) One such core usually runs the same program, although on different data; there are exceptions, but typically, if your program can't fill an entire core with parallelism, you're wasting energy. Each core, in addition to tons (thousands!) of registers, also has some shared memory (also called local memory sometimes, although that term is overloaded), typically 32 64 kB, which you can think of in two ways: Either as a sort-of explicit L1 cache, or as a way to communicate internally on a core. Shared memory is a limited, precious resource in many algorithms. Each core/SM/subslice contains about 8 execution units (Intel calls them EUs, NVIDIA/AMD calls them something else) and some memory access logic. These multiplex a bunch of threads (say, 32) and run in a round-robin-ish fashion. This means that a GPU can handle memory stalls much better than a typical CPU, since it has so many streams to pick from; even though each thread runs in-order, it can just kick off an operation and then go to the next thread while the previous one is working. Each execution unit has a bunch of ALUs (typically 16) and executes code in a SIMD fashion. NVIDIA calls these ALUs CUDA cores , AMD calls them stream processors . Unlike on CPU, this SIMD has full scatter/gather support (although sequential access, especially in certain patterns, is much more efficient than random access), lane enable/disable so it can work with conditional code, etc.. The typically fastest operation is a 32-bit float muladd; usually that's single-cycle. GPUs love 32-bit FP code. (In fact, in some GPU languages, you won't even have 8-, 16-bit or 64-bit types. This is annoying, but not the end of the world.) The vectorization is not exposed to the user in typical code (GLSL has some vector types, but they're usually just broken up into scalars, so that's a red herring), although in some programming languages you can get to swizzle the SIMD stuff internally to gain advantage of that (there's also schemes for broadcasting bits by voting etc.). However, it is crucially important to performance; if you have divergence within a warp, this means the GPU needs to execute both sides of the if. So less divergent code is good. Such a SIMD group is called a warp by NVIDIA (I don't know if the others have names for it). NVIDIA has SIMD/warp width always 32; AMD used to be 64 but is now 16. Intel supports 4 32 (the compiler will autoselect based on a bunch of factors), although 16 is the most common. The upshot of all of this is that you need massive amounts of parallelism to be able to get useful performance out of a CPU. A rule of thumb is that if you could have launched about a thousand threads for your problem on CPU, it's a good fit for a GPU, although this is of course just a guideline. There's a ton of APIs available to write compute shaders. There's CUDA (NVIDIA-only, but the dominant player), D3D compute (Windows-only, but multi-vendor), OpenCL (multi-vendor, but highly variable implementation quality), OpenGL compute shaders (all platforms except macOS, which has too old drivers), Metal (Apple-only) and probably some that I forgot. I've chosen to go for OpenGL compute shaders since I already use OpenGL shaders a lot, and this saves on interop issues. CUDA probably is more mature, but my laptop is Intel. :-) No matter which one you choose, the programming model looks very roughly like this pseudocode:
for (size_t workgroup_idx = 0; workgroup_idx < NUM_WORKGROUPS; ++workgroup_idx)     // in parallel over cores
        char shared_mem[REQUESTED_SHARED_MEM];  // private for each workgroup
        for (size_t local_idx = 0; local_idx < WORKGROUP_SIZE; ++local_idx)    // in parallel on each core
                main(workgroup_idx, local_idx, shared_mem);
         
 
except in reality, the indices will be split in x/y/z for your convenience (you control all six dimensions, of course), and if you haven't asked for too much shared memory, the driver can silently make larger workgroups if it helps increase parallelity (this is totally transparent to you). main() doesn't return anything, but you can do reads and writes as you wish; GPUs have large amounts of memory these days, and staggering amounts of memory bandwidth. Now for the bad part: Generally, you will have no debuggers, no way of logging and no real profilers (if you're lucky, you can get to know how long each compute shader invocation takes, but not what takes time within the shader itself). Especially the latter is maddening; the only real recourse you have is some timers, and then placing timer probes or trying to comment out sections of your code to see if something goes faster. If you don't get the answers you're looking for, forget printf you need to set up a separate buffer, write some numbers into it and pull that buffer down to the GPU. Profilers are an essential part of optimization, and I had really hoped the world would be more mature here by now. Even CUDA doesn't give you all that much insight sometimes I wonder if all of this is because GPU drivers and architectures are meant to be shrouded in mystery for competitiveness reasons, but I'm honestly not sure. So that's it for a crash course in GPU architecture. Next time, we'll start looking at the Narabu codec itself.

26 September 2017

Norbert Preining: Debian/TeX Live 2017.20170926-1

A full month or more has past since the last upload of TeX Live, so it was high time to prepare a new package. Nothing spectacular here I have to say, two small bugs fixed and the usual long list of updates and new packages. From the new packages I found fontloader-luaotfload and interesting project. Loading fonts via lua code in luatex is by now standard, and this package allows for experiments with newer/alternative font loaders. Another very interesting new-comer is pdfreview which lets you set pages of another PDF on a lined background and add notes to it, good for reviewing. Enjoy. New packages abnt, algobox, beilstein, bib2gls, cheatsheet, coelacanth, dijkstra, dynkin-diagrams, endofproofwd, fetchcls, fixjfm, fontloader-luaotfload, forms16be, hithesis, ifxptex, komacv-rg, ku-template, latex-refsheet, limecv, mensa-tex, multilang, na-box, notes-tex, octave, pdfreview, pst-poker, theatre, upzhkinsoku, witharrows. Updated packages 2up, acmart, acro, amsmath, animate, babel, babel-french, babel-hungarian, bangorcsthesis, beamer, beebe, biblatex-gost, biblatex-philosophy, biblatex-source-division, bibletext, bidi, bpchem, bxjaprnind, bxjscls, bytefield, checkcites, chemmacros, chet, chickenize, complexity, curves, cweb, datetime2-german, e-french, epstopdf, eqparbox, esami, etoc, fbb, fithesis, fmtcount, fnspe, fontspec, genealogytree, glossaries, glossaries-extra, hvfloat, ifptex, invoice2, jfmutil, jlreq, jsclasses, koma-script, l3build, l3experimental, l3kernel, l3packages, latexindent, libertinust1math, luatexja, lwarp, markdown, mcf2graph, media9, nddiss, newpx, newtx, novel, numspell, ocgx2, philokalia, phfqit, placeat, platex, poemscol, powerdot, pst-barcode, pst-cie, pst-exa, pst-fit, pst-func, pst-geometrictools, pst-ode, pst-plot, pst-pulley, pst-solarsystem, pst-solides3d, pst-tools, pst-vehicle, pst2pdf, pstricks, pstricks-add, ptex-base, ptex-fonts, pxchfon, quran, randomlist, reledmac, robustindex, scratch, skrapport, spectralsequences, tcolorbox, tetex, tex4ht, texcount, texdef, texinfo, texlive-docindex, texlive-scripts, tikzducks, tikzsymbols, tocloft, translations, updmap-map, uplatex, widetable, xepersian, xetexref, xint, xsim, zhlipsum.

Next.