Search Results: "matthew"

30 November 2021

Russell Coker: Links November 2021

The Guardian has an amusing article by Sophie Elmhirst about Libertarians buying a cruise ship to make a seasteading project off the coast of Panama [1]. It turns out that you need permits etc to do this and maintaining a ship is expensive. Also you wouldn t want to mine cryptocurrency in a ship cabin as most cabins are small and don t have enough airconditioning to remain pleasant if you dump 1kW or more into the air. NPR has an interesting article about the reaction of the NRA to the Columbine shootings [2]. Seems that some NRA person isn t a total asshole and is sharing their private information, maybe they are dying and are worried about going to hell. David Brin wrote an insightful blog post about the singleton hypothesis where he covers some of the evidence of autocratic societies failing [3]. I think he makes a convincing point about a single centralised government for human society not being viable. But something like the EU on a world wide scale could work well. Ken Shirriff wrote an interesting blog post about reverse engineering the Yamaha DX7 synthesiser [4]. The New York Times has an interesting article about a Baboon troop that became less aggressive after the alpha males all died at once from tuberculosis [5]. They established a new more peaceful culture that has outlived the beta males who avoided tuberculosis. The Guardian has an interesting article about how sequencing the genomes of the entire population can save healthcare costs while improving the health of the population [6]. This is somthing wealthy countries should offer for free to the world population. At a bit under $1000 per test that s only about $7 trillion to test everyone, and of course the price should drop significantly if there were billions of tests being done. The Strategy Bridge has an interesting article about SciFi books that have useful portrayals of military strategy [7]. The co-author is Major General Mick Ryan of the Australian Army which is noteworthy as Major General is the second highest rank in use by the Australian Army at this time. Vice has an interesting article about the co-evolution of penises and vaginas and how a lot of that evolution is based on avoiding impregnation from rape [8]. Cory Doctorow wrote an insightful Medium article about the way that governments could force interoperability through purchasing power [9]. Cory Doctorow wrote an insightful article for Locus Magazine about imagining life after capitalism and how capitalism might be replaced [10]. We need a Star Trek future! Arstechnica has an informative article about new developmenet in the rowhammer category of security attacks on DRAM [11]. It seems that DDR4 with ECC is the best current mitigation technique and that DDR3 with ECC is harder to attack than non-ECC RAM. So the thing to do is use ECC on all workstations and avoid doing security critical things on laptops because they can t juse ECC RAM.

6 October 2021

Matthew Palmer: Discovering AWS IAM accounts

Let s say you re someone who happens to discover an AWS account number, and would like to take a stab at guessing what IAM users might be valid in that account. Tricky problem, right? Not with this One Weird Trick! In your own AWS account, create a KMS key and try to reference an ARN representing an IAM user in the other account as the principal. If the policy is accepted by PutKeyPolicy, then that IAM account exists, and if the error says Policy contains a statement with one or more invalid principals then the user doesn t exist. As an example, say you want to guess at IAM users in AWS account 111111111111. Then make sure this statement is in your key policy:
  "Sid": "Test existence of user",
  "Effect": "Allow",
    "AWS": "arn:aws:iam::111111111111:user/bob"
  "Action": "kms:DescribeKey",
  "Resource": "*"
If that policy is accepted, then the account has an IAM user named bob. Otherwise, the user doesn t exist. Scripting this is left as an exercise for the reader. Sadly, wildcards aren t accepted in the username portion of the ARN, otherwise you could do some funky searching with ...:user/a*, ...:user/b*, etc. You can t have everything; where would you put it all? I did mention this to AWS as an account enumeration risk. They re of the opinion that it s a good thing you can know what users exist in random other AWS accounts. I guess that means this is a technique you can put in your toolbox safe in the knowledge it ll work forever. Given this is intended behaviour, I assume you don t need to use a key policy for this, but that s where I stumbled over it. Also, you can probably use it to enumerate roles and anything else that can be a principal, but since I don t see as much use for that, I didn t bother exploring it. There you are, then. If you ever need to guess at IAM users in another AWS account, now you can!

21 September 2021

Russell Coker: Links September 2021

Matthew Garrett wrote an interesting and insightful blog post about the license of software developed or co-developed by machine-learning systems [1]. One of his main points is that people in the FOSS community should aim for less copyright protection. The USENIX ATC 21/OSDI 21 Joint Keynote Address titled It s Time for Operating Systems to Rediscover Hardware has some inssightful points to make [2]. Timothy Roscoe makes some incendiaty points but backs them up with evidence. Is Linux really an OS? I recommend that everyone who s interested in OS design watch this lecture. Cory Doctorow wrote an interesting set of 6 articles about Disneyland, ride pricing, and crowd control [3]. He proposes some interesting ideas for reforming Disneyland. Benjamin Bratton wrote an insightful article about how philosophy failed in the pandemic [4]. He focuses on the Italian philosopher Giorgio Agamben who has a history of writing stupid articles that match Qanon talking points but with better language skills. Arstechnica has an interesting article about penetration testers extracting an encryption key from the bus used by the TPM on a laptop [5]. It s not a likely attack in the real world as most networks can be broken more easily by other methods. But it s still interesting to learn about how the technology works. The Portalist has an article about David Brin s Startide Rising series of novels and his thought s on the concept of Uplift (which he denies inventing) [6]. Jacobin has an insightful article titled You re Not Lazy But Your Boss Wants You to Think You Are [7]. Making people identify as lazy is bad for them and bad for getting them to do work. But this is the first time I ve seen it described as a facet of abusive capitalism. Jacobin has an insightful article about free public transport [8]. Apparently there are already many regions that have free public transport (Tallinn the Capital of Estonia being one example). Fare free public transport allows bus drivers to concentrate on driving not taking fares, removes the need for ticket inspectors, and generally provides a better service. It allows passengers to board buses and trams faster thus reducing traffic congestion and encourages more people to use public transport instead of driving and reduces road maintenance costs. Interesting research from Israel about bypassing facial ID [9]. Apparently they can make a set of 9 images that can pass for over 40% of the population. I didn t expect facial recognition to be an effective form of authentication, but I didn t expect it to be that bad. Edward Snowden wrote an insightful blog post about types of conspiracies [10]. Kevin Rudd wrote an informative article about Sky News in Australia [11]. We need to have a Royal Commission now before we have our own 6th Jan event. Steve from Big Mess O Wires wrote an informative blog post about USB-C and 4K 60Hz video [12]. Basically you can t have a single USB-C hub do 4K 60Hz video and be a USB 3.x hub unless you have compression software running on your PC (slow and only works on Windows), or have DisplayPort 1.4 or Thunderbolt (both not well supported). All of the options are not well documented on online store pages so lots of people will get unpleasant surprises when their deliveries arrive. Computers suck. Steinar H. Gunderson wrote an informative blog post about GaN technology for smaller power supplies [13]. A 65W USB-C PSU that fits the usual wall wart form factor is an interesting development.

13 July 2021

Matthew Garrett: Does free software benefit from ML models being derived works of training data?

Github recently announced Copilot, a machine learning system that makes suggestions for you when you're writing code. It's apparently trained on all public code hosted on Github, which means there's a lot of free software in its training set. Github assert that the output of Copilot belongs to the user, although they admit that it may occasionally produce output that is identical to content from the training set.

Unsurprisingly, this has led to a number of questions along the lines of "If Copilot embeds code that is identical to GPLed training data, is my code now GPLed?". This is extremely understandable, but the underlying issue is actually more general than that. Even code under permissive licenses like BSD requires retention of copyright notices and disclaimers, and failing to include them is just as much a copyright violation as incorporating GPLed code into a work and not abiding by the terms of the GPL is.

But free software licenses only have power to the extent that copyright permits them to. If your code isn't a derived work of GPLed material, you have no obligation to follow the terms of the GPL. Github clearly believe that Copilot's output doesn't count as a derived work as far as US copyright law goes, and as a result the licenses on the training data don't apply to the output. Some people have interpreted this as an attack on free software - Copilot may insert code that's either identical or extremely similar to GPLed code, and claim that there are no license obligations created as a result, effectively allowing the laundering of GPLed code into proprietary software.

I'm completely unqualified to hold a strong opinion on whether Github's legal position is justifiable or not, and right now I'm also not interested in thinking about it too much. What I think is more interesting is what the impact of either position has on free software. Do we benefit more from a future where the output of Copilot (or similar projects) is considered a derived work of the training data, or one where it isn't? Having been involved in a bunch of GPL enforcement activities, it's very easy to think of this as something that weakens the GPL and, as a result, weakens free software. That was my initial reaction, but that's shifted over the past few days.

Let's look at the GNU manifesto, specifically this section:

The fact that the easiest way to copy a program is from one neighbor to another, the fact that a program has both source code and object code which are distinct, and the fact that a program is used rather than read and enjoyed, combine to create a situation in which a person who enforces a copyright is harming society as a whole both materially and spiritually; in which a person should not do so regardless of whether the law enables him to.

The GPL makes use of copyright law to ensure that GPLed work can't be taken from the commons. Anyone who produces a derived work of GPLed code is obliged to provide that work under the same terms. If software weren't copyrightable, the GPL would have no power. But this is the outcome Stallman wanted! The GPL doesn't exist because copyright is good, it exists because software being copyrightable is what enables the concept of proprietary software in the first place.

The powers that the GPL uses to enforce sharing of code are used by the authors of proprietary software to reduce that sharing. They attempt to forbid us from examining their code to determine how it works - they argue that anyone who does so is tainted, unable to contribute similar code to free software projects in case they produce a derived work of the original. Broadly speaking, the further the definition of a derived work reaches, the greater the power of proprietary software authors. If Oracle's argument that APIs are copyrightable had prevailed, it would have been disastrous for free software. If the Apple look and feel suit had established that Microsoft infringed Apple's copyright, we might be living in a future where we had no free software desktop environments.

When we argue for an interpretation of copyright law that enhances the power of the GPL, we're also enhancing the power of giant corporations with a lot of lawyers on hand. So let's look at this another way. If Github's interpretation of copyright law holds, we can train a model on proprietary code and extract concepts without having to worry about being tainted. The proprietary code itself won't enter the commons, but the ideas it embodies will. No more worries about whether you're literally copying the code that implements an algorithm you want to duplicate - simply start typing and let the model remove the risk for you.

There's a reasonable counter argument about equality here. How much GPL-influenced code is going to end up in proprietary projects when compared to the reverse? It's not an easy question to answer, but we should bear in mind that the majority of public repositories on Github aren't under an open source license. Copilot is already claiming to give us access to the concepts embodied in those repositories. Do these provide more value than is given up? I honestly don't know how to measure that. But what I do know is that free software was founded in a belief that software shouldn't be constrained by copyright, and our default stance shouldn't be to argue against the idea that copyright is weaker than we imagined.

(Edit: this post by Julia Reda makes some of the same arguments, but spends some more time focusing on a legal analysis of why having copyright cover the output of Copilot would be a problem)

comment count unavailable comments

4 June 2021

Matthew Garrett: Mike Lindell's Cyber "Evidence"

Mike Lindell, notable for absolutely nothing relevant in this field, today filed a lawsuit against a couple of voting machine manufacturers in response to them suing him for defamation after he claimed that they were covering up hacks that had altered the course of the US election. Paragraph 104 of his suit asserts that he has evidence of at least 20 documented hacks, including the number of votes that were changed. The citation is just a link to a video called Absolute 9-0, which claims to present sufficient evidence that the US supreme court will come to a 9-0 decision that the election was tampered with.

The claim is that Lindell was provided with a set of files on the 9th of January, and gave these to some cyber experts to verify. These experts identified them as packet captures. The video contains scrolling hex, and we are told that this is the raw encrypted data from the files. In reality, the hex values correspond very clearly to printable ASCII, and appear to just be the Pennsylvania voter roll. They're not encrypted, and they're not packet captures (they contain no packet headers).

20 of these packet captures were then selected and analysed, giving us the tables contained within Exhibit 12. The alleged source IPs appear to correspond to the networks the tables claim, and the latitude and longitude presumably just come from a geoip lookup of some sort (although clearly those values are far too precise to be accurate). But if we look at the target IPs, we find something interesting. Most of them resolve to the website for the county that was the nominal target (eg, is So, we're supposed to believe that in many cases, the county voting infrastructure was hosted on the county website.

Unfortunately we're not given the destination port, but isn't listening on anything other than 80 and 443. We're told that the packet data is encrypted, so presumably it's over HTTPS. So, uh, how did they decrypt this to figure out how many votes were switched? If Mike's hackers have broken TLS, they really don't need to be dealing with this.

We're also given some background information on how it's impossible to reconstruct packet captures after the fact (untrue), or that modifying them would change their hashes (true, but in the absence of known good hash values that tells us nothing), but it's pretty clear that nothing we're shown actually demonstrates what we're told it does.

In summary: yes, any supreme court decision on this would be 9-0, just not the way he's hoping for.

Update: It was pointed out that this data appears to be part of a larger dataset. This one is even more dubious - it somehow has MAC addresses for both the source and destination (which is impossible), and almost none of these addresses are in actual issued ranges.

comment count unavailable comments

2 June 2021

Matthew Garrett: Producing a trustworthy x86-based Linux appliance

Let's say you're building some form of appliance on top of general purpose x86 hardware. You want to be able to verify the software it's running hasn't been tampered with. What's the best approach with existing technology?

Let's split this into two separate problems. The first is to do as much as we can to ensure that the software can't be modified without our consent[1]. This requires that each component in the boot chain verify that the next component is legitimate. We call the first component in this chain the root of trust, and in the x86 world this is the system firmware[2]. This firmware is responsible for verifying the bootloader, and the easiest way to do this on x86 is to use UEFI Secure Boot. In this setup the firmware contains a set of trusted signing certificates and will only boot executables with a chain of trust to one of these certificates. Switching the system into setup mode from the firmware menu will allow you to remove the existing keys and install new ones.

(Note: You shouldn't use the trusted certificate directly for signing bootloaders - instead, the trusted certificate should be used to sign another certificate and the key for that certificate used to sign your bootloader. This way, if you ever need to revoke the signing certificate, you can simply sign a new one with the trusted parent and push out a revocation update instead of having to provision new keys)

But what do you want to sign? In the general purpose Linux world, we use an intermediate bootloader called Shim to bridge from the Microsoft signing authority to a distribution one. Shim then verifies the signature on grub, and grub in turn verifies the signature on the kernel. This is a large body of code that exists because of the use cases that general purpose distributions need to support - primarily, booting on arbitrary off the shelf hardware, and allowing arbitrary and complicated boot setups. This is unnecessary in the appliance case, where the hardware target can be well defined, where there's no need for interoperability with the Microsoft signing authority, and where the boot configuration can be extremely static.

We can skip all of this complexity using systemd-boot's unified Linux image support. This has the format described here, but the short version is that it's simply a kernel and initramfs linked into a small EFI executable that will run them. Instructions for generating such an image are here, and if you follow them you'll end up with a single static image that can be directly executed by the firmware. Signing this avoids dealing with a whole host of problems associated with relying on shim and grub, but note that you'll be embedding the initramfs as well. Again, this should be fine for appliance use-cases, but you'll need your build system to support building the initramfs at image creation time rather than relying on it being generated on the host.

At this point we have a single image that can be verified by the firmware and will get us to the point of a running kernel and initramfs. Unless you've got enough RAM that you can put your entire workload in the initramfs, you're going to want a filesystem as well, and you're going to want to verify that that filesystem hasn't been tampered with. The easiest approach to this is to use dm-verity, a device-mapper layer that uses a hash tree to verify that the filesystem contents haven't been modified. The kernel needs to know what the root hash is, so this can either be embedded into your initramfs image or into the kernel command line. Either way, it'll end up in the signed boot image, so nobody will be able to tamper with it.

It's important to note that a dm-verity partition is read-only - the kernel doesn't have the cryptographic secret that would be required to generate a new hash tree if the partition is modified. So if you require the ability to write data or logs anywhere, you'll need to add a new partition for that. If this partition is unencrypted, an attacker with access to the device will be able to put whatever they want on there. You should treat any data you read from there as untrusted, and ensure that it's validated before use (ie, don't just feed it to a random parser written in C and expect that everything's going to be ok). On the other hand, if it's encrypted, remember that you can't just put the encryption key in the boot image - an attacker with access to the device is going to be able to dump that and extract it. You'll probably want to use a TPM-sealed encryption secret, which will be discussed later on.

At this point everything in the boot process is cryptographically verified, and so should be difficult to tamper with. Unfortunately this isn't really sufficient - on x86 systems there's typically no verification of the integrity of the secure boot database. An attacker with physical access to the system could attach a programmer directly to the firmware flash and rewrite the secure boot database to include keys they control. They could then replace the boot image with one that they've signed, and the machine would happily boot code that the attacker controlled. We need to be able to demonstrate that the system booted using the correct secure boot keys, and the only way we can do that is to use the TPM.

I wrote an introduction to TPMs a while back. The important thing to know here is that the TPM contains a set of Platform Configuration Registers that are large enough to contain a cryptographic hash. During boot, each component of the boot process will generate a "measurement" of other security critical components, including the next component to be booted. These measurements are a representation of the data in question - they may simply be a hash of the object being measured, or the hash of a structure containing various pieces of metadata. Each measurement is passed to the TPM, along with the PCR it should be measured into. The TPM takes the new measurement, appends it to the existing value, and then stores the hash of this concatenated data in the PCR. This means that the final PCR value depends not only on the measurement, but also on every previous measurement. Without breaking the hash algorithm, there's no way to set the PCR to an arbitrary value. The hash values and some associated data are stored in a log that's kept in system RAM, which we'll come back to later.

Different PCRs store different pieces of information, but the one that's most interesting to us is PCR 7. Its use is documented in the TCG PC Client Platform Firmware Profile (section, but the short version is that the firmware will measure the secure boot keys that are used to boot the system. If the secure boot keys are altered (such as by an attacker flashing new ones), the PCR 7 value will change.

What can we do with this? There's a couple of choices. For devices that are online, we can perform remote attestation, a process where the device can provide a signed copy of the PCR values to another system. If the system also provides a copy of the TPM event log, the individual events in the log can be replayed in the same way that the TPM would use to calculate the PCR values, and then compared to the actual PCR values. If they match, that implies that the log values are correct, and we can then analyse individual log entries to make assumptions about system state. If a device has been tampered with, the PCR 7 values and associated log entries won't match the expected values, and we can detect the tampering.

If a device is offline, or if there's a need to permit local verification of the device state, we still have options. First, we can perform remote attestation to a local device. I demonstrated doing this over Bluetooth at LCA back in 2020. Alternatively, we can take advantage of other TPM features. TPMs can be configured to store secrets or keys in a way that renders them inaccessible unless a chosen set of PCRs have specific values. This is used in tpm2-totp, which uses a secret stored in the TPM to generate a TOTP value. If the same secret is enrolled in any standard TOTP app, the value generated by the machine can be compared to the value in the app. If they match, the PCR values the secret was sealed to are unmodified. If they don't, or if no numbers are generated at all, that demonstrates that PCR 7 is no longer the same value, and that the system has been tampered with.

Unfortunately, TOTP requires that both sides have possession of the same secret. This is fine when a user is making that association themselves, but works less well if you need some way to ship the secret on a machine and then separately ship the secret to a user. If the user can simply download the secret via some API, so can an attacker. If an attacker has the secret, they can modify the secure boot database and re-seal the secret to the new PCR 7 value. That means having to add some form of authentication, along with a strong binding of machine serial number to a user (in order to avoid someone with valid credentials simply downloading all the secrets).

Instead, we probably want some mechanism that uses asymmetric cryptography. A keypair can be generated on the TPM, which will refuse to release an unencrypted copy of the private key. The public key, however, can be exported and stored. If it's acceptable for a verification app to connect to the internet then the public key can simply be obtained that way - if not, a certificate can be issued to the key, and this exposed to the verifier via a QR code. The app then verifies that the certificate is signed by the vendor, and if so extracts the public key from that. The private key can have an associated policy that only permits its use when PCR 7 has an appropriate value, so the app then generates a nonce and asks the user to type that into the device. The device generates a signature over that nonce and displays that as a QR code. The app verifies the signature matches, and can then assert that PCR 7 has the expected value.

Once we can assert that PCR 7 has the expected value, we can assert that the system booted something signed by us and thus infer that the rest of the boot chain is also secure. But this is still dependent on the TPM obtaining trustworthy information, and unfortunately the bus that the TPM sits on isn't really terribly secure (TPM Genie is an example of an interposer for i2c-connected TPMs, but there's no reason an LPC one can't be constructed to attack the sort usually used on PCs). TPMs do support encrypted communication channels, but bootstrapping those isn't straightforward without firmware support. The easiest way around this is to make use of a firmware-based TPM, where the TPM is implemented in software running on an ancillary controller. Intel's solution is part of their Platform Trust Technology and runs on the Management Engine, AMD run it on the Platform Security Processor. In both cases it's not terribly feasible to intercept the communications, so we avoid this attack. The downside is that we're then placing more trust in components that are running much more code than a TPM would and which have a correspondingly larger attack surface. Which is preferable is going to depend on your threat model.

Most of this should be achievable using Yocto, which now has support for dm-verity built in. It's almost certainly going to be easier using this than trying to base on top of a general purpose distribution. I'd love to see this become a largely push button receive secure image process, so might take a go at that if I have some free time in the near future.

[1] Obviously technologies that can be used to ensure nobody other than me is able to modify the software on devices I own can also be used to ensure that nobody other than the manufacturer is able to modify the software on devices that they sell to third parties. There's no real technological solution to this problem, but we shouldn't allow the fact that a technology can be used in ways that are hostile to user freedom to cause us to reject that technology outright.
[2] This is slightly complicated due to the interactions with the Management Engine (on Intel) or the Platform Security Processor (on AMD). Here's a good writeup on the Intel side of things.

comment count unavailable comments

6 May 2021

Matthew Garrett: More doorbell adventures

Back in my last post on this topic, I'd got shell on my doorbell but hadn't figured out why the HTTP callbacks weren't always firing. I still haven't, but I have learned some more things.

Doorbird sell a chime, a network connected device that is signalled by the doorbell when someone pushes a button. It costs about $150, which seems excessive, but would solve my problem (ie, that if someone pushes the doorbell and I'm not paying attention to my phone, I miss it entirely). But given a shell on the doorbell, how hard could it be to figure out how to mimic the behaviour of one?

Configuration for the doorbell is all stored under /mnt/flash, and there's a bunch of files prefixed 1000eyes that contain config (1000eyes is the German company that seems to be behind Doorbird). One of these was called 1000eyes.peripherals, which seemed like a good starting point. The initial contents were "Peripherals":[] , so it seemed likely that it was intended to be JSON. Unfortunately, since I had no access to any of the peripherals, I had no idea what the format was. I threw the main application into Ghidra and found a function that had debug statements referencing "initPeripherals and read a bunch of JSON keys out of the file, so I could simply look at the keys it referenced and write out a file based on that. I did so, and it didn't work - the app stubbornly refused to believe that there were any defined peripherals. The check that was failing was pcVar4 = strstr(local_50[0],PTR_s_"type":"_0007c980);, which made no sense, since I very definitely had a type key in there. And then I read it more closely. strstr() wasn't being asked to look for "type":, it was being asked to look for "type":". I'd left a space between the : and the opening " in the value, which meant it wasn't matching. The rest of the function seems to call an actual JSON parser, so I have no idea why it doesn't just use that for this part as well, but deleting the space and restarting the service meant it now believed I had a peripheral attached.

The mobile app that's used for configuring the doorbell now showed a device in the peripherals tab, but it had a weird corrupted name. Tapping it resulted in an error telling me that the device was unavailable, and on the doorbell itself generated a log message showing it was trying to reach a device with the hostname bha-04f0212c5cca and (unsurprisingly) failing. The hostname was being generated from the MAC address field in the peripherals file and was presumably supposed to be resolved using mDNS, but for now I just threw a static entry in /etc/hosts pointing at my Home Assistant device. That was enough to show that when I opened the app the doorbell was trying to call a CGI script called peripherals.cgi on my fake chime. When that failed, it called out to the cloud API to ask it to ask the chime[1] instead. Since the cloud was completely unaware of my fake device, this didn't work either. I hacked together a simple server using Python's HTTPServer and was able to return data (another block of JSON). This got me to the point where the app would now let me get to the chime config, but would then immediately exit. adb logcat showed a traceback in the app caused by a failed assertion due to a missing key in the JSON, so I ran the app through jadx, found the assertion and from there figured out what keys I needed. Once that was done, the app opened the config page just fine.

Unfortunately, though, I couldn't edit the config. Whenever I hit "save" the app would tell me that the peripheral wasn't responding. This was strange, since the doorbell wasn't even trying to hit my fake chime. It turned out that the app was making a CGI call to the doorbell, and the thread handling that call was segfaulting just after reading the peripheral config file. This suggested that the format of my JSON was probably wrong and that the doorbell was not handling that gracefully, but trying to figure out what the format should actually be didn't seem easy and none of my attempts improved things.

So, new approach. Rather than writing the config myself, why not let the doorbell do it? I should be able to use the genuine pairing process if I could mimic the chime sufficiently well. Hitting the "add" button in the app asked me for the username and password for the chime, so I typed in something random in the expected format (six characters followed by four zeroes) and a sufficiently long password and hit ok. A few seconds later it told me it couldn't find the device, which wasn't unexpected. What was a little more unexpected was that the log on the doorbell was showing it trying to hit another bha-prefixed hostname (and, obviously, failing). The hostname contains the MAC address, but I hadn't told the doorbell the MAC address of the chime, just its username. Some more digging showed that the doorbell was calling out to the cloud API, giving it the 6 character prefix from the username and getting a MAC address back. Doing the same myself revealed that there was a straightforward mapping from the prefix to the mac address - changing the final character from "a" to "b" incremented the MAC by one. It's actually just a base 26 encoding of the MAC, with aaaaaa translating to 00408C000000.

That explained how the hostname was being generated, and in return I was able to work backwards to figure out which username I should use to generate the hostname I was already using. Attempting to add it now resulted in the doorbell making another CGI call to my fake chime in order to query its feature set, and by mocking that up as well I was able to send back a file containing X-Intercom-Type, X-Intercom-TypeId and X-Intercom-Class fields that made the doorbell happy. I now had a valid JSON file, which cleared up a couple of mysteries. The corrupt name was because the name field isn't supposed to be ASCII - it's base64 encoded UTF16-BE. And the reason I hadn't been able to figure out the JSON format correctly was because it looked something like this:

"Peripherals":[] "prefix": "type":"DoorChime","name":"AEQAbwBvAHIAYwBoAGkAbQBlACAAVABlAHMAdA==","mac":"04f0212c5cca","user":"username","password":"password" ]

Note that there's a total of one [ in this file, but two ]s? Awesome. Anyway, I could now modify the config in the app and hit save, and the doorbell would then call out to my fake chime to push config to it. Weirdly, the association between the chime and a specific button on the doorbell is only stored on the chime, not on the doorbell. Further, hitting the doorbell didn't result in any more HTTP traffic to my fake chime. However, it did result in some broadcast UDP traffic being generated. Searching for the port number led me to the Doorbird LAN API and a complete description of the format and encryption mechanism in use. Argon2I is used to turn the first five characters of the chime's password (which is also stored on the doorbell itself) into a 256-bit key, and this is used with ChaCha20 to decrypt the payload. The payload then contains a six character field describing the device sending the event, and then another field describing the event itself. Some more scrappy Python and I could pick up these packets and decrypt them, which showed that they were being sent whenever any event occurred on the doorbell. This explained why there was no storage of the button/chime association on the doorbell itself - the doorbell sends packets for all events, and the chime is responsible for deciding whether to act on them or not.

On closer examination, it turns out that these packets aren't just sent if there's a configured chime. One is sent for each configured user, avoiding the need for a cloud round trip if your phone is on the same network as the doorbell at the time. There was literally no need for me to mimic the chime at all, suitable events were already being sent.

Still. There's a fair amount of WTFery here, ranging from the strstr() based JSON parsing, the invalid JSON, the symmetric encryption that uses device passwords as the key (requiring the doorbell to be aware of the chime's password) and the use of only the first five characters of the password as input to the KDF. It doesn't give me a great deal of confidence in the rest of the device's security, so I'm going to keep playing.

[1] This seems to be to handle the case where the chime isn't on the same network as the doorbell

comment count unavailable comments

23 April 2021

Matthew Garrett: An accidental bootsplash

Back in 2005 we had Debconf in Helsinki. Earlier in the year I'd ended up invited to Canonical's Ubuntu Down Under event in Sydney, and one of the things we'd tried to design was a reasonable graphical boot environment that could also display status messages. The design constraints were awkward - we wanted it to be entirely in userland (so we didn't need to carry kernel patches), and we didn't want to rely on vesafb[1] (because at the time we needed to reinitialise graphics hardware from userland on suspend/resume[2], and vesa was not super compatible with that). Nothing currently met our requirements, but by the time we'd got to Helsinki there was a general understanding that Paul Sladen was going to implement this.

The Helsinki Debconf ended being an extremely strange event, involving me having to explain to Mark Shuttleworth what the physics of a bomb exploding on a bus were, many people being traumatised by the whole sauna situation, and the whole unfortunate water balloon incident, but it also involved Sladen spending a bunch of time trying to produce an SVG of a London bus as a D-Bus logo and not really writing our hypothetical userland bootsplash program, so on the last night, fueled by Koff that we'd bought by just collecting all the discarded empty bottles and returning them for the deposits, I started writing one.

I knew that Debian was already using graphics mode for installation despite having a textual installer, because they needed to deal with more complex fonts than VGA could manage. Digging into the code, I found that it used BOGL - a graphics library that made use of the VGA framebuffer to draw things. VGA had a pre-allocated memory range for the framebuffer[3], which meant the firmware probably wouldn't map anything else there any hitting those addresses probably wouldn't break anything. This seemed safe.

A few hours later, I had some code that could use BOGL to print status messages to the screen of a machine booted with vga16fb. I woke up some time later, somehow found myself in an airport, and while sitting at the departure gate[4] I spent a while staring at VGA documentation and worked out which magical calls I needed to make to have it behave roughly like a linear framebuffer. Shortly before I got on my flight back to the UK, I had something that could also draw a graphical picture.

Usplash shipped shortly afterwards. We hit various issues - vga16fb produced a 640x480 mode, and some laptops were not inclined to do that without a BIOS call first. 640x400 worked basically everywhere, but meant we had to redraw the art because circles don't work the same way if you change the resolution. My brief "UBUNTU BETA" artwork that was me literally writing "UBUNTU BETA" on an HP TC1100 shortly after I'd got the Wacom screen working did not go down well, and thankfully we had better artwork before release.

But 16 colours is somewhat limiting. SVGALib offered a way to get more colours and better resolution in userland, retaining our prerequisites. Unfortunately it relied on VM86, which doesn't exist in 64-bit mode on Intel systems. I ended up hacking the x86emu into a thunk library that exposed the same API as LRMI, so we could run it without needing VM86. Shockingly, it worked - we had support for 256 colour bootsplashes in any supported resolution on 64 bit systems as well as 32 bit ones.

But by now it was obvious that the future was having the kernel manage graphics support, both in terms of native programming and in supporting suspend/resume. Plymouth is much more fully featured than Usplash ever was, but relies on functionality that simply didn't exist when we started this adventure. There's certainly an argument that we'd have been better off making reasonable kernel modesetting support happen faster, but at this point I had literally no idea how to write decent kernel code and everyone should be happy I kept this to userland.

Anyway. The moral of all of this is that sometimes history works out such that you write some software that a huge number of people run without any idea of who you are, and also that this can happen without you having any fucking idea what you're doing.

Write code. Do crimes.

[1] vesafb relied on either the bootloader or the early stage kernel performing a VBE call to set a mode, and then just drawing directly into that framebuffer. When we were doing GPU reinitialisation in userland we couldn't guarantee that we'd run before the kernel tried to draw stuff into that framebuffer, and there was a risk that that was mapped to something dangerous if the GPU hadn't been reprogrammed into the same state. It turns out that having GPU modesetting in the kernel is a Good Thing.

[2] ACPI didn't guarantee that the firmware would reinitialise the graphics hardware, and as a result most machines didn't. At this point Linux didn't have native support for initialising most graphics hardware, so we fell back to doing it from userland. VBEtool was a terrible hack I wrote to try to re-execute the system's graphics hardware through a range of mechanisms, and it worked in a surprising number of cases.

[3] As long as you were willing to deal with 640x480 in 16 colours

[4] Helsinki-Vantaan had astonishingly comfortable seating for time

comment count unavailable comments

15 March 2021

Matthew Garrett: Exploring my doorbell

I've talked about my doorbell before, but started looking at it again this week because sometimes it simply doesn't send notifications to my Home Assistant setup - the push notifications appear on my phone, but the doorbell simply doesn't trigger the HTTP callback it's meant to[1]. This is obviously suboptimal, but it's also tricky to debug a device when you have no access to it.

Normally I'd just head straight in with a screwdriver, but the doorbell is shared with the other units in this building and it seemed a little anti-social to interfere with a shared resource. So I bought some broken units from ebay and pulled one of them apart. There's several boards inside, but one of them had a conveniently empty connector at the top with "TX", "RX" and "GND" labelled. Sticking a USB-serial converter on this gave me output from U-Boot, and then kernel output. Confirmation that my doorbell runs Linux, but unfortunately it didn't give me a shell prompt. My next approach would often me to just dump the flash and look for vulnerabilities that way, but this device uses TSOP-48 packaged NAND flash rather than the more convenient SPI NOR flash that I already have adapters to access. Dumping this sort of NAND isn't terribly hard, but the easiest way to do it involves desoldering it from the board and plugging it into something like a Flashcat USB adapter, and my soldering's not good enough to put it back on the board afterwards. So I wanted another approach.

U-Boot gave a short countdown to hit a key before continuing with boot, and for once hitting a key actually did something. Unfortunately it then prompted for a password, and giving the wrong one resulted in boot continuing[2]. In the past I've had good luck forcing U-Boot to drop to a prompt by simply connecting one of the data lines on SPI flash to ground while it's trying to read the kernel - the failed read causes U-Boot to error out. It turns out the same works fine on raw NAND, so I just edited the kernel boot arguments to append "init=/bin/sh" and soon I had a shell.

From here on, things were made easier by virtue of the device using the YAFFS filesystem. Unlike many flash filesystems, it's read/write, so I could make changes that would persist through to the running system. There was a convenient copy of telnetd included, but it segfaulted on startup, which reduced its usefulness. Fortunately there was also a copy of Netcat[3]. If you make a fifo somewhere on the filesystem, you can cat the fifo to a shell, pipe the shell to a netcat listener, and then pipe netcat's output back to the fifo. The shell's output all gets passed to whatever connects to netcat, and whatever's sent to netcat gets passed through the fifo back to the shell. This is, obviously, horribly insecure, but it was enough to get a root shell over the network on the running device.

The doorbell runs various bits of software, one of which is Lighttpd to provide a local API and access to the device. Another component ("nxp-client") connects to the vendor's cloud infrastructure and passes cloud commands back to the local webserver. This is where I found something strange. Lighttpd was refusing to start because its modules wanted library symbols that simply weren't present on the device. My best guess is that a firmware update went wrong and left the device in a partially upgraded state - and without a working local webserver, there was no way to perform any further updates. This may explain why this doorbell was sitting on ebay.

Anyway. Now that I had shell, I could simply dump the flash by copying it directly off the /dev/mtdblock devices - since I had netcat, I could just pipe stuff through that back to my actual computer. Now I had access to the filesystem I could extract that locally and start digging into it more deeply. One incredibly useful tool for this is qemu-user. qemu is a general purpose hardware emulation platform, usually used to emulate entire systems. But in qemu-user mode, it instead only emulates the CPU. When a piece of code tries to make a system call to access the kernel, qemu-user translates that to the appropriate calling convention for the host kernel and makes that call instead. Combined with binfmt_misc, you can configure a Linux system to be able to run Linux binaries from other architectures. One of the best things about this is that, because they're still using the host convention for making syscalls, you can run the host strace on them and see what they're doing.

What I found was that nxp-client was calling back to the cloud platform, setting up an encrypted communication channel (using ChaCha20 and a bunch of key setup stuff I couldn't be bothered picking apart) and then waiting for commands from the cloud. It would then proxy those through to the local webserver. Since I couldn't run the local lighttpd, I just wrote a trivial Python app using http.server and waited to see what requests I got. The first was a GET to a CGI script called editcgi.cgi, along with a path name. I mocked up the GET request to respond with what was on the actual filesystem. The cloud then proceeded to POST to editcgi.cgi, with the same pathname and with new file contents. editcgi.cgi is apparently able to read and write to files on the filesystem.

But this is on the interface that's exposed to the cloud client, so this didn't appear immediately useful - and, indeed, trying to hit the same CGI binary over the local network gave me a 401 unauthorized error. There's a local API spec for these doorbells, but they all refer to scripts in the bha-api namespace, and this script was in the plain cgi-bin namespace. But then I noticed that the bha-api namespace didn't actually exist in the filesystem - instead, lighttpd's mod_alias was configured to rewrite requests to bha-api through to files in cgi-bin. And by using the documented API to get a session token, I could call editcgi.cgi to read and write arbitrary files on the doorbell. Which means I can drop an extra script in /etc/rc.d/rc3.d and get a shell on my doorbell.

This all requires the ability to have local authentication credentials, so it's not a big security deal other than it allowing you to retain access to a monitoring device even after you've moved out and had your credentials revoked. I'm sure it's all fine.

[1] I can ping the doorbell from the Home Assistant machine, so it's not that the network is flaky
[2] The password appears to be hy9$gnhw0z6@ if anyone else ends up in this situation

comment count unavailable comments

9 March 2021

Matthew Garrett: Unauthenticated MQTT endpoints on Linksys Velop routers enable local DoS

(Edit: this is CVE-2021-1000002)

Linksys produces a series of wifi mesh routers under the Velop line. These routers use MQTT to send messages to each other for coordination purposes. In the version I tested against, there was zero authentication on this - anyone on the local network is able to connect to the MQTT interface on a router and send commands. As an example:
mosquitto_pub -h -t "network/master/cmd/nodes_temporary_blacklist" -m ' "data": "client": "f8:16:54:43:e2:0c", "duration": "3600", "action": "start" '
will ask the router to block the client with MAC address f8:16:54:43:e2:0c from the network for an hour. Various other MQTT topics pass parameters to shell scripts without quoting them or escaping metacharacters, so more serious outcomes may be possible.

The vendor has released two firmware updates since report - I have not verified whether either fixes this, but the changelog does not indicate any security issues were addressed.


2020-07-30: Submitted through the vendor's security vulnerability report form, indicating that I plan to disclose in either 90 days or after a fix is released. The form turns out to file a Bugcrowd submission.
2020-07-30: I claim the Bugcrowd submission.
2020-08-19: Vendor acknowledges the issue, is able to reproduce and assigns it a P3 priority.
2020-12-15: I ask if there's an update.
2021-02-02: I ask if there's an update.
2021-02-03: Bugcrowd raise a blocker on the issue, asking the vendor to respond.
2021-02-17: I ask for permission to disclose.
2021-03-09: In the absence of any response from the vendor since 2020-08-19, I violate Bugcrowd disclosure policies and unilaterally disclose.

comment count unavailable comments

21 February 2021

Matthew Garrett: Making hibernation work under Linux Lockdown

Linux draws a distinction between code running in kernel (kernel space) and applications running in userland (user space). This is enforced at the hardware level - in x86-speak[1], kernel space code runs in ring 0 and user space code runs in ring 3[2]. If you're running in ring 3 and you attempt to touch memory that's only accessible in ring 0, the hardware will raise a fault. No matter how privileged your ring 3 code, you don't get to touch ring 0.

Kind of. In theory. Traditionally this wasn't well enforced. At the most basic level, since root can load kernel modules, you could just build a kernel module that performed any kernel modifications you wanted and then have root load it. Technically user space code wasn't modifying kernel space code, but the difference was pretty semantic rather than useful. But it got worse - root could also map memory ranges belonging to PCI devices[3], and if the device could perform DMA you could just ask the device to overwrite bits of the kernel[4]. Or root could modify special CPU registers ("Model Specific Registers", or MSRs) that alter CPU behaviour via the /dev/msr interface, and compromise the kernel boundary that way.

It turns out that there were a number of ways root was effectively equivalent to ring 0, and the boundary was more about reliability (ie, a process running as root that ends up misbehaving should still only be able to crash itself rather than taking down the kernel with it) than security. After all, if you were root you could just replace the on-disk kernel with a backdoored one and reboot. Going deeper, you could replace the bootloader with one that automatically injected backdoors into a legitimate kernel image. We didn't have any way to prevent this sort of thing, so attempting to harden the root/kernel boundary wasn't especially interesting.

In 2012 Microsoft started requiring vendors ship systems with UEFI Secure Boot, a firmware feature that allowed[5] systems to refuse to boot anything without an appropriate signature. This not only enabled the creation of a system that drew a strong boundary between root and kernel, it arguably required one - what's the point of restricting what the firmware will stick in ring 0 if root can just throw more code in there afterwards? What ended up as the Lockdown Linux Security Module provides the tooling for this, blocking userspace interfaces that can be used to modify the kernel and enforcing that any modules have a trusted signature.

But that comes at something of a cost. Most of the features that Lockdown blocks are fairly niche, so the direct impact of having it enabled is small. Except that it also blocks hibernation[6], and it turns out some people were using that. The obvious question is "what does hibernation have to do with keeping root out of kernel space", and the answer is a little convoluted and is tied into how Linux implements hibernation. Basically, Linux saves system state into the swap partition and modifies the header to indicate that there's a hibernation image there instead of swap. On the next boot, the kernel sees the header indicating that it's a hibernation image, copies the contents of the swap partition back into RAM, and then jumps back into the old kernel code. What ensures that the hibernation image was actually written out by the kernel? Absolutely nothing, which means a motivated attacker with root access could turn off swap, write a hibernation image to the swap partition themselves, and then reboot. The kernel would happily resume into the attacker's image, giving the attacker control over what gets copied back into kernel space.

This is annoying, because normally when we think about attacks on swap we mitigate it by requiring an encrypted swap partition. But in this case, our attacker is root, and so already has access to the plaintext version of the swap partition. Disk encryption doesn't save us here. We need some way to verify that the hibernation image was written out by the kernel, not by root. And thankfully we have some tools for that.

Trusted Platform Modules (TPMs) are cryptographic coprocessors[7] capable of doing things like generating encryption keys and then encrypting things with them. You can ask a TPM to encrypt something with a key that's tied to that specific TPM - the OS has no access to the decryption key, and nor does any other TPM. So we can have the kernel generate an encryption key, encrypt part of the hibernation image with it, and then have the TPM encrypt it. We store the encrypted copy of the key in the hibernation image as well. On resume, the kernel reads the encrypted copy of the key, passes it to the TPM, gets the decrypted copy back and is able to verify the hibernation image.

That's great! Except root can do exactly the same thing. This tells us the hibernation image was generated on this machine, but doesn't tell us that it was done by the kernel. We need some way to be able to differentiate between keys that were generated in kernel and ones that were generated in userland. TPMs have the concept of "localities" (effectively privilege levels) that would be perfect for this. Userland is only able to access locality 0, so the kernel could simply use locality 1 to encrypt the key. Unfortunately, despite trying pretty hard, I've been unable to get localities to work. The motherboard chipset on my test machines simply doesn't forward any accesses to the TPM unless they're for locality 0. I needed another approach.

TPMs have a set of Platform Configuration Registers (PCRs), intended for keeping a record of system state. The OS isn't able to modify the PCRs directly. Instead, the OS provides a cryptographic hash of some material to the TPM. The TPM takes the existing PCR value, appends the new hash to that, and then stores the hash of the combination in the PCR - a process called "extension". This means that the new value of the TPM depends not only on the value of the new data, it depends on the previous value of the PCR - and, in turn, that previous value depended on its previous value, and so on. The only way to get to a specific PCR value is to either (a) break the hash algorithm, or (b) perform exactly the same sequence of writes. On system reset the PCRs go back to a known value, and the entire process starts again.

Some PCRs are different. PCR 23, for example, can be reset back to its original value without resetting the system. We can make use of that. The first thing we need to do is to prevent userland from being able to reset or extend PCR 23 itself. All TPM accesses go through the kernel, so this is a simple matter of parsing the write before it's sent to the TPM and returning an error if it's a sensitive command that would touch PCR 23. We now know that any change in PCR 23's state will be restricted to the kernel.

When we encrypt material with the TPM, we can ask it to record the PCR state. This is given back to us as metadata accompanying the encrypted secret. Along with the metadata is an additional signature created by the TPM, which can be used to prove that the metadata is both legitimate and associated with this specific encrypted data. In our case, that means we know what the value of PCR 23 was when we encrypted the key. That means that if we simply extend PCR 23 with a known value in-kernel before encrypting our key, we can look at the value of PCR 23 in the metadata. If it matches, the key was encrypted by the kernel - userland can create its own key, but it has no way to extend PCR 23 to the appropriate value first. We now know that the key was generated by the kernel.

But what if the attacker is able to gain access to the encrypted key? Let's say a kernel bug is hit that prevents hibernation from resuming, and you boot back up without wiping the hibernation image. Root can then read the key from the partition, ask the TPM to decrypt it, and then use that to create a new hibernation image. We probably want to prevent that as well. Fortunately, when you ask the TPM to encrypt something, you can ask that the TPM only decrypt it if the PCRs have specific values. "Sealing" material to the TPM in this way allows you to block decryption if the system isn't in the desired state. So, we define a policy that says that PCR 23 must have the same value at resume as it did on hibernation. On resume, the kernel resets PCR 23, extends it to the same value it did during hibernation, and then attempts to decrypt the key. Afterwards, it resets PCR 23 back to the initial value. Even if an attacker gains access to the encrypted copy of the key, the TPM will refuse to decrypt it.

And that's what this patchset implements. There's one fairly significant flaw at the moment, which is simply that an attacker can just reboot into an older kernel that doesn't implement the PCR 23 blocking and set up state by hand. Fortunately, this can be avoided using another aspect of the boot process. When you boot something via UEFI Secure Boot, the signing key used to verify the booted code is measured into PCR 7 by the system firmware. In the Linux world, the Shim bootloader then measures any additional keys that are used. By either using a new key to tag kernels that have support for the PCR 23 restrictions, or by embedding some additional metadata in the kernel that indicates the presence of this feature and measuring that, we can have a PCR 7 value that verifies that the PCR 23 restrictions are present. We then seal the key to PCR 7 as well as PCR 23, and if an attacker boots into a kernel that doesn't have this feature the PCR 7 value will be different and the TPM will refuse to decrypt the secret.

While there's a whole bunch of complexity here, the process should be entirely transparent to the user. The current implementation requires a TPM 2, and I'm not certain whether TPM 1.2 provides all the features necessary to do this properly - if so, extending it shouldn't be hard, but also all systems shipped in the past few years should have a TPM 2, so that's going to depend on whether there's sufficient interest to justify the work. And we're also at the early days of review, so there's always the risk that I've missed something obvious and there are terrible holes in this. And, well, given that it took almost 8 years to get the Lockdown patchset into mainline, let's not assume that I'm good at landing security code.

[1] Other architectures use different terminology here, such as "supervisor" and "user" mode, but it's broadly equivalent
[2] In theory rings 1 and 2 would allow you to run drivers with privileges somewhere between full kernel access and userland applications, but in reality we just don't talk about them in polite company
[3] This is how graphics worked in Linux before kernel modesetting turned up. XFree86 would just map your GPU's registers into userland and poke them directly. This was not a huge win for stability
[4] IOMMUs can help you here, by restricting the memory PCI devices can DMA to or from. The kernel then gets to allocate ranges for device buffers and configure the IOMMU such that the device can't DMA to anything else. Except that region of memory may still contain sensitive material such as function pointers, and attacks like this can still cause you problems as a result.
[5] This describes why I'm using "allowed" rather than "required" here
[6] Saving the system state to disk and powering down the platform entirely - significantly slower than suspending the system while keeping state in RAM, but also resilient against the system losing power.
[7] With some handwaving around "coprocessor". TPMs can't be part of the OS or the system firmware, but they don't technically need to be an independent component. Intel have a TPM implementation that runs on the Management Engine, a separate processor built into the motherboard chipset. AMD have one that runs on the Platform Security Processor, a small ARM core built into their CPU. Various ARM implementations run a TPM in Trustzone, a special CPU mode that (in theory) is able to access resources that are entirely blocked off from anything running in the OS, kernel or otherwise.

comment count unavailable comments

23 November 2020

Shirish Agarwal: White Hat Senior and Education

I had been thinking of doing a blog post on RCEP which China signed with 14 countries a week and a day back but this new story has broken and is being viraled a bit on the interwebs, especially twitter and is pretty much in our domain so thought would be better to do a blog post about it. Also, there is quite a lot packed so quite a bit of unpacking to do.

Whitehat, Greyhat and Blackhat For those of you who may not, there are actually three terms especially in computer science that one comes across. Those are white hats, grey hats and black hats. Now clinically white hats are like the fiery angels or the good guys who basically take permissions to try and find out weakness in an application, program, website, organization and so on and so forth. A somewhat dated reference to hacker could be Sandra Bullock (The Net 1995) , Sneakers (1992), Live Free or Die Hard (2007) . Of the three one could argue that Sandra was actually into viruses which are part of computer security but still she showed some bad-ass skills, but then that is what actors are paid to do  Sneakers was much more interesting for me because in that you got the best key which can unlock any lock, something like quantum computing is supposed to do. One could equate both the first movies in either as a White hat or a Grey hat . A Grey hat is more flexible in his/her moral values, and they are plenty of such people. For e.g. Julius Assange could be described as a Grey hat, but as you can see and understand those are moral issues.

A black hat on the other hand is one who does things for profit even if it harms the others. The easiest fictitious examples are all Die Hard series, all of them except the 4th one, all had bad guys or black hats. The 4th also had but is the odd one out as it had Matthew Farell (Justin Long) as a Grey hat hacker. In real life Kevin Mitnick, Kevin Poulsen, Robert Tappan Morris, George Hotz, Gary McKinnon are some examples of hackers, most of whom were black hats, most of them reformed into white hats and security specialists. There are many other groups and names but that perhaps is best for another day altogether. Now why am I sharing this. Because in all of the above, the people who are using and working with the systems have better than average understanding of systems and they arguably would be better than most people at securing their networks, systems etc. but as we shall see in this case there has been lots of issues in the company.

WhiteHat Jr. and 300 Million Dollars Before I start this, I would like to share that for me this suit in many ways seems to be similar to the suit filed against Krishnaraj Rao . Although the difference is that Krishnaraj Rao s case/suit is that it was in real estate while this one is in education although many things are similar to those cases but also differ in some obvious ways. For e.g. in the suit against Krishnaraj Rao, the plaintiff s first approached the High Court and then the Supreme Court. Of course Krishnaraj Rao won in the High Court and then in the SC plaintiff s agreed to Krishnaraj Rao s demands as they knew they could not win in SC. In that case, a compromise was reached by the plaintiff just before judgement was to be delivered. In this case, the plaintiff have directly approached the Delhi High Court. The charges against Mr. Poonia (the defendant in this case) are very much similar to those which were made in Krishnaraj Rao s suit hence won t be going into those details. They have claimed defamation and filed a 20 crore suit. The idea is basically to silence any whistle-blowers.

Fictional Character Wolf Gupta The first issue in this case or perhaps one of the most famous or infamous character is an unknown. While he has been reportedly hired by Google India, BJYU, Chandigarh. This has been reported by Yahoo News. I did a cursory search on LinkedIn to see if there indeed is a wolf gupta but wasn t able to find any person with such a name. I am not even talking the amount of money/salary the fictitious gentleman is supposed to have got and the various variations on the salary figures at different times and the different ads.

If I wanted to, I could have asked few of the kind souls whom I know are working in Google to see if they can find such a person using their own credentials but it probably would have been a waste of time. When you show a LinkedIn profile in your social media, it should come up in the results, in this case it doesn t. I also tried to find out if somehow BJYU was a partner to Google and came up empty there as well. There is another story done by Kan India but as I m not a subscriber, I don t know what they have written but the beginning of the story itself does not bode well. While I can understand marketing, there is a line between marketing something and being misleading. At least to me, all of the references shared seems misleading at least to me.

Taking down dissent One of the big no-nos at least from what I perceive, you cannot and should not take down dissent or critique. Indians, like most people elsewhere around the world, critique and criticize day and night. Social media like twitter, mastodon and many others would not exist in the place if criticisms are not there. In fact, one could argue that Twitter and most social media is used to drive engagements to a person, brand etc. It is even an official policy in Twitter. Now you can t drive engagements without also being open to critique and this is true of all the web, including of WordPress and me  . What has been happening is that whitehatjr with help of bjyu have been taking out content of people citing copyright violation which seems laughable. When citizens critique anything, we are obviously going to take the name of the product otherwise people would have to start using new names similar to how Tom Riddle was known as Dark Lord , Voldemort and He who shall not be named . There have been quite a few takedowns, I just provide one for reference, the rest of the takedowns would probably come in the ongoing suit/case.
Whitehat Jr. ad showing investors fighting

Now a brief synopsis of what the ad. is about. The ad is about a kid named Chintu who makes an app. The app. Is so good that investors come to his house and right in the lawn and start fighting each other. The parents are enjoying looking at the fight and to add to the whole thing there is also a nosy neighbor who has his own observations. Simply speaking, it is a juvenile ad but it works as most parents in India, as elsewhere are insecure.
Jihan critiquing the whitehatjr ad
Before starting, let me assure that I asked Jihan s parents if it s ok to share his ad on my blog and they agreed. What he has done is broken down the ad and showed how juvenile the ad is and using logic and humor as a template for the same. He does make sure to state that he does not know how the product is as he hasn t used it. His critique was about the ad and not the product as he hasn t used that.

The Website If you look at the website, sadly, most of the site only talks about itself rather than giving examples that people can look in detail. For e.g. they say they have few apps. on Google play-store but no link to confirm the same. The same is true of quite a few other things. In another ad a Paralympic star says don t get into sports and get into coding. Which athlete in their right mind would say that? And it isn t that we (India) are brimming with athletes at the international level. In the last outing which was had in 2016, India sent a stunning 117 athletes but that was an exception as we had the women s hockey squad which was of 16 women, and even then they were overshadowed in numbers by the bureaucratic and support staff. There was criticism about the staff bit but that is probably a story for another date. Most of the site doesn t really give much value and the point seems to be driving sales to their courses. This is pressurizing small kids as well as teenagers and better who are in the second and third year science-engineering whose parents don t get that it is advertising and it is fake and think that their kids are incompetent. So this pressurizes both small kids as well as those who are learning, doing in whatever college or educational institution . The teenagers more often than not are unable to tell/share with them that this is advertising and fake. Also most of us have been on a a good diet of ads. Fair and lovely still sells even though we know it doesn t work. This does remind me of a similar fake academy which used very much similar symptoms and now nobody remembers them today. There used to be an academy called Wings Academy or some similar name. They used to advertise that you come to us and we will make you into a pilot or an airhostess and it was only much later that it was found out that most kids were doing laundry work in hotels and other such work. Many had taken loans, went bankrupt and even committed suicide because they were unable to pay off the loans due to the dreams given by the company and the harsh realities that awaited them. They were sued in court but dunno what happened but soon they were off the radar so we never came to know what happened to those million of kids whose life dreams were shattered.

Security Now comes the security part. They have alleged that Mr. Poonia broke into their systems. While this may be true, what I find funny is that with the name Whitehat, how can they justify it? If you are saying you are white hat you are supposed to be much better than this. And while I have not tried to penetrate their systems, I did find it laughable that the site is using an expired https:// certificate. I could have tried further to figure out the systems but I chose not to. How they could not have an automated script to get the certificate fixed is beyond me, this is known as certificate outage and is very well understood in the industry. There are tools like Let s Encrypt and Certbot (both EFF) and many others. But that is their concern, not mine.

Comparison A similar offering would be unacademy but as can be seen they neither try to push you in any way and nor do they make any ridiculous claims. In fact how genuine unacademy is can be gauged from the fact that many of its learning resources are available to people to see on YT and if they have tools they can also download it. Now, does this mean that every educational website should have their content for free, of course not. But when a channel has 80% 90% of it YT content as ads and testimonials then they surely should give a reason to pause both for parents and students alike. But if parents had done that much research, then things would not be where they are now.

Allegations Just to complete, there are allegations by Mr. Poonia with some screenshots which show the company has been doing a lot of bad things. For e.g. they were harassing an employee at night 2 a.m. who was frustrated and working in the company at the time. Many of the company staff routinely made sexist and offensive, sexual abusive remarks privately between themselves for prospective women who came to interview via webcam (due to the pandemic). There also seems to be a bit of porn on the web/mobile server of the company as well. There also have been allegations that while the company says refund is done next day, many parents who have demanded those refunds have not got it. Now while Mr. Poonia has shared some quotations of the staff while hiding the identities of both the victims and the perpetrators, the language being used in itself tells a lot. I am in two minds whether to share those photos or not hence atm choosing not to. Poonia has also contended that all teachers do not know programming, and they are given scripts to share. There have been some people who did share that experience with him
Suruchi Sethi
From the company s side they are alleging he has hacked the company servers and would probably be using the Fruit of the poisonous tree argument which we have seen have been used in many arguments.

Conclusion Now that lies in the eyes of the Court whether the single bench chooses the literal meaning or use the spirit of the law or the genuine concerns of the people concerned. While in today s hearing while the company asked for a complete sweeping injunction they were unable to get it. Whatever may happen, we may hope to see some fireworks in the second hearing which is slated to be on 6.01.2021 where all of this plays out. Till later.

27 July 2020

Matthew Garrett: Filesystem deduplication is a sidechannel

First off - nothing I'm going to talk about in this post is novel or overly surprising, I just haven't found a clear writeup of it before. I'm not criticising any design decisions or claiming this is an important issue, just raising something that people might otherwise be unaware of.

With that out of the way: Automatic deduplication of data is a feature of modern filesystems like zfs and btrfs. It takes two forms - inline, where the filesystem detects that data being written to disk is identical to data that already exists on disk and simply references the existing copy rather than, and offline, where tooling retroactively identifies duplicated data and removes the duplicate copies (zfs supports inline deduplication, btrfs only currently supports offline). In a world where disks end up with multiple copies of cloud or container images, deduplication can free up significant amounts of disk space.

What's the security implication? The problem is that deduplication doesn't recognise ownership - if two users have copies of the same file, only one copy of the file will be stored[1]. So, if user a stores a file, the amount of free space will decrease. If user b stores another copy of the same file, the amount of free space will remain the same. If user b is able to check how much free space is available, user b can determine whether the file already exists.

This doesn't seem like a huge deal in most cases, but it is a violation of expected behaviour (if user b doesn't have permission to read user a's files, user b shouldn't be able to determine whether user a has a specific file). But we can come up with some convoluted cases where it becomes more relevant, such as law enforcement gaining unprivileged access to a system and then being able to demonstrate that a specific file already exists on that system. Perhaps more interestingly, it's been demonstrated that free space isn't the only sidechannel exposed by deduplication - deduplication has an impact on access timing, and can be used to infer the existence of data across virtual machine boundaries.

As I said, this is almost certainly not something that matters in most real world scenarios. But with so much discussion of CPU sidechannels over the past couple of years, it's interesting to think about what other features also end up leaking information in ways that may not be obvious.

(Edit to add: deduplication isn't enabled on zfs by default and is explicitly triggered on btrfs, so unless it's something you've enabled then this isn't something that affects you)

[1] Deduplication is usually done at the block level rather than the file level, but given zfs's support for variable sized blocks, identical files should be deduplicated even if they're smaller than the maximum record size

comment count unavailable comments

24 June 2020

Matthew Garrett: Making my doorbell work

I recently moved house, and the new building has a Doorbird to act as a doorbell and open the entrance gate for people. There's a documented local control API (no cloud dependency!) and a Home Assistant integration, so this seemed pretty straightforward.

Unfortunately not. The Doorbird is on separate network that's shared across the building, provided by Monkeybrains. We're also a Monkeybrains customer, so our network connection is plugged into the same router and antenna as the Doorbird one. And, as is common, there's port isolation between the networks in order to avoid leakage of information between customers. Rather perversely, we are the only people with an internet connection who are unable to ping my doorbell.

I spent most of the past few weeks digging myself out from under a pile of boxes, but we'd finally reached the point where spending some time figuring out a solution to this seemed reasonable. I spent a while playing with port forwarding, but that wasn't ideal - the only server I run is in the UK, and having packets round trip almost 11,000 miles so I could speak to something a few metres away seemed like a bad plan. Then I tried tethering an old Android device with a data-only SIM, which worked fine but only in one direction (I could see what the doorbell could see, but I couldn't get notifications that someone had pushed a button, which was kind of the point here).

So I went with the obvious solution - I added a wifi access point to the doorbell network, and my home automation machine now exists on two networks simultaneously (nmcli device modify wlan0 ipv4.never-default true is the magic for "ignore the gateway that the DHCP server gives you" if you want to avoid this), and I could now do link local service discovery to find the doorbell if it changed addresses after a power cut or anything. And then, like magic, everything worked - I got notifications from the doorbell when someone hit our button.

But knowing that an event occurred without actually doing something in response seems fairly unhelpful. I have a bunch of Chromecast targets around the house (a mixture of Google Home devices and Chromecast Audios), so just pushing a message to them seemed like the easiest approach. Home Assistant has a text to speech integration that can call out to various services to turn some text into a sample, and then push that to a media player on the local network. You can group multiple Chromecast audio sinks into a group that then presents as a separate device on the network, so I could then write an automation to push audio to the speaker group in response to the button being pressed.

That's nice, but it'd also be nice to do something in response. The Doorbird exposes API control of the gate latch, and Home Assistant exposes that as a switch. I'm using Home Assistant's Google Assistant integration to expose devices Home Assistant knows about to voice control. Which means when I get a house-wide notification that someone's at the door I can just ask Google to open the door for them.

So. Someone pushes the doorbell. That sends a signal to a machine that's bridged onto that network via an access point. That machine then sends a protobuf command to speakers on a separate network, asking them to stream a sample it's providing. Those speakers call back to that machine, grab the sample and play it. At this point, multiple speakers in the house say "Someone is at the door". I then say "Hey Google, activate the front gate" - the device I'm closest to picks this up and sends it to Google, where something turns my speech back into text. It then looks at my home structure data and realises that the "Front Gate" device is associated with my Home Assistant integration. It then calls out to the home automation machine that received the notification in the first place, asking it to trigger the front gate relay. That device calls out to the Doorbird and asks it to open the gate. And now I have functionality equivalent to a doorbell that completes a circuit and rings a bell inside my home, and a button inside my home that completes a circuit and opens the gate, except it involves two networks inside my building, callouts to the cloud, at least 7 devices inside my home that are running Linux and I really don't want to know how many computational cycles.

The future is wonderful.

(I work for Google. I do not work on any of the products described in this post. Please god do not ask me how to integrate your IoT into any of this)

comment count unavailable comments

17 May 2020

Matthew Palmer: Private Key Redaction: UR DOIN IT RONG

Because posting private keys on the Internet is a bad idea, some people like to redact their private keys, so that it looks kinda-sorta like a private key, but it isn t actually giving away anything secret. Unfortunately, due to the way that private keys are represented, it is easy to redact a key in such a way that it doesn t actually redact anything at all. RSA private keys are particularly bad at this, but the problem can (potentially) apply to other keys as well. I ll show you a bit of Inside Baseball with key formats, and then demonstrate the practical implications. Finally, we ll go through a practical worked example from an actual not-really-redacted key I recently stumbled across in my travels.

The Private Lives of Private Keys Here is what a typical private key looks like, when you come across it:
Obviously, there s some hidden meaning in there computers don t encrypt things by shouting BEGIN RSA PRIVATE KEY! , after all. What is between the BEGIN/END lines above is, in fact, a base64-encoded DER format ASN.1 structure representing a PKCS#1 private key. In simple terms, it s a list of numbers very important numbers. The list of numbers is, in order:
  • A version number (0);
  • The public modulus , commonly referred to as n ;
  • The public exponent , or e (which is almost always 65,537, for various unimportant reasons);
  • The private exponent , or d ;
  • The two private primes , or p and q ;
  • Two exponents, which are known as dmp1 and dmq1 ; and
  • A coefficient, known as iqmp .

Why Is This a Problem? The thing is, only three of those numbers are actually required in a private key. The rest, whilst useful to allow the RSA encryption and decryption to be more efficient, aren t necessary. The three absolutely required values are e, p, and q. Of the other numbers, most of them are at least about the same size as each of p and q. So of the total data in an RSA key, less than a quarter of the data is required. Let me show you with the above toy key, by breaking it down piece by piece1:
  • MGI DER for this is a sequence
  • CAQ version (0)
  • CxjdTmecltJEz2PLMpS4BX n
  • AgMBAA e
  • ECEDKtuwD17gpagnASq1zQTY d
  • IJANUYZsIjRsR3 q
  • AgkAkahDUXL0RS dmp1
  • ECCB78r2SnsJC9 dmq1
  • AghaOK3FsKoELg== iqmp
Remember that in order to reconstruct all of these values, all I need are e, p, and q and e is pretty much always 65,537. So I could redact almost all of this key, and still give all the important, private bits of this key. Let me show you:
Now, I doubt that anyone is going to redact a key precisely like this but then again, this isn t a typical RSA key. They usually look a lot more like this:
People typically redact keys by deleting whole lines, and usually replacing them with [...] and the like. But only about 345 of those 1588 characters (excluding the header and footer) are required to construct the entire key. You can redact about 4/5ths of that giant blob of stuff, and your private parts (or at least, those of your key) are still left uncomfortably exposed.

But Wait! There s More! Remember how I said that everything in the key other than e, p, and q could be derived from those three numbers? Let s talk about one of those numbers: n. This is known as the public modulus (because, along with e, it is also present in the public key). It is very easy to calculate: n = p * q. It is also very early in the key (the second number, in fact). Since n = p * q, it follows that q = n / p. Thus, as long as the key is intact up to p, you can derive q by simple division.

Real World Redaction At this point, I d like to introduce an acquaintance of mine: Mr. Johan Finn. He is the proud owner of the GitHub repo johanfinn/scripts. For a while, his repo contained a script that contained a poorly-redacted private key. He since deleted it, by making a new commit, but of course because git never really deletes anything, it s still available. Of course, Mr. Finn may delete the repo, or force-push a new history without that commit, so here is the redacted private key, with a bit of the surrounding shell script, for our illustrative pleasure:
#Add private key to .ssh folder
cd /home/johan/.ssh/
echo  "-----BEGIN RSA PRIVATE KEY-----
-----END RSA PRIVATE KEY-----" >> id_rsa
Now, if you try to reconstruct this key by removing the obvious garbage lines (the ones that are all repeated characters, some of which aren t even valid base64 characters), it still isn t a key at least, openssl pkey doesn t want anything to do with it. The key is very much still in there, though, as we shall soon see. Using a gem I wrote and a quick bit of Ruby, we can extract a complete private key. The irb session looks something like this:
>> require "derparse"
>> b64 = <<EOF
>> b64 += <<EOF
>> der = b64.unpack("m").first
>> c =
>> version = c.value
=> 0
>> c = c.next_node
>> n = c.value
=> 80071596234464993385068908004931... # (etc)
>> c = c.next_node
>> e = c.value
=> 65537
>> c = c.next_node
>> d = c.value
=> 58438813486895877116761996105770... # (etc)
>> c = c.next_node
>> p = c.value
=> 29635449580247160226960937109864... # (etc)
>> c = c.next_node
>> q = c.value
=> 27018856595256414771163410576410... # (etc)
What I ve done, in case you don t speak Ruby, is take the two chunks of plausible-looking base64 data, chuck them together into a variable named b64, unbase64 it into a variable named der, pass that into a new DerParse instance, and then walk the DER value tree until I got all the values I need. Interestingly, the q value actually traverses the split in the two chunks, which means that there s always the possibility that there are lines missing from the key. However, since p and q are supposed to be prime, we can sanity check them to see if corruption is likely to have occurred:
>> require "openssl"
=> true
=> true
Excellent! The chances of a corrupted file producing valid-but-incorrect prime numbers isn t huge, so we can be fairly confident that we ve got the real p and q. Now, with the help of another one of my creations we can use e, p, and q to create a fully-operational battle key:
>> require "openssl/pkey/rsa"
>> k = OpenSSL::PKey::RSA.from_factors(p, q, e)
=> #<OpenSSL::PKey::RSA:0x0000559d5903cd38>
>> k.valid?
=> true
>> k.verify(, k.sign(, "bob"), "bob")
=> true
and there you have it. One fairly redacted-looking private key brought back to life by maths and far too much free time. Sorry Mr. Finn, I hope you re not still using that key on anything Internet-facing.

What About Other Key Types? EC keys are very different beasts, but they have much the same problems as RSA keys. A typical EC key contains both private and public data, and the public portion is twice the size so only about 1/3 of the data in the key is private material. It is quite plausible that you can redact an EC key and leave all the actually private bits exposed.

What Do We Do About It? In short: don t ever try and redact real private keys. For documentation purposes, just put KEY GOES HERE in the appropriate spot, or something like that. Store your secrets somewhere that isn t a public (or even private!) git repo. Generating a dummy private key and sticking it in there isn t a great idea, for different reasons: people have this odd habit of reusing demo keys in real life. There s no need to encourage that sort of thing.
  1. Technically the pieces aren t 100% aligned with the underlying DER, because of how base64 works. I felt it was easier to understand if I stuck to chopping up the base64, rather than decoding into DER and then chopping up the DER.

21 April 2020

Matthew Garrett: Linux kernel lockdown, integrity, and confidentiality

The Linux kernel lockdown patches were merged into the 5.4 kernel last year, which means they're now part of multiple distributions. For me this was a 7-year journey, which means it's easy to forget that others aren't as invested in the code as I am. Here's what these patches are intended to achieve, why they're implemented in the current form and what people should take into account when deploying the feature.

Root is a user - a privileged user, but nevertheless a user. Root is not identical to the kernel. Processes running as root still can't dereference addresses that belong to the kernel, are still subject to the whims of the scheduler and so on. But historically that boundary has been very porous. Various interfaces make it straightforward for root to modify kernel code (such as loading modules or using /dev/mem), while others make it less straightforward (being able to load new ACPI tables that can cause the ACPI interpreter to overwrite the kernel, for instance). In the past that wasn't seen as a significant issue, since there were no widely deployed mechanisms for verifying the integrity of the kernel in the first place. But once UEFI secure boot became widely deployed, this was a problem. If you verify your boot chain but allow root to modify that kernel, the benefits of the verified boot chain are significantly reduced. Even if root can't modify the on-disk kernel, root can just hot-patch the kernel and then make this persistent by dropping a binary that repeats the process on system boot.

Lockdown is intended as a mechanism to avoid that, by providing an optional policy that closes off interfaces that allow root to modify the kernel. This was the sole purpose of the original implementation, which maps to the "integrity" mode that's present in the current implementation. Kernels that boot in lockdown integrity mode prevent even root from using these interfaces, increasing assurances that the running kernel corresponds to the booted kernel. But lockdown's functionality has been extended since then. There are some use cases where preventing root from being able to modify the kernel isn't enough - the kernel may hold secret information that even root shouldn't be permitted to see (such as the EVM signing key that can be used to prevent offline file modification), and the integrity mode doesn't prevent that. This is where lockdown's confidentiality mode comes in. Confidentiality mode is a superset of integrity mode, with additional restrictions on root's ability to use features that would allow the inspection of any kernel memory that could contain secrets.

Unfortunately right now we don't have strong mechanisms for marking which bits of kernel memory contain secrets, so in order to achieve that we end up blocking access to all kernel memory. Unsurprisingly, this compromises people's ability to inspect the kernel for entirely legitimate reasons, such as using the various mechanisms that allow tracing and probing of the kernel.

How can we solve this? There's a few ways:
  1. Introduce a mechanism to tag memory containing secrets, and only restrict accesses to this. I've tried to do something similar for userland and it turns out to be hard, but this is probably the best long-term solution.
  2. Add support for privileged applications with an appropriate signature that implement policy on the userland side. This is actually possible already, though not straightforward. Lockdown is implemented in the LSM layer, which means the policy can be imposed using any other existing LSM. As an example, we could use SELinux to impose the confidentiality restrictions on most processes but permit processes with a specific SELinux context to use them, and then use EVM to ensure that any process running in that context has a legitimate signature. This is quite a few hoops for a general purpose distribution to jump through.
  3. Don't use confidentiality mode in general purpose distributions. The attacks it protects against are mostly against special-purpose use cases, and they can enable it themselves.

My recommendation is for (3), and I'd encourage general purpose distributions that enable lockdown to do so only in integrity mode rather than confidentiality mode. The cost of confidentiality mode is just too high compared to the benefits it provides. People who need confidentiality mode probably already know that they do, and should be in a position to enable it themselves and handle the consequences.

comment count unavailable comments

13 April 2020

Matthew Garrett: Implementing support for advanced DPTF policy in Linux

Intel's Dynamic Platform and Thermal Framework (DPTF) is a feature that's becoming increasingly common on highly portable Intel-based devices. The adaptive policy it implements is based around the idea that thermal management of a system is becoming increasingly complicated - the appropriate set of cooling constraints to place on a system may differ based on a whole bunch of criteria (eg, if a tablet is being held vertically rather than lying on a table, it's probably going to be able to dissipate heat more effectively, so you should impose different constraints). One way of providing these criteria to the OS is to embed them in the system firmware, allowing an OS-level agent to read that and then incorporate OS-level knowledge into a final policy decision.

Unfortunately, while Intel have released some amount of support for DPTF on Linux, they haven't included support for the adaptive policy. And even more annoyingly, many modern laptops run in a heavily conservative thermal state if the OS doesn't support the adaptive policy, meaning that the CPU throttles down extremely quickly and the laptop runs excessively slowly.

It's been a while since I really got stuck into a laptop reverse engineering project, and I don't have much else to do right now, so I've been working on this. It's been a combination of examining what source Intel have released, reverse engineering the Windows code and staring hard at hex dumps until they made some sort of sense. Here's where I am.

There's two main components to the adaptive policy - the adaptive conditions table (APCT) and the adaptive actions table (APAT). The adaptive conditions table contains a set of condition sets, with up to 10 conditions in each condition set. A condition is something like "is the battery above a certain charge", "is this temperature sensor below a certain value", "is the lid open or closed", "is the machine upright or horizontal" and so on. Each condition set is evaluated in turn - if all the conditions evaluate to true, the condition set's target is implemented. If not, we move onto the next condition set. There will typically be a fallback condition set to catch the case where none of the other condition sets evaluate to true.

The action table contains sets of actions associated with a specific target. Once we've picked a target by evaluating the conditions, we execute the actions that have a corresponding target. Actions are things like "Set the CPU power limit to this value" or "Load a passive policy table". Passive policy tables are simply tables associating sensors with devices and an associated temperature limit. If the limit is exceeded, the associated device should be asked to reduce its heat output until the situation is resolved.

There's a couple of twists. The first is the OEM conditions. These are conditions that refer to values that are exposed by the firmware and are otherwise entirely opaque - the firmware knows what these mean, but we don't, so conditions that rely on these values are magical. They could be temperature, they could be power consumption, they could be SKU variations. We just don't know. The other is that older versions of the APCT table didn't include a reference to a device - ie, if you specified a condition based on a temperature, you had no way to express which temperature sensor to use. So, instead, you specified a condition that's greater than 0x10000, which tells the agent to look at the APPC table to extract the device and the appropriate actual condition.

Intel already have a Linux app called Thermal Daemon that implements a subset of this - you're supposed to run the binary-only dptfxtract against your firmware to parse a few bits of the DPTF tables, and it writes out an XML file that Thermal Daemon makes use of. Unfortunately it doesn't handle most of the more interesting bits of the adaptive performance policy, so I've spent the past couple of days extending it to do so and to remove the proprietary dependency.

My current work is here - it requires a couple of kernel patches (that are in the patches directory), and it only supports a very small subset of the possible conditions. It's also entirely possible that it'll do something inappropriate and cause your computer to melt - none of this is publicly documented, I don't have access to the spec and you're relying on my best guesses in a lot of places. But it seems to behave roughly as expected on the one test machine I have here, so time to get some wider testing?

comment count unavailable comments

11 March 2020

Antoine Beaupr : Font changes

I have worked a bit on the fonts I use recently. From the main font I use every day in my text editor and terminals to this very website, I did a major and (hopefully) thoughtful overhaul of my typography, in the hope of making things easier to use and, to be honest, just prettier.

Editor and Terminal: Fira mono This all started when I found out about the Jetbrains Mono font. I found the idea of ligatures fascinating: the result is truly beautiful. So I do what I often do (sometimes to the despair of some fellow Debian members) and filed a RFP to document my research. As it turns out, Jetbrains Mono is not free enough to be packaged in Debian, because it requires proprietary tools to build. I nevertheless figured I could try another font so I looked at other monospace alternatives. I found the following packages in debian: Those are also "programmer fonts" that caught my interest but somehow didn't land in Debian yet: Because Fira code had ligatures, i ended up giving it a shot. I really like the originality of the font. See, for example, how the @ sign looks when compared to my previous font, Liberation Mono:
A dark terminal window showing the prompt '[SIGPIPE]anarcat@angela:~(master)$' in the Liberation Mono font Liberation Mono
A dark terminal window showing the prompt '[SIGPIPE]anarcat@angela:~(master)$' in the Fira mono font Fira Mono
Interestingly, a colleague (thanks ahf!) pointed me to the Practical Typography post "Ligatures in programming fonts: hell no", which makes the very convincing argument that ligatures are downright dangerous to programming environment. In my experiences with the fonts, it was also not always giving the result I would expect. I also remembered that the Emacs Haskell mode would have this tendency of inserting crazy syntactic sugar like this in source code without being asked, which I found extremely frustrating. Besides, Emacs doesn't support ligatures, unless you count such horrendous hacks which hack at the display time. That's because Emacs' display layer is not based on a modern rendering library like Pango but some scary legacy code that very few people understand. On top of the complexity of the codebase, there is also resistance in including a modern font. So I ended up using Fira mono everywhere I use fixed-width fonts, even though it's not packaged in Debian. That involves the following configuration in my .Xresources (no, I haven't switched to Wayland):
! font settings
Emacs*font: Fira mono
rofi*font: Fira mono 12
! Symbola is to get emojis to display correctly, apt install fonts-symbola
!URxvt*font: xft:Monospace,xft:Symbola
URxvt*font: xft:Fira mono,xft:Symbola
I also dropped this script in ~/.local/share/fonts/, for lack of a better way, to download all the fonts I'm interested in:
set -e
set -u
wget --no-clobber --continue
unzip -u
wget --no-clobber --continue
unzip -u
wget --no-clobber --continue
unzip -u
wget --no-clobber --continue
unzip -u
wget --no-clobber --continue -O   true
unzip -u
Update: I forgot to mention one motivation behind this was to work around a change in the freetype interpreter, discussed in bug 866685 and my upgrades documentation.

Website: Charter That "hell no" article got me interested in the Practical Typography web book, which I read over the weekend. It was an eye opener and I realized I had already some of those concepts implanted; in fact it's probably after reading the Typography in ten minutes guide that I ended up trying Fira sans a few years ago. I have removed that font since then, however, after realising it was taking up an order of magnitude more bandwidth space than the actual page content. I really loved the book, so much that I actually bought it. I liked the concept of it, the look, and the fact that it's a living document. There's a lot of typography work I would need to do on this site to catchup with the recommendations from Matthew Butterick. Switching fonts is only one part of this, but it's something I was excited to work on. So I sat down and reviewed the free fonts Butterick recommends and tried out a few. I ended up settling on Charter, a relatively old (in terms of computing) font designed by Matthew Carter (of Verdana fame) in 1987. Charter really looks great and is surprisingly small. While a single version of Fira varies between 96KiB (Fira Sans Condensed) and 308KiB (Fira Sans Medium Italic), Charter is only 28KiB! While it's still about as large as most of my articles, I found it was a better compromise and decided to make the jump. This site is now in Serif, which is a huge change for me. The change was done with the following CSS:
h1, h2, h3, h4, h5, body  
     * Charter: Butterick's favorite, freely available, found on
     * Palatino: Mac OS
     * Palatino Linotype: Windows
     * Noto serif: Android, packaged in Debian
     * Liberation serif: Linux fallback
     * Serif: fallback
    font-family: Charter, Palatino, "Palatino Linotype", "Noto serif", "Liberation serif", serif;
    /* Charter is available from under the liberal Bitstream license */
I have also decided to outline headings by making them slanted, using the beautiful italic version of Charter:
h1, h2, h3, h4, h5  
    font-style: italic;
I also made the point size larger, if the display is large enough:
/* enlarge body point size for charter for larger displays */
@media (min-device-width: 750px)  
        font-size: 20px;
        line-height: 1.3; /* default in FF is ~1.48 */
Modern display (including fonts) have a much higher resolution and the point size on my website was really getting too small to be readable. This, in turn, required a change to the max-width:
    /* about 2.5 alphabets with charter */
    max-width: 35em;
I have kept the footers and "UI" elements in sans-serif though, and kept those aligned on operating system defaults or "system fonts":
/* some hacking at typefaces to get some fresh zest in here
 * fallbacks from:
.navbar, .footer  
     * Avenir: Mac, quite different but should still be pretty
     * Gill sans: Mac, Windows, somewhat similar to Avenir (indulge me)
     * Noto sans: Android, packaged in Debian
     * Open sans: Texlive extras AKA Linux, packaged in Debian
     * Fira sans: Mozilla's Firefox OS
     * Liberation sans: Linux fallback
     * Helvetica: general fallback
     * Sans-serif: fallback
    font-family: Avenir, "Gill sans", "Noto sans", "Open sans", "Fira sans", "Liberation sans", Helvetica, sans-serif;
    /* Fira is available from under the SIL Open Font License */
    /* alternatively, just use system fonts for "controls" instead:
    font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Roboto", "Oxygen", "Ubuntu", "Cantarell", "Fira Sans", "Droid Sans", "Helvetica Neue", sans-serif;
    gitlab uses: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Noto Sans", Ubuntu, Cantarell, "Helvetica Neue", sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji"
I've only done some preliminary testing of how this will look like. Although I tested on a few devices (my phone, e-book tablet, an iPad, and of course my laptop), I fully expect things to break on your device. Do let me know if things look better or worse. For future comparison, my site is well indexed in the Internet Wayback Machine and can be used to look at the site before the change. For example, compare the previous article here with its earlier style. The changes to the theme are of course available in my custom ikiwiki bootstrap theme (see in particular commits 0bca0fb7 and d1901fb8), as usual. Enjoy, and let me know what you think! PS: I considered just setting the Charter font in CSS and not adding it as a @font-face. I'm still considering that option and might do so if the performance cost is too big. The Fira mono font is actually set like this for the preformatted sections of the site, but because it's more common (and it's too big) I haven't added it as a font-face. You might want to download the font locally to benefit from the full experience as well. PPS: As it turns out, an earlier version of this post featured exactly that: a non-webfont version of Charter, which works fine if you have a good Charter font available. But it looks absolutely terrible if, like many Linux users, you have the nasty bitmap font shipped with xfonts-100dpi and xfonts-75dpi. So I fixed the webfont and it's unlikely this site will be able to load reasonably well in Linux until those packages are removed or bitmap font rendering is disabled.

6 March 2020

Reproducible Builds: Reproducible Builds in February 2020

Welcome to the February 2020 report from the Reproducible Builds project. One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the reproducible builds effort is to provide the ability to demonstrate these binaries originated from a particular, trusted, source release: if identical results are generated from a given source in all circumstances, reproducible builds provides the means for multiple third-parties to reach a consensus on whether a build was compromised via distributed checksum validation or some other scheme. In this month s report, we cover:

If you are interested in contributing to the project, please visit our Contribute page on our website.

Media coverage & upstream news Omar Navarro Leija, a PhD student at the University Of Pennsylvania, published a paper entitled Reproducible Containers that describes in detail the workings of a new user-space container tool called DetTrace:
All computation that occurs inside a DetTrace container is a pure function of the initial filesystem state of the container. Reproducible containers can be used for a variety of purposes, including replication for fault-tolerance, reproducible software builds and reproducible data analytics. We use DetTrace to achieve, in an automatic fashion, reproducibility for 12,130 Debian package builds, containing over 800 million lines of code, as well as bioinformatics and machine learning workflows.
There was also considerable discussion on our mailing list regarding this research and a presentation based on the paper will occur at the ASPLOS 2020 conference between March 16th 20th in Lausanne, Switzerland. The many virtues of Reproducible Builds were touted as benefits for software compliance in a talk at FOSDEM 2020, debating whether the Careful Inventory of Licensing Bill of Materials Have Impact of FOSS License Compliance which pitted Jeff McAffer and Carol Smith against Bradley Kuhn and Max Sills. (~47 minutes in). Nobuyoshi Nakada updated the canonical implementation of the Ruby programming language a change such that filesystem globs (ie. calls to list the contents of filesystem directories) will henceforth be sorted in ascending order. Without this change, the underlying nondeterministic ordering of the filesystem is exposed to the language which often results in an unreproducible build. Vagrant Cascadian reported on our mailing list regarding a quick reproducible test for the GNU Guix distribution, which resulted in 81.9% of packages registering as reproducible in his installation:
$ guix challenge --verbose --diff=diffoscope ...
2,463 store items were analyzed:
  - 2,016 (81.9%) were identical
  - 37 (1.5%) differed
  - 410 (16.6%) were inconclusive
Jeremiah Orians announced on our mailing list the release of a number of tools related to cross-compilation such as M2-Planet and mescc-tools-seed. This project attemps a full bootstrap of a cross-platform compiler for the C programming language (written in C itself) from hex, the ultimate goal being able to demonstrate fully-bootstrapped compiler from hex to the GCC GNU Compiler Collection. This has many implications in and around Ken Thompson s Trusting Trust attack outlined in Thompson s 1983 Turing Award Lecture. Twitter user @TheYoctoJester posted an executive summary of reproducible builds in the Yocto Project: Finally, Reddit user tofflos posted to the /r/Java subreddit asking about how to achieve reproducible builds with Maven and Chris Lamb noticed that the Linux kernel documentation about reproducible builds of it is available on the homepages in an attractive HTML format.

Distribution work

Debian Chris Lamb created a merge request for the core debian-installer package to allow all arguments and options from sources.list files (such as [check-valid-until=no] , etc.) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure. (#13) Thorsten Glaser followed-up to a bug filed against the dpkg-source component that was originally filed in late 2015 that claims that the build tool does not respect permissions when unpacking tarballs if the umask is set to 0002. Matthew Garrett posted to the debian-devel mailing list on the topic of Producing verifiable initramfs images as part of a wider conversation on being able to trust the entire software stack on our computers. 59 reviews of Debian packages were added, 30 were updated and 42 were removed this month adding to our knowledge about identified issues. Many issue types were noticed and categorised by Chris Lamb, including:

openSUSE In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update as well as provided the following patches:

Software development

diffoscope diffoscope is our in-depth and content-aware diff-like utility that can locate and diagnose reproducibility issues. It is run countless times a day on our testing infrastructure and is essential for identifying fixes and causes of nondeterministic behaviour. Chris Lamb made the following changes this month, including uploading version 137 to Debian:
  • The sng image utility appears to return with an exit code of 1 if there are even minor errors in the file. (#950806)
  • Also extract classes2.dex, classes3.dex from .apk files extracted by apktool. (#88)
  • No need to use str.format if we are just returning the string. [ ]
  • Add generalised support for ignoring returncodes [ ] and move special-casing of returncodes in zip to use Command.VALID_RETURNCODES. [ ]

Other tools disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues. This month, Vagrant Cascadian updated the Vcs-Git to specify the debian packaging branch. [ ] reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, versions 0.7.13 and 0.7.14 were uploaded to Debian unstable by Holger Levsen after Vagrant Cascadian added support for GNU Guix [ ].

Project documentation & website There was more work performed on our documentation and website this month. Bernhard M. Wiedemann added a Java Gradle Build Tool snippet to the SOURCE_DATE_EPOCH documentation [ ] and normalised various terms to unreproducible [ ]. Chris Lamb added a example [ ] and improved the documentation for the CMake [ ] to the SOURCE_DATE_EPOCH documentation, replaced anyone can with anyone may as, well, not everyone has the resources, skills, time or funding to actually do what it refers to [ ] and improved the pre-processing for our report generation [ ][ ][ ][ ] etc. In addition, Holger Levsen updated our news page to improve the list of reports [ ], added an explicit mention of the weekly news time span [ ] and reverted sorting of news entries to have latest on top [ ] and Mattia Rizzolo added Codethink as a non-fiscal sponsor [ ] and lastly Tianon Gravi added a Docker Images link underneath the Debian project on our Projects page [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Vagrant Cascadian submitted patches via the Debian bug tracking system targeting the packages the Civil Infrastructure Platform has identified via the CIP and CIP build depends package sets:

Testing framework We operate a fully-featured and comprehensive Jenkins-based testing framework that powers This month, the following changes were made by Holger Levsen: In addition, Mattia Rizzolo added an Apache web server redirect for [ ] and reverted the reshuffling of arm64 architecture builders [ ]. The usual build node maintenance was performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian.

Getting in touch If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb and Holger Levsen. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

4 November 2017

Dirk Eddelbuettel: tint 0.0.4: Small enhancements

A maintenance release of the tint package arrived on CRAN earlier today. Its name expands from tint is not tufte as the package offers a fresher take on the Tufte-style for html and pdf presentations. A screenshot of the pdf variant is below.
This release brings some minor enhancements and polish, mostly learned from having done the related pinp (two-column vignette in the PNAS style) and linl (LaTeX letter) RMarkdown-wrapper packages; see below for details from the NEWS.Rd file.

Changes in tint version 0.0.4 (2017-11-02)
  • Skeleton files are also installed as vignettes (#20).
  • A reference to the Tufte source file now points to tint (Ben Marwick in #19, later extended to other Rmd files).
  • Several spelling and grammar errors were corrected too (#13 and #16 by R. Mark Sharp and Matthew Henderson)

Courtesy of CRANberries, there is a comparison to the previous release. More information is on the tint page. For questions or comments use the issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.