if (name == '.') continue;
if (strcmp(name, ".") == 0 strcmp(name, "..") == 0) continue;
$HOME/lib, which is what we did in Plan 9, which had no dot files. Lessons can be learned.)
Welcome to the October 2021 report from the Reproducible Builds project!
This month Samanta Navarro posted to the
oss-security security mailing on a novel category of exploit in the
.tar archive format, where a single
.tar file contains different contents depending on the tar utility being used. Naturally, this has consequences for reproducible builds as Samanta goes onto reply:
Arch Linux uses libarchive (bsdtar) in its build environment. The default tar program installed is GNU tar. It is possible to create a source distribution which leads to different files seen by the build environment than compared to a careful reviewer and other Linux distributions.Samanta notes that addressing the tar utilities themselves will not be a sufficient fix:
I have submitted bug reports and patches to some projects but eventually I had to conclude that the problem itself cannot be fixed by these implementations alone. The best choice for these tools would be to only allow archives which are fully compatible to standards but this in turn would render a lot of archives broken.Reproducible builds, with its twin ideas of reaching consensus on the build outputs as well as precisely recording and describing the build environment, would help address this problem at a higher level.
Reproducibility, repeatability and traceability of builds, drawing heavily on best-practices championed by the Reproducible Builds project.
I got a PR in my repository a few days ago leading back to a team trying to make it easier for packages to be reproducible from source.
stage0-posix, part of a broader effort to provide an ultra-minimal bootstrap seed to increase trust in our software stack.
The current goal is to be able to build the Qubes OS Debian templates solely from packages that can be built reproducibly. Templates in Qubes OS are VM images that can be used to start an application qube quickly based on the template. The qube will have read-only access to the root filesystem of the template, so that the same root filesystem can be shared with multiple application qubes. There are official templates for several variants of both Fedora and Debian, as well as community maintained templates for several other distributions.You can view the whole article on LWN, and Fr d ric also published a lengthy summary about their work on reproducible builds in Qubes as well for those wishing to learn more.
.pycfiles generated with newer versions of Python. [ ]
ffmpeg[ ][ ][ ] and added Arch Linux as a Continuous Integration (CI) test target. [ ] and Vagrant Cascadian updated the testsuite to skip Python bytecode comparisons when
file(1)is older than 5.39. [ ] as well as added external tool references for the Guix distribution for
ppudump. [ ][ ]. Vagrant Cascadian also updated the diffoscope package in GNU Guix [ ][ ]. Lastly, Guangyuan Yang updated the FreeBSD package name on the website [ ], Mattia Rizzolo made a change to override a new Lintian warning due to the new test files [ ], Roland Clobus added support to detect and log if the
GNU_BUILD_IDfield in an ELF binary been modified [ ], Sandro J ckel updated a number of helpful links on the website [ ] and Sergei Trofimovich made the uImage test output support
file() version 5.41 [ ].
0.7.18was uploaded to Debian unstable by Holger Levsen, which also included a change by Holger to clarify that Python 3.9 is used nowadays [ ], but it also included two changes by Vasyl Gello to implement realistic CPU architecture shuffling [ ] and to log the selected variations when the verbosity is configured at a sufficiently high level [ ]. Finally, Vagrant Cascadian updated reprotest to version 0.7.18 in GNU Guix.
sphinx-gallery(re-opened with extensive updates).
apt-get updatewith the
-qargument to get more decent logs. [ ]
snapshot.reproducible-builds.org. [ ]
libarchive-toolspackage (instead of
bsdtar) when updating Jenkins nodes. [ ]
NODE_NAMEis not a fully-qualified domain name (FQDN). [ ]
postgresql_autodocas it not available in bullseye. [ ]
schrootunderlays for deletion that are over a month old. [ ][ ]
/procif it s actually mounted. [ ]
db_backuptask to its own Jenkins job. [ ]
reproducible_build.shscript [ ].
In place of colonialism, as the main instrument of imperialism, we have today neo-colonialism ... [which] like colonialism, is an attempt to export the social conflicts of the capitalist countries. ... The result of neo-colonialism is that foreign capital is used for the exploitation rather than for the development of the less developed parts of the world. Investment, under neo-colonialism, increases, rather than decreases, the gap between the rich and the poor countries of the world.So basically, if colonialism is Europeans bringing genocide, war, and its religion to the Africa, Asia, and the Americas, neo-colonialism is the Americans (note the "n") bringing capitalism to the world. Before we see how this applies to the Internet, we must therefore make a detour into US history. This matters, because anyone would be hard-pressed to decouple neo-colonialism from the empire under which it evolves, and here we can only name the United States of America.
it was intended to be an expression of the American mind, and to give to that expression the proper tone and spirit called for by the occasionIn that aging document, we find the following pearl:
We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.As a founding document, the Declaration still has an impact in the sense that the above quote has been called an:
"immortal declaration", and "perhaps [the] single phrase" of the American Revolutionary period with the greatest "continuing importance." (Wikipedia)Let's read that "immortal declaration" again: "all men are created equal". "Men", in that context, is limited to a certain number of people, namely "property-owning or tax-paying white males, or about 6% of the population". Back when this was written, women didn't have the right to vote, and slavery was legal. Jefferson himself owned hundreds of slaves. The declaration was aimed at the King and was a list of grievances. A concern of the colonists was that the King:
has excited domestic insurrections amongst us, and has endeavoured to bring on the inhabitants of our frontiers, the merciless Indian Savages whose known rule of warfare, is an undistinguished destruction of all ages, sexes and conditions.This is a clear mark of the frontier myth which paved the way for the US to exterminate and colonize the territory some now call the United States of America. The declaration of independence is obviously a colonial document, having being written by colonists. None of this is particularly surprising, historically, but I figured it serves as a good reminder of where the Internet is coming from, since it was born in the US.
The declaration has been criticized for internal inconsistencies. The declaration's assertion that 'cyberspace' is a place removed from the physical world has also been challenged by people who point to the fact that the Internet is always linked to its underlying geography.And indeed, the Internet is definitely a physical object. First controlled and severely restricted by "telcos" like AT&T, it was somewhat "liberated" from that monopoly in 1982 when an anti-trust lawsuit broke up the monopoly, a key historical event that, one could argue, made the Internet possible. (From there on, "backbone" providers could start competing and emerge, and eventually coalesce into new monopolies: Google has a monopoly on search and advertisement, Facebook on communications for a few generations, Amazon on storage and computing, Microsoft on hardware, etc. Even AT&T is now pretty much as consolidated as it was before.) The point is: all those companies have gigantic data centers and intercontinental cables. And those are definitely prioritizing the western world, the heart of the empire. Take for example Google's latest 3,900 mile undersea cable: it does not connect Argentina to South Africa or New Zealand, it connects the US to UK and Spain. Hardly a revolutionary prospect.
Do not think that you can build it, as though it were a public construction project. You cannot. It is an act of nature and it grows itself through our collective actions.In Barlow's mind, the "public" is bad, and private is good, natural. Or, in other words, a "public construction project" is unnatural. And indeed, the modern "nature" of development is private: most of the Internet is now privately owned and operated. I must admit that, as an anarchist, I loved that sentence when I read it. I was rooting for "us", the underdogs, the revolutionaries. And, in a way, I still do: I am on the board of Koumbit and work for a non-profit that has pivoted towards censorship and surveillance evasion. Yet I cannot help but think that, as a whole, we have failed to establish that independence and put too much trust in private companies. It is obvious in retrospect, but it was not, 30 years ago. Now, the infrastructure of the Internet has zero accountability to traditional political entities supposedly representing the people, or even its users. The situation is actually worse than when the US was founded (e.g. "6% of the population can vote"), because the owners of the tech giants are only a handful of people who can override any decision. There's only one Amazon CEO, he's called Jeff Bezos, and he has total control. (Update: Bezos actually ceded the CEO role to Andy Jassy, AWS and Amazon music founder, while remaining executive chairman. I would argue that, as the founder and the richest man on earth, he still has strong control over Amazon.)
We are forming our own Social Contract.I remember the early days, back when "netiquette" was a word, it did feel we had some sort of a contract. Not written in standards of course -- or barely (see RFC1855) -- but as a tacit agreement. How wrong we were. One just needs to look at Facebook to see how problematic that idea is on a global network. Facebook is the quintessential "hacker" ideology put in practice. Mark Zuckerberg explicitly refused to be "arbiter of truth" which implicitly means he will let lies take over its platforms. He also sees Facebook as place where everyone is equal, something that echoes the Declaration:
We are creating a world that all may enter without privilege or prejudice accorded by race, economic power, military force, or station of birth.(We note, in passing, the omission of gender in that list, also mirroring the infamous "All men are created equal" claim of the US declaration.) As the Wall Street Journal's (WSJ) Facebook files later shown, both of those "contracts" have serious limitations inside Facebook. There are VIPs who systematically bypass moderation systems including fascists and rapists. Drug cartels and human traffickers thrive on the platform. Even when Zuckerberg himself tried to tame the platform -- to get people vaccinated or to make it healthier -- he failed: "vaxxer" conspiracies multiplied and Facebook got angrier. This is because the "social contract" behind Facebook and those large companies is a lie: their concern is profit and that means advertising, "engagement" with the platform, which causes increased anxiety and depression in teens, for example. Facebook's response to this is that they are working really hard on moderation. But the truth is that even that system is severely skewed. The WSJ showed that Facebook has translators for only 50 languages. It's a surprisingly hard to count human languages but estimates range the number of distinct languages between 2500 and 7000. So while 50 languages seems big at first, it's actually a tiny fraction of the human population using Facebook. Taking the first 50 of the Wikipedia list of languages by native speakers we omit languages like Dutch (52), Greek (74), and Hungarian (78), and that's just a few random nations picks from Europe. As an example, Facebook has trouble moderating even a major language like Arabic. It censored content from legitimate Arab news sources when they mentioned the word al-Aqsa because Facebook associates it with the al-Aqsa Martyrs' Brigades when they were talking about the Al-Aqsa Mosque... This bias against Arabs also shows how Facebook reproduces the American colonizer politics. The WSJ also pointed out that Facebook spends only 13% of its moderation efforts outside of the US, even if that represents 90% of its users. Facebook spends three more times moderating on "brand safety", which shows its priority is not the safety of its users, but of the advertisers.
Many African people have gained access to these technologies but not the freedom to develop content such as web pages or social media platforms in their own way. Digital natives have much more power and therefore use this to create their own space with their own norms, shaping their online world according to their own outlook.But the digital divide is certainly not the worst problem we have to deal with on the Internet today. Going back to the Declaration, we originally believed we were creating an entirely new world:
This governance will arise according to the conditions of our world, not yours. Our world is different.How I dearly wished that was true. Unfortunately, the Internet is not that different from the offline world. Or, to be more accurate, the values we have embedded in the Internet, particularly of free speech absolutism, sexism, corporatism, and exploitation, are now exploding outside of the Internet, into the "real" world. The Internet was built with free software which, fundamentally, was based on quasi-volunteer labour of an elite force of white men with obviously too much time on their hands (and also: no children). The mythical writing of GCC and Emacs by Richard Stallman is a good example of this, but the entirety of the Internet now seems to be running on random bits and pieces built by hit-and-run programmers working on their copious free time. Whenever any of those fails, it can compromise or bring down entire systems. (Heck, I wrote this article on my day off...) This model of what is fundamentally "cheap labour" is spreading out from the Internet. Delivery workers are being exploited to the bone by apps like Uber -- although it should be noted that workers organise and fight back. Amazon workers are similarly exploited beyond belief, forbidden to take breaks until they pee in bottles, with ambulances nearby to carry out the bodies. During peak of the pandemic, workers were being dangerously exposed to the virus in warehouses. All this while Amazon is basically taking over the entire economy. The Declaration culminates with this prophecy:
We will spread ourselves across the Planet so that no one can arrest our thoughts.This prediction, which first felt revolutionary, is now chilling.
We will create a civilization of the Mind in Cyberspace. May it be more humane and fair than the world your governments have made before.That is still inspiring to me. But if we want to make "cyberspace" more humane, we need to decolonize it. Work on cyberpeace instead of cyberwar. Establish clear code of conduct, discuss ethics, and question your own privileges, biases, and culture. For me the first step in decolonizing my own mind is writing this article. Breaking up tech monopolies might be an important step, but it won't be enough: we have to do a culture shift as well, and that's the hard part.
I'm an optimist. In order to be libertarian, you have to be an optimist. You have to have a benign view of human nature, to believe that human beings left to their own devices are basically good. But I'm not so sure about human institutions, and I think the real point of argument here is whether or not large corporations are human institutions or some other entity we need to be thinking about curtailing. Most libertarians are worried about government but not worried about business. I think we need to be worrying about business in exactly the same way we are worrying about government.And, in a sense, it was a little naive to expect Barlow to not be a colonist. Barlow is, among many things, a cattle rancher who grew up on a colonial ranch in Wyoming. The ranch was founded in 1907 by his great uncle, 17 years after the state joined the Union, and only a generation or two after the Powder River War (1866-1868) and Black Hills War (1876-1877) during which the US took over lands occupied by Lakota, Cheyenne, Arapaho, and other native American nations, in some of the last major First Nations Wars.
<title>tag on the article is actually "Facebook the Colonial Empire" which I also find appropriate.) The article is worth reading in full, but I loved this quote so much that I couldn't resist reproducing it here:
Representations of colonialism have long been present in digital landscapes. ( Even Super Mario Brothers, the video game designer Steven Fox told me last year. You run through the landscape, stomp on everything, and raise your flag at the end. ) But web-based colonialism is not an abstraction. The online forces that shape a new kind of imperialism go beyond Facebook.It goes on:
Consider, for example, digitization projects that focus primarily on English-language literature. If the web is meant to be humanity s new Library of Alexandria, a living repository for all of humanity s knowledge, this is a problem. So is the fact that the vast majority of Wikipedia pages are about a relatively tiny square of the planet. For instance, 14 percent of the world s population lives in Africa, but less than 3 percent of the world s geotagged Wikipedia articles originate there, according to a 2014 Oxford Internet Institute report.And they introduce another definition of Neo-colonialism, while warning about abusing the word like I am sort of doing here:
I m loath to toss around words like colonialism but it s hard to ignore the family resemblances and recognizable DNA, to wit, said Deepika Bahri, an English professor at Emory University who focuses on postcolonial studies. In an email, Bahri summed up those similarities in list form:Another good read is the classic Code and other laws of cyberspace (1999, free PDF) which is also critical of Barlow's Declaration. In "Code is law", Lawrence Lessig argues that:
In the end, she told me, if it isn t a duck, it shouldn t quack like a duck.
- ride in like the savior
- bandy about words like equality, democracy, basic rights
- mask the long-term profit motive (see 2 above)
- justify the logic of partial dissemination as better than nothing
- partner with local elites and vested interests
- accuse the critics of ingratitude
computer code (or "West Coast Code", referring to Silicon Valley) regulates conduct in much the same way that legal code (or "East Coast Code", referring to Washington, D.C.) does (Wikipedia)And now it feels like the west coast has won over the east coast, or maybe it recolonized it. In any case, Internet now christens emperors.
FARM_FINGERPRINTdigest. This releases adds a
#definewhich was needed on everybody s favourite CRAN platform to not attempt to include a missing header
endian.h. With this added
#defineall is well as we can already tell from looking at the CRAN status where the three machines maintained by you-may-know-who have already built the package. The others will follow over the next few days. I also tweeted about the upload with a screenshot demonstrating an eight minute passage from upload to acceptance with the added
#ThankYouCRANtag to say thanks for very smooth and fully automated processing at their end. The very brief NEWS entry follows:
If you like this or other open-source work I do, you can now sponsor me at GitHub.
Changes in version 0.0.2 (2021-08-02)
- On SunOS, set endianness to not error on
- Add badges and installation notes to README as package is on CRAN
FARM_FINGERPRINT. The package was prepared and uploaded yesterday afternoon, and to my surprise already on CRAN this (early) morning when I got up. So here is another
#ThankYouCRANfor very smoothing operations. The very brief NEWS entry follows:
If you like this or other open-source work I do, you can now sponsor me at GitHub.
Changes in version 0.0.1 (2021-07-25)
- Initial version and CRAN upload
media-typespackage, I wanted to know which packages were using the media type
application/x-xcf, which apparently is not correct (#991158). The https://codesearch.debian.net site gives the answer. (Thanks!) Moreover, one can create a user key, for command-line remote access; here is an example below (the file
x-dcs-apikey:followed by my access key).
The result is serialised in JSON. Here is how I transformed it to make a list of email addresses that I could easily paste in
curl -X GET "https://codesearch.debian.net/api/v1/searchperpackage?query=application/x-xcf&match_mode=literal" -H @dcs-apikeyHeader-plessy.txt > result.json
cat result.json jq --raw-output '.."package"' dd-list --stdin sed -e '/^ /d' -e '/^$/'d -e 's/$/,/' -e 's/^/ /'
Our system has detected that this message is 421-4.7.0 suspicious due to the nature of the content and/or the links within.The quantity of email that accounts for just three percent of mail to Google is high, and caused all kinds of monitoring alarms to go off, putting us into a bit of panic. Eventually we realized all but one of the email messages had bit.ly links. I m still not sure whether this issue was caused by a weird and coincidental spike in users sending bit.ly links to Google. Or whether some subtle change in the Google algorithm is responsible. Or some change in our IP address reputation placed greater emphasis on bit.ly links. In the end it doesn t really matter - the real point is that until we disrupt this growing monopoly we will all be at the mercy of Google and their algorithms for email deliverability (and much, much more).
Interestingly as well, while in the vaccine issue, Brazil Anvisa doesn t know what they are doing or the regulator just isn t knowledgeable etc. (statements by various people in GOI, when it comes to testing kits, the same is an approver.)
curl \ -H "x-dcs-apikey: $(cat dcs-apikey-stapelberg.txt)" \ -X GET \ "https://codesearch.debian.net/api/v1/search?query=i3Font&match_mode=regexp"
func burndown() error cfg := openapiclient.NewConfiguration() cfg.AddDefaultHeader("x-dcs-apikey", apiKey) client := openapiclient.NewAPIClient(cfg) ctx := context.Background() // Search through the full Debian Code Search corpus, blocking until all // results are available: results, _, err := client.SearchApi.Search(ctx, "fmt.Sprint(err)", &openapiclient.SearchApiSearchOpts // Literal searches are faster and do not require escaping special // characters, regular expression searches are more powerful. MatchMode: optional.NewString("literal"), ) if err != nil return err // Print to stdout a CSV file with the path and number of occurrences: wr := csv.NewWriter(os.Stdout) header := string "path", "number of occurrences" if err := wr.Write(header); err != nil return err occurrences := make(map[string]int) for _, result := range results occurrences[result.Path]++ for _, result := range results o, ok := occurrences[result.Path] if !ok continue // Print one CSV record per path: delete(occurrences, result.Path) record := string result.Path, strconv.Itoa(o) if err := wr.Write(record); err != nil return err wr.Flush() return wr.Error()
|Debian Code Search CLI tool||Updated to OpenAPI|
|gnome-codesearch||makes no API queries|
QSoas> cd QSoas> l Km-CODH-IV.dat QSoas> RFirst fitThen, to fit the above equation to the data, the simplest is to take advantage of the time-dependent parameters features of QSoas. Run simply:
QSoas> fit-arb im/(1+km/s) /with=s:1,expThis simply launches the fit interface to fit the exact equations above. The
im/(1+km/s)is simply the translation of the Michaelis-Menten equation above, and the
sis the result of the sum of 1 exponential like for the definition of above. Then, load the
Km-CODH-IV.paramsparameter files (using the "Parameters.../Load from file" action at the bottom, or the
Ctrl+Lkeyboard shortcut). Your window should now look like this:
Ctrl+F). Including an offset The fit is not bad, but not perfect. In particular, it is easy to see why: the current predicted by the fit goes to 0 at large times, but the actual current is below 0. We need therefore to include an offset to take this into consideration. Close the fit window, and re-run a fit, but now with this command:
QSoas> fit-arb im/(1+km/s)+io /with=s:1,expNotice the
+iobit that corresponds to the addition of an offset current. Load again the base parameters, run the fit again... Your fit window show now look like:
imis \(i_m\), the maximal current, around 120 nA (which matches the magnitude of the original current).
iois the offset current, around -3nA.
kmis the \(K_m\), expressed in the same units as
s_1, the first "injected" value of
s(we used 50 because the injection is 50 M CO). That means the value of \(K_m\) determined by this fit is around 9 nM ! (but see below).
s_t_1is the time of the injection of CO (it was in the parameter files you loaded).
s_tau_1is the time constant of departure of CO, noted \(\tau\) in the equations above.
transport.rbRuby file that contains the definition of the
itrptfunction, and re-launch the fit window using:
QSoas> ruby-run transport.rb QSoas> fit-arb itrprt(s,km,nFAm,nFAmu)+io /with=s:1,expLoad again the parameter file... but this time you'll have to play a bit more with the starting parameters for QSoas to find the right values when you fit. Here are some tips:
io, no parameter should be negative;
ghcli tool, and this is all over HN, it might make sense to write a note about my clumsy aliases and shell functions I cobbled together in the past month. Background story is that my dayjob moved to GitHub coming from Bitbucket. From my point of view the WebUI for Bitbucket is mediocre, but the one at GitHub is just awful and painful to use, especially for PR processing. So I longed for the terminal and ended up with gh and wtfutil as a dashboard. The setup we have is painful on its own, with several orgs and repos which are more like monorepos covering several corners of infrastructure, and some which are very focused on a single component. All workflows are anti GitHub workflows, so you must have permission on the repo, create a branch in that repo as a feature branch, and open a PR for the merge back into master. gh functions and aliases
wtfutil I have a terminal covering half my screensize with small dashboards listing PRs for the repos I care about. For other repos I reverted back to mail notifications which get sorted and processed from time to time. A sample dashboard config looks like this:
# setup a token with perms to everything, dealing with SAML is a PITA export GITHUB_TOKEN="c0ffee4711" # I use a light theme on my terminal, so adjust the gh theme export GLAMOUR_STYLE="light" #simple aliases to poke at a PR alias gha="gh pr review --approve" alias ghv="gh pr view" alias ghd="gh pr diff" ### github support functions, most invoked with a PR ID as $1 #primary function to review PRs function ghs gh pr view $ 1 gh pr checks $ 1 gh pr diff $ 1 # very custom PR create function relying on ORG and TEAM settings hard coded # main idea is to create the PR with my team directly assigned as reviewer function ghc if git status grep -q 'Untracked'; then echo "ERROR: untracked files in branch" git status return 1 fi git push --set-upstream origin HEAD gh pr create -f -r "$(git remote -v grep push grep -oE 'myorg-[a-z]+')/myteam" # merge a PR and update master if we're not in a different branch function ghm gh pr merge -d -r $ 1 if [[ "$(git rev-parse --abbrev-ref HEAD)" == "master" ]]; then git pull fi # get an overview over the files changed in a PR function ghf gh pr diff $ 1 diffstat -l # generate a link to a commit in the WebUI to pass on to someone else # input is a git commit hash function ghlink local repo="$(git remote -v grep -E "github.+push" cut -d':' -f 2 cut -d'.' -f 1)" echo "https://github.com/$ repo /commit/$ 1 "
github_admin: apiKey: "c0ffee4711" baseURL: "" customQueries: othersPRs: title: "Pull Requests" filter: "is:open is:pr -author:hoexter -label:dependencies" enabled: true enableStatus: true showOpenReviewRequests: false showStats: false position: top: 0 left: 0 height: 3 width: 1 refreshInterval: 30 repositories: - "myorg/admin" uploadURL: "" username: "hoexter" type: github
-label:dependenciesis used here to filter out dependabot PRs in the dashboard. Workflow Look at a PR with
ghv $ID, if it's ok ACK it with
gha $ID. Create a PR from a feature branch with
ghcand later on merge it with
ghm $ID. The
$IDis retrieved from looking at my wtfutil based dashboard. Security Considerations The world is full of bad jokes. For the WebUI access I've the full array of pain with SAML auth, which expires too often, and 2nd factor verification for my account backed by a Yubikey. But to work with the CLI you basically need an API token with full access, everything else drives you insane. So I gave in and generated exactly that. End result is that I now have an API token - which is basically a password - which has full power, and is stored in config files and environment variables. So the security features created around the login are all void now. Was that the aim of it after all?
-----BEGIN RSA PRIVATE KEY----- MGICAQACEQCxjdTmecltJEz2PLMpS4BXAgMBAAECEDKtuwD17gpagnASq1zQTYEC CQDVTYVsjjF7IQIJANUYZsIjRsR3AgkAkahDUXL0RSECCB78r2SnsJC9AghaOK3F sKoELg== -----END RSA PRIVATE KEY-----Obviously, there s some hidden meaning in there computers don t encrypt things by shouting BEGIN RSA PRIVATE KEY! , after all. What is between the
ENDlines above is, in fact, a base64-encoded DER format ASN.1 structure representing a PKCS#1 private key. In simple terms, it s a list of numbers very important numbers. The list of numbers is, in order:
e(which is almost always 65,537, for various unimportant reasons);
q. Of the other numbers, most of them are at least about the same size as each of
q. So of the total data in an RSA key, less than a quarter of the data is required. Let me show you with the above toy key, by breaking it down piece by piece1:
MGIDER for this is a sequence
eis pretty much always 65,537. So I could redact almost all of this key, and still give all the important, private bits of this key. Let me show you:
-----BEGIN RSA PRIVATE KEY----- ..............................................................EC CQDVTYVsjjF7IQIJANUYZsIjRsR3.................................... ........ -----END RSA PRIVATE KEY-----Now, I doubt that anyone is going to redact a key precisely like this but then again, this isn t a typical RSA key. They usually look a lot more like this:
-----BEGIN RSA PRIVATE KEY----- MIIEogIBAAKCAQEAu6Inch7+mWtKn+leB9uCG3MaJIxRyvC/5KTz2fR+h+GOhqj4 SZJobiVB4FrE5FgC7AnlH6qeRi9MI0s6dt5UWZ5oNIeWSaOOeNO+EJDUkSVf67wj SNGXlSjGAkPZ0nRJiDjhuPvQmdW53hOaBLk5udxPEQbenpXAzbLJ7wH5ouLQ3nQw HwpwDNQhF6zRO8WoscpDVThOAM+s4PS7EiK8ZR4hu2toon8Ynadlm95V45wR0VlW zywgbkZCKa1IMrDCscB6CglQ10M3Xzya3iTzDtQxYMVqhDrA7uBYRxA0y1sER+Rb yhEh03xz3AWemJVLCQuU06r+FABXJuY/QuAVvQIDAQABAoIBAFqwWVhzWqNUlFEO PoCVvCEAVRZtK+tmyZj9kU87ORz8DCNR8A+/T/JM17ZUqO2lDGSBs9jGYpGRsr8s USm69BIM2ljpX95fyzDjRu5C0jsFUYNi/7rmctmJR4s4uENcKV5J/++k5oI0Jw4L c1ntHNWUgjK8m0UTJIlHbQq0bbAoFEcfdZxd3W+SzRG3jND3gifqKxBG04YDwloy tu+bPV2jEih6p8tykew5OJwtJ3XsSZnqJMwcvDciVbwYNiJ6pUvGq6Z9kumOavm9 XU26m4cWipuK0URWbHWQA7SjbktqEpxsFrn5bYhJ9qXgLUh/I1+WhB2GEf3hQF5A pDTN4oECgYEA7Kp6lE7ugFBDC09sKAhoQWrVSiFpZG4Z1gsL9z5YmZU/vZf0Su0n 9J2/k5B1GghvSwkTqpDZLXgNz8eIX0WCsS1xpzOuORSNvS1DWuzyATIG2cExuRiB jYWIJUeCpa5p2PdlZmBrnD/hJ4oNk4oAVpf+HisfDSN7HBpN+TJfcAUCgYEAyvY7 Y4hQfHIdcfF3A9eeCGazIYbwVyfoGu70S/BZb2NoNEPymqsz7NOfwZQkL4O7R3Wl Rm0vrWT8T5ykEUgT+2ruZVXYSQCKUOl18acbAy0eZ81wGBljZc9VWBrP1rHviVWd OVDRZNjz6nd6ZMrJvxRa24TvxZbJMmO1cgSW1FkCgYAoWBd1WM9HiGclcnCZknVT UYbykCeLO0mkN1Xe2/32kH7BLzox26PIC2wxF5seyPlP7Ugw92hOW/zewsD4nLze v0R0oFa+3EYdTa4BvgqzMXgBfvGfABJ1saG32SzoWYcpuWLLxPwTMsCLIPmXgRr1 qAtl0SwF7Vp7O/C23mNukQKBgB89DOEB7xloWv3Zo27U9f7nB7UmVsGjY8cZdkJl 6O4LB9PbjXCe3ywZWmJqEbO6e83A3sJbNdZjT65VNq9uP50X1T+FmfeKfL99X2jl RnQTsrVZWmJrLfBSnBkmb0zlMDAcHEnhFYmHFuvEnfL7f1fIoz9cU6c+0RLPY/L7 n9dpAoGAXih17mcmtnV+Ce+lBWzGWw9P4kVDSIxzGxd8gprrGKLa3Q9VuOrLdt58 ++UzNUaBN6VYAe4jgxGfZfh+IaSlMouwOjDgE/qzgY8QsjBubzmABR/KWCYiRqkj qpWCgo1FC1Gn94gh/+dW2Q8+NjYtXWNqQcjRP4AKTBnPktEvdMA= -----END RSA PRIVATE KEY-----People typically redact keys by deleting whole lines, and usually replacing them with
[...]and the like. But only about 345 of those 1588 characters (excluding the header and footer) are required to construct the entire key. You can redact about 4/5ths of that giant blob of stuff, and your private parts (or at least, those of your key) are still left uncomfortably exposed.
qcould be derived from those three numbers? Let s talk about one of those numbers:
n. This is known as the public modulus (because, along with
e, it is also present in the public key). It is very easy to calculate:
n = p * q. It is also very early in the key (the second number, in fact). Since
n = p * q, it follows that
q = n / p. Thus, as long as the key is intact up to
p, you can derive
qby simple division.
johanfinn/scripts. For a while, his repo contained a script that contained a poorly-redacted private key. He since deleted it, by making a new commit, but of course because git never really deletes anything, it s still available. Of course, Mr. Finn may delete the repo, or force-push a new history without that commit, so here is the redacted private key, with a bit of the surrounding shell script, for our illustrative pleasure:
#Add private key to .ssh folder cd /home/johan/.ssh/ echo "-----BEGIN RSA PRIVATE KEY----- MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK MIIJKgIBAAKCAgEAxEVih1JGb8gu/Fm4AZh+ZwJw/pjzzliWrg4mICFt1g7SmIE2 TCQMKABdwd11wOFKCPc/UzRH/fHuQcvWrpbOSdqev/zKff9iedKw/YygkMeIRaXB fYELqvUAOJ8PPfDm70st9GJRhjGgo5+L3cJB2gfgeiDNHzaFvapRSU0oMGQX+kI9 ezsjDAn+0Pp+r3h/u1QpLSH4moRFGF4omNydI+3iTGB98/EzuNhRBHRNq4oBV5SG Pq/A1bem2ninnoEaQ+OPESxYzDz3Jy9jV0W/6LvtJ844m+XX69H5fqq5dy55z6DW sGKn78ULPVZPsYH5Y7C+CM6GAn4nYCpau0t52sqsY5epXdeYx4Dc+Wm0CjXrUDEe Egl4loPKDxJkQqQ/MQiz6Le/UK9vEmnWn1TRXK3ekzNV4NgDfJANBQobOpwt8WVB rbsC0ON7n680RQnl7PltK9P1AQW5vHsahkoixk/BhcwhkrkZGyDIl9g8Q/Euyoq3 eivKPLz7/rhDE7C1BzFy7v8AjC3w7i9QeHcWOZFAXo5hiDasIAkljDOsdfD4tP5/ wSO6E6pjL3kJ+RH2FCHd7ciQb+IcuXbku64ln8gab4p8jLa/mcMI+V3eWYnZ82Yu axsa85hAe4wb60cp/rCJo7ihhDTTvGooqtTisOv2nSvCYpcW9qbL6cGjAXECAwEA AQKCAgEAjz6wnWDP5Y9ts2FrqUZ5ooamnzpUXlpLhrbu3m5ncl4ZF5LfH+QDN0Kl KvONmHsUhJynC/vROybSJBU4Fu4bms1DJY3C39h/L7g00qhLG7901pgWMpn3QQtU 4P49qpBii20MGhuTsmQQALtV4kB/vTgYfinoawpo67cdYmk8lqzGzzB/HKxZdNTq s+zOfxRr7PWMo9LyVRuKLjGyYXZJ/coFaobWBi8Y96Rw5NZZRYQQXLIalC/Dhndm AHckpstEtx2i8f6yxEUOgPvV/gD7Akn92RpqOGW0g/kYpXjGqZQy9PVHGy61sInY HSkcOspIkJiS6WyJY9JcvJPM6ns4b84GE9qoUlWVF3RWJk1dqYCw5hz4U8LFyxsF R6WhYiImvjxBLpab55rSqbGkzjI2z+ucDZyl1gqIv9U6qceVsgRyuqdfVN4deU22 LzO5IEDhnGdFqg9KQY7u8zm686Ejs64T1sh0y4GOmGsSg+P6nsqkdlXH8C+Cf03F lqPFg8WQC7ojl/S8dPmkT5tcJh3BPwIWuvbtVjFOGQc8x0lb+NwK8h2Nsn6LNazS 0H90adh/IyYX4sBMokrpxAi+gMAWiyJHIHLeH2itNKtAQd3qQowbrWNswJSgJzsT JuJ7uqRKAFkE6nCeAkuj/6KHHMPsfCAffVdyGaWqhoxmPOrnVgECggEBAOrCCwiC XxwUgjOfOKx68siFJLfHf4vPo42LZOkAQq5aUmcWHbJVXmoxLYSczyAROopY0wd6 Dx8rqnpO7OtZsdJMeBSHbMVKoBZ77hiCQlrljcj12moFaEAButLCdZFsZW4zF/sx kWIAaPH9vc4MvHHyvyNoB3yQRdevu57X7xGf9UxWuPil/jvdbt9toaraUT6rUBWU GYPNKaLFsQzKsFWAzp5RGpASkhuiBJ0Qx3cfLyirjrKqTipe3o3gh/5RSHQ6VAhz gdUG7WszNWk8FDCL6RTWzPOrbUyJo/wz1kblsL3vhV7ldEKFHeEjsDGroW2VUFlS asAHNvM4/uYcOSECggEBANYH0427qZtLVuL97htXW9kCAT75xbMwgRskAH4nJDlZ IggDErmzBhtrHgR+9X09iL47jr7dUcrVNPHzK/WXALFSKzXhkG/yAgmt3r14WgJ6 5y7010LlPFrzaNEyO/S4ISuBLt4cinjJsrFpoo0WI8jXeM5ddG6ncxdurKXMymY7 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::.:: :::::::::::::::::::::::::::.:::::::::::::::::::::::::::::::::::: LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLlL YYYYYYYYYYYYYYYYYYYYYyYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY gff0GJCOMZ65pMSy3A3cSAtjlKnb4fWzuHD5CFbusN4WhCT/tNxGNSpzvxd8GIDs nY7exs9L230oCCpedVgcbayHCbkChEfoPzL1e1jXjgCwCTgt8GjeEFqc1gXNEaUn O8AJ4VlR8fRszHm6yR0ZUBdY7UJddxQiYOzt0S1RLlECggEAbdcs4mZdqf3OjejJ 06oTPs9NRtAJVZlppSi7pmmAyaNpOuKWMoLPElDAQ3Q7VX26LlExLCZoPOVpdqDH KbdmBEfTR4e11Pn9vYdu9/i6o10U4hpmf4TYKlqk10g1Sj21l8JATj/7Diey8scO sAI1iftSg3aBSj8W7rxCxSezrENzuqw5D95a/he1cMUTB6XuravqZK5O4eR0vrxR AvMzXk5OXrUEALUvt84u6m6XZZ0pq5XZxq74s8p/x1JvTwcpJ3jDKNEixlHfdHEZ ZIu/xpcwD5gRfVGQamdcWvzGHZYLBFO1y5kAtL8kI9tW7WaouWVLmv99AyxdAaCB Y5mBAQKCAQEAzU7AnorPzYndlOzkxRFtp6MGsvRBsvvqPLCyUFEXrHNV872O7tdO GmsMZl+q+TJXw7O54FjJJvqSSS1sk68AGRirHop7VQce8U36BmI2ZX6j2SVAgIkI 9m3btCCt5rfiCatn2+Qg6HECmrCsHw6H0RbwaXS4RZUXD/k4X+sslBitOb7K+Y+N Bacq6QxxjlIqQdKKPs4P2PNHEAey+kEJJGEQ7bTkNxCZ21kgi1Sc5L8U/IGy0BMC PvJxssLdaWILyp3Ws8Q4RAoC5c0ZP0W2j+5NSbi3jsDFi0Y6/2GRdY1HAZX4twem Q0NCedq1JNatP1gsb6bcnVHFDEGsj/35oQKCAQEAgmWMuSrojR/fjJzvke6Wvbox FRnPk+6YRzuYhAP/YPxSRYyB5at++5Q1qr7QWn7NFozFIVFFT8CBU36ktWQ39MGm cJ5SGyN9nAbbuWA6e+/u059R7QL+6f64xHRAGyLT3gOb1G0N6h7VqFT25q5Tq0rc Lf/CvLKoudjv+sQ5GKBPT18+zxmwJ8YUWAsXUyrqoFWY/Tvo5yLxaC0W2gh3+Ppi EDqe4RRJ3VKuKfZxHn5VLxgtBFN96Gy0+Htm5tiMKOZMYAkHiL+vrVZAX0hIEuRZ JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM -----END RSA PRIVATE KEY-----" >> id_rsaNow, if you try to reconstruct this key by removing the obvious garbage lines (the ones that are all repeated characters, some of which aren t even valid base64 characters), it still isn t a key at least,
openssl pkeydoesn t want anything to do with it. The key is very much still in there, though, as we shall soon see. Using a gem I wrote and a quick bit of Ruby, we can extract a complete private key. The
irbsession looks something like this:
>> require "derparse" >> b64 = <<EOF MIIJKgIBAAKCAgEAxEVih1JGb8gu/Fm4AZh+ZwJw/pjzzliWrg4mICFt1g7SmIE2 TCQMKABdwd11wOFKCPc/UzRH/fHuQcvWrpbOSdqev/zKff9iedKw/YygkMeIRaXB fYELqvUAOJ8PPfDm70st9GJRhjGgo5+L3cJB2gfgeiDNHzaFvapRSU0oMGQX+kI9 ezsjDAn+0Pp+r3h/u1QpLSH4moRFGF4omNydI+3iTGB98/EzuNhRBHRNq4oBV5SG Pq/A1bem2ninnoEaQ+OPESxYzDz3Jy9jV0W/6LvtJ844m+XX69H5fqq5dy55z6DW sGKn78ULPVZPsYH5Y7C+CM6GAn4nYCpau0t52sqsY5epXdeYx4Dc+Wm0CjXrUDEe Egl4loPKDxJkQqQ/MQiz6Le/UK9vEmnWn1TRXK3ekzNV4NgDfJANBQobOpwt8WVB rbsC0ON7n680RQnl7PltK9P1AQW5vHsahkoixk/BhcwhkrkZGyDIl9g8Q/Euyoq3 eivKPLz7/rhDE7C1BzFy7v8AjC3w7i9QeHcWOZFAXo5hiDasIAkljDOsdfD4tP5/ wSO6E6pjL3kJ+RH2FCHd7ciQb+IcuXbku64ln8gab4p8jLa/mcMI+V3eWYnZ82Yu axsa85hAe4wb60cp/rCJo7ihhDTTvGooqtTisOv2nSvCYpcW9qbL6cGjAXECAwEA AQKCAgEAjz6wnWDP5Y9ts2FrqUZ5ooamnzpUXlpLhrbu3m5ncl4ZF5LfH+QDN0Kl KvONmHsUhJynC/vROybSJBU4Fu4bms1DJY3C39h/L7g00qhLG7901pgWMpn3QQtU 4P49qpBii20MGhuTsmQQALtV4kB/vTgYfinoawpo67cdYmk8lqzGzzB/HKxZdNTq s+zOfxRr7PWMo9LyVRuKLjGyYXZJ/coFaobWBi8Y96Rw5NZZRYQQXLIalC/Dhndm AHckpstEtx2i8f6yxEUOgPvV/gD7Akn92RpqOGW0g/kYpXjGqZQy9PVHGy61sInY HSkcOspIkJiS6WyJY9JcvJPM6ns4b84GE9qoUlWVF3RWJk1dqYCw5hz4U8LFyxsF R6WhYiImvjxBLpab55rSqbGkzjI2z+ucDZyl1gqIv9U6qceVsgRyuqdfVN4deU22 LzO5IEDhnGdFqg9KQY7u8zm686Ejs64T1sh0y4GOmGsSg+P6nsqkdlXH8C+Cf03F lqPFg8WQC7ojl/S8dPmkT5tcJh3BPwIWuvbtVjFOGQc8x0lb+NwK8h2Nsn6LNazS 0H90adh/IyYX4sBMokrpxAi+gMAWiyJHIHLeH2itNKtAQd3qQowbrWNswJSgJzsT JuJ7uqRKAFkE6nCeAkuj/6KHHMPsfCAffVdyGaWqhoxmPOrnVgECggEBAOrCCwiC XxwUgjOfOKx68siFJLfHf4vPo42LZOkAQq5aUmcWHbJVXmoxLYSczyAROopY0wd6 Dx8rqnpO7OtZsdJMeBSHbMVKoBZ77hiCQlrljcj12moFaEAButLCdZFsZW4zF/sx kWIAaPH9vc4MvHHyvyNoB3yQRdevu57X7xGf9UxWuPil/jvdbt9toaraUT6rUBWU GYPNKaLFsQzKsFWAzp5RGpASkhuiBJ0Qx3cfLyirjrKqTipe3o3gh/5RSHQ6VAhz gdUG7WszNWk8FDCL6RTWzPOrbUyJo/wz1kblsL3vhV7ldEKFHeEjsDGroW2VUFlS asAHNvM4/uYcOSECggEBANYH0427qZtLVuL97htXW9kCAT75xbMwgRskAH4nJDlZ IggDErmzBhtrHgR+9X09iL47jr7dUcrVNPHzK/WXALFSKzXhkG/yAgmt3r14WgJ6 5y7010LlPFrzaNEyO/S4ISuBLt4cinjJsrFpoo0WI8jXeM5ddG6ncxdurKXMymY7 EOF >> b64 += <<EOF gff0GJCOMZ65pMSy3A3cSAtjlKnb4fWzuHD5CFbusN4WhCT/tNxGNSpzvxd8GIDs nY7exs9L230oCCpedVgcbayHCbkChEfoPzL1e1jXjgCwCTgt8GjeEFqc1gXNEaUn O8AJ4VlR8fRszHm6yR0ZUBdY7UJddxQiYOzt0S1RLlECggEAbdcs4mZdqf3OjejJ 06oTPs9NRtAJVZlppSi7pmmAyaNpOuKWMoLPElDAQ3Q7VX26LlExLCZoPOVpdqDH KbdmBEfTR4e11Pn9vYdu9/i6o10U4hpmf4TYKlqk10g1Sj21l8JATj/7Diey8scO sAI1iftSg3aBSj8W7rxCxSezrENzuqw5D95a/he1cMUTB6XuravqZK5O4eR0vrxR AvMzXk5OXrUEALUvt84u6m6XZZ0pq5XZxq74s8p/x1JvTwcpJ3jDKNEixlHfdHEZ ZIu/xpcwD5gRfVGQamdcWvzGHZYLBFO1y5kAtL8kI9tW7WaouWVLmv99AyxdAaCB Y5mBAQKCAQEAzU7AnorPzYndlOzkxRFtp6MGsvRBsvvqPLCyUFEXrHNV872O7tdO GmsMZl+q+TJXw7O54FjJJvqSSS1sk68AGRirHop7VQce8U36BmI2ZX6j2SVAgIkI 9m3btCCt5rfiCatn2+Qg6HECmrCsHw6H0RbwaXS4RZUXD/k4X+sslBitOb7K+Y+N Bacq6QxxjlIqQdKKPs4P2PNHEAey+kEJJGEQ7bTkNxCZ21kgi1Sc5L8U/IGy0BMC PvJxssLdaWILyp3Ws8Q4RAoC5c0ZP0W2j+5NSbi3jsDFi0Y6/2GRdY1HAZX4twem Q0NCedq1JNatP1gsb6bcnVHFDEGsj/35oQKCAQEAgmWMuSrojR/fjJzvke6Wvbox FRnPk+6YRzuYhAP/YPxSRYyB5at++5Q1qr7QWn7NFozFIVFFT8CBU36ktWQ39MGm cJ5SGyN9nAbbuWA6e+/u059R7QL+6f64xHRAGyLT3gOb1G0N6h7VqFT25q5Tq0rc Lf/CvLKoudjv+sQ5GKBPT18+zxmwJ8YUWAsXUyrqoFWY/Tvo5yLxaC0W2gh3+Ppi EDqe4RRJ3VKuKfZxHn5VLxgtBFN96Gy0+Htm5tiMKOZMYAkHiL+vrVZAX0hIEuRZ EOF >> der = b64.unpack("m").first >> c = DerParse.new(der).first_node.first_child >> version = c.value => 0 >> c = c.next_node >> n = c.value => 80071596234464993385068908004931... # (etc) >> c = c.next_node >> e = c.value => 65537 >> c = c.next_node >> d = c.value => 58438813486895877116761996105770... # (etc) >> c = c.next_node >> p = c.value => 29635449580247160226960937109864... # (etc) >> c = c.next_node >> q = c.value => 27018856595256414771163410576410... # (etc)What I ve done, in case you don t speak Ruby, is take the two chunks of plausible-looking base64 data, chuck them together into a variable named
b64, unbase64 it into a variable named
der, pass that into a new
DerParseinstance, and then walk the DER value tree until I got all the values I need. Interestingly, the
qvalue actually traverses the split in the two chunks, which means that there s always the possibility that there are lines missing from the key. However, since
qare supposed to be prime, we can sanity check them to see if corruption is likely to have occurred:
>> require "openssl" >> OpenSSL::BN.new(p).prime? => true >> OpenSSL::BN.new(q).prime? => trueExcellent! The chances of a corrupted file producing valid-but-incorrect prime numbers isn t huge, so we can be fairly confident that we ve got the real
q. Now, with the help of another one of my creations we can use
qto create a fully-operational battle key:
>> require "openssl/pkey/rsa" >> k = OpenSSL::PKey::RSA.from_factors(p, q, e) => #<OpenSSL::PKey::RSA:0x0000559d5903cd38> >> k.valid? => true >> k.verify(OpenSSL::Digest::SHA256.new, k.sign(OpenSSL::Digest::SHA256.new, "bob"), "bob") => trueand there you have it. One fairly redacted-looking private key brought back to life by maths and far too much free time. Sorry Mr. Finn, I hope you re not still using that key on anything Internet-facing.
traps: brave trap invalid opcode ip:55d4e0cb1895 sp:7ffc4b08ff00 error:0 in brave[55d4dfe01000+78e5000]
Welcome to the February 2020 report from the Reproducible Builds project. One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the reproducible builds effort is to provide the ability to demonstrate these binaries originated from a particular, trusted, source release: if identical results are generated from a given source in all circumstances, reproducible builds provides the means for multiple third-parties to reach a consensus on whether a build was compromised via distributed checksum validation or some other scheme. In this month s report, we cover:
If you are interested in contributing to the project, please visit our Contribute page on our website.
All computation that occurs inside a DetTrace container is a pure function of the initial filesystem state of the container. Reproducible containers can be used for a variety of purposes, including replication for fault-tolerance, reproducible software builds and reproducible data analytics. We use DetTrace to achieve, in an automatic fashion, reproducibility for 12,130 Debian package builds, containing over 800 million lines of code, as well as bioinformatics and machine learning workflows.There was also considerable discussion on our mailing list regarding this research and a presentation based on the paper will occur at the ASPLOS 2020 conference between March 16th 20th in Lausanne, Switzerland. The many virtues of Reproducible Builds were touted as benefits for software compliance in a talk at FOSDEM 2020, debating whether the Careful Inventory of Licensing Bill of Materials Have Impact of FOSS License Compliance which pitted Jeff McAffer and Carol Smith against Bradley Kuhn and Max Sills. (~47 minutes in). Nobuyoshi Nakada updated the canonical implementation of the Ruby programming language a change such that filesystem globs (ie. calls to list the contents of filesystem directories) will henceforth be sorted in ascending order. Without this change, the underlying nondeterministic ordering of the filesystem is exposed to the language which often results in an unreproducible build. Vagrant Cascadian reported on our mailing list regarding a quick reproducible test for the GNU Guix distribution, which resulted in 81.9% of packages registering as reproducible in his installation:
Jeremiah Orians announced on our mailing list the release of a number of tools related to cross-compilation such as
$ guix challenge --verbose --diff=diffoscope ... 2,463 store items were analyzed: - 2,016 (81.9%) were identical - 37 (1.5%) differed - 410 (16.6%) were inconclusive
mescc-tools-seed. This project attemps a full bootstrap of a cross-platform compiler for the C programming language (written in C itself) from hex, the ultimate goal being able to demonstrate fully-bootstrapped compiler from hex to the GCC GNU Compiler Collection. This has many implications in and around Ken Thompson s Trusting Trust attack outlined in Thompson s 1983 Turing Award Lecture. Twitter user @TheYoctoJester posted an executive summary of reproducible builds in the Yocto Project: Finally, Reddit user
tofflosposted to the /r/Java subreddit asking about how to achieve reproducible builds with Maven and Chris Lamb noticed that the Linux kernel documentation about reproducible builds of it is available on the kernel.org homepages in an attractive HTML format.
debian-installerpackage to allow all arguments and options from
sources.listfiles (such as
[check-valid-until=no], etc.) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure. (#13) Thorsten Glaser followed-up to a bug filed against the
dpkg-sourcecomponent that was originally filed in late 2015 that claims that the build tool does not respect permissions when unpacking tarballs if the umask is set to
0002. Matthew Garrett posted to the
debian-develmailing list on the topic of Producing verifiable initramfs images as part of a wider conversation on being able to trust the entire software stack on our computers. 59 reviews of Debian packages were added, 30 were updated and 42 were removed this month adding to our knowledge about identified issues. Many issue types were noticed and categorised by Chris Lamb, including:
python-rpm-macros(do not save time-based
.pycfiles for tests)
solfege(filesystem ordering issue sent upstream via email; package is orphaned upstream)
DVDStyler(zip timestamps, submitted upstream)
sngimage utility appears to return with an exit code of 1 if there are even minor errors in the file. (#950806)
.apkfiles extracted by
str.formatif we are just returning the string. [ ]
Command.VALID_RETURNCODES. [ ]
Vcs-Gitto specify the
debianpackaging branch. [ ] reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, versions
0.7.14were uploaded to Debian unstable by Holger Levsen after Vagrant Cascadian added support for GNU Guix [ ].
SOURCE_DATE_EPOCHdocumentation [ ] and normalised various terms to unreproducible [ ]. Chris Lamb added a Meson.build example [ ] and improved the documentation for the CMake [ ] to the
SOURCE_DATE_EPOCHdocumentation, replaced anyone can with anyone may as, well, not everyone has the resources, skills, time or funding to actually do what it refers to [ ] and improved the pre-processing for our report generation [ ][ ][ ][ ] etc. In addition, Holger Levsen updated our news page to improve the list of reports [ ], added an explicit mention of the weekly news time span [ ] and reverted sorting of news entries to have latest on top [ ] and Mattia Rizzolo added Codethink as a non-fiscal sponsor [ ] and lastly Tianon Gravi added a Docker Images link underneath the Debian project on our Projects page [ ].
python-django(Always build the documentation in English)
.buildinfofiles. This has resulted in so that we now know that Debian bullseye contains 4,557 source packages for the
amd64architecture without corresponding
.buildinfofiles and 25,668 source packages with
.buildinfofiles. [ ][ ][ ][ ][ ]
rootuser. [ ]
devscriptsfrom buster-backports [ ].
i386architecture, to allow the latter to catch up a bit. [ ]
arm64architecture builders [ ]. The usual build node maintenance was performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian.
This month s report was written by Bernhard M. Wiedemann, Chris Lamb and Holger Levsen. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.
www.express.co.uk, each on its own line. (Of course you can easily add more if you like.) Voil : instant tranquility. Incidentally, this also works fine on Android. The fact that it was easy to install a good ad blocker without having to mess about with a rooted device or strange proxy settings was the main reason I switched to Firefox on my phone.
From that, Bacon elaborates possible reasons for the apparent decline of the GPL. The graphic used in the article was actually generated by Stephen O'Grady in a January article, The State Of Open Source Licensing, which said:
In Black Duck's sample, the most popular variant of the GPL version 2 is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%.Sullivan, however, argued that the methodology used to create both articles was problematic. Neither contains original research: the graphs actually come from the Black Duck Software "KnowledgeBase" data, which was partly created from the old Ohloh web site now known as Open Hub. To show one problem with the data, Sullivan mentioned two free-software projects, GNU Bash and GNU Emacs, that had been showcased on the front page of Ohloh.net in 2012. On the site, Bash was (and still is) listed as GPLv2+, whereas it changed to GPLv3 in 2011. He also claimed that "Emacs was listed as licensed under GPLv3-only, which is a license Emacs has never had in its history", although I wasn't able to verify that information from the Internet archive. Basically, according to Sullivan, "the two projects featured on the front page of a site that was using [the Black Duck] data set were wrong". This, in turn, seriously brings into question the quality of the data:
I reported this problem and we'll continue to do that but when someone is not sharing the data set that they're using for other people to evaluate it and we see glimpses of it which are incorrect, that should give us a lot of hesitation about accepting any conclusion that comes out of it.Reproducible observations are necessary to the establishment of solid theories in science. Sullivan didn't try to contact Black Duck to get access to the database, because he assumed (rightly, as it turned out) that he would need to "pay for the data under terms that forbid you to share that information with anybody else". So I wrote Black Duck myself to confirm this information. In an email interview, Patrick Carey from Black Duck confirmed its data set is proprietary. He believes, however, that through a "combination of human and automated techniques", Black Duck is "highly confident at the accuracy and completeness of the data in the KnowledgeBase". He did point out, however, that "the way we track the data may not necessarily be optimal for answering the question on license use trend" as "that would entail examination of new open source projects coming into existence each year and the licenses used by them". In other words, even according to Black Duck, its database may not be useful to establish the conclusions drawn by those articles. Carey did agree with those conclusions intuitively, however, saying that "there seems to be a shift toward Apache and MIT licenses in new projects, though I don't have data to back that up". He suggested that "an effective way to answer the trend question would be to analyze the new projects on GitHub over the last 5-10 years." Carey also suggested that "GitHub has become so dominant over the recent years that just looking at projects on GitHub would give you a reasonable sampling from which to draw conclusions".
Indeed, GitHub published a report in 2015 that also seems to confirm MIT's popularity (45%), surpassing copyleft licenses (24%). The data is, however, not without its own limitations. For example, in the above graph going back to the inception of GitHub in 2008, we see a rather abnormal spike in 2013, which seems to correlate with the launch of the choosealicense.com site, described by GitHub as "our first pass at making open source licensing on GitHub easier". In his talk, Sullivan was critical of the initial version of the site which he described as biased toward permissive licenses. Because the GitHub project creation page links to the site, Sullivan explained that the site's bias could have actually influenced GitHub users' license choices. Following a talk from Sullivan at FOSDEM 2016, GitHub addressed the problem later that year by rewording parts of the front page to be more accurate, but that any change in license choice obviously doesn't show in the report produced in 2015 and won't affect choices users have already made. Therefore, there can be reasonable doubts that GitHub's subset of software projects may not actually be that representative of the larger free-software community.
The long history of Debian creates a perfect subject to evaluate how FOSS licenses use has evolved over time, and the popularity of licenses currently in use.Sullivan argued that the Debsources data set is interesting because of its quality: every package in Debian has been reviewed by multiple humans, including the original packager, but also by the FTP masters to ensure that the distribution can legally redistribute the software. The existence of a package in Debian provides a minimal "proof of use": unmaintained packages get removed from Debian on a regular basis and the mere fact that a piece of software gets packaged in Debian means at least some users found it important enough to work on packaging it. Debian packagers make specific efforts to avoid code duplication between packages in order to ease security maintenance. The data set covers a period longer than Black Duck's or GitHub's, as it goes all the way back to the Hamm 2.0 release in 1998. The data and how to reproduce it are freely available under a CC BY-SA 4.0 license.
Sullivan presented the above graph from the research paper that showed the evolution of software license use in the Debian archive. Whereas previous graphs showed statistics in percentages, this one showed actual absolute numbers, where we can't actually distinguish a decline in copyleft licenses. To quote the paper again:
The top license is, once again, GPL-2.0+, followed by: Artistic-1.0/GPL dual-licensing (the licensing choice of Perl and most Perl libraries), GPL-3.0+, and Apache-2.0.Indeed, looking at the graph, at most do we see a rise of the Apache and MIT licenses and no decline of the GPL per se, although its adoption does seem to slow down in recent years. We should also mention the possibility that Debian's data set has the opposite bias: toward GPL software. The Debian project is culturally quite different from the GitHub community and even the larger free-software ecosystem, naturally, which could explain the disparity in the results. We can only hope a similar analysis can be performed on the much larger Software Heritage data set eventually, which may give more representative results. The paper acknowledges this problem:
Debian is likely representative of enterprise use of FOSS as a base operating system, where stable, long-term and seldomly updated software products are desirable. Conversely Debian is unlikely representative of more dynamic FOSS environments (e.g., modern Web-development with micro libraries) where users, who are usually developers themselves, expect to receive library updates on a daily basis.The Debsources research also shares methodology limitations with Black Duck: while Debian packages are reviewed before uploading and we can rely on the copyright information provided by Debian maintainers, the research also relies on automated tools (specifically FOSSology) to retrieve license information. Sullivan also warned against "ascribing reason to numbers": people may have different reasons for choosing a particular license. Developers may choose the MIT license because it has fewer words, for compatibility reasons, or simply because "their lawyers told them to". It may not imply an actual deliberate philosophical or ideological choice. Finally, he brought up the theory that the rise of non-copyleft licenses isn't necessarily at the detriment of the GPL. He explained that, even if there is an actual decline, it may not be much of a problem if there is an overall growth of free software to the detriment of proprietary software. He reminded the audience that non-copyleft licenses are still free software, according to the FSF and the Debian Free Software Guidelines, so their rise is still a positive outcome. Even if the GPL is a better tool to accomplish the goal of a free-software world, we can all acknowledge that the conversion of proprietary software to more permissive and certainly simpler licenses is definitely heading in the right direction.
[I would like to thank the DebConf organizers for providing meals for me during the conference.] Note: this article first appeared in the Linux Weekly News.