The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created. (PDF)
I have identified 16 root causes for unreproducible builds in my empirical study, which I have linked to the corresponding documentation. The initial MR right now contains information about 10 root causes. For each root cause, I have provided a definition, a notable instance, and a workaround. However, I have only found workarounds for 5 out of the 10 root causes listed in this merge request. In the upcoming commits, I plan to add an additional 6 root causes. I kindly request you review the text for any necessary refinements, modifications, or corrections. Additionally, I would appreciate the help with documentation for the solutions/workarounds for the remaining root causes: Archive Metadata, Build ID, File System Ordering, File Permissions, and Snippet Encoding. Your input on the identified root causes for unreproducible builds would be greatly appreciated. [ ]
while packaginggovulncheck
for Arch Linux I noticed a checksum mismatch for a tar file I downloaded fromgo.googlesource.com
. I used diffoscope to compare the.tar
file I downloaded with the.tar
file the build server downloaded, and noticed the timestamps are different.
ffile_prefix_map_passed_to_clang
being fixed since Debian bullseye [ ] and adding a Debian bug tracker reference for the nondeterminism_added_by_pyqt5_pyrcc5
issue [ ].
In addition, Roland Clobus posted another detailed update of the status of reproducible Debian ISO images on our mailing list. In particular, Roland helpfully summarised that live images are looking good, and the number of (passing) automated tests is growing .
util.inspect.object_description
attempts to sort collections, but this can fail. The change handles the failure case by using string-based object descriptions as a
fallback deterministic sort ordering, as well as adding recursive object-description calls for list and tuple datatypes. As a result,
documentation generated by Sphinx will be more likely to be automatically reproducible.
Lastly in news, kpcyrd posted to our mailing list announcing a new repro-env
tool:
My initial interest in reproducible builds was how do I distribute pre-compiled binaries on GitHub without people raising security concerns about them . I ve cycled back to this original problem about 5 years later and built a tool that is meant to address this. [ ]
django-graphql-jwt
(fails to build in 2038)doxygen
(filesystem ordering issue)git-interactive-rebase-tool
(date-related issue)obs-build
procmeter
(parallelism race condition)promu
python-cx_Freeze
(version update for year 2038 fix)python-zope.deprecation
python310
(ASLR-related issue)python-control
(fails to build-j4)python-DateTime
(fails to build in 2038)python-pyface
(date/time-related issue)python-quantities
(date/time-related issue)python-scipy
(date/time-related issue)rpmlint
starship
(filesystem ordering issue)Telethon
xindy
(fails to build in 2036)yt
(filesystem ordering issue)python-bpython
, python-flup
, python-mysqlclient
, python-waitress
, python-WebOb
, python-WebTest
, python-zope.event
, python-zope.hookable
& python-zope.i18nmessageid
dotenv-cli
.unity-java
.ruby-babosa
(forwarded upstream).guidata
(forwarded upstream).SOURCE_DATE_EPOCH
, a three-and-a-half year effort started by Bernhard M. Wiedemann in January 2020, taken over by John Neffenger in March 2021, integrated upstream in June 2023, and available starting with JavaFX 21 on September 19, 2023.244
, 245
and 246
were uploaded to Debian unstable by Chris Lamb, who also made the following changes:
libarchive-5
. [ ]test_dex::test_javap_14_differences
test requires the procyon
tool. [ ]assert_diff
in the .ico
and .jpeg
tests. [ ]XFAIL
due to Debian bugs #1040941 & #1040916. [ ]create_meta_pkg_sets
job into two (for Debian unstable and Debian testing) to half the job runtime to approximately 90 minutes. [ ][ ]postgresql_autodoc
is back in Debian bookworm. [ ]kfreebsd
-related tests now that it s officially dead. [ ]dpkg-db-backup
[ ] and munin-node services
[ ].#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
Barbie No, seriously! If anyone can make a good film about a doll franchise, it's probably Greta Gerwig. Not only was Little Women (2019) more than admirable, the same could be definitely said for Lady Bird (2017). More importantly, I can't help feel she was the real 'Driver' behind Frances Ha (2012), one of the better modern takes on Claudia Weill's revelatory Girlfriends (1978). Still, whenever I remember that Barbie will be a film about a billion-dollar toy and media franchise with a nettlesome history, I recall I rubbished the "Facebook film" that turned into The Social Network (2010). Anyway, the trailer for Barbie is worth watching, if only because it seems like a parody of itself.
Blitz It's difficult to overstate just how important the aerial bombing of London during World War II is crucial to understanding the British psyche, despite it being a constructed phenomenon from the outset. Without wishing to underplay the deaths of over 40,000 civilian deaths, Angus Calder pointed out in the 1990s that the modern mythology surrounding the event "did not evolve spontaneously; it was a propaganda construct directed as much at [then neutral] American opinion as at British." It will therefore be interesting to see how British Grenadian Trinidadian director Steve McQueen addresses a topic so essential to the British self-conception. (Remember the controversy in right-wing circles about the sole Indian soldier in Christopher Nolan's Dunkirk (2017)?) McQueen is perhaps best known for his 12 Years a Slave (2013), but he recently directed a six-part film anthology for the BBC which addressed the realities of post-Empire immigration to Britain, and this leads me to suspect he sees the Blitz and its surrounding mythology with a more critical perspective. But any attempt to complicate the story of World War II will be vigorously opposed in a way that will make the recent hullabaloo surrounding The Crown seem tame. All this is to say that the discourse surrounding this release may be as interesting as the film itself.
Dune, Part II Coming out of the cinema after the first part of Denis Vileneve's adaptation of Dune (2021), I was struck by the conception that it was less of a fresh adaptation of the 1965 novel by Frank Herbert than an attempt to rehabilitate David Lynch's 1984 version and in a broader sense, it was also an attempt to reestablish the primacy of cinema over streaming TV and the myriad of other distractions in our lives. I must admit I'm not a huge fan of the original novel, finding within it a certain prurience regarding hereditary military regimes and writing about them with a certain sense of glee that belies a secret admiration for them... not to mention an eyebrow-raising allegory for the Middle East. Still, Dune, Part II is going to be a fantastic spectacle.
Ferrari It'll be curious to see how this differs substantially from the recent Ford v Ferrari (2019), but given that Michael Mann's Heat (1995) so effectively re-energised the gangster/heist genre, I'm more than willing to kick the tires of this about the founder of the eponymous car manufacturer. I'm in the minority for preferring Mann's Thief (1981) over Heat, in part because the former deals in more abstract themes, so I'd have perhaps prefered to look forward to a more conceptual film from Mann over a story about one specific guy.
How Do You Live There are a few directors one can look forward to watching almost without qualification, and Hayao Miyazaki (My Neighbor Totoro, Kiki's Delivery Service, Princess Mononoke Howl's Moving Castle, etc.) is one of them. And this is especially so given that The Wind Rises (2013) was meant to be the last collaboration between Miyazaki and Studio Ghibli. Let's hope he is able to come out of retirement in another ten years.
Indiana Jones and the Dial of Destiny Given I had a strong dislike of Indiana Jones and the Kingdom of the Crystal Skull (2008), I seriously doubt I will enjoy anything this film has to show me, but with 1981's Raiders of the Lost Ark remaining one of my most treasured films (read my brief homage), I still feel a strong sense of obligation towards the Indiana Jones name, despite it feeling like the copper is being pulled out of the walls of this franchise today.
Kafka I only know Polish filmmaker Agnieszka Holland through her Spoor (2017), an adaptation of Olga Tokarczuk's 2009 eco-crime novel Drive Your Plow Over the Bones of the Dead. I wasn't an unqualified fan of Spoor (nor the book on which it is based), but I am interested in Holland's take on the life of Czech author Franz Kafka, an author enmeshed with twentieth-century art and philosophy, especially that of central Europe. Holland has mentioned she intends to tell the story "as a kind of collage," and I can hope that it is an adventurous take on the over-furrowed biopic genre. Or perhaps Gregor Samsa will awake from uneasy dreams to find himself transformed in his bed into a huge verminous biopic.
The Killer It'll be interesting to see what path David Fincher is taking today, especially after his puzzling and strangely cold Mank (2020) portraying the writing process behind Orson Welles' Citizen Kane (1941). The Killer is said to be a straight-to-Netflix thriller based on the graphic novel about a hired assassin, which makes me think of Fincher's Zodiac (2007), and, of course, Se7en (1995). I'm not as entranced by Fincher as I used to be, but any film with Michael Fassbender and Tilda Swinton (with a score by Trent Reznor) is always going to get my attention.
Killers of the Flower Moon In Killers of the Flower Moon, Martin Scorsese directs an adaptation of a book about the FBI's investigation into a conspiracy to murder Osage tribe members in the early years of the twentieth century in order to deprive them of their oil-rich land. (The only thing more quintessentially American than apple pie is a conspiracy combined with a genocide.) Separate from learning more about this disquieting chapter of American history, I'd love to discover what attracted Scorsese to this particular story: he's one of the few top-level directors who have the ability to lucidly articulate their intentions and motivations.
Napoleon It often strikes me that, despite all of his achievements and fame, it's somehow still possible to claim that Ridley Scott is relatively underrated compared to other directors working at the top level today. Besides that, though, I'm especially interested in this film, not least of all because I just read Tolstoy's War and Peace (read my recent review) and am working my way through the mind-boggling 431-minute Soviet TV adaptation, but also because several auteur filmmakers (including Stanley Kubrick) have tried to make a Napoleon epic and failed.
Oppenheimer In a way, a biopic about the scientist responsible for the atomic bomb and the Manhattan Project seems almost perfect material for Christopher Nolan. He can certainly rely on stars to queue up to be in his movies (Robert Downey Jr., Matt Damon, Kenneth Branagh, etc.), but whilst I'm certain it will be entertaining on many fronts, I fear it will fall into the well-established Nolan mould of yet another single man struggling with obsession, deception and guilt who is trying in vain to balance order and chaos in the world.
The Way of the Wind Marked by philosophical and spiritual overtones, all of Terrence Malick's films are perfumed with themes of transcendence, nature and the inevitable conflict between instinct and reason. My particular favourite is his stunning Days of Heaven (1978), but The Thin Red Line (1998) and A Hidden Life (2019) also touched me ways difficult to relate, and are one of the few films about the Second World War that don't touch off my sensitivity about them (see my remarks about Blitz above). It is therefore somewhat Malickian that his next film will be a biblical drama about the life of Jesus. Given Malick's filmography, I suspect this will be far more subdued than William Wyler's 1959 Ben-Hur and significantly more equivocal in its conviction compared to Paolo Pasolini's ardently progressive The Gospel According to St. Matthew (1964). However, little beyond that can be guessed, and the film may not even appear until 2024 or even 2025.
Zone of Interest I was mesmerised by Jonathan Glazer's Under the Skin (2013), and there is much to admire in his borderline 'revisionist gangster' film Sexy Beast (2000), so I will definitely be on the lookout for this one. The only thing making me hesitate is that Zone of Interest is based on a book by Martin Amis about a romance set inside the Auschwitz concentration camp. I haven't read the book, but Amis has something of a history in his grappling with the history of the twentieth century, and he seems to do it in a way that never sits right with me. But if Paul Verhoeven's Starship Troopers (1997) proves anything at all, it's all in the adaption.
.apk
files shipped by a number of free-software instant messenger applications.
These scripts are often necessary in the Android/APK ecosystem due to these files containing embedded signatures so the conventional bit-for-bit comparison cannot be used. After detailing a litany of issues with these tools, they come to the conclusion that:
It s quite possible these messengers actually have reproducible builds, but the verification scripts they use don t actually allow us to verify whether they do.This reflects the consensus view within the Reproducible Builds project: pursuing a situation in language or package ecosystems where binaries are bit-for-bit identical (over requiring a bespoke ecosystem-specific tool) is not a luxury demanded by purist engineers, but rather the only practical way to demonstrate reproducibility. obfusk also announced the first release of their own set of tools on our mailing list. Related to this, obfusk also posted to an issue filed against Mastodon regarding the difficulties of creating bit-by-bit identical APKs, especially with respect to copying v2/v3 APK signatures created by different tools; they also reported that some APK ordering differences were not caused by building on macOS after all, but by using Android Studio [ ] and that F-Droid added 16 more apps published with Reproducible Builds in December.
aespipe
(#661079, #1020809), cdbackup
(#1011428) & xmlrpc-epi
(#865688, #1020651)
apr-util
(#1006865), lirc
(#979024) & ruby-omniauth-tumblr
amavisd-milter
(#975954), apophenia
(#940013), cfi
(#995647), chessx
(#881664), cmocka
(#991181), desmume
(#890312), golang-gonum-v1-plot
(#968045), intel-gpu-tools
(#945105), jhbuild
(#971420), libjama
(#986601), libjs-qunit
(#976445), liblip
(#1001513, #989583), libstatgrab
(#961747), mlpost
(#977179 and #977180), netcdf-parallel
(#972930), netgen-lvs
(#955783), perfect-scrollbar
(#1000770), python-tomli
(#994979), pytsk
(#992060), smplayer
(#997689), squeak-plugins-scratch
(#876771, #942006), stgit
(#942009), strace
(#896016), surgescript
(#992061), sympow
(#973601), wxmaxima
(#983148), xavs2
(#952493), xaw3d
(#991180, #986704) and yard
(#972668).
OpenRGB
(filesystem ordering issue)python-maturin
(report an issue regarding random numbers)rav1e
(datetime-related issue)weblate
(report that the build fails in 2038)osuosl167
machine is no longer a openqa-worker
node anymore. [ ][ ]foot-terminfo
package on Debian systems. [ ]--timeout
flag. [ ][ ]
228
, 229
and 230
to Debian:
file(1)
version 5.43, with thanks to Christoph Biedl. [ ]test_html.py::test_diff
test if html2text
is not installed. (#1026034)Standards-Version
on all of our packages, including diffoscope [ ], strip-nondeterminism [ ], disorderfs [ ] and reprotest [ ].
#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
230
. This version includes the following changes:
[ Chris Lamb ]
* Fix compatibility with file(1) version 5.43; thanks, Christoph Biedl.
[ Jelle van der Waa ]
* Support Berkeley DB version 6.
.buildinfo
files can be seen/used as SBOMs. And, no less importantly, the Reproducible Builds t-shirt design has been updated
[ ] industry application of R-Bs appears limited, and we seek to understand whether awareness is low or if significant technical and business reasons prevent wider adoption.This is achieved through interviews with software practitioners and business managers, and touches on both the business and technical reasons supporting the adoption (or not) of Reproducible Builds. The article also begins with an excellent explanation and literature review, and even introduces a new helpful analogy for reproducible builds:
[Users are] able to perform a bitwise comparison of the two binaries to verify that they are identical and that the distributed binary is indeed built from the source code in the way the provider claims. Applied in this manner, R-Bs function as a canary, a mechanism that indicates when something might be wrong, and offer an improvement in security over running unverified binaries on computer systems.The full paper is available to download on an open access basis. Elsewhere in academia, Beatriz Michelson Reichert and Rafael R. Obelheiro have published a paper proposing a systematic threat model for a generic software development pipeline identifying possible mitigations for each threat (PDF). Under the Tampering rubric of their paper, various attacks against Continuous Integration (CI) processes:
An attacker may insert a backdoor into a CI or build tool and thus introduce vulnerabilities into the software (resulting in an improper build). To avoid this threat, it is the developer s responsibility to take due care when making use of third-party build tools. Tampered compilers can be mitigated using diversity, as in the diverse double compiling (DDC) technique. Reproducible builds, a recent research topic, can also provide mitigation for this problem. (PDF)
-flto
option: the first involves solving an issue related to seeded random numbers; and the second involved the binary embedding the current working directory in compressed sections of the LTO object. Both of these issues made the build unreproducible.
ddd
(Fixed #834016)libpam-ldap
(Fixed #834050)nsnake
(Fixed #833612)quvi
(Fixed #835259)stressapptest
(Fixed #831587 & #986653)tcpreen
(Fixed #831585)boolector
(Fixed #1023886)tsdecrypt
(Fixed #829713 & #1022130)wbxml2
(QA upload fixed build path issues)tercpp
(QA upload fixed build path issues)SOURCE_DATE_EPOCH
. This was initially suggested and discussed on a devel@
mailing list post but was later written up on the Fedora Wiki as well as being officially proposed to Fedora Engineering Steering Committee (FESCo).
dwz
(Profile-guided optimisation issue)icmake
(filesystem ordering issue)llmnrd
elixir
(report a bug re. stuck build on single-core VMs)warzone2100
(report a bug re. parallelism-dependent output)boolector
.fl-cow
.gerstensaft
.libcgicc
.haskell98-report
.ucspi-proxy
.hunt
.tolua++
.twoftpd
.ipsvd
.gentoo
.lcm
.apcupsd
.openfortivpn
.xtb
.gnunet
.swift-im
.brewtarget
.xrprof
.gitlint
.claws-mail
.presage
.jh7100-bootloader-recovery
.226
and 227
to Debian:
python3-progressbar
and python3-progressbar2
, two modules providing the progressbar
Python module. [ ]file(1)
cannot detect yet and Python 3.11 cannot unmarshal. (#1024335)apksigcopier
. [ ]os_list
. [ ]assert_diff
helper in test_lzip.py
. [ ]lzip.py
and test_lzip.py
. [ ]apktool
if no differences are detected before the signing block [ ].
ssh(1)
into our snapshot server as the jenkins
user. [ ]#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
[ ] proposes a general taxonomy for attacks on opensource supply chains, independent of specific programming languages or ecosystems, and covering all supply chain stages from code contributions to package distribution.Taking the form of an attack tree, the paper covers 107 unique vectors linked to 94 real world supply-chain incidents which is then mapped to 33 mitigating safeguards including, of course, reproducible builds:
Reproducible Builds received a very high utility rating (5) from 10 participants (58.8%), but also a high-cost rating (4 or 5) from 12 (70.6%). One expert commented that a reproducible build like used by Solarwinds now, is a good measure against tampering with a single build system and another claimed this is going to be the single, biggest barrier .
[ ] illustrate a concerning new reality for the software industry and illuminates the increasingly sophisticated threats made by outside nation-states to the supply chains and infrastructure on which we all rely.The 12-month anniversary of the 2020 Solarwinds attack (which SolarWinds Worldwide LLC itself calls the SUNBURST attack) was, of course, the likely impetus for publication.
/build/1st/cyrus-imapd-3.6.0~beta3/
/build/2/cyrus-imapd-3.6.0~beta3/2nd/
git archive
command doesn t match the tarball served by GitHub anymore. In his post, kpcyrd narrows the change to a specific commit in Git. [ ]
repro-get
. According to Akihiro s post, repro-get is a tool to install a specific snapshot of apt/dnf/apk/pacman packages using SHA256SUMS files . This is needed in order to install specific (or pinned ) dependencies needed to validate a build.
man-db
UNIX manual page indexing tool:
One of the people working on [reproducible builds] noticed that man-db s database files were an obstacle to [reproducibility]: in particular, the exact contents of the database seemed to depend on the order in which files were scanned when building it. The reporter proposed solving this by processing files in sorted order, but I wasn t keen on that approach: firstly because it would mean we could no longer process files in an order that makes it more efficient to read them all from disk (still valuable on rotational disks), but mostly because the differences seemed to point to other bugs.Colin goes on to describe his approach to solving the problem, including fixing various fits of internal caching, and he ends his post with None of this is particularly glamorous work, but it paid off .
ascii2binary
(Fixed #1020812, #998758 & #1007421)bibclean
(Fixed #829754 & #929036)dradio
(Fixed #1020814)leave
(Fixed #777403, #967002 & #999259)libimage-imlib2-perl
(Fixed #1020665)mailto
(Fixed #998978 & #777413)remote-tty
(Fixed #829721 & #977280)xcolmix
(Fixed #1020748, #999219 & #988018)z80asm
(Fixed #939775 & #1020875)ario
(Investigated #828876)cloop
(Fixed #787996)elvis-tiny
(Fixed #829755 & #901345)hannah
(Fixed #845782 & #901260)mc
(Investigated #828683)mod-dnssd
(Submitted alternate fix for #828752)snake4
(Fixed #829715 & #913734)the
(Fixed #842550)zephyr
(Investigated #828867 & #1021374)msp430mcu
(Fixed #860275)checkpw
(Fixed #777299 & #1020887)madlib
(Fixed #778946)debhelper
, a set of tools used in the packaging of the majority of Debian packages. The patch addressed an issue in the dh_installsysusers
utility so that the postinst
post-installation script that debhelper
generates the same data regardless of the underlying filesystem ordering.
asymptote
(date-related issue)fastjet-contrib
(sort nondeterminstic filesystem ordering)forge
(Sphinx doctree issue)gau2grid
(output varies with march=native
)gosec
(date-related issue)helmfile
(date-related issue)libnvme
(date-related issue)moab
(CPU)tcl
(fails to build in 2038)vectorscan
(output varies with march=native
)xz2/lzma
(Rust-related filesystem ordering)puppet
back in early 2018 was finally merged into Puppet and was released in Puppet 7.20.0.puppet-agent
.tpm2-pytss
(forwarded upstream).cclive
.librep
.zephyr
.libdv
.dbview
.bwbasic
.olpc-powerd
.o3dgc
.icon
.rdist
.stfl
.pacman
.lam
.xsok
.python-djvulibre
.xzoom
.nitpic
.tcm
.xxkb
.yersinia
.centrifuge
.ssocr
.jakarta-jmeter
.guymager
.crack
.dc3dd
.dlt-viewer
.vart
.pgrouting
.libsx
.device-tree-compiler
.tsdecrypt
.openjdk
(Fixed JDK-8292892)224
and 225
to Debian:
html2text
. [ ]ttx(1)
from the fonttools suite. [ ]stable-po
pipeline to fail in the CI. [ ]order1.diff
test fixture to json_expected_ordering_diff
. [ ]assert_diff
over get_data
and an manual assert within the XML tests. [ ]ALLOWED_TEST_FILES
test; it was mostly just annoying. [ ]tests/test_source.py
file. [ ]logparse
tool to analyse results on the Debian Edu build logs. [ ]btop(1)
on all nodes running Debian. [ ]debstrap
jobs, correctly log the tool usage. [ ]cdebootstrap-static
binary for the 2nd runs of the cdebootstrap
tests. [ ]rm(1)
warning into an info -level message. [ ]osuosl168
node for running Debian bookworm already. [ ][ ]non-free-firmware
suite on the o168
node. [ ]/usr
. [ ]usrmerge
package on Debian bookworm and above. [ ]bc(1)
syntax in the computation of the percentage of unreproducible packages in the dashboard. [ ][ ][ ]index_suite_
pages, order the package status to be the same order of the menu. [ ]--distribution
parameter to the pbuilder
utility. [ ]#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
226
. This version includes the following changes:
[ Christopher Baines ]
* Add an lzip comparator with tests.
[ Chris Lamb ]
* Add support for comparing the "text" content of HTML files using html2text.
(Closes: #1022209, reproducible-builds/diffoscope#318)
* Misc/test improvements:
* Drop the ALLOWED_TEST_FILES test; it's mostly just annoying.
* Drop other copyright notices from lzip.py and test_lzip.py.
* Use assert_diff helper in test_lzip.py.
* Pylint tests/test_source.py.
[ Mattia Rizzolo ]
* Add lzip to debian dependencies.
--remap-path-prefix
solves this problem and has been used to great effect in build systems that rely on reproducibility (Bazel, Nix) to work at all and that there are efforts to teach cargo about it here .
TheAs their announcement later goes onto state, version-pinning using hash-checking mode can prevent this attack, although this does depend on specific installations using this mode, rather than a prevention that can be applied systematically.ctx
hosted project on PyPI was taken over via user account compromise and replaced with a malicious project which contained runtime code which collected the content ofos.environ.items()
when instantiating Ctx objects. The captured environment variables were sent as a base64 encoded query parameter to a Heroku application [ ]
.jar
may have been unnecessary given that diffoscope would have identified the, it must be said that there is something to be said with occasionally delving into seemingly low-level details, as well describing any debugging process. Indeed, as vanitasvitae writes:
Yes, this would have spared me from 3h of debugging But I probably would also not have gone onto this little dive into the JAR/ZIP format, so in the end I m not mad.
KBUILD_BUILD_TIMESTAMP
) in order to prepare my build with the known to disrupt code layout options disabled .
nondeterministic_checksum_generated_by_coq
and nondetermistic_js_output_from_webpack
.
After Holger Levsen found hundreds of packages in the bookworm distribution that lack .buildinfo
files, he uploaded 404 source packages to the archive (with no meaningful source changes). Currently bookworm now shows only 8 packages without .buildinfo
files, and those 8 are fixed in unstable and should migrate shortly. By contrast, Debian unstable will always have packages without .buildinfo
files, as this is how they come through the NEW queue. However, as these packages were not built on the official build servers (ie. they were uploaded by the maintainer) they will never migrate to Debian testing. In the future, therefore, testing should never have packages without .buildinfo
files again.
Roland Clobus posted yet another in-depth status report about his progress making the Debian Live images build reproducibly to our mailing list. In this update, Roland mentions that all major desktops build reproducibly with bullseye, bookworm and sid but also goes on to outline the progress made with automated testing of the generated images using openQA.
FORCE_SOURCE_DATE=1
in the environment of all builds in order to fix numerous timestamp issues in documentation generation tools.
maradns
package as it appears to embed a random prime number. (Patch)
This paper focuses on one research question: how can [Guix]((https://www.gnu.org/software/guix/) and similar systems allow users to securely update their software? [ ] Our main contribution is a model and tool to authenticate new Git revisions. We further show how, building on Git semantics, we build protections against downgrade attacks and related threats. We explain implementation choices. This work has been deployed in production two years ago, giving us insight on its actual use at scale every day. The Git checkout authentication at its core is applicable beyond the specific use case of Guix, and we think it could benefit to developer teams that use Git.A full PDF of the text is available.
215
, 216
and 217
to Debian unstable. Chris Lamb also made the following changes:
--profile
and we were killed via a TERM
signal. This should help in situations where diffoscope is terminated due to some sort of timeout. [ ]IndexError
exceptions (in addition to ValueError
) when parsing .pyc
files. (#1012258)argcomplete
module. [ ]readelf
(ie. binutils), as it appears that this patch level version change resulted in a change of output, not the minor version. [ ]@skip_unless_tool_is_at_least
decorator (NB. at_least
) over @skip_if_tool_version_is
(NB. is
) to fix tests under Debian stable. [ ]TERM
signal. [ ]build-compare
caused a regression for a few days.python-fasttext
(CPU-related issue).node-dommatrix
.rtpengine
.sphinxcontrib-mermaid
.yaru-theme
.mapproxy
(forwarded upstream).libxsmm
.yt-dlp
(forwarded upstream).lz4
, lzop
and xz-utils
packages on all nodes in order to detect running kernels. [ ]SOURCE_DATE_EPOCH
environment variable [ ]. In addition, Sebastian Crane very-helpfully updated the screenshot of salsa.debian.org s request access button on the How to join the Salsa group. [ ]
#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
f-droid.org
are built from source on our own builders. This is partly because F-Droid is backed by the free software community; that is, people who have engaged in the free software community long before Android was conceived, and, in particular, share many if not all of its values. Using F-Droid will therefore feel very familiar to anyone familiar with a modern Linux distribution.
fdroid verify
command.)
Our signature & trust scheme means that F-Droid can verify that an app is 100% free software whilst still using the developer s original .APK
file. More details about this may be found in our reproducibility documentation and on the page about our Verification Server.
if
/else
choice can be hard-coded, instead of being run-time evaluated every time. Such branches can be updated too (the kernel just rewrites the code to switch around the branch ). All these principles apply to static calls as well, but they re for replacing indirect function calls (i.e. a call through a function pointer) with a direct call (i.e. a hard-coded call address). This eliminates the need for Spectre mitigations (e.g. RETPOLINE) for these indirect calls, and avoids a memory lookup for the pointer. For hot-path code (like the scheduler), this has a measurable performance impact. It also serves as a kind of Control Flow Integrity implementation: an indirect call got removed, and the potential destinations have been explicitly identified at compile-time.
network RNG improvementsCAP_SETGID
(instead of to just any group), providing a way to keep the power of granting this capability much more limited. (This isn t complete yet, though, since handling setgroups()
is still needed.)
improve kernel s internal checking of file contentsset_fs()
, Christoph Hellwig made it possible for set_fs() to be optional for an architecture. Subsequently, he then removed set_fs()
entirely for x86, riscv, and powerpc. These architectures will now be free from the entire class of kernel address limit attacks that only needed to corrupt a single value in struct thead_info
.
sysfs_emit() replaces sprintf() in /syssprintf()
and snprintf()
in /sys
handlers by creating a new helper, sysfs_emit()
. This will handle the cases where kernel code was not correctly dealing with the length results from sprintf()
calls, which might lead to buffer overflows in the PAGE_SIZE
buffer that /sys
handlers operate on. With the helper in place, it was possible to start the refactoring of the many sprintf()
callers.
nosymfollow mount optionnosymfollow
mount option. This entirely disables symlink resolution for the given filesystem, similar to other mount options where noexec
disallows execve()
, nosuid
disallows setid bits, and nodev
disallows device files. Quoting the patch, it is useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts. (i.e. for when /proc/sys/fs/protected_symlinks
isn t a big enough hammer.) Chrome OS uses this option for its stateful filesystem, as symlink traversal as been a common attack-persistence vector.
ARMv8.5 Memory Tagging Extension support-Warray-bounds
compiler flag and clear the path for saner bounds checking of array indexes and memcpy()
usage.
That s it for now! Please let me know if you think anything else needs some attention. Next up is Linux v5.11.
2022, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
# create table foo (id int, t text);
CREATE TABLE
# insert into foo values (1, 'Doc1');
INSERT 0 1
# insert into foo values (2, 'Doc2');
INSERT 0 1
# insert into foo values (3, 'Doc3');
INSERT 0 1
# select * from foo;
id t
1 Doc1
2 Doc2
3 Doc3
(3 rows)
# delete from foo where id < 3;
DELETE 2
# select * from foo;
id t
3 Doc3
(1 row)
Oops! The first two documents have disappeared.
Now let's use pg_dirtyread to look at the table:
# create extension pg_dirtyread;
CREATE EXTENSION
# select * from pg_dirtyread('foo') t(id int, t text);
id t
1 Doc1
2 Doc2
3 Doc3
All three documents are still there, but only one of them is visible.
pg_dirtyread can also show PostgreSQL's system colums with the row location and
visibility information. For the first two documents, xmax is set, which means
the row has been deleted:
# select * from pg_dirtyread('foo') t(ctid tid, xmin xid, xmax xid, id int, t text);
ctid xmin xmax id t
(0,1) 1577 1580 1 Doc1
(0,2) 1578 1580 2 Doc2
(0,3) 1579 0 3 Doc3
(3 rows)
Undelete
Caveat: I'm not promising any of the ideas quoted below will actually work in
practice. There are a few caveats and a good portion of intricate knowledge
about the PostgreSQL internals might be required to succeed properly. Consider
consulting your favorite PostgreSQL support channel for advice if you need to
recover data on any production system. Don't try this at work.
I always had plans to extend pg_dirtyread to include some "undelete" command to
make deleted rows reappear, but never got around to trying that. But rows can already be
restored by using the output of pg_dirtyread itself:
# insert into foo select * from pg_dirtyread('foo') t(id int, t text) where id = 1;
This is not a true "undelete", though - it just inserts new rows from the data
read from the table.
pg_surgery
Enter pg_surgery,
which is a new PostgreSQL extension supplied with PostgreSQL 14. It contains
two functions to "perform surgery on a damaged relation". As a side-effect,
they can also make delete tuples reappear.
As I discovered now, one of the functions, heap_force_freeze(), works nicely
with pg_dirtyread. It takes a list of ctids (row locations) that it marks
"frozen", but at the same time as "not deleted".
Let's apply it to our test table, using the ctids that pg_dirtyread can read:
# create extension pg_surgery;
CREATE EXTENSION
# select heap_force_freeze('foo', array_agg(ctid))
from pg_dirtyread('foo') t(ctid tid, xmin xid, xmax xid, id int, t text) where id = 1;
heap_force_freeze
(1 row)
Et voil , our deleted document is back:
# select * from foo;
id t
1 Doc1
3 Doc3
(2 rows)
# select * from pg_dirtyread('foo') t(ctid tid, xmin xid, xmax xid, id int, t text);
ctid xmin xmax id t
(0,1) 2 0 1 Doc1
(0,2) 1578 1580 2 Doc2
(0,3) 1579 0 3 Doc3
(3 rows)
Disclaimer
Most importantly, none of the above methods will work if the data you just
deleted has already been purged by VACUUM or autovacuum. These actively zero
out reclaimed space. Restore from backup to get your data back.
Since both pg_dirtyread and pg_surgery operate outside the normal PostgreSQL
MVCC machinery, it's easy to create corrupt data using them. This includes
duplicated rows, duplicated primary key values, indexes being out of sync with
tables, broken foreign key constraints, and others. You have been warned.
pg_dirtyread does not work (yet) if the deleted rows contain any
toasted
values. Possible other approaches include using
pageinspect
and pg_filedump
to retrieve the ctids of deleted rows.
Please make sure you have working backups and don't need any of the above.
Next.