Search Results: "rafael"

8 December 2022

Reproducible Builds: Reproducible Builds in November 2022

Welcome to yet another report from the Reproducible Builds project, this time for November 2022. In all of these reports (which we have been publishing regularly since May 2015) we attempt to outline the most important things that we have been up to over the past month. As always, if you interested in contributing to the project, please visit our Contribute page on our website.

Reproducible Builds Summit 2022 Following-up from last month s report about our recent summit in Venice, Italy, a comprehensive report from the meeting has not been finalised yet watch this space! As a very small preview, however, we can link to several issues that were filed about the website during the summit (#38, #39, #40, #41, #42, #43, etc.) and collectively learned about Software Bill of Materials (SBOM) s and how .buildinfo files can be seen/used as SBOMs. And, no less importantly, the Reproducible Builds t-shirt design has been updated

Reproducible Builds at European Cyber Week 2022 During the European Cyber Week 2022, a Capture The Flag (CTF) cybersecurity challenge was created by Fr d ric Pierret on the subject of Reproducible Builds. The challenge consisted in a pedagogical sense based on how to make a software release reproducible. To progress through the challenge issues that affect the reproducibility of build (such as build path, timestamps, file ordering, etc.) were to be fixed in steps in order to get the final flag in order to win the challenge. At the end of the competition, five people succeeded in solving the challenge, all of whom were awarded with a shirt. Fr d ric Pierret intends to create similar challenge in the form of a how to in the Reproducible Builds documentation, but two of the 2022 winners are shown here:

On business adoption and use of reproducible builds Simon Butler announced on the rb-general mailing list that the Software Quality Journal published an article called On business adoption and use of reproducible builds for open and closed source software. This article is an interview-based study which focuses on the adoption and uses of Reproducible Builds in industry, with a focus on investigating the reasons why organisations might not have adopted them:
[ ] industry application of R-Bs appears limited, and we seek to understand whether awareness is low or if significant technical and business reasons prevent wider adoption.
This is achieved through interviews with software practitioners and business managers, and touches on both the business and technical reasons supporting the adoption (or not) of Reproducible Builds. The article also begins with an excellent explanation and literature review, and even introduces a new helpful analogy for reproducible builds:
[Users are] able to perform a bitwise comparison of the two binaries to verify that they are identical and that the distributed binary is indeed built from the source code in the way the provider claims. Applied in this manner, R-Bs function as a canary, a mechanism that indicates when something might be wrong, and offer an improvement in security over running unverified binaries on computer systems.
The full paper is available to download on an open access basis. Elsewhere in academia, Beatriz Michelson Reichert and Rafael R. Obelheiro have published a paper proposing a systematic threat model for a generic software development pipeline identifying possible mitigations for each threat (PDF). Under the Tampering rubric of their paper, various attacks against Continuous Integration (CI) processes:
An attacker may insert a backdoor into a CI or build tool and thus introduce vulnerabilities into the software (resulting in an improper build). To avoid this threat, it is the developer s responsibility to take due care when making use of third-party build tools. Tampered compilers can be mitigated using diversity, as in the diverse double compiling (DDC) technique. Reproducible builds, a recent research topic, can also provide mitigation for this problem. (PDF)

Misc news
On our mailing list this month:

Debian & other Linux distributions Over 50 reviews of Debian packages were added this month, another 48 were updated and almost 30 were removed, all of which adds to our knowledge about identified issues. Two new issue types were added as well. [ ][ ]. Vagrant Cascadian announced on our mailing list another online sprint to help clear the huge backlog of reproducible builds patches submitted by performing NMUs (Non-Maintainer Uploads). The first such sprint took place on September 22nd, but others were held on October 6th and October 20th. There were two additional sprints that occurred in November, however, which resulted in the following progress: Lastly, Roland Clobus posted his latest update of the status of reproducible Debian ISO images on our mailing list. This reports that all major desktops build reproducibly with bullseye, bookworm and sid as well as that no custom patches needed to applied to Debian unstable for this result to occur. During November, however, Roland proposed some modifications to live-setup and the rebuild script has been adjusted to fix the failing Jenkins tests for Debian bullseye [ ][ ].
In other news, Miro Hron ok proposed a change to clamp build modification times to the value of SOURCE_DATE_EPOCH. This was initially suggested and discussed on a devel@ mailing list post but was later written up on the Fedora Wiki as well as being officially proposed to Fedora Engineering Steering Committee (FESCo).

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 226 and 227 to Debian:
  • Support both python3-progressbar and python3-progressbar2, two modules providing the progressbar Python module. [ ]
  • Don t run Python decompiling tests on Python bytecode that file(1) cannot detect yet and Python 3.11 cannot unmarshal. (#1024335)
  • Don t attempt to attach text-only differences notice if there are no differences to begin with. (#1024171)
  • Make sure we recommend apksigcopier. [ ]
  • Tidy generation of os_list. [ ]
  • Make the code clearer around generating the Debian substvars . [ ]
  • Use our assert_diff helper in test_lzip.py. [ ]
  • Drop other copyright notices from lzip.py and test_lzip.py. [ ]
In addition to this, Christopher Baines added lzip support [ ], and FC Stegerman added an optimisation whereby we don t run apktool if no differences are detected before the signing block [ ].
A significant number of changes were made to the Reproducible Builds website and documentation this month, including Chris Lamb ensuring the openEuler logo is correctly visible with a white background [ ], FC Stegerman de-duplicated by email address to avoid listing some contributors twice [ ], Herv Boutemy added Apache Maven to the list of affiliated projects [ ] and boyska updated our Contribute page to remark that the Reproducible Builds presence on salsa.debian.org is not just the Git repository but is also for creating issues [ ][ ]. In addition to all this, however, Holger Levsen made the following changes:
  • Add a number of existing publications [ ][ ] and update metadata for some existing publications as well [ ].
  • Hide draft posts on the website homepage. [ ]
  • Add the Warpforge build tool as a participating project of the summit. [ ]
  • Clarify in the footer that we welcome patches to the website repository. [ ]

Testing framework The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In October, the following changes were made by Holger Levsen:
  • Improve the generation of meta package sets (used in grouping packages for reporting/statistical purposes) to treat Debian bookworm as equivalent to Debian unstable in this specific case [ ] and to parse the list of packages used in the Debian cloud images [ ][ ][ ].
  • Temporarily allow Frederic to ssh(1) into our snapshot server as the jenkins user. [ ]
  • Keep some reproducible jobs Jenkins logs much longer [ ] (later reverted).
  • Improve the node health checks to detect failures to update the Debian cloud image package set [ ][ ] and to improve prioritisation of some kernel warnings [ ].
  • Always echo any IRC output to Jenkins output as well. [ ]
  • Deal gracefully with problems related to processing the cloud image package set. [ ]
Finally, Roland Clobus continued his work on testing Live Debian images, including adding support for specifying the origin of the Debian installer [ ] and to warn when the image has unmet dependencies in the package list (e.g. due to a transition) [ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. You can get in touch with us via:

9 November 2022

Debian Brasil: Brasileiros(as) Mantenedores(as) e Desenvolvedores(as) Debian a partir de julho de 2015

Desde de setembro de 2015, o time de publicidade do Projeto Debian passou a publicar a cada dois meses listas com os nomes dos(as) novos(as) Desenvolvedores(as) Debian (DD - do ingl s Debian Developer) e Mantenedores(as) Debian (DM - do ingl s Debian Maintainer). Estamos aproveitando estas listas para publicar abaixo os nomes dos(as) brasileiros(as) que se tornaram Desenvolvedores(as) e Mantenedores(as) Debian a partir de julho de 2015. Desenvolvedores(as) Debian / Debian Developers / DDs: Marcos Talau Fabio Augusto De Muzio Tobich Gabriel F. T. Gomes Thiago Andrade Marques M rcio de Souza Oliveira Paulo Henrique de Lima Santana Samuel Henrique S rgio Durigan J nior Daniel Lenharo de Souza Giovani Augusto Ferreira Adriano Rafael Gomes Breno Leit o Lucas Kanashiro Herbert Parentes Fortes Neto Mantenedores(as) Debian / Debian Maintainers / DMs: Guilherme de Paula Xavier Segundo David da Silva Polverari Paulo Roberto Alves de Oliveira Sergio Almeida Cipriano Junior Francisco Vilmar Cardoso Ruviaro William Grzybowski Tiago Ilieve
Observa es:
  1. Esta lista ser atualizada quando o time de publicidade do Debian publicar novas listas com DMs e DDs e tiver brasileiros.
  2. Para ver a lista completa de Mantenedores(as) e Desenvolvedores(as) Debian, inclusive outros(as) brasileiros(as) antes de julho de 2015 acesse: https://nm.debian.org/public/people

Debian Brasil: Brasileiros(as) Mantenedores(as) e Desenvolvedores(as) Debian a partir de julho de 2015

Desde de setembro de 2015, o time de publicidade do Projeto Debian passou a publicar a cada dois meses listas com os nomes dos(as) novos(as) Desenvolvedores(as) Debian (DD - do ingl s Debian Developer) e Mantenedores(as) Debian (DM - do ingl s Debian Maintainer). Estamos aproveitando estas listas para publicar abaixo os nomes dos(as) brasileiros(as) que se tornaram Desenvolvedores(as) e Mantenedores(as) Debian a partir de julho de 2015. Desenvolvedores(as) Debian / Debian Developers / DDs: Marcos Talau Fabio Augusto De Muzio Tobich Gabriel F. T. Gomes Thiago Andrade Marques M rcio de Souza Oliveira Paulo Henrique de Lima Santana Samuel Henrique S rgio Durigan J nior Daniel Lenharo de Souza Giovani Augusto Ferreira Adriano Rafael Gomes Breno Leit o Lucas Kanashiro Herbert Parentes Fortes Neto Mantenedores(as) Debian / Debian Maintainers / DMs: Guilherme de Paula Xavier Segundo David da Silva Polverari Paulo Roberto Alves de Oliveira Sergio Almeida Cipriano Junior Francisco Vilmar Cardoso Ruviaro William Grzybowski Tiago Ilieve
Observa es:
  1. Esta lista ser atualizada quando o time de publicidade do Debian publicar novas listas com DMs e DDs e tiver brasileiros.
  2. Para ver a lista completa de Mantenedores(as) e Desenvolvedores(as) Debian, inclusive outros(as) brasileiros(as) antes de julho de 2015 acesse: https://nm.debian.org/public/people

15 May 2021

Utkarsh Gupta: Hello, Canonical! o/

Today marks the 90th day of me joining Canonical to work on Ubuntu full-time! So since it s been a while already, this blog post is long due. :)

The News
I joined Canonical, this February, to work on Ubuntu full-time! \o/
Those who know, they know that this is really very exciting for me because Canonical has been a dream company for me, for real (more about this below!). And hey, this is my first job, ever, so all the more reason to be psyched about, isn t it? ^_^ P.S. Keep reading and we ll meet my squad really sooon!

The Story Being an undergrad student (batch 2017-2021), I ve been slightly worried during my last two semesters, naturally, thinking about how s it all gonna pan out and what will I be doing, et al, because I ve been seeing all my friends and batchmates getting placed in companies or going for masters or at least having some sort of plans for their future and I, on the other hand, was hopelessly clueless. :D Well, to be fair, I did Google Summer of Code twice, in 2019 and 2020, became a Debian Developer in 2019, been a part of GCI and Outreachy, contributed to over dozens of open-source projects, et al, et al. So I wasn t all completely hopeless but for sure was completely clueless , heh. And for full disclosure, I was only slightly panicking because firstly, I did get placed in several companies and secondly, I didn t really need a job immediately since I was already getting paid to work on Debian stuff by Freexian, which was good enough. :)
(and honestly, Freexian has my whole heart! - more on that later sometime.) But that s not the point. I was still confused and worried and my mom & dad, more so than anyone. Ugh. We were all figuring out and she asked me places that I was interested to work in. And whilst I wasn t clear about things I wanted to do (and still am!) but I was (very) clear about this and so I told her about Canonical and also did tell her that it s a bit too ambitious for me to think about it now so I ll probably apply after some experience or something. and as they say, the world works in mysterious ways and well, it did for me! So back during the Ruby sprints (Feb 20), Kanashiro, the guy ( ), mentioned that his team was hiring and has a vacant position but I won t be eligible since I was still in my junior year. It was since then I ve been actively praying for Cronus, the god of time, to wave his magic wand and align it in such a way that the next opening should be somewhere near my graduation. And guess what? IT HAPPENED! 9 months later, in November 20, Kanashiro told me his team is hiring yet again and that I could apply this time! Without much (since there was some ) delay, I applied and started asking all sorts of questions to Kanashiro. No words are enough for him, he literally helped me throughout the process; from referring me to answering all sorts of doubts I had! And roughly after 2 months of interviewing, et al, my ambitious dream did come true and I finalyyyy signed my contract! \o/
(the interview process and what went on during those 10 weeks is a story for later ;))

The Server Team! \o This position, which I didn t mention earlier, was for the Server Team which is a team of 15 people, working to make Ubuntu server the best! And as I tweeted sometime back, the team is absolutely lovely, super kind, and consists of the best of teammates one could possibly ask for! Here s a quick sneak peek into our weekly team meeting. Thanks to Rafael for taking such a lovely picture. And yes, the cat Luna is a part of our squad! And oh, did I mention that we re completely remote and distributed?
FUN FACT: Our team covers all the TZs, that is, at any point of time (during weekdays), you ll find someone or the other from the team around! \o/ Anyway, our squad, managed by Rick is divided into two halves: Squeaky Wheels and Table Flip. Cool names, right?
Squeaky Wheels does the distro side of stuff and consists of Christian, Andreas, Rafael, Robie, Bryce, Sergio, Kanashiro, Athos, and now myself as well! And OTOH, Table Flip consists of Dan, Chad, Paride, Lucas, James, and Grant. Even though I interact w/ Squeaky Wheels more (basically daily), each of my teammates is absolutely lovely and equally awesome! Whilst I ll talk more about things here in the upcoming months, this is it for now! If there s anything, in particular, you d like to know more about, let me know! And lastly, here s us vibing our way through, making Ubuntu server better, cause that s how we roll!
Until next time.
:wq for today.

3 November 2016

Bits from Debian: New Debian Developers and Maintainers (September and October 2016)

The following contributors got their Debian Developer accounts in the last two months: The following contributors were added as Debian Maintainers in the last two months: Congratulations!

5 October 2016

Kees Cook: security things in Linux v4.8

Previously: v4.7. Here are a bunch of security things I m excited about in Linux v4.8: SLUB freelist ASLR Thomas Garnier continued his freelist randomization work by adding SLUB support. x86_64 KASLR text base offset physical/virtual decoupling On x86_64, to implement the KASLR text base offset, the physical memory location of the kernel was randomized, which resulted in the virtual address being offset as well. Due to how the kernel s -2GB addressing works (gcc s -mcmodel=kernel ), it wasn t possible to randomize the physical location beyond the 2GB limit, leaving any additional physical memory unused as a randomization target. In order to decouple the physical and virtual location of the kernel (to make physical address exposures less valuable to attackers), the physical location of the kernel needed to be randomized separately from the virtual location. This required a lot of work for handling very large addresses spanning terabytes of address space. Yinghai Lu, Baoquan He, and I landed a series of patches that ultimately did this (and in the process fixed some other bugs too). This expands the physical offset entropy to roughly $physical_memory_size_of_system / 2MB bits. x86_64 KASLR memory base offset Thomas Garnier rolled out KASLR to the kernel s various statically located memory ranges, randomizing their locations with CONFIG_RANDOMIZE_MEMORY. One of the more notable things randomized is the physical memory mapping, which is a known target for attacks. Also randomized is the vmalloc area, which makes attacks against targets vmalloced during boot (which tend to always end up in the same location on a given system) are now harder to locate. (The vmemmap region randomization accidentally missed the v4.8 window and will appear in v4.9.) x86_64 KASLR with hibernation Rafael Wysocki (with Thomas Garnier, Borislav Petkov, Yinghai Lu, Logan Gunthorpe, and myself) worked on a number of fixes to hibernation code that, even without KASLR, were coincidentally exposed by the earlier W^X fix. With that original problem fixed, then memory KASLR exposed more problems. I m very grateful everyone was able to help out fixing these, especially Rafael and Thomas. It s a hard place to debug. The bottom line, now, is that hibernation and KASLR are no longer mutually exclusive. gcc plugin infrastructure Emese Revfy ported the PaX/Grsecurity gcc plugin infrastructure to upstream. If you want to perform compiler-based magic on kernel builds, now it s much easier with CONFIG_GCC_PLUGINS! The plugins live in scripts/gcc-plugins/. Current plugins are a short example called Cyclic Complexity which just emits the complexity of functions as they re compiled, and Sanitizer Coverage which provides the same functionality as gcc s recent -fsanitize-coverage=trace-pc but back through gcc 4.5. Another notable detail about this work is that it was the first Linux kernel security work funded by Linux Foundation s Core Infrastructure Initiative. I m looking forward to more plugins! If you re on Debian or Ubuntu, the required gcc plugin headers are available via the gcc-$N-plugin-dev package (and similarly for all cross-compiler packages). hardened usercopy Along with work from Rik van Riel, Laura Abbott, Casey Schaufler, and many other folks doing testing on the KSPP mailing list, I ported part of PAX_USERCOPY (the basic runtime bounds checking) to upstream as CONFIG_HARDENED_USERCOPY. One of the interface boundaries between the kernel and user-space are the copy_to_user()/copy_from_user() family of functions. Frequently, the size of a copy is known at compile-time ( built-in constant ), so there s not much benefit in checking those sizes (hardened usercopy avoids these cases). In the case of dynamic sizes, hardened usercopy checks for 3 areas of memory: slab allocations, stack allocations, and kernel text. Direct kernel text copying is simply disallowed. Stack copying is allowed as long as it is entirely contained by the current stack memory range (and on x86, only if it does not include the saved stack frame and instruction pointers). For slab allocations (e.g. those allocated through kmem_cache_alloc() and the kmalloc()-family of functions), the copy size is compared against the size of the object being copied. For example, if copy_from_user() is writing to a structure that was allocated as size 64, but the copy gets tricked into trying to write 65 bytes, hardened usercopy will catch it and kill the process. For testing hardened usercopy, lkdtm gained several new tests: USERCOPY_HEAP_SIZE_TO, USERCOPY_HEAP_SIZE_FROM, USERCOPY_STACK_FRAME_TO,
USERCOPY_STACK_FRAME_FROM, USERCOPY_STACK_BEYOND, and USERCOPY_KERNEL. Additionally, USERCOPY_HEAP_FLAG_TO and USERCOPY_HEAP_FLAG_FROM were added to test what will be coming next for hardened usercopy: flagging slab memory as safe for copy to/from user-space , effectively whitelisting certainly slab caches, as done by PAX_USERCOPY. This further reduces the scope of what s allowed to be copied to/from, since most kernel memory is not intended to ever be exposed to user-space. Adding this logic will require some reorganization of usercopy code to add some new APIs, as PAX_USERCOPY s approach to handling special-cases is to add bounce-copies (copy from slab to stack, then copy to userspace) as needed, which is unlikely to be acceptable upstream. seccomp reordered after ptrace By its original design, seccomp filtering happened before ptrace so that seccomp-based ptracers (i.e. SECCOMP_RET_TRACE) could explicitly bypass seccomp filtering and force a desired syscall. Nothing actually used this feature, and as it turns out, it s not compatible with process launchers that install seccomp filters (e.g. systemd, lxc) since as long as the ptrace and fork syscalls are allowed (and fork is needed for any sensible container environment), a process could spawn a tracer to help bypass a filter by injecting syscalls. After Andy Lutomirski convinced me that ordering ptrace first does not change the attack surface of a running process (unless all syscalls are blacklisted, the entire ptrace attack surface will always be exposed), I rearranged things. Now there is no (expected) way to bypass seccomp filters, and containers with seccomp filters can allow ptrace again. That s it for v4.8! The merge window is open for v4.9

2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

9 August 2016

Reproducible builds folks: Reproducible builds: week 67 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday July 31 and Saturday August 6 2016: Toolchain development and fixes Packages fixed and bugs filed The following 24 packages have become reproducible - in our current test setup - due to changes in their build-dependencies: alglib aspcud boomaga fcl flute haskell-hopenpgp indigo italc kst ktexteditor libgroove libjson-rpc-cpp libqes luminance-hdr openscenegraph palabos petri-foo pgagent sisl srm-ifce vera++ visp x42-plugins zbackup The following packages have become reproducible after being fixed: The following newly-uploaded packages appear to be reproducible now, for reasons we were not able to figure out. (Relevant changelogs did not mention reproducible builds.) Some uploads have addressed some reproducibility issues, but not all of them: Patches submitted that have not made their way to the archive yet: Package reviews and QA These are reviews of reproduciblity issues of Debian packages. 276 package reviews have been added, 172 have been updated and 44 have been removed in this week. 7 FTBFS bugs have been reported by Chris Lamb. Reproducibility tools Test infrastructure For testing the impact of allowing variations of the buildpath (which up until now we required to be identical for reproducible rebuilds), Reiner Herrmann contribed a patch which enabled build path variations on testing/i386. This is possible now since dpkg 1.18.10 enables the --fixdebugpath build flag feature by default, which should result in reproducible builds (for C code) even with varying paths. So far we haven't had many results due to disturbances in our build network in the last days, but it seems this would mean roughly between 5-15% additional unreproducible packages - compared to what we see now. We'll keep you updated on the numbers (and problems with compilers and common frameworks) as we find them. lynxis continued work to test LEDE and OpenWrt on two different hosts, to include date variation in the tests. Mattia and Holger worked on the (mass) deployment scripts, so that the - for space reasons - only jenkins.debian.net GIT clone resides in ~jenkins-adm/ and not anymore in Holger's homedir, so that soon Mattia (and possibly others!) will be able to fully maintain this setup, while Holger is doing siesta. Miscellaneous Chris, dkg, h01ger and Ximin attended a Core Infrastricture Initiative summit meeting in New York City, to discuss and promote this Reproducible Builds project. The CII was set up in the wake of the Heartbleed SSL vulnerability to support software projects that are critical to the functioning of the internet. This week's edition was written by Ximin Luo and Holger Levsen and reviewed by a bunch of Reproducible Builds folks on IRC.

10 July 2016

Bits from Debian: New Debian Developers and Maintainers (May and June 2016)

The following contributors got their Debian Developer accounts in the last two months: The following contributors were added as Debian Maintainers in the last two months: Congratulations!

12 April 2016

Reproducible builds folks: Reproducible builds: week 48 in Stretch cycle

What happened in the reproducible builds effort between March 20th and March 26th: Toolchain fixes Daniel Kahn Gillmor worked on removing build path from build symbols submitting a patch adding -fdebug-prefix-map to clang to match GCC, another patch against gcc-5 to backport the removal of -fdebug-prefix-map from DW_AT_producer, and finally by proposing the addition of a normalizedebugpath to the reproducible feature set of dpkg-buildflags that would use -fdebug-prefix-map to replace the current directory with . using -fdebug-prefix-map. Sergey Poznyakoff merged the --clamp-mtime option so that it will be featured in the next Tar release. This option is likely to be used by dpkg-deb to implement deterministic mtimes for packaged files. Packages fixed The following packages have become reproducible due to changes in their build dependencies: augeas, gmtkbabel, ktikz, octave-control, octave-general, octave-image, octave-ltfat, octave-miscellaneous, octave-mpi, octave-nurbs, octave-octcdf, octave-sockets, octave-strings, openlayers, python-structlog, signond. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues, but not all of them: Patches submitted which have not made their way to the archive yet: tests.reproducible-builds.org i386 build nodes have been setup by converting 2 of the 4 amd64 nodes to i386. (h01ger) Package reviews 92 reviews have been removed, 66 added and 31 updated in the previous week. New issues: timestamps_generated_by_xbean_spring, timestamps_generated_by_mangosdk_spiprocessor. Chris Lamb filed 7 FTBFS bugs. Misc. On March 20th, Chris Lamb gave a talk at FOSSASIA 2016 in Singapore. The very same day, but a few timezones apart, h01ger did a presentation at LibrePlanet 2016 in Cambridge, Massachusetts. Seven GSoC/Outreachy applications were made by potential interns to work on various aspects of the reproducible builds effort. On top of interacting with several applicants, prospective mentors gathered to review the applications.

27 March 2016

Lunar: Reproducible builds: week 48 in Stretch cycle

What happened in the reproducible builds effort between March 20th and March 26th:

Toolchain fixes
  • Sebastian Ramacher uploaded breathe/4.2.0-1 which makes its output deterministic. Original patch by Chris Lamb, merged uptream.
  • Rafael Laboissiere uploaded octave/4.0.1-1 which allows packages to be built in place and avoid unreproducible builds due to temporary build directories appearing in the .oct files.
Daniel Kahn Gillmor worked on removing build path from build symbols submitting a patch adding -fdebug-prefix-map to clang to match GCC, another patch against gcc-5 to backport the removal of -fdebug-prefix-map from DW_AT_producer, and finally by proposing the addition of a normalizedebugpath to the reproducible feature set of dpkg-buildflags that would use -fdebug-prefix-map to replace the current directory with . using -fdebug-prefix-map. As succesful result of lobbying at LibrePlanet 2016, the --clamp-mtime option will be featured in the next Tar release. This option is likely to be used by dpkg-deb to implement deterministic mtimes for packaged files.

Packages fixed The following packages have become reproducible due to changes in their build dependencies: augeas, gmtkbabel, ktikz, octave-control, octave-general, octave-image, octave-ltfat, octave-miscellaneous, octave-mpi, octave-nurbs, octave-octcdf, octave-sockets, octave-strings, openlayers, python-structlog, signond. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues, but not all of them: Patches submitted which have not made their way to the archive yet:
  • #818742 on milkytracker by Reiner Herrmann: sorts the list of source files.
  • #818752 on tcl8.4 by Reiner Herrmann: sort source files using C locale.
  • #818753 on tk8.6 by Reiner Herrmann: sort source files using C locale.
  • #818754 on tk8.5 by Reiner Herrmann: sort source files using C locale.
  • #818755 on tk8.4 by Reiner Herrmann: sort source files using C locale.
  • #818952 on marionnet by ceridwen: dummy out build date and uname to make build reproducible.
  • #819334 on avahi by Reiner Herrmann: ship upstream changelog instead of the one generated by gettextize (although duplicate of #804141 by Santiago Vila).

tests.reproducible-builds.org i386 build nodes have been setup by converting 2 of the 4 amd64 nodes to i386. (h01ger)

Package reviews 92 reviews have been removed, 66 added and 31 updated in the previous week. New issues: timestamps_generated_by_xbean_spring, timestamps_generated_by_mangosdk_spiprocessor. Chris Lamb filed 7 FTBFS bugs.

Misc. On March 20th, Chris Lamb gave a talk at FOSSASIA 2016 in Singapore. The very same day, but a few timezones apart, h01ger did a presentation at LibrePlanet 2016 in Cambridge, Massachusetts. Seven GSoC/Outreachy applications were made by potential interns to work on various aspects of the reproducible builds effort. On top of interacting with several applicants, prospective mentors gathered to review the applications. Huge thanks to Linda Naeun Lee for the new hackergotchi visible on Planet Debian.

25 January 2016

Antoine Beaupr : Internet in Cuba

A lot has been written about the Internet in Cuba over the years. I have read a few articles, from New York Times' happy support for Google's invasion of Cuba to RSF's dramatic and fairly outdated report about censorship in Cuba. Having written before about Internet censorship in Tunisia, I was curious to see if I could get a feel of what it is like over there, now that a new Castro is in power and the Obama administration has started restoring diplomatic ties with Cuba. With those political changes coming signifying the end of an embargo that has been called genocidal by the Cuban government, it is surprisingly difficult to get fresh information about the current state of affairs. This article aims to fill that gap in clarifying how the internet works in Cuba, what kind of censorship mechanisms are in place and how to work around them. It also digs more technically into the network architecture and performance. It is published in the hope of providing both Cubans and the rest of the world with a better understanding of their network and, if possible, Cubans ways to access the internet more cheaply or without censorship.

"Censorship" and workarounds Unfortunately, I have been connected to the internet only through the the Varadero airport and the WiFi of a "full included" resort near Jibacoa. I have to come to assume that this network is likely to be on a segregated, uncensored internet while the rest of the country suffers the wrath of the Internet censorship in Cuba I have seen documented elsewhere. Through my research, I couldn't find any sort of direct censorship. The Netalyzr tool couldn't find anything significantly wrong with the connection, other than the obvious performance problems related both to the overloaded uplinks of the Cuban internet. I ran an incomplete OONI probe as well, and it seems there was no obvious censorship detected there as well, at least according to folks in the helpful #ooni IRC channel. Tor also works fine, and could be a great way to avoid the global surveillance system described later in this article. Nevertheless, it still remains to be seen how the internet is censored in the "real" Cuban internet, outside of the tourist designated areas - hopefully future visitors or locals can expand on this using the tools mentioned above, using the regular internet. Usual care should be taken when using any workaround tools, mentioned in this post or not, as different regimes over the world have accused, detained, tortured and killed sometimes for the mere fact of using or distributing circumvention tools. For example, a Russian developer was arrested and detained in 2001 by United States' FBI for exposing vulnerabilities in the Adobe e-books copy protection mechanisms. Similarly, people distributing Tor and other tools have been arrested during the period prior to the revolution in Tunisia.

The Cuban captive portal There is, however, a more pernicious and yet very obvious censorship mechanism at work in Cuba: to get access to the internet, you have to go through what seems to be a state-wide captive portal, which I have seen both at the hotel and the airport. It is presumably deployed at all levels of the internet access points. To get credentials through that portal, you need a username and password which you get by buying a Nauta card. Those cards cost 2$CUC and get you an hour of basically unlimited internet access. That may not seem like a lot for a rich northern hotel party-goer, but for Cubans, it's a lot of money, given that the average monthly salary is around 20$CUC. The system is also pretty annoying to use, because it means you do not get continuous network access: every hour, you need to input a new card, which will obviously make streaming movies and other online activities annoying. It also makes hosting servers basically impossible. So while Cuba does not have, like China or Iran, a "great firewall", there is definitely a big restriction to going online in Cuba. Indeed, it seems to be how the government ensures that Cubans do not foment too much dissent online: keep the internet slow and inaccessible, and you won't get too many Arab spring / blogger revolutions.

Bypassing the Cuban captive portal The good news is that it is perfectly possible for Cubans (or at least for a tourist like me with resources outside of the country) to bypass the captive portal. Like many poorly implemented portals, the portal allows DNS traffic to go through, which makes it possible to access the global network for free by using a tool like iodine which tunnels IP traffic over DNS requests. Of course, the bandwidth and reliability of the connection you get through such a portal is pretty bad. I have regularly seen 80% packet loss and over two minutes of latency:
--- 10.0.0.1 ping statistics ---
163 packets transmitted, 31 received, 80% packet loss, time 162391ms
rtt min/avg/max/mdev = 133.700/2669.535/64188.027/11257.336 ms, pipe 65
Still, it allowed me to login to my home server through SSH using Mosh to workaround the reliability issues. Every once in a while, mosh would get stuck and keep on trying to send packets to probe the server, which would clog the connection even more. So I regularly had to restart the whole stack using these commands:
killall iodine # stop DNS tunnel
nmcli n off # turn off wifi to change MAC address
macchanger -A wlan0 # change MAC address
nmcli n on # turn wifi back on
sleep 3 # wait for wifi to settle
iodine-client-start # restart DNS tunnel
The Koumbit Wiki has good instructions on how to setup a DNS tunnel. I am wondering if such a public service could be of use for Cubans, although I am not sure how it could be deployed only for Cubans, and what kind of traffic it could support... The fact is that iodine does require a server to operate, and that server must be run on the outside of the censored perimeter, something that Cubans may not be able to afford in the first place. Another possible way to save money with the captive portal would be to write something that automates connecting and disconnecting from the portal. You would feed that program a list of credentials and it would connect to the portal only on demand, and disconnect as soon as no traffic goes through. There are details on the implementation of the captive portal below that may help future endeavours in that field.

Private information revealed to the captive portal It should be mentioned, however, that the captive portal has a significant amount of information on clients, which is a direct threat to the online privacy of Cuban internet users. Of course the unique identifiers issued with the Nauta cards can be correlated with your identity, right from the start. For example, I had to give my room number to get a Nauta card issued. Then the central portal also knows which access point you are connected to. For example, the central portal I was connected to Wifi_Memories_Jibacoa which, for anyone that cares to research, will give them a location of about 20 square meters where I was located when connected (there is only one access point in the whole hotel). Finally, the central portal also knows my MAC address, a unique identifier for the computer I am using which also reveals which brand of computer I am using (Mac, Lenovo, etc). While this address can be changed, very few people know that, let alone how. This led me to question whether I would be allowed back in Cuba (or even allowed out!) after publishing this blog post, as it is obvious that I can be easily identified based on the time this article was published, my name and other details. Hopefully the Cuban government will either not notice or not care, but this can be a tricky situation, obviously. I have heard that Cuban prisons are not the best hangout place in Cuba, to say the least...

Network configuration assessment This section is more technical and delves more deeply in the Cuban internet to analyze the quality and topology of the network, along with hints as to which hardware and providers are being used to support the Cuban government.

Line quality The internet is actually not so bad in the hotel. Again, this may be because of the very fact that I am in that hotel, and I get a privileged access to the new fiber line to Venezuela, the ALBA-1 link. The line speed I get is around 1mbps, according to speedtest, which selected a server from LIME in George Town, Cayman Islands:
[1034]anarcat@angela:cuba$ speedtest
Retrieving speedtest.net configuration...
Retrieving speedtest.net server list...
Testing from Empresa de Telecomunicaciones de Cuba (152.206.92.146)...
Selecting best server based on latency...
Hosted by LIME (George Town) [391.78 km]: 317.546 ms
Testing download speed........................................
Download: 1.01 Mbits/s
Testing upload speed..................................................
Upload: 1.00 Mbits/s
Latency to the rest of the world is of couse slow:
--- koumbit.org ping statistics ---
122 packets transmitted, 120 received, 1,64% packet loss, time 18731,6ms
rtt min/avg/max/sdev = 127,457/156,097/725,211/94,688 ms
--- google.com ping statistics ---
122 packets transmitted, 121 received, 0,82% packet loss, time 19371,4ms
rtt min/avg/max/sdev = 132,517/160,095/724,971/93,273 ms
--- redcta.org.ar ping statistics ---
122 packets transmitted, 120 received, 1,64% packet loss, time 40748,6ms
rtt min/avg/max/sdev = 303,035/339,572/965,092/97,503 ms
--- ccc.de ping statistics ---
122 packets transmitted, 72 received, 40,98% packet loss, time 19560,2ms
rtt min/avg/max/sdev = 244,266/271,670/594,104/61,933 ms
Interestingly, Koumbit is actually the closest host in the above test. It could be that Canadian hosts are less affected by bandwidth problems compared to US hosts because of the embargo.

Network topology The various traceroutes show a fairly odd network topology, but that is typical of what I would described as "colonized internet users", which have layers and layers of NAT and obscure routing that keep them from the real internet. Just like large corporations are implementing NAT in a large scale, Cuba seems to have layers and layers of private RFC 1918 IPv4 space. A typical traceroute starts with:
traceroute to koumbit.net (199.58.80.33), 30 hops max, 60 byte packets
 1  10.156.41.1 (10.156.41.1)  9.724 ms  9.472 ms  9.405 ms
 2  192.168.134.137 (192.168.134.137)  16.089 ms  15.612 ms  15.509 ms
 3  172.31.252.113 (172.31.252.113)  15.350 ms  15.805 ms  15.358 ms
 4  pos6-0-0-agu-cr-1.mpls.enet.cu (172.31.253.197)  15.286 ms  14.832 ms  14.405 ms
 5  172.31.252.29 (172.31.252.29)  13.734 ms  13.685 ms  14.485 ms
 6  200.0.16.130 (200.0.16.130)  14.428 ms  11.393 ms  10.977 ms
 7  200.0.16.74 (200.0.16.74)  10.738 ms  10.019 ms  10.326 ms
 8  ix-11-3-1-0.tcore1.TNK-Toronto.as6453.net (64.86.33.45)  108.577 ms  108.449 ms
Let's take this apart line by line:
 1  10.156.41.1 (10.156.41.1)  9.724 ms  9.472 ms  9.405 ms
This is my local gateway, probably the hotel's wifi router.
 2  192.168.134.137 (192.168.134.137)  16.089 ms  15.612 ms  15.509 ms
This is likely not very far from the local gateway, probably still in Cuba. It in one bit away from the captive portal IP address (see below) so it is very likely related to the captive portal implementation.
 3  172.31.252.113 (172.31.252.113)  15.350 ms  15.805 ms  15.358 ms
 4  pos6-0-0-agu-cr-1.mpls.enet.cu (172.31.253.197)  15.286 ms  14.832 ms  14.405 ms
 5  172.31.252.29 (172.31.252.29)  13.734 ms  13.685 ms  14.485 ms
All those are withing RFC 1918 space. Interestingly, the Cuban DNS servers resolve one of those private IPs as within Cuban space, on line #4. That line is interesting because it reveals the potential use of MPLS.
 6  200.0.16.130 (200.0.16.130)  14.428 ms  11.393 ms  10.977 ms
 7  200.0.16.74 (200.0.16.74)  10.738 ms  10.019 ms  10.326 ms
Those two lines are the only ones that actually reveal that the route belongs in Cuba at all. Both IPs are in a tiny (/24, or 256 IP addresses) network allocated to ETECSA, the state telco in Cuba:
inetnum:     200.0.16/24
status:      allocated
aut-num:     N/A
owner:       EMPRESA DE TELECOMUNICACIONES DE CUBA S.A. (IXP CUBA)
ownerid:     CU-CUBA-LACNIC
responsible: Rafael L pez Guerra
address:     Ave. Independencia y 19 Mayo, s/n,
address:     10600 - La Habana - CH
country:     CU
phone:       +53 7 574242 []
owner-c:     JOQ
tech-c:      JOQ
abuse-c:     JEM52
inetrev:     200.0.16/24
nserver:     NS1.NAP.ETECSA.NET
nsstat:      20160123 AA
nslastaa:    20160123
nserver:     NS2.NAP.ETECSA.NET
nsstat:      20160123 AA
nslastaa:    20160123
created:     20030512
changed:     20140610
Then the last hop:
 8  ix-11-3-1-0.tcore1.TNK-Toronto.as6453.net (64.86.33.45)  108.577 ms  108.449 ms  108.257 ms
...interestingly, lands directly in Toronto, in this case going later to Koumbit but that is the first hop that varies according to the destination, hops 1-7 being a common trunk to all external communications. It's also interesting that this shoves a good 90 milliseconds extra in latency, showing that a significant distance and number of equipment crossed. Yet a single hop is crossed, not showing the intermediate step of the Venezuelan link or any other links for that matter. Something obscure is going on there... Also interesting to note is the traceroute to the redirection host, which is only one hop away:
traceroute to 192.168.134.138 (192.168.134.138), 30 hops max, 60 byte packets
 1  192.168.134.138 (192.168.134.138)  6.027 ms  5.698 ms  5.596 ms
Even though it is not the gateway:
$ ip route
default via 10.156.41.1 dev wlan0  proto static  metric 1024
10.156.41.0/24 dev wlan0  proto kernel  scope link  src 10.156.41.4
169.254.0.0/16 dev wlan0  scope link  metric 1000
This means a very close coordination between the different access points and the captive portal system. Finally, note that there seems to be only three peers to the Cuban internet: Teleglobe, formerly Canadian, now owned by the Indian [[!wiki Tata group]], and Telef nica, the Spanish Telco that colonized most of Latin America's internet, all the way down to Argentina. This is confirmed by my traceroutes, which show traffic to Koumbit going through Tata and Google's going through Telef nica.

Captive portal implementation The captive portal is https://www.portal-wifi-temas.nauta.cu/ (not accessible outside of Cuba) and uses a self-signed certificate. The domain name resolves to 190.6.81.230 in the hotel. Accessing http://1.1.1.1/ gives you a status page which allows you to disconnect from the portal. It actually redirects you to https://192.168.134.138/logout.user. That is also a self-signed, but different certificate. That certificate actually reveals the implication of Gemtek which is a "world-leading provider of Wireless Broadband solutions, offering a wide range of solutions from residential to business". It is somewhat unclear if the implication of Gemtek here is deliberate or a misconfiguration on the part of Cuban officials, especially since the certificate is self-signed and was issued in 2002. It could be, however, a trace of the supposed involvement of China in the development of Cuba's networking systems, although Gemtek is based in Taiwan, and not in the China mainland. That IP, in turn, redirects you to the same portal but in a page that shows you the statistics:
https://www.portal-wifi-temas.nauta.cu/?mac=0024D1717D18&script=logout.user&remain_time=00%3A55%3A52&session_time=00%3A04%3A08&username=151003576287&clientip=10.156.41.21&nasid=Wifi_Memories_Jibacoa&r=ac%2Fpopup
Notice how you see the MAC address of the machine in the URL (randomized, this is not my MAC address), along with the remaining time, session time, client IP and the Wifi access point ESSID. There may be some potential in defrauding the session time there, I haven't tested it directly. Hitting Actualizar redirects you back to the IP address, which redirects you to the right URL on the portal. The "real" logout is at:
http://192.168.134.138/logout.user?cmd=logout
The login is performed against https://www.portal-wifi-temas.nauta.cu/index.php?r=ac/login with a referer of:
https://www.portal-wifi-temas.nauta.cu/?&nasid=Wifi_Memories_Jibacoa&nasip=192.168.134.138&clientip=10.156.41.21&mac=EC:55:F9:C5:F2:55&ourl=http%3a%2f%2fgoogle.ca%2f&sslport=443&lang=en-US%2cen%3bq%3d0.8&lanip=10.156.41.1
Again, notice the information revealed to the central portal.

Equipment and providers I ran Nmap probes against both the captive portal and the redirection host, in the hope of finding out how they were built and if they could reveal the source of the equipment used. The complete nmap probes are available in nmap, but it seems that the captive portal is running some embedded device. It is confusing because the probe for the captive portal responds as if it was the gateway, which blurs even more the distinction between the hotel's gateway and the captive portal. This raises the distinct possibility that all access points are actually captive portal that authenticate to another central server. The nmap traces do show three distinct hosts however:
  • the captive portal (www.portal-wifi-temas.nauta.cu, 190.6.81.230)
  • some redirection host (192.168.134.138)
  • the hotel's gateway (10.156.41.1)
They do have distinct signatures so the above may be just me misinterpreting traceroute and nmap results. Your comments may help in clarifying the above. Still, the three devices show up as running Linux, in the two last cases versions between 2.4.21 and 2.4.31. Now, to find out which version of Linux it is running is way more challenging, and it is possible it is just some custom Linux distribution. Indeed, the webserver shows up as G4200.GSI.2.22.0155 and the SSH server is running OpenSSH 3.0.2p1, which is basically prehistoric (2002!) which corroborates the idea that this is some Gemtek embedded device. The fact that those devices are running 14 years old software should be a concern to the people responsible for those networks. There is, for example, a remote root vulnerability that affects that specific version of OpenSSH, among many other vulnerabilities.

A note on Nauta card's security Finally, one can note that it is probably trivial to guess card UIDs. All cards i have here start with the prefix 15100, the following digits being 3576 or 4595, presumably depending on the "batch" that was sent to different hotels, which seems to be batches of 1000 cards. You can also correlate the UID with the date at which the card was issued. For example, 15100357XXX cards are all valid until 19/03/2017, and 151004595XXX cards are all valid until 23/03/2017. Here's the list of UIDs I have seen:
151004595313
151004595974
151003576287
151003576105
151003576097
The passwords, on the other hand, do seem fairly random (although my sample size is small). Interestingly, those passwords are also 12 digits long, which is about as strong as a seven-letter password (mixed uppercase and lowercase). If there are no rate-limiting provisions on that captive portal, it could be possible to guess those passwords, since you have free rein on accessing those routers. Depending on the performance of the routers, you could be lucky and find a working password for free...

Conclusion Clearly, Internet access in Cuba needs to be modernized. We can clearly see that Cuba years behind the rest of the Americas, if only through the percentage of the population with internet access, or download speeds. The existence of a centralized captive portal also enables a huge surveillance potential that should be a concern for any Cuban, or for that matter, anyone wishing to live in a free society. The answer, however, lies not in the liberalization of commerce and opening the doors to the US companies and their own systems of surveillance. It should be possible, and even desirable for Cubans to establish their own neutral network, a proposal I have made in the past even for here in Qu bec. This network could be used and improved by Cubans themselves, prioritizing local communities that would establish their own infrastructure according to their own needs. I have been impressed by this article about the El Paquete system - it shows great innovation and initiative from Cubans which are known for engaging in technology in a creative way. This should be leveraged by letting Cubans do what they want with their networks, not telling them what to do. The best the Googles of this world can do to help Cuba is not to colonize Cuba's technological landscape but to cleanup their own and make their own tools more easily accessible and shareable offline. It is something companies can do right now, something I detailed in a previous article.

9 November 2015

Lunar: Reproducible builds: week 28 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes Chris Lamb filled a bug on python-setuptools with a patch to make the generated requires.txt files reproducible. The patch has been forwarded upstream. Chris also understood why the she-bang in some Python scripts kept being undeterministic: setuptools as called by dh-python could skip re-installing the scripts if the build had been too fast (under one second). #804339 offers a patch fixing the issue by passing --force to setup.py install. #804141 reported on gettext asks for support of SOURCE_DATE_EPOCH in gettextize. Santiago Vila pointed out that it doesn't felt appropriate as gettextize is supposed to be an interactive tool. The problem reported seems to be in avahi build system instead. Packages fixed The following packages became reproducible due to changes in their build dependencies: celestia, dsdo, fonts-taml-tscu, fte, hkgerman, ifrench-gut, ispell-czech, maven-assembly-plugin, maven-project-info-reports-plugin, python-avro, ruby-compass, signond, thepeg, wagon2, xjdic. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Patches submitted which have not made their way to the archive yet: Chris Lamb closed a wrongly reopened bug against haskell-devscripts that was actually a problem in haddock. reproducible.debian.net FreeBSD tests are now run for three branches: master, stable/10, release/10.2.0. (h01ger) diffoscope development Support has been added for Free Pascal unit files (.ppc). (Paul Gevers) The homepage is now available using HTTPS, thanks to Let's Encrypt!. Work has been done to be able to publish diffoscope on the Python Package Index (also known as PyPI): the tlsh module is now optional, compatibility with python-magic has been added, and the fallback code to handle RPM has been fixed. Documentation update Reiner Herrmann, Paul Gevers, Niko Tyni, opi, and Dhole offered various fixes and wording improvements to the reproducible-builds.org. A mailing-list is now available to receive change notifications. NixOS, Guix, and Baserock are featured as projects working on reproducible builds. Package reviews 70 reviews have been removed, 74 added and 17 updated this week. Chris Lamb opened 22 new fail to build from source bugs. New issues this week: randomness_in_ocaml_provides, randomness_in_qdoc_page_id, randomness_in_python_setuptools_requires_txt, gettext_creates_ChangeLog_files_and_entries_with_current_date. Misc. h01ger and Chris Lamb presented Beyond reproducible builds at the MiniDebConf in Cambridge on November 8th. They gave an overview of where we stand and the changes in user tools, infrastructure, and development practices that we might want to see happening. Feedback on these thoughts are welcome. Slides are already available, and the video should be online soon. At the same event, a meeting happened with some members of the release team to discuss the best strategy regarding releases and reproducibility. Minutes have been posted on the Debian reproducible-builds mailing-list.

2 November 2015

Lunar: Reproducible builds: week 27 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes Packages fixed The following packages became reproducible due to changes in their build dependencies: maven-plugin-tools, norwegian, ocaml-melt, python-biom-format, rivet. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: The following package is currently failing to build from source but should now be reproducible: Patches submitted which have not made their way to the archive yet: reproducible.debian.net A quick update on current statistics: testing is at 85% of packages tested reproducible with our modified packages, unstable on armhf caught up with amd64 with 80%. The schroot name used for running diffoscope when testing OpenWrt, NetBSD, Coreboot, and Arch Linux has been fixed. (h01ger, Mattia Rizzolo) Documentation update Paul Gevers documented timestamps in unit files created by the Free Pascal Compiler. reproducible-builds.org is now live. It contains a comprehensive documentation on all aspects that have been identified so far of what we call reproducible builds . It makes room for pointers to projects working on reproducible builds, news, dedicated tools, and community events. Package reviews 206 reviews have been removed, 171 added and 196 updated this week. Chris Lamb reported 28 failing to build from source issues. New issues identified this week: timestamps_in_pdf_content, different_encoding_in_html_by_docbook_xsl, timestamps_in_ppu_generated_by_fpc, method_may_never_be_called_in_documentation_generated_by_javadoc. Misc. Andrei Borzenkov has proposed a fix for uninitialized memory in GRUB's mkimage. Uninitialized memory is one source of hard to track down reproducibility errors. Holger Levsen presented the efforts on reproduible builds at Festival de Software Libre in Puerto Vallarta, Mexico.

8 October 2015

Norbert Preining: Looking at the facts: Sarah Sharp s crusade

Much has been written around the internet about this geeky kernel maintainer Sarah Sharp who left kernel development. I have now spent two hours reading through lkml posts, and want to summarize a few mails from the long thread, since most of the usual news sites just rewrap the original blog of hers without adding any background. darth-cookie The whole thread evolved out call for stable kernel review by Greg Kroah-Hartman where he complained about too many patches that are not actually in rc1 before going into stable:
<rant>
  I'm sitting on top of over 170 more patches that have been marked for
  the stable releases right now that are not included in this set of
  releases.  The fact that there are this many patches for stable stuff
  that are waiting to be merged through the main -rc1 merge window cycle
  is worrying to me.
  [...]
from where it developed into a typical Linus rant on people flagging crap for stable, followed by some jokes:
On Fri, Jul 12, 2013 at 8:47 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> I tend to hold things off after -rc4 because you scare me more than Greg
> does ;-)
 
Have you guys *seen* Greg? The guy is a freakish giant. He *should*
scare you. He might squish you without ever even noticing.
 
              Linus
and Ingo Molnar giving advice to Greg KH:
So Greg, if you want it all to change, create some _real_ threat: be frank 
with contributors and sometimes swear a bit. That will cut your mailqueue 
in half, promise!
with Greg KH taking a funny position in answering:
Ok, I'll channel my "inner Linus" and take a cue from my kids and start
swearing more.
Up to now a pretty decent and normal thread with some jokes and poking, nobody minded, and reading through it I had a good time. The thread continues with a discussion on requirements what to submit to stable, and some side threads on particular commits. And then, out of the blue, Social Justice Warrior (SJW) Sarah Sharp pops in with a very important contribution:
Seriously, guys?  Is this what we need in order to get improve -stable?
Linus Torvalds is advocating for physical intimidation and violence.
Ingo Molnar and Linus are advocating for verbal abuse.
 
Not *fucking* cool.  Violence, whether it be physical intimidation,
verbal threats or verbal abuse is not acceptable.  Keep it professional
on the mailing lists.
 
Let's discuss this at Kernel Summit where we can at least yell at each
other in person.  Yeah, just try yelling at me about this.  I'll roar
right back, louder, for all the people who lose their voice when they
get yelled at by top maintainers.  I won't be the nice girl anymore.
Onto which Linus answers in a great way:
That's the spirit.
 
Greg has taught you well. You have controlled your fear. Now, release
your anger. Only your hatred can destroy me.
 
Come to the dark side, Sarah. We have cookies.
On goes Sarah, gearing up in her SJW mode and starting to rant:
However, I am serious about this.  Linus, you're one of the worst
offenders when it comes to verbally abusing people and publicly tearing
their emotions apart.
 
http://marc.info/?l=linux-kernel&m=135628421403144&w=2
http://marc.info/?l=linux-acpi&m=136157944603147&w=2
 
I'm not going to put up with that shit any more.
Linus himself made clear what he thinks of her:
Trust me, there's a really easy way for me to curse at people: if you
are a maintainer, and you make excuses for your bugs rather than
trying to fix them, I will curse at *YOU*.
 
Because then the problem really is you.
[...]
It is easy to verify what Linus said, by reading the above two links and the answers of the maintainers, both agreed that it was their failure and were sorry. (Mauro s answer, Rafael s answer) It is just the geeky SJW that was not even attacked (who would dare to attack a woman nowadays?). The overall reaction to her by the maintainers can be exemplified by Thomas Gleixner s post:
Just for the record. I got grilled by Linus several times over the
last years and I can't remember a single instance where it was
unjustified.
What follows is a nearly endless discussion with Sarah meandering around, changing permanently her opinion what is acceptable. Linus tried to explain to her in simple words, without success, she continues to rant around. Here arguments are so weak I had nothing but good laugh:
> Sarah, that's a pretty potent argument by Linus, that "acting 
> professionally" risks replacing a raw but honest culture with a
> polished but dishonest culture - which is harmful to developing
> good technology.
> 
> That's a valid concern. What's your reply to that argument?
 
I don't feel the need to comment, because I feel it's a straw man
argument.  I feel that way because I disagree with the definition of
professionalism that people have been pushing.
 
To me, being "professional" means treating each other with respect.  I
can show emotion, express displeasure, be direct, and still show respect
for my fellow developers.
 
For example, I find the following statement to be both direct and
respectful, because it's criticizing code, not the person:
 
"This code is SHIT!  It adds new warnings and it's marked for stable
when it's clearly *crap code* that's not a bug fix.  I'm going to revert
this merge, and I expect a fix from you IMMEDIATELY."
 
The following statement is not respectful, because it targets the
person:
 
"Seriously, Maintainer.  Why are you pushing this kind of *crap* code to
me again?  Why the hell did you mark it for stable when it's clearly
not a bug fix?  Did you even try to f*cking compile this?"
Fortunately, she was immediately corrected and Ingo Molnar wrote an excellent refutation (starting another funny thread) of all her emails, statements, accusations (all of the email is a good read):
_That_ is why it might look to you as if the person was
attacked, because indeed the actions of the top level maintainer were
wrong and are criticised.
 
... and now you want to 'shut down' the discussion. With all due respect,
you started it, you have put out various heavy accusations here and elsewhere,
so you might as well take responsibility for it and let the discussion be
brought to a conclusion, wherever that may take us, compared to your initial view?
(He retracted that last statement, though I don t see a reason for it) Last but not least, let us return to her blog post, where she states herself that:
FYI, comments will be moderated by someone other than me. As this is my blog, not a
government entity, I have the right to replace any comment I feel like with 
 fart fart fart fart .
and she made lots of use of it, I counted at least 10 instances. She seems to remove or fart fart fart any comment that is not in line with her opinion. Further evidence is provided by this post on lkml. Everyone is free to have his own opinion (sorry, his/her), and I am free to form my own opinion on Sarah Sharp by just simply reading the facts. I am more than happy that one more SJW has left Linux development, as the proliferation of cleaning of speech from any personality has taken too far a grip. Coming to my home-base in Debian, unfortunately there is no one in the position and the state of mind of Linus, so we are suffering the same stupidities imposed by social justice worriers and some brainless feminists (no, don t get me wrong, these are two independent attributes. I do NOT state that feminism is brainless) that Linus and the maintainer crew was able to fend of this time. I finish with my favorite post from that thread, by Steven Rosted (from whom I also stole the above image!):
On Tue, 2013-07-16 at 18:37 -0700, Linus Torvalds wrote:
> Emotions aren't bad. Quite the reverse. 
 
Spock and Dr. Sheldon Cooper strongly disagree.
Post Scriptum (after a bike ride) The last point by Linus is what I criticize most on Debian nowdays, it has become a sterilized over-governed entity, where most fun is gone. Making fun is practically forbidden, since there is the slight chance that some minority somewhere on this planet might feel hurt, and by this we are breaking the CoC. Emotions are restricted to the Happy happy how nice we are and how good we are level of US and also Japanese self-reenforcement society. Post Post Scriptum I just read Sarah Sharp s post on What makes a good community? , and without giving a full account or review, I am just pi**ed by the usage of the word microaggressions I can only recommend everyone to read this article and this article to get a feeling how bs the idea of microaggressions has taken over academia and obviously not only academia. Post3 Scriptum I am happy to see Lars Wirzenius, Gunnar Wolf, and Mart n Ferrari opposing my view. I agree with them that my comments concerning Debian are not mainstream in Debian something that is not very surprising, though, and I think it is great that they have fun in Debian, like many other contributors. Post4 Scriptum Although nobody will read this, here is a great response from a female developer:
[...] To Linus: You're a hero to many of us. Don't change. Please. You DO
NOT need to take time away from doing code to grow a pair of breasts
and judge people's emotional states: [...]
Nothing to add here!

1 September 2014

Christian Perrier: Bug #760000

Ren Mayorga reported Debian bug #760000 on Saturday August 30th, against the pyfribidi package. Bug #750000 was reported as of May 31th: nearly exactly 3 months for 10,000 bugs. The bug rate increased a little bit during the last weeks, probably because of the freeze approaching. We're therefore getting more clues about the time when bug #800000 for which we have bets. will be reported. At current rate, this should happen in one year. So, the current favorites are Knuth Posern or Kartik Mistry. Still, David Pr vot, Andreas Tille, Elmar Heeb and Rafael Laboissiere have their chances, too, if the bug rate increases (I'll watch you guys: any MBF by one of you will be suspect...:-)).

2 September 2012

Dirk Eddelbuettel: Faster creation of binomial matrices

Scott Chamberlain blogged about faster creation of binomial matrices the other day, and even referred to our RcppArmadillo package as a possible solution (though claiming he didn't get it to work, tst tst -- that is what the rcpp-devel list is here to help with). The post also fell short of a good aggregated timing comparison for which we love the rbenchmark package. So in order to rectify this, and to see what we can do here with Rcpp, a quick post revisiting the issue. As preliminaries, we need to load three packages: inline to create compiled code on the fly (which, I should mention, is also used together with Rcpp by the Stan / RStan MCMC sampler which is creating some buzz this week), the compiler package included with R to create byte-compiled code and lastly the aforementioned rbenchmark package to do the timings. We also set row and column dimension, and set them a little higher than the original example to actually have something measurable:
library(inline)
library(compiler)
library(rbenchmark)
n <- 500
k <- 100
The first suggestion was the one by Scott himself. We will wrap this one, and all the following ones, in a function so that all approaches are comparable as being in a function of two dimension arguments:
scott <- function(N, K)  
    mm <- matrix(0, N, K)
    apply(mm, c(1, 2), function(x) sample(c(0, 1), 1))
 
scottComp <- cmpfun(scott)
We also immediatly compute a byte-compiled version (just because we now can) to see if this helps at all with the code. As there are no (explicit !) loops, we do not expect a big pickup. Scott's function works, but sweeps the sample() function across all rows and columns which is probably going to be (relatively) expensive. Next is the first improvement suggested to Scott which came from Ted Hart.
ted <- function(N, K)  
    matrix(rbinom(N * K, 1, 0.5), ncol = K, nrow = N)
 
This is quite a bit smarter as it vectorises the approach, generating N times K elements at once which are then reshaped into a matrix. Another suggestion came from David Smith as well as Rafael Maia. We rewrite it slightly to make it a function with two arguments for the desired dimensions:
david <- function(m, n)  
    matrix(sample(0:1, m * n, replace = TRUE), m, n)
 
This is very clever as it uses sample() over zero and one rather than making (expensive) draws from random number generator. Next we have a version from Luis Apiolaza:
luis <- function(m, n)  
     round(matrix(runif(m * n), m, n))
 
It draws from a random uniform and rounds to zero and one, rather than deploying the binomial. Then we have the version using RcppArmadillo hinted at by Scott, but with actual arguments and a correction for row/column dimensions. Thanks to inline we can write the C++ code as an R character string; inline takes care of everything and we end up with C++-based solution directly callable from R:
arma <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   return wrap(arma::randu(n, k));
')
This works, and is pretty fast. The only problem is that it answers the wrong question as it returns U(0,1) draws and not binomials. We need to truncate or round. So a corrected version is
armaFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "RcppArmadillo", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   return wrap(arma::floor(arma::randu(n, k) + 0.5));
')
which uses the the old rounding approximation of adding 1/2 before truncating. With Armadillo in the picture, we do wonder how Rcpp sugar would do. Rcpp sugar, described in one of the eight vignettes of the Rcpp package, is using template meta-programming to provide R-like expressiveness (aka "syntactic sugar") at the C++ level. In particular, it gives access to R's RNG functions using the exact same RNGs as R making the results directly substitutable (whereas Armadillo uses its own RNG).
sugar <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   Rcpp::RNGScope tmp;
   Rcpp::NumericVector draws = Rcpp::runif(n*k);
   return Rcpp::NumericMatrix(n, k, draws.begin());
')
Here Rcpp::RNGScope deals with setting/resetting the R RNG state. This draws a vector of N time K uniforms similar to Luis' function -- and just like Luis' R function does so without looping -- and then shapes a matrix of dimension N by K from it. And it does of course have the same problem as the RcppArmadillo approach earlier and we can use the same solution:
sugarFloor <- cxxfunction(signature(ns="integer", ks="integer"), plugin = "Rcpp", body='
   int n = Rcpp::as<int>(ns);
   int k = Rcpp::as<int>(ks);
   Rcpp::RNGScope tmp;
   Rcpp::NumericVector draws = Rcpp::floor(Rcpp::runif(n*k)+0.5);
   return Rcpp::NumericMatrix(n, k, draws.begin());
')
Now that we have all the pieces in place, we can compare:
res <- benchmark(scott(n, k), scottComp(n,k),
                 ted(n, k), david(n, k), luis(n, k),
                 arma(n, k), sugar(n,k),
                 armaFloor(n, k), sugarFloor(n, k),
                 order="relative", replications=100)
print(res[,1:4])
With all the above code example in a small R script we call via littler, we get
edd@max:~/svn/rcpp/pkg$ r /tmp/scott.r 
Loading required package: methods
              test replications elapsed   relative
7      sugar(n, k)          100   0.072   1.000000
9 sugarFloor(n, k)          100   0.088   1.222222
6       arma(n, k)          100   0.126   1.750000
4      david(n, k)          100   0.136   1.888889
8  armaFloor(n, k)          100   0.138   1.916667
3        ted(n, k)          100   0.384   5.333333
5       luis(n, k)          100   0.410   5.694444
1      scott(n, k)          100  33.045 458.958333
2  scottComp(n, k)          100  33.767 468.986111
We can see several takeaways: Thanks to Scott and everybody for suggesting this interesting problem. Trying the rbinom() Rcpp sugar function, or implementing sample() at the C++ level is, as the saying goes, left as an exercise to the reader.

22 March 2011

Gunnar Wolf: Lets go to Nicaragua, 2012!

Ok, so finally it is official! We just had the DebConf 12 decision meeting. We saw two great proposals, from the cities of Belo Horizonte, Brazil and Managua, Nicaragua. If you are curious on the decision process: We held it over two IRC channels The moderated #debconf-team channel, where only the five members of the decision committee (Marga Manterola, Andrew McMillan, Jeremiah Foster, Holger Levsen, Moray Allan) and two members from each of the bids (Marco T lio Gontijo e Silva and Rafael Cunha de Almeida from Brazil; Leonardo G mez and Eduardo Rosales from Nicaragua) had voices, and the open #dc12-discuss channel where we had an open discussion. Of course, you can get the full conversation logs in those links. I have to thank and congratulate the Brazilian team as they did a great work... The decision was very tight. It was so tight, in fact, that towards the end of the winning all of the committee members were too shy to state the results - so I kidnapped the process by announcing the winner ;-) (I hope that does not cast a shadow of illegitimacy over it) And, very much worth noting, both teams were also very professional: In previous years, we have seen such decisions degenerate into personal attacks and very ugly situations. That has always been painful and unfortunate. And although the Brazilians will not be able to go celebrate tonight, the decision was received with civility, knowing it was a decision among equals, and a decision well carried out. Well, that's it I am very much looking forward for that peculiar two weeks when the whole Debian family meets, this year to be held in Banja Luka, Bosnia-Herzegovina, and I am very eager towards meeting in 2012 in Managua, Nicaragua! Yay!

1 February 2011

Marco Silva: A Super Vaca for DebConf11

In Portuguese we use the word vaquinha, which means little cow, to refer to a group of people contributing money for some common goal. In a meeting of the organization of the DebConf12 BID of Belo Horizonte, I, Rafael, R gis and Samuel decided to create a vaquinha to become a sponsor of DebConf11. With the help of Amazing Val ssio, we created a website and we are collecting donations. The idea is simple: people donate and their name is shown on the website. If they give more than R$50,00, they also receive an exclusive T-shirt. We mixed the idea of vaquinha with the Super Cow Powers from APT. The site is only in Portuguese, since our main focus is to ask for donations from brazilians, but nothing stops foreigns from donating. Our plan is to become Bronze Sponsors, but if we can't all that money, we'll just give to DebConf11 whatever we have. I hope you like the idea, and maybe have a similar initative in your country.

12 July 2010

Matthew Garrett: Power management at Plumbers

I'm running a power management track at the Linux Plumbers Conference again this November. Unlike most conferences which focus on presenting completed work, Plumbers is an opportunity to focus on unsolved problems and throw around as many half-baked solutions as you want in order to try to find one that seems to stick. The suspend/resume problem in Linux is mostly solved[1], which means that it's time for us to focus on runtime power management and quality of service.

This has been an especially interesting year in the field. We've landed the infrastructure for generic runtime power management, glued that into PCI and started implementing that at the driver level. pm_qos is being reworked to improve performance and scalability as we start seeing more drivers that need to express their own constraints. And, of course, we had the wakelock/suspend blockers conversation that didn't end in a terribly satisfactory manner, although Rafael is now working on an implementation that presents equivalent functionality with a different userspace API. Runtime full-system suspend isn't solved yet either - the current cpuidle-based solution doesn't work well on multicore systems. And maybe we could be more aggressive still by looking at reclocking more system components on the fly even if the existing interfaces don't allow that. Do we have all the hooks we need to identify which system resources are being used? Are we doing the best we can in terms of avoiding trading off performance for power savings?

So if you'd like to talk about any of these things, or if there's any other problems that you don't think have been solved yet, head on over to the call for submissions and help make sure that we can make Linux the most power-efficient OS possible.

[1] Yes, some machines are broken, but those tend to be individual weird bugs which we're gradually tracking down rather than fundamental issues in our core code, so they're not really in the scope of Plumbers

12 March 2010

Stefano Zacchiroli: RC bugs of the week - issue 24

RCBW - #24 Some pause, and here we go with another RCBW issue. The pause has involved various Debian-related work, such as preparing OCaml batteries included for Squeeze and of course preparing my DPL platform. Without any further ado, here are this week's squashes: Random points:

Next.