Search Results: "Arturo Borrero Gonzalez"

23 May 2022

Arturo Borrero Gonz lez: Toolforge Jobs Framework

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez. This post continues the discussion of Toolforge updates as described in a previous post. Every non-trivial task performed in Toolforge (like executing a script or running a bot) should be dispatched to a job scheduling backend, which ensures that the job is run in a suitable place with sufficient resources. Jobs can be scheduled synchronously or asynchronously, continuously, or simply executed once. The basic principle of running jobs is fairly straightforward:

You create a job from a submission server (usually login.toolforge.org).
The backend finds a suitable execution node to run the job on, and starts it once resources are available.
As it runs, the job will send output and errors to files until the job completes or is aborted.

So far, if a tool developer wanted to work with jobs, the Toolforge Grid Engine backend was the only suitable choice. This is despite the fact that Kubernetes supports this kind of workload natively. The truth is that we never prepared our Kubernetes environment to work with jobs. Luckily that has changed.

We no longer want to run Grid Engine In a previous blog post we shared information about our desired future for Grid Engine in Toolforge. Our intention is to discontinue our usage of this technology.

Convenient way of running jobs on Toolforge Kubernetes Some advanced Toolforge users really wanted to use Kubernetes. They were aware of the lack of abstractions or helpers, so they were forced to use the raw Kubernetes API. Eventually, they figured everything out and managed to succeed. The result of this move was in the form of [docs on Wikitech][raws] and a few dozen jobs running on Kubernetes for the first time. We were aware of this, and this initiative was much in sync with our ultimate goal: to promote Kubernetes over Grid Engine. We rolled up our sleeves and started thinking of a way to abstract and make it easy to run jobs without having to deal with lots of YAML and the raw Kubernetes API. There is a precedent: the webservice command does exactly that. It hides all the details behind a simple command line interface to start/stop a web app running on Kubernetes. However, we wanted to go even further, be more flexible and prepare ourselves for more situations in the future: we decided to create a complete new REST API to wrap the jobs functionality in Toolforge Kubernetes. The Toolforge Jobs Framework was born.

Toolforge Jobs Framework components The new framework is a small collection of components. As of this writing, we have three:

The REST API responsible for creating/deleting/listing jobs on the Kubernetes system.

A command line interface to interact with the REST API above.

An emailer to notify users about their jobs activity in the Kubernetes system.

There were a couple of challenges that weren t trivial to solve. The authentication and authorization against the Kubernetes API was one of them. The other was deciding on the semantics of the new REST API itself. If you are curious, we invite you to take a look at the documentation we have in wikitech.

Open beta phase Once we gained some confidence with the new framework, in July 2021 we decided to start a beta phase. We suggested some advanced Toolforge users try out the new framework. We tracked this phase in Phabricator, where our collaborators quickly started reporting some early bugs, helping each other, and creating new feature requests. Moreover, when we launched the Grid Engine migration from Debian 9 Stretch to Debian 10 Buster we took a step forward and started promoting the new jobs framework as a viable replacement for the grid. Some official documentation pages were created on wikitech as well. As of this writing the framework continues in beta phase. We have solved basically all of the most important bugs, and we already started thinking on how to address the few feature requests that are missing. We haven t yet established yet the criteria for leaving the beta phase, but it would be good to have:

Critical bugs fixed and most feature requests addressed (or at least somehow planned).

Proper automated test coverage. We can do better on testing the different software components to ensure they are as bug free as possible. This also would make sure that contributing changes is easy.

REST API swagger integration.

Deployment automation. Deploying the REST API and the emailer is tedious. This is tracked in Phabricator.

Documentation, documentation, documentation.

Limitations One of the limitations we bear in mind since early on in the development process of this framework was the support for mixing different programming languages or runtime environments in the same job. Solving this limitation is currently one of the WMCS team priorities, because this is one of the key workflows that was available on Grid Engine. The moment we address it, the framework adoption will grow, and it will pretty much enable the same workflows as in the grid, if not more advanced and featureful. Stay tuned for more upcoming blog posts with additional information about Toolforge. This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

13 May 2022

Arturo Borrero Gonz lez: Toolforge GridEngine Debian 10 Buster migration

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez. In accordance with our operating system upgrade policy, we should migrate our servers to Debian Buster. As discussed in the previous post, one of the most important and successful services provided by the Wikimedia Cloud Services team at the Wikimedia Foundation is Toolforge. Toolforge is a platform that allows users and developers to run and use a variety of applications with the ultimate goal of helping the Wikimedia mission from the technical side. As you may know already, all Wikimedia Foundation servers are powered by Debian, and this includes Toolforge and Cloud VPS. The Debian Project mostly follows a two year cadence for releases, and Toolforge has been using Debian Stretch for some years now, which nowadays is considered old-old-stable . In accordance with our operating system upgrade policy, we should migrate our servers to Debian Buster. Toolforge s two different backend engines, Kubernetes and Grid Engine, are impacted by this upgrade policy. Grid Engine is notably tied to the underlying Debian release, and the execution environment offered to tools running in the grid is limited to what the Debian archive contains for a given release. This is unlike in Kubernetes, where tool developers can leverage container images and decouple the runtime environment selection from the base operating system. Since the Toolforge grid original conception, we have been doing the same operation over and over again:

Prepare a parallel grid deployment with the new operating system.
Ask our users (tool developers) to evaluate a newer version of their runtime and programming languages.
Introduce a migration window and coordinate a quick migration.
Finally, drop the old operating system from grid servers.

We ve done this type of migration several times before. The last few ones were Ubuntu Precise to Ubuntu Trusty and Ubuntu Trusty to Debian Stretch. But this time around we had some special angles to consider.

So, you are upgrading the Debian release

You are migrating to Debian 11 Bullseye, no?

No, we re migrating to Debian 10 Buster

Wait, but Debian 11 Bullseye exists!

Yes, we know! Let me explain

We re migrating the grid from Debian 9 Stretch to Debian 10 Buster, but perhaps we should be migrating from Debian 9 Stretch to Debian 11 Bullseye directly. This is a legitimate concern, and we discussed it in September 2021. Back then, our reasoning was that skipping to Debian 11 Bullseye would be more difficult for our users, especially because greater jump in version numbers for the underlying runtimes. Additionally, all the migration work started before Debian 11 Bullseye was released. Our original intention was for the migration to be completed before the release. For a couple of reasons the project was delayed, and when it was time to restart the project we decided to continue with the original idea. We had some work done to get Debian 10 Buster working correctly with the grid, and supporting Debian 11 Bullseye would require an additional effort. We didn t even check if Grid Engine could be installed in the latest Debian release. For the grid, in general, the engineering effort to do a N+1 upgrade is lower than doing a N+2 upgrade. If we had tried a N+2 upgrade directly, things would have been much slower and difficult for us, and for our users. In that sense, our conclusion was to not skip Debian 10 Buster.

We no longer want to run Grid Engine In a previous blog post we shared information about our desired future for Grid Engine in Toolforge. Our intention is to discontinue our usage of this technology.

No grid? What about my tools? Traditionally there have been two main workflows or use cases that were supported in the grid, but not in our Kubernetes backend:

Running jobs, long-running bots and other scheduled tasks.

Mixing runtime environments (for example, a nodejs app that runs some python code).

The good news is that work to handle the continuity of such use cases has already started. This takes the form of two main efforts:

The Toolforge buildpacks project to support arbitrary runtime environments.

The Toolforge Jobs Framework to support jobs, scheduled tasks, etc.

In particular, the Toolforge Jobs Framework has been available for a while in an open beta phase. We did some initial design and implementation, then deployed it in Toolforge for some users to try it and report bugs, report missing features, etc. These are complex, and feature-rich projects, and they deserve a dedicated blog post. More information on each will be shared in the future. For now, it is worth noting that both initiatives have some degree of development already. The conclusion Knowing all the moving parts, we were faced with a few hard questions when deciding how to approach the Debian 9 Stretch deprecation:

Should we not upgrade the grid, and focus on Kubernetes instead? Let Debian 9 Stretch be the last supported version on the grid?

What is the impact of these decisions on the technical community? What is best for our users?

The choices we made are already known in the community. A couple of weeks ago we announced the Debian 9 Stretch Grid Engine deprecation. In parallel to this migration, we decided to promote the new Toolforge Jobs Framework, even if it s still in beta phase. This new option should help users to future-proof their tool, and reduce maintenance effort. An early migration to Kubernetes now will avoid any more future grid problems. We truly hope that Debian 10 Buster is the last version we have for the grid, but as they say, hope is not a good strategy when it comes to engineering. What we will do is to work really hard in bringing Toolforge to the service level we want, and that means to keep developing and enabling more Kubernetes-based functionalities. Stay tuned for more upcoming blog posts with additional information about Toolforge. This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

4 April 2022

Arturo Borrero Gonz lez: Wikimedia Toolforge and Grid Engine

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez. One of the most important and successful products provided by the Wikimedia Cloud Services team at the Wikimedia Foundation is Toolforge, a hosting service commonly known in the industry as Platform as a Service (PaaS). In particular, it is a platform that allows users and developers to run and use a variety of applications with the ultimate goal of helping the Wikimedia mission from the technical side. Toolforge is powered by two different backend engines, Kubernetes and Grid Engine. The two backends have traditionally offered different features for tool developers. But as time moves forward we ve learnt that Kubernetes is the future. Explaining why is the purpose of this blog post: we want to share more information and reasoning behind this mindset. There are a number of reasons that make Grid Engine poorly suitable to remain as execution backend in Toolforge:

There has not been a new Grid Engine release (bug fixes, security patches, or otherwise) since 2016. This doesn t feel like a project being actively developed or maintained.
The grid has poor support and controls for important aspects such as high availability, fault tolerance and self-recovery.
Maintaining a healthy grid requires plenty of manual operations, like manual queue cleanups in case of failures, hand-crafted scripts for pooling/depooling nodes, etc.
There is no good or modern monitoring support for the grid, and we need to craft and maintain several monitoring pieces for proper observability, and to be able to do proper maintenance.
The grid is also strongly tied to the underlying operating system release version. Migrating from one Debian version to the next is painful (a dedicated blog post about this will follow shortly).
The grid imposes a strong dependency on NFS, another old technology. We would like to reduce dependency on NFS overall, and in the future we will explore NFS-free approaches for Toolforge.
In general, Grid Engine is old software, old technology, which can be replaced by more modern approaches for providing an equivalent or better service.

As mentioned above, our desire is to cover all our grid-like needs with Kubernetes, a technology which has several benefits:

Good high availability, fault tolerance and self-recovery semantics, constructs and facilities.
Maintaining a running Kubernetes cluster requires little manual operations.
There are good monitoring and observability options for Kubernetes deployments, including seamless integration with industry standards like prometheus.
Our current approach to deploying and upgrading Kubernetes is independent of the underlying operating system.
While our current Kubernetes deployment uses NFS as a central component, there is support for using other, more modern, approaches for the kind of shared storage needs we have in Toolforge.
In general, Kubernetes is a modern technology, with a vibrant and healthy community, that enables new use cases and has enough flexibility to adapt legacy ones.

The relationship between Toolforge and Grid Engine has been interesting over the years. The grid has been used for quite a lot of time, we have plenty of documentation and established good practices. On the other hand, the grid is hard to maintain, imposes a heavy burden on the WMCS team and is a technology we must eventually discontinue. How to accommodate the two realities is a refreshing challenge, one that we hope to tackle together in the near future. A tradeoff exists here, but it is clear to us which option is best. So we will work on deprecating and removing Grid Engine and migrating use cases into Kubernetes. This deprecation, however, will be done with care, as we know our technical community relies on the grid for some import Toolforge tools. And some of these workflows will need some adaptation in order to be fully supported on Kubernetes. Stay tuned for more information on present and next works surrounding the Wikimedia Toolforge service. The next blog post will share more concrete details. This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

20 October 2021

Arturo Borrero Gonz lez: Iterating on how we do NFS at Wikimedia Cloud Services

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez. NFS is a central piece of infrastructure that is essential to services like Toolforge. Recently, the Cloud Services team at Wikimedia had been reviewing how we do NFS. The current situation NFS is a central piece of technology for some of the services that the Wikimedia Cloud Services team offers to the community. We have several shares that power different use cases: Toolforge user home directories live on NFS, and Cloud VPS users can also access dumps using this protocol. The current setup involves several physical hardware servers, with about 20TB of storage, offering shares over 10G links to the cloud. For the system to be more fault-tolerant, we duplicate each share for redundancy using DRBD. Running NFS on dedicated hardware servers has traditionally offered us advantages: mostly on the performance and the capacity fields. As time has passed, we have been enumerating more and more reasons to review how we do NFS. For one, the current setup is in violation of some of our internal rules regarding realm separation. Additionally, we had been longing for additional flexibility managing our servers: we wanted to use virtual machines managed by Openstack Nova. The DRBD-based high-availability system required mostly a hand-crafted procedure for failover/failback. There s also some scalability concerns as NFS is easy to grow up, but not to grow horizontally, and of course, we have to be able to keep the tenancy setup while doing so, something that NFS does by using LDAP/Unix users and may get complicated too when growing. In general, the servers have become too big to fail , clearly technical debt, and it has taken us years to decide on taking on the task to rethink the architecture. It s worth mentioning that in an ideal world, we wouldn t depend on NFS, but the truth is that it will still be a central piece of infrastructure for years to come in services like Toolforge. Over a series of brainstorming meetings, the WMCS team evaluated the situation and sorted out the many moving parts. The team managed to boil down the potential service future to two competing options:

Adopt and introduce a new Openstack component into our cloud: Manila this was the right choice if we were interested in a general NFS as a service offering for our Cloud VPS users.
Put the data on Cinder volumes and serve NFS from a couple of virtual machines created by hand this was the right choice if we wanted something that required low effort to engineer and adopt.

Then we decided to research both options in parallel. For a number of reasons, the evaluation was timeboxed to three weeks. Both ideas had a couple of points in common: the NFS data would be stored on our Ceph farm via Cinder volumes, and we would rely on Ceph reliability to avoid using DRBD. Another open topic was how to back up data from Ceph, to store our important bits in more than one basket. We will get to the back up topic later. The manila experiment The Wikimedia Foundation was an early adopter of some Openstack components (Nova, Glance, Designate, Horizon), but Manila was never evaluated for usage until now. Our approach for this experiment was to closely follow the upstream guidelines. We read the documentation and tried to understand the different setups you can build with Manila. As we often feel with other Openstack components, the documentation doesn t perfectly describe how to introduce a given component in your particular local setup. Here we use an admin-controller flat-topology Neutron network. This network is shared by all tenants (or projects) in our Openstack deployment. Also, Manila can use many different driver backends, for things like NetApps or CephFS that we don t use , yet. After some research, the generic driver was the one that seemed to better fit our use case. The generic driver leverages Nova virtual machines instances plus Cinder volume to create and manage the shares. In general, Manila supports two operational modes, whether it should create/destroy the share servers (i.e, the virtual machine instances) or not. This option is called driver_handles_share_server (or DHSS) and takes a boolean value. We were interested in trying with DHSS=true, to really benefit from the potential of the setup. Manila diagram

NFS idea 6, original image in Wikitech So, after sorting all these variables, we moved on with our initial testing. We built a PoC setup as depicted in the diagram above, with the manila-share component running in a virtual machine inside the cloud. The PoC led to us reporting several bugs upstream:

In some cases we tried to address these bugs ourselves:

It s worth mentioning that the upstream community was extra-welcoming to us, and we re thankful for that. However, at the end of our three-week period, our Manila setup still wasn t working as expected. Your experience may change with other drivers perhaps the ZFSonLinux or the CephFS ones. In general, we were having trouble making the setup work as expected, so we decided to abandon this approach in favor of the other option we were considering at the beginning. Simple virtual machine serving NFS The alternative was to create a Nova virtual machine instance by hand and to configure it using puppet. We have been investing in an automation framework lately, so the idea is to not actually create the server by hand. Anyway, the data would be decoupled from the instance into Cinder volumes, which led us to the question we left for later: How should we back up those terabytes of important information? Just to be clear, the backup problem was independent of the above options; with Manila we would still have had to solve the same challenge. We would like to see our data be backed up somewhere else other than in Ceph. And that s exactly where we are at right now. We ve been exploring different backup strategies and will finally use the Cinder backup API. Conclusion The iteration will end with the dedicated NFS hardware servers being stopped, and the shares being served from within the cloud. The migration will take some time to happen because we will check and double-check that everything works as expected (including from the performance point of view) before making definitive changes. We already have some plans to make sure our users experience as little service impact as possible. The most troublesome shares will be those related to Toolforge. At some point we will need to disallow writes to the NFS share, rsync the data out of the hardware servers into the Cinder volumes, point the NFS clients to the new virtual machines, and then enable writes again. The main Toolforge share has about 8TB of data, so this will take a while. We will have more updates in the future. Who knows, perhaps our next-next iteration, in a couple of years, will see us adopting Openstack Manila for good. Featured image credit: File:(from break water) Manila Skyline panoramio.jpg, ewol, CC BY-SA 3.0 This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

15 June 2020

Arturo Borrero Gonz lez: A better Toolforge: a technical deep dive

This post was originally published in the Wikimedia Tech blog, and is authored by Arturo Borrero Gonzalez and Brooke Storm. In the previous post, we shared the context on the recent Kubernetes upgrade that we introduced in the Toolforge service. Today we would like to dive a bit more in the technical details. Custom admission controllers One of the key components of the Toolforge Kubernetes are our custom admission controllers. We use them to validate and enforce that the usage of the service is what we intended for. Basically, we have 2 of them:

Ingress admission controller [source code]
Registry admission controller [source code]

The source code is written in Golang, which is pretty convenient for natively working in a Kubernetes environment. Both code repositories include extensive documentation: how to develop, test, use, and deploy them. We decided to go with custom admission controllers because we couldn t find any native (or built-in) Kubernetes mechanism to accomplish the same sort of checks on user activity. With the Ingress controller, we want to ensure that Ingress objects only handle traffic to our internal domains, which by the time of this writing, are toolforge.org (our new domain) and tools.wmflabs.org (legacy). We safe-list the kube-system namespace and the tool-fourohfour namespace because both need special consideration. More on the Ingress setup later. The registry controller is pretty simple as well. It ensures that only our internal docker registry is used for user-scheduled containers running in Kubernetes. Again, we exclude from the checks containers running in the kube-system namespace (those used by Kubernetes itself). Other than that, the validation itself is pretty easy. For some extra containers we run (like those related to Prometheus metrics) what we do is simply upload those docker images to our internal registry. The controls provided by this admission controller helps us validate that only FLOSS software is run in our environment, which is one of the core rules of Toolforge. RBAC and Pod Security Policy setup I would like to comment next on our RBAC and Pod Security Policy setup. Using the Pod Security Policies (or PSP) we establish a set of constraints on what containers can and can t do in our cluster. We have many PSP configured in our setup:

Privileged policy: used by Kubernetes containers themselves basically a very relaxed set of constraints that are required for the system itself to work.
Default policy: a bit more restricted than the privileged policy, is intended for admins to deploy services, but it isn t currently in use..
Toolforge user policies: this applies to user-scheduled containers, and there are some obvious restrictions here: we only allow unprivileged pods, we control which HostPath is available for pods, use only default Linux capabilities, etc.

Each user can interact with their own namespace (this is how we achieve multi-tenancy in the cluster). Kubernetes knows about each user by means of TLS certs, and for that we have RBAC. Each user has a rolebinding to a shared cluster-role that defines how Toolforge tools can use the Kubernetes API. The following diagram shows the design of our RBAC and PSP in our cluster: RBAC and PSP for Toolforge diagram

RBAC and PSP for Toolforge, original image in wikitech I mentioned that we know about each user by means of TLS certificates. This is true, and in fact, there is a key component in our setup called maintain-kubeusers. This custom piece of Python software is run as a pod inside the cluster and is responsible for reading our external user database (LDAP) and generating the required credentials, namespaces, and other configuration bits for them. With the TLS cert, we basically create a kubeconfig file that is then written into the homes NFS share, so each Toolforge user has it in their shell home directory. Networking and Ingress setup With the basic security controls in place, we can move on to explaining our networking and Ingress setup. Yes, the Ingress word might be a bit overloaded already, but we refer here to Ingress as the path that end-users follow from their web browser in their local machine to a webservice running in the Toolforge cluster. Some additional context here. Toolforge is not only Kubernetes, but we also have a Son of GridEngine deployment, a job scheduler that covers some features not available in Kubernetes. The grid can also run webservices, although we are encouraging users to migrate them to Kubernetes. For compatibility reasons, we needed to adapt our Ingress setup to accommodate the old web grid. Deciding the layout of the network and Ingress was definitely something that took us some time to figure out because there is not a single way to do it right. The following diagram can be used to explain the different steps involved in serving a web service running in the new Toolforge Kubernetes.

Toolforge k8s network topology, original image in Wikitech The end-user HTTP/HTTPs request first hits our front proxy in (1). Running here is NGINX with a custom piece of LUA code that is able to decide whether to contact the web grid or the new Kubernetes cluster. TLS termination happens here as well, for both domains (toolforge.org and tools.wmflabs.org). Note this proxy is reachable from the internet, as it uses a public IPv4 address, a floating IP from CloudVPS, the infrastructure service we provide based on Openstack. Remember that our Kubernetes is directly built in virtual machines a bare-metal type deployment. If the request is directed to a webservice running in Kubernetes, the request now reaches haproxy in (2), which knows the cluster nodes that are available for Ingress. The original 80/TCP packet is now translated to 30000/TCP; this is the TCP port we use internally for the Ingress traffic. This haproxy instance provides load-balancing also for the Kubernetes API as well, using 6443/TCP. It s worth mentioning that unlike the Ingress, the API is only reachable from within the cluster and not from the internet. We have a NGINX-Ingress NodePort service listening in 30000/TCP in every Kubernetes worker node in (3); this helps the request to eventually reach the actual NGINX-Ingress pod in (4), which is listening in 8080/TCP. You can see in the diagram how in the API server (5) we hook the Ingress admission controller (6) to validate Kubernetes Ingress configuration objects before allowing them in for processing by NGINX-Ingress (7). The NGINX-Ingress process knows which tools webservices are online and how to contact them by means of an intermediate Service object in (8). This last Service object means the request finally reaches the actual tool pod in (9). At this point, it is worth noting that our Kubernetes cluster uses internally kube-proxy and Calico, both using Netfilter components to handle traffic. tools-webservice Most user-facing operations are simplified by means of another custom piece of Python code: tools-webservice. This package provides users with the webservice command line utility in our shell bastion hosts. Typical usage is to just run webservice start stop status. This utility creates all the required Kubernetes objects on-demand like Deployment, ReplicaSet, Ingress and Service to ease deploying web apps in Toolforge. Of course, advanced users can interact directly with Kubernetes API and create their custom configuration objects. This utility is just a wrapper, a shortcut. tool-fourohfour and tool-k8s-status The last couple of custom components we would like to mention are the tool-fourohfour and tool-k8s-status web services. These two utilities run inside the cluster as if they were any other user-created tool. The fourohfour tool allows for a controlled handling of HTTP 404 errors, and it works as the default NGINX-Ingress backend. The k8s-status tool shows plenty of information about the cluster itself and each tool running in the cluster, including links to the Server Admin Log, an auto-generated grafana dashboard for metrics, and more. For metrics, we use an external Prometheus server that contacts the Kubernetes cluster to scrape metrics. We created a custom metrics namespace in which we deploy all the different components we use to observe the behavior of the system:

metrics-server: used by some utilities like kubectl top.
kube-state-metrics: provides advanced metrics about the state of the cluster.
cadvisor: to obtain fine-grained metrics about pods, deployments, nodes, etc.

All the Prometheus data we collect is used in several different Grafana dashboards, some of them directed for user information like the ones linked by the k8s-status tool and some others for internal use by us the engineers. These are for internal use but are still public, like the Ingress specific dashboard, or the cluster state dashboard. Working publicly, in a transparent way, is key for the success of CloudVPS in general and Toolforge in particular. Like we commented in the previous post, all the engineering work that was done here was shared by community members. By the community, for the community We think this post sheds some light on how the Toolforge Kubernetes service works, and we hope it could inspire others when trying to build similar services or, even better, help us improve Toolforge itself. Since this was first put into production some months ago we detected already some margin for improvement in a couple of the components. As in many other engineering products, we will follow an iterative approach for evolving the service. Mind that Toolforge is maintained by the Wikimedia Foundation, but you can think of it as a service by the community for the community. We will keep an eye on it and have a list of feature requests and things to improve in the future. We are looking forward to it! This post was originally published in the Wikimedia Tech blog, and is authored by Arturo Borrero Gonzalez and Brooke Storm.

18 May 2020

Arturo Borrero Gonz lez: A better Toolforge: upgrading the Kubernetes cluster

This post was originally published in the Wikimedia Tech blog, and is authored by Arturo Borrero Gonzalez and Brooke Storm. One of the most successful and important products provided by the Wikimedia Cloud Services team at the Wikimedia Foundation is Toolforge. Toolforge is a platform that allows users and developers to run and use a variety of applications that help the Wikimedia movement and mission from the technical point of view in general. Toolforge is a hosting service commonly known in the industry as a Platform as a Service (PaaS). Toolforge is powered by two different backend engines, Kubernetes and GridEngine. This article focuses on how we made a better Toolforge by integrating a newer version of Kubernetes and, along with it, some more modern workflows. The starting point in this story is 2018. Yes, two years ago! We identified that we could do better with our Kubernetes deployment in Toolforge. We were using a very old version, v1.4. Using an old version of any software has more or less the same consequences everywhere: you lack security improvements and some modern key features. Once it was clear that we wanted to upgrade our Kubernetes cluster, both the engineering work and the endless chain of challenges started. It turns out that Kubernetes is a complex and modern technology, which adds some extra abstraction layers to add flexibility and some intelligence to a very old systems engineering need: hosting and running a variety of applications. Our first challenge was to understand what our use case for a modern Kubernetes was. We were particularly interested in some key features:

The increased security and controls required for a public user-facing service, using RBAC, PodSecurityPolicies, quotas, etc.
Native multi-tenancy support, using namespaces
Advanced web routing, using the Ingress API

Soon enough we faced another Kubernetes native challenge: the documentation. For a newcomer, learning and understanding how to adapt Kubernetes to a given use case can be really challenging. We identified some baffling patterns in the docs. For example, different documentation pages would assume you were using different Kubernetes deployments (Minikube vs kubeadm vs a hosted service). We are running Kubernetes like you would on bare-metal (well, in CloudVPS virtual machines), and some documents directly referred to ours as a corner case. During late 2018 and early 2019, we started brainstorming and prototyping. We wanted our cluster to be reproducible and easily rebuildable, and in the Technology Department at the Wikimedia Foundation, we rely on Puppet for that. One of the first things to decide was how to deploy and build the cluster while integrating with Puppet. This is not as simple as it seems because Kubernetes itself is a collection of reconciliation loops, just like Puppet is. So we had to decide what to put directly in Kubernetes and what to control and make visible through Puppet. We decided to stick with kubeadm as the deployment method, as it seems to be the more upstream-standardized tool for the task. We had to make some interesting decisions by trial and error, like where to run the required etcd servers, what the kubeadm init file would look like, how to proxy and load-balance the API on our bare-metal deployment, what network overlay to choose, etc. If you take a look at our public notes, you can get a glimpse of the number of decisions we had to make. Our Kubernetes wasn t going to be a generic cluster, we needed a Toolforge Kubernetes service. This means we don t use some of the components, and also, we add some additional pieces and configurations to it. By the second half of 2019, we were working full-speed on the new Kubernetes cluster. We already had an idea of what we wanted and how to do it. There were a couple of important topics for discussions, for example:

Ingress
Validating admission controllers
Security policies and quotas
PKI and user management

We will describe in detail the final state of those pieces in another blog post, but each of the topics required several hours of engineering time, research, tests, and meetings before reaching a point in which we were comfortable with moving forward. By the end of 2019 and early 2020, we felt like all the pieces were in place, and we started thinking about how to migrate the users, the workloads, from the old cluster to the new one. This migration plan mostly materialized in a Wikitech page which contains concrete information for our users and the community. The interaction with the community was a key success element. Thanks to our vibrant and involved users, we had several early adopters and beta testers that helped us identify early flaws in our designs. The feedback they provided was very valuable for us. Some folks helped solve technical problems, helped with the migration plan or even helped make some design decisions. Worth noting that some of the changes that were presented to our users were not easy to handle for them, like new quotas and usage limits. Introducing new workflows and deprecating old ones is always a risky operation. Even though the migration procedure from the old cluster to the new one was fairly simple, there were some rough edges. We helped our users navigate them. A common issue was a webservice not being able to run in the new cluster due to stricter quota limiting the resources for the tool. Another example is the new Ingress layer failing to properly work with some webservices s particular options. By March 2020, we no longer had anything running in the old Kubernetes cluster, and the migration was completed. We then started thinking about another step towards making a better Toolforge, which is introducing the toolforge.org domain. There is plenty of information about the change to this new domain in Wikitech News. The community wanted a better Toolforge, and so do we, and after almost 2 years of work, we have it! All the work that was done represents the commitment of the Wikimedia Foundation to support the technical community and how we really want to pursue technical engagement in general in the Wikimedia movement. In a follow-up post we will present and discuss more in-depth about some technical details of the new Kubernetes cluster, stay tuned! This post was originally published in the Wikimedia Tech blog, and is authored by Arturo Borrero Gonzalez and Brooke Storm.

19 May 2017

Michael Prokop: Debian stretch: changes in util-linux #newinstretch

We re coming closer to the Debian/stretch stable release and similar to what we had with #newinwheezy and #newinjessie it s time for #newinstretch! Hideki Yamane already started the game by blogging about GitHub s Icon font, fonts-octicons and Arturo Borrero Gonzalez wrote a nice article about nftables in Debian/stretch. One package that isn t new but its tools are used by many of us is util-linux, providing many essential system utilities. We have util-linux v2.25.2 in Debian/jessie and in Debian/stretch there will be util-linux >=v2.29.2. There are many new options available and we also have a few new tools available. Tools that have been taken over from other packages

last: used to be shipped via sysvinit-utils in Debian/jessie
lastb: used to be shipped via sysvinit-utils in Debian/jessie
mesg: used to be shipped via sysvinit-utils in Debian/jessie
mountpoint: used to be shipped via initscripts in Debian/jessie
sulogin: used to be shipped via sysvinit-utils in Debian/jessie

New tools

lsipc: show information on IPC facilities, e.g.:

root@ff2713f55b36:/# lsipc
RESOURCE DESCRIPTION                                              LIMIT USED  USE%
MSGMNI   Number of message queues                                 32000    0 0.00%
MSGMAX   Max size of message (bytes)                               8192    -     -
MSGMNB   Default max size of queue (bytes)                        16384    -     -
SHMMNI   Shared memory segments                                    4096    0 0.00%
SHMALL   Shared memory pages                       18446744073692774399    0 0.00%
SHMMAX   Max size of shared memory segment (bytes) 18446744073692774399    -     -
SHMMIN   Min size of shared memory segment (bytes)                    1    -     -
SEMMNI   Number of semaphore identifiers                          32000    0 0.00%
SEMMNS   Total number of semaphores                          1024000000    0 0.00%
SEMMSL   Max semaphores per semaphore set.                        32000    -     -
SEMOPM   Max number of operations per semop(2)                      500    -     -
SEMVMX   Semaphore max value                                      32767    -     -

lslogins: display information about known users in the system, e.g.:

root@ff2713f55b36:/# lslogins 
  UID USER     PROC PWD-LOCK PWD-DENY LAST-LOGIN GECOS
    0 root        2        0        1            root
    1 daemon      0        0        1            daemon
    2 bin         0        0        1            bin
    3 sys         0        0        1            sys
    4 sync        0        0        1            sync
    5 games       0        0        1            games
    6 man         0        0        1            man
    7 lp          0        0        1            lp
    8 mail        0        0        1            mail
    9 news        0        0        1            news
   10 uucp        0        0        1            uucp
   13 proxy       0        0        1            proxy
   33 www-data    0        0        1            www-data
   34 backup      0        0        1            backup
   38 list        0        0        1            Mailing List Manager
   39 irc         0        0        1            ircd
   41 gnats       0        0        1            Gnats Bug-Reporting System (admin)
  100 _apt        0        0        1            
65534 nobody      0        0        1            nobody

lsns: list system namespaces, e.g.:

root@ff2713f55b36:/# lsns
        NS TYPE   NPROCS PID USER COMMAND
4026531835 cgroup      2   1 root bash
4026531837 user        2   1 root bash
4026532473 mnt         2   1 root bash
4026532474 uts         2   1 root bash
4026532475 ipc         2   1 root bash
4026532476 pid         2   1 root bash
4026532478 net         2   1 root bash

setpriv: run a program with different privilege settings
zramctl: tool to quickly set up zram device parameters, to reset zram devices, and to query the status of used zram devices

New features/options addpart (show or change the real-time scheduling attributes of a process):

--reload reload prompts on running agetty instances

blkdiscard (discard the content of sectors on a device):

-p, --step <num>    size of the discard iterations within the offset
-z, --zeroout       zero-fill rather than discard

chrt (show or change the real-time scheduling attributes of a process):

-d, --deadline            set policy to SCHED_DEADLINE
-T, --sched-runtime <ns>  runtime parameter for DEADLINE
-P, --sched-period <ns>   period parameter for DEADLINE
-D, --sched-deadline <ns> deadline parameter for DEADLINE

fdformat (do a low-level formatting of a floppy disk):

-f, --from <N>    start at the track N (default 0)
-t, --to <N>      stop at the track N
-r, --repair <N>  try to repair tracks failed during the verification (max N retries)

fdisk (display or manipulate a disk partition table):

-B, --protect-boot            don't erase bootbits when creating a new label
-o, --output <list>           output columns
    --bytes                   print SIZE in bytes rather than in human readable format
-w, --wipe <mode>             wipe signatures (auto, always or never)
-W, --wipe-partitions <mode>  wipe signatures from new partitions (auto, always or never)
New available columns (for -o):
 gpt: Device Start End Sectors Size Type Type-UUID Attrs Name UUID
 dos: Device Start End Sectors Cylinders Size Type Id Attrs Boot End-C/H/S Start-C/H/S
 bsd: Slice Start End Sectors Cylinders Size Type Bsize Cpg Fsize
 sgi: Device Start End Sectors Cylinders Size Type Id Attrs
 sun: Device Start End Sectors Cylinders Size Type Id Flags

findmnt (find a (mounted) filesystem):

-J, --json             use JSON output format
-M, --mountpoint <dir> the mountpoint directory
-x, --verify           verify mount table content (default is fstab)
    --verbose          print more details

flock (manage file locks from shell scripts):

-F, --no-fork            execute command without forking
    --verbose            increase verbosity

getty (open a terminal and set its mode):

--reload               reload prompts on running agetty instances

hwclock (query or set the hardware clock):

--get            read hardware clock and print drift corrected result
--update-drift   update drift factor in /etc/adjtime (requires --set or --systohc)

ldattach (attach a line discipline to a serial line):

-c, --intro-command <string>  intro sent before ldattach
-p, --pause <seconds>         pause between intro and ldattach

logger (enter messages into the system log):

-e, --skip-empty         do not log empty lines when processing files
    --no-act             do everything except the write the log
    --octet-count        use rfc6587 octet counting
-S, --size <size>        maximum size for a single message
    --rfc3164            use the obsolete BSD syslog protocol
    --rfc5424[=<snip>]   use the syslog protocol (the default for remote);
                           <snip> can be notime, or notq, and/or nohost
    --sd-id <id>         rfc5424 structured data ID
    --sd-param <data>    rfc5424 structured data name=value
    --msgid <msgid>      set rfc5424 message id field
    --socket-errors[=<on off auto>] print connection errors when using Unix sockets

losetup (set up and control loop devices):

-L, --nooverlap               avoid possible conflict between devices
    --direct-io[=<on off>]    open backing file with O_DIRECT 
-J, --json                    use JSON --list output format
New available --list column:
DIO  access backing file with direct-io

lsblk (list information about block devices):

-J, --json           use JSON output format
New available columns (for --output):
HOTPLUG  removable or hotplug device (usb, pcmcia, ...)
SUBSYSTEMS  de-duplicated chain of subsystems

lscpu (display information about the CPU architecture):

-y, --physical          print physical instead of logical IDs
New available column:
DRAWER  logical drawer number

lslocks (list local system locks):

-J, --json             use JSON output format
-i, --noinaccessible   ignore locks without read permissions

nsenter (run a program with namespaces of other processes):

-C, --cgroup[=<file>]      enter cgroup namespace
    --preserve-credentials do not touch uids or gids
-Z, --follow-context       set SELinux context according to --target PID

rtcwake (enter a system sleep state until a specified wakeup time):

--date <timestamp>   date time of timestamp to wake
--list-modes         list available modes
-r, --reorder <dev>  fix partitions order (by start offset)

sfdisk (display or manipulate a disk partition table):

New Commands:
-J, --json <dev>                  dump partition table in JSON format
-F, --list-free [<dev> ...]       list unpartitioned free areas of each device
-r, --reorder <dev>               fix partitions order (by start offset)
    --delete <dev> [<part> ...]   delete all or specified partitions
--part-label <dev> <part> [<str>] print or change partition label
--part-type <dev> <part> [<type>] print or change partition type
--part-uuid <dev> <part> [<uuid>] print or change partition uuid
--part-attrs <dev> <part> [<str>] print or change partition attributes
New Options:
-a, --append                   append partitions to existing partition table
-b, --backup                   backup partition table sectors (see -O)
    --bytes                    print SIZE in bytes rather than in human readable format
    --move-data[=<typescript>] move partition data after relocation (requires -N)
    --color[=<when>]           colorize output (auto, always or never)
                               colors are enabled by default
-N, --partno <num>             specify partition number
-n, --no-act                   do everything except write to device
    --no-tell-kernel           do not tell kernel about changes
-O, --backup-file <path>       override default backup file name
-o, --output <list>            output columns
-w, --wipe <mode>              wipe signatures (auto, always or never)
-W, --wipe-partitions <mode>   wipe signatures from new partitions (auto, always or never)
-X, --label <name>             specify label type (dos, gpt, ...)
-Y, --label-nested <name>      specify nested label type (dos, bsd)
Available columns (for -o):
 gpt: Device Start End Sectors Size Type Type-UUID Attrs Name UUID
 dos: Device Start End Sectors Cylinders Size Type Id Attrs Boot End-C/H/S Start-C/H/S
 bsd: Slice Start  End Sectors Cylinders Size Type Bsize Cpg Fsize
 sgi: Device Start End Sectors Cylinders Size Type Id Attrs
 sun: Device Start End Sectors Cylinders Size Type Id Flags

swapon (enable devices and files for paging and swapping):

-o, --options <list>     comma-separated list of swap options
New available columns (for --show):
UUID   swap uuid
LABEL  swap label

unshare (run a program with some namespaces unshared from the parent):

-C, --cgroup[=<file>]                              unshare cgroup namespace
    --propagation slave shared private unchanged   modify mount propagation in mount namespace
-s, --setgroups allow deny                         control the setgroups syscall in user namespaces

Deprecated / removed options sfdisk (display or manipulate a disk partition table):

-c, --id                  change or print partition Id
    --change-id           change Id
    --print-id            print Id
-C, --cylinders <number>  set the number of cylinders to use
-H, --heads <number>      set the number of heads to use
-S, --sectors <number>    set the number of sectors to use
-G, --show-pt-geometry    deprecated, alias to --show-geometry
-L, --Linux               deprecated, only for backward compatibility
-u, --unit S              deprecated, only sector unit is supported

17 May 2015

Lunar: Reproducible builds: week 2 in Stretch cycle

What happened about the reproducible builds effort for this week: Media coverage Debian's effort on reproducible builds has been covered in the June 2015 issue of Linux Magazin in Germany.

Article about reproducible builds in Linux Magazin June 2015

Toolchain fixes

gregor herrmann uploaded libextutils-depends-perl/0.404-1 which makes its output deterministic.
Christian Hofstaedtler uploaded yard/0.8.7.4-2 which will not write timestamps in the generated documentation. Original patch by Chris Lamb, does not write timestamps in the generated documentation anymore.
Emmanuel Bourg uploaded maven-plugin-tools/3.3-2 which removes the date from the plugin descriptor. Patch by Reiner Herrmann.
Emmanuel Bourg uploaded maven-archiver/2.6-1 which now uses the date set in the DEB_CHANGELOG_DATETIME environment variable for the timestamp in the pom.properties file embedded in the jar files. Original patch by Chris West.
Nicolas Boulenguez uploaded dh-ada-library/6.4 which will warn against non deterministic ALI for sources newer than changelog.

josch rebased the experimental version of debhelper on 9.20150507. Packages fixed The following 515 packages became reproducible due to changes of their build dependencies: airport-utils, airspy-host, all-in-one-sidebar, ampache, aptfs, arpack, asciio, aspell-kk, asused, balance, batmand, binutils-avr, bioperl, bpm-tools, c2050, cakephp-instaweb, carton, cbp2make, checkbot, checksecurity, chemeq, chronicle, cube2-data, cucumber, darkstat, debci, desktop-file-utils, dh-linktree, django-pagination, dosbox, eekboek, emboss-explorer, encfs, exabgp, fbasics, fife, fonts-lexi-saebom, gdnsd, glances, gnome-clocks, gunicorn, haproxy, haskell-aws, haskell-base-unicode-symbols, haskell-base64-bytestring, haskell-basic-prelude, haskell-binary-shared, haskell-binary, haskell-bitarray, haskell-bool-extras, haskell-boolean, haskell-boomerang, haskell-bytestring-lexing, haskell-bytestring-mmap, haskell-config-value, haskell-mueval, haskell-tasty-kat, itk3, jnr-constants, jshon, kalternatives, kdepim-runtime, kdevplatform, kwalletcli, lemonldap-ng, libalgorithm-combinatorics-perl, libalgorithm-diff-xs-perl, libany-uri-escape-perl, libanyevent-http-scopedclient-perl, libanyevent-perl, libanyevent-processor-perl, libapache-session-wrapper-perl, libapache-sessionx-perl, libapp-options-perl, libarch-perl, libarchive-peek-perl, libaudio-flac-header-perl, libaudio-wav-perl, libaudio-wma-perl, libauth-yubikey-decrypter-perl, libauthen-krb5-simple-perl, libauthen-simple-perl, libautobox-dump-perl, libb-keywords-perl, libbarcode-code128-perl, libbio-das-lite-perl, libbio-mage-perl, libbrowser-open-perl, libbusiness-creditcard-perl, libbusiness-edifact-interchange-perl, libbusiness-isbn-data-perl, libbusiness-tax-vat-validation-perl, libcache-historical-perl, libcache-memcached-perl, libcairo-gobject-perl, libcarp-always-perl, libcarp-fix-1-25-perl, libcatalyst-action-serialize-data-serializer-perl, libcatalyst-controller-formbuilder-perl, libcatalyst-dispatchtype-regex-perl, libcatalyst-plugin-authentication-perl, libcatalyst-plugin-authorization-acl-perl, libcatalyst-plugin-session-store-cache-perl, libcatalyst-plugin-session-store-fastmmap-perl, libcatalyst-plugin-static-simple-perl, libcatalyst-view-gd-perl, libcgi-application-dispatch-perl, libcgi-application-plugin-authentication-perl, libcgi-application-plugin-logdispatch-perl, libcgi-application-plugin-session-perl, libcgi-application-server-perl, libcgi-compile-perl, libcgi-xmlform-perl, libclass-accessor-classy-perl, libclass-accessor-lvalue-perl, libclass-accessor-perl, libclass-c3-adopt-next-perl, libclass-dbi-plugin-type-perl, libclass-field-perl, libclass-handle-perl, libclass-load-perl, libclass-ooorno-perl, libclass-prototyped-perl, libclass-returnvalue-perl, libclass-singleton-perl, libclass-std-fast-perl, libclone-perl, libconfig-auto-perl, libconfig-jfdi-perl, libconfig-simple-perl, libconvert-basen-perl, libconvert-ber-perl, libcpan-checksums-perl, libcpanplus-dist-build-perl, libcriticism-perl, libcrypt-cracklib-perl, libcrypt-dh-gmp-perl, libcrypt-mysql-perl, libcrypt-passwdmd5-perl, libcrypt-simple-perl, libcss-packer-perl, libcss-tiny-perl, libcurses-widgets-perl, libdaemon-control-perl, libdancer-plugin-database-perl, libdancer-session-cookie-perl, libdancer2-plugin-database-perl, libdata-format-html-perl, libdata-uuid-libuuid-perl, libdata-validate-domain-perl, libdate-jd-perl, libdate-simple-perl, libdatetime-astro-sunrise-perl, libdatetime-event-cron-perl, libdatetime-format-dbi-perl, libdatetime-format-epoch-perl, libdatetime-format-mail-perl, libdatetime-tiny-perl, libdatrie, libdb-file-lock-perl, libdbd-firebird-perl, libdbix-abstract-perl, libdbix-class-datetime-epoch-perl, libdbix-class-dynamicdefault-perl, libdbix-class-introspectablem2m-perl, libdbix-class-timestamp-perl, libdbix-connector-perl, libdbix-oo-perl, libdbix-searchbuilder-perl, libdbix-xml-rdb-perl, libdevel-stacktrace-ashtml-perl, libdigest-hmac-perl, libdist-zilla-plugin-emailnotify-perl, libemail-date-format-perl, libemail-mime-perl, libemail-received-perl, libemail-sender-perl, libemail-simple-perl, libencode-detect-perl, libexporter-tidy-perl, libextutils-cchecker-perl, libextutils-installpaths-perl, libextutils-libbuilder-perl, libextutils-makemaker-cpanfile-perl, libextutils-typemap-perl, libfile-counterfile-perl, libfile-pushd-perl, libfile-read-perl, libfile-touch-perl, libfile-type-perl, libfinance-bank-ie-permanenttsb-perl, libfont-freetype-perl, libfrontier-rpc-perl, libgd-securityimage-perl, libgeo-coordinates-utm-perl, libgit-pureperl-perl, libgnome2-canvas-perl, libgnome2-wnck-perl, libgraph-readwrite-perl, libgraphics-colornames-www-perl, libgssapi-perl, libgtk2-appindicator-perl, libgtk2-gladexml-simple-perl, libgtk2-notify-perl, libhash-asobject-perl, libhash-moreutils-perl, libhtml-calendarmonthsimple-perl, libhtml-display-perl, libhtml-fillinform-perl, libhtml-form-perl, libhtml-formhandler-model-dbic-perl, libhtml-html5-entities-perl, libhtml-linkextractor-perl, libhtml-tableextract-perl, libhtml-widget-perl, libhtml-widgets-selectlayers-perl, libhtml-wikiconverter-mediawiki-perl, libhttp-async-perl, libhttp-body-perl, libhttp-date-perl, libimage-imlib2-perl, libimdb-film-perl, libimport-into-perl, libindirect-perl, libio-bufferedselect-perl, libio-compress-lzma-perl, libio-compress-perl, libio-handle-util-perl, libio-interface-perl, libio-multiplex-perl, libio-socket-inet6-perl, libipc-system-simple-perl, libiptables-chainmgr-perl, libjoda-time-java, libjsr305-java, libkiokudb-perl, liblemonldap-ng-cli-perl, liblexical-var-perl, liblingua-en-fathom-perl, liblinux-dvb-perl, liblocales-perl, liblog-dispatch-configurator-any-perl, liblog-log4perl-perl, liblog-report-lexicon-perl, liblwp-mediatypes-perl, liblwp-protocol-https-perl, liblwpx-paranoidagent-perl, libmail-sendeasy-perl, libmarc-xml-perl, libmason-plugin-routersimple-perl, libmasonx-processdir-perl, libmath-base85-perl, libmath-basecalc-perl, libmath-basecnv-perl, libmath-bigint-perl, libmath-convexhull-perl, libmath-gmp-perl, libmath-gradient-perl, libmath-random-isaac-perl, libmath-random-oo-perl, libmath-random-tt800-perl, libmath-tamuanova-perl, libmemoize-expirelru-perl, libmemoize-memcached-perl, libmime-base32-perl, libmime-lite-tt-perl, libmixin-extrafields-param-perl, libmock-quick-perl, libmodule-cpanfile-perl, libmodule-load-conditional-perl, libmodule-starter-pbp-perl, libmodule-util-perl, libmodule-versions-report-perl, libmongodbx-class-perl, libmoo-perl, libmoosex-app-cmd-perl, libmoosex-attributehelpers-perl, libmoosex-blessed-reconstruct-perl, libmoosex-insideout-perl, libmoosex-relatedclassroles-perl, libmoosex-role-timer-perl, libmoosex-role-withoverloading-perl, libmoosex-storage-perl, libmoosex-types-common-perl, libmoosex-types-uri-perl, libmoox-singleton-perl, libmoox-types-mooselike-numeric-perl, libmousex-foreign-perl, libmp3-tag-perl, libmysql-diff-perl, libnamespace-clean-perl, libnet-bonjour-perl, libnet-cli-interact-perl, libnet-daap-dmap-perl, libnet-dbus-glib-perl, libnet-dns-perl, libnet-frame-perl, libnet-google-authsub-perl, libnet-https-any-perl, libnet-https-nb-perl, libnet-idn-encode-perl, libnet-idn-nameprep-perl, libnet-imap-client-perl, libnet-irc-perl, libnet-mac-vendor-perl, libnet-openid-server-perl, libnet-smtp-ssl-perl, libnet-smtp-tls-perl, libnet-smtpauth-perl, libnet-snpp-perl, libnet-sslglue-perl, libnet-telnet-perl, libnhgri-blastall-perl, libnumber-range-perl, libobject-signature-perl, libogg-vorbis-header-pureperl-perl, libopenoffice-oodoc-perl, libparse-cpan-packages-perl, libparse-debian-packages-perl, libparse-fixedlength-perl, libparse-syslog-perl, libparse-win32registry-perl, libpdf-create-perl, libpdf-report-perl, libperl-destruct-level-perl, libperl-metrics-simple-perl, libperl-minimumversion-perl, libperl6-slurp-perl, libpgobject-simple-perl, libplack-middleware-fixmissingbodyinredirect-perl, libplack-test-externalserver-perl, libplucene-perl, libpod-tests-perl, libpoe-component-client-ping-perl, libpoe-component-jabber-perl, libpoe-component-resolver-perl, libpoe-component-server-soap-perl, libpoe-component-syndicator-perl, libposix-strftime-compiler-perl, libposix-strptime-perl, libpostscript-simple-perl, libproc-processtable-perl, libprotocol-osc-perl, librcs-perl, libreadonly-xs-perl, libreturn-multilevel-perl, librivescript-perl, librouter-simple-perl, librrd-simple-perl, libsafe-isa-perl, libscope-guard-perl, libsemver-perl, libset-tiny-perl, libsharyanto-file-util-perl, libshell-command-perl, libsnmp-info-perl, libsoap-lite-perl, libstat-lsmode-perl, libstatistics-online-perl, libstring-compare-constanttime-perl, libstring-format-perl, libstring-toidentifier-en-perl, libstring-tt-perl, libsub-recursive-perl, libsvg-tt-graph-perl, libsvn-notify-perl, libswish-api-common-perl, libtap-formatter-junit-perl, libtap-harness-archive-perl, libtemplate-plugin-number-format-perl, libtemplate-plugin-yaml-perl, libtemplate-tiny-perl, libtenjin-perl, libterm-visual-perl, libtest-block-perl, libtest-carp-perl, libtest-classapi-perl, libtest-cmd-perl, libtest-consistentversion-perl, libtest-data-perl, libtest-databaserow-perl, libtest-differences-perl, libtest-file-sharedir-perl, libtest-hasversion-perl, libtest-kwalitee-perl, libtest-lectrotest-perl, libtest-module-used-perl, libtest-object-perl, libtest-perl-critic-perl, libtest-pod-coverage-perl, libtest-script-perl, libtest-script-run-perl, libtest-spelling-perl, libtest-strict-perl, libtest-synopsis-perl, libtest-trap-perl, libtest-unit-perl, libtest-utf8-perl, libtest-without-module-perl, libtest-www-selenium-perl, libtest-xml-simple-perl, libtest-yaml-perl, libtex-encode-perl, libtext-bibtex-perl, libtext-csv-encoded-perl, libtext-csv-perl, libtext-dhcpleases-perl, libtext-diff-perl, libtext-quoted-perl, libtext-trac-perl, libtext-vfile-asdata-perl, libthai, libthread-conveyor-perl, libthread-sigmask-perl, libtie-cphash-perl, libtie-ical-perl, libtime-stopwatch-perl, libtk-dirselect-perl, libtk-pod-perl, libtorrent, libturpial, libunicode-japanese-perl, libunicode-maputf8-perl, libunicode-stringprep-perl, libuniversal-isa-perl, libuniversal-moniker-perl, liburi-encode-perl, libvi-quickfix-perl, libvideo-capture-v4l-perl, libvideo-fourcc-info-perl, libwiki-toolkit-plugin-rss-reader-perl, libwww-mechanize-formfiller-perl, libwww-mechanize-gzip-perl, libwww-mechanize-perl, libwww-opensearch-perl, libx11-freedesktop-desktopentry-perl, libxc, libxml-dtdparser-perl, libxml-easy-perl, libxml-handler-trees-perl, libxml-libxml-iterator-perl, libxml-libxslt-perl, libxml-rss-perl, libxml-validator-schema-perl, libxml-xpathengine-perl, libxml-xql-perl, llvm-py, madbomber, makefs, mdpress, media-player-info, meta-kde-telepathy, metamonger, mmm-mode, mupen64plus-audio-sdl, mupen64plus-rsp-hle, mupen64plus-ui-console, mupen64plus-video-z64, mussort, newpid, node-formidable, node-github-url-from-git, node-transformers, nsnake, odin, otcl, parsley, pax, pcsc-perl, pd-purepd, pen, prank, proj, proot, puppet-module-puppetlabs-postgresql, python-async, python-pysnmp4, qrencode, r-bioc-graph, r-bioc-hypergraph, r-bioc-iranges, r-bioc-xvector, r-cran-pscl, rbenv, rlinetd, rs, ruby-ascii85, ruby-cutest, ruby-ejs, ruby-factory-girl, ruby-hdfeos5, ruby-kpeg, ruby-libxml, ruby-password, ruby-zip-zip, sdl-sound1.2, stterm, systemd, taktuk, tcc, tryton-modules-account-invoice, ttf-summersby, tupi, tuxpuck, unknown-horizons, unsafe-mock, vcheck, versiontools, vim-addon-manager, vlfeat, vsearch, xacobeo, xen-tools, yubikey-personalization-gui, yubikey-personalization. The following packages became reproducible after getting fixed:

cwirc/2.0.0-8 uploaded by Colin Tuckley, original patch by Reiner Herrmann.
darkplaces/0~20140513+svn12208-1 by Simon McVittie.
exactimage/0.9.1-4 by Sven Eckelmann.
gnupg/1.4.19-1 by Daniel Kahn Gillmor.
httpunit/1.7+dfsg-11 by Emmanuel Bourg.
hy/0.10.1-2 uploaded by Tianon Gravi, original patch by Reiner Herrmann.
ioquake3/1.36+u20150412+dfsg1-2 by Simon McVittie, original patch by Reiner Herrmann.
kiwi/1.9.22-3 by Jelmer Vernooij.
lava-server/2015.05-1 uploaded by Neil Williams, original patch by Reiner Herrmann.
libelixirfm-perl/1.1.976-4 uploaded by gregor herrmann, original patch by Chris Lamb.
littler/0.2.3-2 by Dirk Eddelbuettel.
mednafen/0.9.38.1-1 by Stephen Kitt.
nftables/0.4-4 by Arturo Borrero Gonzalez.
ntdb/1.0-7 by Jelmer Vernooij.
onioncat/0.2.2+svn566-1 by intrigeri.
openarena/0.8.8-13 by Simon McVittie.
openarena-085-data/0.8.5split-6 by Simon McVittie.
openarena-088-data/0.8.8-3 by Simon McVittie.
openarena-data/0.8.5split-6 by Simon McVittie.
openarena-maps/0.8.5split-6 by Simon McVittie.
openarena-players/0.8.5split-6 by Simon McVittie.
openarena-players-mature/0.8.5split-6 by Simon McVittie.
openarena-textures/0.8.5split-6 by Simon McVittie.
pybik/2.0-1 by B. Clausius.
python-xmp-toolkit/2.0.1+git20140309.5437b0a-1 by Daniel Stender.
quakespasm/0.90.0-3 by Stephen Kitt.
traceroute/1:2.0.21-1 uploaded by Laszlo Boszormenyi, original patch by Lunar.
unar/1.8.1-4 uploaded by Matt Kraai, original patch by Lunar.
websvn/2.3.3-1.3 uploaded by Thijs Kinkhorst, original patch by Chris Lamb.
xd/3.23.01-2 uploaded by Frank B. Brokken, original patch by Chris Lamb.

Some uploads fixed some reproducibility issues but not all of them:

ada-reference-manual/1:2012.2-5 by Nicolas Boulenguez.
apparmor/2.9.2-2 by intrigeri.
argyll/1.7.0+repack-1 by J rg Frings-F rst.
lava-dispatcher/2015.05-1 by Neil Williams.
libaunit/3.7.1-2 by Nicolas Boulenguez.
libflorist/2014-2 by Nicolas Boulenguez.
mailcrypt/3.5.9-8 uploaded by Barak A. Pearlmutter, original patch by Chris Lamb.
openchange/1:2.2-7 by Jelmer Vernooij.
sane-backends/1.0.24-11 by J rg Frings-F rst timestamps in .dvi and .ps
tomcat6/6.0.41-4 by Emmanuel Bourg.
tomcat7/7.0.61-1 by Emmanuel Bourg; currently FTBFS.
tomcat8/8.0.22-2 by Emmanuel Bourg.

Patches submitted which did not make their way to the archive yet:

#784541 on yasm by Lunar: remove build date from version strings.
#784694 on smcroute by Micha Lenk: remove build date from version string.
#784672 on gnumeric by Daniel Kahn Gillmor: remove timestamps in embedded gzip'ed data in shared library.
#774347 on sed by Lunar: fix permissions before creating the package.
#784352 on icebreaker by Reiner Herrmann: use UTC timezone when calculating version date.
#784325 on kde-workspace by Lunar: make the output of kdm confproc.pl stable.
#784602 on monkeysign by Daniel Kahn Gillmor: use time of debian/changelog entry when generating documentation.
#784723 on alot by Juan Picca: pass time of debian/changelog entry to Sphinx.
#784538 on file-rc by Lunar: use sed instead of grep+mv to keep correct file permissions.
#784335 on libapache2-mod-perl2 by Lunar: set PERL_HASH_SEED=0 during configure to make the generated .c and .h files stable.
#784267 on mpv by Lunar: pass --disable-build-date to ./configure.
#784793 on bugs-everywhere by Daniel Kahn Gillmor: use time of debian/changelog entry as build date.
#784318 on gnome-desktop3 by Lunar: use time of debian/chanelog entry as build date.
#774504 on debianutils by Lunar: fix file permissions.

reproducible.debian.net Alioth now hosts a script that can be used to redo builds and test for a package. This was preliminary done manually through requests over the IRC channel. This should reduce the number of interruptions for jenkins' maintainers The graph of the oldest build per day has been fixed. Maintainance scripts will not error out when they are no files to remove. Holger Levsen started work on being able to test variations of CPU features and build date (as in build in another month of 1984) by using virtual machines. debbindiff development Version 18 has been released. It will uses proper comparators for pk3 and info files. Tar member names are now assumed to be UTF-8 encoded. The limit for the maximum number of different lines has been removed. Let's see on reproducible.debian.net how it goes for pathological cases. It's now possible to specify both --html and --text output. When neither of them is specified, the default will be to print a text report on the standard output (thanks to Paul Wise for the suggestion). Documentation update Nicolas Boulenguez investigated Ada libraries. Package reviews 451 obsolete reviews have been removed and 156 added this week. New identified issues: running kernel version getting captured, random filenames in GHC debug symbols, and timestamps in headers generated by qdbusxml2cpp. Misc. Holger Levsen went to re:publica and talked about reproducible builds to developers and users there. Holger also had a chance to meet FreeBSD developers and discuss the status of FreeBSD. Investigations have started on how it could be made part of our current test system. Laurent Guerby gave Lunar access to systems in the GCC Compile Farm. Hopefully access to these powerful machines will help to fix packages for GCC, Iceweasel, and similar packages requiring long build times.