Search Results: "cklin"

10 August 2021

Shirish Agarwal: BBI, IP report, State Borders and Civil Aviation I

If I have seen further, it is by standing on the shoulders of Giants Issac Newton, 1675. Although it should be credited to 12th century Bernard of Chartres. You will know why I have shared this, probably at the beginning of Civil Aviation history itself.

Comments on the BBI court case which happened in Kenya, then and the subsequent appeal. I am not going to share much about the coverage of the BBI appeal as Gautam Bhatia has shared quite eloquently his observations, both on the initial case and the subsequent appeal which lasted 5 days in Kenya and was shown all around the world thanks to YouTube. One of the interesting points which stuck with me was that in Kenya, sign language is one of the official languages. And in fact, I was able to read quite a bit about the various sign languages which are there in Kenya. It just boggles the mind that there are countries that also give importance to such even though they are not as rich or as developed as we call developed economies. I probably might give more space and give more depth as it does carry some important judicial jurisprudence which is and which will be felt around the world. How does India react or doesn t is probably another matter altogether  But yes, it needs it own space, maybe after some more time. Report on Standing Committee on IP Regulation in India and the false promises. Again, I do not want to take much time in sharing details about what the report contains, as the report can be found here. I have uploaded it on WordPress, in case of an issue. An observation on the same subject can be found here. At least, to me and probably those who have been following the IP space as either using/working on free software or even IP would be aware that the issues shared have been known since 1994. And it does benefit the industry rather than the country. This way, the rent-seekers, and monopolists win. There is ample literature that shared how rich countries had weak regulation for decades and even centuries till it was advantageous for them to have strong IP. One can look at the history of Europe and the United States for it. We can also look at the history of our neighbor China, which for the last 5 decades has used some provision of IP and disregarded many others. But these words are of no use, as the policies done and shared are by the rich for the rich.

Fighting between two State Borders Ironically or because of it, two BJP ruled states Assam and Mizoram fought between themselves. In which 6 policemen died. While the history of the two states is complicated it becomes a bit more complicated when one goes back into Assam and ULFA history and comes to know that ULFA could not have become that powerful until and unless, the Marwaris, people of my clan had not given generous donations to them. They thought it was a good investment, which later would turn out to be untrue. Those who think ULFA has declined, or whatever, still don t have answers to this or this. Interestingly, both the Chief Ministers approached the Home Minister (Mr. Amit Shah) of BJP. Mr. Shah was supposed to be the Chanakya but in many instances, including this one, he decided to stay away. His statement was on the lines of you guys figure it out yourself. There is a poem that was shared by the late poet Rahat Indori. I am sharing the same below as an image and will attempt to put a rough translation.
kisi ke baap ka hindustan todi hain Rahat Indori
Poets, whether in India or elsewhere, are known to speak truth to power and are a bit of a rebel. This poem by Rahat Indori is provocatively titled Kisi ke baap ka Hindustan todi hai , It challenges the majoritarian idea that Hindustan/India only belongs to the majoritarian religion. He also challenges as well as asserts at the same time that every Indian citizen, regardless of whatever his or her religion might be, is an Indian and can assert India as his home. While the whole poem is compelling in itself, for me what hits home is in the second stanza

:Lagegi Aag to aayege ghat kayi zad me, Yaha pe sirf hamara makan todi hai The meaning is simple yet subtle, he uses Aag or Fire as a symbol of hate sharing that if hate spreads, it won t be his home alone that will be torched. If one wants to literally understand what he meant, I present to you the cult Russian movie No Escapes or Ogon as it is known in Russian. If one were to decipher why the Russian film doesn t talk about climate change, one has to view it from the prism of what their leader Vladimir Putin has said and done over the years. As can be seen even in there, the situation is far more complex than one imagines. Although, it is interesting to note that he decried Climate change as man-made till as late as last year and was on the side of Trump throughout his presidency. This was in 2017 as well as perhaps this. Interestingly, there was a change in tenor and note just a couple of weeks back, but that could be only politicking or much more. Statements that are not backed by legislation and application are usually just a whitewash. We would have to wait to see what concrete steps are taken by Putin, Kremlin, and their Duma before saying either way.

Civil Aviation and the broad structure Civil Aviation is a large topic and I would not be able to do justice to it all in one article/blog post. So, for e.g. I will not be getting into Aircraft (Boeing, Airbus, Comac etc., etc.) or the new electric aircraft as that will just make the blog post long. I will not be also talking about cargo or Visa or many such topics, as all of them actually would and do need their own space. So this would be much more limited to Airports and to some extent airlines, as one cannot survive without the other. The primary reason for doing this is there is and has been a lot of myth-making in India about Civil Aviation in general, whether it has to do with Civil Aviation history or whatever passes as of policy in India.

A little early history Man has always looked at the stars and envisaged himself or herself as a bird, flying with gay abandon. In fact, there have been many paintings, sculptors who imagined how we would fly. The Steam Engine itself was invented in 82 BCE. But the attempt to fly was done by a certain Monk called Brother Elmer of Malmesbury who attempted the same in 1010., shortly after the birth of the rudimentary steam engine The most famous of all would be Leonardo da Vinci for his amazing sketches of flying machines in 1493. There were a couple of books by Cyrano de Bergerac, apparently wrote two books, both sadly published after his death. Interestingly, you can find both the book and the gentleman in the Project Gutenberg archives. How much of M/s Cyrano s exploits were his own and how much embellished by M/S Curtis, maybe a friend, a lover who knows, but it does give the air of the swashbuckling adventurer of the time which many men aspired to in that time. So, why not an author???

L Autre Monde: ou les tats et Empires de la Lune (Comical History of the States and Empires of the Moon) and Les tats et Empires du Soleil (The States and Empires of the Sun). These two French books apparently had a lot of references to flying machines. Both of them were authored by Cyrano de Bergerac. Both of these were sadly published after his death, one apparently in 1656 and the other one a couple of years later. By the 17th century, while it had become easy to know and measure the latitude, measuring longitude was a problem. In fact, it can be argued and probably successfully that India wouldn t have been under British rule or UK wouldn t have been a naval superpower if it hadn t solved the longitudinal problem. Over the years, the British Royal Navy suffered many blows, one of the most famous or infamous among them might be the Scilly naval disaster of 1707 which led to the death of 2000 odd British Royal naval personnel and led to Queen Anne, who was ruling over England at that time via Parliament and called it the Longitude Act which basically was an open competition for anybody to fix the problem and carried the prize money of 20,000. While nobody could claim the whole prize, many did get smaller amounts depending upon the achievements. The best and the nearest who came was John Harrison who made the first sea-watch and with modifications, over the years it became miniaturized to a pocket-sized Marine chronometer although, I doubt the ones used today look anything in those days. But if that had not been invented, we surely would have been freed long ago. The assumption being that the East India Company would have dashed onto rocks so many times, that the whole exercise would have been futile. The downside of it is that maritime trade routes that are being used today and the commerce would not have been. Neither would have aircraft or space for that matter, or at the very least delayed by how many years or decades, nobody knows. If one wants to read about the Longitudinal problem, one can get the famous book Longitude .

In many mythologies, including Indian and Arabian tales, in which we had the flying carpet which would let its passengers go from one place to the next. Then there is also mention of Pushpak Vimana in ancient texts, but those secrets remain secrets. Think how much foreign exchange India could make by both using it and exporting the same worldwide. And I m being serious. There are many who believe in it, but sadly, the ones who know the secret don t seem to want India s progress. Just think of the carbon credits that India could have, which itself would make India a superpower. And I m being serious.

Western Ideas and Implementation. Even in the late and early 18th century, there were many machines that were designed to have controlled flight, but it was only the Wright Flyer that was able to demonstrate a controlled flight in 1903. The ones who came pretty close to what the Wrights achieved were the people by the name of Cayley and Langley. They actually studied what the pioneers had done. They looked at what Otto Lilienthal had done, as he had done a lot of hang-gliding and put a lot of literature in the public domain then.

Furthermore, they also consulted Octave Chanute. The whole system and history of the same are a bit complicated, but it does give a window to what happened then. So, it won t be wrong to say that whatever the Wright Brothers accomplished would probably not have been possible or would have taken years or maybe even decades if that literature and experiments, drawings, etc. in the commons were not available. So, while they did experimentation, they also looked at what other people were doing and had done which was in public domain/commons.

They also did a lot of testing, which gave them new insights. Even the propulsion system they used in the 1903 flight was a design by Nicolaus Otto. In fact, the Aircraft would not have been born if the Chinese had not invented kites in the early sixth century A.D. One also has to credit Issac Newton because of the three laws of motion, again without which none of the above could have happened. What is credited to the Wilbur brothers is not just they made the Kitty Hawk, they also made it commercial as they sold it and variations of the design to the American Air Force and also made a pilot school where pilots were trained for warfighting. 119 odd pilots came out of that school. The Wrights thought that air supremacy would end the war early, but this turned out to be a false hope.

Competition and Those Magnificent Men and their flying machines One of the first competitions to unlock creativity was the English Channel crossing offer made by Daily Mail. This was successfully done by the Frenchman Louis Bl riot. You can read his account here. There were quite a few competitions before World War 1 broke out. There is a beautiful, humorous movie that does dedicate itself to imagining how things would have gone in that time. In fact, there have been two movies, this one and an earlier movie called Sky Riders made many a youth dream. The other movie sadly is not yet in the public domain, and when it will be nobody knows, but if you see it or even read it, it gives you goosebumps.

World War 1 and Improvements to Aircraft World War 1 is remembered as the Great War or the War to end all wars in an attempt at irony. It did a lot of destruction of both people and property, and in fact, laid the foundation of World War 2. At the same time, if World War 1 hadn t happened then Airpower, Plane technology would have taken decades. Even medicine and medical techniques became revolutionary due to World War 1. In order to be brief, I am not sharing much about World War 1 otherwise that itself would become its own blog post. And while it had its heroes and villains who, when, why could be tackled perhaps another time.

The Guggenheim Family and the birth of Civil Aviation If one has to credit one family for the birth of the Civil Aviation, it has to be the Guggenheim family. Again, I would not like to dwell much as much of their contribution has already been noted here. There are quite a few things still that need to be said and pointed out. First and foremost is the fact that they made lessons about flying from grade school to college and afterward till college and beyond which were in the syllabus, whereas in the Indian schooling system, there is nothing like that to date. Here, in India, even in Engineering courses, you don t have much info. Unless until you go for professional Aviation or Aeronautical courses and most of these courses cost a bomb so either the very rich or the very determined (with loans) only go for that, at least that s what my friends have shared. And there is no guarantee you will get a job after that, especially in today s climate. Even their fund, grants, and prizes which were given to people for various people so that improvements could be made to the United States Civil Aviation. This, as shared in the report/blog post shared, was in response to what the younger child/brother saw as Europe having a large advantage both in Military and Civil Aviation. They also made several grants in several Universities which would not only do notable work during their lifetime but carry on the legacy researching on different aspects of Aircraft. One point that should be noted is that Europe was far ahead even then of the U.S. which prompted the younger son. There had already been talks of civil/civilian flights on European routes, although much different from what either of us can imagine today. Even with everything that the U.S. had going for her and still has, Europe is the one which has better airports, better facilities, better everything than the U.S. has even today. If you look at the lists of the Airports for better value of money or facilities, you would find many Airports from Europe, some from Asia, and only a few from the U.S. even though they are some of the most frequent users of the service. But that debate and arguments I would have to leave for perhaps the next blog post as there is still a lot to be covered between the 1930s, 1950s, and today. The Guggenheims archives does a fantastic job of sharing part of the story till the 1950s, but there is also quite a bit which it doesn t. I will probably start from that in the next blog post and then carry on ahead. Lastly, before I wind up, I have to share why I felt the need to write, capture and share this part of Aviation history. The plain and simple reason being, many of the people I meet either on the web, on Twitter or even in real life, many of them are just unaware of how this whole thing came about. The unawareness in my fellow brothers and sisters is just shocking, overwhelming. At least, by sharing these articles, I at least would be able to guide them or at least let them know how it all came to be and where things are going and not just be so clueless. Till later.

2 August 2021

Colin Watson: Launchpad now runs on Python 3!

After a very long porting journey, Launchpad is finally running on Python 3 across all of our systems. I wanted to take a bit of time to reflect on why my emotional responses to this port differ so much from those of some others who ve done large ports, such as the Mercurial maintainers. It s hard to deny that we ve had to burn a lot of time on this, which I m sure has had an opportunity cost, and from one point of view it s essentially running to stand still: there is no single compelling feature that we get solely by porting to Python 3, although it s clearly a prerequisite for tidying up old compatibility code and being able to use modern language facilities in the future. And yet, on the whole, I found this a rewarding project and enjoyed doing it. Some of this may be because by inclination I m a maintenance programmer and actually enjoy this sort of thing. My default view tends to be that software version upgrades may be a pain but it s much better to get that pain over with as soon as you can rather than trying to hold back the tide; you can certainly get involved and try to shape where things end up, but rightly or wrongly I can t think of many cases when a righteously indignant user base managed to arrange for the old version to be maintained in perpetuity so that they never had to deal with the new thing (OK, maybe Perl 5 counts here). I think a more compelling difference between Launchpad and Mercurial, though, may be that very few other people really had a vested interest in what Python version Launchpad happened to be running, because it s all server-side code (aside from some client libraries such as launchpadlib, which were ported years ago). As such, we weren t trying to do this with the internet having Strong Opinions at us. We were doing this because it was obviously the only long-term-maintainable path forward, and in more recent times because some of our library dependencies were starting to drop support for Python 2 and so it was obviously going to become a practical problem for us sooner or later; but if we d just stayed on Python 2 forever then fundamentally hardly anyone else would really have cared directly, only maybe about some indirect consequences of that. I don t follow Mercurial development so I may be entirely off-base, but if other people were yelling at me about how late my project was to finish its port, that in itself would make me feel more negatively about the project even if I thought it was a good idea. Having most of the pressure come from ourselves rather than from outside meant that wasn t an issue for us. I m somewhat inclined to think of the process as an extreme version of paying down technical debt. Moving from Python 2.7 to 3.5, as we just did, means skipping over multiple language versions in one go, and if similar changes had been made more gradually it would probably have felt a lot more like the typical dependency update treadmill. I appreciate why not everyone might want to think of it this way: maybe this is just my own rationalization. Reflections on porting to Python 3 I m not going to defend the Python 3 migration process; it was pretty rough in a lot of ways. Nor am I going to spend much effort relitigating it here, as it s already been done to death elsewhere, and as I understand it the core Python developers have got the message loud and clear by now. At a bare minimum, a lot of valuable time was lost early in Python 3 s lifetime hanging on to flag-day-type porting strategies that were impractical for large projects, when it should have been providing for bilingual strategies (code that runs in both Python 2 and 3 for a transitional period) which is where most libraries and most large migrations ended up in practice. For instance, the early advice to library maintainers to maintain two parallel versions or perhaps translate dynamically with 2to3 was entirely impractical in most non-trivial cases and wasn t what most people ended up doing, and yet the idea that 2to3 is all you need still floats around Stack Overflow and the like as a result. (These days, I would probably point people towards something more like Eevee s porting FAQ as somewhere to start.) There are various fairly straightforward things that people often suggest could have been done to smooth the path, and I largely agree: not removing the u'' string prefix only to put it back in 3.3, fewer gratuitous compatibility breaks in the name of tidiness, and so on. But if I had a time machine, the number one thing I would ask to have been done differently would be introducing type annotations in Python 2 before Python 3 branched off. It s true that it s technically possible to do type annotations in Python 2, but the fact that it s a different syntax that would have to be fixed later is offputting, and in practice it wasn t widely used in Python 2 code. To make a significant difference to the ease of porting, annotations would need to have been introduced early enough that lots of Python 2 library code used them so that porting code didn t have to be quite so much of an exercise of manually figuring out the exact nature of string types from context. Launchpad is a complex piece of software that interacts with multiple domains: for example, it deals with a database, HTTP, web page rendering, Debian-format archive publishing, and multiple revision control systems, and there s often overlap between domains. Each of these tends to imply different kinds of string handling. Web page rendering is normally done mainly in Unicode, converting to bytes as late as possible; revision control systems normally want to spend most of their time working with bytes, although the exact details vary; HTTP is of course bytes on the wire, but Python s WSGI interface has some string type subtleties. In practice I found myself thinking about at least four string-like types (that is, things that in a language with a stricter type system I might well want to define as distinct types and restrict conversion between them): bytes, text, ordinary native strings (str in either language, encoded to UTF-8 in Python 2), and native strings with WSGI s encoding rules. Some of these are emergent properties of writing in the intersection of Python 2 and 3, which is effectively a specialized language of its own without coherent official documentation whose users must intuit its behaviour by comparing multiple sources of information, or by referring to unofficial porting guides: not a very satisfactory situation. Fortunately much of the complexity collapses once it becomes possible to write solely in Python 3. Some of the difficulties we ran into are not ones that are typically thought of as Python 2-to-3 porting issues, because they were changed later in Python 3 s development process. For instance, the email module was substantially improved in around the 3.2/3.3 timeframe to handle Python 3 s bytes/text model more correctly, and since Launchpad sends quite a few different kinds of email messages and has some quite picky tests for exactly what it emits, this entailed a lot of work in our email sending code and in our test suite to account for that. (It took me a while to work out whether we should be treating raw email messages as bytes or as text; bytes turned out to work best.) 3.4 made some tweaks to the implementation of quoted-printable encoding that broke a number of our tests in ways that took some effort to fix, because the tests needed to work on both 2.7 and 3.5. The list goes on. I got quite proficient at digging through Python s git history to figure out when and why some particular bit of behaviour had changed. One of the thorniest problems was parsing HTTP form data. We mainly rely on zope.publisher for this, which in turn relied on cgi.FieldStorage; but cgi.FieldStorage is badly broken in some situations on Python 3. Even if that bug were fixed in a more recent version of Python, we can t easily use anything newer than 3.5 for the first stage of our port due to the version of the base OS we re currently running, so it wouldn t help much. In the end I fixed some minor issues in the multipart module (and was kindly given co-maintenance of it) and converted zope.publisher to use it. Although this took a while to sort out, it seems to have gone very well. A couple of other interesting late-arriving issues were around pickle. For most things we normally prefer safer formats such as JSON, but there are a few cases where we use pickle, particularly for our session databases. One of my colleagues pointed out that I needed to remember to tell pickle to stick to protocol 2, so that we d be able to switch back and forward between Python 2 and 3 for a while; quite right, and we later ran into a similar problem with marshal too. A more surprising problem was that datetime.datetime objects pickled on Python 2 require special care when unpickling on Python 3; rather than the approach that ended up being implemented and documented for Python 3.6, though, I preferred a custom unpickler, both so that things would work on Python 3.5 and so that I wouldn t have to risk affecting the decoding of other pickled strings in the session database. General lessons Writing this over a year after Python 2 s end-of-life date, and certainly nowhere near the leading edge of Python 3 porting work, it s perhaps more useful to look at this in terms of the lessons it has for other large technical debt projects. I mentioned in my previous article that I used the approach of an enormous and frequently-rebased git branch as a working area for the port, committing often and sometimes combining and extracting commits for review once they seemed to be ready. A port of this scale would have been entirely intractable without a tool of similar power to git rebase, so I m very glad that we finished migrating to git in 2019. I relied on this right up to the end of the port, and it also allowed for quick assessments of how much more there was to land. git worktree was also helpful, in that I could easily maintain working trees built for each of Python 2 and 3 for comparison. As is usual for most multi-developer projects, all changes to Launchpad need to go through code review, although we sometimes make exceptions for very simple and obvious changes that can be self-reviewed. Since I knew from the outset that this was going to generate a lot of changes for review, I therefore structured my work from the outset to try to make it as easy as possible for my colleagues to review it. This generally involved keeping most changes to a somewhat manageable size of 800 lines or less (although this wasn t always possible), and arranging commits mainly according to the kind of change they made rather than their location. For example, when I needed to fix issues with / in Python 3 being true division rather than floor division, I did so in one commit across the various places where it mattered and took care not to mix it with other unrelated changes. This is good practice for nearly any kind of development, but it was especially important here since it allowed reviewers to consider a clear explanation of what I was doing in the commit message and then skim-read the rest of it much more quickly. It was vital to keep the codebase in a working state at all times, and deploy to production reasonably often: this way if something went wrong the amount of code we had to debug to figure out what had happened was always tractable. (Although I can t seem to find it now to link to it, I saw an account a while back of a company that had taken a flag-day approach instead with a large codebase. It seemed to work for them, but I m certain we couldn t have made it work for Launchpad.) I can t speak too highly of Launchpad s test suite, much of which originated before my time. Without a great deal of extensive coverage of all sorts of interesting edge cases at both the unit and functional level, and a corresponding culture of maintaining that test suite well when making new changes, it would have been impossible to be anything like as confident of the port as we were. As part of the porting work, we split out a couple of substantial chunks of the Launchpad codebase that could easily be decoupled from the core: its Mailman integration and its code import worker. Both of these had substantial dependencies with complex requirements for porting to Python 3, and arranging to be able to do these separately on their own schedule was absolutely worth it. Like disentangling balls of wool, any opportunity you can take to make things less tightly-coupled is probably going to make it easier to disentangle the rest. (I can see a tractable way forward to porting the code import worker, so we may well get that done soon. Our Mailman integration will need to be rewritten, though, since it currently depends on the Python-2-only Mailman 2, and Mailman 3 has a different architecture.) Python lessons Our database layer was already in pretty good shape for a port, since at least the modern bits of its table modelling interface were already strict about using Unicode for text columns. If you have any kind of pervasive low-level framework like this, then making it be pedantic at you in advance of a Python 3 port will probably incur much less swearing in the long run, as you won t be trying to deal with quite so many bytes/text issues at the same time as everything else. Early in our port, we established a standard set of __future__ imports and started incrementally converting files over to them, mainly because we weren t yet sure what else to do and it seemed likely to be helpful. absolute_import was definitely reasonable (and not often a problem in our code), and print_function was annoying but necessary. In hindsight I m not sure about unicode_literals, though. For files that only deal with bytes and text it was reasonable enough, but as I mentioned above there were also a number of cases where we needed literals of the language s native str type, i.e. bytes in Python 2 and text in Python 3: this was particularly noticeable in WSGI contexts, but also cropped up in some other surprising places. We generally either omitted unicode_literals or used six.ensure_str in such cases, but it was definitely a bit awkward and maybe I should have listened more to people telling me it might be a bad idea. A lot of Launchpad s early tests used doctest, mainly in the style where you have text files that interleave narrative commentary with examples. The development team later reached consensus that this was best avoided in most cases, but by then there were far too many doctests to conveniently rewrite in some other form. Porting doctests to Python 3 is really annoying. You run into all the little changes in how objects are represented as text (particularly u'...' versus '...', but plenty of other cases as well); you have next to no tools to do anything useful like skipping individual bits of a doctest that don t apply; using __future__ imports requires the rather obscure approach of adding the relevant names to the doctest s globals in the relevant DocFileSuite or DocTestSuite; dealing with many exception tracebacks requires something like zope.testing.renormalizing; and whatever code refactoring tools you re using probably don t work properly. Basically, don t have done that. It did all turn out to be tractable for us in the end, and I managed to avoid using much in the way of fragile doctest extensions aside from the aforementioned zope.testing.renormalizing, but it was not an enjoyable experience. Regressions I know of nine regressions that reached Launchpad s production systems as a result of this porting work; of course there were various other regressions caught by CI or in manual testing. (Considering the size of this project, I count it as a resounding success that there were only nine production issues, and that for the most part we were able to fix them quickly.) Equality testing of removed database objects One of the things we had to do while porting to Python 3 was to implement the __eq__, __ne__, and __hash__ special methods for all our database objects. This was quite conceptually fiddly, because doing this requires knowing each object s primary key, and that may not yet be available if we ve created an object in Python but not yet flushed the actual INSERT statement to the database (most of our primary keys are auto-incrementing sequences). We thus had to take care to flush pending SQL statements in such cases in order to ensure that we know the primary keys. However, it s possible to have a problem at the other end of the object lifecycle: that is, a Python object might still be reachable in memory even though the underlying row has been DELETEd from the database. In most cases we don t keep removed objects around for obvious reasons, but it can happen in caching code, and buildd-manager crashed as a result (in fact while it was still running on Python 2). We had to take extra care to avoid this problem. Debian imports crashed on non-UTF-8 filenames Python 2 has some unfortunate behaviour around passing bytes or Unicode strings (depending on the platform) to shutil.rmtree, and the combination of some porting work and a particular source package in Debian that contained a non-UTF-8 file name caused us to run into this. The fix was to ensure that the argument passed to shutil.rmtree is a str regardless of Python version. We d actually run into something similar before: it s a subtle porting gotcha, since it s quite easy to end up passing Unicode strings to shutil.rmtree if you re in the process of porting your code to Python 3, and you might easily not notice if the file names in your tests are all encoded using UTF-8. lazr.restful ETags We eventually got far enough along that we could switch one of our four appserver machines (we have quite a number of other machines too, but the appservers handle web and API requests) to Python 3 and see what happened. By this point our extensive test suite had shaken out the vast majority of the things that could go wrong, but there was always going to be room for some interesting edge cases. One of the Ubuntu kernel team reported that they were seeing an increase in 412 Precondition Failed errors in some of their scripts that use our webservice API. These can happen when you re trying to modify an existing resource: the underlying protocol involves sending an If-Match header with the ETag that the client thinks the resource has, and if this doesn t match the ETag that the server calculates for the resource then the client has to refresh its copy of the resource and try again. We initially thought that this might be legitimate since it can happen in normal operation if you collide with another client making changes to the same resource, but it soon became clear that something stranger was going on: we were getting inconsistent ETags for the same object even when it was unchanged. Since we d recently switched a quarter of our appservers to Python 3, that was a natural suspect. Our lazr.restful package provides the framework for our webservice API, and roughly speaking it generates ETags by serializing objects into some kind of canonical form and hashing the result. Unfortunately the serialization was dependent on the Python version in a few ways, and in particular it serialized lists of strings such as lists of bug tags differently: Python 2 used [u'foo', u'bar', u'baz'] where Python 3 used ['foo', 'bar', 'baz']. In lazr.restful 1.0.3 we switched to using JSON for this, removing the Python version dependency and ensuring consistent behaviour between appservers. Memory leaks This problem took the longest to solve. We noticed fairly quickly from our graphs that the appserver machine we d switched to Python 3 had a serious memory leak. Our appservers had always been a bit leaky, but now it wasn t so much a small hole that we can bail occasionally as the boat is sinking rapidly : A serious memory leak (Yes, this got in the way of working out what was going on with ETags for a while.) I spent ages messing around with various attempts to fix this. Since only a quarter of our appservers were affected, and we could get by on 75% capacity for a while, it wasn t urgent but it was definitely annoying. After spending some quality time with objgraph, for some time I thought traceback reference cycles might be at fault, and I sent a number of fixes to various upstream projects for those (e.g. zope.pagetemplate). Those didn t help the leaks much though, and after a while it became clear to me that this couldn t be the sole problem: Python has a cyclic garbage collector that will eventually collect reference cycles as long as there are no strong references to any objects in them, although it might not happen very quickly. Something else must be going on. Debugging reference leaks in any non-trivial and long-running Python program is extremely arduous, especially with ORMs that naturally tend to end up with lots of cycles and caches. After a while I formed a hypothesis that zope.server might be keeping a strong reference to something, although I never managed to nail it down more firmly than that. This was an attractive theory as we were already in the process of migrating to Gunicorn for other reasons anyway, and Gunicorn also has a convenient max_requests setting that s good at mitigating memory leaks. Getting this all in place took some time, but once we did we found that everything was much more stable: A rather flat memory graph This isn t completely satisfying as we never quite got to the bottom of the leak itself, and it s entirely possible that we ve only papered over it using max_requests: I expect we ll gradually back off on how frequently we restart workers over time to try to track this down. However, pragmatically, it s no longer an operational concern. Mirror prober HTTPS proxy handling After we switched our script servers to Python 3, we had several reports of mirror probing failures. (Launchpad keeps lists of Ubuntu archive and image mirrors, and probes them every so often to check that they re reasonably complete and up to date.) This only affected HTTPS mirrors when probed via a proxy server, support for which is a relatively recent feature in Launchpad and involved some code that we never managed to unit-test properly: of course this is exactly the code that went wrong. Sadly I wasn t able to sort out that gap, but at least the fix was simple. Non-MIME-encoded email headers As I mentioned above, there were substantial changes in the email package between Python 2 and 3, and indeed between minor versions of Python 3. Our test coverage here is pretty good, but it s an area where it s very easy to have gaps. We noticed that a script that processes incoming email was crashing on messages with headers that were non-ASCII but not MIME-encoded (and indeed then crashing again when it tried to send a notification of the crash!). The only examples of these I looked at were spam, but we still didn t want to crash on them. The fix involved being somewhat more careful about both the handling of headers returned by Python s email parser and the building of outgoing email notifications. This seems to be working well so far, although I wouldn t be surprised to find the odd other incorrect detail in this sort of area. Failure to handle non-ISO-8859-1 URL-encoded form input Remember how I said that parsing HTTP form data was thorny? After we finished upgrading all our appservers to Python 3, people started reporting that they couldn t post Unicode comments to bugs, which turned out to be only if the attempt was made using JavaScript, and was because I hadn t quite managed to get URL-encoded form data working properly with zope.publisher and multipart. The current standard describes the URL-encoded format for form data as in many ways an aberrant monstrosity , so this was no great surprise. Part of the problem was some very strange choices in zope.publisher dating back to 2004 or earlier, which I attempted to clean up and simplify. The rest was that Python 2 s urlparse.parse_qs unconditionally decodes percent-encoded sequences as ISO-8859-1 if they re passed in as part of a Unicode string, so multipart needs to work around this on Python 2. I m still not completely confident that this is correct in all situations, but at least now that we re on Python 3 everywhere the matrix of cases we need to care about is smaller. Inconsistent marshalling of Loggerhead s disk cache We use Loggerhead for providing web browsing of Bazaar branches. When we upgraded one of its two servers to Python 3, we immediately noticed that the one still on Python 2 was failing to read back its revision information cache, which it stores in a database on disk. (We noticed this because it caused a deployment to fail: when we tried to roll out new code to the instance still on Python 2, Nagios checks had already caused an incompatible cache to be written for one branch from the Python 3 instance.) This turned out to be a similar problem to the pickle issue mentioned above, except this one was with marshal, which I didn t think to look for because it s a relatively obscure module mostly used for internal purposes by Python itself; I m not sure that Loggerhead should really be using it in the first place. The fix was relatively straightforward, complicated mainly by now needing to cope with throwing away unreadable cache data. Ironically, if we d just gone ahead and taken the nominally riskier path of upgrading both servers at the same time, we might never have had a problem here. Intermittent bzr failures Finally, after we upgraded one of our two Bazaar codehosting servers to Python 3, we had a report of intermittent bzr branch hangs. After some digging I found this in our logs:
Traceback (most recent call last):
  ...
  File "/srv/bazaar.launchpad.net/production/codehosting1-rev-20124175fa98fcb4b43973265a1561174418f4bd/env/lib/python3.5/site-packages/twisted/conch/ssh/channel.py", line 136, in addWindowBytes
    self.startWriting()
  File "/srv/bazaar.launchpad.net/production/codehosting1-rev-20124175fa98fcb4b43973265a1561174418f4bd/env/lib/python3.5/site-packages/lazr/sshserver/session.py", line 88, in startWriting
    resumeProducing()
  File "/srv/bazaar.launchpad.net/production/codehosting1-rev-20124175fa98fcb4b43973265a1561174418f4bd/env/lib/python3.5/site-packages/twisted/internet/process.py", line 894, in resumeProducing
    for p in self.pipes.itervalues():
builtins.AttributeError: 'dict' object has no attribute 'itervalues'
I d seen this before in our git hosting service: it was a bug in Twisted s Python 3 port, fixed after 20.3.0 but unfortunately after the last release that supported Python 2, so we had to backport that patch. Using the same backport dealt with this. Onwards!

20 December 2020

Russ Allbery: Review: Can't Even

Review: Can't Even, by Anne Helen Petersen
Publisher: Houghton Mifflin
Copyright: 2020
ISBN: 0-358-31659-6
Format: Kindle
Pages: 230
Like many other people, I first became aware of Anne Helen Petersen's journalism when her Buzzfeed article "How Millennials Became the Burnout Generation" went viral. Can't Even is the much-awaited (at least by me) book-length expansion of that thesis: The United States is, as a society, burning out, and that burnout is falling on millennials the hardest. We're not recognizing the symptoms because we think burnout looks like something dramatic and flashy. But for most people burnout looks less like a nervous breakdown and more like constant background anxiety and lack of energy.
Laura, who lives in Chicago and works as a special ed teacher, never wants to see her friends, or date, or cook she's so tired, she just wants to melt into the couch. "But then I can't focus on what I'm watching, and end up unfocused again, and not completely relaxing," she explained. "Here I am telling you I don't even relax right! I feel bad about feeling bad! But by the time I have leisure time, I just want to be alone!"
Petersen explores this idea across childhood, education, work, family, and parenting, but the core of her thesis is the precise opposite of the pervasive myth that millennials are entitled and lazy (a persistent generational critique that Petersen points out was also leveled at their Baby Boomer parents in the 1960s and 1970s). Millennials aren't slackers; they're workaholics from childhood, for whom everything has become a hustle and a second (or third or fourth) job. The struggle with "adulting" is a symptom of the burnout on the other side of exhaustion, the mental failures that happen when you've forced yourself to keep going on empty so many times that it's left lingering damage. Petersen is a synthesizing writer who draws together the threads of other books rather than going deep on a novel concept, so if you've been reading about work, psychology, stress, and productivity, many of the ideas here will be familiar. But she's been reading the same authors that I've been reading (Tressie McMillan Cottom, Emily Guendelsberger, Brigid Schulte, and even Cal Newport), and this was the book that helped me pull those analyses together into a coherent picture. That picture starts with the shift of risk in the 1970s and 1980s from previously stable corporations with long-lasting jobs and retirement pensions onto individual employees. The corresponding rise in precarity and therefore fear led to a concerted effort to re-establish a feeling of control. Baby Boomers doubled down on personal responsibility and personal capability, replacing unstructured childhood for their kids with planned activities and academic achievement. That generation, in turn, internalized the need for constant improvement, constant grading, and constant achievement, accepting an implied bargain that if they worked very hard, got good grades, got into good schools, and got a good degree, it would pay off in a good life and financial security. They were betrayed. The payoff never happened; many millennials graduated into the Great Recession and the worst economy since World War II. In response, millennials doubled down on the only path to success they were taught. They took on more debt, got more education, moved back in with their parents to cut expenses, and tried even harder.
Even after watching our parents get shut out, fall from, or simply struggle anxiously to maintain the American Dream, we didn't reject it. We tried to work harder, and better, more efficiently, with more credentials, to achieve it.
Once one has this framework in mind, it's startling how pervasive the "just try harder" message is and how deeply we've internalized it. It is at the center of the time management literature: Getting Things Done focuses almost entirely on individual efficiency. Later time management work has become more aware of the importance of pruning the to-do list and doing fewer things, but addresses that through techniques for individual prioritization. Cal Newport is more aware than most that constant busyness and multitasking interacts poorly with the human brain, and has taken a few tentative steps towards treating the problem as systemic rather than individual, but his focus is still primarily on individual choices. Even when tackling a problem that is clearly societal, such as the monetization of fear and outrage on social media, the solutions are all individual: recognize that those platforms are bad for you, make an individual determination that your attention is being exploited, and quit social media through your personal force of will. And this isn't just productivity systems. Most of public discussion of environmentalism in the United States is about personal energy consumption, your individual carbon footprint, household recycling, and whether you personally should eat meat. Discussions of monopoly and monopsony become debates over whether you personally should buy from Amazon. Concerns about personal privacy turn into advocacy for using an ad blocker or shaming people for using Google products. Articles about the growth of right-wing extremism become exhortations to take responsibility for the right-wing extremist in your life and argue them out of their beliefs over the dinner table. Every major systemic issue facing society becomes yet another personal obligation, another place we are failing as individuals, something else that requires trying harder, learning more, caring more, doing more. This advice is well-meaning (mostly; sometimes it is an intentional and cynical diversion), and can even be effective with specific problems. But it's also a trap. If you're feeling miserable, you just haven't found the right combination of time-block scheduling, kanban, and bullet journaling yet. If you're upset at corporate greed and the destruction of the environment, the change starts with you and your household. The solution is in your personal hands; you just have try a little harder, work a little harder, make better decisions, and spend money more ethically (generally by buying more expensive products). And therefore, when we're already burned out, every topic becomes another failure, increasing our already excessive guilt and anxiety. Believing that we're in control, even when we're not, does have psychological value. That's part of what makes it such a beguiling trap. While drafting this review, I listened to Ezra Klein's interview with Robert Sapolsky on poverty and stress, and one of the points he made is that, when mildly or moderately bad things happen, believing you have control is empowering. It lets you recast the setback as a larger disaster that you were able to prevent and avoid a sense of futility. But when something major goes wrong, believing you have control is actively harmful to your mental health. The tragedy is now also a personal failure, leading to guilt and internal recrimination on top of the effects of the tragedy itself. This is why often the most comforting thing we can say to someone else after a personal disaster is "there's nothing you could have done." Believing we can improve our lives if we just try a little harder does work, until it doesn't. And because it does work for smaller things, it's hard to abandon; in the short term, believing we're at the mercy of forces outside our control feels even worse. So we double down on self-improvement, giving ourselves even more things to attempt to do and thus burning out even more. Petersen is having none of this, and her anger is both satisfying and clarifying.
In writing that article, and this book, I haven't cured anyone's burnout, including my own. But one thing did become incredibly clear. This isn't a personal problem. It's a societal one and it will not be cured by productivity apps, or a bullet journal, or face mask skin treatments, or overnight fucking oats. We gravitate toward those personal cures because they seem tenable, and promise that our lives can be recentered, and regrounded, with just a bit more discipline, a new app, a better email organization strategy, or a new approach to meal planning. But these are all merely Band-Aids on an open wound. They might temporarily stop the bleeding, but when they fall off, and we fail at our new-found discipline, we just feel worse.
Structurally, Can't Even is half summaries of other books and essays put into this overall structure and half short profiles and quotes from millennials that illustrate her point. This is Petersen's typical journalistic style if you're familiar with her other work. It gains a lot from the voices of individuals, but it can also feel like argument from anecdote. If there's a epistemic flaw in this book, it's that Petersen defends her arguments more with examples than with scientific study. I've read enough of the other books she cites, many of which do go into the underlying studies and statistics, to know that her argument is well-grounded, but I think Can't Even works better as a roadmap and synthesis than as a primary source of convincing data. The other flaw that I'll mention is that although Petersen tries very hard to incorporate poorer and non-white millennials, I don't think the effort was successful, and I'm not sure it was possible within the structure of this book. She frequently makes a statement that's accurate and insightful for millennials from white, middle-class families, acknowledges that it doesn't entirely apply to, for example, racial minorities, and then moves on without truly reconciling those two perspectives. I think this is a deep structural problem: One's experience of American life is very different depending on race and class, and the phenomenon that Petersen is speaking to is to an extent specific to those social classes who had a more comfortable and relaxing life and are losing it. One way to see the story of the modern economy is that white people are becoming as precarious as everyone else already was, and are reacting by making the lives of non-white people yet more miserable. Petersen is accurately pointing to significant changes in relationships with employers, productivity, family, and the ideology of individualism, but experiencing that as a change is more applicable to white people than non-white people. That means there are, in a way, two books here: one about the slow collapse of the white middle class into constant burnout, and a different book about the much longer-standing burnout of being non-white in the United States and our systemic failure to address the causes of it. Petersen tries to gesture at the second book, but she's not the person to write it and those two books cannot comfortably live between the same covers. The gestures therefore feel awkward and forced, and while the discomfort itself serves some purpose, it lacks the insight that Petersen brings to the rest of the book. Those critiques aside, I found Can't Even immensely clarifying. It's the first book that explained to me in a way I understood what's so demoralizing and harmful about Instagram and its allure of cosplaying as a successful person. It helped me understand how productivity and individual political choices fit into a system that emphasizes individual action as an excuse to not address collective problems. And it also gave me a strange form of hope, because if something can't go on forever, it will, at some point, stop.
Millennials have been denigrated and mischaracterized, blamed for struggling in situations that set us up to fail. But if we have the endurance and aptitude and wherewithal to work ourselves this deeply into the ground, we also have the strength to fight. We have little savings and less stability. Our anger is barely contained. We're a pile of ashes smoldering, a bad memory of our best selves. Underestimate us at your peril: We have so little left to lose.
Nothing will change without individual people making different decisions and taking different actions than they are today. But we have gone much too far down the path of individual, atomized actions that may produce feelings of personal virtue but that are a path to ineffectiveness and burnout when faced with systemic problems. We need to make different choices, yes, but choices towards solidarity and movement politics rather than personal optimization. There is a backlash coming. If we let it ground itself in personal grievance, it could turn ugly and take a racist and nationalist direction. But that's not, by in large, what millennials have done, and that makes me optimistic. If we embrace the energy of that backlash and help shape it to be more inclusive, just, and fair, we can rediscover the effectiveness of collective solutions for collective problems. Rating: 8 out of 10

25 November 2020

Shirish Agarwal: Women state in India and proposal for corporates in Indian banking

Gradle and Kotlin in Debian Few months back, I was looking at where Gradle and Kotlin were in Debian. They still seem to be a work in progress. I found the Android-tools salsa repo which tells me the state of things. While there has been movement on both, a bit more on Kotlin, it still seems it would take a while. For kotlin, the wiki page is most helpful as well as the android-tool salsa kotlin board page . Ironically, some of the more important info. is shared in a blog post which ideally should also have been reflected in kotlin board page . I did see some of the bugs so know it s pretty much dependency hell. I can only congratulate and encourage Samyak Jain and Raman Sarda. I also played a bit with the google-android-emulator-installer which is basically a hook which downloads the binary from google. I do not know what the plans are, but perhaps in the future they also might be build locally, who knows. Just sharing/stating here so it s part of my notes whenever I wanna see what s happening

Women in India I am sure some of you might remember my blog post from last year. It is almost close to a year 2020 now and the question to be asked is, has much changed ? After a lot or hue and cry the Government of India shared the NCRB data of crimes against women and caste crimes. The report shared that crimes against women had risen by 7.3% in a year, similarly crimes against lower castes also went by similar percentage . With the 2020 pandemic, I am sure the number has gone up more. And there is a possibility that just like last year, next year the Government would cite the pandemic and say no data. This year they have done it for migrant deaths during lockdown , for job losses due to the pandemic and so on and so forth. So, it will be no surprise if the Govt. says about NCRB data next year as well. Although media has been showing some in spite of the regular threats to the journalists as shared in the last blog post. There is also data that shows that women participation in labor force has fallen sharply especially in the last few years and the Government seems to have no neither idea nor do they seem to care for the same. There aren t any concrete plans to bring back the balance even a little bit.

Few Court judgements But all hope is not lost. There have been a couple of good judgements, one from the CIC (Chief Information Commissioner) wherein in specific cases a wife can know salary details of her husband, especially if there is some kind of maintenance due from the husband. There was so much of hue and cry against this order that it was taken down from the livelaw RTI corner. Luckily, I had downloaded it, hence could upload and share it. Another one was a case/suit about a legally matured women who had decided to marry without parental consent. In this case, the Delhi High Court had taken women s side and stated she can marry whom she wants. Interestingly, about a week back Uttar Pradesh (most notorious about crime against women) had made laws called Love Jihad and 2 -3 states have followed them. The idea being to create an atmosphere of hate against Muslims and women have no autonomy about what they want. This is when in a separate suit/case against Sudharshan TV (a far-right leaning channel promoting hate against Muslims) , the Government of India itself put an affidavit stating that Tablighis (a sect of Muslims who came from Malaysia to India for religious discourse and assembly) were not responsible for dissemination of the virus and some media has correctly portrayed the same. Of course, those who are on the side of the Govt. on this topic think a traitor has written. They also thought that the Govt. had taken a wrong approach but couldn t tell of a better approach to the matter. There are too many matters in the Supreme Court of women asking for justice to tell all here but two instances share how the SC has been buckling under the stress of late, one is a webinar which was chaired by Justice Subramaniam where he shared how the executive is using judicial appointments to do what it wants. The gulf between the executive and the SC has been since Indira Gandhi days, especially the judicial orders which declared that the Emergency is valid by large, it has fallen much more recently and the executive has been muscling in which have resulted in more regressive decisions than progressive.

This observation is also in tune with another study which came to the same result although using data. The raw data from the study could give so much more than what has been shared. For e.g. as an idea for the study, of the ones cited, how many have been in civil law, personal law, criminal or constitutional law. This would give a better understanding of things. Also what is shocking is none of our court orders have been cited in the west in the recent past, when there used to be a time when the west used to take guidance from Indian jurisprudence sometimes and cite the orders to reach similar conclusion or if not conclusion at least be used as a precedent. I guess those days are over.

Government giving Corporate ownership to Private Sector Banks There was an Internal Working Group report to review extant ownership guidelines and Corporate Structure for Indian Private Sector Banks. This is the actual title of the report. Now there were and are concerns about the move which were put forth by Dr. Raghuram Rajan and Viral Acharya. While Dr. Rajan had been the 23rd Governor of RBI from 4th September 2013 to 4th September 2016. His most commendable work which largely is unknown to most people was the report A hundred small steps which you buy from sage publications. Viral Acharya was the deputy governor from 23rd January 2017 23rd July 2019. Mr. Acharya just recently published his book Quest for Restoring Financial Stability in India which can be bought from the same publication house as well. They also wrote a three page article stating that does India need corporates in banking? More interestingly, he shares two points from history both world war 1 and world war 2. In both cases, the allies had to cut down the businesses who had owned banks. In Germany, it was the same and in Japan, the zaibatsu s dissolution, both of which were needed to make the world safe again. Now, if we don t learn lessons from history it is our fault, not history s. What was also shared that this idea was taken up in 2013 but was put into cold-storage. He also commented on the pressure on RBI as all co-operative banks have come under its ambit in the last few months. RBI has had a patchy record, especially in the last couple of years, with big scams like ILFS, Yes Bank, PMC Bank, Laxmi Vilas Bank among others. The LVB Bank being the most recent one. If new banking licenses have to be given they can be given to good NBFC s who have been in the market for a long time and have shown maturity while dealing with public money. What is the hurry for giving it to Corporate/business houses ? There are many other good points in the report with which both Mr. Rajan and Mr. Acharya are in agreement and do hope the other points/suggestions/proposals are implemented. There was and is an interesting report by Reserve Bank of India called financial sector legislative reforms commission report volume 1 . If and when it gets deleted from RBI, I have put up a copy at my WordPress account, so we shall always have one. Interestingly, while looking through the people who were part of the committee was a somewhat familiar name Murmu . This is perhaps the first time you see people from a sort of political background being in what should be a cut and dry review which have people normally from careers in finance or Accounts. It also turns out that only one person was in favor of banks going to corporates, all the rest were against. It seems that the specific person hadn t heard the terms self-lending , connected-lending and conflict of interest. One of the more interesting comments in the report is if a corporate has a bank, then why would he go to Switzerland, he would just wash the money in his own bank. And if banks were to become to big to fail like it happened in the United States, it would be again private gains, public losses. There was also a Washington Post article which shares some of the reasons that Indian banks fail. I think we need to remind ourselves once again, how things can become
https://www.youtube.com/watch?v=2gK3s5j7PgA

Positive News at end At the end I do not want to end on a sour notes, hence sharing a YouTube channel of Films Division India where you can see of the very classic works and interviews of some of the greats in Indian art cinema.

https://www.youtube.com/user/FilmsDivision/videos Also sharing a bit of funny story I came to know about youtube-dl, apparently it was taken off from github but thanks to efforts from EFF, Hackernews and others, it is now back in action.

6 June 2017

Russ Allbery: Review: Star Healer

Review: Star Healer, by James White
Series: Sector General #6
Publisher: Orb
Copyright: 1984
Printing: 2002
ISBN: 0-312-87770-6
Format: Trade paperback
Pages: 206
Star Healer is the sixth book of the Sector General series, and I think it may be the first novel in the series that was written as a novel instead of a fix-up of short stories. That makes it not a bad place to start in the series if one would rather not deal with fix-ups or barely-disguised short story collections. There isn't a huge amount of character development over the course of this series (at least to this point), so the main thing you would lose by starting here is some built-up reason for caring about the main character. This is the third book in the Alien Emergencies omnibus (the book referenced in the publication information here). Most of the previous stories have focused on Conway, a Senior Physician in the sprawling and wonderfully well-equipped multi-species hospital called Sector General and, in recent stories, the head physician in the hospital's ambulance ship. Becoming a Senior Physician at Sector General is quite the accomplishment, and a fine point to reach in the career of any doctor specializing in varied life forms, but there is another tier above: the Diagnosticians, who are the elite of Sector General. The difference is education tapes. Deep knowledge of even one specific type of life is a lot to ask of a doctor, as shown by the increasing specialization of human medicine. Sector General, which deals with wildly varying ailments of thousands of species including entirely unknown ones (if, admittedly, primarily trauma, at least in the stories shown), would be an impossible task. White realizes this and works around it with education tapes that temporarily embed in a doctor's head the experience of a doctor of another species entirely. This provides the native expertise missing, but it comes with the full personality of the doctor who recorded the tape, including preferences for food and romantic attachment that may be highly disorienting. Senior Physicians use a tape at a time, and then have it erased again when they don't need it. Diagnosticians juggle four or more tapes at the same time, and keep them for long periods or even permanently, allowing them to do ground-breaking original research. The opening of Star Healer is an offer from the intimidating Chief Psychologist of Sector General: he has a shot at Diagnostician. But it's a major decision that he should think over first, so the next step is to take a vacation of sorts on a quiet world with a small human scientific station. Oh, and there's a native medical problem, although not one with much urgency. Conway doesn't do a lot of resting, because of course he gets pulled into trying to understand the mystery of an alien species that is solitary to the point of deep and unbreakable social taboos against even standing close to other people. This is a nice cultural puzzle in line with the rest of the series, but it also leaves Conway with a new ally: an alien healer in a society in which being a doctor is difficult to the point of near hopelessness. It's not much of a spoiler to say that of course Conway decides to try for Diagnostician after his "vacation." The rest of the book is him juggling multiple cases with his new and often conflicting modes of thinking, and tackling problems that require a bit less in the way of puzzle-solving and a bit more in the way of hard trade-off decisions and quick surgical action. Senior Physicians may be able to concentrate on just one puzzle at a time; Diagnosticians have to juggle several. And they're larger, more long-term problems, focusing on how to improve a general problem for a whole species rather than just heal a specific injured alien. One interesting aspect of this series, which is very much on display here, is that Sector General most definitely does not have a Prime Directive. They are cautious about making contact with particularly primitive civilizations for fear that spacefarers would give them an inferiority complex, but sometimes they do anyway. And if they run into some biological system that offends their sensibilities, they try to fix it, not just observe it. White frequently shows species caught in what the characters call "biological traps," unable to develop farther because of some biological adaptation that gets in their way, and Sector General tries to fix those. It's an interesting ethical problem that I wish they'd think about a bit more. It's not clear they're wrong, and I think it's correct to take an expansive view of the mission to heal, but there's also a sense in which Sector General is modifying culture and biology to make aliens more like them. (The parallels between this and all the abusive paternalism that human cultures do around disability is a little too close to home to be comfortable, and now I kind of wish it hadn't occurred to me.) The gender roles, sadly, continue to be dire, although mostly ignorable because the one major female character is generally just shown as another doctor with little attention to sex. But apparently women (of every species!) cannot become Diagnosticians because they have an insurmountable biological aversion to sharing their minds with a learning tape from any doctor who doesn't find them physically attractive, which is just... sigh. It's sad that someone who could write an otherwise remarkably open-minded and pacifist series of stories, in sharp contrast with most of SF history, would still have that large of a blind spot. Apart from the times gender comes up, I liked this book more than the rest of the series, in part because I strongly prefer novels to short stories. There's more room to develop the story, and while characterization continues to not be White's strong point and that space mostly goes to more puzzles instead, he does provide an interesting set of interlocking puzzles. The problem posed by the aliens Conway meets on his "vacation" isn't fully resolved here (presumably that's for a future book), but he does solve several other significant problems and develops his own problem-solving style in more depth than in previous stories. Mildly recommended, particularly if you like this series in general. Followed by Code Blue - Emergency. Rating: 7 out of 10

5 April 2017

Steinar H. Gunderson: Nageru 1.5.0 released

I just released version 1.5.0 of Nageru, my live video mixer. The biggest feature is obviously the HDMI/SDI live output, but there are lots of small nuggets everywhere; it's been four months in the making. I'll simply paste the NEWS entry here:
Nageru 1.5.0, April 5th, 2017
  - Support for low-latency HDMI/SDI output in addition to (or instead of) the
    stream. This currently only works with DeckLink cards, not bmusb. See the
    manual for more information.
  - Support changing the resolution from the command line, instead of locking
    everything to 1280x720.
  - The A/V sync code has been rewritten to be more in line with Fons
    Adriaensen's original paper. It handles several cases much better,
    in particular when trying to match 59.94 and 60 Hz sources to each other.
    However, it might occasionally need a few extra seconds on startup to
    lock properly if startup is slow.
  - Add support for using x264 for the disk recording. This makes it possible,
    among other things, to run Nageru on a machine entirely without VA-API
    support.
  - Support for 10-bit Y'CbCr, both on input and output. (Output requires
    x264 disk recording, as Quick Sync Video does not support 10-bit H.264.)
    This requires compute shader support, and is in general a little bit
    slower on input and output, due to the extra amount of data being shuffled
    around. Intermediate precision is 16-bit floating-point or better,
    as before.
  - Enable input mode autodetection for DeckLink cards that support it.
    (bmusb mode has always been autodetected.)
  - Add functionality to add a time code to the stream; useful for debugging
    latency.
  - The live display is now both more performant and of higher image quality.
  - Fix a long-standing issue where the preview displays would be too bright
    when using an NVIDIA GPU. (This did not affect the finished stream.)
  - Many other bugfixes and small improvements.
1.5.0 is on its way into Debian experimental (it's too late for the stretch release, especially as it also depends on Movit and bmusb from experimental), or you can get it from the home page as always.

1 April 2017

Antoine Beaupr : My free software activities, February and March 2017

Looking into self-financing Before I begin, I should mention that I started tracking my time working on free software more systematically. I spend a lot of time on the computer, as regular readers of this blog might remember so I wanted to know exactly how much time was paid vs free work. I was already using org-mode's time clock system to keep track of my work hours, so I just extended this to my regular free software contributions, which also helps in writing those reports. It turns out that over 60% of my computer time is spent working on free software. That's huge! I was expecting something more along the range of 20 to 40% of my time. So I started thinking about ways of financing this work. I created a Patreon page but I'm hesitant into launching such a campaign: the only thing worse than "no patreon page" is "a patreon page with failed goals and no one financing it". So before starting such an effort, I'd like to get a feeling of what other people's experience with it are. I know that joeyh is close to achieving his goals, but I can't compare with the guy that invented git-annex or debhelper, so I'm concerned I wouldn't be able to raise the same level of funding. So any advice you have, feel free to contact me in private or in the comments. If you would be ready to fund my work, I'd love to know about it, obviously, but I guess I wouldn't get real numbers until I actually open up such a page... Now, onto the regular report.

Wallabako I spent a good chunk of time completing most of the things I had in mind for Wallabako, which I mentioned quickly in the previous report. Wallabako is now much easier to installed, with clearer instructions, an easier to use configuration file, more reliable synchronization and read status propagation. As usual the Wallabako README file has all the details. I've also looked at better integration with Koreader, the free software e-reader that forms the basis of the okreader free software distribution which has been able to port Debian to the Kobo e-readers, a project I am really excited about. This project has the potential of supporting Kobo readers beyond the lifetime that upstream grants it and removes a lot of proprietary software and spyware that ships with the Kobo readers. So I have made a few contributions to okreader and also on koreader, the ebook reader okreader is based on.

Stressant I rewrote stressant, my simple burn-in and stress-testing tool. After struggling in turn with Debirf, live-build, vmdebootstrap and even FAI, I just figured maybe it wasn't the best idea to try and reinvent that particular wheel: instead of reinventing how to build yet another Debian system build tool, maybe I should just reuse what's already there. It turns out there's a well known, succesful and fairly complete recovery system called Grml. It is a Debian Derivative, so all I needed to do was to stop procrastinating and actually write the actual stressant tool instead of just creating a distribution with a bunch of random tools shipped in. This allowed me to focus on which tools were the best to stress test different components. This selection ended up being: fio can also be used to overwrite disk drives with the proper options (--overwrite and --size=100%), although grml also ships with nwipe for wiping old spinning disks and hdparm to do a secure erase of SSD disks (whatever that's worth). Stressant still needs to be shipped with grml for this transition to be complete. In the meantime, I was able to configure the excellent public Gitlab CI service to provide ISO images with Stressant built-in as a stopgap measure. I also need to figure out a way to automate starting stressant from a boot menu to automate deployments on a larger scale, although because I have little need for the feature at this moment in time, this will likely wait for a sponsor to show up for this to be implemented. Still, stressant has useful features like the capability of sending logs by email using a fresh new implementation of the Python SMTPHandler (BufferedSMTPHandler) which waits for logging to complete before sending a single email. Another interesting piece of code in there is the NegateAction argparse handler that enables the use of "toggle flags" (e.g. --flag / --no-flag). I'm so happy with the code that I figure I could just share it here directly:
class NegateAction(argparse.Action):
    '''add a toggle flag to argparse

    this is similar to 'store_true' or 'store_false', but allows
    arguments prefixed with --no to disable the default. the default
    is set depending on the first argument - if it starts with the
    negative form (define by default as '--no'), the default is False,
    otherwise True.
    '''
    negative = '--no'
    def __init__(self, option_strings, *args, **kwargs):
        '''set default depending on the first argument'''
        default = not option_strings[0].startswith(self.negative)
        super(NegateAction, self).__init__(option_strings, *args,
                                           default=default, nargs=0, **kwargs)
    def __call__(self, parser, ns, values, option):
        '''set the truth value depending on whether
        it starts with the negative form'''
        setattr(ns, self.dest, not option.startswith(self.negative))
Short and sweet. I wonder why stuff like this is not in the standard library yet - maybe just because no one bothered yet? It'd be great to get feedback of more experienced Pythonistas on this one. I hope that my work on Stressant is complete. I get zero funding for this work, and have little use for it myself: I manage only a few machines and such a tool really shines when you regularly put new hardware online, which is (fortunately?) not my case anymore. I'd be happy, of course, to accompany organisations and people that wish to further develop and use such a tool. A short demo of stressant as well as detailed description of how it works is of course available in its README file.

Standard third party repositories After looking at improvements for the grml repository instructions, I realized there was no real "best practices" document on how to configure an Apt repository. Sure, there are tools like reprepro and others, but those hardly qualify as policy: they are very flexible and there are lots of ways to create insecure repositories or curl sh style instructions, which we of course generally want to avoid. While the larger problem of Unstrusted Debian packages remain generally unsolved (e.g. when you install any .deb file, it can get root on your system), it seemed to me one critical part of this problem was how to add a random third-party repository to your machine while limiting, as much as possible, what possible attackers could do with such a repository. In other words, to solve the more general problem of insecure .deb files, we also need to solve the distribution problem, otherwise fixing the .deb files themselves will be useless. This lead to the creation of standardized repository instructions that define:
  1. how to distribute the repository's public signing key (ie. over HTTPS)
  2. how to name suites and components (e.g. use stable and main unless you have a good reason, and explain yourself)
  3. recommend a healthy does of apt preferences pinning
  4. how to distribute keys (e.g. with a derive-archive-keyring package)
I've seen so many third party repositories get this wrong. For example, a lot of repositories recommend this type of command to intialize the OpenPGP trust path:
curl http://example.com/key.asc   apt-key add -
This has the following problems:
  • the key is transfered in plaintext and can easily be manipulated by an active attacker (e.g. a router on your path to the server or a neighbor in a Wifi cafe)
  • the key is added to the main trust root, which allows the key to authentify as the real Debian archive, therefore giving it all rights over all packages
  • since it's part of the global archive, it's difficult for a package to remove/add the key when a key rollover is necessary (and repositories generally don't provide a deriv-archive-keyring to do that process anyways)
An example of this are the Docker install instructions that, at least, manage to do this over HTTPS. Some other repositories don't even bother teaching people about the proper way of adding those keys. We settled for:
wget -O /usr/share/keyrings/deriv-archive-keyring.gpg https://deriv.example.net/debian/deriv-archive-keyring.gpg
That location was explicitly chosen to be out of the main trust directory, so that it needs to be explicitly added to the sources.list as well:
deb [signed-by=/usr/share/keyrings/deriv-archive-keyring.gpg] https://deriv.example.net/debian/ stable main
Similarly, we highly recommend users setup "apt pinning" to restrict what a given repository can do. Since pinning is so confusing, most people don't actually bother even configuring it and I have yet to see a single repo advise its users to configure those preferences, which are essential to limit what a repository can do. To keep configuration simple, we recommend this:
Package: *
Pin: origin deriv.example.net
Pin-Priority: 100
Obviously, for a single-package repository, the actual package name should be listed, e.g.:
Package: foo
Pin: origin deriv.example.net
Pin-Priority: 100
And the priority should probably be set to 1 unless you want to allow automatic upgrades. It is my hope that this design will get more traction in the years to come and become a de-facto standard that will be a key part in safely adding third party repositories. There is obviously much more work to be done to improve security when installing untrusted .deb files, and I encourage Debian developers to consider contributing to the UntrustedDebs discussions and particularly to the Teams/Dpkg/Spec/DeclarativePackaging work.

Signal R&D I spent a significant amount of time this month struggling with the Signal project on my phone. I'm still ambivalent on Signal: it's a centralized designed, too dependent on phone numbers, but I must admit they get a lot of things right and it's the only free-software platform that allows for easy-to-use, multi-platform videoconferencing that my family can use. I've been following Signal for a while: up until now, I had been using the LibreSignal rebuild of the official client, as it is distributed on a F-Droid repository. Because I try to avoid Google (proprietary) software on my phone, it's basically the only way I could even install Signal. Unfortunately, the repository is out of date and introduces another point of trust in the distribution model: now you not only need to trust the Signal authors to do the right thing, you also need to trust that F-Droid repo not to inject nasty code on your phone. I've therefore started a discussion about how Signal could be distributed outside of the Google Play Store. I'd like to think it's one of the things that led the Signal people to distribute an official copy of Signal outside of the playstore. After much struggling, I was able to upgrade to this official client and will be able to upgrade easily by just downloading the APK. (Do note that I ended up reinstalling and re-registering Signal, which unfortunately changed my secret keys.) I do hope Signal enters F-Droid one day, but it could take a while because it still doesn't work without Google services and barely works with MicroG, the free software alternative to the Google services clients. Moxie also set a list of requirements like crash reporting and statistics that need to be implemented on F-Droid's side before he agrees to the deployment, so this could take a while. I've also participated in the, ahem, discussion on the JWZ blog regarding a supposed vulnerability in Signal where it would leak previously unknown phone numbers to third parties. I reviewed the way the phone number is uploaded and, while it's possible to create a rainbow table of phone numbers (which are hashed with a truncated SHA-1 checksum), I couldn't verify the claims of other participants in the thread. For me, Signal still does the right thing with contacts, although I do question the way "read status" notifications get transmitted, but that belong in another bug report / blog post.

Debian Long Term Support (LTS) It's been more than a year working on Debian LTS, started by Raphael Hertzog at Freexian. I didn't work much in February so I had a lot of hours to catchup with, and was unfortunately unable to do so, partly because I was busy with other projects, and partly because my colleagues are doing a great job at resolving the most important issues. So one my concerns this month was finding work. It seemed that all the hard packages were either taken (e.g. my usual favorites, tiff and imagemagick, we done by others) or just too challenging (e.g. I don't feel quite comfortable tackling the LTS branch of the Linux kernel yet). I spent quite a bit of time trying to figure out what was wrong with pcre3, only to realise the "32" in the report was not about the architecture, but about the character width. Because of thise, I marked 4 CVEs (CVE-2017-7186, CVE-2017-7244, CVE-2017-7245, CVE-2017-7246) as "not-affected", since the 32-bith character support wasn't enabled in wheezy (or jessie, for that matter). I still spent some time trying to reproduce the issues, which require a compiler with an AddressSanitizer, something that was introduced in both Clang and GCC after Wheezy was released, which makes reproducing this fairly complicated... This allowed me to experiment more with Vagrant, however, and I have provided the Debian cloud team with a 32-bit Vagrant box that was merged in shortly after, although it doesn't show up yet in the official list of Debian images. Then I looked at the apparmor situation (CVE-2017-6507), Debian bug #858768). That was one tricky bug as well, since it's not a security issue in apparmor per se, but more an issue with things that assume a certain behavior from apparmor. I have concluded that Wheezy was not affected because there are no assumptions of proper isolation there - which are provided only starting from LXC 1.0 - and Docker is not in Wheezy. I also couldn't reproduce the issue on Jessie, but, as it turns out, the issue was sysvinit-specific, which is why I couldn't reproduce it under the default systemd configuration shipped with Jessie. I also looked at the various binutils security issues: as I reported on the mailing list, I didn't see anything serious enough in there to warrant a a security release and followed the lead of both the stable and Red Hat security teams by marking this "no-dsa". I similiarly reviewed the mp3splt security issues (specifically CVE-2017-5666) and was fairly puzzled by that issue, which seems to be triggered only the same address sanitization extensions than PCRE, although there was some pretty wild interplay with debugging flags in there. All in all, it seems we can't reproduce that issue in wheezy, but I do not feel confident enough in the results to push that issue aside for now. I finally uploaded the pending graphicsmagick issue (DLA-547-2), a regression update to fix a crash that was introduced in the previous release (DLA-547-1, mistakenly named DLA-574-1). Hopefully that release should clear up some of the confusion and fix the regression. I also released DLA-879-1 for the CVE-2017-6369 in firebird2.5 which was an interesting experiment: I couldn't reproduce the issue in a local VM. After following the Ubuntu setup tutorial, as I wasn't too familiar with the Firebird database until now (hint: the default username and password is sysdba/masterkey), I ended up assuming we were vulnerable and just backporting the patch after seeing the jessie folks push out a release just in case. I also looked at updating the ca-certificates package to deal with the pending WoSign/Startcom removal: I made an explicit list of the CAs that need to be removed after reviewing the Mozilla list. I also sent a patch for an unrelated issue where ca-certificates is writing to /usr/local (!!) in Debian bug #843722. I have also done some "meta" work in starting a discussion about fixing the missing DLA links in the tracker, as you will notice all of the above links lead to nowhere. Thanks to pabs, there are now some links but unfortunately there are about 500 DLAs missing from the website. We also discussed ways to Debian bug #859123, something which is currently a manual process. This is now in the hands of the excellent webmaster team. I have also filed a few missing security bugs (Debian bug #859135, Debian bug #859136), partly because I wanted to help the security team. But it turned out that I felt the script needed some improvements, so I submitted a patch to improve the script so it is easier to run.

Other projects As usual, there's the usual mixed bags of chaos: More stuff on Github...

1 March 2017

Thorsten Glaser: New GitHub Terms of Service r e q u i r e removing many Open Source works from it

Please use the correct (perma)link to bookmark this article, not the page listing all wlog entries of the last decade. Thank you.</update> Some updates inline and at the bottom. The new Terms of Service of GitHub became effective today, which is quite problematic there was a review phase, but my reviews pointing out the problems were not answered, and, while the language is somewhat changed from the draft, they became effective immediately. Now, the new ToS are not so bad that one immediately must stop using their service for disagreement, but it s important that certain content may no longer legally be pushed to GitHub. I ll try to explain which is affected, and why. I m mostly working my way backwards through section D, as that s where the problems I identified lie, and because this is from easier to harder. Note that using a private repository does not help, as the same terms apply. Anything requiring attribution (e.g. CC-BY, but also BSD, ) Section D.7 requires the person uploading content to waive any and all attribution rights. Ostensibly to allow basic functions like search to work , which I can even believe, but, for a work the uploader did not create completely by themselves, they can t grant this licence. The CC licences are notably bad because they don t permit sublicencing, but even so, anything requiring attribution can, in almost all cases, not written or otherwise, created or uploaded by our Users . This is fact, and the exceptions are few. Anything putting conditions on the right to use, display and perform the work and, worse, reproduce (all Copyleft) Section D.5 requires the uploader to grant all other GitHub users Note that section D.4 is similar, but granting the licence to GitHub (and their successors); while this is worded much more friendly than in the draft, this fact only makes it harder to see if it affects works in a similar way. But that doesn t matter since D.5 is clear enough. (This doesn t mean it s not a problem, just that I don t want to go there and analyse D.4 as D.5 points out the same problems but is easier.) This means that any and all content under copyleft licences is also no longer welcome on GitHub. Anything requiring integrity of the author s source (e.g. LPPL) Some licences are famous for requiring people to keep the original intact while permitting patches to be piled on top; this is actually permissible for Open Source, even though annoying, and the most common LaTeX licence is rather close to that. Section D.3 says any (partial) content can be removed though keeping a PKZIP archive of the original is a likely workaround. Affected licences Anything copyleft (GPL, AGPL, LGPL, CC-*-SA) or requiring attribution (CC-BY-*, but also 4-clause BSD, Apache 2 with NOTICE text file, ) are affected. BSD-style licences without advertising clause (MIT/Expat, MirOS, etc.) are probably not affected if GitHub doesn t go too far and dissociates excerpts from their context and legal info, but then nobody would be able to distribute it, so that d be useless. But what if I just fork something under such a licence? Only continuing to use GitHub constitutes accepting the new terms. This means that repositories from people who last used GitHub before March 2017 are excluded. Even then, the new terms likely only apply to content uploaded in March 2017 or later (note that git commit dates are unreliable, you have to actually check whether the contribution dates March 2017 or later). And then, most people are likely unaware of the new terms. If they upload content they themselves don t have the appropriate rights (waivers to attribution and copyleft/share-alike clauses), it s plain illegal and also makes your upload of them or a derivate thereof no more legal. Granted, people who, in full knowledge of the new ToS, share any User-Generated Content with GitHub on or after 1 March, 2017, and actually have the appropriate rights to do that, can do that; and if you encounter such a repository, you can fork, modify and upload that iff you also waive attribution and copyleft/share-alike rights for your portion of the upload. But especially in the beginning these will be few and far between (even more so taking into account that GitHub is, legally spoken, a mess, and they don t even care about hosting only OSS / Free works). Conclusion (Fazit) I ll be starting to remove any such content of mine, such as the source code mirrors of jupp, which is under the GNU GPLv1, now and will be requesting people who forked such repositories on GitHub to also remove them. This is not something I like to do but something I am required to do in order to comply with the licence granted to me by my upstream. Anything you ve found contributed by me in the meantime is up for review; ping me if I forgot something. (mksh is likely safe, even if I hereby remind you that the attribution requirement of the BSD-style licences still applies outside of GitHub.) (Pet peeve: why can t I adopt a licence with British spelling? They seem to require oversea barbarian spelling.) The others Atlassian Bitbucket has similar terms (even worse actually; I looked at them to see whether I could mirror mksh there, and turns out, I can t if I don t want to lose most of what few rights I retain when publishing under a permissive licence). Gitlab seems to not have such, but requires you to indemnify them YMMV. I think I ll self-host the removed content. And now? I m in contact with someone from GitHub Legal (not explicitly in the official capacity though) and will try to explain the sheer magnitude of the problem and ways to solve this (leaving the technical issues to technical solutions and requiring legal solutions only where strictly necessary), but for now, the ToS are enacted (another point of my criticism of this move) and thus, the aforementioned works must go off GitHub right now. That s not to say they may not come back later once this all has been addressed, if it will be addressed to allow that. The new ToS do have some good; for example, the old ToS said you allow every GitHub user to fork your repositories without ever specifying what that means. It s just that the people over at GitHub need to understand that, both legally and technically , any and all OSS licences grant enough to run a hosting platform already , and separate explicit grants are only needed if a repository contains content not under an OSI/OKFN/Copyfree/FSF/DFSG-free licence. I have been told that these are important issues and been thanked for my feedback; we ll see what comes from this. maybe with a little more effort on the coders side All licences on one of those lists or conformant to the DFSG, OSD or OKD should do . e.g. when displaying search results, add a note this is an excerpt, click HERE to get to the original work in its context, with licence and attribution where HERE is a backlink to the file in the repository It is understood those organisations never un-approve any licence that rightfully conforms to those definitions (also in cases like a grant saying just use any OSS licence which is occasionally used) Update: In the meantime, joeyh has written not one but two insightful articles (although I disagree in some details; the new licence is only to GitHub users (D.5) and GitHub (D.4) and only within their system, so, while uploaders would violate the ToS (they cannot grant the licence) and (probably) the upstream-granted copyleft licence, this would not mean that everyone else wasn t bound by the copyleft licence in, well, enough cases to count (yes it s possible to construct situations in which this hurts the copyleft fraction, but no, they re nowhere near 100%).

2 February 2017

Steinar H. Gunderson: Not going to FOSDEM but a year of Nageru

It's that time of the year :-) And FOSDEM is fun. But this year I won't be going; there was a scheduling conflict, and I didn't really have anything new to present (although I probably could have shifted around priorities to get something). But FOSDEM 2017 also means there's a year since FOSDEM 2016, where I presented Nageru, my live video mixer. And that's been a pretty busy year, so I thought I'd do a live cap from high up above. First of all, Nageru has actually been used in production we did Solskogen and Fyrrom, and both gave invaluable input. Then there have been some non-public events, which have also been useful. The Nageru that's in git right now is evolved considerably from the 1.0.0 that was released last year. diffstat shows 19660 insertions and 3543 deletions; that's counting about 2500 lines of vendored headers, though. Even though I like deleting code much more than adding it, the doubling (from ~10k to ~20k lines) represents a significant amount of new features: 1.1.x added support for non-Intel GPUs. 1.2.x added support for DeckLink input cards (through Blackmagic's proprietary drivers), greatly increasing hardware support, and did a bunch of small UI changes. 1.3.x added x264 support that's strong enough that Nageru has really displaced VLC as my go-to tool for just video-signal-to-H.264-conversion (even though it feels overkill), and also added hotplug support. 1.4.x added multichannel audio support including support for MIDI controllers, and also a disk space indicator (because when you run out of disk during production without understanding that's what happens, it really sucks), and brought extensive end-user documentation. And 1.5.x, in development right now, will add HDMI/SDI output, which, like all the previous changes, requires various rearchitecting and fixing. Of course, there are lots of things that haven't changed as well; the basic UI remains the same, including the way the theme (governing the look-and-feel of the finished video stream) works. The basic design has proved sound, and I don't think I would change a lot if I were to design something like 1.0.0 again. As a small free software project, you have to pick your battles, and I'm certainly glad I didn't start out doing something like network support (or a distributed architecture in general, really). So what's for the next year of Nageru? It's hard to say, and it will definitely depend on the concrete needs of events. A hot candidate (since I might happen to need it) is chroma keying, although good keying is hard to get right and this needs some research. There's also been some discussion around other concrete features, but I won't name them until a firm commitment has been made; priorities can shift around, and it's important to stay flexible. So, enjoy FOSDEM! Perhaps I'll return with a talk in 2018. In the meantime, I'll preparing the stream for the 2017 edition of Fyrrom, and I know for sure there will be more events, more features and more experiences to be had. And, inevitably, more bugs. :-)

27 December 2016

Gunnar Wolf: Giving up on the Drupal 8 debianization

I am sad (but feel my duty) to inform the world that we will not be providing a Drupal 8 package in Debian. I filed an Intent To Package bug a very long time ago, intending to ship it with Jessie; Drupal 8 was so deep a change that it took their community overly long to achieve and stabilize. Still, Drupal 8 was released over a year ago today. I started working on debianizing the package shortly afterwards. There is also some online evidence As my call for help sent through this same blog. I have been too busy this last year. I let the packaging process lay dormant for too long, without even touching it for even half a year. Then, around September, I started working with the very nice guys of Indava, David and Enrique, and did very good advances. They clearly understood Debian's needs when it comes to full source inclusion (as D8 ships many minified Javascript libraries), attribution (as additionally to all those, many third-party PHP projects are bundled in the infamous vendor/ directory), and system-wide dependency management (as Drupal builds on some frameworks and libraries already available within Debian, chiefly Symfony, Doctrine, Twig... Even more, most of them appeared to work at the version levels we will be shipping, so all was dandy and for some weeks, I was quite optimistic on finishing the package on time and with the needed quality and testing. Yay! But... Reality bites. When I started testing my precious package... It broke in horrible ways. Uncomprehensible PHP errors (and I have to add here, I am a PHP newbie and am reluctant to learn better a language that strikes me as so inconsistent, so ugly), which we spent some time tackling... Of course, configuration changes are more than expected... But, just as we Debianers learnt some important lessons after the way-too-long Sarge freeze (ten years ago, many among you won't remember those frustrating days), Drupal learnt as well. They changed their release strategy Instead of describing it, those interested can read it at its source. What it meant for me, sadly, is that this process does not align with the Debian maintenance model. This means: The Drupal API stays mostly-stable between 8.0.x, 8.1.x, 8.2.x, etc. However, Drupal will incorporate new versions of their bundled libraries. I understood the new versions would be incorporated at minor-level branches, but if I read correctly some of my errors, some dependencies change even at patch-level updates. And... Well, if you update a PHP library, and the invoking PHP code (that is, Drupal) relies in this new version... Sadly, it makes it unmaintainable for Debian. So, long story short: I have decided to drop Drupal8 support in Debian. Of course, if somebody wants to pick up the pieces, the Git repository is still there (although I do plan on erasing it in a couple of weeks, as it means useless waste of project resources otherwise), and you could probably even target unstable+backports in a weird way (as it's software that, given our constraints, shouldn't enter testing, at least during a freeze). So... Sigh, a tear is dropped for every lost hour of work, and my depeest regrets to David and Enrique who put their work as well to make D8 happen in Debian. I will soon be closing the ITP and... Forgetting about the whole issue?

22 June 2016

Satyam Zode: GSoC 2016 Week 4 and 5: Reproducible Builds in Debian

This is a brief report on my last week work with Debian Reproducible Builds. In week 4, I mostly worked on designing an interfaces and tackling different issues related to argument completion feature of diffoscope and in week 5 I worked on adding hiding .buildinfo from .changes files. Update for last week s activities Goal for upcoming week: I am thankful to Shirish Agarwal for helping me through visa process. But, unfortunately I won t get visa till 5th July. So I don t think, I would make it to debconf this year. I will certainly attend Debconf 2017. Good news for me is I have passed mid-term evaluations of Google Summer of Code 2016. I will continue my work to improve Debian. Even, I have post GSoC plans ready for Debian project ;) Have a nice day :)

20 June 2016

Joey Hess: twenty years of free software -- part 1 ikiwiki

I'm forty years old. I've been developing free software for twenty years. A decade ago, I wrote a series of posts about my first ten years of free software, looking back over projects I'd developed. These retrospectives seem even more valuable in retrospect; there are things in the old posts that jog my memory, and other details I've forgotten by now. So, I'm doing it again. Over the next two weeks (with a week off in the middle for summer vacation), I'll be writing one post each day about a free software project I've developed in the past decade.
We begin with Ikiwiki. I started it 10 years ago, and still use it to this day; it's the engine that builds this website, and nearly all my other websites, as well as wikis and websites belonging to tons and tons of other projects, like NetBSD, X.org, Freedesktop.org, FreedomBox and many other users. Indeed I'm often reading a website and find myself wondering "hey.. is this using Ikiwiki?", and glance at the html and find that yes, it is. So, Ikiwiki is a reasonably successful and widely used peice of software, at least in its niche. More important to me, it was a static site generator before we knew what those were. It wasn't the first, but it broke plenty of new ground. I'm particularly proud of the way it combines a wiki with blogging support seamlessly, and the incremental updating of the static pages including updating wikilinks and backlinks. Some of these and other features are still pretty unique to Ikiwiki despite the glut of other static site generators available now. Ikiwiki is written in Perl, which was great for getting lots of other contributions (including many of its 113 plugins), but has also held it back some lately. There are less Perl programmers these days. And over the past decade, concurrency has become very important, but Ikiwiki's implementation is stubbornly single threaded, and multithreading such a Perl program is a losing propoisition. I occasionally threaten to rewrite it in Haskell, but I doubt I will. Ikiwiki has several developers now, and I'm the least active of them. I stepped back because I can't write Perl very well anymore, and am mostly very happy with how Ikiwiki works, so only pop up now and then when something annoys me. Next: twenty years of free software -- part 2 etckeeper

16 April 2016

Matthias Klumpp: Introducing AppStream-Generator

AppStream GeneratorSince mid-2015 we were using the dep11-generator in Debian to build AppStream metadata about available software components in the distribution. Getting rid of dep11-generator Unfortunately, the old Python-based dep11-generator was hitting some hard limits pretty soon. For example, using multiprocessing with Python was a pain, since it resulted in some very hard-to-track bugs. Also, the multiprocessing approach (as opposed to multithreading) made it impossible to use the underlying LMDB database properly (it was basically closed and reopened in each forked off process, since pickling the Python LMDB object caused some really funny bugs, which usually manifested themselves in the application hanging forever without any information on what was going on). Additionally to that, the Python-based generator forced me to maintain two implementations of the AppStream YAML spec, one in C and one in Python, which consumes quite some time. There were also some other issues (e.g. no unit-tests) in the implementation, which made me think about rewriting the generator. Adventures in Go / Rust / D Since I didn t want to write this new piece of software in C (or basically, writing it in C was my last option  ), I explored Go and Rust for this purpose and also did a small prototype in the D programming language, when I was starting to feel really adventurous. And while I never intended to write the new generator in D (I was pretty fixated on Go ), this is what happened. The strong points for D for this particular project were its close relation to C (and ease of using existing C code), its super-flat learning curve for someone who knows and likes C and C++ and its pretty powerful implementations of the concurrent and parallel programming paradigms. That being said, not all is great in D and there are some pretty dark spots too, mainly when it comes to the standard library and compilers. I will dive into my experiences with D in a separate blogpost. What good to expect from appstream-generator? So, what can the new appstream-generator do for you? Basically, the same as the old dep11-generator: It will extract metadata from a distribution s package archive, download and resize screenshots, search for icons and size them properly and generate reports in JSON and HTML of found metadata and issues. LibAppStream-based parsing, generation of YAML or XML, multi-distro support, As opposed to the old generator, the new generator utilizes the metadata parsers and writers of libappstream. This allows it to return the extracted metadata as AppStream YAML (for Debian) or XML (everyone else) It is also written in a distribution-agnostic way, so if someone wants to use it in a different distribution than Debian, this is possible now. It just requires a very small distribution-specific backend to be written, all of the details of the metadata extraction are abstracted away (just two interfaces need to be implemented). While I do not expect anyone except Debian to use this in the near future (most distros have found a solution to generate metadata already), the frontend-backend split is a much cleaner design than what was available in the previous code. It also allows to unit-test the code properly, without providing a Debian archive in the testsuite. Feature Flags, Optipng, The new generator also allows to enable and disable certain sets of features in a standardized way. E.g. Ubuntu uses a language-pack system for translations, which Debian doesn t use. Features like this can be implemented as disableable separate modules in the generator. We use this at time to e.g. allow descriptions from packages to be used as AppStream descriptions, or for running optipng on the generated PNG images and icons. No more Contents file dependency Another issue the old generator had was that it used the Contents file from the Debian archive to find matching icons for an application. We could never be sure whether the contents in the Contents file actually matched the contents of the package we were currently dealing with. What made things worse is that at Ubuntu, the archive software is only updating the Contents file weekly daily (while the generator might run multiple times a day), which has lead to software being ignored in the metadata, because icons could not yet be found. Even on Debian, with its quickly-updated Contents file, we could immediately see the effects of an out-of-date Contents file when updating it failed once. In the new generator, we read the contents of each package ourselves now and store them in a LMDB database, bypassing the Contents file and removing the whole class of problems resulting from missing or wrong contents-data. It can t all be good, right? That is true, there are also some known issues the new generator has: Large amounts of RAM required The better speed of the new generator comes at the cost of holding more stuff in RAM. Much more. When processing data from 5 architectures initially on Debian, the amount of required RAM might lie above 4GB, with the OOM killer sometimes being quicker than the garbage collector That being said, on subsequent runs the amount of required memory is much lower. Still, this is something I am working on to improve. What are symbolic links? To be faster, the appstream-generator will read the md5sum file in .deb packages instead of extracting the payload archive and reading its contents. Since the md5sums file does not list symbolic links, symlinks basically don t exist for the new generator. This is a problem for software symlinking icons or even .desktop files around, like e.g. LibreOffice does. I am still investigating how widespread the use of symlinks for icons and .desktop files is, but it looks like fixing packages (making them not-symlink stuff and rather move the files) might be the better approach than investing additional computing power to find symlinks or even switch back to parsing the Contents file. Input on this is welcome! Deploying asgen I finished the last pieces of the appstream-generator (together with doing lots of other cool things and talking to great people) at the GNOME Software Hackfest in London last week (detailed blogposts about things that happened there will follow many thanks once again for the Ubuntu community for sponsoring my attendance!). Since today, the new generator is running on the Debian infrastructure. If bigger issues are found, we can still roll back to the old code. I decided to deploy this faster, so we can get some good testing done before the Stretch release. Please report any issues you may find!

30 March 2016

Julien Danjou: The OpenStack Schizophrenia

When I started contributing to OpenStack, almost five years ago, it was a small ecosystem. There were no foundation, a handful of projects and you could understand the code base in a few days. Fast forward 2016, and it is a totally different beast. The project grew to no less than 54 teams, each team providing one or more deliverable. For example, the Nova and Swift team each one produces one service and its client, whereas the Telemetry team produces 3 services and 3 different clients. In 5 years, OpenStack went to a few IaaS projects, to 54 different teams tackling different areas related to cloud computing. Once upon a time, OpenStack was all about starting some virtual machines on a network, backed by images and volumes. Nowadays, it's also about orchestrating your network deployment over containers, while managing your application life-cycle using a database service, everything being metered and billed for. This exponential growth has been made possible with the decision of the OpenStack Technical Committee to open the gates with the project structure reform voted at the end of 2014. This amendment suppresses the old OpenStack model of "integrated projects" (i.e. Nova, Glance, Swift ). The big tent, as it's called, allowed OpenStack to land new projects every month, growing from the 20 project teams of December 2014 to the 54 we have today multiplying the number of projects by 2.7 in a little more than a year. Amazing growth, right? And this was clearly a good change. I sat at the Technical Committee in 2013, when projects were trying to apply to be "integrated", after Ceilometer and Heat were. It was painful to see how the Technical Committee was trying to assess whether new projects should be brought in or not. But what I notice these days, is how OpenStack is still stuck between its old and new models. On one side, it accepted a lot of new teams, but on the other side, many are considered as second-class citizens. Efforts are made to continue to build an OpenStack project that does not exist anymore. For example, there is a team trying to define what's OpenStack core, named DefCore. That is looking to define which projects are, somehow, actually OpenStack. This leads to weird situations, such as having non-DefCore projects seeing their doc rejected from installation guides. Again, I reiterated my proposal to publish documentation as part of each project code to solve that dishonest situation and put everything on a level playing field Some cross-projects specs are also pushed without implication of all OpenStack projects. For example, The deprecate-cli spec which proposes to deprecate command-line interface tools proposed by each project had a lot of sense in the old OpenStack sense, where the goal was to build a unified and ubiquitous cloud platform. But when you now have tens of projects with largely different scopes, this start making less sense. Still, this spec was merged by the OpenStack Technical Committee this cycle. Keystone is the first project to proudly force users to rely on openstack-client, removing its old keystone command line tool. I find it odd to push that specs when it's pretty clear that some projects (e.g. Swift, Gnocchi ) have no intention to go down that path. Unfortunately, most specs pushed by the Technical Committee are in the realm of wishful thinking. It somehow makes sense, since only a few of the members are actively contributing to OpenStack projects, and they can't by themselves implement all of that magically. But OpenStack is no exception in the free software world and remains a do-ocracy. There is good cross-project content in OpenStack, such as the API working group. While the work done should probably not be OpenStack specific, there's a lot that teams have learned by building various HTTP REST API with different frameworks. Compiling this knowledge and offering it as a guidance to various teams is a great help. My fellow developer Chris Dent wrote a post about what he would do on the Technical Committee. In this article, he points to a lot of the shortcomings I described here, and his confusion between OpenStack being a product or being a kit is quite understandable. Indeed, the message broadcasted by OpenStack is still very confusing after the big tent openness. There's no enough user experience improvement being done. The OpenStack Technical Committee election is opened for April 2016, and from what I read so far, many candidates are proposing to now clean up the big tent, kicking out projects that do not match certain criteria anymore. This is probably a good idea, there is some inactive project laying around. But I don't think that will be enough to solve the identity crisis that OpenStack is experiencing. So this is why, once again this cycle, I will throw my hat in the ring and submit my candidacy for OpenStack Technical Committee.

2 March 2016

Steinar H. Gunderson: Nageru FOSDEM talk video

I got tired of waiting for the video of my FOSDEM talk about Nageru, my live video mixer, to come out, so I made an edit myself. (This wouldn't have been possible without getting access to the raw video from the team, of course.) Of course, in maximally bad timing, the video team published their own (auto-)edit of mine and a lot of other videos on the very same day (and also updated their counts to a whopping 151 fully or partial lost videos out of a bit over 500!), so now there's two competing ones. Of course, since I have only one and not 500 videos to care about, I could afford to give it a bit more love; in particular, I spliced in digital versions of the original slides where appropriate, modified the audio levels a bit when there are audience questions, added manual transcriptions and so on, so I think it ended up quite a bit better. Nageru itself is also coming pretty nicely along, with an 1.1.0 release that supports DeckLink PCI cards (not just the USB ones) and also the NVIDIA proprietary driver, for much increased oomph. It's naturally a bit more quiet now than it was, though conferences always tend to generate a flurry of interest, and then you get back to other interests afterwards. You can find the talk on YouTube; I'll also be happy to provide a full-quality master of my edit if FOSDEM or anyone else wants. Enjoy :-)

7 November 2015

Wouter Verhelst: Minidebconf Cambridge

My main reason for being here was to join the Debconf video team sprint. But hey, since I'm here anyway, why not join all of it, right? Right. Apart from the video team stuff that I've been involved in, I've been spending the last few days improving NBD: There're a few more things that I'd like to see fixed, but after that I'll probably release 3.12 (upstream and in Debian) sometime later this week. With that, I'll be ready to start tackling Debian bug #796633, to provide proper systemd support in nbd-client.

10 September 2015

Antoine Beaupr : Switched to bootstrap theme

I finally gave up and drank the cool-aid of the Bootstrap theme. The main reason was that I noticed the site was basically unusable on any mobile device, including tablet computers like the iPad, which are unfortunately very, very common. Phones wouldn't render the site in a legible way either, unfortunately. The change is rather drastic, so I figured it was important to mention it here. I am not so satisfied pretty exctied with the result: the theme is very basic, if not absent. I actually like that purified form now, and it works across devices now much better. But I do feel I just made my blog look like everyone else that uses Bootstrap. We seem to enter an era where non-graphic designers (like me) are back to building web pages that all look the same, not very different, in a way, from the old days of plain HTML, like the default ikiwiki theme. I did some work to change the look at least minimally, but the top black navbar is really a killer giveaway this is a bootstrap theme. Bootstrap does a lot for the typography, at least when checking the presslabs checklist or the practical typography checklist, but I still had to change the main body width, which helped a lot I think. Anyways, the old Night City theme is still available for download and I could still flip it back on here if I need to.

What changed The new theme keeps the distractions away, but ironically, it's slower than the previous theme even though there are no images and basically no colors at all. On my tests, it now loads in about 3.42 seconds on the "Regular 2G (250Kbps)" simulation of Chromium with 72KB in 12 requests. Whereas my tests with the previous theme were taking 3.25 seconds with 43.5KB in 17 requests. This is due to the Bootstrap CSS, which takes a whopping 19KB itself, but especially because I now load JQuery, which takes a whopping 32.8KB! And that is probably gzip-compressed, as the original is more around 90KB. But at least the thing is readable on phones and tablets now. Compare:

Samsung S3, before Screenshot of a fake rendering on the Samsung Galaxy S3 in night city

Samsung S3, after Screenshot of a fake rendering on the Samsung Galaxy S3 in bootstrap I like this: much simpler, purer version on smaller devices. And we see what counts: the freaking text. Of course, on tablets, it doesn't fare as well: the top menu gets wrapped around...

Apple iPad, before Screenshot of a fake rendering on the iPad in night city

Apple iPad, after Screenshot of a fake rendering on the iPad in bootstrap Apparently, the fix would be for me to "customize the @grid-float-breakpoint variable" in LESS (which I thought was a pager, but seems to be a CSS preprocessor, graaah). Still, I think it's better that way, more readable, and less crufty.

How it was done After doing a thorough evaluation of all the Bootstrap ikiwiki themes I could find (there are more than 4 at least!), I figured I prefered the Jak Linux one. It seemed to be better implemented than the others, cleaner, and I liked the mean black border on top. Black is cool. Oh, and I liked the subtle footer as well. Subtle is cool. Unfortunately, that theme originally required a custom plugin to have a menu on top, which i thought was silly. So I patched it to make it work with the sidebar plugin, which turned out to be trivially simple: just dump the sidebar content, and make the sidebar page have explicit HTML tags in it that match Bootstrap's required classes. I also added the regular action links in the top navbar, but that does overload it quite a bit. I also did some code cleanups and various other small changes, all of which are available in my personnal git repo.

What doesn't work So of course, there are always problems. The first problem is the extra bandwidth usage. Not sure that can be fixed, other than switching to the upcoming Bootstrap 4, which is smaller than Bootstrap 3, bizarrely. The more annoying problems are weird alignment issues. If the screen gets two narrow without kicking some collapse rules in place (the iPad bug above), the navbar rows overflow and look ugly. Even more bizarre, in some cases the right navbar can just completely disappear, in fact, that's the only fix I could find: to assign the hidden-xs and hidden-sm classes to the navbar-right <UL> element. But then it just goes away, which is pretty darn stupid. Still - I assigned the classes to a few actions that seemed low priority, both in the theme and in the sidebar page. Similarly, the search form at the bottom of the page doesn't seem to want to fit in the footer properly. No idea why and I can't bother to figure that one out. I ended up merging this with the backlinks, tags and trails navigation items. Oh, and amazingly enough: comments are simply not rendered at all right now. Oops. I seemed to have picked the single bootstrap Ikiwiki theme that does *not* have comments rendering... Aaaargh. For now I just copy-pasted stuff from the [ramseydsilva][] theme and it looks pretty ugly, but at least it works. I ended up theming comments the same way i did for Night City, which looks pretty good, but is not in sync with the LESS stuff in Bootstrap. I also add to add back trails, backlinks and favicon... looks like the Jak theme wasn't made for a blog after all... but that's now fixed! :) Another improvement would be to change the font from the canonical Helvetica to something prettier, like suggested in Practical Typography, in the font recommendations section, or the presslabs checklist. So far I have settled on Mozilla's Fira font (which means, yes, one set of objects is actually loading from the CDN, sorry). Feedback welcome.

2 January 2015

John Goerzen: Sound players: Adventures with Ampache, mpd, pulseaudio, Raspberry Pi, and Logitech Media Server

I finally decided it was about time to get my whole-house sound project off the ground. As an added bonus, I d like to be able to stream music from my house to my Android phone. Some Background It was about 2.5 years ago that I last revisited the music-listening picture on Linux. I used Spotify for awhile, but the buggy nature of its support for local music eventually drove me up the wall. I have a large collection of music that will never be on Spotify (local choirs, for instance) and this was an important feature. When Google Play Music added the feature of uploading your local collection, I used that; it let me stream music from my phone in my car (using the Bluetooh link to the car). I could also listen at home on a PC, or plug my phone into various devices to play. But that was a hassle, and didn t let me have music throughout the house. Google Play is reasonable for that, but it has a number of really glaring issues. One is that it often gets album artwork wrong; it doesn t use the ID3 tags embedded in the files, but rather tries to guess . Another is that the sync is only add . Move files to another place in your collection, or re-rip them to FLAC and replace your old MP3s, and suddenly they re in Google Play twice. It won t ever see updated metadata, either quite a hassle for someone that uses Musicbrainz and carefully curates metadata. My Hardware I already have an oldish PC set up as a entertainment box running MythTV. It is a diskless system (boots PAE over the network and has NFS root) and very quiet. It has video output to my TV, and audio output via S/PDIF to my receiver. It is one logical audio frontend. My workstation in my office is another obvious place. My kitchen has a radio with a line in jack, and I also have a small portable speaker with a line in jack to make the last two options. I also have a Raspberry Pi model B that I bought awhile ago and was looking for a purpose, so I thought this should be cheap and easy, right? Well. Cheap, yes. Easy, not so much. First attempt: Ampache The Ampache project produces quite a nice piece of software. Ampache has matured significantly in the last 2.5 years, and the usability of its web-interface with HTML5 and Flash players as options is quite impressive. It is easily as usable as Google s, though its learning curve is rather more steep. There are multiple Android apps for Ampache to stream remotely. And, while most are terribly buggy and broken, there is at least one that seems to work well. Ampache can also output m3u/pls files for a standalone player. It does on-the-fly transcoding. There are some chinks in the armor, however. The set of codecs that are transcoded or passed through is a global setting, not a per-device setting. The bitrates are per-account, so you can t easily have it transcode FLAC into 320Kbps mp3 for streaming on your LAN and 128Kbps MP3 for streaming to your phone. (There are some hacks involving IP address ranges and multiple accounts, but they are poorly documented and cumbersome.) Ampache also has a feature called localplay in which it drives local players instead of remote ones. I tried to use this in combination with mpd to drive music to the whole house. Ampache s mpd interface is a bit odd; it actually loads things up into mpd s queue. Sadly it shares the same global configs as the rest. Even though mpd is perfectly capable of handling FLAC audio, the Ampache web player isn t, so you have to either make it transcode mp3 for everything or forego the web player (or use a second account that has transcode everything set). Frustratingly, not one of the Android clients for Ampache is even remotely compatible with Localplay, and some will fail in surprising ways if you have been using Localplay on the web client. So let s see how this mpd thing worked out. Ampache with mpd for whole-house audio The primary method here is to use mpd s pulseaudio driver. I configured it like so:
audio_output
type "pulse"
name "MPD stream"
#server "remote_server" # optional
sink "rtp" # optional
mixer_type "software"

Then in /etc/pulse/system.pa:
load-module module-null-sink sink_name=rtp format=s16be channels=2 rate=44100 sink_properties="device.description='RTP Multicast Sink'"
load-module module-rtp-send source=rtp.monitor
This tells Pulse to use multicast streaming to the LAN for the audio packets. Pulse is supposed to have latency synchronization to achieve perfect audio everywhere. In practice, this works somewhat poorly. Plus I have to install pulse everywhere, which inserts its tentacles way too deeply into the ALSA stack for my taste. (alsamixer suddenly turns useless by default, for instance.) But I gave it a try. After much fiddling Pulse is rather poorly documented and the interactions of configuration tools with it even more so I found a working configuration. On my MythTV box, I added:
load-module module-rtp-recv
load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1;x.x.x.x
The real annoyance was getting it to set the output through the S/PDIF digital port. Finally I figured out the magic potion:
set-card-profile 0 output:iec958-stereo+input:analog-stereo
This worked reasonably well. The Raspberry Pi was a much bigger challenge, however. I put Raspbian on it easily enough, and installed Pulse. But apparently it is well-known in the Pulse community that Pulse s RTP does not work well with wifi. Multicast itself works poorly with wifi in general, and Pulse won t do unicast RTP. However, Pulse assumes very low latency and just won t work well with wifi at all. Morever, pulseaudio on the rpi is something of a difficult beast to tame. It has crackling audio, etc. I eventually got it working decently with: daemon.conf:
realtime-scheduling = yes
realtime-priority = 5
resample-method = src-sinc-fastest
default-sample-rate = 44100
default-fragments = 4
default-fragment-size-msec=20
and in system.pa, commented out the module-udev-detect and module-detect, and added:
load-module module-alsa-card device_id=0 tsched=0 fragments=10 fragment_size=640 tsched_buffer_size=4194384 tsched_buffer_watermark=262144
plus the rtp-recv and tcp protocol lines as before. This got it working with decent quality, but it was always out of sync by at least a few tenths of a second, if not a whole second, with the other rooms. Brief diversion: mplayer Many players besides pulseaudio can play the RTP streams that pulseaudio generates. mplayer, for instance, can. I found this worked on the Raspberry Pi:
mplayer -really-quiet -cache 64 -cache-min 95 -demuxer rawaudio -rawaudio format=0x20776172 rtp://224.0.0.56:46510
But this produced even worse sync. It became clear that Ampache was not going to be the right solution. Logitech Media Server I have looked at this a few times over the years, but I ve somehow skipped it. But I looked at it again now. It is an open source (GPL) server, and was originally designed to work with Logitech hardware. There are now all sorts of software clients out there. There is a helpful wiki about it, although Logitech has rebranded the thing so many times, it s not even consistent internally on what the heck it s called (is it LMS? or Squeezebox Server? or Slimserver? The answer is: yes.) LMS does no web-based streaming. At all. But I thought I d give it a try. Installation is a little weird (its .deb packages up binary perl modules for things that are already in Debian, for half a dozen architectures and many Perl versions.) I had some odd issues but eventually it worked. It scanned my media collection. I installed squeezelite on my workstation as my first player, and things worked reasonably well out of the box. I proceeded to install squeezelite on the Raspberry Pi and MythTV server (where I had to compile libsoxr for wheezy, due to an obscure error I couldn t find the cause of until I used strace). Log in to the LMS web interface, tell it to enable synchronization, and bam! Perfect synchronization! Incredible. But I ran into two issues: one was that it used a lot of CPU even on my workstation (50% or more, even when idle). Strangely, it used far less CPU on the pi than did pulseaudio. Secondly, I d get annoying clicks from it from time to time. Some debugging and investigation revealed that they were both somewhat related; it was getting out of sync with the pi and correcting. Initially I assumed this to be an issue with the pi, but tracked it down to something else. I told it to upsample to 48kHz and this made the problem go away. My command on PC hardware is:
squeezelite -a 200:10 -o dmix -u hLX -z
And on the Raspberry Pi:
squeezelite -a 800:10:16:0 -n kitchen -z
It is working perfectly now. LMS even has an Android remote control app that can control all devices together, queue up playlists, etc. Very slick. The Raspberry Pi The Raspberry Pi, even the model B+, apparently has notoriously bad audio output. I noticed this first as very low volume on the output. I found a $11 USB DAC on Amazon that seems to do the trick, so we will see. I am also using the $10 Edimax USB Wifi and it seems to work well. Other Options The big-name other option here is Sonos. They have perfect sync audio already working. But they re pricy, closed, and proprietary; you have to buy Sonos speakers, transmitters, etc. for the entire house. Their receivers start at $200, and it s at least $350 to integrate an existing stereo into the system. I d be shelling out over $1000 to use Sonos for this setup. As it is, I ve bought less than $100 of equipment (a second Pi and its accessories) and getting output that, while no doubt not quite as pristine, it still quite nice and acceptable. Quite nice. Update: LMS with Apache It turns out that Logitech Media Server can automatically save created playlists to a spot where Ampache can find them. So an Ampache streaming player can be used to access the same collection LMS uses, with full features. Nice!

3 December 2014

Michael Stapelberg: Announcing the new Debian Code Search Instant

For the last few months, I have been working on a new version of Debian Code Search, and today it s going live! I call it Debian Code Search Instant, for multiple reasons, see below. A lot faster The new Debian Code Search is still hosted by Rackspace (thank you!), but our new architecture finally allows us to use their Performance Cloud Servers hardware generation. The old code search architecture was split across 5 different servers, however the search itself was only performed on a single machine with a network attached block volume backed by SSD. The new Code Search spreads out both the trigram index and the source code onto 6 different servers, and thanks to the new hardware generation, we have a locally connected SSD drive in each of them. So, we now get more than 6 times the IOPS we ve had before, at much lower latency :). Grouping by Debian source package The first feature request we ever got for Code Search was to group results by Debian source package. I ve tried tackling that before, but it s a pretty hard problem. With the new architecture and capacity, this finally becomes feasible. After your query completes, there is a checkbox called Group search results by Debian source package . Enable it, and you ll get neatly grouped results. For each package, there is a link at the bottom to refine the search to only consider that package. In case you are more interested in the full list of packages, for example because you are doing large-scale changes across Debian, that s also possible: Click on the symbol right next to the Filter by package list, and a curl command will be revealed with which you can download the full list of packages. Expensive queries (with lots of results) now possible Previously, we had a 60 second timeout during which queries must complete. This timeout has been completely lifted, and long-running queries with tons and tons of results are now possible. We kindly ask you to not abuse this feature it s not very exciting to play with complexity explosion in regular expression engines by preventing others from doing their work. If you re interested in that sort of analysis, go grab the source code and play with it on your own machine :). Instant results While your query is running, you will almost immediately see the top 10 results, even though the search may still take a while. This is useful to figure out if your regular expression matches what you thought it would, before having to wait until your 5 minute query is finished. Also, the new progress bar tells you how far the system is with your search. Instant indexing Previously, Code Search was deployed to a new set of machines from scratch every week in order to update the index. This was necessary because the performance would severely degrade while building the index, so we were temporarily running with twice the amount of machines until the new version was up. In the new architecture, we store an index for each source package and then merge these into one big index shard. This currently takes about 4 minutes with the code I wrote, but I m sure this can be made even faster if necessary. So, whenever new packages are uploaded to the Debian archive, we can just index the new version and trigger a merge. We get notifications about new package uploads from FedMsg. Packages that are not seen on FedMsg for some reason are backfilled every hour. The time between uploading a package and being able to find it in Debian Code Search therefore now ranges from a couple of minutes to about an hour, instead of about a week! New, beautiful User Interface Since we needed to rewrite the User Interface anyway thanks to the new architecture, we also spent some effort on making it modern, polished and beautiful. Smooth CSS animations make you aware of what s going on, and the search results look cleaner than ever. Conclusion At least to me, the new Debian Code Search seems like a significant improvement over the old one. I hope you enjoy the new features into which I put a lot of work. In case you have any feedback, I m happy to hear it (see the contact link at the bottom of every Code Search site).

24 November 2014

John Goerzen: My boys love 1986 computing

Yesterday, Jacob (age 8) asked to help me put together a 30-year-old computer from parts in my basement. Meanwhile, Oliver (age 5) asked Laura to help him learn cursive. Somehow, this doesn t seem odd for a Saturday at our place. 2014-11-22 18.58.36 Let me tell you how this came about. I ve had a project going on for a while now to load data from old floppies. It s been fun, and had a surprise twist the other day: my parents gave me an old TRS-80 Color Computer II (aka CoCo 2 ). It was, in fact, my first computer, one they got for me when I was in Kindergarten. It is nearly 30 years old. I have been musing lately about the great disservice Apple did the world by making computers easy to learn namely the fact that few people ever bother to learn about them. Who bothers to learn about them when, on the iPhone for instance, the case is sealed shut, the lifespan is 1 or 2 years for many purchasers, and the platform is closed in lots of ways? I had forgotten how finicky computers used to be. But after some days struggling with IDE incompatibilities, booting issues, etc., when I actually managed to get data off a machine that had last booted in 1999, I had quite the sense of accomplishment, which I rarely have lately. I did something that was hard to do in a world where most of the interfaces don t work with equipment that old (even if nominally they are supposed to.) The CoCo is one of those computers normally used with a floppy drive or cassette recorder to store programs. You type DIR, and you feel the clack of the drive heads through the desk. You type CLOAD and you hear the relay click closed to turn on the tape motor. You wiggle cables around until they make contact just right. You power-cycle for the times when the reset button doesn t quite do the job. The details of how it works aren t abstracted away by innumerable layers of controllers, interfaces, operating system modules, etc. It s all right there, literally vibrating your desk. So I thought this could be a great opportunity for Jacob to learn a few more computing concepts, such as the difference between mass storage and RAM, plus a great way to encourage him to practice critical thinking. So we trekked down to the basement and came up with handfulls of parts. We brought up the computer, some joysticks, all sorts of tangled cables. We needed adapters, an old TV. Jacob helped me hook everything up, and then the moment of truth: success! A green BASIC screen! I added more parts, but struck out when I tried to connect the floppy drive. The thing just wouldn t start up right whenever the floppy controller cartridge was installed. I cleaned the cartridge. I took it apart, scrubbed the contacts, even did a re-seat of the chips. No dice. So I fired up my CoCo emulator (xroar), and virtually saved some programs to cassette (a .wav file). I then burned those .wav files to an audio CD, brought up an old CD player from the basement, connected the cassette in plug to the CD player s headphone jack, and presto instant programs. (Well, almost. It takes a couple of minutes to load a program from audio codes.) The picture above is Oliver cackling at one of the very simplest BASIC programs there is: number find. The computer picks a random number between 1 and 2000, and asks the user to guess it, giving a too low or too high clue with each incorrect guess. Oliver delighted in giving invalid input (way too high numbers, or things that weren t numbers at all) and cackled at the sarcastic error messages built into the program. During Jacob s turn, he got very serious about it, and is probably going to be learning about how to calculate halfway points before too long. But imagine my pride when this morning, Jacob found the new CD I had made last night (correcting a couple recordings), found my one-line instruction on just part of how to load a program, and correctly figured out by himself all the steps to do in order (type CLOAD on the CoCo, advance the CD to the proper track, press play on the player, wait for it to load on the CoCo, then type RUN). I ordered a replacement floppy controller off eBay tonight, and paid $5 for a coax adapter that should fix some video quality issues. I rescued some 5.25 floppies from my trash can from another project, so they should have plenty of tools for exploration. It is so much easier for them to learn how a disk drive works, and even what the heck a track is, when you can look at a floppy drive with the cover off and see the heads move. There are other things we can do with more modern equipment Jacob has shown a lot of interest in Arduino projects but I have so far drawn a blank on ways to really let kids discover how a modern PC (let alone a modern phone or tablet) works. Update Nov. 24: Every so often, the world surprises me by deciding to, well, read one of my random blog posts. For the benefit of those of you that don t already know my boys, you might want to know that among their common play activites are turning trees into pretend trains, typing at a manual typewriter, reading, writing their own books, using a cassette recorder, building a PC and learning to use bash or xmonad, making long paper tapes with an adding machine, playing records on a record player, building electric gizmos, and even making mud balls. I am often asked about the role of the computer in the lives, given that my hobby and profession involves computers. The answer: less than that of most of their peers. I look for opportunities for them to learn by doing, discovering, playing, or imagining. I make no presumption that they will develop the passion for computers that I did. What I want is for them to have the curiosity and drive to learn everything there is to know about whatever they do develop a passion for, so they will be great at it.

Next.