Series: | Fractured Fables #1 |
Publisher: | Tordotcom |
Copyright: | 2021 |
ISBN: | 1-250-76536-6 |
Format: | Kindle |
Pages: | 121 |
"You know it wasn't originally a spinning wheel in the story?" I offer, because alcohol transforms me into a chatty Wikipedia page.A Spindle Splintered is told from Zinnia's first-person perspective, and Zinnia is great. My favorite thing about Harrow's writing is the fierce and complex emotions of her characters. The overall tone is lighter than The Once and Future Witches or The Ten Thousand Doors of January, but Harrow doesn't shy away from showing the reader Zinnia's internal thought process about her illness (and her eye-rolling bemusement at some of the earlier emotional stages she went through).
Dying girl rule #3 is no romance, because my entire life is one long trolley problem and I don't want to put any more bodies on the tracks. (I've spent enough time in therapy to know that this isn't "a healthy attitude towards attachment," but I personally feel that accepting my own imminent mortality is enough work without also having a healthy attitude about it.)There's a content warning for parents here, since Harrow spends some time on the reaction of Zinnia's parents and the complicated dance between hope, despair, smothering, and freedom that she and they had to go through. There were no easy answers and all balances were fragile, but Zinnia always finds her feet. For me, Harrow's character writing is like emotional martial arts: rolling with punches, taking falls, clear-eyed about the setbacks, but always finding a new point of stability to punch back at the world. Zinnia adds just enough teenage irreverence and impatience to blunt the hardest emotional hits. I really enjoy reading it. The one caution I will make about that part of the story is that the focus is on Zinnia's projected lifespan and not on her illness specifically. Harrow uses it as setup to dig into how she and her parents would react to that knowledge (and I thought those parts were done well), but it's told from the perspective of "what would you do if you knew your date of death," not from the perspective of someone living with a disability. It is to some extent disability as plot device, and like the fairy tale that it's based on, it's deeply invested in the "find a cure" approach to the problem. I'm not disabled and am not the person to ask about how well a story handles disability, but I suspect this one may leave something to be desired. I thought the opening of this story is great. Zinnia is a great first-person protagonist and the opening few chapters are overflowing with snark and acerbic commentary. Dumping Zinnia into another world but having text messaging still work is genius, and I kind of wish Harrow had made that even more central to the book. The rest of the story was good but not as good, and the ending was somewhat predictable and a bit of a deus ex machina. But the characters carried it throughout, and I will happily read more of this. Recommended, with the caveat about disability and the content warning for parents. Followed by A Mirror Mended, which I have already pre-ordered. Rating: 8 out of 10
Series: | The Scholomance #2 |
Publisher: | Del Rey |
Copyright: | 2021 |
ISBN: | 0-593-12887-7 |
Format: | Kindle |
Pages: | 388 |
[She] came round to me and put her arm around my waist and said under her breath, "Hey, she can be taught," with a tease in her voice that wobbled a little, and when I looked at her, her eyes were bright and wet, and I put my arm around her shoulders and hugged her.You'll know it when you get there. The Last Graduate also gives the characters other than El and Orion more room, which is part of how it handles the chosen one trope. It's been obvious since early in the first book that Orion is a sort of chosen one, and it becomes obvious to the reader that El may be as well. But Novik doesn't let the plot focus only on them; instead, she uses that trope to look at how alliances and collective action happen, and how no one can carry the weight by themselves. As El learns more and gains power, she also becomes less central to the plot resolution and has to learn how to be less self-reliant. This is not a book where one character is trained to save the world. It's a book where she manages to enlist the support of a kick-ass project manager and becomes part of a team. Middle books of a trilogy are notoriously challenging. Often they're travel books: the first book sets up a problem, the second book moves the characters both physically and emotionally into a position to solve the problem, and the third book is the payoff. Travel books often sag. They can feel obligatory but somewhat boring, like a chore on the way to the third-book climax. The Last Graduate is not a travel book; it is, instead, a pivot book, which is my favorite form of trilogy. It's a book that rewrites the problem the first book set up, both resolving it and expanding the scope beyond what the reader had expected. This is immensely satisfying when done well, and Novik does it extremely well. This is not a flawless book. There are some pacing hiccups, there is a romance angle that didn't work for me (although it does arrive at some character insights that I thought were spot on), and although I think Novik is doing something interesting with the trope, there is a lot of chosen one power escalation happening here. It's not the sort of book that I can claim is perfectly written. Instead, it's the sort of book that uses some of my favorite plot elements and emotional beats in such an effective way and with such a memorable character that I do not have it in me to care about any of the flaws. Your mileage may therefore vary, but I would be happy to read books like this until the end of time. As mentioned above, The Last Graduate ends on another cliffhanger. This time I was worried that Novik might have ended the series there, since there's enough of an internal climax that I could imagine some literary fiction (which often seems allergic to endings) would have stopped here. Thankfully, Novik's web site says this is not the case. The next year is going to be a difficult wait. The third book of this series is going to be incredibly difficult to write, and I hope Novik is up to the challenge she's made for herself. But she handled the transition between the first and second book so well, and this book is so good that I have a lot of hope. If the third book is half as good as I'm hoping, this is going to be one of my favorite fantasy series of all time. Followed by an as-yet-untitled third book. Rating: 10 out of 10
ErrorKind
, which aims to categorise OS errors
in a portable way.
Audiences for this post
ErrorKind
s are part of the Rust standard
library means that to get this right, you don't need to delve down
and get the actual underlying operating system error number, and
write separate code for each platform you want to support. You can
check whether the error is ErrorKind::NotFound
(or whatever).
Because ErrorKind
is so important in many Rust APIs,
some code which isn't really doing an OS call can still have to
provide an ErrorKind
. For this purpose, Rust provides
a special category ErrorKind::Other
, which doesn't
correspond to any particular OS error.
Rust's stability aims and approach
Another thing Rust tries to do is keep existing code working.
More specifically, Rust tries to:
io::ErrorKind
(Very briefly:)
When you have a value which is an io::ErrorKind
, you
can compare it with specific values:
if error.kind() == ErrorKind::NotFound ...But in Rust it's more usual to write something like this (which you can read like a
switch
statement):
match error.kind() ErrorKind::NotFound => use_default_configuration(), _ => panic!("could not read config file : ", &file, &error),Here
_
means "anything else".
Rust insists that match
statements are
exhaustive, meaning that each one covers all the possibilities.
So if you left out the line with the _
, it wouldn't
compile.
Rust enums can also be marked non_exhaustive
, which is
a declaration by the API designer that they plan to add more kinds.
This has been done for ErrorKind
, so the _
is mandatory, even if you write out all the possibilities that exist
right now: this ensures that if new ErrorKind
s appear,
they won't stop your code compiling.
Improving the error categorisation
The set of error categories stabilised in Rust 1.0 was too small.
It missed many important kinds of error. This makes writing
error-handling code awkward. In any case, we expect to add new
error categories occasionally. I set about trying to improve this
by proposing
new ErrorKind
s. This obviously needed considerable
community review, which is why it took about 9 months.
The trouble with Other
and tests
Rust has to assign an ErrorKind
to every OS error,
even ones it doesn't really know about.
Until recently, it mapped all errors it didn't understand to
ErrorKind::Other
- reusing the category for "not an OS
error at all".
Serious people who write serious code like to have serious tests.
In particular, testing error conditions is really important. For
example, you might want to test your program's handling of disk
full, to make sure it didn't crash, or corrupt files. You would set
up some contraption that would simulate a full disk. And then, in
your tests, you might check that the error was correct.
But until very recently (still now, in Stable Rust), there was
no ErrorKind::StorageFull
. You would
get ErrorKind::Other
. If you were diligent you would
dig out the OS error code (and check for ENOSPC
on
Unix, corresponding Windows errors, etc.). But that's tiresome.
The more obvious thing to do is to check that the kind
is Other
.
Obvious but wrong.
ErrorKind
is non_exhaustive
,
implying that more error kinds will appears, and, naturally, these
would more finely categorise previously-Other
OS errors.
Unfortunately, the documentation note
Errors that arewas only added in May 2020. So the wrongness of the "obvious" approach was, itself, not very obvious. And even with that docs note, there was no compiler warning or anything. The unfortunate result is that there is a body of code out there in the world which might break any time an error that was previouslyOther
now may move to a different or a newErrorKind
variant in the future.
Other
becomes properly categorised.
Furthermore, there was nothing stopping new people writing new
obvious-but-wrong code.
Chosen solution: Uncategorized
The Rust developers wanted an engineered safeguard against the bug of
assuming that a particular error shows up as Other
.
They chose the following solution:
There is now a new ErrorKind::Uncategorized
which is
now used for all OS errors for which there isn't a more specific
categorisation. The fallback translation of unknown errors was
changed from Other
to Uncategorised
.
This is de jure justified by the fact that this enum has always been
marked non_exhaustive
. But in practice because this
bug wasn't previously detected, there is such code in the wild.
That code now breaks (usually, in the form of failing test cases).
Usually when Rust starts to detect a particular programming error,
it is reported as a new warning, which doesn't break anything. But
that's not possible here, because this is a behavioural change.
The new ErrorKind::Uncategorized
is
marked unstable
. This makes it impossible to write
code on Stable Rust which insists that an error comes out
as Uncategorized
. So, one cannot now write code that
will break when new ErrorKind
s are added.
That's the intended effect.
The downside is that this does break old code, and, worse,
it is not as clear as it should be what the fixed code looks like.
Alternatives considered and rejected by the Rust developers
Not adding more ErrorKind
s
This was not tenable. The existing set is already too small, and
error categorisation is in any case expected to improve over time.
Just adding ErrorKind
s as had been done before
This would mean occasionally breaking test cases (or, possibly,
production code) when an error that was
previously Other
becomes categorised. The broken code
would have been "obvious", but de jure wrong, just as it is now,
So this option amounts to expecting this broken code to continue
to be written and continuing to break it occasionally.
Somehow using Rust's Edition system
The Rust language has a system to allow language evolution,
where code declares its Edition (2015, 2018, 2021).
Code from multiple editions can be combined, so that
the ecosystem can upgrade gradually.
It's not clear how this could be used for ErrorKind
, though.
Errors have to be passed between code with different editions.
If those different editions had different categorisations, the
resulting programs would have incoherent and broken error handling.
Also some of the schemes for making this change would mean that new
ErrorKind
s could only be stabilised about once every 3
years, which is far too slow.
How to fix code broken by this change
Most main-line error handling code already has a fallback case for
unknown errors. Simply replacing any occurrence
of Other
with _
is right.
How to fix thorough tests
The tricky problem is tests. Typically, a thorough test case wants
to check that the error is "precisely as expected" (as far as the
test can tell). Now that unknown errors come out as an
unstable Uncategorized
variant that's not so easy. If
the test is expecting an error that is currently not categorised,
you want to write code that says "if the error is any of the
recognised kinds, call it a test failure".
What does "any of the recognised kinds" mean here ? It doesn't
meany any of the kinds recognised by the version of the Rust
stdlib that is actually in use. That set might get bigger.
When the test is compiled and run later, perhaps years later, the
error in this test case might indeed be categorised. What you
actually mean is "the error must not be any of the kinds which
existed when the test was written".
IMO therefore the right solution for such a test case is to cut and
paste the current list of stable ErrorKind
s into your
code. This will seem wrong at first glance, because the list in
your code and in Rust can get out of step. But when they do get out
of step you want your version, not the stdlib's. So freezing the
list at a point in time is precisely right.
You probably only want to maintain one copy of this list, so put it
somewhere central in your codebase's test support machinery.
Periodically, you can update the list deliberately - and fix any
resulting test failures.
Unfortunately this approach is not suggested by the documentation.
In theory you could work all this out yourself from first
principles, given even the situation prior to May 2020, but it seems
unlikely that many people have done so. In particular, cutting and
pasting the list of recognised errors would seem very unnatural.
Conclusions
This was not an easy problem to solve well.
I think Rust has done a plausible job given the various constraints,
and the result is technically good.
It is a shame that this change to make the error handling stability
more correct caused the most trouble for the most careful people who
write the most thorough tests.
I also think the docs could be improved.
edited shortly after posting, and again 2021-09-22 16:11 UTC, to fix HTML slipsThe Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. As covered in my post from last week, the Janitor now regularly tries to import new upstream git snapshots or upstream releases into packages in Sid.
Instance
, and much code needs (immutable) access to it. I can't pass both &Instance
and &mut GameState
because one is inside the other.
My workaround involves passing separate references to the other fields of Instance
, leading to some functions taking far too many arguments. 14 in one case. (They're all different types so argument ordering mistakes just result in compiler errors talking about arguments 9 and 11 having wrong types, rather than actual bugs.)
I felt this problem was purely a restriction arising from limitations of the borrow checker. I thought it might be possible to improve on it. Weeks passed and the question gradually wormed its way into my consciousness. Eventually, I tried some experiments. Encouraged, I persisted.
What and how
partial-borrow is a Rust library which solves this problem. You sprinkle #[Derive(PartialBorrow)]
and
partial!(...)
and then you can
pass a reference which grants mutable access to only some of the fields. You can also pass a reference through which some fields are inaccessible. You can even split a single mut reference into multiple compatible references, for example granting mut access to mutually-nonverlapping subsets.
The core type is Struct__Partial
(for some Struct
).
It is a zero-sized type, but we prevent anyone from constructing one.
Instead we magic up references to it, always ensuring that they have the same address as some Struct
.
The fields of Struct__Partial
are also ZSTs that exist ony as references, and they Deref
to the actual field (subject to compile-type borrow compatibility checking).
Soundness and testing
partial-borrow is primarily a nontrivial procedural macro which autogenerates reams of unsafe
.
Of course I think it's sound, but I thought that the last two times before I added a test which demonstrated otherwise. So it might be fairer to say that I have tried to make it sound and that I don't know of any problems...
Reasoning about the correctness of macro-generated code is not so easy. One problem is that there is nowhere good to put the kind of soundness arguments you would normally add near uses of unsafe
.
I decided to solve this by annotating an instance of the macro output. There's a not very complicated script using diff3 to help fold in changes if the macro output changes - merge conflicts there mean a possible re-review of the argument text. Of course I also have test cases that run with miri, and test cases for expected compiler errors for uses that need to be forbidden for soundness.
But this is quite hairy and I'm worried that it might be rather "my first insane unsafe contraption".
Also the pointer/reference trickery is definitely subtle, and depends heavily on knowing what Rust's aliasing and pointer provenance rules really are. Stacked Borrows is not entirely trivial to reason about in fiddly corner cases.
So for now I have only called it 0.1.0
and left a note in the docs. I haven't actually made Otter use it yet but that's the rather more boring software integration part, not the fun "can I do this mad thing" part so I will probably leave that for a rainy day. Possibly a rainy day after someone other than me has looked at partial-borrow (preferably someone who understands Stacked Borrows...).
Fun!
This was great fun. I even enjoyed writing the docs.
The proc-macro programming environment is not entirely straightforward and there are a number of things to watch out for. For my first non-adhoc proc-macro this was, perhaps, ambitious. But you don't learn anything without trying...
edited 2021-09-02 16:28 UTC to fix a typoThe Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. Linux distributions like Debian fulfill an important function in the FOSS ecosystem - they are system integrators that take existing free and open source software projects and adapt them where necessary to work well together. They also make it possible for users to install more software in an easy and consistent way and with some degree of quality control and review. One of the consequences of this model is that the distribution package often lags behind upstream releases. This is especially true for distributions that have tighter integration and standardization (such as Debian), and often new upstream code is only imported irregularly because it is a manual process - both updating the package, but also making sure that it still works together well with the rest of the system. The process of importing a new upstream used to be (well, back when I started working on Debian packages) fairly manual and something like this:
1 2 | version=4
http://somesite.com/dir/filenamewithversion.tar.gz
|
1 2 3 | ---
Repository: https://www.dulwich.io/code/dulwich/
Repository-Browse: https://www.dulwich.io/code/dulwich/
|
1 2 3 4 5 6 | echo deb "[arch=amd64 signed-by=/usr/share/keyrings/debian-janitor-archive-keyring.gpg]" \
https://janitor.debian.net/ fresh-snapshots main sudo tee /etc/apt/sources.list.d/fresh-snapshots.list
echo deb "[arch=amd64 signed-by=/usr/share/keyrings/debian-janitor-archive-keyring.gpg]" \
https://janitor.debian.net/ fresh-releases main sudo tee /etc/apt/sources.list.d/fresh-releases.list
sudo curl -o /usr/share/keyrings/debian-janitor-archive-keyring.gpg https://janitor.debian.net/pgp_keys
apt update
|
1 | apt install -t fresh-snapshots r-cran-roxygen2
|
[1] | I m not saying that a monoculture is great here, but it does help distributions. |
nailing-cargo/1.0.0
.
nailing-cargo
is a wrapper around the Rust build tool cargo
. nailing-cargo can:
Cargo.toml
)
crates.io
, which is a minimally-curated repository.
I didn't want to expose my main account to that.
And, at the time, I was working on a project which for which I was
also writing a library as a dependency, and I found that cargo
couldn't cope with this unless I were to commit (to my git repository)
the path (on my local laptop) of my dependency.
I filed some bugs, including about the unpublished crate problem.
But also, I was stubborn enough to try to find
a workaround that didn't involve committing junk to my git history.
The result was a
short but horrific shell script.
I wrote about this
at the time (March 2019).
Over the last few years the difficulties I have with cargo have
remained un-resolved. I found my
interactions
with
upstream
rather discouraging.
It didn't seem like I would get anywhere by trying to help
improve cargo to better support my needs.
So instead I have gradually improved nailing-cargo. It is now a Perl
script. It is rather less horrific, and has
proper documentation
(sorry, JS needed because GitLab).
Why Perl ?
Rust would have been my language of choice. But I wanted to avoid a
chicken-and-egg situation. When you're doing privsep, nailing-cargo
has to run in your more privileged environment. I wanted something
easy to get going with.
nailing-cargo has to contain a TOML parser; and I found a small one,
TOML-Tiny, which was good enough as a starting point, and small enough
I could bundle it as a git subtree. Perl is nicely fast to start up
(nailing-cargo --- true
runs in about 170ms on my
laptop), and it is easy to write a Perl script that will work on
pretty much any Perl installation.
Still unsolved: embedding cargo in another build system
A number of my projects contain a mixture of Rust code with other
languages. Unfortunately, nailing-cargo doesn't help with the
problems which arise trying to integrate cargo into another build
system.
I generally resort to find
runes for finding Rust source
files that might influence cargo, and stamp files for seeing if I have
run it recently enough; and I simply live with the fact that cargo
sometimes builds more stuff than I needed it to.
Future
There are a number of ways nailing-cargo could be improved.
Notably, the need to overwrite your actual Cargo.toml
is
very annoying, even if nailing-cargo puts it back afterwards.
A big problem with this is that it means that
nailing-cargo has to take a lock, while your cargo rune runs.
This effectively prevents using nailing-cargo with long-running
processes. Notably, editor integrations like rls and racer.
I could perhaps solve this with more linkfarm-juggling, but that
wouldn't help in-tree builds and it's hard to keep things up to date.
I am considering using
LD_PRELOAD
trickery or maybe bwrap(1)
to "implement"
the alternative Cargo.toml
feature which was
rejected by cargo upstream in 2019
(and
again in April
when someone else asked).
Currently there is no support for using sudo
for
out-of-tree privsep. This should be easy to add but it needs someone
who uses sudo
to want it (and to test it!)
The documentation has some other
dicusssion of limitations, some of
which aren't too hard to improve.
Patches welcome!launchpadlib
, which were ported
years ago). As such, we weren t trying to do this with the internet having
Strong Opinions at us. We were doing this because it was obviously the only
long-term-maintainable path forward, and in more recent times because some
of our library dependencies were starting to drop support for Python 2 and
so it was obviously going to become a practical problem for us sooner or
later; but if we d just stayed on Python 2 forever then fundamentally hardly
anyone else would really have cared directly, only maybe about some indirect
consequences of that. I don t follow Mercurial development so I may be
entirely off-base, but if other people were yelling at me about how late my
project was to finish its port, that in itself would make me feel more
negatively about the project even if I thought it was a good idea. Having
most of the pressure come from ourselves rather than from outside meant that
wasn t an issue for us.
I m somewhat inclined to think of the process as an extreme version of
paying down technical debt. Moving from Python 2.7 to 3.5, as we just did,
means skipping over multiple language versions in one go, and if similar
changes had been made more gradually it would probably have felt a lot more
like the typical dependency update treadmill. I appreciate why not everyone
might want to think of it this way: maybe this is just my own rationalization.
Reflections on porting to Python 3
I m not going to defend the Python 3 migration process; it was pretty rough
in a lot of ways. Nor am I going to spend much effort relitigating it here,
as it s already been done to death elsewhere, and as I understand it the
core Python developers have got the message loud and clear by now. At a
bare minimum, a lot of valuable time was lost early in Python 3 s lifetime
hanging on to flag-day-type porting strategies that were impractical for
large projects, when it should have been providing for bilingual
strategies (code that runs in both Python 2 and 3 for a transitional period)
which is where most libraries and most large migrations ended up in
practice. For instance, the early advice to library maintainers to maintain
two parallel versions or perhaps translate dynamically with 2to3
was
entirely impractical in most non-trivial cases and wasn t what most people
ended up doing, and yet the idea that 2to3
is all you need still floats
around Stack Overflow and the like as a result. (These days, I would
probably point people towards something more like Eevee s porting
FAQ
as somewhere to start.)
There are various fairly straightforward things that people often suggest
could have been done to smooth the path, and I largely agree: not removing
the u''
string prefix only to put it back in 3.3, fewer gratuitous
compatibility breaks in the name of tidiness, and so on. But if I had a
time machine, the number one thing I would ask to have been done differently
would be introducing type annotations in Python 2 before Python 3 branched
off. It s true that it s technically
possible
to do type annotations in Python 2, but the fact that it s a different
syntax that would have to be fixed later is offputting, and in practice it
wasn t widely used in Python 2 code. To make a significant difference to
the ease of porting, annotations would need to have been introduced early
enough that lots of Python 2 library code used them so that porting code
didn t have to be quite so much of an exercise of manually figuring out the
exact nature of string types from context.
Launchpad is a complex piece of software that interacts with multiple
domains: for example, it deals with a database, HTTP, web page rendering,
Debian-format archive publishing, and multiple revision control systems, and
there s often overlap between domains. Each of these tends to imply
different kinds of string handling. Web page rendering is normally done
mainly in Unicode, converting to bytes as late as possible; revision control
systems normally want to spend most of their time working with bytes,
although the exact details vary; HTTP is of course bytes on the wire, but
Python s WSGI interface has some string type
subtleties.
In practice I found myself thinking about at least four string-like types
(that is, things that in a language with a stricter type system I might well
want to define as distinct types and restrict conversion between them):
bytes, text, ordinary native strings (str
in either language, encoded to
UTF-8 in Python 2), and native strings with WSGI s encoding rules. Some of
these are emergent properties of writing in the intersection of Python 2 and
3, which is effectively a specialized language of its own without coherent
official documentation whose users must intuit its behaviour by comparing
multiple sources of information, or by referring to unofficial porting
guides: not a very satisfactory situation. Fortunately much of the
complexity collapses once it becomes possible to write solely in Python 3.
Some of the difficulties we ran into are not ones that are typically thought
of as Python 2-to-3 porting issues, because they were changed later in
Python 3 s development process. For instance, the email
module was
substantially improved in around the 3.2/3.3 timeframe to handle Python 3 s
bytes/text model more correctly, and since Launchpad sends quite a few
different kinds of email messages and has some quite picky tests for exactly
what it emits, this entailed a lot of work in our email sending code and in
our test suite to account for that. (It took me a while to work out whether
we should be treating raw email messages as bytes or as text; bytes turned
out to work best.) 3.4 made some tweaks to the implementation of
quoted-printable encoding that broke a number of our tests in ways that took
some effort to fix, because the tests needed to work on both 2.7 and 3.5.
The list goes on. I got quite proficient at digging through Python s git
history to figure out when and why some particular bit of behaviour had changed.
One of the thorniest problems was parsing HTTP form data. We mainly rely on
zope.publisher
for this, which
in turn relied on
cgi.FieldStorage
; but
cgi.FieldStorage
is badly broken in some
situations on Python 3. Even if that
bug were fixed in a more recent version of Python, we can t easily use
anything newer than 3.5 for the first stage of our port due to the version
of the base OS we re currently running, so it wouldn t help much. In the
end I fixed some minor issues in the
multipart
module (and was kindly
given co-maintenance of it) and converted zope.publisher
to use
it. Although
this took a while to sort out, it seems to have gone very well.
A couple of other interesting late-arriving issues were around
pickle
. For most things
we normally prefer safer formats such as JSON, but there are a few cases
where we use pickle, particularly for our session databases. One of my
colleagues pointed out that I needed to remember to tell pickle
to stick
to protocol
2,
so that we d be able to switch back and forward between Python 2 and 3 for a
while; quite right, and we later ran into a similar problem with
marshal
too. A more
surprising problem was that datetime.datetime
objects pickled on Python 2
require special care when unpickling
on Python 3; rather than the approach that ended up being implemented and
documented
for Python 3.6, though, I preferred a custom
unpickler,
both so that things would work on Python 3.5 and so that I wouldn t have to
risk affecting the decoding of other pickled strings in the session database.
General lessons
Writing this over a year after Python 2 s end-of-life date, and certainly
nowhere near the leading edge of Python 3 porting work, it s perhaps more
useful to look at this in terms of the lessons it has for other large
technical debt projects.
I mentioned in my previous article that
I used the approach of an enormous and frequently-rebased git branch as a
working area for the port, committing often and sometimes combining and
extracting commits for review once they seemed to be ready. A port of this
scale would have been entirely intractable without a tool of similar power
to git rebase
, so I m very glad that we finished migrating to git in 2019.
I relied on this right up to the end of the port, and it also allowed for
quick assessments of how much more there was to land. git
worktree was also helpful, in that I
could easily maintain working trees built for each of Python 2 and 3 for comparison.
As is usual for most multi-developer projects, all changes to Launchpad need
to go through code review, although we sometimes make exceptions for very
simple and obvious changes that can be self-reviewed. Since I knew from the
outset that this was going to generate a lot of changes for review, I
therefore structured my work from the outset to try to make it as easy as
possible for my colleagues to review it. This generally involved keeping
most changes to a somewhat manageable size of 800 lines or less (although
this wasn t always possible), and arranging commits mainly according to the
kind of change they made rather than their location. For example, when I
needed to fix issues with /
in Python 3 being true division rather than
floor division, I did so in one
commit
across the various places where it mattered and took care not to mix it with
other unrelated changes. This is good practice for nearly any kind of
development, but it was especially important here since it allowed reviewers
to consider a clear explanation of what I was doing in the commit message
and then skim-read the rest of it much more quickly.
It was vital to keep the codebase in a working state at all times, and
deploy to production reasonably often: this way if something went wrong the
amount of code we had to debug to figure out what had happened was always
tractable. (Although I can t seem to find it now to link to it, I saw an
account a while back of a company that had taken a flag-day approach instead
with a large codebase. It seemed to work for them, but I m certain we
couldn t have made it work for Launchpad.)
I can t speak too highly of Launchpad s test suite, much of which originated
before my time. Without a great deal of extensive coverage of all sorts of
interesting edge cases at both the unit and functional level, and a
corresponding culture of maintaining that test suite well when making new
changes, it would have been impossible to be anything like as confident of
the port as we were.
As part of the porting work, we split out a couple of substantial chunks of
the Launchpad codebase that could easily be decoupled from the core: its
Mailman integration and its code import
worker. Both of these had substantial
dependencies with complex requirements for porting to Python 3, and
arranging to be able to do these separately on their own schedule was
absolutely worth it. Like disentangling balls of wool, any opportunity you
can take to make things less tightly-coupled is probably going to make it
easier to disentangle the rest. (I can see a tractable way forward to
porting the code import worker, so we may well get that done soon. Our
Mailman integration will need to be rewritten, though, since it currently
depends on the Python-2-only Mailman 2, and Mailman 3 has a different architecture.)
Python lessons
Our database layer was already in pretty good
shape for a port, since at least the modern bits of its table modelling
interface were already strict about using Unicode for text columns. If you
have any kind of pervasive low-level framework like this, then making it be
pedantic at you in advance of a Python 3 port will probably incur much less
swearing in the long run, as you won t be trying to deal with quite so many
bytes/text issues at the same time as everything else.
Early in our port, we established a standard set of
__future__
imports
and started incrementally converting files over to them, mainly because we
weren t yet sure what else to do and it seemed likely to be helpful.
absolute_import
was definitely reasonable (and not often a problem in our
code), and print_function
was annoying but necessary. In hindsight I m
not sure about unicode_literals
, though. For files that only deal with
bytes and text it was reasonable enough, but as I mentioned above there were
also a number of cases where we needed literals of the language s native
str
type, i.e. bytes in Python 2 and text in Python 3: this was
particularly noticeable in WSGI contexts, but also cropped up in some other
surprising
places. We
generally either omitted unicode_literals
or used six.ensure_str
in such
cases, but it was definitely a bit awkward and maybe I should have listened
more to people telling me it might be a bad idea.
A lot of Launchpad s early tests used
doctest, mainly in the
style
where you have text files that interleave narrative commentary with
examples. The development team later reached consensus that this was best
avoided in most cases, but by then there were far too many doctests to
conveniently rewrite in some other form. Porting doctests to Python 3 is
really annoying. You run into all the little changes in how objects are
represented as text (particularly u'...'
versus '...'
, but plenty of
other cases as well); you have next to no tools to do anything useful like
skipping individual bits of a doctest that don t apply; using __future__
imports requires the rather obscure approach of adding the relevant names to
the doctest s globals in the relevant DocFileSuite
or DocTestSuite
;
dealing with many exception tracebacks requires something like
zope.testing.renormalizing
;
and whatever code refactoring tools you re using probably don t work
properly. Basically, don t have done that. It did all turn out to be
tractable for us in the end, and I managed to avoid using much in the way of
fragile doctest extensions aside from the aforementioned
zope.testing.renormalizing
, but it was not an enjoyable experience.
Regressions
I know of nine regressions that reached Launchpad s production systems as a
result of this porting work; of course there were various other regressions
caught by CI or in manual testing. (Considering the size of this project, I
count it as a resounding success that there were only nine production
issues, and that for the most part we were able to fix them quickly.)
Equality testing of removed database objects
One of the things we had to do while porting to Python 3 was to
implement
the __eq__
, __ne__
, and __hash__
special methods for all our database
objects. This was quite conceptually fiddly, because doing this requires
knowing each object s primary key, and that may not yet be available if
we ve created an object in Python but not yet flushed the actual INSERT
statement to the database (most of our primary keys are auto-incrementing
sequences). We thus had to take care to flush pending SQL statements in
such cases in order to ensure that we know the primary keys.
However, it s possible to have a problem at the other end of the object
lifecycle: that is, a Python object might still be reachable in memory even
though the underlying row has been DELETE
d from the database. In most
cases we don t keep removed objects around for obvious reasons, but it can
happen in caching code, and buildd-manager
crashed as a result (in
fact while it was still running on Python 2). We had to take extra
care
to avoid this problem.
Debian imports crashed on non-UTF-8 filenames
Python 2 has some unfortunate
behaviour around passing
bytes or Unicode strings (depending on the platform) to shutil.rmtree
, and
the combination of some porting
work
and a particular source package in Debian that contained a non-UTF-8 file
name caused us to run into this. The
fix
was to ensure that the argument passed to shutil.rmtree
is a str
regardless of Python version.
We d actually run into something
similar
before: it s a subtle porting gotcha, since it s quite easy to end up
passing Unicode strings to shutil.rmtree
if you re in the process of
porting your code to Python 3, and you might easily not notice if the file
names in your tests are all encoded using UTF-8.
lazr.restful ETags
We eventually got far enough along that we could switch one of our four
appserver machines (we have quite a number of other machines too, but the
appservers handle web and API requests) to Python 3 and see what happened.
By this point our extensive test suite had shaken out the vast majority of
the things that could go wrong, but there was always going to be room for
some interesting edge cases.
One of the Ubuntu kernel team reported that they were seeing an increase in
412 Precondition
Failed errors in some
of their scripts that use our webservice API. These can happen when you re
trying to modify an existing resource: the underlying protocol involves
sending an If-Match
header with the ETag
that the client thinks the
resource has, and if this doesn t match the ETag
that the server calculates
for the resource then the client has to refresh its copy of the resource and
try again. We initially thought that this might be legitimate since it can
happen in normal operation if you collide with another client making changes
to the same resource, but it soon became clear that something stranger was
going on: we were getting inconsistent ETag
s for the same object even when
it was unchanged. Since we d recently switched a quarter of our appservers
to Python 3, that was a natural suspect.
Our lazr.restful
package provides the framework for our webservice API,
and roughly speaking it generates ETag
s by serializing objects into some
kind of canonical form and hashing the result. Unfortunately the
serialization was dependent on the Python version in a few ways, and in
particular it serialized lists of strings such as lists of bug tags
differently: Python 2 used [u'foo', u'bar', u'baz']
where Python 3 used
['foo', 'bar', 'baz']
. In lazr.restful
1.0.3 we switched to using
JSON
for this, removing the Python version dependency and ensuring consistent
behaviour between appservers.
Memory leaks
This problem took the longest to solve. We noticed fairly quickly from our
graphs that the appserver machine we d switched to Python 3 had a serious
memory leak. Our appservers had always been a bit leaky, but now it wasn t
so much a small hole that we can bail occasionally as the boat is sinking rapidly :
(Yes, this got in the way of working out what was going on with ETag
s for
a while.)
I spent ages messing around with various attempts to fix this. Since only
a quarter of our appservers were affected, and we could get by on 75%
capacity for a while, it wasn t urgent but it was definitely annoying.
After spending some quality time with
objgraph, for
some time I thought traceback reference
cycles
might be at fault, and I sent a number of fixes to various upstream projects
for those (e.g.
zope.pagetemplate).
Those didn t help the leaks much though, and after a while it became clear
to me that this couldn t be the sole problem: Python has a cyclic garbage
collector that will eventually collect reference cycles as long as there are
no strong references to any objects in them, although it might not happen
very quickly. Something else must be going on.
Debugging reference leaks in any non-trivial and long-running Python program
is extremely arduous, especially with ORMs that naturally tend to end up
with lots of cycles and caches. After a while I formed a hypothesis that
zope.server might be keeping a
strong reference to something, although I never managed to nail it down more
firmly than that. This was an attractive theory as we were already in the
process of migrating to Gunicorn for
other reasons anyway, and Gunicorn also has a convenient
max_requests
setting that s good at mitigating memory leaks. Getting this all in place
took some time, but once we did we found that everything was much more stable:
This isn t completely satisfying as we never quite got to the bottom of the
leak itself, and it s entirely possible that we ve only papered over it
using max_requests
: I expect we ll gradually back off on how frequently we
restart workers over time to try to track this down. However,
pragmatically, it s no longer an operational concern.
Mirror prober HTTPS proxy handling
After we switched our script servers to Python 3, we had several reports of
mirror probing
failures. (Launchpad
keeps lists of Ubuntu archive and image mirrors, and probes them every so
often to check that they re reasonably complete and up to date.) This only
affected HTTPS mirrors when probed via a proxy server, support for which is
a relatively recent feature in Launchpad and involved some code that we
never managed to unit-test properly: of course this is exactly the code that
went wrong. Sadly I wasn t able to sort out that gap, but at least the
fix
was simple.
Non-MIME-encoded email headers
As I mentioned above, there were substantial changes in the email
package
between Python 2 and 3, and indeed between minor versions of Python 3. Our
test coverage here is pretty good, but it s an area where it s very easy to
have gaps. We noticed that a script that processes incoming email was
crashing on messages with headers that were non-ASCII but not
MIME-encoded (and
indeed then crashing again when it tried to send a notification of the
crash!). The only examples of these I looked at were spam, but we still
didn t want to crash on them.
The
fix
involved being somewhat more careful about both the handling of headers
returned by Python s email parser and the building of outgoing email
notifications. This seems to be working well so far, although I wouldn t be
surprised to find the odd other incorrect detail in this sort of area.
Failure to handle non-ISO-8859-1 URL-encoded form input
Remember how I said that parsing HTTP form data was thorny? After we
finished upgrading all our appservers to Python 3, people started reporting
that they couldn t post Unicode comments to
bugs, which turned out
to be only if the attempt was made using JavaScript, and was because I
hadn t quite managed to get URL-encoded form data working properly with
zope.publisher
and multipart
. The current standard describes the
URL-encoded format for form data as in many ways an aberrant
monstrosity ,
so this was no great surprise.
Part of the problem was some very strange
choices in
zope.publisher
dating back to 2004 or earlier, which I attempted to clean
up and simplify.
The rest was that Python 2 s urlparse.parse_qs
unconditionally decodes
percent-encoded sequences as ISO-8859-1 if they re passed in as part of a
Unicode string, so multipart
needs to work around
this on Python 2.
I m still not completely confident that this is correct in all situations,
but at least now that we re on Python 3 everywhere the matrix of cases we
need to care about is smaller.
Inconsistent marshalling of Loggerhead s disk cache
We use Loggerhead for providing web
browsing of Bazaar branches. When we upgraded one of its two servers to
Python 3, we immediately noticed that the one still on Python 2 was failing
to read back its revision information cache, which it stores in a database
on disk. (We noticed this because it caused a deployment to fail: when we
tried to roll out new code to the instance still on Python 2, Nagios checks
had already caused an incompatible cache to be written for one branch from
the Python 3 instance.)
This turned out to be a similar problem to the pickle
issue mentioned
above, except this one was with marshal
, which I didn t think to look for
because it s a relatively obscure module mostly used for internal purposes
by Python itself; I m not sure that Loggerhead should really be using it in
the first place. The fix was
relatively
straightforward,
complicated mainly by now needing to cope with throwing away unreadable
cache data.
Ironically, if we d just gone ahead and taken the nominally riskier path of
upgrading both servers at the same time, we might never have had a problem here.
Intermittent bzr failures
Finally, after we upgraded one of our two Bazaar codehosting servers to
Python 3, we had a
report of intermittent
bzr branch
hangs. After some digging I found this in our logs:
Traceback (most recent call last):
...
File "/srv/bazaar.launchpad.net/production/codehosting1-rev-20124175fa98fcb4b43973265a1561174418f4bd/env/lib/python3.5/site-packages/twisted/conch/ssh/channel.py", line 136, in addWindowBytes
self.startWriting()
File "/srv/bazaar.launchpad.net/production/codehosting1-rev-20124175fa98fcb4b43973265a1561174418f4bd/env/lib/python3.5/site-packages/lazr/sshserver/session.py", line 88, in startWriting
resumeProducing()
File "/srv/bazaar.launchpad.net/production/codehosting1-rev-20124175fa98fcb4b43973265a1561174418f4bd/env/lib/python3.5/site-packages/twisted/internet/process.py", line 894, in resumeProducing
for p in self.pipes.itervalues():
builtins.AttributeError: 'dict' object has no attribute 'itervalues'
:wq
for today.
The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. The FOSS world uses a wide variety of different build tools; given a git repository or tarball, it can be hard to figure out how to build and install a piece of software. Humans will generally know what build tool a project is using when they check out a project from git, or they can read the README. And even then, the answer may not always be straightforward to everybody. For automation, there is no obvious place to figure out how to build or install a project.
1 2 3 4 5 6 7 8 9 | % git clone https://github.com/dulwich/dulwich
% cd dulwich
% ogni --schroot=unstable-amd64-sbuild dist
Writing dulwich-0.20.21/setup.cfg
creating dist
Creating tar archive
removing 'dulwich-0.20.21' (and everything under it)
Found new tarball dulwich-0.20.21.tar.gz in /var/run/schroot/mount/unstable-amd64-sbuild-974d32d7-6f10-4e77-8622-b6a091857e85/build/tmpucazj7j7/package/dist.
|
1 2 3 4 5 6 7 8 9 | % wget https://download.samba.org/pub/ldb/ldb-2.3.0.tar.gz
% tar xvfz ldb-2.3.0.tar.gz
% cd ldb-2.3.0
% ogni install --prefix=/tmp/ldb
+ install /tmp/ldb/include/ldb.h (from include/ldb.h)
Waf: Leaving directory /tmp/ldb-2.3.0/bin/default'
'install' finished successfully (11.395s)
|
1 2 3 4 5 6 | % wget https://cpan.metacpan.org/authors/id/T/TO/TORU/XML-LibXML-LazyBuilder-0.08.tar.gz _ <https://cpan.metacpan.org/authors/id/T/TO/TORU/XML-LibXML-LazyBuilder-0.08.tar.gz> _
% tar xvfz XML-LibXML-LazyBuilder-0.08.tar.gz
Cd XML-LibXML-LazyBuilder-0.08
% ogni test
|
The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. The upstream ontologist is a project that extracts metadata about upstream projects in a consistent format. It does this with a combination of heuristics and reading ecosystem-specific metadata files, such as Python s setup.py, rust s Cargo.toml as well as e.g. scanning README files.
% guess-upstream-metadata
<string>:2: (INFO/1) Duplicate implicit target name: "contributing".
Name: dulwich
Repository: https://www.dulwich.io/code/
X-Security-MD: https://github.com/dulwich/dulwich/tree/HEAD/SECURITY.md
X-Version: 0.20.21
Bug-Database: https://github.com/dulwich/dulwich/issues
X-Summary: Python Git Library
X-Description:
This is the Dulwich project.
It aims to provide an interface to git repos (both local and remote) that
doesn't call out to git directly but instead uses pure Python.
X-License: Apache License, version 2 or GNU General Public License, version 2 or later.
Bug-Submit: https://github.com/dulwich/dulwich/issues/new
[1] | Obviously this won t be able to describe the full licensing situation for many projects. Projects like scancode-toolkit are more appropriate for that. |
The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor. In my last blogpost, I introduced the buildlog consultant - a tool that can identify many reasons why a Debian build failed. For example, here s a fragment of a build log where the Build-Depends lack python3-setuptools:
849 850 851 852 853 854 855 856 857 858 | dpkg-buildpackage: info: host architecture amd64
fakeroot debian/rules clean
dh clean --with python3,sphinxdoc --buildsystem=pybuild
dh_auto_clean -O--buildsystem=pybuild
I: pybuild base:232: python3.9 setup.py clean
Traceback (most recent call last):
File "/<<PKGBUILDDIR>>/setup.py", line 2, in <module>
from setuptools import setup
ModuleNotFoundError: No module named 'setuptools'
E: pybuild pybuild:353: clean: plugin distutils failed with: exit code=1: python3.9 setup.py clean
|
% analyse-sbuild-log --json ~/build.log
"stage": "build",
"section": "Build",
"lineno": 857,
"kind": "missing-python-module",
"details": "module": "setuptools", "python_version": 3, "minimum_version": null
% apt-file search /usr/lib/python3/dist-packages/setuptools/__init__.py
python3-setuptools: /usr/lib/python3/dist-packages/setuptools/__init__.py
% deb-fix-build
Using output directory /tmp/tmpyz0nkgqq
Using sbuild chroot unstable-amd64-sbuild
Using fixers:
Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('setuptools_scm', python_version=3, minimum_version='4')
Using apt-file to search apt contents
Adding build dependency: python3-setuptools-scm (>= 4)
Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('toml', python_version=3, minimum_version=None)
Adding build dependency: python3-toml
Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
Built 0.5.2-1- changes files at [ saneyaml_0.5.2-1_amd64.changes ].
% git log -p
commit 5a1715f4c7273b042818fc75702f2284034c7277 (HEAD -> master)
Author: Jelmer Vernoo <jelmer@jelmer.uk>
Date: Sun Apr 4 02:35:56 2021 +0100
Add missing build dependency on python3-toml.
diff --git a/debian/control b/debian/control
index 5b854dc..3b27b73 100644
--- a/debian/control
+++ b/debian/control
@@ -1,6 +1,6 @@
Rules-Requires-Root: no
Standards-Version: 4.5.1
-Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
+Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4), python3-toml
Testsuite: autopkgtest-pkg-python
Source: python-saneyaml
Priority: optional
commit f03047da80fcd8468ee231fbc4cf8488d7a0acd1
Author: Jelmer Vernoo <jelmer@jelmer.uk>
Date: Sun Apr 4 02:35:34 2021 +0100
Add missing build dependency on python3-setuptools-scm (>= 4).
diff --git a/debian/control b/debian/control
index a476cc2..5b854dc 100644
--- a/debian/control
+++ b/debian/control
@@ -1,6 +1,6 @@
Rules-Requires-Root: no
Standards-Version: 4.5.1
-Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel
+Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
Testsuite: autopkgtest-pkg-python
Source: python-saneyaml
Priority: optional
Next.