Jonathan Dowland: Nine Inch Nails, Cornwall, June

- ninlive.com recordings for Friday
- setlist.fm page for Friday (nerdy song stats etc)
type ecdsaSkKeyMsg struct Type string sshtype:"17 25" Curve string PubKeyBytes []byte RpId string Flags uint8 KeyHandle []byte Reserved []byte Comments string Constraints []byte ssh:"rest"Where Type is ssh.KeyAlgoSKECDSA256, Curve is "nistp256", RpId is the identity of the relying party (eg, "webauthn.io"), Flags is 0x1 if you want the user to have to touch the key, KeyHandle is the hardware token's representation of the key (basically an opaque blob that's sufficient for the token to regenerate the keypair - this is generally stored by the remote site and handed back to you when it wants you to authenticate). The other fields can be ignored, other than PubKeyBytes, which is supposed to be the public half of the keypair.
type ecSig struct R *big.Int S *big.Intand then ssh.Unmarshal the Rest member to
type authData struct Flags uint8 SigCount uint32The signature needs to be converted back to a DER-encoded ASN.1 structure (eg,
var b cryptobyte.Builder b.AddASN1(asn1.SEQUENCE, func(b *cryptobyte.Builder) b.AddASN1BigInt(ecSig.R) b.AddASN1BigInt(ecSig.S) ) signatureDER, _ := b.Bytes(), and then you need to construct the Authenticator Data structure. For this, take the RpId used earlier and generate the sha256. Append the one byte Flags variable, and then convert SigCount to big endian and append those 4 bytes. You should now have a 37 byte structure. This needs to be CBOR encoded (I used github.com/fxamacker/cbor and just called cbor.Marshal(data, cbor.EncOptions )).
rec-def
library: In Norman Ramsey s very nice talk about his Functional Pearl Beyond Relooper: Recursive Translation of Unstructured Control Flow to Structured Control Flow , he had the following slide showing the equation for the dominators of a node in a graph:
Data.Set
, and also a function to find the predecessors of a node in a graph:
import qualified Data.Set as S
import qualified Data.Map as M
intersections :: [S.Set Int] -> S.Set Int
intersections [] = S.empty
intersections xs = foldl1 S.intersection xs
preds :: [(Int,Int)] -> M.Map Int [Int]
preds edges = M.fromListWith (<>) $
[ (v1, []) (v1, _) <- edges ] ++ -- to make the map total
[ (v2, [v1]) (v1, v2) <- edges ]
domintors1 :: [(Int,Int)] -> M.Map Int [Int] domintors1 edges = fmap S.toList doms where doms :: M.Map Int (S.Set Int) doms = M.mapWithKey (\v vs -> S.insert v (intersections [ doms M.! v' v' <- vs])) (preds edges)
ghci> domintors1 [] fromList [] ghci> domintors1 [(1,2)] fromList [(1,[1]),(2,[1,2])] ghci> domintors1 [(1,2),(1,3),(2,4),(3,4)] fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]
Data.Recursive.Set
.
import qualified Data.Recursive.Set as RS
intersections :: [RS.RSet Int] -> RS.RSet Int
intersections [] = RS.empty
intersections xs = foldl1 RS.intersection xs
domintors2 :: [(Int,Int)] -> M.Map Int [Int]
domintors2 edges = fmap (S.toList . RS.get) doms
where
doms :: M.Map Int (RS.RSet Int)
doms = M.mapWithKey
(\v vs -> RS.insert v (intersections [ doms M.! v' v' <- vs]))
(preds edges)
Data.Recursive.Set
calculates, as documented, the least fixed point.
What now? Until the library has code for RDualSet a
, we can work around this by using the dual formula to calculate the non-dominators. To do this, we
S.empty
, use the set of all nodes (which requires some extra plumbing)unions' :: S.Set Int -> [RS.RSet Int] -> RS.RSet Int unions' univ [] = mkR univ unions' _ xs = foldl1 RS.union xs domintors3 :: [(Int,Int)] -> M.Map Int [Int] domintors3 edges = fmap (S.toList . S.difference nodes . RS.get) nonDoms where nodes = S.fromList [v (v1,v2) <- edges, v <- [v1,v2]] nonDoms :: M.Map Int (RS.RSet Int) nonDoms = M.mapWithKey (\v vs -> RS.delete v (unions' nodes [ nonDoms M.! v' v' <- vs])) (preds edges)
ghci> domintors3 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]
We worked a little bit on how to express the beautiful formula to Haskell, but at no point did we have to think about how to solve it. To me, this is the essence of declarative programming.
rec-def
Haskell library to a few people. As this crowd appreciates writing compilers, and example from the realm of program analysis is quite compelling.
env
which of the variables may throw an exception:
canThrow1 :: Exp -> Bool canThrow1 = go M.empty where go :: M.Map Var Bool -> Exp -> Bool go env (Var v) = env M.! v go env Throw = True go env (Catch e) = False go env (Lam v e) = go (M.insert v False env) e go env (App e1 e2) = go env e1 go env e2 go env (Let v e1 e2) = go env' e2 where env_bind = M.singleton v (go env e1) env' = M.union env_bind env
Let
, where we extend the environment env
with the information about the additional variable env_bind
, which is calculated from analyzing the right-hand side e1
.
So far so good:
ghci> someVal = Lam "y" (Var "y")
ghci> canThrow1 $ Throw
True
ghci> canThrow1 $ Let "x" Throw someVal
False
ghci> canThrow1 $ Let "x" Throw (App (Var "x") someVal)
True
canThrow1
? Let use naively follow the pattern used for Let
: Calculate the analysis information for the variables in env_bind
, extend the environment with that, and pass it down:
go env (LetRec binds e) = go env' e where env_bind = M.fromList [ (v, go env' e) (v,e) <- binds ] env' = M.union env_bind env
env'
, and not just env
, when analyzing the right-hand sides. It has to be that way, as all the variables are in scope in all the right-hand sides.
In a strict language, such a mutually recursive definition, where env_bind
uses env'
which uses env_bind
is basically unthinkable. But in a lazy language like Haskell, it might just work.
Unfortunately, it works only as long as the recursive bindings are not actually recursive, or if they are recursive, they are not used:
ghci> canThrow1 $ LetRec [("x", Throw)] (Var "x")
True
ghci> canThrow1 $ LetRec [("x", App (Var "y") someVal), ("y", Throw)] (Var "x")
True
ghci> canThrow1 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "y")
True
But with genuine recursion, it does not work, and simply goes into a recursive cycle:
ghci> canThrow1 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "x")
^CInterrupted.
That is disappointing! Do we really have to toss that code and somehow do an explicit fixed-point calculation here? Obscuring our nice declarative code? And possibly having to repeat work (such as traversing the syntax tree) many times that we should only have to do once?
RBool
from Data.Recursive.Bool
instead of Bool
, we can write the exact same code, as follows:
import qualified Data.Recursive.Bool as RB
canThrow2 :: Exp -> Bool
canThrow2 = RB.get . go M.empty
where
go :: M.Map Var RBool -> Exp -> R Bool
go env (Var v) = env M.! v
go env Throw = RB.true
go env (Catch e) = RB.false
go env (Lam v e) = go (M.insert v RB.false env) e
go env (App e1 e2) = go env e1 RB. go env e2
go env (Let v e1 e2) = go env' e2
where
env_bind = M.singleton v (go env e1)
env' = M.union env_bind env
go env (LetRec binds e) = go env' e
where
env_bind = M.fromList [ (v, go env' e) (v,e) <- binds ]
env' = M.union env_bind env
ghci> canThrow2 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "x")
False
ghci> canThrow2 $ LetRec [("x", App (Var "x") Throw), ("y", Throw)] (Var "x")
True
I find this much more pleasing than the explicit naive fix-pointing you might do otherwise, where you stabilize the result at each LetRec
independently: Not only is all that extra work hidden from the programmer, but now also a single traversal of the syntax tree creates, thanks to the laziness, a graph of RBool
values, which are then solved under the hood .
canThrow2
fails to produce a result in case we hit x=x
:
ghci> canThrow2 $ LetRec [("x", Var "x")] (Var "x")
^CInterrupted.
This is, after all the syntax tree has been processed and all the map lookups have been resolved, equivalent to
ghci> let x = x in RB.get (x :: RBool)
^CInterrupted.
rec-def
machinery can only kick in if at least one of its function is used on any such cycle, even if it is just a form of identity (which I ~ought to add to the library~ since have added to the library):
ghci> idR x = RB.false x
ghci> let x = idR x in getR (x :: R Bool)
False
And indeed, if I insert a call to idR
in the line
then our analyzer will no longer stumble over these nasty recursive equations:
ghci> canThrow2 $ LetRec [("x", Var "x")] (Var "x")
False
It is a bit disappointing to have to do that, but I do not see a better way yet. I guess the def-rec
library expects the programmer to have a similar level of sophistication as other tie-the-know tricks with laziness (where you also have to ensure that your definitions are productive and that the sharing is not accidentally lost).
iked
can start.
Some months ago I switched my Debian laptop s configuration from the traditional ifupdown to systemd-networkd
.
It took me some time to figure out how to have systemd-networkd
create dummy interfaces on which iked
can install addresses, but also not interfere with iked
by trying to manage these interfaces.
Here is my working configuration.
First, I have systemd create the interface dummy1
by creating a systemd.netdev(5)
configuration file at /etc/systemd/network/20-dummy1.netdev
:
[NetDev]
Name=dummy1
Kind=dummy
Then I tell systemd not to manage this interface by creating a systemd.network(5)
configuration file at /etc/systemd/network/20-dummy1.network
:
[Match]
Name=dummy1
Unmanaged=yes
Restarting systemd-networkd causes these interfaces to get created, and we can then check their status using networkctl(8)
:
$ systemctl restart systemd-networkd.service
$ networkctl
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 enp2s0f0 ether off unmanaged
3 enp5s0 ether off unmanaged
4 dummy1 ether degraded configuring
5 dummy3 ether degraded configuring
6 sit0 sit off unmanaged
8 wlp3s0 wlan routable configured
9 he-ipv6 sit routable configured
8 links listed.
/etc/iked.conf
, making sure to assign the received address to the interface dummy1
.
ikev2 'hades' active esp \
from dynamic to 10.0.1.0/24 \
peer hades.rak.ac \
srcid '/CN=asteria.rak.ac' \
dstid '/CN=hades.rak.ac' \
request address 10.0.1.103 \
iface dummy1
Restarting openiked and checking the status of the interface reveals that it has been assigned an address on the internal network and that it is routable:
$ systemctl restart openiked.service
$ networkctl status dummy1
4: dummy1
Link File: /usr/lib/systemd/network/99-default.link
Network File: /etc/systemd/network/20-dummy1.network
Type: ether
Kind: dummy
State: routable (configured)
Online state: online
Driver: dummy
Hardware Address: 22:50:5f:98:a1:a9
MTU: 1500
QDisc: noqueue
IPv6 Address Generation Mode: eui64
Queue Length (Tx/Rx): 1/1
Address: 10.0.1.103
fe80::2050:5fff:fe98:a1a9
DNS: 10.0.1.1
Route Domains: .
Activation Policy: up
Required For Online: yes
DHCP6 Client DUID: DUID-EN/Vendor:0000ab11aafa4f02d6ac68d40000
gpg -q -d zstdcat -T0 zfs receive -u -o readonly=on "$STORE/$DEST"This processes tens of thousands of zfs sends per week. Recently, having written Filespooler, I switched to sending the backups using Filespooler over NNCP. Now fspl (the Filespooler executable) opens the file for each stream and then connects it to what amounts to this pipeline:
bash -c 'gpg -q -d 2>/dev/null zstdcat -T0' zfs receive -u -o readonly=on "$STORE/$DEST"Actually, to be more precise, it spins up the bash part of it, reads a few bytes from it, and then connects it to the zfs receive. And this works well almost always. In something like 1/1000 of the cases, it deadlocks, and I still don t know why. But I can talk about the journey of trying to figure it out (and maybe some of you will have some ideas). Filespooler is written in Rust, and uses Rust s Command system. Effectively what happens is this:
int err = zfs_file_read(fp, (char *)buf + done, len - done, &resid); if (resid == len - done) /* * Note: ECKSUM or ZFS_ERR_STREAM_TRUNCATED indicates * that the receive was interrupted and can * potentially be resumed. */ err = SET_ERROR(ZFS_ERR_STREAM_TRUNCATED);resid is an output parameter with the number of bytes remaining from a short read, so in this case, if the read produced zero bytes, then it sets that error. What s zfs_file_read then? It boils down to a thin wrapper around kernel_read(). This winds up calling __kernel_read(), which calls read_iter on the pipe, which is pipe_read(). That s where I don t have the knowledge to get into the weeds right now. So it seems likely to me that the problem has something to do with zfs receive. But, what, and why does it only not work in this one very specific situation, and only so rarely? And why does attaching strace to zstdcat make it all work again? I m indeed puzzled! Update 2022-06-20: See the followup post which identifies this as likely a kernel bug and explains why this particular use of Filespooler made it easier to trigger.
We currently use cookies to support our use of Google Analytics on the Website and Service. Google Analytics collects information about how you use the Website and Service. [...] This helps us to provide you with a good experience when you browse our Website and use our Service and also allows us to improve our Website and our Service.When I asked Matrix people about why they were using Google Analytics, they explained this was for development purposes and they were aiming for velocity at the time, not privacy (paraphrasing here). They also included a "free to snitch" clause:
If we are or believe that we are under a duty to disclose or share your personal data, we will do so in order to comply with any legal obligation, the instructions or requests of a governmental authority or regulator, including those outside of the UK.Those are really broad terms, above and beyond what is typically expected legally. Like the current retention policies, such user tracking and ... "liberal" collaboration practices with the state set a bad precedent for other home servers. Thankfully, since the above policy was published (2017), the GDPR was "implemented" (2018) and it seems like both the Element.io privacy policy and the Matrix.org privacy policy have been somewhat improved since. Notable points of the new privacy policies:
matrix.org
serviceWe will forget your copy of your data upon your request. We will also forward your request to be forgotten onto federated homeservers. However - these homeservers are outside our span of control, so we cannot guarantee they will forget your data.It's great they implemented those mechanisms and, after all, if there's an hostile party in there, nothing can prevent them from using screenshots to just exfiltrate your data away from the client side anyways, even with services typically seen as more secure, like Signal. As an aside, I also appreciate that Matrix.org has a fairly decent code of conduct, based on the TODO CoC which checks all the boxes in the geekfeminism wiki.
matrix.org
to block
known abusers (users or servers). Bans are pretty flexible and
can operate at the user, room, or server level.
Matrix people suggest making the bot admin of your channels, because
you can't take back admin from a user once given.
This tool and Mjolnir are based on the admin API built into Synapse.
- System notify users (all users/users from a list, specific user)
- delete sessions/devices not seen for X days
- purge the remote media cache
- select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
- purge history of theses rooms
- shutdown rooms
+R
mode ("only
registered users can join") by default, except that anyone can
register their own homeserver, which makes this limited.
Server admins can block IP addresses and home servers, but those tools
are not easily available to room admins. There is an API
(m.room.server_acl
in /devtools
) but it is not reliable
(thanks Austin Huang for the clarification).
Matrix has the concept of guest accounts, but it is not used very
much, and virtually no client or homeserver supports it. This contrasts with the way
IRC works: by default, anyone can join an IRC network even without
authentication. Some channels require registration, but in general you
are free to join and look around (until you get blocked, of course).
I have seen anecdotal evidence (CW: Twitter, nitter link) that "moderating bridges is hell", and
I can imagine why. Moderation is already hard enough on one
federation, when you bridge a room with another network, you inherit
all the problems from that network but without the entire abuse
control tools from the original network's API...
joe
on server B,
they will hijack that room on that specific server. This will not
(necessarily) affect users on the other servers, as servers could
refuse parts of the updates or ban the compromised account (or
server).
It does seem like a major flaw that room credentials are bound to
Matrix identifiers, as opposed to the E2E encryption credentials. In
an encrypted room even with fully verified members, a compromised or
hostile home server can still take over the room by impersonating an
admin. That admin (or even a newly minted user) can then send events
or listen on the conversations.
This is even more frustrating when you consider that Matrix events are
actually signed and therefore have some authentication attached
to them, acting like some sort of Merkle tree (as it contains a link
to previous events). That signature, however, is made from the
homeserver PKI keys, not the client's E2E keys, which makes E2E feel
like it has been "bolted on" later.
connect
block on both servers@anarcat:matrix.org
) is bound to
that specific home server. If that server goes down, that user is
completely disconnected. They could register a new account elsewhere
and reconnect, but then they basically lose all their configuration:
contacts, joined channels are all lost.
(Also notice how the Matrix IDs don't look like a typical user address
like an email in XMPP. They at least did their homework and got the
allocation for the scheme.)
#room:matrix.org
is also
visible as #room:example.com
on the example.com
home server. Both
addresses refer to the same room underlying room.
(Finding this in the Element settings is not obvious though, because
that "alias" are actually called a "local address" there. So to create
such an alias (in Element), you need to go in the room settings'
"General" section, "Show more" in "Local address", then add the alias
name (e.g. foo
), and then that room will be available on your
example.com
homeserver as #foo:example.com
.)
So a room doesn't belong to a server, it belongs to the federation,
and anyone can join the room from any serer (if the room is public, or
if invited otherwise). You can create a room on server A and when a
user from server B joins, the room will be replicated on server B as
well. If server A fails, server B will keep relaying traffic to
connected users and servers.
A room is therefore not fundamentally addressed with the above alias,
instead ,it has a internal Matrix ID, which basically a random
string. It has a server name attached to it, but that was made just to
avoid collisions. That can get a little confusing. For example, the
#fractal:gnome.org
room is an alias on the gnome.org
server, but
the room ID is !hwiGbsdSTZIwSRfybq:matrix.org
. That's because the
room was created on matrix.org
, but the preferred branding is
gnome.org
now.
As an aside, rooms, by default, live forever, even after the last user
quits. There's an admin API to delete rooms and a tombstone
event to redirect to another one, but neither have a GUI yet. The
latter is part of MSC1501 ("Room version upgrades") which allows
a room admin to close a room, with a message and a pointer to another
room.
matrix.example.com
) must never change in
the future, as renaming home servers is not supported.
The documentation used to say you could "run a hot spare" but that has
been removed. Last I heard, it was not possible to run a
high-availability setup where multiple, separate locations could
replace each other automatically. You can have high performance
setups where the load gets distributed among workers, but those
are based on a shared database (Redis and PostgreSQL) backend.
So my guess is it would be possible to create a "warm" spare server of
a matrix home server with regular PostgreSQL replication, but
that is not documented in the Synapse manual. This sort of setup
would also not be useful to deal with networking issues or denial of
service attacks, as you will not be able to spread the load over
multiple network locations easily. Redis and PostgreSQL heroes are
welcome to provide their multi-primary solution in the comments. In
the meantime, I'll just point out this is a solution that's handled
somewhat more gracefully in IRC, by having the possibility of
delegating the authentication layer.
.well-known
pattern (or SRV
records, but that's "not recommended" and a bit confusing) to
delegate that service to another server. Be warned that the server
still needs to be explicitly configured for your domain. You can't
just put:
"m.server": "matrix.org:443"
... on https://example.com/.well-known/matrix/server
and start using
@you:example.com
as a Matrix ID. That's because Matrix doesn't
support "virtual hosting" and you'd still be connecting to rooms and
people with your matrix.org
identity, not example.com
as you would
normally expect. This is also why you cannot rename your home
server.
The server discovery API is what allows servers to find each
other. Clients, on the other hand, use the client-server discovery
API: this is what allows a given client to find your home server
when you type your Matrix ID on login.
matrix.debian.social
) takes a few minutes
and then fails. That is because the home server has to sync the
entire room state when you join the room. There was promising work on
this announced in the lengthy 2021 retrospective, and some of
that work landed (partial sync) in the 1.53 release already.
Other improvements coming include sliding sync, lazy loading
over federation, and fast room joins. So that's actually
something that could be fixed in the fairly short term.
But in general, communication in Matrix doesn't feel as "snappy" as on
IRC or even Signal. It's hard to quantify this without instrumenting a
full latency test bed (for example the tools I used in the terminal
emulators latency tests), but
even just typing in a web browser feels slower than typing in a xterm
or Emacs for me.
Even in conversations, I "feel" people don't immediately respond as
fast. In fact, this could be an interesting double-blind experiment to
make: have people guess whether they are talking to a person on
Matrix, XMPP, or IRC, for example. My theory would be that people
could notice that Matrix users are slower, if only because of the TCP
round-trip time each message has to take.
matrix:
scheme, it's just not exactly clear what they should do with
it, especially when the handler is just another web page (e.g. Element
web).
In general, when compared with tools like Signal or WhatsApp, Matrix
doesn't fare so well in terms of user discovery. I probably have some
of my normal contacts that have a Matrix account as well, but there's
really no way to know. It's kind of creepy when Signal tells you
"this person is on Signal!" but it's also pretty cool that it works,
and they actually implemented it pretty well.
Registration is also less obvious: in Signal, the app confirms your
phone number automatically. It's friction-less and quick. In Matrix,
you need to learn about home servers, pick one, register (with a
password! aargh!), and then setup encryption keys (not default),
etc. It's a lot more friction.
And look, I understand: giving away your phone number is a huge
trade-off. I don't like it either. But it solves a real problem and
makes encryption accessible to a ton more people. Matrix does have
"identity servers" that can serve that purpose, but I don't feel
confident sharing my phone number there. It doesn't help that the
identity servers don't have private contact discovery: giving them
your phone number is a more serious security compromise than with
Signal.
There's a catch-22 here too: because no one feels like giving away
their phone numbers, no one does, and everyone assumes that stuff
doesn't work anyways. Like it or not, Signal forcing people to
divulge their phone number actually gives them critical mass that
means actually a lot of my relatives are on Signal and I don't have
to install crap like WhatsApp to talk with them.
/READ
command for
the latter:
/ALIAS READ script exec \$_->activity(0) for Irssi::windows
And yes, that's a Perl script in my IRC client. I am not aware of any
Matrix client that does stuff like that, except maybe Weechat, if we
can call it a Matrix client, or Irssi itself, now that it has a
Matrix plugin (!).
As for other clients, I have looked through the Matrix Client
Matrix (confusing right?) to try to figure out which one to try,
and, even after selecting Linux
as a filter, the chart is just too
wide to figure out anything. So I tried those, kind of randomly:
weechat-matrix
or gomuks
. At least Weechat
is scriptable so I could continue playing the power-user. Right now my
strategy with messaging (and that includes microblogging like Twitter
or Mastodon) is that everything goes through my IRC client, so Weechat
could actually fit well in there. Going with gomuks
, on the other
hand, would mean running it in parallel with Irssi or ... ditching
IRC, which is a leap I'm not quite ready to take just yet.
Oh, and basically none of those clients (except Nheko and Element)
support VoIP, which is still kind of a second-class citizen in
Matrix. It does not support large multimedia rooms, for example:
Jitsi was used for FOSDEM instead of the native videoconferencing
system.
matrix.org
publishes a (federated) block list
of hostile servers (#matrix-org-coc-bl:matrix.org
, yes, of course
it's a room).
Interestingly, Email is also in that stage, where there are block
lists of spammers, and it's a race between those blockers and
spammers. Large email providers, obviously, are getting closer to the
EFnet stage: you could consider they only accept email from themselves
or between themselves. It's getting increasingly hard to deliver mail
to Outlook and Gmail for example, partly because of bias against small
providers, but also because they are including more and more
machine-learning tools to sort through email and those systems are,
fundamentally, unknowable. It's not quite the same as splitting the
federation the way EFnet did, but the effect is similar.
HTTP has somehow managed to live in a parallel universe, as it's
technically still completely federated: anyone can start a web server
if they have a public IP address and anyone can connect to it. The
catch, of course, is how you find the darn thing. Which is how Google
became one of the most powerful corporations on earth, and how they
became the gatekeepers of human knowledge online.
I have only briefly mentioned XMPP here, and my XMPP fans will
undoubtedly comment on that, but I think it's somewhere in the middle
of all of this. It was co-opted by Facebook and Google, and
both corporations have abandoned it to its fate. I remember fondly the
days where I could do instant messaging with my contacts who had a
Gmail account. Those days are gone, and I don't talk to anyone over
Jabber anymore, unfortunately. And this is a threat that Matrix still
has to face.
It's also the threat Email is currently facing. On the one hand
corporations like Facebook want to completely destroy it and have
mostly succeeded: many people just have an email account to
register on things and talk to their friends over Instagram or
(lately) TikTok (which, I know, is not Facebook, but they started that
fire).
On the other hand, you have corporations like Microsoft and Google who
are still using and providing email services because, frankly, you
still do need email for stuff, just like fax is still around
but they are more and more isolated in their own silo. At this point,
it's only a matter of time they reach critical mass and just decide
that the risk of allowing external mail coming in is not worth the
cost. They'll simply flip the switch and work on an allow-list
principle. Then we'll have closed the loop and email will be
dead, just like IRC is "dead" now.
I wonder which path Matrix will take. Could it liberate us from these
vicious cycles?
Update: this generated some discussions on lobste.rs.
hugo
, asciidoctor
and the papermod
theme, how I publish it using
nginx
, how I ve integrated the remark42
comment system and how I ve
automated its publication using gitea
and json2file-go
.
It is a long post, but I hope that at least parts of it can be interesting for
some, feel free to ignore it if that is not your case
config.yml
file is the one shown below (probably some of the
settings are not required nor being used right now, but I m including the
current file, so this post will have always the latest version of it):
baseURL: https://blogops.mixinet.net/
title: Mixinet BlogOps
paginate: 5
theme: PaperMod
destination: public/
enableInlineShortcodes: true
enableRobotsTXT: true
buildDrafts: false
buildFuture: false
buildExpired: false
enableEmoji: true
pygmentsUseClasses: true
minify:
disableXML: true
minifyOutput: true
languages:
en:
languageName: "English"
description: "Mixinet BlogOps - https://blogops.mixinet.net/"
author: "Sergio Talens-Oliag"
weight: 1
title: Mixinet BlogOps
homeInfoParams:
Title: "Sergio Talens-Oliag Technical Blog"
Content: >

taxonomies:
category: categories
tag: tags
series: series
menu:
main:
- name: Archive
url: archives
weight: 5
- name: Categories
url: categories/
weight: 10
- name: Tags
url: tags/
weight: 10
- name: Search
url: search/
weight: 15
outputs:
home:
- HTML
- RSS
- JSON
params:
env: production
defaultTheme: light
disableThemeToggle: false
ShowShareButtons: true
ShowReadingTime: true
disableSpecial1stPost: true
disableHLJS: true
displayFullLangName: true
ShowPostNavLinks: true
ShowBreadCrumbs: true
ShowCodeCopyButtons: true
ShowRssButtonInSectionTermList: true
ShowFullTextinRSS: true
ShowToc: true
TocOpen: false
comments: true
remark42SiteID: "blogops"
remark42Url: "/remark42"
profileMode:
enabled: false
title: Sergio Talens-Oliag Technical Blog
imageUrl: "/images/mixinet-blogops.png"
imageTitle: Mixinet BlogOps
buttons:
- name: Archives
url: archives
- name: Categories
url: categories
- name: Tags
url: tags
socialIcons:
- name: CV
url: "https://www.uv.es/~sto/cv/"
- name: Debian
url: "https://people.debian.org/~sto/"
- name: GitHub
url: "https://github.com/sto/"
- name: GitLab
url: "https://gitlab.com/stalens/"
- name: Linkedin
url: "https://www.linkedin.com/in/sergio-talens-oliag/"
- name: RSS
url: "index.xml"
assets:
disableHLJS: true
favicon: "/favicon.ico"
favicon16x16: "/favicon-16x16.png"
favicon32x32: "/favicon-32x32.png"
apple_touch_icon: "/apple-touch-icon.png"
safari_pinned_tab: "/safari-pinned-tab.svg"
fuseOpts:
isCaseSensitive: false
shouldSort: true
location: 0
distance: 1000
threshold: 0.4
minMatchCharLength: 0
keys: ["title", "permalink", "summary", "content"]
markup:
asciidocExt:
attributes:
backend: html5s
extensions: ['asciidoctor-html5s','asciidoctor-diagram']
failureLevel: fatal
noHeaderOrFooter: true
preserveTOC: false
safeMode: unsafe
sectionNumbers: false
trace: false
verbose: false
workingFolderCurrent: true
privacy:
vimeo:
disabled: false
simple: true
twitter:
disabled: false
enableDNT: true
simple: true
instagram:
disabled: false
simple: true
youtube:
disabled: false
privacyEnhanced: true
services:
instagram:
disableInlineCSS: true
twitter:
disableInlineCSS: true
security:
exec:
allow:
- '^asciidoctor$'
- '^dart-sass-embedded$'
- '^go$'
- '^npx$'
- '^postcss$'
disableHLJS
and assets.disableHLJS
are set to true
; we plan to use
rouge
on adoc
and the inclusion of the hljs
assets adds styles that
collide with the ones used by rouge
.ShowToc
is set to true
and the TocOpen
setting is set to false
to
make the ToC appear collapsed initially. My plan was to use the asciidoctor
ToC, but after trying I believe that the theme one looks nice and I don t
need to adjust styles, although it has some issues with the html5s
processor (the admonition titles use <h6>
and they are shown on the ToC,
which is weird), to fix it I ve copied the layouts/partial/toc.html
to my
site repository and replaced the range of headings to end at 5
instead of
6
(in fact 5
still seems a lot, but as I don t think I ll use that heading
level on the posts it doesn t really matter).params.profileMode
values are adjusted, but for now I ve left it disabled
setting params.profileMode.enabled
to false
and I ve set the
homeInfoParams
to show more or less the same content with the latest posts
under it (I ve added some styles to my custom.css
style sheet to center the
text and image of the first post to match the look and feel of the profile).asciidocExt
section I ve adjusted the backend
to use html5s
,
I ve added the asciidoctor-html5s
and asciidoctor-diagram
extensions to
asciidoctor
and adjusted the workingFolderCurrent
to true
to make
asciidoctor-diagram
work right (haven t tested it yet).asciidoctor
using the html5s
processor I ve added some files to
the assets/css/extended
directory:
assets/css/extended/custom.css
to
make the homeInfoParams
look like the profile page and I ve also changed a
little bit some theme styles to make things look better with the html5s
output:/* Fix first entry alignment to make it look like the profile */
.first-entry text-align: center;
.first-entry img display: inline;
/**
* Remove margin for .post-content code and reduce padding to make it look
* better with the asciidoctor html5s output.
**/
.post-content code margin: auto 0; padding: 4px;
assets/css/extended/adoc.css
with some styles
taken from the asciidoctor-default.css
, see this
blog
post about the original file; mine is the same after formatting it with
css-beautify and editing it to use variables for
the colors to support light and dark themes:/* AsciiDoctor*/
table
border-collapse: collapse;
border-spacing: 0
.admonitionblock>table
border-collapse: separate;
border: 0;
background: none;
width: 100%
.admonitionblock>table td.icon
text-align: center;
width: 80px
.admonitionblock>table td.icon img
max-width: none
.admonitionblock>table td.icon .title
font-weight: bold;
font-family: "Open Sans", "DejaVu Sans", sans-serif;
text-transform: uppercase
.admonitionblock>table td.content
padding-left: 1.125em;
padding-right: 1.25em;
border-left: 1px solid #ddddd8;
color: var(--primary)
.admonitionblock>table td.content>:last-child>:last-child
margin-bottom: 0
.admonitionblock td.icon [class^="fa icon-"]
font-size: 2.5em;
text-shadow: 1px 1px 2px var(--secondary);
cursor: default
.admonitionblock td.icon .icon-note::before
content: "\f05a";
color: var(--icon-note-color)
.admonitionblock td.icon .icon-tip::before
content: "\f0eb";
color: var(--icon-tip-color)
.admonitionblock td.icon .icon-warning::before
content: "\f071";
color: var(--icon-warning-color)
.admonitionblock td.icon .icon-caution::before
content: "\f06d";
color: var(--icon-caution-color)
.admonitionblock td.icon .icon-important::before
content: "\f06a";
color: var(--icon-important-color)
.conum[data-value]
display: inline-block;
color: #fff !important;
background-color: rgba(100, 100, 0, .8);
-webkit-border-radius: 100px;
border-radius: 100px;
text-align: center;
font-size: .75em;
width: 1.67em;
height: 1.67em;
line-height: 1.67em;
font-family: "Open Sans", "DejaVu Sans", sans-serif;
font-style: normal;
font-weight: bold
.conum[data-value] *
color: #fff !important
.conum[data-value]+b
display: none
.conum[data-value]::after
content: attr(data-value)
pre .conum[data-value]
position: relative;
top: -.125em
b.conum *
color: inherit !important
.conum:not([data-value]):empty
display: none
theme-vars.css
file that changes the highlighted code background color and adds the color
definitions used by the admonitions::root
/* Solarized base2 */
/* --hljs-bg: rgb(238, 232, 213); */
/* Solarized base3 */
/* --hljs-bg: rgb(253, 246, 227); */
/* Solarized base02 */
--hljs-bg: rgb(7, 54, 66);
/* Solarized base03 */
/* --hljs-bg: rgb(0, 43, 54); */
/* Default asciidoctor theme colors */
--icon-note-color: #19407c;
--icon-tip-color: var(--primary);
--icon-warning-color: #bf6900;
--icon-caution-color: #bf3400;
--icon-important-color: #bf0000
.dark
--hljs-bg: rgb(7, 54, 66);
/* Asciidoctor theme colors with tint for dark background */
--icon-note-color: #3e7bd7;
--icon-tip-color: var(--primary);
--icon-warning-color: #ff8d03;
--icon-caution-color: #ff7847;
--icon-important-color: #ff3030
font-awesome
, so I ve downloaded its resources for
version 4.7.0
(the one used by asciidoctor
) storing the
font-awesome.css
into on the assets/css/extended
dir (that way it is
merged with the rest of .css
files) and copying the fonts to the
static/assets/fonts/
dir (will be served directly):FA_BASE_URL="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0"
curl "$FA_BASE_URL/css/font-awesome.css" \
> assets/css/extended/font-awesome.css
for f in FontAwesome.otf fontawesome-webfont.eot \
fontawesome-webfont.svg fontawesome-webfont.ttf \
fontawesome-webfont.woff fontawesome-webfont.woff2; do
curl "$FA_BASE_URL/fonts/$f" > "static/assets/fonts/$f"
done
css
compatible with rouge
) so we need a css
to do the highlight styling; as
rouge
provides a way to export them, I ve created the
assets/css/extended/rouge.css
file with the thankful_eyes
theme:rougify style thankful_eyes > assets/css/extended/rouge.css
html5s
backend with admonitions I ve added a
variation of the example found on this
blog
post to assets/js/adoc-admonitions.js
:// replace the default admonitions block with a table that uses a format
// similar to the standard asciidoctor ... as we are using fa-icons here there
// is no need to add the icons: font entry on the document.
window.addEventListener('load', function ()
const admonitions = document.getElementsByClassName('admonition-block')
for (let i = admonitions.length - 1; i >= 0; i--)
const elm = admonitions[i]
const type = elm.classList[1]
const title = elm.getElementsByClassName('block-title')[0];
const label = title.getElementsByClassName('title-label')[0]
.innerHTML.slice(0, -1);
elm.removeChild(elm.getElementsByClassName('block-title')[0]);
const text = elm.innerHTML
const parent = elm.parentNode
const tempDiv = document.createElement('div')
tempDiv.innerHTML = <div class="admonitionblock $ type ">
<table>
<tbody>
<tr>
<td class="icon">
<i class="fa icon-$ type " title="$ label "></i>
</td>
<td class="content">
$ text
</td>
</tr>
</tbody>
</table>
</div>
const input = tempDiv.childNodes[0]
parent.replaceChild(input, elm)
)
layouts/partials/extend_footer.html
file
adding the following lines to it:
- $admonitions := slice (resources.Get "js/adoc-admonitions.js")
resources.Concat "assets/js/adoc-admonitions.js" minify fingerprint
<script defer crossorigin="anonymous" src=" $admonitions.RelPermalink "
integrity=" $admonitions.Data.Integrity "></script>
layouts/partials/comments.html
with the following content
based on the remark42
documentation, including extra code to sync the dark/light setting with the
one set on the site:
<div id="remark42"></div>
<script>
var remark_config =
host: .Site.Params.remark42Url ,
site_id: .Site.Params.remark42SiteID ,
url: .Permalink ,
locale: .Site.Language.Lang
;
(function(c)
/* Adjust the theme using the local-storage pref-theme if set */
if (localStorage.getItem("pref-theme") === "dark")
remark_config.theme = "dark";
else if (localStorage.getItem("pref-theme") === "light")
remark_config.theme = "light";
/* Add remark42 widget */
for(var i = 0; i < c.length; i++)
var d = document, s = d.createElement('script');
s.src = remark_config.host + '/web/' + c[i] +'.js';
s.defer = true;
(d.head d.body).appendChild(s);
)(remark_config.components ['embed']);
</script>
remark42
I ve also added the following inside
the layouts/partials/extend_footer.html
file:
- if (not site.Params.disableThemeToggle)
<script>
/* Function to change theme when the toggle button is pressed */
document.getElementById("theme-toggle").addEventListener("click", () =>
if (typeof window.REMARK42 != "undefined")
if (document.body.className.includes('dark'))
window.REMARK42.changeTheme('light');
else
window.REMARK42.changeTheme('dark');
);
</script>
- end
theme-toggle
button is pressed we change the remark42
theme before the PaperMod
one (that s needed here only, on page loads the
remark42
theme is synced with the main one using the code from the
layouts/partials/comments.html
shown earlier).docker-compose
with the following
configuration:
version: "2"
services:
hugo:
build:
context: ./docker/hugo-adoc
dockerfile: ./Dockerfile
image: sto/hugo-adoc
container_name: hugo-adoc-blogops
restart: always
volumes:
- .:/documents
command: server --bind 0.0.0.0 -D -F
user: $ APP_UID :$ APP_GID
nginx:
image: nginx:latest
container_name: nginx-blogops
restart: always
volumes:
- ./nginx/default.conf:/etc/nginx/conf.d/default.conf
ports:
- 1313:1313
remark42:
build:
context: ./docker/remark42
dockerfile: ./Dockerfile
image: sto/remark42
container_name: remark42-blogops
restart: always
env_file:
- ./.env
- ./remark42/env.dev
volumes:
- ./remark42/var.dev:/srv/var
.env
file with the current user ID
and GID on the variables APP_UID
and APP_GID
(if we don t do it the files
can end up being owned by a user that is not the same as the one running the
services):
$ echo "APP_UID=$(id -u)\nAPP_GID=$(id -g)" > .env
Dockerfile
used to generate the sto/hugo-adoc
is:
FROM asciidoctor/docker-asciidoctor:latest
RUN gem install --no-document asciidoctor-html5s &&\
apk update && apk add --no-cache curl libc6-compat &&\
repo_path="gohugoio/hugo" &&\
api_url="https://api.github.com/repos/$repo_path/releases/latest" &&\
download_url="$(\
curl -sL "$api_url" \
sed -n "s/^.*download_url\": \"\\(.*.extended.*Linux-64bit.tar.gz\)\"/\1/p"\
)" &&\
curl -sL "$download_url" -o /tmp/hugo.tgz &&\
tar xf /tmp/hugo.tgz hugo &&\
install hugo /usr/bin/ &&\
rm -f hugo /tmp/hugo.tgz &&\
/usr/bin/hugo version &&\
apk del curl && rm -rf /var/cache/apk/*
# Expose port for live server
EXPOSE 1313
ENTRYPOINT ["/usr/bin/hugo"]
CMD [""]
asciidoctor
and to use hugo
I only need to download the binary from their latest release
at github (as we are using an
image based on alpine we also need to install the
libc6-compat
package, but once that is done things are working fine for me so
far).
The image does not launch the server by default because I don t want it to; in
fact I use the same docker-compose.yml
file to publish the site in production
simply calling the container without the arguments passed on the
docker-compose.yml
file (see later).
When running the containers with docker-compose up
(or docker compose up
if
you have the docker-compose-plugin
package installed) we also launch a nginx
container and the remark42
service so we can test everything together.
The Dockerfile
for the remark42
image is the original one with an updated
version of the init.sh
script:
FROM umputun/remark42:latest
COPY init.sh /init.sh
init.sh
is similar to the original, but allows us to use an
APP_GID
variable and updates the /etc/group
file of the container so the
files get the right user and group (with the original script the group is
always 1001
):
#!/sbin/dinit /bin/sh
uid="$(id -u)"
if [ "$ uid " -eq "0" ]; then
echo "init container"
# set container's time zone
cp "/usr/share/zoneinfo/$ TIME_ZONE " /etc/localtime
echo "$ TIME_ZONE " >/etc/timezone
echo "set timezone $ TIME_ZONE ($(date))"
# set UID & GID for the app
if [ "$ APP_UID " ] [ "$ APP_GID " ]; then
[ "$ APP_UID " ] APP_UID="1001"
[ "$ APP_GID " ] APP_GID="$ APP_UID "
echo "set custom APP_UID=$ APP_UID & APP_GID=$ APP_GID "
sed -i "s/^app:x:1001:1001:/app:x:$ APP_UID :$ APP_GID :/" /etc/passwd
sed -i "s/^app:x:1001:/app:x:$ APP_GID :/" /etc/group
else
echo "custom APP_UID and/or APP_GID not defined, using 1001:1001"
fi
chown -R app:app /srv /home/app
fi
echo "prepare environment"
# replace % REMARK_URL % by content of REMARK_URL variable
find /srv -regex '.*\.\(html\ js\ mjs\)$' -print \
-exec sed -i "s % REMARK_URL % $ REMARK_URL g" \;
if [ -n "$ SITE_ID " ]; then
#replace "site_id: 'remark'" by SITE_ID
sed -i "s 'remark' '$ SITE_ID ' g" /srv/web/*.html
fi
echo "execute \"$*\""
if [ "$ uid " -eq "0" ]; then
exec su-exec app "$@"
else
exec "$@"
fi
remark42
for development is quite minimal:
TIME_ZONE=Europe/Madrid
REMARK_URL=http://localhost:1313/remark42
SITE=blogops
SECRET=123456
ADMIN_SHARED_ID=sto
AUTH_ANON=true
EMOJI=true
nginx/default.conf
file used to publish the service locally is simple
too:
server
listen 1313;
server_name localhost;
location /
proxy_pass http://hugo:1313;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
location /remark42/
rewrite /remark42/(.*) /$1 break;
proxy_pass http://remark42:8080/;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
main
Debian repository:
git
to clone & pull the repository,jq
to parse json
files from shell scripts,json2file-go
to save the webhook messages to files,inotify-tools
to detect when new files are stored by json2file-go
and
launch scripts to process them,nginx
to publish the site using HTTPS and work as proxy for
json2file-go
and remark42
(I run it using a container),task-spool
to queue the scripts that update the deployment.docker
and docker compose
from the debian packages on the
docker
repository:
docker-ce
to run the containers,docker-compose-plugin
to run docker compose
(it is a plugin, so no -
in
the name).git
repository I ve created a deploy key, added it to gitea
and cloned the project on the /srv/blogops
PATH (that route is owned by a
regular user that has permissions to run docker
, as I said before).hugo
To compile the site we are using the docker-compose.yml
file seen before, to
be able to run it first we build the container images and once we have them we
launch hugo
using docker compose run
:
$ cd /srv/blogops
$ git pull
$ docker compose build
$ if [ -d "./public" ]; then rm -rf ./public; fi
$ docker compose run hugo --
/srv/blogops/public
(we remove the
directory first because hugo
does not clean the destination folder as
jekyll
does).
The deploy script re-generates the site as described and moves the public
directory to its final place for publishing.remark42
with dockerOn the /srv/blogops/remark42
folder I have the following docker-compose.yml
:
version: "2"
services:
remark42:
build:
context: ../docker/remark42
dockerfile: ./Dockerfile
image: sto/remark42
env_file:
- ../.env
- ./env.prod
container_name: remark42
restart: always
volumes:
- ./var.prod:/srv/var
ports:
- 127.0.0.1:8042:8080
../.env
file is loaded to get the APP_UID
and APP_GID
variables that
are used by my version of the init.sh
script to adjust file permissions and
the env.prod
file contains the rest of the settings for remark42
, including
the social network tokens (see the
remark42 documentation for
the available parameters, I don t include my configuration here because some of
them are secrets).nginx
configuration for the blogops.mixinet.net
site is as simple as:
server
listen 443 ssl http2;
server_name blogops.mixinet.net;
ssl_certificate /etc/letsencrypt/live/blogops.mixinet.net/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/blogops.mixinet.net/privkey.pem;
include /etc/letsencrypt/options-ssl-nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
access_log /var/log/nginx/blogops.mixinet.net-443.access.log;
error_log /var/log/nginx/blogops.mixinet.net-443.error.log;
root /srv/blogops/nginx/public_html;
location /
try_files $uri $uri/ =404;
include /srv/blogops/nginx/remark42.conf;
server
listen 80 ;
listen [::]:80 ;
server_name blogops.mixinet.net;
access_log /var/log/nginx/blogops.mixinet.net-80.access.log;
error_log /var/log/nginx/blogops.mixinet.net-80.error.log;
if ($host = blogops.mixinet.net)
return 301 https://$host$request_uri;
return 404;
/srv/blogops/nginx/public_html
and not on /srv/blogops/public
; the reason
for that is that I want to be able to compile without affecting the running
site, the deployment script generates the site on /srv/blogops/public
and if
all works well we rename folders to do the switch, making the change feel almost
atomic.gitea
at my home and the VM where the blog is served, I m
going to configure the json2file-go
to listen for connections on a high port
using a self signed certificate and listening on IP addresses only reachable
through the VPN.
To do it we create a systemd socket
to run json2file-go
and adjust its
configuration to listen on a private IP (we use the FreeBind
option on its
definition to be able to launch the service even when the IP is not available,
that is, when the VPN is down).
The following script can be used to set up the json2file-go
configuration:
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
BASE_DIR="/srv/blogops/webhook"
J2F_DIR="$BASE_DIR/json2file"
TLS_DIR="$BASE_DIR/tls"
J2F_SERVICE_NAME="json2file-go"
J2F_SERVICE_DIR="/etc/systemd/system/json2file-go.service.d"
J2F_SERVICE_OVERRIDE="$J2F_SERVICE_DIR/override.conf"
J2F_SOCKET_DIR="/etc/systemd/system/json2file-go.socket.d"
J2F_SOCKET_OVERRIDE="$J2F_SOCKET_DIR/override.conf"
J2F_BASEDIR_FILE="/etc/json2file-go/basedir"
J2F_DIRLIST_FILE="/etc/json2file-go/dirlist"
J2F_CRT_FILE="/etc/json2file-go/certfile"
J2F_KEY_FILE="/etc/json2file-go/keyfile"
J2F_CRT_PATH="$TLS_DIR/crt.pem"
J2F_KEY_PATH="$TLS_DIR/key.pem"
# ----
# MAIN
# ----
# Install packages used with json2file for the blogops site
sudo apt update
sudo apt install -y json2file-go uuid
if [ -z "$(type mkcert)" ]; then
sudo apt install -y mkcert
fi
sudo apt clean
# Configuration file values
J2F_USER="$(id -u)"
J2F_GROUP="$(id -g)"
J2F_DIRLIST="blogops:$(uuid)"
J2F_LISTEN_STREAM="172.31.31.1:4443"
# Configure json2file
[ -d "$J2F_DIR" ] mkdir "$J2F_DIR"
sudo sh -c "echo '$J2F_DIR' >'$J2F_BASEDIR_FILE'"
[ -d "$TLS_DIR" ] mkdir "$TLS_DIR"
if [ ! -f "$J2F_CRT_PATH" ] [ ! -f "$J2F_KEY_PATH" ]; then
mkcert -cert-file "$J2F_CRT_PATH" -key-file "$J2F_KEY_PATH" "$(hostname -f)"
fi
sudo sh -c "echo '$J2F_CRT_PATH' >'$J2F_CRT_FILE'"
sudo sh -c "echo '$J2F_KEY_PATH' >'$J2F_KEY_FILE'"
sudo sh -c "cat >'$J2F_DIRLIST_FILE'" <<EOF
$(echo "$J2F_DIRLIST" tr ';' '\n')
EOF
# Service override
[ -d "$J2F_SERVICE_DIR" ] sudo mkdir "$J2F_SERVICE_DIR"
sudo sh -c "cat >'$J2F_SERVICE_OVERRIDE'" <<EOF
[Service]
User=$J2F_USER
Group=$J2F_GROUP
EOF
# Socket override
[ -d "$J2F_SOCKET_DIR" ] sudo mkdir "$J2F_SOCKET_DIR"
sudo sh -c "cat >'$J2F_SOCKET_OVERRIDE'" <<EOF
[Socket]
# Set FreeBind to listen on missing addresses (the VPN can be down sometimes)
FreeBind=true
# Set ListenStream to nothing to clear its value and add the new value later
ListenStream=
ListenStream=$J2F_LISTEN_STREAM
EOF
# Restart and enable service
sudo systemctl daemon-reload
sudo systemctl stop "$J2F_SERVICE_NAME"
sudo systemctl start "$J2F_SERVICE_NAME"
sudo systemctl enable "$J2F_SERVICE_NAME"
# ----
# vim: ts=2:sw=2:et:ai:sts=2
mkcert
to create the temporary certificates, to install the
package on bullseye
the backports
repository must be available.json2file-go
server we go to the project and enter into
the hooks/gitea/new
page, once there we create a new webhook of type gitea
and set the target URL to https://172.31.31.1:4443/blogops
and on the secret
field we put the token generated with uuid
by the setup script:
sed -n -e 's/blogops://p' /etc/json2file-go/dirlist
*
webhook
section of the app.ini
of our gitea
server allows us to call the IP and skips the TLS verification (you can see the
available options on the
gitea
documentation).
The [webhook]
section of my server looks like this:
[webhook]
ALLOWED_HOST_LIST=private
SKIP_TLS_VERIFY=true
webhook
configured we can try it and if it works our
json2file
server will store the file on the
/srv/blogops/webhook/json2file/blogops/
folder.gitea
and store the messages on files, but we have to do something to
process those files once they are saved in our machine.
An option could be to use a cronjob
to look for new files, but we can do
better on Linux using inotify
we will use the inotifywait
command from
inotify-tools
to watch the json2file
output directory and execute a script
each time a new file is moved inside it or closed after writing
(IN_CLOSE_WRITE
and IN_MOVED_TO
events).
To avoid concurrency problems we are going to use task-spooler
to launch the
scripts that process the webhooks using a queue of length 1, so they are
executed one by one in a FIFO queue.
The spooler script is this:
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
BASE_DIR="/srv/blogops/webhook"
BIN_DIR="$BASE_DIR/bin"
TSP_DIR="$BASE_DIR/tsp"
WEBHOOK_COMMAND="$BIN_DIR/blogops-webhook.sh"
# ---------
# FUNCTIONS
# ---------
queue_job()
echo "Queuing job to process file '$1'"
TMPDIR="$TSP_DIR" TS_SLOTS="1" TS_MAXFINISHED="10" \
tsp -n "$WEBHOOK_COMMAND" "$1"
# ----
# MAIN
# ----
INPUT_DIR="$1"
if [ ! -d "$INPUT_DIR" ]; then
echo "Input directory '$INPUT_DIR' does not exist, aborting!"
exit 1
fi
[ -d "$TSP_DIR" ] mkdir "$TSP_DIR"
echo "Processing existing files under '$INPUT_DIR'"
find "$INPUT_DIR" -type f sort while read -r _filename; do
queue_job "$_filename"
done
# Use inotifywatch to process new files
echo "Watching for new files under '$INPUT_DIR'"
inotifywait -q -m -e close_write,moved_to --format "%w%f" -r "$INPUT_DIR"
while read -r _filename; do
queue_job "$_filename"
done
# ----
# vim: ts=2:sw=2:et:ai:sts=2
systemd service
using the following
script:
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
BASE_DIR="/srv/blogops/webhook"
BIN_DIR="$BASE_DIR/bin"
J2F_DIR="$BASE_DIR/json2file"
SPOOLER_COMMAND="$BIN_DIR/blogops-spooler.sh '$J2F_DIR'"
SPOOLER_SERVICE_NAME="blogops-j2f-spooler"
SPOOLER_SERVICE_FILE="/etc/systemd/system/$SPOOLER_SERVICE_NAME.service"
# Configuration file values
J2F_USER="$(id -u)"
J2F_GROUP="$(id -g)"
# ----
# MAIN
# ----
# Install packages used with the webhook processor
sudo apt update
sudo apt install -y inotify-tools jq task-spooler
sudo apt clean
# Configure process service
sudo sh -c "cat > $SPOOLER_SERVICE_FILE" <<EOF
[Install]
WantedBy=multi-user.target
[Unit]
Description=json2file processor for $J2F_USER
After=docker.service
[Service]
Type=simple
User=$J2F_USER
Group=$J2F_GROUP
ExecStart=$SPOOLER_COMMAND
EOF
# Restart and enable service
sudo systemctl daemon-reload
sudo systemctl stop "$SPOOLER_SERVICE_NAME" true
sudo systemctl start "$SPOOLER_SERVICE_NAME"
sudo systemctl enable "$SPOOLER_SERVICE_NAME"
# ----
# vim: ts=2:sw=2:et:ai:sts=2
hugo
with docker
compose
,#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
# Values
REPO_REF="refs/heads/main"
REPO_CLONE_URL="https://gitea.mixinet.net/mixinet/blogops.git"
MAIL_PREFIX="[BLOGOPS-WEBHOOK] "
# Address that gets all messages, leave it empty if not wanted
MAIL_TO_ADDR="blogops@mixinet.net"
# If the following variable is set to 'true' the pusher gets mail on failures
MAIL_ERRFILE="false"
# If the following variable is set to 'true' the pusher gets mail on success
MAIL_LOGFILE="false"
# gitea's conf/app.ini value of NO_REPLY_ADDRESS, it is used for email domains
# when the KeepEmailPrivate option is enabled for a user
NO_REPLY_ADDRESS="noreply.example.org"
# Directories
BASE_DIR="/srv/blogops"
PUBLIC_DIR="$BASE_DIR/public"
NGINX_BASE_DIR="$BASE_DIR/nginx"
PUBLIC_HTML_DIR="$NGINX_BASE_DIR/public_html"
WEBHOOK_BASE_DIR="$BASE_DIR/webhook"
WEBHOOK_SPOOL_DIR="$WEBHOOK_BASE_DIR/spool"
WEBHOOK_ACCEPTED="$WEBHOOK_SPOOL_DIR/accepted"
WEBHOOK_DEPLOYED="$WEBHOOK_SPOOL_DIR/deployed"
WEBHOOK_REJECTED="$WEBHOOK_SPOOL_DIR/rejected"
WEBHOOK_TROUBLED="$WEBHOOK_SPOOL_DIR/troubled"
WEBHOOK_LOG_DIR="$WEBHOOK_SPOOL_DIR/log"
# Files
TODAY="$(date +%Y%m%d)"
OUTPUT_BASENAME="$(date +%Y%m%d-%H%M%S.%N)"
WEBHOOK_LOGFILE_PATH="$WEBHOOK_LOG_DIR/$OUTPUT_BASENAME.log"
WEBHOOK_ACCEPTED_JSON="$WEBHOOK_ACCEPTED/$OUTPUT_BASENAME.json"
WEBHOOK_ACCEPTED_LOGF="$WEBHOOK_ACCEPTED/$OUTPUT_BASENAME.log"
WEBHOOK_REJECTED_TODAY="$WEBHOOK_REJECTED/$TODAY"
WEBHOOK_REJECTED_JSON="$WEBHOOK_REJECTED_TODAY/$OUTPUT_BASENAME.json"
WEBHOOK_REJECTED_LOGF="$WEBHOOK_REJECTED_TODAY/$OUTPUT_BASENAME.log"
WEBHOOK_DEPLOYED_TODAY="$WEBHOOK_DEPLOYED/$TODAY"
WEBHOOK_DEPLOYED_JSON="$WEBHOOK_DEPLOYED_TODAY/$OUTPUT_BASENAME.json"
WEBHOOK_DEPLOYED_LOGF="$WEBHOOK_DEPLOYED_TODAY/$OUTPUT_BASENAME.log"
WEBHOOK_TROUBLED_TODAY="$WEBHOOK_TROUBLED/$TODAY"
WEBHOOK_TROUBLED_JSON="$WEBHOOK_TROUBLED_TODAY/$OUTPUT_BASENAME.json"
WEBHOOK_TROUBLED_LOGF="$WEBHOOK_TROUBLED_TODAY/$OUTPUT_BASENAME.log"
# Query to get variables from a gitea webhook json
ENV_VARS_QUERY="$(
printf "%s" \
'(. @sh "gt_ref=\(.ref);"),' \
'(. @sh "gt_after=\(.after);"),' \
'(.repository @sh "gt_repo_clone_url=\(.clone_url);"),' \
'(.repository @sh "gt_repo_name=\(.name);"),' \
'(.pusher @sh "gt_pusher_full_name=\(.full_name);"),' \
'(.pusher @sh "gt_pusher_email=\(.email);")'
)"
# ---------
# Functions
# ---------
webhook_log()
echo "$(date -R) $*" >>"$WEBHOOK_LOGFILE_PATH"
webhook_check_directories()
for _d in "$WEBHOOK_SPOOL_DIR" "$WEBHOOK_ACCEPTED" "$WEBHOOK_DEPLOYED" \
"$WEBHOOK_REJECTED" "$WEBHOOK_TROUBLED" "$WEBHOOK_LOG_DIR"; do
[ -d "$_d" ] mkdir "$_d"
done
webhook_clean_directories()
# Try to remove empty dirs
for _d in "$WEBHOOK_ACCEPTED" "$WEBHOOK_DEPLOYED" "$WEBHOOK_REJECTED" \
"$WEBHOOK_TROUBLED" "$WEBHOOK_LOG_DIR" "$WEBHOOK_SPOOL_DIR"; do
if [ -d "$_d" ]; then
rmdir "$_d" 2>/dev/null true
fi
done
webhook_accept()
webhook_log "Accepted: $*"
mv "$WEBHOOK_JSON_INPUT_FILE" "$WEBHOOK_ACCEPTED_JSON"
mv "$WEBHOOK_LOGFILE_PATH" "$WEBHOOK_ACCEPTED_LOGF"
WEBHOOK_LOGFILE_PATH="$WEBHOOK_ACCEPTED_LOGF"
webhook_reject()
[ -d "$WEBHOOK_REJECTED_TODAY" ] mkdir "$WEBHOOK_REJECTED_TODAY"
webhook_log "Rejected: $*"
if [ -f "$WEBHOOK_JSON_INPUT_FILE" ]; then
mv "$WEBHOOK_JSON_INPUT_FILE" "$WEBHOOK_REJECTED_JSON"
fi
mv "$WEBHOOK_LOGFILE_PATH" "$WEBHOOK_REJECTED_LOGF"
exit 0
webhook_deployed()
[ -d "$WEBHOOK_DEPLOYED_TODAY" ] mkdir "$WEBHOOK_DEPLOYED_TODAY"
webhook_log "Deployed: $*"
mv "$WEBHOOK_ACCEPTED_JSON" "$WEBHOOK_DEPLOYED_JSON"
mv "$WEBHOOK_ACCEPTED_LOGF" "$WEBHOOK_DEPLOYED_LOGF"
WEBHOOK_LOGFILE_PATH="$WEBHOOK_DEPLOYED_LOGF"
webhook_troubled()
[ -d "$WEBHOOK_TROUBLED_TODAY" ] mkdir "$WEBHOOK_TROUBLED_TODAY"
webhook_log "Troubled: $*"
mv "$WEBHOOK_ACCEPTED_JSON" "$WEBHOOK_TROUBLED_JSON"
mv "$WEBHOOK_ACCEPTED_LOGF" "$WEBHOOK_TROUBLED_LOGF"
WEBHOOK_LOGFILE_PATH="$WEBHOOK_TROUBLED_LOGF"
print_mailto()
_addr="$1"
_user_email=""
# Add the pusher email address unless it is from the domain NO_REPLY_ADDRESS,
# which should match the value of that variable on the gitea 'app.ini' (it
# is the domain used for emails when the user hides it).
# shellcheck disable=SC2154
if [ -n "$ gt_pusher_email##*@"$ NO_REPLY_ADDRESS " " ] &&
[ -z "$ gt_pusher_email##*@* " ]; then
_user_email="\"$gt_pusher_full_name <$gt_pusher_email>\""
fi
if [ "$_addr" ] && [ "$_user_email" ]; then
echo "$_addr,$_user_email"
elif [ "$_user_email" ]; then
echo "$_user_email"
elif [ "$_addr" ]; then
echo "$_addr"
fi
mail_success()
to_addr="$MAIL_TO_ADDR"
if [ "$MAIL_LOGFILE" = "true" ]; then
to_addr="$(print_mailto "$to_addr")"
fi
if [ "$to_addr" ]; then
# shellcheck disable=SC2154
subject="OK - $gt_repo_name updated to commit '$gt_after'"
mail -s "$ MAIL_PREFIX $ subject " "$to_addr" \
<"$WEBHOOK_LOGFILE_PATH"
fi
mail_failure()
to_addr="$MAIL_TO_ADDR"
if [ "$MAIL_ERRFILE" = true ]; then
to_addr="$(print_mailto "$to_addr")"
fi
if [ "$to_addr" ]; then
# shellcheck disable=SC2154
subject="KO - $gt_repo_name update FAILED for commit '$gt_after'"
mail -s "$ MAIL_PREFIX $ subject " "$to_addr" \
<"$WEBHOOK_LOGFILE_PATH"
fi
# ----
# MAIN
# ----
# Check directories
webhook_check_directories
# Go to the base directory
cd "$BASE_DIR"
# Check if the file exists
WEBHOOK_JSON_INPUT_FILE="$1"
if [ ! -f "$WEBHOOK_JSON_INPUT_FILE" ]; then
webhook_reject "Input arg '$1' is not a file, aborting"
fi
# Parse the file
webhook_log "Processing file '$WEBHOOK_JSON_INPUT_FILE'"
eval "$(jq -r "$ENV_VARS_QUERY" "$WEBHOOK_JSON_INPUT_FILE")"
# Check that the repository clone url is right
# shellcheck disable=SC2154
if [ "$gt_repo_clone_url" != "$REPO_CLONE_URL" ]; then
webhook_reject "Wrong repository: '$gt_clone_url'"
fi
# Check that the branch is the right one
# shellcheck disable=SC2154
if [ "$gt_ref" != "$REPO_REF" ]; then
webhook_reject "Wrong repository ref: '$gt_ref'"
fi
# Accept the file
# shellcheck disable=SC2154
webhook_accept "Processing '$gt_repo_name'"
# Update the checkout
ret="0"
git fetch >>"$WEBHOOK_LOGFILE_PATH" 2>&1 ret="$?"
if [ "$ret" -ne "0" ]; then
webhook_troubled "Repository fetch failed"
mail_failure
fi
# shellcheck disable=SC2154
git checkout "$gt_after" >>"$WEBHOOK_LOGFILE_PATH" 2>&1 ret="$?"
if [ "$ret" -ne "0" ]; then
webhook_troubled "Repository checkout failed"
mail_failure
fi
# Remove the build dir if present
if [ -d "$PUBLIC_DIR" ]; then
rm -rf "$PUBLIC_DIR"
fi
# Build site
docker compose run hugo -- >>"$WEBHOOK_LOGFILE_PATH" 2>&1 ret="$?"
# go back to the main branch
git switch main && git pull
# Fail if public dir was missing
if [ "$ret" -ne "0" ] [ ! -d "$PUBLIC_DIR" ]; then
webhook_troubled "Site build failed"
mail_failure
fi
# Remove old public_html copies
webhook_log 'Removing old site versions, if present'
find $NGINX_BASE_DIR -mindepth 1 -maxdepth 1 -name 'public_html-*' -type d \
-exec rm -rf \; >>"$WEBHOOK_LOGFILE_PATH" 2>&1 ret="$?"
if [ "$ret" -ne "0" ]; then
webhook_troubled "Removal of old site versions failed"
mail_failure
fi
# Switch site directory
TS="$(date +%Y%m%d-%H%M%S)"
if [ -d "$PUBLIC_HTML_DIR" ]; then
webhook_log "Moving '$PUBLIC_HTML_DIR' to '$PUBLIC_HTML_DIR-$TS'"
mv "$PUBLIC_HTML_DIR" "$PUBLIC_HTML_DIR-$TS" >>"$WEBHOOK_LOGFILE_PATH" 2>&1
ret="$?"
fi
if [ "$ret" -eq "0" ]; then
webhook_log "Moving '$PUBLIC_DIR' to '$PUBLIC_HTML_DIR'"
mv "$PUBLIC_DIR" "$PUBLIC_HTML_DIR" >>"$WEBHOOK_LOGFILE_PATH" 2>&1
ret="$?"
fi
if [ "$ret" -ne "0" ]; then
webhook_troubled "Site switch failed"
mail_failure
else
webhook_deployed "Site deployed successfully"
mail_success
fi
# ----
# vim: ts=2:sw=2:et:ai:sts=2
Needs to be able to create two copies always. Can get stuck in irreversible read-only mode if only one copy can be made.Even as of now, RAID-1 and RAID-10 has this note:
The simple redundancy RAID levels utilize different mirrors in a way that does not achieve the maximum performance. The logic can be improved so the reads will spread over the mirrors evenly or based on device congestion.Granted, that's not a stability concern anymore, just performance. A reviewer of a draft of this article actually claimed that BTRFS only reads from one of the drives, which hopefully is inaccurate, but goes to show how confusing all this is. There are other warnings in the Debian wiki that are quite scary. Even the legendary Arch wiki has a warning on top of their BTRFS page, still. Even if those issues are now fixed, it can be hard to tell when they were fixed. There is a changelog by feature but it explicitly warns that it doesn't know "which kernel version it is considered mature enough for production use", so it's also useless for this. It would have been much better if BTRFS was released into the world only when those bugs were being completely fixed. Or that, at least, features were announced when they were stable, not just "we merged to mainline, good luck". Even now, we get mixed messages even in the official BTRFS documentation which says "The Btrfs code base is stable" (main page) while at the same time clearly stating unstable parts in the status page (currently RAID56). There are much harsher BTRFS critics than me out there so I will stop here, but let's just say that I feel a little uncomfortable trusting server data with full RAID arrays to BTRFS. But surely, for a workstation, things should just work smoothly... Right? Well, let's see the snags I hit.
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931,5G 0 disk
sda1 8:1 0 200M 0 part /boot/efi
sda2 8:2 0 1G 0 part /boot
sda3 8:3 0 7,8G 0 part
fedora_swap 253:5 0 7.8G 0 crypt [SWAP]
sda4 8:4 0 922,5G 0 part
fedora_crypt 253:4 0 922,5G 0 crypt /
(This might not entirely be accurate: I rebuilt this from the Debian
side of things.)
This is pretty straightforward, except for the swap partition:
normally, I just treat swap like any other logical volume and create
it in a logical volume. This is now just speculation, but I bet it was
setup this way because "swap" support was only added in BTRFS 5.0.
I fully expect BTRFS experts to yell at me now because this is an old
setup and BTRFS is so much better now, but that's exactly the point
here. That setup is not that old (2018? old? really?), and migrating
to a new partition scheme isn't exactly practical right now. But let's
move on to more practical considerations.
/dev/nvme0n1
and nvme1n1
/dev/md1
vg_tbbuild05
(multiple PVs can be added to a single VG which is
why there is that abstraction)NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 1.7T 0 disk
nvme0n1p1 259:1 0 8M 0 part
nvme0n1p2 259:2 0 512M 0 part
md0 9:0 0 511M 0 raid1 /boot
nvme0n1p3 259:3 0 1.7T 0 part
md1 9:1 0 1.7T 0 raid1
crypt_dev_md1 253:0 0 1.7T 0 crypt
vg_tbbuild05-root 253:1 0 30G 0 lvm /
vg_tbbuild05-swap 253:2 0 125.7G 0 lvm [SWAP]
vg_tbbuild05-srv 253:3 0 1.5T 0 lvm /srv
nvme0n1p4 259:4 0 1M 0 part
I stripped the other nvme1n1
disk because it's basically the same.
Now, if we look at my BTRFS-enabled workstation, which doesn't even
have RAID, we have the following:
/dev/sda
with, again, /dev/sda4
being where BTRFS livesfedora_crypt
, which is, confusingly, kind of like a
volume group. it's where everything lives. i think.home
, root
, /
, etc. those are actually the things
that get mounted. you'd think you'd mount a filesystem, but no, you
mount a subvolume. that is backwards.lsblk
:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931,5G 0 disk
sda1 8:1 0 200M 0 part /boot/efi
sda2 8:2 0 1G 0 part /boot
sda3 8:3 0 7,8G 0 part [SWAP]
sda4 8:4 0 922,5G 0 part
fedora_crypt 253:4 0 922,5G 0 crypt /srv
Notice how we don't see all the BTRFS volumes here? Maybe it's because
I'm mounting this from the Debian side, but lsblk
definitely gets
confused here. I frankly don't quite understand what's going on, even
after repeatedly looking around the rather dismal
documentation. But that's what I gather from the following
commands:
root@curie:/home/anarcat# btrfs filesystem show
Label: 'fedora' uuid: 5abb9def-c725-44ef-a45e-d72657803f37
Total devices 1 FS bytes used 883.29GiB
devid 1 size 922.47GiB used 916.47GiB path /dev/mapper/fedora_crypt
root@curie:/home/anarcat# btrfs subvolume list /srv
ID 257 gen 108092 top level 5 path home
ID 258 gen 108094 top level 5 path root
ID 263 gen 108020 top level 258 path root/var/lib/machines
I only got to that point through trial and error. Notice how I use an
existing mountpoint to list the related subvolumes. If I try to use
the filesystem path, the one that's listed in filesystem show
, I
fail:
root@curie:/home/anarcat# btrfs subvolume list /dev/mapper/fedora_crypt
ERROR: not a btrfs filesystem: /dev/mapper/fedora_crypt
ERROR: can't access '/dev/mapper/fedora_crypt'
Maybe I just need to use the label? Nope:
root@curie:/home/anarcat# btrfs subvolume list fedora
ERROR: cannot access 'fedora': No such file or directory
ERROR: can't access 'fedora'
This is really confusing. I don't even know if I understand this
right, and I've been staring at this all afternoon. Hopefully, the
lazyweb will correct me eventually.
(As an aside, why are they called "subvolumes"? If something is a
"sub" of "something else", that "something else" must exist
right? But no, BTRFS doesn't have "volumes", it only has
"subvolumes". Go figure. Presumably the filesystem still holds "files"
though, at least empirically it doesn't seem like it lost anything so
far.
In any case, at least I can refer to this section in the future, the
next time I fumble around the btrfs
commandline, as I surely will. I
will possibly even update this section as I get better at it, or based
on my reader's judicious feedback.
/etc/fstab
,
on the Debian side of things:
UUID=5abb9def-c725-44ef-a45e-d72657803f37 /srv btrfs defaults 0 2
This thankfully ignores all the subvolume nonsense because it relies
on the UUID. mount
tells me that's actually the "root" (? /
?)
subvolume:
root@curie:/home/anarcat# mount grep /srv
/dev/mapper/fedora_crypt on /srv type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)
Let's see if I can mount the other volumes I have on there. Remember
that subvolume list
showed I had home
, root
, and
var/lib/machines
. Let's try root
:
mount -o subvol=root /dev/mapper/fedora_crypt /mnt
Interestingly, root
is not the same as /
, it's a different
subvolume! It seems to be the Fedora root (/
, really) filesystem. No
idea what is happening here. I also have a home
subvolume, let's
mount it too, for good measure:
mount -o subvol=home /dev/mapper/fedora_crypt /mnt/home
Note that lsblk
doesn't notice those two new mountpoints, and that's
normal: it only lists block devices and subvolumes (rather
inconveniently, I'd say) do not show up as devices:
root@curie:/home/anarcat# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931,5G 0 disk
sda1 8:1 0 200M 0 part
sda2 8:2 0 1G 0 part
sda3 8:3 0 7,8G 0 part
sda4 8:4 0 922,5G 0 part
fedora_crypt 253:4 0 922,5G 0 crypt /srv
This is really, really confusing. Maybe I did something wrong in the
setup. Maybe it's because I'm mounting it from outside Fedora. Either
way, it just doesn't feel right.
root@curie:/home/anarcat# df -h /srv /mnt /mnt/home
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/fedora_crypt 923G 886G 31G 97% /srv
/dev/mapper/fedora_crypt 923G 886G 31G 97% /mnt
/dev/mapper/fedora_crypt 923G 886G 31G 97% /mnt/home
(Notice, in passing, that it looks like the same filesystem is mounted
in different places. In that sense, you'd expect /srv
and /mnt
(and /mnt/home
?!) to be exactly the same, but no: they are entirely
different directory structures, which I will not call "filesystems"
here because everyone's head will explode in sparks of confusion.)
Yes, disk space is shared (that's the Size
and Avail
columns,
makes sense). But nope, no cookie for you: they all have the same
Used
columns, so you need to actually walk the entire filesystem to
figure out what each disk takes.
(For future reference, that's basically:
root@curie:/home/anarcat# time du -schx /mnt/home /mnt /srv
124M /mnt/home
7.5G /mnt
875G /srv
883G total
real 2m49.080s
user 0m3.664s
sys 0m19.013s
And yes, that was painfully slow.)
ZFS actually has some oddities in that regard, but at least it tells
me how much disk each volume (and snapshot) takes:
root@tubman:~# time df -t zfs -h
Filesystem Size Used Avail Use% Mounted on
rpool/ROOT/debian 3.5T 1.4G 3.5T 1% /
rpool/var/tmp 3.5T 384K 3.5T 1% /var/tmp
rpool/var/spool 3.5T 256K 3.5T 1% /var/spool
rpool/var/log 3.5T 2.0G 3.5T 1% /var/log
rpool/home/root 3.5T 2.2G 3.5T 1% /root
rpool/home 3.5T 256K 3.5T 1% /home
rpool/srv 3.5T 80G 3.5T 3% /srv
rpool/var/cache 3.5T 114M 3.5T 1% /var/cache
bpool/BOOT/debian 571M 90M 481M 16% /boot
real 0m0.003s
user 0m0.002s
sys 0m0.000s
That's 56360 times faster, by the way.
But yes, that's not fair: those in the know will know there's a
different command to do what df
does with BTRFS filesystems, the
btrfs filesystem usage
command:
root@curie:/home/anarcat# time btrfs filesystem usage /srv
Overall:
Device size: 922.47GiB
Device allocated: 916.47GiB
Device unallocated: 6.00GiB
Device missing: 0.00B
Used: 884.97GiB
Free (estimated): 30.84GiB (min: 27.84GiB)
Free (statfs, df): 30.84GiB
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Multiple profiles: no
Data,single: Size:906.45GiB, Used:881.61GiB (97.26%)
/dev/mapper/fedora_crypt 906.45GiB
Metadata,DUP: Size:5.00GiB, Used:1.68GiB (33.58%)
/dev/mapper/fedora_crypt 10.00GiB
System,DUP: Size:8.00MiB, Used:128.00KiB (1.56%)
/dev/mapper/fedora_crypt 16.00MiB
Unallocated:
/dev/mapper/fedora_crypt 6.00GiB
real 0m0,004s
user 0m0,000s
sys 0m0,004s
Almost as fast as ZFS's df! Good job. But wait. That doesn't actually
tell me usage per subvolume. Notice it's filesystem usage
, not
subvolume usage
, which unhelpfully refuses to exist. That command
only shows that one "filesystem" internal statistics that are pretty
opaque.. You can also appreciate that it's wasting 6GB of
"unallocated" disk space there: I probably did something Very Wrong
and should be punished by Hacker News. I also wonder why it has 1.68GB
of "metadata" used...
At this point, I just really want to throw that thing out of the
window and restart from scratch. I don't really feel like learning the
BTRFS internals, as they seem oblique and completely bizarre to me. It
feels a little like the state of PHP now: it's actually pretty solid,
but built upon so many layers of cruft that I still feel it corrupts
my brain every time I have to deal with it (needle or haystack first?
anyone?)...
last_job_sync
to synchronize job dependencies of the previous
submission. Although the DRM scheduler guarantees the order of starting to
execute a job in the same queue in the kernel space, the order of completion
isn t predictable. On the other hand, we still needed to use syncobjs to follow
job completion since we have event threads on the CPU side. Therefore, a more
accurate implementation requires last_job syncobjs to track when each engine
(CL, TFU, and CSD) is idle. We also needed to keep the driver working on
previous versions of v3d kernel-driver with single semaphores, then we kept
tracking ANY last_job_sync
to preserve the previous implementation.
This was waiting for multisync support in the v3d kernel, which is already available. Exposing this feature however enabled a few more CTS tests that exposed pre-existing bugs in the user-space driver so we fix those here before exposing the feature.
This should give you emulated timeline semaphores for free and kernel-assisted sharable timeline semaphores for cheap once you have the kernel interface wired in.
# Download a binary device tree file and matching kernel a good soul uploaded to github
wget https://github.com/vfdev-5/qemu-rpi2-vexpress/raw/master/kernel-qemu-4.4.1-vexpress
wget https://github.com/vfdev-5/qemu-rpi2-vexpress/raw/master/vexpress-v2p-ca15-tc1.dtb
# Download the official Rasbian image without X
wget https://downloads.raspberrypi.org/raspios_lite_armhf/images/raspios_lite_armhf-2022-04-07/2022-04-04-raspios-bullseye-armhf-lite.img.xz
unxz 2022-04-04-raspios-bullseye-armhf-lite.img.xz
# Convert it from the raw image to a qcow2 image and add some space
qemu-img convert -f raw -O qcow2 2022-04-04-raspios-bullseye-armhf-lite.img rasbian.qcow2
qemu-img resize rasbian.qcow2 4G
# make sure we get a user account setup
echo "me:$(echo 'test123' openssl passwd -6 -stdin)" > userconf
sudo guestmount -a rasbian.qcow2 -m /dev/sda1 /mnt
sudo mv userconf /mnt
sudo guestunmount /mnt
# start qemu
qemu-system-arm -m 2048M -M vexpress-a15 -cpu cortex-a15 \
-kernel kernel-qemu-4.4.1-vexpress -no-reboot \
-smp 2 -serial stdio \
-dtb vexpress-v2p-ca15-tc1.dtb -sd rasbian.qcow2 \
-append "root=/dev/mmcblk0p2 rw rootfstype=ext4 console=ttyAMA0,15200 loglevel=8" \
-nic user,hostfwd=tcp::5555-:22
# login at the serial console as user me with password test123
sudo -i
# enable ssh
systemctl enable ssh
systemctl start ssh
# resize partition and filesystem
parted /dev/mmcblk0 resizepart 2 100%
resize2fs /dev/mmcblk0p2
Now I can login via ssh and start to play:
ssh me@localhost -p 5555
mklabel gpt mkpart EFI fat32 1 99 mkpart boot ext3 99 300 toggle 1 boot toggle 1 esp p # Model: CT1000P1SSD8 (nvme) # Disk /dev/nvme1n1: 1000GB # Sector size (logical/physical): 512B/512B # Partition Table: gpt # Disk Flags: # # Number Start End Size File system Name Flags # 1 1049kB 98.6MB 97.5MB fat32 EFI boot, esp # 2 98.6MB 300MB 201MB ext3 boot qHere are the commands needed to create the filesystems and install the necessary files. This is almost to the stage of being scriptable. Some minor changes need to be made to convert from NVMe device names to SATA/SAS but nothing serious.
mkfs.vfat /dev/nvme1n1p1 mkfs.ext3 -N 1000 /dev/nvme1n1p2 file -s /dev/nvme1n1p2 sed -e s/^.*UUID/UUID/ -e "s/ .*$/ \/boot ext3 noatime 0 1/" >> /etc/fstab file -s /dev/nvme1n1p1 tr "[a-f]" "[A-F]" sed -e s/^.*numBEr.0x/UUID=/ -e "s/, .*$/ \/boot\/efi vfat umask=0077 0 1/" >> /etc/fstab # edit /etc/fstab to put a hyphen between the 2 groups of 4 chars for the VFAT filesystem UUID mount /boot mkdir -p /boot/efi /boot/grub mount /boot/efi mkdir -p /boot/efi/EFI/debian apt install efibootmgr shim-unsigned grub-efi-amd64 cp /usr/lib/shim/* /usr/lib/grub/x86_64-efi/monolithic/grubx64.efi /boot/efi/EFI/debian file -s /dev/nvme1n1p2 sed -e "s/^.*UUID=/search.fs_uuid /" -e "s/ .needs.*$/ root hd0,gpt2/" > /boot/efi/EFI/debian/grub.cfg echo "set prefix=(\$root)'/boot/grub'" >> /boot/efi/EFI/debian/grub.cfg echo "configfile \$prefix/grub.cfg" >> /boot/efi/EFI/debian/grub.cfg grub-install update-grubIf someone would like to make a script that can handle the different partition names of regular SCSI/SATA disks, NVMe, CCISS, etc then that would be great. It would be good to have a script in Debian that creates the partitions and sets up the EFI files. If you want to have a second bootable device then the following commands will copy a GPT partition table and give it new UUIDs, make very certain that $DISKB is the one you want to be wiped and refer to my previous mention of parted -l . Also note that parted has a rescue command which works very well.
sgdisk /dev/$DISKA -R /dev/$DISKB sgdisk -G /dev/$DISKBTo backup a GPT partition table run a command like this. Note that if sgdisk is told to backup a MBR partitioned disk it will say Found invalid GPT and valid MBR; converting MBR to GPT forma which is probably a viable way of converting MBR format to GPT.
sgdisk -b sda.bak /dev/sda
Various efforts towards build verifiability have been made to C/C++-based systems, yet the techniques for Java-based systems are not systematic and are often specific to a particular build tool (eg. Maven). In this study, we present a systematic approach towards build verifiability on Java-based systems.
We first define the problem, and then provide insight into the challenges of making real-world software build in a reproducible manner-this is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).
SOURCE_DATE_EPOCH
specification related to formats that cannot help embedding potentially timezone-specific timestamp. (Full thread index.)
203
, 204
, 205
and 206
to Debian unstable, as well as made the following changes to the code itself:
file(1)
-related regression where Debian .changes
files that contained non-ASCII text were not identified as such, therefore resulting in seemingly arbitrary packages not actually comparing the nested files themselves. The non-ASCII parts were typically in the Maintainer
or in the changelog text. [ ][ ]binwalk
, return False
from BinwalkFile.recognizes
. [ ]binwalk
, don t report that we are missing the Python rpm
module! [ ]diffoscope
and diffoscope-minimal
packages have the same version. [ ]
debian-devel
mailing list after noticing that the binutils
source package contained unreproducible logs in one of its binary packages. Vagrant expanded the discussion to one about all kinds of build metadata in packages and outlines a number of potential solutions that support reproducible builds and arbitrary metadata.
Vagrant also started a discussion on debian-devel
after identifying a large number of packages that embed build paths via RPATH when building with CMake, including a list of packages (grouped by Debian maintainer) affected by this issue. Maintainers were requested to check whether their package still builds correctly when passing the -DCMAKE_BUILD_RPATH_USE_ORIGIN=ON
directive.
On our mailing list this month, kpcyrd announced the release of rebuilderd-debian-buildinfo-crawler a tool to parse the Packages.xz
Debian package index file, attempts to discover the right .buildinfo
file from buildinfos.debian.net and outputs it in a format that can be understood by rebuilderd. The tool, which is available on GitHub, solves a problem regarding correlating Debian version numbers with their builds.
bauen1 provided two patches for debian-cd, the software used to make Debian installer images. This involved passing --invariant
and -i deb00001
to mkfs.msdos(8)
and avoided embedding timestamps into the gzipped Packages
and Translations
files. After some discussion, the patches in question were merged and will be included in debian-cd version 3.1.36.
Roland Clobus wrote another in-depth status update about status of live Debian images, summarising the current situation that all major desktops build reproducibly with bullseye, bookworm and sid .
python3.10
package was uploaded to Debian by doko, fixing an issue where [.pyc
files were not reproducible because the elements in frozenset
data structures were not ordered reproducibly. This meant that to creating a bit-for-bit reproducible Debian chroot which included .pyc
files was not reproducible. As of writing, the only remaining unreproducible parts of a standard
chroot is man-db
, but Guillem Jover has a patch for update-alternatives
which will likely be part of the next release of dpkg
.
Elsewhere in Debian, 139 reviews of Debian packages were added, 29 were updated and 17 were removed this month adding to our knowledge about identified issues. A large number of issue types have been updated too, including the addition of captures_kernel_variant
, erlang_escript_file
, captures_build_path_in_r_rdb_rds_databases
, captures_build_path_in_vo_files_generated_by_coq
and build_path_in_vo_files_generated_by_coq
.
contributors.sh
Bash/shell script into a Python script. [ ][ ][ ]btop
(sort-related issue)complexity
(date)giac
(update the version with upstreamed date patch)htcondor
(use CMake timestamp)libint
(readdir
system call related)libnet
(date-related issue)librime-lua
(sort filesystem ordering)linux_logo
(sort-related issue)micro-editor
(date-related issue)openvas-smb
(date-related issue)ovmf
(sort-related issue)paperjam
(date-related issue)python-PyQRCode
(date-related issue)quimb
(single-CPU build failure)radare2
(Meson date/time-related issue)radare2
(Rework SOURCE_DATE_EPOCH
usage to be portable)siproxd
(date, with Sebastian Kemper + follow-upxonsh
(Address Space Layout Randomisation-related issue)xsnow
(date & tar(1)
-related issue)zip
(toolchain issue related to filesystem ordering)ltsp
(forwarded upstream).pcmemtest
.hatchling
.mpl-sphinx-theme
(forwarded upstream)gap-hapcryst
.tree-puzzle
.jcabi-aspects
.paper-icon-theme
.wcwidth
.xir
.xir
.ruby-github-markup
.ruby-tioga
.btop
.libadwaita-1
.snibbetracker
.cctbx
.mdnsd
.gmerlin
.beav
.krita
.qt6-base
.onevpl-intel-gpu
.ruby3.0
.nix
.foma
.ruby3.0
.openwrt.git
repository the next day.
useradd
warnings when building packages. [ ]armhf
architecture nodes to add a hint to where nodes named virt-*
. [ ]logrotate
and man-db
services. [ ]#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
Next.