After years of thinking about and learning about how radios work, I figured it
was high-time to start to more aggressively share the things i ve been
learning. I had a ton of fun at DistrictCon
year 0, so it was a pretty natural place to pitch an RF-focused introductory
talk.
I was selected for Year 1, and able to give my first ever RF related talk about
how to set off restaurant pagers (including one on stage!) by reading and
writing IQ directly using a little bit of stdlib only Python.
This talk is based around the work I ve written about previously
(here, here and
here), but the all-in-one form factor was
something I was hoping would help encourage folks out there to take a look
under the hood of some of the gear around them.
(In case the iframe above isn t working, direct link to the YouTube video
recording is here)
I ve posted my slides from the talk at
PARCH.pdf to hopefully give folks some time
to flip through them directly.
All in all, the session was great It was truely humbling to see so many
folks interested in hearing me talk about radios. I had a bit of an own-goal in
picking a 20 minute form-factor, so the talk is paced wrong (it feels like it
went way too fast). Hopefully being able to see the slides and pause the video
is helpful.
We had a short ad-hoc session after where I brought two sets of pagers and my
power switch; but unfortunately we didn t have anyone who was able to trigger
any of the devices on their own (due to a mix of time between sessions and
computer set-up). Hopefully it was enough to get folks interested in trying
this on their own!
I ve written down a new rule (no name, sorry) that I ll be repeating to myself
and those around me. If you can replace DNS with key value store mapping
a name to an ip and it still makes sense, it was not, in fact, DNS. Feel
free to repeat it along with me.
Sure, the It s always DNS meme is funny the first few hundred times you see
it but what s less funny is when critical thinking ends because a DNS query
is involved. DNS failures are often the first observable problem because
it s one of the first things that needs to be done. DNS is fairly complicated,
implementation-dependent, and at times frustrating to debug but it is not
the operational hazard it s made out to be. It s at best a shallow take, and at
worst actively holding teams back from understanding their true operational
risks.
IP connectivity failures between a host and the rest of the network is not a
reason to blame DNS. This would happen no matter how you distribute the updated
name to IP mappings. Wiping out
all the records during the course of operations due to an automation bug
is not a reason to blame DNS. This, too, would happen no matter how you
distribute the name to IP mappings. Something made the choice to delete all the
mappings, and it did what you asked it to do
There s plenty of annoying DNS specific sharp edges to blame when things do
go wrong (like 8.8.8.8 and 1.1.1.1 disagreeing about resolving a domain
because of DNSSEC, or since we re on the topic, a
DNSSEC rollout bricking prod for hours)
for us to be cracking jokes anytime a program makes a DNS request.
We can do better.
Off-and-on over the last several weeks, I've been spending time trying to learn/understand YubiKeys better, especially from the perspective of ECDSA and signing.
I had a good mental model for how "slots" work (canonically referenced by their hexadecimal names such as 9C), but found that it had a gap related to "objects"; while closing that, I was annoyed that the main reference table for this gap lives primarily in either a PDF or insideseveralimplementations, so I figured I should create the reference I want to see in the world, but that it would also be useful to write down some of my understanding for my own (and maybe others') future reference.
So, to that end, I'm going to start with a bit ( ) of background information, with the heavy caveat that this only applies to "PIV" ("FIPS 201") usage of YubiKeys, and that I only actually care about ECDSA, although I've been reassured that it's the same for at least RSA (anything outside this is firmly Here Be Not Tianon; "gl hf dd").
(Incidentally, learning all this helped me actually appreciate the simplicity of cloud-based KMS solutions, which was an unexpected side effect. )
At a really high level, ECDSA is like many other (asymmetric) cryptographic solutions you've got a public key and a private key, the private key can be used to "sign" data (tiny amounts of data, in fact, like P-256 can only reasonably sign 256 bits of data, which is where cryptographic hashes like SHA256 come in as secure analogues for larger data in small bit sizes), and the public key can then be used to verify that the data was indeed signed by the private key, and only someone with the private key could've done so. There's some complex math and RNGs involved, but none of that's actually relevant to this post, so find that information elsewhere.
Unfortunately, this is where things go off the rails: PIV is X.509 ("x509") heavy, and there's no X.509 in the na ve view of my use case.
In a YubiKey (or any other PIV-signing-supporting smart card? do they actually have competitors in this specific niche? ), a given "slot" can hold one single private key. There are ~24 slots which can hold a private key and be used for signing, although "Slot 9c" is officially designated as the "Digital Signature" slot and is encouraged for signing purposes.
One of the biggest gotchas is that with pure-PIV (and older YubiKey firmware ) the public key for a given slot is only available at the time the key is generated, and the whole point of the device in the first place is that the private key is never, ever available from it (all cryptographic operations happen inside the device), so if you don't save that public key when you first ask the device to generate a private key in a particular slot, the public key is lost forever (asterisk).
$# generate a new ECDSA P-256 key in "slot 9c" ("Digital Signature")$# WARNING: THIS WILL GLEEFULLY WIPE SLOT 9C WITHOUT PROMPTING$yubico-piv-tool --slot 9c --algorithm ECCP256 --action generate
-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEtGoWRGyjjUlJFXpu8BL6Rnx8jjKR
5+Mzl2Vepgor+k7N9q7ppOtSMWefjFVR0SEPmXqXINNsCi6LpLtNEigIRg==
-----END PUBLIC KEY-----
Successfully generated a new private key.
$# this is the only time/place we (officially) get this public key
With that background, now let's get to the second aspect of "slots" and how X.509 fits. For every aforementioned slot, there is a corresponding "object" (read: place to store arbitrary data) which is corresponding only by convention. For all these "key" slots the (again, by convention) corresponding "object" is explicitly supposed to be an X.509 certificate (see also the PDF reference linked above).
It turns out this is a useful and topical place to store that public key we need to keep handy! It's also an interesting place to shove additional details about what the key in a given slot is being used for, if that's your thing. Converting the raw public key into a (likely self-signed) X.509 certificate is an exercise for the reader, but if you want to follow the conventions, you need some way to convert a given "slot" to the corresponding "object", and that is the lookup table I wish existed in more forms.
So, without further ado, here is the anti-climax:
Slot
Object
Description
0x9A
0x5FC105
X.509 Certificate for PIV Authentication
0x9E
0x5FC101
X.509 Certificate for Card Authentication
0x9C
0x5FC10A
X.509 Certificate for Digital Signature
0x9D
0x5FC10B
X.509 Certificate for Key Management
0x82
0x5FC10D
Retired X.509 Certificate for Key Management 1
0x83
0x5FC10E
Retired X.509 Certificate for Key Management 2
0x84
0x5FC10F
Retired X.509 Certificate for Key Management 3
0x85
0x5FC110
Retired X.509 Certificate for Key Management 4
0x86
0x5FC111
Retired X.509 Certificate for Key Management 5
0x87
0x5FC112
Retired X.509 Certificate for Key Management 6
0x88
0x5FC113
Retired X.509 Certificate for Key Management 7
0x89
0x5FC114
Retired X.509 Certificate for Key Management 8
0x8A
0x5FC115
Retired X.509 Certificate for Key Management 9
0x8B
0x5FC116
Retired X.509 Certificate for Key Management 10
0x8C
0x5FC117
Retired X.509 Certificate for Key Management 11
0x8D
0x5FC118
Retired X.509 Certificate for Key Management 12
0x8E
0x5FC119
Retired X.509 Certificate for Key Management 13
0x8F
0x5FC11A
Retired X.509 Certificate for Key Management 14
0x90
0x5FC11B
Retired X.509 Certificate for Key Management 15
0x91
0x5FC11C
Retired X.509 Certificate for Key Management 16
0x92
0x5FC11D
Retired X.509 Certificate for Key Management 17
0x93
0x5FC11E
Retired X.509 Certificate for Key Management 18
0x94
0x5FC11F
Retired X.509 Certificate for Key Management 19
0x95
0x5FC120
Retired X.509 Certificate for Key Management 20
See also "piv-objects.json" for a machine-readable copy of this data.
(Major thanks to paultag and jongzipjohnson for helping me learn and generally putting up with me, but especially dealing with my live-stream-of-thoughts while I stumble through the dark. )
The Internet has changed a lot in the last 40+ years. Fads have come and gone.
Network protocols have been designed, deployed, adopted, and abandoned.
Industries have come and gone. The types of people on the internet have changed
a lot. The number of people on the internet has changed a lot, creating an
information medium unlike anything ever seen before in human history. There s a
lot of good things about the Internet as of 2025, but there s also an
inescapable hole in what it used to be, for me.
I miss being able to throw a site up to send around to friends to play with
without worrying about hordes of AI-feeding HTML combine harvesters DoS-ing my
website, costing me thousands in network transfer for the privilege. I miss
being able to put a lightly authenticated game server up and not worry too much
at night wondering if that process is now mining bitcoin. I miss being able
to run a server in my home closet. Decades of cat and mouse games have rendered
running a mail server nearly impossible. Those who are brave enough to try
are met with weekslong stretches of delivery failures and countless hours
yelling ineffectually into a pipe that leads from the cheerful lobby of some
disinterested corporation directly into a void somewhere 4 layers below ground
level.
I miss the spirit of curiosity, exploration, and trying new things. I miss
building things for fun without having to worry about being too successful,
after which security offices start demanding my supplier paperwork in
triplicate as heartfelt thanks from their engineering teams. I miss communities
that are run because it is important to them, not for ad revenue. I miss
community operated spaces and having more than four websites that are all full
of nothing except screenshots of each other.
Every other page I find myself on now has an AI generated click-bait title,
shared for rage-clicks all brought-to-you-by-our-sponsors completely covered
wall-to-wall with popup modals, telling me how much they respect my privacy,
with the real content hidden at the bottom bracketed by deceptive ads served by
companies that definitely know which new coffee shop I went to last month.
This is wrong, and those who have seen what was know it.
I can t keep doing it. I m not doing it any more. I reject the notion that
this is as it needs to be. It is wrong. The hole left in what the Internet used
to be must be filled. I will fill it.
What comes before part b?
Throughout the 2000s, some of my favorite memories were from LAN parties at my
friends places. Dragging your setup somewhere, long nights playing games,
goofing off, even building software all night to get something working being
able to do something fiercely technical in the context of a uniquely social
activity. It wasn t really much about the games or the projects it was an
excuse to spend time together, just hanging out. A huge reason I learned so
much in college was that campus was a non-stop LAN party we could freely
stand up servers, talk between dorms on the LAN, and hit my dorm room computer
from the lab. Things could go from individual to social in the matter of
seconds. The Internet used to work this way my dorm had public IPs handed out
by DHCP, and my workstation could serve traffic from anywhere on the internet.
I haven t been back to campus in a few years, but I d be surprised if this were
still the case.
In December of 2021, three of us got together and connected our houses together
in what we now call The Promised LAN. The idea is simple fill the hole we feel
is gone from our lives. Build our own always-on 24/7 nonstop LAN party. Build a
space that is intrinsically social, even though we re doing technical things.
We can freely host insecure game servers or one-off side projects without
worrying about what someone will do with it.
Over the years, it s evolved very slowly we haven t pulled any all-nighters.
Our mantra has become old growth , building each layer carefully. As of May
2025, the LAN is now 19 friends running around 25 network segments. Those 25
networks are connected to 3 backbone nodes, exchanging routes and IP traffic
for the LAN. We refer to the set of backbone operators as The Bureau of LAN
Management . Combined decades of operating critical infrastructure has
driven The Bureau to make a set of well-understood, boring, predictable,
interoperable and easily debuggable decisions to make this all happen.
Nothing here is exotic or even technically interesting.
Applications of trusting trust
The hardest part, however, is rejecting the idea that anything outside our own
LAN is untrustworthy nearly irreversible damage inflicted on us by the
Internet. We have solved this by not solving it. We strictly control
membership the absolute hard minimum for joining the LAN requires 10 years of
friendship with at least one member of the Bureau, with another 10 years of
friendship planned. Members of the LAN can veto new members even if all other
criteria is met. Even with those strict rules, there s no shortage of friends
that meet the qualifications but we are not equipped to take that many folks
on. It s hard to join -both socially and technically. Doing something malicious
on the LAN requires a lot of highly technical effort upfront, and it would
endanger a decade of friendship. We have relied on those human, social,
interpersonal bonds to bring us all together. It s worked for the last 4 years,
and it should continue working until we think of something better.
We assume roommates, partners, kids, and visitors all have access to The
Promised LAN. If they re let into our friends' network, there is a level of
trust that works transitively for us I trust them to be on mine. This LAN is
not for security , rather, the network border is a social one. Benign
hacking in the original sense of misusing systems to do fun and interesting
things is encouraged. Robust ACLs and firewalls on the LAN are, by definition,
an interpersonal not technical failure. We all trust every other network
operator to run their segment in a way that aligns with our collective values
and norms.
Over the last 4 years, we ve grown our own culture and fads around half of the
people on the LAN have thermal receipt printers with open access, for printing
out quips or jokes on each other s counters. It s incredible how much network
transport and a trusting culture gets you there s a 3-node IRC network, exotic
hardware to gawk at, radios galore, a NAS storage swap, LAN only email, and
even a SIP phone network of redphones .
DIY
We do not wish to, nor will we, rebuild the internet. We do not wish to, nor
will we, scale this. We will never be friends with enough people, as hard as we
may try. Participation hinges on us all having fun. As a result, membership
will never be open, and we will never have enough connected LANs to deal with
the technical and social problems that start to happen with scale. This is a
feature, not a bug.
This is a call for you to do the same. Build your own LAN. Connect it with
friends homes. Remember what is missing from your life, and fill it in. Use
software you know how to operate and get it running. Build slowly. Build your
community. Do it with joy. Remember how we got here. Rebuild a community space
that doesn t need to be mediated by faceless corporations and ad revenue. Build
something sustainable that brings you joy. Rebuild something you use daily.
Bring back what we re missing.
Some of you may remember that I recently felt a bit underwhelmed
by the last pager I reverse engineered the Retekess TD-158,
mostly due to how intuitive their design decions were. It was pretty easy
to jump to conclusions because they had made some pretty good decisions on
how to do things.
I figured I d spin the wheel again and try a new pager system this time I
went for a SU-68G-10 pager, since I recognized the form factor as another
fairly common unit I ve seen around town. Off to Amazon I went, bought a set,
and got to work trying to track down the FCC filings on this model. I
eventually found what seemed to be the right make/model, and it, once again,
indicated that this system should be operating in the 433 MHz ISM band likely
using OOK modulation. So, figured I d start with the center of the band (again)
at 433.92 MHz, take a capture, test my luck, and was greeted with a now very
familiar sight.
Same as the last goarounds, except the premable here is a 0 symbol followed
by 6-ish symbol durations of no data, followed by 25 bits of a packet. Careful
readers will observe 26 symbols above after the preamble I did too! The last
0 in the screenshot above is not actually a part of the packet rather,
it s part of the next packet s preamble. Each packet is packed in pretty tight.
By Hand Demodulation
Going off the same premise as last time, I figured i d give it a manual demod
and see what shakes out (again). This is now the third time i ve run this play,
so check out either of my prior twoposts for a
better written description of what s going on here I ll skip all the details
since i d just be copy-pasting from those posts into here. Long story short, I
demodulated a call for pager 1, call for pager 10, and a power off command.
What
Bits
Call 1
1101111111100100100000000
Call 10
1101111111100100010100000
Off
1101111111100111101101110
A few things jump out at me here the first 14 bits are fixed (in my case,
11011111111001), which means some mix of preamble, system id, or other
system-wide constant. Additionally, The last 9 bits also look like they are our
pager the 1 and 10 pager numbers (LSB bit order) jump right out
(100000000 and 010100000, respectively). That just leaves the two remaining
bits which look to be the action 00 for a Call , and 11 for a Power
off . I don t super love this since command has two bits rather than one, the
base station ID seems really long, and a 9-bit Pager ID is just weird. Also,
what is up with that power-off pager id? Weird. So, let s go and see what we
can do to narrow down and confirm things by hand.
Testing bit flips
Rather than call it a day at that, I figure it s worth a bit of diligence to
make sure it s all correct so I figured we should try sending packets to
my pagers and see how they react to different messages after flipping bits
in parts of the packet.
I implemented a simple base station for the pagers using my Ettus B210mini, and
threw together a simple OOK modulator and transmitter program which allows me
to send specifically crafted test packets on frequency. Implementing the base
station is pretty straightforward, because of the modulation of the signal
(OOK), it s mostly a matter of setting a buffer to 1 and 0 for where the
carrier signal is on or off timed to the sample rate, and sending that off to
the radio. If you re interested in a more detailed writeup on the steps
involved, there s a bit more in my christmas tree post.
First off, I d like to check the base id. I want to know if all the bits in
what I m calling the base id are truly part of the base station ID, or
perhaps they have some other purpose (version, preamble?). I wound up following
a three-step process for every base station id:
Starting with an unmodified call packet for the pager under test:
Flip the Nth bit, and transmit the call. See if the pager reacts.
Hold SET , and pair the pager with the new packet.
Transmit the call. See if the pager reacts.
After re-setting the ID, transmit the call with the physical base station,
see if the pager reacts.
Starting with an unmodified off packet for the pager system
Flip the Nth bit, transmit the off, see if the pager reacts.
What wound up happening is that changing any bit in the first 14 bits meant
that the packet no longer worked with any pager until it was re-paired, at
which point it begun to work again. This likely means the first 14 bits are
part of the base station ID and not static between base stations, or some
constant like a version or something. All bits appear to be used.
I repeated the same process with the command bits, and found that only 11
and 00 caused the pagers to react for the pager ids i ve tried.
I repeated this process one last time with the pager id bits this time, and
found the last bit in the packet isn t part of the pager ID, and can be either
a 1 or a 0 and still cause the pager to react as if it were a 0. This means
that the last bit is unknown but it has no impact on either a power off or
call, and all messages sent by my base station always have a 0 set. It s not
clear if this is used by anything likely not since setting a bit there
doesn t result in any change of behavior I can see yet.
Final Packet Structure
After playing around with flipping bits and testing, the final structure
I was able to come up with based on behavior I was able to observe from
transmitting hand-crafted packets and watching pagers buzz:
base id
command
pager id
???
Commands
The command section bit comes in two flavors either a call or an off
command.
Type
Id (2 bits)
Description
Call
00
Call the pager identified by the id in pager id
Off
11
Request pagers power off, pager id is always 10110111
As for the actual RF PHY characteristics, here s my best guesses at what s
going on with them:
What
Description
Center Frequency
433.92 MHz
Modulation
OOK
Symbol Duration
1300us
Bits
25
Preamble
325us of carrier, followed by 8800us of no carrier
I m not 100% on the timings, but they appear to be close enough to work
reliabily. Same with the center frequency, it s roughly right but there
may be a slight difference i m missing.
Lingering Questions
This was all generally pretty understandable another system that had some
good decisions, and wasn t too bad to reverse engineer. This was a bit more fun
to do, since there was a bit more ambiguity here, but still not crazy. At least
this one was a bit more ambiguous that needed a bit of followup to confirm
things, which made it a bit more fun.
I am left with a few questions, though which I m kinda interested in
understanding, but I ll likely need a lot more data and/or original source:
Why is the command two bits here? This was a bit tough to understand because
of the number of bits they have at their disposal given the one last bit at
the end of the packet that doesn t seem to do anything, there s no reason this
couldn t have been a 16 bit base station id, and an 8 bit pager id along with a
single bit command (call or off).
When sending an off why is power off that bit pattern? Other pager IDs
don t seem to work with off , so it has some meaning, but I m not sure what
that is. You press and hold 9 on the physical base station, but the code winds
up coming out to 0xED, 237 or maybe -19 if it s signed. I can t quite
figure out why it s this value. Are there other codes?
Finally what s up with the last bit? Why is it 25 bits and not 24? It must
take more work to process something that isn t 8 bit aligned and all for
something that s not being used!
I can t remember exactly the joke I was making at the time in my
work s slack instance (I m sure it wasn t particularly
funny, though; and not even worth re-reading the thread to work out), but it
wound up with me writing a UEFI binary for the punchline. Not to spoil the
ending but it worked - no pesky kernel, no messing around with userland . I
guess the only part of this you really need to know for the setup here is that
it was a Severance joke,
which is some fantastic TV. If you haven t seen it, this post will seem perhaps
weirder than it actually is. I promise I haven t joined any new cults. For
those who have seen it, the payoff to my joke is that I wanted my machine to
boot directly to an image of
Kier Eagan.
As for how to do it I figured I d give the uefi
crate a shot, and see how it is to use,
since this is a low stakes way of trying it out. In general, this isn t the
sort of thing I d usually post about except this wound up being easier and
way cleaner than I thought it would be. That alone is worth sharing, in the
hopes someome comes across this in the future and feels like they, too, can
write something fun targeting the UEFI.
First thing s first gotta create a rust project (I ll leave that part to you
depending on your life choices), and to add the uefi crate to your
Cargo.toml. You can either use cargo add or add a line like this by hand:
uefi = version = "0.33", features = ["panic_handler", "alloc", "global_allocator"]
We also need to teach cargo about how to go about building for the UEFI target,
so we need to create a rust-toolchain.toml with one (or both) of the UEFI
targets we re interested in:
Unfortunately, I wasn t able to use the
image crate,
since it won t build against the uefi target. This looks like it s
because rustc had no way to compile the required floating point operations
within the image crate without hardware floating point instructions
specifically. Rust tends to punt a lot of that to libm usually, so this isnt
entirely shocking given we re nostd for a non-hardfloat target.
So-called softening requires a software floating point implementation that
the compiler can use to polyfill (feels weird to use the term polyfill here,
but I guess it s spiritually right?) the lack of hardware floating point
operations, which rust hasn t implemented for this target yet. As a result, I
changed tactics, and figured I d use ImageMagick to pre-compute the pixels
from a jpg, rather than doing it at runtime. A bit of a bummer, since I need
to do more out of band pre-processing and hardcoding, and updating the image
kinda sucks as a result but it s entirely manageable.
This will take our input file (kier.jpg), resize it to get as close to the
desired resolution as possible while maintaining aspect ration, then convert it
from a jpg to a flat array of 4 byte RGBA pixels. Critically, it s also
important to remember that the size of the kier.full.jpg file may not actually
be the requested size it will not change the aspect ratio, so be sure to
make a careful note of the resulting size of the kier.full.jpg file.
Last step with the image is to compile it into our Rust bianary, since we
don t want to struggle with trying to read this off disk, which is thankfully
real easy to do.
Remember to use the width and height from the final kier.full.jpg file as the
values for KIER_WIDTH and KIER_HEIGHT. KIER_PIXEL_SIZE is 4, since we
have 4 byte wide values for each pixel as a result of our conversion step into
RGBA. We ll only use RGB, and if we ever drop the alpha channel, we can drop
that down to 3. I don t entirely know why I kept alpha around, but I figured it
was fine. My kier.full.jpg image winds up shorter than the requested height
(which is also qemu s default resolution for me) which means we ll get a
semi-annoying black band under the image when we go to run it but it ll
work.
Anyway, now that we have our image as bytes, we can get down to work, and
write the rest of the code to handle moving bytes around from in-memory
as a flat block if pixels, and request that they be displayed using the
UEFI GOP. We ll just need to hack up a container
for the image pixels and teach it how to blit to the display.
/// RGB Image to move around. This isn't the same as an
/// image::RgbImage , but we can associate the size of
/// the image along with the flat buffer of pixels.
structRgbImage/// Size of the image as a tuple, as the
/// (width, height)
size: (usize, usize),
/// raw pixels we'll send to the display.
inner: Vec<BltPixel>,
impl RgbImage
/// Create a new RgbImage .
fnnew(width: usize, height: usize) -> Self
RgbImage
size: (width, height),
inner: vec![BltPixel::new(0, 0, 0); width * height],
/// Take our pixels and request that the UEFI GOP
/// display them for us.
fnwrite(&self, gop: &mut GraphicsOutput) -> Result
gop.blt(BltOp::BufferToVideo
buffer: &self.inner,
src: BltRegion::Full,
dest: (0, 0),
dims: self.size,
)
impl Index<(usize, usize)>for RgbImage
typeOutput= BltPixel;
fnindex(&self, idx: (usize, usize)) -> &BltPixellet (x, y) = idx;
&self.inner[y * self.size.0+ x]
impl IndexMut<(usize, usize)>for RgbImage
fnindex_mut(&mut self, idx: (usize, usize)) -> &mut BltPixel
let (x, y) = idx;
&mut self.inner[y * self.size.0+ x]
We also need to do some basic setup to get a handle to the UEFI
GOP via the UEFI crate (using
uefi::boot::get_handle_for_protocol
and
uefi::boot::open_protocol_exclusive
for the GraphicsOutput
protocol), so that we have the object we need to pass to RgbImage in order
for it to write the pixels to the display. The only trick here is that the
display on the booted system can really be any resolution so we need to do
some capping to ensure that we don t write more pixels than the display can
handle. Writing fewer than the display s maximum seems fine, though.
fnpraise() -> Result
let gop_handle = boot::get_handle_for_protocol::<GraphicsOutput>()?;
letmut gop = boot::open_protocol_exclusive::<GraphicsOutput>(gop_handle)?;
// Get the (width, height) that is the minimum of
// our image and the display we're using.
let (width, height) = gop.current_mode_info().resolution();
let (width, height) = (width.min(KIER_WIDTH), height.min(KIER_HEIGHT));
letmut buffer = RgbImage::new(width, height);
for y in0..height
for x in0..width
let idx_r = ((y * KIER_WIDTH) + x) * KIER_PIXEL_SIZE;
let pixel =&mut buffer[(x, y)];
pixel.red = KIER[idx_r];
pixel.green = KIER[idx_r +1];
pixel.blue = KIER[idx_r +2];
buffer.write(&mut gop)?;
Ok(())
Not so bad! A bit tedious we could solve some of this by turning
KIER into an RgbImage at compile-time using some clever Cow and
const tricks and implement blitting a sub-image of the image but this
will do for now. This is a joke, after all, let s not go nuts. All that s
left with our code is for us to write our main function and try and boot
the thing!
#[entry]fnmain() -> Status
uefi::helpers::init().unwrap();
praise().unwrap();
boot::stall(100_000_000);
Status::SUCCESS
If you re following along at home and so interested, the final source is over at
gist.github.com.
We can go ahead and build it using cargo (as is our tradition) by targeting
the UEFI platform.
Testing the UEFI Blob
While I can definitely get my machine to boot these blobs to test, I figured
I d save myself some time by using QEMU to test without a full boot.
If you ve not done this sort of thing before, we ll need two packages,
qemu and ovmf. It s a bit different than most invocations of qemu you
may see out there so I figured it d be worth writing this down, too.
$ doas apt install qemu-system-x86 ovmf
qemu has a nice feature where it ll create us an EFI partition as a drive and
attach it to the VM off a local directory so let s construct an EFI
partition file structure, and drop our binary into the conventional location.
If you haven t done this before, and are only interested in running this in a
VM, don t worry too much about it, a lot of it is convention and this layout
should work for you.
With all this in place, we can kick off qemu, booting it in UEFI mode using
the ovmf firmware, attaching our EFI partition directory as a drive to
our VM to boot off of.
If all goes well, soon you ll be met with the all knowing gaze of
Chosen One, Kier Eagan. The thing that really impressed me about all
this is this program worked first try it all went so boringly
normal. Truly, kudos to the uefi crate maintainers, it s incredibly
well done.
Booting a live system
Sure, we could stop here, but anyone can open up an app window and see a
picture of Kier Eagan, so I knew I needed to finish the job and boot a real
machine up with this. In order to do that, we need to format a USB stick.
BE SURE /dev/sda IS CORRECT IF YOU RE COPY AND PASTING. All my drives
are NVMe, so BE CAREFUL if you use SATA, it may very well be your
hard drive! Please do not destroy your computer over this.
$ doas fdisk /dev/sda
Welcome to fdisk (util-linux 2.40.4).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-4014079, default 2048):
Last sector, +/-sectors or +/-size K,M,G,T,P (2048-4014079, default 4014079):
Created a new partition 1 of type 'Linux' and of size 1.9 GiB.
Command (m for help): t
Selected partition 1
Hex code or alias (type L to list all): ef
Changed type of partition 'Linux' to 'EFI (FAT-12/16/32)'.
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
Once that looks good (depending on your flavor of udev you may or
may not need to unplug and replug your USB stick), we can go ahead
and format our new EFI partition (BE CAREFUL THAT /dev/sda IS YOUR
USB STICK) and write our EFI directory to it.
$ doas mkfs.fat /dev/sda1
$ doas mount /dev/sda1 /mnt
$ cp -r esp/efi /mnt
$ find /mnt
/mnt
/mnt/efi
/mnt/efi/boot
/mnt/efi/boot/bootx64.efi
Of course, naturally, devotion to Kier shouldn t mean backdooring your system.
Disabling Secure Boot runs counter to the Core Principals, such as Probity, and
not doing this would surely run counter to Verve, Wit and Vision. This bit does
require that you ve taken the step to enroll a
MOK and know how
to use it, right about now is when we can use sbsign to sign our UEFI binary
we want to boot from to continue enforcing Secure Boot. The details for how
this command should be run specifically is likely something you ll need to work
out depending on how you ve decided to manage your MOK.
I figured I d leave a signed copy of boot2kier at
/boot/efi/EFI/BOOT/KIER.efi on my Dell XPS 13, with Secure Boot enabled
and enforcing, just took a matter of going into my BIOS to add the right
boot option, which was no sweat. I m sure there is a way to do it using
efibootmgr, but I wasn t smart enough to do that quickly. I let er rip,
and it booted up and worked great!
It was a bit hard to get a video of my laptop, though but lucky for me, I
have a Minisforum Z83-F sitting around (which, until a few weeks ago was running
the annual http server to control my christmas tree
) so I grabbed it out of the christmas bin, wired it up to a video capture
card I have sitting around, and figured I d grab a video of me booting a
physical device off the boot2kier USB stick.
Attentive readers will notice the image of Kier is smaller then the qemu booted
system which just means our real machine has a larger GOP display
resolution than qemu, which makes sense! We could write some fancy resize code
(sounds annoying), center the image (can t be assed but should be the easy way
out here) or resize the original image (pretty hardware specific workaround).
Additionally, you can make out the image being written to the display before us
(the Minisforum logo) behind Kier, which is really cool stuff. If we were real
fancy we could write blank pixels to the display before blitting Kier, but,
again, I don t think I care to do that much work.
But now I must away
If I wanted to keep this joke going, I d likely try and find a copy of the
original
video when Helly 100%s her file
and boot into that or maybe play a terrible midi PC speaker rendition of
Kier, Chosen One, Kier after
rendering the image. I, unfortunately, don t have any friends involved with
production (yet?), so I reckon all that s out for now. I ll likely stop playing
with this the joke was done and I m only writing this post because of how
great everything was along the way.
All in all, this reminds me so much of building a homebrew kernel to boot a
system into but like, good, though, and it s a nice reminder of both how
fun this stuff can be, and how far we ve come. UEFI protocols are light-years
better than how we did it in the dark ages, and the tooling for this is SO
much more mature. Booting a custom UEFI binary is miles ahead of trying to
boot your own kernel, and I can t believe how good the uefi crate is
specifically.
Praise Kier! Kudos, to everyone involved in making this so delightful .
In basically every engineering organization I ve ever regarded as particularly
high functioning, I ve sat through one specific recurring conversation which is
not a conversation about complexity . Things are good or bad because they
are or aren t complex, architectures needs to be redone because it s too
complex some refactor of whatever it is won t work because it s too complex.
You may have even been a part of some of these conversations or even been
the one advocating for simple light-weight solutions. I ve done it. Many times.
Rarely, if ever, do we talk about complexity within its rightful context
complexity for whom. Is a solution complex because it s complex for the
end user? Is it complex if it s complex for an API consumer? Is it complex if
it s complex for the person maintaining the API service? Is it complex if it s
complex for someone outside the team maintaining it to understand?
Complexity within a problem domain I ve come to believe, is fairly zero-sum
there s a fixed amount of complexity in the problem to be solved, and you can
choose to either solve it, or leave it for those downstream of you
to solve that problem on their own.
That being said, while I believe there is a lower bound in complexity to
contend with for a problem, I do not believe there is an upper bound to the
complexity of solutions possible. It is always possible, and in fact, very
likely that teams create problems for themselves while trying to solve a
problem. The rest of this post is talking to the lower bound. When getting
feedback on an early draft of this blog post, I ve been informed that Fred
Brooks coined a term for what I call lower bound complexity Essential
Complexity , in the paper
No Silver Bullet Essence and Accident in Software Engineering ,
which is a better term and can be used interchangeably.
Complexity Culture
In a large enough organization, where the team is high functioning enough to
have and maintain trust amongst peers, members of the team will specialize.
People will begin to engage with subsets of the work to be done, and begin to
have their efficacy measured against that part of the organization s problems.
Incentives shift, and over time it becomes increasingly likely that two
engineers may have two very different priorities when working on the same
system together. Someone accountable for uptime and tasked with responding to
outages will begin to resist changes. Someone accountable for rapidly
delivering features will resist gates between them and their users. Companies
(either wittingly or unwittingly) will deal with this by tasking engineers with
both production (feature development) and operational tasks (maintenance), so
the difference in incentives isn t usually as bad as it could be.
When we get a bunch of folks from far-flung corners of an organization in a
room, fire up a slide deck and throw up some aspirational to-be architecture
diagram in order to get a sign-off to solve some problem (be it someone needs a
credible promotion packet, new feature needs to get delivered, or the system
has begun to fail and needs fixing), the initial reaction will, more often than
I d like, start to devolve into a discussion of how this is going to introduce
a bunch of complexity, going to be hard to maintain, why can t you make it
less complex?
Right around here is when I start to try and contextualize the conversation
happening around me understand what complexity is that being discussed, and
understand who is taking on that burden. Think about who should be owning
that problem, and work through the tradeoffs involved. Is it best solved here,
or left to consumers (be them other systems, developers, or users). Should
something become an API call s optional param, taking on all the edge-cases and
on, or should users have to implement the logic using the data you
return (leaving everyone else to take on all the edge-cases and maintenance)?
Should you process the data, or require the user to preprocess it for you?
Frequently it s right to make an active and explicit decision to simplify and
leave problems to be solved downstream, since they may not actually need to be
solved or perhaps you expect consumers will want to own the specifics of
how the problem is solved, in which case you leave lots of documentation and
examples. Many other times, especially when it s something downstream consumers
are likely to hit, it s best solved internal to the system, since the only
thing that can come of leaving it unsolved are bugs, frustration and
half-correct solutions. This is a grey-space of tradeoffs, not a clear decision
tree. No one wants the software manifestation of a katamari ball or a junk
drawer, nor does anyone want a half-baked service unable to handle the simplest
use-case.
Head-in-sand as a Service
Popoffs about how complex something is, are, to a first approximation, best
understood as meaning complicated for the person making comments . A lot of
the #thoughtleadership believe that an AWS hosted EKS k8s cluster running
images built by CI talking to an AWS hosted PostgreSQL RDS is not complex.
They re right. Mostly right. This is less complex less complex for them.
It s not, however, without complexity and its own tradeoffs it s just
complexity that they do not have to deal with. Now they don t have to
maintain machines that have pesky operating systems or hard drive failures.
They don t have to deal with updating the version of k8s, nor ensuring the
backups work. No one has to push some artifact to prod manually. Deployments
happen unattended. You click a button and get a cluster.
On the other hand, developers outside the ops function need to deal with
troubleshooting CI, debugging access control rules encoded in turing complete
YAML, permissions issues inside the cluster due to whatever the fuck a service
mesh is, everyone needs to learn how to use some k8s tools they only actually
use during a bad day, likely while doing some x.509 troubleshooting to
connect to the cluster (an internal only endpoint; just port forward it) not
to mention all sorts of rules to route packets to their project (a single
repo s binary being run in 3 containers on a single vm host).
Beyond that, there s the invisible complexity complexity on the interior of
a service you depend on. I think about the dozens of teams maintaining the EKS
service (which is either run on EC2 instances, or alternately, EC2 instances in
a trench coat, moustache and even more shell scripts), the RDS service (also
EC2 and shell scripts, but this time accounting for redundancy, backups,
availability zones), scores of hypervisors pulled off the shelf (xen, kvm)
smashed together with the ones built in-house (firecracker, nitro, etc)
running on hardware that has to be refreshed and maintained continuously. Every
request processed by network ACL rules, AWS IAM rules, security group rules,
using IP space announced to the internet wired through IXPs directly into ISPs.
I don t even want to begin to think about the complexity inherent in how those
switches are designed. Shitloads of complexity to solve problems you may or
may not have, or even know you had.
What s more complex? An app running in an in-house 4u server racked in the
office s telco closet in the back running off the office Verizon line, or an
app running four hypervisors deep in an AWS datacenter? Which is more complex
to you? What about to your organization? In total? Which is more prone to
failure? Which is more secure? Is the complexity good or bad? What type of
Complexity can you manage effectively? Which threaten the system? Which
threaten your users?
COMPLEXIVIBES
This extends beyond Engineering. Decisions regarding what tools are we able to
use be them existing contracts with cloud providers, CIO mandated SaaS
products, a list of the only permissible open source projects will incur
costs in terms of expressed complexity . Pinning open source projects to a
fixed set makes SBOM production less complex . Using only one SaaS provider s
product suite (even if its terrible, because it has all the types of tools you
need) makes accreditation less complex . If all you have is a contract with
Pauly T s lowest price technically acceptable artisinal cloudary and
haberdashery, the way you pay for your compute is less complex for the CIO
shop, though you will find yourself building your own hosted database template,
mechanism to spin up a k8s cluster, and all the operational and technical
burden that comes with it. Or you won t and make it everyone else s problem in
the organization. Nothing you can do will solve for the fact that you must
now deal with this problem somewhere because it was less complicated for the
business to put the workloads on the existing contract with a cut-rate vendor.
Suddenly, the decision to reduce complexity because of an existing contract
vehicle has resulted in a huge amount of technical risk and maintenance burden
being onboarded. Complexity you would otherwise externalize has now been taken
on internally. With a large enough organizations (specifically, in this case,
i m talking about you, bureaucracies), this is largely ignored or accepted as
normal since the personnel cost is understood to be free to everyone involved.
Doing it this way is more expensive, more work, less reliable and less
maintainable, and yet, somehow, is, in a lot of ways, less complex to the
organization. It s particularly bad with bureaucracies, since screwing up a
contract will get you into much more trouble than delivering a broken product,
leaving basically no reason for anyone to care to fix this.
I can t shake the feeling that for every story of technical mandates gone
awry, somewhere just
out of sight there s a decisionmaker optimizing for what they believe to be the
least amount of complexity least hassle, fewest unique cases, most
consistency as they can. They freely offload complexity from their
accreditation and risk acceptance functions through mandates. They will never
have to deal with it. That does not change the fact that someone does.
TC;DR (TOO COMPLEX; DIDN T REVIEW)
We wish to rid ourselves of systemic Complexity after all, complexity is
bad, simplicity is good. Removing upper-bound own-goal complexity ( accidental
complexity in Brooks s terms) is important, but once you hit the lower bound
complexity, the tradeoffs become zero-sum. Removing complexity from one part of
the system means that somewhere else maybe outside your organization or in a
non-engineering function must grow it back. Sometimes, the opposite is the
case, such as when a previously manual business processes is automated. Maybe that s a
good idea. Maybe it s not. All I know is that what doesn t help the situation
is conflating complexity with everything we don t like legacy code,
maintenance burden or toil, cost, delivery velocity.
Complexity is not the same as proclivity to failure. The most reliable
systems I ve interacted with are unimaginably complex, with layers of internal
protection to prevent complete failure. This has its own set of costs which
other people have written about extensively.
Complexity is not cost. Sometimes the cost of taking all the complexity
in-house is less, for whatever value of cost you choose to use.
Complexity is not absolute. Something simple from one perspective may
be wildly complex from another. The impulse to burn down complex sections of
code is helpful to have generally, but
sometimes things are complicated for a reason,
even if that reason exists outside your codebase or organization.
Complexity is not something you can remove without introducing complexity
elsewhere. Just as not making a decision is a decision itself; choosing to
require someone else to deal with a problem rather than dealing with it
internally is a choice that needs to be considered in its full context.
Next time you re sitting through a discussion and someone starts to talk about
all the complexity about to be introduced, I want to pop up in the back of your
head, politely asking what does complex mean in this context? Is it lower
bound complexity? Is this complexity desirable? Is what they re saying mean
something along the lines of I don t understand the problems being solved, or
does it mean something along the lines of this problem should be solved
elsewhere? Do they believe this will result in more work for them in a way that
you don t see? Should this not solved at all by changing the bounds of what we
should accept or redefine the understood limits of this system? Is the perceived
complexity a result of a decision elsewhere? Who s taking this complexity on,
or more to the point, is failing to address complexity required by the problem
leaving it to others? Does it impact others? How specifically? What are you
not seeing?
What can change?
What should change?
It s been a while since I played with something new been stuck in a bit of a
rut with radios recently - working on refining and debugging stuff I mostly
understand for the time being. The other day, I was out getting some food and I
idly wondered how the restaurant pager system worked. Idle curiosity gave way
to the realization that I, in fact, likely had the means and ability to answer
this question, so I bought the first set of the most popular looking restaurant
pagers I could find on eBay, figuring it d be a fun multi-week adventure.
Order up!
I wound up buying a Retekess brand TD-158 Restaurant Pager System (they looked
like ones I d seen before and seemed to be low-cost and popular), and quickly
after, had a pack of 10 pagers and a base station in-hand. The manual stated
that the radios operated at 433 MHz (cool! can do! Love a good ISM band
device), and after taking an inital read through the manual for tips on the
PHY, I picked out a few interesting things. First is that the base station ID
was limited to 0-999, which is weird because it means the limiting factor is
likely the base-10 display on the base station, not the protocol we need
enough bits to store 999 at least 10 bits. Nothing else seemed to catch my
eye, so I figured may as well jump right to it.
Not being the type to mess with success, I did exactly the same thing as I did
in my christmas tree post, and took a capture at 433.92MHz
since it was in the middle of the band, and immediately got deja-vu. Not only
was the signal at 433.92MHz, but throwing the packet into inspectrum gave
me the identical plot of the OOK encoding scheme.
Not just similar identical. The only major difference was the baud rate and
bit structure of the packets, and the only minor difference was the existence
of what I think is a wakeup preamble packet (of all zeros), rather than a
preamble symbol that lasted longer than usual PHY symbol (which makes this
pager system a bit easier to work with than my tree, IMHO).
Getting down to work, I took some measurements to determine what the symbol
duration was over the course of a few packets, I was able to determine the
symbol rate was somewhere around 858 microseconds (0.000858 seconds per
symbol), which is a weird number, but maybe I m slightly off or there s some
larger math I m missing that makes this number satisfyingly round (internal
low cost crystal clock or something? I assume this is some hardware constrait
with the pager?)
Anyway, good enough. Moving along, let s try our hand at a demod let s just
assume it s all the same as the chrismas tree post and demod ones and zeros
the same way here. That gives us 26 bits:
00001101110000001010001000
Now, I know we need at least 10 bits for the base station ID, some number
of bits for the pager ID, and some bits for the command. This was a capture
of me hitting call from a base station ID of 55 to a pager with the ID of
10, so let s blindly look for 10 bit chunks with the numbers we re looking for:
0000110111 0000001010 001000
Jeez. First try. 10 bits for the base station ID (55 in binary is
0000110111), 10 bits for the pager ID (10 in binary is 0000001010), which
leaves us with 6 bits for a command (and maybe something else too?) which is
8 here. Great, cool, let s work off that being the case and revisit it if
we hit bugs.
Besides our data packet, there s also a preamble packet that I ll add in,
in case it s used for signal detection or wakeup or something which is
fairly easy to do since it s the same packet structure as the above, just
all zeros. Very kind of them to leave it with the same number of bits and
encoding scheme it s nice that it can live outside the PHY.
Once I got here, I wrote a quick and dirty modulator, and was able to ring up
pagers! Unmitigated success and good news only downside was that it took me
a single night, and not the multi-week adventure I was looking for. Well, let s
finish the job and document what we ve found for the sake of completeness.
Boxing everything up
My best guess on the packet structure is as follows:
base id
argument
command
For a call or F2 operation, the argument is the Pager s ID code,
but for other commands it s a value or an enum, depending. Here s a table
of my by-hand demodulation of all the packet types the base station
produces:
Type
Cmd Id
Description
Call
8
Call the pager identified by the id in argument
Off
60
Request any pagers on the charger power off when power is removed, argument is all zero
F2
40
Program a pager to the specified Pager ID (in argument) and base station
F3
44
Set the reminder duration in seconds specified in argument
F4
48
Set the pager's beep mode to the one in argument (0 is disabled, 1 is slow, 2 is medium, 3 is fast)
F5
52
Set the pager's vibration mode to the one in argument (0 is disabled, 1 is enabled)
Kitchen s closed for the night
I m not going to be publishing this code since I can t think of a good use
anyone would have for this besides folks using a low cost SDR and annoying
local resturants; but there s enough here for folks who find this interesting
to try modulating this protocol on their own hardware if they want to buy their
own pack of pagers and give it a shot, which I do encourage! It s fun! Radios
are great, and this is a good protocol to hack with it s really nice.
All in all, this wasn t the multi-week adventure I was looking for, this was
still a great exercise and a fun reminder that I ve come a far way from when
I ve started. It felt a lot like cheating since I was able to infer a lot about
the PHY because I d seen it before, but it was still a great time. I may grab a
few more restaurant pagers and see if I can find one with a more exotic PHY to
emulate next. I mean why not, I ve already got the
thermalprinter
libraries
working
Years ago, at what I think I remember was DebConf 15, I hacked for a while
on debhelper to
write build-ids to debian binary control files,
so that the build-id (more specifically, the ELF note
.note.gnu.build-id) wound up in the Debian apt archive metadata.
I ve always thought this was super cool, and seeing as how Michael Stapelberg
blogged
some great pointers around the ecosystem, including the fancy new debuginfod
service, and the
find-dbgsym-packages
helper, which uses these same headers, I don t think I m the only one.
At work I ve been using a lot of rust,
specifically, async rust using tokio. To try and work on
my style, and to dig deeper into the how and why of the decisions made in these
frameworks, I ve decided to hack up a project that I ve wanted to do ever
since 2015 write a debug filesystem. Let s get to it.
Back to the Future
Time to admit something. I really love Plan 9. It s
just so good. So many ideas from Plan 9 are just so prescient, and everything
just feels right. Not just right like, feels good like, correct. The
bit that I ve always liked the most is 9p, the network protocol for serving
a filesystem over a network. This leads to all sorts of fun programs, like the
Plan 9 ftp client being a 9p server you mount the ftp server and access
files like any other files. It s kinda like if fuse were more fully a part
of how the operating system worked, but fuse is all running client-side. With
9p there s a single client, and different servers that you can connect to,
which may be backed by a hard drive, remote resources over something like SFTP, FTP, HTTP or even purely synthetic.
The interesting (maybe sad?) part here is that 9p wound up outliving Plan 9
in terms of adoption 9p is in all sorts of places folks don t usually expect.
For instance, the Windows Subsystem for Linux uses the 9p protocol to share
files between Windows and Linux. ChromeOS uses it to share files with Crostini,
and qemu uses 9p (virtio-p9) to share files between guest and host. If you re
noticing a pattern here, you d be right; for some reason 9p is the go-to protocol
to exchange files between hypervisor and guest. Why? I have no idea, except maybe
due to being designed well, simple to implement, and it s a lot easier to validate the data being shared
and validate security boundaries. Simplicity has its value.
As a result, there s a lot of lingering 9p support kicking around. Turns out
Linux can even handle mounting 9p filesystems out of the box. This means that I
can deploy a filesystem to my LAN or my localhost by running a process on top
of a computer that needs nothing special, and mount it over the network on an
unmodified machine unlike fuse, where you d need client-specific software
to run in order to mount the directory. For instance, let s mount a 9p
filesystem running on my localhost machine, serving requests on 127.0.0.1:564
(tcp) that goes by the name mountpointname to /mnt.
Linux will mount away, and attach to the filesystem as the root user, and by default,
attach to that mountpoint again for each local user that attempts to use
it. Nifty, right? I think so. The server is able
to keep track of per-user access and authorization
along with the host OS.
WHEREIN I STYX WITH IT
Since I wanted to push myself a bit more with rust and tokio specifically,
I opted to implement the whole stack myself, without third party libraries on
the critical path where I could avoid it. The 9p protocol (sometimes called
Styx, the original name for it) is incredibly simple. It s a series of client
to server requests, which receive a server to client response. These are,
respectively, T messages, which transmit a request to the server, which
trigger an R message in response (Reply messages). These messages are
TLV payload
with a very straight forward structure so straight forward, in fact, that I
was able to implement a working server off nothing more than a handful of man
pages.
Later on after the basics worked, I found a more complete
spec page
that contains more information about the
unix specific variant
that I opted to use (9P2000.u rather than 9P2000) due to the level
of Linux specific support for the 9P2000.u variant over the 9P2000
protocol.
MR ROBOTO
The backend stack over at zoo is rust and tokio
running i/o for an HTTP and WebRTC server. I figured I d pick something
fairly similar to write my filesystem with, since 9P can be implemented
on basically anything with I/O. That means tokio tcp server bits, which
construct and use a 9p server, which has an idiomatic Rusty API that
partially abstracts the raw R and T messages, but not so much as to
cause issues with hiding implementation possibilities. At each abstraction
level, there s an escape hatch allowing someone to implement any of
the layers if required. I called this framework
arigato which can be found over on
docs.rs and
crates.io.
/// Simplified version of the arigato File trait; this isn't actually
/// the same trait; there's some small cosmetic differences. The
/// actual trait can be found at:
///
/// https://docs.rs/arigato/latest/arigato/server/trait.File.html
trait File
/// OpenFile is the type returned by this File via an Open call.
typeOpenFile: OpenFile;
/// Return the 9p Qid for this file. A file is the same if the Qid is
/// the same. A Qid contains information about the mode of the file,
/// version of the file, and a unique 64 bit identifier.
fnqid(&self) -> Qid;
/// Construct the 9p Stat struct with metadata about a file.
async fnstat(&self) -> FileResult<Stat>;
/// Attempt to update the file metadata.
async fnwstat(&mut self, s: &Stat) -> FileResult<()>;
/// Traverse the filesystem tree.
async fnwalk(&self, path: &[&str]) -> FileResult<(Option<Self>, Vec<Self>)>;
/// Request that a file's reference be removed from the file tree.
async fnunlink(&mut self) -> FileResult<()>;
/// Create a file at a specific location in the file tree.
async fncreate(
&mut self,
name: &str,
perm: u16,
ty: FileType,
mode: OpenMode,
extension: &str,
) -> FileResult<Self>;
/// Open the File, returning a handle to the open file, which handles
/// file i/o. This is split into a second type since it is genuinely
/// unrelated -- and the fact that a file is Open or Closed can be
/// handled by the arigato server for us.
async fnopen(&mut self, mode: OpenMode) -> FileResult<Self::OpenFile>;
/// Simplified version of the arigato OpenFile trait; this isn't actually
/// the same trait; there's some small cosmetic differences. The
/// actual trait can be found at:
///
/// https://docs.rs/arigato/latest/arigato/server/trait.OpenFile.html
trait OpenFile
/// iounit to report for this file. The iounit reported is used for Read
/// or Write operations to signal, if non-zero, the maximum size that is
/// guaranteed to be transferred atomically.
fniounit(&self) -> u32;
/// Read some number of bytes up to buf.len() from the provided
/// offset of the underlying file. The number of bytes read is
/// returned.
async fnread_at(
&mut self,
buf: &mut [u8],
offset: u64,
) -> FileResult<u32>;
/// Write some number of bytes up to buf.len() from the provided
/// offset of the underlying file. The number of bytes written
/// is returned.
fnwrite_at(
&mut self,
buf: &mut [u8],
offset: u64,
) -> FileResult<u32>;
Thanks, decade ago paultag!
Let s do it! Let s use arigato to implement a 9p filesystem we ll call
debugfs that will serve all the debug
files shipped according to the Packages metadata from the apt archive. We ll
fetch the Packages file and construct a filesystem based on the reported
Build-Id entries. For those who don t know much about how an apt repo
works, here s the 2-second crash course on what we re doing. The first is to
fetch the Packages file, which is specific to a binary architecture (such as
amd64, arm64 or riscv64). That architecture is specific to a
component (such as main, contrib or non-free). That component is
specific to a suite, such as stable, unstable or any of its aliases
(bullseye, bookworm, etc). Let s take a look at the Packages.xz file for
the unstable-debugsuite, maincomponent, for all amd64 binaries.
This will return the Debian-style
rfc2822-like headers,
which is an export of the metadata contained inside each .deb file which
apt (or other tools that can use the apt repo format) use to fetch
information about debs. Let s take a look at the debug headers for the
netlabel-tools package in unstable which is a package named
netlabel-tools-dbgsym in unstable-debug.
So here, we can parse the package headers in the Packages.xz file, and store,
for each Build-Id, the Filename where we can fetch the .deb at. Each
.deb contains a number of files but we re only really interested in the
files inside the .deb located at or under /usr/lib/debug/.build-id/,
which you can find in debugfs under
rfc822.rs. It s
crude, and very single-purpose, but I m feeling a bit lazy.
Who needs dpkg?!
For folks who haven t seen it yet, a .deb file is a special type of
.ar file, that contains (usually)
three files inside debian-binary, control.tar.xz and data.tar.xz.
The core of an .ar file is a fixed size (60 byte) entry header,
followed by the specified size number of bytes.
[8 byte .ar file magic]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
[60 byte entry header]
[N bytes of data]
...
First up was to implement a basic ar parser in
ar.rs. Before we get
into using it to parse a deb, as a quick diversion, let s break apart a .deb
file by hand something that is a bit of a rite of passage (or at least it
used to be? I m getting old) during the Debian nm (new member) process, to take
a look at where exactly the .debug file lives inside the .deb file.
$ ar x netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ ls
control.tar.xz debian-binary
data.tar.xz netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb
$ tar --list -f data.tar.xz grep '.debug$'
./usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
Since we know quite a bit about the structure of a .deb file, and I had to
implement support from scratch anyway, I opted to implement a (very!) basic
debfile parser using HTTP Range requests. HTTP Range requests, if supported by
the server (denoted by a accept-ranges: bytes HTTP header in response to an
HTTP HEAD request to that file) means that we can add a header such as
range: bytes=8-68 to specifically request that the returned GET body be the
byte range provided (in the above case, the bytes starting from byte offset 8
until byte offset 68). This means we can fetch just the ar file entry from
the .deb file until we get to the file inside the .deb we are interested in
(in our case, the data.tar.xz file) at which point we can request the body
of that file with a final range request. I wound up writing a struct to
handle a read_at-style API surface in
hrange.rs, which
we can pair with ar.rs above and start to find our data in the .deb remotely
without downloading and unpacking the .deb at all.
After we have the body of the data.tar.xz coming back through the HTTP
response, we get to pipe it through an xz decompressor (this kinda sucked in
Rust, since a tokioAsyncRead is not the same as an http Body response is
not the same as std::io::Read, is not the same as an async (or sync)
Iterator is not the same as what the xz2 crate expects; leading me to read
blocks of data to a buffer and stuff them through the decoder by looping over
the buffer for each lzma2 packet in a loop), and tarfile parser (similarly
troublesome). From there we get to iterate over all entries in the tarfile,
stopping when we reach our file of interest. Since we can t seek, but gdb
needs to, we ll pull it out of the stream into a Cursor<Vec<u8>> in-memory
and pass a handle to it back to the user.
From here on out its a matter of
gluing together a File traited struct
in debugfs, and serving the filesystem over TCP using arigato. Done
deal!
A quick diversion about compression
I was originally hoping to avoid transferring the whole tar file over the
network (and therefore also reading the whole debug file into ram, which
objectively sucks), but quickly hit issues with figuring out a way around
seeking around an xz file. What s interesting is xz has a great primitive
to solve this specific problem (specifically, use a block size that allows you
to seek to the block as close to your desired seek position just before it,
only discarding at most block size - 1 bytes), but data.tar.xz files
generated by dpkg appear to have a single mega-huge block for the whole file.
I don t know why I would have expected any different, in retrospect. That means
that this now devolves into the base case of How do I seek around an lzma2
compressed data stream ; which is a lot more complex of a question.
Thankfully, notoriously brilliant tianon was
nice enough to introduce me to Jon Johnson
who did something super similar adapted a technique to seek inside a
compressed gzip file, which lets his service
oci.dag.dev
seek through Docker container images super fast based on some prior work
such as soci-snapshotter, gztool, and
zran.c.
He also pulled this party trick off for apk based distros
over at apk.dag.dev, which seems apropos.
Jon was nice enough to publish a lot of his work on this specifically in a
central place under the name targz
on his GitHub, which has been a ton of fun to read through.
The gist is that, by dumping the decompressor s state (window of previous
bytes, in-memory data derived from the last N-1 bytes) at specific
checkpoints along with the compressed data stream offset in bytes and
decompressed offset in bytes, one can seek to that checkpoint in the compressed
stream and pick up where you left off creating a similar block mechanism
against the wishes of gzip. It means you d need to do an O(n) run over the
file, but every request after that will be sped up according to the number
of checkpoints you ve taken.
Given the complexity of xz and lzma2, I don t think this is possible
for me at the moment especially given most of the files I ll be requesting
will not be loaded from again especially when I can just cache the debug
header by Build-Id. I want to implement this (because I m generally curious
and Jon has a way of getting someone excited about compression schemes, which
is not a sentence I thought I d ever say out loud), but for now I m going to
move on without this optimization. Such a shame, since it kills a lot of the
work that went into seeking around the .deb file in the first place, given
the debian-binary and control.tar.gz members are so small.
The Good
First, the good news right? It works! That s pretty cool. I m positive
my younger self would be amused and happy to see this working; as is
current day paultag. Let s take debugfs out for a spin! First, we need
to mount the filesystem. It even works on an entirely unmodified, stock
Debian box on my LAN, which is huge. Let s take it for a spin:
And, let s prove to ourselves that this actually mounted before we go
trying to use it:
$ mount grep build-id
192.168.0.2 on /usr/lib/debug/.build-id type 9p (rw,relatime,aname=unstable-debug,access=user,trans=tcp,version=9p2000.u,port=564)
Slick. We ve got an open connection to the server, where our host
will keep a connection alive as root, attached to the filesystem provided
in aname. Let s take a look at it.
$ ls /usr/lib/debug/.build-id/
00 0d 1a 27 34 41 4e 5b 68 75 82 8E 9b a8 b5 c2 CE db e7 f3
01 0e 1b 28 35 42 4f 5c 69 76 83 8f 9c a9 b6 c3 cf dc E7 f4
02 0f 1c 29 36 43 50 5d 6a 77 84 90 9d aa b7 c4 d0 dd e8 f5
03 10 1d 2a 37 44 51 5e 6b 78 85 91 9e ab b8 c5 d1 de e9 f6
04 11 1e 2b 38 45 52 5f 6c 79 86 92 9f ac b9 c6 d2 df ea f7
05 12 1f 2c 39 46 53 60 6d 7a 87 93 a0 ad ba c7 d3 e0 eb f8
06 13 20 2d 3a 47 54 61 6e 7b 88 94 a1 ae bb c8 d4 e1 ec f9
07 14 21 2e 3b 48 55 62 6f 7c 89 95 a2 af bc c9 d5 e2 ed fa
08 15 22 2f 3c 49 56 63 70 7d 8a 96 a3 b0 bd ca d6 e3 ee fb
09 16 23 30 3d 4a 57 64 71 7e 8b 97 a4 b1 be cb d7 e4 ef fc
0a 17 24 31 3e 4b 58 65 72 7f 8c 98 a5 b2 bf cc d8 E4 f0 fd
0b 18 25 32 3f 4c 59 66 73 80 8d 99 a6 b3 c0 cd d9 e5 f1 fe
0c 19 26 33 40 4d 5a 67 74 81 8e 9a a7 b4 c1 ce da e6 f2 ff
Outstanding. Let s try using gdb to debug a binary that was provided by
the Debian archive, and see if it ll load the ELF by build-id from the
right .deb in the unstable-debug suite:
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
Yes! Yes it will!
$ file /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
/usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, BuildID[sha1]=e59f81f6573dadd5d95a6e4474d9388ab2777e2a, for GNU/Linux 3.2.0, with debug_info, not stripped
The Bad
Linux s support for 9p is mainline, which is great, but it s not robust.
Network issues or server restarts will wedge the mountpoint (Linux can t
reconnect when the tcp connection breaks), and things that work fine on local
filesystems get translated in a way that causes a lot of network chatter for
instance, just due to the way the syscalls are translated, doing an ls, will
result in a stat call for each file in the directory, even though linux had
just got a stat entry for every file while it was resolving directory names.
On top of that, Linux will serialize all I/O with the server, so there s no
concurrent requests for file information, writes, or reads pending at the same
time to the server; and read and write throughput will degrade as latency
increases due to increasing round-trip time, even though there are offsets
included in the read and write calls. It works well enough, but is
frustrating to run up against, since there s not a lot you can do server-side
to help with this beyond implementing the 9P2000.L variant (which, maybe is
worth it).
The Ugly
Unfortunately, we don t know the file size(s) until we ve actually opened the
underlying tar file and found the correct member, so for most files, we don t
know the real size to report when getting a stat. We can t parse the tarfiles
for every stat call, since that d make ls even slower (bummer). Only
hiccup is that when I report a filesize of zero, gdb throws a bit of a
fit; let s try with a size of 0 to start:
$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 0 Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
warning: Discarding section .note.gnu.build-id which has a section size (24) larger than the file size [in module /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug]
[...]
This obviously won t work since gdb will throw away all our hard work because
of stat s output, and neither will loading the real size of the underlying
file. That only leaves us with hardcoding a file size and hope nothing else
breaks significantly as a result. Let s try it again:
$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
-r--r--r-- 1 root root 954M Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug
$ gdb -q /usr/sbin/netlabelctl
Reading symbols from /usr/sbin/netlabelctl...
Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug...
(gdb)
Much better. I mean, terrible but better. Better for now, anyway.
Kilroy was here
Do I think this is a particularly good idea? I mean; kinda. I m probably going
to make some fun 9parigato-based filesystems for use around my LAN, but I
don t think I ll be moving to use debugfs until I can figure out how to
ensure the connection is more resilient to changing networks, server restarts
and fixes on i/o performance. I think it was a useful exercise and is a pretty
great hack, but I don t think this ll be shipping anywhere anytime soon.
Along with me publishing this post, I ve pushed up all my repos; so you
should be able to play along at home! There s a lot more work to be done
on arigato; but it does handshake and successfully export a working
9P2000.u filesystem. Check it out on on my github at
arigato,
debugfs
and also on crates.io
and docs.rs.
At least I can say I was here and I got it working after all these years.
Interested in future updates? Follow me on mastodon at
@paul@soylent.green. Posts about
hz.tools will be tagged
#hztools.
If you're on the Fediverse, I'd very much appreciate boosts on
my toot!
While working on hz.tools, I started to move my beamforming
code from 2-D (meaning, beamforming to some specific angle on the X-Y plane for
waves on the X-Y plane) to 3-D. I ll have more to say about that once I get
around to publishing the code as soon as I m sure it s not completely wrong,
but in the meantime I decided to write a simple simulator to visually
check the beamformer against the textbooks. The results were pretty rad,
so I figured I d throw together a post since it s interesting all on its
own outside of beamforming as a general topic.
I figured I d write this in Rust, since I ve been using Rust as my primary
language over at zoo, and it s a good chance
to learn the language better.
This post has some large GIFs
It make take a little bit to load depending
on your internet connection. Sorry about that, I'm not clever enough to
do better without doing tons of complex engineering work. They may be
choppy while they load or something. I tried to compress an ensmall them,
so if they're loaded but fuzzy, click on them to load a slightly larger
version.
This post won t cover the basics of how
phased arrays work
or the specifics of calculating the phase offsets for each antenna,
but I ll dig into how I wrote a simple simulator and how I wound up
checking my phase offsets to generate the renders below.
Assumptions
I didn t want to build a general purpose RF simulator, anything particularly
generic, or something that would solve for any more than the things right
in front of me. To do this as simply (and quickly all this code took about
a day to write, including the beamforming math) I had to reduce the
amount of work in front of me.
Given that I was concerend with visualizing what the antenna pattern would look
like in 3-D given some antenna geometry, operating frequency and configured
beam, I made the following assumptions:
All anetnnas are perfectly isotropic they receive a signal that is
exactly the same strength no matter what direction the signal originates
from.
There s a single point-source isotropic emitter in the far-field (I modeled
this as being 1 million meters away 1000 kilometers) of the antenna system.
There is no noise, multipath, loss or distortion in the signal as it travels
through space.
Antennas will never interfere with each other.
2-D Polar Plots
The last time I wrote something like this,
I generated 2-D GIFs which show a radiation pattern, not unlike the
polar plots you d see on a microphone.
These are handy because it lets you visualize what the directionality of
the antenna looks like, as well as in what direction emissions are
captured, and in what directions emissions are nulled out. You can see
these plots
on spec sheets for antennas in both 2-D and 3-D form.
Now, let s port the 2-D approach to 3-D and see how well it works out.
Writing the 3-D simulator
As an EM wave travels through free space, the place at which you sample
the wave controls that phase you observe at each time-step. This means,
assuming perfectly synchronized clocks, a transmitter and receiver exactly
one RF wavelength apart will observe a signal in-phase, but a transmitter
and receiver a half wavelength apart will observe a signal 180 degrees
out of phase.
This means that if we take the distance between our point-source and
antenna element, divide it by the wavelength, we can use the fractional
part of the resulting number to determine the phase observed. If we
multiply that number (in the range of 0 to just under 1) by
tau, we can generate a complex number by taking the
cos and sin of the multiplied phase (in the range of 0 to tau), assuming
the transmitter is emitting a carrier wave at a static amplitude and all
clocks are in perfect sync.
At this point, given some synthetic transmission point and each antenna, we
know what the expected complex sample would be at each antenna. At this point,
we can adjust the phase of each antenna according to the beamforming
phase offset configuration, and add up every sample in order to determine
what the entire system would collectively produce a sample as.
let beamformed_phases: Vec<Complex>= ...;
let magnitude = beamformed_phases
.iter()
.zip(observed_phases.iter())
.map((beamformed, observed) observed * beamformed)
.reduce(acc, el acc + el)
.unwrap()
.abs();
Armed with this information, it s straight forward to generate some number
of (Azimuth, Elevation) points to sample, generate a transmission point
far away in that direction, resolve what the resulting Complex sample would be,
take its magnitude, and use that to create an (x, y, z) point at
(azimuth, elevation, magnitude). The color attached two that point is
based on its distance from (0, 0, 0). I opted to use the
Life Aquatic
table for this one.
After this process is complete, I have a
point cloud of
((x, y, z), (r, g, b)) points. I wrote a small program using
kiss3d to render point cloud using tons of
small spheres, and write out the frames to a set of PNGs, which get compiled
into a GIF.
Now for the fun part, let s take a look at some radiation patterns!
1x4 Phased Array
The first configuration is a phased array where all the elements are
in perfect alignment on the y and z axis, and separated by some
offset in the x axis. This configuration can sweep 180 degrees (not
the full 360), but can t be steared in elevation at all.
Let s take a look at what this looks like for a well constructed
1x4 phased array:
And now let s take a look at the renders as we play with the configuration of
this array and make sure things look right. Our initial quarter-wavelength
spacing is very effective and has some outstanding performance characteristics.
Let s check to see that everything looks right as a first test.
Nice. Looks perfect. When pointing forward at (0, 0), we d expect to see a
torus, which we do. As we sweep between 0 and 360, astute observers will notice
the pattern is mirrored along the axis of the antennas, when the beam is facing
forward to 0 degrees, it ll also receive at 180 degrees just as strong. There s
a small sidelobe that forms when it s configured along the array, but
it also becomes the most directional, and the sidelobes remain fairly small.
Long compared to the wavelength (1 )
Let s try again, but rather than spacing each antenna of a wavelength
apart, let s see about spacing each antenna 1 of a wavelength apart instead.
The main lobe is a lot more narrow (not a bad thing!), but some significant
sidelobes have formed (not ideal). This can cause a lot of confusion when doing
things that require a lot of directional resolution unless they re compensated
for.
Going from ( to 5 )
The last model begs the question - what do things look like when you separate
the antennas from each other but without moving the beam? Let s simulate moving
our antennas but not adjusting the configured beam or operating frequency.
Very cool. As the spacing becomes longer in relation to the operating frequency,
we can see the sidelobes start to form out of the end of the antenna system.
2x2 Phased Array
The second configuration I want to try is a phased array where the elements
are in perfect alignment on the z axis, and separated by a fixed offset
in either the x or y axis by their neighbor, forming a square when
viewed along the x/y axis.
Let s take a look at what this looks like for a well constructed
2x2 phased array:
Let s do the same as above and take a look at the renders as we play with the
configuration of this array and see what things look like. This configuration
should suppress the sidelobes and give us good performance, and even give us
some amount of control in elevation while we re at it.
Sweet. Heck yeah. The array is quite directional in the configured direction,
and can even sweep a little bit in elevation, a definite improvement
from the 1x4 above.
Long compared to the wavelength (1 )
Let s do the same thing as the 1x4 and take a look at what happens when the
distance between elements is long compared to the frequency of operation
say, 1 of a wavelength apart? What happens to the sidelobes given this
spacing when the frequency of operation is much different than the physical
geometry?
Mesmerising. This is my favorate render. The sidelobes are very fun to
watch come in and out of existence. It looks absolutely other-worldly.
Going from ( to 5 )
Finally, for completeness' sake, what do things look like when you separate the
antennas from each other just as we did with the 1x4? Let s simulate moving our
antennas but not adjusting the configured beam or operating frequency.
Very very cool. The sidelobes wind up turning the very blobby cardioid into
an electromagnetic dog toy. I think we ve proven to ourselves that using
a phased array much outside its designed frequency of operation seems like
a real bad idea.
Future Work
Now that I have a system to test things out, I m a bit more confident that
my beamforming code is close to right! I d love to push that code over
the line and blog about it, since it s a really interesting topic on its own.
Once I m sure the code involved isn t full of lies, I ll put it up on the
hztools org, and post about it here and on
mastodon.
I decided to leave my job (Principal Software Engineer) after 4 years. I have no idea what I want to do next, so I ve been having loads of chats to try and work that out.
I like working in mission focused organizations, working to fix problems across the stack, from interpersonal down to the operating system. I enjoy going where I m rare , places that don t always get the most attention. At my last job, I most enjoyed working to drive engineering standards for all products across the company, mentoring engineers across all teams and seniority levels, and serving as an advisor for senior leadership as we grew the engineering team from 3 to 150 people.
If you have a role that you think I d like to hear about, I d love to hear about it at jobs pault.ag (where the is an @ sign).
Interested in future updates? Follow me on mastodon at
@paul@soylent.green. Posts about
hz.tools will be tagged
#hztools.
If you're on the Fediverse, I'd very much appreciate boosts on
my announcement toot!
Ever since 2019, I ve been learning about how radios work, and trying to learn
about using them the hard way by writing as much of the stack as is
practical (for some value of practical) myself. I wrote my first Hello World
in 2018, which was a simple FM radio player, which used librtlsdr to read in
an IQ stream, did some filtering, and played the real valued audio stream via
pulseaudio. Over 4 years this has slowly grown through persistence, lots of
questions to too many friends to thank (although I will try), and the eternal
patience of my wife hearing about radios nonstop for years into a number
of Go repos that can do quite a bit, and support a handful of radios.
I ve resisted making the repos public not out of embarrassment or a desire to
keep secrets, but rather, an attempt to keep myself free of any maintenance
obligations to users so that I could freely break my own API, add and remove
API surface as I saw fit. The worst case was to have this project feel like
work, and I can t imagine that will happen if I feel frustrated by PRs
that are getting ahead of me solving problems I didn t yet know about, or
bugs I didn t understand the fix for.
As my rate of changes to the most central dependencies has slowed, i ve begun
to entertain the idea of publishing them. After a bit of back and forth, I ve
decided
it s time to make a number of them public,
and to start working on them in the open, as I ve built up a bit of knowledge
in the space, and I and feel confident that the repo doesn t contain overt
lies. That s not to say it doesn t contain lies, but those lies are likely
hidden and lurking in the dark. Beware.
That being said, it shouldn t be a surprise to say I ve not published
everything yet for the same reasons as above. I plan to open repos as the
rate of changes slows and I understand the problems the library solves well
enough or if the project dead ends and I ve stopped learning.
Intention behind hz.tools
It s my sincere hope that my repos help to make Software Defined Radio (SDR)
code a bit easier to understand, and serves as an understandable framework to
learn with. It s a large codebase, but one that is possible to sit down and
understand because, well, it was written by a single person. Frankly, I m also
not productive enough in my free time in the middle of the night and on
weekends and holidays to create a codebase that s too large to understand, I
hope!
I remain wary of this project turning into work, so my goal is to be very
upfront about my boundaries, and the limits of what classes of contributions
i m interested in seeing.
Here s some goals of open sourcing these repos:
I do want this library to be used to learn with. Please go through it
all and use it to learn about radios and how software can control them!
I am interested in bugs if there s a problem you discover. Such bugs
are likely a great chance for me to fix something I ve misunderstood or
typoed.
I am interested in PRs fixing bugs you find. I may need a bit of a
back and forth to fully understand the problem if I do not understand
the bug and fix yet. I hope you may have some grace if it s taking a long
time.
Here s a list of some anti-goals of open sourcing these repos.
I do not want this library to become a critical dependency of an
important project, since I do not have the time to deal with the maintenance
burden. Putting me in that position is going to make me very uncomfortable.
I am not interested in feature requests, the features have grown as
I ve hit problems, I m not interested in building or maintaining features
for features sake. The API surface should be exposed enough to allow others
to experiment with such things out-of-tree.
I m not interested in clever code replacing clear code without a very
compelling reason.
I use GNU/Linux (specifically Debian
),
and from time-to-time I ve made sure that my code runs on OpenBSD
too.
Platforms beyond that will likely not be supported at the expense of either
of those two. I ll take fixes for bugs that fix a problem on another
platform, but not damage the code to work around issues / lack of features
on other platforms (like Windows).
I m not saying all this to be a jerk, I do it to
make sure I can continue on my journey to learn about how radios work without
my full time job becoming maintaining a radio framework single-handedly for
other people to use even if it means I need to close PRs or bugs without
merging it or fixing the issue.
With all that out of the way, I m very happy to announce that the repos are now
public under github.com/hztools.
Should you use this?
Probably not. The intent here is not to provide a general purpose Go SDR
framework for everyone to build on, although I am keenly aware it looks and
feels like it, since that what it is to me. This is a learning project, so for
any use beyond joining me in learning should use something like
GNU Radio
or a similar framework that has a community behind it.
In fact, I suspect most contributors ought to be contributing to GNU Radio, and
not this project. If I can encourage people to do so,
contribute to GNU Radio!
Nothing makes me happier than seeing GNU Radio continue to be the go-to, and
well supported. Consider
donating to GNU Radio!
hz.tools/rf - Frequency types
The hz.tools/rf library contains the abstract concept of frequency, and
some very basic helpers to interact with frequency ranges (such as helpers
to deal with frequency ranges, or frequency range math) as well as frequencies
and some very basic conversions (to meters, etc) and parsers (to parse values
like 10MHz). This ensures that all the hz.tools libraries have a shared
understanding of Frequencies, a standard way of representing ranges of
Frequencies, and the ability to handle the IO boundary with things like CLI
arguments, JSON or YAML.
The git repo can be found at
github.com/hztools/go-rf, and is
importable as hz.tools/rf.
// Parse a frequency using hz.tools/rf.ParseHz, and print it to stdout.
freq:=rf.MustParseHz("-10kHz")
fmt.Printf("Frequency: %s\n", freq+rf.MHz)
// Prints: 'Frequency: 990kHz'
// Return the Intersection between two RF ranges, and print
// it to stdout.
r1:=rf.Rangerf.KHz, rf.MHzr2:=rf.Rangerf.Hz(10), rf.KHz*100fmt.Printf("Range: %s\n", r1.Intersection(r2))
// Prints: Range: 1000Hz->100kHz
These can be used to represent tons of things - ranges can be used for things
like the tunable range of an SDR, the bandpass of a filter or the frequencies
that correspond to a bin of an FFT, while frequencies can be used for things
such as frequency offsets or the tuned center frequency.
hz.tools/sdr - SDR I/O and IQ Types
This is the big one. This library represents the majority of the shared
types and bindings, and is likely the most useful place to look at when
learning about the IO boundary between a program and an SDR.
The git repo can be found at
github.com/hztools/go-sdr, and is
importable as hz.tools/sdr.
This library is designed to look (and in some cases, mirror) the Go io idioms
so that this library feels as idiomatic as it can, so that Go builtins interact
with IQ in a way that s possible to reason about, and to avoid reinventing the
wheel by designing new API surface. While some of the API looks (and is even
called) the same thing as a similar function in io, the implementation is
usually a lot more naive, and may have unexpected sharp edges such as
concurrency issues or performance problems.
The following IQ types are implemented using the sdr.Samples interface. The
hz.tools/sdr package contains helpers for conversion between types, and some
basic manipulation of IQ streams.
IQ Format
hz.tools Name
Underlying Go Type
Interleaved uint8 (rtl-sdr)
sdr.SamplesU8
[][2]uint8
Interleaved int8 (hackrf, uhd)
sdr.SamplesI8
[][2]int8
Interleaved int16 (pluto, uhd)
sdr.SamplesI16
[][2]int16
Interleaved float32 (airspy, uhd)
sdr.SamplesC64
[]complex64
The following SDRs have implemented drivers in-tree.
SDR
Format
RX/TX
State
rtl
u8
RX
Good
HackRF
i8
RX/TX
Good
PlutoSDR
i16
RX/TX
Good
rtl kerberos
u8
RX
Old
uhd
i16/c64/i8
RX/TX
Good
airspyhf
c64
RX
Exp
The following major packages and subpackages exist at the time of writing:
Import
What is it?
hz.tools/sdr
Core IQ types, supporting types and implementations that interact with the byte boundary
hz.tools/sdr/rtl
sdr.Receiver implementation using librtlsdr.
hz.tools/sdr/rtl/kerberos
Helpers to enable coherent RX using the Kerberos SDR.
hz.tools/sdr/rtl/e4k
Helpers to interact with the E4000 RTL-SDR dongle.
hz.tools/sdr/fft
Interfaces for performing an FFT, which are implemented by other packages.
sdr.Transceiver implementation for the PlutoSDR using libiio.
hz.tools/sdr/uhd
sdr.Transceiver implementation for UHD radios, specifically the B210 and B200mini
hz.tools/sdr/hackrf
sdr.Transceiver implementation for the HackRF using libhackrf.
hz.tools/sdr/mock
Mock SDR for testing purposes.
hz.tools/sdr/airspyhf
sdr.Receiver implementation for the AirspyHF+ Discovery with libairspyhf.
hz.tools/sdr/internal/simd
SIMD helpers for IQ operations, written in Go ASM. This isn t the best to learn from, and it contains pure go implemtnations alongside.
hz.tools/sdr/stream
Common Reader/Writer helpers that operate on IQ streams.
hz.tools/fftw - hz.tools/sdr/fft implementation
The hz.tools/fftw package contains bindings to libfftw3 to implement
the hz.tools/sdr/fft.Planner type to transform between the time and
frequency domain.
The git repo can be found at
github.com/hztools/go-fftw, and is
importable as hz.tools/fftw.
This is the default throughout most of my codebase, although that default is
only expressed at the leaf package libraries should not be hardcoding the
use of this library in favor of taking an fft.Planner, unless it s used as
part of testing. There are a bunch of ways to do an FFT out there, things like
clFFT or a pure-go FFT implementation could be plugged in depending on what s
being solved for.
hz.tools/ fm,am - analog audio demodulation and modulation
The hz.tools/fm and hz.tools/am packages contain demodulators for
AM analog radio, and FM analog radio. This code is a bit old, so it has
a lot of room for cleanup, but it ll do a very basic demodulation of IQ
to audio.
The git repos can be found at
github.com/hztools/go-fm and
github.com/hztools/go-am,
and are importable as
hz.tools/fm and
hz.tools/am.
As a bonus, the hz.tools/fm package also contains a modulator, which has been
tested on the air and with some of my handheld radios. This code is a bit
old, since the hz.tools/fm code is effectively the first IQ processing code
I d ever written, but it still runs and I run it from time to time.
// Basic sketch for playing FM radio using a reader stream from
// an SDR or other IQ stream.
bandwidth:=150*rf.KHzreader, err = stream.ConvertReader(reader, sdr.SampleFormatC64)
iferr!=nil...demod, err:=fm.Demodulate(reader, fm.DemodulatorConfigDeviation: bandwidth/2,
Downsample: 8, // some value here depending on sample rate
Planner: fftw.Plan,
)
iferr!=nil...speaker, err:=pulseaudio.NewWriter(pulseaudio.ConfigFormat: pulseaudio.SampleFormatFloat32NE,
Rate: demod.SampleRate(),
AppName: "rf",
StreamName: "fm",
Channels: 1,
SinkName: "",
)
iferr!=nil...buf:= make([]float32, 1024*64)
fori, err:=demod.Read(buf)
iferr!=nil...ifi==0
panic("...")
iferr:=speaker.Write(buf[:i]); err!=nil...
hz.tools/rfcap - byte serialization for IQ data
The hz.tools/rfcap package is the reference implementation of the
rfcap spec , and is how I store IQ captures
locally, and how I send them across a byte boundary.
The git repo can be found at
github.com/hztools/go-rfcap, and is
importable as hz.tools/rfcap.
If you re interested in storing IQ in a way others can use, the better approach
is to use SigMFrfcap exists for cases
like using UNIX pipes to move IQ around, through APIs, or when I send
IQ data through an OS socket, to ensure the sample format (and other metadata)
is communicated with it.
rfcap has a number of limitations, for instance, it can not express a change
in frequency or sample rate during the capture, since the header is fixed at
the beginning of the file.
Before you go on: I've been
warned off
implementing this in practice on a few counts; namely, the space tradeoff isn't
worth it, and it's unlikely to correct meaningful errors. I'm going to leave
this post up, but please do take the content with a very large grain of salt!
My initial efforts to build a PHY
and Data Link layer from
scratch using my own code have been progressing nicely since the initial
BPSK based protocol I ve documented under the PACKRAT series. As part of that,
I ve been diving deep into FEC,
and in particular, LDPC.
I won t be able to do an overview of LDPC justice in this post with any
luck that ll come in a later post to come as part of the RATPACK series,
so some knowledge is assumed. As such this post is less useful for those
looking to learn about LDPC, and a bit more targeted to those who enjoy
talking and thinking about FEC.
Hey, heads up! - This post contains extremely unvalidated and back of
the napkin quality work without any effort to prove this out generally.
Hopefully this work can be of help to others, but please double check anything
below if you need it for your own work!
While implementing LDPC, I ve gotten an encoder and checker working, enough
to use LDPC like a checksum. The next big step is to write a Decoder, which
can do error correction. The two popular approaches for the actual correction
that I ve seen while reading about LDPC are
Belief Propagation,
and some class of linear programming that I haven t dug into yet.
I m not thrilled at how expensive this all is in software, so while implementing
the stack I ve been exploring every shady side ally to try and learn more about
how encoders and decoders work, both in theory - and in practice.
Processing an LDPC Message
Checking if a message is correct is fairly straightforward with LDPC (as with
encoding, I ll note). As a quick refresher given the LDPC H (check) matrix
of width N, you can check your message vector (msg) of length N by
multipling H and msg, and checking if the output vector is all zero.
// scheme contains our G (generator) and
// H (check) matrices.
scheme:=G: Matrix... , H: Matrix...// msg contains our LDPC message (data and
// check bits).
msg:=Vector...// N is also the length of the encoded
// msg vector after check bits have been
// added.
N:=scheme.G.Width// Now, let's generate our 'check' vector.
ch:=Multiply(scheme.H, msg)
We can now see if the message is correct or not:
// if the ch vector is all zeros, we know
// that the message is valid, and we don't
// need to do anything.
ifch.IsZero()
// handle the case where the message
// is fine as-is.
return...// Expensive decode here
This is great for getting a thumbs up / thumbs down on the message being
correct, but correcting errors still requires pulling the LDPC matrix values
from the g (generator) matrix out, building a bipartite graph, and
iteratively reprocessing the bit values, until constraints are satisfied and
the message has been corrected.
This got me thinking - what is the output vector when it s not all zeros?
Since 1 values in the output vector indicates consistency problems in the
message bits as they relate to the check bits, I wondered if this could be used
to speed up my LDPC decoder. It appears to work, so this post is half an attempt
to document this technique before I put it in my hot path, and half a plea for
those who do like to talk about FEC to tell me what name this technique
actually is.
k-Bit Brute Forcing
Given that the output Vector s non-zero bit pattern is set due to the position
of errors in the message vector, let s use that fact to build up a table
of k-Bit errors that we can index into.
// for clarity's sake, the Vector
// type is being used as the lookup
// key here, even though it may
// need to be a hash or string in
// some cases.
idx:=map[Vector]intfori:=0; i < N; i++// Create a vector of length N
v:=Vectorv.FlipBit(i)
// Now, let's use the generator matrix to encode
// the data with checksums, and then use the
// check matrix on the message to figure out what
// bit pattern results
ev:=Multiply(scheme.H, Multiply(v, scheme.G))
idx[ev] = i
This can be extended to multiple bits (hence: k-Bits), but I ve only done
one here for illustration. Now that we have our idx mapping, we can now
go back to the hot path on Checking the incoming message data:
// if the ch vector is all zeros, we know
// that the message is valid, and we don't
// need to do anything.
ifch.IsZero()
// handle the case where the message
// is fine as-is.
return...errIdx, ok:=idx[ch]
ifokmsg.FlipBit(errIdx)
// Verify the LDPC message using
// H again here.
return...// Expensive decode here
Since map lookups wind up a heck of a lot faster than message-passing
bit state, the hope here is this will short-circuit easy to solve errors
for k-Bits, for some value of k that the system memory can tolerate.
Does this work?
Frankly I have no idea. I ve written a small program and brute forced
single-bit errors in all bit positions using random data to start with, and
I ve not been able to find any collisions in the 1-bit error set, using the
LDPC matrix from 802.3an-2006. Even if I was to find a collision for a
higher-order k-Bit value, I m tempted to continue with this approach, and treat
each set of bits in the Vector s bin (like a hash-table), checking the LDPC
validity after each bit set in the bin. As long as the collision rate is small
enough, it should be possible to correct k-Bits of error faster than the more
expensive Belief Propagation approach. That being said, I m not entirely
convinced collisions will be very common, but it ll take a bit more time
working through the math to say that with any confidence.
Have you seen this approach called something official in publications? See
an obvious flaw in the system? Send me a tip, please!
AX.25 is a tough protocol to use on UNIX systems. A lot of the support in
Linux, specifically, is pretty hard to use, and tends to be built into the
reptilian brain of the kernel. xan built a userland
AX.25 stack called patty, for which I have
now built some Go bindings on top of.
Code needed to create AX.25 Sockets via Go can be found at
github.com/k3xec/go-patty,
and imported by Go source as
k3xec.com/patty.
Overview
Clint patty programs (including consumers of this Go library) work by
communicating with a userland daemon (pattyd) via a UNIX named socket.
That daemon will communicate with a particular radio using a
KISS TNC serial device.
The Go bindings implement as many standard Go library interfaces as is
practical, allowing for the plug and play use of patty (and AX.25) in
places where you would expect a network socket (such as TCP) to work, such
as Go s http library.
Example
packagemainimport (
"fmt""log""net""os""time""k3xec.com/patty"
)
funcmain()
callsign:="N0CALL-10"client, err:=patty.Open("patty.sock")
iferr!=nil
panic(err)
l, err:=client.Listen("ax25", callsign)
iferr!=nil
panic(err)
forlog.Printf("Listening for requests to %s", l.Addr())
conn, err:=l.Accept()
iferr!=nillog.Printf("Error accepting: %s", err)
continuegohandle(conn)
funchandle(cnet.Conn) errordeferc.Close()
log.Printf("New connection from %s (local: %s)", c.RemoteAddr(), c.LocalAddr())
fmt.Fprintf(c, Hello! This is Paul's experimental %s node. Feel free
to poke around. Let me know if you spot anything funny.
Five pings are to follow!
, c.LocalAddr())
fori:=0; i < 5; i++time.Sleep(time.Second*5)
fmt.Fprintf(c, "Ping!\n")
returnnil
This post is part of a series called "PACKRAT". If this is the first post
you've found, it'd be worth reading the
intro post first and then looking over
all posts in the series.
In the last post, we left off at being able to send and recieve PACKRAT frames
to and from devices. Since we can transport IPv4 packets over the network,
let s go ahead and see if we can read/write
Ethernet frames from a Linux network
interface, and on the backend, read and write PACKRAT frames over the air.
This has the benifit of continuing to allow Linux userspace tools to work (like
cURL, as we ll try!), which means we don t have to do a lot of work to
implement higher level protocols or tactics to get a connection established
over the link.
Given that this post is less RF and more Linuxy, I m going to include more code
snippits than in prior posts, and those snippits are closer to runable Go, but
still not complete examples. There s also a lot of different ways to do this,
I ve just picked the easiest one for me to implement and debug given my
existing tooling for you, you may find another approach easier to implement!
Again, deviation here is very welcome, and since this segment is the least
RF centric post in the series, the pace and tone is going to feel different.
If you feel lost here, that s OK. This isn t the most important part of
the series, and is mostly here to give a concrete ending to the story arc.
Any way you want to finish your own journy is the best way for you to finish
it!
Implement Ethernet conversion code
This assumes an importable package with a Frame struct, which we can use to
convert a Frame to/from Ethernet. Given that the PACKRAT frame has a field that
Ethernet doesn t (namely, Callsign), that will need to be explicitly passed
in when turning an Ethernet frame into a PACKRAT Frame.
...// ToPackrat will create a packrat frame from an Ethernet frame.
funcToPackrat(callsign [8]byte, frame*ethernet.Frame) (*packrat.Frame, error)
varframeTypepackrat.FrameTypeswitchframe.EtherTypecaseethernet.EtherTypeIPv4:
frameType = packrat.FrameTypeIPv4default:
returnnil, fmt.Errorf("ethernet: unsupported ethernet type %x", frame.EtherType)
return&packrat.FrameDestination: frame.Destination,
Source: frame.Source,
Type: frameType,
Callsign: callsign,
Payload: frame.Payload,
, nil// FromPackrat will create an Ethernet frame from a Packrat frame.
funcFromPackrat(frame*packrat.Frame) (*ethernet.Frame, error)
varetherTypeethernet.EtherTypeswitchframe.Typecasepackrat.FrameTypeRaw:
returnnil, fmt.Errorf("ethernet: unsupported packrat type 'raw'")
casepackrat.FrameTypeIPv4:
etherType = ethernet.EtherTypeIPv4default:
returnnil, fmt.Errorf("ethernet: unknown packrat type %x", frame.Type)
// We lose the Callsign here, which is sad.
returnðernet.FrameDestination: frame.Destination,
Source: frame.Source,
EtherType: etherType,
Payload: frame.Payload,
, nil
Our helpers, ToPackrat and FromPackrat can now be used to transmorgify
PACKRAT into Ethernet, or Ethernet into PACKRAT. Let s put them into use!
Implement a TAP interface
On Linux, the networking stack can be exposed to userland using TUN or
TAP interfaces. TUN devices allow a
userspace program to read and write data at the Layer 3 / IP layer. TAP devices
allow a userspace program to read and write data at the Layer 2 Data Link /
Ethernet layer. Writing data at Layer 2 is what we want to do, since we re
looking to transform our Layer 2 into Ethernet s Layer 2 Frames. Our first job
here is to create the actual TAP interface, set the MAC address, and set the IP
range to our pre-coordinated IP range.
...import (
"net""github.com/mdlayher/ethernet""github.com/songgao/water""github.com/vishvananda/netlink"
)
...config:=water.ConfigDeviceType: water.TAPconfig.Name = "rat0"iface, err:=water.New(config)
...netIface, err:=netlink.LinkByName("rat0")
...// Pick a range here that works for you!
//
// For my local network, I'm using some IPs
// that AMPR (ampr.org) was nice enough to
// allocate to me for ham radio use. Thanks,
// AMPR!
//
// Let's just use 10.* here, though.
//
ip, cidr, err:=net.ParseCIDR("10.0.0.1/24")
...cidr.IP = iperr = netlink.AddrAdd(netIface, &netlink.AddrIPNet: cidr,
Peer: cidr,
)
...// Add all our neighbors to the ARP table
for_, neighbor:=rangeneighborsnetlink.NeighAdd(&netlink.NeighLinkIndex: netIface.Attrs().Index,
Type: netlink.FAMILY_V4,
State: netlink.NUD_PERMANENT,
IP: neighbor.IP,
HardwareAddr: neighbor.MAC,
)
// Pick a MAC that is globally unique here, this is
// just used as an example!
addr, err:=net.ParseMAC("FA:DE:DC:AB:LE:01")
...netlink.LinkSetHardwareAddr(netIface, addr)
...err = netlink.LinkSetUp(netIface)
varframe = ðernet.Framevarbuf = make([]byte, 1500)
forn, err:=iface.Read(buf)
...err = frame.UnmarshalBinary(buf[:n])
...// process frame here (to come)
...
Now that our network stack can resolve an IP to a MAC Address (via ip neigh
according to our pre-defined neighbors), and send that IP packet to our daemon,
it s now on us to send IPv4 data over the airwaves. Here, we re going to take
packets coming in from our TAP interface, and marshal the Ethernet frame into a
PACKRAT Frame and transmit it. As with the rest of the RF code, we ll leave
that up to the implementer, of course, using what was built during Part 2:
Transmitting BPSK symbols and Part 4: Framing
data.
...for// continued from above
n, err:=iface.Read(buf)
...err = frame.UnmarshalBinary(buf[:n])
...switchframe.EtherTypecase0x0800:
// ipv4 packet
pack, err:=ToPackrat(
// Add my callsign to all Frames, for now
[8]byte'K', '3', 'X', 'E', 'C' ,
frame,
)
...err = transmitPacket(pack)
......
Now that we have transmitting covered, let s go ahead and handle the recieve
path here. We re going to listen on frequency using the code built in
Part 3: Receiving BPSK symbols and
Part 4: Framing data. The Frames we decode from
the airwaves are expected to come back from the call
packratReader.Next in the code below, and the exact way that works
is up to the implementer.
...for// pull the next packrat frame from
// the symbol stream as we did in the
// last post
packet, err:=packratReader.Next()
...// check for CRC errors and drop invalid
// packets
err = packet.Check()
...ifbytes.Equal(packet.Source, addr)
// if we've heard ourself transmitting
// let's avoid looping back
continue// create an ethernet frame
frame, err:=FromPackrat(packet)
...buf, err:=frame.MarshalBinary()
...// and inject it into the tap
err = iface.Write(buf)
......
Phew. Right. Now we should be able to listen for PACKRAT frames on the
air and inject them into our TAP interface.
Putting it all Together
After all this work weeks of work! we can finally get around to putting
some real packets over the air. For me, this was an incredibly satisfying
milestone, and tied together months of learning!
I was able to start up a UDP server on a remote machine with an RTL-SDR dongle
attached to it, listening on the TAP interface s host IP with my defined MAC
address, and send UDP packets to that server via PACKRAT using my laptop,
/dev/udp and an Ettus B210, sending packets into the TAP interface.
Now that UDP was working, I was able to get TCP to work using two PlutoSDRs,
which allowed me to run the cURL command I pasted in the first post (both
simultaneously listen and transmit on behalf of my TAP interface).
It s my hope that someone out there will be inspired to implement their own
Layer 1 and Layer 2 as a learning exercise, and gets the same sense of
gratification that I did! If you re reading this, and at a point
where you ve been able to send IP traffic over your own Layer 1 / Layer 2,
please get in touch! I d be thrilled to hear all about it. I d love to link
to any posts or examples you publish here!
This post is part of a series called "PACKRAT". If this is the first post
you've found, it'd be worth reading the
intro post first and then looking over
all posts in the series.
In the last post, we we were able to build a functioning
Layer 1 PHY where we can encode symbols to transmit, and receive symbols on the
other end, we re now at the point where we can encode and decode those symbols
as bits and frame blocks of data, marking them with a Sender and a Destination
for routing to the right host(s). This is a Layer 2 scheme in the OSI model,
which is otherwise known as the Data Link
Layer. You re using one to
view this website right now I m willing to bet your data is going
through an Ethernet layer 2 as well as WiFi or maybe a cellular data
protocol like 5G or LTE.
Given that this entire exercise is hard enough without designing a complex
Layer 2 scheme, I opted for simplicity in the hopes this would free me from the
complexity and research that has gone into this field for the last 50 years. I
settled on stealing a few ideas from Ethernet Frames namely, the use of MAC
addresses to identify parties, and
the EtherType field to indicate the Payload type. I also stole the idea of
using a CRC at the end of the Frame to check for corruption, as well as the
specific CRC method (crc32 using 0xedb88320 as the polynomial).
Lastly, I added a callsign field to make life easier on ham radio frequencies
if I was ever to seriously attempt to use a variant of this protocol over the
air with multiple users. However, given this scheme is not a commonly used
scheme, it s best practice to use a nearby radio to identify your transmissions
on the same frequency while testing or use a Faraday box to test without
transmitting over the airwaves. I added the callsign field in an effort to lean
into the spirit of the Part 97 regulations, even if I relied on a phone
emission to identify the Frames.
As an aside, I asked the ARRL for input here, and their stance to me over email
was I d be OK according to the regs if I were to stick to UHF and put my
callsign into the BPSK stream using a widely understood encoding (even with no
knowledge of PACKRAT, the callsign is ASCII over BPSK and should be easily
demodulatable for followup with me). Even with all this, I opted to use FM
phone to transmit my callsign when I was active on the air (specifically, using
an SDR and a small bash script to automate transmission while I watched for
interference or other band users).
Right, back to the Frame:
sync
dest
source
callsign
type
payload
crc
With all that done, I put that layout into a struct, so that we can marshal
and unmarshal bytes to and from our Frame objects, and work with it
in software.
Time to pick some consts
I picked a unique and distinctive sync sequence, which the sender
will transmit before the Frame, while the receiver listens for that
sequence to know when it s in byte alignment with the symbol stream.
My sync sequence is [3]byte 'U', 'f', '~' which works out to be a
very pleasant bit sequence of 01010101 01100110 01111110. It s important
to have soothing preambles for your Frames. We need all the good energy
we can get at this point.
var (
FrameStart = [3]byte'U', 'f', '~'FrameMaxPayloadSize = 1500
)
Next, I defined some FrameType values for the type field,
which I can use to determine what is done with that data next,
something Ethernet was originally missing, but has since grown
to depend on (who needs Length anyway? Not me. See below!)
FrameType
Description
Bytes
Raw
Bytes in the Payload field are opaque and not to be parsed.
[2]byte 0x00, 0x01
IPv4
Bytes in the Payload field are an IPv4 packet.
[2]byte 0x00, 0x02
And finally, I decided on a maximum length of the Payload, and decided on
limiting it to 1500 bytes to align with the MTU of Ethernet.
var (
FrameTypeRaw = FrameType0, 1FrameTypeIPv4 = FrameType0, 2
)
Given we know how we re going to marshal and unmarshal binary data to and from
Frames, we can now move on to looking through the bit stream for our Frames.
Why is there no Length field?
I was initially a bit surprised that Ethernet Frames didn t have a Length field
in use, but the more I thought about it, the more it seemed like a big ole'
failure mode without a good implementation outcome. Either the Length is right
(resulting in no action and used bits on every packet) or the Length is not the
length of the Payload and the driver needs to determine what to do with the
packet does it try and trim the overlong payload and ignore the rest? What
if both the end of the read bytes and the end of the subset of the packet
denoted by Length have a valid CRC? Which is used? Will everyone agree? What
if Length is longer than the Payload but the CRC is good where we detected a
lost carrer?
I decided on simplicity. The end of a Frame is denoted by the loss of the BPSK
carrier when the signal is no longer being transmitted (or more correctly,
when the signal is no longer received), we know we ve hit the end of a packet.
Missing a single symbol will result in the Frame being finalized. This can
cause some degree of corruption, but it s also a lot easier than doing tricks
like bit stuffing to create an end of symbol stream delimiter.
Finding the Frame start in a Symbol Stream
First thing we need to do is find our sync bit pattern in the symbols we re
receiving from our BPSK demodulator. There s some smart ways to do this, but
given that I m not much of a smart man, I again decided to go for simple
instead. Given our incoming vector of symbols (which are still float values)
prepend one at a time to a vector of floats that is the same length as the
sync phrase, and compare against the sync phrase, to determine if we re in
sync with the byte boundary within the symbol stream.
The only trick here is that because we re using BPSK to modulate and demodulate
the data, post phaselock we can be 180 degrees out of alignment (such that a +1
is demodulated as -1, or vice versa). To deal with that, I check against both
the sync phrase as well as the inverse of the sync phrase (both [1, -1, 1] as well as [-1, 1, -1]) where if the inverse sync is matched, all symbols
to follow will be inverted as well. This effectively turns our symbols back
into bits, even if we re flipped out of phase. Other techniques like
NRZI will represent a 0 or
1 by a change in phase state which is great, but can often cascade into long
runs of bit errors, and is generally more complex to implement. That
representation isn t ambiguous, given you look for a phase change, not the
absolute phase value, which is incredibly compelling.
Here s a notional example of how I ve been thinking about the phrase sliding
window and how I ve been thinking of the checks. Each row is a new symbol
taken from the BPSK receiver, and pushed to the head of the sliding window,
moving all symbols back in the vector by one.
Given the pseudocode above, let s step through what the checks would be doing
at each step:
Buffer
Sync
Inverse Sync
[ ]float 0, ,0
[ ]float -1, ,-1
[ ]float 1, ,1
[ ]float 0, ,1
[ ]float -1, ,-1
[ ]float 1, ,1
[more bits in]
[ ]float -1, ,-1
[ ]float 1, ,1
[ ]float 1, ,1
[ ]float -1, ,-1
[ ]float 1, ,1
After this notional set of comparisons, we know that at the last step, we are
now aligned to the frame and byte boundary the next symbol / bit will be the
MSB of the 0th Frame byte. Additionally, we know we re also 180 degrees out of
phase, so we need to flip the symbol s sign to get the bit. From this point on
we can consume 8 bits at a time, and re-assemble the byte stream. I don t know
what this technique is called or even if this is used in real grown-up
implementations, but it s been working for my toy implementation.
Next Steps
Now that we can read/write Frames to and from PACKRAT, the next steps here are
going to be implementing code to encode and decode Ethernet traffic into
PACKRAT, coming next in Part 5!
This post is part of a series called "PACKRAT". If this is the first post
you've found, it'd be worth reading the
intro post first and then looking over
all posts in the series.
In the last post, we worked through how to generate
a BPSK signal, and hopefully transmit it using one of our SDRs.
Let s take that and move on to Receiving BPSK and turning that back
into symbols!
Demodulating BPSK data is a bit more tricky than transmitting BPSK data, mostly
due to tedious facts of life such as space, time, and hardware built with
compromises because not doing that makes the problem impossible. Unfortunately,
it s now our job to work within our imperfect world to recover perfect data. We
need to handle the addition of noise, differences in frequency, clock
synchronization and interference in order to recover our information. This
makes life a lot harder than when we transmit information, and as a result, a
lot more complex.
Coarse Sync
Our starting point for this section will be working from a capture of a number
of generated PACKRAT packets
as heard by a PlutoSDR at (xz compressed interleaved int16, 2,621,440 samples per second)
Every SDR has its own oscillator, which eventually controls a number of
different components of an SDR, such as the IF (if it s a superheterodyne
architecture) and the sampling rate. Drift in oscillators lead to drifts in
frequency such that what one SDR may think is 100MHz may be 100.01MHz for
another radio. Even if the radios were perfectly in sync, other artifacts such
as doppler time dilation due to motion
can cause the frequency to appear higher or lower in frequency than it was
transmitted.
All this is a long way of saying, we need to determine when we see a strong
signal that s close-ish to our tuned frequency, and take steps to roughly
correct it to our center frequency (in the order of 100s of Hz to kHz) in order
to acquire a phase lock on the signal to attempt to decode information contained
within.
The easiest way of detecting the loudest signal of interest is to use an
FFT. Getting into how
FFTs work is out of scope of this post, so if this is the first time you re
seeing mention of an FFT, it may be a good place to take a quick break to learn
a bit about the time domain (which is what the IQ data we ve been working with
so far is), frequency domain, and how the FFT and iFFT operations can convert
between them.
Lastly, because FFTs average power over the window, swapping phases such that
the transmitted wave has the same number of in-phase and inverted-phase symbols
the power would wind up averaging to zero. This is not helpful, so I took a tip
from Dr. Marc Lichtman s PySDR
project
and used complex squaring to drive
our BPSK signal into a single detectable carrier by squaring the IQ data.
Because points are on the unit circle and at tau/2 (specifically, tau/(2^1)
for BPSK, 2^2 for QPSK) angles, and given that squaring has the effect of
doubling the angle, and angles are all mod tau, this will drive our wave
comprised of two opposite phases back into a continuous wave effectively
removing our BPSK modulation, making it much easier to detect in the
frequency domain. Thanks to Tom Bereknyei
for helping me with that!
...variq []complexvarfreq []complexfori:=rangeiqiq[i] = iq[i] *iq[i]
// perform an fft, computing the frequency
// domain vector in freq given the iq data
// contained in iq .
fft(iq, freq)
// get the array index of the max value in the
// freq array given the magnitude value of the
// complex numbers.
varbinIdx = max(abs(freq))
...
Now, most FFT operations will lay the frequency domain data out a bit
differently than you may expect (as a human), which is that the 0th element of
the FFT is 0Hz, not the most negative number (like in a waterfall). Generally
speaking, zero first is the most common frequency domain layout (and
generally speaking the most safe assumption if there s no other documentation
on fft layout). Negative first is usually used when the FFT is being rendered
for human consumption such as a waterfall plot.
Given that we now know which FFT bin (which is to say, which index into the
FFT array) contains the strongest signal, we ll go ahead and figure out what
frequency that bin relates to.
In the time domain, each complex number is the next time instant. In the
frequency domain, each bin is a discrete frequency or more specifically a
frequency range. The bandwidth of the bin is a function of the sampling rate
and number of time domain samples used to do the FFT operation. As you increase
the amount of time used to preform the FFT, the more precise the FFT
measurement of frequency can be, but it will cover the same bandwidth, as
defined by the sampling rate.
...varsampleRate = 2,621,440// bandwidth is the range of frequencies
// contained inside a single FFT bin,
// measured in Hz.
varbandwidth = sampleRate/len(freq)
...
Now that we know we have a zero-first layout and the bin bandwidth,
we can compute what our frequency offset is in Hz.
...// binIdx is the index into the freq slice
// containing the frequency domain data.
varbinIdx = 0// binFreq is the frequency of the bin
// denoted by binIdx
varbinFreq = 0ifbinIdx > len(freq)/2// This branch covers the case where the bin
// is past the middle point - which is to say,
// if this is a negative frequency.
binFreq = bandwidth* (binIdx- len(freq))
else// This branch covers the case where the bin
// is in the first half of the frequency array,
// which is to say - if this frequency is
// a positive frequency.
binFreq = bandwidth*binIdx...
However, sice we squared the IQ data, we re off in frequency by twice the
actual frequency if we are reading 12kHz, the bin is actually 6kHz. We
need to adjust for that before continuing with processing.
...varbinFreq = 0...// [compute the binFreq as above]
...// Adjust for the squaring of our IQ data
binFreq = binFreq/2...
Finally, we need to shift the frequency by the inverse of the binFreq
by generating a carrier wave at a specific frequency and rotating every
sample by our carrier wave so that a wave at the same frequency will
slow down (or stand still!) as it approaches 0Hz relative to the carrier
wave.
vartau = pi*2// ts tracks where in time we are (basically: phase)
vartsfloat// inc is the amount we step forward in time (seconds)
// each sample.
varincfloat = (1/sampleRate)
// amount to shift frequencies, in Hz,
// in this case, shift +12 kHz to 0Hz
varshift = -12,000fori:=rangeiqts+=incifts > tau// not actually needed, but keeps ts within
// 0 to 2*pi (since it is modulus 2*pi anyway)
ts-=tau// Here, we're going to create a carrier wave
// at the provided frequency (in this case,
// -12kHz)
cwIq = complex(cos(tau*shift*ts), sin(tau*shift*ts))
iq[i] = iq[i] *cwIq
Now we ve got the strong signal we ve observed (which may or may not be
our BPSK modulated signal!) close enough to 0Hz that we ought to be able
to Phase Lock the signal in order to begin demodulating the signal.
Filter
After we re roughly in the neighborhood of a few kHz, we can now take some
steps to cut out any high frequency components (both positive high frequencies
and negative high frequencies). The normal way to do this would be to do an
FFT, apply the filter in the frequency domain, and then do an iFFT to turn it
back into time series data. This will work in loads of cases, but I ve
found it to be incredibly tricky to get right when doing PSK. As such, I ve
opted to do this the old fashioned way in the time domain.
I ve again opted to go simple rather than correct, and haven t used
nearly any of the advanced level trickery I ve come across for fear of using
it wrong. As a result, our process here is going to be generating a
sinc filter by computing a
number of taps, and applying that in the time domain directly on the IQ
stream.
...// Apply sinc taps to an IQ stream
variq []complex// taps as created in dst above
vartaps []floatvardelay = make([]complex, len(taps))
fori:=rangeiq// let's shift the next sample into
// the delay buffer
copy(delay[1:], delay)
delay[0] = iq[i]
varphasorcomplexforj:=rangedelay// for each sample in the buffer, let's
// weight them by the tap values, and
// create a new complex number based on
// filtering the real and imag values.
phasor+= complex(
taps[j] * real(delay[j]),
taps[j] * imag(delay[j]),
)
// now that we've run this sample
// through the filter, we can go ahead
// and scale it back (since we multiply
// above) and drop it back into the iq
// buffer.
iq[i] = complex(
real(phasor) / len(taps),
imag(phasor) / len(taps),
)
...
After running IQ samples through the taps and back out, we ll have a signal
that s been filtered to the shape of our designed Sinc filter which will
cut out captured high frequency components (both positive and negative).
Astute observers will note that we re using the real (float) valued taps
on both the real and imaginary values independently. I m sure there s a way
to apply taps using complex numbers, but it was a bit confusing to work
through without being positive of the outcome. I may revisit this in the
future!
Downsample
Now, post-filter, we ve got a lot of extra RF bandwidth being represented in
our IQ stream at our high sample rate All the high frequency values are now
filtered out, which means we can reduce our sampling rate without losing much
information at all. We can either do nothing about it and process at the fairly
high sample rate we re capturing at, or we can drop the sample rate down and
help reduce the volume of numbers coming our way.
There s two big ways of doing this; either you can take every Nth sample (e.g.,
take every other sample to half the sample rate, or take every 10th to decimate
the sample stream to a 10th of what it originally was) which is the easiest to
implement (and easy on the CPU too), or to average a number of samples to
create a new sample.
A nice bonus to averaging samples is that you can trade-off some CPU time for a
higher effective number of bits
(ENOB) in your IQ
stream, which helps reduce noise, among other things. Some hardware does
exactly this (called Oversampling ), and like many things, it has
some pros and some cons.
I ve opted to treat our IQ stream like an oversampled IQ stream
and average samples to get a marginal bump in ENOB.
Taking a group of 4 samples and averaging them results in a bit of added
precision. That means that a stream of IQ data at 8 ENOB can be bumped to 9
ENOB of precision after the process of oversampling and averaging. That
resulting stream will be at 1/4 of the sample rate, and this process can be
repeated 4 samples can again be taken for a bit of added precision; which is
going to be 1/4 of the sample rate (again), or 1/16 of the original sample
rate. If we again take a group of 4 samples, we ll wind up with another bit and
a sample rate that s 1/64 of the original sample rate.
Phase Lock
Our starting point for this section is the same capture as above, but
post-coarse sync, filtering downsampling (xz compressed interleaved float32, 163,840 samples per second)
The PLL in PACKRAT was one
of the parts I spent the most time stuck on. There s no shortage of discussions
of how hardware PLLs work, or even a few software PLLs, but very little by way
of how to apply them and/or troubleshoot them. After getting frustrated trying
to follow the well worn path, I decided to cut my own way through the bush
using what I had learned about the concept, and hope that it works well enough
to continue on.
PLLs, in concept are fairly simple you generate a carrier wave at a
frequency, compare the real-world SDR IQ sample to where your carrier wave is
in phase, and use the difference between the local wave and the observed wave
to adjust the frequency and phase of your carrier wave. Eventually, if all goes
well, that delta is driven as small as possible, and your carrier wave can be
used as a reference clock to determine if the observed signal changes in
frequency or phase.
In reality, tuning PLLs is a total pain, and basically no one outlines how to
apply them to BPSK signals in a descriptive way. I ve had to steal an approach
I ve seen in hardware to implement my software PLL, with any hope it s close
enough that this isn t a hazard to learners. The concept is to generate the
carrier wave (as above) and store some rolling averages to tune the carrier
wave over time. I use two constants, alpha and beta (which appear to be
traditional PLL variable names for this function) which control how quickly the
frequency and phase is changed according to observed mismatches. Alpha is set
fairly high, which means discrepancies between our carrier and observed data
are quickly applied to the phase, and a lower constant for Beta, which will
take long-term errors and attempt to use that to match frequency.
This is all well and good. Getting to this point isn t all that obscure, but
the trouble comes when processing a BPSK signal. Phase changes kick the PLL
out of alignment and it tends to require some time to get back into phase lock,
when we really shouldn t even be loosing it in the first place. My attempt is
to generate two predicted samples, one for each phase of our BPSK signal. The
delta is compared, and the lower error of the two is used to adjust the PLL,
but the carrier wave itself is used to rotate the sample.
varalpha = 0.1varbeta = (alpha*alpha) /2varphase = 0.0varfrequency = 0.0...fori:=rangeiqpredicted = complex(cos(phase), sin(phase))
sample = iq[i] *conj(predicted)
delta = phase(sample)
predicted2 = complex(cos(phase+pi), sin(phase+pi))
sample2 = iq[i] *conj(predicted2)
delta2 = phase(sample2)
ifabs(delta2) < abs(delta)
// note that we do not update 'sample'.
delta = delta2phase+=alpha*deltafrequency+=beta*delta// adjust the iq sample to the PLL rotated
// sample.
iq[i] = sample...
If all goes well, this loop has the effect of driving a BPSK signal s imaginary
values to 0, and the real value between +1 and -1.
Average Idle / Carrier Detect
Our starting point for this section is the same capture as above, but
post-PLL (xz compressed interleaved float32, 163,840 samples per second)
When we start out, we have IQ samples that have been mostly driven to an
imaginary component of 0 and real value range between +1 and -1 for each symbol
period. Our goal now is to determine if we re receiving a signal, and if so,
determine if it s +1 or -1. This is a deceptively hard problem given it spans a
lot of other similarly entertaining hard problems. I ve opted to not solve the
hard problems involved and hope that in practice my very haphazard
implementation works well enough. This turns out to be both good (not solving a
problem is a great way to not spend time on it) and bad (turns out it does
materially impact performance). This segment is the one I plan on revisiting,
first. Expect more here at some point!
Given that I want to be able to encapsulate three states in the output from
this section (our Symbols are no carrier detected ( 0 ), real value 1 ( 1 ) or
real value -1 ("-1")), which means spending cycles to determine what the
baseline noise is to try and identify when a signal breaks through the noise
becomes incredibly important.
varidleThresholdvarthresholdFactor = 10...// sigThreshold is used to determine if the symbol
// is -1, +1 or 0. It's 1.3 times the idle signal
// threshold.
varsigThreshold = (idleThreshold*0.3) +idleThreshold// iq contains a single symbol's worth of IQ samples.
// clock alignment isn't really considered; so we'll
// get a bad packet if we have a symbol transition
// in the middle of this buffer. No attempt is made
// to correct for this yet.
variq []complex// avg is used to average a chunk of samples in the
// symbol buffer.
varavgfloatvarmid = len(iq) /2// midNum is used to determine how many symbols to
// average at the middle of the symbol.
varmidNum = len(iq) /50forj:=mid; j < mid+midNum; j++avg+= real(iq[j])
avg/=midNumvarsymbolfloatswitchcaseavg > sigThreshold:
symbol = 1caseavg < -sigThreshold:
symbol = -1default:
symbol = 0// update the idleThreshold using the thresholdFactor
// to average the idleThreshold over more samples to
// get a better idea of average noise.
idleThreshold = (
(idleThreshold*(thresholdFactor-1) +symbol) \
/thresholdFactor
)
// write symbol to output somewhere
...
Next Steps
Now that we have a stream of values that are either +1, -1 or 0, we
can frame / unframe the data contained in the stream, and decode Packets
contained inside, coming next in Part 4!
This post is part of a series called "PACKRAT". If this is the first post
you've found, it'd be worth reading the
intro post first and then looking over
all posts in the series.
In the last post, we worked through what IQ is,
and different formats that it may be sent or received in. Let s take that
and move on to Transmitting BPSK using IQ data!
When we transmit and receive information through RF using an SDR, data is
traditionally encoded into a stream of
symbols
which are then used by a program to modulate the IQ stream, and sent over
the airwaves.
PACKRAT uses BPSK to
encode Symbols through RF. BPSK is the act of modulating the phase of a
sine wave to carry information. The transmitted wave swaps between two
states in order to convey a 0 or a 1. Our symbols modulate the transmitted
sine wave s phase, so that it moves between in-phase with the SDR s transmitter
and 180 degrees (or radians) out of phase with the SDR s transmitter.
The difference between a Bit and a Symbol in PACKRAT is not incredibly
meaningful, and I ll often find myself slipping up when talking about them.
I ve done my best to try and use the right word at the right stage, but it s
not as obvious where the line between bit and symbol is at least not as
obvious as it would be with QPSK or QAM. The biggest difference is that there
are three meaningful states for PACKRAT over BPSK - a 1 (for In phase ), -1
(for 180 degrees out of phase ) and 0 (for no carrier ). For my
implementation, a stream of all zeros will not transmit data over the airwaves,
a stream of all 1s will transmit all 1 bits over the airwaves, and a stream
of all -1s will transmit all 0 bits over the airwaves.
We re not going to cover turning a byte (or bit) into a symbol yet I m going
to write more about that in a later section. So for now, let s just worry about
symbols in, and symbols out.
Transmitting a Sine wave at 0Hz
If we go back to thinking about IQ data as a precisely timed measurements of
energy over time at some particular specific frequency, we can consider what
a sine wave will look like in IQ. Before we dive into antennas and RF, let s go
to something a bit more visual.
For the first example, you can see an example of a camera who s frame rate (or
Sampling Rate!) matches the exact number of rotations per second (or
Frequency!) of the propeller and it appears to stand exactly
still. Every time the Camera
takes a frame, it s catching the propeller in the exact same place in space,
even though it s made a complete rotation.
The second example is very similar, it s a light strobing (in this case, our
sampling rate, since the darkness is ignored by our brains) at the same rate
(frequency) as water dropping from a
faucet and the video creator
is even nice enough to change the sampling frequency to have the droplets move
both forward and backward (positive and negative frequency) in comparison to the
faucet.
IQ works the same way. If we catch something in perfect frequency alignment
with our radio, we ll wind up with readings that are the same for the entire
stream of data. This means we can transmit a sine wave by setting all of the
IQ samples in our buffer to 1+0i, which will transmit a pure sine wave at
exactly the center frequency of the radio.
Alternatively, we can transmit a Sine wave (but with the opposite phase) by
flipping the real value from 1 to -1. The same Sine wave is transmitted on
the same Frequency, except when the wave goes high in the example above, the
wave will go low in the example below.
In fact, we can make a carrier wave at any phase angle and amplitude by using a
bit of trig.
// angle is in radians - here we have
// 1.5 Pi (3 Tau) or 270 degrees.
varangle = pi*1.5// amplitude controls the transmitted
// strength of the carrier wave.
varamplitude = 1.0// output buffer as above
varsine []complexfori:=rangesinesine[i] = complex(
amplitude*cos(angle),
amplitude*sin(angle),
)
The amplitude of the transmitted wave is the absolute value of the IQ
sample (sometimes called magnitude), and the phase can be computed as the
angle (or argument). The amplitude remains constant (at 1) in both cases.
Remember back to the airplane propeller or water droplets we re controlling
where we re observing the sine wave. It looks like a consistent value
to us, but in reality it s being transmitted as a pure carrier wave at the
provided frequency. Changing the angle of the number we re transmitting will
control where in the sine wave cycle we re observing it at.
Generating BPSK modulated IQ data
Modulating our carrier wave with our symbols is fairly straightforward to
do we can multiply the symbol by 1 to get the real value to be used
in the IQ stream. Or, more simply - we can just use the symbol directly in
the constructed IQ data.
varsampleRate = 2,621,440varbaudRate = 1024// This represents the number of IQ samples
// required to send a single symbol at the
// provided baud and sample rate. I picked
// two numbers in order to avoid half samples.
// We will transmit each symbol in blocks of
// this size.
varsamplesPerSymbol = sampleRate/baudRatevarsamples = make([]complex, samplesPerSymbol)
// symbol is one of 1, -1 or 0.
foreachsymbolinsymbolsfori:=rangesamplessamples[i] = complex(symbol, 0)
// write the samples out to an output file
// or radio.
write(samples)
Next Steps
Now that we can transmit data, we ll start working on a receive path in Part 3,
in order to check our work when transmitting the packets, as well as being
able to hear packets we transmit from afar, coming up next in
Part 3!!
This post is part of a series called "PACKRAT". If this is the first post
you've found, it'd be worth reading the
intro post first and then looking over
all posts in the series.
When working with SDRs, information about the signals your radio is receiving
are communicated by streams of
IQ
data. IQ is short for In-phase and Quadrature , which means 90 degrees
out of phase. Values in the IQ stream are
complex numbers, so
converting them to a native complex type in your language
helps greatly when processing the IQ data for meaning.
I won t get too deep into what IQ is or why complex numbers (mostly since I
don t think I fully understand it well enough to explain it yet), but here s
some basics in case this is your first interaction with IQ data before
going off and reading more.
Before we get started
at any point, if you feel lost in this post, it's OK to take a break to do
a bit of learning elsewhere in the internet. I'm still new to this, so I'm
sure my overview in one paragraph here won't help clarify things too much.
This took me months to sort out on my own. It's not you, really! I particularly
enjoyed reading
visual-dsp.switchb.org
when it came to learning about how IQ represents signals, and
Software-Defined Radio for Engineers
for a more general reference.
Each value in the stream is taken at a precisely spaced sampling interval
(called the sampling rate of the radio). Jitter in that sampling interval, or a
drift in the requested and actual sampling rate (usually represented in PPM, or
parts per million how many samples out of one million are missing) can cause
errors in frequency. In the case of a PPM error, one radio may think it s
100.1MHz and the other may think it s 100.2MHz, and jitter will result in added
noise in the resulting stream.
A single IQ sample is both the real and imaginary values, together. The complex
number (both parts) is the sample. The number of samples per second is the
number of real and imaginary value pairs per second.
Each sample is reading the electrical energy coming off the antenna at that
exact time instant. We re looking to see how that goes up and down over time to
determine what frequencies we re observing around us. If the IQ stream is only
real-valued measures (e.g., float values rather than complex values reading
voltage from a wire), you can still send and receive signals, but those signals
will be mirrored across your 0Hz boundary. That means if you re tuned to
100MHz, and you have a nearby transmitter at 99.9MHz, you d see it at 100.1MHz.
If you want to get an intuitive understanding of this concept before getting
into the heavy math, a good place to start is looking at how Quadrature
encoders
work. Using complex numbers means we can see up in frequency as well as
down in frequency, and understand that those are different signals.
The reason why we need negative frequencies is that our 0Hz is the center of
our SDR s tuned frequency, not actually at 0Hz in nature. Generally speaking,
it s doing loads in hardware (and firmware!) to mix the raw RF signals with a
local oscillator to a frequency that can be sampled at the requested rate
(fundamentally the same concept as a superheterodyne
receiver), so a
frequency of -10MHz means that signal is 10 MHz below the center of our SDR s
tuned frequency.
The sampling rate dictates the amount of frequency representable in the data
stream. You ll sometimes see this called the
Nyquist frequency. The Nyquist
Frequency is one half of the sampling rate. Intuitively, if you think about the
amount of bandwidth observable as being 1:1 with the sampling rate of the stream,
and the middle of your bandwidth is 0 Hz, you would only have enough space to
go up in frequency for half of your bandwidth or half of your sampling rate.
Same for going down in frequency.
Float 32 / Complex 64
IQ samples that are being processed by software are commonly processed as
an interleaved pair of 32 bit floating point numbers, or a 64 bit complex
number. The first float32 is the real value, and the second is the imaginary
value.
I#0
Q#0
I#1
Q#1
I#2
Q#2
The complex number 1+1i is represented as 1.0 1.0 and the complex number
-1-1i is represented as -1.0 -1.0. Unless otherwise specified, all the
IQ samples and pseudocode to follow assumes interleaved float32 IQ data
streams.
Example interleaved float32 file (10Hz Wave at 1024 Samples per Second)
RTL-SDR
IQ samples from the RTL-SDR are encoded as a stream of interleaved unsigned
8 bit integers (uint8 or u8). The first sample is the real (in-phase or I)
value, and the second is the imaginary (quadrature or Q) value. Together each
pair of values makes up a complex number at a specific time instant.
I#0
Q#0
I#1
Q#1
I#2
Q#2
The complex number 1+1i is represented as 0xFF 0xFF and the complex number
-1-1i is represented as 0x00 0x00. The complex number 0+0i is not easily
representable since half of 0xFF is 127.5.
Complex Number
Representation
1+1i
[]uint8 0xFF, 0xFF
-1+1i
[]uint8 0x00, 0xFF
-1-1i
[]uint8 0x00, 0x00
0+0i
[]uint8 0x80, 0x80 or []uint8 0x7F, 0x7F
And finally, here s some pseudocode to convert an rtl-sdr style IQ sample
to a floating point complex number:
HackRF
IQ samples from the HackRF are encoded as a stream of interleaved signed
8 bit integers (int8 or i8). The first sample is the real (in-phase or I)
value, and the second is the imaginary (quadrature or Q) value. Together each
pair of values makes up a complex number at a specific time instant.
I#0
Q#0
I#1
Q#1
I#2
Q#2
Formats that use signed integers do have one quirk due to
two s complement,
which is that the smallest negative number representable s absolute
value is one more than the largest positive number. int8 values can
range between -128 to 127, which means there s bit of ambiguity in
how +1, 0 and -1 are represented. Either you can create perfectly symmetric
ranges of values between +1 and -1, but 0 is not representable, have more
possible values in the negative range, or allow values above (or just below)
the maximum in the range to be allowed.
Within my implementation, my approach has been to scale based on the max
integer value of the type, so the lowest possible signed value is actually
slightly smaller than -1. Generally, if your code is seeing values that low
the difference in step between -1 and slightly less than -1 isn t very
significant, even with only 8 bits. Just a curiosity to be aware of.
Complex Number
Representation
1+1i
[]int8 127, 127
-1+1i
[]int8 -128, 127
-1-1i
[]int8 -128, -128
0+0i
[]int8 0, 0
And finally, here s some pseudocode to convert a hackrf style IQ sample to a
floating point complex number:
PlutoSDR
IQ samples from the PlutoSDR are encoded as a stream of interleaved signed
16 bit integers (int16 or i16). The first sample is the real (in-phase or I)
value, and the second is the imaginary (quadrature or Q) value. Together each
pair of values makes up a complex number at a specific time instant.
Almost no SDRs capture at a 16 bit depth natively, often you ll see 12 bit
integers (as is the case with the PlutoSDR) being sent around as 16 bit
integers. This leads to the next possible question, which is are values LSB
or MSB aligned? The PlutoSDR sends data LSB aligned (which is to say, the
largest real or imaginary value in the stream will not exceed 4095), but
expects data being transmitted to be MSB aligned (which is to say the lowest
set bit possible is the 5th bit in the number, or values can only be set in
increments of 16).
As a result, the quirk observed with the HackRF (that the range of values
between 0 and -1 is different than the range of values between 0 and +1) does
not impact us so long as we do not use the whole 16 bit range.
Complex Number
Representation
1+1i
[]int16 32767, 32767
-1+1i
[]int16 -32768, 32767
-1-1i
[]int16 -32768, -32768
0+0i
[]int16 0, 0
And finally, here s some pseudocode to convert a PlutoSDR style IQ sample to a
floating point complex number, including moving the sample from LSB to MSB
aligned:
...in = []int16-15072, 496// shift left 4 bits (16 bits - 12 bits = 4 bits)
// to move from LSB aligned to MSB aligned.
in[0] = in[0] <<4in[1] = in[1] <<4real = (float(in[0]))/32767imag = (float(in[1]))/32767out = complex(real, imag)
....
Next Steps
Now that we can read (and write!) IQ data, we can get started first on the
transmitter, which we can (in turn) use to test receiving our own BPSK signal,
coming next in Part 2!
Hello! Welcome. I m so thrilled you re here.
Some of you may know this (as I ve written about in the past), but if you re
new to my RF travels, I ve spent nights and weekends over the last two years
doing some self directed learning on how radios work. I ve gone from a very
basic understanding of wireless communications, all the way through the process
of learning about and implementing a set of libraries to modulate and
demodulate data using my now formidable stash of
SDRs. I ve been
implementing all of the RF processing code from first principals and purely
based on other primitives I ve written myself to prove to myself that I
understand each concept before moving on.
I ve just finished a large personal milestone I was able to successfully
send a cURL HTTP request through a network interface into my stack of
libraries, through my own
BPSK
implementation, framed in my own artisanal hand crafted Layer 2 framing scheme,
demodulated by my code on the other end, and sent into a Linux network
interface. The combination of the Layer 1
PHY and Layer 2 Data
Link is something that I ve
been calling PACKRAT .
$ curl http://44.127.0.8:8000/
* Connected to 44.127.0.8 (44.127.0.8) port 8000 (#0)
> GET / HTTP/1.1
> Host: localhost:1313
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
* HTTP/1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Length: 236
<
____ _ ____ _ ______ _ _____
_ \ / \ / ___ / / _ \ / \ _ _
_) / _ \ ' / _) / _ \
__/ ___ \ ___ . \ _ < / ___ \
_ /_/ \_\____ _ \_\_ \_\/_/ \_\_
* Closing connection 0
In an effort to pay it forward to thank my friends for their time walking me
through huge chunks of this, and those who publish their work, I m now spending
some time documenting how I was able to implement this protocol. I would never
have gotten as far as I did without the incredible patience and kindness of
friends spending time working with me, and educators publishing their hard work
for the world to learn from. Please accept my deepest thanks and appreciation.
The PACKRAT posts are written from the perspective of a novice radio engineer,
but experienced software engineer. I ll be leaving
out a lot of the technical details on the software end and specific software
implementation, focusing on the general gist of the implementation in the radio
critical components exclusively. The idea here is this is intended to be a
framework a jumping off point for those who are interested in doing this
themselves. I hope that this series of blog posts will come to be useful
to those who embark on this incredibly rewarding journey after me.
This is the first post in the series, and it will contain links to all the
posts to follow. This is going to be the landing page I link others to as I
publish additional posts, I ll be updating the links on this page. The posts
will also grow a tag, which you can check back on, or follow along with
here.
Tau
Tau ( ) is a much more natural expression of the mathematical constant used
for circles which I use rather than Pi ( ).
You may see me use Tau in code or text Tau is the same as 2 , so if you
see a Tau and don t know what to do, feel free to mentally or textually
replace it with 2 . I just hate always writing 2 everywhere and only
using (or worse yet 2 /2) .when I mean 1/2 of a circle (or, /2).
Psuedo-code
Basicaly none of the code contained in this series is valid on its own. It s
very lightly basically Go, and only meant to express concepts in term of
software. The examples in the post shouldn t be taken on their own as working
snippits to process IQ data, but rather, be used to guide implementations
to process the data in question. I d love to invite all readers to try to
play at home with the examples, and try and work through the example data
captures!
Captures
Speaking of captures, I ve included live on-the-air captures of PACKRAT
packets, as transmitted from my implementation, in different parts of these
posts. This means you can go through the process of building code to parse and
receive PACKRAT packets, and then build a transmitter that is validated by your
receiver. It s my hope folks will follow along at home and experiment with
software to process RF data on their own!