Search Results: "pino"

30 May 2021

Russell Coker: HP ML110 Gen9

I ve just bought a HP ML110 Gen9 as a personal workstation, here are my notes about it and documentation on running Debian on it. Why a Server? I bought this is because the ML350p Gen8 turned out to be too noisy for my taste [1]. I ve just been editing my page about Memtest86+ RAM speeds [2], over the course of 10 years (high end laptop in 2001 to low end server in 2011) RAM speed increased by a factor of 100. RAM speed has been increasing at a lower rate than CPU speed and is becoming an increasing bottleneck on system performance. So while I could get a faster white-box system the cost of a second-hand server isn t that great and I m getting a system that s 100* faster than what was adequate for most tasks in 2001. HP makes some nice workstation class machines with ECC RAM (think server without remote management, hot-swap disks, or redundant PSU but with sound hardware). But they are significantly more expensive on the second hand market than servers. This server cost me $650 and came with 2*480G DC grade SSDs (Intel but with HPE stickers). I hope that more than half of the purchase price will be recovered from selling the SSDs (I will use NVMe). Also 64G of non-ECC RAM costs $370 from my local store. As I want lots of RAM for testing software on VMs it will probably turn out that the server cost me less than the cost of new RAM once I ve sold the SSDs! Monitoring
wget -O /usr/local/hpePublicKey2048_key1.pub https://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub
echo "# HP monitoring" >> /etc/apt/sources.list
echo "deb [signed-by=/usr/local/hpePublicKey2048_key1.pub] http://downloads.linux.hpe.com/SDR/downloads/MCP/Debian/ stretch/current-gen9 non-free" >> /etc/apt/sources.list
The above commands will make the management utilities installable on Debian/Buster. If using Bullseye (Testing at the moment) then you need to have Buster repositories in APT for dependencies, HP doesn t seem to have packaged all their utilities for Buster.
wget -r -np -A Contents-amd64.bz2 http://downloads.linux.hpe.com/SDR/repo/mcp/debian/dists
To find out which repositories had the programs I need I ran the above recursive wget and then uncompressed them for grep -R (as an aside it would be nice if bzgrep supported -R). I installed the hp-health package which has hpasmcli for viewing and setting many configuration options and hplog for viewing event log data and thermal data (among a few other things). I ve added a new monitor to etbemon hp-temp.monitor to monitor HP server temperatures, I haven t made a configuration option to change the thresholds for what is considered normal because I don t expect server class systems to be routinely running above the warning temperature. For the linux-temp.monitor script I added a command-line option for the percentage of the high temperature that is an error condition as well as an option for the number of CPU cores that need to be over-temperature, having one core permanently over the high temperature due to a web browser seems standard for white-box workstations nowadays. The hp-health package depends on libc6-i686 lib32gcc1 even though none of the programs it contains use lib32gcc1. Depending on lib32gcc1 instead of lib32gcc1 lib32gcc-s1 means that installing hp-health requires removing mesa-opencl-icd which probably means that BOINC can t use the GPU among other things. I solved this by editing /var/lib/dpkg/status and changing the package dependencies to what I desired. Note that this is not something for a novice to do, make a backup and make sure you know what you are doing! Issues The HPE Dynamic Smart Array B140i is a software RAID device. While it s convenient for some users that software RAID gets supported in the UEFI boot process, generally software RAID is a bad idea. Also my system has hot-swap drive caddies but the controller doesn t support hot-swap. So the first thing to do was to configure the array controller to run in AHCI mode and give up on using hot-swap drive caddies for hot-swap. I tested all the documented ways of scanning for new devices and nothing other than a reboot made the kernel recognise a new SATA disk. According to specs provided by Dell and HP the ML110 Gen9 makes less noise than the PowerEdge T320, according to my own observations the reverse is the case. I don t know if this is because of Dell being more conservative in their specs than HP or because of how dBA is measured vs my own personal annoyance thresholds for sounds. As the system makes more noise than I m comfortable with I plan to build a rubber enclosure for the rear of the system to reduce noise, that will be the subject of another post. For Australian readers Bunnings has some good deals on rubber floor mats that can be used to reduce server noise. The server doesn t have sound hardware, while one could argue that servers don t need sound there are some server uses for sound hardware such as using line input as a source of entropy. Also for a manufacturer it might be a benefit to use the same motherboard for workstations and servers. Fortunately a friend gave me a nice set of Logitech USB speakers a few years ago that I hadn t previously had a cause to use, so that will solve the problem for me (I don t need line-in on a workstation). UEFI and Memtest I decided to try UEFI boot for something new (in the past I d only used UEFI boot for a server that only had large disks). In the past I ve booted all my own systems with BIOS boot because I m familiar with it and they all have SSDs for booting which are less than 2TB in size (until recently 2TB SSDs weren t affordable for my personal use). The Debian UEFI wiki page is worth reading [3]. The Debian Wiki page about ProLiant servers [4] is worth reading too. Memtest86+ doesn t support EFI booting (just goes to a black screen) even though Debian/Buster puts in a GRUB entry for it (Debian bug #695246 was filed for this in 2012). Also on my ML110 Memtest86+ doesn t report the RAM speed (a known issue on Memtest86+). Comments on the net say that Memtest86+ hasn t been maintained for a long time and Memtest86 (the non-free version) has been updated more recently. So far I haven t seen a system with ECC RAM have a memory problem that could be detected by Memtest86+, the memory problems I ve seen on ECC systems have been things that prevent booting (RAM not being recognised correctly), that are detected by the BIOS as ECC errors before booting, or that are reported by the kernel as ECC errors at run time (happened years ago and I can t remember the details). Overall I m not a fan of EFI with the way it currently works in Debian. It seems to add some of the GRUB functionality into the BIOS and then use that to load GRUB. It seems that EFI can do everything you need and it would be better to just have a single boot loader not two of them chained. Power Supply There are a range of PSUs for the ML110, the one I have has the smallest available PSU (350W) and doesn t have a PCIe power cable (the one used for video cards). Here is the HP document which shows the cabling for the various ML110 Gen8 PSUs [5], I have the 350W PSU. One thing I ve considered is whether I could make an adaptor from the drive bay power to the PCIe connector. A quick web search indicates that 4 SAS disks when active can take up to 75W more power than a system with no disks. If that s the case then the 2 spare drive bay connectors which can each handle 4 disks should be able to supply 150W. As a 6 pin PCIe power cable (GPU power cable) is rated at 75W that should be fine in theory (here s a page with the pinouts for PCIe power connectors [6]). My video card is a Radeon R7 260X which apparently takes about 113W all up so should be taking less than 75W from the PCIe power cable. All I really want is YouTube, Netflix, and text editing at 4K resolution. So I don t need much in terms of 3D power. KDE uses some of the advanced features of modern video cards, but it doesn t compare to 3D gaming. According to the Wikipedia page for Radeon RX 500 series [7] the RX560 supports DisplayPort 1.4 and HDMI 2.0 (both of which do 4K@60Hz) and has a TDP of 75W. So a RX560 video card seems like a good option that will work in any system that doesn t have a spare PCIe power cable. I ve just ordered one of those for $246 so hopefully that will arrive in a week or so. PCI Fan The ML110 Gen9 has an optional PCIe fan and baffle to cool PCIe cards (part number 784580-B21). Extra cooling of PCIe cards is a good thing, but $400 list price (and about $50 ebay price) for the fan and baffle is unpleasant. When I boot the system with a PCIe dual-ethernet card and two PCIe NVMe cards it gives a BIOS warning on boot, when I add a video card it refuses to boot without the extra fan. It s nice that the system makes sure it doesn t get into a thermal overload situation, but it would be nicer if they just shipped all necessary fans with it instead of trying to get more money out of customers. I just bought a PCI fan and baffle kit for $60. Conclusion In spite of the unexpected expense of a new video card and PCI fan the overall cost of this system is still low, particularly when considering that I ll find another use for the video card which needs and extra power connector. It is disappointing that HP didn t supply a more capable PSU and fit all the fans to all models, the expectation of a server is that you can just do server stuff not have to buy extra bits before you can do server stuff. If you want to install Tesla GPUs or something then it s expected that you might need to do something unusual with a server, but the basic stuff should just work. A single processor tower server should be designed to function as a deskside workstation and be able to handle an average video card. Generally it s a nice computer, I look forward to getting the next deliveries of parts so I can make it work properly.

25 April 2021

Antoine Beaupr : Lost article ideas

I wrote for LWN for about two years. During that time, I wrote (what seems to me an impressive) 34 articles, but I always had a pile of ideas in the back of my mind. Those are ideas, notes, and scribbles lying around. Some were just completely abandoned because they didn't seem a good fit for LWN. Concretely, I stored those in branches in a git repository, and used the branch name (and, naively, the last commit log) as indicators of the topic. This was the state of affairs when I left:
remotes/private/attic/novena                    822ca2bb add letter i sent to novena, never published
remotes/private/attic/secureboot                de09d82b quick review, add note and graph
remotes/private/attic/wireguard                 5c5340d1 wireguard review, tutorial and comparison with alternatives
remotes/private/backlog/dat                     914c5edf Merge branch 'master' into backlog/dat
remotes/private/backlog/packet                  9b2c6d1a ham radio packet innovations and primer
remotes/private/backlog/performance-tweaks      dcf02676 config notes for http2
remotes/private/backlog/serverless              9fce6484 postponed until kubecon europe
remotes/private/fin/cost-of-hosting             00d8e499 cost-of-hosting article online
remotes/private/fin/kubecon                     f4fd7df2 remove published or spun off articles
remotes/private/fin/kubecon-overview            21fae984 publish kubecon overview article
remotes/private/fin/kubecon2018                 1edc5ec8 add series
remotes/private/fin/netconf                     3f4b7ece publish the netconf articles
remotes/private/fin/netdev                      6ee66559 publish articles from netdev 2.2
remotes/private/fin/pgp-offline                 f841deed pgp offline branch ready for publication
remotes/private/fin/primes                      c7e5b912 publish the ROCA paper
remotes/private/fin/runtimes                    4bee1d70 prepare publication of runtimes articles
remotes/private/fin/token-benchmarks            5a363992 regenerate timestamp automatically
remotes/private/ideas/astropy                   95d53152 astropy or python in astronomy
remotes/private/ideas/avaneya                   20a6d149 crowdfunded blade-runner-themed GPLv3 simcity-like simulator
remotes/private/ideas/backups-benchmarks        fe2f1f13 review of backup software through performance and features
remotes/private/ideas/cumin                     7bed3945 review of the cumin automation tool from WM foundation
remotes/private/ideas/future-of-distros         d086ca0d modern packaging problems and complex apps
remotes/private/ideas/on-dying                  a92ad23f another dying thing
remotes/private/ideas/openpgp-discovery         8f2782f0 openpgp discovery mechanisms (WKD, etc), thanks to jonas meurer
remotes/private/ideas/password-bench            451602c0 bruteforce estimates for various password patterns compared with RSA key sizes
remotes/private/ideas/prometheus-openmetrics    2568dbd6 openmetrics standardizing prom metrics enpoints
remotes/private/ideas/telling-time              f3c24a53 another way of telling time
remotes/private/ideas/wallabako                 4f44c5da talk about wallabako, read-it-later + kobo hacking
remotes/private/stalled/bench-bench-bench       8cef0504 benchmarking http benchmarking tools
remotes/private/stalled/debian-survey-democracy 909bdc98 free software surveys and debian democracy, volunteer vs paid work
Wow, what a mess! Let's see if I can make sense of this:

Attic Those are articles that I thought about, then finally rejected, either because it didn't seem worth it, or my editors rejected it, or I just moved on:
  • novena: the project is ooold now, didn't seem to fit a LWN article. it was basically "how can i build my novena now" and "you guys rock!" it seems like the MNT Reform is the brain child of the Novena now, and I dare say it's even cooler!
  • secureboot: my LWN editors were critical of my approach, and probably rightly so - it's a really complex subject and I was probably out of my depth... it's also out of date now, we did manage secureboot in Debian
  • wireguard: LWN ended up writing extensive coverage, and I was biased against Donenfeld because of conflicts in a previous project

Backlog Those were articles I was planning to write about next.
  • dat: I already had written Sharing and archiving data sets with Dat, but it seems I had more to say... mostly performance issues, beaker, no streaming, limited adoption... to be investigated, I guess?
  • packet: a primer on data communications over ham radio, and the cool new tech that has emerged in the free software world. those are mainly notes about Pat, Direwolf, APRS and so on... just never got around to making sense of it or really using the tech...
  • performance-tweaks: "optimizing websites at the age of http2", the unwritten story of the optimization of this website with HTTP/2 and friends
  • serverless: god. one of the leftover topics at Kubecon, my notes on this were thin, and the actual subject, possibly even thinner... the only lie worse than the cloud is that there's no server at all! concretely, that's a pile of notes about Kubecon which I wanted to sort through. Probably belongs in the attic now.

Fin Those are finished articles, they were published on my website and LWN, but the branches were kept because previous drafts had private notes that should not be published.

Ideas A lot of those branches were actually just an empty commit, with the commitlog being the "pitch", more or less. I'd send that list to my editors, sometimes with a few more links (basically the above), and they would nudge me one way or the other. Sometimes they would actively discourage me to write about something, and I would do it anyways, send them a draft, and they would patiently make me rewrite it until it was a decent article. This was especially hard with the terminal emulator series, which took forever to write and even got my editors upset when they realized I had never installed Fedora (I ended up installing it, and I was proven wrong!)

Stalled Oh, and then there's those: those are either "ideas" or "backlog" that got so far behind that I just moved them out of the way because I was tired of seeing them in my list.
  • stalled/bench-bench-bench benchmarking http benchmarking tools, a horrible mess of links, copy-paste from terminals, and ideas about benchmarking... some of this trickled out into this benchmarking guide at Tor, but not much more than the list of tools
  • stalled/debian-survey-democracy: "free software surveys and Debian democracy, volunteer vs paid work"... A long standing concern of mine is that all Debian work is supposed to be volunteer, and paying explicitly for work inside Debian has traditionally been frowned upon, even leading to serious drama and dissent (remember Dunc-Tank)? back when I was writing for LWN, I was also doing paid work for Debian LTS. I also learned that a lot (most?) Debian Developers were actually being paid by their job to work on Debian. So I was confused by this apparent contradiction, especially given how the LTS project has been mostly accepted, while Dunc-Tank was not... See also this talk at Debconf 16. I had hopes that this study would show the "hunch" people have offered (that most DDs are paid to work on Debian) but it seems to show the reverse (only 36% of DDs, and 18% of all respondents paid). So I am still confused and worried about the sustainability of Debian.

What do you think? So that's all I got. As people might have noticed here, I have much less time to write these days, but if there's any subject in there I should pick, what is the one that you would find most interesting? Oh! and I should mention that you can write to LWN! If you think people should know more about some Linux thing, you can get paid to write for it! Pitch it to the editors, they won't bite. The worst that can happen is that they say "yes" and there goes two years of your life learning to write. Because no, you don't know how to write, no one does. You need an editor to write. That's why this article looks like crap and has a smiley. :)

9 April 2021

Michael Prokop: A Ceph war story

It all started with the big bang! We nearly lost 33 of 36 disks on a Proxmox/Ceph Cluster; this is the story of how we recovered them. At the end of 2020, we eventually had a long outstanding maintenance window for taking care of system upgrades at a customer. During this maintenance window, which involved reboots of server systems, the involved Ceph cluster unexpectedly went into a critical state. What was planned to be a few hours of checklist work in the early evening turned out to be an emergency case; let s call it a nightmare (not only because it included a big part of the night). Since we have learned a few things from our post mortem and RCA, it s worth sharing those with others. But first things first, let s step back and clarify what we had to deal with. The system and its upgrade One part of the upgrade included 3 Debian servers (we re calling them server1, server2 and server3 here), running on Proxmox v5 + Debian/stretch with 12 Ceph OSDs each (65.45TB in total), a so-called Proxmox Hyper-Converged Ceph Cluster. First, we went for upgrading the Proxmox v5/stretch system to Proxmox v6/buster, before updating Ceph Luminous v12.2.13 to the latest v14.2 release, supported by Proxmox v6/buster. The Proxmox upgrade included updating corosync from v2 to v3. As part of this upgrade, we had to apply some configuration changes, like adjust ring0 + ring1 address settings and add a mon_host configuration to the Ceph configuration. During the first two servers reboots, we noticed configuration glitches. After fixing those, we went for a reboot of the third server as well. Then we noticed that several Ceph OSDs were unexpectedly down. The NTP service wasn t working as expected after the upgrade. The underlying issue is a race condition of ntp with systemd-timesyncd (see #889290). As a result, we had clock skew problems with Ceph, indicating that the Ceph monitors clocks aren t running in sync (which is essential for proper Ceph operation). We initially assumed that our Ceph OSD failure derived from this clock skew problem, so we took care of it. After yet another round of reboots, to ensure the systems are running all with identical and sane configurations and services, we noticed lots of failing OSDs. This time all but three OSDs (19, 21 and 22) were down:
% sudo ceph osd tree
ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       65.44138 root default
-2       21.81310     host server1
 0   hdd  1.08989         osd.0    down  1.00000 1.00000
 1   hdd  1.08989         osd.1    down  1.00000 1.00000
 2   hdd  1.63539         osd.2    down  1.00000 1.00000
 3   hdd  1.63539         osd.3    down  1.00000 1.00000
 4   hdd  1.63539         osd.4    down  1.00000 1.00000
 5   hdd  1.63539         osd.5    down  1.00000 1.00000
18   hdd  2.18279         osd.18   down  1.00000 1.00000
20   hdd  2.18179         osd.20   down  1.00000 1.00000
28   hdd  2.18179         osd.28   down  1.00000 1.00000
29   hdd  2.18179         osd.29   down  1.00000 1.00000
30   hdd  2.18179         osd.30   down  1.00000 1.00000
31   hdd  2.18179         osd.31   down  1.00000 1.00000
-4       21.81409     host server2
 6   hdd  1.08989         osd.6    down  1.00000 1.00000
 7   hdd  1.08989         osd.7    down  1.00000 1.00000
 8   hdd  1.63539         osd.8    down  1.00000 1.00000
 9   hdd  1.63539         osd.9    down  1.00000 1.00000
10   hdd  1.63539         osd.10   down  1.00000 1.00000
11   hdd  1.63539         osd.11   down  1.00000 1.00000
19   hdd  2.18179         osd.19     up  1.00000 1.00000
21   hdd  2.18279         osd.21     up  1.00000 1.00000
22   hdd  2.18279         osd.22     up  1.00000 1.00000
32   hdd  2.18179         osd.32   down  1.00000 1.00000
33   hdd  2.18179         osd.33   down  1.00000 1.00000
34   hdd  2.18179         osd.34   down  1.00000 1.00000
-3       21.81419     host server3
12   hdd  1.08989         osd.12   down  1.00000 1.00000
13   hdd  1.08989         osd.13   down  1.00000 1.00000
14   hdd  1.63539         osd.14   down  1.00000 1.00000
15   hdd  1.63539         osd.15   down  1.00000 1.00000
16   hdd  1.63539         osd.16   down  1.00000 1.00000
17   hdd  1.63539         osd.17   down  1.00000 1.00000
23   hdd  2.18190         osd.23   down  1.00000 1.00000
24   hdd  2.18279         osd.24   down  1.00000 1.00000
25   hdd  2.18279         osd.25   down  1.00000 1.00000
35   hdd  2.18179         osd.35   down  1.00000 1.00000
36   hdd  2.18179         osd.36   down  1.00000 1.00000
37   hdd  2.18179         osd.37   down  1.00000 1.00000
Our blood pressure increased slightly! Did we just lose all of our cluster? What happened, and how can we get all the other OSDs back? We stumbled upon this beauty in our logs:
kernel: [   73.697957] XFS (sdl1): SB stripe unit sanity check failed
kernel: [   73.698002] XFS (sdl1): Metadata corruption detected at xfs_sb_read_verify+0x10e/0x180 [xfs], xfs_sb block 0xffffffffffffffff
kernel: [   73.698799] XFS (sdl1): Unmount and run xfs_repair
kernel: [   73.699199] XFS (sdl1): First 128 bytes of corrupted metadata buffer:
kernel: [   73.699677] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
kernel: [   73.700205] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
kernel: [   73.700836] 00000020: 62 44 2b c0 e6 22 40 d7 84 3d e1 cc 65 88 e9 d8  bD+.."@..=..e...
kernel: [   73.701347] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
kernel: [   73.701770] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
ceph-disk[4240]: mount: /var/lib/ceph/tmp/mnt.jw367Y: mount(2) system call failed: Structure needs cleaning.
ceph-disk[4240]: ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t', u'xfs', '-o', 'noatime,inode64', '--', '/dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.cdda39ed-5
ceph/tmp/mnt.jw367Y']' returned non-zero exit status 32
kernel: [   73.702162] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
kernel: [   73.702550] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
kernel: [   73.702975] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
kernel: [   73.703373] XFS (sdl1): SB validate failed with error -117.
The same issue was present for the other failing OSDs. We hoped, that the data itself was still there, and only the mounting of the XFS partitions failed. The Ceph cluster was initially installed in 2017 with Ceph jewel/10.2 with the OSDs on filestore (nowadays being a legacy approach to storing objects in Ceph). However, we migrated the disks to bluestore since then (with ceph-disk and not yet via ceph-volume what s being used nowadays). Using ceph-disk introduces these 100MB XFS partitions containing basic metadata for the OSD. Given that we had three working OSDs left, we decided to investigate how to rebuild the failing ones. Some folks on #ceph (thanks T1, ormandj + peetaur!) were kind enough to share how working XFS partitions looked like for them. After creating a backup (via dd), we tried to re-create such an XFS partition on server1. We noticed that even mounting a freshly created XFS partition failed:
synpromika@server1 ~ % sudo mkfs.xfs -f -i size=2048 -m uuid="4568c300-ad83-4288-963e-badcd99bf54f" /dev/sdc1
meta-data=/dev/sdc1              isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
synpromika@server1 ~ % sudo mount /dev/sdc1 /mnt/ceph-recovery
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
cache_node_purge: refcount was 1, not zero (node=0x1d3c400)
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x18800/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x18800/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x24c00/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x24c00/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0xc400/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0xc400/0x1000
releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!found dirty buffer (bulk) on free list!bad magic number
bad magic number
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
releasing dirty buffer (bulk) to free list!mount: /mnt/ceph-recovery: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error.
Ouch. This very much looked related to the actual issue we re seeing. So we tried to execute mkfs.xfs with a bunch of different sunit/swidth settings. Using -d sunit=512 -d swidth=512 at least worked then, so we decided to force its usage in the creation of our OSD XFS partition. This brought us a working XFS partition. Please note, sunit must not be larger than swidth (more on that later!). Then we reconstructed how to restore all the metadata for the OSD (activate.monmap, active, block_uuid, bluefs, ceph_fsid, fsid, keyring, kv_backend, magic, mkfs_done, ready, require_osd_release, systemd, type, whoami). To identify the UUID, we can read the data from ceph --format json osd dump , like this for all our OSDs (Zsh syntax ftw!):
synpromika@server1 ~ % for f in  0..37  ; printf "osd-$f: %s\n" "$(sudo ceph --format json osd dump   jq -r ".osds[]   select(.osd==$f)   .uuid")"
osd-0: 4568c300-ad83-4288-963e-badcd99bf54f
osd-1: e573a17a-ccde-4719-bdf8-eef66903ca4f
osd-2: 0e1b2626-f248-4e7d-9950-f1a46644754e
osd-3: 1ac6a0a2-20ee-4ed8-9f76-d24e900c800c
[...]
Identifying the corresponding raw device for each OSD UUID is possible via:
synpromika@server1 ~ % UUID="4568c300-ad83-4288-963e-badcd99bf54f"
synpromika@server1 ~ % readlink -f /dev/disk/by-partuuid/"$ UUID "
/dev/sdc1
The OSD s key ID can be retrieved via:
synpromika@server1 ~ % OSD_ID=0
synpromika@server1 ~ % sudo ceph auth get osd."$ OSD_ID " -f json 2>/dev/null   jq -r '.[]   .key'
AQCKFpZdm0We[...]
Now we also need to identify the underlying block device:
synpromika@server1 ~ % OSD_ID=0
synpromika@server1 ~ % sudo ceph osd metadata osd."$ OSD_ID " -f json   jq -r '.bluestore_bdev_partition_path'    
/dev/sdc2
With all of this, we reconstructed the keyring, fsid, whoami, block + block_uuid files. All the other files inside the XFS metadata partition are identical on each OSD. So after placing and adjusting the corresponding metadata on the XFS partition for Ceph usage, we got a working OSD hurray! Since we had to fix yet another 32 OSDs, we decided to automate this XFS partitioning and metadata recovery procedure. We had a network share available on /srv/backup for storing backups of existing partition data. On each server, we tested the procedure with one single OSD before iterating over the list of remaining failing OSDs. We started with a shell script on server1, then adjusted the script for server2 and server3. This is the script, as we executed it on the 3rd server. Thanks to this, we managed to get the Ceph cluster up and running again. We didn t want to continue with the Ceph upgrade itself during the night though, as we wanted to know exactly what was going on and why the system behaved like that. Time for RCA! Root Cause Analysis So all but three OSDs on server2 failed, and the problem seems to be related to XFS. Therefore, our starting point for the RCA was, to identify what was different on server2, as compared to server1 + server3. My initial assumption was that this was related to some firmware issues with the involved controller (and as it turned out later, I was right!). The disks were attached as JBOD devices to a ServeRAID M5210 controller (with a stripe size of 512). Firmware state:
synpromika@server1 ~ % sudo storcli64 /c0 show all   grep '^Firmware'
Firmware Package Build = 24.16.0-0092
Firmware Version = 4.660.00-8156
synpromika@server2 ~ % sudo storcli64 /c0 show all   grep '^Firmware'
Firmware Package Build = 24.21.0-0112
Firmware Version = 4.680.00-8489
synpromika@server3 ~ % sudo storcli64 /c0 show all   grep '^Firmware'
Firmware Package Build = 24.16.0-0092
Firmware Version = 4.660.00-8156
This looked very promising, as server2 indeed runs with a different firmware version on the controller. But how so? Well, the motherboard of server2 got replaced by a Lenovo/IBM technician in January 2020, as we had a failing memory slot during a memory upgrade. As part of this procedure, the Lenovo/IBM technician installed the latest firmware versions. According to our documentation, some OSDs were rebuilt (due to the filestore->bluestore migration) in March and April 2020. It turned out that precisely those OSDs were the ones that survived the upgrade. So the surviving drives were created with a different firmware version running on the involved controller. All the other OSDs were created with an older controller firmware. But what difference does this make? Now let s check firmware changelogs. For the 24.21.0-0097 release we found this:
- Cannot create or mount xfs filesystem using xfsprogs 4.19.x kernel 4.20(SCGCQ02027889)
- xfs_info command run on an XFS file system created on a VD of strip size 1M shows sunit and swidth as 0(SCGCQ02056038)
Our XFS problem certainly was related to the controller s firmware. We also recalled that our monitoring system reported different sunit settings for the OSDs that were rebuilt in March and April. For example, OSD 21 was recreated and got different sunit settings:
WARN  server2.example.org  Mount options of /var/lib/ceph/osd/ceph-21      WARN - Missing: sunit=1024, Exceeding: sunit=512
We compared the new OSD 21 with an existing one (OSD 25 on server3):
synpromika@server2 ~ % systemctl show var-lib-ceph-osd-ceph\\x2d21.mount   grep sunit
Options=rw,noatime,attr2,inode64,sunit=512,swidth=512,noquota
synpromika@server3 ~ % systemctl show var-lib-ceph-osd-ceph\\x2d25.mount   grep sunit
Options=rw,noatime,attr2,inode64,sunit=1024,swidth=512,noquota
Thanks to our documentation, we could compare execution logs of their creation:
% diff -u ceph-disk-osd-25.log ceph-disk-osd-21.log
-synpromika@server2 ~ % sudo ceph-disk -v prepare --bluestore /dev/sdj --osd-id 25
+synpromika@server3 ~ % sudo ceph-disk -v prepare --bluestore /dev/sdi --osd-id 21
[...]
-command_check_call: Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdj1
-meta-data=/dev/sdj1              isize=2048   agcount=4, agsize=6272 blks
[...]
+command_check_call: Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdi1
+meta-data=/dev/sdi1              isize=2048   agcount=4, agsize=6336 blks
          =                       sectsz=4096  attr=2, projid32bit=1
          =                       crc=1        finobt=1, sparse=0, rmapbt=0, reflink=0
-data     =                       bsize=4096   blocks=25088, imaxpct=25
-         =                       sunit=128    swidth=64 blks
+data     =                       bsize=4096   blocks=25344, imaxpct=25
+         =                       sunit=64     swidth=64 blks
 naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
 log      =internal log           bsize=4096   blocks=1608, version=2
          =                       sectsz=4096  sunit=1 blks, lazy-count=1
 realtime =none                   extsz=4096   blocks=0, rtextents=0
[...]
So back then, we even tried to track this down but couldn t make sense of it yet. But now this sounds very much like it is related to the problem we saw with this Ceph/XFS failure. We follow Occam s razor, assuming the simplest explanation is usually the right one, so let s check the disk properties and see what differs:
synpromika@server1 ~ % sudo blockdev --getsz --getsize64 --getss --getpbsz --getiomin --getioopt /dev/sdk
4685545472
2398999281664
512
4096
524288
262144
synpromika@server2 ~ % sudo blockdev --getsz --getsize64 --getss --getpbsz --getiomin --getioopt /dev/sdk
4685545472
2398999281664
512
4096
262144
262144
See the difference between server1 and server2 for identical disks? The getiomin option now reports something different for them:
synpromika@server1 ~ % sudo blockdev --getiomin /dev/sdk            
524288
synpromika@server1 ~ % cat /sys/block/sdk/queue/minimum_io_size
524288
synpromika@server2 ~ % sudo blockdev --getiomin /dev/sdk 
262144
synpromika@server2 ~ % cat /sys/block/sdk/queue/minimum_io_size
262144
It doesn t make sense that the minimum I/O size (iomin, AKA BLKIOMIN) is bigger than the optimal I/O size (ioopt, AKA BLKIOOPT). This leads us to Bug 202127 cannot mount or create xfs on a 597T device, which matches our findings here. But why did this XFS partition work in the past and fails now with the newer kernel version? The XFS behaviour change Now given that we have backups of all the XFS partition, we wanted to track down, a) when this XFS behaviour was introduced, and b) whether, and if so how it would be possible to reuse the XFS partition without having to rebuild it from scratch (e.g. if you would have no working Ceph OSD or backups left). Let s look at such a failing XFS partition with the Grml live system:
root@grml ~ # grml-version
grml64-full 2020.06 Release Codename Ausgehfuahangl [2020-06-24]
root@grml ~ # uname -a
Linux grml 5.6.0-2-amd64 #1 SMP Debian 5.6.14-2 (2020-06-09) x86_64 GNU/Linux
root@grml ~ # grml-hostname grml-2020-06
Setting hostname to grml-2020-06: done
root@grml ~ # exec zsh
root@grml-2020-06 ~ # dpkg -l xfsprogs util-linux
Desired=Unknown/Install/Remove/Purge/Hold
  Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
 / Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
 / Name           Version      Architecture Description
+++-==============-============-============-=========================================
ii  util-linux     2.35.2-4     amd64        miscellaneous system utilities
ii  xfsprogs       5.6.0-1+b2   amd64        Utilities for managing the XFS filesystem
There it s failing, no matter which mount option we try:
root@grml-2020-06 ~ # mount ./sdd1.dd /mnt
mount: /mnt: mount(2) system call failed: Structure needs cleaning.
root@grml-2020-06 ~ # dmesg   tail -30
[...]
[   64.788640] XFS (loop1): SB stripe unit sanity check failed
[   64.788671] XFS (loop1): Metadata corruption detected at xfs_sb_read_verify+0x102/0x170 [xfs], xfs_sb block 0xffffffffffffffff
[   64.788671] XFS (loop1): Unmount and run xfs_repair
[   64.788672] XFS (loop1): First 128 bytes of corrupted metadata buffer:
[   64.788673] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
[   64.788674] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   64.788675] 00000020: 32 b6 dc 35 53 b7 44 96 9d 63 30 ab b3 2b 68 36  2..5S.D..c0..+h6
[   64.788675] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
[   64.788675] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
[   64.788676] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
[   64.788677] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
[   64.788677] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
[   64.788679] XFS (loop1): SB validate failed with error -117.
root@grml-2020-06 ~ # mount -t xfs -o rw,relatime,attr2,inode64,sunit=1024,swidth=512,noquota ./sdd1.dd /mnt/
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop1, missing codepage or helper program, or other error.
32 root@grml-2020-06 ~ # dmesg   tail -1
[   66.342976] XFS (loop1): stripe width (512) must be a multiple of the stripe unit (1024)
root@grml-2020-06 ~ # mount -t xfs -o rw,relatime,attr2,inode64,sunit=512,swidth=512,noquota ./sdd1.dd /mnt/
mount: /mnt: mount(2) system call failed: Structure needs cleaning.
32 root@grml-2020-06 ~ # dmesg   tail -14
[   66.342976] XFS (loop1): stripe width (512) must be a multiple of the stripe unit (1024)
[   80.751277] XFS (loop1): SB stripe unit sanity check failed
[   80.751323] XFS (loop1): Metadata corruption detected at xfs_sb_read_verify+0x102/0x170 [xfs], xfs_sb block 0xffffffffffffffff 
[   80.751324] XFS (loop1): Unmount and run xfs_repair
[   80.751325] XFS (loop1): First 128 bytes of corrupted metadata buffer:
[   80.751327] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
[   80.751328] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   80.751330] 00000020: 32 b6 dc 35 53 b7 44 96 9d 63 30 ab b3 2b 68 36  2..5S.D..c0..+h6
[   80.751331] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
[   80.751331] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
[   80.751332] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
[   80.751333] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
[   80.751334] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
[   80.751338] XFS (loop1): SB validate failed with error -117.
Also xfs_repair doesn t help either:
root@grml-2020-06 ~ # xfs_info ./sdd1.dd
meta-data=./sdd1.dd              isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@grml-2020-06 ~ # xfs_repair ./sdd1.dd
Phase 1 - find and verify superblock...
bad primary superblock - bad stripe width in superblock !!!
attempting to find secondary superblock...
..............................................................................................Sorry, could not find valid secondary superblock
Exiting now.
With the SB stripe unit sanity check failed message, we could easily track this down to the following commit fa4ca9c:
% git show fa4ca9c5574605d1e48b7e617705230a0640b6da   cat
commit fa4ca9c5574605d1e48b7e617705230a0640b6da
Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Jun 5 10:06:16 2018 -0700
    
    xfs: catch bad stripe alignment configurations
    
    When stripe alignments are invalid, data alignment algorithms in the
    allocator may not work correctly. Ensure we catch superblocks with
    invalid stripe alignment setups at mount time. These data alignment
    mismatches are now detected at mount time like this:
    
    XFS (loop0): SB stripe unit sanity check failed
    XFS (loop0): Metadata corruption detected at xfs_sb_read_verify+0xab/0x110, xfs_sb block 0xffffffffffffffff
    XFS (loop0): Unmount and run xfs_repair
    XFS (loop0): First 128 bytes of corrupted metadata buffer:
    0000000091c2de02: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 10 00  XFSB............
    0000000023bff869: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000000cdd8c893: 17 32 37 15 ff ca 46 3d 9a 17 d3 33 04 b5 f1 a2  .27...F=...3....
    000000009fd2844f: 00 00 00 00 00 00 00 04 00 00 00 00 00 00 06 d0  ................
    0000000088e9b0bb: 00 00 00 00 00 00 06 d1 00 00 00 00 00 00 06 d2  ................
    00000000ff233a20: 00 00 00 01 00 00 10 00 00 00 00 01 00 00 00 00  ................
    000000009db0ac8b: 00 00 03 60 e1 34 02 00 08 00 00 02 00 00 00 00  ... .4..........
    00000000f7022460: 00 00 00 00 00 00 00 00 0c 09 0b 01 0c 00 00 19  ................
    XFS (loop0): SB validate failed with error -117.
    
    And the mount fails.
    
    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
    Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
diff --git fs/xfs/libxfs/xfs_sb.c fs/xfs/libxfs/xfs_sb.c
index b5dca3c8c84d..c06b6fc92966 100644
--- fs/xfs/libxfs/xfs_sb.c
+++ fs/xfs/libxfs/xfs_sb.c
@@ -278,6 +278,22 @@ xfs_mount_validate_sb(
                return -EFSCORRUPTED;
         
        
+       if (sbp->sb_unit)  
+               if (!xfs_sb_version_hasdalign(sbp)  
+                   sbp->sb_unit > sbp->sb_width  
+                   (sbp->sb_width % sbp->sb_unit) != 0)  
+                       xfs_notice(mp, "SB stripe unit sanity check failed");
+                       return -EFSCORRUPTED;
+                 
+         else if (xfs_sb_version_hasdalign(sbp))   
+               xfs_notice(mp, "SB stripe alignment sanity check failed");
+               return -EFSCORRUPTED;
+         else if (sbp->sb_width)  
+               xfs_notice(mp, "SB stripe width sanity check failed");
+               return -EFSCORRUPTED;
+        
+
+       
        if (xfs_sb_version_hascrc(&mp->m_sb) &&
            sbp->sb_blocksize < XFS_MIN_CRC_BLOCKSIZE)  
                xfs_notice(mp, "v5 SB sanity check failed");
This change is included in kernel versions 4.18-rc1 and newer:
% git describe --contains fa4ca9c5574605d1e48
v4.18-rc1~37^2~14
Now let s try with an older kernel version (4.9.0), using old Grml 2017.05 release:
root@grml ~ # grml-version
grml64-small 2017.05 Release Codename Freedatensuppe [2017-05-31]
root@grml ~ # uname -a
Linux grml 4.9.0-1-grml-amd64 #1 SMP Debian 4.9.29-1+grml.1 (2017-05-24) x86_64 GNU/Linux
root@grml ~ # lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 9.0 (stretch)
Release:        9.0
Codename:       stretch
root@grml ~ # grml-hostname grml-2017-05
Setting hostname to grml-2017-05: done
root@grml ~ # exec zsh
root@grml-2017-05 ~ #
root@grml-2017-05 ~ # xfs_info ./sdd1.dd
xfs_info: ./sdd1.dd is not a mounted XFS filesystem
1 root@grml-2017-05 ~ # xfs_repair ./sdd1.dd
Phase 1 - find and verify superblock...
bad primary superblock - bad stripe width in superblock !!!
attempting to find secondary superblock...
..............................................................................................Sorry, could not find valid secondary superblock
Exiting now.
1 root@grml-2017-05 ~ # mount ./sdd1.dd /mnt
root@grml-2017-05 ~ # mount -t xfs
/root/sdd1.dd on /mnt type xfs (rw,relatime,attr2,inode64,sunit=1024,swidth=512,noquota)
root@grml-2017-05 ~ # ls /mnt
activate.monmap  active  block  block_uuid  bluefs  ceph_fsid  fsid  keyring  kv_backend  magic  mkfs_done  ready  require_osd_release  systemd  type  whoami
root@grml-2017-05 ~ # xfs_info /mnt
meta-data=/dev/loop1             isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0 rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Mounting there indeed works! Now, if we mount the filesystem with new and proper sunit/swidth settings using the older kernel, it should rewrite them on disk:
root@grml-2017-05 ~ # mount -t xfs -o sunit=512,swidth=512 ./sdd1.dd /mnt/
root@grml-2017-05 ~ # umount /mnt/
And indeed, mounting this rewritten filesystem then also works with newer kernels:
root@grml-2020-06 ~ # mount ./sdd1.rewritten /mnt/
root@grml-2020-06 ~ # xfs_info /root/sdd1.rewritten
meta-data=/dev/loop1             isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=64    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@grml-2020-06 ~ # mount -t xfs                
/root/sdd1.rewritten on /mnt type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=512,swidth=512,noquota)
FTR: The sunit=512,swidth=512 from the xfs mount option is identical to xfs_info s output sunit=64,swidth=64 (because mount.xfs s sunit value is given in 512-byte block units, see man 5 xfs, and the xfs_info output reported here is in blocks with a block size (bsize) of 4096, so sunit = 512*512 := 64*4096 ). mkfs uses minimum and optimal sizes for stripe unit and stripe width; you can check this e.g. via (note that server2 with fixed firmware version reports proper values, whereas server3 with broken controller firmware reports non-sense):
synpromika@server2 ~ % for i in /sys/block/sd*/queue/ ; do printf "%s: %s %s\n" "$i" "$(cat "$i"/minimum_io_size)" "$(cat "$i"/optimal_io_size)" ; done
[...]
/sys/block/sdc/queue/: 262144 262144
/sys/block/sdd/queue/: 262144 262144
/sys/block/sde/queue/: 262144 262144
/sys/block/sdf/queue/: 262144 262144
/sys/block/sdg/queue/: 262144 262144
/sys/block/sdh/queue/: 262144 262144
/sys/block/sdi/queue/: 262144 262144
/sys/block/sdj/queue/: 262144 262144
/sys/block/sdk/queue/: 262144 262144
/sys/block/sdl/queue/: 262144 262144
/sys/block/sdm/queue/: 262144 262144
/sys/block/sdn/queue/: 262144 262144
[...]
synpromika@server3 ~ % for i in /sys/block/sd*/queue/ ; do printf "%s: %s %s\n" "$i" "$(cat "$i"/minimum_io_size)" "$(cat "$i"/optimal_io_size)" ; done
[...]
/sys/block/sdc/queue/: 524288 262144
/sys/block/sdd/queue/: 524288 262144
/sys/block/sde/queue/: 524288 262144
/sys/block/sdf/queue/: 524288 262144
/sys/block/sdg/queue/: 524288 262144
/sys/block/sdh/queue/: 524288 262144
/sys/block/sdi/queue/: 524288 262144
/sys/block/sdj/queue/: 524288 262144
/sys/block/sdk/queue/: 524288 262144
/sys/block/sdl/queue/: 524288 262144
/sys/block/sdm/queue/: 524288 262144
/sys/block/sdn/queue/: 524288 262144
[...]
This is the underlying reason why the initially created XFS partitions were created with incorrect sunit/swidth settings. The broken firmware of server1 and server3 was the cause of the incorrect settings they were ignored by old(er) xfs/kernel versions, but treated as an error by new ones. Make sure to also read the XFS FAQ regarding How to calculate the correct sunit,swidth values for optimal performance . We also stumbled upon two interesting reads in RedHat s knowledge base: 5075561 + 2150101 (requires an active subscription, though) and #1835947. Am I affected? How to work around it? To check whether your XFS mount points are affected by this issue, the following command line should be useful:
awk '$3 == "xfs" print $2 ' /proc/self/mounts   while read mount ; do echo -n "$mount " ; xfs_info $mount   awk '$0 ~ "swidth" gsub(/.*=/,"",$2); gsub(/.*=/,"",$3); print $2,$3 '   awk '  if ($1 > $2) print "impacted"; else print "OK" ' ; done
If you run into the above situation, the only known solution to get your original XFS partition working again, is to boot into an older kernel version again (4.17 or older), mount the XFS partition with correct sunit/swidth settings and then boot back into your new system (kernel version wise). Lessons learned Thanks: Darshaka Pathirana, Chris Hofstaedtler and Michael Hanscho. Looking for help with your IT infrastructure? Let us know!

4 March 2021

Gunnar Wolf: The power of EIDE

I am quite happy with the Raspberry tower I bought for keeping my Raspberries organized. Clustering them? No, not by a long shot. I just want to quickly know where they all are, and at a glance, be able to know which one I will work with. Bottom drawer has a RPi1B, second one has a RPi2, next comes a 3B+, and the top two ones are RPi4 (4 and 8GB). That allows me for quick testing of stuff. Yes, I am tempted to get the top one out of the array and use it in production but as it stands, that s the layout. My only quip with this? Serial console access. Connecting and releasing the three tiny cables (no, the red one is not required it provides +5V power, but it s not enough to power over USB more than the earliest RPis) with my big, fat and numb fingers Always takes a minute or three. Until I thought of the obvious: Why not connect the RPi headers to an old EIDE cable? They are of the same dimensions, and much more practical to connect and yank! with that interface expansion in place, I will be able to easily connect my console cables Or even more, I can put on a serious electronic look on my face, take out my soldering iro ehem my very small breadboard for those with limited abilities, and look more interesting! In fact, I am almost sure I can get these two little buggers to blink interestingly when bytes come and go to my RPis! I will finally gain a bit of self-respect as an electronic tinkerer! (yes, yes, I enjoy playing with RPis, but I treat them as Well Computers. I don t do interfacing to the real world, although I m sure it can be fun) What stopped me from doing so? Pin 20 of the EIDE specification. As a service to clumsy computer repairers such as myself, the standard specifies pin 20 is not to carry any signals, and the drive headers are to ship it cut (Key, pin missing), so that together with the notch on the outer part of the bracket inserting the cable upside down is physically impossible. So no, I m not able to finish the project with pieces at hand. I even went to two nearby electronic shops yesterday when I took my dog out for a walk, and could not find it there either So I ended up buying what appears to be a sweet, cheap product covering my needs from our corporate capitalist overlords.

22 December 2020

Norbert Preining: Debian KDE Status for Bullseye

A long journey has come to nice finish. 9 month ago I switched to KDE/Plasma, and started to package newer versions of it than available in Debian. Since then I have packaged every single version of Plasma, the KDE frameworks, and KDE Apps and made them available for Debian/unstable and Debian/testing via the OBS build server. Today, finally, I have uploaded Frameworks 5.77 and Plasma 5.20.4 to unstable, the end of a long story. Despite some initial disagreements with the Debian Qt/KDE Team, we found a modus vivendi, and since some months now I am member of the team and working together with the rest to get an uptodate KDE/Plasma system into Debian/bullseye. Thanks to everyone involved! The last weeks we have also worked on updating many of the KDE/Apps packages to the latest release 20.12.0, which means that as of now, Debian/unstable contains the most recent versions of KDE Frameworks, KDE Plasma, and of most KDE Apps. Thanks goes to all the team members, in particular to (in alphabetic order) Aur lien, Patrick, Pino, Sandro, and Scarlett for their work, and to all the testers and bug reporters. The current status is also more or less what we plan to get into Debian/Bullseye. An update to Frameworks 5.78 and Plasma 5.20.5 is still possible, but not decided by now. Concerning my OBS packages: they are mostly superseeded by now, and all but the KDE Apps package can be removed from the apt sources. The only remaining archive of interest is
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/apps2012/Debian_Unstable/ ./
and the same with Testing instead of Unstable for Debian/testing. For those adventurous, there is also the digikam-beta repository That s it. Have a nice Christmas, if you celebrate it, and a good start into a hopefully better 2021! Enjoy.

12 May 2020

Evgeni Golov: Building a Shelly 2.5 USB to TTL adapter cable

When you want to flash your Shelly 2.5 with anything but the original firmware for the first time, you'll need to attach it to your computer. Later flashes can happen over the air (at least with ESPHome or Tasmota), but the first one cannot. In theory, this is not a problem as the Shelly has a quite exposed and well documented interface: Shelly 2.5 pinout However, on closer inspection you'll notice that your normal jumper wires don't fit as the Shelly has a connector with 1.27mm (0.05in) pitch and 1mm diameter holes. Now, there are various tutorials on the Internet how to build a compatible connector using Ethernet cables and hot glue or with female header socket legs, and you can even buy cables on Amazon for 18 ! But 18 sounded like a lot and the female header socket thing while working was pretty finicky to use, so I decided to build something different. We'll need 6 female-to-female jumper wires and a 1.27mm pitch male header. Jumper wires I had at home, the header I got is a SL 1X20G 1,27 from reichelt.de for 0.61 . It's a 20 pin one, so we can make 3 adapters out of it if needed. Oh and we'll need some isolation tape. SL 1X20G 1,27 The first step is to cut the header into 6 pin chunks. Make sure not to cut too close to the 6th pin as the whole thing is rather fragile and you might lose it. SL 1X20G 1,27 cut into pieces It now fits very well into the Shelly with the longer side of the pins. Shelly 2.5 with pin headers attached Second step is to strip the plastic part of one side of the jumper wires. Those are designed to fit 2.54mm pitch headers and won't work for our use case otherwise. jumper wire with removed plastic As the connectors are still too big, even after removing the plastic, the next step is to take some pliers and gently press the connectors until they fit the smaller pins of our header. Shelly 2.5 with pin headers and a jumper wire attached Now is the time to put everything together. To avoid short circuiting the pins/connectors, apply some isolation tape while assembling, but not too much as the space is really limited. Shelly 2.5 with pin headers and a jumper wire attached and taped And we're done, a wonderful (lol) and working (yay) Shelly 2.5 cable that can be attached to any USB-TTL adapter, like the pictured FTDI clone you get almost everywhere. Shelly 2.5 with full cable and FTDI attached Yes, in an ideal world we would have soldered the header to the cable, but I didn't feel like soldering on that limited space. And yes, shrink-wrap might be a good thing too, but again, limited space and with isolation tape you only need one layer between two pins, not two.

13 April 2020

Shirish Agarwal: Migrant worker woes and many other stories

I was gonna use this blog post to share about the migrant worker woes as there has been multiple stories doing the rounds. For e.g. a story which caught the idea of few people but most of us, i.e. middle-class people are so much into our own thing that we care a fig leaf about what happens to migrants. This should not be a story coming from a humane society but it seems India is no different than any other country of the world and in not a good way. Allow me to share
Or for those who don t like youtube, here s an alternative link https://www.invidio.us/watch?v=JGEgZq_1jmc Now the above two editorial shares two stories, one of Trump retaliatory threat to India in the Q&A of the journalist. In fact, Trump has upped the ante on visa sanctions as India buckled so easily under pressure. There have been other stories doing the rounds how people who have illnesses who need HCQ in India are either dying or are close to death because of unavailability of HCQ in the medicine shop. There have been reports in Pune as well as South Mumbai (one of the poshest localities in Mumbai/Bombay) that medicine shops are running empty or emptier. There have been so many stories on that, with reporters going to shops and asking owners of the medicine shops and shop-owners being clueless. I think the best article which vividly describes the Government of India (GOI) response to the pandemic is the free-to-read article shared by Arundhati Roy in Financial Times. It has reduced so much of my work or sharing that it s unbelievable. And she has shared it with pictures and all so I can share other aspects of how the pandemic has been affecting India and bringing the worst out in the Government in its our of need. In fact, not surprisingly though, apparently there was also a pro-Israel similar thing which happened in Africa too . As India has too few friends now globally, hence it decided to give a free pass to them.

Government of India, news agencies and paid News One of the attempts the state tried to do, although very late IMHO is that it tried to reach out to the opposition i.e. Congress party and the others. Mrs. Sonia Gandhi, who is the Congress president asked that the Government should not run any of its ads on private television channels for a period of two years. There had been plenty of articles, both by medianama and others who have alleged that at least from the last 6 odd years, Government ads. comprise of almost 50-60% advertising budget of a channel advertising budget. This has been discussed also in medianama s roundtable on online content which happened few months back. While an edited version is out there on YT, this was full two day s event which happened across two different cities.
or the alternative to youtube https://www.invidio.us/watch?v=c1PhWR1-Urs It was as if the roundtable discussions were not enough, Mrs. Gandhi clarion call was answered by News Broadcaster s Association (NBA) and this is what they had to say
News Broadcasters Association reply to Mrs. Gandhi
To put it simply, NBA deplored the suggestion by Mrs. Gandhi and even called the economy in recession and all they had were the Government s own advertising budget to justify their existence. The statements in themselves are highly pregnant and reveal both the relationship that the media, print or mainstream news channels have with the Government of India. Now if you see that, doesn t it make sense that media always slants the story from the Government s perspective rather than remaining neutral. If my bread basket were on the onus of me siding with the Govt. that is what most sane persons would do, otherwise they would resign and leave which many reporters who had a conscience did. Interestingly enough, the NBA statement didn t just end there but also used the word recession , this is the term that Government of India (GOI) hates and has in turn has been maintaining the word, terminology slowdown . While from a layman s perspective the two terms may seem to be similar, if India has indeed been in recession then the tools and the decisions that should have been taken by GOI should have been much different than what they took. Interestingly, enough GOI has refrained from saying anything on the matter which only reveals their own interests in the matter. Also if an association head is making the statement, it is more than likely that he consulted a lawyer or two and used application of mind while drafting the response. In other words, or put more simply, this was a very carefully drafted letter because they know that tomorrow the opposition party may come into power so they don t want to upset the power dynamics too much.

Privacy issues arising due to the Pandemic On the same Financial Times, two stories which dealt with the possible privacy violations due to the Pandemic have been doing the rounds. The first one, by Yuval Noah Harari is more exploratory by nature and makes some very good points without going far too deep into specific instances of recent times but rather goes into history and past instances where Governments have used the pandemics to exert more control over their populace and drive their agenda. I especially liked the last few lines which he shared in his op-ed Even if the current administration eventually changes tack and comes up with a global plan of action, few would follow a leader who never takes responsibility, who never admits mistakes, and who routinely takes all the credit for himself while leaving all the blame to others. Yuval Noah Harari . The whole statement could right fit onto the American President which he was talking about while at the same time, fits right into the current Indian Prime Minister, Boris Johnson of UK and perhaps Jair Bolsanaro of Brazil. All these three-four individuals have in common is that most of them belong to right-wing and hence cater only to the rich industrialist s agenda. While I don t know about Jair Bolsanaro much, at least three out of four had to turn to socialism and had to give some bailout packages to the public at large, even though continuing to undermine their own actions. More on this probably a bit down the line. The second story shared by Nic Fildes and Javier Espinoza who broke the story of various surveillance attempts and the privacy concerns that people have. Even the Indian PMO has asked this data and because there was no protest by the civil society, a token protest was done by COAI (Cellular Operator Association of India) but beyond that nothing, I am guessing because the civil society didn t make much noise as everybody is busy with their own concerns of safety and things going on, it s possible that such data may have gone to the Government. There is not much new here that people who had been working on the privacy issues know, it s just how easy Governments are finding to do it. The part of informed consent is really a misnomer . Governments lie all the time, for e.g. in the UK, did the leave party and people take informed consent, no they pushed their own agenda. This is and will be similar in many countries of the world.

False Socialism by RW parties In at least the three countries I have observing, simply due to available time, that lot of false promises are being made by our leaders and more often than not, the bailouts will be given to already rich industrialists. An op-ed by Vivek Kaul, who initially went by his handle which means somebody who is educated but unemployed. While Vivek has been one-man army in revealing most of the Government s mischiefs especially as fudging numbers are concerned among other things, there have been others too. As far as the US is concerned, an e-zine called free press (literally) has been sharing Trump s hollowness and proclamations for U.S. . Far more interestingly, I found New York times investigated and found a cache of e-mails starting from early January, which they are calling Red Dawn . The cache is undeniable proof that medical personnel in the U.S. were very much concerned since January 2020 but it was only after other countries started lock-down that U.S. had to follow suit. I am sure Indian medical professionals may have done similar mail exchanges but we will never know as the Indian media isn t independent enough.

Domestic violence and Patriarchy There have been numerous reports of domestic violence against women going up, in fact two prominent publications have shared pieces about how domestic violence has gone up in India since the lockdown but the mainstream press is busy with its own tropes, the reasons already stated above. In fact, interestingly enough, most women can t wear loose fitting clothes inside the house because of the near ones being there 24 7 . This was being shared as India is going through summer where heat waves are common and most families do not have access to A/C s and rely on either a fan or just ventilation to help them out. I can t write more about this as simply I m not a woman so I haven t had to face the pressures that they have to every day. Interestingly though, there was a piece shared by arre. Interestingly, also arre whose content I have shared a few times on my blog has gone from light, funny to be much darker and more serious tone. Whether this is due to the times we live in is something that a social scientist or a social anthropologist may look into in the times to come. One of the good things though, there hasn t been any grid failures as no industrial activity is happening (at all). In fact SEB s (State Electricity Boards) has shown a de-growth in electricity uptake as no industrial activity has been taken. While they haven t reduced any prices (which they ideally should have) as everybody is suffering.

Loot and price rise Again, don t think it is an Indian issue but perhaps may be the same globally. Because of broken supply chains, there are both real and artificial shortages happening which is leading to reasonable and unreasonable price hikes in the market. Fresh veggies which were normally between INR 10/- to INR 20/- for 250 gm have reached INR 40/- 50/- and even above. Many of the things that we have to become depend upon are not there anymore. The shortage of plastic bottles being case in point.
Aryan Plastic bottle
This and many others like these pictures have been shared on social media but it seems the Government is busy doing something else. The only thing we know for sure is that the lock-down period is only gonna increase, no word about PPE s (Personal Protection Equipment) or face masks or anything else. While India has ordered some, those orders are being diverted to US or EU. In fact, many doctors who have asked for the same have been arrested, sacked or suspended for asking such inconvenient questions, although whether in BJP ruled states or otherwise. In fact, the Centre has suspended MPLADS funds , members of parliament get funds which they can use to provide relief work or whatever they think the money is best to spend upon.

Conditions of Labor in the Pandemic Another sort of depressing story has been how the Supreme Court CJI Justice SA Bobde has made statements and refrained from playing any role in directing the Center to provide relief to the daily wage laborers. In fact, Mr. Bobde made statements such as why they need salaries if they are getting food. This was shared by barandbench, a site curated by lawyers and reporters alike. Both livelaw as well as barandbench have worked to enhance people s awareness about the legal happenings in our High Courts and Supreme Court. And while sadly, they cannot cover all, they at least do attempt to cover a bit of what s hot atm. The Chief Justice who draws a salary of INR 250,000 per month besides other perks is perhaps unaware or doesn t care about fate of millions of casual workers, 400 460 million workers who will face abject poverty and by extension even if there are 4 members of the family so probably 1.2 billion people will fall below the poverty line. Three, four major sectors are going to be severely impacted, namely Agriculture, Construction and then MSME (Micro, small and medium enterprises) which cover everything from autos, industrial components, FMCG, electronics, you name it, it s done by the MSME sector. We know that the Rabi crop, even though it was gonna be a bumper crop this year will rot away in the fields. Even the Kharif crop whose window for sowing is at the most 2-3 weeks will not be able to get it done in time. In fact, with the extended lockdown of another 21 days, people will probably return home after 2 months by which time they would have nothing to do there as well as here in the cities. Another good report was done by the wire, the mainstream media has already left the station.

Ministry of Public Health There was an article penned by Dr. Edmond Fernandes which he published last year. The low salary along with the complexities that Indian doctors are and may face in the near future are just mind-boggling.

The Loss Losses have already started pouring in. Just today Air Deccan has ceased all its operations. I had loved Mr. Gopinath s airline which was started in the early 2000 s. While I won t bore you with the history, most of it can be seen from simplify Deccan . This I believe is just the start and it s only after the few months after the lock-down has been lifted would we really know the true extent of losses everywhere. And the more lenghthier the lockdown, the more difficult it would be businesses to ramp back. People have already diagnosed at the very least 15-20 sectors of the economy which would be hit and another similar or more number of sectors which will have first and second-order of losses and ramp-downs. While some guesses are being made, many are wildly optimistic and many are wildly pessimistic, as shared we would only know the results when the lockdown is opened up.

Predictions for the future While things are very much in the air, some predictions can be made or rationally deduced. For instance, investments made in automation and IT would remain and perhaps even accelerate a little. Logistics models would need to be re-worked and maybe, just maybe there would be talk and action in making local supply chains a bit more robust. Financing is going to be a huge issue for at least 6 months to a year. Infrastructure projects which require huge amount of cash upfront will either have to be re-worked or delayed, how they will affect projects like Pune Metro and other such projects only time will tell.

Raghuram Rajan Raghuram Rajan was recently asked if he would come back and let bygones be bygones. Raghuram in his own roundabout way said no. He is right now with Chicago Booth doing the work that he always love. Why would he leave that and be right in the middle of the messes other people have made. He probably gets more money, more freedom and probably has a class full of potential future economists. Immigration Control, Conferences and thought experiment There are so many clueless people out there, who don t know why it takes so long for any visa to be processed. From what little I know, it is to verify who you say you are and you have valid reason to enter the country. The people from home ministry verify credentials, as well as probably check with lists of known criminals and their networks world-wide. They probably have programs for such scenarios and are part and parcel of their everyday work. The same applies to immigration control at Airports. there has been a huge gap at immigration counters and the numbers of passengers who were flying internationally to and fro from India. While in India, we call them as Ministry of Home Affairs, in U.S. it s Department of Homeland security, other countries using similar jargons. Now even before this pandemic happened, the number of people who are supposed to do border control and check people was way less and there have been scenes of Air rage especially in Indian airports after people came after a long-distance flight. Now there are couple of thought experiments, just day before yesterday scientists discovered six new coronaviruses in bats and scientists in Iceland found 40 odd mutations of the virus on people. Now are countries going to ban people from Iceland as in time the icelandic people probably would have anti-bodies on all the forty odd mutations. Now if and when they come in contact onto others who have not, what would happen ? And this is not specifically about one space or ethnicity or whatever, microbes and viruses have been longer on earth than we have. In our greed we have made viruses resistant to antibiotics. While Mr. Trump says as he discovered it today, this has been known to the medical fraternity since tht 1950 s. CDC s own chart shows it. We cannot live in fear of a virus, the only way we can beat it is by understanding it and using science. Jon Cohen shared some of the incredible ways science is looking to beat this thing
or as again an alternative to youtube https://www.invidio.us/watch?v=MPVG_n3w_vM One of the most troubling question is how the differently-abled communities which don t have media coverage at the best of times, haven t had any media coverage at all during the pandemic. What are their stories and what they are experiencing ? How are they coping ? Are there anyways we could help each other ? By not having those stories, we perhaps have left them more vulnerable than we intend. And what does that speak about us, as people or as a community or a society ?

Silver Linings While there is not a lot to be positive about, one interesting project I came about is openbreath.tech . This is an idea, venture started by IISER (Indian Institute of Science Education and Research) , IUCAA (Inter-University Centre for Astronomy and Astrophysics). They are collaborating with octogeneraian Capt (Retd) Rustom Barucha from Barucha Instrumentation and Control, besides IndoGenius, New Delhi, and King s College, London. The first two institutes are from my home town, Pune. While I don t know much of the specifics of this idea other than that there is an existing Barucha ventilator which they hope to open-source and make it easier for people to produce their own. While I have more questions than answers at this point, this is something hopefully to watch out for in the coming days and weeks. The other jolly bit of good news has come from Punjab where after several decades, people in Northern Punjab are finally able to see the Himalayas or the Himalayan mountain range.
Dhauladhar range Northern Punjab Copyright CNN.Com
There you have it, What I have covered is barely scratching the surface. As a large section of the media only focuses on one narrative, other stories and narratives are lost. Be safe, till later.

29 March 2020

Enrico Zini: Politics links

How tech's richest plan to save themselves after the apocalypse
politics privilege archive.org
Silicon Valley s elite are hatching plans to escape disaster and when it comes, they ll leave the rest of us behind
Heteronomy refers to action that is influenced by a force outside the individual, in other words the state or condition of being ruled, governed, or under the sway of another, as in a military occupation.
Poster P590CW $9.00 Early Warning Signs Of Fascism Laurence W. Britt wrote about the common signs of fascism in April, 2003, after researching seven fascist regimes: Hitler's Nazi Germany; Mussolini's Italy; Franco's Spain; Salazar's Portugal; Papadopoulos' Greece; Pinochet's Chile; Suharto's Indonesia. Get involved! Text: Early Warning Signs of Fascism Powerful and Continuing Nationalism Disdain For Human Rights Identification of Enemies As a unifying cause Supremacy of the military Rampant Sexism Controlled Mass Media Obsession With National Security
Political and social scientist Stefania Milan writes about social movements, mobilization and organized collective action. On the one hand, interactions and networks achieve more visibility and become a proxy for a collective we . On the other hand: Law enforcement can exercise preemptive monitorin
How new technologies and techniques pioneered by dictators will shape the 2020 election
A regional election offers lessons on combatting the rise of the far right, both across the Continent and in the United States.
The Italian diaspora is the large-scale emigration of Italians from Italy. There are two major Italian diasporas in Italian history. The first diaspora began more or less around 1880, a decade or so after the Unification of Italy (with most leaving after 1880), and ended in the 1920s to early-1940s with the rise of Fascism in Italy. The second diaspora started after the end of World War II and roughly concluded in the 1970s. These together constituted the largest voluntary emigration period in documented history. Between 1880-1980, about 15,000,000 Italians left the country permanently. By 1980, it was estimated that about 25,000,000 Italians were residing outside Italy. A third wave is being reported in present times, due to the socio-economic problems caused by the financial crisis of the early twenty-first century, especially amongst the youth. According to the Public Register of Italian Residents Abroad (AIRE), figures of Italians abroad rose from 3,106,251 in 2006 to 4,636,647 in 2015, growing by 49.3% in just ten years.

19 October 2016

Reproducible builds folks: Reproducible Builds: week 77 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday October 9 and Saturday October 15 2016: Media coverage Documentation update After discussions with HW42, Steven Chamberlain, Vagrant Cascadian, Daniel Shahaf, Christopher Berg, Daniel Kahn Gillmor and others, Ximin Luo has started writing up more concrete and detailed design plans for setting SOURCE_ROOT_DIR for reproducible debugging symbols, buildinfo security semantics and buildinfo security infrastructure. Toolchain development and fixes Dmitry Shachnev noted that our patch for #831779 has been temporarily rejected by docutils upstream; we are trying to persuade them again. Tony Mancill uploaded javatools/0.59 to unstable containing original patch by Chris Lamb. This fixed an issue where documentation Recommends: substvars would not be reproducible. Ximin Luo filed bug 77985 to GCC as a pre-requisite for future patches to make debugging symbols reproducible. Packages reviewed and fixed, and bugs filed The following updated packages have become reproducible - in our current test setup - after being fixed: The following updated packages appear to be reproducible now, for reasons we were not able to figure out. (Relevant changelogs did not mention reproducible builds.) Some uploads have addressed some reproducibility issues, but not all of them: Some uploads have addressed nearly all reproducibility issues, except for build path issues: Patches submitted that have not made their way to the archive yet: Reviews of unreproducible packages 101 package reviews have been added, 49 have been updated and 4 have been removed in this week, adding to our knowledge about identified issues. 3 issue types have been updated: Weekly QA work During of reproducibility testing, some FTBFS bugs have been detected and reported by: tests.reproducible-builds.org Debian: Openwrt/LEDE/NetBSD/coreboot/Fedora/archlinux: Misc. We are running a poll to find a good time for an IRC meeting. This week's edition was written by Ximin Luo, Holger Levsen & Chris Lamb and reviewed by a bunch of Reproducible Builds folks on IRC.

25 September 2016

Steinar H. Gunderson: Nageru @ Fyrrom

When Samfundet wanted to make their own Boiler Room spinoff (called Fyrrom more or less a direct translation), it was a great opportunity to try out the new multitrack code in Nageru. After all, what can go wrong with a pretty much untested and unfinished git branch, right? So we cobbled together a bunch of random equipment from here and there: Video equipment Hooked it up to Nageru: Nageru screenshot and together with some great work from the people actually pulling together the event, this was the result. Lots of fun. And yes, some bugs were discovered of course, field testing without followup patches is meaningless (that would either mean you're not actually taking your test experience into account, or that your testing gave no actionable feedback and thus was useless), so they will be fixed in due time for the 1.4.0 release. Edit: Fixed a screenshot link.

7 September 2016

Reproducible builds folks: Reproducible Builds: week 71 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday August 28 and Saturday September 3 2016: Media coverage Antonio Terceiro blogged about testing build reprodubility with debrepro . GSoC and Outreachy updates The next round is being planned now: see their page with a timeline and participating organizations listing. Maybe you want to participate this time? Then please reach out to us as soon as possible! Packages reviewed and fixed, and bugs filed The following packages have addressed reproducibility issues in other packages: The following updated packages have become reproducible in our current test setup after being fixed: The following updated packages appear to be reproducible now, for reasons we were not able to figure out yet. (Relevant changelogs did not mention reproducible builds.) The following 4 packages were not changed, but have become reproducible due to changes in their build-dependencies: Some uploads have addressed some reproducibility issues, but not all of them: Patches submitted that have not made their way to the archive yet: Reviews of unreproducible packages 706 package reviews have been added, 22 have been updated and 16 have been removed in this week, adding to our knowledge about identified issues. 5 issue types have been added: 1 issue type has been updated: Weekly QA work FTBFS bugs have been reported by: diffoscope development diffoscope development on the next version (60) continued in git, taking in contributions from: strip-nondeterminism development Mattia Rizzolo uploaded strip-nondeterminism 0.023-2~bpo8+1 to jessie-backports. A new version of strip-nondeterminism 0.024-1 was uploaded to unstable by Chris Lamb. It included contributions from: Holger added jobs on jenkins.debian.net to run testsuites on every commit. There is one job for the master branch and one for the other branches. disorderfs development Holger added jobs on jenkins.debian.net to run testsuites on every commit. There is one job for the master branch and one for the other branches. tests.reproducible-builds.org Debian: We now vary the GECOS records of the two build users. Thanks to Paul Wise for providing the patch. Misc. This week's edition was written by Ximin Luo, Holger Levsen & Chris Lamb and reviewed by a bunch of Reproducible Builds folks on IRC.

20 June 2016

Simon McVittie: GTK Hackfest 2016

I'm back from the GTK hackfest in Toronto, Canada and mostly recovered from jetlag, so it's time to write up my notes on what we discussed there. Despite the hackfest's title, I was mainly there to talk about non-GUI parts of the stack, and technologies that fit more closely in what could be seen as the freedesktop.org platform than they do in GNOME. In particular, I'm interested in Flatpak as a way to deploy self-contained "apps" in a freedesktop-based, sandboxed runtime environment layered over the Universal Operating System and its many derivatives, with both binary and source compatibility with other GNU/Linux distributions. I'm mainly only writing about discussions I was directly involved in: lots of what sounded like good discussion about the actual graphics toolkit went over my head completely :-) More notes, mostly from Matthias Clasen, are available on the GNOME wiki. In no particular order: Thinking with portals We spent some time discussing Flatpak's portals, mostly on Tuesday. These are the components that expose a subset of desktop functionality as D-Bus services that can be used by contained applications: they are part of the security boundary between a contained app and the rest of the desktop session. Android's intents are a similar concept seen elsewhere. While the portals are primarily designed for Flatpak, there's no real reason why they couldn't be used by other app-containment solutions such as Canonical's Snap. One major topic of discussion was their overall design and layout. Most portals will consist of a UX-independent part in Flatpak itself, together with a UX-specific implementation of any user interaction the portal needs. For example, the portal for file selection has a D-Bus service in Flatpak, which interacts with some UX-specific service that will pop up a standard UX-specific "Open" dialog for GNOME and probably other GTK environments, that dialog is in (a branch of) GTK. A design principle that was reiterated in this discussion is that the UX-independent part should do as much as possible, with the UX-specific part only carrying out the user interactions that need to comply with a particular UX design (in the GTK case, GNOME's design). This minimizes the amount of work that needs to be redone for other desktop or embedded environments, while still ensuring that the other environments can have their chosen UX design. In particular, it's important that, as much as possible, the security- and performance-sensitive work (such as data transport and authentication) is shared between all environments. The aim is for portals to get the user's permission to carry out actions, while keeping it as implicit as possible, avoiding an "are you sure?" step where feasible. For example, if an application asks to open a file, the user's permission is implicitly given by them selecting the file in the file-chooser dialog and pressing OK: if they do not want this application to open a file at all, they can deny permission by cancelling. Similarly, if an application asks to stream webcam data, the UX we expect is for GNOME's Cheese app (or a similar non-GNOME app) to appear, open the webcam to provide a preview window so they can see what they are about to send, but not actually start sending the stream to the requesting app until the user has pressed a "Start" button. When defining the API "contracts" to be provided by applications in that situation, we will need to be clear about whether the provider is expected to obtain confirmation like this: in most cases I would anticipate that it is. One security trade-off here is that we have to have a small amount of trust in the providing app. For example, continuing the example of Cheese as a webcam provider, Cheese could (and perhaps should) be a contained app itself, whether via something like Flatpak, an LSM like AppArmor or both. If Cheese is compromised somehow, then whenever it is running, it would be technically possible for it to open the webcam, stream video and send it to a hostile third-party application. We concluded that this is an acceptable trade-off: each application needs to be trusted with the privileges that it needs to do its job, and we should not put up barriers that are easy to circumvent or otherwise serve no purpose. The main (only?) portal so far is the file chooser, in which the contained application asks the wider system to show an "Open..." dialog, and if the user selects a file, it is returned to the contained application through a FUSE filesystem, the document portal. The reference implementation of the UX for this is in GTK, and is basically a GtkFileChooserDialog. The intention is that other environments such as KDE will substitute their own equivalent. Other planned portals include: Environment variables GNOME on Wayland currently has a problem with environment variables: there are some traditional ways to set environment variables for X11 sessions or login shells using shell script fragments (/etc/X11/Xsession.d, /etc/X11/xinit/xinitrc.d, /etc/profile.d), but these do not apply to Wayland, or to noninteractive login environments like cron and systemd --user. We are also keen to avoid requiring a Turing-complete shell language during session startup, because it's difficult to reason about and potentially rather inefficient. Some uses of environment variables can be dismissed as unnecessary or even unwanted, similar to the statement in Debian Policy 9.9: "A program must not depend on environment variables to get reasonable defaults." However, there are two common situations where environment variables can be necessary for proper OS integration: search-paths like $PATH, $XDG_DATA_DIRS and $PYTHONPATH (particularly necessary for things like Flatpak), and optionally-loaded modules like $GTK_MODULES and $QT_ACCESSIBILITY where a package influences the configuration of another package. There is a stopgap solution in GNOME's gdm display manager, /usr/share/gdm/env.d, but this is gdm-specific and insufficiently expressive to provide the functionality needed by Flatpak: "set XDG_DATA_DIRS to its specified default value if unset, then add a couple of extra paths". pam_env comes closer PAM is run at every transition from "no user logged in" to "user can execute arbitrary code as themselves" but it doesn't support .d fragments, which are required if we want distribution packages to be able to extend search paths. pam_env also turns off per-user configuration by default, citing security concerns. I'll write more about this when I have a concrete proposal for how to solve it. I think the best solution is probably a PAM module similar to pam_env but supporting .d directories, either by modifying pam_env directly or out-of-tree, combined with clarifying what the security concerns for per-user configuration are and how they can be avoided. Relocatable binary packages On Windows and OS X, various GLib APIs automatically discover where the application binary is located and use search paths relative to that; for example, if C:\myprefix\bin\app.exe is running, GLib might put C:\myprefix\share into the result of g_get_system_data_dirs(), so that the application can ask to load app/data.xml from the data directories and get C:\myprefix\share\app\data.xml. We would like to be able to do the same on Linux, for example so that the apps in a Flatpak or Snap package can be constructed from RPM or dpkg packages without needing to be recompiled for a different --prefix, and so that other third-party software packages like the games on Steam and gog.com can easily locate their own resources. Relatedly, there are currently no well-defined semantics for what happens when a .desktop file or a D-Bus .service file has Exec=./bin/foo. The meaning of Exec=foo is well-defined (it searches $PATH) and the meaning of Exec=/opt/whatever/bin/foo is obvious. When this came up in D-Bus previously, my assertion was that the meaning should be the same as in .desktop files, whatever that is. We agreed to propose that the meaning of a non-absolute path in a .desktop or .service file should be interpreted relative to the directory where the .desktop or .service file was found: for example, if /opt/whatever/share/applications/foo.desktop says Exec=../../bin/foo, then /opt/whatever/bin/foo would be the right thing to execute. While preparing a mail to the freedesktop and D-Bus mailing lists proposing this, I found that I had proposed the same thing almost 2 years ago... this time I hope I can actually make it happen! Flatpak and OSTree bug fixing On the way to the hackfest, and while the discussion moved to topics that I didn't have useful input on, I spent some time fixing up the Debian packaging for Flatpak and its dependencies. In particular, I did my first upload as a co-maintainer of bubblewrap, uploaded ostree to unstable (with the known limitation that the grub, dracut and systemd integration is missing for now since I haven't been able to test it yet), got most of the way through packaging Flatpak 0.6.5 (which I'll upload soon), cherry-picked the right patches to make ostree compile on Debian 8 in an effort to make backports trivial, and spent some time disentangling a flatpak test failure which was breaking the Debian package's installed-tests. I'm still looking into ostree test failures on little-endian MIPS, which I was able to reproduce on a Debian porterbox just before the end of the hackfest. OSTree + Debian I also had some useful conversations with developers from Endless, who recently opened up a version of their OSTree build scripts for public access. Hopefully that information brings me a bit closer to being able to publish a walkthrough for how to deploy a simple Debian derivative using OSTree (help with that is very welcome of course!). GTK life-cycle and versioning The life-cycle of GTK releases has already been mentioned here and elsewhere, and there are some interesting responses in the comments on my earlier blog post. It's important to note that what we discussed at the hackfest is only a proposal: a hackfest discussion between a subset of the GTK maintainers and a small number of other GTK users (I am in the latter category) doesn't, and shouldn't, set policy for all of GTK or for all of GNOME. I believe the intention is that the GTK maintainers will discuss the proposals further at GUADEC, and make a decision after that. As I said before, I hope that being more realistic about API and ABI guarantees can avoid GTK going too far towards either of the possible extremes: either becoming unable to advance because it's too constrained by compatibility, or breaking applications because it isn't constrained enough. The current situation, where it is meant to be compatible within the GTK 3 branch but in practice applications still sometimes break, doesn't seem ideal for anyone, and I hope we can do better in future. Acknowledgements Thanks to everyone involved, particularly:

31 January 2016

Daniel Silverstone: The Beer'o'Meter project

As some of you may know, I have been working on a small hardware project called the Beer'o'Meter whose purpose is to allow us to extend Ye Olde Vic's beer board to indicate the approximate fullness of each cask. For some time now, we've been operating an electronic beer board at the Vic which you may see tweeted out from time to time. The pumpotron has become very popular with the visitors to the pub, especially that it can be viewed online in a basic textual form. Of course, as many of you who visit pubs know only too well. That a beer is "on" is no indication of whether or not you need to get there sharpish to have a pint, or if you can take your time and have a curry first. As a result, some of us have noticed a particular beer on, come to the pub after dinner, and then been very sad that if only we'd come 30 minutes previously, we'd have had a chance at the very beer we were excited about. Combine this kind of sadness with a two week break at Christmas, and I started to develop a Beer'o'Meter to extend the pumpotron with an indication of how much of a given beer had already been served. Recently my boards came back from Elecrow along with various bits and bobs, and I have spent some time today building one up for test purposes. As always, it's important to start with some prep work to collect all the necessary components. I like to use cake cases as you may have noticed on the posting yesterday about the oscilloscope I built. Component prep for the Beer'o'Meter Naturally, after prep comes the various stages of assembly. You start with the lowest-height components, so here's the board after I fitted the ceramic capacitors: Step 1, ceramic capacitors And here's after I fitted the lying-down electrolytic decoupling capacitor for the 3.3 volt line: Step 2, capacitors which lie down Next I should have fitted the six transitors from the middle cake case, but I discovered that I'd used the wrong pinout for them. Even after weeks of verification by myself and others, I'd made a mistake. My good friend Vincent Sanders recently posted about how creativity is allowing yourself to make mistakes and here I had made a doozy I hadn't spotted until I tried to assemble the board. Fortunately TO-92 transistors have nice long legs and I have a pair of tweezers and some electrical tape. As such I soon had six transistors doing the river dance: Transistors doing the river-dance With that done, I noticed that the transistors now stood taller than the pins (previously I had been intending to fit the transistors before the pins) so I had to shuffle things around and fit all my 0.1" pins and sockets next: Step 3, pins and sockets Then I could fit my dancing transistors: Step 4, transistors We're almost finished now, just one more capacitor to provide some input decoupling on the 9v power supply: Finished -- decoupling fitted Of course, it wouldn't be complete without the ESP8266Huzzah I acquired from AdaFruit though I have to say that I'm unlikely to use these again, but rather I might design in the surface-mount version of the module instead. Fitted with the module And since this is the very first Beer'o'Meter to be made, I had to go and put a 1 on the serial-number space on the back of the board. I then tried to sign my name in the box, made a hash of it, so scribbled in the gap :-) The back of the finished module Finally I got to fit all six of my flow meters ready for some testing. I may post again about testing the unit, but for now, here's a big spider of a flow meter for beer: The Beer'o'Meter spider This has been quite a learning experience for me, and I hope in the future to be able to share more of my hardware projects, perhaps from an earlier stage. I have plans for a DAC board, and perhaps some other things.

27 January 2016

Uwe Kleine-K nig: Installing Debian Jessie on a Netgear ReadyNAS 104

The Netgear ReadyNAS 104 comes shipped with U-Boot. To access its "shell" remove the small quadratic sticker at the backside to reveal the UART pins (3V3, pinout available at Arnaud's NAS page[1] and connect a matching adapter. Also connect a network cable to the lower jack. Then on a different machine in the same network setup a tftp server (e.g. apt-get install tftpd-hpa). As of today the latest beta netboot installer (Beta 2) doesn't work any more because the kernel in jessie was updated since the installer was released. So pick up the armhf netboot installer from the daily snapshots. You need initrd.gz and vmlinuz. Furthermore armada-370-netgear-rn104.dtb. Update: As jessie is released now, download the following images: To make the installer ready to boot do:
# apt-get install u-boot-tools
$ cat vmlinuz armada-370-netgear-rn104.dtb > vmlinuz-rn104
$ mkimage -A arm -O linux -T multi -C none -a 0x04000000 -e 0x04000000 -n "Debian Jessie armhf installer" -d vmlinuz-rn104:initrd.gz uImage-installer-rn104
# cp uImage-installer-rn104 /srv/tftp
Then on the U-Boot shell setup networking and start the installer by issuing the following commands:
dhcp
setenv serverip 192.168.1.17
tftp uImage-installer-rn104
bootm $load_addr
With 192.168.1.17 being the IPv4 of the machine you set up the tftp server above, adapt accordingly to your setup. While in U-Boot the default ethernet device is the lower jack, the installer is only able to use the upper, so replug the ethernet cable to the upper receptacle. Go through the installation, and before rebooting do the following: Select "Change debconf priority" and set it to "low". Then "Download installer components" and check "mtd-modules-3.16.0-4-armmp-di". After that "Execute a shell" and do:
# depmod -a 
# modprobe pxa3xx_nand
# apt-install flash-kernel
# mount --bind /proc /target/proc
# chroot /target
# cat >> /etc/flash-kernel/db << EOF
Machine: NETGEAR ReadyNAS 104
DTB-Id: armada-370-netgear-rn104.dtb
DTB-Append: yes
Mtd-Kernel: uImage
Mtd-Initrd: minirootfs
U-Boot-Kernel-Address: 0x04000000
U-Boot-Initrd-Address: 0x05000000
Required-Packages: u-boot-tools
EOF
# flash-kernel
You then need to adapt the u-boot environment to pass the right root parameter to Linux. Alternatively add Bootloader-Sets-Incorrect-Root: yes to /etc/flash-kernel/db. [1] Note this page is about a ReadyNAS 102, the pinout is identical though.

20 December 2015

Lunar: Reproducible builds: week 34 in Stretch cycle

What happened in the reproducible builds effort between December 13th to December 19th: Infrastructure Niels Thykier started implementing support for .buildinfo files in dak. A very preliminary commit was made by Ansgar Burchardt to prevent .buildinfo files from being removed from the upload queue. Toolchain fixes Mattia Rizzolo rebased our experimental debhelper with the changes from the latest upload. New fixes have been merged by OCaml upstream. Packages fixed The following 39 packages have become reproducible due to changes in their build dependencies: apache-mime4j, avahi-sharp, blam, bless, cecil-flowanalysis, cecil, coco-cs, cowbell, cppformat, dbus-sharp-glib, dbus-sharp, gdcm, gnome-keyring-sharp, gudev-sharp-1.0, jackson-annotations, jackson-core, jboss-classfilewriter, jboss-jdeparser2, jetty8, json-spirit, lat, leveldb-sharp, libdecentxml-java, libjavaewah-java, libkarma, mono.reflection, monobristol, nuget, pinta, snakeyaml, taglib-sharp, tangerine, themonospot, tomboy-latex, widemargin, wordpress, xsddiagram, xsp, zeitgeist-sharp. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues, but not all of them: Patches submitted which have not made their way to the archive yet: reproducible.debian.net Packages in experimental are now tested on armhf. (h01ger) Arch Linux packages in the multilib and community repositories (4,000 more source packages) are also being tested. All of these test results are better analyzed and nicely displayed together with each package. (h01ger) For Fedora, build jobs can now run in parallel. Two are currently running, now testing reproducibility of 785 source packages from Fedora 23. mock/1.2.3-1.1 has been uploaded to experimental to better build RPMs. (h01ger) Work has started on having automatic build node pools to maximize use of armhf build nodes. (Vagrant Cascadian) diffoscope development Version 43 has been released on December 15th. It has been dubbed as epic! as it contains many contributions that were written around the summit in Athens. Baptiste Daroussin found that running diffoscope on some Tar archives could overwrite arbitrary files. This has been fixed by using libarchive instead of Python internal Tar library and adding a sanity check for destination paths. In any cases, until proper sandboxing is implemented, don't run diffosope on unstrusted inputs outside an isolated, throw-away system. Mike Hommey identified that the CBFS comparator would needlessly waste time scanning big files. It will now not consider any files bigger than 24 MiB 8 MiB more than the largest ROM created by coreboot at this time. An encoding issue related to Zip files has also been fixed. (Lunar) New comparators have been added: Android dex files (Reiner Herrmann), filesystem images using libguestfs (Reiner Herrmann), icons and JPEG images using libcaca (Chris Lamb), and OS X binaries (Clemens Lang). The comparator for Free Pascal Compilation Unit will now only be used when the unit version matches the compiler one. (Levente Polyak) A new multi-file HTML output with on-demand loading of long diffs is available through the --html-dir option. On-demand loading requires jQuery which path can be specified through the --jquery option. The diffs can also be simply browsed for non-JavaScript users or when jQuery is not available. (Joachim Breitner) Example of on-demand loading in diffosope Portability toward other systems has been improved: old versions of GNU diff are now supported (Mike McQuaid), suggestion of the appropriate locale is now the more generic en_US.UTF-8 (Ed Maste), the --list-tools option can now support multiple systems (Mattia Rizzolo, Levente Polyak, Lunar). Many internal changes and code clean-ups have been made, paving the way for parallel processing. (Lunar) Version 44 was released on December 18th fixing an issue affecting .deb lacking a md5sums file introduced in a previous refactoring (Lunar). Support has been added for Mozilla optimized Zip files. (Mike Hommey). The HTML output has been optimized in size (Mike Hommey, Esa Peuha, Lunar), speed (Lunar), and will now properly number lines (Mike Hommey). A message will always be displayed when lines are ignored at the end of a diff (Lunar). For portability and consistency, Python os.walk() function is now used instead of find to perform directory listing. (Lunar) Documentation update Package reviews 143 reviews have been removed, 69 added and 22 updated in the previous week. Chris Lamb reported 12 new FTBFS issues. News issues identified this week: random_order_in_init_py_generated_by_python-genpy, timestamps_in_copyright_added_by_perl_dist_zilla, random_contents_in_dat_files_generated_by_chasen-dictutils_makemat, timestamps_in_documentation_generated_by_pandoc. Chris West did some improvements on the scripts used to manage notes in the misc repository. Misc. Accounts of the reproducible builds summit in Athens were written by Thomas Klausner from NetBSD and Hans-Christoph Steiner from The Guardian Project. Some openSUSE developers are working on a hackweek on reproducible builds which was discussed on the opensuse-packaging mailing-list.

3 August 2015

Lunar: Reproducible builds: week 14 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes akira submitted a patch to make cdbs export SOURCE_DATE_EPOCH. She uploded a package with the enhancement to the experimental reproducible repository. Packages fixed The following 15 packages became reproducible due to changes in their build dependencies: dracut, editorconfig-core, elasticsearch, fish, libftdi1, liblouisxml, mk-configure, nanoc, octave-bim, octave-data-smoothing, octave-financial, octave-ga, octave-missing-functions, octave-secs1d, octave-splines, valgrind. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: In contrib, Dmitry Smirnov improved libdvd-pkg with 1.3.99-1-1. Patches submitted which have not made their way to the archive yet: reproducible.debian.net Four armhf build hosts were provided by Vagrant Cascadian and have been configured to be used by jenkins.debian.net. Work on including armhf builds in the reproducible.debian.net webpages has begun. So far the repository comparison page just shows us which armhf binary packages are currently missing in our repo. (h01ger) The scheduler has been changed to re-schedule more packages from stretch than sid, as the gcc5 transition has started This mostly affects build log age. (h01ger) A new depwait status has been introduced for packages which can't be built because of missing build dependencies. (Mattia Rizzolo) debbindiff development Finally, on August 31st, Lunar released debbindiff 27 containing a complete overhaul of the code for the comparison stage. The new architecture is more versatile and extensible while minimizing code duplication. libarchive is now used to handle cpio archives and iso9660 images through the newly packaged python-libarchive-c. This should also help support a couple other archive formats in the future. Symlinks and devices are now properly compared. Text files are compared as Unicode after being decoded, and encoding differences are reported. Support for Sqlite3 and Mono/.NET executables has been added. Thanks to Valentin Lorentz, the test suite should now run on more systems. A small defiency in unquashfs has been identified in the process. A long standing optimization is now performed on Debian package: based on the content of the md5sums control file, we skip comparing files with matching hashes. This makes debbindiff usable on packages with many files. Fuzzy-matching is now performed for files in the same container (like a tarball) to handle renames. Also, for Debian .changes, listed files are now compared without looking the embedded version number. This makes debbindiff a lot more useful when comparing different versions of the same package. Based on the rearchitecturing work has been done to allow parallel processing. The branch now seems to work most of the time. More test needs to be done before it can be merged. The current fuzzy-matching algorithm, ssdeep, has showed disappointing results. One important use case is being able to properly compare debug symbols. Their path is made using the Build ID. As this identifier is made with a checksum of the binary content, finding things like CPP macros is much easier when a diff of the debug symbols is available. Good news is that TLSH, another fuzzy-matching algorithm, has been tested with much better results. A package is waiting in NEW and the code is ready for it to become available. A follow-up release 28 was made on August 2nd fixing content label used for gzip2, bzip2 and xz files and an error on text files only differing in their encoding. It also contains a small code improvement on how comments on Difference object are handled. This is the last release name debbindiff. A new name has been chosen to better reflect that it is not a Debian specific tool. Stay tuned! Documentation update Valentin Lorentz updated the patch submission template to suggest to write the kind of issue in the bug subject. Small progress have been made on the Reproducible Builds HOWTO while preparing the related CCCamp15 talk. Package reviews 235 obsolete reviews have been removed, 47 added and 113 updated this week. 42 reports for packages failing to build from source have been made by Chris West (Faux). New issue added this week: haskell_devscripts_locale_substvars. Misc. Valentin Lorentz wrote a script to report packages tested as unreproducible installed on a system. We encourage everyone to run it on their systems and give feedback!

23 May 2015

Eddy Petri&#537;or: Linksys NSLU2 adventures into the NetBSD land passed through JTAG highlands - part 2 - RedBoot reverse engineering and APEX hacking

(continuation of Linksys NSLU2 adventures into the NetBSD land passed through JTAG highlands - part 1; meanwhile, my article was mentioned briefly in BSDNow Episode 89 - Exclusive Disjunction around minute 36:25)

Choosing to call RedBoot from a hacked Apex
As I was saying in my previous post, in order to be able to automate the booting of the NetBSD image via TFTP, I opted for using a 2nd stage bootloader (planning to flash it in the NSLU2 instead of a Linux kernel), and since Debian was already using Apex, I chose Apex, too.

The first problem I found was that the networking support in Apex was relying on an old version of the Intel NPE library which I couldn't find on Intel's site. The new version was incompatible/not building with the old build wrapper in Apex, so I was faced with 3 options:
  1. Fight with the availabel Intel code and try to force it to compile in Apex
  2. Incorporate the NPE driver from NetBSD into a rump kernel to be included in Apex instead of the original Intel code, since the NetBSD driver only needed an easily compilable binary blob
  3. Hack together an Apex version that simulates the typing necessary RedBoot commands to load via TFTP the netbsd image and execute it.
After taking a look at the NPE driver buildsystem, I concluded there were very few options less attractive that option 1, among which was hammering nails through my forehead as a improvement measure against the severe brain damage which I would probably be likely to be inflicted with after dealing with the NPE "build system".

Option 2 looked like the best option I could have, given the situation, but my NetBSD foo was too close to 0 to even dream to endeavor on such a task. In my opinion, this still remains the technically superior solution to the problem since is very portable and a flexible way to ensure networking works in spite of the proprietary NPE code.

But, in practice, the best option I could implement at the time was option 3. I initially planned to pre-fill from Apex my desired commands into the RedBoot buffer that stored the keyboard strokes typed by the user:

load -r -b 0x200000 -h 192.168.0.2 netbsd-nfs.bin
g
Since this was the first time ever for me I was going to do less than trivial reverse engineering in order to find the addresses and signatures of interesting functions in the RedBoot code, it wasn't bad at all that I had a version of the RedBoot source code.

When stuck with reverse engineering, apply JTAG
The bad thing was that the code Linksys published as the source of the RedBoot running inside the NSLU2 was, in fact, a different code which had some significant changes around the code pieces I was mostly interested in. That in spite of the GPL terms.

But I thought that I could manage. After all, how hard could it be to identify the 2-3 functions I was interested in and 1 buffer? Even if I only had the disassembled code from the slug, it shouldn't be that hard.

I struggled with this for about 2-3 weeks on the few occasions I had during that time, but the excitement of leaning something new kept me going. Until I got stuck somewhere between the misalignment between the published RedBoot code and the disassembled code, the state of the system at the time of dumping the contents from RAM (for purposes of disassemby), the assembly code generated by GCC for some specific C code I didn't have at all, and the particularities of ARM assembly.

What was most likely to unblock me was to actually see the code in action, so I decided attaching a JTAG dongle to the slug and do a session of in-circuit-debugging was in order.

Luckily, the pinout of the JTAG interface was already identified in the NSLU2 Linux project, so I only had to solder some wires to the specified places and a 2x20 header to be able to connect through JTAG to the board.


JTAG connections on Kinder (the NSLU2 targeting NetBSD)

After this was done I tried immediately to see if when using a JTAG debugger I could break the execution of the code on the system. The answer was sadly, no.

The chip was identified, but breaking the execution was not happening. I tried this in OpenOCD and in another proprietary debugger application I had access to, and the result was the same, breaking was not happening.
$ openocd -f interface/ftdi/olimex-arm-usb-ocd.cfg -f board/linksys_nslu2.cfg
Open On-Chip Debugger 0.8.0 (2015-04-14-09:12)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.sourceforge.net/doc/doxygen/bugs.html
Info : only one transport option; autoselect 'jtag'
adapter speed: 300 kHz
Info : ixp42x.cpu: hardware has 2 breakpoints and 2 watchpoints
0
Info : clock speed 300 kHz
Info : JTAG tap: ixp42x.cpu tap/device found: 0x29277013 (mfg: 0x009,
part: 0x9277, ver: 0x2)
[..]

$ telnet localhost 4444
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
> halt
target was in unknown state when halt was requested
in procedure 'halt'
> poll
background polling: on
TAP: ixp42x.cpu (enabled)
target state: unknown
Looking into the documentation I found a bit of information on the XScale processors[X] which suggested that XScale processors might necessarily need the (otherwise optional) SRST signal on the JTAG interface to be able to single step the chip.

This confused me a lot since I was sure other people had already used JTAG on the NSLU2.

The options I saw at the time were:
  1. my NSLU2 did have a fully working JTAG interface (either due to the missing SRST signal on the interface or maybe due to a JTAG lock on later generation NSLU-s, as was my second slug)
  2. nobody ever single stepped the slug using OpenOCD or other JTAG debugger, they only reflashed, and I was on totally new ground
I even contacted Rod Whitby, the project leader of the NSLU2 project to try to confirm single stepping was done before. Rod told me he never did that and he only reflashed the device.

This confused me even further because, from what I encountered on other platforms, in order to flash some device, the code responsible for programming the flash is loaded in the RAM of the target microcontroller and that code is executed on the target after a RAM buffer with the to be flashed data is preloaded via JTAG, then the operation is repeated for all flash blocks to be reprogrammed.

I was aware it was possible to program a flash chip situated on the board, outside the chip, by only playing with the chip's pads, strictly via JTAG, but I was still hoping single stepping the execution of the code in RedBoot was possible.

Guided by that hope and the possibility the newer versions of the device to be locked, I decided to add a JTAG interface to my older NSLU2, too. But this time I decided I would also add the TRST and SRST signals to the JTAG interface, just in case single stepping would work.

This mod involved even more extensive changes than the ones done on the other NSLU, but I was so frustrated by the fact I was stuck that I didn't mind poking a few holes through the case and the prospect of a connector always sticking out from the other NSLU2, which was doing some small, yet useful work in my home LAN.

It turns out NOBODY single stepped the NSLU2 After biting the bullet and soldering JTAG interface with also the TRST and the SRST signals connected as the pinout page from the NSLU2 Linux wiki suggested, I was disappointed to observe that I was not able to single step the older NSLU2 either, in spite of the presence of the extra signals.

I even tinkered with the reset configurations of OpenOCD, but had not success. After obtaining the same result on the proprietary debugger, digging through a presentation made by Rod back in the hay day of the project and the conversations on the NSLU2 Linux Yahoo mailing list, I finally concluded:
Actually nobody single stepped the NSLU2, no matter the version of the NSLU2 or connections available on the JTAG interface!
So I was back to square 1, I had to either struggle with disassembly, reevaluate my initial options, find another option or even drop entirely the idea. At that point I was already committed to the project, so dropping entirely the idea didn't seem like the reasonable thing to do.

Since I was feeling I was really close to finish on the route I had chosen a while ago, I was not any significantly more knowledgeable in the NetBSD code, and looking at the NPE code made me feel like washing my hands, the only option which seemed reasonable was to go on.

Digging a lot more through the internet, I was finally able to find another version of the RedBoot source which was modified for Intel ixp42x systems. A few checks here and there revealed this newly found code was actually almost identical to the code I had disassembled from the slug I was aiming to run NetBSD on. This was a huge step forward.

Long story short, a couple of days later I had a hacked Apex that could go through the RedBoot data structures, search for available commands in RedBoot and successfully call any of the built-in RedBoot commands!

Testing with loading this modified Apex by hand in RAM via TFTP then jumping into it to see if things woked as expected revealed a few small issues which I corrected right away.

Flashing a modified RedBoot?! But why? Wasn't Apex supposed to avoid exactly that risky operation?
Since the tests when executing from RAM were successful, my custom second stage Apex bootloader for NetBSD net booting was ready to be flashed into the NSLU2.

I added two more targets in the Makefile in the code on the dedicated netbsd branch of my Apex repository to generate the images ready for flashing into the NSLU2 flash (RedBoot needs to find a Sercomm header in flash, otherwise it will crash) and the exact commands to be executed in RedBoot are also print out after generation. This way, if the command is copy-pasted, there is no risk the NSLU2 is bricked by mistake.

After some flashing and reflashing of the apex_nslu2.flash image into the NSLU2 flash, some manual testing, tweaking and modifying the default built in APEX commands, checking that the sequence of commands 'move', 'go 0x01d00000' would jump into Apex, which, in turn, would call RedBoot to transfer the netbsd-nfs.bin image from a TFTP to RAM and then execute it successfully, it was high time to check NetBSD would boot automatically after the NSLU was powered on.

It didn't. Contrary to my previous tests, no call made from Apex to the RedBoot code would return back to Apex, not even the execution of a basic command such as the 'version' command.

It turns out the default commands hardcoded into RedBoot were 'boot; exec 0x01d00000', but I had tested 'boot; go 0x01d0000', which is not the same thing.

While 'go' does a plain jump at the specified address, the 'exec' command also does some preparations so it allows a jump into the Linux kernel and those preparations break some environment the RedBoot commands expect. I don't know which those are and didn't had the mood or motivation to find out.

So the easiest solution was to change the RedBoot's built-in command and turn that 'exec' into a 'go'. But that meant this time I was actually risking to brick the NSLU, unless I
was able to reflash via JTAG the NSLU2.


(to be continued - next, changing RedBoot and bisecting through the NetBSD history)

[X] Linksys NSLU2 has an XScale IXP420 processor which is compatible at ASM level with the ARMv5TEJ instruction set

6 May 2015

John Goerzen: I Give Up on Google: Free is Too Expensive

I am really tired of things Google has done lately. The most recent example being retiring Classic Maps. That s a problem, because the current Maps mysteriously doesn t show most of my saved ( starred ) places. Google has known about this since at least 2013. There are posts all over their forums about it going back to when what is now regular Google Maps was beta. Google employees even knew about it and did nothing. For someone that made heavy use of it, this was quite annoying. But there have been plenty of others: I even used to use Flickr, then moved to Picasa when Yahoo stopped investing in Flickr. Now I m back to Flickr, because Google stopped investing in Picasa. The takeaway is that you can t really rely on Google for anything. Counting on something being there for an upcoming trip and then having it be suddenly yanked away is a level of frustration that just makes the service not so useful. Never knowing when obvious things (7-day calendar view) will be removed means you just can t depend on it. So, are there good alternatives? Things I m thinking of include: Anybody else moving off Google?

30 April 2015

Eddy Petri&#537;or: Linksys NSLU2 JTAG help requested

Some time ago I have embarked on a jurney to install NetBSD on one of my two NSLU2-s. I have ran into all sorts of hurdles and problems which I finally managed to overcome, except one:

The NSLU I am using has a standard 20 pin ARM JTAG connector attached to it (as per this page http://www.nslu2-linux.org/wiki/Info/PinoutOfJTAGPort, only TDI, TDO, TMS, TCK, Vref and GND signals), but, although the chip is identified, I am unable to halt the CPU:
    $ openocd -f interface/ftdi/olimex-arm-usb-ocd.cfg -f board/linksys_nslu2.cfg
    Open On-Chip Debugger 0.8.0 (2015-04-14-09:12)
    Licensed under GNU GPL v2
    For bug reports, read
    http://openocd.sourceforge.net/doc/doxygen/bugs.html
    Info : only one transport option; autoselect 'jtag'
    adapter speed: 300 kHz
    Info : ixp42x.cpu: hardware has 2 breakpoints and 2 watchpoints
    0
    Info : clock speed 300 kHz
    Info : JTAG tap: ixp42x.cpu tap/device found: 0x29277013 (mfg: 0x009,
    part: 0x9277, ver: 0x2)
    [..]
    $ telnet localhost 4444
    Trying ::1...
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    Open On-Chip Debugger
    > halt
    target was in unknown state when halt was requested
    in procedure 'halt'
    > poll
    background polling: on
    TAP: ixp42x.cpu (enabled)
    target state: unknown
My main goal is to make sure I can flash the device via JTAG, in case I break it, but it would be ideal if I could use the JTAG to single step through the code.

I have found that other people have managed to flash the device via JTAG without the other signals, and some have even changed the bootloader (and had JTAG confirmed as backup solution), so I am stuck.

So if anyone can give some insights into ixp42x / Xscale / NSLU2 specific JTAG issues or hints regarding this issue on OpenOCD or other such tool, I would be really grateful.


Note: I have made a hacked second stage Apex bootloader to laod the NetBSD image via TFTP, but the default RedBoot sequence 'boot; exec 0x01d00000' should be 'boot; go 0x01d00000' for NetBSD to work, so I am considering changing the RedBoot partition to alter that command. The gory details can be summed as my Apex is calling RedBoot functions to be network enabled (because Intel's NPE current code is not working on Apex) and I have tested this to work with go, but not with exec.

22 April 2015

Tollef Fog Heen: Temperature monitoring using a Beaglebone Black and 1-wire

I've had a half-broken temperature monitoring setup at home for quite some time. It started out with a Atom-based NAS, a USB-serial adapter and a passive 1-wire adapter. It sometimes worked, then stopped working, then started when poked with a stick. Later, the NAS was moved under the stairs and I put a Beaglebone Black in its old place. The temperature monitoring thereafter never really worked, but I didn't have the time to fix it. Over the last few days, I've managed to get it working again, of course by replacing nearly all the existing components. I'm using the DS18B20 sensors. They're about USD 1 a piece on Ebay (when buying small quantities) and seems to work quite ok. My first task was to address the reliability problems: Dropouts and really poor performance. I thought the passive adapter was problematic, in particular with the wire lengths I'm using and I therefore wanted to replace it with something else. The BBB has GPIO support, and various blog posts suggested using that. However, I'm running Debian on my BBB which doesn't have support for DTB overrides, so I needed to patch the kernel DTB. (Apparently, DTB overrides are landing upstream, but obviously not in time for Jessie.) I've never even looked at Device Tree before, but the structure was reasonably simple and with a sample override from bonebrews it was easy enough to come up with my patch. This uses pin 11 (yes, 11, not 13, read the bonebrews article for explanation on the numbering) on the P8 block. This needs to be compiled into a .dtb. I found the easiest way was just to drop the patched .dts into an unpacked kernel tree and then running make dtbs. Once this works, you need to compile the w1-gpio kernel module, since Debian hasn't yet enabled that. Run make menuconfig, find it under "Device drivers", "1-wire", "1-wire bus master", build it as a module. I then had to build a full kernel to get the symversions right, then build the modules. I think there is or should be an easier way to do that, but as I cross-built it on a fast AMD64 machine, I didn't investigate too much. Insmod-ing w1-gpio then works, but for me, it failed to detect any sensors. Reading the data sheet, it looked like a pull-up resistor on the data line was needed. I had enabled the internal pull-up, but apparently that wasn't enough, so I added a 4.7kOhm resistor between pin 3 (VDD_3V3) on P9 and pin (GPIO_45) on P8. With that in place, my sensors showed up in /sys/bus/w1/devices and you can read the values using cat. In my case, I wanted the data to go into collectd and then to graphite. I first tried using an Exec plugin, but never got it to work properly. Using a [python plugin] worked much better and my graphite installation is now showing me temperatures. Now I just need to add more probes around the house. The most useful references were In addition, various searches for DS18B20 pinout and similar, of course.

Next.