Search Results: "etbe"

30 May 2021

Russell Coker: HP ML110 Gen9

I ve just bought a HP ML110 Gen9 as a personal workstation, here are my notes about it and documentation on running Debian on it. Why a Server? I bought this is because the ML350p Gen8 turned out to be too noisy for my taste [1]. I ve just been editing my page about Memtest86+ RAM speeds [2], over the course of 10 years (high end laptop in 2001 to low end server in 2011) RAM speed increased by a factor of 100. RAM speed has been increasing at a lower rate than CPU speed and is becoming an increasing bottleneck on system performance. So while I could get a faster white-box system the cost of a second-hand server isn t that great and I m getting a system that s 100* faster than what was adequate for most tasks in 2001. HP makes some nice workstation class machines with ECC RAM (think server without remote management, hot-swap disks, or redundant PSU but with sound hardware). But they are significantly more expensive on the second hand market than servers. This server cost me $650 and came with 2*480G DC grade SSDs (Intel but with HPE stickers). I hope that more than half of the purchase price will be recovered from selling the SSDs (I will use NVMe). Also 64G of non-ECC RAM costs $370 from my local store. As I want lots of RAM for testing software on VMs it will probably turn out that the server cost me less than the cost of new RAM once I ve sold the SSDs! Monitoring
wget -O /usr/local/hpePublicKey2048_key1.pub https://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub
echo "# HP monitoring" >> /etc/apt/sources.list
echo "deb [signed-by=/usr/local/hpePublicKey2048_key1.pub] http://downloads.linux.hpe.com/SDR/downloads/MCP/Debian/ stretch/current-gen9 non-free" >> /etc/apt/sources.list
The above commands will make the management utilities installable on Debian/Buster. If using Bullseye (Testing at the moment) then you need to have Buster repositories in APT for dependencies, HP doesn t seem to have packaged all their utilities for Buster.
wget -r -np -A Contents-amd64.bz2 http://downloads.linux.hpe.com/SDR/repo/mcp/debian/dists
To find out which repositories had the programs I need I ran the above recursive wget and then uncompressed them for grep -R (as an aside it would be nice if bzgrep supported -R). I installed the hp-health package which has hpasmcli for viewing and setting many configuration options and hplog for viewing event log data and thermal data (among a few other things). I ve added a new monitor to etbemon hp-temp.monitor to monitor HP server temperatures, I haven t made a configuration option to change the thresholds for what is considered normal because I don t expect server class systems to be routinely running above the warning temperature. For the linux-temp.monitor script I added a command-line option for the percentage of the high temperature that is an error condition as well as an option for the number of CPU cores that need to be over-temperature, having one core permanently over the high temperature due to a web browser seems standard for white-box workstations nowadays. The hp-health package depends on libc6-i686 lib32gcc1 even though none of the programs it contains use lib32gcc1. Depending on lib32gcc1 instead of lib32gcc1 lib32gcc-s1 means that installing hp-health requires removing mesa-opencl-icd which probably means that BOINC can t use the GPU among other things. I solved this by editing /var/lib/dpkg/status and changing the package dependencies to what I desired. Note that this is not something for a novice to do, make a backup and make sure you know what you are doing! Issues The HPE Dynamic Smart Array B140i is a software RAID device. While it s convenient for some users that software RAID gets supported in the UEFI boot process, generally software RAID is a bad idea. Also my system has hot-swap drive caddies but the controller doesn t support hot-swap. So the first thing to do was to configure the array controller to run in AHCI mode and give up on using hot-swap drive caddies for hot-swap. I tested all the documented ways of scanning for new devices and nothing other than a reboot made the kernel recognise a new SATA disk. According to specs provided by Dell and HP the ML110 Gen9 makes less noise than the PowerEdge T320, according to my own observations the reverse is the case. I don t know if this is because of Dell being more conservative in their specs than HP or because of how dBA is measured vs my own personal annoyance thresholds for sounds. As the system makes more noise than I m comfortable with I plan to build a rubber enclosure for the rear of the system to reduce noise, that will be the subject of another post. For Australian readers Bunnings has some good deals on rubber floor mats that can be used to reduce server noise. The server doesn t have sound hardware, while one could argue that servers don t need sound there are some server uses for sound hardware such as using line input as a source of entropy. Also for a manufacturer it might be a benefit to use the same motherboard for workstations and servers. Fortunately a friend gave me a nice set of Logitech USB speakers a few years ago that I hadn t previously had a cause to use, so that will solve the problem for me (I don t need line-in on a workstation). UEFI and Memtest I decided to try UEFI boot for something new (in the past I d only used UEFI boot for a server that only had large disks). In the past I ve booted all my own systems with BIOS boot because I m familiar with it and they all have SSDs for booting which are less than 2TB in size (until recently 2TB SSDs weren t affordable for my personal use). The Debian UEFI wiki page is worth reading [3]. The Debian Wiki page about ProLiant servers [4] is worth reading too. Memtest86+ doesn t support EFI booting (just goes to a black screen) even though Debian/Buster puts in a GRUB entry for it (Debian bug #695246 was filed for this in 2012). Also on my ML110 Memtest86+ doesn t report the RAM speed (a known issue on Memtest86+). Comments on the net say that Memtest86+ hasn t been maintained for a long time and Memtest86 (the non-free version) has been updated more recently. So far I haven t seen a system with ECC RAM have a memory problem that could be detected by Memtest86+, the memory problems I ve seen on ECC systems have been things that prevent booting (RAM not being recognised correctly), that are detected by the BIOS as ECC errors before booting, or that are reported by the kernel as ECC errors at run time (happened years ago and I can t remember the details). Overall I m not a fan of EFI with the way it currently works in Debian. It seems to add some of the GRUB functionality into the BIOS and then use that to load GRUB. It seems that EFI can do everything you need and it would be better to just have a single boot loader not two of them chained. Power Supply There are a range of PSUs for the ML110, the one I have has the smallest available PSU (350W) and doesn t have a PCIe power cable (the one used for video cards). Here is the HP document which shows the cabling for the various ML110 Gen8 PSUs [5], I have the 350W PSU. One thing I ve considered is whether I could make an adaptor from the drive bay power to the PCIe connector. A quick web search indicates that 4 SAS disks when active can take up to 75W more power than a system with no disks. If that s the case then the 2 spare drive bay connectors which can each handle 4 disks should be able to supply 150W. As a 6 pin PCIe power cable (GPU power cable) is rated at 75W that should be fine in theory (here s a page with the pinouts for PCIe power connectors [6]). My video card is a Radeon R7 260X which apparently takes about 113W all up so should be taking less than 75W from the PCIe power cable. All I really want is YouTube, Netflix, and text editing at 4K resolution. So I don t need much in terms of 3D power. KDE uses some of the advanced features of modern video cards, but it doesn t compare to 3D gaming. According to the Wikipedia page for Radeon RX 500 series [7] the RX560 supports DisplayPort 1.4 and HDMI 2.0 (both of which do 4K@60Hz) and has a TDP of 75W. So a RX560 video card seems like a good option that will work in any system that doesn t have a spare PCIe power cable. I ve just ordered one of those for $246 so hopefully that will arrive in a week or so. PCI Fan The ML110 Gen9 has an optional PCIe fan and baffle to cool PCIe cards (part number 784580-B21). Extra cooling of PCIe cards is a good thing, but $400 list price (and about $50 ebay price) for the fan and baffle is unpleasant. When I boot the system with a PCIe dual-ethernet card and two PCIe NVMe cards it gives a BIOS warning on boot, when I add a video card it refuses to boot without the extra fan. It s nice that the system makes sure it doesn t get into a thermal overload situation, but it would be nicer if they just shipped all necessary fans with it instead of trying to get more money out of customers. I just bought a PCI fan and baffle kit for $60. Conclusion In spite of the unexpected expense of a new video card and PCI fan the overall cost of this system is still low, particularly when considering that I ll find another use for the video card which needs and extra power connector. It is disappointing that HP didn t supply a more capable PSU and fit all the fans to all models, the expectation of a server is that you can just do server stuff not have to buy extra bits before you can do server stuff. If you want to install Tesla GPUs or something then it s expected that you might need to do something unusual with a server, but the basic stuff should just work. A single processor tower server should be designed to function as a deskside workstation and be able to handle an average video card. Generally it s a nice computer, I look forward to getting the next deliveries of parts so I can make it work properly.

10 May 2021

Russell Coker: Minikube and Debian

I just started looking at the Kubernetes documentation and interactive tutorial [1], which incidentally is really good. Everyone who is developing a complex system should look at this to get some ideas for online training. Here are some notes on setting it up on Debian. Add Kubernetes Apt Repository
deb https://apt.kubernetes.io/ kubernetes-xenial main
First add the above to your apt sources configuration (/etc/apt/sources.list or some file under /etc/apt/sources.list.d) for the kubectl package. Ubuntu Xenial is near enough to Debian/Buster and Debian/Unstable that it should work well for both of them. Then install the GPG key 6A030B21BA07F4FB for use by apt:
gpg --recv-key 6A030B21BA07F4FB
gpg --list-sigs 6A030B21BA07F4FB
gpg --export 6A030B21BA07F4FB   apt-key add -
The Google key in question is not signed. Install Packages for the Tutorial The online training is based on minikube which uses libvirt to setup a KVM virtual machine to do stuff. To get this running you need to have a system that is capable of running KVM (IE the BIOS is set to allow hardware virtualisation). It MIGHT work on QEMU software emulation without KVM support (technically it s possible but it would be slow and require some code to handle that), I didn t test if it does. Run the following command to install libvirt, kvm, and dnsmasq (which minikube requires) and kubectl on Debian/Buster:
apt install libvirt-clients libvirt-daemon-system qemu-kvm dnsmasq kubectl
For Debian/Unstable run the following command:
apt install libvirt-clients libvirt-daemon-system qemu-system-x86 dnsmasq kubectl
To run libvirt as non-root without needing a password for everything you need to add the user in question to the libvirt group. I recommend running things as non-root whenever possible. In this case entering a password for everything will probably be more pain than you want. The Debian Wiki page about KVM [2] is worth reading. Install Minikube Test Environment Here is the documentation for installing Minikube [3]. Basically just download a single executable from the net, put it in your $PATH, and run it. Best to use non-root for that. Also you need at least 3G of temporary storage space in the home directory of the user that runs it. After installing minikube run minikube start which will download container image data and start it up. Then you can run commands like the following to see what it has done.
# get overview of virsh commands
virsh help
# list domains
virsh --connect qemu:///system list
# list block devices a domain uses
virsh --connect qemu:///system domblklist minikube
# show stats on block device usage
virsh --connect qemu:///system domblkstat minikube hda
# list virtual networks
virsh --connect qemu:///system net-list
# list dhcp leases on a virtual network
virsh --connect qemu:///system net-dhcp-leases minikube-net
# list network filters
virsh --connect qemu:///system nwfilter-list
# list real network interfaces
virsh --connect qemu:///system iface-list

Russell Coker: Echo Chambers vs Epistemic Bubbles

C Thi Nguyen wrote an interesting article about the difficulty of escaping from Echo Chambers and also mentions Epistemic Bubbles [1]. An Echo Chamber is a group of people who reinforce the same ideas and who often preemptively strike against opposing ideas (for example the right wing denigrating mainstream media to prevent their followers from straying from their approved message). An Epistemic Bubble is a group of people who just happen to not have contact with certain different ideas. When reading that article I wondered about what bubbles I and the people I associate with may be in. One obvious issue is that I have little communication with people who don t write in English and also little communication with people who are poor. So people who are poor and who can t write in English (which means significant portions of the population of India and Africa) are out of communication range for me. There are many situations that are claimed to be bubbles such as white people who are claimed to be innocent of racial issues because they only associate with other white people and men in the IT industry who are claimed to be innocent of sexism because they don t associate with women in the IT industry. But I think they are more of an echo chamber issue, if a white American doesn t access any of the variety of English language media produced by Afro Americans and realise that there s a racial problem it s because they don t want to see it and deliberately avoid looking at evidence. If a man in the IT industry doesn t access any of the media produced by women in tech and realise there are problems with sexism then it s because they don t want to see it. When is it OK to Reject a Group? The Ad Hominem Wikipedia page has a good analysis of different types of Ad Hominem arguments [2]. But the criteria for refuting a point in a debate are very different to the criteria used to determine which sources you should trust when learning about a topic. For example it s theoretically possible for someone to be good at computer science while also thinking the world is flat. In a debate about some aspect of computer programming it would be a fallacious Ad Hominem argument to say you think the Earth is flat therefore you can t program a computer . But if you do a Google search for information on computer programming and one of the results is from earthisflat.com then it would probably save time to skip reading that one. If only one person supports an idea then it s quite likely to be wrong. Good ideas tend to be supported by multiple people and for any good idea you will find a supporter who doesn t appear to have any ideas that are obviously wrong. One of the problems we have as a society now is determining the quality of data (ideas, claims about facts, opinions, communication/spam, etc). When humans have to do that it takes time and energy. Shortcuts can make things easier. Some shortcuts I use are that mainstream media articles are usually more reliable than social media posts (even posts by my friends) and that certain media outlets are untrustworthy (like Breitbart). The next step is that anyone who cites a bad site like Breitbart as factual (rather than just an indication of what some extremists believe) is unreliable. For most questions that you might search for on the Internet there is a virtually endless supply of answers, the challenge is not finding an answer but finding a correct answer. So eliminating answers that are unlikely to be correct is an important part of the search. If someone is citing references to support their argument and they can only cite fringe or extremist sites then I won t be convinced. Now someone could turn that argument around and claim that a site I reference such as the New York Times is wrong. If I find that my ideas were based on a claim that can only be found on the NYT then I will reconsider the issue. While I think that the NYT is generally accurate they are capable of making mistakes and if they are the sole source for claims that go against other claims then I will be hesitant to accept such claims. Newspapers often have exclusive articles based on their own research, but such articles always inspire investigation from other newspapers so other articles appear either supporting or questioning the claims in the exclusive. Saving Time When Interacting With Members of Echo Chambers Convincing a member of a cult or echo chamber of anything is not likely. When in discussions with them the focus should be on the audience and on avoiding wasting much time while also not giving them the impression that you agree with them. A common thing that members of echo chambers say is I don t have time to read about that when you ask if they have read a research paper or a news article. I don t have time to listen to people who can t or won t learn before speaking, there just isn t any value in that. Also if someone has a list of memes that takes more than 15 minutes to recite then they have obviously got time for reading things, just not reading outside their echo chamber. Conversations with members of echo chambers seem to be state free. They make a claim and you reject it, but regardless of the logical flaws you point out or the counter evidence you cite they make the same claim again the next time you speak to them. This seems to be evidence supporting the claim that evangelism is not about converting other people but alienating cult members from the wider society [3] (the original Quora text seems unavailable so I ve linked to a Reddit copy). Pointing out that they had made a claim previously and didn t address the issues you had with it seems effective, such discussions seem to be more about performance so you want to perform your part quickly and efficiently. Be aware of false claims about etiquette. It s generally regarded as polite not to disagree much with someone who invites you to your home or who has done some favour for you, but that is no reason for tolerating an unwanted lecture about their echo chamber. Anyone who tries to create a situation where it seems rude of you not to listen to them saying things that they know will offend you is being rude, much ruder than telling them you are sick of it. Look for specific claims that can be disproven easily. The claim that the Roman Salute is different from the Hitler Salute is one example that is easy to disprove. Then they have to deal with the issue of their echo chamber being wrong about something.

Russell Coker: More EVM

This is another post about EVM/IMA which has it s main purpose providing useful web search results for problems. However if reading it on a planet feed inspires someone to play with EVM/IMA then that s good too, it s interesting technology. When using EVM/IMA in the Linux kernel if dmesg has errors like op=appraise_data cause=missing-HMAC the missing-HMAC means that the error code in the kernel source is INTEGRITY_NOLABEL which has a comment No security.evm xattr . You can check for the xattr on a file with the following command (this example has the security.evm xattr):
# getfattr -d -m - /etc/fstab 
getfattr: Removing leading '/' from absolute path names
# file: etc/fstab
security.evm=0sAwICqGOsfwCAvgE9y9OP74QxJ/I+3eOSF2n2dM51St98z/7LYHFd9rfGTvssvhTSYL9G8cTdRAH8ozggJu7VCzggW1REoTjnLcPeuMJsrMbW3DwVrB6ldDmJzyenLMjnIHmRDDeK309aRbLVn2ueJZ07aMDcSr+sxhOOAQ/GIW4SW8L1AKpKn4g=
security.ima=0sAT+Eivfxl+7FYI+Hr9K4sE6IieZ+
security.selinux="system_u:object_r:etc_t:s0"
If dmesg has errors like op=appraise_data cause=invalid-HMAC the invalid-HMAC means that the error code in the kernel source is INTEGRITY_FAIL which has a comment Invalid HMAC/signature . These errors are from the evm_verifyxattr() function in Linux kernel 5.11.14. The error evm: HMAC key is not set means that the evm key is not initialised, this means the key needs to be loaded into the kernel and EVM is initialised by the command echo 1 > /sys/kernel/security/evm (or possibly some equivalent from a utility like evmctl). When the key is loaded the kernel gives the message evm: key initialized and after that /sys/kernel/security/evm is read-only. If there is something wrong with the key the kernel gives the message evm: key initialization failed , it seems that the way to determine if your key is good is to try writing 1 to /sys/kernel/security/evm and see what happens. After that the command cat /sys/kernel/security/evm should return 3 . The Gentoo wiki has good documentation on how to create and load the keys which has to be done before initialising EVM [1]. I ll write more about that in another post.

3 May 2021

Russell Coker: DNS, Lots of IPs, and Postal

I decided to start work on repeating the tests for my 2006 OSDC paper on Benchmarking Mail Relays [1] and discover how the last 15 years of hardware developments have changed things. There have been software changes in that time too, but nothing that compares with going from single core 32bit systems with less than 1G of RAM and 60G IDE disks to multi-core 64bit systems with 128G of RAM and SSDs. As an aside the hardware I used in 2006 wasn t cutting edge and the hardware I m using now isn t either. In both cases it s systems I bought second hand for under $1000. Pedants can think of this as comparing 2004 and 2018 hardware. BIND I decided to make some changes to reflect the increased hardware capacity and use 2560 domains and IP addresses, which gave the following errors as well as a startup time of a minute on a system with two E5-2620 CPUs.
May  2 16:38:37 server named[7372]: listening on IPv4 interface lo, 127.0.0.1#53
May  2 16:38:37 server named[7372]: listening on IPv4 interface eno4, 10.0.2.45#53
May  2 16:38:37 server named[7372]: listening on IPv4 interface eno4, 10.0.40.1#53
May  2 16:38:37 server named[7372]: listening on IPv4 interface eno4, 10.0.40.2#53
May  2 16:38:37 server named[7372]: listening on IPv4 interface eno4, 10.0.40.3#53
[...]
May  2 16:39:33 server named[7372]: listening on IPv4 interface eno4, 10.0.47.0#53
May  2 16:39:33 server named[7372]: listening on IPv4 interface eno4, 10.0.48.0#53
May  2 16:39:33 server named[7372]: listening on IPv4 interface eno4, 10.0.49.0#53
May  2 16:39:33 server named[7372]: listening on IPv6 interface lo, ::1#53
[...]
May  2 16:39:36 server named[7372]: zone localhost/IN: loaded serial 2
May  2 16:39:36 server named[7372]: all zones loaded
May  2 16:39:36 server named[7372]: running
May  2 16:39:36 server named[7372]: socket: file descriptor exceeds limit (123273/21000)
May  2 16:39:36 server named[7372]: managed-keys-zone: Unable to fetch DNSKEY set '.': not enough free resources
May  2 16:39:36 server named[7372]: socket: file descriptor exceeds limit (123273/21000)
The first thing I noticed is that a default configuration of BIND with 2560 local IPs (when just running in the default recursive mode) takes a minute to start and needed to open over 100,000 file handles. BIND also had some errors in that configuration which led to it not accepting shutdown requests. I filed Debian bug report #987927 [2] about this. One way of dealing with the errors in this situation on Debian is to edit /etc/default/named and put in the following line to allow BIND to access to many file handles:
OPTIONS="-u bind -S 150000"
But the best thing to do for BIND when there are many IP addresses that aren t going to be used for DNS service is to put a directive like the following in the BIND configuration to specify the IP address or addresses that are used for the DNS service:
listen-on   10.0.2.45;  ;
I have just added the listen-on and listen-on-v6 directives to one of my servers with about a dozen IP addresses. While 2560 IP addresses is an unusual corner case it s not uncommon to have dozens of addresses on one system. dig When doing tests of Postfix for relaying mail I noticed that mail was being deferred with DNS problems (error was Host or domain name not found. Name service error for name=a838.example.com type=MX: Host not found, try again . I tested the DNS lookups with dig which failed with errors like the following:
dig -t mx a704.example.com
socket.c:1740: internal_send: 10.0.2.45#53: Invalid argument
socket.c:1740: internal_send: 10.0.2.45#53: Invalid argument
socket.c:1740: internal_send: 10.0.2.45#53: Invalid argument
; <
> DiG 9.16.13-Debian <
> -t mx a704.example.com
;; global options: +cmd
;; connection timed out; no servers could be reached
Here is a sample of the strace output from tracing dig:
bind(20,  sa_family=AF_INET, sin_port=htons(0), 
sin_addr=inet_addr("0.0.0.0") , 16) = 0
recvmsg(20,  msg_namelen=128 , 0)       = -1 EAGAIN (Resource temporarily 
unavailable)
write(4, "\24\0\0\0\375\377\377\377", 8) = 8
sendmsg(20,  msg_name= sa_family=AF_INET, sin_port=htons(53), 
sin_addr=inet_addr("10.0.2.45") , msg_
namelen=16, msg_iov=[ iov_base="86\1 
\0\1\0\0\0\0\0\1\4a704\7example\3com\0\0\17\0\1\0\0)\20\0\0\0\0
\0\0\f\0\n\0\10's\367\265\16bx\354", iov_len=57 ], msg_iovlen=1, 
msg_controllen=0, msg_flags=0 , 0) 
= -1 EINVAL (Invalid argument)
write(2, "socket.c:1740: ", 15)         = 15
write(2, "internal_send: 10.0.2.45#53: Invalid argument", 45) = 45
write(2, "\n", 1)                       = 1
futex(0x7f5a80696084, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x7f5a80696010, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5a8069809c, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f5a80698020, FUTEX_WAKE_PRIVATE, 1) = 1
sendmsg(20,  msg_name= sa_family=AF_INET, sin_port=htons(53), 
sin_addr=inet_addr("10.0.2.45") , msg_namelen=16, msg_iov=[ iov_base="86\1 
\0\1\0\0\0\0\0\1\4a704\7example\3com\0\0\17\0\1\0\0)\20\0\0\0\0\0\0\f\0\n\0\10's\367\265\16bx\354", 
iov_len=57 ], msg_iovlen=1, msg_controllen=0, msg_flags=0 , 0) = -1 EINVAL 
(Invalid argument)
write(2, "socket.c:1740: ", 15)         = 15
write(2, "internal_send: 10.0.2.45#53: Invalid argument", 45) = 45
write(2, "\n", 1)
Ubuntu bug #1702726 claims that an insufficient ARP cache was the cause of dig problems [3]. At the time I encountered the dig problems I was seeing lots of kernel error messages neighbour: arp_cache: neighbor table overflow which I solved by putting the following in /etc/sysctl.d/mine.conf:
net.ipv4.neigh.default.gc_thresh3 = 4096
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh1 = 1024
Making that change (and having rebooted because I didn t need to run the server overnight) didn t entirely solve the problems. I have seen some DNS errors from Postfix since then but they are less common than before. When they happened I didn t have that error from dig. At this stage I m not certain that the ARP change fixed the dig problem although it seems likely (it s always difficult to be certain that you have solved a race condition instead of made it less common or just accidentally changed something else to conceal it). But it is clearly a good thing to have a large enough ARP cache so the above change is probably the right thing for most people (with the possibility of changing the numbers according to the required scale). Also people having that dig error should probably check their kernel message log, if the ARP cache isn t the cause then some other kernel networking issue might be related. Preliminary Results With Postfix I m seeing around 24,000 messages relayed per minute with more than 60% CPU time idle. I m not sure exactly how to count idle time when there are 12 CPU cores and 24 hyper-threads as having only 1 process scheduled for each pair of hyperthreads on a core is very different to having half the CPU cores unused. I ran my script to disable hyper-threads by telling the Linux kernel to disable each processor core that has the same core ID as another, it was buggy and disabled the second CPU altogether (better than finding this out on a production server). Going from 24 hyper-threads of 2 CPUs to 6 non-HT cores of a single CPU didn t change the thoughput and the idle time went to about 30%, so I have possibly halved the CPU capacity for these tasks by disabling all hyper-threads and one entire CPU which is surprising given that I theoretically reduced the CPU power by 75%. I think my focus now has to be on hyper-threading optimisation. Since 2006 the performance has gone from ~20 messages per minute on relatively commodity hardware to 24,000 messages per minute on server equipment that is uncommon for home use but which is also within range of home desktop PCs. I think that a typical desktop PC with a similar speed CPU, 32G of RAM and SSD storage would give the same performance. Moore s Law (that transistor count doubles approximately every 2 years) is often misquoted as having performance double every 2 years. In this case more than 1024* the performance over 15 years means the performance doubling every 18 months. Probably most of that is due to SATA SSDs massively outperforming IDE hard drives but it s still impressive. Notes I ve been using example.com for test purposes for a long time, but RFC2606 specifies .test, .example, and .invalid as reserved top level domains for such things. On the next iteration I ll change my scripts to use .test. My current test setup has a KVM virtual machine running my bhm program to receive mail which is taking between 20% and 50% of a CPU core in my tests so far. While that is happening the kvm process is reported as taking between 60% and 200% of a CPU core, so kvm takes as much as 4* the CPU of the guest due to the virtual networking overhead even though I m using the virtio-net-pci driver (the most efficient form of KVM networking for emulating a regular ethernet card). I ve also seen this in production with a virtual machine running a ToR relay node. I ve fixed a bug where Postal would try to send the SMTP quit command after encountering a TCP error which would cause an infinite loop and SEGV.

28 April 2021

Russell Coker: Links April 2021

Dr Justin Lehmiller s blog post comparing his official (academic style) and real biographies is interesting [1]. Also the rest of his blog is interesting too, he works at the Kinsey Institute so you know he s good. Media Matters has an interesting article on the spread of vaccine misinformation on Instagram [2]. John Goerzen wrote a long post summarising some of the many ways of having a decentralised Internet [3]. One problem he didn t address is how to choose between them, I could spend months of work to setup a fraction of those services. Erasmo Acosta wrote an interesting medium article Could Something as Pedestrian as the Mitochondria Unlock the Mystery of the Great Silence? [4]. I don t know enough about biology to determine how plausible this is. But it is a worry, I hope that humans will meet extra-terrestrial intelligences at some future time. Meredith Haggerty wrote an insightful Medium article about the love vs money aspects of romantic comedies [5]. Changes in viewer demographics would be one factor that makes lead actors in romantic movies significantly less wealthy in recent times. Informative article about ZIP compression and the history of compression in general [6]. Vice has an insightful article about one way of taking over SMS access of phones without affecting voice call or data access [7]. With this method the victom won t notice that they are having their sservice interfered with until it s way too late. They also explain the chain of problems in the US telecommunications industry that led to this. I wonder what s happening in this regard in other parts of the world. The clown code of ethics (8 Commandments) is interesting [8]. Sam Hartman wrote an insightful blog post about the problems with RMS and how to deal with him [9]. Also Sam Whitton has an interesting take on this [10]. Another insightful post is by Selam G about RMS long history of bad behavior and the way universities are run [11]. Cory Doctorow wrote an insightful article for Locus about free markets with a focus on DRM on audio books [12]. We need legislative changes to fix this!

25 April 2021

Russell Coker: Scanning with a MFC-9120CN on Bullseye

I previously wrote about getting a Brother MFC-9120CN multifunction printer/scanner to print on Linux [1]. I had also got it scanning which I didn t blog about.
found USB scanner (vendor=0x04f9, product=0x021d) at libusb:003:002
I recently upgraded that Linux system to Debian/Testing (which will soon be released as Debian/Bullseye) and scanning broke. The command sane-find-scanner would find the USB connected scanner (with the above output), but scanimage -L didn t. It turned out that I had to edit /etc/sane.d/dll.d/hplip which had a single uncommented line of hpaio and replace that with brother3 to make SANE load the driver /usr/lib64/sane/libsane-brother3.so from the brscan3 package (which Brother provided from their web site years ago). I have the following script to do the scanning (which can run as non-root):
#!/bin/bash
set -e
if [ "$1" == "" ]; then
  echo "specify output filename"
  exit 1
fi
TMP=$(mktemp)
scanimage > $TMP
convert $TMP $1
rm $TMP
Final Note This blog post doesn t describe everything that needs to be done to setup a scanner, I already had part of it setup from 10 years ago. But for anyone who finds this after having trouble, /etc/sane.d/dll.d is one place you should look for important configuration (especially if sane-find-scanner works and scanimage -L fails). Also the Brother drivers are handy to have although I apparently had it working in the past with the hpaio driver from HP (the Brother device emulates a HP device).

22 April 2021

Russell Coker: HP ML350P Gen8

I m playing with a HP Proliant ML350P Gen8 server (part num 646676-011). For HP servers ML means tower (see the ProLiant Wikipedia page for more details [1]). For HP servers the generation indicates how old the server is, Gen8 was announced in 2012 and Gen10 seems to be the current generation. Debian Packages from HP
wget -O /usr/local/hpePublicKey2048_key1.pub https://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub
echo "# HP RAID" >> /etc/apt/sources.list
echo "deb [signed-by=/usr/local/hpePublicKey2048_key1.pub] http://downloads.linux.hpe.com/SDR/downloads/MCP/Debian/ buster/current non-free" >> /etc/apt/sources.list
The above commands will setup the APT repository for Debian/Buster. See the HP Downloads FAQ [2] for more information about their repositories. hponcfg This package contains the hponcfg program that configures ILO (the HP remote management system) from Linux. One noteworthy command is hponcfg -r to reset the ILO, something you should do before selling an old system. ssacli This package contains the ssacli program to configure storage arrays, here are some examples of how to use it:
# list controllers and show slot numbers
ssacli controller all show
# list arrays on controller identified by slot and give array IDs
ssacli controller slot=0 array all show
# show details of one array
ssacli controller slot=0 array A show
# show all disks on one controller
ssacli controller slot=0 physicaldrive all show
# show config of a controller, this gives RAID level etc
ssacli controller slot=0 show config
# delete array B (you can immediately pull the disks from it)
ssacli controller slot=0 array B delete
# create an array type RAID0 with specified drives, do this with one drive per array for BTRFS/ZFS
ssacli controller slot=0 create type=arrayr0 drives=1I:1:1
When a disk is used in JBOD mode just under 33MB will be used at the end of the disk for the RAID metadata. If you have existing disks with a DOS partition table you can put it in a HP array as a JBOD and it will work with all data intact (GPT partition table is more complicated). When all disks are removed from the server the cooling fans run at high speed, this would be annoying if you wanted to have a diskless workstation or server using only external storage. ssaducli This package contains the ssaducli diagnostic utility for storage arrays. The SSD wear gauge report doesn t work for the 2 SSDs I tested it on, maybe it only supports SAS SSDs not SATA SSDs. It doesn t seem to do anything that I need. storcli This package contains both 32bit and 64bit versions of the MegaRAID utility and deletes whichever one doesn t match the installation in the package postinst, so it fails debsums checks etc. The MegaRAID utility is for a different type of RAID controller to the Smart Storage Array (AKA SSA) that the other utilities work with. As an aside it seems that there are multiple types of MegaRAID controller, the management program from the storcli package doesn t work on a Dell server with MegaRAID. They should have made separate 32bit and 64bit versions of this package. Recommendations Here is HP page for downloading firmware updates (including security updates) [3], you have to login first and have a warranty. This is legal but poor service. Dell servers have comparable prices (on the second hand marker) and comparable features but give free firmware updates to everyone. Dell have overall lower quality of Debian packages for supporting utilities, but a wider range of support so generally Dell support seems better in every way. Dell and HP hardware seems of equal quality so overall I think it s best to buy Dell. Suggestions for HP Finding which of the signing keys to use is unreasonably difficult. You should get some HP employees to sign the HP keys used for repositories with their personal keys and then go to LUG meetings and get their personal keys well connected to the web of trust. Then upload the HP keys to the public key repositories. You should also use the same keys for signing all versions of the repositories. Having different keys for the different versions of Debian wastes people s time. Please provide firmware for all users, even if they buy systems second hand. It is in your best interests to have systems used long-term and have them run securely. It is not in your best interests to have older HP servers perform badly. Having all the fans run at maximum speed when power is turned on is a standard server feature. Some servers can throttle the fan when the BIOS is running, it would be nice if HP servers did that. Having ridiculously loud fans until just before GRUB starts is annoying.

18 April 2021

Russell Coker: IMA/EVM Certificates

I ve been experimenting with IMA/EVM. Here is the Sourceforge page for the upstream project [1]. The aim of that project is to check hashes and maybe public key signatures on files before performing read/exec type operations on them. It can be used as the next logical step from booting a signed kernel with TPM. I am a long way from getting that sort of thing going, just getting the kernel to boot and load keys is my current challenge and isn t helped due to the lack of documentation on error messages. This blog post started as a way of documenting the error messages so future people who google errors can get a useful result. I am not trying to document everything, just help people get through some of the first problems. I am using Debian for my work, but some of this will apply to other distributions (particularly the kernel error messages). The Debian distribution has the ima-evm-utils but no other support for IMA/EVM. To get this going in Debian you need to compile your own kernel with IMA support and then boot it with kernel command-line options to enable IMA, in recent kernels that includes lsm=integrity as a mandatory requirement to prevent a kernel Oops after mounting the initrd (there is already a patch to fix this). If you want to just use IMA (not get involved in development) then a good option would be to use RHEL (here is their documentation) [2] or SUSE (here is their documentation) [3]. Note that both RHEL and SUSE use older kernels so their documentation WILL lead you astray if you try and use the latest kernel.org kernel. The Debian initrd I created a script named /etc/initramfs-tools/hooks/keys with the following contents to copy the key(s) from /etc/keys to the initrd where the kernel will load it/them. The kernel configuration determines whether x509_evm.der or x509_ima.der (or maybe both) is loaded. I haven t yet worked out which key is needed when.
#!/bin/bash
mkdir -p $ DESTDIR /etc/keys
cp /etc/keys/* $ DESTDIR /etc/keys
Making the Keys
#!/bin/sh
GENKEY=ima.genkey
cat << __EOF__ >$GENKEY
[ req ]
default_bits = 1024
distinguished_name = req_distinguished_name
prompt = no
string_mask = utf8only
x509_extensions = v3_usr
[ req_distinguished_name ]
O =  hostname 
CN =  whoami  signing key
emailAddress =  whoami @ hostname 
[ v3_usr ]
basicConstraints=critical,CA:FALSE
#basicConstraints=CA:FALSE
keyUsage=digitalSignature
#keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid
#authorityKeyIdentifier=keyid,issuer
__EOF__
openssl req -new -nodes -utf8 -sha1 -days 365 -batch -config $GENKEY \
                -out csr_ima.pem -keyout privkey_ima.pem
openssl x509 -req -in csr_ima.pem -days 365 -extfile $GENKEY -extensions v3_usr \
                -CA ~/kern/linux-5.11.14/certs/signing_key.pem -CAkey ~/kern/linux-5.11.14/certs/signing_key.pem -CAcreateserial \
                -outform DER -out x509_evm.der
To get the below result I used the above script to generate a key, it is the /usr/share/doc/ima-evm-utils/examples/ima-genkey.sh script from the ima-evm-utils package but changed to use the key generated from kernel compilation to sign it. You can copy the files in the certs directory from one kernel build tree to another to have the same certificate and use the same initrd configuration. After generating the key I copied x509_evm.der to /etc/keys on the target host and built the initrd before rebooting.
[    1.050321] integrity: Loading X.509 certificate: /etc/keys/x509_evm.der
[    1.092560] integrity: Loaded X.509 cert 'xev: etbe signing key: 99d4fa9051e2c178017180df5fcc6e5dbd8bb606'
Errors Here are some of the kernel error messages I received along with my best interpretation of what they mean. [ 1.062031] integrity: Loading X.509 certificate: /etc/keys/x509_ima.der
[ 1.063689] integrity: Problem loading X.509 certificate -74 Error -74 means -EBADMSG, which means there s something wrong with the certificate file. I have got that from /etc/keys/x509_ima.der not being in der format and I have got it from a der file that contained a key pair that wasn t signed.
[    1.049170] integrity: Loading X.509 certificate: /etc/keys/x509_ima.der
[    1.093092] integrity: Problem loading X.509 certificate -126
Error -126 means -ENOKEY, so the key wasn t in the file or the key wasn t signed by the kernel signing key.
[    1.074759] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
Error -2 means -ENOENT, so the file wasn t found on the initrd. Note that it does NOT look at the root filesystem. References

14 April 2021

Russell Coker: Basics of Linux Kernel Debugging

Firstly a disclaimer, I m not an expert on this and I m not trying to instruct anyone who is aiming to become an expert. The aim of this blog post is to help someone who has a single kernel issue they want to debug as part of doing something that s mostly not kernel coding. I welcome comments about the second step to kernel debugging for the benefit of people who need more than this (which might include me next week). Also suggestions for people who can t use a kvm/qemu debugger would be good. Below is a command to run qemu with GDB. It should be run from the Linux kernel source directory. You can add other qemu options for a blog device and virtual networking if necessary, but the bug I encountered gave an oops from the initrd so I didn t need to go further. The nokaslr is to avoid address space randomisation which deliberately makes debugging tasks harder (from a certain perspective debugging a kernel and compromising a kernel are fairly similar). Loading the bzImage is fine, gdb can map that to the different file it looks at later on.
qemu-system-x86_64 -kernel arch/x86/boot/bzImage -initrd ../initrd-$KERN_VER -curses -m 2000 -append "root=/dev/vda ro nokaslr" -gdb tcp::1200
The command to run GDB is gdb vmlinux , when at the GDB prompt you can run the command target remote localhost:1200 to connect to the GDB server port 1200. Note that there is nothing special about port 1200, it was given in an example I saw and is as good as any other port. It is important that you run GDB against the vmlinux file in the main directory not any of the several stripped and packaged files, GDB can t handle a bzImage file but that s OK, it ends up much the same in RAM. When the target remote command is processed the kernel will be suspended by the debugger, if you are looking for a bug early in the boot you may need to be quick about this. Using qemu-system-x86_64 instead of kvm slows things down and can help in that regard. The bug I was hunting happened 1.6 seconds after kernel load with KVM and 7.8 seconds after kernel load with qemu. I am not aware of all the implications of the kvm vs qemu decision on debugging. If your bug is a race condition then trying both would be a good strategy. After the target remote command you can debug the kernel just like any other program. If you put a breakpoint on print_modules() that will catch the operation of printing an Oops which can be handy.

12 April 2021

Russell Coker: Yama

I ve just setup the Yama LSM module on some of my Linux systems. Yama controls ptrace which is the debugging and tracing API for Unix systems. The aim is to prevent a compromised process from using ptrace to compromise other processes and cause more damage. In most cases a process which can ptrace another process which usually means having capability SYS_PTRACE (IE being root) or having the same UID as the target process can interfere with that process in other ways such as modifying it s configuration and data files. But even so I think it has the potential for making things more difficult for attackers without making the system more difficult to use. If you put kernel.yama.ptrace_scope = 1 in sysctl.conf (or write 1 to /proc/sys/kernel/yama/ptrace_scope) then a user process can only trace it s child processes. This means that strace -p and gdb -p will fail when run as non-root but apart from that everything else will work. Generally strace -p (tracing the system calls of another process) is of most use to the sysadmin who can do it as root. The command gdb -p and variants of it are commonly used by developers so yama wouldn t be a good thing on a system that is primarily used for software development. Another option is kernel.yama.ptrace_scope = 3 which means no-one can trace and it can t be disabled without a reboot. This could be a good option for production servers that have no need for software development. It wouldn t work well for a small server where the sysadmin needs to debug everything, but when dozens or hundreds of servers have their configuration rolled out via a provisioning tool this would be a good setting to include. See Documentation/admin-guide/LSM/Yama.rst in the kernel source for the details. When running with capability SYS_PTRACE (IE root shell) you can ptrace anything else and if necessary disable Yama by writing 0 to /proc/sys/kernel/yama/ptrace_scope . I am enabling mode 1 on all my systems because I think it will make things harder for attackers while not making things more difficult for me. Also note that SE Linux restricts SYS_PTRACE and also restricts cross-domain ptrace access, so the combination with Yama makes things extra difficult for an attacker. Yama is enabled in the Debian kernels by default so it s very easy to setup for Debian users, just edit /etc/sysctl.d/whatever.conf and it will be enabled on boot.

Russell Coker: Riverdale

I ve been watching the show Riverdale on Netflix recently. It s an interesting modern take on the Archie comics. Having watched Josie and the Pussycats in Outer Space when I was younger I was anticipating something aimed towards a similar audience. As solving mysteries and crimes was apparently a major theme of the show I anticipated something along similar lines to Scooby Doo, some suspense and some spooky things, but then a happy ending where criminals get arrested and no-one gets hurt or killed while the vast majority of people are nice. Instead the first episode has a teen being murdered and Ms Grundy being obsessed with 15yo boys and sleeping with Archie (who s supposed to be 15 but played by a 20yo actor). Everyone in the show has some dark secret. The filming has a dark theme, the sky is usually overcast and it s generally gloomy. This is a significant contrast to Veronica Mars which has some similarities in having a young cast, a sassy female sleuth, and some similar plot elements. Veronica Mars has a bright theme and a significant comedy element in spite of dealing with some dark issues (murder, rape, child sex abuse, and more). But Riverdale is just dark. Anyone who watches this with their kids expecting something like Scooby Doo is in for a big surprise. There are lots of interesting stylistic elements in the show. Lots of clothing and uniform designs that seem to date from the 1940 s. It seems like some alternate universe where kids have smartphones and laptops while dressing in the style of the 1940s. One thing that annoyed me was construction workers using tools like sledge-hammers instead of excavators. A society that has smart phones but no earth-moving equipment isn t plausible. On the upside there is a racial mix in the show that more accurately reflects American society than the original Archie comics and homophobia is much less common than in most parts of our society. For both race issues and gay/lesbian issues the show treats them in an accurate way (portraying some bigotry) while the main characters aren t racist or homophobic. I think it s generally an OK show and recommend it to people who want a dark show. It s a good show to watch while doing something on a laptop so you can check Wikipedia for the references to 1940s stuff (like when Bikinis were invented). I m half way through season 3 which isn t as good as the first 2, I don t know if it will get better later in the season or whether I should have stopped after season 2. I don t usually review fiction, but the interesting aesthetics of the show made it deserve a review.

Russell Coker: Storage Trends 2021

The Viability of Small Disks Less than a year ago I wrote a blog post about storage trends [1]. My main point in that post was that disks smaller than 2TB weren t viable then and 2TB disks wouldn t be economically viable in the near future. Now MSY has 2TB disks for $72 and 2TB SSD for $245, saving $173 if you get a hard drive (compared to saving $240 10 months ago). Given the difference in performance and noise 2TB hard drives won t be worth using for most applications nowadays. NVMe vs SSD Last year NVMe prices were very comparable for SSD prices, I was hoping that trend would continue and SSDs would go away. Now for sizes 1TB and smaller NVMe and SSD prices are very similar, but for 2TB the NVMe prices are twice that of SSD presumably partly due to poor demand for 2TB NVMe. There are also no NVMe devices larger than 2TB on sale at MSY (a store which caters to home stuff not special server equipment) but SSDs go up to 8TB. It seems that NVMe is only really suitable for workstation storage and for cache etc on a server. So SATA SSDs will be around for a while. Small Servers There are a range of low end servers which support a limited number of disks. Dell has 2 disk servers and 4 disk servers. If one of those had 8TB SSDs you could have 8TB of RAID-1 or 24TB of RAID-Z storage in a low end server. That covers the vast majority of servers (small business or workgroup servers tend to have less than 8TB of storage). Larger Servers Anandtech has an article on Seagates roadmap to 120TB disks [2]. They currently sell 20TB disks using HAMR technology Currently the biggest disks that MSY sells are 10TB for $395, which was also the biggest disk they were selling last year. Last year MSY only sold SSDs up to 2TB in size (larger ones were available from other companies at much higher prices), now they sell 8TB SSDs for $949 (4* capacity increase in less than a year). Seagate is planning 30TB disks for 2023, if SSDs continue to increase in capacity by 4* per year we could have 128TB SSDs in 2023. If you needed a server with 100TB of storage then having 2 or 3 SSDs in a RAID array would be much easier to manage and faster than 4*30TB disks in an array. When you have a server with many disks you can expect to have more disk failures due to vibration. One time I built a server with 18 disks and took disks from 2 smaller servers that had 4 and 5 disks. The 9 disks which had been working reliably for years started having problems within weeks of running in the bigger server. This is one of the many reasons for paying extra for SSD storage. Seagate is apparently planning 50TB disks for 2026 and 100TB disks for 2030. If that s the best they can do then SSD vendors should be able to sell larger products sooner at prices that are competitive. Matching hard drive prices is not required, getting to less than 4* the price should be enough for most customers. The Anandtech article is worth reading, it mentions some interesting features that Seagate are developing such as having 2 actuators (which they call Mach.2) so the drive can access 2 different tracks at the same time. That can double the performance of a disk, but that doesn t change things much when SSDs are more than 100* faster. Presumably the Mach.2 disks will be SAS and incredibly expensive while providing significantly less performance than affordable SATA SSDs. Computer Cases In my last post I speculated on the appearance of smaller cases designed to not have DVD drives or 3.5 hard drives. Such cases still haven t appeared apart from special purpose machines like the NUC that were available last year. It would be nice if we could get a new industry standard for smaller power supplies. Currently power supplies are expected to be almost 5 inches wide (due to the expectation of a 5.25 DVD drive mounted horizontally). We need some industry standards for smaller PCs that aren t like the NUC, the NUC is very nice, but most people who build their own PC need more space than that. I still think that planning on USB DVD drives is the right way to go. I ve got 4PCs in my home that are regularly used and CDs and DVDs are used so rarely that sharing a single DVD drive among all 4 wouldn t be a problem. Conclusion I m tempted to get a couple of 4TB SSDs for my home server which cost $487 each, it currently has 2*500G SSDs and 3*4TB disks. I would have to remove some unused files but that s probably not too hard to do as I have lots of old backups etc on there. Another possibility is to use 2*4TB SSDs for most stuff and 2*4TB disks for backups. I m recommending that all my clients only use SSDs for their storage. I only have one client with enough storage that disks are the only option (100TB of storage) but they moved all the functions of that server to AWS and use S3 for the storage. Now I don t have any clients doing anything with storage that can t be done in a better way on SSD for a price difference that s easy for them to afford. Affordable SSD also makes RAID-1 in workstations more viable. 2 disks in a PC is noisy if you have an office full of them and produces enough waste heat to be a reliability issue (most people don t cool their offices adequately on weekends). 2 SSDs in a PC is no problem at all. As 500G SSDs are available for $73 it s not a significant cost to install 2 of them in every PC in the office (more cost for my time than hardware). I generally won t recommend that hard drives be replaced with SSDs in systems that are working well. But if a machine runs out of space then replacing it with SSDs in a RAID-1 is a good choice. Moore s law might cover SSDs, but it definitely doesn t cover hard drives. Hard drives have fallen way behind developments of most other parts of computers over the last 30 years, hopefully they will go away soon.

1 April 2021

Russell Coker: Censoring Images

A client asked me to develop a system for censoring images from an automatic camera. The situation is that we have a camera taking regular photos from a fixed location which includes part of someone else s property. So my client made a JPEG with some black rectangles in the sections that need to be covered. The first thing I needed to do was convert the JPEG to a PNG with transparency for the sections that aren t to be covered. To convert it I loaded the JPEG in the GIMP and went to the Layer->Transparency->Add Alpha Channel menu to enabled the Alpha channel. Then I selected the Bucket Fill tool and used Mode Erase and Fill by Composite and then clicked on the background (the part of the JPEG that was white) to make it transparent. Then I exported it to PNG. If anyone knows of an easy way to convert the file then please let me know. It would be nice if there was a command-line program I could run to convert a specified color (default white) to transparent. I say this because I can imagine my client going through a dozen iterations of an overlay file that doesn t quite fit. To censor the image I ran the composite command from imagemagick. The command I used was composite -gravity center overlay.png in.jpg out.jpg . If anyone knows a better way of doing this then please let me know. The platform I m using is a ARM926EJ-S rev 5 (v5l) which takes 8 minutes of CPU time to convert a single JPEG at full DSLR resolution (4 megapixel). It also required enabling swap on a SD card to avoid running out of RAM and running systemctl disable tmp.mount to stop using tmpfs for /tmp as the system only has 256M of RAM.

27 February 2021

Russell Coker: Links February 2021

Elestic Search gets a new license to deal with AWS not paying them [1]. Of course AWS will fork the products in question. We need some anti-trust action against Amazon. Big Think has an interesting article about what appears to be ritualistic behaviour in chompanzees [2]. The next issue is that if they are developing a stone-age culture does that mean we should treat them differently from other less developed animals? Last Week in AWS has an informative article about Parler s new serverless architecture [3]. They explain why it s not easy to move away from a cloud platform even for a service that s designed to not be dependent on it. The moral of the story is that running a service so horrible that none of the major cloud providers will touch it doesn t scale. Patheos has an insightful article about people who spread the most easily disproved lies for their religion [4]. A lot of political commentary nowadays is like that. Indi Samarajiva wrote an insightful article comparing terrorism in Sri Lanka with the right-wing terrorism in the US [5]. The conclusion is that it s only just starting in the US. Belling Cat has an interesting article about the FSB attempt to murder Russian presidential candidate Alexey Navalny [6]. Russ Allbery wrote an interesting review of Anti-Social, a book about the work of an anti-social behavior officer in the UK [7]. The book (and Russ s review) has some good insights into how crime can be reduced. Of course a large part of that is allowing people who want to use drugs to do so in an affordable way. Informative post from Electrical Engineering Materials about the difference between KVW and KW [8]. KVA is bigger than KW, sometimes a lot bigger. Arstechnica has an interesting but not surprising article about a supply chain attack on software development [9]. Exploiting the way npm and similar tools resolve dependencies to make them download hostile code. There is no possibility of automatic downloads being OK for security unless they are from known good sites that don t allow random people to upload. Any sort of system that allows automatic download from sites like the Node or Python repositories, Github, etc is ripe for abuse. I think the correct solution is to have dependencies installed manually or automatically from a distribution like Debian, Ubuntu, Fedora, etc where there have been checks on the source of the source. Devon Price wrote an insightful Medium article Laziness Does Not Exist about the psychological factors which can lead to poor results that many people interpret as laziness [10]. Everyone who supervises other people s work should read this.

21 January 2021

Russell Coker: Links January 2021

Krebs on Security has an informative article about web notifications and how they are being used for spamming and promoting malware [1]. He also includes links for how to permanently disable them. If nothing else clicking no on each new site that wants to send notifications is annoying. Michael Stapelberg wrote an insightful posts about inefficiencies in the Debian development processes [2]. While I agree with most of his assessment of Debian issues I am not going to decrease my involvement in Debian. Of the issues he mentions the 2 that seem to have the best effort to reward ratio are improvements to mailing list archives (to ideally make it practical to post to lists without subscribing and read responses in the archives) and the issues of forgetting all the complexities of the development process which can be alleviated by better Wiki pages. In my Debian work I ve contributed more to the Wiki in recent times but not nearly as much as I should. Jacobin has an insightful article Ending Poverty in the United States Would Actually Be Pretty Easy [3]. Mark Brown wrote an interesting blog post about the Rust programming language [4]. He links to a couple of longer blog posts about it. Rust has some great features and I ve been meaning to learn it. Scientific America has an informative article about research on the spread of fake news and memes [5]. Something to consider when using social media. Bruce Schneier wrote an insightful blog post on whether there should be limits on persuasive technology [6]. Jonathan Dowland wrote an interesting blog post about git rebasing and lab books [7]. I think it s an interesting thought experiment to compare the process of developing code worthy of being committed to a master branch of a VCS to the process of developing a Ph.D thesis. CBS has a disturbing article about the effect of Covid19 on people s lungs [8]. Apparently it usually does more lung damage than long-term smoking and even 70%+ of people who don t have symptoms of the disease get significant lung damage. People who live in heavily affected countries like the US now have to worry that they might have had the disease and got lung damage without knowing it. Russ Allbery wrote an interesting review of the book Because Internet about modern linguistics [9]. The topic is interesting and I might read that book at some future time (I have many good books I want to read). Jonathan Carter wrote an interesting blog post about CentOS Streams and why using a totally free OS like Debian is going to be a better option for most users [10]. Linus has slammed Intel for using ECC support as a way of segmenting the market between server and desktop to maximise profits [11]. It would be nice if a company made a line of Ryzen systems with ECC RAM support, but most manufacturers seem to be in on the market segmentation scam. Russ Allbery wrote an interesting review of the book Can t Even about millenials as the burnout generation and the blame that the corporate culture deserves for this [12].

12 January 2021

Russell Coker: PSI and Cgroup2

In the comments on my post about Load Average Monitoring [1] an anonymous person recommended that I investigate PSI. As an aside, why do I get so many great comments anonymously? Don t people want to get credit for having good ideas and learning about new technology before others? PSI is the Pressure Stall Information subsystem for Linux that is included in kernels 4.20 and above, if you want to use it in Debian then you need a kernel from Testing or Unstable (Bullseye has kernel 4.19). The place to start reading about PSI is the main Facebook page about it, it was originally developed at Facebook [2]. I am a little confused by the actual numbers I get out of PSI, while for the load average I can often see where they come from (EG have 2 processes each taking 100% of a core and the load average will be about 2) it s difficult to work out where the PSI numbers come from. For my own use I decided to treat them as unscaled numbers that just indicate problems, higher number is worse and not worry too much about what the number really means. With the cgroup2 interface which is supported by the version of systemd in Testing (and which has been included in Debian backports for Buster) you get PSI files for each cgroup. I ve just uploaded version 1.3.5-2 of etbemon (package mon) to Debian/Unstable which displays the cgroups with PSI numbers greater than 0.5% when the load average test fails.
System CPU Pressure: avg10=0.87 avg60=0.99 avg300=1.00 total=20556310510
/system.slice avg10=0.86 avg60=0.92 avg300=0.97 total=18238772699
/system.slice/system-tor.slice avg10=0.85 avg60=0.69 avg300=0.60 total=11996599996
/system.slice/system-tor.slice/tor@default.service avg10=0.83 avg60=0.69 avg300=0.59 total=5358485146
System IO Pressure: avg10=18.30 avg60=35.85 avg300=42.85 total=310383148314
 full avg10=13.95 avg60=27.72 avg300=33.60 total=216001337513
/system.slice avg10=2.78 avg60=3.86 avg300=5.74 total=51574347007
/system.slice full avg10=1.87 avg60=2.87 avg300=4.36 total=35513103577
/system.slice/mariadb.service avg10=1.33 avg60=3.07 avg300=3.68 total=2559016514
/system.slice/mariadb.service full avg10=1.29 avg60=3.01 avg300=3.61 total=2508485595
/system.slice/matrix-synapse.service avg10=2.74 avg60=3.92 avg300=4.95 total=20466738903
/system.slice/matrix-synapse.service full avg10=2.74 avg60=3.92 avg300=4.95 total=20435187166
Above is an extract from the output of the loadaverage check. It shows that tor is a major user of CPU time (the VM runs a ToR relay node and has close to 100% of one core devoted to that task). It also shows that Mariadb and Matrix are the main users of disk IO. When I installed Matrix the Debian package told me that using SQLite would give lower performance than MySQL, but that didn t seem like a big deal as the server only has a few users. Maybe I should move Matrix to the Mariadb instance. to improve overall system performance. So far I have not written any code to display the memory PSI files. I don t have a lack of RAM on systems I run at the moment and don t have a good test case for this. I welcome patches from people who have the ability to test this and get some benefit from it. We are probably about 6 months away from a new release of Debian and this is probably the last thing I need to do to make etbemon ready for that.

Russell Coker: RISC-V and Qemu

RISC-V is the latest RISC architecture that s become popular. It is the 5th RISC architecture from the University of California Berkeley. It seems to be a competitor to ARM due to not having license fees or restrictions on alterations to the architecture (something you have to pay extra for when using ARM). RISC-V seems the most popular architecture to implement in FPGA. When I first tried to run RISC-V under QEMU it didn t work, which was probably due to running Debian/Unstable on my QEMU/KVM system and there being QEMU bugs in Unstable at the time. I have just tried it again and got it working. The Debian Wiki page about RISC-V is pretty good [1]. The instructions there got it going for me. One thing I wasted some time on before reading that page was trying to get a netinst CD image, which is what I usually do for setting up a VM. Apparently there isn t RISC-V hardware that boots from a CD/DVD so there isn t a Debian netinst CD image. But debootstrap can install directly from the Debian web server (something I ve never wanted to do in the past) and that gave me a successful installation. Here are the commands I used to setup the base image:
apt-get install debootstrap qemu-user-static binfmt-support debian-ports-archive-keyring
debootstrap --arch=riscv64 --keyring /usr/share/keyrings/debian-ports-archive-keyring.gpg --include=debian-ports-archive-keyring unstable /mnt/tmp http://deb.debian.org/debian-ports
I first tried running RISC-V Qemu on Buster, but even ls didn t work properly and the installation failed.
chroot /mnt/tmp bin/bash
# ls -ld .
/usr/bin/ls: cannot access '.': Function not implemented
When I ran it on Unstable ls works but strace doesn t work in a chroot, this gave enough functionality to complete the installation.
chroot /mnt/tmp bin/bash
# strace ls -l
/usr/bin/strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Function not implemented
/usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented
/usr/bin/strace: PTRACE_SETOPTIONS: Function not implemented
/usr/bin/strace: detach: waitpid(1602629): No child processes
/usr/bin/strace: Process 1602629 detached
When running the VM the operation was noticably slower than the emulation of PPC64 and S/390x which both ran at an apparently normal speed. When running on a server with equivalent speed CPU a ssh login was obviously slower due to the CPU time taken for encryption, a ssh connection from a system on the same LAN took 6 seconds to connect. I presume that because RISC-V is a newer architecture there hasn t been as much effort made on optimising the Qemu emulation and that a future version of Qemu will be faster. But I don t think that Debian/Bullseye will give good Qemu performance for RISC-V, probably more changes are needed than can happen before the freeze. Maybe a version of Qemu with better RISC-V performance can be uploaded to backports some time after Bullseye is released. Here s the Qemu command I use to run RISC-V emulation:
qemu-system-riscv64 -machine virt -device virtio-blk-device,drive=hd0 -drive file=/vmstore/riscv,format=raw,id=hd0 -device virtio-blk-device,drive=hd1 -drive file=/vmswap/riscv,format=raw,id=hd1 -m 1024 -kernel /boot/riscv/vmlinux-5.10.0-1-riscv64 -initrd /boot/riscv/initrd.img-5.10.0-1-riscv64 -nographic -append net.ifnames=0 noresume security=selinux root=/dev/vda ro -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-device,rng=rng0 -device virtio-net-device,netdev=net0,mac=02:02:00:00:01:03 -netdev tap,id=net0,helper=/usr/lib/qemu/qemu-bridge-helper
Currently the program /usr/sbin/sefcontext_compile from the selinux-utils package needs execmem access on RISC-V while it doesn t on any other architecture I have tested. I don t know why and support for debugging such things seems to be in early stages of development, for example the execstack program doesn t work on RISC-V now. RISC-V emulation in Unstable seems adequate for people who are serious about RISC-V development. But if you want to just try a different architecture then PPC64 and S/390 will work better.

7 January 2021

Russell Coker: Monopoly the Game

The Smithsonian Mag has an informative article about the history of the game Monopoly [1]. The main point about Monopoly teaching about the problems of inequality is one I was already aware of, but there are some aspects of the history that I learned from the article. Here s an article about using modified version of Monopoly to teach Sociology [2]. Maria Paino and Jeffrey Chin wrote an interesting paper about using Monopoly with revised rules to teach Sociology [3]. They publish the rules which are interesting and seem good for a class. I think it would be good to have some new games which can teach about class differences. Maybe have an Escape From Poverty game where you have choices that include drug dealing to try and improve your situation or a cooperative game where people try to create a small business. While Monopoly can be instructive it s based on the economic circumstances of the past. The vast majority of rich people aren t rich from land ownership.

5 January 2021

Russell Coker: Planet Linux Australia

Linux Australia have decided to cease running the Planet installation on planet.linux.org.au. I believe that blogging is still useful and a web page with a feed of Australian Linux blogs is a useful service. So I have started running a new Planet Linux Australia on https://planet.luv.asn.au/. There has been discussion about getting some sort of redirection from the old Linux Australia page, but they don t seem able to do that. If you have a blog that has a reasonable portion of Linux and FOSS content and is based in or connected to Australia then email me on russell at coker.com.au to get it added. When I started running this I took the old list of feeds from planet.linux.org.au, deleted all blogs that didn t have posts for 5 years and all blogs that were broken and had no recent posts. I emailed people who had recently broken blogs so they could fix them. It seems that many people who run personal blogs aren t bothered by a bit of downtime. As an aside I would be happy to setup the monitoring system I use to monitor any personal web site of a Linux person and notify them by Jabber or email of an outage. I could set it to not alert for a specified period (10 mins, 1 hour, whatever you like) so it doesn t alert needlessly on routine sysadmin work and I could have it check SSL certificate validity as well as the basic page header.

Next.

Previous.