It won't be long before I grow grey hair if
udev keeps pulling tricks on
me. Yesterday, an upgrade to 0.087-1 hosed one of my systems. The system is
sarge-based, but it is connected a to a cable modem needing the
cdc_ether
driver, which 2.6.8 does not have. Since I don't expect backports of 2.6.15 to
sarge (Update:
backports.org has a
2.6.15 backport, but it won't
solve my problem it seems), but also don't want to migrate my system to
testing, I simply decided to pin
linux-image-2.6.15-1-686 and all its
dependencies to
unstable with
APT:
Package: linux-image-2.6.15-1-686
Pin: release a=unstable
Pin-Priority: 600
Package: initramfs-tools
Pin: release a=unstable
Pin-Priority: 600
Package: udev
[...]
The install worked, and off I went to reboot... but the machine did not come
back up, and booting 2.6.8 with the new
udev also failed. Great. At first,
I thought I
knew the problem, but at closer inspection, it was something
else:
udev's
ide.agent hung itself up and timed out. It turns out it
was looking for
hd141 instead of
hda, and once I found that out, it
didn't take long to put two and two together: 141 is ASCII 97 is 'a'. And if
you
echo hd\141 just like that, the shell will swallow the backslash.
Marco, the
udev maintainer blamed a broken shell, and I identified
busybox-cvs-static to be
the problem; Replacing it with
busybox
from
unstable fixed the issue.
Now all that remains is to convince Marco that the bug has nothing to do with
initramfs-tools when it occurs in a script provided by
udev.
initramfs-tools depends on
busybox-cvs-static busybox since
it
works with either. If
udev doesn't work with
busybox-cvs-static, it
has to conflict, which is not really an option though, due to a
libc6
upgrade loop. Fortunately, the 2.6.16 kernel will make
ide.agent obsolete,
so the problem shall vanish in smoke.
With one problem solved, I woke up this morning to find another. I use udev's
network interface renaming feature to ensure that my interfaces always have
the names I expect, and that their names give me a hint as to what they're
connected to. Sure, using
/etc/modules to ensure a defined load order
would work fine, but I have too many machines under my control to want to
remember that
eth2 on this machine is the wireless LAN.
So I use the following
udev rules:
wall:~# cat /etc/udev/rules.d/local-interfaces.rules
KERNEL="eth*", SYSFS address ="00:02:8a:80:21:31", NAME="internet"
KERNEL="eth*", SYSFS address ="08:00:46:b1:2d:ee", NAME="lan"
KERNEL="eth*", SYSFS address ="00:50:04:5b:ec:b3", NAME="wlan"
KERNEL="eth*", SYSFS address ="00:04:23:72:4e:6c", NAME="wifibackup"
Update: Bas Zoetekouw suggested to match against something else than MAC
addresses, for testing. Thus, I tried PCI IDs (using the topmost
SYSFS device entry in the
udevinfo output):
KERNEL="eth*", SYSFS device ="0x24c4", NAME="internet"
KERNEL="eth*", SYSFS device ="0x103d", NAME="lan"
KERNEL="eth*", SYSFS device ="0x5157", NAME="wlan"
KERNEL="eth*", SYSFS device ="0x1043", NAME="wifibackup"
The problem remains the exact same. It also remains the same if I completely
remove
wlan and
wifibackup.
When I woke up this morning, I found the following mess:
wall:~# ip addr
2: internet: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 08:00:46:b1:2d:ee brd ff:ff:ff:ff:ff:ff
3: eth0_ifrename: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:02:8a:80:21:31 brd ff:ff:ff:ff:ff:ff
4: wifibackup: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:04:23:72:4e:6c brd ff:ff:ff:ff:ff:ff
5: wlan: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:50:04:5b:ec:b3 brd ff:ff:ff:ff:ff:ff
inet 192.168.14.129/25 brd 192.168.14.255 scope global wlan
inet6 fe80::250:4ff:fe5b:ecb3/64 scope link
valid_lft forever preferred_lft forever
The
wlan and
wifibackup interfaces are configured correctly (I use
wifibackup to hook into the various open WLANs around, when my provider
goes down, or I need more bandwidth). But
internet was assigned to the
LAN interface, and
eth0_ifrename, well... that's just whacked.
Looking at the
udev code, this seems to be due to a patch Marco pulled
from Ubuntu, which is to guard against race conditions in the renaming. For
instance, if
eth0 needs to become
eth1 and
vice versa,
udev
renames the first to
eth0_ifrename and waits until the other has finished
its identity change. The patch, however, is a hack: it tries endlessly to
rename the interface to its final target name, which, in my case, obviously
goes on forever.
10:18 * Md just copied it from the Ubuntu package
10:18 < madduck> why???
10:19 < Md> because it worked in my artificial setup
Unfortunately, this isn't the first time that Ubuntu's "giving back to Debian"
(which requires Debian to go out and fetch) is two steps back rather than one
forward. I would hope that maintainers of criticial packages (such as
udev) would exercise more care when pulling from Ubuntu. And that Ubuntu
would please stop adding hacks to packages and instead concentrate on fixing
issues at the root
the Debian way.
So my problem still persists, and even given Ubuntu's
ifrename patch
problems, I can't figure out what is actually going on. It does not help that
udev also suddenly stopped logging interface name changes. Yes, just like
that.
KERNEL="eth*", SYSFS address ="00:02:8a:80:21:31", NAME="internet"
KERNEL="eth*", SYSFS address ="08:00:46:b1:2d:ee", NAME="lan"
How can these two rules actually trigger the rename conflict? The only way
I could imagine is that
udev gets confused and falsely renames
08:00:46:b1:2d:ee to
internet. Then, when it gets to the other card,
a name collision occurs,
udev chooses
eth0_ifrename as temporary
workaround, and then tries forever to rename
eth0_ifrename to
internet, which will
never succeed.
So why does
udev get confused in the first place? Why would it ever name
the interface
08:00:46:b1:2d:ee internet? Beats me. But I better end
here because the world surely doesn't need just anoAther
udev rant.
Update: I forgot to mention that the renaming works just fine when
I unload/load modules from the command line. It's only during the boot process
that things go wild.
Update 2: I should not that it
does not work fine some of the time if
the modules are loaded in quick succession.