I finally decided to publish the few hacks I am using to run Linux on SUN's LDOM.
They are not pretty but they do the job done for me.
I take 0 responsibility for them so use them at your own risk.
Use this HOWTO only if you are dealing to mess with your system. Do *NOT* use for production systems if you do not understand what you are doing
Step 1: read all the possible LDOM documentation that SUN made available for you.
Step 2: understand that this is all experimental software.
Step 3: repeat 1 and 2 a few times.
LDOM server in Solaris allows to export 3 kind of block devices (file, partition/slice, entire disk) to the guest and those are seen as virtual disks.
The guest cannot really feel the difference but it is important to understand the way in which the LDOM software plays with them.
I did *not* test the export for partition/slice and my hacks do not know how to cope with it. You have been warned.
The problem: LDOM virtual disk daemon mangles partition tables
In an attempt to make sure that the partition tables exported to the guests are sane, the daemon ends up trashing valid ones because the check it forces to the devices is too strict and it behaves differently if the device is a file, a partition/slice or an entire disk.
The ldm bind operation is the dangerous one to perform and that can kill your partition table for good.
How ldm bind operation work:
- for entire disks:
* check device partition table
* fix device partition table if check fails
* lock device
* export device to guest
* store status of device once it is validated and do not check it ever again
- for files:
* check device partition table
* fix device partition table if check fails
* export device to guest
Remember that I did *not* test export of a slice and that my hack does not take it into account. You have been warned twice.
Limitations of the workaround
First of all I assume that you installed LDOM 1.0 in the standard path (/opt/SUNWldm/) otherwise you might need to change the script manually to match an alternate path.
Each guest is using the same kind of devices. The workaround does not yet how to cope with exports of files and partitions to the same guest.
You know how to use dd and some other basic tools.
If you start playing with bind/unbind operations manually and things break, you keep all pieces.
How to
This list of operation sounds extremely complex, but it is not and it is required only at the first install time
and possibly to update your partition table backups if you decide to change them. The changes to the ldom init script will take to restore the last known to work partion table on each reboot. NOTE: if you fail to follow these instruction and do not use valid configuration entries or backup partitions, LDOM might not start anymore or even core dump. TAKE EXTREME CARE (and learn how to reboot into factory-defaults at SC/ALOM)
1 - get the few files from
here. This html file is there too for your convenience.
2 - ldmd_start is the newly modified init script for LDOM. It should be placed in /opt/SUNWldm/bin and made executable. Note that if there is no configuration file or the configuration file is empty nothing will happen and you will be running the exact same script as the original. ldmd_start.orig and .diff are there for your convenience to check that I did not add a root kit ;).
3 - ldom.linux is an example configuration file that you want to place in /etc.
4 - mkdir /.ldom.linux , we will use this directory to store the backup partition tables.
5 - decide what kind of device you want to export.
Follow steps 6 and 7 if you decided to use a full disk as device and skip to 8 if you are using a file
6 - Use the format command from Solaris and make sure to label and format the disk.
7 - Take a copy of the clean partition table.
Example: dd if=/path/to/device of=/.ldom.linux/$guestname-$(basename /path/to/device).format count=1 bs=512
So let say that your guest is called foo1 and you are exporting /dev/rdsk/c1t1d0s2 you will issue the following command:
dd if=/dev/rdsk/c1t1d0s2 of=/.ldom.linux/foo1-c1t1d0s2.format count=1 bs=512
Continue from here if you are exporting a file
8 - Configure your LDOM guest following the LDOM administration guide.
9 - Netboot and install Linux in the guest.
10 - Before rebooting into the installation, halt the guest and unbind it. The unbind is required to unlock the exported device for read.
11 - Take a copy of the partition table that has been created by Linux.
Example: dd if=/path/to/device of=/.ldom.linux/$guestname-$(basename /path/to/device).backup
Following the above example:
dd if=/dev/rdsk/c1t1d0s2 of=/.ldom.linux/foo1-c1t1d0s2.backup count=1 bs=512
or
dd if=/path/to/file of=/.ldom.linux/foo1-file.backup count=1 bs=512
12 - Edit the configuration file in /etc that you installed at step 3.
Each line contains:
guest_name disk1 disk2 ...
NOTE: you do not need to add entries for Solaris guests. Only Linux guests are required! Empty lines are skipped, no comments are allowed in the file.
Follow steps 13 to 19 if you are exporting a disk
13 - Restore Solaris partition table:
Example: dd conv=notrunc if=/.ldom.linux/$guest-$(basename /path/to/device).format of=/path/to/device bs=512 count=1
14 - Bind the guest
15 - Unbind the guest again. HINT: this is the most important step here because LDOM disk server now has validated the partition table and it will not check it again.
16 - Restore Linux partition table:
Example: dd conv=notrunc if=/.ldom.linux/$guest-$(basename /path/to/device).backup of=/path/to/device bs=512 count=1
17 - Bind the guest again.
18 - Start the guest.
19 - Enjoy.
Follow steps 20 to 23 if you are using a file
20 - Bind the guest
21 - Restore Linux partition table:
Example: dd conv=notrunc if=/.ldom.linux/$guest-$(basename /path/to/file).backup of=/path/to/file bs=512 count=1
22 - Start the guest.
23 - Enjoy.
This said, the script will allow you to reboot the controller without having to worry about Linux guests and it should ensure that LDOM will always start properly. LDOM stores the status of a guests in the Machine Description and we will use this status to restore the partition tables too. If the guest is in either bind or active status, we will take proper actions to stop/unbind/restore/bind/start. If the guest is in unbind state, we will do nothing because we do not know why is in that state. The script also is not error prone. Failing to stop or unbind the domain, to perform restore partition table operation, might cause LDOM to abort and can require manual fixing but the above information should be enough for any experienced user to recover.
Have fun