HOW-TO: Mirror Boot Drives Using SDS

Making use of RAID (Redundant Array of Independent Disks) is a logical thing to do, especially in implementation of architectures for data sensitive systems, especially servers. Another reason is avoid downtime in the event of a failure of a boot disk or the disk that stores the operating system. The latter is what this post wants to outline -- mirroring boot drives using SDS.

RAID 1 or mirroring can be cheaply implemented using software mirroring. That is built into the operating system and will not require additional hardware. Solaris used to call this software the Solaris Disk Suite (or SDS) and is now built into the latest Solaris 10 as Solaris Volume Manager (or SVM). Regardless of whichever name, it refers to the same thing. Let us discuss the same implementation of SDS/SVM.

Prepare the Drives. SDS uses metadevice state databases to store information on disk about the state of your DiskSuite configuration. The metadevice state database records and tracks changes made to your configuration. These databases must reside on a dedicated slice (in the case of a boot drive). I typically leave a small amount of unused space on the boot drive when installing Solaris for these databases. That is, I leave at least one unused slice with approximately 5 MB of free space available for SDS when installing Solaris.

If you do not have any unused space and you have an unused slice, then you may borrow space from swap. See documentation from Sun to perform this step. I recommend using this procedure in single user mode. However, I have been able to successfully perform these steps on idle systems (multi-user but not running applications) where there is no swapping being performed. In abbreviated terms:

* lists the swap devices configured on the system:
root@host # swap -l
swapfile dev swaplo blocks free
/dev/dsk/c0d0d0s1 29,0 8 1638608 1638608

* Disable the swap:
root@host # swap -d /dev/dsk/c0t0d0s1

* repartition swap device to exclude at least 5 MB of sector space. Partition these sectors on an unused slice using the format command:
root@host # format

Use format command to select the boot disk and create the slice that will hold the state database. The output from format of my boot disk looks like the following. I have the following filesystems carved: /, swap, /var, /opt, and /export/home
Part      Tag    Flag     Cylinders        Size            Blocks
0 root wm 0 - 1392 3.13GB (1393/0/0) 6563816
1 swap wu 1393 - 3131 3.91GB (1739/0/0) 8194168
2 backup wm 0 - 7505 16.86GB (7506/0/0) 35368272
3 var wm 3132 - 4870 3.91GB (1739/0/0) 8194168
4 unassigned wm 4871 - 5740 1.95GB (870/0/0) 4099440
5 home wm 5741 - 7479 3.91GB (1739/0/0) 8194168
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 0 0 (0/0/0) 0

Notice that slice 6 and 7 are unassigned and also there are 26 unused cylinders (7480 to 7505).

* create the dedicated slice for the state databases (common convention uses slice 7):
partition> 7
Part Tag Flag Cylinders Size Blocks
7 unassigned wm 7480 - 7504 57.52MB (25/0/0) 117800

Enter partition id tag[unassigned]:
Enter partition permission flags[wm]:
Enter new starting cyl[0]: 7480
Enter partition size[117800b, 25c, 57.52mb, 0.06gb]: 57.52mb
partition> p
Current partition table (unnamed):
Total disk cylinders available: 7506 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks
0 root wm 0 - 1392 3.13GB (1393/0/0) 6563816
1 swap wu 1393 - 3131 3.91GB (1739/0/0) 8194168
2 backup wm 0 - 7505 16.86GB (7506/0/0) 35368272
3 var wm 3132 - 4870 3.91GB (1739/0/0) 8194168
4 unassigned wm 4871 - 5740 1.95GB (870/0/0) 4099440
5 home wm 5741 - 7479 3.91GB (1739/0/0) 8194168
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 7480 - 7505 57.52MB (26/0/0) 122512

partition> label
Ready to label disk, continue? y

* after partitioning, re-enable Swap
root@host # swap -a /dev/dsk/c0t0d0s1

* format the mirror drive in the same manner as the primary. There is no need to run format to make this happen as there is a much faster way.
root@host # prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t8d0s2
fmthard: New volume table of contents now in place

In this case c0t0d0s2 is the boot drive and c0t8d0s2 is the mirror. Notice that it is on the same controller. You should try to mirror drives across different controllers if at all possible. Basically, the fmthard command takes the partition table of the boot disk and replicates it to the mirror drive. Use the format command to verify that the partitions are exactly identical.

Configure Solstice Disk Suite. With the drives properly sliced or partitioned, it is time to configure SDS.

[1] Create at least 2 state database replicas on each disk. A state database replica stores DiskSuite configuration and state information. Before you can use DiskSuite, you must create state database replicas.
root@host # metadb -a -f -c2 /dev/dsk/c0t0d0s7 /dev/dsk/c0t8d0s7

Where -a means adding; -f means force because this is the first time creating databases; and -c 2 means create 2 databases in each slice.

[2] Create the mirror for / filesystem. Here, we are creating a one-way mirror which for the time being is composed of 1 drive. Later we will attach the second drive to the mirror. The metainit command defines the metadevices that the mirror will use. The device numbers (d##) are arbitrary.
root@host # metainit -f d10 1 1 c0t0d0s0
root@host # metainit d20 1 1 c0t8d0s0
root@host # metainit d0 -m d10

Take a look at your /etc/vfstab and notice that the / filesystem will be mounted on /dev/md/dsk rather than /dev/dsk.

[3] Update the /etc/vfstab for / filesystem and /etc/system. Do not try to edit /etc/vfstab or /etc/system manually. Use the metaroot command:
root@host # metaroot d0

Solstice Disk Suite has the following rules with respect to the use of database replicas:
» The system will not boot unless more than half of the replicas are available
» The system will panic if more than half of the replicas are corrupt

In other words, if one of your drives fail, and the system is rebooted for any reason (hardware/software/human error), the system will not automatically boot in a two disk mirror configuration. The system will have to be brought to single user mode where the broken replicas can be removed to pass the quorum rule (more than half the replicas must be available). Fortunately, you can disable the feature by setting the following system parameter:
root@host # echo "set md:mirrored_root_flag=1" >> /etc/system

It is recommended to set this parameter in a 2 disk mirror configuration to ensure that the system boots with a failed drive.

[4] Create the mirror for all other filesystems:
» for swap:
root@host # metainit -f d11 1 1 c0t0d0s1
root@host # metainit d21 1 1 c0t8d0s1
root@host # metainit d1 -m d11
» for /var filesystem:
root@host # metainit -f d12 1 1 c0t0d0s3
root@host # metainit d22 1 1 c0t8d0s3
root@host # metainit d2 -m d12
» for /opt filesystem:
root@host # metainit -f d13 1 1 c0t0d0s4
root@host # metainit d23 1 1 c0t8d0s4
root@host # metainit d3 -m d13
» for /export/home filesystem:
root@host # metainit -f d14 1 1 c0t0d0s5
root@host # metainit d24 1 1 c0t8d0s5
root@host # metainit d4 -m d14

[5] Edit the /etc/vfstab to mount the new mirrors on boot. In order to put safeguards in place, make sure to take a backup copy of the file /etc/vfstab.
The /etc/vfstab prior to edits:
#device         device          mount           FS      fsck    mount   mount
#to mount to fsck point type pass at boot options
#
#/dev/dsk/c1d0s2 /dev/rdsk/c1d0s2 /usr ufs 1 yes -
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/dsk/c0t0d0s1 - - swap - no -
/dev/md/dsk/d30 /dev/md/rdsk/d30 / ufs 1 no -
/dev/dsk/c0t0d0s3 /dev/rdsk/c0t0d0s3 /var ufs 1 no -
/dev/dsk/c0t0d0s5 /dev/rdsk/c0t0d0s5 /export/home ufs 2 yes -
/dev/dsk/c0t0d0s4 /dev/rdsk/c0t0d0s4 /opt ufs 2 yes -
swap - /tmp tmpfs - yes -

The /etc/vfstab after edits:
#device         device          mount           FS      fsck    mount   mount
#to mount to fsck point type pass at boot options
#
#/dev/dsk/c1d0s2 /dev/rdsk/c1d0s2 /usr ufs 1 yes -
FD - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/md/dsk/d1 - - swap - no -
/dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no -
/dev/md/dsk/d2 /dev/md/rdsk/d2 /var ufs 1 no -
/dev/md/dsk/d4 /dev/md/rdsk/d4 /export/home ufs 2 yes
-
/dev/md/dsk/d3 /dev/md/rdsk/d3 /opt ufs 2 yes -
swap - /tmp tmpfs - yes -

[6] (OPTIONAL) Suppress harmless warning messages
Typically, after a SDS install, you will receive the harmless but annoying messages on boot-up: "WARNING: forceload of misc/md_hotspares failed". This is a nuisance, so I typically suppress them by creating an empty hot spare pool:
# metainit hsp001

[7] Reboot and allow the system to mount the mirrors.
root@host # init 6


Ignore the following errors on boot. Suns reason for these errors: "These warnings are harmless, and may be ignored. They are an artifact of the way drivers are loaded during the boot process when you have a mirrored root or /usr file system."
WARNING: forceload of misc/md_trans failed
WARNING: forceload of misc/md_raid failed
WARNING: forceload of misc/md_hotspares failed

[8] Attach the second submirror to the mirror. This will cause the data from the boot disk to be synchronized with the mirrored drive.
# metattach d0 d20
# metattach d1 d21
# metattach d2 d22
# metattach d3 d23
# metattach d4 d24

Use metastat to track progress
# metastat d0

d0: Mirror
Submirror 0: d10
State: Okay
Submirror 1: d20
State: Resyncing
Resync in progress: 21 % done
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 6563816 blocks
...

[9] Notify the system of the change in the dump device.
root@host # dumpadm -d /dev/md/dsk/d1

[10] Enable the mirror disk to be bootable:
# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t8d0s0

Set nvram and reboot system to verify system will boot
# eeprom "boot-device=disk0 disk1"
# eeprom use-nvramrc?=false

NOTE: You can set this at the boot prom if you like. Use the devalias and setenv command. In case of primary boot disk failure, boot from the alternate disk:
ok  boot mirror

Stay tuned for the next guide on how to replace a failed mirror drive.

You might also be interested in:

Feedback

We at pimp-my-rig strive to keep on improving, help us reach that goal by leaving comments or constructive criticisms. Don't miss out on our next feature -- subscribe via RSS (What is RSS?).

Share This

0 comments: