How do I rescan the SCSI bus to add
or remove a SCSI device without rebooting the computer?
Updated 19 Nov 2012, 7:29 PM GMT
Issue
·
It is possible to add or remove a SCSI device without
rebooting a running system?
·
Can you scan a SCSI bus for new or missing SCSI devices
without rebooting?
·
What is the Linux equivalent to the Solaris command
`devfsadm` to add or remove storage devices?
·
How can I make newly connected SCSI devices available
without rebooting?
·
I am trying to add a LUN to a live system but it is not
recognized
·
How can I force a rescan of my SAN?
·
What to do if a newly allocated LUN on my SAN is not
available?
Environment
·
Red Hat Enterprise Linux 5.0 or above
o
SCSI devices over a Fibre Channel or iSCSI transport
Technical support for online storage
reconfiguration is provided on Red Hat Enterprise Linux 5 and above. Limited
tools for hot adding and removing storage are present in previous releases of
Red Hat Enterprise Linux however they cannot be guaranteed to work correctly in
all configurations. Red Hat Enterprise Linux 5 includes many enhancements
to udev, the low level device drivers, SCSI midlayer, and device-mapper
multipath which enables comprehensive support for online storage
reconfiguration.
This article,
the Online Storage Reconfiguration Guide, and the Storage Administration Guide currently cover
the FC and iSCSI transports. Future versions of this documentation will cover
other SCSI transports, such as SAS and FCoE.
Hewlett-Packard
SmartArray controllers and other hardware that uses the cciss driver provide a
different interface for manipulating SCSI devices. Users of this hardware
can find a similar guide here.
The procedures below also apply to
hypervisors (i.e. "dom0" in Red Hat Enterprise Linux 5
virtualization), but the procedures are different for dynamically altering the
storage of running virtual guests. For more information about adding storage to
virtual guests, see the Virtualization Guide.
Resolution
Yes, as of Red Hat Enterprise Linux 5.0, it
is possible to make changes to the SCSI I/O subsystem without rebooting. There
are a number of methods that can be used to accomplish this, some perform
changes explicitly, one device at a time, or one bus at a time. Others are
potentially more disruptive, causing bus resets, or potentially causing a large
number of configuration changes at the same time. If the less-disruptive
methods are used, then it is not necessary to pause I/O while the change is being
made. If one of the more disruptive methods are used then, as a precaution, it
is necessary to pause I/O on each of the SCSI busses that are involved in the
change.
This article is
a brief summary of the information contained in the Red Hat Enterprise Linux
manuals. For Red Hat Enterprise Linux 5 refer to Online Storage Reconfiguration Guide. For Red Hat
Enterprise Linux 6 refer to Storage Administration Guide. You must refer to
these documents for complete coverage of this topic.
Removing a Storage Device
Before removing access to the storage
device itself, you may want to copy data from the device. When that is done,
then you must stop and flush all I/O, and remove all operating system
references to the device, as described below. If this is a multipath device
then you must do this for the multipath pseudo device, and each of the
identifiers that represent a path to the device.
Removal of a storage device is not
recommended when the system is under memory pressure, since the I/O flush will
add to the load. To determine the level of memory pressure run the command:
Device removal is not recommended if
swapping is active (non-zero "si" and "so" columns in the
vmstat output), and free memory is less than 5% of the total memory in more
than 10 samples per 100. (The total memory can be obtained with the
"free" command.)
The general procedure for removing all
access to a device is as follows:
1. Close all
users of the device. Copy data from the device, as needed.
2. Use umount to unmount any
file systems that mounted the device.
3. Remove the device from any md and LVM volume that is using it. If the device
is a member of an LVM Volume group, then it may be necessary to move data off
the device using the pvmove command, then
use the vgreduce command to
remove the physical volume, and (optionally) pvremove to remove the
LVM metadata from the disk.
4. If you are removing a multipath device, run multipath -l and take note of
all the paths to the device. When this has been done, remove the multipath
device:
multipath -f multipath-device
Where multipath-device is the name of
the multipath device mpath0, for example.
NOTE: This
command may fail with "map in use" if the multipath device is still
in use (for example, a partition is on the device). Seehttps://access.redhat.com/kb/docs/DOC-56916 for further details.
5. Use the
following command to flush any outstanding I/O to all paths to the device:
blockdev --flushbufs device
This is
particularly important for raw devices, where there is no umount or vgreduce
operation to cause an I/O flush.
6. Remove
any reference to the device's path-based name, like /dev/sd or
/dev/disk/by-path or the major:minor number, in applications, scripts, or
utilities on the system. This is important to ensure that a different
device, when added in the future, will not be mistaken for the current device.
7. The final
step is to remove each path to the device from the SCSI subsystem. The
command to remove a path is:
echo 1 >
/sys/block/device-name/device/delete
Where device-name may be sde, for example.
Another variation of this operation is:
echo 1 >
/sys/class/scsi_device/h:c:t:l/device/delete
Where h is the HBA
number, c is the channel
on the HBA, t is the SCSI
target ID, and l is the LUN.
You can
determine the device-name and the h,c,t,l for a device
from various commands, such as lsscsi, scsi_id, multipath -l, andls -l /dev/disk/by-*
If each of the steps above are followed,
then a device can safely be removed from a running system. It is not necessary
to stop I/O to other devices while this is done.
Other
procedures, such as the physical removal of the device, followed by a rescan of
the SCSI bus using rescan-scsi-bus or issue_lip to cause the
operating system state to be updated to reflect the change, are not
recommended. This may cause delays due to I/O timeouts, and devices may be
removed/replaced unexpectedly. If it is necessary to perform a rescan of an
interconnect, it must be done while I/O is paused. Refer to Online Storage Reconfiguration Guide and Storage Administration Guide for more
information.
Adding a Storage Device
or a Path
When adding a device, be aware that the
path-based device name (the “sd” name, the major:minor number, and
/dev/disk/by-path name, for example) that the system assigns to the new device
may have been previously in use by a device that has since been removed. Ensure
that all old references to the path-based device name have been removed. Otherwise
the new device may be mistaken for the old device.
The first step is to physically enable
access to the new storage device, or a new path to an existing device.
This may involve installing cables, disks, and vendor-specific commands at the
FC or iSCSI storage server. When you do this, take note of the LUN value for
the new storage that will be presented to your host.
Next, make the operating system aware of
the new storage device, or path to an existing device. The preferred command
is:
echo "c t l" > /sys/class/scsi_host/hostH/scan
where H is the HBA
number, c is the channel
on the HBA, t is the SCSI
target ID, and l is the LUN.
You can
determine the H,c,t by refering to
another device that is already configured on the same path as the new device.
This can be done with commands such as lsscsi, scsi_id, multipath -l, and ls -l
/dev/disk/by-*. This information, plus the LUN number of the new
device, can be used as shown above to probe and configure that path to the new
device.
Note: In some Fibre
Channel hardware configurations, when a new LUN is created on the RAID array it
may not be visible to the operating system until after a LIP (Loop
Initialization Protocol) operation is performed. Refer to the manuals for
instructions on how to do this. If a LIP is required, it will be necessary to
stop I/O while this operation is done.
As of Red Hat
Enterprise Linux 5.6, it is also possible to use the wildcard character
"-" in place of c, t and/or l in the command
shown above. In this case, it is not necessary to stop I/O while this
command executes. In versions prior to 5.6, the use of wildcards in this
command requires that I/O be paused as a precaution.
After adding all
the SCSI paths to the device, execute the multipath command, and
check to see that the device has been properly configured. At this point, the
device is available to be added to md, LVM, mkfs, or mount, for example.
Other commands,
that cause a SCSI bus reset, LIP, or a system-wide rescan, that may result in
multiple add/remove/replace operations, are not recommended. If these commands
are used, then I/O to the effected SCSI buses must be paused and flushed prior
to the operation. Refer to the Online Storage Reconfiguration Guide and Storage Administration Guide for more information.
As of release 5.4, a script called /usr/bin/rescan-scsi-bus.sh is available as
part of the sg3_utils package. This
can make rescan operations easier. This script is described in the manuals
mentioned above.`
Comments
Seems by default WWN are used.
I have three tape drives per WWN, so am not getting all necessary nodes.
How can I modify so that Serial Numbers not WWN numbers are used for tape drives?
For example,
root@> inquire
scsidev@0.0.0:SPECTRA PYTHON 2000|Autochanger (Jukebox), /dev/sg0
S/N: 901F002454
ATNN=SPECTRA PYTHON 901F002454
WWNN=201F0090A5002454
scsidev@0.0.1:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst0
S/N: 1011002454
ATNN=IBM ULTRIUM-TD4 1011002454
WWNN=201F0090A5002454
scsidev@0.0.2:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst1
S/N: 1012002454
ATNN=IBM ULTRIUM-TD4 1012002454
WWNN=201F0090A5002454
scsidev@0.1.0:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst2
S/N: 1014002454
ATNN=IBM ULTRIUM-TD4 1014002454
WWNN=201F0090A5002454
scsidev@0.2.0:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst3
S/N: 1021002454
ATNN=IBM ULTRIUM-TD4 1021002454
WWNN=202F0090A5002454
scsidev@0.2.1:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst4
S/N: 1022002454
ATNN=IBM ULTRIUM-TD4 1022002454
WWNN=202F0090A5002454
<snip - removed remaining similar tape drives>
root# > ls -al /dev/tape/by-id/
total 0
drwxr-xr-x 2 root root 180 Jun 4 05:48 .
drwxr-xr-x 3 root root 60 Jun 4 05:36 ..
lrwxrwxrwx 1 root root 9 Jun 4 05:36 scsi-3201f0090a5002454 -> ../../sg0
lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3201f0090a5002454-nst -> ../../nst1
lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3202f0090a5002454-nst -> ../../nst5
lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3203f0090a5002454-nst -> ../../nst6
lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3204f0090a5002454-nst -> ../../nst9
lrwxrwxrwx 1 root root 11 Jun 4 05:48 scsi-3205f0090a5002454-nst -> ../../nst14
lrwxrwxrwx 1 root root 11 Jun 4 05:48 scsi-3206f0090a5002454-nst -> ../../nst16
I assume that the common WWN number is the cause of the missing drive links. Seems to select a random drive from within common WWN numbers, ignoring Serial Number (S/N) and "ATTN" value.