우주곰, 엔지니어 방랑기

'Categories'에 해당되는 글 190건

2012.12.03 ssh 6.1 for RHEL 5

2012.12.03 How do I rescan the SCSI bus to add or remove a SCSI device without rebooting the computer?

2012.04.10 ext4 분석

2012.04.03 How can persistent names be created for SCSI devices in Red Hat Enterprise Linux 4, 5 and 6?

2012.04.02 yum commands are segfaulting

2012.03.26 How to update offline RHEL server without network connection to Red Hat Network/Proxy/Satellite.

2012.03.23 How do you configure an ILO 3 fence device for RHEL Clustering?

2012.03.15 Why is the /proc/scsi/qla2xxx/ or /proc/scsi/lpfc/ directory missing in Red Hat Enterprise Linux 5 and what has replaced it?

2012.03.13 ServeRAID C100 Driver for Red Hat Enterprise Linux 6

2012.03.12 What is the SysRq facility and how do I use it?

2012.03.08 Offline upgrading to Red Hat Enterprise Linux 4.9

2012.02.27 PXE boot

2012.02.24 레드헷 설치용 USB 만들기 (ISO to USB for RHEL)

2012.02.21 How do I create a bootable USB pen drive to start a Red Hat Enterprise Linux installation?

2012.02.13 RHEL 5 본딩 자동화 스크립트

2012.02.13 linux 설치 후 DAEMON 정리

2012.02.06 RHEL 4,5,6 bonding 구성 방법

2012.01.25 swap은 어떻게 잡는 것이 좋을까?

2012.01.25 /dev/shm resize 방법

2010.11.22 개념정리

My Advanced Linux/Advanced Linux 2012. 12. 3. 23:20

ssh 6.1 for RHEL 5

the latest version of openSSH for Red Hat Enterprise Linux Server release 5.7 according to yum is openssh-4.3p2-82

but that doesn’t stop us:

cd /usr/local/src
wget http://mirror.esc7.net/pub/OpenBSD/OpenSSH/portable/openssh-6.1p1.tar.gz
tar -xvzf openssh-6.1p1.tar.gz
cp ./openssh-6.1p1/contrib/redhat/openssh.spec /usr/src/redhat/SPECS/
cp openssh-6.1p1.tar.gz /usr/src/redhat/SOURCES/
cd /usr/src/redhat/SPECS
perl -i.bak -pe 's/^(%define no_(gnome|x11)_askpass)\s+0$/$1 1/' openssh.spec
rpmbuild -bb openssh.spec
cd /usr/src/redhat/RPMS/i386
rpm -e openssh-askpass-4.3p2-82.el5.i386
rpm -Uvh openssh-*.rpm
 
[root@server ~]# ssh -v
OpenSSH_6.1p1, OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 12. 3. 22:30

How do I rescan the SCSI bus to add or remove a SCSI device without rebooting the computer?

How do I rescan the SCSI bus to add or remove a SCSI device without rebooting the computer?

Updated 19 Nov 2012, 7:29 PM GMT

양식의 맨 아래

Issue

· It is possible to add or remove a SCSI device without rebooting a running system?

· Can you scan a SCSI bus for new or missing SCSI devices without rebooting?

· What is the Linux equivalent to the Solaris command `devfsadm` to add or remove storage devices?

· How can I make newly connected SCSI devices available without rebooting?

· I am trying to add a LUN to a live system but it is not recognized

· How can I force a rescan of my SAN?

· What to do if a newly allocated LUN on my SAN is not available?

Environment

· Red Hat Enterprise Linux 5.0 or above

o SCSI devices over a Fibre Channel or iSCSI transport

Technical support for online storage reconfiguration is provided on Red Hat Enterprise Linux 5 and above. Limited tools for hot adding and removing storage are present in previous releases of Red Hat Enterprise Linux however they cannot be guaranteed to work correctly in all configurations. Red Hat Enterprise Linux 5 includes many enhancements to udev, the low level device drivers, SCSI midlayer, and device-mapper multipath which enables comprehensive support for online storage reconfiguration.

This article, the Online Storage Reconfiguration Guide, and the Storage Administration Guide currently cover the FC and iSCSI transports. Future versions of this documentation will cover other SCSI transports, such as SAS and FCoE.

Hewlett-Packard SmartArray controllers and other hardware that uses the cciss driver provide a different interface for manipulating SCSI devices. Users of this hardware can find a similar guide here.

The procedures below also apply to hypervisors (i.e. "dom0" in Red Hat Enterprise Linux 5 virtualization), but the procedures are different for dynamically altering the storage of running virtual guests. For more information about adding storage to virtual guests, see the Virtualization Guide.

Resolution

Yes, as of Red Hat Enterprise Linux 5.0, it is possible to make changes to the SCSI I/O subsystem without rebooting. There are a number of methods that can be used to accomplish this, some perform changes explicitly, one device at a time, or one bus at a time. Others are potentially more disruptive, causing bus resets, or potentially causing a large number of configuration changes at the same time. If the less-disruptive methods are used, then it is not necessary to pause I/O while the change is being made. If one of the more disruptive methods are used then, as a precaution, it is necessary to pause I/O on each of the SCSI busses that are involved in the change.

This article is a brief summary of the information contained in the Red Hat Enterprise Linux manuals. For Red Hat Enterprise Linux 5 refer to Online Storage Reconfiguration Guide. For Red Hat Enterprise Linux 6 refer to Storage Administration Guide. You must refer to these documents for complete coverage of this topic.

Removing a Storage Device

Before removing access to the storage device itself, you may want to copy data from the device. When that is done, then you must stop and flush all I/O, and remove all operating system references to the device, as described below. If this is a multipath device then you must do this for the multipath pseudo device, and each of the identifiers that represent a path to the device.

Removal of a storage device is not recommended when the system is under memory pressure, since the I/O flush will add to the load. To determine the level of memory pressure run the command:

vmstat 1 100

Device removal is not recommended if swapping is active (non-zero "si" and "so" columns in the vmstat output), and free memory is less than 5% of the total memory in more than 10 samples per 100. (The total memory can be obtained with the "free" command.)

The general procedure for removing all access to a device is as follows:

1. Close all users of the device. Copy data from the device, as needed.

2. Use umount to unmount any file systems that mounted the device.

3. Remove the device from any md and LVM volume that is using it. If the device is a member of an LVM Volume group, then it may be necessary to move data off the device using the pvmove command, then use the vgreduce command to remove the physical volume, and (optionally) pvremove to remove the LVM metadata from the disk.

4. If you are removing a multipath device, run multipath -l and take note of all the paths to the device. When this has been done, remove the multipath device:

multipath -f multipath-device

Where multipath-device is the name of the multipath device mpath0, for example.

NOTE: This command may fail with "map in use" if the multipath device is still in use (for example, a partition is on the device). Seehttps://access.redhat.com/kb/docs/DOC-56916 for further details.

5. Use the following command to flush any outstanding I/O to all paths to the device:

blockdev --flushbufs device

This is particularly important for raw devices, where there is no umount or vgreduce operation to cause an I/O flush.

6. Remove any reference to the device's path-based name, like /dev/sd or /dev/disk/by-path or the major:minor number, in applications, scripts, or utilities on the system. This is important to ensure that a different device, when added in the future, will not be mistaken for the current device.

7. The final step is to remove each path to the device from the SCSI subsystem. The command to remove a path is:

echo 1 > /sys/block/device-name/device/delete

Where device-name may be sde, for example.

Another variation of this operation is:

echo 1 > /sys/class/scsi_device/h:c:t:l/device/delete

Where h is the HBA number, c is the channel on the HBA, t is the SCSI target ID, and l is the LUN.

You can determine the device-name and the h,c,t,l for a device from various commands, such as lsscsi, scsi_id, multipath -l, andls -l /dev/disk/by-*

If each of the steps above are followed, then a device can safely be removed from a running system. It is not necessary to stop I/O to other devices while this is done.

Other procedures, such as the physical removal of the device, followed by a rescan of the SCSI bus using rescan-scsi-bus or issue_lip to cause the operating system state to be updated to reflect the change, are not recommended. This may cause delays due to I/O timeouts, and devices may be removed/replaced unexpectedly. If it is necessary to perform a rescan of an interconnect, it must be done while I/O is paused. Refer to Online Storage Reconfiguration Guide and Storage Administration Guide for more information.

Adding a Storage Device or a Path

When adding a device, be aware that the path-based device name (the “sd” name, the major:minor number, and /dev/disk/by-path name, for example) that the system assigns to the new device may have been previously in use by a device that has since been removed. Ensure that all old references to the path-based device name have been removed. Otherwise the new device may be mistaken for the old device.

The first step is to physically enable access to the new storage device, or a new path to an existing device. This may involve installing cables, disks, and vendor-specific commands at the FC or iSCSI storage server. When you do this, take note of the LUN value for the new storage that will be presented to your host.

Next, make the operating system aware of the new storage device, or path to an existing device. The preferred command is:

echo "c t l" > /sys/class/scsi_host/hostH/scan

where H is the HBA number, c is the channel on the HBA, t is the SCSI target ID, and l is the LUN.

You can determine the H,c,t by refering to another device that is already configured on the same path as the new device. This can be done with commands such as lsscsi, scsi_id, multipath -l, and ls -l /dev/disk/by-*. This information, plus the LUN number of the new device, can be used as shown above to probe and configure that path to the new device.

Note: In some Fibre Channel hardware configurations, when a new LUN is created on the RAID array it may not be visible to the operating system until after a LIP (Loop Initialization Protocol) operation is performed. Refer to the manuals for instructions on how to do this. If a LIP is required, it will be necessary to stop I/O while this operation is done.

As of Red Hat Enterprise Linux 5.6, it is also possible to use the wildcard character "-" in place of c, t and/or l in the command shown above. In this case, it is not necessary to stop I/O while this command executes. In versions prior to 5.6, the use of wildcards in this command requires that I/O be paused as a precaution.

After adding all the SCSI paths to the device, execute the multipath command, and check to see that the device has been properly configured. At this point, the device is available to be added to md, LVM, mkfs, or mount, for example.

Other commands, that cause a SCSI bus reset, LIP, or a system-wide rescan, that may result in multiple add/remove/replace operations, are not recommended. If these commands are used, then I/O to the effected SCSI buses must be paused and flushed prior to the operation. Refer to the Online Storage Reconfiguration Guide and Storage Administration Guide for more information.

As of release 5.4, a script called /usr/bin/rescan-scsi-bus.sh is available as part of the sg3_utils package. This can make rescan operations easier. This script is described in the manuals mentioned above.`

저작자표시 비영리 변경금지 (새창열림)

Tips! 2012. 4. 10. 13:53

ext4 분석

출처 : http://www.ibm.com/developerworks/kr/library/l-anatomy-ext4/

inux 커널이 새롭게 발표될 때마다 몇 가지 뛰어난 기능이 포함되어 있듯이 이번 12월에 발표된 2.6.28 릴리스에도 우수한 기능이 포함되어 있다. 이 릴리스는 현재 개발 작업이 한창 진행 중인 Btrfs와 같은 여러 가지 우수한 기능 중에서 안정적인 ext4 파일 시스템이 최초로 적용된 릴리스이다. 이 차세대 Extended File System에서는 확장성과 신뢰성이 향상되었으며 뛰어난 새 기능도 추가되었다. Ext4는 1TB 디스크를 최대 백만 개까지 사용할 수 있는 파일 시스템으로 확장할 수 있다.

Extended File System의 약사

VFS(Virtual File System) 스위치

VFS는 상위 계층 파일 시스템 사용자의 기본 파일 시스템에 대한 세부 사항을 추상화하는 계층이다. 이러한 기능을 제공하는 VFS를 바탕으로 Linux는 지정된 Linux 시스템에서 여러 파일 시스템을 동시에 지원할 수 있다.

Linux를 지원하는 최초의 파일 시스템은 Minix 파일 시스템이었지만 이 파일 시스템에는 몇 가지 심각한 성능 문제가 있었기 때문에 Extended File System이라는 파일 시스템이 Linux를 위해 특별히 개발되었다. Remy Card가 설계한 첫 번째 Extended File System(ext)은 1992년 4월에 Linux에 채택되었다. ext 파일 시스템은 0.96c 커널에 구현된 VFS(Virtual File System) 스위치를 최초로 사용했으며 최대 2GB 크기의 파일 시스템을 지원했다.

두 번째 Extended File System(ext2) 또한 Remy Card가 구현했으며 1993년 1월에 발표되었다. 이 파일 시스템은 Berkeley FFS(Fast File System)와 같은 당시의 다른 파일 시스템의 발전된 아이디어를 채택했다. Ext2에서는 지원되는 파일 시스템의 크기가 2TB로 확장되었으며 2.6 커널에서는 ext2 파일 시스템의 최대 크기가 32TB로 확장되었다.

developerWorks에서 Tim Jones의 추가 기사 읽기

세 번째 Extended File System(ext3)은 일부 경쟁 파일 시스템에 비해 성능이 떨어지기는 했지만 Linux 파일 시스템의 맥락에서는 크게 발전한 모습을 보여 준 파일 시스템이다. ext3 파일 시스템에서는 예기치 않게 시스템이 중단되었을 때 파일 시스템의 신뢰성을 높여 주는 저널링개념이 도입되었다. Silicon Graphics의 XFS 및 IBM® JFS(Journaled File System)와 같은 경쟁 파일 시스템의 성능이 더 좋기는 했지만 ext3은 이미 ext2를 사용하고 있는 시스템에서 직접 업그레이드할 수 있는 기능을 지원했다. Ext3은 2001년 11월에 발표되었으며 Stephen Tweedie가 구현했다.

현재, 네 번째 Extended File System(ext4)이 발표되었다. Ext4에는 성능, 확장성 및 신뢰성을 향상시킨 수많은 새 기능이 도입되었다. 가장 눈에 띄는 특징은 ext4가 1EB(exabyte)의 파일 시스템을 지원한다는 것이다. Ext4는 ext3를 유지 관리해 온 Theodore Tso가 이끄는 개발자 팀에 의해 구현되어 2.6.19 커널에 채택되었다. 그리고 지금은 2.6.28 커널에 이르러 안정적인 상태로 유지되고 있다(2008년 12월 현재).

Ext4에는 다양한 경쟁 파일 시스템의 유용한 개념이 적용되었다. 예를 들어, 익스텐트를 사용하여 블록을 관리하는 방법은 JFS에서 구현했으며 또 다른 블록 관련 기능인 지연된 할당은 XFS와 Sun Microsystems의 ZFS에서 구현했다.

새로운 ext4 파일 시스템에서는 혁신적으로 개선된 다양한 기능을 볼 수 있다. 새롭게 추가된 기능, 현재 파일 시스템의 한계를 뛰어 넘은 우수한 확장성, 오류에 효과적으로 대응할 수 있는 신뢰성 및 뛰어난 성능에 이르기까지 파일 시스템의 모든 부분에서 개선된 사항을 발견할 수 있다.

위로

기능성

Ext4에는 새 기능이 매우 많이 포함되어 있기는 하지만 그 중에서도 가장 중요한 특징은 이전 버전인 ext3과의 쌍방 호환성이며 앞으로 더 높은 성능을 제공하게 될 미래의 Linux 시스템을 내다보고 시간 소인의 기능도 향상되었다.

이전 버전 및 후속 버전과의 호환성

ext3은 오늘날 Linux에서 가장 많이 사용되고 있는 파일 시스템 중 하나이기 때문에 ext4로의 마이그레이션은 큰 어려움 없이 쉽게 수행할 수 있어야 한다. 이를 위해 ext4는 쌍방 호환성을 고려하여 설계되었다(그림 1 참조). Ext4는 ext3 파일 시스템을 ext4 파일 시스템으로 마운트할 수 있도록 후속 버전으로의 호환성을 제공한다. ext4를 충분히 활용하려면 파일 시스템 마이그레이션을 수행하여 새로운 ext4 형식으로 변환한 후 사용해야 한다. ext4 파일 시스템을 ext3로도 마운트할 수 있기는(이전 버전과의 호환성) 하지만 ext4 파일 시스템에서 익스텐트(성능 섹션 참조)를 사용하지 않는 경우에만 가능하다.

그림 1. ext4의 쌍방 호환성

이러한 호환성 특징 외에도 ext3 파일 시스템을 ext4로 마이그레이션하는 작업을 점차적으로 수행할 수 있다. 즉, 옮기지 않은 기존 파일을 기존 ext3 형식으로 유지하면서 새 파일(또는 복사한 기존 파일)을 새로운 ext4 데이터 구조로 관리할 수 있다. 이러한 방법을 통해 온라인으로 ext3 파일 시스템을 ext4 파일 시스템으로 마이그레이션할 수 있다.

시간 소인 정밀도 및 범위 향상

놀랍게도 ext4 이전의 Extended File System에서는 초 단위의 시간 소인을 사용하고 있다. 이 시간 소인은 많은 설정에서 효과적으로 사용되었지만 프로세서의 처리 속도가 빨라지고 통합 기능(멀티 코어 프로세서)이 향상되었을 뿐만 아니라 고성능 컴퓨팅과 같은 다른 애플리케이션 도메인에서 Linux가 사용되면서 그 한계가 드러나고 있다. Ext4의 시간 소인은 기본적으로 나노초 LSB로 확장되어 후속 버전과의 호환성을 보장한다. 또한 두 개의 추가 비트를 통해 시간 범위도 500년 이후까지 사용할 수 있도록 확장되었다.

위로

확장성

업그레이드할 파일 시스템의 가장 중요한 특성 중 하나는 증가하는 수요에 대응할 수 있는 확장성이다. 여러 가지 방법으로 확장성을 강화한 Ext4는 ext3 한계를 극복하고 파일 시스템 메타데이터 관리를 위한 토대를 새롭게 마련하였다.

파일 시스템 제한 확장

ext4의 첫 번째 가시적인 차이점은 파일 시스템 볼륨, 파일 크기 및 서브디렉토리 제한에 대한 지원이 향상되었다는 것이다. Ext4는 최대 1EB(1000PB)의 파일 시스템을 지원한다. 오늘날 적용되고 있는 표준에 따르면 많은 용량처럼 보이기는 하지만 스토리지 사용량이 지속적으로 늘어나고 있다는 점을 감안하면 ext4가 미래를 염두에 두고 개발되었음을 명확히 알 수 있다. ext4에서 허용되는 최대 파일 크기는 16TB(4KB 블록 가정)이며, 이는 ext3의 최대 파일 크기의 8배에 해당한다.

마지막으로 ext4에서는 서브디렉토리 제한도 32KB 디렉토리 깊이에서 거의 무한대로 확장되었다. 이러한 확장이 무리한 확장으로 보인다면 1EB의 스토리지를 사용하는 파일 시스템의 계층 구조를 생각해 봐야 한다. 디렉토리 인덱싱도 해시된 B 트리 형태의 구조로 최적화되었다. 따라서 제한이 크게 확장되었음에도 불구하고 ext4에서는 매우 빠른 조회가 가능하다.

익스텐트

ext3의 주요 단점 중 하나는 할당 방법에 있었다. 여유 공간에 대한 비트 맵을 통해 파일이 할당되었는데 이 방법은 빠르지도 않고 확장성도 좋지 않았다. Ext3의 형식은 작은 파일에 매우 효율적이지만 큰 파일에는 비효율적이다. Ext4에서는 할당 기능을 향상시키고 더욱 효율적인 스토리지 구조를 지원하기 위해 ext3의 메커니즘을 익스텐트로 대체했다. 익스텐트는 연속되는 블록 시퀀스를 나타낸다. 이처럼 익스텐트를 사용하게 되면 블록의 저장 위치에 대한 정보를 유지하는 대신 연속 블록으로 구성된 긴 목록의 저장 위치에 대한 정보가 유지되기 때문에 저장되는 전체 메타데이터의 용량이 줄어든다.

ext4의 익스텐트는 계층화된 접근 방법을 통해 작은 파일을 효율적으로 나타내며 익스텐트 트리를 사용하여 대용량 파일을 효율적으로 나타낸다. 예를 들어, 단일 ext4 inode에는 4개의 익스텐트를 참조할 수 있는 공간이 있으며, 이 경우 각 익스텐트는 연속 블록 세트를 나타낸다. 대용량 파일(조각화된 파일 포함)의 경우, inode는 인덱스 노드를 참조할 수 있으며, 각각의 인덱스 노드는 여러 익스텐트를 참조하는 리프 노드를 참조할 수 있다. 이 고정 깊이 익스텐트 트리는 대용량 스파스 파일에 대한 효과적인 표현 스키마를 제공한다. 또한 노드에는 파일 시스템 손상을 방지하기 위한 자동 검사 메커니즘이 있다.

위로

성능

새 파일 시스템을 측정하는 데 사용되는 가장 중요한 속성 중 하나는 기본 성능이다. 성능은 가장 어려운 분야 중 하나이다. 왜냐하면 파일 시스템의 용량이 커지고 신뢰성에 대한 기대가 높아질수록 성능 저하가 발생할 수 있기 때문이다. 하지만 ext4는 확장성과 신뢰성을 제공하는 동시에 성능 향상을 위한 여러 가지 향상된 기능도 제공한다.

파일 레벨 사전 할당

데이터베이스 또는 컨텐츠 스트리밍과 같은 특정 애플리케이션에서는 드라이브에 대한 순차 블록 읽기 최적화를 사용하고 블록에 대한 읽기 명령 비율을 최대화하기 위해 연속 블록에 저장되는 파일을 사용한다. 연속 블록 세그먼트를 제공할 수 있는 익스텐트 외에도 과거에 XFS에서 구현되었던 대로 매우 큰 연속 블록 섹션을 원하는 크기로 사전 할당하는 매우 강력한 방법도 있다. Ext4에서는 지정된 크기의 파일을 사전 할당 및 초기화하는 새로운 시스템 호출을 통해 이 기술이 구현되었다. 그런 다음 필요한 데이터를 기록한 후 데이터에 대한 제한적인 읽기 성능을 제공할 수 있다.

블록 할당 지연

할당 지연은 파일 크기를 기반으로 하는 또 하나의 최적화 방법이다. 이 성능 최적화 방법은 블록을 디스크에 강제로 기록할 때까지 디스크의 물리적 블록을 할당하지 않고 기다린다. 이 최적화 방법의 핵심은 디스크에 기록할 필요가 있을 때까지 물리적 블록의 할당이 지연되기 때문에 더 많은 블록을 연속 블록에 할당 및 기록할 수 있다는 것이다. 이 방법은 파일 시스템에서 작업이 자동으로 수행된다는 점을 제외하면 지속적인 사전 할당과 유사하다. 하지만 파일 크기가 미리 알려져 있는 경우에는 지속적인 사전 할당이 가장 효과적인 방법이다.

멀티 블록 할당

최적화를 위해 향상된 마지막 기능은 ext4의 블록 할당자이다. 이 최적화 방법 또한 연속 블록과 관련되어 있다. ext3의 경우 블록 할당자는 한 번에 하나의 블록을 할당하는 방식으로 작동한다. 여러 개의 블록이 필요한 경우 연속 데이터를 연속되지 않은 블록에서 찾을 수 있었다. Ext4에서는 디스크에 연속되어 있을 수 있도록 여러 블록을 동시에 할당하는 블록 할당자를 사용하여 이 문제를 해결했다. 이전 최적화와 마찬가지로 이 최적화에서도 순차 읽기 최적화를 위해 디스크에서 최적화할 관련 데이터를 수집한다.

멀티 블록 할당의 또 다른 특징은 블록을 할당하는 데 필요한 처리 리소스의 용량에서 찾아볼 수 있다. 한 번에 하나의 블록만을 할당하는 가장 단순한 형태의 방법을 사용하는 ext3의 경우에는 블록 할당을 수행하기 위해 블록마다 한 번의 호출이 필요했다. 하지만 여러 블록을 동시에 할당하는 경우에는 블록 할당자에 대한 호출 횟수가 많이 줄어들기 때문에 할당 속도가 빨라지고 필요한 처리 리소스의 양도 줄어든다.

위로

신뢰성

ext4에서는 파일 시스템이 매우 큰 크기로 확장될 수 있기 때문에 신뢰성에 대한 관심도 당연히 커질 것이다. Ext4에는 이러한 우려를 해소할 수 있는 여러 가지 자동 보호 및 자동 복구 메커니즘이 마련되어 있다.

파일 시스템 저널에 대한 체크섬 검사

ext3과 마찬가지로 ext4도 저널링 파일 시스템이다. 저널링은 저널(디스크의 연속된 영역에 있는 전용 순환 로그)을 통해 파일 시스템의 변경 사항을 기록하는 프로세스이다. 그런 다음 로그에 기록된 변경 사항에 따라 물리적 스토리지에 실제 변경 사항이 적용된다. 이 방법을 사용하면 좀 더 안정적으로 변경 사항을 구현할 수 있으며 작업 중에 시스템 오류 또는 전원 문제가 발생하더라도 일관성을 유지할 수 있다. 결과적으로 파일 시스템의 손상 가능성이 줄어드는 효과를 얻을 수 있다.

저널링을 사용하더라도 올바르지 않은 항목이 저널에 있다면 손상 가능성은 여전히 존재한다. 이 문제를 해결하기 위해 ext4에서는 저널에 대한 체크섬 기능을 구현하여 올바른 변경 사항만 기본 파일 시스템에 적용되도록 보장한다. 참고자료 섹션에서 ext4의 중요한 기능인 저널링에 대한 추가 참고자료를 볼 수 있다.

Ext4는 사용자의 필요에 따라 여러 가지 모드의 저널링을 지원한다. 예를 들어, ext4는 메타데이터만 저널링되는 모드(Writeback 모드), 메타데이터가 저널링된 후 저널을 바탕으로 메타데이터가 기록될 때 데이터가 기록되는 모드(Ordered 모드) 및 메타데이터와 데이터가 모두 저널링되는 모드(가장 안정적인 Journal 모드)를 지원한다. Journal 모드는 파일 시스템의 일관성을 보장하는 가장 좋은 방법이기는 하지만 모든 데이터가 저널을 통과하기 때문에 가장 느린 방법이기도 하다.

온라인 조각 모음

ext4에는 파일 시스템 내의 조각을 줄여 주는 기능(순차 블록 할당을 위한 익스텐트)이 통합되어 있기는 하지만 파일 시스템을 장기간 사용할 경우에는 어느 정도의 조각이 발생하는 것은 피할 수가 없다. 이 문제를 해결하여 성능을 향상시키기 위해 파일 시스템 및 개별 파일에 대한 조각 모음을 수행하는 온라인 조각 모음 도구가 제공된다. 온라인 조각 모음 도구는 인접한 익스텐트를 참조하는 새 ext4 inode에 파일을 복사하는 단순한 도구이다.

온라인 조각 모음의 또 다른 특징은 파일 시스템 검사(fsck)에 필요한 시간이 짧다는 것이다. Ext4에서는 inode 테이블에 있는 블록 그룹 중 사용되지 않고 있는 블록 그룹이 구별되기 때문에 fsck는 해당 블록 그룹 전체를 생략하여 빠르게 검사 프로세스를 수행할 수 있다. 파일 시스템의 크기가 증가하게 되면 필연적으로 내부 손상이 발생하기 마련이며 이러한 문제를 해결하기 위해 운영 체제는 파일 시스템에 대한 유효성 검증을 수행한다. 그리고 이러한 유효성 검증을 통해 ext4가 전반적으로 높은 신뢰성을 갖추고 있음을 알 수 있다.

위로

미래의 모습

Extended File System은 분명 1992년에 처음 발표된 ext부터 2008년의 ext4에 이르기까지 Linux 내에서 길고도 의미 있는 역사를 가지고 있다. Linux를 위해 특별히 설계된 첫 번째 파일 시스템이면서 가장 효율적이고, 안정적이며 강력한 파일 시스템 중 하나였음을 입증해 보였다. XFS, JFS, Reiser 및 IRON 결함 허용 파일 시스템 기술 등의 다른 새 파일 시스템의 아이디어도 통합되어 있는 Ext4는 파일 시스템 관련 리서치에서 꾸준한 발전을 보여 주고 있다. 앞으로 개발될 ext5의 모습을 예측하기에는 너무 앞선 감이 있지만 엔터프라이즈 환경을 대비한 Linux 시스템을 이끌게 될 것이라는 점만은 분명하다.

참고자료

교육

"Ubuntu 9.04 Receives EXT4 Support"에서는 최신 Ubuntu 릴리스에서 ext4를 통해 얻게 된 매우 인상적인 성능 향상 결과를 볼 수 있다(JFS, XFS, ReiserFS 및 Ext3과 비교).
ext4 kernel wiki(시스템에서 ext4를 실행하는 데 필요한 정보 제공), Fedora ext4 페이지, Kernel Newbies ext4 페이지 및Wikipedia에서 ext4에 대한 자세한 정보를 볼 수 있다. Wikipedia의 Extended file system 페이지에서도 4가지 Extended File System(1 - 4)에 대한 정보를 볼 수 있으며, 이 페이지에서는 파일 시스템 비교 및 두 번째 Extended File System의 역사에 대한 링크를 제공한다.
IBM Linux Technology Center에서 제공하는 "Ext4: The Next Generation of Ext2/3 Filesystem"에서 ext4에 대한 좋은 프리젠테이션을 볼 수 있다.
Tim의 "리눅스 파일 시스템 분석"(developerWorks, 2007년 10월) 및 "리눅스 저널링 파일 시스템 분석"(developerWorks, 2008년 6월)에서 Linux 파일 시스템 및 저널링 파일 시스템에 대한 자세한 정보를 볼 수 있다.
Softpedia에 게재된 2.6.28과 Heise online에 실린 2.6.29에서 새 커널 릴리스인 2.6.28과 2.6.29에 대한 자세한 설명을 볼 수 있다. 새 커널 릴리스인 2.6.28과 2.6.29에는 획기적으로 향상된 여러 가지 주요 기능이 있다.
Linux: ext4 Filesystem에서 ext4의 초기 역사를 볼 수 있다. 2006년에 게재된 이 kernel trap 기사에는 ext4 파일 시스템에 대한 Theodore Tso의 초기 제안이 들어 있다.
University of Wisconsin-Madison의 학위 논문으로 작성한 Vijayan Prabhakaran의 IRON File Systems을 읽어보자. IRON(Internal RObustNess) 기술은 디스크 드라이브에 오류가 발생하는 흥미로운 상황을 가정하고서 디스크 드라이브를 관리하는 방법을 제안한다. 특히, IRON은 저널링 방법을 통한 오류 검색 및 복구 기술을 제안한다.
developerWorks에서 Tim Jones의 추가 기사를 볼 수 있다.
developerWorks Linux 영역에서는 Linux 입문자를 비롯한 Linux 개발자에게 도움이 되는 여러 가지 리소스를 제공하고 있으며 가장 인기 있는 기사와 튜토리얼도 볼 수 있다.
developerWorks에 있는 Linux 팁과 Linux 튜토리얼을 모두 볼 수 있다.
developerWorks 기술 행사 및 웹 캐스트를 통해 최신 정보를 얻을 수 있다.

제품 및 기술 얻기

kernel.org에서 최신 커널 릴리스를 다운로드할 수 있다.
developerWorks에서 직접 다운로드할 수 있는 IBM 시험판 소프트웨어를 사용하여 Linux와 관련된 후속 개발 프로젝트를 구현해 볼 수 있다.

토론

블로그, 포럼, 팟캐스트 및 스페이스를 통해 developerWorks 커뮤니티에 참여할 수 있다.

필자소개

M. Tim Jones는 임베디드 펌웨어 아키텍트이자 Artificial Intelligence: A Systems Approach, GNU/Linux Application Programming(현재 2판), AI Application Programming(현재 2판) 및 BSD Sockets Programming from a Multilanguage Perspective의 저자이다. 정지 위성을 위한 커널 개발에서 시작해 임베디드 시스템 아키텍처와 네트워크 프로토콜 개발에 이르기까지 다양한 분야에 대한 공학 지식을 가지고 있다. 콜로라도주 롱몬트 소재의 Emulex Corp.에서 컨설턴트 엔지니어로 활약하고 있다.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 4. 3. 09:35

How can persistent names be created for SCSI devices in Red Hat Enterprise Linux 4, 5 and 6?

last modified by Nitin Yewale on 01/10/12 - 11:54

Issue

How do I create persistent device names for attached SCSI devices that will not be changed on reboot or when new devices are added or existing devices are removed?

Environment

Red Hat Enterprise Linux (RHEL) 6
Red Hat Enterprise Linux 5.3 and later
Red Hat Enterprise Linux 4.7 and later

Resolution

The udev rules supplied with Red Hat Enterprise Linux 6, 5.3 and later, and 4.7 and later can create persistent device names for SCSI devices. These names are actually persistently-named symbolic links that appear in /dev/disk/by-id (for disk devices) and /dev/tape/by-id (for tape and media changer devices). These symbolic links, which use persistent device attributes (like serial numbers), do not change when devices are added or removed from the system (which causes reordering of the /dev/nst* devices, for example).

Use of these persistently-named symbolic links is highly desirable in, for instance, the configuration of backup software (which is commonly a static definition that binds a backup software device name to an operating system-level device file name).

Note: that these persistently-named symbolic links are created in addition to the default device file names in /dev (for example, /dev/nst0).

For Red Hat Enterprise Linux 4 only

By default, a Red Hat Enterprise Linux 4.7 system will not create these persistently-named symbolic links in /dev/[disk|tape]/by-id. For the persistently-named symbolic links to be created, /etc/scsi_id.config must be modified as follows:

options=-g -u

Following this modification, the system should be rebooted or run command start_udev to enable creation of the persistently-named symbolic links.

Comments

Reference Links - RHEL6

Persistent Naming section in the Red Hat Enterprise Linux 6 Storage Administration Guide.
Configuring persistent storage in Red Hat Enterprise Linux 6 section in the Red Hat Enterprise Linux 6 Virtualization Guide.

Reference Links - RHEL5

Persistent Naming chapter in the Red Hat Enterprise Linux 5 Online Storage Reconfiguration Guide.
Configuring persistent storage in Red Hat Enterprise Linux 5 section in the Red Hat Enterprise Linux 5 Virtualization Guide.

Component

udev

Comments

Craig Campbell on Fri, 06/04/2010 - 07:46 Permalink

Seems by default WWN are used.

I have three tape drives per WWN, so am not getting all necessary nodes.

How can I modify so that Serial Numbers not WWN numbers are used for tape drives?

For example,

root@> inquire

scsidev@0.0.0:SPECTRA PYTHON 2000|Autochanger (Jukebox), /dev/sg0

S/N: 901F002454

ATNN=SPECTRA PYTHON 901F002454

WWNN=201F0090A5002454

scsidev@0.0.1:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst0

S/N: 1011002454

ATNN=IBM ULTRIUM-TD4 1011002454

WWNN=201F0090A5002454

scsidev@0.0.2:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst1

S/N: 1012002454

ATNN=IBM ULTRIUM-TD4 1012002454

WWNN=201F0090A5002454

scsidev@0.1.0:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst2

S/N: 1014002454

ATNN=IBM ULTRIUM-TD4 1014002454

WWNN=201F0090A5002454

scsidev@0.2.0:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst3

S/N: 1021002454

ATNN=IBM ULTRIUM-TD4 1021002454

WWNN=202F0090A5002454

scsidev@0.2.1:IBM ULTRIUM-TD4 97F9|Tape, /dev/nst4

S/N: 1022002454

ATNN=IBM ULTRIUM-TD4 1022002454

WWNN=202F0090A5002454

root# > ls -al /dev/tape/by-id/

total 0

drwxr-xr-x 2 root root 180 Jun 4 05:48 .

drwxr-xr-x 3 root root 60 Jun 4 05:36 ..

lrwxrwxrwx 1 root root 9 Jun 4 05:36 scsi-3201f0090a5002454 -> ../../sg0

lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3201f0090a5002454-nst -> ../../nst1

lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3202f0090a5002454-nst -> ../../nst5

lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3203f0090a5002454-nst -> ../../nst6

lrwxrwxrwx 1 root root 10 Jun 4 05:48 scsi-3204f0090a5002454-nst -> ../../nst9

lrwxrwxrwx 1 root root 11 Jun 4 05:48 scsi-3205f0090a5002454-nst -> ../../nst14

lrwxrwxrwx 1 root root 11 Jun 4 05:48 scsi-3206f0090a5002454-nst -> ../../nst16

I assume that the common WWN number is the cause of the missing drive links. Seems to select a random drive from within common WWN numbers, ignoring Serial Number (S/N) and "ATTN" value.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 4. 2. 15:45

yum commands are segfaulting

Issue

When I try to run "yum update", the following error occurs:

# yum update
Loaded plugins: downloadonly, rhnplugin
rhel-i386-server | 1.4 kB 00:00
rhel-i386-server5/primary | 3.3 MB 00:01
Segmentation fault

Environment

Red Hat Enterprise Linux 5.5

Resolution

Move the /usr/local/lib/libz* files out of the way so it uses the Red Hat supplied libz* libraries:

mv /usr/local/lib/libz* /tmp

Root Cause

yum is using /usr/local/lib/libz.so.1 instead of the system libraries in /usr/lib and /lib.

Diagnostic Steps

1. Tried cleaning the yum cache:

# yum clean all
# yum clean metadata
# rm -rf /var/cache/yum/*
# rhn-profile-sync
# yum check-update

... but yum still segfaults

2. Customer is using 3rd party libraries:

$ cat etc/ld.so.conf
include ld.so.conf.d/*.conf
/usr/local/lib/
/usr/local/jpeg/lib/
/usr/local/freetype/lib/
/usr/local/gd/lib/
/usr/local/mysql/lib/
/export/sources/php-5.2.13/libs

3. Got an strace of yum:

# strace -ffvto trace_yumupdate.txt yum update
...$ cat trace_yumupdate.txt.6566 | grep 'open.*/usr/local/lib' | grep -v ENOENT
15:21:40 open("/usr/local/lib/libz.so.1", O_RDONLY) = 6

... so yum is using the libz.so.1 library from /usr/local/lib instead of from /usr/lib or /lib

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 3. 26. 15:43

How to update offline RHEL server without network connection to Red Hat Network/Proxy/Satellite.

last modified by Imogen Flood-Murphy on 01/05/12 - 07:45

Issue

A Red Hat Enterprise Linux server which is not connected to the Internet, needs to be updated, and has no access to a RHN Satellite or Proxy server.

Environment

Red Hat Enterprise Linux 6.x
Red Hat Enterprise Linux 5.x

Resolution

There is a server which is offline and doesn't have any connection to the Internet.

Then we need station (or laptop / virtual machine), which has the same major Red Hat Enterprise Linux version as server and is connected to the Red Hat Network/Proxy/Satellite.

Copy the /var/lib/rpm to the station connected to the Internet (you can use USB/CD…)
```
scp -r /var/lib/rpm root@station:/tmp/
```
Install the download only plugin for yum and createrepo on the machine which is connected to the Internet (Red Hat Network):
```
yum install yum-downloadonly createrepo
yum clean all
```
Backup the original rpm directory on the station and replace it with the rpm directory from the "offline" server:
```
mv -v /var/lib/rpm /var/lib/rpm.orig
mv -v /tmp/rpm /var/lib/
```

Download updates to /tmp/rpm_updates and return back the /var/lib/rpm

mkdir -v /tmp/rpm_updates
yum update --downloadonly --downloaddir /tmp/rpm_updates
createrepo /tmp/rpm_updates
rm -rvf /var/lib/rpm
mv -v /var/lib/rpm.orig /var/lib/rpm

Transfer the downloaded rpms to the server and update:

scp -r /tmp/rpm_updates root@server:/tmp/
ssh root@server

cat > /etc/yum.repos.d/rhel-offline-updates.repo << \EOF
[rhel-offline-updates]
name=Red Hat Enterprise Linux $releasever - $basearch - Offline Updates Repository
baseurl=file:///tmp/rpm_updates
enabled=1
EOF

yum upgrade

…and the server is updated.

These updates are the same as if "yum update" had been executed on a station that had a connection to the Internet.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 3. 23. 17:44

How do you configure an ILO 3 fence device for RHEL Clustering?

last modified by Shane Bradley on 01/19/12 - 11:30

Issue

How do you configure an ILO 3 fence device for RHEL Clustering?

Environment

Red Hat Cluster Suite 4+
Red Hat Enterprise Linux 5 Advanced Platform (Clustering)
Red Hat Enterprise Linux Server 6 (with the High Availability Add on)

Resolution

Support for the iLO3 fence device has been added to the fence_ipmilan fence device in the following errata: http://rhn.redhat.com/errata/RHEA-2010-0876.html.

The iLO3 firmware should be a minimum of 1.15 as provided by HP.

On both cluster nodes, install the following OpenIPMI packages used for fencing:

$ yum install OpenIPMI OpenIPMI-tools

Stop and disable the 'acpid' daemon:

$ service acpid stop; chkconfig acpid off

Test ipmitool interaction with iLO3:

$ ipmitool -H <iloip> -I lanplus -U <ilousername> -P <ilopassword> chassis power status

The desired output is:

Chassis Power is on
Edit the /etc/cluster/cluster.conf to add the fence device:

<?xml version="1.0"?> 

<cluster alias="rh5nodesThree" configversion="32" name="rh5nodesThree"> 

    <fencedaemon cleanstart="0" postfaildelay="1" postjoindelay="3"/> 

    <clusternodes> 

        <clusternode name="rh5node1.examplerh.com" nodeid="1" votes="1"> 

            <fence> 

                <method name="1"> 

                    <device domain="rh5node1" name="ilo3node1"/> 

                </method> 

            </fence> 

        </clusternode> 

        <clusternode name="rh5node2.examplerh.com" nodeid="2" votes="1"> 

            <fence> 

                <method name="1"> 

                    <device domain="rh5node2" name="ilo3node2"/> 

                </method> 

            </fence> 

        </clusternode> 

        <clusternode name="rh5node3.examplerh.com" nodeid="3" votes="1"> 

            <fence> 

                <method name="1"> 

                    <device domain="rh5node3" name="ilo3node3"/> 

                </method> 

            </fence> 

        </clusternode> 

    </clusternodes> 

    <cman expectedvotes="3"> 

        <multicast addr="229.5.1.1"/> 

    </cman> 

    <fencedevices> 

        <fencedevice agent="fenceipmilan" powerwait="10" ipaddr="XX.XX.XX.XX" lanplus="1" login="username" name="ilo3node1" passwd="password"/> 

        <fencedevice agent="fenceipmilan" powerwait="10" ipaddr="XX.XX.XX.XX" lanplus="1" login="username" name="ilo3node2" passwd="password"/> 

        <fencedevice agent="fenceipmilan" powerwait="10" ipaddr="XX.XX.XX.XX" lanplus="1" login="username" name="ilo3node3" passwd="password"/> 

    </fencedevices> 

    <rm> 

        <failoverdomains/> 

        <resources/> 

    </rm> 

</cluster>

Test that fencing is successful. From node1 attempt to fence node2 as follows:

$ fencenode node2

For more information on fencing cluster nodes manually then see the following article: How do you manually call fencing agents from the commandline?

Component

cluster

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 3. 15. 17:39

Why is the /proc/scsi/qla2xxx/ or /proc/scsi/lpfc/ directory missing in Red Hat Enterprise Linux 5 and what has replaced it?

last modified by Takayoshi Kimura on 02/14/12 - 02:36

Issue

The 2.6.11 Linux kernel introduced certain changes to the lpfc (emulex driver) and qla2xxx (Qlogic driver) Fibre Channel Host Bus Adapter (HBA) drivers which removed the following entries from the proc pseudo-filesystem: /proc/scsi/qla2xxx, /proc/scsi/lpfc. These entries had provided a centralized repository of information about the drivers and connected hardware. After the changes, the drivers started storing all this information within the /sys filesystem. Since Red Hat Enterprise Linux 5 uses version 2.6.18 of the Linux kernel it is affected by this change.

Using the /sys filesystem has the advantage that all the Fibre Channel drivers now use a unified and consistent manner to report data. However it also means that the data previously available in a single file is now scattered across a myriad of files in different parts of the /sys filesystem.

One basic example is the status of a Fibre Channel HBA: checking this can now be accomplished with the following command:

# cat /sys/class/scsi_host/host#/state

where host# is the H-value in the HBTL SCSI addressing format, which references the appropriate Fibre Channel HBA. For emulex adapters (lpfc driver) for example, this command would yield:

# cat /sys/class/scsi_host/host1/state
Link Up - Ready:
Fabric

For qlogic devices (qla2xxx driver) the output would instead be as follows:

# cat /sys/class/scsi_host/host1/state
Link Up - F_Port

Environment

Red Hat Enterprise Linux 5

Resolution

Obviously it becomes quite impractical to search through the /sys filesystem for the relevant files when there is a large variety of Fibre Channel-related information of interest. Instead of manual searching, the systool (1) command provides a simple but powerful means of examining and analyzing this information. Detailed below are several commands which demonstrate samples of information which the systool command can be used to examine.

To examine some simple information about the Fibre Channel HBAs in a machine:

# systool -c fchost -v

To look at verbose information regarding the SCSI adapters present on a system:

# systool -c scsihost -v

To see what Fibre Channel devices are connected to the Fibre Channel HBA cards:

# systool -c fcremoteports -v -d

For Fibre Channel transport information:

# systool -c fctransport -v

For information on SCSI disks connected to a system:

# systool -c scsidisk -v

To examine more disk information including which hosts are connected to which disks:

# systool -b scsi -v

Furthermore, by installing the sg3utils package it is possible to use the sgmap command to view more information about the SCSI map. After installing the package, run:

# modprobe sg sg_map -x

Finally, to obtain driver information, including version numbers and active parameters, the following commands can be used for the lpfc and qla2xxx drivers respectively:

# systool -m lpfc -v systool -m qla2xxx -v

ATTENTION: The syntax of the systool (1) command differs across versions of Red Hat Enterprise Linux. Therefore the commands above are only valid for Red Hat Enterprise Linux 5.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 3. 13. 17:46

ServeRAID C100 Driver for Red Hat Enterprise Linux 6

Adapters Supported: ServeRAID C100 (81Y4475)

Kernels Supported:
------------------
megasr_14.05.0701.2011-1_rhel6.1_32.img
 - kernel-2.6.32-131.0.15.el6.i686

megasr_14.05.0701.2011-1_rhel6.1_64.img
 - kernel-2.6.32-131.0.15.el6.x86_64


(C) Copyright International Business Machines Corporation 1999, 2011. All 
rights reserved.  US Government Users Restricted Rights - Use, duplication, 
or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

Note: Before using this information and the product it supports, read the 
general information in "Notices and trademarks" in this document.


CONTENTS
________

1.0  Overview
2.0  Installation and setup instructions
     2.1 Working with driver image files to create driver installation media
     2.2 Network operating system installation instructions
     2.3 Troubleshooting tips
3.0  Configuration information 
4.0  Unattended mode
5.0  Web site and support phone number
6.0  Notices and trademarks
7.0  Disclaimer


1.0  Overview
_____________

  1.1    This update includes a new device driver for the ServeRAID C100 
         supporting Red Hat Enterprise Linux 6 (RHEL 6).  

  1.2    Limitations:
         - None

  1.3    Problems fixed:
         - See change history

  1.4    Level of Recommendations and Prerequisites for the update:
         - None

  1.5    Dependencies:
         - None

  1.6    Update Contents:
          o  ibm_dd_megasr_14.05.0701.2011_rhel6_32-64.zip
             - Driver update image
          o  ibm_dd_megasr_14.05.0701.2011_rhel6_32-64
             - Change history


2.0  Installation and setup instructions
________________________________________

  Use the following set of instructions to install the supported network 
  operating systems.

  2.1 Working with driver image files to create driver installation media
  -----------------------------------------------------------------------

  These driver images can be used to create a USB key, CD, DVD, or floppy disk 
  containing the driver formatted for use during the installation of the 
  operating system.
  
  1) Copy the .zip file to a temporary directory and extract it.
  
  2) Using the list of supported kernels at the top of this readme, determine 
     which set of .img files you will need for your installation.  Use these 
     files wherever 'the .img file' is referenced in this readme.
  
  3) Using the .img file from your set, create a driver update disk on a USB 
     key, CD, DVD, floppy or other media using the instructions below for your 
     media type.
  
     USB Key:
     --------
     There are two different partitioning methods for USB keys.  One of the 
     methods below will work and the other will not, depending on which way 
     your key is partitioned.  The easiest way to discover which is correct 
     for your key is to try the Quick Copy Method first.  If this method is 
     not correct you will receive a message stating that no driver could be 
     found on your media when you try to load the driver in step 3 below.  If 
     that occurs, you can use the Extraction Method and reinsert the key.  Use 
     the Back button on the installation screens to re-detect the key.  You 
     should not need to reboot or start the installation over.
     
     Quick Copy Method:  Copy the .img file to the root directory of the USB 
     key.  You do not need to remove other files from the key unless there is 
     less space than necessary for the two files.
     
     Extraction Method:  Use an img-to-media application (such as dd, rawrite,
     or emt4win, or ardi4usb) to extract the image to the key.  This method 
     will overwrite all data on the key, so you will need to remove all other 
     files before extracting to the key.  Follow the instructions that came 
     with your img-to-media application to correctly extract to your key.  
  
     All other media:
     ----------------
     Use an img-to-media application (such as dd, rawrite, emt4win, or 
     ardi4usb) to extract the image to the media.  This method will overwrite 
     all data on the media.  If you are using rewritable media, you will need 
     to remove all other files before performing the extraction.  Follow the 
     instructions that came with your img-to-media application to correctly 
     extract to your media.

  2.2 Network operating system installation instructions
  ------------------------------------------------------

  Follow these instructions to add the ServeRAID C100 for System x driver 
  during the installation of RHEL 6.

  -----------------------------------------------------------------------------
  For Legacy Installations:

  Install instructions support the following NOS's:
    - Red Hat Enterprise Linux 6.1 Server Edition
         Driver Media:
            megasr_14.05.0701.2011-1_rhel6.1_32.img

    - Red Hat Enterprise Linux 6.1 Server x64 Edition
         Driver Media:
            megasr_14.05.0701.2011-1_rhel6.1_64.img

  Server Preparations:
  - Enable ServeRAID C100 (Software RAID) in F1 Setup and create a RAID volume 
    per the User Guide instructions.
  - For 64-bit versions, configure the "Legacy Only" boot option within F1 
    setup | Boot Manager.

  Installation Procedure:
  1.  Create MEGASR driver diskette or USB Key and attach the device to the 
      server.
  2.  Boot to RHEL 6 installation media to begin install.
  3.  At the "Welcome to RHEL 6" screen, highlight "Install or upgrade an  
      existing system" then press "Tab" to edit the boot options,
  4.  Add the following boot parameters to the to the end of the existing 
      line using either of the following two sets paramters:

        linux dd blacklist=ahci

        -or-

	linux dd noprobe=ata1 noprobe=ata2 noprobe=ata3 noprobe=ata4

      Press "Enter" to start the install.

  5.  When prompted, choose "Yes" to having a driver disk.
  6.  Select the device (diskette or USB key) for the MEGASR driver location.
  7.  Install any additional drivers or cancel to continue.
  8.  On the next screen, either verify the media or skip the media test as 
      prompted.
  9.  The graphic portion of the installation will begin.  Continue the 
      installation following the screens through to completion.

  -----------------------------------------------------------------------------
  For native uEFI installations:

  Follow these instructions to add the ServeRAID C100 for System x driver 
  during the installation of RHEL 6.

  Install instructions support the following NOS's:
    - Red Hat Enterprise Linux 6.1 Server x64 Edition
         Driver Media:
            megasr_14.05.0701.2011-1_rhel6.1_64.img

  Server Preparations:
  - Enable ServeRAID C100 (Software RAID) in F1 Setup and create a RAID volume 
    per the User Guide instructions.
  - Ensure the "Legacy Only" boot option within F1 setup | Boot Manager is 
    removed.

  Installation Procedure:
  1.  Create MEGASR driver diskette or USB Key and attach the device to the 
      server.
  2.  Boot to RHEL 6 installation media to begin install.
  3.  When prompted with "Booting Red Hat Enterprise Linux 6.1 in seconds...",
      press any key.
  4.  From the GNU GRUB menu, edit "Red Hat Enterprise Linux 6.1" and add either 
      of the following two sets paramters to the end of the line:

        linux dd blacklist=ahci  

        -or-

        linux dd noprobe=ata1 noprobe=ata2 noprobe=ata3 noprobe=ata4

      Press "Enter" to save the changes and press "b" to boot with the new 
      options.

  5.  When prompted, choose "Yes" to having a driver disk.
  6.  Select the device (diskette or USB key) for the MEGASR driver location.
  7.  Install any additional drivers or cancel to continue.
  8.  On the next screen, either verify the media or skip the media test as 
      prompted.
  9.  The graphic portion of the installation will begin.  Continue the 
      installation following the screens through to completion.


  2.2 Troubleshooting tips
  ------------------------
    None


3.0  Configuration information
______________________________
		
  For detailed setup instructions for your controller, refer to the 
  ServeRAID C100 User's Guide.


4.0  Unattended Mode
____________________

  Not supported.


5.0 Web Sites and Support Phone Number
______________________________________

  o  You can find support and downloads for IBM products from the IBM Support 
     Web site:

     http://www.ibm.com/support/
     
     You can find support and downloads specific to disk controllers by 
     searching for the "Disk Controller and RAID Software Matrix" from the 
     main support page.

  o  For the latest compatibility information, see the IBM ServerProven Web 
     site:

     http://www-03.ibm.com/servers/eserver/serverproven/compat/us/

  o  With the original purchase of an IBM hardware product, you have access 
     to extensive support coverage.  During the IBM hardware product warranty 
     period, you may call the IBM HelpCenter (1-800-IBM-SERV in the U.S.) 
     for hardware product assistance covered under the terms of the 
     IBM hardware warranty.


6.0 Trademarks and Notices
__________________________

  This product may contain program code or packages ("code") licensed by third 
  parties, as well as code licensed by IBM.   For non-IBM Code, the third 
  parties, not IBM, are the licensors.  Your use of the non-IBM code is 
  governed by the terms of the license accompanying that code, as identified 
  in the attached files.  You acknowledge that you have read and agree to the 
  license agreements contained in these files. If you do not agree to the 
  terms of these third party license agreements, you may not use the 
  accompanying code.

  IBM and ServeRAID are trademarks or registered trademarks of International 
  Business Machines Corporation in the United States and other countries.

  LSI and MegaRAID are trademarks or registered trademarks of LSI Logic, Corp 
  in the United States and other countries.

  Linux is a registered trademark of Linus Torvalds in the United States and 
  other countries.

  Other company, product, and service names may be trademarks or service marks 
  of others.


7.0 Disclaimer
______________

  THIS DOCUMENT IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND.
  IBM DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS OR IMPLIED,
  INCLUDING WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF FITNESS
  FOR A PARTICULAR PURPOSE AND MERCHANTABILITY WITH RESPECT TO THE
  INFORMATION IN THIS DOCUMENT.  BY FURNISHING THIS DOCUMENT, IBM
  GRANTS NO LICENSES TO ANY PATENTS OR COPYRIGHTS.

  Note to U.S. Government Users -- Documentation related to
  restricted rights -- Use, duplication or disclosure is subject
  to restrictions set forth in GSA ADP Schedule Contract with
  IBM Corporation.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 3. 12. 16:03

What is the SysRq facility and how do I use it?

last modified by Ray Dassen on 08/13/11 - 04:57

Issue

What is the SysRq facility and how do I use it?

Environment

Red Hat Enterprise Linux 3, 4, 5, and 6

Resolution

What is the "Magic" SysRq key?

According to the Linux kernel documentation:

It is a 'magical' key combo you can hit which the kernel will respond to regardless of whatever else it is doing, unless it is completely locked up.

The sysrq key is one of the best (and sometimes the only) way to determine what a machine is really doing. It is useful when a system appears to be "hung" or for diagnosing elusive, transient, kernel-related problems.

How do I enable and disable the SysRq key?

For security reasons, Red Hat Enterprise Linux disables the SysRq key by default. To enable it, run:

# echo 1 > /proc/sys/kernel/sysrq

To disable it:

# echo 0 > /proc/sys/kernel/sysrq

To enable it permanently, set the kernel.sysrq value in /etc/sysctl.conf to 1. That will cause it to be enabled on reboot.

# grep sysrq /etc/sysctl.conf
kernel.sysrq = 1

Since enabling sysrq gives someone with physical console access extra abilities, it is recommended to disable it when not troubleshooting a problem or to ensure that physical console access is properly secured.

How do I trigger a sysrq event?

There are several ways to trigger a sysrq event. On a normal system, with an AT keyboard, sysrq events can be triggered from the console with the following key combination:

Alt+PrintScreen+[CommandKey]

For instance, to tell the kernel to dump memory info (command key "m"), you would hold down the Alt and Print Screen keys, and then hit the m key.

Note that this will not work from an X Window System screen. You should first change to a text virtual terminal. Hit Ctrl+Alt+F1 to switch to the first virtual console prior to hitting the sysrq key combination.

On a serial console, you can achieve the same effect by sending a Breaksignal to the console and then hitting the command key within 5 seconds. This also works for virtual serial console access through an out-of-band service processor or remote console like HP iLO, Sun ILOM and IBM RSA. Refer to service processor specific documentation for details on how to send a Breaksignal; for example, How to trigger SysRq over an HP iLo Virtual Serial Port (VSP).

If you have a root shell on the machine (and the system is responding enough for you to do so), you can also write the command key character to the/proc/sysrq-trigger file. This is useful for triggering this info when you are not on the system console or for triggering it from scripts.

# echo 'm' > /proc/sysrq-trigger

When I trigger a sysrq event that generates output, where does it go?

When a sysrq command is triggered, the kernel will print out the information to the kernel ring buffer and to the system console. This information is normally logged via syslog to /var/log/messages.

Unfortunately, when dealing with machines that are extremely unresponsive, syslogd is often unable to log these events. In these situations, provisioning a serial console is often recommended for collecting the data.

What sort of sysrq events can be triggered?

There are several sysrq events that can be triggered once the sysrq facility is enabled. These vary somewhat between kernel versions, but there are a few that are commonly used:

m - dump information about memory allocation
t - dump thread state information
p - dump current CPU registers and flags
c - intentionally crash the system (useful for forcing a disk or netdump)
s - immediately sync all mounted filesystems
u - immediately remount all filesystems read-only
b - immediately reboot the machine
o - immediately power off the machine (if configured and supported)
f - start the Out Of Memory Killer (OOM)
w - dumps tasks that are in uninterruptable (blocked) state

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 3. 8. 10:06

Offline upgrading to Red Hat Enterprise Linux 4.9

last modified by Andrius Benokraitis on 10/04/11 - 12:21

NOTE: The following information has been provided by Red Hat, but is outside the scope of our posted Service Level Agreements (https://www.redhat.com/support/service/sla/ ) and support procedures. The information is provided as-is and any configuration settings or installed applications made from the information in this article could make your Operating System unsupported by Red Hat Support Services. The intent of this article is to provide you with information to accomplish your system needs. Use the information in this article at your own risk.

Issue

Red Hat Network (RHN) does not contain Red Hat Enterprise Linux 4.9 installation ISOs.[1]

Environment

Red Hat Enterprise Linux 4.8 without access to Red Hat Network

Resolution

Create a Reference System that connects to Red Hat Network and downloads the latest RHEL 4 packages. Those downloaded packages are then used to upgrade the Target System from Red Hat Enterprise Linux 4.8 to Red Hat Enterprise Linux 4.9 without connecting to Red Hat Network.
Reference System: Red Hat Enterprise Linux 4.8 installed and connected to Red Hat Network
Target System: Red Hat Enterprise Linux 4.8 installed but not connected to Red Hat Network
It is assumed that the Reference System is identical or similar to the Target System, including architecture type. If they cannot be similar, it is recommended that the Reference System be an @everything installation to minimize missed package updates.

Reference System Setup

Issue the following commands as root user on the Reference System after installing a base Red Hat Enterprise Linux 4.8 system from Red Hat Network.
Ensure there are no previously downloaded RPMs on the system:

rm -rf /var/spool/up2date/*

Download all available updates (including those on the "skip" list) from RHN and stores them in /var/spool/up2date :

up2date -u -v -d -f

Transfer the downloaded packages to an empty mounted device for later use on the Target System:

cp /var/spool/up2date/*.rpm /media/flash_drive

Target System Setup

Perform the following actions as root user on the Target System after completing the previous steps with the Reference System.
Mount the device containing the updated packages.
Edit the /etc/sysconfig/rhn/sources file with the following:

... #up2date default
dir rhel49 /media/flash_drive

...

By modifying/including these commands, the default search directory is disabled, and is replaced with the local mounted device.

Import the RPM GPG key:

rpm --import /usr/share/rhn/RPM-GPG-KEY

Update all packages (including the kernel) on the Target System:

up2date -uf

Reboot the system.

[1] The Red Hat Enterprise Linux 4 Life Cycle entered Production 3 Phase on 16-Feb-2011 with the release of Red Hat Enterprise Linux 4.9. No new features, hardware support or updated installation images (ISOs) are released during the Production 3 phase. Refer to the Red Hat Enterprise Linux Support Policy for details on the life cycle of Red Hat Enterprise Linux releases.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/Advanced Linux 2012. 2. 27. 13:42

PXE boot

1. PXE BOOT란?

사전 부팅 실행 환경 또는 간단히 PXE(Pre-boot eXecution Environment)는 네트워크 인터페이스를 통해 컴퓨터를 부팅할 수 있게 해주는 환경이다.

2. PXE 구성 요소

요즘 대부분의 서버들에도 PXE 지원하는 네트워크 카드가 설치 되어 있으니 만약 DVD-ROM이 없거나, Bootable USB가 인식이 되지 않을 때 유용하다.

l PXE Server - 부트 이미지 파일을 포함한 설정정보 교환.

l TFTP Server - 부트 이미지 파일을 전송.

l PXE Client - PXE 지원 네트워크 카드 필요(2000년 이후 출시된 제품에는 대부분 장착)

3. TFTP 설정법

다운로드 : http://tftpd32.jounin.net/

기본 실행 화면

GLOBAL 설정 화면

기본적으로 TFTP Server와 DHCP Server는 켜 있어야 한다.

TFTP server는 boot 이미지를 전송하는 프로토콜이며, 실질적은 FTP 서버가 아님을 기억하자.

DHCP는 PXE 부팅을 하기 위해 IP를 할당 받기 위한 서버이다.

TFTP 설정

TFTP 설정은 기본으로 두면 된다.

TFTP는 기본적으로 UDP 69포트를 사용한다.

만약 구성하고 있는 서버에 다수의 대역의 IP를사용 중이라면 Bind TFTP to this address 항목에서 사용할 대역을 설정 해 준다.

→ 이렇게 하면 조금 더 빨리 IP할당을 해 준다.

DHCP설정 화면

DHCP 설정이 가장 중요하다.

리눅스 DHCP설정과 별반다를 것이 없지만 여기에서 가장 중요한 것은 pxelinux.0 파일 설정

이 파일은 linux 설치 시 syslinux 패키지에 포함 되어 있다. 해당 버전을 다운 받아 놓자.

* CAUTION

RHEL 5버전의 pxeliux.0 와 menu.c32파일을 가지고는 RHEL 6버전의 PXE 부팅이 되지 않는다! 필히 최신버전인 RHEL 6버전의 pxelinux.0과 menu.c32파일을 구비 해 두자.

4. 디렉토리 구성

TFTP압축을 풀면 달랑 파일 몇 개만 있다. 이 상태로만 쓸 수 있는 것이 아니며, 하위 폴더에 파일 및 디렉토리를 생성하여야 한다.

필수 디렉토리

pxelinux.cfg : syslinux.cfg 파일과 동일한 역할을 하는 디렉토리로, 디렉토리 안에 default 라는 파일이 있어야 한다.(구성은 syslinux.cfg 파일과 100% 동일하니 잘 구성된 syslinux.cfg 파일이 있다면 이름만 바꾸어서 사용해도 된다.)

필수 파일

pxelinux.0 : 부트로더 파일

menu.c32 : 설치 시 메뉴를 보여주기 위한 파일

기타 설정 파일

ks : kickstart 용 파일을 모아 둠

vesamenu.c32 : 그래픽 한 환경설정을 위한 파일

rhel5.X : rhel 5.X 버전의 ISO를 풀어서 넣어둠

rhel6.X : rhel 6.X 버전의 ISO를 풀어서 넣어둠

5. 참조 URL

http://tftpd32.jounin.net/tftpd32_download.html

http://www.syslinux.org/wiki/index.php/PXELINUX

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/Advanced Linux 2012. 2. 24. 09:14

레드헷 설치용 USB 만들기 (ISO to USB for RHEL)

간혹 서버에 설치 하기 위해 돌아다니다 보면 DVD-ROM이 없는 서버가있다.
꼭 그럴경우가 아니라 장애대응을 나갔다가 OS를 재 설치 해야하는경우가 생긴다. 이럴 경우 간단히 ISO 파일을 가지고 설치용 USB를 만들 수 있다.

이전글 : 2012/02/21 - [My Advanced Linux/Advanced Linux] - How do I create a bootable USB pen drive to start a Red Hat Enterprise Linux installation?

하지만 Windows에서도 간단히 만들 수 있게해주는 오픈 소스가 있어 소개 한다.

http://iso2usb.sourceforge.net/

ISO를 Bootable USB로 만들어 주는 여러 툴이 있는데, RHEL과 CentOS라면 이 툴을 추천한다.
(이 툴은 UNetbootin을 기반으로 만들어 졌기 때문에 사실상 인터페이스는 똑같다)

1. Diskimage에 5.x/6.x버전에 해당하는 ISO 이미지를 넣는다

2. Type에 USB와 Dirve을 지정한다.

3. OK를 클릭한다

생각보다 금방 완료가 된다.
별도로 해 줄 작업이 없이 해당 USB를 꼽으면 기존 DVD를 넣고 설치하는 것과 동일한 모습을 볼 수 있다.

그리고 몇 가지 팁을 더 주자면 syslinux.cfg를 수정하여 멀티 설치버전 이미지를 넣을 수 있다.
syslinux.cfg 수정법은 조금만 검색하면 쉽게 알 수 있으니 별도로 설명 하진 않겠다. 이걸 이용하여 kickstart파일까지 만들면 USB 삽입 후 클릭 한번으로 OS 설치가 가능해진다 ~

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/RedHat Knowledge 2012. 2. 21. 09:39

How do I create a bootable USB pen drive to start a Red Hat Enterprise Linux installation?

last modified by Raghu Udiyar on 12/09/11 - 15:24

Release found: Red Hat Enterprise Linux 5

Problem

You need to install Red Hat Enterprise Linux on a server which does not have a floppy drive or CD-ROM drive, but which does has a USB port.

Assumptions

Your network environment is not set up to allow Red Hat Enterprise Linux to be installed completely from the network (through PXE boot). If it is, please make use of this option, as it is more straightforward than the procedure documented here.
Your network environment is configured to provide the contents of the Red Hat Enterprise Linux DVDs through a protocol supported by the Red Hat Enterprise Linux installer, such as NFS or FTP.
The server's BIOS supports booting from a USB mass storage device like a flash/pen drive.

Solution

The following steps configure a USB pen drive as a boot medium to start the installation of Red Hat Enterprise Linux.

Attach the USB pen drive to a system which is already running Red Hat Enterprise Linux.
Run

dmesg
From the dmesg output, identify the device name under which the drive is known to the system.

Sample messages for a 1 Gb flash disk being recognized as /dev/sdb:

Initializing USB Mass Storage driver... scsi2 : SCSI emulation for USB Mass Storage devices usb-storage: device found at 5 usb-storage: waiting for device to settle before scanning usbcore: registered new driver usb-storage USB Mass Storage support registered. Vendor: USB 2.0 Model: Flash Disk Rev: 5.00 Type: Direct-Access ANSI SCSI revision: 02 SCSI device sdb: 2043904 512-byte hdwr sectors (1046 MB) sdb: Write Protect is off sdb: Mode Sense: 0b 00 00 08 sdb: assuming drive cache: write through SCSI device sdb: 2043904 512-byte hdwr sectors (1046 MB) sdb: Write Protect is off sdb: Mode Sense: 0b 00 00 08 sdb: assuming drive cache: write through sdb: sdb1 sd 2:0:0:0: Attached scsi removable disk sdb
sd 2:0:0:0: Attached scsi generic sg1 type 0

usb-storage: device scan complete
Note: For the remainder of this article, we will assume this device name to be /dev/sdb. Make sure you adjust the device references in the following steps as per your local situation.
At this point, the flash drive is likely to have been automatically mounted by the system. Make sure the flash drive is unmounted. E.g. in nautilus, by right-clicking on the icon for the drive and selecting Unmount Volume.
Use fdisk to partition the flash drive as follows:
- There is a single partition.
- This partition is numbered as 1.
- Its partition type is set to 'b' (W95 FAT32).
- It is tagged as bootable.
Format the partition created in the previous step as FAT:

mkdosfs /dev/sdb1
Mount the partition:

mount /dev/sdb1 /mnt
Copy the contents of /RedHat/isolinux/ from the first installation CD/DVD onto the flash drive, i.e. to /mnt.

Note: the files isolinux.bin, boot.cat and TRANS.TBL are not needed and can thus be removed or deleted.
Rename the configuration file:

cd /mnt/; mv isolinux.cfg syslinux.cfg
Copy the installer's initial RAM disk /RedHat/images/pxeboot/initrd.img from the first installation CD/DVD onto the flash drive, i.e. to /mnt.
Optional step: To configure any boot settings, edit the syslinux.cfg on the USB flash drive. For example to configure the installation to use a kickstart file shared over NFS, specify the following:

linux ks=nfs:://ks.cfg
Unmount the flash drive:

umount /dev/sdb1
Make the USB flash drive bootable. The flash drive must be unmounted for this to work properly.

syslinux /dev/sdb1
Mount the flash drive again:

mount /dev/sdb1 /mnt
Install GRUB on the USB flash drive:

grub-install --root-directory=/mnt /dev/sdb
Verify that the USB flash drive has a /boot/grub directory. If it does not, create the directory manually.

cd /mnt

mkdir -p boot/grub
Create the grub.conf file. Below is a sample grub.conf:

default=0 timeout=5 root (hd1,0) title Red Hat Enterprise Linux installer
kernel /vmlinuz

initrd /initrd.img
Copy or confirm the created grub.conf file is on the /boot/grub/ directory of the USB flash drive.
Unmount the flash drive:

umount /dev/sdb1
At this point, the USB disk should be bootable.
Attach the USB disk to the system you wish to install Red Hat Enterprise Linux on.
Boot from the USB disk. Refer to the hardware vendor's BIOS documentation for details on changing the order in which devices are checked for booting from.
Once you are booted in the Red Hat Enterprise Linux installer, continue with your network installation of choice.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/Bash shell scripts 2012. 2. 13. 13:48

RHEL 5 본딩 자동화 스크립트

본딩을 일일히 구성하는게 귀찮아진 나머지 스크립트를 짜려다, 혹시나 해서 검색해보니 좋은 스크립트가 있어서 소개!
출처는 맨 아래 적어놓았으며, 약간의 수정을 했다^^

bond.sh

#!/bin/sh

# This script creates bonding interfaces on RHEL 5.

# The first and second parameters are used to specify the enslaved interfaces.

# The third parameter is used to describe the name of the bonding interface.

# The network configuration is collected from the first device.

# After running the script please verify the/etc/modprobe.conf file as well as

# all/etc/sysconfig/network-scripts/ifcfg* files!

# LICENSE INFORMATION

# This software is released under the BSD license:

# Redistribution and use in source and binary forms, with or without modification,

# are permitted provided that the following conditions are met:

# 1. Redistributions of source code must retain the above copyright notice, this list

# of conditions and the following disclaimer.

# 2. Redistributions in binary form must reproduce the above copyright notice, this

# list of conditions and the following disclaimer in the documentation and/or other

# materials provided with the distribution.

# 3. The name of the author may not be used to endorse or promote products derived

# from this software without specific prior written permission.

# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING,

# BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A

# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY

# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES

# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;

# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY

# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS

# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

# Global variables

SCRIPTNAME=$(basename $0 .sh)

EXIT_SUCCESS=0

EXIT_FAILED=1

EXIT_ERROR=2

EXIT_BUG=10

VERSION="1.0.0"

# Base functions

# This function displays the basic usage

function usage {

echo "Usage: $SCRIPTNAME <first slave interface> <second slave interface> <bonding interface> <ipaddress> <netmask> <gateway>" >&2

echo "This script bonds two network interfaces on RHEL 5 with the static network config from the first slave interface." >&2

echo >&2

echo "e.g. # ./$SCRIPTNAME eth1 eth3 bond1 192.168.5.250 255.255.255.0 192.168.5.1" >&2

echo >&2

[[ $# -eq 1 ]] && exit $1 || exit $EXIT_FAILED

}

# This function checks that the command is run with the right parameters

function preflightcheck {

# This script needs to be run as root.

if [ $(id -u) -ne 0 ]; then

echo "You need to be root to run this script."

exit $EXIT_FAILED

# Check if we have exactly 3 commandline parameters.

if [ $# -ne 6 ]; then

echo "Commandline parameter is missing. (only $# present)."

usage

exit $EXIT_FAILED

# Check if the first input is correct.

if ! echo $1|grep -q "^eth[0-9]$"; then

echo "The first parameter needs to be an ethernet device (e.g. eth1)."

usage

exit $EXIT_FAILED

# Check if the second input is correct.

if ! echo $2|grep -q "^eth[0-9]$"; then

echo "The second parameter needs to be an ethernet device (e.g. eth3)."

usage

exit $EXIT_FAILED

# Check if the third input is correct.

if ! echo $3|grep -q "^bond[0-9]$"; then

echo "The third parameter needs to be a bonding device (e.g. bond3)."

usage

exit $EXIT_FAILED

}

# The main function that creates the bonding devices.

function rh5mkbond {

# Load the bonding kernel module with active-backup mode and set mii link monitoring to 100 ms.

cp /etc/modprobe.conf /tmp/modprobe.conf.bonding

cat >> /tmp/modprobe.conf.bonding <<EOF

alias $3 bonding

EOF

cat /tmp/modprobe.conf.bonding|uniq > /etc/modprobe.conf

# Get interface details

#IP=$(/sbin/ifconfig $1|egrep -o "([0-9]{1,3}\.){3}[0-9]{1,3}"|sed -n "1p")

#NETMASK=$(/sbin/ifconfig $1|egrep -o "([0-9]{1,3}\.){3}[0-9]{1,3}"|sed -n "3p")

MACIF1=$(/sbin/ifconfig $1|egrep -o "([[:xdigit:]]{2}[:]){5}[[:xdigit:]]{2}")

MACIF2=$(/sbin/ifconfig $2|egrep -o "([[:xdigit:]]{2}[:]){5}[[:xdigit:]]{2}")

# Create the bond0 device file.

mv /etc/sysconfig/network-scripts/ifcfg-$3 /etc/sysconfig/network-scripts/ifcfg-$3.orig 2>/dev/null

cat >> /etc/sysconfig/network-scripts/ifcfg-$3 <<BOND

DEVICE=$3

BOOTPROTO=none

IPADDR=$4

NETMASK=$5

GATEWAY=$6

USERCTL=no

BONDING_OPTS="mode=0 miimon=100"

BOND

# Create the slave device files.

for i in $1 $2

mv /etc/sysconfig/network-scripts/ifcfg-$i /etc/sysconfig/network-scripts/ifcfg-$i.orig 2>/dev/null

cat >> /etc/sysconfig/network-scripts/ifcfg-$i <<IFS

DEVICE=$i

BOOTPROTO=none

HWADDR=$(/sbin/ifconfig $i|egrep -o "([[:xdigit:]]{2}[:]){5}[[:xdigit:]]{2}")

ONBOOT=yes

MASTER=$3

SLAVE=yes

USERCTL=no

IFS

done

}

# Call functions

preflightcheck $1 $2 $3 $4 $5 $6

rh5mkbond $1 $2 $3 $4 $5 $6

# End script

exit $EXIT_SUCCESS

출처 : http://www.mindtwist.de/main/linux/11-red-hat/23-how-to-automatically-create-a-bonding-configuration-on-rhel-5.html

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/Bash shell scripts 2012. 2. 13. 11:22

linux 설치 후 DAEMON 정리

uzoochk.sh

linux 설치 후 DAEMON을 정리하기 귀찮아서 만든 쉘이라고 하기에도 민망한 스크립트 -_-;;

#!/bin/bash

# created by uzoogom at 2012.2.13

export LC_ALL=C

RED='\e[1;31m'

GREEN='\e[1;32m'

YELLOW='\e[1;33m'

BLUE='\e[1;34m'

NC='\e[0m'

chkall=$(chkconfig --list | egrep "(on|off)" | awk '{print $1}')

chkon=$(cat chklist.txt | egrep -v "^#")

# ALL DEAMON STOP

echo -e "ALL DEAMON STOP ========================================================="

for chkalloff in $chkall

chkconfig --level 2345 $chkalloff off

service $chkalloff stop 1>/dev/null

echo -e "$chkalloff is ${RED}OFF${NC}"

done

echo "Done====================================================================="

echo ""

# SELECT DEAMON START

echo -e "SELECT DEAMON START ====================================================="

for chkselect in $chkon

chkconfig --level 2345 $chkselect on

service $chkselect start 1>/dev/null

echo -e "$chkselect is ${GREEN}ON${NC}"

done

echo "Done====================================================================="

echo ""

# selinux disabled

echo "selinux status==========================================================="

setenforce 0 1>/dev/null

perl -pi -e 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

grep "SELINUX=" /etc/selinux/config | egrep -v "^#"

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/Advanced Linux 2012. 2. 6. 11:04

RHEL 4,5,6 bonding 구성 방법

Configuring bonded devices on Red Hat Enterprise Linux 6

Configuring bonded devices

Single bonded device

For the detailed manual of bonding configuration on RHEL6, please refer to,

Note : Ensure that Network Manager is not running on the system, as NM does not support bonding :

service NetworkManager stop

chkconfig NetworkManager off

To configure the bond0 device with the network interface eth0 and eth1, perform the following steps:

1. Create a new file as root named bonding.confin the /etc/modprobe.d/ directory. Insert the following line in this new file:

alias bond0 bonding

2. Create the channel bonding interface file ifcfg-bond0 in the /etc/sysconfig/network-scripts/ directory:

# cat /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

IPADDR=192.168.50.111

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

BONDING_OPTS="mode=0 miimon=100"

Note:

Configure the bonding parameters in the file /etc/sysconfig/network-scripts/ifcfg-bond0, as above, BONDING_OPTS="mode=0 miimon=100".
The behavior of the bonded interfaces depends upon the mode. The mode 0 is the default value, which causes bonding to set all slaves of an active-backup bond to the same MAC address at enslavement time. For more information about the bonding modes, refer to The bonding modes supported in Red Hat Enterprise Linux.

3. Configure the ethernet interface in the file /etc/sysconfig/network-scripts/ifcfg-eth0. Both eth0 and eth1 should look like the following example:

DEVICE=eth<N>

BOOTPROTO=none

HWADDR=54:52:00:26:90:fc

ONBOOT=yes

MASTER=bond0

SLAVE=yes

USERCTL=no

Note:

Replace <N> with the numerical value for the interface, such as 0 and 1 in this example. Replace the HWADDR value with the MAC for the interface.
Red Hat suggest that configure the MAC address of the ethernet card into the file /etc/sysconfig/network-scripts/ifcfg-eth<N>.

4. Restart the network service:

# service network restart

Note: It may be necessary to disable NetworkManager if you find that bonding is not working properly. To do this, you would run the following:

# service NetworkManager Stop

# chkconfig NetworkManager off

# service Network restart

5. In order to check the bonding status, check the following file:

# cat /proc/net/bonding/bond0

Multiple bonded device

Configuring multiple bonding channels is similar to configuring a single bonding channel. Setup the ifcfg-bond<N> and ifcfg-eth<X> files as if there were only one bonding channel. You can specify different BONDING_OPTS for different bonding channels so that they can have different modes and other settings. Refer to the section 4.2.2. Channel Bonding Interfaces in the Red Hat Enterprise Linux 6 Deployment Guide for more information.

To configure the bond0 device with the ethernet interface eth0 and eth1, and configure the bond1 device with the Ethernet interface eth2 and eth3, perform the following steps:

1. Create configuration file /etc/modprobe.d/bonding.conf with the following lines:

alias bond0 bonding

alias bond1 bonding

2. Create the channel bonding interface files ifcfg-bond0 and ifcfg-bond1, in the /etc/sysconfig/network-scripts/ directory:

# cat /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

IPADDR=192.168.50.111

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

BONDING_OPTS="mode=0 miimon=100"

# cat /etc/sysconfig/network-scripts/ifcfg-bond1

DEVICE=bond1

IPADDR=192.168.30.111

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

BONDING_OPTS="mode=1 miimon=50"

Note: there are different bonding modes for bond0 and bond1. For the bond0 device, it is the balance-rr policy (mode=0). For the bond1 device, it is thefail_over_mac policy (mode=1). More information about the bonding modes please refer to The bonding modes supported in Red Hat Enterprise Linux

3. Configure the ethernet interface in the file /etc/sysconfig/network-scripts/ifcfg-eth0. Both eth0 and eth1 should look like the following example:

DEVICE=eth<N>

BOOTPROTO=none

HWADDR=54:52:00:26:90:fc

ONBOOT=yes

MASTER=bond0

SLAVE=yes

USERCTL=no

Note:

Replace <N> with the numerical value for the interface, such as 0 and 1 in this example. Replace the HWADDR value with the MAC for the interface.
Red Hat suggest that configure the MAC address of the ethernet card into the file /etc/sysconfig/network-scripts/ifcfg-eth<N>.

4. Restart the network service:

# service network restart

5. In order to check the bonding status, check the following file:

# cat /proc/net/bonding/bond0

Configuring bonded devices on Red Hat Enterprise Linux 5

Single bonded device on RHEL5

For the detailed manual of bonding configuration on RHEL5, please refer to,

To configure the bond0 device with the network interface eth0 and eth1, perform the following steps:

1. Add the following line to /etc/modprobe.conf:

alias bond0 bonding

2. Create the channel bonding interface file ifcfg-bond0 in the /etc/sysconfig/network-scripts/ directory:

# cat /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

IPADDR=192.168.50.111

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

BONDING_OPTS="mode=0 miimon=100"

Note:

Configure the bonding parameters in the file /etc/sysconfig/network-scripts/ifcfg-bond0, as above, BONDING_OPTS="mode=0 miimon=100".
The behavior of the bonded interfaces depends upon the mode. The mode 0 is the default value, which causes bonding to set all slaves of an active-backup bond to the same MAC address at enslavement time. For more information about the bonding modes, refer to The bonding modes supported in Red Hat Enterprise Linux.

3. Configure the ethernet interface in the file /etc/sysconfig/network-scripts/ifcfg-eth0. Both eth0 and eth1 should look like the following example:

DEVICE=eth<N>

BOOTPROTO=none

HWADDR=54:52:00:26:90:fc

ONBOOT=yes

MASTER=bond0

SLAVE=yes

USERCTL=no

Note:

Replace <N> with the numerical value for the interface, such as 0 and 1 in this example. Replace the HWADDR value with the MAC for the interface.
Red Hat suggest that configure the MAC address of the ethernet card into the file /etc/sysconfig/network-scripts/ifcfg-eth<N>.

4. Restart the network service:

# service network restart

5. In order to check the bonding status, check the following file:

# cat /proc/net/bonding/bond0

Multiple bonded device on RHEL5

In Red Hat Enterprise Linux 5.3 (or update to initscripts-8.45.25-1.el5) and later, configuring multiple bonding channels is similar to configuring a single bonding channel. Setup the ifcfg-bond<N> and ifcfg-eth<X> files as if there were only one bonding channel. You can specify different BONDING_OPTS for different bonding channels so that they can have different modes and other settings. Refer to the section 15.2.3. Channel Bonding Interfaces in the Red Hat Enterprise Linux 5 Deployment Guide for more information.

To configure the bond0 device with the ethernet interface eth0 and eth1, and configure the bond1 device with the Ethernet interface eth2 and eth3, perform the following steps:

1. Add the following line to /etc/modprobe.conf:

alias bond0 bonding

alias bond1 bonding

2. Create the channel bonding interface files ifcfg-bond0 and ifcfg-bond1, in the /etc/sysconfig/network-scripts/ directory:

# cat /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

IPADDR=192.168.50.111

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

BONDING_OPTS="mode=0 miimon=100"

# cat /etc/sysconfig/network-scripts/ifcfg-bond1

DEVICE=bond1

IPADDR=192.168.30.111

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

BONDING_OPTS="mode=1 miimon=50"

3. Configure the ethernet interface in the file /etc/sysconfig/network-scripts/ifcfg-eth0. Both eth0 and eth1 should look like the following example:

DEVICE=eth<N>

BOOTPROTO=none

HWADDR=54:52:00:26:90:fc

ONBOOT=yes

MASTER=bond0

SLAVE=yes

USERCTL=no

Note:

Replace <N> with the numerical value for the interface, such as 0 and 1 in this example. Replace the HWADDR value with the MAC for the interface.
Red Hat suggest that configure the MAC address of the ethernet card into the file /etc/sysconfig/network-scripts/ifcfg-eth<N>.

4. Restart the network service:

# service network restart

5. In order to check the bonding status, check the following file:

# cat /proc/net/bonding/bond0

Configuring bonded devices on Red Hat Enterprise Linux 4

Single bonded device on RHEL4

For a detailed manual for bonding configuration on RHEL4 , please refer to,

To configure the bond0 device with the network interface eth0 and eth1, perform the following steps,

1. Add the following line to /etc/modprobe.conf,

alias bond0 bonding

options bonding mode=1 miimon=100

Note:

Configure the bonding parameters in the file /etc/modprobe.conf. It is different from the configuration of RHEL5. The configuration on RHEL5 you configure all bonding parameters in the ifcfg-bond<x> by passing them in the BONDING_OPTS= variable, while in RHEL4 you need to pass those in the modprobe.conf using 'install' syntax.
For the mode=1, it is the fail_over_mac policy mode. More information about the bonding modes please refer to The bonding modes supported in Red Hat Enterprise Linux

2. Create the channel bonding interface file in the /etc/sysconfig/network-scripts/ directory, ifcfg-bond0

# cat /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

IPADDR=192.168.50.111

NETMASK=255.255.255.0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

3. Configure the ethernet interface in the file /etc/sysconfig/network-scripts/ifcfg-eth<N>. In this example, both eth0 and eth1 should look like this:

DEVICE=eth<N>

BOOTPROTO=none

HWADDR=54:52:00:26:90:fc

ONBOOT=yes

MASTER=bond0

SLAVE=yes

USERCTL=no

Note:

Replace the <N> with the numerical value for the interface, such as 0 and 1 in this example. Replace the HWADDR value with the MAC for the interface.
Red Hat suggest that you configure the MAC address of the ethernet card into the file /etc/sysconfig/network-scripts/ifcfg-eth<N>.
The "54:52:00:26:90:fc" is the hardware address (MAC) of the Ethernet Card in the system.

Multiple bonded device on RHEL4

To configure multiple bonding channels on RHEL4, first set up the ifcfg-bond<N> and ifcfg-eth<X> files as you would for a single bonding channel, shown in the previous section.

Configuring multiple channels requires a different setup for /etc/modprobe.conf. If the two bonding channels have the same bonding options, such as bonding mode, monitoring frequency and so on, add the b option. For example:

alias bond0 bonding

alias bond1 bonding

options bonding max_bonds=2 mode=balance-rr miimon=100

If the two bonding channels have different bonding options (for example, one is using round-robin mode and one is using active-backup mode), the bonding modules have to load twice with different options. For example, in /etc/modprobe.conf:

install bond0 /sbin/modprobe --ignore-install bonding -o bonding0 mode=0 miimon=100 primary=eth0

install bond1 /sbin/modprobe --ignore-install bonding -o bonding1 mode=1 miimon=50 primary=eth2

If there are more bonding channels, add one install bond<N> /sbin/modprobe --ignore-install bonding -o bonding<N> options line per bonding channel.

Note: The use of -o bondingX to get different options for multiple bonds was not possible in Red Hat Enterprise Linux 4 GA and 4 Update 1.

After the file /etc/modprobe.conf is modified, restart the network service:

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/Advanced Linux 2012. 1. 25. 15:49

swap은 어떻게 잡는 것이 좋을까?

요즘 서버들의 물리메모리를 보면 참 대단하다.
특히나 가상화가 추세이다보니 예전에는 상상도 할 수없을 만큼의 물리메모리를 장착한다.
그럼 이렇게 몇 백기가 단위의 물리 메모리 일때 swap은 어떻게 줘야하는가?

기본적으로 리눅스하는 사람은 swap 공식은 물리메모리에 2배수, 8G 이상은 1배수 정도로 생각하고 있었다.(나만 그런가?!)
하지만 만약 128G의 물리메모리를 가진 서버라면? 1배수라 하여도 128G의 하드디스크의 128G를 써야하는가? 요즘 서버는 메모리는
커지는 반면, internal disk의 용량은 줄어들고있는 추세라는걸 감안하면 엄청난 disk 낭비이기도 하다.

뭐 상황별로 다르긴하겠지만, RHEL 5, 6버전별 권장 스왑용량은 아래와 같다.

Considering that

At installation time when configuring the swap space there is no easy way to predetermine the memory a workload will require, and
The more RAM a system has the less swap space it typically needs, a better swap space requirements rule for Red Hat Enterprise Linux 5 is:
- Systems with 4 GB of ram or less require a minimum of 2 GB of swap space
- Systems with 4 GB to 16 GB of ram require a minimum of 4 GB of swap space
- Systems with 16 GB to 64 GB of ram require a minimum of 8 GB of swap space
- Systems with 64 GB to 256 GB of ram require a minimum of 16 GB of swap space
And, a better swap space requirements for Red Hat Enterprise Linux 6 is:
- Systems with 2 GB of ram or less require a minimum of 2*ram of swap space
- Systems with 2 GB to 8 GB of ram require a minimum of ram size of swap space
- Systems with 8 GB to 64 GB of ram require a minimum of ram/2 of swap space
- Systems with 64 GB of ram or more require a minimum of 4 GB of swap space.

Also the following points influence the decison if SWAP should be allocated and how much:

Do specific application requirements exist? In the past applications were written and vendors made specific recommendations how much SWAP should be used when the application is run. If this is the case the SWAP has to be sized accordingly to run the application in a supported environment.
Do other requirements exist? Workstations and laptops might be hibernated with storing the RAM contents in the SWAP area.
Cost/usage tradeoff Swap is stored on harddisks which cause costs, for initial purchase as well as for maintenance. If the system uses internal harddisks which contain the RHEL then it is likely that there is unused discspace - this can be used for swap without additional costs. The cost/usage trafeoff has changed in the last years with changing prices for RAM/harddiscs.
Assigning swap as 'last effort' While the blockdevices hosting SWAP are mostly many scales slower than RAM, in the case of an application requesting more and more ram it is useful to have swap as an additional layer before the OOMkiller strikes in.
Special system requirements due to much hardware If many cpu cores (i.e. >140) are available on the system or much memory is installed (i.e. >3TB) then this imposes also a requirement for swap. Such systems should be setup with atleast 100GB of swap.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux/Advanced Linux 2012. 1. 25. 15:09

/dev/shm resize 방법

/dev/shm이란?
If an application is POSIX compliant or it uses GLIBC (2.2 and above) on a Red Hat Enterprise Linux system, it will usually use the /dev/shm for shared memory (shm_open, shm_unlink). /dev/shm is a temporary filesystem (tmpfs) which is mounted from /etc/fstab. Hence the standard options like "size" supported for tmpfs can be used to increase or decrease the size of tmpfs on /dev/shm (by default it is half of available system RAM).

이다.... ( - _-);

더욱 자세한 것은 http://www.kernel.org/pub/linux/kernel/people/marcelo/linux-2.4/Documentation/filesystems/tmpfs.txt 를 참조하자.

하지만 기본적으로 설치 시 물리메모리의 반으로 설정이 되어 있는데, 이것을 resize하는 방법을 정리해 본다.

기본적으로 아래와 같이 설정이 되어 있다.
# /etc/fstab
none /dev/shm tmpfs defaults 0 0

이것을 하기와 같이 수정한다.(1G로 설정해 본다)
none /dev/shm tmpfs defaults,size=1024M 0 0

이 후 리마운트만 하면 적용 끝!

# mount -o remount /dev/shm

확인은 df -h 로 가능하다.

여기에서 한 가지 의문점은, 그럼 물리 메모리보다 크게 잡을 경우에는 어떤 문제가 생길것인가?
결론부터 말하자면, 문제가 없다가 정답니다.

/dev/shm은 물리메모리 뿐만 아니라, 필요시에는 swap 영역까지 사용이 가능하다. 하지만 swap을 사용하지 않게 하기
위해서 기본적으로 물리메모리의 반이 설정 되어 있으며, 설정이 되어 있더라고 해도 처음부터 해당 메모리를 선점하지 않고
필요 시마다 메모리를 늘려가는 방식이기 때문에 크게 걱정하지 않아도 된다.

저작자표시 비영리 변경금지 (새창열림)

My Advanced Linux 2010. 11. 22. 10:56

개념정리

SMP (symmetric multiprocessing) ; 대칭형 다중처리

SMP는 운영체계와 메모리를 공유하는 여러 프로세서가 프로그램을 수행하는 것을 말한다. SMP에서는 프로세서가 메모리와 입출력 버스 및 데이터 path를 공유하며, 또한 하나의 운영체계가 모든 프로세서를 관리한다. 보통 2개부터 32개의 프로세서로 이루어지며, 어떤 시스템은 64개까지 프로세서를 공유한다.

SMP시스템은 보통 MPP시스템에 비하여 병렬 프로그래밍이 훨씬 쉽고, 프로세서간 작업 분산(workload balance)시키는 것은 훨씬 용이하지만, 확장성은 MPP에 비하여 취약하다. 또한 많은 사용자가 동시에 데이터베이스에 접근하여 일을 처리하는 OLTP 작업에서도 강점을 보인다

NUMA (non-uniform memory access)

NUMA[누마]는 멀티프로세싱 시스템에서 지역적으로는 메모리를 공유하며, 성능을 향상시키고, 시스템 확장성이 있도록 마이크로프로세서 클러스터를 구성하기 위한 방법이다. NUMA는 SMP 시스템에서 사용된다. SMP 시스템은 서로 밀접하게 결합되어, 모든 것을 공유하는 시스템으로서, 다중 프로세서들이 하나의 단일 운영체계 하에서 공통의 버스를 통해 각자의 메모리를 액세스한다. 보통, SMP의 한계는 마이크로프로세서가 추가됨에 따라, 공유 버스나 데이터 경로가 과중한 부하가 생기게 되어, 성능에 병목현상이 일어나는데 있다. NUMA는 몇 개의 마이크로프로세서들 간에 중간 단계의 공유메모리를 추가함으로써, 모든 데이터 액세스가 주버스 상에서 움직이지 않아도 되도록 한다.

NUMA는 하나의 상자 속에 있는 클러스터로 생각할 수 있다. 클러스터는 대체로 마더보드 상의 하나의 공유 메모리 (L3 캐시라고도 부른다)로 향하는 로컬버스에, 서로 연결된 네 개의 마이크로프로세서들로 구성된다. 이 유니트는 모든 클러스터들을 서로 연결하는 공용 버스 내에서 SMP를 구성하기 위하여 비슷한 유니트에 추가될 수 있다. 이러한 시스템은 대체로 16~256개의 마이크로프로세서를 가지고 있다. SMP 시스템에서 실행되는 응용프로그램에게는, 모든 개별 프로세서 메모리들이 하나의 단일 메모리인 것처럼 비쳐진다.

프로세서가 어떤 메모리 주소에 있는 데이터를 찾을 때, 그것은 마이크로프로세서 그 자체에 붙어 있는 L1 캐시를 먼저 찾은 다음, 근처에 있는 다소 큰 L2 캐시 칩을 찾는다. 그 다음에는 다른 마이크로프로세서 인근에 있는 원격 메모리의 데이터를 찾기 전에, NUMA 구성에 의해 제공되는 제3의 캐시를 찾는다. NUMA에게는, 이러한 클러스터들 각각이 서로 연결된 네트웍 내에 있는 하나의 노드들 처럼 비쳐진다. NUMA는 모든 노드들 상에 있는 데이터를 계층 체계로 유지한다.

NUMA SMP 시스템의 클러스터들 사이에 있는 버스에서는 SCI (scalable coherent interface) 기술을 사용하여 데이터가 움직인다. SCI는 다중 클러스터의 노드에 걸쳐 캐시 일관성이라고 불리는 것과 대등하다.

SMP와 NUMA 시스템은 대체로 공통의 데이터베이스 상에 집단적으로 작업하는 많은 수의 프로세서들에게, 작업 처리를 분담시킬 수 있는 데이터 마이닝과 의사결정 시스템 등과 같은 분야에 사용된다. 시퀀트, 데이터제너럴, NCR 등이 NUMA SMP 시스템을 생산하는 회사들이다.

Linux 2.6 Completely Fair Scheduler

Linux 스케줄러는 여러 가지 면에서 흥미로운 연구 분야이다. 흥미로운 점 중 하나는 Linux가 적용된 사용 모델이다. 원래는 데스크탑 운영 체제용으로 개발된 Linux였지만 지금은 서버, 소형 임베디드 장치, 메인프레임 및 수퍼컴퓨터에서도 사용되고 있다. 물론 이러한 도메인에 대한 스케줄링 로드는 각기 다르다. 또 하나 흥미로운 점은 아키텍처(멀티프로세싱, 대칭 멀티스레딩, NUMA(Non-Uniform Memory Access))와 가상화를 포함한 플랫폼 영역의 기술 발전이다. 여기에는 상호 운용성(사용자 응답성)과 전체적인 공평성 사이의 밸런스도 포함되어 있다. 이러한 관점에서 보면 Linux에서의 스케줄링 문제가 매우 어려운 문제라는 것을 쉽게 알 수 있다.

Linux 스케줄러의 약사

초기 Linux 스케줄러에서는 최소한의 설계를 사용하며 다수의 프로세서나 하이퍼스레딩이 포함된 대형 아키텍처를 전혀 고려하지 않는다. 1.2 Linux 스케줄러에서는 라운드 로빈 스케줄링 정책에 따라 작동하는 순환형 큐를 사용하여 실행 가능한 작업을 관리한다. 이 스케줄러는 프로세스를 추가 및 제거하는 데 효율적이며 잠금 기능을 사용하여 구조를 보호한다. 간단히 말해서 이 스케줄러는 복잡하지 않고 단순하며 빠르다.

Linux 버전 2.2에서는 스케줄링 클래스라는 개념이 채택되면서 실시간 작업, 우선 순위가 없는 작업 및 비실시간 작업에 대한 스케줄링 정책을 사용할 수 있다. 2.2 스케줄러에는 SMP(Symmetric Multiprocessing)에 대한 지원도 포함되어 있다.

2.4 커널에는 스케줄링 이벤트 동안 모든 작업을 반복하여 O(N) 시간 이내에 작동되는 비교적 단순한 스케줄러가 있다. 2.4 스케줄러에서는 시간을 에포크(Epoch) 단위로 나누며 모든 작업은 해당 시간 조각 동안 실행할 수 있다. 작업이 해당 시간 조각의 일부를 사용하지 못한 경우에는 다음 에포크에서 더 길게 실행할 수 있도록 나머지 시간의 절반이 새 시간 조각에 추가된다. 이 스케줄러는 단순히 goodness 함수(메트릭)를 적용하여 다음 실행 작업을 결정하는 방식으로 전체 작업을 반복한다. 이 방법은 상대적으로 단순하기는 하지만 상대적으로 비효율적이고, 확장성이 없으며 실시간 시스템에 취약하다. 게다가 멀티코어 프로세서와 같은 새로운 하드웨어 아키텍처도 사용할 수 없다.

초기 2.6 스케줄러인 O(1) 스케줄러는 2.4 스케줄러의 여러 가지 문제점을 해결하기 위해 설계되었다. 즉, 이 스케줄러는 전체 작업 목록을 반복하지 않고도 스케줄링할 다음 작업을 식별할 수 있다. O(1)이라는 이름에서 알 수 있듯이 이 스케줄러는 훨씬 더 효율적이고 확장성이 높았다. O(1) 스케줄러는 실행 가능한 작업을 하나의 실행 큐로 추적한다. (실제로는 우선 순위 레벨별로 두 개의 실행 큐를 사용한다. 하나는 활성 작업을 위한 큐이며, 다른 하나는 만료된 작업을 위한 큐이다.) 다시 말해서 다음 실행 작업을 식별하기 위해 이 스케줄러는 각 우선 순위에 해당하는 특정 활성 실행 큐에서 다음 작업을 가져오기만 하면 된다. O(1) 스케줄러는 확장성이 훨씬 더 높아졌으며 상호 작용 메트릭과 여러 가지 추론 방법을 통합하여 I/O 또는 프로세서 관련 작업인지 여부를 결정한다. 하지만 O(1) 스케줄러는 커널에서 다루기가 쉽지 않았다. 추론 방법을 계산하는 데 필요한 코드의 양이 매우 많았기 때문에 근본적으로 관리하기가 어려웠을 뿐만 아니라 순수주의자의 입장에서 보면 알고리즘의 실체가 없었다.

프로세스와 스레드 비교

Linux에서는 프로세스와 스레드를 동일한 것으로 간주하여 프로세스 및 스레드 스케줄링을 통합적으로 처리한다. 하나의 프로세스는 단일 스레드일 수도 있지만 여러 리소스(코드 및/또는 데이터)를 공유하는 여러 스레드를 포함할 수도 있다.

O(1) 스케줄러가 직면하고 있는 문제와 기타 외부 요인으로 인해 몇 가지 변경이 필요했다. 이 변경은 Con Kolivas의 Staircase scheduler에 포함되어 있던 RDSL(Rotating Staircase Deadline Scheduler)을 이용한 커널 패치를 통해 이루어졌다. 이 작업의 결과로 공평성과 한정 지연 특성이 통합되어 있는 단순하게 설계된 스케줄러가 제공되었다. Kolivas의 스케줄러는 많은 부분에 영향을 주었으며(최신 2.6.21 주류 커널에 통합하는 호출 사용) 그 이후 이 방향에 따라 스케줄러의 변경이 이루어졌다. O(1) 스케줄러의 작성자인 Ingo Molnar는 그 이후 Kolivas의 작업에서 얻은 몇 가지 아이디어를 바탕으로 CFS를 개발했다. 이제 상위 레벨에서 CFS가 작동하는 방법을 자세히 살펴보자.

위로

CFS 개요

CFS의 기본 개념은 작업에 프로세서 시간을 제공할 때 밸런스(공평성)를 유지하는 것이다. 즉, 프로세스에 공평한 양의 프로세서가 제공되어야 한다. 작업 시간의 밸런스가 무너진 경우에는(다른 작업에 비해 하나 이상의 작업에 공평한 양의 시간이 주어지지 않은 경우) 작업 시간이 적게 지정된 작업에 실행 시간이 주어져야 한다.

CFS에서는 밸런스를 결정하기 위해 가상 런타임이라는 지정된 작업에 제공된 시간의 양을 관리한다. 작업의 가상 런타임이 작을수록 즉, 프로세서에 액세스할 수 있도록 허용된 시간이 작은 작업일수록 더 많은 프로세서 시간이 필요하다. 또한 CFS에는 대기자 공평성이라는 개념도 포함되어 있다. 이 개념은 현재 실행할 수 없는 작업(예를 들어, I/O를 대기 중인 작업)이 나중에 프로세서가 필요할 때 대기했던 시간에 상응하는 프로세서 시간을 받을 수 있도록 보장한다.

하지만 CFS는 이전 Linux 스케줄러와는 달리 실행 큐에서 작업을 관리하지 않고 시간순으로 정렬된 red-black 트리(그림 1 참조)를 유지한다. Red-black 트리에는 흥미롭고 유용한 두 가지 특성이 있다. 첫 번째는 스스로 밸런스를 조절한다는 것이다. 즉, 이 트리의 모든 경로는 다른 경로보다 두 배 이상 길어지지 않는다. 두 번째는 트리에 대한 작업이 O(log n) 시간(여기서 n는 트리의 노드 수임) 내에 발생한다는 것이다. 따라서 작업을 빠르고 효율적으로 삽입하거나 삭제할 수 있다.

그림 1. Red-black 트리의 예

시간순으로 정렬된 red-black 트리에 저장되어 있는 작업(sched_entity 오브젝트로 표시됨)을 보여 주는 위 그림을 보면 프로세서에 대한 요구가 높은(가상 런타임이 낮은) 작업부터 차례대로 트리의 왼쪽에 저장되며 프로세서에 대한 요구가 가장 낮은(가상 런타임이 가장 높은) 작업이 트리의 맨 오른쪽에 저장된다. 그런 다음 스케줄러는 공평성을 유지하기 위해 red-black 트리의 맨 왼쪽 노드를 다음에 실행할 노드로 선택한다. 작업은 해당 실행 시간을 가상 런타임에 추가하여 CPU 사용 시간을 계산한 다음 실행 가능한 경우 트리로 다시 삽입된다. 이 방법에 따라 트리의 왼쪽에 있는 작업에 실행 시간이 지정되며 트리의 컨텐츠가 오른쪽에서 왼쪽으로 이동하면서 공평성이 유지된다. @@@따라서 실행 가능한 작업 세트 전체의 실행 밸런스를 유지하기 위해 실행 가능한 각 작업은 다른 작업을 따라서 이동한다.@@@

위로

CFS 내부

Linux 내의 모든 작업은 task_struct라는 작업 구조체로 표시된다. 이 구조체는(연결된 다른 구조체와 함께) 작업을 완전히 설명하며 작업의 현재 상태, 해당 스택, 프로세스 플래그, 우선 순위(정적 및 동적) 등을 포함한다. ./linux/include/linux/sched.h에서 이 구조체와 기타 여러 관련 구조체를 볼 수 있다. 하지만 모든 작업이 실행 가능하지는 않기 때문에 task_struct에는 CFS 관련 필드가 없다. 대신 스케줄링 정보를 추적하기 위해 sched_entity라는 새 구조체가 작성되었다(그림 2 참조).

그림 2. 작업의 구조체 계층과 red-black 트리

그림 2에서는 다양한 구조체의 관계를 보여 준다. 트리의 루트는 ./kernel/sched.c에 있는 cfs_rq 구조체의 rb_root 요소를 통해 참조된다. Red-black 트리의 리프에는 아무 정보도 없지만 내부 노드는 실행 가능한 하나 이상의 작업을 나타낸다. Red-black 트리의 각 노드는 rb_node로 표시되며 하위 참조와 상위 노드의 색만 포함한다. rb_node는 sched_entity 구조체 내에 포함되며 이 구조체에는 rb_node 참조, 로드 중량 및 다양한 통계 데이터가 포함되어 있다. 무엇보다도 sched_entity에는 작업이 실행되면서 red-black 트리의 인덱스로 작동한 시간을 나타내는 vruntime(64비트 필드)이 포함되어 있다. 마지막으로 작업을 설명하고 sched_entity 구조체를 포함하는 task_struct가 맨 위에 있다.

CFS 부분과 관련된 스케줄링 함수는 매우 간단하다. @@@./kernel/sched.c를 보면 yield()를 통해 그 자체를 선취하지 않는 한 현재 실행 중인 작업을 선취하는 일반 schedule() 함수가 있다.@@@ CFS에는 선취에 대한 실제 시간 조각 개념이 없다. 이는 선취 시간이 가변적이기 때문이다. 현재 실행 중인 작업(선취된 작업)이 put_prev_task(스케줄링 클래스를 통해)에 대한 호출을 통해 red-black 트리로 리턴된다. 스케줄링할 다음 작업을 식별할 때가 되면 schedule 함수가 pick_next_task 함수를 호출한다. 이 함수도 ./kernel/sched.c에 있는 일반 함수이지만 스케줄러 클래스를 통해 CFS 스케줄러를 호출한다. CFS의 pick_next_task 함수는 ./kernel/sched_fair.c(pick_next_task_fair()라고 함)에 있다. 이 함수는 단순히 red-black 트리에서 가장 왼쪽에 있는 작업을 선택하여 연관된 sched_entity를 리턴한다. 간단한 task_of() 호출에서 이 참조를 사용하여 리턴된 task_struct 참조를 식별한다. 마지막으로 일반 스케줄러가 이 작업에 프로세서를 제공한다.

우선 순위와 CFS

CFS에서는 우선 순위를 직접 사용하지 않는 대신 작업에 허용된 실행 시간에 대한 지연 인수로 사용한다. 우선 순위가 낮을수록 지연 인수가 높은 작업이며, 우선 순위가 높을수록 지연 인수가 낮은 작업이다. 이는 우선 순위가 높은 작업보다 우선 순위가 낮은 작업에서 작업에 허용된 실행 시간이 더 빨리 소진된다는 것을 의미한다. 이 방법은 우선 순위별로 실행 큐를 관리하지 않아도 되는 효과적인 방법이다.

CFS 그룹 스케줄링

CFS의 또 하나 흥미로운 특징은 2.6.24 커널에서 도입된 그룹 스케줄링 개념이다. 그룹 스케줄링은 스케줄링의 공평성을 높일 수 있는 또 다른 방법으로, 특히 그 안에서 다른 많은 작업이 발생하는 작업의 경우 효과가 높다. HTTP 서버의 전형적인 아키텍처와 같이 들어오는 연결을 병렬 처리하기 위해 많은 작업이 발생하는 서버를 가정해 보자. CFS에서는 모든 작업을 균등하게 처리하는 대신 이 동작을 처리하기 위해 그룹을 사용한다. 작업이 발생하는 서버 프로세스는 계층 구조로 되어 있는 전체 그룹에 대한 가상 런타임을 공유하는 반면 단일 작업은 고유한 독립 가상 런타임을 관리한다. 이 방법에서는 단일 작업이 그룹과 거의 비슷한 스케줄링 시간을 받는다. /proc 인터페이스는 프로세스 계층 구조를 관리하는 데 사용되며, 그룹을 형성하는 방법에 대한 전체 제어를 제공한다. 이 구성을 사용하면 사용자, 프로세스 또는 각 변형 전체에 대한 스케줄을 공평하게 할당할 수 있다.

위로

스케줄링 클래스 및 도메인

CFS에서는 스케줄링 클래스(그림 2 참조)라는 개념도 도입되었다. 각 작업은 스케줄링 클래스에 속하며, 이 스케줄링 클래스에 따라 작업의 스케줄링 방법이 결정된다. 스케줄링 클래스는 sched_class를 통해 스케줄러의 동작을 정의하는 공통 함수 세트를 정의한다. @@@예를 들어, 각 스케줄러는 스케줄링할 작업을 추가하고, 실행할 다음 작업을 가져오고, 스케줄러에게 양도하는 등의 작업을 수행할 수 있는 방법을 제공한다.@@@ 각 스케줄러 클래스는 단일 연결 목록을 통해 다른 하나의 스케줄러와 연결되어 있으므로 이 연결을 따라 클래스를 반복할 수 있다(예를 들어, 지정된 프로세서의 비활성화를 활성화하기 위해). 그림 3에서는 일반적인 구조체를 보여 준다. 이 그림에서 enqueue_task 및 dequeue_task 함수는 특정 스케줄링 구조체에 작업을 추가하거나 제거하는 단순한 작업을 수행한다. pick_next_task 함수는 스케줄링 클래스의 특정 정책에 따라 실행할 다음 작업을 선택한다.

그림 3. 스케줄링 클래스의 그래픽 보기

하지만 스케줄링 클래스도 작업 구조체의 일부이기 때문에(그림 2 참조) 해당 스케줄링 클래스와 상관 없이 작업에서 수행할 연산이 단순해진다. 예를 들어, 다음 함수는 새 작업을 사용하여 현재 실행 중인 작업을 ./kernel/sched.c로부터 선취한다. (여기서 curr은 현재 실행 중인 작업을 정의하고, rq는 CFS에 대한 red-black 트리를 나타내며 p는 스케줄링할 다음 작업이다.)

static inline void check_preempt( struct rq *rq, struct task_struct *p )

{

rq->curr->sched_class->check_preempt_curr( rq, p );

}

이 작업이 공평한 스케줄링 클래스를 사용하고 있다면 check_preempt_curr()이 check_preempt_wakeup()으로 해석된다. ./kernel/sched_rt.c, ./kernel/sched_fair.c 및 ./kernel/sched_idle.c에서 이러한 관계를 볼 수 있다.

스케줄링 클래스는 여전히 스케줄링 변경의 흥미로운 특징으로 남아 있지만 스케줄링 도메인의 추가로 그 기능이 확장되었다. 이러한 도메인을 사용하면 로드 밸런싱 및 분리를 위해 하나 이상의 프로세서를 그룹화할 수 있다. 하나 이상의 프로세서가 스케줄링 정책(및 로드 밸런스)을 공유하거나 독립 스케줄링 정책을 구현하여 작업을 의도적으로 분리할 수 있다.

위로

기타 스케줄러

@@@스케줄링 작업을 계속하다 보면 성능 및 확장성이 향상된 개발 중인 스케줄러를 만날 수 있다. Con Kolivas는 자신의 Linux 경험에 안주하지 않고 BFS라는 도발적인 약어를 사용하는 또 다른 Linux용 스케줄러를 개발했다.@@@ 이 스케줄러는 NUMA 시스템과 모바일 장치에서 향상된 성능을 제공하는 것으로 보고되었으며 파생 Android 운영 체제에 도입되었다.

위로

추가 주제

Linux에 항상 적용되는 한 가지 사실이 있다면 그것은 바로 변화가 필연적이라는 것이다. 현재는 CFS가 2.6 Linux 스케줄러이지만 미래에는 정적 또는 동적으로 호출할 수 있는 또 다른 새 스케줄러 또는 스케줄러 스위트일 수도 있을 것이다. CFS, RSDL 및 커널과 관련된 프로세스가 매우 복잡함에도 불구하고 Kolivas와 Molnar의 노력으로 2.6 작업 스케줄링에서 새로운 수준의 공평성을 유지할 수 있게 되었다.

SpinLock & Ticket SpinLock

이름이 뜻하는대로, 만약 다른 스레드가 lock을 소유하고 있다면 그 lock이 반환될 때까지 계속 확인하며 기다리는 것이다. 즉 mutiprocessor system에서 여러 processor가 동시에 critical section에 진입하지 못하도록 하는 synchronization 기법이다. 한 processor가 lock을 가지고 있으면 다른 processor들은 unlock될 때까지 busy-wait하다가 lock을 차지하기 위해 동시에 lock 변수에 접근(write)한다.

여기서 두 가지 문제가 발생할 수 있는데 첫 번째는 각 processor 간에 lock을 획득하는 순서를 보장할 수 없기 때문에 먼저 spin lock을 기다리던 processor가 더 나중에 lock을 얻을 수도 있다는 것이다. 때문에 spin lock은 공정하지 못하다.

또 하나의 문제는 성능에 관련된 것으로 cache coherency로 인해 한 processor가 lock 변수에 write를 하게되면 다른 모든 processor의 cache line이 invalidate된다. 따라서 contention이 심한 경우 lock을 얻은 processor에서도 반복적으로 cache miss가 발생하여 실행 성능이 매우 나빠질 수 있다. (보통 lock 변수와 데이터는 같은 line에 놓여있을 것이다.)

Spin Lock 은 다음과 같은 특성을 갖는다.

1. Lock을 얻을 수 없다면, 계속해서 Lock을 확인하며 얻을 때까지 기다린다. 이른바 바쁘게 기다리는 busy wating이다.

2. 바쁘게 기다린다는 것은 무한 루프를 돌면서 최대한 다른 스레드에게 CPU를 양보하지 않는 것이다.

3. Lock이 곧 사용가능해질 경우 컨택스트 스위치를 줄여 CPU의 부담을 덜어준다. 하지만, 만약 어떤 스레드가 Lock을 오랫동안 유지한다면 오히려 CPU 시간을 많이 소모할 가능성이 있다.

4. 하나의 CPU나 하나의 코어만 있는 경우에는 유용하지 않다. 그 이유는 만약 다른 스레드가 Lock을 가지고 있고 그 스레드가 Lock을 풀어 주려면 싱글 CPU 시스템에서는 어차피 컨택스트 스위치가 일어나야하기 때문이다.

ticket spin lock은 이를 개선하기 위해 2.6.25 버전부터 도입된 것으로 lock을 기다리는 각 processor들은 자신 만의 ticket을 부여받고 자기 차례가 돌아오는 경우에만 write를 시도하므로 순서대로 lock을 얻을 수 있으며 전체적으로 cache miss 횟수를 줄일 수 있다.