mercredi 4 août 2010

Question
Why is my Shared Ethernet Adapter (SEA) failover delayed?

Cause
Spanning tree is turned on at the switch port with portfast disabled.

Answer
SEA failover, from primary to backup, the delay is sometimes due to Spanning Tree Protocol being enabled on the switch ports.
To ensure prompt recovery times when you enable the Spanning Tree Protocol on the switch ports connected to the physical adapters of the Shared Ethernet Adapter, you can also enable the portfast option on those ports. The portfast option allows the switch to immediately forward packets on the port without first completing the Spanning Tree Protocol. (Spanning Tree Protocol blocks the port completely until it is finished.)

On SEA failback, from backup to primary, there is an additional issue:

The switches are sometimes not ready to read transmit and receive packets even after declaring the link as up. Hence, it leads to packet loss. This type of problem can be avoided or reduce failback time by disabling Spanning Tree Protocol all together.

Here are the 5 supported methods to test SEA failover:

Scenario 1, Manual SEA Failover
On VIO server:
$ lsdev -type adapter
or
$ oem_setup_env
# lsdev -Cc adapter |grep ent --> Note which ent is the SEA
# entstat -d entX | grep State --> Check for the state (PRIMARY, or BACKUP)

Set ha_mode to standby on primary VIOS with chdev command:
# chdev -l entX -a ha_mode=standby
or
$ chdev -dev -attr ha_mode=standby

Reset it back to auto and the SEA should fail back to the primary VIOS:
# chdev -l entX -a ha_mode=auto
or
$ chdev -dev -attr ha_mode=auto


Scenario 2, Primary VIOS Shutdown
Reboot the primary VIOS for fail over to backup SEA adapter.
When the primary VIOS is up again, it should fail back to the primary SEA adapter.

Scenario 3, Primary VIOS Error
Deactivate primary VIOS from the HMC for fail over to backup SEA adapter.
Activate the primary VIOS for the fail back to the primary SEA adapter again.

Scenario 4, Physical Link Failure
Unplug the cable of the physical ethernet adapter on primary VIOS for the failover to the backup VIOS.
Replug the cable of the physical ethernet adapter on primary VIOS for the failback to the primary VIOS.

Scenario 5, Reverse Boot Sequence
Shut down both the VIO servers.
Activate the VIOS with backup SEA until the adapter becomes active.
Activate the VIOS with primary SEA. The configuration should fail back to
the primary SEA.

NOTE: When we force a manual failover in Scenario 1, we bring down the link to the switch connected to VIO1, thus asking the switch to modify its MAC tables accordingly. The backup VIOS is able to take over immediately since it is up and running but was just not being used as yet. Now, during failback, the same situation occurs. Less delay happens because we forced the failover while the primary VIOS is up and running.

In Scenario 2, when primary VIO1 is shutdown, the failover is also immediate. However, the failback to VIO 1 takes more time because the switch connected to VIO1 takes more time to start requeing packets.

The fact that the delay is shorter for manual failover and longer for VIOS shutdown, implies that the delay is happening because some switches don't start transmitting and receiving packets for some time even after declaring that the link as up. From IBM's side, if the link is up when TCPIP is started, then we assume the switch is ready to start sending and receiving packets even though it may not actually be ready.

Related information
How to Setup SEA Failover on DUAL VIO Servers
Other SEA Related Links

removing end of a variable

export TOTO=fssam0sv
echo ${TOTO%%sv}

ca retire le sv de la fin de la variable

which ipc is holding a file open

If you cant unmount a fs due to an ipc :

Problem:
server DB2 unmount file system
IPCS process still running ID instance is owned by DB2
ipcrm 0515-020 shmid ( was not found idnumber)
m 2097169 0xffffffff D-rw-rw-rw- udbpd030 db2iadm1

[root@mbhop5pdb2]/> ipcrm -m 2097169
ipcrm: 0515-020 shmid(2097169) was not found.


1) Use the new -S option on ipcs to obtain the shared memory segment ID.

# ipcs -mS

m 131075 0x00001a4c --rw------- root system
SID :
0x2b85

2) Verify that the svmon command is installed on the system. If not,install from the AIX installation CDs.

$ lslpp -l perfagent.tools

3) Use the svmon command to find all processes attached to the shared memory segment.

# svmon -S 0x2b85 -l

Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual
2b85 3 work shared memory segment - 656 0 0 656
pid(s)=10862

This shared memory segment has only one process attached.

To remove this shared memory segment, you must first kill the process that is attached to the segment.
# kill 10862
# ipcrm -m 131075

unix time

Pour savoir à quelle date correspond 1254407497:

perl -e "print scalar(localtime(1254407497))"



et inversement:


perl -e "use POSIX ; print (mktime(10,45,11,31,4,107));"

avec comme syntaxe
mktime(sec, min, hour, mday, mon, year, wday = 0, yday = 0, isdst = 0)

ksh tip

simple way to get input caracter in the output text :

cat $i | bla bla | awk '{ var " " bla bla bla $1" "$2 bla bla }' var=$i >> out

How to capture boot debug of a SAN boot PowerVM Virtual I/O Server or AIX/NPIV client partition that is failing to boot.

Problem(Abstract)
How to capture boot debug of a SAN boot PowerVM Virtual I/O Server or AIX/NPIV client partition that is failing to boot.

Symptom
NPIV/AIX Client or VIOS fails to boot from SAN.

Environment REQUIREMENTS
1. POWER6 System
2. A program where console terminal logging can be enabled will be needed. The following procedure uses PuTTY (a Windows ssh client program) as the means to open a console session to capture the boot debug data to a file. It's available for download at http://www.putty.org
Diagnosing the problem Things to check PRIOR to gathering the debug
For a NEW NPIV/AIX Client Install
1. Ensure NPIV mapping is correct
2. Ensure SAN swtich is zoned correctly to the NPIV client's WWPN
3. Ensure resources (LUN) is assigned from the storage directly to the client's WWPN
4. Ensure Installation media meet minimum level required by the storage
For previously running LPAR
1. Check if boot device can be set in SMS
2. Check if rootvg is accessible in Service Mode.
Resolving the problem
1. To capture a boot debug to a file, open an ssh session via PuTTY to the HMC as follows
Under Category
Session
-> click on Logging
-> select "All session output" on the right
-> specify the filename in the "Log file name" box as shown in Figure 1
Terminal
-> click on Keyboard
-> select Control-H for the Backspace key
Click on Session
-> Type in the full domain to the HMC in the Host Name and Saved Sessions box
-> select SSH protocol (HMC must be configured to accept ssh connections)
-> Click on Open (See Figure 2). You will get a PuTTY Security Alert Window
-> Click No to connect just once
-> Login as hscroot and type 'vtmenu' to open a console session to the partition in question
-> Select the Managed System name
-> Select the partition in question =>You may or may not see activity at this point depending on the status of the partition.

2. Boot the partition to Open Firmware (0 >) prompt, run ioinfo utility, and select the FCINFO option as follows:
0 > ioinfo

!!! IOINFO: FOR IBM INTERNAL USE ONLY !!!
This tool gives you information about SCSI,IDE,SATA,SAS,and USB devices attached to the
system

Select a tool from the following
1. SCSIINFO
2. IDEINFO
3. SATAINFO
4. SASINFO
5. USBINFO
6. FCINFO <=====
7. VSCSIINFO

q - quit/exit

==> 6

3. Select the desired path. In this example, we are selected the 2nd virtual Fibre Channel path:

FCINFO Main Menu
Select a FC Node from the following list:
# Location Code Pathname
---------------------------------------------------------------
1. U9117.MMA.65EBF8C-V32-C5-T1 /vdevice/vfc-client@30000005
2. U9117.MMA.65EBF8C-V32-C6-T1 /vdevice/vfc-client@30000006

q - Quit/Exit

==> 2

4. Select a FC Device

FC Node Menu
FC Node String: /vdevice/vfc-client@30000006
FC Node WorldWidePortName: c05076001ab6003a
-----------------------------------------------------------------
1. List Attached FC Devices
2. Select a FC Device <=====
3. Enable/Disable FC Adapter Debug flags

q - Quit/Exit

==> 2

5. Select the appropriate LUN. In this example option 1 happens to be the bootable device:

1. 50060e801530f310,0 - 10240 MB Disk drive (bootable)
2. 50060e801530f310,1000000000000 - 35840 MB Disk drive

Select a FC Device : 1

FC Device Menu
FC Target Address ==> 50060e801530f310 FC Lun Address ==> 0
FC Device String: /vdevice/vfc-client@30000006/disk@50060e801530f310,0:0
FC Device: 10240 MB Disk drive (bootable)
----------------------------------------------------------------------

6. Select "Display Inquiry Data"
1. Display Inquiry Data <=====
2. Spin up Drive
3. Spin down Drive
4. Continuous random Reads ( hit any key to stop )
5. Enable/Disable FC Device Debug flags
98. Boot from this Device
q - Quit/Exit

==> 1

INQUIRY DATA FOR : TARGET ==> 50060e801530f310 LUN ==> 0 - 10240 MB Disk drive (bootable)
000002f4cd00: 00 00 03 32 cf 00 00 02 48 49 54 41 43 48 49 20 :...2....HITACHI :
000002f4cd10: 4f 50 45 4e 2d 56 20 20 20 20 20 20 20 20 20 20 :OPEN-V :
000002f4cd20: 36 30 30 34 35 30 20 31 33 30 46 33 33 30 33 33 :600450 130F33033:
000002f4cd30: 20 32 41 20 01 01 01 01 00 00 00 00 00 00 00 00 : 2A ............:
000002f4cd40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
000002f4cd50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
000002f4cd60: 05 01 05 70 30 30 ff 00 c0 50 76 00 1a b6 00 3a :...p00...Pv....::
000002f4cd70: c0 50 76 00 1a b6 00 3a 00 00 00 0f 00 00 00 00 :.Pv....:........:
000002f4cd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 03 00 00 :................:
000002f4cd90: 01 01 01 01 00 00 00 00 01 01 01 01 01 01 01 01 :................:
000002f4cda0: 01 01 01 01 01 01 01 01 55 55 55 55 55 55 55 55 :........UUUUUUUU:
000002f4cdb0: 55 55 55 55 00 00 00 00 ff ff ff ff 00 00 00 00 :UUUU............:
000002f4cdc0: 00 00 00 03 00 00 00 01 00 00 00 01 00 01 99 40 :...............@:
000002f4cdd0: 00 00 71 a3 00 00 00 00 00 00 00 00 00 00 00 00 :..q.............:
000002f4cde0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :................:
000002f4cdf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 :...............:
Hit a key to continue...

FC Device Menu
FC Target Address ==> 50060e801530f310 FC Lun Address ==> 0
FC Device String: /vdevice/vfc-client@30000006/disk@50060e801530f310,0:0
FC Device: 10240 MB Disk drive (bootable)
----------------------------------------------------------------------

7. Select to "Boot from this Device"
1. Display Inquiry Data
2. Spin up Drive
3. Spin down Drive
4. Continuous random Reads ( hit any key to stop )
5. Enable/Disable FC Device Debug flags
98. Boot from this Device

q - Quit/Exit

==> 98
---------------------------------------------------------------------------------------------------------
Welcome to AIX.
boot image timestamp: 06:26 10/01
The current time and date: 09:46:20 10/01/2009
processor count: 1; memory size: 8192MB; kernel size: 23463042
boot device: /vdevice/vfc-client@30000006/disk@50060e801530f310,0
----------------------------------------------------------------------------------------------------------


8. Rename the putty.log file to reflect the PMR#, Branch#, and Country Code (i.e 12345.678.000.SAN_boot_debug.log) and upload it to:
ftp testcase.software.ibm.com
login: anonymous
password:
ftp> cd /toibm/aix
ftp> prompt
ftp> binary
ftp> put .log
ftp> quit