Journey of a sysadmin: 2012

Friday, December 28, 2012

Setup vlan on top of bond in SuSE linux 11

I have 4 NICs on a SuSE server (SELS 11 sp2) and I want to bond them together and enable vlan on top of the bond device, following are the files I need create:

(1) ifcfg-eth<#>: (ifcfg-eth0 ~ ifcfg-eth3) , basically set all them with empty IP and with STARTMODE='off' , BOOTPROTO='none' )

Myhost01:/etc/sysconfig/network # cat ifcfg-eth0
BOOTPROTO='none'
BROADCAST=''
ETHTOOL_OPTIONS=''
IPADDR=''
MTU=''
NAME='NetXtreme II BCM5709 Gigabit Ethernet'
NETMASK=''
NETWORK=''
REMOTE_IPADDR=''
STARTMODE='off'
USERCONTROL='no'
Myhost01:/etc/sysconfig/network #cp ifcfg-eth0 ifcfg-eth1
Myhost01:/etc/sysconfig/network #cp ifcfg-eth0 ifcfg-eth2
Myhost01:/etc/sysconfig/network #cp ifcfg-eth0 ifcfg-eth3

(2) ifcfg-bond0 (bond eth0 ~ eth3 together, using mode 4 or 802.3ad , see Trunking / Bonding Multiple Network Interfaces for more descriptions about the modes;
Please note 802.3ad needs special configurations on switch side, see Cisco LACP for details. )

Myhost01:/etc/sysconfig/network # cat ifcfg-bond0
BONDING_MASTER='yes'
BONDING_MODULE_OPTS='mode=802.3ad miimon=100'
BOOTPROTO='static'
BROADCAST=''
ETHTOOL_OPTIONS=''
IPADDR='0.0.0.0/32'
MTU=''
NAME=''
NETMASK='255.255.255.0'
NETWORK=''
REMOTE_IPADDR=''
STARTMODE='auto'
USERCONTROL='no'
BONDING_SLAVE0='eth0'
BONDING_SLAVE1='eth1'
BONDING_SLAVE2='eth2'
BONDING_SLAVE3='eth3'
PREFIXLEN='24'

(3) ifcfg-vlan<#> ( Create vlan on top of the bond interface, vlan0 for vlan ID 10, vlan1 for vlan ID 20 , make sure you have MODULES_LOADED_ON_BOOT="8021q" added in /etc/sysconfig/kernel , network admin also need enable vlan tags on switch side.)

Myhost01:/etc/sysconfig/network #
Myhost01:/etc/sysconfig/network # cat ifcfg-vlan0
BOOTPROTO='static'
BROADCAST=''
ETHERDEVICE='bond0'
ETHTOOL_OPTIONS=''
IPADDR='192.168.10.70/24'
MTU=''
NAME=''
NETWORK=''
REMOTE_IPADDR=''
STARTMODE='auto'
USERCONTROL='no'
VLAN_ID='10'

Myhost01:/etc/sysconfig/network # cat ifcfg-vlan1
BOOTPROTO='static'
BROADCAST=''
ETHERDEVICE='bond0'
ETHTOOL_OPTIONS=''
IPADDR='192.168.20.50/24'
MTU=''
NAME=''
NETWORK=''
REMOTE_IPADDR=''
STARTMODE='auto'
USERCONTROL='no'
VLAN_ID='20'

After above configuration in place, check route configurations (such as default route in /etc/sysconfig/network/routes) and then restart network:
/sbin/rcnetwork restart

Above is an example from CLI, to accomplish the same thing in GUI, please use "yast2 lan ".
For eth0~eth3, edit them to be slave in bond, with Activate device of "NEVER", create the bond with the proper bond driver options (in "Bond Slaves" tab ) and activate device as " At Boot Time", then create the vlan interfaces on top of the bond;

Please refer to following blogs/docs for more details:
suse-linux-11-how-to-configure-vlans/
Yast2 lan

Monday, June 11, 2012

rlogin and "Install UNIX Client Software"

When install/upgrade netbackup agent for solaris 10 remotely, one needs enable rlogin/rsh.

Symantec has a tech note about this:

http://www.symantec.com/business/support/index?page=content&id=HOWTO19935

Basically, it enable rlogin by "inetadm enable svc:/network/login:rlogin" and created $HOME/.rhosts file with proper entries.

However, when a solaris 10 client is hardened using JASS /CIS or similar tools, above tech note will not be enough, following steps are needed to enable rlogin (without prompting password) and rsh execution :

(1) Enable svc:/network/shell:default if it is not enabled;
(2) If there are tcp wrapper (/etc/hosts.allow, hosts.deny) or IP filter, enable rsh or disable these host based firewall rules;
(3) Check following two entries in /etc/pam.conf:

rlogin        auth sufficient         pam_rhosts_auth.so.1

rsh   auth sufficient         pam_rhosts_auth.so.1

        They are usually commented out by the hardening script and need be uncommented in order to allow rsh.

   For example, they should look like this if you need allow rsh:
# rsh service (explicit because of pam_rhost_auth,
rsh     auth sufficient         pam_rhosts_auth.so.1
rsh     auth required           pam_unix_cred.so.1
......
# rlogin service (explicit because of pam_rhost_auth)
#
rlogin auth sufficient         pam_rhosts_auth.so.1
rlogin auth requisite          pam_authtok_get.so.1
rlogin auth required           pam_dhkeys.so.1
rlogin auth required           pam_unix_cred.so.1
rlogin auth required           pam_unix_auth.so.1

        Just FYI, if you accidentally comment out those "required" lines, you would get "Insufficient credentials" error when test rsh from a remote host:

master-servert# rsh testhost1 ls
Insufficient credentials.

Before kick off the remote installation, do a simple test from netbackup master server, such as:

master-server# rsh <client-hostname> ls /usr

         Above command should successfully list all files in /usr directory on the remote client;

After installation/upgrade of netbackup client, undo all the changes mentioned above to disable rsh.

Tuesday, March 13, 2012

solaris volume manager diskset in local zone

I usually choose zfs volume when I need present raw devices in solaris 10 local zones, however, in certain scenarios ZFS is not a good fit. One of the example is the /tempdb file systems in Sybase 12.5 or Sybase 15. There is no directio semantic in ZFS and Sybase /tempdb could not work properly on ZFS files system (poor performance v.s. UFS) In such cases, I need present at least the /tempdb file systems as UFS. In order to make the disks portable (transferred to another zone/system) I usually put them in Solaris Volume Manager diskset (metaset) so that it can be moved (export/import) if needed.

Another advantage using metaset instead of plain ufs mount is the volume management provided by SVM allows me to migrate volume online to other storage later on without rebooting/reconfiguring servers. This is usually true when SAN disks are hardware raid and thus only appears to be single disk under solaris (no software volume management if not under ZFS). I usually put the disk in a single plex mirror device and then create soft partition on top of the mirror device. It gives me more flexibility should I need migrate data using volume management features.

For example:
# metaset -s Syb01MS -a -A enable -h host1 (to create metaset Syb01MS on host1, auto import at reboot)
# metaset -s Syb01MS -a c5t6xxxd0 (add disk, the disk will automatically relabeled with only s0 and s7 )
# metainit -s Syb01MS d1 1 1 c5t6xxxd0s0 ( create a single plex submirror d1)
# metainit -s Syb01MS d0 -m d1 (create a single plex mirror device d0)
# metainit -s Syb01MS d11 -p d0 32g ( starting to create soft partitions on top of meta device d0, some are raw volumes, some can be UFS file systems)

Following are some examples in zonecfg info:

fs:
dir: /tempdb01
special: /dev/md/Syb01MS/dsk/d71
raw: /dev/md/Syb01MS/rdsk/d71
type: ufs
options: [rw]
......
device
match: /dev/md/shared/1/rdsk/d*
device
match: /dev/md/Syb01MS/rdsk/d*
.....

The metaset Syb01MS contains all disks needed for the sybase instance in local zone. The raw volumes are presented by adding two entries for device directive. The /dev/md/shared/1/... is a must, since the metaset name is a symbolic link to the metaset number (1 in this case) under /dev/md/shared/. In this way, the raw devices will show up in local zones as expected. What is more, I will manually create a directory and put another set of symbolic links and pointing to these device entries (raw volume), such as:

# ls -l /SybDev01
total 44
lrwxrwxrwx 1 sybase sybase 24 May 17 2011 DATA01 -> /dev/md/shared/1/rdsk/d11
lrwxrwxrwx 1 sybase sybase 24 May 17 2011 DATA02 -> /dev/md/shared/1/rdsk/d12
.......

It will give sysadmin more flexibility should we need make any changes of the underlying devices later.

Monday, March 12, 2012

Create an async HUR pair specifying CTGID

We use "paircreate" to create HUR pair between HDS arrays. Due to the distance limitations we only create Asynchronous (async) pairs.

For example, after setup the HUR environment prerequisites (such as create journal volumes in both arrays, install RM software on local and remote CCI hosts, create FC/FCIP paths between arrays etc.) and define the proper entries in /etc/horcm.conf at both local and remote CCI hosts:

From local CCI host:

#paircreate -g TestHur -vl -f async 0 -jp 0 -js 0

Above command will create the async HUR pair TestHur. The first 0 is the CTGID, the second 0 is the Journal ID for PVOL (local) and the third 0 is the Journal ID for SVOL.

The exact syntax of `paircreate` command can be found in "CCI Command Reference Guide".

You may use "paircreate -h" to try to get some hints:

bash-3.00# paircreate -h
Model : RAID-Manager/Solaris
Ver&Rev: 01-24-03/13
Usage : paircreate [options] for HORC
-h Help/Usage
-I[#] Set to HORCMINST#
-IH[#] or -ITC[#] Set to HORC mode [and HORCMINST#]
-IM[#] or -ISI[#] Set to MRCF mode [and HORCMINST#]
-z Set to the interactive mode
-zx Set to the interactive mode and HORCM monitoring
-q Quit(Return to main())
-g[s] Specify the group_name
-d[s] Specify the pair_volume_name
-d[g][s] [mun#] Specify the raw_device without '-g' option
-d[g][s] [mun#] Specify the LDEV# in the RAID without '-g' option
-nomsg Not display message of paircreate
-pvol or -vl Specify making P-VOL to the local instance
-svol or -vr Specify making S-VOL to the local instance
-pvol Specify making P-VOL to the ldev group
-svol Specify making S-VOL to the ldev group
-f[g] [CTGID] Specify the fence_level(never/status/data/async)
-jp -js Specify the journal group ID for UR with '-f async'
-c Specify the track size for copy
-cto [c-time] [r-time] Specify the timer for controlling CT group
-nocopy Set to the No_copy_mode
-nocsus Set to the No_copy_suspend for UR
-m Specify the create mode<'cyl or trk'> for S-VOL

Above hint may seem enough if you are familiar with HUR, however more detailed information is in the CCI guide. For example, I found that if I ignore the CTGID (the first 0 in my example, it appears it should be automatically assigned, however there is gotchas...) the paircreate may fail. The reason is explained in CCI guide:

"A CTGID (CT Group ID) is assigned automatically if you do not specify
the “CTGID” option in this command. If “CTGID” is not specified and the maximum number of CT groups already exists, an EX_ENOCTG error is returned. Therefore, the “CTGID” option can forcibly assign a volume group to an existing CTGID (e.g., 0-127 on 9900V)."

As the old saying: "The devil is in the details".....

Friday, March 9, 2012

Rebuild tape devices on solaris for NetBackup

On NetBackup solaris servers, when I add a new tape drive or change the device paths in anyway (/dev/rmt/ entries changed) I need rebuild the sg devices (in /dev/sg ) as well.

I always use following steps on the solaris host to accomplish this task:

1: run /usr/sbin/rem_drv sg

2: remove everything in /dev/sg/*

3: remove /kernel/drv/sg.conf

4: vi /etc/devlink.tab and remove all type=ddi_pseudo;name=sg stuff

5: remove /usr/openv/volmgr/bin/driver/sg.links and sg.conf files

6: from /usr/openv/volmgr/bin/driver cp sg.links.all to sg.links and sg.conf.all to sg.conf

7: cp sg.build /usr/openv/volmgr/bin to /usr/openv/volmgr/bin/driver/

8: in /usr/openv/volmgr/bin/driver/ run ./sg.build all -mt 15 -ml 7

9: run sg.install

10. init 6

After the server comes up, I will use following commands to check all devices to see if they match up correctly:

bash-3.00# /usr/openv/volmgr/bin/sgscan all
/dev/sg/c0t0l0: Disk (/dev/rdsk/c0t0d0): "LSILOGICLogical Volume"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/8): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/15): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/6): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/12): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/0): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/13): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/16): "IBM ULTRIUM-TD3"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/11): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/14): "STK T10000A"
/dev/sg/c0tw50xxl0: Tape (/dev/rmt/9): "STK T10000A"

bash-3.00# /usr/openv/volmgr/bin/tpautoconf -t
TPAC60 STK T10000A 1.48 531xx -1 -1 -1 -1 /dev/rmt/0cbn - -
TPAC60 STK T10000A 1.46 531xx -1 -1 -1 -1 /dev/rmt/9cbn - -
TPAC60 STK T10000A 1.48 531xx-1 -1 -1 -1 /dev/rmt/6cbn - -
TPAC60 STK T10000A 1.48 531xx -1 -1 -1 -1 /dev/rmt/11cbn - -
TPAC60 IBM ULTRIUM-TD3 93G0 121xx -1 -1 -1 -1 /dev/rmt/16cbn - -
TPAC60 STK T10000A 1.48 531xx -1 -1 -1 -1 /dev/rmt/14cbn - -
TPAC60 STK T10000A 1.48 531xx -1 -1 -1 -1 /dev/rmt/8cbn - -
TPAC60 STK T10000A 1.48 531xx -1 -1 -1 -1 /dev/rmt/13cbn - -
TPAC60 STK T10000A 1.48 531xx -1 -1 -1 -1 /dev/rmt/15cbn - -
TPAC60 STK T10000A 1.48 531xx -1 -1 -1 -1 /dev/rmt/12cbn - -

bash-3.00# ls -l /dev/rmt/*cbn
...

From above commands you should be able to match the /dev/rmt/xcbn to the WWN of the drive, you can also find the uniq serial number of the tape drive, thus you can verify from tape library to see if the devices are matched up correctly.

Finish above steps on all servers need such changes, now we just go to NetBackup console (jnbSA) and use the "Configure Storage Devices -- Define robots and drives." wizard to let NetBackup figure out the device entries (for SSO) in device database and define storage unit if necessary.

Thursday, March 8, 2012

Inconsist of volume size between local and remote in HDS HUR or true copy

When I use HDS true copy or HUR (Universal Replicator) to create pair between HDS arrays, one of the common errors I once encountered is the volume size does not match between local and remote arrays.

Following is an example of the error messages in this scenario:

COMMAND ERROR : EUserId for HORC : root (0) Wed xxx xx 16:28:05 2011
CMDLINE : paircreate -g Testdc -vl -f never
16:28:05-39db4-11495- [paircreate] Inconsist of volume size between local and remote: devname:tdc01, localsize(0:2819280), remotesize(0:2814f00)
16:28:05-44091-11495- [paircreate][exit(212)]
[EX_ENQSIZ] Unmatched volume size for pairing
[Cause ]:Size of a volume is unsuitable between the remote and local volume.
[Action]:Please confirm volume size or number of LUSE volume using the 'raidscan -f' command,and change the SIZE of volumes identically.

As shown in above logs, the local volume size is 2819280 (Hex, or 42046080 blocks in Decimal ), the remote volume size is 2814f00 (or 42028800 blocks). There are around 9MB differences in size. I asked myself: How could this happen?

It turns out that before I try to create the pair, I get the local volume size from GUI (storage navigator or Device manager), however the size shown in GUI is depended on the "Capacity Unit" you selected in the GUI. By default, the capacity unit is usually in GB or MB. However 1MB = 2048 Blocks (512 bytes= 1 block), what if the volume is 2050 block? You bet it, the GUI will still show the volume as 1MB if the capacity unit is in MB.

Following is an example: LDEV 00:01 is shown as 206.34GB or 211300MB under "LUN Expansion" -- "LUN Expansion" tab, if you use either of the size to create the remote volume, it will fail.

Why? Because if you go to the VLL tab, and find the LDEV 00:01 in group 5-7, you will find that its size is 211300.312MB (Even this one is a round up number and can not be relied upon.) or to be more accurate : 432743040 Blocks if you choose the "Capacity Unit" to "block".

Actually, 211300.312MB= 432743038.976 blocks, obviously it could not be the size, it is merely a round up of the actual size in block.

So If you always choose "capacity unit" to block, and use that size to create remote volume , you will not encounter the "inconsist of volume size" problem. Please note, it is equally important to choose capacity unit in block when creating remote volume, because even if you choose for example 10.312MB, it may or may not end at the exact block boundary you think it should be, there could be rounding errors as long as you are not choosing block as the capacity unit.

Above examples are from a Sun 9990 array, this trick (choose capacity unit to block when get the size of local volume and create the remote volume using that size in block) applies similarly to
other HDS product line. You will find similar settings (capacity unit) in HDS's latest VSP device manager GUI.

Or as a Unix sysadmin, if you always love command line, then use raidscan /raidcom to get the size of local volume and create remote volume , the size (in hex) you get from the command line won't "cheat" you. Please see raidscan/raidcom man pages for more details.

Journey of a sysadmin