Thursday, July 18, 2013

This is taken from IBM Systems Magazine - AIX

ASO Can Quickly Boost 

Performance for Free

ASO can quickly boost deliver performance for free


In 2011, IBM announced the Active System Optimizer (ASO) enhancement, available only on recent AIX levels for POWER7. The intent was to automatically apply best-practices performance tweaks to individual LPARs. Some of those tweaks include improvements to memory affinity—specifically associating targeted workloads to a specific core or set of cores and determining if memory can be relocated for higher affinity to cache and cores. An in-LPAR solution, ASO runs in the background of each LPAR it’s tuning. Keep in mind it doesn’t support active memory sharing or LPAR migration.

Getting Started

ASO is installed by default in a standard installation as part of the bos.aso fileset. Running lslpp –l | grep bos.aso shows the installed level set. The three ASO prerequisites are:

  1. AIX V6 TL08 or AIX V7
  2. The bos.aso fileset must be installed
  3. A POWER7/7+ server
To check the prerequisites:
# oslevel -s
7100-01-05-1228
This output shows the system is at: AIX V7 TL01 SP5.
# lslpp -l | grep aso
  bos.aso                   7.1.1.15  COMMITTED  Active System Optimizer
  bos.aso                   7.1.1.15  COMMITTED  Active System Optimizer
# lsconf | grep ^Processor
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Processor Clock Speed: 3000 MHz
This indicates the system can run ASO without problems.
To use ASO, first start the subsystem, using startsrc –s aso. You can then monitor and control it using the asoo command. The following commands can be very useful:
# lssrc -s aso
Subsystem         Group            PID          Status 
 aso                              5373980      active
The ASO subsystem is started and is running as PID 5373980.
# ps -ef | grep 5373980
    root  5373980  3604670   0   Jan 02      -  1:14 /usr/sbin/aso
The program being run is /usr/sbin/aso.
asoo –L   List the current and reboot values
# asoo -L
NAME                  CUR    DEF    BOOT   MIN    MAX     UNIT       TYPE 
aso_active             0      0      0      0      1      boolean      D

#asoo -op aso_active=1 Permanently turn ASO on (0 is disabled and is the default)
To check what’s happening in the ASO logs; their location is specified in /etc/syslog.conf and defaults to:
aso.notice /var/log/aso/aso.log      rotate size 128k time 7d
aso.info    /var/log/aso/aso_process.log rotate size 1024k
The aso.log records the on/off status and the reasons ASO goes into hibernation. Similarly, aso_process.log records the same information as well as details of any actions ASO takes and the processes it modifies.
My test LPAR is on a Power 720 and has an entitlement of 0.5, VPs set to 2 and 4GB of memory. The lssrad command shows I only have one socket, so there’s little to tune with respect to affinity.
# lssrad -av
REF1   SRAD        MEM      CPU
0
          0    3699.81      0-11
Although the ASO subsystem is running, we still need to enable ASO as follows:
# asoo -o aso_active=1
Setting aso_active to 1
The log now shows:
Jun 16 17:32:46 testlpar aso:notice aso[5373980]: ASO enabled by tunable
Jun 16 17:32:46 testlpar aso:notice aso[5373980]: [HIB] Current number of system
 virtual cpus too low (2 cpus)
Jun 16 17:32:46 testlpar aso:notice aso[5373980]: [HIB] Increase system virtual cpus to
 at least 3 cpus to run ASO. Hibernating.
When planning where and when to run ASO, keep in mind it needs at least three virtual processors (VPs) to run. At this point, ASO was still enabled, so I disabled it again and used Dynamic LPAR to add another VP to the LPAR.
# lsdev -C | grep proc
proc0      Available 00-00       Processor
proc4      Available 00-04       Processor
proc8      Available 00-08       Processor
# asoo -o aso_active=1
Setting aso_active to 1
The log now shows:
Jun 16 17:39:08 testlpar aso:notice aso[5373980]: ASO enabled by tunable
Unfortunately, the test system wasn’t very busy so ASO decided to hibernate:
LOG
Jun 16 17:40:53 testlpar aso:notice aso[5373980]: [HIB] Used entitlement per unfolded vCPU
 is below threshold (1% of a core).
Jun 16 17:40:53 testlpar aso:notice aso[5373980]: [HIB] ASO will hibernate until used
 entitlement is at least 30% of a core per unfolded vCPU

Internal Analysis

On a busy system, ASO takes some time to profile and analyze the running workloads to dynamically tune the system for those workloads. Once it’s running, it needs no further interaction. ASO will examine AIX kernel data about processes and threads, and the POWER7 hardware performance counters. Based on that data, ASO makes tuning decisions. It will try to improve cache and memory affinity by moving workloads to cores that help them improve their affinity needs. It takes into account Workload Manager (WLM) and workload partition (WPAR) resource sets as well as constraints. It does this by looking at the workload over minutes, and setting scheduler resource allocation domain (SRAD) and resource set (RSET) rules in order to change the CPU a workload is running on. All changes are logged and tracked with ASO logging both the expected and actual gain seen.
By default, ASO enables cache and memory affinity optimization, large page optimization, and memory prefetch. These restricted tunables should only be modified at the request of IBM Support.
To improve the cache and memory performance of workloads, ASO uses one of three types of optimization: cache affinity, aggressive cache affinity or memory affinity. For cache affinity, it tries to reduce chip-to-chip cache movement. For a workload assigned cores across two sockets, ASO tries to place the workload so it fits within a socket using an RSET. For aggressive cache affinity, it tries to reduce the number of chips used and may compress the workload onto fewer cores to do so. For memory affinity, ASO tries to ensure that local memory is used. This often involves migrating pages for workloads to DIMMs closer to the core being used, thus reducing remote memory traffic. It uses CPU and memory RSETs to do this.

Debugging

In the unlikely event you experience problems with ASO, the perfpmr.sh scripts don’t collect the ASO logs. So, you must set ASO into debug mode:
asoo –o debug_level=3
You’ll then need to recreate the problem and reset ASO to normal logging:
asoo –o debug_level=0
Finally, forward the aso_debug.out file to IBM Support.

A Fine Tuning Tool

Free with AIX and easy to use, ASO offers great benefits for tuning. It monitors workloads on a busy system and dynamically moves them to gain the best affinity within an LPAR. With it, administrators can avoid making WLM and RSET changes to attain affinity.
It can be turned off immediately if performance results are less than expected. It’s particularly useful for multi-threaded, long-running processes on larger systems, especially the Power 770 and above, which have multiple nodes. However, any of the multi-socket systems could benefit. Try ASO on test LPARs to gauge its potential benefits for performance.

Friday, November 9, 2012

Running cldump on a Cluster

http://ibmsystemsmag.blogs.com/aixchange/2012/11/running-cldump-on-a-cluster.html

I am putting this in my blog as my own point of reference for my own future use. anybody who wants to know more please refer to the above url. If this is an illegal posting please alert me.
Thanks.

written by Rob McNelly


I was recently asked why the cldump command wasn't running on a PowerHA 7.1 cluster.
After running /usr/es/sbin/cluster/utilities/cldump, my client received this output:

            cldump: Waiting for the Cluster SMUX peer (clstrmgrES)
            to stabilize.............
            Failed retrieving cluster information.

            There are a number of possible causes:
            clinfoES or snmpd subsystems are not active.
            snmp is unresponsive.
            snmp is not configured correctly.
            Cluster services are not active on any nodes.

            Refer to the HACMP Administration Guide for more information.

I checked and learned that IBM has been scaling back the default SNMP configuration over the years for security reasons. However, this issue is relatively easy to address:
            1) edit /etc/snmpv3.conf (all nodes) and remove the comment hash from this line:

            #COMMUNITY public    public     noAuthNoPriv 0.0.0.0    0.0.0.0         -

            2) add this line (this is the top-level cluster view of the SNMP MIB):

            VACM_VIEW        defaultView     1.3.6.1.4.1.2.3.1.2.1.5 - included -

            3) restart the relevant daemons (this can be done without stopping cluster services):

            stopsrc -s clinfoES
            stopsrc -s snmpd
            stopsrc -s aixmibd
            stopsrc -s hostmibd
            stopsrc -s snmpmibd
            sleep 10
            startsrc -s snmpd
            startsrc -s aixmibd
            startsrc -s hostmibd
            startsrc -s snmpmibd
            sleep 60
            startsrc -s clinfoES

After these changes, cldump was working. 
We also found warning messages when we started cluster services or tried to synchronize the cluster:
            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea1 does not support fast disk takeover
            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea2 does not support fast disk takeover
            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea1 does not support fast disk takeover
            WARNING: Volume group datavg is an enhanced concurrent mode volume group used as a serial resource, but the LVM level on node nodea2 does not support fast disk takeover
I called support and was told that this was addressed by IV26874. We were also provided with an iFix, which, once loaded, took care of the problem. So if you see the warning, contact IBM and get the iFix (if it isn't yet available via service pack.)
Incidentally, neither of these issues was a show-stopper in my client's environment. I continue to be very impressed by PowerHA 7.1.

Friday, February 18, 2011

HOWTO INSTALL OPEN SSH ON AIX6.1

1. To install open ssh you will also need open ssl.

for open ssl you can download from:
https://www14.software.ibm.com/webapp/iwm/web/reg/download.do?source=aixbp&S_PKG=openssl&lang=en_US&cp=UTF-8

the file name is:
openssl.0.9.8.1301.tar.Z (6 MB)

for open ssh you can download from:
http://sourceforge.net/projects/openssh-aix/files/openssh-aix%20Source%20Patch/Openssh-4.5p1_srcpatch/openssh-4.5_srcpatch.tar/download

the file name:
openssh-4.5_srcpatch.tar

2. Once downloaded create a directory (eg /tmp/openssh) and transfer the files.
3. uncompress the files
4. tar -xvf for openssl and openssh

tar -xvf ./openssl.0.9.8.1301.tar
tar -xvf ./openssh-4.5_srcpatch.tar

5. install the openssl first. when you tar the open ssl file, it will create an openssl directory. cd to that directory. run this command
/tmp/openssh> cd openssl.0.9.8.1301
/tmp/openssh/openssl0.9.8.1301> inutoc .

use smit install and point the installation directory to /tmp/openssh/openssl0.9.8.1301 . remember to choose accept new licence as yes and all the files need to install.

6. the install openssh.
/tmp/openssh> inutoc .

use smit install and point the installation directory to /tmp/openssh . Remember to choose accept new license as yes and all the files need to install

7. to start the ssh , run the daemon as root user

/tmp/openssh> startsrc -g sshd


Good luck!

Friday, January 21, 2011

AIX IN THE UNIX WORLD

Twenty five years ago on January 21, 1986, IBM Austin launched a new operating system called IBM RT Personal Computer Advanced Interactive eXecutive -- better known as AIX -- with a new system called the IBM RT PC. The system ran on a RISC processor codenamed “ROMP” (for Research Office Products Division MultiProcessor) and was originally marketed as an engineering workstation.

This new AIX operating system was based on UNIX operating system, but it included significant IBM enhancements such as a virtualization to allow multiple operating systems to run on a single machine, support for high resolution displays, and a simple user interface.

Over the ensuing years we have seen significant advances in the evolution of AIX:
- 1990, AIX V3 on the RS/6000 on the first POWER processor
- 1994, AIX V4 with support for symmetric multiprocessing
- 2001, AIX 5L provides logical partitioning virtualization on POWER4
- 2007, AIX 6 contains workload partitions
- 2010, AIX 7 has built in clustering and the ability to host an earlier version of AIX

Of course AIX was not evolving alone – since that original release on the RT PC in 1986, the capabilities of Power processor and hardware grew from a single processor running at 5.9 MHz to today’s Power 795 running up to 256 POWER7 cores at 4.25 GHz..

Throughout this evolution, the AIX and Power Systems market position has grown from a small fraction of the engineering workstation market to the market leader of the $16 billion enterprise UNIX server market. Leadership performance has long been key part of this success, but AIX and Power Systems also lead the market in reliability and availability.

Monday, March 29, 2010

AIX QUICK SHEET

Filesystems

/dev/hd1 /home
/dev/hd2 /usr
/dev/hd3 /tmp
/dev/hd4 / root
/dev/hd5 BLV (Boot Logical Volume)
/dev/hd6 Paging space
/dev/hd8 JFS2 log
/dev/hd9var /var
/dev/hd10opt /opt
/dev/hd11admin /admin New in 6.1
livedump /var/adm/ras/livedump New in 6.1 TL3
/proc procfs pseudo filesystem
Remove mount point entry and the LV for /mymount
rmfs /mymount (Add -r to remove mount point)

Grow the /var lesystem by 1 Gig
chfs -a size=+1G /var

Grow the /var lesystem to 1 Gig
chfs -a size=1G /var

Find the file usage on a fi lesystem
du -smx /

List filesystems in a grep-able format
lsfs

Get extended information about the /home filesystem
lsfs -q /home

Create a log device on datavg VG
mklv -t jfs2log -y datalog1 datavg 1

Format the log device just created
logform /dev/datalog1

Kernel Tuning

- no is used in the following examples. vmo, no, nfso, ioo, raso, and schedo all use similar syntax. lvmo uses di fferent syntax.

Reset all networking tunable to the default values
no -D (Changed values will be listed)

List all networking tunable
no -a

Set a tunable temporarily (until reboot)
no -o use isno=1

Set a tunable at next reboot
no -r -o use isno=1

Set current value of tunable as well as reboot
no -p -o use isno=1

List all settings, defaults, min, max, and next boot values
no -L

List all sys0 tunables
lsattr -El sys0

Get information on the minperm% vmo tunable
vmo -h minperm%

Change the maximum number of user processes to 2048
chdev -l sys0 -a maxuproc=2048

Check to see if SMT is enabled
smtctl

Directory containing tunables settings
/etc/tunables/

ODM
Query CuDv for a specific item
odmget -q name=hdisk0 CuDv

Query CuDv using the \like" syntax
odmget -q "name like hdisk?" CuDv

Query CuDv using a complex query
odmget -q "name like hdisk? and parent like vscsi?" CuDv

Devices
List all devices on a system
lsdev

List all disk devices on a system (See next item for a list of classes)
lsdev -Cc disk

List all customized (existing) device classes (-P for complete list)
lsdev -C -r class

Remove hdisk5
rmdev -dl hdisk5

Get device address of hdisk1
getconf DISK DEVNAME hdisk1 (or) bootinfo -o hdisk1

Get the size (in MB) of hdisk1
getconf DISK SIZE /dev/hdisk1 (or) bootinfo -s hdisk1

List all disks belonging to scsi0
lsdev -Cc disk -p scsi0

Find the slot of a PCI Ethernet adapter
lsslot -c pci -l ent0

Find the (virtual) location of an Ethernet adapter
lscfg -l ent1

Find the location codes of all devices in the system
lscfg

List all MPIO paths for hdisk0
lspath -l hdisk0

Find the WWN of the fcs0 HBA adapter
lscfg -vl fcs0 | grep Network

Temporarily change console output to /console.out
swcons /console.out -> (Use swcons to change back.)

Get statistics and extended information on fcs0
fcstat fcs0

Tasks
Change port type of HBA (This may vary by HBA vendor)
rmdev -d -l fcnet0
rmdev -d -l fscsi0
chdev -l fcs0 -a link type=pt2pt
cfgmgr

Mirroring rootvg to hdisk1
extendvg rootvg hdisk1
mirrorvg rootvg
bosboot -ad hdisk0
bosboot -ad hdisk1
bootlist -m normal hdisk0 hdisk1

Mount a CD/DVD ROM to /mnt

mount -rv cdrfs /dev/cd0 /mnt -> (for a CD)
mount -v udfs -o ro /dev/cd0 /mnt -> (for a DVD)
-> Note the two di erent types of read-only flags. Either is Ok.

Create a VG, LV, and FS, mirror, and create mirrored LV

mkvg -s 256 -y datavg hdisk1 (PP size is 1/4 Gig)
mklv -t jfs2log -y dataloglv datavg 1
logform /dev/dataloglv
mklv -t jfs2 -y data01lv datavg 8 -> (2 Gig LV)
crfs -v jfs2 -d data01lv -m /data01 -A yes
extendvg datavg hdisk2
mklvcopy dataloglv 2 -> (Note use of mirrorvg in next example)
mklvcopy data01lv 2
syncvg -v datavg
lsvg -l datavg will now list 2 PPs for every LP
mklv -c 2 -t jfs2 -y data02lv datavg 8 -> (2 Gig LV)
crfs -v jfs2 -d data02lv -m /data02 -A yes
mount -a

Move a VG from hdisk1 to hdisk2

extendvg datavg hdisk2
mirrorvg datavg hdisk2
-> Wait for mirrors to synchronize
unmirrorvg datavg hdisk1
reducevg datavg hdisk1

Find the free space on PV hdisk1
lspv hdisk1 ! (Look for \FREE PPs")

Additional Information
http://publib.boulder.ibm.com/infocenter/systems/scope/aix
http://www.redbooks.ibm.com/portals/unix
Display error codes can be found in the \Diagnostic Information for Multiple Bus Systems" manual

About this QuickSheet
Created by: William Favorite (wfavorite@tablespace.net)
Updates at: http://www.tablespace.net/quicksheet/
Disclaimer: This document is a guide and it includes no express warranties to the suitability, relevance, or compatibility of its contents with any specfi c system. Research any and all commands that you inflict upon your command line.
Distribution: The PDF version is free to redistribute as long as credit to the author and tablespace.net is retained in the printed and viewable versions. LATEX source not distributed at this time.

Saturday, August 15, 2009

New and Update IBM Redbooks

TS7650G and TS7650 ProtecTIER De-duplication Servers
Revised: August 7, 2009
http://w3.itso.ibm.com/redpieces/abstracts/sg247652.html?Open

Tuning System x Servers for Performance
Published: August 4, 2009 ISBN: 0738433071 848 pages
http://www.redbooks.ibm.com/abstracts/sg245287.html?Open

Implementing an iDataPlex Solution
Published: August 4, 2009 ISBN: 0738433233 272 pages
http://www.redbooks.ibm.com/abstracts/sg247629.html?Open

Managing Unified Storage with N-Series Operation Manager
Published: August 4, 2009 ISBN: 0738433160 576 pages
http://www.redbooks.ibm.com/abstracts/sg247734.html?Open



PowerVM Virtualization on System p: Intro & Configuration
Revised: July 29, 2009 ISBN: 0738485306 398 pages
http://www.redbooks.ibm.com/abstracts/sg247940.html?Open

Power 520 Technical Overview
Revised: July 29, 2009
http://www.redbooks.ibm.com/redpapers/abstracts/redp4403.html?Open

Brocade 8Gb FC Single-port and Dual-port HBA for System x
Published: July 31, 2009
http://www.redbooks.ibm.com/abstracts/tips0719.html?Open

QLogic 8Gb FC Single-port and Dual-port HBAs for System x
Published: July 31, 2009
http://www.redbooks.ibm.com/abstracts/tips0721.html?Open

System hang after update to 6100-03

Systems with Encypted File System (EFS) support enabled may fail to boot after updating to the 6100-03 Technology Level.

This problem occurs if the clic.rte.kernext and clic.rte.lib filesets are installed at a
level below 4.6.0.0. EFS may be enabled if the 'efsenable' command was ever executed
on the system, even if there are no encrypted file systems currently in use. The presence of the file /var/efs/efsenabled indicates that EFS is enabled.
To avoid this issue, update the clic.rte.kernext and clic.rte.lib filesets to the 4.6.0.0 level from the clic.rte image available on the AIX Expansion Pack, dated 5/2009 or later. The clic.rte filesets need to be updated before the system is rebooted after updating to the 6100-03 Technology Level.

If the 6100-03 Technology Level is installed without updating the clic.rte filesets to the 4.6.0.0 level, the system may fail to reboot. Should this occur, you can recover using the following procedure.

1. Reboot system into service mode
2. Select 'Task Selection', then chose 'Shell Prompt'
3. Move file /var/efs/efsenabled to /var/efs/efsenabled.SAVE
4. Reboot system into normal mode
5. Update clic.rte.kernext and clic.rte.lib to the 4.6.0.0 level
6. Move back /var/efs/efsenabled from /var/efs/efsenabled.SAVE

Distribution of the 6100-03 Technology Level on Fix Central has been temporarily suspended until this issue is resolved.

So, be warned!