New Challenge the way to success

We are is what we think, everything that we get is what we effort. with passion and focus we can reach and get our dreams to be true. Everyday in your life, career & social we will face the challenge, the challenge are vary depending on your daily life and your interest… you can take it or lose it.

Today I just passed : Oracle Certified Professional, Oracle Solaris Cluster 3.2 System Administrator and this is my new challenge in my career.

The Oracle Certified Professional, Oracle Solaris Cluster 3.2 System Administrator certification is for system administrators who install, support and administer Oracle Solaris Cluster 3.2 or Sun Cluster 3.2. It is recommended that candidates have the course Sun Cluster 3.2 Administration as well as a minimum of 6-12 months of Oracle Solaris Cluster or Sun Cluster installation and administration experience, basic network administration and ability to perform Volume Management using both Solaris Volume Manager softwware (SVM) and VERITAS Volume Manager (VxVM). ~ Oracle Certification Program

Troubleshooting the lost diskset on Sun Cluster

In middle of night I check my cluster labs, and it show my apache resource group is not running… I re-check it, and I found that the cluster node didn’t mount the webds /global/web. My webds diskset is gone, I don’t know the root cause of this problem… 😀 maybe I’m doing another lab in the same node, and did not consciously change its configuration.

if i check the metaset of the webds status, it show no node that own this diskset

bash-3.00# metaset -s webds

Set name = webds, Set number = 2

Host                Owner
  clnode-01
  clnode-02

Mediator Host(s)    Aliases
  clnode-01
  clnode-02

Driv Dbase

d6   Yes

and if I running the metastat status for webds it coming up with error :

bash-3.00# metastat -s webds
metastat: clnode-01: webds: must be owner of the set for this command

The resolution is simple, below my troubleshooting :
1. boot your node-1 & node-2 in non cluster mode
2. comment out the share device at /etc/vfstab
3. boot your node-1 & node-2 in cluster mode
4. on node-2 :
force purge the lost disket :

metaset -s <setname> -P -f

5. on node-1 :
force purge the lost disket :

metaset -s <setname> -P -f

re-recreate your metaset disket :

metaset -s <setname> -a -h NodeA NodeB
metaset -s <setname> -a <diskpath0> <diskpath1> ... <diskpathN>
metaset -s <setname> -a -m NodeA NodeB
metaset

(should show new set and ownership)

Note : because my webds disket is set of the svm disk, I re-create the soft partition on it..

bash-3.00# metainit -s webds d1 1 1 /dev/did/rdsk/d6s0
webds/d1: Concat/Stripe is setup
bash-3.00# metainit -s webds d200 -p d1 3g
d200: Soft Partition is setup
bash-3.00# metastat -s webds
webds/d200: Soft Partition
    Device: webds/d1
    State: Okay
    Size: 6291456 blocks (3.0 GB)
        Extent              Start Block              Block count
             0                       32                  6291456

webds/d1: Concat/Stripe
    Size: 10457088 blocks (5.0 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        d6s0            0     No            Okay   No

Device Relocation Information:
Device   Reloc  Device ID
d6   No         -

testing mount, and ls direcory :

bash-3.00# mount /dev/md/webds/dsk/d200 /global/web
bash-3.00# ls -l /global/web
total 24
drwxr-xr-x   2 root     root         512 May 31 21:55 bin
drwxr-xr-x   2 root     bin          512 May 25 06:22 cgi-bin
drwxr-xr-x   2 root     root         512 May 31 21:59 conf
drwxr-xr-x   2 root     bin         1024 May 25 06:22 htdocs
drwx------   2 root     root        8192 May 31 20:56 lost+found

in theory, And you should be happy, your cluster resource group is running again.

bash-3.00# clrg status

=== Cluster Resource Groups ===

Group Name    Node Name             Suspended   Status
----------    ---------             ---------   ------
nfs-rg        clnode-01             No          Online
              clnode-02             No          Offline

apache-rg     clnode-01:webapp-01   No          Online
              clnode-01:webapp-02   No          Offline

bash-3.00# clrs status

=== Cluster Resources ===

Resource Name      Node Name             State     Status Message
-------------      ---------             -----     --------------
nfs-res            clnode-01             Online    Online - Service is online.
                   clnode-02             Offline   Offline

nfs-stor           clnode-01             Online    Online
                   clnode-02             Offline   Offline

mycluster-nfs      clnode-01             Online    Online - LogicalHostname online.
                   clnode-02             Offline   Offline

apache-res         clnode-01:webapp-01   Online    Online - Service is online.
                   clnode-01:webapp-02   Offline   Offline

apache-stor        clnode-01:webapp-01   Online    Online
                   clnode-01:webapp-02   Offline   Offline

mycluster-webapp   clnode-01:webapp-01   Online    Online - LogicalHostname online.
                   clnode-01:webapp-02   Offline   Offline

reboot your node if needed. 🙂