Fwd: Re: [HECnet] More clustering fun

Mark Wickens mark at wickensonline.co.uk
Fri Sep 16 11:09:04 PDT 2011


Just to add to the picture - if I reduce VOTES on the satellite to 0 I get this happening:

-------- Original Message --------
Subject:
Re: [HECnet] More clustering fun
Date:
Fri, 16 Sep 2011 09:33:47 +0000
From:
hvlems at zonnet.nl
Reply-To:
hvlems at zonnet.nl
To:
Mark Wickens <mark at wickensonline.co.uk>


All I can think of is this:
1) Both slave and aleph both use the same VMSCLUSTER license 
2) The cluster id and cluster password are different on both nodes. 
On an alpha you can modify this in sysman (use help in sysman to find the correct command). On a Vax the command is burried in sysgen. 
Hans
-----Original Message-----
From: Mark Wickens <mark at wickensonline.co.uk>
Date: Fri, 16 Sep 2011 10:11:41 
Cc: <hvlems at zonnet.nl>
Subject: Re: [HECnet] More clustering fun

Hi Hans,

I didn't want to pre-empt what I thought happened last time as I wasn't 
sure I'd got it right, but it has happened again so it's definitely an 
issue.

I've updated the VOTES in ALEPH (the satellite) MODPARAMS.DAT, ran 
AUTOGEN and rebooted the cluster.

Now when ALEPH attempts to join the cluster I get these messages repeatedly:

%CNXMAN,  sending VAXcluster membership request to system SLAVE
%CNXMAN,  sending VAXcluster membership request to system SLAVE
%CNXMAN,  sending VAXcluster membership request to system SLAVE
%CNXMAN,  sending VAXcluster membership request to system SLAVE
%CNXMAN,  sending VAXcluster membership request to system SLAVE

and I see this on SLAVE:

$$
    %CNXMAN,  Received VMScluster membership request from system ALEPH
%CNXMAN,  Proposing addition of system ALEPH
%CNXMAN,  Completing VMScluster state transition
%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:35.79  %%%%%%%%%%%
10:05:35.79 Node SLAVE (csid 00010001) received VMScluster membership 
request from node ALEPH

%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:35.79  %%%%%%%%%%%
10:05:35.79 Node SLAVE (csid 00010001) proposed addition of node ALEPH

%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:35.79  %%%%%%%%%%%
10:05:35.79 Node SLAVE (csid 00010001) completed VMScluster state transition

$$
    %CNXMAN,  Received VMScluster membership request from system ALEPH
%CNXMAN,  Proposing addition of system ALEPH
%CNXMAN,  Completing VMScluster state transition
%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:39.04  %%%%%%%%%%%
10:05:39.04 Node SLAVE (csid 00010001) received VMScluster membership 
request from node ALEPH

%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:39.04  %%%%%%%%%%%
10:05:39.04 Node SLAVE (csid 00010001) proposed addition of node ALEPH

%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:39.04  %%%%%%%%%%%
10:05:39.04 Node SLAVE (csid 00010001) completed VMScluster state transition

$$
    %CNXMAN,  Received VMScluster membership request from system ALEPH
%CNXMAN,  Proposing addition of system ALEPH
%CNXMAN,  Completing VMScluster state transition
%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:43.02  %%%%%%%%%%%
10:05:43.02 Node SLAVE (csid 00010001) received VMScluster membership 
request from node ALEPH

%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:43.02  %%%%%%%%%%%
10:05:43.02 Node SLAVE (csid 00010001) proposed addition of node ALEPH

%%%%%%%%%%%  OPCOM  16-SEP-2011 10:05:43.02  %%%%%%%%%%%
10:05:43.02 Node SLAVE (csid 00010001) completed VMScluster state transition

This just repeats forever.

Some more information from SLAVE (the ALPHA server):

SHOW CLUSTER:

View of Cluster from system ID 4345  node: 
SLAVE                                                               
16-SEP-2011 10:06:13
+--------------------------------------------------------+---------+
|                         SYSTEMS                        | MEMBERS |
+--------+--------------------------------+--------------+---------+
|  NODE  |             HW_TYPE            |   SOFTWARE   |  STATUS |
+--------+--------------------------------+--------------+---------+
| SLAVE  | AlphaServer 1000A 5/300        | VMS V8.3     | MEMBER  |
| ALEPH  | VAXstation 4000-VLC            | VMS V7.3     | NEW     |
+--------+--------------------------------+--------------+---------+
+------------------------------------------------------------------------------------+
|                                       
CLUSTER                                      |
+--------+-----------+----------+------------+-------------------+-------------------+
| CL_EXP | CL_QUORUM | CL_VOTES | CL_MEMBERS |       FORMED      |  
LAST_TRANSITION  |
+--------+-----------+----------+------------+-------------------+-------------------+
|      1 |         1 |        1 |          1 | 16-SEP-2011 09:56 | 
16-SEP-2011 09:56 |
+--------+-----------+----------+------------+-------------------+-------------------+

SYSGEN>  SHOW EXPECTED_VOTES
%CNXMAN,  Completing VMScluster state transition
Parameter Name            Current    Default     Min.       Max.   Unit  
Dynamic
--------------            -------    -------   -------    -------  ----  
-------
EXPECTED_VOTES                  1          1         1        127 Votes


Any ideas why this is going wrong?

Thanks for the help, much appreciated,

Mark.

On 16/09/11 09:40, hvlems at zonnet.nl wrote:
> Regarding the alphaserver: check the value of expectedvotes in sysgen.
> In a cluster with non-voting satellites only, its value must be less than Votes+1
> Hans
> ------Origineel bericht------
> Van: Mark Wickens
> Afzender: owner-hecnet at Update.UU.SE
> Aan: hecnet at Update.UU.SE
> Beantwoorden: hecnet at Update.UU.SE
> Onderwerp: [HECnet] More clustering fun
> Verzonden: 16 september 2011 10:14
>
> I've now refreshed the VAX satellites system drive and installed it in the
> ALPHA server. The one problem I have remaining is that the VOTES the
> satellite is contributing to the cluster is 1. I believe for a proper
> satellite this should be 0.
>
> Is this a case of updating the MODPARAMS.DAT on the satellite and autogen
> and reboot? Do I need to do anything with the ALPHA servers configuration?
>
> Presumably I will need to reboot the ALPHA server as well.
>
> Thanks for the help,
>
> Kind regards, Mark.
>



More information about the Hecnet-list mailing list