[HECnet] SETNOD, Part 2

Thomas DeBellis tommytimesharing at gmail.com
Tue Jun 22 10:15:48 PDT 2021


Now that I am thinking about it, perhaps one reason that a DECnet 
executor will refuse to change its running address is because that would 
involve changing the MAC address of the Ethernet adapter? Tops-20 DECnet 
does not appear to want to do after boot-up is complete, although I 
don't know why nor if the IP stack would care.

Multi-net on Tops-20 also includes CI communications.  As I recall, it 
is not programmatically possible to change your CI address because that 
depends on where you are physically plugged into the STAR adapter (or 
concentrator or whatever it was).  Maybe things wouldn't croak in that 
case, but you would need to use low level communications to notify the 
other nodes of the address change.

Tops-20 definitely has code (in the SCS% JSYS) to signal other nodes in 
the cluster that a particular system's name has changed.  However, once 
DECnet is initialized, the NODE% function to do this, .NDSLN, will be 
refused by the Executor with a NODX16 ("DECnet is already initialized").

It would appear that DECnet does _not_ allow aliasing to be done 
network-wide.  From the 4.0 Network Management Functional Specification, 
one finds the following:

    DNA allows one node name  for  each  node.  The network manager
    should make sure that each node name and address in the network is 
    unique.   (An  implementation may also provide the ability to assign
    additional node names to nodes, but these names can be known to the
    local node  only).

The SETND2 NODE% .NDINT simulator is pretty strict about certain things, 
depending on the mode.  In boot up mode, you are allowed to set any node 
to any address you want.  Once the simulated node table is populated 
(called "Running" mode), you can not change /anything/ about a node 
unless you delete it first.  The simulator will reject all cases of 
this, even if you are aliasing in an a remote area.

This is as contrasted with Tops-20 current behavior of apparently only 
detecting such a clash in the local area.  This is done in the SCLINK 
module; I have yet to review the out of area code in ROUTER.

All of this allows me to test the SETND2 node management routines very 
extensively without having to worry about getting the monitor's node 
data base in a very bad way, the only cure for that being a reboot.

Of course, I managed to repeatably hang and crash the system rather 
spectacularly while implementing the simulator (all this while not being 
enabled'), but that is another story...
> ------------------------------------------------------------------------
> On 6/1/21 7:01 AM, Johnny Billquist wrote:
>
> Changing the name of the executor might be objected to by most systems...
> Updating it with the same information it already have should (I hope) 
> be a no-op.
>
> With RSX, the nodename of the executor itself is special, and don't 
> sit together with all to name to number associations for other nodes. 
> So it don't really hit the executor when doing such a clear/purge. I 
> think it's similar with VMS.
>
> But my comment was mainly because the point brought up from Keith that 
> other node information, such as various parameters for DECservers, are 
> also in the database, and you don't want to delete that.
>
> But the commands are "CLEAR NODE * NAME", which means only the name is 
> deleted, not other type of information/parameters in the database. So 
> DECserver information should not be touched in the first place.
>
>   Johnny
>> ------------------------------------------------------------------------
>> On 2021-06-01 07:02, Thomas DeBellis wrote:
>>
>> True, but I don't believe you couldn't do that on a Tops-20 cluster.  
>> At best, it might not be a good idea.
>>
>> No Tops-20 node on the CI will allow a name (or address) redefinition 
>> for itself once booted.  The command would be rejected by the 
>> Executor.  You'd have the situation where nodes would have different 
>> definitions of neighbors because each neighbor's Executor would no 
>> longer have the neighbor's definition.  Or something like that...
>>
>> At Columbia, if we had to do something like that, we brought the 
>> entire cluster down.  This was *really* frowned on.
>>> ------------------------------------------------------------------------
>>> On 5/31/21 5:17 PM, Johnny Billquist wrote:
>>>
>>> Well, the CLEAR/PURGE is only for the name, not any other 
>>> information...
>>>
>>> Johnny
>>>> ------------------------------------------------------------------------
>>>> On 2021-05-31 23:04, Steve Davidson wrote:
>>>>
>>>> Unfortunately CLEAR/PURGE is not a good idea in clusters or nodes 
>>>> that boot DECservers.  VMS requires additional information in the 
>>>> database for those nodes/servers that would be wiped out. That is 
>>>> why I wrote NETUPDATEV2.COM. It does the update without touching 
>>>> any nodes in the local area.
>>>>
>>>> -Steve Davidson
>>>>
>>>> SF:iP1
>>>>> ------------------------------------------------------------------------
>>>>> On May 31, 2021, at 16:49, Keith Halewood 
>>>>> <Keith.Halewood at pitbulluk.org> wrote:
>>>>>
>>>>> With VMS, it's also permissible to copy 
>>>>> sys$system:netnode_remote.dat to other nodes, mainly because no 
>>>>> executor information is contained within this file.
>>>>>
>>>>> Dune::netupdatev3.com is a modified form of the update script 
>>>>> which does a purge/clear by individual area, except for one's own 
>>>>> area which is handled on a node by node basis, including a 
>>>>> configurable range within that area which is ignored. For example, 
>>>>> it prevents MIM:: from overriding 29.100-199 by default.
>>>>>
>>>>> Keith
>>>>>> ------------------------------------------------------------------------
>>>>>> From: owner-hecnet at Update.UU.SE 
>>>>>> [mailto:owner-hecnet at Update.UU.SE] On Behalf Of Johnny Billquist
>>>>>> Sent: 31 May 2021 20:48
>>>>>> To: hecnet at Update.UU.SE
>>>>>> Subject: Re: [HECnet] SETNOD, Part 2
>>>>>>
>>>>>> If there is some additional commands you'd like for me to put 
>>>>>> into FIX.T20, let me know.
>>>>>>
>>>>>> The VMS commandfile I creates starts like this:
>>>>>>
>>>>>>     $ MCR NCP
>>>>>>     PURGE NODE * NAME
>>>>>>     CLEAR NODE * NAME
>>>>>>     def nod 41.28 name 28NH
>>>>>>     .
>>>>>>     .
>>>>>>     .
>>>>>>
>>>>>> Which means that any previous definitions are first cleared out 
>>>>>> before any definitions go in.
>>>>>> This is because VMS (and RSX) do not handle if you have a node 
>>>>>> name that gets a different address. Clearing things out first 
>>>>>> solves that.
>>>>>>
>>>>>> The alternate thing that can be done in VMS is that you can copy 
>>>>>> nodenames from within NCP from another node, which seems to avoid 
>>>>>> the problem as well (I think).
>>>>>>
>>>>>> In RSX, the permanent nodename database in RSX can be created by 
>>>>>> a separate tool that does not at all relates to the current 
>>>>>> nodename database, so in RSX it's rather easy. You download a new 
>>>>>> database, and then you switch it over to the new db.
>>>>>>
>>>>>> With other systems I don't know at all.
>>>>>>
>>>>>>    Johnny
>>>>>>
>>>>>> -- 
>>>>>> Johnny Billquist                  || "I'm on a bus
>>>>>>                                    ||  on a psychedelic trip
>>>>>> email: bqt at softjar.se             ||  Reading murder books
>>>>>> pdp is alive!                     ||  tryin' to stay hip" - B. Idol
>>>>>> ------------------------------------------------------------------------
>>>>>>> On 2021-05-31 20:50, Thomas DeBellis wrote:
>>>>>>>
>>>>>>> I was wondering if anybody would care to explain how routine 
>>>>>>> node maintenance happens for DECnet on non-Tops-20 systems.  
>>>>>>> Specifically, Johnny's node list on MIM:: changes more or less 
>>>>>>> about once a month, sometimes more, sometimes less.
>>>>>>>
>>>>>>> Is anybody keeping up on this?  How?  I had a (bi-weekly) 
>>>>>>> re-occurring batch job which NFT'ed the latest node file from 
>>>>>>> MIM:: and simply used SETNOD to shove the whole thing into the 
>>>>>>> running monitor, on the assumption that the monitor would figure 
>>>>>>> out what to do.  While slapping in the whole list (with .NDINT) 
>>>>>>> during timesharing did strike me as somewhat wasteful, I didn't 
>>>>>>> pay much attention to the matter as it did work.
>>>>>>>
>>>>>>> This is mistaken.  Tops-20 will not 'make it' work, nor does it 
>>>>>>> apparently detect certain situations which appear to be 
>>>>>>> problematic.   It does detect and reject two situations.
>>>>>>>
>>>>>>>  1. You may not change either the name or address of the host
>>>>>>>     (I.E., the Executor).  These can only be set once at boot
>>>>>>>     up.  Do other operating systems have this restriction?
>>>>>>>  2. You may not change the address of an existing node in the
>>>>>>>     local area.
>>>>>>>
>>>>>>> A node insertion in the local area which usurps an address of 
>>>>>>> another node deletes that node.  Outside of the local area, you 
>>>>>>> are on your own.  It does whatever you want, which means that 
>>>>>>> you can have multiple nodes with the same address.  Is that a 
>>>>>>> problem? On IP4, this would been known as 'aliasing', but I 
>>>>>>> don't think DECnet allows this.
>>>>>>>
>>>>>>> So it would appear that the appropriate behavior is that a new 
>>>>>>> node list implies a system reboot.  Unless I'm actively doing 
>>>>>>> monitor development, I can't stand doing this.
>>>>>>>
>>>>>>> However, fixing the problem turned out to be pernicious.  
>>>>>>> Neither of the two cases above is reported to the user program; 
>>>>>>> there is no way to determine what might have gone wrong.  There 
>>>>>>> is no way for the user program to proactively prevent errors 
>>>>>>> because, while you can ask Tops-20 to translate a DECnet address 
>>>>>>> to a node name and to verify that a DECnet node name exists, 
>>>>>>> there is no way to return the address for a verified DECnet node 
>>>>>>> name.  Is this an oversight?  Can a user program get the address 
>>>>>>> of a DECnet node name on other operating systems?
>>>>>>>
>>>>>>> I remediated the low level error reporting issue and implemented 
>>>>>>> a new function for NODE% to return the address of an existing 
>>>>>>> DECnet node (.NDVFX or Verify Node Extended).  Fixing SETNOD 
>>>>>>> proved impossible.  I discovered that the actions to be 
>>>>>>> performed were complex enough when automated that the dimensions 
>>>>>>> of the solution were wholly beyond its capabilities.  Not that 
>>>>>>> there was anything wrong with SETNOD, it just wasn't designed 
>>>>>>> for this kind of heavy lift.  So I rewrote it from scratch 
>>>>>>> (cleverly naming it SETND2). I'm converging on completion, but I 
>>>>>>> don't work on it actively, so this will probably be a few more 
>>>>>>> weeks.
>>>>>>>
>>>>>>> Here is some sample output; let's suppose that BOINGO needs its 
>>>>>>> address changed from 2.399 to 2.400 and that this conflicts with 
>>>>>>> another node (in this case, APOLLO).  To get this to work right, 
>>>>>>> what you need to do is tell Tops-20 to do is delete BOINGO 
>>>>>>> first, so that there is no name clash on the insertion.  Then 
>>>>>>> you have to delete APOLLO, so that there is no address 
>>>>>>> conflict.  Once you are done performing both these actions, it's 
>>>>>>> safe to do the insertion and Tops-20 doesn't reject it or 
>>>>>>> otherwise gets itself confused.
>>>>>>>
>>>>>>> @*setnd2*
>>>>>>> % Insufficient capabilities for INSERT command
>>>>>>>
>>>>>>> SETNODE>vERBOSITY (level is) vERBOSE
>>>>>>> Verbosity level is VERBOSE
>>>>>>> SETNODE>get /sECTION-MAP /nO-ACCESS
>>>>>>> [BIN file: TOMMYT:<SYSTEM>NODE-DATA.BIN.91;RESTRICTED-JFN:13 ]
>>>>>>> Mapped one section (4 pages), 1778 Words, 889 Nodes.
>>>>>>> SETNODE>*recONSTRUCT /sILENT
>>>>>>> [Closed log file: NUL:]
>>>>>>> SETNODE>shoW aREA 2 uNCHANGED
>>>>>>> [Area 2]
>>>>>>> A2RTR   ADAGIO  ADVENT  ADVNT5  AMAPUR APOLLO  AUG11 AUGVAX  BASSET
>>>>>>> BEAGLE  BELLS   BOINGO  BOXER   BULDOG CHARON CODA COLLIE  CONDOR
>>>>>>> CORGI   COYOTE  CYPHER  DALMTN DIVISI DOGPAK ELIDYR ELITE   FOX
>>>>>>> GLDRTR  GLOVER  GRUNT HERMES  HUIA HUNTER  HUSKY JACKAL  JENSEN
>>>>>>> KELPIE LABRDR  LAPDOG LARGO   LEGATO LENTO   MASTIF MENTOR  MEZZO
>>>>>>> MULTIA  MUTT    NO0K    ODST    OINGO OSIRIS  PAVANE POCO    POODLE
>>>>>>> PUG     PUGGLE  PUPPY   R2X899  REACH SPARK TERIER THEARK  TOMMYT
>>>>>>> VENTI   WLFHND  WOLF    ZITI
>>>>>>> Total nodes in area 2: 67
>>>>>>> SETNODE>shoW uNCHANGED boiNGO
>>>>>>> BOINGO:: (2.399)
>>>>>>> SETNODE>shoW uNCHANGED boiNGO
>>>>>>> BOINGO:: (2.399)
>>>>>>> SETNODE>set 2.400 boingo
>>>>>>> Set existing node BOINGO:: (2.400)
>>>>>>> Node BOINGO:: (2.400)
>>>>>>> % Removing node BOINGO:: (2.399) from same list to insert in the 
>>>>>>> delete list
>>>>>>> % Re-using key text for insertion in delete list, BOINGO (2.399)
>>>>>>> % Removing BOINGO::'s previous address (2.399)
>>>>>>> % Removing node APOLLO:: (2.400) from same list to insert in the 
>>>>>>> delete list
>>>>>>> % Re-using key text for insertion in delete list, APOLLO (2.400)
>>>>>>> % Deleting APOLLO:: (2.400) to reassign its address to BOINGO::
>>>>>>> % Allowing update request for node BOINGO:: (2.400) because 
>>>>>>> being deleted as (2.399)
>>>>>>> % Removing node BOINGO:: (2.399) from unchanged list because its 
>>>>>>> address has changed to (2.400)
>>>>>>> % Re-using key text for insertion in update list, BOINGO (2.400) 
>>>>>>> Node change request for BOINGO:: (2.400)
>>>>>>> SETNODE> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20210622/53f50fb7/attachment-0003.htm>


More information about the Hecnet-list mailing list