[HECnet] Tops-20 SETNOD Failure

Thomas DeBellis tommytimesharing at gmail.com
Tue May 4 08:15:21 PDT 2021


Has anybody ever seen SETNOD fail to insert the entire node list?  I 
just did.

Shortly after I put my 20's up on HECnet, I wrote a reoccurring batch 
job that fires once a week on Sundays to pull the latest node list 
(T20.FIX) from MIM::.  I use the highly venerable FILCOM program to do a 
difference of it with the previous week's list.  I don't do anything in 
particular with the output except save it in case I feel like looking at 
it for some reason.

The batch job always inserts the entire list, rewriting whatever might 
be in the monitor's data base.  I have always been unsatisfied with 
doing things that way because it seemed to me to be inefficient as the 
node list grew.   The HECnet node list count was 716 on 9-Jun-19 and 
it's now up to 884 as of the latest version that I've pulled, 
30-Apr-21.  The other problem is the microscopic possibility that a node 
is in Tops-20's monitor database (a hash table) that isn't in the HECnet 
node list.

Nodes can get removed, although I think that infrequent.  Nodes could 
get inserted outside of the batch job, but I think that most unlikely in 
my situation.  Nodes can get renamed, as evidenced by 2.299 below, which 
went from THEPIT to THEARK.  None of this should or has broken anything.

However, it's been in the back of my mind to do two enhancements, one to 
Tops-20 and one to SETNOD. The NODE% JSYS should have an additional 
feature to return the current monitor data base.  The SETNOD program 
should be enhanced to take that to compute the set difference with the 
new list.  This would show additions, renames and deletions. That would 
bring the update operation down from some hundred items to less than 
ten, on average.  This would obviously make more of a difference on huge 
DECnet's in the tens of thousands of nodes. Another NODE% feature should 
probably be to whack the entire monitor database except for the local 
node, which would be useful for trouble shooting.

Last Sunday, the batch job failed with the following error:

18:33:40 USER   SETNOD>*TAKE SYSTEM:NODE-DATA.TXT.0
18:33:40 USER
18:33:40 USER   [Fork SETNOD opening <SYSTEM>NODE-DATA.TXT.1 for reading]
18:33:41 USER   SETNOD>*SAVE
18:33:41 USER
18:33:41 USER   [Fork SETNOD opening <SYSTEM>NODE-DATA.BIN.74 for 
reading, writing]
18:33:41 USER   SETNOD>*INSERT
18:33:41 USER
18:33:41 USER *?SETNOD: Failed at node REACH*
18:33:41 USER   SETNOD>

I had a look at the SETNOD source and the HECnet node list and have 
discovered and concluded a few things.  First, there doesn't seem to be 
anything syntactically wrong with REACH::'s definition: "set nod 2.298 
name REACH". Second, there don't appear to be any semantic issues.  
2.298 wasn't in use and it shouldn't matter if it was.

In the case of INSERT, there are two kinds of errors from NODE%, a 
general failure of the JSYS and an incomplete insertion.   The error is 
from the second case.  Unfortunately, SETNOD isn't reporting enough 
information about the error, so I have to make some changes there.  It's 
also possible that SETNOD is building an inconsistent database for the 
monitor to swallow; at least the LIST command is giving me some odd 
results, viz:

    SETNOD>list arEA 2

    [AREA 2]
    A2RTR

    TOTAL NODES FOUND: 1

    SETNOD>

That's clearly wrong, viz:

    !i dec
      Local DECNET node: VENTI2.  Nodes reachable: 7.
      Accessible DECNET nodes are:    A2RTR    BOINGO LEGATO   
    TOMMYT    VENTI2    VENTI    ZITI

The Exec output should probably be changed to say, "Nodes reachable in 
local area" and "Online nodes in area are:"

Anybody have any ideas?  Hunches?  Clues?

------------------------------------------------------------------------

File 1) OLDF:[4,120]    created: 1241 15-Apr-21
File 2) NEWF:[1,1]      created: 0102 30-Apr-21

1)1     set nod 44.9 name OSMIUM
****
2)1     set nod 2.292 name OSIRIS
2)      set nod 44.9 name OSMIUM
**************
1)1     set nod 13.3 name RED
****
2)1 *set nod 2.298 name REACH *
2)      set nod 13.3 name RED
**************
1)1     set nod 2.298 name RSX11M
1)      set nod 1.306 name RSX124
****
2)1     set nod 1.306 name RSX124
**************
1)1     set nod 42.5 name SPARKY
****
2)1     set nod 2.291 name SPARK
2)      set nod 42.5 name SPARKY
**************
1)1     set nod 2.299 name THEPIT
1)      set nod 35.70 name THOMAS
****
2)1     set nod 2.299 name THEARK
2)      set nod 35.70 name THOMAS
**************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20210504/18be0634/attachment.htm>


More information about the Hecnet-list mailing list