[HECnet] Tops-20 SETNOD Failure
Thomas DeBellis
tommytimesharing at gmail.com
Tue May 4 19:31:36 PDT 2021
Personally, I don't see how it could /possibly/ be anything to do with
the REACH:: node definition, but I have been known to occasionally
overlook the utterly obvious, particularly when it's near night-night.
Maybe not this time.
Right now, the way to figure it out is to get the minor error data and
see where that takes things. So I'm making a change to JNTMAN to have
.NDINT to return the lower level code on an incomplete insert. SCLINK
appears to have a problem that it is mangling return values, which I'm
currently investigating.
You can't just blithely assuming somebody got it wrong and 'fix' things;
sometimes it's a certain way for a reason.
On 5/4/21 8:46 PM, Johnny Billquist wrote:
> On 2021-05-05 00:54, Mike Kostersitz wrote:
>> Ouch that is one of my nodes 😊 @Johnny Billquist
>> <mailto:bqt at softjar.se> anything you could think of since we just
>> renamed my old RSX11M node to REACH.
>
> Well. It is something slightly broken in Tops-20, so there isn't
> really anything we can do about it.
>
> Except hope that Thomas can figure it out and fix it.
>
> Johnny
>
>>
>> Mike
>>
>> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
>> Windows 10
>>
>> *From: *Thomas DeBellis <mailto:tommytimesharing at gmail.com>
>> *Sent: *Tuesday, May 4, 2021 15:16
>> *To: *HECnet <mailto:hecnet at update.uu.se>
>> *Subject: *[HECnet] Re: Tops-20 SETNOD Failure
>>
>> I fixed a few things in SETNOD to get some more information about the
>> error. In particular,
>>
>> * Allow listing of AREA 1 (this was specifically disallowed, I don't
>> know why)
>> * More consistent error reporting (via ESOUT%)
>> * List more than one node when doing an area list (it would only list
>> a single node)
>> * List nodes with more than three digits in the node number when doing
>> columnar output
>>
>> So now you get the expected results:
>>
>> SETNOD>lis a 1
>> [Area 1]
>> A1RTR 1023 ATHENA 620 ATLE 605 AURORA 606
>> BANAI 770
>> BANX25 771 BEA 19 BIZET 800 BJARNE 7
>> BLINKY 266
>> CATWZL 302 CLYDE 269 COOPER 263 CRISPS 201
>> CYGNUS 259
>> DAVROS 254 DBIT 351 DE1RSX 450 DE1RSY 452
>> DOCTOR 252
>> ELIN 616 ELMER 617 ERNIE 2 ERSATZ 350
>> FLETCH 100
>> FNATTE 3 FREJ 608 GAXP 730 GNAT 16
>> GNOME 6
>> GOBLIN 4 GVAX 731 HAGMAN 262 HARPER 261
>> HORSE 150
>> HUGIN 602 HYUNA 500 INKY 268 JIMIN 501
>> JOCKE 21
>> JOSSE 17 KLIO 451 KRILLE 8 LOKE 607
>> MACARO 303
>> MACRA 258 MAGICA 1 MASTER 251 MIM 13
>> MUNIN 603
>> NIPPER 202 NOMAD 610 NOXBIT 720 ORACLE 301
>> PACMAN 265
>> PAI 541 PALLAS 621 PAMINA 18 PIDP11 560
>> PINKY 267
>> PISTON 520 PLINTH 200 PMAVS2 510 PONDUS 15
>> PONY 12
>> PUFF 22 QEMUNT 151 REI 540 ROCKY 11
>> ROJIN 542
>> RSX124 306 RSX145 304 RSX170 305 RSX184 307
>> RUTAN 255
>> SHARPE 260 SIDRAT 253 SIGGE 10 SPEEDY 24
>> TARDIS 250
>> TEMPO 9 THOROS 257 TINA 14 TIPSY 604
>> TONGUE 264
>> TOPSY 601 VALAR 400 VAROS 256 WXP 20
>> WXP2 23
>> YMER 609 ZEKE 5
>> Total nodes in area 1: 92
>> SETNOD>exit
>>
>> Regarding the error, I have reproduced it with a single entry, viz:
>>
>> !setnod
>> SETNOD>_set nod 2.298 name REACH_
>> SETNOD>_insert_
>> ?SETNOD: Failed at node REACH (2.298), Item 0 of 1
>> SETNOD>
>>
>> The high level code to do the entry is in JNTMAN. It loops through
>> the table passed to it via .NDINT, calling a lower level routine
>> called SCTAND in SCLINK. An error here is passed up to JNTMAN, but
>> it is not passed back to the user. There are some other problems in
>> SCLINK pertaining to negative return values, so some minor work is
>> necessary there, also.
>>
>> I'll make some changes to these two modules, generate a new monitor
>> for VENTI2 and see what happens in a few days.
>>
>> Right now, if any Tops-20 using is using SETNOD to update DECnet
>> tables, this appears to fail. If anybody else is seeing it or can
>> reproduce it, I'd like to hear about it.
>>
>> On 5/4/21 11:15 AM, Thomas DeBellis wrote:
>>
>> Has anybody ever seen SETNOD fail to insert the entire node list? I
>> just did.
>>
>> Shortly after I put my 20's up on HECnet, I wrote a reoccurring
>> batch job that fires once a week on Sundays to pull the latest node
>> list (T20.FIX) from MIM::. I use the highly venerable FILCOM
>> program to do a difference of it with the previous week's list. I
>> don't do anything in particular with the output except save it in
>> case I feel like looking at it for some reason.
>>
>> The batch job always inserts the entire list, rewriting whatever
>> might be in the monitor's data base. I have always been unsatisfied
>> with doing things that way because it seemed to me to be inefficient
>> as the node list grew. The HECnet node list count was 716 on
>> 9-Jun-19 and it's now up to 884 as of the latest version that I've
>> pulled, 30-Apr-21. The other problem is the microscopic possibility
>> that a node is in Tops-20's monitor database (a hash table) that
>> isn't in the HECnet node list.
>>
>> Nodes can get removed, although I think that infrequent. Nodes
>> could get inserted outside of the batch job, but I think that most
>> unlikely in my situation. Nodes can get renamed, as evidenced by
>> 2.299 below, which went from THEPIT to THEARK. None of this should
>> or has broken anything.
>>
>> However, it's been in the back of my mind to do two enhancements,
>> one to Tops-20 and one to SETNOD. The NODE% JSYS should have an
>> additional feature to return the current monitor data base. The
>> SETNOD program should be enhanced to take that to compute the set
>> difference with the new list. This would show additions, renames
>> and deletions. That would bring the update operation down from some
>> hundred items to less than ten, on average. This would obviously
>> make more of a difference on huge DECnet's in the tens of thousands
>> of nodes. Another NODE% feature should probably be to whack the
>> entire monitor database except for the local node, which would be
>> useful for trouble shooting.
>>
>> Last Sunday, the batch job failed with the following error:
>>
>> 18:33:40 USER SETNOD>*TAKE SYSTEM:NODE-DATA.TXT.0
>> 18:33:40 USER
>> 18:33:40 USER [Fork SETNOD opening <SYSTEM>NODE-DATA.TXT.1 for
>> reading]
>> 18:33:41 USER SETNOD>*SAVE
>> 18:33:41 USER
>> 18:33:41 USER [Fork SETNOD opening <SYSTEM>NODE-DATA.BIN.74 for
>> reading, writing]
>> 18:33:41 USER SETNOD>*INSERT
>> 18:33:41 USER
>> 18:33:41 USER *?SETNOD: Failed at node REACH*
>> 18:33:41 USER SETNOD>
>>
>> I had a look at the SETNOD source and the HECnet node list and have
>> discovered and concluded a few things. First, there doesn't seem to
>> be anything syntactically wrong with REACH::'s definition: "set nod
>> 2.298 name REACH". Second, there don't appear to be any semantic
>> issues. 2.298 wasn't in use and it shouldn't matter if it was.
>>
>> In the case of INSERT, there are two kinds of errors from NODE%, a
>> general failure of the JSYS and an incomplete insertion. The error
>> is from the second case. Unfortunately, SETNOD isn't reporting
>> enough information about the error, so I have to make some changes
>> there. It's also possible that SETNOD is building an inconsistent
>> database for the monitor to swallow; at least the LIST command is
>> giving me some odd results, viz:
>>
>> SETNOD>list arEA 2
>>
>> [AREA 2]
>> A2RTR
>>
>> TOTAL NODES FOUND: 1
>>
>> SETNOD>
>>
>> That's clearly wrong, viz:
>>
>> !i dec
>> Local DECNET node: VENTI2. Nodes reachable: 7.
>> Accessible DECNET nodes are: A2RTR BOINGO LEGATO
>> TOMMYT VENTI2 VENTI ZITI
>>
>> The Exec output should probably be changed to say, "Nodes reachable
>> in local area" and "Online nodes in area are:"
>>
>> Anybody have any ideas? Hunches? Clues?
>>
>> File 1) OLDF:[4,120] created: 1241 15-Apr-21
>> File 2) NEWF:[1,1] created: 0102 30-Apr-21
>>
>> 1)1 set nod 44.9 name OSMIUM
>> ****
>> 2)1 set nod 2.292 name OSIRIS
>> 2) set nod 44.9 name OSMIUM
>> **************
>> 1)1 set nod 13.3 name RED
>> ****
>> 2)1 *set nod 2.298 name REACH *
>> 2) set nod 13.3 name RED
>> **************
>> 1)1 set nod 2.298 name RSX11M
>> 1) set nod 1.306 name RSX124
>> ****
>> 2)1 set nod 1.306 name RSX124
>> **************
>> 1)1 set nod 42.5 name SPARKY
>> ****
>> 2)1 set nod 2.291 name SPARK
>> 2) set nod 42.5 name SPARKY
>> **************
>> 1)1 set nod 2.299 name THEPIT
>> 1) set nod 35.70 name THOMAS
>> ****
>> 2)1 set nod 2.299 name THEARK
>> 2) set nod 35.70 name THOMAS
>> **************
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20210504/79dd342c/attachment-0001.htm>
More information about the Hecnet-list
mailing list