[HECnet] Thousands of DECnet errors on Tops-20

Thomas DeBellis tommytimesharing at gmail.com
Fri Jan 15 21:30:29 PST 2021


Yes, it is peculiar, isn't it?  It for a header.  I chased it down when 
I saw it:

        MP %RTEHS,<2+7-6+21+4> ;Ethernet header size, composed of:
                                 ;+2 Ethernet padding bytes
                                 ;+7 Router Phase-IV pad bytes
                                 ;-6 corrects for assumed Phase III header
                                 ;+21 allows for full P-IV NI header
                                 ;+4 allows for 4 KLNIA CRC bytes (input)
                                 ;   & for byte misalignment after BLT 
(output)

The 1504 is to handle a rounding condition when converting from bytes to 
words.

It comes out to 1476 bytes maximum data buffer size, which is what I am 
re-configuring for when I reboot later this weekend.

> ------------------------------------------------------------------------
> On 1/15/21 11:49 PM, Johnny Billquist wrote:
>
> Hum. 1504-%RTEHS seems like a weird expression. And then it would mean 
> RTEHS is 28? Any idea what RTEHS represents?
>
> Anyway, the actual maximum payload on ethernet is 1500 bytes. Then you 
> have src and dst mac (6 bytes each), protocol (2 bytes) and crc (4 
> bytes). So if we're talking full size plus overhead, it should say 
> 1518. I could guess that the 1504 would be payload plus crc, but then 
> why is 28 additional bytes taken away? That's twice what the overhead 
> for the rest is... And that it's exactly twice is also intriguing...
>
> And yeah, I would definitely assume that DTE, CI and NI are not 
> relevant for a 2020. But I do know of one person who at least 
> partially got a DELUA to work on a 2020, so maybe it could have been 
> possible to get ethernet on it.
>
>   Johnny
>> ------------------------------------------------------------------------
>> On 2021-01-15 21:59, Thomas DeBellis wrote:
>>
>> Was this maybe that magical version 5 of Tops-20 that MRC put 
>> together for the 2020?  I sure would love to see the sources for 
>> that!  I'm not sure if this is relevant, but the following macro in 
>> D36COM is of interest:
>>
>>     DEFINE KNMMCS,<
>>     ;              Symbol,Name,Cost, Maximum receive block size
>>              KNMMAC LD.TST,TST,  1,  0 ;TST DEVICE
>>              KNMMAC LD.DTE,DTE,  3, <^D576>                 ;DTE DEVICE
>>              KNMMAC LD.KDP,KDP,  4, <^D576>                 ;KDP DEVICE
>>              KNMMAC LD.DDP,DDP,  5, <^D576>                 ;DDP DEVICE
>>              KNMMAC LD.CIP,CI,   2, <^D576>                 ;CI DEVICE
>>              KNMMAC LD.NI ,NI,   1, <^D1504-%RTEHS>         ;NI DEVICE
>>              KNMMAC LD.DMR,DMR,  2, <^D576>                 ;DMR DEVICE
>>      >;END OF KNMMCS
>>
>> What can be seen is that the maximum block size is 576 in _all_ cases 
>> except the NI, which is 1476 bytes.  I don't know if any of these 
>> devices are relevant to the 2020; one assumes that the DTE, CI and NI 
>> are not.
>>> ------------------------------------------------------------------------
>>> On 1/12/21 3:29 PM, Peter Lothberg wrote:
>>>
>>> The DECnet segment size has to be the same "network wide".
>>>
>>> If I remember right DECnet looks at the two end nodes and uses the 
>>> smalles segment size,
>>> so if there is any transit node in the path with a small segment 
>>> size things will not work as
>>> it will drop packets bigger than it''s size.
>>>
>>> The only SW/HW combination I knew of that has other than 576 is 
>>> MRC/Stu DECnet for
>>> Tops20 4.x on DEC2020.
>>>
>>> -P
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>>     *From: *"tommytimesharing" <tommytimesharing at gmail.com>
>>>     *To: *"hecnet" <hecnet at Update.UU.SE>
>>>     *Sent: *Monday, January 11, 2021 11:58:56 PM
>>>     *Subject: *Re: [HECnet] Thousands of DECnet errors on Tops-20
>>>
>>>     Yes, I had seen this and had wondered about it after I had
>>>     reflected on the output of a SHOW EXECUTOR CHARACTERISTICS
>>>     command(clipped)
>>>
>>>         Executor Node = 2.520 (TOMMYT)
>>>
>>>           Identification = Tommy Timesharing
>>>           Management Version = 4.0.0
>>>           CPU = DECSYSTEM1020
>>>           Software Identification = Tops-20 7.1 PANDA
>>>
>>>                 .
>>>                 .
>>>                 .
>>>
>>>         Buffer Size = *576*
>>>           Segment Buffer Size = *576*
>>>
>>>     So it would appear that the 20's implementation of NICE knows of
>>>     this differentiation.  I can parse for both SET EXECUTOR SEGMENT
>>>     BUFFER SIZE and SET EXECUTOR BUFFER SIZE. Both fail, of course;
>>>     again, once DECnet is initialized, they are locked.
>>>
>>>     However, when one looks at the DECnet initialization block
>>>     (IBBLK), it only contains a field for buffer size (IBBSZ), nothing
>>>     about segment size.  Further, the NODE% JSYS' set DECnet
>>>     initialization parameters function (.NDPRM) only contains a
>>>     sub-function for buffer size (.NDBSZ) and SETSPD will only parse
>>>     for DECNET BUFFER-SIZE. I'm hopeful to test that this weekend
>>>     after I've looked further through the error log.
>>>
>>>     The receive code in the low level NI driver (PHYKNI) only checks
>>>     to see whether was was received will fit into the buffer
>>>     specified.  It returns a length error (UNLER%) to DNADLL, but not
>>>     the actual difference.
>>>
>>>     I have yet to puzzle out how the segment size is derived, but it
>>>     is apparently set on a line basis.
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>         On 1/11/21 8:24 PM, Johnny Billquist wrote:
>>>
>>>         Thomas, I wonder if you might experience the effects of that
>>>         ethernet packet size might be different than the DECnet
>>>         segment buffer size.
>>>         This is a little hard to explain, as I don't have all the
>>>         proper DECnet naming correct.
>>>
>>>         But, based on RSX, there is two sizes relevant. One is the
>>>         actual buffer size the line is using. The other is the DECnet
>>>         segment buffer size.
>>>
>>>         The DECnet segment buffer size is the maximum size of packets
>>>         you can ever expect DECnet itself to ever use.
>>>         However, at least with RSX, when it comes to the exchange of
>>>         information at the line level, which includes things like
>>>         hello messages, RSX is actually using a system buffer size
>>>         setting, which might be very different from the DECnet segment
>>>         buffer size.
>>>
>>>         I found out that VMS have a problem here in that if the hello
>>>         packets coming in are much larger than the DECnet segment
>>>         buffer size, you never even get adjacency up, while RSX can
>>>         deal with this just fine.
>>>
>>>         It sounds like you might be seeing something similar in
>>>         Tops-20. In which case you would need to tell the other end to
>>>         reduce the size of these hello and routing information packets
>>>         for Tops-20 to be happy, or else find a way to accept larger
>>>         packets.
>>>
>>>         After all, ethernet packets can be up to 1500 bytes of payload.
>>>
>>>         And to explain it a bit more from an RSX point of view. RSX
>>>         will use the system buffer size when creating these hello
>>>         messages. So, if that is set to 1500, you will get hello
>>>         packets up to 1500 bytes in size, which contain routing
>>>         vectors and so on.
>>>
>>>         But actual DECnet communication will be limited to what the
>>>         DECnet segment buffer size say, so once you have adjacency up,
>>>         when a connection is established between two programs, those
>>>         packets will never be larger than the DECnet segment buffer
>>>         size, which is commonly 576 bytes.
>>>
>>>           Johnny
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>             On 2021-01-11 23:43, Thomas DeBellis wrote:
>>>
>>>             Paul,
>>>
>>>             Lots of good information.  For right now, I did an
>>>             experiment and  went into MDDT and stubbed out the XWD
>>>             UNLER%,^D5 entry in the NIEVTB: table in the running
>>>             monitor on VENTI2.  Since then (about an hour or so ago),
>>>             TOMMYT 's ERROR.SYS file has been increasing as usual (a
>>>             couple of pages an hour) while VENTI2's hasn't changed at
>>>             all.  So that particular fire hose is plugged for the time
>>>             being.
>>>
>>>             I don't believe I have seen this particular error before,
>>>             however, there are probably some great reasons for that. 
>>>             In the 1980's, CCnet may not have had Level-2 routers on
>>>             it while Columbia's 20's were online.  We did have a
>>>             problem with the 20's complaining about long Ethernet
>>>             frames from an early version BSD 4.2 that was being run on
>>>             some VAX 11/750's in the Computer Science department's
>>>             research lab.  They got taught how to not do that and all
>>>             was well.
>>>
>>>             Tops-20's multinet implementation was first done at BBN
>>>             and then later imported.  I am not sure that it will allow
>>>             me to change the frame size.  576 was what was used for
>>>             the Internet, so I don't know where that might be
>>>             hardwired.  I'll check.
>>>
>>>             I think there are two forensics to perform here:
>>>
>>>              1. Investigate when the errors started happening; whether
>>>             they predate
>>>                 Bob adopting PyDECnet
>>>              2. Investigate what the size difference is; I don't
>>>             believe that is
>>>                 going into the error log, but I'll have to look more
>>>             carefully with
>>>                 SPEAR.
>>>
>>>             A *warning* for anyone also looking to track this down: if
>>>             you do the retrieve in SPEAR on KLH10 and you don't have
>>>             have my time out changes for DTESRV, you will probably
>>>             crash your 20.  This will happen both with a standard DEC
>>>             monitor and PANDA.
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>>                 On 1/11/21 4:41 PM, Paul Koning wrote:
>>>
>>>                     On Jan 11, 2021, at 4:22 PM, Thomas
>>>                     DeBellis<tommytimesharing at gmail.com>
>>>                     <mailto:tommytimesharing at gmail.com> wrote:
>>>
>>>                     OK, I guess that's probably a level 2 router
>>>                     broadcast coming over the bridge.  There is no way
>>>                     Tops-10 or Tops-20 could currently be generating
>>>                     that because there is no code to do so; they're
>>>                     level 1, only
>>>
>>>                 Yes, unfortunately originally both multicasts used the
>>>                 same address.  That was changed in Phase IV Plus, but
>>>                 that still sends to the old address for backwards
>>>                 compatibility and it isn't universally implemented.
>>>
>>>                     I started looking at the error; it starts out in
>>>                     DNADLL when it is detected on a frame that has
>>>                     come back from NISRV (the Ethernet Interface
>>>                     driver).  The error is then handed off to NTMAN
>>>                     where the actual logging is done.  So, there are
>>>                     two quick hacks to stop all the errors:
>>>
>>>                         • I could stub out the length error entry (XWD
>>>                     UNLER%,^D5) in the NIEVTB: table in DNADLL.MAC.
>>>                         • I could put in a filter ($NOFIL) for event
>>>                     class 5 in the NMXFIL: table in NTMAN.MAC.
>>>
>>>                     That will stop the deluge for the moment.
>>>                     Meanwhile, I have to understand what's actually
>>>                     being detected; even the full SPEAR entry is short
>>>                     on details (like how long the frame was).
>>>
>>>                 The thing to look for is the buffer size (frame size)
>>>                 setting of the stations on the Ethernet.  It should
>>>                 match; if not someone may send a frame small enough by
>>>                 its settings but too large for someone else who has a
>>>                 smaller value.  Routing messages tend to cause that
>>>                 problem because they are variable length; the Phase IV
>>>                 rules have the routers send them (the periodic ones)
>>>                 as large as the line buffer size permits.
>>>
>>>                 Note that DECnet by convention doesn't use the full
>>>                 max Ethernet frame size in DECnet, because DECnet has
>>>                 no fragmentation so the normal settings are chosen to
>>>                 make for consistent NSP packet sizes throughout the
>>>                 network.   The router sending the problematic messages
>>>                 is 2.1023 (not 63.whatever, Rob, remember that
>>>                 addresses are little endian) which has its Ethernet
>>>                 buffer size set to 591.  That matches the VMS
>>>                 conventional default of 576 when accounting for the
>>>                 "long header" used on Ethernet vs. the "short header"
>>>                 on point to point (DDCMP etc.) links).  But VENTI2 has
>>>                 its block size set to 576.  If you change it to 591 it
>>>                 should start working.
>>>
>>>                 Perhaps I should change PyDECnet to have a way to send
>>>                 shorter than max routing messages.
>>>
>>>                     paul
>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20210116/2d88767c/attachment-0001.htm>


More information about the Hecnet-list mailing list