[HECnet] Thousands of DECnet errors on Tops-20

Thomas DeBellis tommytimesharing at gmail.com
Fri Jan 15 12:59:25 PST 2021


Was this maybe that magical version 5 of Tops-20 that MRC put together 
for the 2020?  I sure would love to see the sources for that!  I'm not 
sure if this is relevant, but the following macro in D36COM is of interest:

    DEFINE KNMMCS,<
    ;              Symbol,Name,Cost, Maximum receive block size
             KNMMAC LD.TST,TST,  1,  0 ;TST DEVICE
             KNMMAC LD.DTE,DTE,  3, <^D576>                 ;DTE DEVICE
             KNMMAC LD.KDP,KDP,  4, <^D576>                 ;KDP DEVICE
             KNMMAC LD.DDP,DDP,  5, <^D576>                 ;DDP DEVICE
             KNMMAC LD.CIP,CI,   2, <^D576>                 ;CI DEVICE
             KNMMAC LD.NI ,NI,   1, <^D1504-%RTEHS>         ;NI DEVICE
             KNMMAC LD.DMR,DMR,  2, <^D576>                 ;DMR DEVICE
     >;END OF KNMMCS

What can be seen is that the maximum block size is 576 in _all_ cases 
except the NI, which is 1476 bytes.  I don't know if any of these 
devices are relevant to the 2020; one assumes that the DTE, CI and NI 
are not.

> ------------------------------------------------------------------------
> On 1/12/21 3:29 PM, Peter Lothberg wrote:
>
> The DECnet segment size has to be the same "network wide".
>
> If I remember right DECnet looks at the two end nodes and uses the 
> smalles segment size,
> so if there is any transit node in the path with a small segment size 
> things will not work as
> it will drop packets bigger than it''s size.
>
> The only SW/HW combination I knew of that has other than 576 is 
> MRC/Stu DECnet for
> Tops20 4.x on DEC2020.
>
> -P
>
> ------------------------------------------------------------------------
>
>     *From: *"tommytimesharing" <tommytimesharing at gmail.com>
>     *To: *"hecnet" <hecnet at Update.UU.SE>
>     *Sent: *Monday, January 11, 2021 11:58:56 PM
>     *Subject: *Re: [HECnet] Thousands of DECnet errors on Tops-20
>
>     Yes, I had seen this and had wondered about it after I had
>     reflected on the output of a SHOW EXECUTOR CHARACTERISTICS
>     command(clipped)
>
>         Executor Node = 2.520 (TOMMYT)
>
>           Identification = Tommy Timesharing
>           Management Version = 4.0.0
>           CPU = DECSYSTEM1020
>           Software Identification = Tops-20 7.1 PANDA
>
>                 .
>                 .
>                 .
>
>         Buffer Size = *576*
>           Segment Buffer Size = *576*
>
>     So it would appear that the 20's implementation of NICE knows of
>     this differentiation.  I can parse for both SET EXECUTOR SEGMENT
>     BUFFER SIZE and SET EXECUTOR BUFFER SIZE. Both fail, of course;
>     again, once DECnet is initialized, they are locked.
>
>     However, when one looks at the DECnet initialization block
>     (IBBLK), it only contains a field for buffer size (IBBSZ), nothing
>     about segment size.  Further, the NODE% JSYS' set DECnet
>     initialization parameters function (.NDPRM) only contains a
>     sub-function for buffer size (.NDBSZ) and SETSPD will only parse
>     for DECNET BUFFER-SIZE. I'm hopeful to test that this weekend
>     after I've looked further through the error log.
>
>     The receive code in the low level NI driver (PHYKNI) only checks
>     to see whether was was received will fit into the buffer
>     specified.  It returns a length error (UNLER%) to DNADLL, but not
>     the actual difference.
>
>     I have yet to puzzle out how the segment size is derived, but it
>     is apparently set on a line basis.
>
>         ------------------------------------------------------------------------
>         On 1/11/21 8:24 PM, Johnny Billquist wrote:
>
>         Thomas, I wonder if you might experience the effects of that
>         ethernet packet size might be different than the DECnet
>         segment buffer size.
>         This is a little hard to explain, as I don't have all the
>         proper DECnet naming correct.
>
>         But, based on RSX, there is two sizes relevant. One is the
>         actual buffer size the line is using. The other is the DECnet
>         segment buffer size.
>
>         The DECnet segment buffer size is the maximum size of packets
>         you can ever expect DECnet itself to ever use.
>         However, at least with RSX, when it comes to the exchange of
>         information at the line level, which includes things like
>         hello messages, RSX is actually using a system buffer size
>         setting, which might be very different from the DECnet segment
>         buffer size.
>
>         I found out that VMS have a problem here in that if the hello
>         packets coming in are much larger than the DECnet segment
>         buffer size, you never even get adjacency up, while RSX can
>         deal with this just fine.
>
>         It sounds like you might be seeing something similar in
>         Tops-20. In which case you would need to tell the other end to
>         reduce the size of these hello and routing information packets
>         for Tops-20 to be happy, or else find a way to accept larger
>         packets.
>
>         After all, ethernet packets can be up to 1500 bytes of payload.
>
>         And to explain it a bit more from an RSX point of view. RSX
>         will use the system buffer size when creating these hello
>         messages. So, if that is set to 1500, you will get hello
>         packets up to 1500 bytes in size, which contain routing
>         vectors and so on.
>
>         But actual DECnet communication will be limited to what the
>         DECnet segment buffer size say, so once you have adjacency up,
>         when a connection is established between two programs, those
>         packets will never be larger than the DECnet segment buffer
>         size, which is commonly 576 bytes.
>
>           Johnny
>
>             ------------------------------------------------------------------------
>             On 2021-01-11 23:43, Thomas DeBellis wrote:
>
>             Paul,
>
>             Lots of good information.  For right now, I did an
>             experiment and  went into MDDT and stubbed out the XWD
>             UNLER%,^D5 entry in the NIEVTB: table in the running
>             monitor on VENTI2.  Since then (about an hour or so ago),
>             TOMMYT 's ERROR.SYS file has been increasing as usual (a
>             couple of pages an hour) while VENTI2's hasn't changed at
>             all.  So that particular fire hose is plugged for the time
>             being.
>
>             I don't believe I have seen this particular error before,
>             however, there are probably some great reasons for that. 
>             In the 1980's, CCnet may not have had Level-2 routers on
>             it while Columbia's 20's were online.  We did have a
>             problem with the 20's complaining about long Ethernet
>             frames from an early version BSD 4.2 that was being run on
>             some VAX 11/750's in the Computer Science department's
>             research lab.  They got taught how to not do that and all
>             was well.
>
>             Tops-20's multinet implementation was first done at BBN
>             and then later imported.  I am not sure that it will allow
>             me to change the frame size.  576 was what was used for
>             the Internet, so I don't know where that might be
>             hardwired.  I'll check.
>
>             I think there are two forensics to perform here:
>
>              1. Investigate when the errors started happening; whether
>             they predate
>                 Bob adopting PyDECnet
>              2. Investigate what the size difference is; I don't
>             believe that is
>                 going into the error log, but I'll have to look more
>             carefully with
>                 SPEAR.
>
>             A *warning* for anyone also looking to track this down: if
>             you do the retrieve in SPEAR on KLH10 and you don't have
>             have my time out changes for DTESRV, you will probably
>             crash your 20.  This will happen both with a standard DEC
>             monitor and PANDA.
>
>                 ------------------------------------------------------------------------
>
>                 On 1/11/21 4:41 PM, Paul Koning wrote:
>
>                     On Jan 11, 2021, at 4:22 PM, Thomas
>                     DeBellis<tommytimesharing at gmail.com>
>                     <mailto:tommytimesharing at gmail.com> wrote:
>
>                     OK, I guess that's probably a level 2 router
>                     broadcast coming over the bridge.  There is no way
>                     Tops-10 or Tops-20 could currently be generating
>                     that because there is no code to do so; they're
>                     level 1, only
>
>                 Yes, unfortunately originally both multicasts used the
>                 same address.  That was changed in Phase IV Plus, but
>                 that still sends to the old address for backwards
>                 compatibility and it isn't universally implemented.
>
>                     I started looking at the error; it starts out in
>                     DNADLL when it is detected on a frame that has
>                     come back from NISRV (the Ethernet Interface
>                     driver).  The error is then handed off to NTMAN
>                     where the actual logging is done.  So, there are
>                     two quick hacks to stop all the errors:
>
>                         • I could stub out the length error entry (XWD
>                     UNLER%,^D5) in the NIEVTB: table in DNADLL.MAC.
>                         • I could put in a filter ($NOFIL) for event
>                     class 5 in the NMXFIL: table in NTMAN.MAC.
>
>                     That will stop the deluge for the moment.
>                     Meanwhile, I have to understand what's actually
>                     being detected; even the full SPEAR entry is short
>                     on details (like how long the frame was).
>
>                 The thing to look for is the buffer size (frame size)
>                 setting of the stations on the Ethernet.  It should
>                 match; if not someone may send a frame small enough by
>                 its settings but too large for someone else who has a
>                 smaller value.  Routing messages tend to cause that
>                 problem because they are variable length; the Phase IV
>                 rules have the routers send them (the periodic ones)
>                 as large as the line buffer size permits.
>
>                 Note that DECnet by convention doesn't use the full
>                 max Ethernet frame size in DECnet, because DECnet has
>                 no fragmentation so the normal settings are chosen to
>                 make for consistent NSP packet sizes throughout the
>                 network.   The router sending the problematic messages
>                 is 2.1023 (not 63.whatever, Rob, remember that
>                 addresses are little endian) which has its Ethernet
>                 buffer size set to 591.  That matches the VMS
>                 conventional default of 576 when accounting for the
>                 "long header" used on Ethernet vs. the "short header"
>                 on point to point (DDCMP etc.) links).  But VENTI2 has
>                 its block size set to 576.  If you change it to 591 it
>                 should start working.
>
>                 Perhaps I should change PyDECnet to have a way to send
>                 shorter than max routing messages.
>
>                     paul
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20210115/2224b9c9/attachment-0001.htm>


More information about the Hecnet-list mailing list