[HECnet] Thousands of DECnet errors on Tops-20
Thomas DeBellis
tommytimesharing at gmail.com
Fri Jan 15 12:59:25 PST 2021
Was this maybe that magical version 5 of Tops-20 that MRC put together
for the 2020? I sure would love to see the sources for that! I'm not
sure if this is relevant, but the following macro in D36COM is of interest:
DEFINE KNMMCS,<
; Symbol,Name,Cost, Maximum receive block size
KNMMAC LD.TST,TST, 1, 0 ;TST DEVICE
KNMMAC LD.DTE,DTE, 3, <^D576> ;DTE DEVICE
KNMMAC LD.KDP,KDP, 4, <^D576> ;KDP DEVICE
KNMMAC LD.DDP,DDP, 5, <^D576> ;DDP DEVICE
KNMMAC LD.CIP,CI, 2, <^D576> ;CI DEVICE
KNMMAC LD.NI ,NI, 1, <^D1504-%RTEHS> ;NI DEVICE
KNMMAC LD.DMR,DMR, 2, <^D576> ;DMR DEVICE
>;END OF KNMMCS
What can be seen is that the maximum block size is 576 in _all_ cases
except the NI, which is 1476 bytes. I don't know if any of these
devices are relevant to the 2020; one assumes that the DTE, CI and NI
are not.
> ------------------------------------------------------------------------
> On 1/12/21 3:29 PM, Peter Lothberg wrote:
>
> The DECnet segment size has to be the same "network wide".
>
> If I remember right DECnet looks at the two end nodes and uses the
> smalles segment size,
> so if there is any transit node in the path with a small segment size
> things will not work as
> it will drop packets bigger than it''s size.
>
> The only SW/HW combination I knew of that has other than 576 is
> MRC/Stu DECnet for
> Tops20 4.x on DEC2020.
>
> -P
>
> ------------------------------------------------------------------------
>
> *From: *"tommytimesharing" <tommytimesharing at gmail.com>
> *To: *"hecnet" <hecnet at Update.UU.SE>
> *Sent: *Monday, January 11, 2021 11:58:56 PM
> *Subject: *Re: [HECnet] Thousands of DECnet errors on Tops-20
>
> Yes, I had seen this and had wondered about it after I had
> reflected on the output of a SHOW EXECUTOR CHARACTERISTICS
> command(clipped)
>
> Executor Node = 2.520 (TOMMYT)
>
> Identification = Tommy Timesharing
> Management Version = 4.0.0
> CPU = DECSYSTEM1020
> Software Identification = Tops-20 7.1 PANDA
>
> .
> .
> .
>
> Buffer Size = *576*
> Segment Buffer Size = *576*
>
> So it would appear that the 20's implementation of NICE knows of
> this differentiation. I can parse for both SET EXECUTOR SEGMENT
> BUFFER SIZE and SET EXECUTOR BUFFER SIZE. Both fail, of course;
> again, once DECnet is initialized, they are locked.
>
> However, when one looks at the DECnet initialization block
> (IBBLK), it only contains a field for buffer size (IBBSZ), nothing
> about segment size. Further, the NODE% JSYS' set DECnet
> initialization parameters function (.NDPRM) only contains a
> sub-function for buffer size (.NDBSZ) and SETSPD will only parse
> for DECNET BUFFER-SIZE. I'm hopeful to test that this weekend
> after I've looked further through the error log.
>
> The receive code in the low level NI driver (PHYKNI) only checks
> to see whether was was received will fit into the buffer
> specified. It returns a length error (UNLER%) to DNADLL, but not
> the actual difference.
>
> I have yet to puzzle out how the segment size is derived, but it
> is apparently set on a line basis.
>
> ------------------------------------------------------------------------
> On 1/11/21 8:24 PM, Johnny Billquist wrote:
>
> Thomas, I wonder if you might experience the effects of that
> ethernet packet size might be different than the DECnet
> segment buffer size.
> This is a little hard to explain, as I don't have all the
> proper DECnet naming correct.
>
> But, based on RSX, there is two sizes relevant. One is the
> actual buffer size the line is using. The other is the DECnet
> segment buffer size.
>
> The DECnet segment buffer size is the maximum size of packets
> you can ever expect DECnet itself to ever use.
> However, at least with RSX, when it comes to the exchange of
> information at the line level, which includes things like
> hello messages, RSX is actually using a system buffer size
> setting, which might be very different from the DECnet segment
> buffer size.
>
> I found out that VMS have a problem here in that if the hello
> packets coming in are much larger than the DECnet segment
> buffer size, you never even get adjacency up, while RSX can
> deal with this just fine.
>
> It sounds like you might be seeing something similar in
> Tops-20. In which case you would need to tell the other end to
> reduce the size of these hello and routing information packets
> for Tops-20 to be happy, or else find a way to accept larger
> packets.
>
> After all, ethernet packets can be up to 1500 bytes of payload.
>
> And to explain it a bit more from an RSX point of view. RSX
> will use the system buffer size when creating these hello
> messages. So, if that is set to 1500, you will get hello
> packets up to 1500 bytes in size, which contain routing
> vectors and so on.
>
> But actual DECnet communication will be limited to what the
> DECnet segment buffer size say, so once you have adjacency up,
> when a connection is established between two programs, those
> packets will never be larger than the DECnet segment buffer
> size, which is commonly 576 bytes.
>
> Johnny
>
> ------------------------------------------------------------------------
> On 2021-01-11 23:43, Thomas DeBellis wrote:
>
> Paul,
>
> Lots of good information. For right now, I did an
> experiment and went into MDDT and stubbed out the XWD
> UNLER%,^D5 entry in the NIEVTB: table in the running
> monitor on VENTI2. Since then (about an hour or so ago),
> TOMMYT 's ERROR.SYS file has been increasing as usual (a
> couple of pages an hour) while VENTI2's hasn't changed at
> all. So that particular fire hose is plugged for the time
> being.
>
> I don't believe I have seen this particular error before,
> however, there are probably some great reasons for that.
> In the 1980's, CCnet may not have had Level-2 routers on
> it while Columbia's 20's were online. We did have a
> problem with the 20's complaining about long Ethernet
> frames from an early version BSD 4.2 that was being run on
> some VAX 11/750's in the Computer Science department's
> research lab. They got taught how to not do that and all
> was well.
>
> Tops-20's multinet implementation was first done at BBN
> and then later imported. I am not sure that it will allow
> me to change the frame size. 576 was what was used for
> the Internet, so I don't know where that might be
> hardwired. I'll check.
>
> I think there are two forensics to perform here:
>
> 1. Investigate when the errors started happening; whether
> they predate
> Bob adopting PyDECnet
> 2. Investigate what the size difference is; I don't
> believe that is
> going into the error log, but I'll have to look more
> carefully with
> SPEAR.
>
> A *warning* for anyone also looking to track this down: if
> you do the retrieve in SPEAR on KLH10 and you don't have
> have my time out changes for DTESRV, you will probably
> crash your 20. This will happen both with a standard DEC
> monitor and PANDA.
>
> ------------------------------------------------------------------------
>
> On 1/11/21 4:41 PM, Paul Koning wrote:
>
> On Jan 11, 2021, at 4:22 PM, Thomas
> DeBellis<tommytimesharing at gmail.com>
> <mailto:tommytimesharing at gmail.com> wrote:
>
> OK, I guess that's probably a level 2 router
> broadcast coming over the bridge. There is no way
> Tops-10 or Tops-20 could currently be generating
> that because there is no code to do so; they're
> level 1, only
>
> Yes, unfortunately originally both multicasts used the
> same address. That was changed in Phase IV Plus, but
> that still sends to the old address for backwards
> compatibility and it isn't universally implemented.
>
> I started looking at the error; it starts out in
> DNADLL when it is detected on a frame that has
> come back from NISRV (the Ethernet Interface
> driver). The error is then handed off to NTMAN
> where the actual logging is done. So, there are
> two quick hacks to stop all the errors:
>
> • I could stub out the length error entry (XWD
> UNLER%,^D5) in the NIEVTB: table in DNADLL.MAC.
> • I could put in a filter ($NOFIL) for event
> class 5 in the NMXFIL: table in NTMAN.MAC.
>
> That will stop the deluge for the moment.
> Meanwhile, I have to understand what's actually
> being detected; even the full SPEAR entry is short
> on details (like how long the frame was).
>
> The thing to look for is the buffer size (frame size)
> setting of the stations on the Ethernet. It should
> match; if not someone may send a frame small enough by
> its settings but too large for someone else who has a
> smaller value. Routing messages tend to cause that
> problem because they are variable length; the Phase IV
> rules have the routers send them (the periodic ones)
> as large as the line buffer size permits.
>
> Note that DECnet by convention doesn't use the full
> max Ethernet frame size in DECnet, because DECnet has
> no fragmentation so the normal settings are chosen to
> make for consistent NSP packet sizes throughout the
> network. The router sending the problematic messages
> is 2.1023 (not 63.whatever, Rob, remember that
> addresses are little endian) which has its Ethernet
> buffer size set to 591. That matches the VMS
> conventional default of 576 when accounting for the
> "long header" used on Ethernet vs. the "short header"
> on point to point (DDCMP etc.) links). But VENTI2 has
> its block size set to 576. If you change it to 591 it
> should start working.
>
> Perhaps I should change PyDECnet to have a way to send
> shorter than max routing messages.
>
> paul
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20210115/2224b9c9/attachment-0001.htm>
More information about the Hecnet-list
mailing list