[HECnet] Thousands of DECnet errors on Tops-20
Thomas DeBellis
tommytimesharing at gmail.com
Fri Jan 15 21:30:29 PST 2021
Yes, it is peculiar, isn't it? It for a header. I chased it down when
I saw it:
MP %RTEHS,<2+7-6+21+4> ;Ethernet header size, composed of:
;+2 Ethernet padding bytes
;+7 Router Phase-IV pad bytes
;-6 corrects for assumed Phase III header
;+21 allows for full P-IV NI header
;+4 allows for 4 KLNIA CRC bytes (input)
; & for byte misalignment after BLT
(output)
The 1504 is to handle a rounding condition when converting from bytes to
words.
It comes out to 1476 bytes maximum data buffer size, which is what I am
re-configuring for when I reboot later this weekend.
> ------------------------------------------------------------------------
> On 1/15/21 11:49 PM, Johnny Billquist wrote:
>
> Hum. 1504-%RTEHS seems like a weird expression. And then it would mean
> RTEHS is 28? Any idea what RTEHS represents?
>
> Anyway, the actual maximum payload on ethernet is 1500 bytes. Then you
> have src and dst mac (6 bytes each), protocol (2 bytes) and crc (4
> bytes). So if we're talking full size plus overhead, it should say
> 1518. I could guess that the 1504 would be payload plus crc, but then
> why is 28 additional bytes taken away? That's twice what the overhead
> for the rest is... And that it's exactly twice is also intriguing...
>
> And yeah, I would definitely assume that DTE, CI and NI are not
> relevant for a 2020. But I do know of one person who at least
> partially got a DELUA to work on a 2020, so maybe it could have been
> possible to get ethernet on it.
>
> Johnny
>> ------------------------------------------------------------------------
>> On 2021-01-15 21:59, Thomas DeBellis wrote:
>>
>> Was this maybe that magical version 5 of Tops-20 that MRC put
>> together for the 2020? I sure would love to see the sources for
>> that! I'm not sure if this is relevant, but the following macro in
>> D36COM is of interest:
>>
>> DEFINE KNMMCS,<
>> ; Symbol,Name,Cost, Maximum receive block size
>> KNMMAC LD.TST,TST, 1, 0 ;TST DEVICE
>> KNMMAC LD.DTE,DTE, 3, <^D576> ;DTE DEVICE
>> KNMMAC LD.KDP,KDP, 4, <^D576> ;KDP DEVICE
>> KNMMAC LD.DDP,DDP, 5, <^D576> ;DDP DEVICE
>> KNMMAC LD.CIP,CI, 2, <^D576> ;CI DEVICE
>> KNMMAC LD.NI ,NI, 1, <^D1504-%RTEHS> ;NI DEVICE
>> KNMMAC LD.DMR,DMR, 2, <^D576> ;DMR DEVICE
>> >;END OF KNMMCS
>>
>> What can be seen is that the maximum block size is 576 in _all_ cases
>> except the NI, which is 1476 bytes. I don't know if any of these
>> devices are relevant to the 2020; one assumes that the DTE, CI and NI
>> are not.
>>> ------------------------------------------------------------------------
>>> On 1/12/21 3:29 PM, Peter Lothberg wrote:
>>>
>>> The DECnet segment size has to be the same "network wide".
>>>
>>> If I remember right DECnet looks at the two end nodes and uses the
>>> smalles segment size,
>>> so if there is any transit node in the path with a small segment
>>> size things will not work as
>>> it will drop packets bigger than it''s size.
>>>
>>> The only SW/HW combination I knew of that has other than 576 is
>>> MRC/Stu DECnet for
>>> Tops20 4.x on DEC2020.
>>>
>>> -P
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>> *From: *"tommytimesharing" <tommytimesharing at gmail.com>
>>> *To: *"hecnet" <hecnet at Update.UU.SE>
>>> *Sent: *Monday, January 11, 2021 11:58:56 PM
>>> *Subject: *Re: [HECnet] Thousands of DECnet errors on Tops-20
>>>
>>> Yes, I had seen this and had wondered about it after I had
>>> reflected on the output of a SHOW EXECUTOR CHARACTERISTICS
>>> command(clipped)
>>>
>>> Executor Node = 2.520 (TOMMYT)
>>>
>>> Identification = Tommy Timesharing
>>> Management Version = 4.0.0
>>> CPU = DECSYSTEM1020
>>> Software Identification = Tops-20 7.1 PANDA
>>>
>>> .
>>> .
>>> .
>>>
>>> Buffer Size = *576*
>>> Segment Buffer Size = *576*
>>>
>>> So it would appear that the 20's implementation of NICE knows of
>>> this differentiation. I can parse for both SET EXECUTOR SEGMENT
>>> BUFFER SIZE and SET EXECUTOR BUFFER SIZE. Both fail, of course;
>>> again, once DECnet is initialized, they are locked.
>>>
>>> However, when one looks at the DECnet initialization block
>>> (IBBLK), it only contains a field for buffer size (IBBSZ), nothing
>>> about segment size. Further, the NODE% JSYS' set DECnet
>>> initialization parameters function (.NDPRM) only contains a
>>> sub-function for buffer size (.NDBSZ) and SETSPD will only parse
>>> for DECNET BUFFER-SIZE. I'm hopeful to test that this weekend
>>> after I've looked further through the error log.
>>>
>>> The receive code in the low level NI driver (PHYKNI) only checks
>>> to see whether was was received will fit into the buffer
>>> specified. It returns a length error (UNLER%) to DNADLL, but not
>>> the actual difference.
>>>
>>> I have yet to puzzle out how the segment size is derived, but it
>>> is apparently set on a line basis.
>>>
>>> ------------------------------------------------------------------------
>>>
>>> On 1/11/21 8:24 PM, Johnny Billquist wrote:
>>>
>>> Thomas, I wonder if you might experience the effects of that
>>> ethernet packet size might be different than the DECnet
>>> segment buffer size.
>>> This is a little hard to explain, as I don't have all the
>>> proper DECnet naming correct.
>>>
>>> But, based on RSX, there is two sizes relevant. One is the
>>> actual buffer size the line is using. The other is the DECnet
>>> segment buffer size.
>>>
>>> The DECnet segment buffer size is the maximum size of packets
>>> you can ever expect DECnet itself to ever use.
>>> However, at least with RSX, when it comes to the exchange of
>>> information at the line level, which includes things like
>>> hello messages, RSX is actually using a system buffer size
>>> setting, which might be very different from the DECnet segment
>>> buffer size.
>>>
>>> I found out that VMS have a problem here in that if the hello
>>> packets coming in are much larger than the DECnet segment
>>> buffer size, you never even get adjacency up, while RSX can
>>> deal with this just fine.
>>>
>>> It sounds like you might be seeing something similar in
>>> Tops-20. In which case you would need to tell the other end to
>>> reduce the size of these hello and routing information packets
>>> for Tops-20 to be happy, or else find a way to accept larger
>>> packets.
>>>
>>> After all, ethernet packets can be up to 1500 bytes of payload.
>>>
>>> And to explain it a bit more from an RSX point of view. RSX
>>> will use the system buffer size when creating these hello
>>> messages. So, if that is set to 1500, you will get hello
>>> packets up to 1500 bytes in size, which contain routing
>>> vectors and so on.
>>>
>>> But actual DECnet communication will be limited to what the
>>> DECnet segment buffer size say, so once you have adjacency up,
>>> when a connection is established between two programs, those
>>> packets will never be larger than the DECnet segment buffer
>>> size, which is commonly 576 bytes.
>>>
>>> Johnny
>>>
>>> ------------------------------------------------------------------------
>>>
>>> On 2021-01-11 23:43, Thomas DeBellis wrote:
>>>
>>> Paul,
>>>
>>> Lots of good information. For right now, I did an
>>> experiment and went into MDDT and stubbed out the XWD
>>> UNLER%,^D5 entry in the NIEVTB: table in the running
>>> monitor on VENTI2. Since then (about an hour or so ago),
>>> TOMMYT 's ERROR.SYS file has been increasing as usual (a
>>> couple of pages an hour) while VENTI2's hasn't changed at
>>> all. So that particular fire hose is plugged for the time
>>> being.
>>>
>>> I don't believe I have seen this particular error before,
>>> however, there are probably some great reasons for that.
>>> In the 1980's, CCnet may not have had Level-2 routers on
>>> it while Columbia's 20's were online. We did have a
>>> problem with the 20's complaining about long Ethernet
>>> frames from an early version BSD 4.2 that was being run on
>>> some VAX 11/750's in the Computer Science department's
>>> research lab. They got taught how to not do that and all
>>> was well.
>>>
>>> Tops-20's multinet implementation was first done at BBN
>>> and then later imported. I am not sure that it will allow
>>> me to change the frame size. 576 was what was used for
>>> the Internet, so I don't know where that might be
>>> hardwired. I'll check.
>>>
>>> I think there are two forensics to perform here:
>>>
>>> 1. Investigate when the errors started happening; whether
>>> they predate
>>> Bob adopting PyDECnet
>>> 2. Investigate what the size difference is; I don't
>>> believe that is
>>> going into the error log, but I'll have to look more
>>> carefully with
>>> SPEAR.
>>>
>>> A *warning* for anyone also looking to track this down: if
>>> you do the retrieve in SPEAR on KLH10 and you don't have
>>> have my time out changes for DTESRV, you will probably
>>> crash your 20. This will happen both with a standard DEC
>>> monitor and PANDA.
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>> On 1/11/21 4:41 PM, Paul Koning wrote:
>>>
>>> On Jan 11, 2021, at 4:22 PM, Thomas
>>> DeBellis<tommytimesharing at gmail.com>
>>> <mailto:tommytimesharing at gmail.com> wrote:
>>>
>>> OK, I guess that's probably a level 2 router
>>> broadcast coming over the bridge. There is no way
>>> Tops-10 or Tops-20 could currently be generating
>>> that because there is no code to do so; they're
>>> level 1, only
>>>
>>> Yes, unfortunately originally both multicasts used the
>>> same address. That was changed in Phase IV Plus, but
>>> that still sends to the old address for backwards
>>> compatibility and it isn't universally implemented.
>>>
>>> I started looking at the error; it starts out in
>>> DNADLL when it is detected on a frame that has
>>> come back from NISRV (the Ethernet Interface
>>> driver). The error is then handed off to NTMAN
>>> where the actual logging is done. So, there are
>>> two quick hacks to stop all the errors:
>>>
>>> • I could stub out the length error entry (XWD
>>> UNLER%,^D5) in the NIEVTB: table in DNADLL.MAC.
>>> • I could put in a filter ($NOFIL) for event
>>> class 5 in the NMXFIL: table in NTMAN.MAC.
>>>
>>> That will stop the deluge for the moment.
>>> Meanwhile, I have to understand what's actually
>>> being detected; even the full SPEAR entry is short
>>> on details (like how long the frame was).
>>>
>>> The thing to look for is the buffer size (frame size)
>>> setting of the stations on the Ethernet. It should
>>> match; if not someone may send a frame small enough by
>>> its settings but too large for someone else who has a
>>> smaller value. Routing messages tend to cause that
>>> problem because they are variable length; the Phase IV
>>> rules have the routers send them (the periodic ones)
>>> as large as the line buffer size permits.
>>>
>>> Note that DECnet by convention doesn't use the full
>>> max Ethernet frame size in DECnet, because DECnet has
>>> no fragmentation so the normal settings are chosen to
>>> make for consistent NSP packet sizes throughout the
>>> network. The router sending the problematic messages
>>> is 2.1023 (not 63.whatever, Rob, remember that
>>> addresses are little endian) which has its Ethernet
>>> buffer size set to 591. That matches the VMS
>>> conventional default of 576 when accounting for the
>>> "long header" used on Ethernet vs. the "short header"
>>> on point to point (DDCMP etc.) links). But VENTI2 has
>>> its block size set to 576. If you change it to 591 it
>>> should start working.
>>>
>>> Perhaps I should change PyDECnet to have a way to send
>>> shorter than max routing messages.
>>>
>>> paul
>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20210116/2d88767c/attachment-0001.htm>
More information about the Hecnet-list
mailing list