[HECnet] Emulated XQ polling timer setting and data overrun

Mark Pizzolato - Info Comm Mark at infocomm.com
Wed Jun 4 18:44:36 PDT 2014


On Monday, May 26, 2014 at 7:09 AM, Jean-Yves Bernier wrote:
[ Summary : File transfers between two simh PDP hang, DECnet reports Data
overruns and Response timeouts ]


At 2:14 AM +0200 26/5/14, Johnny Billquist wrote:

This is a problem inside of DECnet on the simulated host. It gets
packets   faster than it can process them, so some packets are dropped.
Unfortunately DECnet deals very bad with systematic packet loss like this.
You get retransmissions, and after a while the retransmission
timeout backs   off until you have more than a minute between
retransmission attempts.

Anyway, if you can get simh to throttle the ethernet interface, that
might help you.
(I don't remember offhand if it do support such functionality.)


The service   polling timer can be adjusted

SET XQ POLL={DEFAULT|4..2500}

Set to 100 by default.


Changing the polling timer makes a huge difference. Have a look at:

http://pastebin.com/AZ1U6bh3

Although it still hangs sometimes, reliability has vastly improved upon the
erratic behavior of the beginning. Remember, the completion time was
about 3 minutes.

We're almost there :)

This turns into an interesting challenge : optimize XQ service timer to make
overruns the lowest possible. This depends on many factors, among them is
the data sink bandwidth.

You may have flawless copy to TI:, but it will fail to disk. The terminal is
actually throttling the transfer. Disks are faster, and emulated disks are order
of magnitude faster than the original ones.
Emulation is pushing DECnet to speeds it was never designed for.

I'm running here as low as 10 polls/sec. Maybe 50 would be optimal, and
what about 500? I need a metrics. And tools. Here, I am using AT.
to time a 100. blocks file transfer. Overruns and timeouts still raise slowly, but
DECnet recovers happily most of the time.

If you're going to drive this deeply into testing it would really be best if you ran with the latest code since your results may ultimately suggest code changes, AND the latest code may behave significantly different than the older versions.

Independent of which codebase you're running, and since you're now tweaking the behavior of the simulated hardware, you may want to look at sim> SHOW XQ STATS and try to analyze the relationship between these stats and the ones the OS sees.

Also, you may want to explore what happens if:

NCP> DEFINE LINE QNA-0 RECEIVE BUFFER 32

Is done prior to starting your network...     The limit of 32 may be different on different Operating Systems...

Once again the latest code is available from: https://github.com/simh/simh/archive/master.zip

Good Luck,

- Mark



More information about the Hecnet-list mailing list