[HECnet] KLH10 halting at random

Thomas DeBellis tommytimesharing at gmail.com
Mon Aug 31 14:00:04 PDT 2020


If you are running a standard PANDA distribution, then DDT is in the 
monitor and you may fail to it.  Did it come up?  Did you do an examine 
from the KLH10 micro-engine to see what instruction it was failing on?  
Did you see what module it is failing in?

My monitor is modified from the base PANDA distribution to include 
several local enhancements, so when I looked at that address, it showed 
up as in the entry of CHKOPC, which is what is checking for differed 
closes on virtual circuits.  This is in PHYKLP which is the KLIPA driver 
(a.k.a. the CI).  Since KLH10 (sadly) does not implement the CI, there 
is no way you should be executing in that module as there nothing for it 
to talk to.

Moreover, there is no JRST 4 there.  So probably you have something else 
at that address.

I have been running KLH10 for a /very/ long time; since late December 
2002 and have made modifications there, too to fix an issue with locking 
memory and to better support Linux (recent Ubuntu).  It is remarkably 
robust; despite intensive development, I have stayed up well over a year 
at a time (I.E., hit UP2LNG BUGHLT's)

I have found one problem; if you are running it on an _extremely_ fast 
machine with SSD storage (in other words, you're basically never waiting 
for anything) and you seriously beat on the file system, then the 
keep-alive counter can get out of sync with the 20 thinking the front 
end has died and the KLH10 DTE simulator apparently not understanding 
what to do.

The 20 typed an initial BUGCHK and then in the middle of the second one, 
it hangs waiting for the front end.

It's on my list of things to investigate.

> ------------------------------------------------------------------------
> On 8/31/20 4:15 PM, Supratim Sanyal wrote:
>
> hi all - my panda distribution instance is halting after a couple of 
> days with the following message. is this a known problem for which 
> there is some workaround?
>
> Monitor RF434E DEC10 Development
> System uptime 52:10:47
> Current date/time Wednesday 29-Jul-120 6:01:04
>
> [HALTED: Program Halt, PC = 22013]
>
> thanks
>
> Supratim
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sonic.net/pipermail/hecnet-list/attachments/20200831/dba3c552/attachment.html>


More information about the Hecnet-list mailing list