[HECnet] DECnet for Linux

Tim Stark fsword007 at gmail.com
Sun Jan 28 10:18:05 PST 2018


Yay! Thanks for arranging it at latest Linux kernels! I will apply them to
my codes.

A few weeks ago, I downloaded a copy of CVS repo from Sourceforge and
successfully converted to Git repo.  I plan to upload it to my github
account soon. 

Then I will apply that patches to repo.

I have Ubuntu 17.10 and kept getting complaints every few minutes about NULL
pointer in decent module. 

I am working on my Tinker SBC (Pi clone with faster CPU processor) for 7/24
operation.

Tim

-----Original Message-----
From: owner-hecnet at Update.UU.SE [mailto:owner-hecnet at Update.UU.SE] On Behalf
Of Erik Olofsen
Sent: Friday, January 26, 2018 3:52 PM
To: hecnet at Update.UU.SE
Subject: Re: [HECnet] DECnet for Linux

Thank you all for your support!

So here are some more detailed notes to get DECnet for Linux working with a
recent longterm kernel, 4.9.77. How stable it remains to be seen...

Changes to sources of other kernels are likely to be similar. But because of
many API changes, there are many differences between decnet module sources
with different kernel versions, so patch files are not too useful. However,
just a few small edits are needed.

The patches below were tested with Slackware 14.1 (which doesn't use
systemd), where it is relatively easy to build a custom kernel. Also, decnet
startup was done by hand rather than using a script:

# ifconfig eth0 mtu 576 # see below
# modprobe decnet
# echo -n area.node > /proc/sys/net/decnet/node_address # dnetd # phoned

[Kernel 4.14.15 gives BUG: unable to handle kernel NULL pointer dereference
which seems to be related to xfrm_lookup(). A kernel warning about dst.h:256
can be avoided by using dst_use_noref() instead of dst_use() at two places.]


--- MTU ---

It seems the decnet module can generate packets which are one byte larger
than intended. This can be fixed (?) in af_decnet.c:dn_mss_from_pmtu() by
adding one line:

                 */
                mtu -= (21 + DN_MAX_NSP_DATA_HEADER + 16);
        }
+       mtu--; /* probably due to padding */
        if (mtu > mss)
                mss = mtu;
        return mss;


The dneigh utility gives

Node                     HWtype  HWaddress           Flags      MTU
Iface
rullfl                   loop    AA:00:04:00:04:70   ---        65533
lo
rullfs                   ether   AA:00:04:00:29:70   -2-        1498
eth0
gorvax                   ether   AA:00:04:00:90:21   -2-        1498
eth0
pdxvax                   ether   AA:00:04:00:98:F2   -2-        1498
eth0
dimma                    ether   AA:00:04:00:0B:EC   -2-        576
eth0
jocke                    ether   AA:00:04:00:15:04   -2-        576
eth0
mim                      ether   AA:00:04:00:0D:04   -2-        1500
eth0
hub                      ether   AA:00:04:00:FE:AB   -2-        576
eth0
bitxoz                   ether   AA:00:04:00:51:1C   1--        1492
eth0
skhngw                   ether   AA:00:04:00:04:38   -2-        576
eth0
a44rtr                   ether   AA:00:04:00:FF:B3   -2-        1498
eth0

where it is interesting to see values of 576, but also a value of 1500 for
MIM (but communication with MIM works well).

MTUs or blksizes are computed in dn_neigh.c, accompanied by the remark:

        /*
         * Make an estimate of the remote block size by assuming that its
         * two less then the device mtu, which it true for ethernet (and
         * other things which support long format headers) since there is
         * an extra length field (of 16 bits) which isn't part of the
         * ethernet headers and which the DECnet specs won't admit is part
         * of the DECnet routing headers either.
         *
         * If we over estimate here its no big deal, the NSP negotiations
         * will prevent us from sending packets which are too large for the
         * remote node to handle. In any case this figure is normally
updated
         * by a hello message in most cases.
         */
        dn->blksize = dev->mtu - 2;

The above mtu--; seems sufficient, and subtracting an additional two not
necessary.

For HECnet, it seems best to set the MTU to 576, to be able to reach nodes
in different areas. This can be done with ifconfig.

Optionally in sysctl_net_decnet.c a /proc/sys/net/decnet/mtu entry could be
added with:

                .maxlen = sizeof(int),
                .mode = 0644,
                .proc_handler = proc_dointvec,
+       },
+       {
+               .procname = "mtu",
+               .data = &decnet_mtu,
+               .maxlen = sizeof(int),
+               .mode = 0644,
+               .proc_handler = proc_dointvec,
        },
        { }
 };

where decnet_mtu needs to be declared in this source file, and in
linux/net/include/dn.h
as:

extern int decnet_mtu;

which could be used at various places instead of dst_mtu(). [But perhaps the
blksize can be requested from the remote host.]


--- "No buffer space available" message by a dnprogs utility  ---

At some point in time, sock_alloc_send_skb() changed, causing a problem in
af_decnet.c:dn_alloc_send_pskb(), which can be fixed by adding the second
"+"
line (the first one seems common practice, but may be unnecessary):

        if (skb) {
                skb->protocol = htons(ETH_P_DNA_RT);
                skb->pkt_type = PACKET_OUTGOING;
+               skb->sk = sk;
+               *errcode = 0;
        }
        return skb;
 }


--- "Destroying alive neighbour" kernel message ---

This issue could be related by dneigh entries that are neither local nor
routers.

The following two changes seem to help (based on comparing various sources):

The check of __refcnt in dn_route.c:dn_dst_check_expire():

                spin_lock(&dn_rt_hash_table[i].lock);
                while ((rt = rcu_dereference_protected(*rtp,
 
lockdep_is_held(&dn_rt_hash_table[i].lock))) != NULL) {
-                       if (atomic_read(&rt->dst.__refcnt) ||
+                       if (atomic_read(&rt->dst.__refcnt) > 1 ||
                                        (now - rt->dst.lastuse) < expire) {
                                rtp = &rt->dst.dn_next;
                                continue;

And the setting of initial_ref in dn_route.c:dn_route_output_slow():

        if (dev_out->flags & IFF_LOOPBACK)
                flags |= RTCF_LOCAL;
 
-       rt = dst_alloc(&dn_dst_ops, dev_out, 0, DST_OBSOLETE_NONE,
DST_HOST);
+       rt = dst_alloc(&dn_dst_ops, dev_out, 1, DST_OBSOLETE_NONE, 
+ DST_HOST);
        if (rt == NULL)
                goto e_nobufs;


--- "WARNING: CPU: 0 PID: nnn at include/net/dst.h:188" kernel message ---

This seems to be benign, and can be avoided by the following two changes in
dn_route.c, in void dn_dst_update_pmtu() and dn_rt_set_next_hop():

-       if (dst_metric(dst, RTAX_MTU) > mtu && mtu >= min_mtu) {
+       if (dst_metric_raw(dst, RTAX_MTU) > mtu && mtu >= min_mtu) {

-       if (dst_metric(&rt->dst, RTAX_MTU) > rt->dst.dev->mtu)
+       if (dst_metric_raw(&rt->dst, RTAX_MTU) > rt->dst.dev->mtu)


--- Kernel segfault with ctermd (login from remote node) ---

dnprogs 2.65 were used for testing. The segfault happens with
ctermd.c:cterm_reset(), where variable line is used, even though it is not
assigned when DNETUSE_DEVPTS is defined. So this #ifndef around the code
using line helps:

#ifndef DNETUSE_DEVPTS
        p=line+sizeof("/dev/")-1;

        setutent();
        memcpy(entry.ut_line,p,strlen(p));
        entry.ut_line[strlen(p)]='\0';
        lentry=getutline(&entry);
        lentry->ut_type=DEAD_PROCESS;

        memset(lentry->ut_line,0,UT_LINESIZE);
        memset(lentry->ut_user,0,UT_NAMESIZE);
        lentry->ut_time=0;
        pututline(lentry);

        (void)chmod(line,0666);
        (void)chown(line,0,0);
        *p='p';
        (void)chmod(line,0666);
        (void)chown(line,0,0);
#endif


--- Using local node with dnprogs crashes the machine ---

This has not been solved.



More information about the Hecnet-list mailing list