[HECnet] DECnet for Linux

Erik Olofsen e.olofsen at xs4all.nl
Fri Jan 26 12:52:09 PST 2018


Thank you all for your support!

So here are some more detailed notes to get DECnet for Linux working with a
recent longterm kernel, 4.9.77. How stable it remains to be seen...

Changes to sources of other kernels are likely to be similar. But because of
many API changes, there are many differences between decnet module sources
with different kernel versions, so patch files are not too useful. However,
just a few small edits are needed.

The patches below were tested with Slackware 14.1 (which doesn't use systemd),
where it is relatively easy to build a custom kernel. Also, decnet startup
was done by hand rather than using a script:

# ifconfig eth0 mtu 576 # see below
# modprobe decnet
# echo -n area.node > /proc/sys/net/decnet/node_address
# dnetd
# phoned

[Kernel 4.14.15 gives BUG: unable to handle kernel NULL pointer dereference
which seems to be related to xfrm_lookup(). A kernel warning about dst.h:256
can be avoided by using dst_use_noref() instead of dst_use() at two places.]


--- MTU ---

It seems the decnet module can generate packets which are one byte larger than
intended. This can be fixed (?) in af_decnet.c:dn_mss_from_pmtu() by adding
one line:

                 */
                mtu -= (21 + DN_MAX_NSP_DATA_HEADER + 16);
        }
+       mtu--; /* probably due to padding */
        if (mtu > mss)
                mss = mtu;
        return mss;


The dneigh utility gives

Node                     HWtype  HWaddress           Flags      MTU        Iface
rullfl                   loop    AA:00:04:00:04:70   ---        65533      lo
rullfs                   ether   AA:00:04:00:29:70   -2-        1498       eth0
gorvax                   ether   AA:00:04:00:90:21   -2-        1498       eth0
pdxvax                   ether   AA:00:04:00:98:F2   -2-        1498       eth0
dimma                    ether   AA:00:04:00:0B:EC   -2-        576        eth0
jocke                    ether   AA:00:04:00:15:04   -2-        576        eth0
mim                      ether   AA:00:04:00:0D:04   -2-        1500       eth0
hub                      ether   AA:00:04:00:FE:AB   -2-        576        eth0
bitxoz                   ether   AA:00:04:00:51:1C   1--        1492       eth0
skhngw                   ether   AA:00:04:00:04:38   -2-        576        eth0
a44rtr                   ether   AA:00:04:00:FF:B3   -2-        1498       eth0

where it is interesting to see values of 576, but also a value of 1500 for MIM
(but communication with MIM works well).

MTUs or blksizes are computed in dn_neigh.c, accompanied by the remark:

        /*
         * Make an estimate of the remote block size by assuming that its
         * two less then the device mtu, which it true for ethernet (and
         * other things which support long format headers) since there is
         * an extra length field (of 16 bits) which isn't part of the
         * ethernet headers and which the DECnet specs won't admit is part
         * of the DECnet routing headers either.
         *
         * If we over estimate here its no big deal, the NSP negotiations
         * will prevent us from sending packets which are too large for the
         * remote node to handle. In any case this figure is normally updated
         * by a hello message in most cases.
         */
        dn->blksize = dev->mtu - 2;

The above mtu--; seems sufficient, and subtracting an additional two not necessary.

For HECnet, it seems best to set the MTU to 576, to be able to reach nodes in
different areas. This can be done with ifconfig.

Optionally in sysctl_net_decnet.c a /proc/sys/net/decnet/mtu entry could be added with:

                .maxlen = sizeof(int),
                .mode = 0644,
                .proc_handler = proc_dointvec,
+       },
+       {
+               .procname = "mtu",
+               .data = &decnet_mtu,
+               .maxlen = sizeof(int),
+               .mode = 0644,
+               .proc_handler = proc_dointvec,
        },
        { }
 };

where decnet_mtu needs to be declared in this source file, and in linux/net/include/dn.h
as:

extern int decnet_mtu;

which could be used at various places instead of dst_mtu(). [But perhaps the
blksize can be requested from the remote host.]


--- "No buffer space available" message by a dnprogs utility  ---

At some point in time, sock_alloc_send_skb() changed, causing a problem in
af_decnet.c:dn_alloc_send_pskb(), which can be fixed by adding the second "+"
line (the first one seems common practice, but may be unnecessary):

        if (skb) {
                skb->protocol = htons(ETH_P_DNA_RT);
                skb->pkt_type = PACKET_OUTGOING;
+               skb->sk = sk;
+               *errcode = 0;
        }
        return skb;
 }


--- "Destroying alive neighbour" kernel message ---

This issue could be related by dneigh entries that are neither local nor routers.

The following two changes seem to help (based on comparing various sources):

The check of __refcnt in dn_route.c:dn_dst_check_expire():

                spin_lock(&dn_rt_hash_table[i].lock);
                while ((rt = rcu_dereference_protected(*rtp,
                                                lockdep_is_held(&dn_rt_hash_table[i].lock))) != NULL) {
-                       if (atomic_read(&rt->dst.__refcnt) ||
+                       if (atomic_read(&rt->dst.__refcnt) > 1 ||
                                        (now - rt->dst.lastuse) < expire) {
                                rtp = &rt->dst.dn_next;
                                continue;

And the setting of initial_ref in dn_route.c:dn_route_output_slow():

        if (dev_out->flags & IFF_LOOPBACK)
                flags |= RTCF_LOCAL;
 
-       rt = dst_alloc(&dn_dst_ops, dev_out, 0, DST_OBSOLETE_NONE, DST_HOST);
+       rt = dst_alloc(&dn_dst_ops, dev_out, 1, DST_OBSOLETE_NONE, DST_HOST);
        if (rt == NULL)
                goto e_nobufs;


--- "WARNING: CPU: 0 PID: nnn at include/net/dst.h:188" kernel message ---

This seems to be benign, and can be avoided by the following two changes
in dn_route.c, in void dn_dst_update_pmtu() and dn_rt_set_next_hop():

-       if (dst_metric(dst, RTAX_MTU) > mtu && mtu >= min_mtu) {
+       if (dst_metric_raw(dst, RTAX_MTU) > mtu && mtu >= min_mtu) {

-       if (dst_metric(&rt->dst, RTAX_MTU) > rt->dst.dev->mtu)
+       if (dst_metric_raw(&rt->dst, RTAX_MTU) > rt->dst.dev->mtu)


--- Kernel segfault with ctermd (login from remote node) ---

dnprogs 2.65 were used for testing. The segfault happens with
ctermd.c:cterm_reset(), where variable line is used, even though
it is not assigned when DNETUSE_DEVPTS is defined. So this #ifndef
around the code using line helps:

#ifndef DNETUSE_DEVPTS
        p=line+sizeof("/dev/")-1;

        setutent();
        memcpy(entry.ut_line,p,strlen(p));
        entry.ut_line[strlen(p)]='\0';
        lentry=getutline(&entry);
        lentry->ut_type=DEAD_PROCESS;

        memset(lentry->ut_line,0,UT_LINESIZE);
        memset(lentry->ut_user,0,UT_NAMESIZE);
        lentry->ut_time=0;
        pututline(lentry);

        (void)chmod(line,0666);
        (void)chown(line,0,0);
        *p='p';
        (void)chmod(line,0666);
        (void)chown(line,0,0);
#endif


--- Using local node with dnprogs crashes the machine ---

This has not been solved.


More information about the Hecnet-list mailing list