[HECnet] HECnet crawler and INFO.TXT - character for separator

Sampsa Laine sampsa at mac.com
Thu Oct 22 17:23:38 PDT 2009


Ah yes, that old chestnut, scando-letters in 7 bit, had the same issues on BBSes back in the 80s/90s myself (I'm Finnish).

Ok, since the pipe symbol is in reality almost a letter, I suggest we go with #.

: might be used in the comments field to point to objects on HECNET (e.g. "..mail me at CHIMPY::SAMPSA") and people do use it and ; in writing quite frequently.

Shall we suggest to change the separator from | to # ?

Sampsa


On 22 Oct 2009, at 10:50, goran ahling wrote:

Hi,

just one humble little detail, thus a direct mail and not to the list...   (Sorry I'm some days after in reading/commenting this storm of messages)

Once upon time, right about the time when those computers we are now "playing" with were new, there was US-ASCII, a 7 bit character encoding scheme, representing, amongst others, the 26 characters used in English writing. But in other countries, like here in Sweden, we use some other number of characters in our alpabet - swedish uses 29 letters, russian uses 33, for example.

So, a "local code" was develped, where some few of the rarely used tokens of the US-ASCII-scheme was instead used for those extra 6 letters (lower and upper case of 3 letters).

This was set in the ISO 646 standard, set 1975/12/01 - locally named SEN 85 02 00 Annex B

Here those letters are represented as:

       0x7d     }
       0x7b     {
       0x7c     |
       0x5d     ]
       0x5b     [
       0x5c     \

So, if I'm writing my given name (G  ran) in that 7-bit coding, as we often do in those old systems, I'll write it G|ran. In an old system (might be old version RSX, might be RT-11 or migt be TOPS-10), I'd have to use that format.

These extra letters are quite common in daily writing and in names as they are wowels. I guess that the system descriptions you are trying to find a suitable format to encode most likely will contain names of owner/manager/..., end eventually also a city name as location. There are several citys inside Sweden containing these tokens (like \rebro, \regrund, \rkelljunga, G|teborg, Malm|, V}nersborg, Lax{, ...), not to mention several popular personal and family-names
Besides, there is an extra "tweak" to this.

If i write these letters in "DEC Multinational", almost identical (in this respect) to ISO-LATIN-1, but strip the 8:th bit (not uncommon, try a VT-100 terminal or equivalent "telnet" 7-bit "of the shelf"!), You would end up with              getting edvEDV.

In those old dyas, the department secratary quite easilly managed to sort out post (from SUN microsystems) sent to mr. Gveran Eheling (my name is G  ran   hling), even though my dep. used DEC computers, that had working 8-bit...

So, to end up this E-mail:
Would possibly the         :       or the       ;             be a better separator than the       |     , as I really think these characters are more rarely used than the pipe!
An alternative might be the       #

All my best,

G  ran   hling           .EQ.   G|ran ]hling       .EQ.   Goeran AAhling       .NE.     Goran Ahling
(Missconfigured printer might even print it as Gveran Ehling, but is it still NOT Goran Ahling, that would be another given name and another family name).


Sampsa Laine wrote:
Why? What would that accomplish that the pipe separated format doesn't except make things uglier? The reason I chose the pipe symbol is because this way we don't have to quote strings as people very seldomly use pipe symbols, but do use commas all the time...
On 21 Oct 2009, at 02:15, Steve Davidson wrote:

Let's try this again...

Any chance this could be in comma separated format?   Put strings in
quotes if necessary.

-Steve

-----Original Message-----
From: owner-hecnet at Update.UU.SE [mailto:owner-hecnet at Update.UU.SE] On
Behalf Of Sampsa Laine
Sent: Tuesday, October 20, 2009 19:43
To: hecnet at Update.UU.SE
Subject: Re: [HECnet] HECnet crawler and INFO.TXT

I've updated CHIMPY in the same format.

Sampsa


On 21 Oct 2009, at 00:26, Bob Armstrong wrote:

I took Sampsa's suggestion and put an INFO.TXT on CODA with a
machine
readable section at the end that contains information about the
local nodes.
I already wrote a little DCL script that crawls the HECnet and
collects all
the INFO.TXT files that it can find (so far there are eight,
counting mine)
and if enough people adopt this format I'll write another little
script to
parse out the node information.

The format is pretty straight forward - you can just type out
CODA::INFO.TXT and see for yourself.

Bob



More information about the Hecnet-list mailing list