[HECnet] [Simh] Announcing PDP-11 Multi-Precision Arithmetic Package V0.9 (Preliminary Version) - FORTRAN 77 Callable

Tue Jun 23 19:53:49 PDT 2015

Jerome,
    I find your discussion of multiword precision math very interesting although I don't have any applications
that need that sort of thing. The idea of extending the instruction set to do high speed external math
routines reminds me of the folks using GPUs to do high speed math operations.

   I recently visited the University of Illinois NCSA facility where the new "Blue Waters" system went into operation late last summer.  ( See https://bluewaters.ncsa.illinois.edu/hardware-summary ) 

  It was designed by Cray and has 22,640 Cray XE6 nodes and 4,228 Cray XK7 nodes that include NVIDIA graphics processor acceleration. It can achieve 13+ petaflops, but the power bill is a killer as it draws 24 megawatts but that includes the water chillers and the 9,000 gallons per minute of cooling water flowing through it.

  It is very neat that Eratz-11 has that kind of extensibility in it. I would love to play with it's multiprocessor capabilities sometime with RSX11M+ like Johnny B. has done.

  Good Luck with your project.

Mark

On Jun 23, 2015, at 8:05 AM, Jerome H. Fine wrote:

> About 10 years ago, I was using an algorithm which required more than
> 15 digits of precision.  I wrote some PDP-11 assembler code which
> could handle unsigned values up to 2^512 (just under 10^160) plus
> fractional numbers with 1024 bits that had a precision on the right hand
> side of the decimal point equal to the integer portion - 512 bits for each.
> Actually, there were three levels of precision: 128 bits, 256 bits and the
> maximum at 512 bits.  The FORTRAN 77 integer symbols were LU...,
> MU... and NU... while the corresponding integer / fractional symbols
> were LX..., MX... and NX..., all allocated as CHARACTER *n variables.
> 
> These subroutines are designed to be used under FORTRAN 77, so any
> PDP-11 operating system (such as RT-11 and RSX-11) can easily make
> use of them.  While these routines include ADD, SUBTRACT and
> MULTIPLY, DIVISION is not available, although that is easily remedied
> via a FORTRAN 77 subroutine which arrives at the result via the standard
> approximation algorithm.  Also available are ENCODE and DECODE
> routines to convert between internal binary and external decimal values.
> In addition, there are routines to convert back and forth between all six
> sizes of variables and DOUBLE  PRECISION floating point or REAL *8
> variables.
> 
> Of late, I realized that a signed variable aspect is required, so I have begun
> to consider what is needed.  ALSO, because I so often run the PDP-11
> code under the Ersatz-11 emulator, I will consider supporting the use of
> six additional PDP-11 instructions (for each ONLY one combination of
> registers will be used as operands - Ersatz-11 supports a DLL):
> UMUL16  -  unsigned multiple two 16 bit variables
> MUL32     - signed multiple two 32 bit variables
> UMUL32  -  unsigned multiple two 32 bit variables
> UDIV16   -  unsigned divide a 32 bit variable by a 16 bit variable
> DIV32      -  signed divide a 64 bit variable by a 32 bit variable
> UDIV32   -  unsigned divide a 64 bit variable by a 32 bit variable
> the UMUL16 and UMUL32 instructions being especially important to
> perform multi-precision MULTIPLY.  I will also consider the possibility
> of a single PDP-11 instruction to perform multi-precision arithmetic of
> values contained in memory using that ability of the Ersatz-11 emulator
> to LOAD a user written DLL, namely to convert many of the PDP-11
> multi-precision assembler subroutines to a single PDP-11 instruction
> which would then be executed using x86 instructions at a much higher
> speed, sort of like a CIS for multi-precision variables.  In that case,
> much larger sized variables could easily be supported due to the much
> higher speed of execution.  In addition, the (approximately) 16KB
> of subroutine instruction / data memory within the emulated PDP-11
> could be substantially reduced.
> 
> If there is sufficient interest and support, complete algorithms might be
> implemented which could directly make use of the x86's huge GB
> memory to solve particular problems - sort of like a SLAR auxiliary
> processor CPU (which for example performed an FFT on a KB
> sized array in virtual memory) implemented in software rather than
> hardware.
> 
> I hope that some interest is expressed.  Commercial inquiries for a
> specific algorithm would obviously receive priority, but hobby users
> are expressly encouraged to express all of their needs as well.
> 
> Jerome Fine