DATA FORMATS IN OLD
48- AND 36-BITS COMPUTERS

by:   J.H.M. Bonten


First publication date of 'Burroughs': 05 october 2006
First publication date of 'Univac+IBM': 19 september 2007
First publication date of 'sixbit char' 19 september 2007
First publication date of 'Digital': 25 september 2007
Update of these texts: 26 september 2007
Last update viz.: 8 march 2009
     Addition of U-418 double
     New introductory text added
     Full revision of B-6700 text

Machines:

   Burroughs 6700, 7700, 7900, and the descendant Unisys A-series.
   Univac-1100, Univac-418 and IBM-7094.
   Digital PDP-10, Decsystem-10 and -20.

CONTENTS

==>>Back to main index of numeric formats


SINGLE-WORD MEMORY
WITH SECONDARY ADDRESS

CONTENTS

-->Return to document header

Double-address computers

As written elsewhere in this internet site, in the early years 60-s IBM began to make computers that combined the ordination task with the calculation task. Several other companies did it too. They made machines with long RAMemory words. Each such a word was split into a very few adjacent short words. Thus that memory word acted as a 'mini-memory' for these few short words.

Besides the ordinary addressing mechanism for finding the words in the memory the computers got a second addressing mechanism for finding the short words in one long word. So to access a short word two addresses together had to be given, the memory address and the 'mini-address'. Often there was not an overall-address system to point at the short word directly, like the IBM-360 has. (In fact the same still holds in the present-day computers for accessing the individual bits in a memory word.)

Generally each company applied its own word formats and access systems in its 'double-address' computers. Also the numeric structures were different. Often the computers got a memory word size of 36 bits. This size enables the integer numbers to store the value of ten digits and sometimes even eleven digits.

Sixbit characters

Also the whole word can contain six characters of six bits without wasting any bit. Sixbit characters were very popular since they were simple to implement. With such a set a full-worthy and fairly well-readable text in a simple lay-out can be written with the smallest number of bits as possible. Bits were very expensive at that time.

Alas, every computer company designed its own set of sixbit characters. The US-military designed its own set, called Fiel(d-)data. But also, it made slightly different versions of this Fieldata. This all resulted in the ciculation of a lot of sets

Every set embraces the 26 capital letters, the ten digits, some interpunctuation marks and some instructions for controlling the peripheral devices. But for every set another group of interpunctuation marks has been selected. Also the control characters differ. Also the bit patterns of the letters and digits differ. For example, Univac gave the bit-pattern 000110 to the letter A, whilst the British ICL assigned it the pattern 100001.  This documents lists a few of the sets and patterns.

A computer word of 36 bits has a disadvantage. It cannot store characters from the more versatile 8-bits set without wasting any bit. Therefore Burroughs applied words of 48 bits. Therein six eight-bits characters fit or eight six-bits characters.

Word formats

This document describes the word formats in some incompatible machine series, the Burroughs-6700, the Univac-1100 and the Digital PDP-10, and their successors. Also it lists the numeric formats of an IBM 'double-address' computer, the 7094, which remarkably are compatible with those on the Univac-1100.  In 1986 the two companies Burroughs and Univac merged into Unisys. The sixbit characters applied by its machines are listed.

The DEC PDP-10 does not operate with predefined mini-words. The user can address easily every row of adjacent bits in a memory word. Therefore only its numeric formats are described. These are compared with those in the Univac-1100 and with those of an other Digital machine pedigree, the PDP-11/VAX/Alpha.

Error handling

Remarkably none of the machines stores a special value arising from an arithmetic operation into the word for the resulting number. Such a special value, like Infinity or Undefined, is transferred only to the Processor Status Register. The word for the results gets a legal number, which factly is a fake. The user's program must take immediate action, otherwise the error will creep invisibly through the results of following calculations. The simplest action is an immediate stop.

Back to contents


BURROUGHS B6700, B7700, B7900
AND UNISYS A-SERIES

CONTENTS

-->Return to document header

Data and Strings

Burroughs discerns between 'Data' and 'Strings'. The Data (what a confusing word!) are the set of information for which the RAMemory words are used as whole words only. Generally they are for the arithmetic applications. So the numeric data belong to this group. Strings are the set of information when the short words are addressed too. They are for the text processing and data-base management.

The Data items occupy one or two whole '48-bits' memory words. They are accessed by addressing the first whole word only. When present the second word is dragged automatically via the tags in front of the words. The bits in such a tag are named 58, 49 and 50.  The value of the tag of a single word is octal 0.  The value of the tag of each word in a double word is octal 2.

The total RAMemory word size is 52 bits. Bit 51 is the check bit for odd parity. It is not accessible by any user.

A String occupies one word and consists of a series of word arts. These parts can be addressed separately, i.e. extracted as a kind of mini-word out of the whole word. Double words are not applied for the Strings. A memory word can be subdivided into mini-words of one out of three different types: 4-bits digits (nibbles, nybbles), 6-bit characters and 8-bits bytes. The value of the tag is octal 0, irrespective the selected type of mini-words.

The access of a mini-word can be performed via the double- ddress mechanism, but also via the so-called POINTERs which constitute a kind of single-address mechanism.

Full-word binary numbers

Every number is stored as an exponent and a mantissa. Both the mantissa and the exponent are witten as 'binary' integers in sign-and-magnitude notation. The exponent is also treated as an integer value. Note that it is not written in excess-bias notation. Both the exponent and the mantissa have a +0 and a -0 value and the number has two +/- signs. The value of the number is zero when all bits in the mantissa are zero, irrespective the values of the exponent and the two signs.

The size of the data-part in a memory-word is 48 bits. For the calculations this word is subdivided into sixteen triplets of 3 bits. These octal digits are called 'octades'. Consequently the number parts are written in octal code and the base of the exponent is 8.

           Number in single 48-bits computer word
  47 46 45 44     39 38       <- bit index ->            0
   \  | |  /       | |                                   |
   +-+-+-+----------+-------------------------------------+
   | |+|-| exponent |              mantissa               |
   | |-|+|  6 bits  |              39 bits                |
   +-+-+-+----------+-------------------------------------Q
   /  |  \                                               /
  |   |   exponent sign                     binary period
  |   coefficient sign                        (virtual)
  unused

A single-precision number consists of one word. Its first octade contains the signs of both the mantissa and the exponent. The highest bit in it (nr. 47) is not used at all. (This bit is used only in the second part of a double precision number and by the storage of text characters.)  The binary period of the mantissa is located at right of the mantissa, so the mantissa is an integer value.

A double-precision number consists of two memory words. The second word contains the extensions of both the exponent and the mantissa. In the evaluation the exponent extension is placed BEFORE the single-precision exponent, thus making the exponent a big-size integer value. The mantissa extension is placed AFTER the single-precision mantissa. The binary period in the mantissa is located on the border between this extension and the mantissa in the first word. Thus the extension is an added fractional part.

         First word in double 96-bits computer word
  47 46 45 44     39 38       <- bit index ->            0
   \  | |  /       | |                                   |
   +-+-+-+----------+-------------------------------------+
   | |+|-| LSP-expon|   integral part of mantissa (MSP}   |
   | |-|+|  6 bits  |              39 bits                |
   +-+-+-+----------+-------------------------------------Q
   /  |  \                                               /
  |   |   exponent sign                     binary period
  |   coefficient sign                        (virtual)
  unused

         Second word in double 96-bits computer word
    47            39 38       <- bit index ->            0
    |              | |                                   |
   +----------------+-------------------------------------+
   |MSP of exponent |  fractional part of mantissa (LSP)  |
   |    9 bits      |              39 bits                |
   +----------------+-------------------------------------+

A number can be represented by two ways: with varying exponent value and with fixed exponent value. When the exponent can vary the mantissa is left-normalized, so the first digit of it is never 0, except when the value of the whole number is zero. This is called the floating-point number.

When the exponent of a double-precision float is in the range [-63,+63] this float can be truncated or rounded into a single-precision float. In case of rounding a carry may occur which is added to the mantissa of the single-precision float. When both the mantissa and the exponent are already at their maximum values this carry will lead to an overflow. In this peculiar case the carry is omitted to avoid the overflow. This 'bug' seems to be intended by the manufacturer.

When the exponent is fixed it is obligatory zero or 13 and the mantissa is unnormalized. This is called the integer number. A number is integerized whenever possible. Thus the distinction between integer and floating-point is not very clear. Since the float is notated as a sign and absolute value the integer is so too, i.e. as a sign and magnitude. It is not notated in the one- or two-complement way.

The single-word integer has the exponent set to zero and can contain a value of over 11 digits. To store the long numbers of upto 23 digits a 'double integer' is applied. This is a double-precision float with the exponent value fixed on +13.

The specifier INTEGER in Algol or Fortran means: integer notation obliged, float notation forbidden. Then the number is moulded and rounded to fit this notation. The specifier REAL (= old word for Float) does not forbid the integer notation.

Decimal computation is absent

The machine is designed both for scientific and for banking applications. It can store decimally written numbers easily into Strings made up of 4-bits nibbles that are filled with binarily coded decinal (BCD) characters. Perhaps to stress this fact the layout of a computer word is given as a set of units of four bits, both in the drawings in the manuals and on the lamps display on the operator's console of the older machines.

Remarkably neverheless, the machine is not able to calculate decimally. It lacks a processor for decimal arithmetics. Every number that is written in decimals must be converted into a binarily (octally) written number always before it is used in a calculation. The result of the calculation must be converted back into the decimal notation before it is stored into the Strings. Thus the calculations are performed binarily always. This strategy is the same as applied by Konrad Zuse in his Z1 (which is described elsewhere in this internet site). Burroughs claims that this clumsy way does not consume more time than needed by a separate processor for decimal arithmetics.

Therefore even in Cobol all arithmetic calculations are performed in the binary (octal) way. To compensate this 'shortcoming' the machine has hardware for fast multiplying and dividing by ten and thus easily shift the decimal point to the left or right.

Comments to the tables

The tables below show the structure of the numeric words, the actual meaning of their bits, the maximum and minimum non-zero value they can store and the sccuracy of this storage.

Generally in mathematical calculations this accuracy of the computer is very important. It is the number of digits that can be represented reliably by the machine. When the number is used as a float it is defined by the guaranteed relative accuracy which is calculated when the normalized mantissa is at its smallest (= octal 1000...).  When the number is used as an integer it is defined by the maximum value that it can store, which is calculated when the mantissa is at its largest (= octal 7777...).  This value of the mantissa is nearly 8 times bigger than the smallest one.

The guaranteed relative accuracy is 10_log (minimum_normalized_mantissa).
The maximum number of digits is 10_log (maximum_mantissa+1).
Since the ratio between both mantissa values is nearly 8, both values differ 10_log(8) = 0.9 for numbers of the same length.

Numeric structures

FLOAT

The exponent varies and the mantissa is normalized.

octade 16:
    bit 47 = unused
    bit 46 = sign of mantissa (0='+', 1='-')
    bit 45 = sign of exponent (0='+', 1='-')
octade 15 + 14:
    bit 44 to 39 = exponent
octade 13 to 1
    bit 38 to 0  = mantissa

Range of absolute value of non-zero mantissa is from 8^12 until 8^13-1.    This is approx. from 6.8E10 until 5.5E11.
Value range of exponent is from -63 up to +63.
Range of absolute value of float is 8^12*8^(-63) until (8^13-1)*8^63.    This ia approx. 4.3E+68 until 8.8E-47
The guaranteed relative accuracy is 8^-(13-1) = 8^(-12).
So the relative accuracy is at least 10.8 decimal digits.

DOUBLE PRECISION FLOAT

The exponent varies and the mantissa is normalized.

First word:
    same as single precision float

Second word:
 octade 16 to 14:
    bit 47 to 39 = exponent extension
 octade 13 to 1:
    bit 38 to 0  = mantissa extension (the added fraction)

Because of the fractional part the mantissa becomes more accurate. But its value-range does nearly not change. So its absolute value wil stay from 6.8E10 until 5.5E11.
The exponent has the range from -32767 until +32767.
So the range of the float is from 8^(-32755) until nearly 8^32780.    This is approximately from 1.9E-29581 until 1.9E+29603
The guaranteed relative accuracy is 8^-(26-1) = 8^(-25).
So the relative accuracy is at least 22.5 decimal digits.

INTEGER

The exponent is fixed and the mantissa is not normalized.

octade 16:
    bit 47 = unused
    bit 46 = sign of mantissa (0='+', 1='-')
    bit 45 = 0
octade 15 + 14:
    bit 44 to 39 = 0
octade 13 to 1:
    bit 38 to 0 = mantissa

Range of absolute of non-zero mantissa is from 8^13-1 until 1.    This is from approx. 5.5E11 until 1.
Maximum integer value is 5.5E11.    This is 11.7 decimal digits.

DOUBLE INTEGER

The exponent is fixed and the mantissa is not normalized.

First word
 octade 16:
    bit 47 = unused
    bit 46 = sign of mantissa (0='+', 1='-')
    bit 45 = 0
 octade 15 + 14:
    bit 44 to 39 = '001101' (value in decimal = 13)
 octade 13 to 1
    bit 38 to 0  = first part of mantissa

Second word:
 octade 16 to 14:
    bit 47 to 39 = 0
 octade 13 to 1:
    bit 38 to 0  = second part of mantissa

Range of absolute value of non-zero mantissa is from 8^13-1 until 8^(-13).    This is from approx. 5.5E11 until 1.8E-12.
Maximum integer value is 8^13*(8^13-1).    This is nearly 8^26, which is approximately 3.0E+23.    This is 23.4 decimal digits.

Arithmetic data in listing

Machines: Burroughs-6700, 7700, 7900, Unisys-A-series

Binary bits

                 #octades           binary non-zero abs.value
             total expon mantis    minimum     maximum(nearly)
single float    16   2   13      8^(-63+12)      8^(63+13)
double float    32   5   26      8^(-32767+12)   8^(32767+13)
single integer  16   0   13         1            8^13
double integer  32  (0)  26      8^(13-13)=1     8^(13+13)=8^26

Decimal values

                    decimal non-zero abs.value   float-accur.
                     minimum        maximum     or int-#digits
single float       8.76E-47        4.31E+68          10.8
double float       1.94E-29581     1.94E+29603       22.5
single integer          1          5.49E+11          11.7
double integer          1          3.02E+23          23.4

The rounding protocol for the decimal figures is:
-- for minimum value: always rounding up
-- for maximum value: always rounding down
-- for accuracy and #digits: always rounding down
(All rounding protocols are written elesewhere in this internet site. Similar holds for the concept of 'guaranteed accuracy'.)

Non-numeric information types

The machine also uses several non-numeric information types. These are the type BOOLEAN (= LOGICAL) and several text types.

Since the logical algebra is seen as a kind of arithmetics Burroughs has decided to let the type Boolean be a Data type too. Each single logical value is handled as data and stored in a whole memory word. Consequently only one bit in the word is actually used to contain the logical information. The other bits are discarded all. This remarkable construction makes this type to be an extremely great waste of costly memory space.

BOOLEAN:
  bits 47 to 1: unused
  bit 0:   Logical value:  1 = true,  0 = false

DOUBLE BOOLEAN (!!):
  first word:
     same as single-length Boolean
  second word:
     bits 47 to 0: all bits are unused.

With the Strings the machine operates very economically. No bit is wasted at all. Six, eight or twelve word parts are stored side by side in one 48-bits memory word. Generally they are used for storing text characters and for storing the 8-bits codes (the 'syllabes') that instruct the central processor. Pointer arithmetic is available for easy handling the strings.

In a double word the tag bits 50, 49 and 48 in each of both words have the value 010.  In a single word, both for Data and for Strings, the tag bits have the value 000.  Bit 51 is the check bit for odd parity. It is not accessible by the user. The total RAMemory word size is 52 bits.

The Data/String-descriptor control-word (with tag 101) addresses the desired word(s) in the core memory and shows the way of use, as data or string, and the size of the information elements. The bits 42, 41 and 40 indicate this way and size.

The machine applies four different character sets, none of them is the 8-bits ASCII.  They are:

CHARACTERS:
  EBCDIC = Extended Binarily Coded Decimal Interchange Code:
           6 chars of 8 bits in one memory word
  BCL = Burroughs Coded Language:
           8 chars of 6 bits in one memory word
           The 'Internal' BCL and 'External' BCL differ much.
  BCD = Binarily Coded Decimal:
          12 chars of 4 bits in one memory word

Besides the subdivision in the small string-words the machine also enables the easy selection of every bit or arbitrary contiguous group of bits in a single RAMemory word.

Links

Burroughs reference manuals:
   B6700 Information Processing Systems ref.man., 1972
   B6700/7700 Algol Language ref.man., june 1974
Unisys support reference manual:
   e-@action Clear Path Enterprise servers NX6820 and NX6830, Level Delta Architecture, june 2001

Back to contents


UNIVAC-418+1100 and IBM-7094

CONTENTS

-->Return to document header

Introduction

To a high degree the three computers Univac-1100, Univac-418 and IBM-7094 apply the same numeric structures. But in their manuals the indexing of the bits differ. In this document the bit-numbering of Univac is applied, not that of IBM. Therefore in all machines the rightmost bit, the least significant bit, is indicated here as bit 0.

The Univac-418 has a word size of 18 bits. Two of them make a whole number. The other two computers have a word size of 36 bits. This enables the integer numbers to contain ten digits and sometimes even eleven digits. Also a whole word can contain six characters of six bits without wasting any bit. These Fieldata-characters are described elsewhere in this document.

The Univac-1100 machine can handle ASCII-characters too. Then for each character a space of nine bits is allocated, thus wasting one or two bits. Four of these characters fit in one word. (Burroughs has diminished this waste by applying words of 48 bits. Therein six eight-bits characters fit or eight six-bits characters.)

In all machines the floating-point and the double-precision numbers are notated as a signed coefficient plus a biased exponent. There is no hidden bit and there are no special values. For both numbers the numeric values are calculated by the formula:

   Value = Coefficient * 2 ^ (ExponentInteger - ExponentBias)

The value of the coefficient is always between 0 and 1, so the coefficient is always a fraction. It has no integral part. The value of the whole number is zero when the value of the coefficient equals zero. The non-zero coefficient must be normalized, so its first bit is always 1. Thus its smallest value is 0.5.

A few statements about the negative integer numbers in the NASA documents are doubtful. First the IBM-7094 does not apply the two-complement notation. It applies sign+magnitude. Secondly the U-418 might mot apply the sign+magnitude notation. Bit inversion is more likely.

The rounding protocol for the decimal figures in the lists is:
- for minimum value: always rounding up
- for maximum value: always rounding down
- for accuracy and #digits: always rounding down
(All rounding protocols are written elesewhere in this internet site. Similar holds for the concept of 'guaranteed accuracy'.)

Numeric structures
in U-1100 and IBM-7094

FLOAT

bit 35      = sign bit  (0='+', 1='-')
bit 34 - 27 = exponent
bit 26 - 0  = coefficient

exponent integer = 0 - 255
exponent bias = 128
0 =< coeffcient = fraction < 1.0
value range when normalized = 1.47E-39 = 2^-129 <-> 1.70E+38 = 2^+127
guaranteed accuracy = 26 bits = 7.8 digits

The negative floats are in the sign+magnitude notation.

DOUBLE PRECISION FLOAT

first word:
  bit 35      = sign bit  (0='+', 1='-')
  bit 34 - 24 = exponent
  bit 23 - 0  = coefficient (MSP)
second word:
  bit 35 - 0  = coefficient (LSP)

exponent integer = 0 - 2047
exponent bias = 1024
0 =< coefficient = fraction < 1.0
value range when normalized = 2.79E-309 = 2^-1025 <-> 8.98E+307 = 2^+1023
guaranteed accuracy = 59 bits = 17.7 digits

The negative double-floats are in the sign+magnitude notation.

FULL-WORD INTEGER

Integers are in the sign and mantissa notation:
  bit 35      = sign bit  (0='+', 1='-')
  bit 34 - 0  = mantissa: 0 =< abs(value) =< 2^35 -1
                                   = 34,359,738,367
                                   =(approx.)= 3.43E+10

maximum number of digits = 10.5

In the Univac negative integers are in one-complement notation. In the IBM-7094 these integers are in sign+magnitude format.

HALF-WORD INTEGER

  bit 17      = sign bit  (0='+', 1='-')
  bit 16 - 0  = mantissa: 0 =< abs(value) =< 2^17 -1
                                   = 131,072
                                   =(approx.)= 1.31E+5

maximum number of digits = 5.1

In Univac only; negatives are in the one-complement notation.

THIRD-WORD INTEGER

  bit 11      = sign bit  (0='+', 1='-')
  bit 10 - 0  = mantissa: 0 =< abs(value) =< 2^11 -1 = 2,048
 

maximum number of digits = 3.3

In Univac only; negatives are in the one-complement notation.

BOOLEAN

in Univac-1100:
  bit 35 - 1  = unused
  bit 0       = logical value:  0 = false, 1 = true

in IBM-7094 and in Univac-418
  unknown

Numeric structures
in U-418

FLOAT

first word:
  bit 17      = sign bit  (0='+', 1='-')
  bit 16 - 9  = exponent
  bit  8 - 0  = coefficient (MSP)
second word:
  bit 17 - 0  = coefficient (LSP)

The properties and the value ranges of the coefficient and the exponent equal those of the U-1100 and IBM-7094. However the negative numbers are notated differently. They are in the one-complement notation: all bits are inverted, even those of the exponent.

DOUBLE PRECISION FLOAT

first word:
  bit 17 - 15 = unused
  bit 14 - 0  = exponent
second word:
  bit 17      = sign bit  (0='+', 1='-')
  bit 16 - 0  = coefficient (MSP)
third word:
  bit 17 - 0  = coefficient (LSP)

exponent integer = 0 - 32767
exponent bias = 16384
0 =< coefficient = fraction < 1.0
value range when normalized = 4.21E-4933 = 2^-16385 <-> 5.94E+4931 = 2^+16383
guaranteed accuracy = 34 bits = 10.2 digits

The double-precision number has less than double precision since it consists of three computer words, not four. When it is negative the second and third word are one-complemented. The first word stays in the positive notation.

INTEGER

bit 17      = sign bit  (0='+', 1='-')
bit 16 - 0  = mantissa

This number equals exactly the half-word integer in the U-1100.  Negative numbers are in the one-complement notation.

DOUBLE INTEGER

first word:
  bit 17      = sign bit  (0='+', 1='-')
  bit 16 - 0  = mantissa (MSP)
second word:
  bit 17 - 0  = mantissa (LSP)

This number equals exactly the full-word integer in the U-1100.  Negative numbers are in the one-complement notation. The only applicable operations are addition and subtraction.

Arithmetic data in listing

Machines: Univac-1100, IBM-7094, Univac-418
  None of the machines does apply all formats.

Binary bits

                   #bits           binary normalized abs.value
             total expon coeff     minimum     maximum(nearly)
single float   36    8    27      2^(-1-128)        2^+127
double float   72   11    60      2^(-1-1024)       2^+1023
double U-418   54   15    35      2^(-1-16384)      2^+16383
integer        36    -    35    min.non-zero: 1     2^+35
half-integer   18    -    17    min.non-zero: 1     2^+17
third-integer  12    -    11    min.non-zero: 1     2^+11

Decimal values

                  decimal normalized abs.value   float-accur.
                    minimum         maximum     or int-#digits
single float       1.47E-39        1.70E+38           7.8
double float       2.79E-309       8.98E+307         17.7
double U-418       4.21E-4933      5.94E+4931        10.2
integer         min.non-zero: 1    3.43E+10          10.5
half-integer    min.non-zero: 1    1.31E+5            5.1
third-integer   min.non-zero: 1      2048             3.3

The rounding protocol for the decimal figures is:
-- for minimum value: always rounding up
-- for maximum value: always rounding down
-- for accuracy and #digits: always rounding down
(All rounding protocols are written elesewhere in this internet site. Similar holds for the concept of 'guaranteed accuracy'.)

Links

Univac reference manual:
   Fortran-V on Univac-1100 series with Exec-8,   dated: before 1979;  more data are unknown
Book:
   R. Clay Sprowls: "Computers, a programming problem approach",
           Harper&Row 1968, LCCCN = 68-12278 (ISBN: none)

Back to contents


SIX-BITS CHARACTERS

CONTENTS

-->Return to document header

Encoding mess

In the present days the text characters in computers are stored in data units of eight bits, the so-called 'bytes', or even of sixteen bits, the so-called 'unicode characters'. Also most of the computer manufacturers use the same bit patterns for the characters. So for example, the bit pattern that represents the character '@' is the same on the computers of many different brands. The popular encoding mechanism used for storing the characters in the eight-bits bytes is named ASCII.

In the early days of computers things were very different. Many computer firms let the text characters be stored in units of six bits. Also the encoding protocol differed much between the brands and even between the different computer models of the same brand. Sometimes even two or more encoding protocols were used simultaneously inside the same model!  Of course this mess might have hampered easy data exchange between the computers.

Computer firms of different brands gave different names to their six-bits encoding protocols. Some names are:
     Univac:      Fieldata
     Burroughs:   Burroughs Coded Language ('BCL'),   2 versions: 'internal' and 'external'
     IBM:    Binary Coded Decimal ('BCD')
It may be confusing that the naming by IBM also holds for the encoding of the ten digits into units of four bits.

Of course, the less bits per storage unit the less different characters it can contain, as the next table shows:

       16 bits:  65536 characters
        8 bits:    256 characters
        7 bits:    128 characters
        6 bits:     64 characters

Irrespective the encoding mess all computer firms agreed in one aspect: they all put the letters of the Latin alphabet and the ten digits in their encoding scheme.

The six-bit units can contain all 10 digits, all 26 capital letters and all 26 undercast letters, but then do not have any place for the interpunctuation marks. These marks are needed for a good readability of a text. To give way to these marks all firms also decided to omit the undercast letters.

The useage of the 10 digits, the 26 Latin capitals and the blank space was the only correspondence between all firms. Every firm filled the remaining 27 six-bits units with its own set of interpunctuation marks. And as already stated, every firm applied its own way of encoding the characters onto the bit patterns. Burroughs applied even two totally different ways of encoding, both named Burroughs Coded Language (BCL)

The next list shows the encoding shemes of both BCL versions, the so-called internal and external version. The list also shows the single set used by Univac, called Fieldata. The BCD codes used in the 7094 model by IBM for the letters, the digits and the blank equal those of the BCL internal. The codes of the punctuation marks are unknown yet. Therefore the IBM set is not listed here. The set used by another firm, ICL, is listed. This British firm does not exist anymore.

Listing of six-bits character sets

General: All character sets contain:
          26 Latin capital letters, A - Z
          10 decimal digits, 0 - 9
           1 blank space
          27 interpunctuation marks


  bit            Burroughs-6700    Univac-1100
pattern  octal  intern.BCL.extern   Fieldata     ICL

 0 -- 7
000 000    0 0     0         ?         @          0
000 001    0 1     1         1         [          1
000 010    0 2     2         2         ]          2
000 011    0 3     3         3         #          3

000 100    0 4     4         4        /\          4
000 101    0 5     5         5       blank        5
000 110    0 6     6         6         A          6
000 111    0 7     7         7         B          7

 8 - 15
001 000    1 0     8         8         C          8
001 001    1 1     9         9         D          9
001 010    1 2     #         0         E          : (col)
001 011    1 3     @         #         F          ; (sem)

001 100    1 4     ?         @         G          <
001 101    1 5     : (col)   : (col)   H          =
001 110    1 6     >         >         I          >
001 111    1 7     >=        >=        J          ?

16 - 23
010 000    2 0     +       blank       K        blank
010 001    2 1     A         /         L          ! (xcl)
010 010    2 2     B         S         M          "
010 011    2 3     C         T         N          #

010 100    2 4     D         U         O          L (GBP)
010 101    2 5     E         V         P          %
010 110    2 6     F         W         Q          &
010 111    2 7     G         X         R          ' (apo)

24 - 31
011 000    3 0     H         Y         S          (
011 001    3 1     I         Z         T          )
011 010    3 2     . (per)   -=        U          *
011 011    3 3     [         , (com)   V          +

011 100    3 4     &         %         W          , (com)
011 101    3 5     (         =         X          -
011 110    3 6     <         ]         Y          . (per)
011 111    3 7     <-        " (str)   Z          /

32 - 39
100 000    4 0     x (mul)   -         )          @
100 001    4 1     J         J         -          A
100 010    4 2     K         K         +          B
100 011    4 3     L         L         <          C

100 100    4 4     M         M         =          D
100 101    4 5     N         N         >          E
100 110    4 6     O         O         &          F
100 111    4 7     P         P         $          G

40 - 47
101 000    5 0     Q         Q         *          H
101 001    5 1     R         R         (          I
101 010    5 2     $         x (mul)   %          J
101 011    5 3     *         $         : (col)    K

101 100    5 4     -         *         ?          L
101 101    5 5     )         )         ! (xcl)    M
101 110    5 6     ; (sem)   ; (sem)   , (com)    N
101 111    5 7     =<        =<        \          O

48 - 55
110 000    6 0   blank       &         0          P
110 001    6 1     /         A         1          Q
110 010    6 2     S         B         2          R
110 011    6 3     T         C         3          S

110 100    6 4     U         D         4          T
110 101    6 5     V         E         5          U
110 110    6 6     W         F         6          V
110 111    6 7     X         G         7          W

56 - 63
111 000    7 0     Y         H         8          X
111 001    7 1     Z         I         9          Y
111 010    7 2     , (com)   +         ' (apo)    Z
111 011    7 3     %         . (per)   ; (sem)    [

111 100    7 4     -=        [         /          $
111 101    7 5     =         (         . (per)    ]
111 110    7 6     ]         <        []         Up-arrow
111 111    7 7     " (str)   <-    -= or "STOP"   <-

Legenda

In reality all symbols occupy only one position on the printed paper, not two or more as some do here in this listing.

 ' (apo)  =  apostrophe
 " (str)  =  string quote
 . (per)  =  period
 , (com)  =  comma
 : (col)  =  colon
 ; (sem)  =  semocolon
 ! (xcl)  =  exclamation mark
 x (mul)  =  multiply symbol
 L (GBP)  =  currency symbol for British pound sterling
    ICL   =  Imperial Computers Limited, an old British brand
/\  =  delta symbol, its bottom side is closed actually
[]  =  lozenge = square or rectangle
<-  =  left arrow
-=  =  not equal
=<  =  less than or equal to
>=  =  greater than or equal to

Links

   B6700/7700 Algol Language ref.man., june 1974
   Fortran-V on Univac-1100 series with Exec-8,   dated: before 1979;  more data are unknown
   Data formats collected by NASA

Back to contents


DIGITAL PDP-10 and
DECSYSTEM-10, 20

CONTENTS

-->Return to document header

General description

During the years of the seventies and first third of the eighties the Digital Equipment Corporation (= DEC) ran two families of computers with incompatible architectures. These were the PDP-11 with a databus width of 16 bits and the PDP-10 with a bus width of 36 bits.

The Decsystem-10 and Decsystem-20 belong to the PDP-10 family. In 1983 this product line was cancelled in favor of the PDP-11 line. The PDP-11 was upgraded into VAX and later into Alpha. Amazingly, the profitable VAX ran DEC into ruin, so in 1998 the company was taken over by Compaq.

The numeric formats of the PDP-11 are discussed in the document about the hidden bit in this internet site. Those of the PDP-10 are discussed now. A peculiarity in these formats is the 'double-sign construction'.

The size of a memory word in the PDP-10 computer is 36 bits. Signed numbers consist of one or two words. When two words the second word is the tail part (= least significant part) of the number. The first word contains the most significant part.

When the number occupies one word the leftmost bit is used for the +/- sign. When two words the leftmost bits in both words are used for this +/- sign. Thus in contrast with most other computers the first bit in the second word is not used as part of the absolute value of the number. It is a copy of the sign bit of the first word, thus being a kind of 'shadow sign bit'.

The second bit is the first numerical bit in the tail. This bit must be thought to follow immediately the last bit of the first word. This construction makes that a double number has one numerical bit less than a similar number in other 36-bits computers, e.g. the Univac-1100 or IBM-7094.

Negative numbers are stored in two-complement notation. When single the word is complemented. When double both words are complemented separately. The shadow sign bit serves for the independence between the words in this complementation.

The double-sign construction is applied both in the double signed-integer and in the double floating-point numbers.

A floating-point number consists of the +/- sign, the exponent which is biased and the mantissa. In contrast with the PDP-11 all bits of the mantissa are visible. The hidden bit is not applied. Also there are no special values. When the number has double length the whole second word from its second bit on is the tail part of the mantissa that starts in the first word.

When the float number is negative its whole bit-pattern is two-complemented, the bits of the exponent inclusive, as if it were an integer number. So in order to attain the actual value the bit pattern must be 'un-complemented' first.

Thus the number is stored by two or three steps: first its absolute value is calculated and stored. Then the sign is entered in the first bit(s). When the sign is negative then the number is two-complemented. The words of a double precision number are two-complemented separately like two independent one-word integers.

Sometimes Digital gave the single precision float the name F-type float. In the earlier versions of the PDP-10 it applied one type of double precision floats, named D-type. In the later versions it added a second type, the G-type.

The older D-type uses the same exponent as the single precision number. Thus it has the same value range, only the accuracy is higher. The newer G-type has a longer exponent, so the value range is much greater at a small cost of the accuracy. Some processors of the PDP-10 family were not able to handle a double precision number at all. For them an emulator was written in software. The range and the accuracy of the F-type (= single number) equal those of the Univac-1100 and IBM-7094.

In spite of the differences there are some conformities with the PDP-11 family:
-- First, the exponents and their biases are the same in all three floating point formats. So the value ranges both computer families can handle are equal.
-- Second, the value of the mantissa is always between 0 and 1, so the mantissa is always a fraction. It has no integral part. It is said to be normalized when its first bit is 1. Then its value is said to be between 0.5 and 1.0.

The correspondences to the PDP-11 go somewhat further: The mantissa of a non-zero value should always be normalized. In the PDP-11 every non-zero mantissa is assumed to be normalized. A zero value should be represented by both the mantissa and the exponent bit-pattern zero. For the PDP-10 the same holds. When the mantissa is not normalized or the exponent bit-pattern is not zero wrong calculation results may occur.

In the PDP-10 computers the memory word is not divided in predefined parts that can be selected separately, like bytes or nibbles. But the user can select easily every bit or arbitrary contiguous group of bits inside the word. Thus these computers can handle (text-) 'bytes' of every length from 1 upto 36 bits. Often bytes of seven bits are used. They are filled with ASCII coded characters. Five of them fit in the word. Then one bit is left unused. The PDP-11 always handles eight-bits ASCII bytes, so the textes of both machines are poorly compatible.

Remarkably, even the manuals of both computer families are not compatible in an important aspect, the numbering of the bits in a computer word. In the PDP-11 manuals the least significant bit (= LSB) is indexed as zero, but in the PDP-10 manuals the most significant bit (= MSB) is indexed as zero. In the numbers this bit is the sign bit.

In the next structure listing the bits are indexed according to the PDP-11 definition since this way of indexing is applied in every document in this internet site. Therefore here the MSB is indexed by 35, whilst in the PDP-10 manuals it is indexed by 0. Similarily the LSB is indexed here as 0, and in the manuals 35.

After this listing a summary is given of the number of bits, the accuracies and value ranges. In this list the rounding protocol applied for the decimal figures is: - for minimum value: always rounding up - for maximum value: always rounding down - for accuracy and #digits: always rounding down (All rounding protocols are written elesewhere in this internet site. Similar holds for the concept of 'guaranteed accuracy'.)

Numeric structures

FLOAT ('F')

bit 35      = sign bit  (0='+', 1='-')
bit 34 - 27 = exponent
bit 26 - 0  = mantissa

exponent integer = 0 - 255
exponent bias = 128
0 =< mantissa = fraction < 1.0
value range when normalized = 1.47E-39 = 2^-129 <-> 1.70E+38 = 2^+127
guaranteed accuracy = 26 bits = 7.8 digits

Range and accuracy equal those of the Univac-1100, the IBM-7094 and the Univac-418.

DOUBLE PRECISION FLOAT ('D')

first word:
  bit 35      = sign bit  (0='+', 1='-')
  bit 34 - 27 = exponent
  bit 26 - 0  = mantissa (MSP)
second word:
  bit 35      = shadow sign bit
  bit 34 - 0  = mantissa (LSP)

exponent integer = 0 - 255
exponent bias = 128
0 =< mantissa = fraction < 1.0
value range when normalized = 1.47E-39 = 2^-129 <-> 1.70E+38 = 2^+127
guaranteed accuracy = 61 bits = 18.3 digits

DOUBLE PRECISION FLOAT ('G')

first word:
  bit 35      = sign bit  (0='+', 1='-')
  bit 34 - 24 = exponent
  bit 23 - 0  = mantissa (MSP)
second word:
  bit 35      = shadow sign bit
  bit 34 - 0  = mantissa (LSP)

exponent integer = 0 - 2047
exponent bias = 1024
0 =< mantissa = fraction < 1.0
value range when normalized = 2.79E-309 = 2^-1025 <-> 8.98E+307 = 2^+1023
guaranteed accuracy = 58 bits = 17.4 digits

INTEGER

  bit 35      = sign bit  (0='+', 1='-')
  bit 35 - 0  = mantissa

integers are in the two-complement notation
value ranges from -2^35 upto +2^35-1    so from  -34,359,738,368  upto  +34,359,738,367
maximum number of digits = 10.5

DOUBLE INTEGER

first word:
  bit 35      = sign bit  (0='+', 1='-')
  bit 34 - 0  = most significant part (MSP)
second word:
  bit 35      = shadow sign bit
  bit 34 - 0  = least significant part (LSP)

double integers are in the two-complement notation
value ranges from -2^70 upto +2^70-1    so approximately from  -1.18E+21  upto  +1.18E+21
maximum number of digits = 21.0

BOOLEAN

in the language Fortran:
  bit 34 - 0  = unused
  bit 35      = (This is the sign bit of the numbers) =
                logical value:  0 = '+' = false, 1 = '-' = true

Arithmetic data in listing

Machine: DEC-PDP10

Binary bits

                     #bits          binary normalized abs.value
               total expon mantis    minimum    maximum(nearly)
single float F   36    8    27      2^(-1-128)       2^+127
double float D   72    8    62      2^(-1-128)       2^+127
double float G   72   11    59      2^(-1-1024)      2^+1023
integer          36    -    35    min.non-zero: 1    2^+35
double integer   72    -    70    min.non-zero: 1    2^+70

Decimal values

                  decimal normalized abs.value   float-accur.
                    minimum         maximum     or int-#digits
single float F     1.47E-39        1.70E+38           7.8
double float D     1.47E-39        1.70E+38          18.3
double float G     2.79E-309       8.98E+307         17.4
integer         min.non-zero: 1    3.43E+10          10.5
double integer  min.non-zero: 1    1.18E+21          21.0

Again: Note that a single-word number when negative is two-complemented like an integer. So the bit-pattern has to be 'un-complemented' to find the actual value it stands for. In a double-word number both words are two-complemented separately.

The rounding protocol for the decimal figures is:
-- for minimum value: always rounding up
-- for maximum value: always rounding down
-- for accuracy and #digits: always rounding down
(All rounding protocols are written elesewhere in this internet site. Similar holds for the concept of 'guaranteed accuracy'.)

Comparison with other machines

The following list compares the arithmetic formats of the several computers in the PDP-10 family with those in the PDP-11 family and the Univac-1100 family.

              # bits mantissa            guaranteed accuracy
float #bits  Univac  PDP-10  PDP-11     (at integer: # digits)
type  expon   -1100         +hid.bit   U-1100   PDP-10   PDP-11

  F      8     27      27      24        7.8     7.8      6.9
  D      8      -      62      56         -     18.3     16.5
  G     11     60      59      53       17.7    17.4     15.6
integer  0     35      35      31       10.5    10.5      9.3
doub.int 0      -      70      63         -     21.0     18.9

Data in all machines of all families:

Mantissa in floats:

     0.5 =< mantissa < 1.0   when the mantissa is normalized.

Exponent:

   length     excess bias       resulting value range
      8           128           from  -128 upto  +127
     11          1024           from -1024 upto +1023

The numeric float values are calculated by the formula:

     value = mantissa * 2 ^ resulting_exponent

Links

Digital manuals:
   TOPS-10/20 Fortran language manual, nr. AA-N383B-TK/AD-N383B-T1, february 1987
   Many other PDP-10 manuals
MIT PDP-10 Info file "Introduction to PDP-10 Assembly Language Programming"

Back to contents