MULTI-WORD MEMORIES

by:   J.H.M. Bonten


CONTENTS

==>>Back to main index of numeric formats


THE IBM-1620 DECIMAL COMPUTER

First publication date: 8 march 2009

CONTENTS

-->Return to document header

An ordinator goes scientific

In the late years 50's IBM designed two decimally working computers with data of variable length. These are the IBM-1401 and the IBM-1620.  The 1401 was intended for administrative purposes, whilst the 1620 was intended primarily for scientific applications. Therefore the latter was rigged with Fortran and a decimal kind of a hardware floating-point processor, both things the 1401 did not have. Anyhow, actually it was a more or less refurbished ordinator, as the structure and useage of the RAMemory show.

The memory words in both machines look fairly the same although they are not equal. The 1401 uses 8 bits, 7 for the actual data and 1 for the parity check by the hardware. The 1620 uses 6 bits, 5 for the data and 1 for the parity check. Here the 1620 is described more extensively.

In this description a contiguous series of consecutive words in the RAMemory is called a 'string', irrespective the contents in these words. A digit is a word that contains an integral value out of the range from 0 upto 9.

Memory structure

In the IBM-1620 the RAMemory word consists of six bits. Bit 5 is the check bit for odd parity. The user cannot access it. The bits 3, 2, 1 and 0 (named 8, 4, 2 and 1 by IBM) contain an unsigned digit in BCD (hex 1001 or less) or a special code (hex 1010 and more).  Bit 4 is the 'flag bit', which is used for several purposes.

The four bits 0 to 3 are interpreted together, so their contents is seen as one whole thing. That interpretation is not influenced by the contents of the flag bit 4.  Reversely also, the contents of bit 4 is interpreted fully independently from the contents of the bits 1 to 3.  It can have several meanings, depending on the context wherein it is used. Thus in fact every memory word consists of two independent words, one of one bit and the other containing four bits.

   Identifying the six bits:
       5  4  3  2  1  0  <--  index in this document
       C  F  8  4  2  1  <--  name by IBM

The computer is a variable-length computer, as it can transfer every number of words to and from the CPU in one memory access. Although there is one exception: This number cannot be 1, so a single word cannot be transferred. But 2 or 3 or any other even or odd number can, as long as there is enough (empty) RAMemory. A special CPU-command serves for transmitting one single word.

The computer has three different series of consecutive words. These are the data for internal processing, the data for external communication, and the CPU-instructions. Each type of series has its own method of transfer between a memory location and another location or CPU or peripheral device, and vice versa.

The memory is addressed decimally. An address consists of five digits, thus ranging from 00000 upto 99999.  So the address space contains 100,000 words. Actually never a machine with a RAM bigger than 60,000 words has been sold.

CPU-instructions

A CPU-instruction has a fixed length of 12 words (or 8 for one branch command). It is accessed by pointing at the location of the word on the lowest RAMerory address. The computer accesses the eleven (or seven) following words by increasing the address by one for eleven (or seven) times. Every instruction must start at an even word address, so the address of the pointed location is always even.

Nearly all instructions consist of a 2-word op-code followed by a 5-word 'P-address' and a 5-word 'Q-address'.  Some of them only use the P-address, thus wasting five words. Also does the ordinary branch instruction 'B'.  Therefore later a shorter version without Q-address has been introduced, named 'B7'.  This instruction is padded with an unused word because the next instruction must start on an even address.

When an operation uses and makes data in the RAMemory only, then the P-adress is the destination address and the Q-address is the source address or an immediate value. When it involves a peripheral device, then the P-address is the address in the memory, both for source and destination, and the Q-address controls the I/O device. The second and third word in that Q-address form the address (port number, channel number) of that device, and the third word may contain a control character for that device. The first and fifth word stay unused.

The flag bits in the middle of the instruction string can be set to one. They serve for commanding the indirection or for addressing a CPU's index-register:
- In the least significant digit of a 5-digit address the flag is set to 1 for indirect addressing. By this way multi-level indirection can be used. Thus the machine can even be put in an infinite indirect addressing loop!
- In the middle 3 digits of a 5-digit address the flags are set to select one out of the 7 index registers.

Series of words as data

The length of a series of words for data is often variable, seldomly fixed. Again the first word to be accessed has to be pointed at by the user's program. Again the computer accesses the following words by continuously updating the address by one. But now it stops when a series-termination symbol is met. This terminator is either a flag or an end-of-line symbol.

Now the update is either an INcrement or a DEcrement. In case of increment the initial address must point at the head of the string, which is the word with the lowest memory address. In case of decrement the initial address points at the string's tail, which has the highest memory address. Thus the computer can behave as a big-endian machine or a little-endian machine simultaneously.

The series must consist of at least two words. The flag bit in the first word can be zero or one. The series is terminated by the second or later word that has the flag bit one or the 'end-of-line' symbol as its value. In all words between the first and the last accessed word the flag bit must be zero, otherwise the series would be terminated prematurely.

The flag bit in the last accessed word is called a 'Word Mark'. The end-of-line symbol is called a "Group Mark'.  A series of data is made either of single words or of double words. When double words, these words must begin on an even memory address.

Digits and special values

The set of values that can be stored in the four bits 0 upto 3 is divided into two groups. One group contains the digits 0 upto 9 which are written in BCD-code. The other group contains three special values and three unused bit-patterns. Altogether all sixteen bit-patterns and their meanings are:

bit-
pattern   meaning
-------   -------
0 0 0 0   digit with value 0    \
0 0 0 1   digit with value 1     |  When the word contains
 :::::     :::   ::   :::  :     |- one of these ten values
1 0 0 0   digit with value 8     |  it is called a 'DIGIT'.
1 0 0 1   digit with value 9    /

1 0 1 0   Record Mark (marks end of record or end of line)
1 0 1 1   [unused]
1 1 0 0   Numeric Blank (blank in output format for punch card)
1 1 0 1   [unused]
1 1 1 0   [unused]
1 1 1 1   Group Mark (marks end of group of records in disk I/O

The IBM-1620 works very decimally. Whilst the digits are used everywhere, the six non-digit bit-patterns have am extremely limited range of uses, They can NOT be used as ordinary data, e.g. in calculations, Also they cannot be used in the CPU-instructions, neither in the opcodes nor in the addresses. The three special values are used mainly in the communication with the peripheral I/O-devices.

So even in a CPU-instruction the op-code and the address(es) must be fully decimal. Thus the memory is addressed decimally. An address that points at a RAMemory location consists of five digits, thus ranging from 00000 upto 99999.

Text processing

A contiguous series of consecutive memory words that use used as data is called a 'string'. Such a string is either a text or one single number. First the text is discussed. The number will be discussed later.

A text is build up of characters. Each character is represented by a series of two consecutive words, thus forming a 'double-word'.  Double-words must start at an even address in RAMemory.

Remarkably only the ten decimal values are allowed in each word to store the character symbol. The six non-decimal values are not allowed in either word. Thus the computer can store utmost 100 different characters, not 256 as the binarily oriented machines do in eight bits. Remarkably, the machine does not use a lot of the possible bit-patterns for text. Although enough unused patterns are available, it does not handle the lower-case characters,

Letter K:
             +---------+---------+
             |    5    |    2    |
             |         |         |
             |  zone-  | numeric |
             |  digit  |  digit  |
             +---------+---------+
  Address:    even: n    odd: n+1

IBM has given fairly weird names to both digits. The leftmost digit is called the 'zone digit' and its right neighbour is called the 'numeric digit'.  These names still exist in today's Cobol, wherein an 8-bits byte is split into 4 zone bits and 4 numeric bits. The zone digit must stand on the even address. When the zone digit is flagged the whole character is said to be flagged. Never the numeric digit is flagged. A character is accessed by mentioning the odd address of its numeric digit, never via the even address of the zone digit.

The numeric values that represent the characters are:

         character     value
         ---------    ---------
          blank        00
          period .     03
          )            04
          +            10
          $            13
          *            14
          hyphen -     20
          /            21
          comma ,      23
          (            24
          =            33
          @            34
          A - I        41 - 49
         -0 - -9       50 - 59
          J - R        51 - 59
          S - Z        62 - 69
          0 - 9        70 - 79
       record mark     0.10
        group mark     0.15

All values not listed here are said to be an ivalid character. On paper this character is printed as an X with vertical bar or as a rectangle. This symbol is often called 'Smersh'.

The straightforward translation of a numeric digit into a character gives one out of two different results. When the digit is positive the result will be in the range from 70 upto 79.  When the digit is flagged to be negative the result will be in the range from 50 upto 59.  The alphabetic characters J to R are located in the same range. The printer always prints their symbols. The value 50 is printed as Smersh.

In the processing of the text the machine behaves as a little-endian machine. The text is processed from its end to its beginming. All flag bits in all characters must be zero, except the one in the zone digit of the leftmost character, which must be one. It acts as the text terminator and is called Word Mark.

In the communication with the outside world (printer, keyboard, tape, punch-card, and other I/O-devices) the computer behaves more or less as a big-endian machine. The for humans normal order is used. The first character of the transfer is addressed by the location of its numeric digit, not by that of its zone digit. Now the string is terminated by the end-of-line symbol which is called the Record Mark. This symbol itself is not transferred.

Example:
For internal processing the word KNUCKLE is stored as:

               _
              _K     N     U     C     K     L     E
memory-       5 2   5 5   6 4   4 3   5 2   5 3   4 5
-address:   low                                     high
                                                    A
start-address ____..                          ..____|

For the I/O-transfers this word looks like:

               K     N     U     C     K     L     E     #
memory-       5 2   5 5   6 4   4 3   5 2   5 3   4 5   0 10
-address:   low                                           high
                A
start-address __|

In this graphic '#' means the Record-Mark symbol.
The graphic shows that the start address is always odd in text processing.

Numeric structures

A fixed-point (e.g. integer) number is stored as a series of at least two consecutive memory words filled with BCD-digits. The decimal point is virtual. It is not stored in the series, but it must be remebered elsewhere by the executing program.

The series' maximum length is virtually infinite. The flag bit of the number's leftmost digit ia always set to one, thus marking the most significant digit. The flag bit of the rightmost digit is the sign of the number:  0 = '+' and 1 = '-'.  Thus the number is written in sign+magnitude notation. The flag bits of the other digits must be zero.

In processing the numbers the computer acts as a little-endian machine. The CPU addresses the least significant digit first, and then lets the sddress creep down until the most significant digit is attained. The flag bit acts as the 'creep stopper'.

The computer has a floating-point processor in its hardware. (The 1401 lacks this facility.)  A floating-point number is stored as a contiguous series of two integer numbers. The second one is the exponent. It has always two digits. The first one is the coefficient. Its length can vary from 2 upto 100 digits. When the language Fortran is applied this length is fixed to 8 digits for single precision. Then the whole number occupies 10 memory words.

The coefficient must be normalized Its value reaches from 0.1 until 1.  The exponent goes from -99 upto +99.  Thus the value of the whole number, when nonzero, goes from E-100 until E+99.

Next are some floating-point numbers and their representations in the machine. An upperscore means that the flag bit is 1.

                       _      __
          - 480.0  ->  4800000003
                       _       _
            48.0   ->  4800000002
                       _      __
          - 4.8    ->  4800000001
                       _       _
            0.48   ->  4800000000
                       _       __
            0.048  ->  4800000001
                       _      ___
          - 0.0048 ->  4800000002

Summary

A series of words for use as data must be at least two words long. It has either a numeric or an alphanumeric meaning. It serves as a text or one decimal number. The flag bit in the last word can have any value, That in the first word may have one. If so it acts as a terminator. The flags in the words in between must be zero. A series of words used for a CPU-instruction has a fixed length of 12 or 8 words and has no terminator.

For processing a string the CPU needs to know the address of the word to be handled first and the direction of updating the address (upwards or downwards).

A fixed-point number is stored as a series of at least two BCD-digits. The decimal point is virtual. A floating-point number consists of two integer-valued fixed-point numbers. Its non-zero value range goes from 10^(-100) until 10^(+99).  A text character is stored as a series of two BCD-digits. A text string is a series of one or more characters. When intended for I/O it is finished by a non-BCD Record Mark. Instructions for the CPU consist of 8 or 12 BCD-digits. An address consists of five digits, so the address space is 100,000 RAMemory words.

The purpose of the flag bit is:
- In the first-addressed digit of a number it is the +/- sign of that number.
- It marks the word to be processed last in a string with variable length (Word Mark)
- In the least significant digit of a 5-digit address it is set to 1 for indirect addressing.
- In the middle 3 digits of a 5-digit address they are set to 1 to select one of 7 index registers.

Links

Wikipedia about the IBM-1620
Book:
R. Clay Sprowls: "Computers, a programming problem approach", Harper&Row 1968, LCCCN = 68-12278 (ISBN: none)

Back to contents


WORDS OF 2^N BITS AND NUMBERS IN
IBM-360   and   IBM-370

Date of first publication: 8 march 2009

CONTENTS

-->Return to document header

Introduction

In the midst of the years sixties IBM launched a new computer, the IBM-360. Its memory was designed such that it can behave both as having long words and as having short words. Thus the computer became the first one to handle simultaneously various data formats in an easy way. It revolutionized the arrangement of the data in the computer's physical storage. The computer also got two arithmetic processors, a decimal and a binary one. So it was apt both for scientific and banking applications.

Remarkably the name '360' may suggest its data bus width has 36 bits. Actually it has not. The bus width is not dividable by 3 or 6 or 9, in contrast with many other computers of that time. Some of them are described elsewhere in this internet site.

The 360 improved the concept of compatibility also. It was not one type of computers, but it was a series of different models, from cheap, small and slow to expensive, large and fast. Nearly every program developed on one model culd run on all other models without any modification. Consequently this series and its successing series, the 370, got used very ubiquitously in the late years sixties, the seventies and the eighties.

For the ability to handle different data formats and for the program compatiility some restrictions were built in the design of the 360.  The size (= length) of each data word is a power of 2 and not too great. And the word boundaries are aligned. Each of them coincides with a boundary of a shorter word. Also a sharp distinction was made between physical and logical data storage.

Full-binary memory organisation

The storage mechanism is discussed in more detail now. For this sake we must see the memory as a giantly long and homogeneous row of single bits. The number of bits is dividable by 64.  This row is subdivided into groups of bits, each containing the same number of bits. Stated otherwise: the groups have the same size or length. Neighbouring groups adjoin (= touch) each other, so no rogue bits are left. The groups are called 'computer words'.

The computer applies five different sizes of words: 4, 8, 16, 32 or 64 bits. The subdivision just described implies that the two boundaries of every (except 4-bits) word always coincide with those of two immediately shorter words. Of course the reverse is not true. The boundary where the two shorter words adjoin punches the longer word in its midst. For example, the two boundaries of a byte do not punch a nibble in its midst. But the byte is punched in its midst by the boundary between two consecutive nibbles.

Thus the boundaries are located at places that are away from the left edge of the memory over distances that are integral numbers of words of one type. Therefore they are said to be 'integral' at the length of these words. The drawing elucidates the memory organization. Herein the integral boundaries are shown by vertical lines. The different types of words are named according to IBM.

#BITS                                               NAME by IBM
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
 4 |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  | nibble
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
 8 |     |     |     |     |     |     |     |     | byte
   +-----+-----+-----+-----+-----+-----+-----+-----+
16 |           |           |           |           | half word
   +-----------+-----------+-----------+-----------+
32 |                       |                       | full word
   +-----------------------+-----------------------+
64 |                                               | double wrd
   +-----------------------------------------------+

In those early days there were computers wherein the memory was addressed decimally. It is clear that that mechanism is not fit for this memory organisation with the binary word splitting. Therefore IBM applied the binary addressing mechanism only.

IBM discerns two sharply distinct categories of words: the physical words and the logical words. In both categories four of the five word sizes are available. The nibbles are logical only. The manufacturer determines which physical words are used in the computer's RAMemory. The small computers get 8 bits, whilst the large ones get 64 bits. The medium-sized computers get 16 or 32 bits. This word size determines the width of the data bus between the CPU and the RAM.  Therefore never a mixture of word sizes is applied in an individual computer.

Mixing different word sizes

However, such a mixture is possible by the logical way. The programmer can apply all word lengths simultaneously in each program. He does not need to be concerned about the physical word length of the machine he works on. Only the logical word length matters for him.

The 360-hardware projects the desired logical words upon the available physical words. Hereby it takes care for the right alignment of the boundaries, which is in the way as described above. For example, in the small machine a 32-bits word is placed upon four consecutive 8-bits memory words, whilst in the large machine it is placed in the left half or the right half of a 64-bits memory word, but never in the midst of that word.

This construction enables all machines to be used both for data ordering and text processing (that generally use short words) and for scientific calculations (generally using long words). It also enables both data and programs to be transferred easily from the one to another machine. Consequently all machines can be programmed in the same way despite their enormous physical differences. They all are (nearly) fully compatible.

In listing the word formats and their common applications are:

#bits  log phys   IBM-name     general applic. when logical
-----  --- ----   --------     ----------------------------
  4    yes  NO    nibble       decimal calculus for banks
  8    yes  yes   byte         text processing + data bases
 16    yes  yes   half word    binary integer + CPU-instruction
 32    yes  yes   (full) word  binary integer + binary float
 64    yes  yes   double word  binary floating point
128    yes  NO    extend.word  binary float (in 360/85 and 370)

Thus all machines handle the 8-bits bytes for text characters and for decimal digits. The 4-bits nibbles are for decimal digits too. The binary floating-point numbers are stored in words of 32 or 64 bits, The binary integer numbers stay in 16 or 32 bit words. The instructions for the CPU are organized in rows of 1, 2 or 3 half-words.

Note that a full word is not the same as a row of two half words, although both have the same length of 32 bits. The full word cannot start on every half-word boundary, whilst the row can. The scheme elucidates this:

   +-----------------------------------------------+
64 |                                               | double wrd
   +-----------------------+-----------------------+
32 |                       |                       | full word
   +-----------+-----------+-----------+-----------+
16 |           |           |           |           | half word
   +-----+-----+-----+-----+-----+-----+-----+-----+
 8 |     |     |     |     |     |     |     |     | byte
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
                                                 _
0 0       ***********************                 \_ impossible
0 0                   ***********************    _/  for both
1 1 ***********************
0 1             ***********************
1 1                         ***********************
0 1                                     ***********************

A A
| |
| '--- legal for row of two half words (= 2 x 16 bits)
'----- legal for full word (= 32 bits)
              1 = legal; 0 = illegal

A row of characters or 'zoned' (= stored in bytes) decimal digits can start on any byte and can have any length. The row has not a delimiter. The space allocated to it is mentioned in every instruction that handles it.

There are two exceptions to this 'binary-split oriented' memory management. They offend the concept of the integral boundaries.
- First: A row of decimal digits stored in nibbles must start at a byte boundary, so the half of the nibble boundaries are illegal starting points. This row has not a delimiter too.
- Second: The lately born model 85 and the series 370 deviate further from the memory management scheme. Herein every data item of any size can start at every byte boundary, although this may degrade the performance significantly. Therefore IBM advises to keep the words aligned at the integral boundaries of their size. Like in the ordinary 360's the CPU-instructions must start always at the integral boundaries of the half words.

'Binary' numbers

The computer stores the decimal data in bytes (e.g. digits in EBCDIC) or nibbles. It is able to apply five decimal operators directly on these data. These are addition, subtraction, multiplication, division, and comparison. The computer can also calculate on the numbers that are stored in the binary way, the integer and floating-point numbers. The structure of these numbers is discussed now.

Beside storing decimal digits the nibbles also form the basis for the internal structure of a binary floating-point number. The coefficient in the single-precision number occupies 6 nibbles. The machine calculates with the hexadecimal digits in these nibbles, not with the solitary bits. Consequently the exponent base is (decimal) 16, not 2.  Together this exponent and the coefficient-sign occupy two nibbles (= one byte).

A float number represents a zero value only when all hex-digits in the coefficient are zero. When also both the exponent and the sign have the all-zeroes bit-pattern the number is said to have a true zero.

In a non-zero number at least one of the coefficient's hex-digits is non-zero, i.e. 1 or more. This occurs when at least one of the hex-digit's four bits equals 1.  This is not necessarily its leftmost bit. Thus its three leftmost bits can become zero. Consequently the hidden bit can not be applied in the coefficient. All its bits stay visible.

In the 360-machines the coefficient is always normalized, i.e. its first hex-digit is always non-zero. In the 370-machines unnormalized coefficients can be applied.

Since a hex-digit occupies 'many' (4) bits the normalization process generally requires less steps to make the first hex-digit non-zero, thus rendering this process faster or cheaper than in a full-binary machine with the same coefficient length. But the cost is high: the guaranteed accuracy decreases by 3 bits = 0.9 decimal digits. The value range of the whole number is greater than in a comparable full-binary machine since the exponent base is much greater.

The double precision format equals the single precision format with an elongated coefficient. The exponent is not changed. So the value of a double precion number does not differ much from that of its 'corresponding' single precision number. Only the accuracy is better.

The extended precision format equals the double precision format elongated with the coefficient of a second double precision format. The sign bit and the exponent part of that second word stay unused. The exponent is not made longer. So again, the numeric value does nearly not change. Only the accuracy increases. This giant number is applied only in the late-born model 360/85 and in the 370 series.

All three formats have the same exponent. This exponent is written in excess-bias notation, with bias 64.  The coefficient is a fraction. When normalized its (decimal) value ranges from 1/16 until 1.0.  All the number's bit patterns have an ordinary finite numerical meaning. There are no special values like NaN or Infinity. The whole numeric value is calculated by the formula:
        Value = Coefficient * 16 ^ (ExponentInteger - ExponentBias)

The range of values is not symmetrical around the value 1.  The multiplication min*max delivers 16^(-2) = 1/256 =~= 0.0039.

The whole number is written in sign + magnitude notation. So its sign is inverted solely by toggling the sign bit. There are no other things, e.g. a complement notation.

Integral numbers are stored in half words or in full words. For them the two-complement notation is used to make them negative. There are no integers in double words or longer. In the 360-s addresses are unsigned 24-bits numbers. Addres displacements are unsigned 12-bits numbers.

'Binary' numeric structures

SINGLE PRECISION FLOAT

This number is stored in a 32-bits full word.

bit 31          =  +/- sign
bit 30 upto 24  =  exponent (base = 16, excess-bias = 64)
bit 23 upto 0   =  coefficient (value from 0.0625 until 1.0)

DOUBLE PRECISION FLOAT

This number is stored in a 64-bits double word. The first half of this word equals the single precision float.

First full word:
 bit 63          =  +/- sign
 bit 62 upto 56  =  exponent (base = 16, excess-bias = 64)
 bit 55 upto 32  =  coefficient, most significant part

Second full word:
 bit 31 upto 0   =  coefficient, least significant part (=tail)

EXTENDED PRECISION FLOAT

This number is applied only in the 360/85 and in the 370.  It consists of a series of two double words, thus occupying 128 bits, of which 8 are unused. The first half of this float equals the double precision float.

First double word:
 bit 63          =  +/- sign
 bit 62 upto 56  =  exponent (base = 16, excess-bias = 64)
 bit 55 upto 0   =  coefficient, most significant part

Second double word:
 bit 63 upto 56  =  unused
 bit 55 upto 0   =  coefficient, least significant part (=tail)

LONG INTEGER

This number is stored in a 32-bits full word.

bit 31         =  +/- sign
bit 30 upto 0  =  value in two-complement

SHORT INTEGER

This number is stored in a 16-bits half word.

bit 15         =  +/- sign
bit 14 upto 0  =  value in two-complement

Arithmetic data in listing

Machines: IBM-360, IBM-370

Binary bits

data type     #  b i t s       binary normalized absolute value
          total  expon  coeff     minimum      maximum(nearly)
float       32     7     24      16^(-64-1)        16^63
double      64     7     56      16^(-64-1)        16^63
extended   128     7    112      16^(-64-1)        16^63
long int    32     0     31         2^30            2^31
short int   16     0     15         2^14            2^15

Decimal values

data type    decimal normal. abs.value   floats: guar.dec.accur
              minimum         maximum    integers: #digits
float        5.398E-79       7.237E+75             6.0
double       5.398E-79       7.237E+75            15.6
extended     5.398E-79       7.237E+75            32.5
long int     1.074E+9        2.147E+9              9.3
short int    16384 (or _5)   32767 (or _8)         4.5

Unnormalized values

data type    coefficent     minimum absolute non-zero value
            # hex-digits        binary          decimal
float             6           16^(-64-6)       5.148E-85
double           14           16^(-64-14)      1.199E-94
extended         28           16^(-64-28)      1.664E-111
long int        7.75              1                1
short int       3.75              1                1

Links

Data formats collected by NASA
G.M. Amdahl, G.A. Blaauw & F.P. Brooks: "Architecture of the IBM System/360"
IBM System/370, Principles of operation
"Computer structures: readings and examples", chapter 43

Back to contents


MEMORY STRUCTURE IN THE
D.E.C.  PDP-11 PEDIGREE

First publication date: 8 march 2009

-->Return to document header

Sizes of the data units

Around 1970 the company Digital Equipment Corporation (= DEC) gave one of its products, the PDP-11, a memory structure fairly similar to that of the IBM-370.  The length of every data item is a power of 2.  However the company gave them names different from those by IBM.  The successors of this successfull minicomputer, the VAX (= Virtual Address Extension) and the Alpha, got the same geography in their memory.

The smallest addressable unit is the 8-bits byte. The largest unit has a length of 64 (in PDP-11 and Alpha) or 128 (in VAX) bits. The individual bits in such a unit can be accessed by shifting and masking.

In similarity with the IBM-370 a nibble string must start at a byte boundary. All other data items can start at any byte boundary. However, the computers may perform better when the items are aligned naturally, i.e. start at the boundaries that are integral to their size (see the text about IBM-360+370).

The data units and the names given by Digital to them are:

 LENGTH   DEC-NAME      USEAGE
bits bytes

  1       bit           smallest unit of information
  4       nibble        decimal digits written in BCD

  8   1   byte          small integer, boolean, text character
 16   2   (DEC-) word   integer, boolean, CPU-instruction
 32   4   longword      long integer, single float (= F-type)
 64   8   quadword      double float (both D- and G-type)
128  16   octaword      in VAX only; quadruple float (= H-type)

Due to a severe communication error inside the research team that designed the PDP-11 the meachines got a weird mixture of big-endiannes and little-endiannes. As a result the bytes that make up a number are written in the wrong sequence. So in the actual bit-pattern the exponent and the coefficent are broken into pieces that are placed in an intermingled way. The arithmetic processor re-orders these bytes before using the number as input for a calculation. And it shuffles the bytes back into the original order before storing the output number into the memory. This will not be discussed here further.

In the floating-point numbers a hidden-bit notation is applied. This means that the coefficient is normalized always and that its first bit is omitted. This notation is explained thoroughly in the document about the hidden bit notation. In that discussion the bytes are assumed to be re-ordered.

Back to text begin