PROPOSALS FOR VARIABLE-LENGTH FLOATING-POINT DECIMALS

by:   J.H.M. Bonten


First date of publication: 22 march 2007
Small modification: 21 may 2007
Renaming 754r into 754-2008: 23 september 2008

Contents

Back to index of numeric formats


EXTENDED PACKED-DECIMAL

General construction

This text is a proposal for a new type of numerical data in the computer. It is the decimal number with floating point and variable length. It is stored in a contiguous sequence of four-bit nibbles. Like every floating-point number it contains a sign, an exponent and a coefficient. This new type of the variable-length floating-point nibble-decimals is named FLONIB.

Its construction is much simpler than that of the IEEE-754-2008 Packed Decimal Encoding. So it is more easy to understand. Also the series is not forced to have one out of a few predefined lengthes. It can have any length of two or more nibbles.

The number is given much flexibility by the built-in indicator for the length of the exponent. And a special symbol can make the coefficient shorter than the space available for it. Both the length of the exponent and that of the coefficient are not fixed 'forever' at the initilisation of the number field. They can be changed during the run of the program.

The disadvantage is that each digit occupies one whole nibble. Since a compression method like the Densely Packed Decimal is not applied some memory space is wasted. Also less digits can be transported via the computer's data bus in one memory cycle.

Luckily the length indicators for the coefficient and the exponent do not occupy useful space. They occupy space that would be wasted otherwise. Hence the density of the data equals that of the signed series with four-bit digits used in Cobol. The type Flonib can be seen as an extension of that type.

The first (= leftmost) nibble is called the opening nibble. It contains the obligatory +/- sign and some other data. The other nibbles form a sequence of decimal digits, a part of it being the exponent and the remainder being the coefficient.

This sequence is terminated either by the right edge of the memory space allocated to the number or by a nibble containing a special non-digit value. When present it is the last (= rightmost) nibble of the series. It is called the closing nibble.

Thus the total construction of such a number becomes:

 opening-nibble      digit-nibbles      optional closing-nibble

The contents of the nibbles is described in detail below.

Digit nibbles

Each digit-nibble contains one decimal digit, written in Binary Coded Decimal (= BCD).

hex pattern
  value          meaning
-----------      -------
    0            digit 0
    1            digit 1
    .               .
    .               .
    9            digit 9

   10            [forbidden]
   11            [forbidden]
   12            [forbidden]
   13            [forbidden]
   14            [forbidden]
   15            [forbidden]

The series of digits is interpreted as an exponent followed by a coefficient. The length of the series for the exponent is zero to five digits. The determination of this length is explained later. The other digits belong to the coefficient. The length of this series can be virtually infinite. It must be at least one, otherwise the number is in error.

The exponent is written in excess-bias notation. The actual exponent value is got by subtracting 5000... from it. Thus the actual exponent value ranges from -5000... to +4999...
Examples:
    When the exponent sequence equals 6384 then this means +1384.
    When the exponent sequence equals 2384 then this means -2616.

The first of the coefficient digits is assumed to be before the decimal point and the others are behind it. So the value of the coefficient ranges from 0.0000... to 9.9999...   Normalization is not obligatory.

Closing nibble

The closing nibble contains the closing symbol. This is a bit pattern of '1010' or higher. This symbol also tells whether the digits sequence is legal or does not make any sense. In the latter case the symbol contains a special value. Then the whole number is assumed to have that value.

hex pattern
  value          meaning
-----------      -------
    0            [forbidden]
    1            [forbidden]
    .                .
    .                .
    8            [forbidden]
    9            [forbidden]

   10            Infinity  (+/- sign is in the opening nibble)
   11            NaN, quiet
   12            NaN, signaling
   13            [ free for later definition ]
   14            [ free for later definition ]
   15            number is meaningful

When the closing nibble is absent the last digit nibble can be used to display the Infinity or NaN.

When the closing nibble stands too far to the left so that the coefficient can not have any length then the whole number is assumed to be a signaling NaN always, irrespective the contents of this nibble and of the series of digits remaining for the exponent. The coefficient must have at least one digit in order to enable the number to have an ordinary value.

Two special operators handle the closing nibble. In fact they read and set the total length of the series of digits. The reading operator counts the number of digits until the closing nibble or the end of the memory allocated to the numeric field. The setting operator puts a closing nibble with bit pattern '1111' at the right of the desired series of digits. When this should be done beyond the right edge of the allocated memory nothing is put. By a Boolean output value this setting operator tells whether it has placed the nibble actually or not.

Opening nibble

The opening nibble contains the sign of the number and also the length of the exponent. The sign is in its first bit: 0 = '+' and 1 = '-'.  The other three bits tell the number of digits in the digit sequence that are used for the exponent.

octal pattern
  value         meaning
-------------   -------
    0           exponent is absent, so the absolute value of
                   the whole number ranges from 0 to 9.99..
    1           exponent contains 1 digit,
                   so its actual value ranges from -5 to +4.
    2           exponent contains 2 digits  (value -50 to +49)
    3              ..       ..    3   ..    (from -500 to +499)
    4              ..       ..    4   ..    (-5000 .. +4999)
    5           exponent contains 5 digits  (-50000 .. +49999)
    6           exponent is absent; whole number is integral.
                   The coefficient is interpreted as an
                   integer without a decimal point in it. So
                   the number's value ranges from 0 to 9999...
    7           when sign is positive:
                   free for later definition.
                when sign is negative:
                   this nibble is not the beginning of a
                   number.

Remark: Both in the opening nibble and in the closing nibble the bit pattern '1111' (= hex 15) is used as a 'Field terminator', similar as in the Nibble-Edited type which is described below.

The three bits that determine the exponent length are never changed by any numerical assignment operator. These operators only read the bits in order to know which digits belong to the exponent and which ones to the coefficient. Thus the length of the exponent in the receiving field is preserved.

Special operators enable the user/programmer to read and change the three bits. When these bits are changed the length of the exponent part in the series of digits is changed. But the contents of all other nibbles are not changed, neither the digits nor the closing nibble. So in general the numeric value the field stands for becomes very different. It has virtually no relation to the value it stood for previously.

Examples:
  '0 000' 6 2 3 4 5 6 7 '1111'   means   6.234567
  '0 001' 6 2 3 4 5 6 7 '1111'   means   2.34567 * 10^1
  '0 010' 6 2 3 4 5 6 7 '1111'   means   3.4567 * 10^12
  '0 011' 6 2 3 4 5 6 7 '1111'   means   4.567 * 10^123
  '0 100' 6 2 3 4 5 6 7 '1111'   means   5.67 * 10^1234
  '0 101' 6 2 3 4 5 6 7 '1111'   means   6.7 * 10^12345
  '0 110' 1 2 3 4 5 6 7 '1111'   means   1234567

Assignment operation

There are two assignment operators: the full-space assigner and the confined assigner. These have different results when the sending numeric field contains an ordinary value and the receiving field contains a non-digit nibble (= closing nibble) in the second or a more right nibble.

The full-space assigner uses all nibbles in the memory space of the receiver to store the number. The closing nibble is not looked at. It is overwritten simply by an ordinary digit (or a padding zero).

The confined assigner searches for the closing nibble in the receiving field. If this is present the assigner forces the coefficient to fit between the exponent and this nibble. The nibble itself will stand upright and keep its place, although its contents may be modified. In the very most times it will become '1111' (= hex 15).

When present the closing nibble in the sending field is taken into account always by both assigners. When the number of available digits in the coefficient of the receiving field is shorter than that of the sending field a mathematical rounding is applied by the mode activated by the latestly executed rounding-mode command. If the receiving coefficient is longer it will be padded with zeroes.

When the sending field represents a special value (either by a closing nibble with that value or by a shortage of digits) the receiving field will get a closing nibble with the same value. If the receiving field has no closing nibble it will get one in stead of its last digit.

On the analogy of the decimals with fixed length the decimals with variable length can be used also as fixed-point numbers. For this a third assigner is needed that does not change the exponent of the receiving number, but only looks at it.

Definitions

Here the definitions of these floating-point variable-length nibble numbers are proposed for several programming languages.

in Cobol

In Cobol the definition may look like:

     level-nr  field-name   FLONIB   fl, el, cl.

Herein:

EL and CL are initial values. They can be changed during the run of the program by the special statement
         SETFLONIB  field-name,  EL,  CL.
FL is fixed 'forever' and so cannot be changed at a later time.

The actual length of the coefficient (= 'ACL') becomes the minimum of CL and FL-EL-1.   Thus:  ACL = min (CL, FL-EL-1).  This gives the following possibilities:

in Fortran

A nibble string should be made in a way similar to a character string, thus with an invisibly built-in string descriptor. An example of applying such a string descriptor is:

    CHARACTER*7 OrdinaryText      ! well known from Fortran-77
    NIBBLE*10 SimpleNibblesText   ! every contents is allowed
    FLONIB*10 FloatNibbleNumber   ! contents must be a number
    LOGICAL CloNib    ! Closing nibble is put in the string?
       :
       :
    CloNib = SETFLONIB (FloatNibbleNumber, EL, CL)
               !  This intrinsic function sets the lengths
               !  of both the exponent and the coefficient.
               !  The appearance and meaning of the number
               !  match exactly those in Cobol.

The function Setflonib can be called as often as one likes. The field length of a flonib-number is the length written in the string descriptor. So in this example the field length of FloatNibbleNumber is 10.

in C/C++

In C/C++ a character string is seen as an ordinary array-row of characters. There is not a built in indicator for the row length. So the user/programmer should constantly take care of transporting this datum to the subprograms via a separate argument when thransporting the array row.

In order to avoid an error-prone clumsiness like this a Flonib number must become a structure consisting of an integral value and the array row of nibbles. The contents and meaning of the array row equal those in Fortran and Cobol. The integral value is the length of the memory space allocated for this row. In C++ this structure can be protected from undesired intrusion by the outside world by defining it as an object class.

Note that the bit pattern '1111' works in a way similar to the ASCII-null byte '00000000' in a character string. It finishes the string before the end of the array row is reached (and perhaps surpassed).

in some other languages

In some programming languages a character string can have every length. This length can vary during the run of the program. Similar should be possible for a nibble string. In this case the closing nibble must be present always, otherwise the coefficient would become nearly infinitely long.

Back to contents


'TEXT'-STRING "NIBBLE-EDITED"

Scientific text notation compressed

The existence of a variable-length numeric type in nibbles suggests the design of a variable-length numeric-text type in nibbles too. Generally a decimal number as a text in scientific notation contains some interpunctuation marks. Therefore it is written in a series of bytes. But it can be compressed into nibbles. Even the non-digit symbols can be. Moreover, two or more numbers can be chained together into one long nibble string, even when they have different lengths. This compression saves a lot of space in memory and disk and a lot of transfer time over the data buses and lines compared to the ASCII bytes.

A very simple program can translate easily the string of nibbles into the humanly readable text of bytes and vice-versa. It should work according to this table:

----- nibble value -----    --- meaning in the numeric text ---
bit pattern   hexadecimal   numeric meaning     ASCII-byte code

  0000           0                0                     0
  0001           1                1                     1
   :             :                :                     :
   :             :                :                     :
  1001           9                9                     9
  1010           A         Area separator           USA:. Eur:,
  1011           B         Block separator   blank, USA:, Eur:.
  1100           C         Credit sign                  +
  1101           D         Debet sign                   -
  1110           E         Exponent separator           E
  1111           F         Field separator              ;

The Area separator sets the areas for the integral part and the fractional part in the coefficient apart. The Field separator finishes the whole number and thus can seperate consecutive numeric fields. Its meaning is similar to that of the closing nibble in the Flonib numbers, and thus somehow similar to the Null-byte at the end of a text in C/C++.  The other four non-digit symbols are easily to understand.

For example: The two strings of numbers written in byte symbols are matched exactly by the third string written in hexadecimal symbols:

   USA-bytes:  -1,395,153.27E-3;2.76594E+19;59;+35 278 431.2;
  Euro-bytes:  -1.395.153,27E-3;2,76594E+19;59;+35 278 431,2;
 hexadecimal:  D1B395B153A27ED3F2A76594EC19F59FC35B278B431A2F

The storage of numeric text data in nibbles in stead of bytes can save up to 50 percent of storage space and transfer time, although actually it will be less since a data base also contains a lot of ordinary text, e.g. names and addresses.

The programming languages should get facilities to translate a series of numeric values into such a numeric nibble-text and vice versa. In Cobol the nibble text should be named 'Nibble Edited' in analogy to 'Display Edited'.  The facilities for the mathematical languages like Fortran and C/C++ to interprete and create these nibble strings should be similar to the format specifiers in the Read and Write/Print statements.

Back to contents

Back to index of numeric formats