## PROPOSALS FOR THE COBOL NUMERICS

### by:   J.H.M. Bonten

First date of publication: 29 january 2007
Last date of modification: 25 june 2007
Renaming 754r into 754-2008: 23 september 2008
Correction of Round-to-ceiling example: 29 january 2009
Addition of body-mass-index example: 14 march 2009
Enhancement of Einstein example: 29 june 2009
Correction of Concept-2 constants: 18 september 2010

### Contents:

The four chapters in this document are fairly independent. They all suggest some improvements in the notation of numeric values and expressions in the language Cobol. The first one arguments the use of the PDE numbers and proposes a way of specifying them. The second chapter looks at the ways to store a number as a series of half-bytes ('nibbles'). The third chapter revises the notation of several arithmetic operations. And the fourth chapter revises the commands for the value rounding.

Back to index of numeric formats

### Mental compatibility of PDE-decimals

For nearly 50 years Cobol handles numeric data of a category that many other languages like Fortran and C/C++ do not have. It is the well-known way of data storage by Cobol which can be seen as a distinct numerical category. In reference to its specifier USAGE IS DISPLAY let us call this category with the name DISPLAY.  It is designed especially for accounting and administrative applications.

This category has a text-oriented way of storage and it enables the overlay of data. This means that two or more numbers can share the same memory area. This is in contrast to the binary and decimal categories that are computer-word oriented and do not facilitate memory overlays. Also it has its own rules for the arithmetic operators. The main advantage of this category is that the storage of a number corresponds always to the mental image about that number in the mind of most humans. It is said to be compatible with that image.

A number in this DISPLAY category consists of a series of digits written in text bytes and often with a virtual decimal point in between. The numbers can have any length in digits from 1 upto 32 (18 in older eras), and the virtual point can stand on any location in this series. Also imaginary zeroes can be added. Thus over 1000 different structures are available to store the numeric values. Each individual structure is described by a PICTURE specification, e.g. PIC 999V99.

In contrast to this each of the word-oriented categories shows up less than ten different structures. All numbers are forced to fit into these few formats. This may seem a rigid bondage to the obdurate Cobol-user. He has less choice in the layout of the number. But there is more that supports his aversion against these numeric categories: the supposed incompatibility with the mental image.

Upto now non-display data have always been non-decimal (e.g. binary) data. The way a numeric value is stored favorizes the (hardware-) structure of the computer and thus increases the speed of the calculations enormously. Alas it is totally incompatible with the image of that value in the human mind. The translation from decimal to non-decimal (binary) messes all things up. The average user who tries to grasp the contents of the number gets totally astray and confused. The digits appear to be totally hustled and intertwined. Therefore he cannot read the contents directly by an overlay with a text field. For him the number is not more than a black box and hopefully the right digits are inside. He will experience that only the DISPLAY data are easily to grasp by his human mind.

Nevertheless the Cobol programmer should get accustomed to the 'weird' categories of the non-display numbers since these have three advantages: First the computer needs less memory to store the numeric values in these 'obscure' formats, secondly the computer calculates faster by them, and at third these formats enable program segments written in different languages to exchange numerical data very easily and without the need for any conversion. Thus they can make Cobol very versatile. For example, Cobol can communicate easily with Fortran and C/C++.

The DISPLAY numbers are not really friendly to the computer. They are slow, consume much memory space and they are not compatible with the numbers of other languages. Therefore the Cobol programmer must sometimes use the computer-word oriented numbers. Luckily for him today the salvation is at hand. Now there are categories of numbers that are fairly friendly to the computer and still are compatible with the image in the human mind. Like the binary numbers these numbers are word-oriented non-display numbers. They are the decimal numbers of IEEE-754-2008 which are encoded in Packed Decimal Encoding (PDE).

These are not more than ordinary sequences of decimal digits stored in a compacted way. Even the decimal point is present in such a number (by the exponent in it). Although the number is squeezed into a confined space the user can still manage it mentally by the ordinary way. He does not need to curl his mind like a snake into weird thoughts to grasp the contents of this number. His ordinary way of thinking in a series of digits with a point in between (like the DISPLAY data) remains valuable.

The user only needs to accept that he cannot look inside the number directly by overlaying a text string with the REDEFINES clause, and also that the length of the series of digits in this number is predefined. He cannot shorten or elongate this series. There are three (or four) such series available: with 34 or 16 or 7 (or 4) digits. And only from this set of three (or four) lengths he can select. There is no other choice. But the user can place the decimal point on every location in this series wherever he likes, and even outside this series. In some cases he can even shift this point to another place and thus elongate or shorten the fractional part of the number.

Thus the IEEE-754-2008 numbers stay mentally compatible like the 'old' DISPLAY data. Yet they are still fairly friendly for the computer. It is the compatibility with the human mind that enables the easy use of these numbers. Therefore all three (or four) lengths should be implemented in Cobol, not only the longest one of 128 bits like some language designers propose.

So today the fear for word-oriented numbers does not make much sense anymore. The decimal numbers also may ease the leap to the binary word-oriented numbers that sometimes are obligatory. Nevertheless the binary numbers stay incompatible with the human mind and therefore are of course not really handsome and acceptable for many 'Cobolisti' (or 'Cobolistas'?).

The clear definition of the new 'peculiar' formats makes the USAGE specifier superfluous. When a field format is specified with a PIC(-TURE) then it is automatically a DISPLAY field. When it is specified with a computer-word oriented format then it is automatically a COMPUTATIONAL field. (An INDEX field corresponds to the pointers used for arrays in C/C++.)

### Some examples

To ease the switch for Cobol people to the compact decimal non-display numbers two 'new'-Cobol examples with some data-field specifiers are given here. The first one is:

```WORKING-STORAGE SECTION.
01 NEW-YORK.
02 CENTER.
O3 BROOKLYN     PIC A(3)X(12).
O3 HARLEM       PIC 999V99.
03 MANHATTAN    PIC AA9.
02 PRECINCT.
03 NEWARK       PIC 999.
03 WEST-END     UNSIGNED-SHORT.
01 NETHERLANDS.
02 THE-HAGUE    BINARY-INTEGER-BYTE.
02 AMSTERDAM    BININT-Y.
02 ROTTERDAM    BINFLOT-M.
77 PARIS       DECFLOT-S.
77 ROME        DECFIX-L     FRACTION = 6.
77 PRAGUE      DECFIX-XS    FRACTION = 3.
```

The length of the fraction of Rome is 6.  This means that when this number is unfolded into display text-bytes then six digits will be at the right side of the decimal point.

Similarily Prague will show up three digits at right of the decimal point. The total length of Prague is four digits since every decimal-XS contains four digits (see the proposal about the unsigned PDE-format). The programmer needs to know this fact. So Prague has one digit at left of the decimal point.

Second example: The following specifications show that the decimal point does not need to be inside the series of digits. It can also be outside.

```  01 EGYPT.
02 ABU-SIMBEL  DECFIX-S   FRACTION = -2.
02 GIZA        DECFIX-S   FRACTION = -1.
02 ALEXANDRIA  DECFIX-S   FRACTION = 0.
02 PORT-SAID   DECFIX-S   FRACTION = 1.
O2 SUEZ        DECFIX-S   FRACTION = 2.
...
02 CAIRO       DECFIX-S   FRACTION = 6.
02 ASWAN       DECFIX-S   FRACTION = 7.
02 EL-AMARNA   DECFIX-S   FRACTION = 5.
02 KARNAK      DECFIX-S   FRACTION = 6.
```

In the numbers from Port-Said upto Cairo the point is between the series. In the number Alexandria it is at the right edge of the series. In the number Aswan it is at the left edge. In the other four numbers it is intended to be outside the series.

When translated into a display specification the numbers would become the fields:

```     02 ABU-SIMBEL    PIC s9999999PPV
02 GIZA          PIC s9999999PV
02 ALEXANDRIA    PIC s9999999V  (= PIC s9999999)
02 PORT-SAID     PIC s999999V9
O2 SUEZ          PIC s99999V99
...
02 CAIRO         PIC s9V999999
02 ASWAN         PIC sV9999999
02 EL-AMARNA    PIC sVP9999999
02 KARNAK      PIC sVPP9999999
```

The following, third, example with unsigned PDE is not fully possible:

```  01 ITALY.
02 PALERMO     DECFIX-XS   FRACTION = -1.
02 VENICE      DECFIX-XS   FRACTION = 0.
02 GENUA       DECFIX-XS   FRACTION = 1.
O2 ROME        DECFIX-XS   FRACTION = 2.
02 SIENA       DECFIX-XS   FRACTION = 3.
02 MILANO      DECFIX-XS   FRACTION = 4.
02 BOLOGNA     DECFIX-XS   FRACTION = 5.
02 FLORENCE    DECFIX-XS   FRACTION = 6.
```

The middle six numbers, Venice to Bologna, are correct. When translated into a display specification they would be:

```     02 VENICE      PIC 9999V  (= PIC 9999)
02 GENUA       PIC 999V9
O2 ROME        PIC 99V99
02 SIENA       PIC 9V999
02 MILANO      PIC V9999
02 BOLOGNA    PIC VP9999
```

The numbers Palermo and Florence cause an error since the range of the built-in exponent is too small to contain their fraction lengths.

The programmer must know on beforehand that in a decimal number (both decflot and decfix) the number of digits and the range of the fraction lengths are:

```    length   #digits  +/-sign   fraction-length range
------   -------  -------   ---------------------
-XS        4     signed           +2 , 4    (see proposal)
-XS        4     NO sign           0 , 5    (see proposal)
-S         7     signed          -90 , 101
-M        16     signed         -369 , 398
-L        34     signed        -6111 , 6176
```

In a decfix number the programmer must give a location for the decimal point, which he does by the fraction specifier. In a decflot number the computer does this by itself at every arithmetic operation.

### Integral decimals

When used in fixed mode and with the right exponent value the PDE-numbers can be applied as integer numbers, the decimal integers. In this case the exponent value should be choosen such that the decimal point is immediately at the right of the last coefficient digit. Then the length of the fraction is exactly zero. For this the value of the exponent should be:

```    length   #digits  +/-sign   exponent value
------   -------  -------   --------------
-XS        4     signed     not possible   (see proposal)
-XS        4     NO sign          3        (see proposal)
-S         7     signed           6
-M        16     signed          15
-L        34     signed          33
```

Back to contents

Back to index of numeric formats

### Packed Decimal

Besides the Packed Decimal Encoding an older system of digits compression exists. It is called Packed Decimal (without the word Encoding). It consists of a sequence of digits coded in BCD (= Binary Coded Decimal) and stored in four-bit units called nibbles. Two of such nibbles fit in one byte. The +/- sign if present always occupies a separate nibble. The range of values that can be stored in this numeric sequence is much more than in a Display sequence with the same number of bytes, but it is less than in the Packed Decimal Encoding (= PDE).

The fully binary storage of the numeric value can stand for approximately the same number of digits as the PDE, although conversion errors between binary and decimal may occur. The computer calculates faster and the guaranteed accuracy is slightly higher.

The following table shows the number of decimal digits that fit in a series of bytes for all three ways of decimal storage. And it shows the approximate guaranteed accuracy in decimals of the binary numbers. The guaranteed accuracy of the decimal storage is always one less than the number of digits listed here. Example: the guaranteed accuracy of the 16-bytes Display standard storage is 15 digits.

```           length in bytes ->      2       4       8       16
-------------------------------
type of storage:                  ---- number of digits -----

Display (Standard in Cobol)        2       4       8       16
(with separate sign)       1       3       7       15

Packed Decimal (BCD, unsigned)     4       8      16       32
(BCD, with sign)    3       7      15       31

Packed Decimal Encoding            4       7      16       34
(IEEE-754-2008, with exponent)

Binary with hidden bit            --- guaranteed accuracy ----
(IEEE-754, with exponent)       3.0     6.9    15.6     33.7
```

This table is the basis for the proposal to use the guaranteed accuracy as a suffix in the type names for numeric values in stead of the general word-length suffix. This is described in the Proposal for Universal names for data-types.

To describe the actual structure of the digits sequence in the series of nibbles a kind of Picture specifier is required similar to that for the Display data. Let us call this the Nibble-Picture, shorthand NibPic. Again the cryptic clause USAGE IS COMPUTATIONAL can be omitted.

In most computers the series of nibbles is stored in a series of bytes. Therefore sometimes one nibble stays unused. If so, it is always the first or the last nibble. Example:

```  01 FRANCE.
02 LYON          NIBBLE-PICTURE  S999V99.
02 DIJON         NIBPIC  S999V99.
02 PARIS         NIBPIC  99V99.
02 MARSEILLE     NIBPIC  U9(11)V9(6).
02 TOULOUSE      NIBPIC  S9(8)U.
```

In this table Lyon and Dijon have equal structure. Both are stored in a series of three bytes. Paris is unsigned and can be stored in two bytes. Marseille is unsigned too. Its 17 digits need 8 bytes. The leftmost nibble stays unused. The signed number Toulouse is stored in five bytes with the rightmost nibble unused.

Like the DISPLAY data the NIBBLE data are not compatible with the numeric data used in many other programming languages.

### Location of the sign

In the numbers of the types Binary (IEEE-754) and Decimal (IEEE-754-2008) the location of the sign is predefined. It is always the first bit of the computer word and its meaning is:
0 = '+'     and     1 = '-'

In the numbers of the data types Display and Nibble the user or programmer can select one out of several ways to store the sign. He can store the sign at the left side of the series of digits (SIGN LEADING) or at the right side (SIGN TRAILING) or leave it absent. In the latter case the number is always non-negative.

When the sign is present in a Nibble number, it always occupies a separate nibble. So it occupies the space of one digit. When the sign is present in a Display number it can either occupy a separate byte (SIGN SEPARATE) or be stored in the otherwise unused so called 'zone bits' of the first or last digit. This storage in the zone bits is the default standard in Cobol.

An English-like expression is used to indicate the storage of the sign. This expression can become fairly cumbersome. Its longest version is:
SIGN TRAILING SEPARATE
Such frightening expressions should be abolished.

This can be done by a concise and easy way. For this the picture specification must be adapted. For the type Display the sign storage indicators can become:
The type Nibble has only one storage indicator:

The examples make clear how these indicators are applied. Behind the percent symbols the comments are written.

```  01  ENGLAND.                        % comments:
02    BIRMINGHAM   PIC  SS999V99.   % sign leading separate
02    LONDON       PIC  999V99SS.   % sign trailing separate
02    DOVER        PIC  9999V99.    % no sign (=non-negative)

02    LUX-CITY    NIBPIC  99999V9.  % no sign (=non-negative)
```

All elements of England consist of 6 bytes. All elements of Luxemburg consist of 6 nibbles = 3 bytes. ('Lux-city' means 'Luxemburg-city')  Note that the symbol S can stand at the tail side of the picture specification, which is not allowed in the present-day Cobol.

### Built-in decimal-point

Both in the Display notation and in the Nibble notation the location of the decimal point is not stored in the number. It is memorized by the executing environment in the (sub-)program that uses the number. Hence this location is not transported to another program segment by the argument list in the subprogram header. Only the bare digits are. Thus every segment can assume its own location of the decimal point. Of course this assumption is very prone to errors.

Therefore it is better that the location of the decimal point is stored inside the number. Thus it is bound tightly to the digits it belongs to. Then it is transferred automatically via the argument lists to the subprograms always. Never a mistake will be committed.

Often to remember the location of the decimal point the number is fitted with an additional integral value. The length of this so-called 'exponent' is a few digits or bits. The original number is now called the 'coefficient'. The decimal point itself is not written in the coefficient.

The tight connection of point location and digits series is one of the advantages of the Packed Decimal Encoding of IEEE-754-2008.  Alas, for many people this encoding system might be fairly complicated. In Cobol it cannot be overlaid with the REDEFINES clause. And user/programmer can select the length of the coefficient only out of a small set of predefined lengths.

Some other systems of numbers with built-in exponent are based on the series of BCD coded four-bit nibbles. One of them is an extension of the Packed Decimal system. Another is a compressed version of the ordinary humanly readable text-notation in ASCII bytes of the numeric value, therefore called 'nibble edited'. Both systems enable the user/programmer to choose every length for the coefficient. They should be applied not only in Cobol, but also in other programming languages like C/C++ and Fortran. These numbers with variable length are proposed in a separate document in this Internet site.

Back to contents

Back to index of numeric formats

### Results differ from those in Fortran/C/C++

The application of the computer words as numbers with fixed length in Cobol give raise to a new problem. The elementary operators like ADD, SUBTRACT, DIVIDE and MULTIPLY are expected to give the same results as their counterparts in the more mathematically organized languages like Fortran or C/C++.  But this may not be true when a sequence of these operators is applied. Then the results may become slightly different. This will be explained now, and the solution will be given.

Suppose the variables A, B, C, D and E are all numbers in the computer-word oriented notation, i.e. they all are binary or decimal floating or fixed point numbers. None of them is in the display notation. The two variables D and E have equal structure.

Then the Cobol statement
MULTIPLY A BY B GIVING D
gives exactly the same resulting value in D as the Fortran or C/C++ statement gives in E
E = A * B

But the set of Cobol statements
MULTIPLY A BY B GIVING D
is not exactly equal to the Fortran/C/C++ (from now on solely: Fortran) statement
E = C + A * B
The results in D and E can be sligthly different.

The reason is that Fortran uses a temporary variable to store the value of the multiplication. That variable is invisible to the programmer. Cobol does not use such a variable since it does not enable a chain of arithmetic expressions in one statement (except in the COMPUTE statement). Therefore the explicit variable D must be used to store the result of the multiplication. And that fact is the snag.

The temporary variable used by Fortran is the longest word the arithmetic processor in the computer can manage as a single numeric entity. In several computers this entity is longer than the longest entity the programmer can specify explicitly in the program text. For example, in the Intel-PC the longest binary floating-point number the user can specify has 64 bits, whilst the temporary variable has 80 bits. (IEEE-754 T and E format respectively, in my proposed notation BINFLOT-M and BINFLOT-AM)

So the value the adder receives to add to C is more accurate in Fortran than in Cobol. Therefore the resulting value in E may differ slightly from that in D.

### Equal results by special variables

This difference can be abolished by introducing into Cobol an explicit variable for temporary use. Let us give it a quite predictable name: TEMP.  The programmer can neither define this variable nor specify its structure. This variable exists always and the computer always gives it the structure best fit to the output of the statement. So its structure may change in the course of the program.

Example:
The variables A1 and B1 are decimal and the variables A2 and B2 are binary. In the following program part the variable TEMP is decimal first and later it switches to binary automatically like a cameleon. The variables C, D1 an D2 do not influence the status of TEMP.

```         MULTIPLY A1 BY B1 GIVING TEMP      ! Temp is decimal
MULTIPLY A2 BY B2 GIVING TEMP      ! Temp is binary
```

In all cases TEMP is the longest computer-word the mathematical processor can manage as a single numeric entity.

Now the set of Cobol statements
MULTIPLY A BY B GIVING TEMP
gives exactly the same result as the Fortran statement
E = C + A * B

Alas, a new problem arises: The fairly complicated statement
E = A1 * B1 + A2 / B2
still cannot be implemented in Cobol. Now two variables TEMP are needed simultaneously.

Therefore besides the single variable TEMP an array of such variables should be implemented. In theory this array can be infinitely long. Every element may differ in structure from the other elements, since every element selects the structure best fit to the receiving end of the statement it is used in. The index to select such an element must be an integer constant always. It can never be a variable or an expression.

Let us keep this array anonymous. Its elements are called TEMPS (= plural of TEMP). Thus the definition is somehow like:

```    01   anonymous
02   TEMPS  OCCURS inifinite TIMES    PIC cameleon.
```

The following element selection is allowed:
TEMPS(5)
but the next one is not allowed:
TEMPS(K)
even when K is a nice integral variable. And of course the following is forbidden also:
TEMPS(K+1)
For documentational purposes an adding expression with integral constants only should be allowed:
TEMPS(3+5)
which means TEMPS(8)

Now the fairly complicated statement can be written as:

```        MULTIPLY A1 BY B1 GIVING TEMPS(1)
DIVIDE A2 BY B2 GIVING TEMPS(2)
```

For the simple problems the single variable TEMP remains in use, so this set of statements can be replaced by:

```        MULTIPLY A1 BY B1 GIVING TEMP
DIVIDE A2 BY B2 GIVING TEMPS(1)
```

Every program segment has its own variable TEMP and anonymous array containing the TEMPS.  So the segments can neither influence nor communicate to each other by these variables.

### Design more general functions

One elementary arithmetic statement can be added to Cobol to make this language more versatile.

```     in Cobol           similar to     Fortran
POWERUP A WITH K                     A = A ** K   (= A^K)
POWERUP A WITH K GIVING C            C = A ** K   (= A^K)
```

Herein K must be an integral value, i.e. a value without a decimal point. K is allowed to be negative. Then automatically a division is added to the exponentiation. Then A^K means:
1 / ( A ^ |K| )
wherein |K| means the absolute value of K.

Some elementary arithmetic functions can be transformed into statements, e.g:

```     in Cobol           similar to     Fortran
SQUAREROOT A                         A = sqrt(A)
SQUAREROOT A GIVING C                C = sqrt(A)
RECIPROCAL A                         A = 1 / A
RECIPROCAL A GIVING C                C = 1 / A
```

For Squareroot the number A must be non-negative otherwise an error will occur.

Similarily like these two elementary functions several other ones can be reshaped into real Cobol statements. Even some more simple operations should become Cobol statements. For example, for the calculation of the area size of a land or the volume of a house or lorry the following six statements should be given:

```Legenda:
L = length    W = width     H = height
A = area      V = volume    G = general

in Cobol       similar to     Fortran
QUADRAT G                         G = G ** 2
QUADRAT L GIVING A                A = L ** 2
SURFACE L, W GIVING A             A = L * W
CUBIC G                           G = G ** 3
CUBIC L GIVING V                  V = L ** 3
VOLUME L, W, H GIVING V           V = L * W * H
```

Note that Surface equals the ordinary Multiply. So its name makes sense only in some cases for a better documentation.

Together with the storage for the temporary values all these elemntary functions and operations make the COMPUTE statement sometimes superfluous as the examples in the following section show.

### Some calculations without Compute statement

#### Albert Einstein's formulas

The famous mass-formula of Albert Einstein can be calculated fairly easily with the 'new' Cobol statements, without the use of the Compute statement. This formula is:

```                        mass0
mass  =  ---------------------------
___________
/           |
\        /       2
\      /      V
\    /  1 - -----
\  /          2
\/         C
```

Herein:
C = speed of light (2998E5 m/s)
V = velocity of the object
mass0 = mass of the object when it stands still
mass = mass of the object in the move

The computation will be:

```    QUADRAT V GIVING TEMPS(1)
DIVIDE TEMPS(1) BY TEMPS(2)
SUBTRACT TEMPS(1) FROM 1 GIVING TEMPS(1)
SQUAREROOT TEMPS(1)
DIVIDE MASS0 BY TEMPS(1) GIVING MASS
```

Of course, in Fortran the writing is more concise:
mass = mass0 / sqrt(1-v**2/c**2)
although the result will be exactly the same.

His famous energy formula which is the basis of the atomic bomb

```                  2
E = mass * C
```

becomes with the 'new' Cobol statements:

```    QUADRAT C GIVING TEMPS(2)
MULTIPLY MASS BY TEMPS(2) GIVING E
```

Again, Fortran is more concise:
E = mass * c**2
and produces exactly the same result.

Note that this energy formula is based on the mass calculated by his first formula, not on mass0.  When the velocity is very low in comparison with the speed of light, the mass formula can be approximated very well by the fairly simple formula:

```                    2      1             2
E  =  mass0 * C   +  --- * mass0 * V
2
```

The first component of this formula tells the amount of energy that is set free when the object is 'vaporized' into radiation. The second part tells the energy the object has got because it is moving. It is the kinetic energy found by Isaac Newton more than two centuries before Einstein.

#### Concept-2 rowing machine

The Concept-2 rowing ergomenter is an air fan that is driven by the rowing movement of a human person. A computer mounted on this machine registers the power the person delivers. It also tells the speed the rower would make in a real race-rowing boat when delivering this power.

On this machine as in the actual rowing world this speed is expressed as its inverse: the number of seconds to go a certain distance, e.g. 100 or 500 meters. Concept-2 applies a reference distance of 500 meters. Based on the delivered power the computer estimates the energy consumption of the average rower during his effort.

The fan has a disc at its right side to choke its air input and thus make the rowing easier. The lower the position is the less air has to be displaced at the same fan-rotation speed and thus the easier the rowing becomes at the same tempo (= number of strokes per minute). But the computer decreases the virtual boat speed. Thus the disc position does not influence the formulas for the power delivery and the energy consumption.

The power delivered by the rower is a cubic formula. The formula for his energy consumption is fairly straightforward. The legenda also explains shortly the three machine constants.

Legenda for the Concept-2 models C, D and E:

```     T = time in seconds per 500 meters (546.81 yards)
power = power delivery in Watts
energy = energy consumption in KiloCalories per hour
3.5E8 = aerodynamic resistance constant of this ergometer
300 = basic metabolism of the sporting human body
3.44 = energy consumption factor in Kcal/Hour per Watt
with muscle rendement of 25 percent
```

The two formulas are:

```                     -3
power = 3.5E8 * T
energy = 300 + 3.44 * power
```

With the 'new' Cobol statements these formulas will become:

```    CUBIC T GIVING TEMP
DIVIDE 3.5E8 BY TEMP GIVING POWER
MULTIPLY POWER BY 3.44 GIVING TEMP
```

In Fortran these formulas become:
power = 3.5E8 / ( T ** 3 )
energy = 300 + 3.44 * power

At 120 seconds (= 2 minutes) per 500 meters the power 203 Watts and the energy consumption is 1000 KCal/hour when the proper rowing technique is applied.

This example is based on an internal memo of the students rowing union 'Theta' at Eindhoven in the Netherlands.

#### Quetelet's body-mass index improved:

In the present days physicians state that too many people have too much bodyfat. Everywhere one meets the term Body Mass Index (= BMI).  Its well known formula based on the Napoleonic (= metric) measurement system is:
BMI = weight/(length^2).

According to the theoretical physics this formula should not be a square formula, but a cubic formula:
BMI = weight/(length^3).

Actually a long person is not a straight enlargement of a short person. For example, the heads of both persons have equal size. The circumference above the ears and eyes approximates 60 cm (= 24 inches) for all adult persons. Head and neck together always count for a height of approximately 30 cm (= 12 inches). Also there are some other differences in the anatomical structure of the body.

Therefore the BMI formula must be less than cubic. In the nineteenth century the Belgian poly-scientist Adolphe Quetelet who introduced mathematics into the social and human sciences adopted the exactly-square formula which is still in common use today. In some countries the BMI is even named Quetelet index.

At present several investigators suppose that the exponent of the actual formula should be somewhere between 2.3 and 2.7.  Therefore we can assume that the exponent value 2.5 gives a much more accurate BMI formula. Thus the improved formula becomes:
BMI = weight/(length^(2.5)).

Of course, when using this improved formula the boundaries between the areas of Obese, Too-fat, Normal, Thin, and Too-skinny must be updated.

A big advantage of the inproved formula is that it better describes the required BMI for very young people, e.g. from ten years old upto the puberty age. Generally these youngsters are fairly shorter than adults. The square formula does not tell correctly the 'allowed' BMI for very short people. It allows them to be fairly fat. So a correction factor is needed. Consequently for all young people until the age of sixteen an age-related correction factor has been introduced. When the improved formula is applied this correction factor is needed only for the ages less than ten.

In Fortran the improved formula can be written in one line as:
BMI = weight / ( sqrt(length) * length ** 2 )
or as:
BMI = weight * sqrt(length) / length ** 3

Writing it in Cobol deserves four lines:

```     SQUAREROOT LENGTH GIVING TEMPS(1)
SQUARE LENGTH GIVING TEMPS(2)
MULTIPLY TEMPS(1) BY TEMPS(2)
DIVIDE WEIGHT BY TEMPS(1) GIVING BMI
```

The formula can also be programmed as:

```     SQUAREROOT LENGTH GIVING TEMPS(1)
CUBIC LENGTH GIVING TEMPS(2)
RECIPROCAL TEMPS(2)
VOLUME WEIGHT, TEMPS(1), TEMPS(2) GIVING BMI
```

#### Conclusive

Nevertheless, in spite that Fortran is more concise, these examples shows that even moderately more complex formulas can be done easily by with the 'new' Cobol statements. Then the ugly COMPUTE statement becomes superfluous in several cases.

Back to contents

Back to index of numeric formats

### Rounding methods

Very often in a computer numeric values are copied from one storage field (computer-word or display text) to another. In many cases the leading or trailing digits (bit, octal, decimal, hex, etc.) of the coefficient have to be omitted, otherwise the number will not fit into the receiving field. When one or more leading digits are omitted then an error message is given and the receiving field is filled with a special value, NaN or Infinity.

When one or more of the trailing digits are omitted then the number itself will not be stored, but a number representing a neighbouring value. No error message is given. The neighbouring value is the number that fits into the receiving field and is the nearest to the value that should be stored. The finding of that nearest value is called 'rounding'.

Two algorithms are used for the rounding. The first one is simply cut off the superfluous digits of the coefficient. The remaining part of the coefficient is clearly recognizable as the first part of the original coefficient. The second rounding method is the same cut-off followed by adding 1 to the last digit of the remaining coefficient. So the absolute value of the whole number is increased slightly. This may completely change the whole bit pattern of the remaining coefficient part, beyond recognition.

Amongst others the selection which one of these two algorithms is used depends on the +/- sign of the original value. Example:
Round-to-Ceiling means:
When original value is ...
... negative: simply cut off.
... positive: cut off.  When removed part is ...
... zero: do nothing more.
... nonzero: add 1 to coefficient.

The other selection criterions are the commands that select which one of these two algorithms will be used at each moment. These command names, their resulting rounding methods and their proposed Cobol names [between brackets] are:

```ROUND DOWN           [DOWN]
Round towards 0.  (truncation = always cut-off)
ROUND UP             [UP]
Round away from 0.  (always add 1 to cutted coefficient)
ROUND TO FLOOR       [FLOOR]
Round towards -Infinity.
ROUND TO CEILING     [CEILING]
Round towards +Infinity.
ROUND DOWN WHEN HALF     [HALFDOWN]
Round to nearest; if equidistant, round down.
ROUND UP WHEN HALF       [HALFUP]
Round to nearest; if equidistant, round up. (often used in
Europe)
ROUND TO EVEN WHEN HALF  [HALFEVEN]
Round to nearest; if equidistant, round so that the final
bit or digit is even.  (banking rounding)
```

These are the seven rounding modes that are used fairly to very often. Five rounding modes that are used seldomly or never can be added to complete this list:

```ROUND TO ODD WHEN HALF   [HALFODD]
Round to nearest; if equidistant, round so that the final
bit or digit is odd.
ROUND TO EVEN    [EVEN]
Round such that the last bit or digit becomes even.
ROUND TO ODD     [ODD]
Round such that the last bit or digit becomes odd.
ROUND TO FLOOR WHEN HALF    [HALFFLOOR]
Round to nearest; if equidistant, round to floor.
ROUND TO CEILING WHEN HALF  [HALFCEILING]
Round to nearest; if equidistant, round to ceiling.
```

The complete list is rehearsed in the Proposals for numeric (object-) classes and rounding command.

### Application commands

Many statements create a value and copy this value into a value-receiving field (= 'variable' in many other languages). In this copying act the act of rounding must be applied. The statements are those with the keywords TO and GIVING and some others, for example:

```   ADD A TO C
CALL F USING A,B GIVING C
MOVE A TO C
DIVIDE C BY B
```

When the receiving field (C in these examples) is a field for temporary storage, TEMP or TEMPS(i), then rounding will not be applied since it is not necessary.

Now the questions are:
-  where and when in a computer program should the rounding command be given?
-  how far should reach its influence?

To answer the last question: it is up to the language designers whether they want to keep the influence locally (= option L, see below) or to spread it out globally (= option G). The theory and effects of the Global option are more difficult to comprehend than those of the Local option. They are described in more detail in the Proposal for numeric (object-) classes and rounding command.

The first question has three answers for the location. Together with the reach of influence they become:

• --- At the beginning of a program segment ---
(The main program is considered as a program segment also.) In the Special-Names paragraph of the Configuration-Section of the program segment the programmer can write:
ROUNDING METHOD IS name.
L:  The selected rounding method is applied by all copying statements in the whole segment.
G:  The selected rounding method is applied in the whole program segment and in all the segments its calls, except those that have their own ROUNDING METHOD command.
• --- Absent ---
When the Rounding-Method statement is omitted a default method is used for the whole program segment.
L:  This default method has been selected by the designers of the language.
G:  This method equals the one that is active in the statement that calls this segment. When that segment is the main program, then the default method of L:.
• --- At the tail side of a statement with a copying act ---
The statement is lengthened with the sequence:
ROUNDING = name.
Only for this statement the rounding method selected for the program segment is overruled. This lengthening is not allowed when the receiving field is a TEMP or TEMPS field. Note that the MOVE statement consists of the copying act solely and therefore also can get the rounding tail.
L:  The selected rounding method works solely in the statement.
G:  The selected rounding method works in the statement and in all program segments it calls.

The presently used keyword ROUNDED which applies the halfup rounding should be abolished.

Example:
A part of a program segment wherein the statements are numbered looks like:

```        ROUNDING METHOD IS HALFEVEN.
....
02      MOVE TEMP TO D ROUNDING = CEILING.
03      ADD A, B GIVING C.
04      CALL F USING C,B GIVING E.
05      DIVIDE TEMP BY E ROUNDING = UP.
```

The statements 01, 03 and 04 apply the even-when-half rounding method. The statement 02 applies the rounding to ceiling. The statement 05 is in error since TEMP is the receiving field.

In statement 04 the program segment with name F is invoked. When the reach of influence is global and this segment F has no explicit rounding-mode command, then the even-when-half method is active in this segment too. When the reach is local, then the method selected as default method by the language designers is applied in the segment F.

In concise languages like Fortran or C/C++ the setting-line 'ROUNDING METHOD IS name' or a similar text can be applied too in the top of every program segment (= subroutine or function). But the addition of the local command line 'ROUNDING = name' to the tail of an arithmetic expression is grammatically less nice. At least the program's visual appearance will become bad.

Also in these concise languages the influence of the rounding command is either always local or always global, depending on the language designers. The theory and effects of the global reach are described in the Proposals for numeric (object-) classes and rounding command.

Back to contents

Back to index of numeric formats