Numbers in Computers

Decimal

People write, think, and calculate in decimal = base 10. The digits represent powers of 10:

1011 decimal = 1000 + 10 + 1

12345 = 1*10000 + 2*1000 + 3*100 + 4*10 + 5

9876 = 9 * 10³ + 8 * 10² + 7 * 10¹ + 6 * 10⁰

Binary

Computers do not use decimal. Instead they use binary = base 2. That is because they are built from transistors, diodes, capacitors, circuits and switches that have two states = ON / off. So computers use powers of 2 instead of powers of 10.

When a number is stored inside a computer, it is stored in binary, like this:

1101 binary = 1 * 2³ + 1 * 2² + 0 * 2¹ + 1 * 2⁰= 8 + 4 + 1 = 13 decimal

1111111 binary = 2⁷ + 2⁶ + 2⁵ + 2⁴ + 2³ + 2² + 2¹ + 2⁰ = 64+32+16+8+4+2+1 = 127 decimal

If you input a number you will type it in decimal. The computer must convert the number to binary before storing it. This is a bit difficult - the computer must think of a set of powers of two that add up to the number.

Start with 91 decimal
The next smallest power of 2 is 64 = 2⁶ , so we can use that.
Then I have 91-64 = 27 left over.
The next smaller power of 2 below 27 is 16 = 2⁴
27-16 = 11
The next smaller power of 2 is 8 = 2³
11 - 8 = 3
Being clever, we see that 3 = 2+1
So the result is: 91 = 64 + 16 + 8 + 2 + 1 = 1011011 binary

Automatic Conversion

If the computer actually had to think about every number it processes, things would be very slow. The CPU contains dedicated circuits for performing this conversion. They use the following algorithm:

91 / 2 = 45 remainder 1

45 / 2 = 22 remainder 1

22 / 2 = 11 remainder 0

11 / 2 = 5 remainder 1

5 / 2 = 2 remainder 1

2 / 2 = 1 remainder 0

1 / 2 = 0 remainder 1

Now read the remainders backward, from the bottom to the top, to get the binary number:

91 dec = 1011011 bin

This algorithm basically involves dividing and subtracting, over and over again. No thinking required.

Problems with Binary

People have trouble reading binary, but computers are wired and/or programmed for binary, so it's no problem for them. Whenever numbers are input or output, conversions let the people work in decimal. So no problem there.

Problems are associated with size limits. In a typical 32 bit PC, an integer (whole number) is stored in 32 bits. The biggest number that fits in 32 bits has 32 1-bits:

1111111 1111111 1111111 1111111
= 2³¹ + 2³⁰ + 2²⁹ + ... + 2³ + 2² + 2¹ + 2⁰

     =   2³² - 1 = 4.3 billion (approximately)
         (actually, half the ints are negative,
             so the actual maximum value is 2³¹ - 1 = 2.1 billion
             But we will leave that discussion for later.
           )

Any whole numbers larger than this cannot be stored in 32 bits. This causes an overflow. The following Java command causes a compiler error:

System.out.println(9876543210 + 1);

--->   integer number too large: 9876543210

Decimals

Not all numbers are integers - we also have fractions and decimals. Decimals are stored a bit differently than integers - the computer uses scientific notation, like this:

1234.5 = 1.2345 x 10³==> written as 1.2345 e +03 in computers

This is called floating point, and consists of two values: the mantissa (decimal part) and the exponent (power of 10).

By using powers of 10, the computer can represent very LARGE numbers - much bigger than the 32-bit integers. But decimals also have a limit - in fact, there are several different limits for decimals.

Double SIZE

Java stores decimal numbers in 64-bits (this is called a double). It uses 53 bits to represent the mantissa, 11 bits to represent the exponent. In each case, 1 bit is used for a sign bit. So the largest exponent we can store is 2^10, or 1024. This is roughly 10^308. Thus, the largest exponent is 308 base 10.

==> the maximum size of a double exponent is 10³⁰⁸
so 10³¹⁰ is too large and cannot be stored as a 64-bit double value (overflow)

Double PRECISION

The mantissa is 52 bits long (one bit is used for the + or - sign). This can store a number up to 2^52, which has 16 decimal digits, so it can store decimal numbers up to 16 digits long:

==> the maximum precision of the double mantissa is 16 digits.
so the number 1234567890.987654321 is too long and cannot be stored precisely in a 64-bit double

Overflow and Underflow

When a number is two large to be stored, we call this an overflow.

It is also possible for a number to be too small - e.g. a very small fraction.

2.5 x 10^-400==> the negative exponent means there are lots of 0's after the decimal point: 0.00000.....399 zeroes....00025

This is too small to be stored in a 64-bit double and causes an underflow.

Round-off Error

Since a 64-bit double can only store 16 significant figures (digits), a number with more digits cannot be stored precisely.

1234567890.987654321 ==> stored as 1234567890.9876542

This isn't so bad, but it isn't even rounded off correctly at the end - it has a 2 instead of a 3. That's annoying.

Types

Numbers get converted either to integer or double values for storage. The computer makes some automatic decisions or assumptions for this purpose. It basically works like this:

a number without decimal points is an integer
a number with a decimal point or an e is a double
any calculation containing only integer values will be performed using integer arithmetic
any calculation containing at least one double value will be performed using floating point arithmetic

This also leads to some errors. Integer division produces integer results, without decimal points, by truncating (throwing away) any remainders. Integer multiplications can easily produce values that are two large, causing an integer overflow. Integer overflows can produce negative numbers, producing quite surprising results.

Java Examples

The following Java commands and results illustrate the errors that can be caused by the limitations of computer number storage.

Java Command	Result	Comments
System.out.println( 10 / 3 );	3	Integer division throws away the decimal remainder
System.out.println( 10.0 / 3);	3.3333333333333335	Round-off error at the end
System.out.println( 1 / 0 );	ArithmeticException: / by zero	Division by zero causes a run-time error
System.out.println(1/3 + 2/3);	0	Integer division produces 0 + 0
System.out.println( 9876543*9876543);	-1195595903	Integer overflow - answer doesn't fit in 32 bits
System.out.println( Math.pow(999,999) );	Infinity	Floating-point overflow (larger than 10³⁰⁸)
System.out.println( 1.2 e -400 );	floating point number too small	Underflow (smaller than 10^-308)
System.out.println( 0.1 + 0.2 + 0.3 );	0.6000000000000001	Round-off error (0.1 cannot be stored accurately in binary)
System.out.println( 0.1 + 0.2 + 0.3 - 0.6);	1.1102230246251565E-16	Meaningless discrepancy - a very small round-off error

Avoiding Arithmetic Problems

Use double rather than int whenever possible (it's not always possible)
Write 3.0 instead of 3
Use typed variables rather than literal values
Don't assume that your results will be correct or useable - test them
Use if.. to test for acceptable values
Make sure the user is informed and doesn't trust computer results too much