Understanding Float and Double Binary Representation in C
Understanding Float and Double Binary Representation in C
Introduction:
When we store decimal numbers such as 10.5, 3.14, or 0.125 in C, the computer cannot store them directly in decimal form. Instead, it stores them using the IEEE 754 floating-point standard.
The two most commonly used floating-point data types are:
float(32-bit)double(64-bit)
Understanding their binary representation helps programmers write efficient and accurate code, especially in embedded systems, scientific computing, and numerical applications.
Why Not Store Decimal Numbers Directly?
Computers operate using binary digits (0 and 1).
For example:
Decimal:
10.5
Binary:
1010.1
To store very large and very small numbers efficiently, computers use a format similar to scientific notation.
Decimal scientific notation:
12345 = 1.2345 × 10⁴
Binary scientific notation:
1010.1 = 1.0101 × 2³
This representation is called Floating Point Representation.
IEEE 754 Float Representation (32-bit)
A float occupies 4 bytes (32 bits).
Layout:
------------------------------------------------
| Sign | Exponent (8 bits) | Fraction (23 bits)|
------------------------------------------------
1 8 23
Sign Bit
Determines whether the number is positive or negative.
0 → Positive
1 → Negative
Exponent
Stores the power of 2.
Actual Exponent:
Exponent = Stored Exponent - 127
Here, 127 is called the Bias.
Fraction (Mantissa)
Stores the significant digits of the number.
The leading 1 is assumed and not stored.
Example:
1.10101
Only:
10101
is stored.
Example: Float Representation of 10.5
Step 1: Convert to Binary
10 = 1010
0.5 = 0.1
10.5 = 1010.1
Step 2: Normalize
Move decimal point after first 1.
1010.1
↓
1.0101 × 2³
Step 3: Sign Bit
Positive number:
0
Step 4: Exponent
Actual exponent = 3
Stored exponent = 3 + 127
= 130
130 in binary:
10000010
Step 5: Mantissa
Take digits after decimal point:
01010000000000000000000
Final Float Representation
0 | 10000010 | 01010000000000000000000
Binary:
01000001001010000000000000000000
Hexadecimal:
0x41280000
IEEE 754 Double Representation (64-bit)
A double occupies 8 bytes (64 bits).
Layout:
-------------------------------------------------------
| Sign | Exponent (11 bits) | Fraction (52 bits) |
-------------------------------------------------------
1 11 52
Exponent Bias
For double:
Bias = 1023
Formula:
Actual Exponent = Stored Exponent - 1023
Example: Double Representation of 10.5
Normalized form:
1.0101 × 2³
Sign
0
Exponent
3 + 1023
= 1026
Binary:
10000000010
Fraction
0101000000000000000000000000000000000000000000000000
Final layout:
0 | 10000000010 | 0101000000000000000000000000000000000000000000000000
Hexadecimal:
0x4025000000000000
Float vs Double
| Feature | Float | Double |
|---|---|---|
| Size | 4 Bytes | 8 Bytes |
| Sign Bit | 1 | 1 |
| Exponent Bits | 8 | 11 |
| Mantissa Bits | 23 | 52 |
| Precision | ~7 digits | ~15-16 digits |
| Range | Smaller | Larger |
| Memory Usage | Less | More |
Why Does 0.1 + 0.2 != 0.3?
Many decimal numbers cannot be represented exactly in binary.
Example:
0.1
Binary becomes:
0.00011001100110011...
(repeating forever)
The value gets rounded when stored.
Therefore:
printf("%.20f\n", 0.1 + 0.2);
Output:
0.30000000000000004441
This is a floating-point precision limitation.
Program to View Binary Representation of Float
#include <stdio.h>
int main()
{
float num = 10.5;
unsigned int *ptr = (unsigned int *)#
for(int i = 31; i >= 0; i--)
{
printf("%d", (*ptr >> i) & 1);
if(i == 31 || i == 23)
printf(" ");
}
return 0;
}
Output:
0 10000010 01010000000000000000000
Key Takeaways
✔ Float uses 32 bits and Double uses 64 bits.
✔ Both follow the IEEE 754 standard.
✔ Numbers are stored as:
Sign
Exponent
Mantissa
✔ Float provides around 7 digits of precision.
✔ Double provides around 15–16 digits of precision.
✔ Decimal values such as 0.1 cannot be represented exactly in binary.
✔ Understanding floating-point representation helps in debugging precision issues and optimizing embedded applications.
Conclusion
Floating-point representation is one of the most important concepts in computer systems and embedded programming. Understanding how float and double are stored internally helps developers write more reliable software, avoid precision-related bugs, and make informed decisions about memory and performance trade-offs.