Understanding Float and Double Binary Representation in C

31 May 2026 Anvesh G 0 C Programming

Understanding Float and Double Binary Representation in C

Introduction:

When we store decimal numbers such as 10.5, 3.14, or 0.125 in C, the computer cannot store them directly in decimal form. Instead, it stores them using the IEEE 754 floating-point standard.

The two most commonly used floating-point data types are:

  • float (32-bit)

  • double (64-bit)

Understanding their binary representation helps programmers write efficient and accurate code, especially in embedded systems, scientific computing, and numerical applications.


Why Not Store Decimal Numbers Directly?

Computers operate using binary digits (0 and 1).

For example:

Decimal:

10.5

Binary:

1010.1

To store very large and very small numbers efficiently, computers use a format similar to scientific notation.

Decimal scientific notation:

12345 = 1.2345 × 10⁴

Binary scientific notation:

1010.1 = 1.0101 × 2³

This representation is called Floating Point Representation.


IEEE 754 Float Representation (32-bit)

A float occupies 4 bytes (32 bits).

Layout:

------------------------------------------------
| Sign | Exponent (8 bits) | Fraction (23 bits)|
------------------------------------------------
   1           8                  23

Sign Bit

Determines whether the number is positive or negative.

0 → Positive
1 → Negative

Exponent

Stores the power of 2.

Actual Exponent:

Exponent = Stored Exponent - 127

Here, 127 is called the Bias.


Fraction (Mantissa)

Stores the significant digits of the number.

The leading 1 is assumed and not stored.

Example:

1.10101

Only:

10101

is stored.


Example: Float Representation of 10.5

Step 1: Convert to Binary

10 = 1010
0.5 = 0.1

10.5 = 1010.1

Step 2: Normalize

Move decimal point after first 1.

1010.1

↓

1.0101 × 2³

Step 3: Sign Bit

Positive number:

0

Step 4: Exponent

Actual exponent = 3

Stored exponent = 3 + 127
                 = 130

130 in binary:

10000010

Step 5: Mantissa

Take digits after decimal point:

01010000000000000000000

Final Float Representation

0 | 10000010 | 01010000000000000000000

Binary:

01000001001010000000000000000000

Hexadecimal:

0x41280000

IEEE 754 Double Representation (64-bit)

A double occupies 8 bytes (64 bits).

Layout:

-------------------------------------------------------
| Sign | Exponent (11 bits) | Fraction (52 bits)      |
-------------------------------------------------------
   1            11                   52

Exponent Bias

For double:

Bias = 1023

Formula:

Actual Exponent = Stored Exponent - 1023

Example: Double Representation of 10.5

Normalized form:

1.0101 × 2³

Sign

0

Exponent

3 + 1023
= 1026

Binary:

10000000010

Fraction

0101000000000000000000000000000000000000000000000000

Final layout:

0 | 10000000010 | 0101000000000000000000000000000000000000000000000000

Hexadecimal:

0x4025000000000000

Float vs Double

FeatureFloatDouble
Size4 Bytes8 Bytes
Sign Bit11
Exponent Bits811
Mantissa Bits2352
Precision~7 digits~15-16 digits
RangeSmallerLarger
Memory UsageLessMore

Why Does 0.1 + 0.2 != 0.3?

Many decimal numbers cannot be represented exactly in binary.

Example:

0.1

Binary becomes:

0.00011001100110011...

(repeating forever)

The value gets rounded when stored.

Therefore:

printf("%.20f\n", 0.1 + 0.2);

Output:

0.30000000000000004441

This is a floating-point precision limitation.


Program to View Binary Representation of Float

#include <stdio.h>

int main()
{
    float num = 10.5;

    unsigned int *ptr = (unsigned int *)&num;

    for(int i = 31; i >= 0; i--)
    {
        printf("%d", (*ptr >> i) & 1);

        if(i == 31 || i == 23)
            printf(" ");
    }

    return 0;
}

Output:

0 10000010 01010000000000000000000

Key Takeaways

✔ Float uses 32 bits and Double uses 64 bits.

✔ Both follow the IEEE 754 standard.

✔ Numbers are stored as:

  • Sign

  • Exponent

  • Mantissa

✔ Float provides around 7 digits of precision.

✔ Double provides around 15–16 digits of precision.

✔ Decimal values such as 0.1 cannot be represented exactly in binary.

✔ Understanding floating-point representation helps in debugging precision issues and optimizing embedded applications.

Conclusion

Floating-point representation is one of the most important concepts in computer systems and embedded programming. Understanding how float and double are stored internally helps developers write more reliable software, avoid precision-related bugs, and make informed decisions about memory and performance trade-offs.

Author
BY: Anvesh G

Related Blogs

Post Comments.

Login to Post a Comment

No comments yet, Be the first to comment.