Basic storage format (from high to low): Sign + Exponent + Fraction
Sign: Symbol bit
Exponent: Order Code
Fraction: Valid Number
Analysis of 32-bit Floating Point Storage Format
Sign: 1 bit (31bit)
Exponent: 8 bits (8 bits in total from 30 to 23)
Fraction: 23 bits (23 bits from 22 to 0)
The true value of 32-bit non-zero floating-point numbers (python syntax):
(-1) **Sign * 2 **(Exponent-127) * (1 + Fraction)
Examples are as follows:
a = 12.5
1. Solving symbolic bits
If a is greater than 0, Sign is 0, expressed in binary as:
2. Solving Order Code
a is expressed as binary: 1100.0
If the decimal point needs to move 3 bits to the left, the Exponent is 130 (127 + 3), expressed in binary: 10000010
3. Solving Valid Numbers
If a significant number needs to remove the 1 implied in the highest digit, the integral part of the significant number is 100.
The decimal decimal fraction is converted to binary decimal by the decimal * 2. If the integer part is taken, the decimal part is: 1.
The binary of a can be expressed as: 01000101000000000000000000000000000 when the latter is added 0.
That is: 0100 0001 0100 10000 0000 0000 0000 0000 0000 0000
Expressed in hexadecimal system: 0x41480000
4. Reducing Truth Value
Sign = bin(0) = 0 Exponent = bin(10000010) = 130 Fraction = bin(0.1001) = 2 ** (-1) + 2 ** (-4) = 0.5625
True value:
(-1) **0 * 2 **(130-127) * (1 + 0.5625) = 12.5
32-bit floating-point binary storage parsing code (c++):
https://github.com/mike-zhang/cppExamples/blob/master/dataTypeOpt/IEEE754Relate/floatTest1.cpp
Operation effect:
[root@localhost floatTest1]# ./floatToBin1 sizeof(float) : 4 sizeof(int) : 4 a = 12.500000 showFloat : 0x 41 48 00 00 UFP : 0,82,480000 b : 0x41480000 showIEEE754 a = 12.500000 showIEEE754 varTmp = 0x00c00000 showIEEE754 c = 0x00400000 showIEEE754 i = 19 , a1 = 1.000000 , showIEEE754 c = 00480000 , showIEEE754 b = 0x41000000 showIEEE754 i = 18 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 17 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 16 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 15 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 14 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 13 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 12 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 11 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 10 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 9 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 8 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 7 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 6 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 5 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 4 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 3 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 2 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 i = 1 , a1 = 0.000000 , showIEEE754 b = 0x41000000 showIEEE754 : 0x41480000 [root@localhost floatTest1]#
Analysis of 64-bit floating-point storage format
Sign: 1 bit (31bit)
Exponent: 11 bits (62 to 52 total 11 bits)
Fraction: 52 bits (52 bits from 51 to 0)
The true value of 64-bit non-zero floating-point numbers (python syntax):
(-1) **Sign * 2 **(Exponent-1023) * (1 + Fraction)
Examples are as follows:
a = 12.5
1. Solving symbolic bits
If a is greater than 0, Sign is 0, expressed in binary as:
2. Solving Order Code
a is expressed as binary: 1100.0
If the decimal point needs to move 3 bits to the left, the Exponent is 1026 (1023 + 3), expressed in binary: 100000000010
3. Solving Valid Numbers
If a significant number needs to remove the 1 implied in the highest digit, the integral part of the significant number is 100.
The decimal decimal fraction is converted to binary decimal by the decimal * 2. If the integer part is taken, the decimal part is: 1.
If the following complement is 0, the binary of a can be expressed as:
0100000000101001000000000000000000000000000000000000000000000000
That is: 0100 0000 00010 1001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
Expressed in hexadecimal system: 0x4029000000000000
4. Reducing Truth Value
Sign = bin(0) = 0 Exponent = bin(10000000010) = 1026 Fraction = bin(0.1001) = 2 ** (-1) + 2 ** (-4) = 0.5625
True value:
(-1) **0 * 2 **(1026-1023) * (1 + 0.5625) = 12.5
64-bit floating-point binary storage parsing code (c++):
https://github.com/mike-zhang/cppExamples/blob/master/dataTypeOpt/IEEE754Relate/doubleTest1.cpp
Operation effect:
[root@localhost t1]# ./doubleToBin1 sizeof(double) : 8 sizeof(long) : 8 a = 12.500000 showDouble : 0x 40 29 00 00 00 00 00 00 UFP : 0,402,0 b : 0x0 showIEEE754 a = 12.500000 showIEEE754 logLen = 3 showIEEE754 c = 4620693217682128896(0x4020000000000000) showIEEE754 b = 0x4020000000000000 showIEEE754 varTmp = 0x8000000000000 showIEEE754 c = 0x8000000000000 showIEEE754 i = 48 , a1 = 1.000000 , showIEEE754 c = 9000000000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 47 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 46 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 45 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 44 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 43 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 42 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 41 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 40 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 39 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 38 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 37 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 36 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 35 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 34 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 33 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 32 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 31 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 30 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 29 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 28 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 27 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 26 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 25 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 24 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 23 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 22 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 21 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 20 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 19 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 18 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 17 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 16 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 15 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 14 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 13 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 12 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 11 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 10 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 9 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 8 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 7 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 6 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 5 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 4 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 3 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 2 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 i = 1 , a1 = 0.000000 , showIEEE754 b = 0x4020000000000000 showIEEE754 : 0x4029000000000000 [root@localhost t1]#
Okay, that's all. I hope it will help you.
This article github address:
Welcome to add