Byte and bit in java

Byte capacity relationship in java

1TB=1024GB      TB It's a trillion 
1GB=1024MB      GB It's Gigabit 
1MB=1024KB      MB Is a sign  
1KB=1024Byte    KB It's kilobytes 

be careful: Byte The abbreviation is B That is, bytes

In actual development, it is inevitable to deal with strings and Byte bytes. Many times, when you debug, you will find a string of numbers: {97104,...}
There are both positive and negative numbers. What exactly is this? Let's learn it again today

How many bytes does an English letter correspond to

public class ByteBit {
    public static void main(String[] args) {
        String a = "a";
        byte[] bytes = a.getBytes();
        for (byte b : bytes) {
            int c = b;
            // It is found that byte is actually ascii code after printing
            System.out.println(c);      // 97
            // Let's take a look at the bit corresponding to each byte. Byte gets the corresponding bit
            String s = Integer.toBinaryString(c);
            // The result printed by 1100001 should be 8 bit s. There is a 0 in front of it, but it is not printed,
            // As can be seen from the print result, one English character occupies one byte
            System.out.println(s);
        }
    }

The implementation results are:

97
1100001
public class ByteBit {
    public static void main(String[] args) {
        String a = "alrewre";
        byte[] bytes = a.getBytes();
        for (byte b : bytes) {
            int c = b;
            // It is found that byte is actually ascii code after printing
            System.out.println(c);
            // Let's take a look at the bit corresponding to each byte. Byte gets the corresponding bit
            String s = Integer.toBinaryString(c);
            System.out.println(s);
        }
    }

The implementation results are:

97
1100001
108
1101100
114
1110010
101
1100101
119
1110111
114
1110010
101
1100101

Summary: it is not difficult to find that the English upper and lower case letters in the string in our life correspond to a byte, and each byte corresponds to a number,
The number of English letters in a string corresponds to the number of numbers and bytes, and a byte corresponds to an 8-Bit bit, which is composed of 0 or 1

So after reading English, now let's think about how many bytes a Chinese letter corresponds to?

How many bytes does a Chinese character correspond to

Under the coding format of UTF-8

public class ByteBitDemo {
    public static void main(String[] args) {
        String a = "law";
        byte[] bytes = a.getBytes("utf-8");
        for (byte aByte : bytes) {
            int c = aByte;
            System.out.println(c);
        }
    }

Print results:

-27
-66
-117

It can be seen that one Chinese corresponds to three bytes. Print the bytes into bits to see:

public class ByteBitDemo {
    public static void main(String[] args) {
        String a = "law";
        byte[] bytes = a.getBytes("utf-8");
        for (byte aByte : bytes) {
            int c = aByte;
            System.out.print(c + "  ");
            // Convert byte to bit print
            String s = Integer.toBinaryString(c);
            System.out.println(s);
        }
    }

Print results:

-27  11111111111111111111111111100101
-66  11111111111111111111111110111110
-117  11111111111111111111111110001011

In GBK coding format

public class ByteBitDemo {
    public static void main(String[] args) throws UnsupportedEncodingException {
        String a = "law";
        byte[] bytes = a.getBytes("GBK");
        for (byte aByte : bytes) {
            // int c = aByte;
            System.out.print(aByte + "  ");
            // Convert byte to bit print
            String s = Integer.toBinaryString(aByte);
            System.out.println(s);
        }
    }

Print results:

-62  11111111111111111111111111000010
-55  11111111111111111111111111001001

summary

  • 8bit corresponding to one byte

  • For Chinese characters, the encoding format is different, and the corresponding bytes are also different
    If it is UTF-8: a Chinese corresponds to three bytes
    If GBK: two bytes corresponding to one Chinese
    English does not distinguish under what coding format

  • The string of a number also corresponds to a byte
    String a = “1”; A number 1 corresponds to a byte
    String a = “12343”; This is a five digit string, corresponding to five bytes

Doubts about this article
Isn't a byte corresponding to 8bit? Why is the Bit number printed on it 32 bits?

-73  11111111111111111111111110110111
-24  11111111111111111111111111101000

Added by dagee on Sat, 19 Feb 2022 05:04:38 +0200