Byte capacity relationship in java
1TB=1024GB TB It's a trillion 1GB=1024MB GB It's Gigabit 1MB=1024KB MB Is a sign 1KB=1024Byte KB It's kilobytes be careful: Byte The abbreviation is B That is, bytes
In actual development, it is inevitable to deal with strings and Byte bytes. Many times, when you debug, you will find a string of numbers: {97104,...}
There are both positive and negative numbers. What exactly is this? Let's learn it again today
How many bytes does an English letter correspond to
public class ByteBit { public static void main(String[] args) { String a = "a"; byte[] bytes = a.getBytes(); for (byte b : bytes) { int c = b; // It is found that byte is actually ascii code after printing System.out.println(c); // 97 // Let's take a look at the bit corresponding to each byte. Byte gets the corresponding bit String s = Integer.toBinaryString(c); // The result printed by 1100001 should be 8 bit s. There is a 0 in front of it, but it is not printed, // As can be seen from the print result, one English character occupies one byte System.out.println(s); } }
The implementation results are:
97 1100001
public class ByteBit { public static void main(String[] args) { String a = "alrewre"; byte[] bytes = a.getBytes(); for (byte b : bytes) { int c = b; // It is found that byte is actually ascii code after printing System.out.println(c); // Let's take a look at the bit corresponding to each byte. Byte gets the corresponding bit String s = Integer.toBinaryString(c); System.out.println(s); } }
The implementation results are:
97 1100001 108 1101100 114 1110010 101 1100101 119 1110111 114 1110010 101 1100101
Summary: it is not difficult to find that the English upper and lower case letters in the string in our life correspond to a byte, and each byte corresponds to a number,
The number of English letters in a string corresponds to the number of numbers and bytes, and a byte corresponds to an 8-Bit bit, which is composed of 0 or 1
So after reading English, now let's think about how many bytes a Chinese letter corresponds to?
How many bytes does a Chinese character correspond to
Under the coding format of UTF-8
public class ByteBitDemo { public static void main(String[] args) { String a = "law"; byte[] bytes = a.getBytes("utf-8"); for (byte aByte : bytes) { int c = aByte; System.out.println(c); } }
Print results:
-27 -66 -117
It can be seen that one Chinese corresponds to three bytes. Print the bytes into bits to see:
public class ByteBitDemo { public static void main(String[] args) { String a = "law"; byte[] bytes = a.getBytes("utf-8"); for (byte aByte : bytes) { int c = aByte; System.out.print(c + " "); // Convert byte to bit print String s = Integer.toBinaryString(c); System.out.println(s); } }
Print results:
-27 11111111111111111111111111100101 -66 11111111111111111111111110111110 -117 11111111111111111111111110001011
In GBK coding format
public class ByteBitDemo { public static void main(String[] args) throws UnsupportedEncodingException { String a = "law"; byte[] bytes = a.getBytes("GBK"); for (byte aByte : bytes) { // int c = aByte; System.out.print(aByte + " "); // Convert byte to bit print String s = Integer.toBinaryString(aByte); System.out.println(s); } }
Print results:
-62 11111111111111111111111111000010 -55 11111111111111111111111111001001
summary
-
8bit corresponding to one byte
-
For Chinese characters, the encoding format is different, and the corresponding bytes are also different
If it is UTF-8: a Chinese corresponds to three bytes
If GBK: two bytes corresponding to one Chinese
English does not distinguish under what coding format -
The string of a number also corresponds to a byte
String a = “1”; A number 1 corresponds to a byte
String a = “12343”; This is a five digit string, corresponding to five bytes
Doubts about this article
Isn't a byte corresponding to 8bit? Why is the Bit number printed on it 32 bits?
-73 11111111111111111111111110110111 -24 11111111111111111111111111101000