Qt ASCII to Unicode to Chinese

Qt ASCII to Unicode to Chinese

1. Preface

This is mainly a protocol processing when reading the second-generation ID card. At present, two modules are contacted, and there are some differences in the protocol. Here is an example of the parsing process to illustrate this transformation.

2. Example and conversion process

Generally, in the analysis result of the second generation certificate, the name accounts for 30 bytes, but there are some differences in the return of different modules, because the direct result is ASCII. For example, the ASCII code of the Chinese character "Zheng" is: "\ u90D1". Part b modules directly return two bytes 0x90 and 0xd1, but some modules are stored as 4 bytes, and the hexadecimal code read by the serial port, Therefore, it becomes 0x44 (D), 0x31 (1), 0x39 (9) and 0x30 (0). In these two cases, the data is read from the serial port and converted into actual Chinese characters. The conversion process is described here.

2.1 ASCII format

This has 60 bytes, which can be converted into 30 bytes and then 15 bytes, or 60 bytes and 4 bytes can be divided into 15 parts directly according to the characters, and the ASCII can be split and replied. This situation is difficult to understand, but it is easy to see. After receiving the serial port data through QByteArray, we can directly print and display ASCII, which can directly print "D190", We print after receiving through serial port:

QBytearray readBuff;
readBuff = serialPort.readAll();

qDebug()<<readBuff;

At this time, we can see the results through the online conversion tool of ASCII to Chinese:

[the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-mr1asygc-1632995585382) (C: \ users \ admin \ appdata \ roaming \ typora \ typora user images \ image-20210930171446284. PNG)]

Then extract the 60 bytes (30 bytes in ASCII code, but each character accounts for 2 bytes, just like "D1", the actual storage of D and 1 needs to occupy one byte), and then convert them into Chinese characters in the following way. Moreover, every two ASCII bytes form a Chinese character, and the high and low bytes of the returned bytes need to be interchanged:

QByteArray tmp;

for (int i = 0; i < 59; i++) {
    tmp[i] = g_IDNumCardInfo[beginOffset + i];
    tmp[i + 1] = g_IDNumCardInfo[beginOffset + i + 1];
    //        qDebug() << tmp[i] << tmp[i + 1];
}
qDebug() << tmp << tmp.size();

QString nameRes;
for (int i = 0; i < 15; i++) {
    bool ok;
    QByteArray byte;
    //High and low byte interchange
    byte[0] = tmp[2 + 4 * i];
    byte[1] = tmp[3 + 4 * i];
    byte[2] = tmp[0 + 4 * i];
    byte[3] = tmp[1 + 4 * i];
    //        qDebug() << "file:" << __FILE__ << "line:" << __LINE__ << byte;
    auto *unicode = new QChar[1];

    unicode[0] = byte.toInt(&ok, 16);
    nameRes.append(QString::fromRawData(unicode, 1));
}

qDebug() << "file:" << __FILE__ << "line:" << __LINE__ << "name:" << nameRes;

2.2 hexadecimal bytes

The 30 bytes of this can be combined into 15 bytes for conversion. In fact, this is equivalent to a layer of encapsulation. The two character bytes of ASCII are combined into one byte. When this kind of storage is stored, the high-order bytes are directly exchanged, then stored in the ushort array, and then directly converted into a string:

ushort name[15];

for(int i = 0; i < 15; i++) {
    name[i] = g_IDNumCardInfo[8+2*i+1] << 8;
    name[i] = name[i] + (g_IDNumCardInfo[8+2*i] & 0x00ff);

    qDebug()<<"file:"<<__FILE__<<"line:"<<__LINE__<<QString().sprintf("%02x",name[i]);
}

QString nameRes = QString::fromUtf16(name, 15);
qDebug()<<"file:"<<__FILE__<<"line:"<<__LINE__<< nameRes;

3. Finally

Every two bytes represent the Unicode code of one Chinese character, and the low order is the first, such as the two bytes of 90D1. However, when transmitting the four characters of 9, 0, D and 1, they can be combined in pairs and transmitted in two bytes (0x90,0xD1) or not in four bytes according to the situation (0x39, 0x30, 0x44 and 0x31 can be transmitted as ASCII characters, and 0x09, 0x00, 0x0d and 0x01 can also be transmitted directly in hexadecimal style).

Keywords: Qt Embedded system unicode

Added by rnintulsa on Fri, 01 Oct 2021 01:08:45 +0300