1: Environmental preparation
pyshark can be used to parse messages. If it is a common protocol such as HTTP and HTTPS, the normal version can also be done. However, if I want to parse a relatively new protocol, such as QUIC, I have to use a relatively new tshark version.
The environment for this practice is CentOS 7, and the tshark version installed is 3.2 3. The source code installation method is adopted
1. Download the installation package
http://ftp.uni-kl.de/pub/wireshark/src/all-versions/wireshark-3.2.3.tar.xz
2. Installation dependency
yum install cmake3 libpcap libgcrypt-devel glib2-devel qt-devel qt5-qtbase-devel qt5-linguist qt5-qtmultimedia-devel qt5-qtsvg-devel libcap-devel libcap-ng-devel gnutls-devel krb5-devel libxml2-devel lua-devel lz4-devel snappy-devel spandsp-devel libssh2-devel bcg729-devel libmaxminddb-devel sbc-devel libsmi-devel libnl3-devel libnghttp2-devel libssh-devel libpcap-devel c-ares-devel redhat-rpm-config rpm-build gtk+-devel gtk3-devel desktop-file-utils portaudio-devel rubygem-asciidoctor docbook5-style-xsl docbook-style-xsl systemd-devel gcc gcc-c++ flex bison doxygen gettext-devel libxslt cmake
3. Install wireshark from the source code
tar -xvf wireshark-3.2.3.tar.xz cd wireshark-3.2.3 cmake3 . make -i -j 16 make install
Note: during the installation process, do not paste and copy the above commands, especially when cmake3 is executed, the required libraries and dependencies will be checked, and a prompt of success or failure will be given at the end of cmake3
For example, I encountered many errors, but the prompt was also obvious
If all the above are successfully executed, check the tshark version
[root@g7j9z sbin]# tshark -v Running as user "root" and group "root". This could be dangerous. TShark (Wireshark) 3.2.3 (Git commit f39b50865a13) Copyright 1998-2020 Gerald Combs <gerald@wireshark.org> and contributors. License GPLv2+: GNU GPL version 2 or later <https://www.gnu.org/licenses/gpl-2.0.html> This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Compiled (64-bit) with libpcap, with POSIX capabilities (Linux), with libnl 3, with GLib 2.56.1, with zlib 1.2.7, with SMI 0.4.8, with c-ares 1.10.0, with Lua 5.1.4, with GnuTLS 3.3.29, with Gcrypt 1.5.3, with MIT Kerberos, with MaxMind DB resolver, with nghttp2 1.33.0, without brotli, with LZ4, without Zstandard, with Snappy, with libxml2 2.9.1. Running on Linux 3.10.0-327.el7.x86_64, with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (with SSE4.2), with 322184 MB of physical memory, with locale en_US.UTF-8, with libpcap version 1.5.3, with GnuTLS 3.3.29, with Gcrypt 1.5.3, with zlib 1.2.7, binary plugins supported (0 loaded). Built using gcc 4.8.5 20150623 (Red Hat 4.8.5-44).
4. Install pyshark
[root@g7j9z quicParas]# pip3 install pyshark Collecting pyshark Downloading https://mirrors.zte.com.cn/pypi/packages/e5/96/ebc5fb6cd63242c6851acdfa8a0ac14fbaec2d6c53f6d64d0a5ca06cd1af/pyshark-0.4.3-py3-none-any.whl Collecting py (from pyshark) Downloading https://mirrors.zte.com.cn/pypi/packages/67/32/6fe01cfc3d1a27c92fdbcdfc3f67856da8cbadf0dd9f2e18055202b2dc62/py-1.10.0-py2.py3-none-any.whl (97kB) 100% |████████████████████████████████| 102kB 2.8MB/s Collecting lxml (from pyshark) Downloading https://mirrors.zte.com.cn/pypi/packages/bd/78/56a7c88a57d0d14945472535d0df9fb4bbad7d34ede658ec7961635c790e/lxml-4.6.2-cp36-cp36m-manylinux1_x86_64.whl (5.5MB) 100% |████████████████████████████████| 5.5MB 3.5MB/s Installing collected packages: py, lxml, pyshark Successfully installed lxml-4.6.2 py-1.10.0 pyshark-0.4.3
2: Parsing QUIC
The quic version parsed this time is Q023, and other versions of quic have not been tried
You can look at the message opened by wireshark first
The next step is to try to unpack with pyshark
1. Open the local pcap file
cap = pyshark.FileCapture('./gquic_q023.pcap')[0]
We only take the first packet of the message for experiment this time, and take the index 0
2. Locate the application layer of QUIC
We can first look at which properties and methods are supported
>>> cap = pyshark.FileCapture('./gquic_q023.pcap')[0] >>> dir(cap) ['__bool__', '__class__', '__contains__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_packet_string', 'captured_length', 'eth', 'frame_info', 'get_multiple_layers', 'get_raw_packet', 'gquic', 'highest_layer', 'interface_captured', 'ip', 'layers', 'length', 'number', 'pretty_print', 'show', 'sniff_time', 'sniff_timestamp', 'transport_layer', 'udp'] >>>
There's a highest in it_ Layer is to locate to the application layer
We can print it
>>> cap.highest_layer 'GQUIC'
3. Obtain QUIC layer code stream
cap[cap.highest_layer]
We can print the parsed QUIC field
For example, if we want to get the SNI field of QUIC, we can call it directly
>>> cap[cap.highest_layer].tag_sni 'www.googleapis.com' >>>
Next, let's look at a complete analysis code and its printing
import pyshark # Open the stored capture file cap = pyshark.FileCapture('./gquic_q023.pcap')[0] print("*****************GQUIC LAYER PRINT*******************") print(cap[cap.highest_layer]) print("*****************GQUIC LAYER PRINT*******************") print("QUIC SNI:",cap[cap.highest_layer].tag_sni)
[root@g7j9z quicParas]# python3 quicParac.py *****************GQUIC LAYER PRINT******************* Layer GQUIC: Public Flags: 0x0d .... ...1 = Version: Yes .... ..0. = Reset: No .... 11.. = CID Length: 8 Bytes (0x3) ..00 .... = Packet Number Length: 1 Byte (0x0) .0.. .... = Multipath: No 0... .... = Reserved: 0x0 CID: 10123107773473542882 Version: Q023 Packet Number: 1 Message Authentication Hash: 020e7c2363fc8725a6caf935 Private Flags: 0x01 .... ...1 = Entropy: Yes .... ..0. = FEC Group: No .... .0.. = FEC: No 0000 0... = Reserved: 0x00 STREAM (Special Frame Type) Stream ID: 1, Type: CHLO (Client Hello) Frame Type: STREAM (Special Frame Type) (0xa0) 1... .... = Stream: True .0.. .... = FIN: False ..1. .... = Data Length: 2 Bytes ...0 00.. = Offset Length: 0 Byte (0) .... ..00 = Stream Length: 1 Byte (0) Stream ID: 1 (Reserved for (G)QUIC handshake, crypto, config updates...) Data Length: 1300 Tag: CHLO (Client Hello) Tag Number: 5 Padding: 0000 Tag/value: PAD (Padding) (l=1210) Tag Type: PAD (Padding) Tag offset end: 1210 Tag length: 1210 Tag/value: 2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d\xe2\x80\xa6 Padding: 2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d\xe2\x80\xa6 Server Name Indication: www.googleapis.com Version: Q023 Common certificate sets: 0x399ff95340f7fec9 Proof demand: X509 Padding Length: 18 Padding: 000000000000000000000000000000000000 PADDING Length: 18 Frame Type: PADDING (0x00) Tag/value: SNI (Server Name Indication) (l=18): www.googleapis.com Tag/value: VER (Version) (l=4): Q023 Tag/value: CCS (Common Certificate Sets) (l=16) Tag/value: PDMD (Proof Demand) (l=4): X509 Tag Type: SNI (Server Name Indication) Tag Type: VER (Version) Tag Type: CCS (Common Certificate Sets) Tag Type: PDMD (Proof Demand) Tag offset end: 1228 Tag offset end: 1232 Tag offset end: 1248 Tag offset end: 1252 Tag length: 18 Tag length: 4 Tag length: 16 Tag length: 4 Tag/value: 7777772e676f6f676c65617069732e636f6d Tag/value: 51303233 Tag/value: 399ff95340f7fec97b26e9e7e45c71ff Tag/value: 58353039 Common certificate sets: 0x7b26e9e7e45c71ff *****************GQUIC LAYER PRINT******************* QUIC SNI: www.googleapis.com
3: Summary
If you have used scapy, you can see whether it is very familiar with the above. The strength of pyshark is that you can call all the packet decoders built in tshark. This paper just takes pyshark as an example to unpack. Its purpose can be more than that. You can also use pyshark to sniff on the network interface. In addition, it should be noted that with the continuous updating of the protocol, since its decoding function depends on tshark, if it is necessary to parse a relatively new protocol, we need to upgrade tshark synchronously (provided that the new version supports parsing). After all, the technology is developing, and we should update it frequently.