[note] how do you know what is downloaded from build script?

How does the product track vulnerabilities? When managing the supply chain, we will encounter many problems. We know that a factory's assembly line machine will install various sensors to monitor the production environment. What about the software assembly line? For example, did build script download unsafe lib? For example, has our download source been attacked and provided an unsafe package? First, we need to know which websites are visited by build script and what are the URLs?

In fact, we can use the method of third-party attack to set up a proxy, and then pass all network traffic through this proxy. This proxy can simulate the target website, and then pass the information to the target intact. In this way, you can naturally get the HTTP/HTTPS url. For many binary protocol s, it is still difficult at present. For example, ssh is not text, so it cannot be parsed quickly; Things like udp are even more complicated. Therefore, in order to achieve 100% monitoring, based on the current technology, we need to use HTTP/HTTPS for external access to all build script s.

HTTP proxy is easy to understand, but some HTTPS people will certainly have doubts about how to disguise a website and pass SSL authentication. In fact, this is ultimately a problem of certificate configuration. We generate a root certificate to issue all disguised websites. As long as we register the root certificate where it is needed and make it trusted temporarily. Of course, there are some troublesome places here. Different tools have their own methods to read and verify the root certificate. We need to see which tools are used by build script for special configuration.

Because different tools have different configurations for HTTPS proxy, how to ensure that all traffic is routed to the proxy? We can enable another magic weapon tcpdump, which can monitor all the original network traffic; We can list the src and dst IP addresses, and resolve the DNS packet to get the hostname, so that all external network accesses will be listed. If HTTPS does not route to proxy, there will be some unknown IP or hostname in the tcpdump report.

https://github.com/stallpool/track-network-traffic

From the code point of view, the core is actually a mitmproxy, which will automatically generate a root certificate, and then intercept the url disguised as various websites. The work of track network traffic is mainly to configure the root certificate into the required tools. Of course, you also need to turn on the tcpdump function so that traffic does not slip through the net.

Let's take a look at the usage of this tool:

bash ./bin/tnt.bash -o . -- curl https://www.google.com
bash ./bin/tnt.bash -a -o . -- pip install pg8000

Of course, one command is too troublesome, so write a bash

cat > all_in_one.sh <<EOF
curl https://www.google.com
pip install pg8000
EOF

bash ./bin/tnt.bash -o . -- bash all_in_one.sh

In this way, you can get the following report.json

{
   "items": [
         {
         "content": "binary",
         "host": "www.google.com",
         "path": [
            "/ HTTP/2.0"
         ],
         "protocol": "https"
      },
      {
         "content": "binary",
         "host": "pypi.org",
         "path": [
            "/simple/pg8000/",
            "/simple/scramp/",
            "/simple/asn1crypto/"
         ],
         "protocol": "https"
      },
      {
         "content": "binary",
         "host": "files.pythonhosted.org",
         "path": [
            "/packages/0d/b9/0f8e90f4d3785c517b15e1643d58fd484e2b594559d1af37e19217a74817/pg8000-1.22.0-py3-none-any.whl",
            "/packages/27/31/80bfb02ba2daa9a0ca66f82650c411f1a2b21ce85164408f57e99aab4e4e/scramp-1.4.1-py3-none-any.whl",
            "/packages/b5/a8/56be92dcd4a5bf1998705a9b4028249fe7c9a035b955fe93b6a3e5b829f8/asn1crypto-1.4.0-py2.py3-none-any.whl"
         ],
         "protocol": "https"
      }
   ]
}

For docker build, we can also configure various proxies, but the final product image cannot be published because there are more certificates and configured layers in the image. In fact, this can be solved by forcibly removing the layer chain. For example, you want to get the network access report of docker build

cat > Dockerfile <<EOF
FROM python:2.7.18
RUN curl https://www.google.com
RUN pip install pg8000
EOF

./bin/docker build .

Let's explore the others by ourselves. Students who want to discuss can indicate the topics to be discussed and add wechat

Keywords: security https

Added by jakem on Wed, 10 Nov 2021 02:16:12 +0200