37 - Case: DNS resolution is fast and slow. What should I do?


Although IP address facilitates the communication of machines, it brings a heavy memory burden to people accessing these services
Few people can remember the IP address of Github, because this string of characters has no meaning to the human brain

However, this does not prevent frequent use of this service. Why?
Of course, because there are simpler and more convenient ways
You can use the domain name GitHub COM, rather than relying on specific IP addresses, which is actually the origin of domain name system DNS

DNS (Domain Name System), namely Domain Name System, is the most basic service in the Internet. It mainly provides the query service of the mapping relationship between domain name and IP address

DNS not only facilitates people to access different Internet services
Moreover, dynamic service discovery and Global Server Load Balance (GSLB) mechanisms are provided for many applications
In this way, DNS can select the IP closest to the user to provide services
Even if the IP address of the back-end service changes, users can still access it with the same domain name




Domain name and DNS resolution

A domain name consists of a string of characters separated by dots and is used as the name of a computer or group of computers on the Internet
The purpose is to facilitate the identification of the host location providing various services in the Internet

It should be noted that the domain name is unique in the world and can only be applied for registration through a special domain name registrar
In order to organize many computers in the global Internet, domain names are also separated by points to form a hierarchical structure
Each string separated by a dot constitutes a level in the domain name, and the lower the position, the higher the level

Geek time website time geekbang. Org as an example to understand the meaning of domain names
In this string, the last org is the top-level domain name, the middle geekbang is the secondary domain name, and the leftmost time is the tertiary domain name

As shown in the figure below, note that the dot (.) is the root of all domain names, that is, all domain names use the dot as the suffix
It can also be understood that in the process of domain name resolution, all domain names end with dots

By understanding these concepts, we can see that the domain name is mainly for people to remember, and the IP address is the real mechanism of communication between machines
A service that converts domain names to IP addresses, that is, the domain name resolution service (DNS) mentioned at the beginning
The corresponding server is the domain name server, and the network protocol is the DNS protocol

DNS protocol belongs to the application layer in TCP/IP stack, but the actual transmission is still based on UDP or TCP protocol (UDP is the majority)
And the domain name server generally listens on port 53

Since domain names are managed in a hierarchical structure, domain name resolution is also recursive (starting from the top level, and so on)
Send it to the domain name server at each level until the resolution result is obtained

However, don't worry. The recursive query process doesn't need to be operated by yourself. The DNS server will complete it for you
We just need to pre configure an available DNS server

Of course, generally speaking, each level of DNS server will have a cache of the most recently resolved records
When the cache hits, you can directly reply with the records in the cache
If the cache expires or does not exist, you need to query in the recursive way just mentioned

Therefore, when configuring the network of Linux system, the system administrator needs to configure not only the IP address, but also the DNS server
In this way, it can access external services through the domain name

Execute the following command to query the system configuration

$ cat /etc/resolv.conf
nameserver 114.114.114.114

DNS service manages all data through resource records. It supports A, CNAME, MX, NS, PTR and other types of records

  1. A record used to convert domain names into IP addresses
  2. CNAME record to create alias
  3. NS record, indicating the domain name server address corresponding to the domain name

In short, when accessing A Web address, you need to query the IP address corresponding to the domain name through the A record of DNS, and then access the Web service through the IP

Geek time website time geekbang. Org as an example
Execute the following nslookup command to query the A record of this domain name

$ nslookup time.geekbang.org
# Domain name server and port information
Server: 114.114.114.114
Address: 114.114.114.114#53
# Non authoritative query results
Non-authoritative answer:
Name: time.geekbang.org
Address: 39.106.233.17

Note here that due to 114.114 114.114 does not directly manage time geekbang. Org domain name server, so the query results are non authoritative
Using the above command can only get 114.114 114.114 query results

If the cache is not hit, the DNS query is actually a recursive process. Is there a way to know the execution of the whole recursive query?

In fact, in addition to nslookup, another commonly used DNS resolution tool dig
It provides the trace function, which can show the whole process of recursive query
For example, you can execute the following commands to get the query results

# +trace means to enable tracking query
# +nodnssec indicates that DNS Security extension is prohibited
$ dig +trace +nodnssec time.geekbang.org
; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu <<>> +trace +nodnssec time.geekbang.org
;; global options: +cmd
. 322086 IN NS m.root-servers.net.
. 322086 IN NS a.root-servers.net.
. 322086 IN NS i.root-servers.net.
. 322086 IN NS d.root-servers.net.
. 322086 IN NS g.root-servers.net.
. 322086 IN NS l.root-servers.net.
. 322086 IN NS c.root-servers.net.
. 322086 IN NS b.root-servers.net.
. 322086 IN NS h.root-servers.net.
. 322086 IN NS e.root-servers.net.
. 322086 IN NS k.root-servers.net.
. 322086 IN NS j.root-servers.net.
. 322086 IN NS f.root-servers.net.
;; Received 239 bytes from 114.114.114.114#53(114.114.114.114) in 1340 ms
org. 172800 IN NS a0.org.afilias-nst.info.
org. 172800 IN NS a2.org.afilias-nst.info.
org. 172800 IN NS b0.org.afilias-nst.org.
org. 172800 IN NS b2.org.afilias-nst.org.
org. 172800 IN NS c0.org.afilias-nst.info.
org. 172800 IN NS d0.org.afilias-nst.org.
;; Received 448 bytes from 198.97.190.53#53(h.root-servers.net) in 708 ms
geekbang.org. 86400 IN NS dns9.hichina.com.
geekbang.org. 86400 IN NS dns10.hichina.com.
;; Received 96 bytes from 199.19.54.1#53(b0.org.afilias-nst.org) in 1833 ms
time.geekbang.org. 600 IN A 39.106.233.176
;; Received 62 bytes from 140.205.41.16#53(dns10.hichina.com) in 4 ms

The output of dig trace mainly includes four parts

  1. The first part is from 114.114 114.114 NS records of some root domain name servers (.) found
  2. The second part is to select one of the NS record results (h.root-servers.net) and query the NS record of the top-level domain name org
  3. The third part is from org Select one of the NS records (B0. Org. Afilias NST. ORG)
    And query the secondary domain name geekbang org. NS server for
  4. The last part is from geekbang org. The NS server (dns10.hichina.com) queries the A record of the final host time.geekbang.org

The NS records of domain names at all levels displayed in this output are actually the addresses of domain name servers at all levels, which can better understand the DNS resolution process

Of course, not only do services published to the Internet need domain names,
Most of the time, we want to resolve the domain name of the host inside the LAN (that is, the intranet domain name is the host name in most cases)
Linux also supports this behavior

Therefore, the mapping relationship between host name and IP address can be written into the / etc/hosts file of the local machine
The specified host name can directly find the target IP locally
For example, you can execute the following commands to operate

$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain
::1 localhost6 localhost6.localdomain6
192.168.0.100 domain.com

Or you can set up a custom DNS server in the intranet to resolve domain names in the intranet
The intranet DNS server will generally set up one or more upstream DNS servers to resolve the domain names of the Internet




Case preparation

  1. Machine configuration

    Ubuntu 18.04
     Machine configuration: 2 CPU,4GB Memory
     Pre installation docker Tools such as apt install docker.io
    
    
  2. SSH log in to the Ubuntu machine, and then execute the following command to pull the Docker image used in the case

    root@alnk:~# docker pull feisky/dnsutils
    Using default tag: latest
    latest: Pulling from feisky/dnsutils
    ......
    
    
  3. Run the following command to view the DNS server currently configured by the host

    root@alnk:~# cat /etc/resolv.conf
    nameserver 114.114.114.114
    
    

Case 1: DNS resolution failed

  1. Execute the following command to enter today's first case. If everything is normal, you can see the following output

    root@alnk:~# echo $(mktemp)
    /tmp/tmp.egiweqJvFf
    
    # Enter the SHELL terminal of the case environment
    root@alnk:~# docker run -it --rm -v $(mktemp):/etc/resolv.conf feisky/dnsutils bash
    root@7d705f264ae5:/# 
    
    # root@7d705f264ae5:/# Represents a command that runs inside the container
    
    
  2. In the container terminal, execute the DNS query command to query time geekbang. Org IP address

    root@7d705f264ae5:/# nslookup time.geekbang.org
    ;; connection timed out; no servers could be reached
    ##
    After finding that this command is blocked for a long time, it still fails. It is reported connection timed out and no servers could be reached error
    
    
  3. Check the network with ping tool

    root@7d705f264ae5:/# ping 11  -c  3 114.114.114.114
    PING 114.114.114.114 (114.114.114.114): 56 data bytes
    64 bytes from 114.114.114.114: icmp_seq=0 ttl=67 time=35.739 ms
    64 bytes from 114.114.114.114: icmp_seq=1 ttl=91 time=35.738 ms
    64 bytes from 114.114.114.114: icmp_seq=2 ttl=77 time=35.693 ms
    --- 114.114.114.114 ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max/stddev = 35.693/35.723/35.739/0.000 ms
    ##
    You can see that the network is connected
     How do you know nslookup Why did the command fail?
    In fact, there are many methods here. The simplest one is to open it nslookup Debug output of
     View the detailed steps in the query process and check whether there are exceptions
    
    
  4. Continue in the container terminal, execute the following command

    root@7d705f264ae5:/# nslookup -debug time.geekbang.org
    ;; Connection to 127.0.0.1#53(127.0.0.1) for time.geekbang.org failed: connection refused.
    ;; Connection to ::1#53(::1) for time.geekbang.org failed: address not available.
    ##
    You can see from this output nslookup Connection loopback address (127).0.0.1 and ::1)Port 53 failed
     There is a problem here. Why do you connect the loopback address instead of the 114 you saw earlier.114.114.114 And?
    ##
    It may be because there is no configuration in the container DNS The server
     Execute the following command to confirm
    root@7d705f264ae5:/# cat /etc/resolv.conf
    ##
    Sure enough, this command has no output, indicating that there is no configuration in the container DNS The server
     At this point, you will know the solution
     stay/etc/resolv.conf File, configuration DNS The server is OK
    
    
  5. Execute the following command. After configuring the DNS server, re execute the nslookup command

    root@7d705f264ae5:/# echo "nameserver 114.114.114.114" > /etc/resolv.conf
    root@7d705f264ae5:/# nslookup time.geekbang.org
    Server:114.114.114.114
    Address:114.114.114.114#53
    
    Non-authoritative answer:
    Name:time.geekbang.org
    Address: 39.106.233.176
    ##
    Here, the first case is easily solved
     Finally, it is executed in the terminal exit Command to exit the container, Docker It will automatically clean the container that just ran
    
    


Case 2: DNS resolution is unstable

  1. Execute the following command to start a new container and enter its terminal

    root@alnk:~# docker run -it --rm --cap-add=NET_ADMIN --dns 8.8.8.8 feisky/dnsutils bash
    root@d99479553cb5:/# 
    
    
  2. Run the nslookup command to parse time geekbang. Org IP address

    root@d99479553cb5:/# time nslookup time.geekbang.org
    Server: 8.8.8.8
    Address: 8.8.8.8#53
    
    Non-authoritative answer:
    Name: time.geekbang.org
    Address: 39.106.233.176
    
    real 0m10.349s
    user 0m0.004s
    sys 0m0.0
    ##
    It can be seen that the parsing was very slow and took 10 seconds
     Overall, now DNS The parsing result is not only slow, but also timeout failure occurs
    ##
    Why? How to deal with this problem?
    In fact, according to the previous explanation DNS Parsing, to put it bluntly, is the process of interaction between the client and the server, and this process also uses UDP agreement
     Then, for the whole process, there are many possible situations where the parsing result is unstable. For example
    1. DNS There is a problem with the server itself. The response is slow and unstable
    2. Client to DNS The network latency of the server is relatively large
    3. DNS In some cases, the request or response packet is lost by the network device in the link
    ##
    According to the above nslookup As you can see from the output, the client is now connected DNS It's 8.8.8.8
     This is Google Provided DNS service
     yes Google I'm still quite relieved, DNS The probability of server failure should be relatively small
     Basically ruled out DNS Server problems
     Is that the second possibility, local to DNS What about the server latency?
    
    
  3. ping can be used to test server latency

    root@d99479553cb5:/# ping -c3 8.8.8.8
    PING 8.8.8.8 (8.8.8.8): 56 data bytes
    64 bytes from 8.8.8.8: icmp_seq=0 ttl=31 time=137.637 ms
    64 bytes from 8.8.8.8: icmp seq=1 ttl=31 time=144.743 ms
    64 bytes from 8.8.8.8: icmp_seq=2 ttl=31 time=138.576 ms
    --- 8.8.8.8 ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max/stddev = 137.637/140.319/144.743/3.152 ms
    ##
    from ping As you can see, the delay here has reached 140 ms,This explains why parsing is so slow
    ##
    What should we do when we encounter this problem?
    Obviously, since the delay is too large, change to a smaller one DNS Servers, such as 114 provided by telecommunications.114.114.114
    
    
  4. Replace the DNS server and execute the nslookup resolution command again

    root@d99479553cb5:/# echo nameserver 114.114.114.114 > /etc/resolv.conf
    
    root@d99479553cb5:/# time.  nslookup time.geekbang.rog   org
    Server:114.114.114.114
    Address:114.114.114.114#53
    
    Non-authoritative answer:
    Name:time.geekbang.org
    Address: 39.106.233.176
    
    real0m0.070s
    user0m0.000s
    sys0m0.007s
    root@d99479553cb5:/# time nslookup time.geekbang.org
    Server:114.114.114.114
    Address:114.114.114.114#53
    
    Non-authoritative answer:
    Name:time.geekbang.org
    Address: 39.106.233.176
    
    real0m0.075s
    user0m0.007s
    sys0m0.000s
    ##
    Now you only need 64 ms You can complete the analysis, which is 10 times higher than that just now s Much better
    ##
    Here the problem seems to be solved
     However, if you run multiple times nslookup Command, it is estimated that there will not be good results every time
     For example, sometimes you need 1 s Even more time
    ##
    1s of DNS The parsing time is still too long, which is unacceptable for many applications
     So, how to solve this problem?
    use DNS cache
     In this way, only the first query needs to go DNS Server requests, future queries, as long as DNS Records do not expire, just use the records in the cache
    ##
    However, it should be noted that the mainstream of use Linux Release, except for the latest version Ubuntu (Such as 18.04 Or updated version)
    Other versions are not automatically configured DNS cache
    ##
    So I want to turn it on for the system DNS Caching requires additional configuration
     For example, the simplest way is to use dnsmasq
    ##
    dnsmasq Is the most commonly used DNS One of the caching services, often as DHCP Service to use
     Its installation and configuration are relatively simple, and its performance can meet the requirements of most applications DNS Cache requirements
    
    
  5. Execute the following command to start dnsmasq

    root@d99479553cb5:/# /te  etc/init.d/dnsmasq start
     * Starting DNS forwarder and DHCP server dnsmasq        [ OK ]
     
    
  6. Modify / etc / resolv Conf, change the DNS server to the listening address of dnsmasq, here is 127.0 zero point one

    root@d99479553cb5:/# echo nameserver 127.0.0.1 > /etc/resolv.cong f
    
    
  7. Re execute the nslookup command multiple times

    root@d99479553cb5:/# time nslookup time.geekbang.org
    Server:127.0.0.1
    Address:127.0.0.1#53
    
    Non-authoritative answer:
    Name:time.geekbang.org
    Address: 39.106.233.176
    
    
    real0m0.007s
    user0m0.003s
    sys0m0.003s
    root@d99479553cb5:/# time nslookup time.geekbang.org
    Server:127.0.0.1
    Address:127.0.0.1#53
    
    Non-authoritative answer:
    Name:time.geekbang.org
    Address: 39.106.233.176
    
    
    real0m0.007s
    user0m0.003s
    sys0m0.003s
    ##
    Only the first parsing is very slow, and every subsequent parsing is very fast
     And every time DNS The time required for parsing is also very stable
    
    



Summary

DNS is the most basic service in the Internet. It provides the query service of the mapping relationship between domain name and IP address
Many applications were initially developed without considering DNS resolution
After subsequent problems occur, it can only be found after several days of troubleshooting. In fact, it is caused by slow DNS resolution

Imagine a Web service interface that takes 1s to wait for DNS resolution each time
Then, no matter how to optimize the internal logic of the application, the response of this interface is too slow for users
Because the response time will always be greater than 1 second

Therefore, in the process of application development, we must consider the possible performance problems caused by DNS resolution and master common optimization methods
Several common DNS optimization methods are summarized

  1. Cache the results of DNS resolution
    Caching is the most effective method, but note that once the cache expires, you still need to go to the DNS server to retrieve new records
    However, this is acceptable for most applications
  2. Prefetch the results of DNS resolution
    This is the most commonly used method in Web applications such as browsers
    In other words, before users click the hyperlink on the page, the browser will automatically resolve the domain name in the background and cache the results
  3. Replace regular DNS resolution with HTTP DNS
    This is the method that many mobile applications will choose, especially now domain name hijacking is common
    Using HTTP protocol to bypass the DNS server in the link can avoid the problem of domain name hijacking
  4. DNS based global load balancing (GSLB)
    This not only provides load balancing and high availability for services, but also returns the nearest IP address according to the user's location

Added by dirkadirka on Fri, 31 Dec 2021 11:47:14 +0200