Smooth restart means that our programs can be restarted without interrupting service, seamlessly linking up old and new processes, and deploying Zero-Downtime.
Smooth restart is based on elegant exit, as described in a previous article: Golang uses the Shutdown feature to gracefully summarize the exit of http services
There are two main strategies for achieving a smooth restart:
Scenario 1: If our service is multi-machine deployed, we can offline the machine that will restart the service from the gateway through the gateway program, and then come back online after the restart is completed. This scheme is suitable for enterprise applications with multi-machine deployment.
Scenario 2: Let our program self-start and restart child processes to achieve smooth restart. The core strategy is to switch between child and parent processes by copying file descriptors, which is suitable for single-machine deployment applications.
Today we will focus on Scenario 2, which allows our programs to restart smoothly, referencing an open source library for related implementations: https://github.com/fvbock/endless
Introduction to Implementation Principles
Introduction to http connections:
We know that http services are also based on tcp connections, and we can see from the golang http package source that the underlying is achieved by listening for tcp connections.
func (srv *Server) ListenAndServe() error { if srv.shuttingDown() { return ErrServerClosed } addr := srv.Addr if addr == "" { addr = ":http" } ln, err := net.Listen("tcp", addr) if err != nil { return err } return srv.Serve(tcpKeepAliveListener{ln.(*net.TCPListener)}) }
Multiplexed socket s:
When a program opens tcp connection listening, it creates a socket and returns a file descriptor handler to our program;
By copying the file descriptor file, the socket can continue to use the original port without closing, the natural http connection will not be broken, and starting the same process will not cause the port to be occupied.
Tested with the following code:
package main import ( "fmt" "net/http" "context" "time" "os" "os/signal" "syscall" "net" "flag" "os/exec" ) var ( graceful = flag.Bool("grace", false, "graceful restart flag") procType = "" ) func main() { flag.Parse() mux := http.NewServeMux() mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { fmt.Fprintln(w, fmt.Sprintf("Hello world! ===> %s", procType)) }) server := &http.Server{ Addr: ":8080", Handler: mux, } var err error var listener net.Listener if *graceful { f := os.NewFile(3, "") listener, err = net.FileListener(f) procType = "fork process" } else { listener, _ = net.Listen("tcp", server.Addr) procType = "main process" //Main program open 5 s after fork Subprocess go func() { time.Sleep(5*time.Second) forkSocket(listener.(*net.TCPListener)) }() } err=server.Serve(listener.(*net.TCPListener)) fmt.Println(fmt.Sprintf("proc exit %v", err)) } func forkSocket(tcpListener *net.TCPListener) error { f, err := tcpListener.File() if err != nil { return err } args := []string{"-grace"} fmt.Println(os.Args[0], args) cmd := exec.Command(os.Args[0], args...) cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr // put socket FD at the first entry cmd.ExtraFiles = []*os.File{f} return cmd.Start() }
After the program starts, wait for 5s to automatically fork the sub-process, you can see from the ps command that two processes coexist at the same time:
Then we can access http://127.0.0.1/through the browser and see the output of the main or child process randomly displayed.
Write a test code to loop requests:
package main import ( "net/http" "io/ioutil" "fmt" "sync" ) func main(){ wg:=sync.WaitGroup{} wg.Add(100) for i:=0; i<100; i++ { go func(index int) { result:=getUrl(fmt.Sprintf("http://127.0.0.1:8080?%d", i)) fmt.Println(fmt.Sprintf("loop:%d %s", index, result)) wg.Done() }(i) } wg.Wait() } func getUrl(url string) string{ resp, _ := http.Get(url) defer resp.Body.Close() body, _ := ioutil.ReadAll(resp.Body) return string(body) }
You can see the returned data as either a main process or a child process.
Switching process:
The moment a new process starts and an old process exits, there will be a short moment when two processes use the same file descriptor at the same time. This state, accessed through http requests, will randomly request to the new or old process, which is OK, because requests are either on the new process or on the old process; when the old process ends, requests will all arriveProcessing on a new process enables a smooth restart;
To sum up, we can summarize the core implementation as follows:
1. Monitor exit signals;
2. After listening for a signal, the fork subprocess starts the program with the same command and passes the file descriptor to the subprocess.
3. When the child process starts, the parent process stops the service and processes the task being executed (or times out) and exits;
4. At this point, only one new process is running, resulting in a smooth restart.
A complete demo code, by sending a USR1 signal, the program automatically creates subprocesses and closes the main process for a smooth restart:
package main import ( "fmt" "net/http" "context" "os" "os/signal" "syscall" "net" "flag" "os/exec" ) var ( graceful = flag.Bool("grace", false, "graceful restart flag") ) func main() { flag.Parse() mux := http.NewServeMux() mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { fmt.Fprintln(w, "Hello world!") }) server := &http.Server{ Addr: ":8080", Handler: mux, } var err error var listener net.Listener if *graceful { f := os.NewFile(3, "") listener, err = net.FileListener(f) } else { listener, err = net.Listen("tcp", server.Addr) } if err != nil{ fmt.Println(fmt.Sprintf("listener error %v", err)) return } go listenSignal(context.Background(), server, listener) err=server.Serve(listener.(*net.TCPListener)) fmt.Println(fmt.Sprintf("proc exit %v", err)) } func forkSocket(tcpListener *net.TCPListener) error { f, err := tcpListener.File() if err != nil { return err } args := []string{"-grace"} fmt.Println(os.Args[0], args) cmd := exec.Command(os.Args[0], args...) cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr // put socket FD at the first entry cmd.ExtraFiles = []*os.File{f} return cmd.Start() } func listenSignal(ctx context.Context, httpSrv *http.Server, listener net.Listener) { sigs := make(chan os.Signal, 1) signal.Notify(sigs, syscall.USR1) select { case <-sigs: forkSocket(listener.(*net.TCPListener)) httpSrv.Shutdown(ctx) fmt.Println("http shutdown") } }
Use apache's AB manometry tool to verify that executing ab-c 50-t 20 http://127.0.0.1:8080/50 concurrent 20s during the manometry period sends a USR1 signal to the pid running the program, you can see the results of the manometry, there are no failed requests, so it is a problem to achieve a smooth restart of the program.
Finally, I'll give you an Amway Web development framework, which has wrapped smooth restarts, ready to use, and quickly built a Web service with smooth restarts.
Frame Source: https://gitee.com/zhucheer/orange
File: https://www.kancloud.cn/chase688/orange_framework/1448035