Go http.Server graceful shutdown遇到的奇怪问题

发布于:2023-05-15 ⋅ 阅读:(474) ⋅ 点赞:(0)

代码如下:

package main

import (
    "context"
    "log"
    "net/http"
    "os"
    "os/signal"
    "sync"
    "time"

    "github.com/gin-gonic/gin"
)

var wg sync.WaitGroup

var apiQuit = make(chan bool)

func apiQuitSignal() {
    log.Println("quit signal")
    apiQuit <- true
}

func main() {
    router := gin.New()
    router.GET("/quit", func(c *gin.Context) {
        log.Println("GET /quit")
        apiQuitSignal()
        //time.AfterFunc(5*time.Second, apiQuitSignal)
        c.String(200, "quit")
        //c.String(200, "quit in 5 seconds")
    })
    router.GET("/hello", func(c *gin.Context) {
        log.Println("GET /hello")
        c.String(200, "hello")
    })

    srv := &http.Server{
        Addr:    ":8888",
        Handler: router,
    }

    register := func(f func()) {
        wg.Add(1)
        srv.RegisterOnShutdown(func() {
            defer wg.Done()
            f()
        })
    }

    register(func() {
        time.Sleep(10 * time.Second)
    })

    go func() {
        // 服务连接
        if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
            log.Fatalf("listen: %s\n", err)
        }
    }()

    quit := make(chan os.Signal)
    signal.Notify(quit, os.Interrupt)

    select {
    case <-quit:
        log.Println("quit from os.Signal")
    case <-apiQuit:
        log.Println("quit from api")
    }

    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    //ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
    defer cancel()
    if err := srv.Shutdown(ctx); err != nil {
        log.Fatal("Server Shutdown:", err)
    }

    wg.Wait()
    log.Println("Server Exit")
}

上面的服务使用apiQuit从请求quit时获得shutdown的信号,然后进行graceful shutdown操作。

然后在请求的过程中,出现了一些奇怪的现象,表述如下:

一,在服务器启动后,仅请求quit接口,会报
Server Shutdown:context deadline exceeded
这个明显是srv.Shutdown(ctx)的时候context超时了。

二,在服务器启动后,先请求hello接口,再请求quit接口,可以正常退出。

三,在服务器启动后,先请求quit接口,再立刻请求hello接口,可以正常退出。

四,增加context的超时时间到10秒,可以正常退出。

这明显是Shutdown的代码有些奇怪的feature(或者bug)。

看代码:

// file: net/http/server.go

func (srv *Server) Shutdown(ctx context.Context) error {
    srv.inShutdown.setTrue()

    srv.mu.Lock()
    lnerr := srv.closeListenersLocked()
    srv.closeDoneChanLocked()
    for _, f := range srv.onShutdown {
        go f()
    }
    srv.mu.Unlock()

    pollIntervalBase := time.Millisecond
    nextPollInterval := func() time.Duration {
        // Add 10% jitter.
        interval := pollIntervalBase + time.Duration(rand.Intn(int(pollIntervalBase/10)))
        // Double and clamp for next time.
        pollIntervalBase *= 2
        if pollIntervalBase > shutdownPollIntervalMax {
            pollIntervalBase = shutdownPollIntervalMax
        }
        return interval
    }

    timer := time.NewTimer(nextPollInterval())
    defer timer.Stop()
    for {
        if srv.closeIdleConns() && srv.numListeners() == 0 {
            return lnerr
        }
        select {
        case <-ctx.Done():
            return ctx.Err()
        case <-timer.C:
            timer.Reset(nextPollInterval())
        }
    }
}

func (s *Server) closeIdleConns() bool {
    s.mu.Lock()
    defer s.mu.Unlock()
    quiescent := true
    for c := range s.activeConn {
        st, unixSec := c.getState()
        // Issue 22682: treat StateNew connections as if
        // they're idle if we haven't read the first request's
        // header in over 5 seconds.
        if st == StateNew && unixSec < time.Now().Unix()-5 {
            st = StateIdle
        }
        if st != StateIdle || unixSec == 0 {
            // Assume unixSec == 0 means it's a very new
            // connection, without state set yet.
            quiescent = false
            continue
        }
        c.rwc.Close()
        delete(s.activeConn, c)
    }
    return quiescent
}

先找到Shutdown函数,看到ctx.Done(),超时报错在这里,然后定位srv.closeIdleConns()
再找到closeIdleConns方法,可以看到主要代码就是一个
for c := range s.activeConn
然后注意下面的代码

// Issue 22682: treat StateNew connections as if
// they're idle if we haven't read the first request's
// header in over 5 seconds.
if st == StateNew && unixSec < time.Now().Unix()-5 {
    st = StateIdle
}

如果连接的状态是StateNew的时候,会延迟到5秒(实际上并不是严格的5秒,参考Shutdown函数里的timer机制)才能转为StateIdle。
并且这个问题已经有人提到了,参考Issue 22682
https://github.com/golang/go/issues/22682

解决的方法也就很简单:
方法一、在quit请求结束5秒后再发信号之后再调用Shutdown
方法二、在发送了quit信号后阻塞5秒再调用Shutdown

本文含有隐藏内容,请 开通VIP 后查看

网站公告

今日签到

点亮在社区的每一天
去签到