Graceful Shutdown Is a Feature, Not a Signal Handler


Graceful shutdown is often treated as an implementation detail.

Something like:

“Catch SIGTERM, close a channel, done.”

That mindset works — until the first real production deploy under load.

Then suddenly:

  • requests hang
  • deploys stall
  • Kubernetes kills your pod
  • and nobody is quite sure why

The reason is simple:

Graceful shutdown is not a signal handler.
It’s a system-level feature.


What “graceful” actually means

A graceful shutdown guarantees that:

  • no new work is accepted
  • in-flight work is allowed to finish (within bounds)
  • background goroutines exit
  • resources are released
  • the process terminates predictably

Anything less is wishful thinking.


The common but incomplete approach

Most services start here:

sig := make(chan os.Signal, 1)
signal.Notify(sig, os.Interrupt, syscall.SIGTERM)

<-sig
log.Println("shutting down")

This detects shutdown.

It does not implement shutdown.

Nothing here:

  • stops goroutines
  • cancels requests
  • drains workers
  • enforces time limits

It only observes the signal.


Shutdown must propagate through context

In Go, cancellation is cooperative.

The only scalable way to shut down a service is:

cancel context → everything reacts

A minimal pattern looks like this:

ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
defer stop()

This gives you:

  • a root context
  • cancellation on SIGTERM
  • a single authority for shutdown

Every part of the system must derive from this context.


HTTP servers: stop accepting, then wait

The correct shutdown sequence for an HTTP server:

go func() {
    if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
        log.Fatal(err)
    }
}()

<-ctx.Done()

shutdownCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()

srv.Shutdown(shutdownCtx)

What this does:

  • stops accepting new connections
  • waits for active requests
  • enforces an upper bound

Without this, requests are cut mid-flight.


Goroutines must listen for cancellation

This is where many shutdowns fail.

If you have background goroutines like:

go func() {
    for {
        doWork()
    }
}()

Your shutdown is already broken.

The correct pattern is:

go func() {
    for {
        select {
        case <-ctx.Done():
            return
        default:
            doWork()
        }
    }
}()

Every long-lived goroutine must know when to stop.


Worker pools must drain, not disappear

Worker pools need special care.

Bad shutdown:

  • close the process
  • workers die mid-job

Better shutdown:

  • stop accepting new jobs
  • let workers finish current work
  • exit cleanly

Typical pattern:

close(jobs)
wg.Wait()

But only if:

  • producers are stopped
  • consumers exit on channel close
  • cancellation is respected

Shutdown is choreography.


Time is part of correctness

A shutdown without timeouts is not graceful.

You must decide:

  • how long requests may run
  • how long background tasks may finish
  • when to force termination

This is not pessimism. It’s engineering.

“Wait forever” is not a strategy.


A quick Python contrast (only where it helps)

In Python async systems (e.g. FastAPI + asyncio), shutdown often relies on:

  • lifespan events
  • task cancellation via the event loop
  • implicit propagation

Go is more explicit.

You must:

  • pass context
  • check cancellation
  • design exit paths

This verbosity is intentional.

Go makes shutdown behavior visible in code — not hidden in the runtime.


The most common shutdown failure modes

If shutdown hangs, look for:

  • goroutines blocked on channels
  • goroutines ignoring context
  • worker pools without drain logic
  • background retries with no exit condition
  • database calls without timeouts

All of these are design issues, not bugs.


A simple rule that holds up

If you remember one rule:

Every component must know when the system is shutting down.

If a component doesn’t:

  • it will leak
  • it will block
  • it will be killed

Grace is intentional.


The takeaway

Handling SIGTERM is trivial.

Designing a system that:

  • stops accepting work
  • finishes what matters
  • exits predictably

…is not.

Graceful shutdown is not something you add later. It’s something you design for from day one.

In Go, shutdown correctness is a feature. Treat it like one.


  • context.Context Is the Real API in Go
  • Why Goroutines Leak (and How to Prove It)
  • Channels Are Not Queues
  • Bounded Concurrency Beats Clever Concurrency