Original post

Go has been continuously growing in the past decade, especially among the infrastructure teams and in the cloud ecosystem. In this article, we will go through some of the unique strengths of Go in this field. We will also cover some gotchas that may not be obvious to the users at the first sight.

Build small binaries. Go builds small binaries. This makes it a good language to build artifacts for containerized or serverless environments. The final artifact with runtime dependencies can be as small as 20-25 MBs.

Runtime initialization is fast. Go’s runtime initialization is fast. If you are writing autoscaling servers in Go, cold start can’t be affected by Go’s runtime initialization. Go libraries and frameworks are also trying to be on the side of fast initialization compared to some other ecosystem such as JVM languages. The entire ecosystem contributes to fast process start.

Build static binaries. Go programs compile into a static binary. This allows users to simplify their final delivery process in most cases. Go binaries can be used as a final artifact of the CI/CD systems and deployed by copying the binary to a remote machine.

Cross compile to 64-bit Linux. Go compiler provides cross compilation. Especially if you don’t have any CGO dependencies, you can easily cross compile to any operating system and architecture. This allows users to build for their production environment regardless of their build environment.

For example, regardless of your current environment, running the following command builds for Linux 64-bit:

$ GOOS=linux GOARCH=amd64 go build

Don’t ship your toolchain. In your production environment, you don’t need Go toolchain to run Go. The final artifact is a small executable binary. You don’t have to care about installing and maintaining Go across your servers. Also, don’t ship containers with Go toolchain. Instead use the toolchain to build and copy the final binary into the production container.

Rebuild and redeploy with Go releases. Go only supports the last two major versions. Just because Go runtime is compiled in the body, with each Go release, rebuild and redeploy your production services.

At Google, we use the release candidates to build production services as soon as there is an RC version. You can use the RC version for production services, or at least push to canary with the RC version. If you see an unexpected behavior, immediately file an issue.

The go tool can print the Go version used to build a binary:

$ go version <binary>
<binary>: go1.13.5

You can additionally use tools like gops to list and report the Go versions of the binaries currently running on your system.

Embed commit versions into binaries. Embed the revision numbers when you are building a Go binary. You can also embed the build constraint and other options used when building. debug.BuildInfo also provides information about the module as well as the dependencies. Alternatively, go command can report module information and the dependencies:

$ go version -m dlv
dlv: go1.13.5
        path    github.com/go-delve/delve/cmd/dlv
        mod     github.com/go-delve/delve       v1.3.2  h1:K8VjV+Q2YnBYlPq0ctjrvc9h7h03wXszlszzfGW5Tog=
        dep     github.com/cosiner/argv v0.0.0-20170225145430-13bacc38a0a5      h1:rIXlvz2IWiupMFlC45cZCXZFvKX/ExBcSLrDy2G0Lp8=
        dep     github.com/mattn/go-isatty      v0.0.3  h1:ns/ykhmWi7G9O+8a448SecJU3nSMBXJfqQkl0upE1jI=
        dep     github.com/peterh/liner v0.0.0-20170317030525-88609521dc4b      h1:8uaXtUkxiy+T/zdLWuxa/PG4so0TPZDZfafFNNSaptE=
        dep     github.com/sirupsen/logrus      v0.0.0-20180523074243-ea8897e79973      h1:3AJZYTzw3gm3TNTt30x0CCKD7GOn2sdd50Hn35fQkGY=
        dep     github.com/spf13/cobra  v0.0.0-20170417170307-b6cb39589372      h1:eRfW1vRS4th8IX2iQeyqQ8cOUNOySvAYJ0IUvTXGoYA=
        dep     github.com/spf13/pflag  v0.0.0-20170417173400-9e4c21054fa1      h1:7bozMfSdo41n2NOc0GsVTTVUiA+Ncaj6pXNpm4UHKys=
        dep     go.starlark.net v0.0.0-20190702223751-32f345186213      h1:lkYv5AKwvvduv5XWP6szk/bvvgO6aDeUujhZQXIFTes=
        dep     golang.org/x/arch       v0.0.0-20171004143515-077ac972c2e4      h1:TP7YcWHbnFq4v8/3wM2JwgM0SRRtsYJ7Z6Oj0arz2bs=
        dep     golang.org/x/crypto     v0.0.0-20180614174826-fd5f17ee7299      h1:zxP+xTjjk4kD+M5IFPweL7/4851FUhYkzbDqbzkN1JE=
        dep     golang.org/x/sys        v0.0.0-20190626221950-04f50cda93cb      h1:fgwFCsaw9buMuxNd6+DQfAuSFqbNiQZpcgJQAgJsK6k=
        dep     gopkg.in/yaml.v2        v2.2.1  h1:mUhvW9EsL+naU5Q3cakzfE91YhliOondGd6ZrsDBHQE=

FaaS is Go binary as a service. Function-as-a-service products such as Google Cloud Functions or AWS Lambda serves Go functions. But in fact, they are building a user function into a binary and serve the binary. This means you have to organize and build packages acknowledging this fact. Because the final binary is not forked for every incoming request but is being reused:

  • You may have data races if you access to common resources from multiple functions.
  • You may need to use sync.Once in the function to initialize some of the resources if you need the incoming request to initialize.
  • Background goroutines may need to keep working even after the function is finished and binary is about to be terminated. You may need to flush data manually or gradually shutdown background routines.
  • Providers are not consistent about signaling the Go process before a shutdown. Expect hard terminations as soon as your function exits.
  • You may want to use the incoming request’s context for calls initiated in the function. In such cases, being able to reuse resources are getting harder.

Gracefully reject incoming requests. When auto scaling down or shutting down new resources, start rejecting incoming requests to the Go program. http.Server provides Shutdown for this purpose.

Report the essential metrics. Go runtime and diagnostics tools provide a variety of essential metrics from the Go programs. Report them to your monitoring systems. Some of these metrics can be accessible by runtime.NumGoroutine, runtime.NumThreads, runtime.NumCGOCalls and runtime.ReadMemStats. See instrumentation libraries such as Prometheus’ Go library as a reference on what can be exported.

Print scheduling and GC events. Go can optionally print out scheduling and GC related events to the standard output. When in production, you can use the GODEBUG environmental variable to print out verbose insights from the runtime.

The following command will start the binary and print GC events as well as the state of the current utilization at every 5000 ms to the standard out:

$ GODEBUG=gctrace=1,schedtrace=5000 <binary>

Propagate the incoming context. Go allows propagating the context in the process via context.Context. You can also signal cancellation or timeout decisions to other goroutines using context. You can use context to propagate values such as trace/request IDs or other metadata relevant in the critical path. You can log with context key/values where it applies. If you have an incoming request context, keep propagating it. For example:

http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
    // use r.Context() for the calls made here.

Continuously profile in production. Go uses pprof which is a lightweight profiling collection mechanism. It only adds a single-digit percentage overhead to the execution when enabled. You can utilize this strength by collection profiles from production systems and understanding the fleet-wide hotspots to optimize. See Continuous Profiling of Go programs for more insights and a reference implementation of a continuous profiling product.

pprof can symbolize the profiling data by incorporating the binary. If you are collecting profiles from production, you’d like to store profiling data with symbols.
Even though there is no good standard library function for this task, there is an existing reference that can be adopted.

Dump debuggable postmortems. Go allows post-mortem debugging. When running Go in production, core dumps allow you to retrospectively investigate why binaries crash. If you have Go programs constantly crashing, you can retrieve their core dumps and understand why the crashed and which state they were in. You can also utilize core dumps to debug in production by taking a snapshot (a core dump) and using your debugger. See core dumps for more.