Original post

Vincent Blanchon
Illustration created for “A Journey With ”, made from the original Gopher, created by Renee French.

ℹ️ This article is based on Go 1.14.

Go provides memory synchronization mechanisms such as channel or mutex that help to solve different issues. In the case of shared memory, mutex protects the memory against data races. However, although two mutexes exist, Go also provides atomic memory primitives via the package to improve performance. Let’s first go back to the data races before diving into the solutions.

A data race can occur when two or more goroutines access the same memory location concurrently, and at least one of them is writing. While the maps have a native mechanism to protect against data races, a simple structure does not have any, making it vulnerable to data races.

To illustrate a data race, I will take an example of a configuration that is continuously updated by a goroutine. Here is the code:

Running this code clearly shows that the result is non-deterministic due to the data race:

&{[79167 79170 79173 79176 79179 79181]}
&{[79216 79219 79220 79221 79222 79223]}
&{[79265 79268 79271 79274 79278 79281]}

Each line was expecting to be a continuous sequence of integers when the result is quite random. Running the same program with the flag points out the data races:

Read at 0x00c0003aa028 by goroutine 9:
/usr/local/go/src/fmt/print.go:213 +0xb5
main.go:30 +0x3b

Previous write at 0x00c0003aa028 by goroutine 7:
main.go:20 +0xfe

Protecting our reads and writes from data races can be done by a mutex — probably the most common one — or by the package.

The standard library provides two kinds of mutex with the package: and ; the latter is optimized when your program deals with multiples readers and very few writers. Here is one solution:

The program now prints out the expected result; the numbers are properly incremented:

&{[213 214 215 216 217 218]}
&{[214 215 216 217 218 219]}
&{[215 216 217 218 219 220]}

The second solution can be done thanks to the package. Here is the code:

The result is also the expected one:

&{[32724 32725 32726 32727 32728 32729]}
&{[32733 32734 32735 32736 32737 32738]}
&{[32753 32754 32755 32756 32757 32758]}

Regarding the generated output, it looks like the solution using the package is much faster since it can generate a higher sequence of numbers. Benchmarking both of the programs would help to figure out which one is the most efficient.

A benchmark should be interpreted according to what is measured. In this case, I will measure the previous program where it has a writer that constantly stores a new config along with multiple readers that constantly read it. To cover more potential cases, I will also include benchmarks for a program that only has readers, assuming the config does not change often. Here is an example of this new case:

Running the benchmark ten times give the following results:

name                              time/op
AtomicOneWriterMultipleReaders-4 72.2ns ± 2%
AtomicMultipleReaders-4 65.8ns ± 2%
MutexOneWriterMultipleReaders-4 717ns ± 3%
MutexMultipleReaders-4 176ns ± 2%

The benchmark confirms what we have seen before in terms of performance. To understand where exactly the bottleneck is with the mutex, we can rerun the program with the tracer enabled.

For more information about the package, I suggest you read my article ”Go: Discovery of the Trace Package.”

Here is the profile with the program using the package:

The goroutines run with no interruption and are able to complete their tasks. Regarding the profile of the program with the mutex, that is quite different:

The running time is now quite fragmented, and this is due to the mutex that parks the goroutine. This is confirmed from the goroutine’s overview, where it shows the time spent blocked on synchronization:

The blocking time accounts for roughly a third of the time. It can be detailed from the blocking profile:

The package definitely brings an advantage in that case. However, performance could be degraded in some. For instance, if you would have to store a large map, you would have to copy it every single time the map is updated, making it inefficient.

For more information about the mutex, I suggest you read my article ”Go: Mutex and Starvation.”