In A / B tests, we tried to slow down the page output in 100 millisecond increments and found that even very small delays lead to a significant drop in revenue. - Greg Linden, Amazon.com
var bufpool = sync.Pool{ New: func() interface{} { buf := make([]byte, 512) return &buf }}
Get()
objects from the pool and Put()
them back when you're done. // sync.Pool returns a interface{}: you must cast it to the underlying type // before you use it. b := *bufpool.Get().(*[]byte) defer bufpool.Put(&b) // Now, go do interesting things with your byte buffer. buf := bytes.NewBuffer(b)
type AuthenticationResponse { Token string UserID string } rsp := authPool.Get().(*AuthenticationResponse) defer authPool.Put(rsp) // If we don't hit this if statement, we might return data from other users! if blah { rsp.UserID = "user-1" rsp.Token = "super-secret } return rsp
// reset resets all fields of the AuthenticationResponse before pooling it. func (a* AuthenticationResponse) reset() { a.Token = "" a.UserID = "" } rsp := authPool.Get().(*AuthenticationResponse) defer func() { rsp.reset() authPool.Put(rsp) }()
var ( r io.Reader w io.Writer ) // Obtain a buffer from the pool. buf := *bufPool.Get().(*[]byte) defer bufPool.Put(&buf) // We only write to w exactly what we read from r, and no more. nr, er := r.Read(buf) if nr > 0 { nw, ew := w.Write(buf[0:nr]) }
map[string]int
, then GC should check every line. This happens with every garbage collection, since the lines contain pointers.map[string]int
and measure the duration of the garbage collection. We allocate our map in the package area to ensure heap memory allocation. package main import ( "fmt" "runtime" "strconv" "time" ) const ( numElements = 10000000 ) var foo = map[string]int{} func timeGC() { t := time.Now() runtime.GC() fmt.Printf("gc took: %s\n", time.Since(t)) } func main() { for i := 0; i < numElements; i++ { foo[strconv.Itoa(i)] = i } for { timeGC() time.Sleep(1 * time.Second) } }
inthash → go install && inthash gc took: 98.726321ms gc took: 105.524633ms gc took: 102.829451ms gc took: 102.71908ms gc took: 103.084104ms gc took: 104.821989ms
map[int]int
. package main import ( "fmt" "runtime" "time" ) const ( numElements = 10000000 ) var foo = map[int]int{} func timeGC() { t := time.Now() runtime.GC() fmt.Printf("gc took: %s\n", time.Since(t)) } func main() { for i := 0; i < numElements; i++ { foo[i] = i } for { timeGC() time.Sleep(1 * time.Second) } }
inthash → go install && inthash gc took: 3.608993ms gc took: 3.926913ms gc took: 3.955706ms gc took: 4.063795ms gc took: 3.91519ms gc took: 3.75226ms
json.Marshal
and json.Unmarshal
rely on json.Unmarshal
reflection to serialize structure fields to bytes and vice versa. This can be slow: reflection is not as efficient as explicit code. package json // Marshal take an object and returns its representation in JSON. func Marshal(obj interface{}) ([]byte, error) { // Check if this object knows how to marshal itself to JSON // by satisfying the Marshaller interface. if m, is := obj.(json.Marshaller); is { return m.MarshalJSON() } // It doesn't know how to marshal itself. Do default reflection based marshallling. return marshal(obj) }
json.Marshaller
.$file.go
containing the structures for which you want to generate code.easyjson -all $ file.go
$file_easyjson.go
must be generated. Since easyjson
implemented the json.Marshaller
interface for you, instead of the default reflection, these functions will be called. Congratulations: you just sped up your JSON code three times. There are many tricks to further increase productivity.go generate
for these tasks. To keep it in sync with the structures, I prefer to put generate.go
at the root of the package, which causes go generate
for all package files: this can help when you have a lot of files that need to generate such code. General advice: to ensure that the structures are updated, call go generate
in CI and check that there are no differences with the registered code.String()
in the builder, a string is actually created. It relies on some unsafe tricks to return the base bytes as a zero-distributed string: see this blog for further study on how this works. // main.go package main import "strings" var strs = []string{ "here's", "a", "some", "long", "list", "of", "strings", "for", "you", } func buildStrNaive() string { var s string for _, v := range strs { s += v } return s } func buildStrBuilder() string { b := strings.Builder{} // Grow the buffer to a decent length, so we don't have to continually // re-allocate. b.Grow(60) for _, v := range strs { b.WriteString(v) } return b.String() }
// main_test.go package main import ( "testing" ) var str string func BenchmarkStringBuildNaive(b *testing.B) { for i := 0; i < bN; i++ { str = buildStrNaive() } } func BenchmarkStringBuildBuilder(b *testing.B) { for i := 0; i < bN; i++ { str = buildStrBuilder() }
strbuild → go test -bench =. -benchmem goos: darwin goarch: amd64 pkg: github.com/sjwhitworth/perfblog/strbuild BenchmarkStringBuildNaive-8 5000000 255 ns / op 216 B / op 8 allocs / op BenchmarkStringBuildBuilder-8 20000000 54.9 ns / op 64 B / op 1 allocs / op
strings.Builder
is 4.7 times faster, causes eight times less selections and takes up four times less memory.strings.Builder
. In general, I recommend using it everywhere except in the most trivial cases of string construction.fmt
basically accepts interface{}
as function arguments. There are two drawbacks:interface{}
usually results in a heap allocation. This blog explains why this is so. // main.go package main import ( "fmt" "strconv" ) func strconvFmt(a string, b int) string { return a + ":" + strconv.Itoa(b) } func fmtFmt(a string, b int) string { return fmt.Sprintf("%s:%d", a, b) } func main() {}
// main_test.go package main import ( "testing" ) var ( a = "boo" blah = 42 box = "" ) func BenchmarkStrconv(b *testing.B) { for i := 0; i < bN; i++ { box = strconvFmt(a, blah) } a = box } func BenchmarkFmt(b *testing.B) { for i := 0; i < bN; i++ { box = fmtFmt(a, blah) } a = box }
strfmt → go test -bench =. -benchmem goos: darwin goarch: amd64 pkg: github.com/sjwhitworth/perfblog/strfmt BenchmarkStrconv-8 30000000 39.5 ns / op 32 B / op 1 allocs / op BenchmarkFmt-8 10000000 143 ns / op 72 B / op 3 allocs / op
type slice struct { // pointer to underlying data in the slice. data uintptr // the number of elements in the slice. len int // the number of elements that the slice can // grow to before a new underlying array // is allocated. cap int }
data
: pointer to basic data in the slicelen
: current number of elements in the slicecap
: the number of elements to which the slice can grow before redistributioncap
) is reached, a new array is allocated with a double value, the memory is copied from the old slice to a new one, and the old array is discarded. var userIDs []string for _, bar := range rsp.Users { userIDs = append(userIDs, bar.ID) }
len
and the zero boundary capacity cap
. After receiving the answer, we add elements to the slice, while reaching the boundary capacity: a new base array is allocated, where the cap
doubled and the data is copied into it. If we have 8 elements in response, this leads to 5 redistributions. userIDs := make([]string, 0, len(rsp.Users) for _, bar := range rsp.Users { userIDs = append(userIDs, bar.ID) }
make(map[string]string, len(foo))
will allocate enough memory to avoid redistribution.time.AppendFormat
on it. The second one takes a byte buffer, writes a formatted time representation, and returns an extended byte slice. This is often found in other packages of the standard library: see strconv.AppendFloat against bytes.NewBuffer .sync.Pool
, instead of allocating a new buffer each time. Or, you can increase the initial buffer size to a value that is more appropriate for your program to reduce the number of repeated copies of the slice.Source: https://habr.com/ru/post/457004/
All Articles