In A / B tests, we tried to slow down the page output in 100 millisecond increments and found that even very small delays lead to a significant drop in revenue. - Greg Linden, Amazon.com
var bufpool = sync.Pool{ New: func() interface{} { buf := make([]byte, 512) return &buf }} Get() objects from the pool and Put() them back when you're done. // sync.Pool returns a interface{}: you must cast it to the underlying type // before you use it. b := *bufpool.Get().(*[]byte) defer bufpool.Put(&b) // Now, go do interesting things with your byte buffer. buf := bytes.NewBuffer(b) type AuthenticationResponse { Token string UserID string } rsp := authPool.Get().(*AuthenticationResponse) defer authPool.Put(rsp) // If we don't hit this if statement, we might return data from other users! if blah { rsp.UserID = "user-1" rsp.Token = "super-secret } return rsp // reset resets all fields of the AuthenticationResponse before pooling it. func (a* AuthenticationResponse) reset() { a.Token = "" a.UserID = "" } rsp := authPool.Get().(*AuthenticationResponse) defer func() { rsp.reset() authPool.Put(rsp) }() var ( r io.Reader w io.Writer ) // Obtain a buffer from the pool. buf := *bufPool.Get().(*[]byte) defer bufPool.Put(&buf) // We only write to w exactly what we read from r, and no more. nr, er := r.Read(buf) if nr > 0 { nw, ew := w.Write(buf[0:nr]) } map[string]int , then GC should check every line. This happens with every garbage collection, since the lines contain pointers.map[string]int and measure the duration of the garbage collection. We allocate our map in the package area to ensure heap memory allocation. package main import ( "fmt" "runtime" "strconv" "time" ) const ( numElements = 10000000 ) var foo = map[string]int{} func timeGC() { t := time.Now() runtime.GC() fmt.Printf("gc took: %s\n", time.Since(t)) } func main() { for i := 0; i < numElements; i++ { foo[strconv.Itoa(i)] = i } for { timeGC() time.Sleep(1 * time.Second) } } inthash → go install && inthash gc took: 98.726321ms gc took: 105.524633ms gc took: 102.829451ms gc took: 102.71908ms gc took: 103.084104ms gc took: 104.821989ms
map[int]int . package main import ( "fmt" "runtime" "time" ) const ( numElements = 10000000 ) var foo = map[int]int{} func timeGC() { t := time.Now() runtime.GC() fmt.Printf("gc took: %s\n", time.Since(t)) } func main() { for i := 0; i < numElements; i++ { foo[i] = i } for { timeGC() time.Sleep(1 * time.Second) } } inthash → go install && inthash gc took: 3.608993ms gc took: 3.926913ms gc took: 3.955706ms gc took: 4.063795ms gc took: 3.91519ms gc took: 3.75226ms
json.Marshal and json.Unmarshal rely on json.Unmarshal reflection to serialize structure fields to bytes and vice versa. This can be slow: reflection is not as efficient as explicit code. package json // Marshal take an object and returns its representation in JSON. func Marshal(obj interface{}) ([]byte, error) { // Check if this object knows how to marshal itself to JSON // by satisfying the Marshaller interface. if m, is := obj.(json.Marshaller); is { return m.MarshalJSON() } // It doesn't know how to marshal itself. Do default reflection based marshallling. return marshal(obj) } json.Marshaller .$file.go containing the structures for which you want to generate code.easyjson -all $ file.go
$file_easyjson.go must be generated. Since easyjson implemented the json.Marshaller interface for you, instead of the default reflection, these functions will be called. Congratulations: you just sped up your JSON code three times. There are many tricks to further increase productivity.go generate for these tasks. To keep it in sync with the structures, I prefer to put generate.go at the root of the package, which causes go generate for all package files: this can help when you have a lot of files that need to generate such code. General advice: to ensure that the structures are updated, call go generate in CI and check that there are no differences with the registered code.String() in the builder, a string is actually created. It relies on some unsafe tricks to return the base bytes as a zero-distributed string: see this blog for further study on how this works. // main.go package main import "strings" var strs = []string{ "here's", "a", "some", "long", "list", "of", "strings", "for", "you", } func buildStrNaive() string { var s string for _, v := range strs { s += v } return s } func buildStrBuilder() string { b := strings.Builder{} // Grow the buffer to a decent length, so we don't have to continually // re-allocate. b.Grow(60) for _, v := range strs { b.WriteString(v) } return b.String() } // main_test.go package main import ( "testing" ) var str string func BenchmarkStringBuildNaive(b *testing.B) { for i := 0; i < bN; i++ { str = buildStrNaive() } } func BenchmarkStringBuildBuilder(b *testing.B) { for i := 0; i < bN; i++ { str = buildStrBuilder() } strbuild → go test -bench =. -benchmem goos: darwin goarch: amd64 pkg: github.com/sjwhitworth/perfblog/strbuild BenchmarkStringBuildNaive-8 5000000 255 ns / op 216 B / op 8 allocs / op BenchmarkStringBuildBuilder-8 20000000 54.9 ns / op 64 B / op 1 allocs / op
strings.Builder is 4.7 times faster, causes eight times less selections and takes up four times less memory.strings.Builder . In general, I recommend using it everywhere except in the most trivial cases of string construction.fmt basically accepts interface{} as function arguments. There are two drawbacks:interface{} usually results in a heap allocation. This blog explains why this is so. // main.go package main import ( "fmt" "strconv" ) func strconvFmt(a string, b int) string { return a + ":" + strconv.Itoa(b) } func fmtFmt(a string, b int) string { return fmt.Sprintf("%s:%d", a, b) } func main() {} // main_test.go package main import ( "testing" ) var ( a = "boo" blah = 42 box = "" ) func BenchmarkStrconv(b *testing.B) { for i := 0; i < bN; i++ { box = strconvFmt(a, blah) } a = box } func BenchmarkFmt(b *testing.B) { for i := 0; i < bN; i++ { box = fmtFmt(a, blah) } a = box } strfmt → go test -bench =. -benchmem goos: darwin goarch: amd64 pkg: github.com/sjwhitworth/perfblog/strfmt BenchmarkStrconv-8 30000000 39.5 ns / op 32 B / op 1 allocs / op BenchmarkFmt-8 10000000 143 ns / op 72 B / op 3 allocs / op
type slice struct { // pointer to underlying data in the slice. data uintptr // the number of elements in the slice. len int // the number of elements that the slice can // grow to before a new underlying array // is allocated. cap int } data : pointer to basic data in the slicelen : current number of elements in the slicecap : the number of elements to which the slice can grow before redistributioncap ) is reached, a new array is allocated with a double value, the memory is copied from the old slice to a new one, and the old array is discarded. var userIDs []string for _, bar := range rsp.Users { userIDs = append(userIDs, bar.ID) } len and the zero boundary capacity cap . After receiving the answer, we add elements to the slice, while reaching the boundary capacity: a new base array is allocated, where the cap doubled and the data is copied into it. If we have 8 elements in response, this leads to 5 redistributions. userIDs := make([]string, 0, len(rsp.Users) for _, bar := range rsp.Users { userIDs = append(userIDs, bar.ID) } make(map[string]string, len(foo)) will allocate enough memory to avoid redistribution.time.AppendFormat on it. The second one takes a byte buffer, writes a formatted time representation, and returns an extended byte slice. This is often found in other packages of the standard library: see strconv.AppendFloat against bytes.NewBuffer .sync.Pool , instead of allocating a new buffer each time. Or, you can increase the initial buffer size to a value that is more appropriate for your program to reduce the number of repeated copies of the slice.Source: https://habr.com/ru/post/457004/
All Articles