string与[]byte的直接转换是通过底层数据copy实现的
var a = []byte("hello boy")
var b = string(a)
这种操作在并发量达到十万百万级别的时候会拖慢程序的处理速度
通过gdb调试来看一下string和[]byte的数据结构
(gdb) l main.main
2
3 import (
4 "fmt"
5 )
6
7 func main() {
8 s := "hello, world!"
9 b := []byte(s)
10
11 fmt.Println(s, b)
(gdb) b 11
Breakpoint 1 at 0x487cd9: file /export/home/machao/src/test/strbytes.go, line 11.
(gdb) r
Starting program: /export/home/machao/src/test/test1
Breakpoint 1, main.main () at /export/home/machao/src/test/strbytes.go:11
11 fmt.Println(s, b)
(gdb) info locals
s = {
str = 0x4b8ccf "hello, world!level 3 resetload64 failednil stackbaseout of memorys.allocCount=srmount errorstill in listtimer expiredtriggerRatio=unreachable: value method xadd64 failedxchg64 failed nmidlelocked= on "..., len = 13}
b = {array = 0xc4200140e0 "hello, world!", len = 13, cap = 16}
(gdb) ptype s
type = struct string {
uint8 *str;
int len;
}
(gdb) ptype b
type = struct []uint8 {
uint8 *array;
int len;
int cap;
}
转换后 [ ]byte 底层数组与原 string 内部指针并不相同,以此可确定数据被复制。那么,如不修改数据,仅转换类型,是否可避开复制,从而提升性能?
从 ptype 输出的结构来看,string 可看做 [2]uintptr,而 [ ]byte 则是 [3]uintptr,这便于我们编写代码,无需额外定义结构类型。如此,str2bytes 只需构建 [3]uintptr{ptr, len, len},而 bytes2str 更简单,直接转换指针类型,忽略掉 cap 即可。
通过unsafe.Pointer(指针转换)和uintptr(指针运算)实现转换
1 package main 2 3 import ( 4 "fmt" 5 "strings" 6 "unsafe" 7 ) 8 9 func str2bytes(s string) []byte { 10 x := (*[2]uintptr)(unsafe.Pointer(&s)) 11 h := [3]uintptr{x[0], x[1], x[1]} 12 return *(*[]byte)(unsafe.Pointer(&h)) 13 } 14 15 func bytes2str(b []byte) string { 16 return *(*string)(unsafe.Pointer(&b)) 17 } 18 19 func main() { 20 s := strings.Repeat("abc", 3) 21 b := str2bytes(s) 22 s2 := bytes2str(b) 23 fmt.Println(b, s2) 24 }
没有出现逃逸现象
package main import ( "testing" "io/ioutil" "time" "fmt" ) var s, _ = ioutil.ReadFile("mydata4vipday.720.datx") func test() { b := string(s) _ = []byte(b) } func test2() { b := bytes2str(s) _ = str2bytes(b) } func BenchmarkTest(b *testing.B) { t1 := time.Now() for i := 0; i < b.N; i++ { test() } fmt.Println("test", time.Now().Sub(t1), b.N) } func BenchmarkTestBlock(b *testing.B) { t1 := time.Now() for i := 0; i < b.N; i++ { test2() } fmt.Println("test block", time.Now().Sub(t1), b.N) }
对比一下优化前后的性能差异
没有额外开辟内存0B/op,执行效率:5亿次耗时1.6秒,而不用unsafe.Pointer和uintptr转换300次耗时久达到了1.1秒,效率对比高下立判