一.etcd简介
etcd 是一个分布式键值对存储系统,由coreos 开发,内部采用 raft 协议作为一致性算法,用于可靠、快速地保存关键数据,并提供访问。通过分布式锁、leader选举和写屏障(write barriers),来实现可靠的分布式协作。etcd集群是为高可用、持久化数据存储和检索而准备。
概念词汇
Raft: etcd所采用的保证分布式系统强一致性的算法。
Node: 一个Raft状态机实例。
Member: 一个etcd实例。它管理着一个Node,并且可以为客户端请求提供服务。
Cluster: 由多个Member构成、可以协同工作的etcd集群。
Peer: 对同一个etcd集群中另外一个Member的称呼。
Client: 向etcd集群发送HTTP请求的客户端。
WAL: 预写式日志,etcd用于持久化存储的日志格式。
snapshot: etcd防止WAL文件过多而设置的快照,存储etcd数据状态。
Proxy: etcd的一种模式,为etcd集群提供反向代理服务。
Leader: Raft算法中,通过竞选而产生的、处理所有数据提交的节点。
Follower: 竞选失败的节点作为Raft中的从属节点,为算法提供强一致性保证。
Candidate:当Follower超过一定时间接收不到Leader的心跳时转变为Candidate开始竞选。
Term: 某个节点成为Leader到下一次竞选时间,称为一个Term。
Index: 数据项编号。Raft中通过Term和Index来定位数据
应用场景
服务发现
消息发布与订阅
负载均衡
分布式通知与协调
分布式锁、分布式队列
集群监控与Leader竞选
etcd与redis
etcd: 用于共享配置和服务发现的分布式一致键值存储. etcd 是一种分布式键值存储, 它提供了一种跨机器集群存储数据的可靠方式. etcd 在网络分区期间优雅地处理 master 选举, 并且会容忍机器故障.
redis: 持久化在磁盘上的内存数据库, Redis 是一个开源、BSD 许可的高级键值存储. 它通常被称为数据结构服务器, 因为键可以包含字符串、散列、列表、集合和排序集合.
二.etcd安装
采用二进制安装,解压后将etcd和etcdctl二进制文件复制到/user/bin/下即可:
wget https://github.com/coreos/etcd/releases/download/v3.5.1/etcd-v3.5.1-linux-amd64.tar.gz
版本查看:
root@master ~ > etcdctl version
etcdctl version: 3.5.1
API version: 3.5
etcd常用命令:
COMMANDS: alarm disarm Disarms all alarms alarm list Lists all alarms auth disable Disables authentication auth enable Enables authentication auth status Returns authentication status check datascale Check the memory usage of holding data for different workloads on a given server endpoint. check perf Check the performance of the etcd cluster compaction Compacts the event history in etcd defrag Defragments the storage of the etcd members with given endpoints del Removes the specified key or range of keys [key, range_end) elect Observes and participates in leader election endpoint hashkv Prints the KV history hash for each endpoint in --endpoints endpoint health Checks the healthiness of endpoints specified in `--endpoints` flag endpoint status Prints out the status of endpoints specified in `--endpoints` flag get Gets the key or a range of keys help Help about any command lease grant Creates leases lease keep-alive Keeps leases alive (renew) lease list List all active leases lease revoke Revokes leases lease timetolive Get lease information lock Acquires a named lock make-mirror Makes a mirror at the destination etcd cluster member add Adds a member into the cluster member list Lists all members in the cluster member promote Promotes a non-voting member in the cluster member remove Removes a member from the cluster member update Updates a member in the cluster move-leader Transfers leadership to another etcd cluster member. put Puts the given key into the store role add Adds a new role role delete Deletes a role role get Gets detailed information of a role role grant-permission Grants a key to a role role list Lists all roles role revoke-permission Revokes a key from a role snapshot restore Restores an etcd member snapshot to an etcd directory snapshot save Stores an etcd node backend snapshot to a given file snapshot status [deprecated] Gets backend snapshot status of a given file txn Txn processes all the requests in one transaction user add Adds a new user user delete Deletes a user user get Gets detailed information of a user user grant-role Grants a role to a user user list Lists all users user passwd Changes password of user user revoke-role Revokes a role from a user version Prints the version of etcdctl watch Watches events stream on keys or prefixes OPTIONS: --cacert="" verify certificates of TLS-enabled secure servers using this CA bundle --cert="" identify secure client using this TLS certificate file --command-timeout=5s timeout for short running command (excluding dial timeout) --debug[=false] enable client-side debug logging --dial-timeout=2s dial timeout for client connections -d, --discovery-srv="" domain name to query for SRV records describing cluster endpoints --discovery-srv-name="" service name to query when using DNS discovery --endpoints=[127.0.0.1:2379] gRPC endpoints -h, --help[=false] help for etcdctl --hex[=false] print byte strings as hex encoded strings --insecure-discovery[=true] accept insecure SRV records describing cluster endpoints --insecure-skip-tls-verify[=false] skip server certificate verification (CAUTION: this option should be enabled only for testing purposes) --insecure-transport[=true] disable transport security for client connections --keepalive-time=2s keepalive time for client connections --keepalive-timeout=6s keepalive timeout for client connections --key="" identify secure client using this TLS key file --password="" password for authentication (if this option is used, --user option shouldn't include password) --user="" username[:password] for authentication (prompt if password is not supplied) -w, --write-out="simple" set the output format (fields, json, protobuf, simple, table)
指定ip端口启动etcd:
etcd --listen-client-urls http://0.0.0.0:2379 --advertise-client-urls http://0.0.0.0:2379 --listen-peer-urls http://0.0.0.0:2380
三.go操作etcd
put和get
func main() {
cli, err := clientv3.New(clientv3.Config{
Endpoints: []string{"192.168.79.134:2379"}, // etcd节点,因为使用的单节点,所以这里只有一个
DialTimeout: 5 * time.Second, //超时时间
})
if err != nil {
fmt.Println(err)
}
fmt.Println("[INFO] connect to etcd success")
defer cli.Close()
// put
// 设置超时时间
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
_, err = cli.Put(ctx, "test", "hello")
cancel()
if err != nil {
fmt.Printf("put to etcd failed, err:%v\n", err)
return
}
// get
ctx, cancel = context.WithTimeout(context.Background(), time.Second)
resp, err := cli.Get(ctx, "test")
cancel()
if err != nil {
fmt.Printf("get from etcd failed, err:%v\n", err)
return
}
for _, ev := range resp.Kvs {
fmt.Printf("%s:%s\n", ev.Key, ev.Value)
}
}
// [INFO] connect to etcd success
// test:hello
watch
func main() { cli, err := clientv3.New(clientv3.Config{ Endpoints: []string{"192.168.79.134:2379"}, DialTimeout: 5 * time.Second, }) if err != nil { fmt.Printf("connect to etcd failed, err:%v\n", err) return } fmt.Println("connect to etcd success") defer cli.Close() // watch // Watch(ctx context.Context, key string, opts ...OpOption) WatchChan rch := cli.Watch(context.Background(), "test") for wresp := range rch { for _, ev := range wresp.Events { fmt.Printf("Type: %s Key:%s Value:%s\n", ev.Type, ev.Kv.Key, ev.Kv.Value) } } }
监听test的变化,Watch返回的是一个WatchResponse的管道类型,所以我们可以用for循环取值,每当test发生变化,watch就会发现并做相应操作:
type WatchResponse struct {
Header pb.ResponseHeader
Events []*storagepb.Event
CompactRevision int64
Canceled bool
}type Event struct {
Type Event_EventType `protobuf:"varint,1,opt,name=type,proto3,enum=storagepb.Event_EventType" json:"type,omitempty"`
Kv *KeyValue `protobuf:"bytes,2,opt,name=kv" json:"kv,omitempty"`
}
root@master ~ > etcdctl put test world
OK
root@master ~ > etcdctl del test
1
// $ go run watch.go
// connect to etcd success
// Type: PUT Key:test Value:world
// Type: DELETE Key:test Value:
四.go中安装etcd v3的坑
***建议直接使用clientv3的3.5版本,与grpc最新版本兼容,安装: go get go.etcd.io/etcd/client/v3@v3.5.4,详细版本迭代及操作建议参考官方说明: https://github.com/etcd-io/etcd/tree/main/client/v3
***如果受go版本或公司大环境影响不能使用最新版本,可以参考下边操作
1.当我们直接使用go get github.com/coreos/etcd/clientv3或者go get go.etcd.io/etcd时,会自动安装etcd2.3.8版本,一个很久的版本,所以在安装时一定要指定版本, 如: go get github.com/coreos/etcd/clientv3@v3.3.25
2.必须安装有grpc v1.26.0版本. 如果装有多个版本的grpc,需要在go.mod中需添加下边代码或者直接不使用go mod tidy导包,而是通过go get google.golang.org/grpc@v1.26.0和go get github.com/coreos/etcd/clientv3@v3.3.25完成导包
replace google.golang.org/grpc v1.38.0 => google.golang.org/grpc v1.26.0
五.etcd实现服务注册和发现
方法汇总:
clientv3.New: 创建etcdv3客户端(func New(cfg Config) (*Client, error))
clientv3.Config: 创建客户端时使用的配置
Grant: 初始化一个新租约(Grant(ctx context.Context, ttl int64) (*LeaseGrantResponse, error))
Put: 注册服务并绑定租约
KeepAlive: 设置续租,定期发送续租请求(KeepAlive(ctx context.Context, id LeaseID) (<-chan *LeaseKeepAliveResponse, error))
Revoke: 撤销租约
Get: 获取服务
Watch: 监控服务
实现流程:
实现代码:
1.服务注册
注册一个前缀为/web的服务:
package main
import (
"context"
"github.com/coreos/etcd/clientv3"
"log"
"time"
)
//ServiceRegister 创建租约注册服务
type ServiceRegister struct {
cli *clientv3.Client //etcd client
leaseID clientv3.LeaseID //租约ID
//租约keepalieve相应chan
keepAliveChan <-chan *clientv3.LeaseKeepAliveResponse
key string //key
val string //value
}
//NewServiceRegister 新建注册服务
func NewServiceRegister(endpoints []string, key, val string, lease int64) (*ServiceRegister, error) {
cli, err := clientv3.New(clientv3.Config{
Endpoints: endpoints,
DialTimeout: 5 * time.Second,
})
if err != nil {
log.Fatal(err)
}
ser := &ServiceRegister{
cli: cli,
key: key,
val: val,
}
//申请租约设置时间keepalive并注册服务
if err := ser.putKeyWithLease(lease); err != nil {
return nil, err
}
return ser, nil
}
//设置租约
func (s *ServiceRegister) putKeyWithLease(lease int64) error {
//设置租约时间
resp, err := s.cli.Grant(context.Background(), lease)
if err != nil {
return err
}
//注册服务并绑定租约
_, err = s.cli.Put(context.Background(), s.key, s.val, clientv3.WithLease(resp.ID))
if err != nil {
return err
}
//设置续租 定期发送需求请求
leaseRespChan, err := s.cli.KeepAlive(context.Background(), resp.ID)
if err != nil {
return err
}
s.leaseID = resp.ID
log.Println(s.leaseID)
s.keepAliveChan = leaseRespChan
log.Printf("Put key:%s val:%s success!", s.key, s.val)
return nil
}
//ListenLeaseRespChan 监听 续租情况
func (s *ServiceRegister) ListenLeaseRespChan() {
for leaseKeepResp := range s.keepAliveChan {
log.Println("续约成功", leaseKeepResp)
}
log.Println("关闭续租")
}
// Close 注销服务
func (s *ServiceRegister) Close() error {
//撤销租约
if _, err := s.cli.Revoke(context.Background(), s.leaseID); err != nil {
return err
}
log.Println("撤销租约")
return s.cli.Close()
}
func main() {
var endpoints = []string{"192.168.79.134:2379"}
ser, err := NewServiceRegister(endpoints, "/web", "192.168.1.51:8000", 5)
if err != nil {
log.Fatalln(err)
}
//监听续租相应chan
go ser.ListenLeaseRespChan()
select {
case <-time.After(20 * time.Second):
ser.Close()
}
}
2.服务发现
通过/web前缀发现服务,并持续监测/web服务的变化
package main
import (
"context"
"log"
"sync"
"time"
"github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/mvcc/mvccpb"
)
//ServiceDiscovery 服务发现
type ServiceDiscovery struct {
cli *clientv3.Client //etcd client
serverList map[string]string //服务列表
lock sync.Mutex
}
//NewServiceDiscovery 新建发现服务
func NewServiceDiscovery(endpoints []string) *ServiceDiscovery {
cli, err := clientv3.New(clientv3.Config{
Endpoints: endpoints,
DialTimeout: 5 * time.Second,
})
if err != nil {
log.Fatal(err)
}
return &ServiceDiscovery{
cli: cli,
serverList: make(map[string]string),
}
}
//WatchService 初始化服务列表和监视
func (s *ServiceDiscovery) WatchService(prefix string) error {
//根据前缀获取现有的key
resp, err := s.cli.Get(context.Background(), prefix, clientv3.WithPrefix())
if err != nil {
return err
}
for _, ev := range resp.Kvs {
s.SetServiceList(string(ev.Key), string(ev.Value))
}
//监视前缀,修改变更的server
go s.watcher(prefix)
return nil
}
//watcher 监听前缀
func (s *ServiceDiscovery) watcher(prefix string) {
rch := s.cli.Watch(context.Background(), prefix, clientv3.WithPrefix())
log.Printf("watching prefix:%s now...", prefix)
for wresp := range rch {
for _, ev := range wresp.Events {
switch ev.Type {
case mvccpb.PUT: //修改或者新增
s.SetServiceList(string(ev.Kv.Key), string(ev.Kv.Value))
case mvccpb.DELETE: //删除
s.DelServiceList(string(ev.Kv.Key))
}
}
}
}
//SetServiceList 新增服务地址
func (s *ServiceDiscovery) SetServiceList(key, val string) {
s.lock.Lock()
defer s.lock.Unlock()
s.serverList[key] = string(val)
log.Println("put key :", key, "val:", val)
}
//DelServiceList 删除服务地址
func (s *ServiceDiscovery) DelServiceList(key string) {
s.lock.Lock()
defer s.lock.Unlock()
delete(s.serverList, key)
log.Println("del key:", key)
}
//GetServices 获取服务地址
func (s *ServiceDiscovery) GetServices() []string {
s.lock.Lock()
defer s.lock.Unlock()
addrs := make([]string, 0)
for _, v := range s.serverList {
addrs = append(addrs, v)
}
return addrs
}
//Close 关闭服务
func (s *ServiceDiscovery) Close() error {
return s.cli.Close()
}
func main() {
var endpoints = []string{"192.168.79.134:2379"}
ser := NewServiceDiscovery(endpoints)
defer ser.Close()
_ = ser.WatchService("/web")
for {
select {
case <-time.Tick(10 * time.Second):
log.Println(ser.GetServices())
}
}
}
3.测试
六.grpc注册etcd集群
在使用grpc作为服务,并用etcd作为服务发现工具时要注意grpc版本.