环境:
源端--集群环境
192.168.1.118
192.168.1.134
192.168.1.135
目的端--单节点
192.168.1.85
1.在源端创建库和表
我这里是192.168.1.118上面创建本地表
创建库:
create database db_test;
创建表:
create table db_test.worker
(
id Int32,
name String,
gender Int16,
birthday Date,
cert_num String,
department_id Int32
)
ENGINE = MergeTree()
primary key id
partition by department_id
ORDER BY id;
写入数据:
insert into db_test.worker values(1,'name1',1,now(),12,10);
insert into db_test.worker values(2,'name2',1,now(),12,11);
2.在 /etc/clickhouse-server/ 中,创建目录层级 copytasks/task1/
在节点192.168.1.118上操作
[root@localhost clickhouse-server]# cd /etc/clickhouse-server/
[root@localhost clickhouse-server]# mkdir -p copytasks/task1
3.创建复制任务文件schema.xml
这个文件的内容包括源和目标的集群分片信息,以及需要同步的表信息.
vi /etc/clickhouse-server/copytasks/task1/schema.xml
<yandex>
<remote_servers>
<source_cluster>
<shard>
<weight>1</weight>
<replica>
<host>192.168.1.118</host>
<port>9000</port>
</replica>
</shard>
</source_cluster>
<target_cluster>
<shard>
<weight>1</weight>
<replica>
<host>192.168.1.85</host>
<port>9000</port>
</replica>
</shard>
</target_cluster>
</remote_servers>
<!-- How many simultaneously active workers are possible. If you run more workers superfluous workers will sleep. -->
<max_workers>2</max_workers>
<!-- Setting used to fetch (pull) data from source cluster tables -->
<settings_pull>
<readonly>1</readonly>
</settings_pull>
<!-- Setting used to insert (push) data to destination cluster tables -->
<settings_push>
<readonly>0</readonly>
</settings_push>
<!-- Common setting for fetch (pull) and insert (push) operations. Also, copier process context uses it.
They are overlaid by <settings_pull/> and <settings_push/> respectively. -->
<settings>
<connect_timeout>3</connect_timeout>
<!-- Sync insert is set forcibly, leave it here just in case. -->
<insert_distributed_sync>1</insert_distributed_sync>
</settings>
<tables>
<table_person>
<cluster_pull>source_cluster</cluster_pull>
<database_pull>db_test</database_pull>
<table_pull>worker</table_pull>
<cluster_push>target_cluster</cluster_push>
<database_push>db_test</database_push>
<table_push>worker</table_push>
<engine> ENGINE = MergeTree()
PARTITION BY department_id
PRIMARY KEY id
ORDER BY id
SETTINGS index_granularity = 8192
</engine>
</table_person>
</tables>
</yandex>
4.进入zookeeper 的bin 目录,使用如下命令,在zookeeper中创建复制任务
cd /opt/apache-zookeeper-3.7.0-bin/bin/
./zkCli.sh -server 192.168.1.85:2181 create /clickhouse/copytasks ""
./zkCli.sh -server 192.168.1.85:2181 create /clickhouse/copytasks/task1 ""
./zkCli.sh -server 192.168.1.85:2181 create /clickhouse/copytasks/task1/description "`cat /etc/clickhouse-server/copytasks/task1/schema.xml`"
./zkCli.sh -server 192.168.1.85:2181 get
5.在/etc/clickhouse-server/copytasks/task1 中,创建文件 zookeeper.xml,它用来配置备份需要的zookeeper 的地址。内容如下
vi /etc/clickhouse-server/copytasks/task1/zookeeper.xml
<yandex>
<logger>
<level>trace</level>
<size>100M</size>
<count>3</count>
</logger>
<zookeeper>
<node index="1">
<host>192.168.1.118</host>
<port>2182</port>
</node>
</zookeeper>
</yandex>
6. 现在执行下面的命令,运行clickhouse-copier备份任务。
clickhouse-copier --config-file=/etc/clickhouse-server/copytasks/task1/zookeeper.xml --task-path=/clickhouse/copytasks/task1 --base-dir /etc/clickhouse-server/copytasks/task1/