doris ccr 数据同步工具(跨集群同步)
注意:本章使用的工具为二开工具, 官方工具不支持endpoint_mapping, 需要使用二进制部署的方式部署, 但具体步骤差不太多
前提条件
FE / BE 配置都需要打开:
enable_feature_binlog = true在 fe.conf / be.conf 文件中声名
库/表需要打开 binlog:
ALTER DATABASE db_name SET properties ("binlog.enable" = "true");使用脚本一键开启 binlog
./enable_db_binlog.sh -h ip -p sql_port -u root -d db_name
源集群和备份集群每个 fe, be 都需要定义对应 nodeport svc 用于数据同步
使用 ccr 工具进行数据同步
确定源和目标 FE/BE 信息、记录返回 Host 用于之后做映射
SHOW FRONTENDS;
SHOW BACKENDS;分别固定 FE/BE 端口
源集群 NodePort
FE
BE
目标集群 NodePort
FE
BE
NodePort Service 模板
apiVersion: v1
kind: Service
metadata:
name: doris-fe-0-nodeport
spec:
type: NodePort
selector:
statefulset.kubernetes.io/pod-name: doriscluster-helm-fe-0
ports:
- name: mysql
port: 9030
targetPort: 9030
nodePort: 32580
- name: thrift
port: 9020
targetPort: 9020
nodePort: 32581
---
apiVersion: v1
kind: Service
metadata:
name: doris-fe-1-nodeport
spec:
type: NodePort
selector:
statefulset.kubernetes.io/pod-name: doriscluster-helm-fe-1
ports:
- name: mysql
port: 9030
targetPort: 9030
nodePort: 32582
- name: thrift
port: 9020
targetPort: 9020
nodePort: 32583
---
apiVersion: v1
kind: Service
metadata:
name: doris-fe-1-nodeport
spec:
type: NodePort
selector:
statefulset.kubernetes.io/pod-name: doriscluster-helm-fe-2
ports:
- name: mysql
port: 9030
targetPort: 9030
nodePort: 32584
- name: thrift
port: 9020
targetPort: 9020
nodePort: 32585
---
apiVersion: v1
kind: Service
metadata:
name: doris-be-0-nodeport
spec:
type: NodePort
selector:
statefulset.kubernetes.io/pod-name: doriscluster-helm-be-0
ports:
- name: be
port: 9060
targetPort: 9060
nodePort: 32060
- name: http
port: 8040
targetPort: 8040
nodePort: 32040
- name: brpc
port: 8060
targetPort: 8060
nodePort: 32061
---
apiVersion: v1
kind: Service
metadata:
name: doris-be-1-nodeport
spec:
type: NodePort
selector:
statefulset.kubernetes.io/pod-name: doriscluster-helm-be-1
ports:
- name: be
port: 9060
targetPort: 9060
nodePort: 32062
- name: http
port: 8040
targetPort: 8040
nodePort: 32042
- name: brpc
port: 8060
targetPort: 8060
nodePort: 32063
---
apiVersion: v1
kind: Service
metadata:
name: doris-be-2-nodeport
spec:
type: NodePort
selector:
statefulset.kubernetes.io/pod-name: doriscluster-helm-be-2
ports:
- name: be
port: 9060
targetPort: 9060
nodePort: 32064
- name: http
port: 8040
targetPort: 8040
nodePort: 32044
- name: brpc
port: 8060
targetPort: 8060
nodePort: 32065
说明:
selector必须精确到单个 PodnodePort必须落在集群允许范围,本次环境为30000-32767FE / BE 都需要逐 Pod 暴露
同步数据
启动 ccr 同步工具
# 启动ccr
ccr_syncer -host 127.0.0.1 -port 19192 -db_dir /tmp/ccr-syncer-test.db -log_level info
## 后台启动
bash start_syncer.sh --host 127.0.0.1 --port 19192 --db_dir /tmp/db --log_level info --daemon
同步任务
说明:
src.host/src.port/src.thrift_port填 Syncer 当前能直连的源端 FE 外部入口dest.host/dest.port/dest.thrift_port填 Syncer 当前能直连的目标端 FE 外部入口endpoint_mapping用于把 Doris 内部地址改写成外部NodePort本次同步的是整个库,因此
table为空字符串host 为 fe master 地址、port 为 9030 端口、thrift_port 为 9020 端口
curl -X POST -H "Content-Type: application/json" -d '{
"name": "demo_ccr",
"src": {
"host": "10.0.0.30",
"port": "32580",
"thrift_port": "32581",
"user": "root",
"password": "",
"database": "demo",
"table": "",
"endpoint_mapping": {
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32580",
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32581",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32582",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32583",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32584",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32585",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32060",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32040",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32061",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32062",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32042",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32063",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32064",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32044",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32065"
}
},
"dest": {
"host": "10.0.1.30",
"port": "32580",
"thrift_port": "32581",
"user": "root",
"password": "",
"database": "demo",
"table": "",
"endpoint_mapping": {
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32580",
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32581",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32582",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32583",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32584",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32585",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32060",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32040",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32061",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32062",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32042",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32063",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32064",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32044",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32065"
}
}
}' http://127.0.0.1:19192/create_ccr
其他常用 api
查看任务状态:
curl -s -X POST -H "Content-Type: application/json" -d '{"name":"demo_ccr"}' http://127.0.0.1:19192/job_status
查看延迟:
curl -s -X POST -H "Content-Type: application/json" -d '{"name":"demo_ccr"}' http://127.0.0.1:19192/get_lag
查看任务详情:
curl -s -X POST -H "Content-Type: application/json" -d '{"name":"demo_ccr"}' http://127.0.0.1:19192/job_detail
删除任务
curl -s -X POST -H "Content-Type: application/json" -d '{"name":"demo_ccr"}' http://127.0.0.1:19192/delete
Doris 中查看恢复进度
SHOW RESTORE FROM demo;
重点关注:
`State`
`DownloadFinishedTime`
`FinishedTime`
`UnfinishedTasks`
`TaskErrMsg`最终结果
源端
demo全量同步成功restore 完成后,正式表可见且有数据
任务状态切换为
DBIncrementalSync增量插入数据后,目标端可正常看到同步结果
使用 ccr 工具从目标集群恢复数据到源集群
删除现有任务
curl -s -X POST -H "Content-Type: application/json" -d '{"name":"demo_ccr"}' http://127.0.0.1:19192/delete
创建方向同步任务
curl -X POST -H "Content-Type: application/json" -d '{
"name": "demo_ccr_reverse",
"src": {
"host": "10.0.1.30",
"port": "32580",
"thrift_port": "32581",
"user": "root",
"password": "",
"database": "demo",
"table": "",
"endpoint_mapping": {
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32580",
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32581",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32582",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32583",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32584",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32585",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32060",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32040",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32061",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32062",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32042",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32063",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32064",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32044",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32065"
}
},
"dest": {
"host": "10.0.0.30",
"port": "32580",
"thrift_port": "32581",
"user": "root",
"password": "",
"database": "demo",
"table": "",
"endpoint_mapping": {
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32580",
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32581",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32582",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32583",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32584",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32585",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32060",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32040",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32061",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32062",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32042",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32063",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32064",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32044",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32065"
}
}
}' http://127.0.0.1:19192/create_ccr
注意:每次创建任务都为全量同步,最终会覆盖掉原本的数据
源集群查看数据状态、等待数据同步完成
SHOW RESTORE FROM demo;
删除方向同步任务
curl -s -X POST -H "Content-Type: application/json" -d '{"name":"demo_ccr_reverse"}' http://127.0.0.1:19192/delete
创建源到目标的任务
curl -X POST -H "Content-Type: application/json" -d '{
"name": "demo_ccr",
"src": {
"host": "10.0.0.30",
"port": "32584",
"thrift_port": "32585",
"user": "root",
"password": "",
"database": "demo",
"table": "",
"endpoint_mapping": {
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32580",
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32581",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32582",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32583",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.0.30:32584",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.0.30:32585",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32060",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32040",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32061",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32062",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32042",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32063",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.0.30:32064",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.0.30:32044",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.0.30:32065"
}
},
"dest": {
"host": "10.0.1.30",
"port": "30670",
"thrift_port": "31185",
"user": "root",
"password": "",
"database": "demo",
"table": "",
"endpoint_mapping": {
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32580",
"doriscluster-helm-fe-0.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32581",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32582",
"doriscluster-helm-fe-1.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32583",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9030": "10.0.1.30:32584",
"doriscluster-helm-fe-2.doriscluster-helm-fe-internal.default.svc.cluster.local:9020": "10.0.1.30:32585",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32060",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32040",
"doriscluster-helm-be-0.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32061",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32062",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32042",
"doriscluster-helm-be-1.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32063",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:9060": "10.0.1.30:32064",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8040": "10.0.1.30:32044",
"doriscluster-helm-be-2.doriscluster-helm-be-internal.default.svc.cluster.local:8060": "10.0.1.30:32065"
}
}
}' http://127.0.0.1:19192/create_ccr
等待完成即可
参数优化
## 备份/恢复任务的总超时时间
- backup_job_default_timeout_ms
源集群 FE、目标集群 FE 都要配,单位是毫秒。默认 1 天。
先看当前值:
SHOW FRONTEND CONFIG LIKE '%backup_job_default_timeout_ms%';
临时在线生效:
ADMIN SET FRONTEND CONFIG ("backup_job_default_timeout_ms" = "259200000");
是 3 天。这个参数是 MasterOnly,SQL 会转发到当前 Master,但这类改法重启后会失效。
259200000 是 3 天。这个参数是 MasterOnly,SQL 会转发到当前 Master,但这类改法重启后会失效。
## 上游 binlog 的保留时间
- binlog.ttl_seconds
配在上游库属性,单位是秒。
例如把 hk_active 的 binlog 保留 7 天:
ALTER DATABASE hk_active SET PROPERTIES ("binlog.ttl_seconds" = "604800");
## 下游单个 BE 上,单个下载线程的限速
- max_download_speed_kbps
配在下游 BE,单位是 KB/s。默认 50000,也就是 50MB/s 左右。
查看:
SHOW BACKEND CONFIG LIKE '%max_download_speed_kbps%';
临时在线改某个 BE:
curl -X POST "http://<be_host>:8040/api/update_config?max_download_speed_kbps=204800"
204800 约等于 200MB/s。
## 下游单个 BE 的下载线程数
- download_worker_count
也配在下游 BE。
查看:
SHOW BACKEND CONFIG LIKE '%download_worker_count%';
临时在线改某个 BE:
curl -X POST "http://<be_host>:8040/api/update_config?download_worker_count=4"
这个参数通常比单纯调限速更有效。官网也说如果把它调上去,很多时候就不用再改 max_download_speed_kbps 了。
你这个环境里怎么落地
- 先用上面的 SQL / HTTP API 做临时验证。
- 验证有效后,再写进 operator/helm 管理的 FE/BE 配置里做持久化。
- 不建议只 kubectl exec 进 Pod 改 fe.conf / be.conf,Pod 重建后大概率会丢。
- backup_job_default_timeout_ms 最好写到所有 FE 的配置里,不然以后 FE 切主后,新 Master 可能又回默认值。
- max_download_speed_kbps 和 download_worker_count 要下发到所有下游 BE,不是只改一个。