es 数据备份、还原、迁移
snapshot API 文档:https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html
Reindex API 将文档从源复制到目标。
文档:https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex
示例 将53服务器上的doc索引 topicid属性不为空的文档,复制到 52服务器(本机)上
配置允许reindex的远程主机
在elasticsearch.yml
文件中增加配置: reindex.remote.whitelist: [10.0.0.53:9200]
调用52服务器的reindex API
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 POST _reindex { "source" : { "remote" : { "host" : "http://10.0.0.53:9200" }, "size" : 1000 , "index" : "doc" , "query" : { "bool" : {"must_not" : [ {"term" : { "topicid" : { "value" : "" } }} ]} } }, "dest" : { "index" : "doc" } }
调用时遇到问题:
1 2 3 4 5 6 7 8 { "type" : "illegal_argument_exception" , "reason" : "Remote responded with a chunk that was too large. Use a smaller batch size." , "caused_by" : { "type" : "content_too_long_exception" , "reason" : "entity content is too long [402906671] for the configured buffer limit [104857600]" } }
搜索发现这个缓冲区好像被写死了,没有办法调大。最后我将size改变为50,完成复制工作。
elasticsearch-dump 项目:https://github.com/elasticsearch-dump/elasticsearch-dump
常用参数:
–input: 源
–output: 目标
–type: 类型
–limit: 批量操作,每次处理的数据
–fileSize: 按文件大小分片
备份脚本 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # !/bin/sh if [ $# -eq 0 ]; then echo "请指定索引!" echo "sh $0 index" exit -1 fi ip=192.168.1.206 port=9202 backup_dir=./backup index=$1 dir=${backup_dir}/${index} url=http://${ip}:${port}/${index} mkdir -p ${backup_dir}/$index ./bin/elasticdump --input=$url --output=${dir}/analyzer.json --type=analyzer ./bin/elasticdump --input=$url --output=${dir}/mapping.json --type=mapping ./bin/elasticdump --input=$url --output=${dir}/index.json --type=data --fileSize=1gb
恢复脚本