es 数据备份、还原、迁移

es 数据备份、还原、迁移

snapshot API

文档:https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html

Reindex API

将文档从源复制到目标。

文档:https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex

示例

将53服务器上的doc索引 topicid属性不为空的文档,复制到 52服务器(本机)上

  • 配置允许reindex的远程主机
    • elasticsearch.yml文件中增加配置: reindex.remote.whitelist: [10.0.0.53:9200]
  • 调用52服务器的reindex API
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
POST _reindex
{
"source": {
"remote": {
"host": "http://10.0.0.53:9200"
},
"size": 1000,
"index": "doc",
"query": {
"bool": {"must_not": [
{"term": {
"topicid": {
"value": ""
}
}}
]}
}
},
"dest": {
"index": "doc"
}
}

调用时遇到问题:

1
2
3
4
5
6
7
8
{
"type": "illegal_argument_exception",
"reason": "Remote responded with a chunk that was too large. Use a smaller batch size.",
"caused_by": {
"type": "content_too_long_exception",
"reason": "entity content is too long [402906671] for the configured buffer limit [104857600]"
}
}

搜索发现这个缓冲区好像被写死了,没有办法调大。最后我将size改变为50,完成复制工作。

elasticsearch-dump

项目:https://github.com/elasticsearch-dump/elasticsearch-dump

常用参数:

  • –input: 源
  • –output: 目标
  • –type: 类型
  • –limit: 批量操作,每次处理的数据
  • –fileSize: 按文件大小分片

备份脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#!/bin/sh

if [ $# -eq 0 ]; then
echo "请指定索引!"
echo "sh $0 index"
exit -1
fi

ip=192.168.1.206
port=9202
backup_dir=./backup
index=$1
dir=${backup_dir}/${index}
url=http://${ip}:${port}/${index}

mkdir -p ${backup_dir}/$index

./bin/elasticdump --input=$url --output=${dir}/analyzer.json --type=analyzer
./bin/elasticdump --input=$url --output=${dir}/mapping.json --type=mapping
./bin/elasticdump --input=$url --output=${dir}/index.json --type=data --fileSize=1gb

恢复脚本


 

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×