本地使用 docker 运行OpenSearch + Dashboard + IK 分词插件

准备基础镜像

注意一定要拉取和当前 IK 分词插件版本一致的 OpenSearch 镜像:
https://github.com/aparo/opensearch-analysis-ik/releases

写这篇文章的时候 IK 最新版本 2.11.0, 而 dockerhub 上 OpenSearch 最新版是 2.11.1 如果版本不匹配的话是不能用的, 小版本号对不上也不行! 已经踩过坑了…

# 拉取对应版本的 opensearch/dashboard image
docker pull opensearchproject/opensearch:2.11.0
docker pull opensearchproject/opensearch-dashboards:2.11.0

额外注意事项
对于运行 Docker 的 Linux 系统环境需要提前修改一下系统配置 vm.max_map_count 的值, 否则后面运行容器的时候会出现下面错误:

Max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

# 临时生效
sysctl -w vm.max_map_count=262144

# 修改系统配置文件(重启后生效)
echo "vm.max_map_count=262144" >> /etc/sysctl.conf

启动临时容器安装 IK 插件

docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" --name opensearch-temp -d opensearchproject/opensearch:2.11.0

容器启动后, 以 root 用户连接进去, 当前镜像使用的是 AmazonLinux 2023 作为基础系统, 如果直接连进去的话是没法用 su/sudo 命令的.

docker exec -u 0 -it opensearch-temp /bin/bash

先安装 wget, unzip 命令

# 容器中的 bash 环境
yum install -y wget unzip
# 装完就退出
exit

再以普通用户连进去继续下载的操作, 省了后面再去手动改文件权限的麻烦

docker exec -it opensearch-temp /bin/bash

继续在容器中下载和解压 IK 插件

# 容器中的 bash 环境
pwd
# 确认当前的工作路径
# /usr/share/opensearch

# 进到插件目录
cd plugins

# 下载插件
wget https://github.com/aparo/opensearch-analysis-ik/releases/download/2.11.0/opensearch-analysis-ik.zip

# 解压到 ik 文件夹
unzip opensearch-analysis-ik.zip -d ik

# 删除 zip 包
rm opensearch-analysis-ik.zip

# 退出容器
exit

将装好 IK 插件的容器重新打镜像

# 当前容器状态打个临时的镜像
docker commit opensearch-temp opensearch-ik:temp

# 使用临时镜像启动新容器并把环境变量 discovery.type 给清理掉
docker run --name opensearch-temp2 -d -e discovery.type opensearch-ik:temp

# 再用这个新的容器打镜像
docker commit opensearch-temp2 opensearch-ik:2.11.0

# 镜像创建好就可以干掉临时的容器了
docker stop opensearch-temp
docker stop opensearch-temp2

# [可选] 清理一下
docker container prune -f

到这里我们就准备好了封装 IK 插件的镜像了

创建 docker-compose.yaml

参考官方文档 https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/ 稍作调整, 将 node 用到的 image 替换成我们前面做好的带着 IK 插件的镜像, 其他配置随缘调整.

version: '3'
services:
  opensearch-node1: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/)
    image: opensearch-ik:2.11.0 # Specifying the latest available image - modify if you want a specific version
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster # Name the cluster
      - node.name=opensearch-node1 # Name the node that will run in this container
      - discovery.seed_hosts=opensearch-node1,opensearch-node2 # Nodes to look for when discovering the cluster
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2 # Nodes eligible to serve as cluster manager
      - bootstrap.memory_lock=true # Disable JVM heap memory swapping
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
    ulimits:
      memlock:
        soft: -1 # Set memlock to unlimited (no soft or hard limit)
        hard: -1
      nofile:
        soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
        hard: 65536
    volumes:
      - opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container
    ports:
      - 9200:9200 # REST API
      - 9600:9600 # Performance Analyzer
    networks:
      - opensearch-net # All of the containers will join the same Docker bridge network
  opensearch-node2:
    image: opensearch-ik:2.11.0 # This should be the same image used for opensearch-node1 to avoid issues
    container_name: opensearch-node2
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node2
      - discovery.seed_hosts=opensearch-node1,opensearch-node2
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
      - bootstrap.memory_lock=true
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - opensearch-data2:/usr/share/opensearch/data
    networks:
      - opensearch-net
  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:latest # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
    container_name: opensearch-dashboards
    ports:
      - 5601:5601 # Map host port 5601 to container port 5601
    expose:
      - "5601" # Expose port 5601 for web access to OpenSearch Dashboards
    environment:
      OPENSEARCH_HOSTS: '["https://opensearch-node1:9200","https://opensearch-node2:9200"]' # Define the OpenSearch nodes that OpenSearch Dashboards will query
    networks:
      - opensearch-net

volumes:
  opensearch-data1:
  opensearch-data2:

networks:
  opensearch-net:

启动集群

docker-compose up -d

Creating network "opensearch-cluster_opensearch-net" with the default driver
Creating volume "opensearch-cluster_opensearch-data1" with default driver
Creating volume "opensearch-cluster_opensearch-data2" with default driver
Creating opensearch-dashboards ... done
Creating opensearch-node1      ... done
Creating opensearch-node2      ... done