dremio 社区版,集群安装比较简单,核心就是一个配置(zk,分布式存储),为了方便本地环境的测试我
基于docker-compose 提供了一个方便部署的环境,可以使用
环境配置
- docker-compose
version: "3"
services:
zk:
image: zookeeper
ports:
- 2181:2181
mysql:
image: mysql:5.6
command: --character-set-server=utf8
ports:
- "3308:3306"
environment:
- MYSQL_ROOT_PASSWORD=dalong
- MYSQL_USER=boss
- MYSQL_DATABASE=boss
- MYSQL_PASSWORD=dalong
minio:
image: minio/minio
ports:
- "9000:9000"
- "19001:19001"
environment:
MINIO_ACCESS_KEY: minio
MINIO_SECRET_KEY: minio123
command: server --console-address :19001 --quiet /data
dremio_coordinator:
build: .
hostname: dremio-coordinator
container_name: dremio-coordinator
volumes:
- ./conf/dremio_coor.conf:/opt/dremio/conf/dremio.conf
- ./datas:/myappdemo
ports:
- "9047:9047"
- "31010:31010"
- "9090:9090"
dremio_executor_1:
build: .
hostname: dremio-executor-1
container_name: dremio-executor-1
volumes:
- ./conf/dremio_exec.conf:/opt/dremio/conf/dremio.conf
- ./datas:/myappdemo
ports:
- "9048:9047"
- "31011:31010"
- "9091:9090"
dremio_executor_2:
build: .
hostname: dremio_executor_2
container_name: dremio_executor_2
volumes:
- ./conf/dremio_exec.conf:/opt/dremio/conf/dremio.conf
- ./datas:/myappdemo
ports:
- "9049:9047"
- "31012:31010"
- "9092:9090"
dremio_executor_3:
build: .
volumes:
- ./conf/dremio_exec.conf:/opt/dremio/conf/dremio.conf
- ./datas:/myappdemo
pg:
image: postgres:16.0
ports:
- "5432:5432"
environment:
- POSTGRES_PASSWORD=dalongdemo
nessie:
image: projectnessie/nessie:0.75.0-java
environment:
- NESSIE_VERSION_STORE_TYPE=JDBC
- QUARKUS.DATASOURCE.USERNAME=postgres
- QUARKUS.DATASOURCE.PASSWORD=dalongdemo
- QUARKUS_DATASOURCE_JDBC_URL=jdbc:postgresql://pg:5432/postgres
ports:
- "19120:19120"
- "19121:19121"
简单说明: 里边包含了minio,nessie,pg,mysql,zk,以及dremio 的协调节点,执行节点,同时为了方便测试集成jprofiler
- 协调节点配置
分布式存储使用了本地模式,同时只进行查询以及元数据的一些处理,不进行具体的查询执行,因为使用了集群需要配置zk,禁用嵌入式的zk
paths: {
# the local path for dremio to store data.
local: ${DREMIO_HOME}"/data"
dist: "file:///myappdemo"
# the distributed path Dremio data including job results, downloads, uploads, etc
#dist: "pdfs://"${paths.local}"/pdfs"
accelerator: ${paths.dist}/accelerator,
downloads: ${paths.dist}/downloads,
uploads: ${paths.dist}/uploads,
results: ${paths.dist}/results
scratch: ${paths.dist}/scratch
}
zookeeper: "zk:2181"
debug {
allowTestApis: true
}
services.coordinator.master.embedded-zookeeper.enabled: false
services: {
coordinator.enabled: true,
coordinator.master.enabled: true,
executor.enabled: false
}
- 执行节点配置
核心是zk
zookeeper: "zk:2181"
services.coordinator.master.embedded-zookeeper.enabled: false
services: {
coordinator.enabled: false,
coordinator.master.enabled: false,
executor.enabled: true
}
- nessie 配置
比较简单,为了方便环境的重复使用配置了基于pg 的版本存储
说明
整个环境的部署比较简单,主要是方便本地学习测试,同时执行节点有点多,如果本地资源不够可以只保留一个,其他的可以删除,完整配置我
已经push github了可以参考
参考资料
https://github.com/rongfengliang/dremio_cluster_docker-compose
https://docs.dremio.com/current/get-started/cluster-deployments/customizing-configuration/dremio-conf/
https://docs.dremio.com/current/get-started/cluster-deployments/customizing-configuration/dremio-conf/high-availability-config