Skip to content

Commit c8124e9

Browse files
committed
DX-18737: Add Helm C3 executor and dist store caching
- Dremio 4.0.0 or later required. - Adds the concept of an imageTag to expose features that are introduced only in newer versions of Dremio. - Removes the dremioVersion value that needs to be manually set to reference the same version that is used by the image. - Adds optional Cloud Cache support. Dist is split between PDFS and cloud storage. Change-Id: I645c53bb772c0d52362052ef77925c08b30cc494
1 parent 45a0484 commit c8124e9

File tree

9 files changed

+299
-181
lines changed

9 files changed

+299
-181
lines changed

charts/dremio/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
apiVersion: "v1"
22
name: "dremio"
3-
version: "0.0.7"
3+
version: "0.1.0"
44
keywords:
55
- dremio
66
- data

charts/dremio/README.md

Lines changed: 121 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,10 @@
22

33
## Overview
44

5-
This is a Helm chart to deploy a Dremio cluster in kubernetes. It uses a persistent volume for the master node to store the metadata for the cluster. The default configuration uses the default persistent storage supported by the kubernetes platform. For example,
5+
This is a Helm chart to deploy a Dremio cluster in kubernetes. It uses
6+
a persistent volume for the master node to store the metadata for the
7+
cluster. The default configuration uses the default persistent storage
8+
supported by the kubernetes platform. For example,
69

710
| Kubernetes platform | Persistent store |
811
|---------------------|------------------|
@@ -11,139 +14,210 @@ This is a Helm chart to deploy a Dremio cluster in kubernetes. It uses a persist
1114
| Google GKE | Persistent Disk |
1215
| Local K8S on Docker | Hostpath |
1316

14-
If you want to use a different storage class available in your kubernetes environment, add the storageClass in values.yaml.
15-
16-
An appropriate distributed file store (S3, ADLS, HDFS, etc) should be used for paths.dist as this deployment will lose locally persisted reflections and uploads. You can update config/dremio.conf. Dremio [documentation](https://docs.dremio.com/deployment/distributed-storage.html) provides more information on this.
17-
18-
This assumes you already have kubernetes cluster setup, kubectl configured to talk to your kubernetes cluster and helm setup in your cluster. Review and update values.yaml to reflect values for your environment before installing the helm chart. This is specially important for for the memory and cpu values - your kubernetes cluster should have sufficient resources to provision the pods with those values. If your kubernetes installation does not support serviceType LoadBalancer, it is recommended to comment the serviceType value in values.yaml file before deploying.
17+
If you want to use a different storage class available in your
18+
kubernetes environment, add the storageClass in values.yaml.
19+
20+
An appropriate distributed file store (S3, ADLS, HDFS, etc) should be
21+
used for paths.dist as this deployment will lose locally persisted
22+
reflections and uploads. You can update config/dremio.conf. Dremio
23+
[documentation](https://docs.dremio.com/deployment/distributed-storage.html)
24+
provides more information on this.
25+
26+
This assumes you already have kubernetes cluster setup, kubectl
27+
configured to talk to your kubernetes cluster and helm setup in your
28+
cluster. Review and update values.yaml to reflect values for your
29+
environment before installing the helm chart. This is specially
30+
important for for the memory and cpu values - your kubernetes cluster
31+
should have sufficient resources to provision the pods with those
32+
values. If your kubernetes installation does not support serviceType
33+
LoadBalancer, it is recommended to comment the serviceType value in
34+
values.yaml file before deploying.
1935

2036
#### Installing the helm chart
21-
Review charts/dremio/values.yaml and adjust the values as per your requirements. Note that the values for cpu and memory for the coordinator and the executors are set to work with AKS on Azure with worker nodes setup with machine types Standard_E16s_v3.
37+
38+
Review charts/dremio/values.yaml and adjust the values as per your
39+
requirements. Note that the values for cpu and memory for the
40+
coordinator and the executors are set to work with AKS on Azure with
41+
worker nodes setup with machine types Standard_E16s_v3.
2242

2343
Run this from the charts directory
44+
2445
```bash
25-
cd charts
26-
helm install --wait dremio
27-
```
28-
If it takes longer than a couple of minutes to complete, check the status of the pods to see where they are waiting. If they are pending scheduling due to limited memory or cpu, either adjust the values in values.yaml and restart the process or add more resources to your kubernetes cluster.
46+
cd charts helm install --wait dremio ```
47+
48+
If it takes longer than a couple of minutes to complete, check the
49+
status of the pods to see where they are waiting. If they are pending
50+
scheduling due to limited memory or cpu, either adjust the values in
51+
values.yaml and restart the process or add more resources to your
52+
kubernetes cluster.
2953
3054
#### Connect to the Dremio UI
31-
If your kubernetes supports serviceType LoadBalancer, you can get to the Dremio UI on the load balancer external ip.
3255
33-
For example, if your service output is:
56+
If your kubernetes supports serviceType LoadBalancer, you can get to
57+
the Dremio UI on the load balancer external IP. For example, if your
58+
service output is:
3459
3560
```bash
3661
kubectl get services dremio-client
3762
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
3863
dremio-client LoadBalancer 10.99.227.180 35.226.31.211 31010:32260/TCP,9047:30620/TCP 2d
3964
```
4065
41-
you can get to the Dremio UI using the value under column EXTERNAL-IP:
66+
You can get to the Dremio UI using the value under column EXTERNAL-IP:
4267
4368
http://35.226.31.211:9047
4469
45-
If your kubernetes does not have support of serviceType LoadBalancer, you can access the Dremio UI on the port exposed on the node. For example, if the service output is:
70+
If your kubernetes does not have support of serviceType LoadBalancer,
71+
you can access the Dremio UI on the port exposed on the node. For
72+
example, if the service output is:
4673
4774
```bash
4875
kubectl get services dremio-client
4976
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
5077
dremio-client NodePort 10.110.65.97 <none> 31010:32390/TCP,9047:30670/TCP 1h
5178
```
52-
where there is no external ip and the Dremio master is running on node "localhost", you can get to Dremio UI using:
5379
54-
http://localhost:30670
80+
Where there is no external IP and the Dremio master is running on node
81+
"localhost", you can get to Dremio UI using:
5582
83+
http://localhost:30670
5684
5785
#### Dremio Client Port
58-
The port 31010 is used for ODBC and JDBC connections. You can look up service dremio-client in kubernetes to find the host to use for ODBC or JDBC connections. Depending on your kubernetes cluster supporting serviceType LoadBalancer, you will use the load balancer external-ip or the node on which a coordinator is running.
86+
87+
The port 31010 is used for ODBC and JDBC connections. You can look up
88+
service dremio-client in kubernetes to find the host to use for ODBC
89+
or JDBC connections. Depending on your kubernetes cluster supporting
90+
serviceType LoadBalancer, you will use the load balancer external-ip
91+
or the node on which a coordinator is running.
5992
6093
```bash
6194
kubectl get services dremio-client
6295
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
6396
dremio-client LoadBalancer 10.99.227.180 35.226.31.211 31010:32260/TCP,9047:30620/TCP 2d
6497
```
6598
66-
For example, in the above output, the service is exposed on an external-ip. So, you can use 35.226.31.211:31010 in your ODBC or JDBC connections.
99+
For example, in the above output, the service is exposed on an
100+
external-ip. So, you can use 35.226.31.211:31010 in your ODBC or JDBC
101+
connections.
67102
68103
#### Viewing logs
69-
Logs are written to the container's console. All the logs - server.log, server.out, server.gc and access.log - are written into the console simultaneously. You can view the logs using kubectl.
70-
```
71-
kubectl logs <container-name>
72-
```
73-
You can also tail the logs using the -f parameter.
74-
```
75-
kubectl logs -f <container-name>
76-
```
104+
105+
Logs are written to the container's console. All the logs -
106+
server.log, server.out, server.gc and access.log - are written into
107+
the console simultaneously. You can view the logs using kubectl. ```
108+
kubectl logs <container-name> ``` You can also tail the logs using the
109+
-f parameter. ``` kubectl logs -f <container-name> ```
77110
78111
#### Scale by adding additional Coordinators or Executors (optional)
79-
Get the name of the helm release. In the example below, the release name is plundering-alpaca.
112+
113+
Get the name of the helm release. In the example below, the release
114+
name is plundering-alpaca:
115+
80116
```bash
81117
helm list
82118
NAME REVISION UPDATED STATUS CHART NAMESPACE
83119
plundering-alpaca 1 Wed Jul 18 09:36:14 2018 DEPLOYED dremio-0.0.5 default
84120
```
85121
86-
Add additional coordinators
122+
Add additional coordinators:
123+
87124
```bash
88125
helm upgrade <release name> dremio --set coordinator.count=3
89126
```
90127
91-
Add additional executors
128+
Add additional executors:
129+
92130
```bash
93131
helm upgrade <release name> dremio --set executor.count=5
94132
```
95133
96134
You can also scale down the same way.
97135
98136
### Running offline dremio-admin commands
99-
Administration commands restore, cleanup and set-password in dremio-admin needs to be run when
100-
the Dremio cluster is not running. So, before running these commands, you need to shutdown
101-
the Dremio cluster. Use the helm delete command to delete the helm release.
102-
(Kubernetes does not delete the persistent store volumes when you delete statefulset pods and
103-
when you install the cluster again using helm, the existing persistent store will be used and
104-
you will get your Dremio cluster running again.)
105-
106-
After Dremio cluster is shutdown, start the dremio-admin pod using
137+
138+
Administration commands restore, cleanup and set-password in
139+
dremio-admin needs to be run when the Dremio cluster is not
140+
running. So, before running these commands, you need to shutdown the
141+
Dremio cluster. Use the helm delete command to delete the helm
142+
release. (Kubernetes does not delete the persistent store volumes
143+
when you delete statefulset pods and when you install the cluster
144+
again using helm, the existing persistent store will be used and you
145+
will get your Dremio cluster running again.)
146+
147+
After Dremio cluster is shutdown, start the dremio-admin pod using:
148+
107149
```bash
108150
helm install --wait dremio --set DremioAdmin=true
109151
```
110-
Once the pod is running, you can connect to the pod using
152+
Once the pod is running, you can connect to the pod using:
153+
111154
```bash
112155
kubectl exec -it dremio-admin -- bash
113156
```
114157
Now, you have a bash shell from where you can run the dremio-admin commands.
115158
116-
Once you are done, you can delete the helm release for the dremio-admin and start your Dremio cluster.
159+
Once you are done, you can delete the helm release for the
160+
dremio-admin and start your Dremio cluster.
117161
118162
#### Upgrading Dremio
119-
You should attempt upgrade when no queries are running on the cluster. Update the Dremio image tag in your values.yaml file. E.g.
163+
164+
You should attempt upgrade when no queries are running on the
165+
cluster. Update the Dremio image tag in your values.yaml file. E.g:
166+
120167
```bash
121168
image: dremio/dremio-oss:3.0.0
122169
...
123170
```
124171
125-
Get the name of the helm release. In the example below, the release name is plundering-alpaca.
172+
Get the name of the helm release. In the example below, the release
173+
name is plundering-alpaca.
174+
126175
```bash
127176
helm list
128177
NAME REVISION UPDATED STATUS CHART NAMESPACE
129178
plundering-alpaca 1 Wed Jul 18 09:36:14 2018 DEPLOYED dremio-0.0.5 default
130179
```
131180
132181
Upgrade the deployment via helm upgrade command:
182+
133183
```
134184
helm upgrade <release name> .
135185
```
136186
137-
Existing pods will be terminated and new pods will be created with the new image. You can
187+
Existing pods will be terminated and new pods will be created with the
188+
new image. You can
189+
138190
monitor the status of the pods by running:
139191
```
140192
kubectl get pods
141193
```
142194
143-
Once all the pods are restarted and running, your Dremio cluster is upgraded.
195+
Once all the pods are restarted and running, your Dremio cluster is
196+
upgraded.
144197
145198
#### Customizing Dremio configuration
146199
147-
Dremio configuration files used by the deployment are in the config directory. These files are propagated to all the pods in the cluster. Updating the configuration and upgrading the helm release - just like doing an upgrade - would refresh all the pods with the new configuration. [Dremio documentation](https://docs.dremio.com/deployment/README-config.html) covers the configuration capabilities in Dremio.
148-
149-
If you need to add a core-site.xml, you can add the file to the config directory and it will be propagated to all the pods on install or upgrade of the deployment.
200+
Dremio configuration files used by the deployment are in the config
201+
directory. These files are propagated to all the pods in the
202+
cluster. Updating the configuration and upgrading the helm release -
203+
just like doing an upgrade - would refresh all the pods with the new
204+
configuration. [Dremio
205+
documentation](https://docs.dremio.com/deployment/README-config.html)
206+
covers the configuration capabilities in Dremio.
207+
208+
If you need to add a core-site.xml, you can add the file to the config
209+
directory and it will be propagated to all the pods on install or
210+
upgrade of the deployment.
211+
212+
#### Important Changes
213+
214+
2019-09-19 (v0.1.0): BREAKING CHANGE.
215+
216+
Dremio versions before 4.0.0 are no longer supported by this Helm
217+
chart. Dremio image specifier was split into an imageName and
218+
imageTag parts to follow best practices. "dist" value in
219+
dremio.conf moved to cloud storage where possible (otherwise
220+
defaults to pdfs) -- this will lose any previously extant
221+
reflections materialisations, user uploads, scratch files, etc.
222+
Also added Cloud Cache support (new in Dremio 4.0). Please see
223+
values.yaml for details on this new configuration.

0 commit comments

Comments
 (0)