# Playbook for cluster logging
Before running the playbook, you should log in as an admin user (either kubeadmin
or
any other user that has the cluster_admin
role):
$ export KUBECONFIG ~/.ocp/auth/kubeconfig
$ oc login -u kubeadmin -p <password>
$ cd ~/OpenShift-on-SimpliVity
# Deploying the small profile
After making the customizations appropriate to your environment, deploy the small
EFK stack. In the following example,
the configuration specified:
efk_channel=4.2
efk_profile="small"
Run the playbooks/efk.yml
playbook:
$ ansible-playbook -i hosts playbooks/efk.yml
The playbook takes approximately 1-2 minutes to complete. However, it may take several additional minutes for the various Cluster Logging components to successfully deploy to the OpenShift Container Platform cluster. You can observere the logging pods being created:
$ oc get pod -n openshift-logging
NAME READY STATUS RESTARTS AGE
cluster-logging-operator-789f86bc5d-52864 1/1 Running 0 36s
elasticsearch-cdm-98n13kgt-1-68c7c496b7-7h58d 0/2 ContainerCreating 0 14s
fluentd-4xxjm 0/1 ContainerCreating 0 13s
fluentd-ds6v7 0/1 ContainerCreating 0 13s
fluentd-gp6mn 0/1 ContainerCreating 0 13s
fluentd-mv29x 0/1 ContainerCreating 0 13s
fluentd-pnpgj 0/1 ContainerCreating 0 13s
fluentd-sfkcl 0/1 ContainerCreating 0 13s
kibana-6db8448b8c-whlfc 0/2 ContainerCreating 0 14s
Once the pods are ready, you can check the distribution of the pods across the nodes. Fluentd is deployed on each node in the cluster, while only one instance of Elasticsearch and one of Kibana is deployed.
$ kubectl get pod -n openshift-logging -o custom-columns='Name:{.metadata.name},Node:{.spec.nodeName}'
Name Node
cluster-logging-operator-789f86bc5d-52864 ocpp-worker2
elasticsearch-cdm-98n13kgt-1-68c7c496b7-7h58d ocpp-worker2
fluentd-4xxjm ocpp-master1
fluentd-ds6v7 ocpp-worker2
fluentd-gp6mn ocpp-master2
fluentd-mv29x ocpp-master0
fluentd-pnpgj ocpp-worker1
fluentd-sfkcl ocpp-worker0
kibana-6db8448b8c-whlfc ocpp-worker2
You can see the mininim and maximum resource requirements for the small Elasticsearch pod, using the oc describe pod
command, in the Requests
and Limits
fields:
$ oc describe pod elasticsearch-cdm-98n13kgt-1-68c7c496b7-7h58d -n openshift-logging
Name: elasticsearch-cdm-98n13kgt-1-68c7c496b7-7h58d
Namespace: openshift-logging
Priority: 0
PriorityClassName: <none>
Node: ocpp-worker2/10.15.163.215
Start Time: Fri, 13 Dec 2019 09:01:26 -0500
...
Status: Running
IP: 10.129.2.9
Controlled By: ReplicaSet/elasticsearch-cdm-98n13kgt-1-68c7c496b7
Containers:
elasticsearch:
Container ID: cri-o://7baffe9ccc070660805a8fe42d6f62564420b893aa547b570b3342944a10ca43
Image: registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:ddcead06ec96b837804f8299d6cbd6ba33e46c9555cdc96a7aba8c820f9bd29f
Image ID: registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:ddcead06ec96b837804f8299d6cbd6ba33e46c9555cdc96a7aba8c820f9bd29f
Ports: 9300/TCP, 9200/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Fri, 13 Dec 2019 09:02:44 -0500
Ready: True
Restart Count: 0
Limits:
memory: 2Gi
Requests:
cpu: 200m
memory: 2Gi
To view the Kibana dashboard, determine the route:
$ oc get route -n openshift-logging
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
kibana kibana-openshift-logging.apps.ocpproxy.hpecloud.org kibana <all> reencrypt/Redirect None
In your browser, log in and view the Kibana dashboard using the returned route, in this case,
https://kibana-openshift-logging.apps.ocpproxy.hpecloud.org
.
Figure. Kibana dashboard
# Migrating from the small to the large profile
It is possible to expand this initial small
profile to the large
profile using the same playbook. You will need to add
extra worker nodes that have the capacity to accept the larger workload. You can add the new nodes before or after you use
the playbook to do the migration, as the result will be the same. In the following example, the playbook is run before the
addition of new nodes, for illustration purposes.
Re-run the playbook, but this time specify the large
profile. As an alternative to updating your configuration file,
you can set the value on the command line:
$ ansible-playbook -i hosts playbooks/efk.yml -e efk_profile=large
Notice how there are now 2 Elasticsearch pods in the pending
state as the Kubernetes scheduler cannot find any nodes
that can fulfil the larger minimum requirements (16 GB memory) for the new Elasticsearch pods.
$ kubectl get pod -n openshift-logging
NAME READY STATUS RESTARTS AGE
cluster-logging-operator-789f86bc5d-52864 1/1 Running 0 22m
curator-1576246800-fbwph 0/1 Completed 0 3m25s
elasticsearch-cdm-98n13kgt-1-68c7c496b7-7h58d 2/2 Running 0 22m
elasticsearch-cdm-98n13kgt-2-77b48d47dd-kszvv 0/2 Pending 0 4m39s
elasticsearch-cdm-98n13kgt-3-ff8844764-2pjcd 0/2 Pending 0 4m38s
fluentd-4xxjm 1/1 Running 0 22m
fluentd-ds6v7 1/1 Running 0 22m
fluentd-gp6mn 1/1 Running 0 22m
fluentd-mv29x 1/1 Running 0 22m
fluentd-pnpgj 1/1 Running 0 22m
fluentd-sfkcl 1/1 Running 0 22m
kibana-6db8448b8c-ff8m7 2/2 Running 0 4m39s
kibana-6db8448b8c-whlfc 2/2 Running 0 22m
You can use the oc describe pod
command to determine that the new Elasticsearch pods cannot be scheduled due to
the larger memory requirements:
$ oc describe pod elasticsearch-cdm-98n13kgt-2-77b48d47dd-kszvv -n openshift-logging | tail
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 82s (x5 over 3m59s) default-scheduler 0/5 nodes are available: 5 Insufficient memory.
Now add extra worker nodes to your cluster, setting the cpu
and ram
attributes to sufficiently large values.
In your hosts file, add new entries in the [rhcos_worker]
group:
[rhcos_worker]
...
hpe-worker5 ansible_host=10.15.155.215 cpus=8 ram=32768 # Larger worker node for EFK
hpe-worker6 ansible_host=10.15.155.216 cpus=8 ram=32768 # Larger worker node for EFK
hpe-worker7 ansible_host=10.15.155.217 cpus=8 ram=32768 # Larger worker node for EFK
In the above example, each of these large CoreOS worker nodes will be allocated 8 virtual CPU cores and 32GB of RAM.
These values override the default limits of 4 virtual CPU cores and 16GB RAM defined in the group_vars/worker.yml
file.
Deploy the additional, large worker nodes using the procedure described in the section Deploying CoreOS worker nodes.
$ ansible-playbook -i hosts playbooks/scale.yml
Check that the new nodes are ready, in this case ocpp-worker5
, ocpp-worker6
and ocpp-worker7
.
$ oc get nodes
NAME STATUSS AGE VERSION
ocpp-master0 Ready master 30h v1.14.6+31a56cf75
ocpp-master1 Ready master 30h v1.14.6+31a56cf75
ocpp-master2 Ready master 30h v1.14.6+31a56cf75
ocpp-worker0 Ready worker 30h v1.14.6+31a56cf75
ocpp-worker1 Ready worker 30h v1.14.6+31a56cf75
ocpp-worker2 Ready worker 30h v1.14.6+31a56cf75
ocpp-worker5 Ready worker 1m v1.14.6+31a56cf75
ocpp-worker6 Ready worker 1m v1.14.6+31a56cf75
ocpp-worker7 Ready worker 1m v1.14.6+31a56cf75
Once the pods are ready, check how the Elasticsearch pods are distributed across the new nodes:
$ kubectl get pod -n openshift-logging -o custom-columns='Name:{.metadata.name},Node:{.spec.nodeName}'
Name Node
cluster-logging-operator-789f86bc5d-52864 ocpp-worker2
curator-1576248600-cscbg ocpp-worker7
elasticsearch-cdm-98n13kgt-1-59477757c4-v8cxc ocpp-worker7
elasticsearch-cdm-98n13kgt-2-77b48d47dd-kszvv ocpp-worker5
elasticsearch-cdm-98n13kgt-3-ff8844764-2pjcd ocpp-worker6
fluentd-4xxjm ocpp-master1
fluentd-ds6v7 ocpp-worker2
fluentd-gp6mn ocpp-master2
fluentd-lggqs ocpp-worker5
fluentd-mv29x ocpp-master0
fluentd-pnpgj ocpp-worker1
fluentd-r2s4l ocpp-worker7
fluentd-sfkcl ocpp-worker0
fluentd-zztmq ocpp-worker6
kibana-6db8448b8c-ff8m7 ocpp-worker0
kibana-6db8448b8c-whlfc ocpp-worker2
The two pending Elasticsearch pods have been scheduled on two of the new larger nodes, ocpp-worker5
and ocpp-worker6
. The original Elasticsearch pod is terminated and restarted on the third of the larger nodes, ocpp-worker7
.
If you now examine the Elasticsearch pod on ocpp-worker7
, you will see that the minimum and maximum
resource requirements have changed, as shown in the Requests
and Limits
fields:
$ oc describe pod elasticsearch-cdm-98n13kgt-1-59477757c4-m8m5w -n openshift-logging Name: elasticsearch-cdm-98n13kgt-1-59477757c4-m8m5w
Namespace: openshift-logging
Priority: 0
PriorityClassName: <none>
Node: ocpp-worker7/10.15.163.220
Start Time: Fri, 13 Dec 2019 10:04:13 -0500
...
Status: Running
IP: 10.130.2.7
Controlled By: ReplicaSet/elasticsearch-cdm-98n13kgt-1-59477757c4
Containers:
elasticsearch:
Container ID: cri-o://42e8169e2c2bd3acbd2b059a12ee33f2fb85a42eb15d36a4a2faf6c6ab13ef3d
Image: registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:ddcead06ec96b837804f8299d6cbd6ba33e46c9555cdc96a7aba8c820f9bd29f
Image ID: registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:ddcead06ec96b837804f8299d6cbd6ba33e46c9555cdc96a7aba8c820f9bd29f
Ports: 9300/TCP, 9200/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Fri, 13 Dec 2019 10:04:38 -0500
Ready: True
Restart Count: 0
Limits:
memory: 16Gi
Requests:
cpu: 1
memory: 16Gi