参考:RabbitMQ Cluster Operator for Kubernetes
参考:https://github.com/rabbitmq/cluster-operator
RabbitMQ官方维护了两个kubernetes operator
- RabbitMQ Cluster Kubernetes Operator:用于在Kubernetes集群上自动化配置、运行、管理、操作RabbitMQ集群。
- RabbitMQ Messaging Topology Operator:用于管理
RabbitMQ Cluster Kubernetes Operator
创建的RabbitMQ集群里面的资源对象。它将RabbitMQ集群资源对象变成kubernetes集群里面的资源对象,通过编写yaml清单文件,执行kubectl便可以对RabbitMQ集群里面的资源对象创建删除更新。
本篇文章介绍的是使用RabbitMQ Cluster Kubernetes Operator
去快速创建RabbitMQ集群
一、安装Operator
可以根据需要调整deployment资源rabbitmq-cluster-operator的镜像
可以看到创建了namespace,crd,serviceaccount,role,clusterrole,rolebinding,clusterrolebinding, deployment等资源
[root@master RabbitMQ-Operator]# kubectl apply -f cluster-operator.yml
namespace/rabbitmq-system created
customresourcedefinition.apiextensions.k8s.io/rabbitmqclusters.rabbitmq.com created
serviceaccount/rabbitmq-cluster-operator created
role.rbac.authorization.k8s.io/rabbitmq-cluster-leader-election-role created
clusterrole.rbac.authorization.k8s.io/rabbitmq-cluster-operator-role created
clusterrole.rbac.authorization.k8s.io/rabbitmq-cluster-service-binding-role created
rolebinding.rbac.authorization.k8s.io/rabbitmq-cluster-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/rabbitmq-cluster-operator-rolebinding created
deployment.apps/rabbitmq-cluster-operator created
二、kubernetes创建和设置默认存储类
因为rabbitmq是有状态服务,需要用到pvc相关资源,我这里使用的存储是NFS
#部署csi-driver-nfs
[root@master RabbitMQ-Operator]# helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
"csi-driver-nfs" has been added to your repositories
[root@master RabbitMQ-Operator]# helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs --namespace kube-system --version v4.2.0
NAME: csi-driver-nfs
LAST DEPLOYED: Thu Mar 9 23:49:28 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The CSI NFS Driver is getting deployed to your cluster.
To check CSI NFS Driver pods status, please run:
kubectl --namespace=kube-system get pods --selector="app.kubernetes.io/instance=csi-driver-nfs" --watch
[root@master ~]# kubectl --namespace=kube-system get pods --selector="app.kubernetes.io/instance=csi-driver-nfs"
NAME READY STATUS RESTARTS AGE
csi-nfs-controller-5dd46bcf75-8xzjm 3/3 Running 0 4m5s
csi-nfs-node-6xtbr 3/3 Running 0 4m5s
csi-nfs-node-xwbbh 3/3 Running 0 5s
#部署NFS Server,此处忽略部署过程
[root@master ~]# showmount -e 192.168.66.143
Export list for 192.168.66.143:
/data/nfs 192.168.66.0/24
#创建默认存储类
[root@master ~]# cat nfs-csi.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-csi
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: nfs.csi.k8s.io
parameters:
server: 192.168.66.166
share: /data/nfs_data
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- nfsvers=3
- hard
- nolock
[root@master ~]# kubectl apply -f nfs-csi.yaml
storageclass.storage.k8s.io/nfs-csi created
[root@master ~]# kubectl get sc nfs-csi
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nfs-csi (default) nfs.csi.k8s.io Retain Immediate false 5s
三、创建RabbitMQ集群
- 我这里只是简单配置了下副本数,镜像,service类型等
- 更多配置参考:https://www.rabbitmq.com/kubernetes/operator/using-operator.html#configure 和 https://github.com/rabbitmq/cluster-operator/tree/main/docs/examples
[root@master RabbitMQ-Operator]# cat rabbitmq-cluster1.yaml
apiVersion: v1
kind: Namespace
metadata:
name: rabbitmq-cluster1
---
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: rabbitmq-cluster1
namespace: rabbitmq-cluster1
spec:
image: rabbitmq:3.10.2-management
replicas: 3
service:
type: NodePort
[root@master RabbitMQ-Operator]# kubectl apply -f rabbitmq-cluster1.yaml
namespace/rabbitmq-cluster1 unchanged
rabbitmqcluster.rabbitmq.com/rabbitmq-cluster1 created
四、查看创建的资源有哪些
端口信息:
- 5672:通信端口
- 15672:管理页面端口
- 15692:Prometheus metrics端口
[root@master RabbitMQ-Operator]# kubectl -n rabbitmq-cluster1 get all
NAME READY STATUS RESTARTS AGE
pod/rabbitmq-cluster1-server-0 1/1 Running 0 4m49s
pod/rabbitmq-cluster1-server-1 1/1 Running 0 5m49s
pod/rabbitmq-cluster1-server-2 1/1 Running 0 6m49s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/rabbitmq-cluster1 NodePort 10.68.8.253 <none> 15672:31947/TCP,15692:32676/TCP,5672:30727/TCP 9m18s
service/rabbitmq-cluster1-nodes ClusterIP None <none> 4369/TCP,25672/TCP 9m18s
NAME READY AGE
statefulset.apps/rabbitmq-cluster1-server 3/3 9m18s
NAME ALLREPLICASREADY RECONCILESUCCESS AGE
rabbitmqcluster.rabbitmq.com/rabbitmq-cluster1 True True 9m18s
五、查看默认的用户名密码
[root@master ~]# kubectl -n rabbitmq-cluster1 get secret rabbitmq-cluster1-default-user -o jsonpath="{.data.username}" | base64 --decode
default_user_iwWPOW8WD7vhEr3HJ6N
[root@master ~]# kubectl -n rabbitmq-cluster1 get secret rabbitmq-cluster1-default-user -o jsonpath="{.data.password}" | base64 --decode
o4uc8Z7FQjmE9E6ne9UhiCwQlZwkj5zG
六、访问rabbitmq管理页面

七、RabbitMQ集群监控
上面的RabbitMQ部署好之后已经通过15692端口暴露RabbitMQ集群指标信息,如果已经有Prometheus,直接,配置job,并对接grafana面板即可。
如果没有Prometheus可用,那么可用先部署Prometheus Operator,然后使用RabbitMQ官方提供的ServiceMonitor清单文件部署即可。
部署Prometheus Operator
# 参考https://prometheus-operator.dev/docs/prologue/quick-start/
[root@master tmp]# git clone https://github.com/prometheus-operator/kube-prometheus.git
[root@master tmp]# cd kube-prometheus
#修改manifests/prometheus-service.yaml,manifests/grafana-service.yaml,manifests/alertmanager-service.yaml
#spec下面增加type: NodePort字段
[root@master kube-prometheus]# kubectl create -f manifests/setup
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
namespace/monitoring create
[root@master kube-prometheus]# kubectl apply -f manifests/
alertmanager.monitoring.coreos.com/main created
networkpolicy.networking.k8s.io/alertmanager-main created
poddisruptionbudget.policy/alertmanager-main created
prometheusrule.monitoring.coreos.com/alertmanager-main-rules created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
servicemonitor.monitoring.coreos.com/alertmanager-main created
clusterrole.rbac.authorization.k8s.io/blackbox-exporter unchanged
clusterrolebinding.rbac.authorization.k8s.io/blackbox-exporter unchanged
configmap/blackbox-exporter-configuration created
deployment.apps/blackbox-exporter created
networkpolicy.networking.k8s.io/blackbox-exporter created
service/blackbox-exporter created
serviceaccount/blackbox-exporter created
servicemonitor.monitoring.coreos.com/blackbox-exporter created
secret/grafana-config created
secret/grafana-datasources created
configmap/grafana-dashboard-alertmanager-overview created
configmap/grafana-dashboard-apiserver created
configmap/grafana-dashboard-cluster-total created
configmap/grafana-dashboard-controller-manager created
configmap/grafana-dashboard-grafana-overview created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-node created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-kubelet created
configmap/grafana-dashboard-namespace-by-pod created
configmap/grafana-dashboard-namespace-by-workload created
configmap/grafana-dashboard-node-cluster-rsrc-use created
configmap/grafana-dashboard-node-rsrc-use created
configmap/grafana-dashboard-nodes-darwin created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-persistentvolumesusage created
configmap/grafana-dashboard-pod-total created
configmap/grafana-dashboard-prometheus-remote-write created
configmap/grafana-dashboard-prometheus created
configmap/grafana-dashboard-proxy created
configmap/grafana-dashboard-scheduler created
configmap/grafana-dashboard-workload-total created
configmap/grafana-dashboards created
deployment.apps/grafana created
networkpolicy.networking.k8s.io/grafana created
prometheusrule.monitoring.coreos.com/grafana-rules created
service/grafana created
serviceaccount/grafana created
servicemonitor.monitoring.coreos.com/grafana created
prometheusrule.monitoring.coreos.com/kube-prometheus-rules created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics unchanged
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics unchanged
deployment.apps/kube-state-metrics created
networkpolicy.networking.k8s.io/kube-state-metrics created
prometheusrule.monitoring.coreos.com/kube-state-metrics-rules created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created
servicemonitor.monitoring.coreos.com/kube-state-metrics created
prometheusrule.monitoring.coreos.com/kubernetes-monitoring-rules created
servicemonitor.monitoring.coreos.com/kube-apiserver created
servicemonitor.monitoring.coreos.com/coredns created
servicemonitor.monitoring.coreos.com/kube-controller-manager created
servicemonitor.monitoring.coreos.com/kube-scheduler created
servicemonitor.monitoring.coreos.com/kubelet created
clusterrole.rbac.authorization.k8s.io/node-exporter unchanged
clusterrolebinding.rbac.authorization.k8s.io/node-exporter unchanged
daemonset.apps/node-exporter created
networkpolicy.networking.k8s.io/node-exporter created
prometheusrule.monitoring.coreos.com/node-exporter-rules created
service/node-exporter created
serviceaccount/node-exporter created
servicemonitor.monitoring.coreos.com/node-exporter created
clusterrole.rbac.authorization.k8s.io/prometheus-k8s unchanged
clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s unchanged
networkpolicy.networking.k8s.io/prometheus-k8s created
poddisruptionbudget.policy/prometheus-k8s created
prometheus.monitoring.coreos.com/k8s created
prometheusrule.monitoring.coreos.com/prometheus-k8s-prometheus-rules created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s unchanged
rolebinding.rbac.authorization.k8s.io/prometheus-k8s unchanged
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s-config created
role.rbac.authorization.k8s.io/prometheus-k8s unchanged
role.rbac.authorization.k8s.io/prometheus-k8s unchanged
role.rbac.authorization.k8s.io/prometheus-k8s created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus-k8s created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged
clusterrole.rbac.authorization.k8s.io/prometheus-adapter unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter unchanged
clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator unchanged
clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources unchanged
configmap/adapter-config created
deployment.apps/prometheus-adapter created
networkpolicy.networking.k8s.io/prometheus-adapter created
poddisruptionbudget.policy/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader unchanged
service/prometheus-adapter created
serviceaccount/prometheus-adapter created
servicemonitor.monitoring.coreos.com/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/prometheus-operator unchanged
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator unchanged
deployment.apps/prometheus-operator created
networkpolicy.networking.k8s.io/prometheus-operator created
prometheusrule.monitoring.coreos.com/prometheus-operator-rules created
service/prometheus-operator created
serviceaccount/prometheus-operator created
servicemonitor.monitoring.coreos.com/prometheus-operator created
#prometheus-operator需要添加一些权限,否则拉取不到rabbitmq集群信息
[root@master ~]# cat prometheus-roles.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: monitoring
[root@master ~]# kubectl apply -f prometheus-roles.yaml
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
监控所有RabbitMQ集群
[root@master ~]# kubectl apply --filename https://raw.githubusercontent.com/rabbitmq/cluster-operator/main/observability/prometheus/monitors/rabbitmq-servicemonitor.yml
servicemonitor.monitoring.coreos.com/rabbitmq created
导入grafana dashboard
https://grafana.com/grafana/dashboards/10991-rabbitmq-overview/
效果如下

告警配置:
https://www.rabbitmq.com/kubernetes/operator/operator-monitoring.html
部分监控依赖kube-state-metrics
组件
[root@master nfs]# kubectl get pod -A | grep kube-state-metrics
monitoring kube-state-metrics-ccb6bd9b8-nlqds 3/3 Running 0 24m
RabbitMQ官方已经写好了一些规则,根据实际需求改改就可以用了,在这里
[root@master rules]# kubectl apply -f rabbitmq/
prometheusrule.monitoring.coreos.com/rabbitmq-container-restarts created
prometheusrule.monitoring.coreos.com/rabbitmq-file-descriptors-near-limit created
prometheusrule.monitoring.coreos.com/rabbitmq-high-connection-churn created
prometheusrule.monitoring.coreos.com/rabbitmq-insufficient-established-erlang-distribution-links created
prometheusrule.monitoring.coreos.com/rabbitmq-low-disk-watermark-predicted created
prometheusrule.monitoring.coreos.com/rabbitmq-no-majority-of-nodes-ready created
prometheusrule.monitoring.coreos.com/rabbitmq-persistent-volume-missing created
prometheusrule.monitoring.coreos.com/rabbitmq-recording-rules created
prometheusrule.monitoring.coreos.com/rabbitmq-tcp-sockets-near-limit created
prometheusrule.monitoring.coreos.com/rabbitmq-unroutable-messages created

修改表达式,制造一个告警
