Monitoring
To faciliate access to metrics, we changed how we publish controller-manager metrics in version v0.14.1 of the Signadot Operator. Documentation for older operator versions is available here.
The Signadot Operator exposes several Prometheus endpoints that can be used to collect
metrics about the status of the application. Each Prometheus endpoint is exposed via
a service in the signadot namespace named with a -metrics
suffix and serving port 9090
.
The following metrics services are available:
agent-metrics
at path/metrics
with compnent nameagent
io-context-server-metrics
at path/metrics
with component nameio_context_server
signadot-controller-manager-metrics
at path/metrics/signadot
with component namemanager
tunnel-proxy-metrics
at path/metrics
with component nametunnel_proxy
routeserver-metrics
at path/metrics
(v0.15+)
Signadot Operator metrics names all take the form signadot_operator_<component>_<metric-name>
Agent Metrics
Agent metris served by the service agent-metrics
at port 9090
at path /metrics
.
The metrics are prefixed by signadot_operator_agent_
.
Metrics name | Type | Description |
---|---|---|
connection_attempts | Counter | Total number times the agent attempts to connect |
A sudden increase in this metric by some value greater than 1 could indicate connectivity problems to Signadot or problems with the agent deployment.
IO Context Server Metrics
IO Context Server metris are served by the service io-context-server-metrics
at port 9090
at path /metrics
.
The metrics names are prefixed by signadot_operator_io_context_server_
.
Metrics name | Type | Description |
---|---|---|
requests{method,result} | Counter | Count of requests by HTTP method and HTTP status code of the result |
Controller Manager Metrics
Controller Manager metris are served by the service signadot-controller-manager-metrics
at port 9090
at path /metrics/signadot
.
The metrics names are prefixed by signadot_operator_manager_
.
The following metrics are exported:
Metrics name | Type | Description |
---|---|---|
sandboxes | Gauge | Total number of active sandbox resources by Readiness and ReadinessReason |
routegroups | Gauge | Total number of active routegroup resources by Readiness and ReadinessReason |
Many lower level metrics exported by
controller-runtime are
available at the same host and port but under path /metrics
.
Tunnel Proxy Metrics
Tunnel Proxy metris are
served by the service tunnel-proxy-metrics
at port 9090
at path /metrics
.
The metrics names are prefixed by signadot_operator_tunnel_proxy_
.
Metrics name | Type | Description |
---|---|---|
forward_tunnel_connection_errors{method} | Counter | Total number of errors in forward tunnel connections, by method |
reverse_tunnels{method} | Gauge | Total number of reverse tunnels, by method (xap or ssh) |
reverse_tunnel_connections{method} | Gauge | Total number of reverse tunnel connections, by method |
reverse_tunnel_errors{method} | Counter | Total number of errors setting up reverse tunnels, by method |
requests_users_total{type,user} | Counter | Total number of connection requests, by type (forward or reverse) and user |
requests_endpoints_total{type,host} | Counter | Total number of connection requests, by type (forward or reverse) and host endpoint |
requests_protocol_total{type,protocol} | Counter | Total number of connection requests, by type (forward or reverse) and protocol |
Route Server Metrics (v0.15+)
Route Server metris are
served by the service routeserver-metrics
at port 9090
at path /metrics
.
The metrics names for the gRPC endpoint are prefixed by grpc_server_
.
Metrics name | Type | Description |
---|---|---|
grpc_server_started_total | Counter | Total number of RPCs started on the server, by type, service and method |
grpc_server_handled_total | Counter | Total number of RPCs completed on the server, regardless of success or failure, by type, service, method and code |
grpc_server_msg_received_total | Counter | Total number of gRPC stream messages received on the server, by type, service and method |
grpc_server_msg_sent_total | Counter | Total number of gRPC stream messages sent by the server, by type, service and method |
grpc_server_handling_seconds_count | Histogram | Count of all completed RPCs by type, service and method |
grpc_server_handling_seconds_sum | Histogram | Cumulative time of RPCs by type, service and method |
grpc_server_handling_seconds_bucket | Histogram | The counts of RPCs by type, service and method in respective handling-time buckets |
In the case of the HTTP endpoint, metrics names are prefixed by http_server_
.
Metrics name | Type | Description |
---|---|---|
http_server_requests_total | Counter | Count of HTTP requests received, by status code, method and path |
http_server_requests_total | Histogram | The counts of HTTP requests by status code, method and path in respective handling-time buckets |
Prometheus Integration
A full example of integration with the Prometheus Operator is available here
Operator v0.14.0 and Prior
Operator v0.14.0 and prior only export general controller-runtime metrics, and only via RBAC authenticated https at:
https://signadot-controller-manager-metrics-service.signadot.svc:8443/metrics
More information about authenticating to this endpoint is available below.
Tunnel Proxy Metrics
The Tunnel Proxy exposes an HTTP endpoing at:
http://tunnel-proxy.signadot.svc:8001/metrics
The following metrics are exported:
Metrics name | Type | Description |
---|---|---|
inbound_connections | Gauge | Total number of active inbound connections by tunnel method |
inbound_revtuns | Gauge | Total number of inbound reverse tunnels by tunnel method |
Authenticating for Controller Manager Metrics v0.14.0 and Prior
To faciliate access to metrics, we changed how we publish controller-manager metrics in version v0.14.1 of the Signadot Operator. This section is only relevant to operator versions v0.14.0 and prior.
Prior to v0.14.1 of the Signadot Operator, the controller-manager metrics endpoint was protected by kube-rbac-proxy, so accessing it requires some authentication setup. In particular, clients will need to
- Authenticate to the https metrics endpoint using the ClusterRole
signadot-metrics-reader
; and - Accept a self-signed certificate.
Below are instructions for accomplishing this with Prometheus and the Datadog Agent.
Authenticating with Prometheus
- Grant the required permissions to the service account used by Prometheus:
kubectl create clusterrolebinding signadot-metrics-reader --clusterrole=signadot-metrics-reader --serviceaccount=<namespace>:<service-account-name>
- Configure the ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
control-plane: controller-manager
name: controller-manager-metrics-monitor
namespace: signadot
spec:
endpoints:
- path: /metrics
port: https
scheme: https
bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
tlsConfig:
insecureSkipVerify: true
selector:
matchLabels:
control-plane: controller-manager
Datadog Agent
If you are using Datadog
Agent, you apply
the following patch to the signadot-controller-manager
deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: signadot-controller-manager
spec:
template:
metadata:
annotations:
ad.datadoghq.com/kube-rbac-proxy.check_names: |
["openmetrics"]
ad.datadoghq.com/kube-rbac-proxy.init_configs: |
[{}]
ad.datadoghq.com/kube-rbac-proxy.instances: |
[{
"openmetrics_endpoint": "https://%%host%%:8443/metrics",
"namespace": "signadot",
"metrics": [".*"],
"auth_token": {
"reader": {
"type": "file",
"path": "/var/run/secrets/kubernetes.io/serviceaccount/token"
},
"writer": {
"type": "header",
"name": "Authorization",
"value": "Bearer <TOKEN>",
"placeholder": "<TOKEN>"
}
},
"tls_verify": "false"
}]