UDProbe
UDProbe (mix of UDP and Probe) is a library for testing and measuring network loss and latency between distributed endpoints.
It does this by sending UDP datagrams/probes from collectors to reflectors and measuring how long it takes for them to return, if they return at all. UDP is used to provide ECMP hashing over multiple paths (a win over ICMP) without the need for setup/teardown and per-packet granularity (a win over TCP).
Why Is This Useful
Black box testing is critical to the successful monitoring and operation of a network. While collection of metrics from network devices can provide greater detail regarding known issues, they don't always provide a complete picture and can provide an overwhelming number of metrics. Black box testing with UDProbe doesn't care how the network is structured, only if it's working. This data can be used for building KPIs, observing big-picture issues, and guiding investigations into issues with unknown causes by quantifying which flows are/aren't working.
Network operators often find this useful for gauging the impact of network issues on internal traffic, identifying the scope of impact, and locating issues for which they had no other metrics (internal hardware failures, circuit degradations, etc).
Even if you operate entirely in the cloud UDProbe can help identify reachability and network health issues between and within regions/zones.
Quick Start
Docker Deployment (Easiest)
# Run Reflector
docker run -d \
--name udprobe-reflector \
-p 8100:8100 \
-p 8200:8200 \
tenkenx/udprobe-reflector
The reflector will listen for probes on port 8100 and will expose health metrics on http://localhost:8200/metrics.
# Run Collector
docker run -d \
--name udprobe-collector \
-v /path/to/config.yaml:/etc/udprobe/config.yaml \
-p 5200:5200 \
tenkenx/udprobe-collector
The collector will expose metrics on http://localhost:5200/metrics for Prometheus to scrape.
Check out the Configuration Reference for example configurations for the collector.
Local Development
# Run the reflector
go run github.com/nsw3550/udprobe/cmd/reflector
# Run the collector (in a separate terminal)
go run github.com/nsw3550/udprobe/cmd/collector -udprobe.config configs/simple_example.yaml
By default these will use the same ports as the docker containers.
Prometheus Metrics
Collector Metrics
The collector exposes the following metrics on port 5200:
| Metric | Type | Description |
|---|---|---|
udprobe_packet_loss_percentage |
Gauge | Packet loss percentage for a given measurement period |
udprobe_packets_sent |
Gauge | Number of packets sent for a given measurement period |
udprobe_packets_lost |
Gauge | Number of packets lost for a given measurement period |
udprobe_rtt |
Gauge | Average round-trip time (RTT) for packets sent during a given measurement period |
Reflector Metrics
The reflector exposes the following metrics on port 8200:
| Metric | Type | Description |
|---|---|---|
udprobe_reflector_packets_received_total |
Counter | Total UDP packets received by the reflector |
udprobe_reflector_packets_reflected_total |
Counter | Packets successfully reflected back to sender |
udprobe_reflector_packets_bad_data_total |
Counter | Malformed/unparseable packets received |
udprobe_reflector_packets_throttled_total |
Counter | Packets dropped due to rate limiting |
udprobe_reflector_tos_changes_total |
Counter | ToS bit changes on the socket |
udprobe_reflector_up |
Gauge | Health status: 1 if running, 0 if stopped |