diff --git a/charts/iperf3-monitor/values.yaml b/charts/iperf3-monitor/values.yaml
index 0aec506..d2de952 100644
--- a/charts/iperf3-monitor/values.yaml
+++ b/charts/iperf3-monitor/values.yaml
@@ -85,7 +85,7 @@ rbac:
 serviceAccount:
   # -- The name of the ServiceAccount to use for the exporter pod.
   # Only used if rbac.create is false. If not set, it defaults to the chart's fullname.
-  name: ""
+  name: "iperf3-monitor"
 
 serviceMonitor:
   # -- If true, create a ServiceMonitor resource for integration with Prometheus Operator.
diff --git a/docs/DESIGN.md b/docs/DESIGN.md
new file mode 100644
index 0000000..0b4c1c4
--- /dev/null
+++ b/docs/DESIGN.md
@@ -0,0 +1,1418 @@
+# **Architecting a Kubernetes-Native Network Performance Monitoring Service with iperf3, Prometheus, and Helm**
+
+## **Section 1: Architectural Blueprint for Continuous Network Validation**
+
+### **1.1 Introduction to Proactive Network Monitoring in Kubernetes** {#introduction-to-proactive-network-monitoring-in-kubernetes}
+
+In modern cloud-native infrastructures, Kubernetes has emerged as the de
+facto standard for container orchestration, simplifying the deployment,
+scaling, and management of complex applications.^1^ However, the very
+dynamism and abstraction that make Kubernetes powerful also introduce
+significant challenges in diagnosing network performance issues. The
+ephemeral nature of pods, the complexity of overlay networks provided by
+Container Network Interfaces (CNIs), and the multi-layered traffic
+routing through Services and Ingress controllers can obscure the root
+causes of latency, packet loss, and throughput degradation.
+
+Traditional, reactive troubleshooting---investigating network problems
+only after an application has failed---is insufficient in these
+environments. Performance bottlenecks can be subtle, intermittent, and
+difficult to reproduce, often manifesting as degraded user experience
+long before they trigger hard failures.^1^ To maintain the reliability
+and performance of critical workloads, engineering teams must shift from
+a reactive to a proactive stance. This requires a system that performs
+continuous, automated validation of the underlying network fabric,
+treating network health not as an assumption but as a measurable,
+time-series metric.
+
+This document outlines the architecture and implementation of a
+comprehensive, Kubernetes-native network performance monitoring service.
+The solution leverages a suite of industry-standard, open-source tools
+to provide continuous, actionable insights into cluster network health.
+The core components are:
+
+- **iperf3:** A widely adopted tool for active network performance
+  > measurement, used to generate traffic and measure maximum achievable
+  > bandwidth, jitter, and packet loss between two points.^2^
+
+- **Prometheus:** A powerful, open-source monitoring and alerting system
+  > that has become the standard for collecting and storing time-series
+  > metrics in the Kubernetes ecosystem.^3^
+
+- **Grafana:** A leading visualization tool for creating rich,
+  > interactive dashboards from various data sources, including
+  > Prometheus, enabling intuitive analysis of complex datasets.^4^
+
+By combining these components into a cohesive, automated service, we can
+transform abstract network performance into a concrete, queryable, and
+visualizable stream of data, enabling teams to detect and address
+infrastructure-level issues before they impact end-users.^6^
+
+### **1.2 The Core Architectural Pattern: Decoupled Test Endpoints and a Central Orchestrator** {#the-core-architectural-pattern-decoupled-test-endpoints-and-a-central-orchestrator}
+
+The foundation of this monitoring service is a robust, decoupled
+architectural pattern designed for scalability and resilience within a
+dynamic Kubernetes environment. The design separates the passive test
+endpoints from the active test orchestrator, a critical distinction that
+ensures the system is both efficient and aligned with Kubernetes
+operational principles.
+
+The data flow and component interaction can be visualized as follows:
+
+1.  A **DaemonSet** deploys an iperf3 server pod onto every node in the
+    > cluster, creating a mesh of passive test targets.
+
+2.  A central **Deployment**, the iperf3-exporter, uses the Kubernetes
+    > API to discover the IP addresses of all iperf3 server pods.
+
+3.  The iperf3-exporter periodically orchestrates tests, running an
+    > iperf3 client to connect to each server pod and measure network
+    > performance.
+
+4.  The exporter parses the JSON output from iperf3, transforms the
+    > results into Prometheus metrics, and exposes them on a /metrics
+    > HTTP endpoint.
+
+5.  A **Prometheus** server, configured via a **ServiceMonitor**,
+    > scrapes the /metrics endpoint of the exporter, ingesting the
+    > performance data into its time-series database.
+
+6.  A **Grafana** instance, using Prometheus as a data source,
+    > visualizes the metrics in a purpose-built dashboard, providing
+    > heatmaps and time-series graphs of node-to-node bandwidth, jitter,
+    > and packet loss.
+
+This architecture is composed of three primary logical components:
+
+- **Component 1: The iperf3-server DaemonSet.** To accurately measure
+  > network performance between any two nodes (N-to-N), an iperf3 server
+  > process must be running and accessible on every node. The DaemonSet
+  > is the canonical Kubernetes controller for this exact use case. It
+  > guarantees that a copy of a specific pod runs on all, or a selected
+  > subset of, nodes within the cluster.^7^ When a new node joins the
+  > cluster, the
+  > DaemonSet controller automatically deploys an iperf3-server pod to
+  > it; conversely, when a node is removed, the pod is garbage
+  > collected. This ensures the mesh of test endpoints is always in sync
+  > with the state of the cluster, requiring zero manual
+  > intervention.^9^ This pattern of using a
+  > DaemonSet to deploy iperf3 across a cluster is a well-established
+  > practice for network validation.^11^
+
+- **Component 2: The iperf3-exporter Deployment.** A separate,
+  > centralized component is required to act as the test orchestrator.
+  > This component is responsible for initiating the iperf3 client
+  > connections, executing the tests, parsing the results, and exposing
+  > them as Prometheus metrics. Since this is a stateless service whose
+  > primary function is to perform a periodic task, a Deployment is the
+  > ideal controller.^8^ A
+  > Deployment ensures a specified number of replicas are running,
+  > provides mechanisms for rolling updates, and allows for independent
+  > resource management and lifecycle control, decoupled from the
+  > iperf3-server pods it tests against.^10^
+
+- **Component 3: The Prometheus & Grafana Stack.** The monitoring
+  > backend is provided by the kube-prometheus-stack, a comprehensive
+  > Helm chart that deploys Prometheus, Grafana, Alertmanager, and the
+  > necessary exporters for cluster monitoring.^4^ Our custom monitoring
+  > service is designed to integrate seamlessly with this stack,
+  > leveraging its Prometheus Operator for automatic scrape
+  > configuration and its Grafana instance for visualization.
+
+### **1.3 Architectural Justification and Design Rationale** {#architectural-justification-and-design-rationale}
+
+The primary strength of this architecture lies in its deliberate
+separation of concerns, a design choice that yields significant benefits
+in resilience, scalability, and operational efficiency. The DaemonSet is
+responsible for the *presence* of test endpoints, while the Deployment
+handles the *orchestration* of the tests. This decoupling is not
+arbitrary; it is a direct consequence of applying Kubernetes-native
+principles to the problem.
+
+The logical progression is as follows: The requirement to continuously
+measure N-to-N node bandwidth necessitates that iperf3 server processes
+are available on all N nodes to act as targets. The most reliable,
+self-healing, and automated method to achieve this \"one-pod-per-node\"
+pattern in Kubernetes is to use a DaemonSet.^7^ This makes the server
+deployment automatically scale with the cluster itself. Next, a process
+is needed to trigger the tests against these servers. This
+\"orchestrator\" is a logically distinct, active service. It needs to be
+reliable and potentially scalable, but it does not need to run on every
+single node. The standard Kubernetes object for managing such stateless
+services is a
+
+Deployment.^8^
+
+This separation allows for independent and appropriate resource
+allocation. The iperf3-server pods are extremely lightweight, consuming
+minimal resources while idle. The iperf3-exporter, however, may be more
+CPU-intensive during the brief periods it is actively running tests. By
+placing them in different workload objects (DaemonSet and Deployment),
+we can configure their resource requests and limits independently. This
+prevents the monitoring workload from interfering with or being starved
+by application workloads, a crucial consideration for any
+production-grade system. This design is fundamentally more robust and
+scalable than simpler, monolithic approaches, such as a single script
+that attempts to manage both server and client lifecycles.^12^
+
+## **Section 2: Implementing the iperf3-prometheus-exporter**
+
+The heart of this monitoring solution is the iperf3-prometheus-exporter,
+a custom application responsible for orchestrating the network tests and
+translating their results into a format that Prometheus can ingest. This
+section provides a detailed breakdown of its implementation, from
+technology selection to the final container image.
+
+### **2.1 Technology Selection: Python for Agility and Ecosystem** {#technology-selection-python-for-agility-and-ecosystem}
+
+Python was selected as the implementation language for the exporter due
+to its powerful ecosystem and rapid development capabilities. The
+availability of mature, well-maintained libraries for interacting with
+both Prometheus and Kubernetes significantly accelerates the development
+of a robust, cloud-native application.
+
+The key libraries leveraged are:
+
+- **prometheus-client:** The official Python client library for
+  > instrumenting applications with Prometheus metrics. It provides a
+  > simple API for defining metrics (Gauges, Counters, etc.) and
+  > exposing them via an HTTP server, handling much of the boilerplate
+  > required for creating a valid exporter.^13^
+
+- **iperf3-python:** A clean, high-level Python wrapper around the
+  > iperf3 C library. It allows for programmatic control of iperf3
+  > clients and servers, and it can directly parse the JSON output of a
+  > test into a convenient Python object, eliminating the need for
+  > manual process management and output parsing.^15^
+
+- **kubernetes:** The official Python client library for the Kubernetes
+  > API. This library is essential for the exporter to become
+  > \"Kubernetes-aware,\" enabling it to dynamically discover the
+  > iperf3-server pods it needs to test against by querying the API
+  > server directly.
+
+### **2.2 Core Exporter Logic (Annotated Python Code)** {#core-exporter-logic-annotated-python-code}
+
+The exporter\'s logic can be broken down into five distinct steps, which
+together form a continuous loop of discovery, testing, and reporting.
+
+#### **Step 1: Initialization and Metric Definition**
+
+The application begins by importing the necessary libraries and defining
+the Prometheus metrics that will be exposed. We use a Gauge metric, as
+bandwidth is a value that can go up or down. Labels are crucial for
+providing context; they allow us to slice and dice the data in
+Prometheus and Grafana.
+
+> Python
+
+import os
+import time
+import logging
+from kubernetes import client, config
+from prometheus_client import start_http_server, Gauge
+import iperf3
+
+\# \-\-- Configuration \-\--
+\# Configure logging
+logging.basicConfig(level=logging.INFO, format=\'%(asctime)s -
+%(levelname)s - %(message)s\')
+
+\# \-\-- Prometheus Metrics Definition \-\--
+IPERF_BANDWIDTH_MBPS = Gauge(
+\'iperf_network_bandwidth_mbps\',
+\'Network bandwidth measured by iperf3 in Megabits per second\',
+\[\'source_node\', \'destination_node\', \'protocol\'\]
+)
+IPERF_JITTER_MS = Gauge(
+\'iperf_network_jitter_ms\',
+\'Network jitter measured by iperf3 in milliseconds\',
+\[\'source_node\', \'destination_node\', \'protocol\'\]
+)
+IPERF_PACKETS_TOTAL = Gauge(
+\'iperf_network_packets_total\',
+\'Total packets transmitted or received during the iperf3 test\',
+\[\'source_node\', \'destination_node\', \'protocol\'\]
+)
+IPERF_LOST_PACKETS = Gauge(
+\'iperf_network_lost_packets_total\',
+\'Total lost packets during the iperf3 UDP test\',
+\[\'source_node\', \'destination_node\', \'protocol\'\]
+)
+IPERF_TEST_SUCCESS = Gauge(
+\'iperf_test_success\',
+\'Indicates if the iperf3 test was successful (1) or failed (0)\',
+\[\'source_node\', \'destination_node\', \'protocol\'\]
+)
+
+#### **Step 2: Kubernetes-Aware Target Discovery**
+
+A static list of test targets is an anti-pattern in a dynamic
+environment like Kubernetes.^16^ The exporter must dynamically discover
+its targets. This is achieved by using the Kubernetes Python client to
+query the API server for all pods that match the label selector of our
+
+iperf3-server DaemonSet (e.g., app=iperf3-server). The function returns
+a list of dictionaries, each containing the pod\'s IP address and the
+name of the node it is running on.
+
+This dynamic discovery is what transforms the exporter from a simple
+script into a resilient, automated service. It adapts to cluster scaling
+events without any manual intervention. The logical path is clear:
+Kubernetes clusters are dynamic, so a hardcoded list of IPs would become
+stale instantly. The API server is the single source of truth for the
+cluster\'s state. Therefore, the exporter must query this API, which in
+turn necessitates including the Kubernetes client library and
+configuring the appropriate Role-Based Access Control (RBAC) permissions
+for its ServiceAccount.
+
+> Python
+
+def discover_iperf_servers():
+\"\"\"
+Discover iperf3 server pods in the cluster using the Kubernetes API.
+\"\"\"
+try:
+\# Load in-cluster configuration
+config.load_incluster_config()
+v1 = client.CoreV1Api()
+
+namespace = os.getenv(\'IPERF_SERVER_NAMESPACE\', \'default\')
+label_selector = os.getenv(\'IPERF_SERVER_LABEL_SELECTOR\',
+\'app=iperf3-server\')
+
+logging.info(f\"Discovering iperf3 servers with label
+\'{label_selector}\' in namespace \'{namespace}\'\")
+
+ret = v1.list_pod_for_all_namespaces(label_selector=label_selector,
+watch=False)
+
+servers =
+for i in ret.items:
+\# Ensure pod has an IP and is running
+if i.status.pod_ip and i.status.phase == \'Running\':
+servers.append({
+\'ip\': i.status.pod_ip,
+\'node_name\': i.spec.node_name
+})
+logging.info(f\"Discovered {len(servers)} iperf3 server pods.\")
+return servers
+except Exception as e:
+logging.error(f\"Error discovering iperf servers: {e}\")
+return
+
+#### **Step 3: The Test Orchestration Loop**
+
+The main function of the application contains an infinite while True
+loop that orchestrates the entire process. It periodically discovers the
+servers, creates a list of test pairs (node-to-node), and then executes
+an iperf3 test for each pair.
+
+> Python
+
+def run_iperf_test(server_ip, server_port, protocol, source_node,
+dest_node):
+\"\"\"
+Runs a single iperf3 test and updates Prometheus metrics.
+\"\"\"
+logging.info(f\"Running iperf3 test from {source_node} to {dest_node}
+({server_ip}:{server_port}) using {protocol.upper()}\")
+
+client = iperf3.Client()
+client.server_hostname = server_ip
+client.port = server_port
+client.protocol = protocol
+client.duration = int(os.getenv(\'IPERF_TEST_DURATION\', 5))
+client.json_output = True \# Critical for parsing
+
+result = client.run()
+
+\# Parse results and update metrics
+parse_and_publish_metrics(result, source_node, dest_node, protocol)
+
+def main_loop():
+\"\"\"
+Main orchestration loop.
+\"\"\"
+test_interval = int(os.getenv(\'IPERF_TEST_INTERVAL\', 300))
+server_port = int(os.getenv(\'IPERF_SERVER_PORT\', 5201))
+protocol = os.getenv(\'IPERF_TEST_PROTOCOL\', \'tcp\').lower()
+source_node_name = os.getenv(\'SOURCE_NODE_NAME\') \# Injected via
+Downward API
+
+if not source_node_name:
+logging.error(\"SOURCE_NODE_NAME environment variable not set.
+Exiting.\")
+return
+
+while True:
+servers = discover_iperf_servers()
+
+for server in servers:
+\# Avoid testing a node against itself
+if server\[\'node_name\'\] == source_node_name:
+continue
+
+run_iperf_test(server\[\'ip\'\], server_port, protocol,
+source_node_name, server\[\'node_name\'\])
+
+logging.info(f\"Completed test cycle. Sleeping for {test_interval}
+seconds.\")
+time.sleep(test_interval)
+
+#### **Step 4: Parsing and Publishing Metrics**
+
+After each test run, a dedicated function parses the JSON result object
+provided by the iperf3-python library.^15^ It extracts the key
+performance indicators and uses them to set the value of the
+corresponding Prometheus
+
+Gauge, applying the correct labels for source and destination nodes.
+Robust error handling ensures that failed tests are also recorded as a
+metric, which is vital for alerting.
+
+> Python
+
+def parse_and_publish_metrics(result, source_node, dest_node,
+protocol):
+\"\"\"
+Parses the iperf3 result and updates Prometheus gauges.
+\"\"\"
+labels = {\'source_node\': source_node, \'destination_node\': dest_node,
+\'protocol\': protocol}
+
+if result.error:
+logging.error(f\"Test from {source_node} to {dest_node} failed:
+{result.error}\")
+IPERF_TEST_SUCCESS.labels(\*\*labels).set(0)
+\# Clear previous successful metrics for this path
+IPERF_BANDWIDTH_MBPS.labels(\*\*labels).set(0)
+IPERF_JITTER_MS.labels(\*\*labels).set(0)
+return
+
+IPERF_TEST_SUCCESS.labels(\*\*labels).set(1)
+
+\# The summary data is in result.sent_Mbps or result.received_Mbps
+depending on direction
+\# For simplicity, we check for available attributes.
+if hasattr(result, \'sent_Mbps\'):
+bandwidth_mbps = result.sent_Mbps
+elif hasattr(result, \'received_Mbps\'):
+bandwidth_mbps = result.received_Mbps
+else:
+\# Fallback for different iperf3 versions/outputs
+bandwidth_mbps = result.Mbps if hasattr(result, \'Mbps\') else 0
+
+IPERF_BANDWIDTH_MBPS.labels(\*\*labels).set(bandwidth_mbps)
+
+if protocol == \'udp\':
+IPERF_JITTER_MS.labels(\*\*labels).set(result.jitter_ms if
+hasattr(result, \'jitter_ms\') else 0)
+IPERF_PACKETS_TOTAL.labels(\*\*labels).set(result.packets if
+hasattr(result, \'packets\') else 0)
+IPERF_LOST_PACKETS.labels(\*\*labels).set(result.lost_packets if
+hasattr(result, \'lost_packets\') else 0)
+
+#### **Step 5: Exposing the /metrics Endpoint**
+
+Finally, the main execution block starts a simple HTTP server using the
+prometheus-client library. This server exposes the collected metrics on
+the standard /metrics path, ready to be scraped by Prometheus.^13^
+
+> Python
+
+if \_\_name\_\_ == \'\_\_main\_\_\':
+\# Start the Prometheus metrics server
+listen_port = int(os.getenv(\'LISTEN_PORT\', 9876))
+start_http_server(listen_port)
+logging.info(f\"Prometheus exporter listening on port {listen_port}\")
+
+\# Start the main orchestration loop
+main_loop()
+
+### **2.3 Containerizing the Exporter (Dockerfile)** {#containerizing-the-exporter-dockerfile}
+
+To deploy the exporter in Kubernetes, it must be packaged into a
+container image. A multi-stage Dockerfile is used to create a minimal
+and more secure final image by separating the build environment from the
+runtime environment. This is a standard best practice for producing
+production-ready containers.^14^
+
+> Dockerfile
+
+\# Stage 1: Build stage with dependencies
+FROM python:3.9-slim as builder
+
+WORKDIR /app
+
+\# Install iperf3 and build dependencies
+RUN apt-get update && \\
+apt-get install -y \--no-install-recommends gcc iperf3 libiperf-dev &&
+\\
+rm -rf /var/lib/apt/lists/\*
+
+\# Install Python dependencies
+COPY requirements.txt.
+RUN pip install \--no-cache-dir -r requirements.txt
+
+\# Stage 2: Final runtime stage
+FROM python:3.9-slim
+
+WORKDIR /app
+
+\# Copy iperf3 binary and library from the builder stage
+COPY \--from=builder /usr/bin/iperf3 /usr/bin/iperf3
+COPY \--from=builder /usr/lib/x86_64-linux-gnu/libiperf.so.0
+/usr/lib/x86_64-linux-gnu/libiperf.so.0
+
+\# Copy installed Python packages from the builder stage
+COPY \--from=builder /usr/local/lib/python3.9/site-packages
+/usr/local/lib/python3.9/site-packages
+
+\# Copy the exporter application code
+COPY exporter.py.
+
+\# Expose the metrics port
+EXPOSE 9876
+
+\# Set the entrypoint
+CMD \[\"python\", \"exporter.py\"\]
+
+The corresponding requirements.txt would contain:
+
+prometheus-client
+iperf3
+kubernetes
+
+## **Section 3: Kubernetes Manifests and Deployment Strategy**
+
+With the architectural blueprint defined and the exporter application
+containerized, the next step is to translate this design into
+declarative Kubernetes manifests. These YAML files define the necessary
+Kubernetes objects to deploy, configure, and manage the monitoring
+service. Using static manifests here provides a clear foundation before
+they are parameterized into a Helm chart in the next section.
+
+### **3.1 The iperf3-server DaemonSet** {#the-iperf3-server-daemonset}
+
+The iperf3-server component is deployed as a DaemonSet to ensure an
+instance of the server pod runs on every eligible node in the
+cluster.^7^ This creates the ubiquitous grid of test endpoints required
+for comprehensive N-to-N testing.
+
+Key fields in this manifest include:
+
+- **spec.selector**: Connects the DaemonSet to the pods it manages via
+  > labels.
+
+- **spec.template.metadata.labels**: The label app: iperf3-server is
+  > applied to the pods, which is crucial for discovery by both the
+  > iperf3-exporter and Kubernetes Services.
+
+- **spec.template.spec.containers**: Defines the iperf3 container, using
+  > a public image and running the iperf3 -s command to start it in
+  > server mode.
+
+- **spec.template.spec.tolerations**: This is often necessary to allow
+  > the DaemonSet to schedule pods on control-plane (master) nodes,
+  > which may have taints preventing normal workloads from running
+  > there. This ensures the entire cluster, including masters, is part
+  > of the test mesh.
+
+- **spec.template.spec.hostNetwork: true**: This is a critical setting.
+  > By running the server pods on the host\'s network namespace, we
+  > bypass the Kubernetes network overlay (CNI) for the server side.
+  > This allows the test to measure the raw performance of the
+  > underlying node network interface, which is often the primary goal
+  > of infrastructure-level testing.
+
+> YAML
+
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+name: iperf3-server
+labels:
+app: iperf3-server
+spec:
+selector:
+matchLabels:
+app: iperf3-server
+template:
+metadata:
+labels:
+app: iperf3-server
+spec:
+\# Run on the host network to measure raw node-to-node performance
+hostNetwork: true
+\# Tolerations to allow scheduling on control-plane nodes
+tolerations:
+- key: \"node-role.kubernetes.io/control-plane\"
+operator: \"Exists\"
+effect: \"NoSchedule\"
+- key: \"node-role.kubernetes.io/master\"
+operator: \"Exists\"
+effect: \"NoSchedule\"
+containers:
+- name: iperf3-server
+image: networkstatic/iperf3:latest
+args: \[\"-s\"\] \# Start in server mode
+ports:
+- containerPort: 5201
+name: iperf3
+protocol: TCP
+- containerPort: 5201
+name: iperf3-udp
+protocol: UDP
+resources:
+requests:
+cpu: \"50m\"
+memory: \"64Mi\"
+limits:
+cpu: \"100m\"
+memory: \"128Mi\"
+
+### **3.2 The iperf3-exporter Deployment** {#the-iperf3-exporter-deployment}
+
+The iperf3-exporter is deployed as a Deployment, as it is a stateless
+application that orchestrates the tests.^14^ Only one replica is
+typically needed, as it can sequentially test all nodes.
+
+Key fields in this manifest are:
+
+- **spec.replicas: 1**: A single instance is sufficient for most
+  > clusters.
+
+- **spec.template.spec.serviceAccountName**: This assigns the custom
+  > ServiceAccount (defined next) to the pod, granting it the necessary
+  > permissions to talk to the Kubernetes API.
+
+- **spec.template.spec.containers.env**: The SOURCE_NODE_NAME
+  > environment variable is populated using the Downward API. This is
+  > how the exporter pod knows which node *it* is running on, allowing
+  > it to skip testing against itself.
+
+- **spec.template.spec.containers.image**: This points to the custom
+  > exporter image built in the previous section.
+
+> YAML
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+name: iperf3-exporter
+labels:
+app: iperf3-exporter
+spec:
+replicas: 1
+selector:
+matchLabels:
+app: iperf3-exporter
+template:
+metadata:
+labels:
+app: iperf3-exporter
+spec:
+serviceAccountName: iperf3-exporter-sa
+containers:
+- name: iperf3-exporter
+image: your-repo/iperf3-prometheus-exporter:latest \# Replace with your
+image
+ports:
+- containerPort: 9876
+name: metrics
+env:
+\# Use the Downward API to inject the node name this pod is running on
+- name: SOURCE_NODE_NAME
+valueFrom:
+fieldRef:
+fieldPath: spec.nodeName
+\# Other configurations for the exporter script
+- name: IPERF_TEST_INTERVAL
+value: \"300\"
+- name: IPERF_SERVER_LABEL_SELECTOR
+value: \"app=iperf3-server\"
+resources:
+requests:
+cpu: \"100m\"
+memory: \"128Mi\"
+limits:
+cpu: \"500m\"
+memory: \"256Mi\"
+
+### **3.3 RBAC: Granting Necessary Permissions** {#rbac-granting-necessary-permissions}
+
+For the exporter to perform its dynamic discovery of iperf3-server pods,
+it must be granted specific, limited permissions to read information
+from the Kubernetes API. This is accomplished through a ServiceAccount,
+a ClusterRole, and a ClusterRoleBinding.
+
+- **ServiceAccount**: Provides an identity for the exporter pod within
+  > the cluster.
+
+- **ClusterRole**: Defines a set of permissions. Here, we grant get,
+  > list, and watch access to pods. These are the minimum required
+  > permissions for the discovery function to work. The role is a
+  > ClusterRole because the exporter needs to find pods across all
+  > namespaces where servers might be running.
+
+- **ClusterRoleBinding**: Links the ServiceAccount to the ClusterRole,
+  > effectively granting the permissions to any pod that uses the
+  > ServiceAccount.
+
+> YAML
+
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+name: iperf3-exporter-sa
+\-\--
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+name: iperf3-exporter-role
+rules:
+- apiGroups: \[\"\"\]
+resources: \[\"pods\"\]
+verbs: \[\"get\", \"list\", \"watch\"\]
+\-\--
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+name: iperf3-exporter-rb
+subjects:
+- kind: ServiceAccount
+name: iperf3-exporter-sa
+namespace: default \# The namespace where the exporter is deployed
+roleRef:
+kind: ClusterRole
+name: iperf3-exporter-role
+apiGroup: rbac.authorization.k8s.io
+
+### **3.4 Network Exposure: Service and ServiceMonitor** {#network-exposure-service-and-servicemonitor}
+
+To make the exporter\'s metrics available to Prometheus, we need two
+final objects. The Service exposes the exporter pod\'s metrics port
+within the cluster, and the ServiceMonitor tells the Prometheus Operator
+how to find and scrape that service.
+
+This ServiceMonitor-based approach is the linchpin for a GitOps-friendly
+integration. Instead of manually editing the central Prometheus
+configuration file---a brittle and non-declarative process---we deploy a
+ServiceMonitor custom resource alongside our application.^14^ The
+Prometheus Operator, a key component of the
+
+kube-prometheus-stack, continuously watches for these objects. When it
+discovers our iperf3-exporter-sm, it automatically generates the
+necessary scrape configuration and reloads Prometheus without any manual
+intervention.^4^ This empowers the application team to define
+
+*how their application should be monitored* as part of the
+application\'s own deployment package, a cornerstone of scalable, \"you
+build it, you run it\" observability.
+
+> YAML
+
+apiVersion: v1
+kind: Service
+metadata:
+name: iperf3-exporter-svc
+labels:
+app: iperf3-exporter
+spec:
+selector:
+app: iperf3-exporter
+ports:
+- name: metrics
+port: 9876
+targetPort: metrics
+protocol: TCP
+\-\--
+apiVersion: monitoring.coreos.com/v1
+kind: ServiceMonitor
+metadata:
+name: iperf3-exporter-sm
+labels:
+\# Label for Prometheus Operator to discover this ServiceMonitor
+release: prometheus-operator
+spec:
+selector:
+matchLabels:
+\# This must match the labels on the Service object above
+app: iperf3-exporter
+endpoints:
+- port: metrics
+interval: 60s
+scrapeTimeout: 30s
+
+## **Section 4: Packaging with Helm for Reusability and Distribution**
+
+While static YAML manifests are excellent for defining Kubernetes
+resources, they lack the flexibility needed for easy configuration,
+distribution, and lifecycle management. Helm, the package manager for
+Kubernetes, solves this by bundling applications into
+version-controlled, reusable packages called charts.^17^ This section
+details how to package the entire
+
+iperf3 monitoring service into a professional, flexible, and
+distributable Helm chart.
+
+### **4.1 Helm Chart Structure** {#helm-chart-structure}
+
+A well-organized Helm chart follows a standard directory structure. This
+convention makes charts easier to understand and maintain.^19^
+
+iperf3-monitor/
+├── Chart.yaml \# Metadata about the chart (name, version, etc.)
+├── values.yaml \# Default configuration values for the chart
+├── charts/ \# Directory for sub-chart dependencies (empty for this
+project)
+├── templates/ \# Directory containing the templated Kubernetes
+manifests
+│ ├── \_helpers.tpl \# A place for reusable template helpers
+│ ├── server-daemonset.yaml
+│ ├── exporter-deployment.yaml
+│ ├── rbac.yaml
+│ ├── service.yaml
+│ └── servicemonitor.yaml
+└── README.md \# Documentation for the chart
+
+### **4.2 Templating the Kubernetes Manifests** {#templating-the-kubernetes-manifests}
+
+The core of Helm\'s power lies in its templating engine, which uses Go
+templates. We convert the static manifests from Section 3 into dynamic
+templates by replacing hardcoded values with references to variables
+defined in the values.yaml file.
+
+A crucial best practice is to use a \_helpers.tpl file to define common
+functions and partial templates, especially for generating resource
+names and labels. This reduces boilerplate, ensures consistency, and
+makes the chart easier to manage.^19^
+
+**Example: templates/\_helpers.tpl**
+
+> Code snippet
+
+{{/\*
+Expand the name of the chart.
+\*/}}
+{{- define \"iperf3-monitor.name\" -}}
+{{- default.Chart.Name.Values.nameOverride \| trunc 63 \| trimSuffix
+\"-\" }}
+{{- end -}}
+
+{{/\*
+Create a default fully qualified app name.
+We truncate at 63 chars because some Kubernetes name fields are limited
+to this (by the DNS naming spec).
+\*/}}
+{{- define \"iperf3-monitor.fullname\" -}}
+{{- if.Values.fullnameOverride }}
+{{-.Values.fullnameOverride \| trunc 63 \| trimSuffix \"-\" }}
+{{- else }}
+{{- \$name := default.Chart.Name.Values.nameOverride }}
+{{- if contains \$name.Release.Name }}
+{{-.Release.Name \| trunc 63 \| trimSuffix \"-\" }}
+{{- else }}
+{{- printf \"%s-%s\".Release.Name \$name \| trunc 63 \| trimSuffix \"-\"
+}}
+{{- end }}
+{{- end }}
+{{- end -}}
+
+{{/\*
+Common labels
+\*/}}
+{{- define \"iperf3-monitor.labels\" -}}
+helm.sh/chart: {{ include \"iperf3-monitor.name\". }}
+{{ include \"iperf3-monitor.selectorLabels\". }}
+{{- if.Chart.AppVersion }}
+app.kubernetes.io/version: {{.Chart.AppVersion \| quote }}
+{{- end }}
+app.kubernetes.io/managed-by: {{.Release.Service }}
+{{- end -}}
+
+{{/\*
+Selector labels
+\*/}}
+{{- define \"iperf3-monitor.selectorLabels\" -}}
+app.kubernetes.io/name: {{ include \"iperf3-monitor.name\". }}
+app.kubernetes.io/instance: {{.Release.Name }}
+{{- end -}}
+
+**Example: Templated exporter-deployment.yaml**
+
+> YAML
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+name: {{ include \"iperf3-monitor.fullname\". }}-exporter
+labels:
+{{- include \"iperf3-monitor.labels\". \| nindent 4 }}
+app.kubernetes.io/component: exporter
+spec:
+replicas: {{.Values.exporter.replicaCount }}
+selector:
+matchLabels:
+{{- include \"iperf3-monitor.selectorLabels\". \| nindent 6 }}
+app.kubernetes.io/component: exporter
+template:
+metadata:
+labels:
+{{- include \"iperf3-monitor.selectorLabels\". \| nindent 8 }}
+app.kubernetes.io/component: exporter
+spec:
+{{- if.Values.rbac.create }}
+serviceAccountName: {{ include \"iperf3-monitor.fullname\". }}-sa
+{{- else }}
+serviceAccountName: {{.Values.serviceAccount.name }}
+{{- end }}
+containers:
+- name: iperf3-exporter
+image: \"{{.Values.exporter.image.repository
+}}:{{.Values.exporter.image.tag \| default.Chart.AppVersion }}\"
+imagePullPolicy: {{.Values.exporter.image.pullPolicy }}
+ports:
+- containerPort: 9876
+name: metrics
+env:
+- name: SOURCE_NODE_NAME
+valueFrom:
+fieldRef:
+fieldPath: spec.nodeName
+- name: IPERF_TEST_INTERVAL
+value: \"{{.Values.exporter.testInterval }}\"
+resources:
+{{- toYaml.Values.exporter.resources \| nindent 10 }}
+
+### **4.3 Designing a Comprehensive values.yaml** {#designing-a-comprehensive-values.yaml}
+
+The values.yaml file is the public API of a Helm chart. A well-designed
+values file is intuitive, clearly documented, and provides users with
+the flexibility to adapt the chart to their specific needs. Best
+practices include using clear, camelCase naming conventions and
+providing comments for every parameter.^21^
+
+A particularly powerful feature of Helm is conditional logic. By
+wrapping entire resource definitions in if blocks based on boolean flags
+in values.yaml (e.g., {{- if.Values.rbac.create }}), the chart becomes
+highly adaptable. A user in a high-security environment can disable the
+automatic creation of ClusterRoles by setting rbac.create: false,
+allowing them to manage permissions manually without causing the Helm
+installation to fail.^20^ Similarly, a user not running the Prometheus
+Operator can set
+
+serviceMonitor.enabled: false. This adaptability transforms the chart
+from a rigid, all-or-nothing package into a flexible building block,
+dramatically increasing its utility across different organizations and
+security postures.
+
+The following table documents the comprehensive set of configurable
+parameters for the iperf3-monitor chart. This serves as the primary
+documentation for any user wishing to install and customize the service.
+
+| Parameter                    | Description                                                          | Type    | Default                                   |
+|------------------------------|----------------------------------------------------------------------|---------|-------------------------------------------|
+| nameOverride                 | Override the name of the chart.                                      | string  | \"\"                                      |
+| fullnameOverride             | Override the fully qualified app name.                               | string  | \"\"                                      |
+| exporter.image.repository    | The container image repository for the exporter.                     | string  | ghcr.io/my-org/iperf3-prometheus-exporter |
+| exporter.image.tag           | The container image tag for the exporter.                            | string  | (Chart.AppVersion)                        |
+| exporter.image.pullPolicy    | The image pull policy for the exporter.                              | string  | IfNotPresent                              |
+| exporter.replicaCount        | Number of exporter pod replicas.                                     | integer | 1                                         |
+| exporter.testInterval        | Interval in seconds between test cycles.                             | integer | 300                                       |
+| exporter.testTimeout         | Timeout in seconds for a single iperf3 test.                         | integer | 10                                        |
+| exporter.testProtocol        | Protocol to use for testing (tcp or udp).                            | string  | tcp                                       |
+| exporter.resources           | CPU/memory resource requests and limits for the exporter.            | object  | {}                                        |
+| server.image.repository      | The container image repository for the iperf3 server.                | string  | networkstatic/iperf3                      |
+| server.image.tag             | The container image tag for the iperf3 server.                       | string  | latest                                    |
+| server.resources             | CPU/memory resource requests and limits for the server pods.         | object  | {}                                        |
+| server.nodeSelector          | Node selector for scheduling server pods.                            | object  | {}                                        |
+| server.tolerations           | Tolerations for scheduling server pods on tainted nodes.             | array   | \`\`                                      |
+| rbac.create                  | If true, create ServiceAccount, ClusterRole, and ClusterRoleBinding. | boolean | true                                      |
+| serviceAccount.name          | The name of the ServiceAccount to use. Used if rbac.create is false. | string  | \"\"                                      |
+| serviceMonitor.enabled       | If true, create a ServiceMonitor for Prometheus Operator.            | boolean | true                                      |
+| serviceMonitor.interval      | Scrape interval for the ServiceMonitor.                              | string  | 60s                                       |
+| serviceMonitor.scrapeTimeout | Scrape timeout for the ServiceMonitor.                               | string  | 30s                                       |
+
+## **Section 5: Visualizing Network Performance with a Custom Grafana Dashboard**
+
+The final piece of the user experience is a purpose-built Grafana
+dashboard that transforms the raw, time-series metrics from Prometheus
+into intuitive, actionable visualizations. A well-designed dashboard
+does more than just display data; it tells a story, guiding an operator
+from a high-level overview of cluster health to a deep-dive analysis of
+a specific problematic network path.^5^
+
+### **5.1 Dashboard Design Principles** {#dashboard-design-principles}
+
+The primary goals for this network performance dashboard are:
+
+1.  **At-a-Glance Overview:** Provide an immediate, cluster-wide view of
+    > network health, allowing operators to quickly spot systemic issues
+    > or anomalies.
+
+2.  **Intuitive Drill-Down:** Enable users to seamlessly transition from
+    > a high-level view to a detailed analysis of performance between
+    > specific nodes.
+
+3.  **Correlation:** Display multiple related metrics (bandwidth,
+    > jitter, packet loss) on the same timeline to help identify causal
+    > relationships.
+
+4.  **Clarity and Simplicity:** Avoid clutter and overly complex panels
+    > that can obscure meaningful data.^4^
+
+### **5.2 Key Visualizations and Panels** {#key-visualizations-and-panels}
+
+The dashboard is constructed from several key panel types, each serving
+a specific analytical purpose.
+
+- **Panel 1: Node-to-Node Bandwidth Heatmap.** This is the centerpiece
+  > of the dashboard\'s overview. It uses Grafana\'s \"Heatmap\"
+  > visualization to create a matrix of network performance.
+
+  - **Y-Axis:** Source Node (source_node label).
+
+  - **X-Axis:** Destination Node (destination_node label).
+
+  - **Cell Color:** The value of the iperf_network_bandwidth_mbps
+    > metric.
+
+  - PromQL Query: avg(iperf_network_bandwidth_mbps) by (source_node,
+    > destination_node)
+    > This panel provides an instant visual summary of the entire
+    > cluster\'s network fabric. A healthy cluster will show a uniformly
+    > \"hot\" (high bandwidth) grid, while any \"cold\" spots
+    > immediately draw attention to underperforming network paths.
+
+- **Panel 2: Time-Series Performance Graphs.** These panels use the
+  > \"Time series\" visualization to plot performance over time,
+  > allowing for trend analysis and historical investigation.
+
+  - **Bandwidth (Mbps):** Plots
+    > iperf_network_bandwidth_mbps{source_node=\"\$source_node\",
+    > destination_node=\"\$destination_node\"}.
+
+  - **Jitter (ms):** Plots
+    > iperf_network_jitter_ms{source_node=\"\$source_node\",
+    > destination_node=\"\$destination_node\", protocol=\"udp\"}.
+
+  - Packet Loss (%): Plots (iperf_network_lost_packets_total{\...} /
+    > iperf_network_packets_total{\...}) \* 100.
+    > These graphs are filtered by the dashboard variables, enabling the
+    > drill-down analysis.
+
+- **Panel 3: Stat Panels.** These panels use the \"Stat\" visualization
+  > to display single, key performance indicators (KPIs) for the
+  > selected time range and nodes.
+
+  - **Average Bandwidth:** avg(iperf_network_bandwidth_mbps{\...})
+
+  - **Minimum Bandwidth:** min(iperf_network_bandwidth_mbps{\...})
+
+  - **Maximum Jitter:** max(iperf_network_jitter_ms{\...})
+
+### **5.3 Enabling Interactivity with Grafana Variables** {#enabling-interactivity-with-grafana-variables}
+
+The dashboard\'s interactivity is powered by Grafana\'s template
+variables. These variables are dynamically populated from Prometheus and
+are used to filter the data displayed in the panels.^4^
+
+- **\$source_node**: A dropdown variable populated by the PromQL query
+  > label_values(iperf_network_bandwidth_mbps, source_node).
+
+- **\$destination_node**: A dropdown variable populated by
+  > label_values(iperf_network_bandwidth_mbps{source_node=\"\$source_node\"},
+  > destination_node). This query is cascaded, meaning it only shows
+  > destinations relevant to the selected source.
+
+- **\$protocol**: A custom variable with the options tcp and udp.
+
+This combination of a high-level heatmap with interactive,
+variable-driven drill-down graphs creates a powerful analytical
+workflow. An operator can begin with a bird\'s-eye view of the cluster.
+Upon spotting an anomaly on the heatmap (e.g., a low-bandwidth link
+between Node-5 and Node-8), they can use the \$source_node and
+\$destination_node dropdowns to select that specific path. All the
+time-series panels will instantly update to show the detailed
+performance history for that link, allowing the operator to correlate
+bandwidth drops with jitter spikes or other events. This workflow
+transforms raw data into actionable insight, dramatically reducing the
+Mean Time to Identification (MTTI) for network issues.
+
+### **5.4 The Complete Grafana Dashboard JSON Model** {#the-complete-grafana-dashboard-json-model}
+
+To facilitate easy deployment, the entire dashboard is defined in a
+single JSON model. This model can be imported directly into any Grafana
+instance.
+
+> JSON
+
+{
+\"\_\_inputs\":,
+\"\_\_requires\": \[
+{
+\"type\": \"grafana\",
+\"id\": \"grafana\",
+\"name\": \"Grafana\",
+\"version\": \"8.0.0\"
+},
+{
+\"type\": \"datasource\",
+\"id\": \"prometheus\",
+\"name\": \"Prometheus\",
+\"version\": \"1.0.0\"
+}
+\],
+\"annotations\": {
+\"list\": \[
+{
+\"builtIn\": 1,
+\"datasource\": {
+\"type\": \"grafana\",
+\"uid\": \"\-- Grafana \--\"
+},
+\"enable\": true,
+\"hide\": true,
+\"iconColor\": \"rgba(0, 211, 255, 1)\",
+\"name\": \"Annotations & Alerts\",
+\"type\": \"dashboard\"
+}
+\]
+},
+\"editable\": true,
+\"fiscalYearStartMonth\": 0,
+\"gnetId\": null,
+\"graphTooltip\": 0,
+\"id\": null,
+\"links\":,
+\"panels\":)\",
+\"format\": \"heatmap\",
+\"legendFormat\": \"{{source_node}} -\> {{destination_node}}\",
+\"refId\": \"A\"
+}
+\],
+\"cards\": { \"cardPadding\": null, \"cardRound\": null },
+\"color\": {
+\"mode\": \"spectrum\",
+\"scheme\": \"red-yellow-green\",
+\"exponent\": 0.5,
+\"reverse\": false
+},
+\"dataFormat\": \"tsbuckets\",
+\"yAxis\": { \"show\": true, \"format\": \"short\" },
+\"xAxis\": { \"show\": true }
+},
+{
+\"title\": \"Bandwidth Over Time (Source: \$source_node, Dest:
+\$destination_node)\",
+\"type\": \"timeseries\",
+\"datasource\": {
+\"type\": \"prometheus\",
+\"uid\": \"prometheus\"
+},
+\"gridPos\": { \"h\": 8, \"w\": 12, \"x\": 0, \"y\": 9 },
+\"targets\":,
+\"fieldConfig\": {
+\"defaults\": {
+\"unit\": \"mbps\"
+}
+}
+},
+{
+\"title\": \"Jitter Over Time (Source: \$source_node, Dest:
+\$destination_node)\",
+\"type\": \"timeseries\",
+\"datasource\": {
+\"type\": \"prometheus\",
+\"uid\": \"prometheus\"
+},
+\"gridPos\": { \"h\": 8, \"w\": 12, \"x\": 12, \"y\": 9 },
+\"targets\": \[
+{
+\"expr\": \"iperf_network_jitter_ms{source_node=\\\$source_node\\,
+destination_node=\\\$destination_node\\, protocol=\\udp\\}\",
+\"legendFormat\": \"Jitter\",
+\"refId\": \"A\"
+}
+\],
+\"fieldConfig\": {
+\"defaults\": {
+\"unit\": \"ms\"
+}
+}
+}
+\],
+\"refresh\": \"30s\",
+\"schemaVersion\": 36,
+\"style\": \"dark\",
+\"tags\": \[\"iperf3\", \"network\", \"kubernetes\"\],
+\"templating\": {
+\"list\": \[
+{
+\"current\": {},
+\"datasource\": {
+\"type\": \"prometheus\",
+\"uid\": \"prometheus\"
+},
+\"definition\": \"label_values(iperf_network_bandwidth_mbps,
+source_node)\",
+\"hide\": 0,
+\"includeAll\": false,
+\"multi\": false,
+\"name\": \"source_node\",
+\"options\":,
+\"query\": \"label_values(iperf_network_bandwidth_mbps,
+source_node)\",
+\"refresh\": 1,
+\"regex\": \"\",
+\"skipUrlSync\": false,
+\"sort\": 1,
+\"type\": \"query\"
+},
+{
+\"current\": {},
+\"datasource\": {
+\"type\": \"prometheus\",
+\"uid\": \"prometheus\"
+},
+\"definition\":
+\"label_values(iperf_network_bandwidth_mbps{source_node=\\\$source_node\\},
+destination_node)\",
+\"hide\": 0,
+\"includeAll\": false,
+\"multi\": false,
+\"name\": \"destination_node\",
+\"options\":,
+\"query\":
+\"label_values(iperf_network_bandwidth_mbps{source_node=\\\$source_node\\},
+destination_node)\",
+\"refresh\": 1,
+\"regex\": \"\",
+\"skipUrlSync\": false,
+\"sort\": 1,
+\"type\": \"query\"
+},
+{
+\"current\": { \"selected\": true, \"text\": \"tcp\", \"value\": \"tcp\"
+},
+\"hide\": 0,
+\"includeAll\": false,
+\"multi\": false,
+\"name\": \"protocol\",
+\"options\": \[
+{ \"selected\": true, \"text\": \"tcp\", \"value\": \"tcp\" },
+{ \"selected\": false, \"text\": \"udp\", \"value\": \"udp\" }
+\],
+\"query\": \"tcp,udp\",
+\"skipUrlSync\": false,
+\"type\": \"custom\"
+}
+\]
+},
+\"time\": {
+\"from\": \"now-1h\",
+\"to\": \"now\"
+},
+\"timepicker\": {},
+\"timezone\": \"browser\",
+\"title\": \"Kubernetes iperf3 Network Performance\",
+\"uid\": \"k8s-iperf3-dashboard\",
+\"version\": 1,
+\"weekStart\": \"\"
+}
+
+## **Section 6: GitHub Repository Structure and CI/CD Workflow**
+
+To deliver this monitoring service as a professional, open-source-ready
+project, it is essential to package it within a well-structured GitHub
+repository and implement a robust Continuous Integration and Continuous
+Deployment (CI/CD) pipeline. This automates the build, test, and release
+process, ensuring that every version of the software is consistent,
+trustworthy, and easy for consumers to adopt.
+
+### **6.1 Recommended Repository Structure** {#recommended-repository-structure}
+
+A clean, logical directory structure is fundamental for project
+maintainability and ease of navigation for contributors and users.
+
+.
+├──.github/
+│ └── workflows/
+│ └── release.yml \# GitHub Actions workflow for CI/CD
+├── charts/
+│ └── iperf3-monitor/ \# The Helm chart for the service
+│ ├── Chart.yaml
+│ ├── values.yaml
+│ └── templates/
+│ └──\...
+└── exporter/
+├── Dockerfile \# Dockerfile for the exporter
+├── requirements.txt \# Python dependencies
+└── exporter.py \# Exporter source code
+├──.gitignore
+├── LICENSE
+└── README.md
+
+This structure cleanly separates the exporter application code
+(/exporter) from its deployment packaging (/charts/iperf3-monitor), and
+its release automation (/.github/workflows).
+
+### **6.2 CI/CD Pipeline with GitHub Actions** {#cicd-pipeline-with-github-actions}
+
+A fully automated CI/CD pipeline is the hallmark of a mature software
+project. It eliminates manual, error-prone release steps and provides
+strong guarantees about the integrity of the published artifacts. By
+triggering the pipeline on the creation of a Git tag (e.g., v1.2.3), we
+use the tag as a single source of truth for versioning both the Docker
+image and the Helm chart. This ensures that chart version 1.2.3 is built
+to use image version 1.2.3, and that both have been validated before
+release. This automated, atomic release process provides trust and
+velocity, elevating the project from a collection of files into a
+reliable, distributable piece of software.
+
+The following GitHub Actions workflow automates the entire release
+process:
+
+> YAML
+
+\#.github/workflows/release.yml
+name: Release iperf3-monitor
+
+on:
+push:
+tags:
+- \'v\*.\*.\*\'
+
+env:
+REGISTRY: ghcr.io
+IMAGE_NAME: \${{ github.repository }}
+
+jobs:
+lint-and-test:
+name: Lint and Test
+runs-on: ubuntu-latest
+steps:
+- name: Check out code
+uses: actions/checkout@v3
+
+- name: Set up Helm
+uses: azure/setup-helm@v3
+with:
+version: v3.10.0
+
+- name: Helm Lint
+run: helm lint./charts/iperf3-monitor
+
+build-and-publish-image:
+name: Build and Publish Docker Image
+runs-on: ubuntu-latest
+needs: lint-and-test
+permissions:
+contents: read
+packages: write
+steps:
+- name: Check out code
+uses: actions/checkout@v3
+
+- name: Log in to GitHub Container Registry
+uses: docker/login-action@v2
+with:
+registry: \${{ env.REGISTRY }}
+username: \${{ github.actor }}
+password: \${{ secrets.GITHUB_TOKEN }}
+
+- name: Extract metadata (tags, labels) for Docker
+id: meta
+uses: docker/metadata-action@v4
+with:
+images: \${{ env.REGISTRY }}/\${{ env.IMAGE_NAME }}
+
+- name: Build and push Docker image
+uses: docker/build-push-action@v4
+with:
+context:./exporter
+push: true
+tags: \${{ steps.meta.outputs.tags }}
+labels: \${{ steps.meta.outputs.labels }}
+
+package-and-publish-chart:
+name: Package and Publish Helm Chart
+runs-on: ubuntu-latest
+needs: build-and-publish-image
+permissions:
+contents: write
+steps:
+- name: Check out code
+uses: actions/checkout@v3
+with:
+fetch-depth: 0
+
+- name: Set up Helm
+uses: azure/setup-helm@v3
+with:
+version: v3.10.0
+
+- name: Set Chart Version
+run: \|
+VERSION=\$(echo \"\${{ github.ref_name }}\" \| sed \'s/\^v//\')
+helm-docs \--sort-values-order file
+yq e -i \'.version =
+strenv(VERSION)\'./charts/iperf3-monitor/Chart.yaml
+yq e -i \'.appVersion =
+strenv(VERSION)\'./charts/iperf3-monitor/Chart.yaml
+
+- name: Publish Helm chart
+uses: stefanprodan/helm-gh-pages@v1.6.0
+with:
+token: \${{ secrets.GITHUB_TOKEN }}
+charts_dir:./charts
+charts_url: https://\${{ github.repository_owner }}.github.io/\${{
+github.event.repository.name }}
+
+### **6.3 Documentation and Usability** {#documentation-and-usability}
+
+The final, and arguably most critical, component for project success is
+high-quality documentation. The README.md file at the root of the
+repository is the primary entry point for any user. It should clearly
+explain what the project does, its architecture, and how to deploy and
+use it.
+
+A common failure point in software projects is documentation that falls
+out of sync with the code. For Helm charts, the values.yaml file
+frequently changes, adding new parameters and options. To combat this,
+it is a best practice to automate the documentation of these parameters.
+The helm-docs tool can be integrated directly into the CI/CD pipeline to
+automatically generate the \"Parameters\" section of the README.md by
+parsing the comments directly from the values.yaml file.^20^ This
+ensures that the documentation is always an accurate reflection of the
+chart\'s configurable options, providing a seamless and trustworthy
+experience for users.
+
+## **Conclusion**
+
+The proliferation of distributed microservices on Kubernetes has made
+network performance a critical, yet often opaque, component of overall
+application health. This report has detailed a comprehensive,
+production-grade solution for establishing continuous network validation
+within a Kubernetes cluster. By architecting a system around the robust,
+decoupled pattern of an iperf3-server DaemonSet and a Kubernetes-aware
+iperf3-exporter Deployment, this service provides a resilient and
+automated foundation for network observability.
+
+The implementation leverages industry-standard tools---Python for the
+exporter, Prometheus for metrics storage, and Grafana for
+visualization---to create a powerful and flexible monitoring pipeline.
+The entire service is packaged into a professional Helm chart, following
+best practices for templating, configuration, and adaptability. This
+allows for simple, version-controlled deployment across a wide range of
+environments. The final Grafana dashboard transforms the collected data
+into an intuitive, interactive narrative, enabling engineers to move
+swiftly from high-level anomaly detection to root-cause analysis.
+
+Ultimately, by treating network performance not as a given but as a
+continuously measured metric, organizations can proactively identify and
+resolve infrastructure bottlenecks, enhance application reliability, and
+ensure a consistent, high-quality experience for their users in the
+dynamic world of Kubernetes.