Vertical scaling means changing the resources allocated to a single instance of an application (more or less CPU/memory), which is why C is correct. In Kubernetes terms, this corresponds to adjusting container resource requests and limits (for CPU and memory). Increasing resources can help a workload handle more load per Pod by giving it more compute or memory headroom; decreasing can reduce cost and improve cluster packing efficiency.
This differs from horizontal scaling, which changes the number of instances (replicas). Option D describes horizontal scaling: adding/removing replicas of the same workload, typically managed by a Deployment and often automated via the Horizontal Pod Autoscaler (HPA). Option B describes scaling the infrastructure layer (nodes) which is cluster/node autoscaling (Cluster Autoscaler in cloud environments). Option A is not a standard scaling definition.
In practice, vertical scaling in Kubernetes can be manual (edit the Deployment resource requests/limits) or automated using the Vertical Pod Autoscaler (VPA), which can recommend or apply new requests based on observed usage. A key nuance is that changing requests/limits often requires Pod restarts to take effect, so vertical scaling is less “instant” than HPA and can disrupt workloads if not planned. That’s why many production teams prefer horizontal scaling for traffic-driven workloads and use vertical scaling to right-size baseline resources or address memory-bound/cpu-bound behavior.
From a cloud-native architecture standpoint, understanding vertical vs horizontal scaling helps you design for elasticity: use vertical scaling to tune per-instance capacity; use horizontal scaling for resilience and throughput; and combine with node autoscaling to ensure the cluster has sufficient capacity. The definition the question is testing is simple: vertical scaling = change resources per application instance, which is option C.