{"id":303,"date":"2026-04-15T03:40:26","date_gmt":"2026-04-15T03:40:26","guid":{"rendered":"https:\/\/blog.ngocha.biz\/?p=303"},"modified":"2026-04-15T03:40:26","modified_gmt":"2026-04-15T03:40:26","slug":"vpa-in-place-pod-resize","status":"publish","type":"post","link":"https:\/\/blog.ngocha.biz\/?p=303","title":{"rendered":"How to Use In-Place Pod Resize with VPA in Kubernetes"},"content":{"rendered":"<p>In this blog, you will learn about the Kubernetes in-place pod resize feature and how to implement it with Vertical Pod Autoscaler (VPA).<\/p>\n<p>At the end, you will learn:<\/p>\n<ul>\n<li>What in-place pod resize.<\/li>\n<li>How it works behind the scenes.<\/li>\n<li>Why do we need VPA for in-place pod resize<\/li>\n<li>Practically deploy and test in-place pod resize <\/li>\n<li>Downsizing considerations and more..<\/li>\n<\/ul>\n<p>Lets get started.<\/p>\n<h2 id=\"what-is-in-place-pod-resize\">What is In-Place Pod Resize?<\/h2>\n<div class=\"kg-card kg-callout-card kg-callout-card-blue\">\n<div class=\"kg-callout-emoji\">\ud83d\udccc<\/div>\n<div class=\"kg-callout-text\"><b><strong style=\"white-space: pre-wrap;\">Note:<\/strong><\/b> This feature is stable from Kubernetes version 1.35<\/div>\n<\/div>\n<p>In <a href=\"https:\/\/devopscube.com\/kubernetes-tutorials-beginners\/\" rel=\"noreferrer\">Kubernetes<\/a>, it was not possible to change the CPU or memory for a running Pod without a restart.<\/p>\n<p>But the <strong>In-Place Pod Resize<\/strong> feature solves that problem.<\/p>\n<p>In-Place Pod Resize feature allows you to change CPU and memory requests and limits on a running pod <strong>without deleting or recreating it.<\/strong><\/p>\n<p>When a resize is needed, the kubelet updates the container&#8217;s <a href=\"https:\/\/devopscube.com\/linux-capabilities\/\" rel=\"noreferrer\">Linux<\/a> cgroup directly, without deleting the pod or restarting the container, unless you explicitly request it.<\/p>\n<p>Here is an example that resizes a nginx pod without restarting it.<\/p>\n<pre><code class=\"language-bash\">kubectl patch pod nginx-demo \\\n  --subresource resize \\\n  --type merge \\\n  -p '{\n    \"spec\": {\n      \"containers\": [{\n        \"name\": \"nginx\",\n        \"resources\": {\n          \"requests\": { \"cpu\": \"500m\" },\n          \"limits\": { \"cpu\": \"500m\" }\n        }\n      }]\n    }\n  }'<\/code><\/pre>\n<p>In the above example, the <strong><code>--subresource resize<\/code><\/strong> flag is the key part. It tells kubernetes that it is <strong>updating only the resizable fields<\/strong> (CPU and memory) of the pod.<\/p>\n<p>Think of <code>\/resize<\/code> subresource as a instruction that tells the cluster to adjust the running container&#8217;s resources without terminating the process.<\/p>\n<p>But there is a problem. <\/p>\n<p>You can only do it only on a pod level. <strong>You cannot do that for a deployment<\/strong>. If you update the Deployment, Kubernetes will recreate the Pods. That is why, you <strong>need to use VPA<\/strong> to perform in-place update for resources.<\/p>\n<h2 id=\"in-place-pod-resize-with-vpa\">In-Place Pod Resize With VPA<\/h2>\n<p>The following image illustrates how below VPA In-Place Pod Resize works behind the scenes.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2026\/04\/image-44.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"1517\" height=\"1539\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2026\/04\/image-44.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2026\/04\/image-44.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2026\/04\/image-44.png 1517w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Here is how the components work together to resize your pod:<\/p>\n<ul>\n<li>Everything starts with the metrics server. It continuously tracks the CPU and memory usage of your pods.<\/li>\n<li>The <strong>VPA Recommender<\/strong> analyzes these metrics. If it detects that a pod is under or overused, it calculates a new target resource value.<\/li>\n<li>The <strong>VPA Updater,<\/strong> instead of evicting it, sends a <strong>patch request<\/strong> to the Kubernetes API to update the pod&#8217;s resources via the <strong><code>\/resize<\/code><\/strong> subresource.<\/li>\n<li>Then, the <strong>Kubelet<\/strong> on the node receives the update and updates the container&#8217;s <strong>cgroups<\/strong> directly.<\/li>\n<li>Thus, the new resource range will be updated to the pod without restarting it.<\/li>\n<\/ul>\n<h2 id=\"use-case\">Use Case<\/h2>\n<p>One limitation when VPA applies a new recommendation is, it evicts the running pod and recreates it. A new pod with the updated resources comes up, the old one gets deleted.<\/p>\n<p>For many workloads, this is acceptable. But for stateful apps, apps like <a href=\"https:\/\/devopscube.com\/deploy-wordpress-on-kubernetes\/\" rel=\"noreferrer\">WordPress<\/a>, <a href=\"https:\/\/devopscube.com\/create-kubernetes-jobs-cron-jobs\/\" rel=\"noreferrer\">long-running jobs<\/a>, or anything sensitive to restarts, that eviction may cause issues.<\/p>\n<p>This is where <strong>VPA In-Place Pod Resize<\/strong> comes in. We can use VPA <strong><code>InPlaceOrRecreate<\/code><\/strong> mode to use the  kubernetes In-Place Pod Resize feature. It updates the pods of the target deployment without restarting it.<\/p>\n<div class=\"kg-card kg-callout-card kg-callout-card-yellow\">\n<div class=\"kg-callout-emoji\">\u26a0\ufe0f<\/div>\n<div class=\"kg-callout-text\">What if the node doesnt have requested resources? <\/p>\n<p>In <code spellcheck=\"false\" style=\"white-space: pre-wrap;\">InPlaceOrRecreate<\/code> mode, if the node does not have enough CPU or memory to complete the resize request, VPA will evict the pod and try to schedule in another node that has enough capacity. Otherwise, the pod remains pending<\/div>\n<\/div>\n<h2 id=\"container-resizepolicy-must-know\">Container ResizePolicy (must know)<\/h2>\n<p>What if I want to restart the container after an in-place resize?<\/p>\n<p>For example, <strong>JVM based apps<\/strong> read the maximum heap size once when they start, based on the container&#8217;s memory limit at that time. If you increase the cgroup memory limit without restarting, the JVM will not use the extra memory, and its heap size does not change.<\/p>\n<p>To use in-place resize effectively, you can add a <strong><code>resizePolicy<\/code><\/strong> field to each container in your pod spec. This tells the kubelet <strong>what to do when a specific resource changes.<\/strong><\/p>\n<p>This policy allows the <strong>container to be restarted without restarting the pod.<\/strong><\/p>\n<p>There are two options in <strong><code>resizePolicy<\/code><\/strong>:<\/p>\n<ol>\n<li><strong>NotRequired<\/strong> &#8211; The cgroup is updated while it keeps the container running.<\/li>\n<li><strong>RestartContainer<\/strong> &#8211; The container is restarted after the resource change.<\/li>\n<\/ol>\n<div class=\"kg-card kg-callout-card kg-callout-card-blue\">\n<div class=\"kg-callout-emoji\">\ud83d\udca1<\/div>\n<div class=\"kg-callout-text\">The default policy is <code spellcheck=\"false\" style=\"white-space: pre-wrap;\">NotRequired<\/code><\/div>\n<\/div>\n<p>You can also use both options for the container. The example below shows the <code>resizePolicy<\/code> block.<\/p>\n<pre><code class=\"language-yaml\">resizePolicy:\n  - resourceName: cpu\n    restartPolicy: NotRequired\n  - resourceName: memory\n    restartPolicy: RestartContainer<\/code><\/pre>\n<p>This causes a<strong> container restart when memory changes<\/strong>, but not when CPU changes.<\/p>\n<p>Now lets move on to hands-on and look VPA in place resizing practically.<\/p>\n<h2 id=\"prerequisites\">Prerequisites<\/h2>\n<p>Make sure you have the following prerequisites before moving forward.<\/p>\n<ul>\n<li><a href=\"https:\/\/devopscube.com\/setup-kubernetes-cluster-kubeadm\/\">Kubernetes cluster<\/a> (Version 1.35)<\/li>\n<li><a href=\"https:\/\/devopscube.com\/kubectl-set-context\/\" rel=\"noreferrer\">kubectl<\/a><\/li>\n<\/ul>\n<div class=\"kg-card kg-callout-card kg-callout-card-blue\">\n<div class=\"kg-callout-emoji\">\ud83d\udca1<\/div>\n<div class=\"kg-callout-text\">For in-place resize work, your cluster nodes should be cgroup2<\/div>\n<\/div>\n<p>To verify the cgroup version, SSH into a node and run the following command.<\/p>\n<pre><code class=\"language-bash\">stat -fc %T \/sys\/fs\/cgroup\/\n<\/code><\/pre>\n<p>The output should be <code>cgroup2<\/code>. If it says <code>tmpfs<\/code>, your nodes are on cgroup v1, and in-place resize will not work.<\/p>\n<h2 id=\"install-metrics-server\">Install Metrics Server<\/h2>\n<p>Before installing VPA, you need to install the metrics server. VPA gets the resource usage of pods from the metrics server. (Ignore this step if you already have metrics server running in your cluster)<\/p>\n<p>Use the following command to install it.<\/p>\n<pre><code class=\"language-bash\">kubectl apply -f https:\/\/raw.githubusercontent.com\/techiescamp\/kubeadm-scripts\/main\/manifests\/metrics-server.yaml\n<\/code><\/pre>\n<p>Then use the following command to check if the pods are up and running.<\/p>\n<pre><code>$ kubectl get pods -n kube-system | grep metrics-server\n\nmetrics-server-6dc6795f96-8jj5w     1\/1     Running   0             3m32s\n<\/code><\/pre>\n<h2 id=\"install-vpa\">Install VPA <\/h2>\n<p>Use the following commands to install VPA with the in-place feature enabled.<\/p>\n<pre><code class=\"language-bash\">git clone https:\/\/github.com\/kubernetes\/autoscaler.git\n\ncd autoscaler\/vertical-pod-autoscaler\/\n\n.\/hack\/vpa-up.sh\n<\/code><\/pre>\n<p>Then use the following command to make sure all the VPA pods are running.<\/p>\n<pre><code class=\"language-bash\">kubectl get pods -n kube-system | grep vpa\n<\/code><\/pre>\n<p>You should see the&nbsp;<strong><code>vpa-recommender<\/code><\/strong>,&nbsp;<strong><code>vpa-updater<\/code><\/strong>, and&nbsp;<strong><code>vpa-admission-controller<\/code><\/strong>&nbsp;pods all in&nbsp;<strong><code>Running<\/code><\/strong>&nbsp;state.<\/p>\n<pre><code class=\"language-bash\">vpa-admission-controller-786f8fd784-nbqd7  1\/1   Running   0   14s\nvpa-recommender-797589f6c9-qhf66           1\/1   Running   0   16s\nvpa-updater-579cc98d57-6jwd7               1\/1   Running   0   16s<\/code><\/pre>\n<h2 id=\"validating-in-place-resize\">Validating In-Place Resize<\/h2>\n<p>To test In-Place Resize, let&#8217;s create a deployment with an <a href=\"https:\/\/devopscube.com\/setup-ingress-kubernetes-nginx-controller\/\" rel=\"noreferrer\">Nginx<\/a> image and create a VPA object for the deployment. <\/p>\n<h3 id=\"create-a-deployment-with-resizepolicy\">Create a Deployment with resizePolicy<\/h3>\n<p>Lets create a simple nginx application with no resource limit or request set.<\/p>\n<p>Copy the following content and execute in the terminal.<\/p>\n<pre><code class=\"language-yaml\">kubectl apply -f - &lt;&lt;EOF\napiVersion: apps\/v1\nkind: Deployment\nmetadata:\n  name: nginx\nspec:\n  replicas: 1\n  selector:\n    matchLabels:\n      app: nginx-resize\n  template:\n    metadata:\n      labels:\n        app: nginx-resize\n    spec:\n      containers:\n        - name: nginx\n          image: nginx:latest\n          resources:\n            requests:\n              cpu: \"50m\"\n              memory: \"64Mi\"\n            limits:\n              cpu: \"50m\"\n              memory: \"64Mi\"\n          resizePolicy:\n            - resourceName: cpu\n              restartPolicy: NotRequired\n            - resourceName: memory\n              restartPolicy: NotRequired\nEOF<\/code><\/pre>\n<div class=\"kg-card kg-callout-card kg-callout-card-blue\">\n<div class=\"kg-callout-emoji\">\u26a0\ufe0f<\/div>\n<div class=\"kg-callout-text\"><b><strong style=\"white-space: pre-wrap;\">Important Note:<\/strong><\/b> If you don&#8217;t set initial requests or limits, Kubernetes labels your pod as <b><strong style=\"white-space: pre-wrap;\">BestEffort<\/strong><\/b>.<\/p>\n<p>When the VPA later adds a request, it tries to change that label to <b><strong style=\"white-space: pre-wrap;\">Burstable<\/strong><\/b>. Since a <a href=\"https:\/\/newsletter.devopscube.com\/p\/pod-qos?ref=devopscube.com\" rel=\"noreferrer\">Pod\u2019s QoS class<\/a> is permanent and cannot be changed while it\u2019s running, the cluster is forced to delete and recreate the pod instead of just resizing it.<\/div>\n<\/div>\n<p>Now, verify the pod is running.<\/p>\n<pre><code class=\"language-bash\">kubectl get pods -l app=nginx-resize<\/code><\/pre>\n<p>You will get the following output.<\/p>\n<pre><code class=\"language-bash\">NAME                      READY   STATUS    RESTARTS   AGE\n\nnginx-77d65d859c-h8v8     1\/1     Running   0          45s<\/code><\/pre>\n<p>Use the following command to check the resource allocation of the pod.<\/p>\n<pre><code class=\"language-bash\">kubectl get pod -l app=nginx-resize \\\n  -o jsonpath='{.items[0].spec.containers[0].resources}' | jq<\/code><\/pre>\n<p>You will get the CPU and memory request as you specified.<\/p>\n<pre><code class=\"language-json\">{\n  \"limits\": {\n    \"cpu\": \"50m\",\n    \"memory\": \"64Mi\"\n  },\n  \"requests\": {\n    \"cpu\": \"50m\",\n    \"memory\": \"64Mi\"\n  }\n}<\/code><\/pre>\n<h3 id=\"create-a-vpa-object\">Create a VPA Object <\/h3>\n<p>Lets create a VPA object targeting the nginx <a href=\"https:\/\/devopscube.com\/kubernetes-deployment-tutorial\/\" rel=\"noreferrer\">deployment<\/a> with <strong><code>InPlaceOrRecreate<\/code><\/strong> mode.<\/p>\n<p>In the VPC we set the CPU minimum to <strong><code>100m<\/code><\/strong> and memory to <strong><code>200Mi<\/code><\/strong><\/p>\n<p>Copy and apply the manifest in the terminal using the following.<\/p>\n<pre><code class=\"language-yaml\">kubectl apply -f - &lt;&lt;EOF\napiVersion: autoscaling.k8s.io\/v1\nkind: VerticalPodAutoscaler\nmetadata:\n  name: web-vpa\nspec:\n  targetRef:\n    apiVersion: \"apps\/v1\"\n    kind: Deployment\n    name: nginx\n  updatePolicy:\n    updateMode: \"InPlaceOrRecreate\"\n    minReplicas: 1\n  resourcePolicy:\n    containerPolicies:\n      - containerName: \"nginx\"\n        minAllowed:\n          cpu: \"100m\"\n          memory: \"200Mi\"\n        maxAllowed:\n          cpu: \"500m\"\n          memory: \"500Mi\"\n        controlledResources: [\"cpu\", \"memory\"]\nEOF<\/code><\/pre>\n<p>Now, run the following command to verify if the VPA object is created. It will take couple of minutes for the Recommender to collect metrics and update the resource.<\/p>\n<pre><code class=\"language-bash\">kubectl get vpa --watch<\/code><\/pre>\n<p>You can see the resources updated from <strong>50m to 100m<\/strong> CPU and <strong>64Mi to 250Mi <\/strong>memory. <\/p>\n<pre><code class=\"language-bash\">$ kubectl get vpa --watch\n\nNAME   MODE                CPU   MEM     PROVIDED   AGE\nweb-vpa    InPlaceOrRecreate   50m   64Mi   True       7m18s\nweb-vpa    InPlaceOrRecreate   100m   250Mi   True       7m42s<\/code><\/pre>\n<p>Even though the min was 200Mi, the <strong>VPU updated to 250<\/strong> as the app requires it.<\/p>\n<p>Now use the following command to check the resource allocation of the pod again.<\/p>\n<pre><code class=\"language-bash\">kubectl get pod -l app=nginx-resize \\\n  -o jsonpath='{.items[0].spec.containers[0].resources}' | jq<\/code><\/pre>\n<p>You can see the resource request and limit have been updated as per the VPA values.<\/p>\n<pre><code>\n  \"limits\": {\n    \"cpu\": \"100m\",\n    \"memory\": \"250Mi\"\n  },\n  \"requests\": {\n    \"cpu\": \"100m\",\n    \"memory\": \"250Mi\"\n  }\n}<\/code><\/pre>\n<p>And, if you check the pod, you can see the resources are changed without the pod getting evicted.<\/p>\n<h2 id=\"downsizing-pods\">Downsizing Pods<\/h2>\n<p>There may be use cases where you might want to downsize the resource for the pods. But following are the things to consider.<\/p>\n<ol>\n<li><strong>CPU downsizing:<\/strong> CPU is compressible so the kernel throttles it without killing the process.<\/li>\n<li><strong>Memory downsizing:<\/strong> Memory is non-compressible. If the container is currently using more memory than the new limit, it will end up crashing the pod with OOMkill error.<\/li>\n<\/ol>\n<p>For memory <strong>downsizing<\/strong>, you can consider using resizePolicy to restart for memory changes<\/p>\n<h2 id=\"vpa-best-practices\">VPA Best Practices<\/h2>\n<p>Following are some of the best pracrtices when using in-place pod resizing.<\/p>\n<ol>\n<li>Always use in <code>Off<\/code> mode in production or your pods resource change according to the usage, or even leads to eviction.<\/li>\n<li>Always set <code>minAllowed<\/code> and <code>maxAllowed<\/code>. Without <code>maxAllowed<\/code>, VPA could recommend values that exhaust node capacity.<\/li>\n<li>Use <code>RestartContainer<\/code> for memory on JVM apps. Without a restart, the JVM heap ceiling does not change even if the cgroup limit does.<\/li>\n<\/ol>\n<h2 id=\"clean-up\">Clean Up<\/h2>\n<p>If the setup is no longer needed, run the following commands to clean it.<\/p>\n<p>Run the following commands to delete the deployment and VPA object we created.<\/p>\n<pre><code class=\"language-bash\">kubectl delete vpa web-vpa\nkubectl delete deploy nginx<\/code><\/pre>\n<p>Then, run the following command from where you have cloned the autoscaler repository. It removes VPA installtion.<\/p>\n<pre><code class=\"language-bash\">.\/autoscaler\/vertical-pod-autoscaler\/hack\/vpa-down.sh<\/code><\/pre>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>VPA In-Place Pod Resize lets your pods get the right resources without being evicted.<\/p>\n<p>With <code>InPlaceOrRecreate<\/code>, VPA applies its recommendations by patching the running pod directly. The pod UID stays the same, the container keeps running, and the only evidence of the change is the updated resource values.<\/p>\n<p>Also, using the InPlaceOrRecreate mode does not mean it always changes the pods resources without a restart. If resource modification is not possible without a restart, it will restart the pod.<\/p>\n<hr>\n<p><strong>Ngu\u1ed3n:<\/strong> <a href=\"https:\/\/devopscube.com\/vpa-in-place-pod-resize\/\" target=\"_blank\" rel=\"noopener noreferrer\">How to Use In-Place Pod Resize with VPA in Kubernetes \u2014 DevOpsCube<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Source: https:\/\/devopscube.com\/vpa-in-place-pod-resize\/<\/p>\n","protected":false},"author":1,"featured_media":304,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-303","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devops"],"_links":{"self":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/posts\/303","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=303"}],"version-history":[{"count":0,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/posts\/303\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/media\/304"}],"wp:attachment":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=303"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=303"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=303"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}