<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[AUTOMATESTACK]]></title><description><![CDATA[AUTOMATESTACK]]></description><link>https://automatestack.dev</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1757075915677/62adfc81-7f22-4d75-92d4-ee1646ab4acd.png</url><title>AUTOMATESTACK</title><link>https://automatestack.dev</link></image><generator>RSS for Node</generator><lastBuildDate>Fri, 17 Apr 2026 13:51:18 GMT</lastBuildDate><atom:link href="https://automatestack.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Reclaiming Flow: A Guide to Sustainable Productivity and Digital Sanity]]></title><description><![CDATA[If you looked at my calendar three years ago, you would have seen a mosaic of 30-minute meeting fragments, scattered Jira tickets, and "quick syncs" that effectively destroyed any chance of deep work. Like many in the tech industry, I wore my busynes...]]></description><link>https://automatestack.dev/reclaiming-flow-a-guide-to-sustainable-productivity-and-digital-sanity</link><guid isPermaLink="true">https://automatestack.dev/reclaiming-flow-a-guide-to-sustainable-productivity-and-digital-sanity</guid><category><![CDATA[digitalzen]]></category><category><![CDATA[app blocker]]></category><category><![CDATA[Productivity]]></category><category><![CDATA[#digitaldetox]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Thu, 18 Dec 2025 14:46:21 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766068640986/a59d5b7f-23e1-4713-ba7a-d68dcaf1e606.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you looked at my calendar three years ago, you would have seen a mosaic of 30-minute meeting fragments, scattered Jira tickets, and "quick syncs" that effectively destroyed any chance of deep work. Like many in the tech industry, I wore my busyness like a badge of honor. I assumed that responding to Slack messages within 30 seconds was the definition of being "responsive" and "reliable."</p>
<p>The reality, however, was much grimmer. I was technically "working" 10 hours a day, but my actual output was suffering. Worse, my mental well-being was deteriorating under the weight of constant context switching.</p>
<p>Today, my setup looks very different. It is not just about having a faster processor or an ergonomic chair; it is about a philosophy of <em>defensive resource management</em>. Here is the honest breakdown of how I rebuilt my productivity workflow to prioritize focus over frenzy.</p>
<h2 id="heading-the-hardware-simplicity-as-a-feature">✨ The Hardware: Simplicity as a Feature</h2>
<p>I used to be obsessed with having the maximum screen real estate—three monitors, a tablet for metrics, and a phone always propped up. I realized eventually that more pixels often just meant more vectors for distraction.</p>
<p>My current physical setup is intentionally reductive:</p>
<ul>
<li><p><strong>Single wide Monitor:</strong> It allows for side-by-side code and documentation without the neck strain of twisting between screens.</p>
</li>
<li><p><strong>Noise-Canceling Headphones:</strong> These are non-negotiable. They are my physical "Do Not Disturb" sign.</p>
</li>
<li><p><strong>Mechanical Keyboard:</strong> The tactile feedback helps induce a rhythm in typing, which can actually trigger a flow state trigger for me.</p>
</li>
</ul>
<p>But hardware is the easy part. The real battle is software and psychology.</p>
<h2 id="heading-the-deep-work-protocol">🧠 The "Deep Work" Protocol</h2>
<p>The core of my productivity philosophy is Cal Newport’s concept of "Deep Work." As Cloud Engineers and creators, we need long, uninterrupted blocks of time to load complex contexts into our working memory or troubleshoot that critical issue within a time frame. A single notification can topple that mental house of cards.</p>
<p>To protect these blocks, I implemented a strict 4-hour "<strong><em>Deep Focus</em></strong>" window every morning. During this time:</p>
<ol>
<li><p><strong>Async First:</strong> I close MS Teams and Email. My team knows that unless the server room is literally on fire, I am unavailable until 1:00 PM.</p>
</li>
<li><p><strong>Task Batching:</strong> I group all administrative minutiae (updating tickets, code reviews, emails) into a "shallow work" block in the late afternoon when my cognitive energy is naturally lower.</p>
</li>
</ol>
<h2 id="heading-the-missing-link-enforcing-boundaries">🚧 The Missing Link: Enforcing Boundaries</h2>
<p>Here is the truthful part that most productivity gurus skip: <strong>Willpower is a finite resource.</strong></p>
<p>In the beginning, I tried to simply "promise myself" I wouldn't check Reddit while my script runs successfully or scroll Twitter when I hit a logic error. I failed constantly. The dopamine loop of social media is engineered by some of the smartest minds in our industry to be irresistible. I needed a tool that was stronger than my own wavering discipline.</p>
<p>This is where 👉 🧘‍♂️ <a href="https://www.digitalzen.app/" target="_blank">DigitalZen.app</a> 🧘‍♂️ became the cornerstone of my digital hygiene.</p>
<p>I had tried other blockers before, but they were either too easy to bypass or too clunky to configure. DigitalZen integrated seamlessly into my workflow because it doesn't just "block sites"; it helps curate an environment. I set it up to whitelist only my essential dev tools (GitHub, StackOverflow, documentation sites) and blacklist the infinite-scroll traps (social media, news aggregators) during my deep focus Time.</p>
<p>It doesn't just block websites. I can set it to block the Steam client, Discord, or even my email client during deep work hours. My distractions aren't just websites; they are other applications running on my system too.</p>
<p>The feature I love most is the out-of-the-box focus templates, which make it seamless to block specific categories of websites such as social media or adult content.</p>
<p>The difference was immediate. When my brain instinctively reached for a distraction during a difficult coding problem, the block page was a gentle but firm reminder: <em>Not now. Stick with the problem.</em> It outsourced my self-discipline, preserving my mental energy for the actual work.</p>
<p>This Digitalzen feature page does an excellent job of explaining the impressive capabilities of the platform. 👉 <a href="https://www.digitalzen.app/#how-it-works" target="_blank">DigitalZen.app How it works</a> </p>
<h2 id="heading-mental-well-being-the-art-of-disconnecting">🧘‍♂️ Mental Well-being: The Art of Disconnecting</h2>
<p>Productivity is meaningless if you burn out in six months. A major part of my new setup involves rigid boundaries between "Online" and "Offline."</p>
<p>The "Always On" culture is a fast track to anxiety. By using 🧘‍♂️ <a href="https://www.digitalzen.app/" target="_blank">DigitalZen.app</a> 🧘‍♂️ to lock me out of work apps after 7:00 PM, I force myself to decompress. It sounds paradoxical to use software to stop using software, but that hard stop allows me to engage in analog hobbies—reading, cooking, or just walking without a podcast playing.</p>
<p>It has anti-tamper features too. You can’t just kill the process or uninstall it when you get the urge to slack off. It forces you to stick to the schedule you set when you were in a rational state of mind.</p>
<p>This downtime is not "wasted" time; it is recovery time. It is during these quiet moments that my brain processes the day's information, often leading to solutions for bugs that baffled me hours earlier.</p>
<h2 id="heading-the-result">🎉 The Result</h2>
<p>Since adopting this distraction-free architecture, my output hasn't just increased in volume; it has increased in <em>value</em>. I ship cleaner code, write better documentation, and actually enjoy the process of building &amp; troubleshooting again.</p>
<p>If you are finding yourself drowning in digital noise, stop trying to "try harder." Build a system that protects your attention. Whether it’s optimizing your physical desk or using tools like 🧘‍♂️ <a href="https://www.digitalzen.app/" target="_blank">DigitalZen.app</a> 🧘‍♂️ to guard your focus, the goal is the same: effortless consistency in a chaotic world.</p>
<p>Productivity isn't about doing more things; it's about doing the <em>right</em> things with your full attention.</p>
]]></content:encoded></item><item><title><![CDATA[Understanding Pod Priority and Preemption in Kubernetes: A Detailed Guide]]></title><description><![CDATA[Introduction
In Kubernetes, Pod Priority and Preemption is a powerful scheduling feature that ensures critical workloads are placed and maintained on your cluster, even when resources are scarce. With this mechanism, Kubernetes can automatically pree...]]></description><link>https://automatestack.dev/understanding-pod-priority-and-preemption-in-kubernetes-a-detailed-guide</link><guid isPermaLink="true">https://automatestack.dev/understanding-pod-priority-and-preemption-in-kubernetes-a-detailed-guide</guid><category><![CDATA[kubernetes pod priority]]></category><category><![CDATA[kubernetes Preemption]]></category><category><![CDATA[k8s]]></category><category><![CDATA[#k8scluster]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[kubernetes architecture]]></category><category><![CDATA[#kubernetes #container ]]></category><category><![CDATA[Kubernetes CKA Preparation]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Tue, 09 Sep 2025 06:35:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766082871149/6e7c81d4-495a-4211-8557-2395f1164ebb.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In Kubernetes, <strong>Pod Priority and Preemption</strong> is a powerful scheduling feature that ensures critical workloads are placed and maintained on your cluster, even when resources are scarce. With this mechanism, Kubernetes can automatically <strong>preempt</strong> (evict) lower-priority pods to make room for higher-priority ones, helping orchestrate resource-efficient and reliable workload execution. Introduced as generally available in Kubernetes v1.14, this feature has become a staple for cluster operations.</p>
<h2 id="heading-1-what-is-pod-priority">1. What Is Pod Priority?</h2>
<p><strong>Pod Priority</strong> is an integer value assigned to a Pod, representing its importance relative to others. Higher values indicate higher importance in scheduling decisions.</p>
<ul>
<li><p>Pods without an explicit priority use a default value of 0.</p>
</li>
<li><p>Priorities are defined through <code>PriorityClass</code> objects, which are non-namespaced resources that map a name to an integer priority.</p>
</li>
</ul>
<pre><code class="lang-mermaid">flowchart TD
    %% Nodes
    A([📦 Pod Scheduled]) --&gt; B{⚖️ Cluster Under Pressure?}

    B -- "Yes" --&gt; C[🔥 Node Pressure Eviction]
    C --&gt; D[💀 Pod terminated on node]
    D --&gt; E{🧑‍✈️ Controlled by ReplicaSet/Deployment?}
    E -- "Yes" --&gt; F[♻️ Controller creates new Pod]
    F --&gt; G[🚀 Scheduler places on another node]
    E -- "No" --&gt; H[❌ Pod stays deleted]

    B -- "No, but Higher Priority Pod Pending" --&gt; I[⬆️ Preemption Triggered]
    I --&gt; J[⚔️ Lower priority pods evicted]
    J --&gt; D

    B -- "No pressure &amp; no higher priority pod" --&gt; K[✅ Pod keeps running]

    %% Styling
    classDef start fill:#2E86C1,color:#fff,stroke:#1B4F72,stroke-width:2px;
    classDef decision fill:#F4D03F,color:#000,stroke:#B7950B,stroke-width:2px;
    classDef danger fill:#E74C3C,color:#fff,stroke:#922B21,stroke-width:2px;
    classDef dead fill:#6E2C00,color:#fff,stroke:#641E16,stroke-width:2px;
    classDef controller fill:#27AE60,color:#fff,stroke:#145A32,stroke-width:2px;
    classDef running fill:#1ABC9C,color:#fff,stroke:#0E6251,stroke-width:2px;

    %% Assign classes
    A:::start
    B:::decision
    C:::danger
    I:::danger
    D:::dead
    J:::danger
    E:::decision
    F:::controller
    G:::controller
    H:::dead
    K:::running
</code></pre>
<h2 id="heading-2-defining-priority-priorityclass">2. Defining Priority: <code>PriorityClass</code></h2>
<p>A <code>PriorityClass</code> defines both the name and numerical value of a priority:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">scheduling.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">PriorityClass</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">high-priority-apps</span>
<span class="hljs-attr">value:</span> <span class="hljs-number">1000000</span>
<span class="hljs-attr">globalDefault:</span> <span class="hljs-literal">false</span>
<span class="hljs-attr">description:</span> <span class="hljs-string">"Pods critical to business logic."</span>
</code></pre>
<ul>
<li><p><code>value</code>: Higher numbers mean higher priority.</p>
</li>
<li><p><code>globalDefault</code>: If <code>true</code>, this is the default for pods without a specified <code>priorityClassName</code>—but only for pods created after the class exists.</p>
</li>
</ul>
<p>Kubernetes ships with two default system-critical classes:</p>
<ul>
<li><p><code>system-node-critical</code> (≈ 2,000,001,000)</p>
</li>
<li><p><code>system-cluster-critical</code> (≈ 2,000,000,000)</p>
</li>
</ul>
<h2 id="heading-3-scheduling-how-priority-influences-order">3. Scheduling: How Priority Influences Order</h2>
<p>Once Pod Priority is in place, the scheduler sorts pending pods by priority. <strong>High-priority pods</strong> are attempted first. If scheduling a high-priority pod fails due to resource constraints, the scheduler may then preempt lower-priority pods to make room.</p>
<h2 id="heading-4-preemption-making-space-for-what-matters">4. Preemption: Making Space for What Matters</h2>
<p>When a pending pod cannot be scheduled:</p>
<ol>
<li><p>The scheduler looks for nodes where evicting one or more lower-priority pods would free enough capacity.</p>
</li>
<li><p>It evicts the minimal necessary set of pods to schedule the higher-priority pod.</p>
</li>
<li><p>When a pod is <strong>evicted</strong> (whether due to <strong>preemption</strong> or <strong>node pressure eviction</strong>):</p>
<ol>
<li><p>The pod is <strong>terminated</strong> on the node where it is running.</p>
</li>
<li><p>The pod is deleted from the current node.</p>
</li>
<li><p>If the pod belongs to a <strong>controller</strong> (e.g., Deployment, StatefulSet, ReplicaSet, Job, etc.), that controller will notice the missing replica and create a <strong>new pod</strong>.</p>
</li>
<li><p>The scheduler will then place this <strong>new pod</strong> on another suitable node.</p>
</li>
</ol>
</li>
</ol>
<p>So effectively, <strong>Standalone Pod (not managed by a controller):</strong> Once evicted, it is gone permanently.</p>
<p><strong>Pod managed by a controller:</strong> It gets recreated, usually on another node, assuming resources are available.</p>
<h3 id="heading-scheduling-metadata">Scheduling metadata:</h3>
<p>The pending pod’s status.<code>nominatedNodeName</code> indicates which node is targeted for preemption. However, the pod may ultimately be scheduled elsewhere if conditions change.</p>
<h3 id="heading-important-constraints">Important Constraints:</h3>
<ul>
<li><p><strong>Victim pods</strong> terminate using their graceful termination period (default ~30 seconds), which delays when space becomes available.</p>
</li>
<li><p><code>PodDisruptionBudget</code> (PDB) is respected on a best-effort basis but can be violated if no alternate victim set exists.</p>
</li>
<li><p><strong>Inter-pod affinity</strong>: If the pending pod requires co-location with lower-priority pods, preemption won't occur on that node.</p>
</li>
<li><p><strong>Cross-node preemption is not supported</strong>: The scheduler doesn’t preempt pods on other nodes to alleviate anti-affinity constraints.</p>
</li>
</ul>
<h2 id="heading-5-non-preempting-priority-classes">5. Non-Preempting Priority Classes</h2>
<p>Introduced in Kubernetes v1.24, you can define a <code>PriorityClass</code> with:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">preemptionPolicy:</span> <span class="hljs-string">Never</span>
</code></pre>
<p>This means pods with this class will:</p>
<ul>
<li><p>Queue ahead of lower-priority pods.</p>
</li>
<li><p><strong>Not</strong> preempt other pods.</p>
</li>
<li><p>Be preempted by even higher-priority pods.</p>
</li>
</ul>
<p>This is useful, for example, in ML or data science workflows where you want to ensure high scheduling priority without disrupting running services.</p>
<h2 id="heading-6-interplay-with-qos-and-eviction">6. Interplay with QoS and Eviction</h2>
<p>While Pod QoS classes (<code>Guaranteed</code>, <code>Burstable</code>, <code>BestEffort</code>) affect eviction precedence during node-pressure scenarios, they <strong>don’t influence scheduling preemption</strong>. The scheduler focuses solely on priority values—QoS only comes into play during evictions and not scheduling.</p>
<p>At node pressure, pods are ranked for eviction by:</p>
<ol>
<li><p>Exceeding resource requests</p>
</li>
<li><p>Priority</p>
</li>
<li><p>Resource usage relative to requests</p>
</li>
</ol>
<h2 id="heading-7-why-use-pod-priority-and-preemption">7. Why Use Pod Priority and Preemption?</h2>
<ul>
<li><p><strong>Reliability</strong>: Ensures critical workloads are scheduled promptly without over-provisioning clusters.</p>
</li>
<li><p><strong>Resource utilization</strong>: Hosts both mission-critical and lower-priority workloads together, evicting non-essential pods under pressure.</p>
</li>
<li><p><strong>Operational flexibility</strong>: You can finely control priority and preemption behavior using Policy, Preemption settings, and PDB nuances.</p>
</li>
</ul>
<h2 id="heading-8-sample-yaml-snippet">8. Sample YAML Snippet</h2>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">scheduling.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">PriorityClass</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">high-priority-apps</span>
<span class="hljs-attr">value:</span> <span class="hljs-number">1000000</span>
<span class="hljs-attr">globalDefault:</span> <span class="hljs-literal">false</span>
<span class="hljs-attr">description:</span> <span class="hljs-string">"Priority for critical services."</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">nginx</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">containers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">nginx</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">nginx</span>
  <span class="hljs-attr">priorityClassName:</span> <span class="hljs-string">high-priority-apps</span>
</code></pre>
<p>To create a non-preempting class via kubectl:</p>
<pre><code class="lang-bash">kubectl create priorityclass high-priority --value=1000 \
  --description=<span class="hljs-string">"High priority but non-preempting"</span> \
  --preemption-policy=<span class="hljs-string">"Never"</span>
</code></pre>
<h2 id="heading-9-best-practices-amp-troubleshooting">9. Best Practices &amp; Troubleshooting</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Scenario</td><td>Guidance</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Unintended preemptions</strong></td><td>Ensure priority levels are correctly assigned; empty <code>priorityClassName</code> defaults to <code>0</code>.</td></tr>
<tr>
<td><strong>Pending pods not scheduling after preemption</strong></td><td>Another higher-priority pod may have taken precedence. This is expected.</td></tr>
<tr>
<td><strong>Higher-priority pods evicted first</strong></td><td>The scheduler may choose nodes where victims have the lowest priority or where PDB isn't violated.</td></tr>
<tr>
<td><strong>Affinity issues</strong></td><td>Avoid inter-pod affinity that ties a high-priority pod to a lower-priority pod, as it can block preemption.</td></tr>
<tr>
<td><strong>Termination latency in scheduling gap</strong></td><td>Reduce or set <code>terminationGracePeriodSeconds</code> to a small value on lower-priority pods.</td></tr>
</tbody>
</table>
</div>]]></content:encoded></item><item><title><![CDATA[How to Self-Host n8n for Free Forever on Oracle Cloud]]></title><description><![CDATA[n8n stands out as one of the most powerful open-source low-code AI workflow automation tools available. While cloud-hosted n8n can get expensive quickly, Oracle Cloud's Always Free tier offers an incredible opportunity to run n8n completely free, for...]]></description><link>https://automatestack.dev/self-host-n8n-for-free-for-life-on-oracle-cloud</link><guid isPermaLink="true">https://automatestack.dev/self-host-n8n-for-free-for-life-on-oracle-cloud</guid><category><![CDATA[host n8n]]></category><category><![CDATA[host n8n free]]></category><category><![CDATA[n8n]]></category><category><![CDATA[Oracle Cloud]]></category><category><![CDATA[Traefik]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Fri, 05 Sep 2025 06:25:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766082046311/0810fb5b-0b5f-4237-9dc9-13a18709e43f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>n8n stands out as one of the most powerful open-source low-code <strong>AI workflow automation</strong> tools available. While cloud-hosted n8n can get expensive quickly, Oracle Cloud's Always Free tier offers an incredible opportunity to run n8n completely free, forever.</p>
<p>In this comprehensive guide, I'll walk you through setting up n8n on Oracle Cloud Infrastructure (OCI) using their generous Always Free tier, which includes compute instances that never expire.</p>
<h2 id="heading-why-oracle-clouds-always-free-tier">💡Why Oracle Cloud's Always Free Tier?</h2>
<p>Oracle Cloud's Always Free tier is genuinely impressive:</p>
<ul>
<li><p><strong>2 AMD-based Compute VMs</strong> with 1/8 OCPU and 1 GB memory each</p>
</li>
<li><p><strong>Up to 4 Arm-based Ampere A1 cores</strong> and 24 GB of memory (can be used as one VM or split)</p>
</li>
<li><p><strong>200 GB total Block Volume storage</strong></p>
</li>
<li><p><strong>10 GB Object Storage</strong></p>
</li>
<li><p><strong>Always Free</strong> - no time limits, no credit expiration</p>
</li>
</ul>
<p>The Arm-based instances are particularly powerful for running n8n, offering excellent performance for automation workflows.</p>
<p>For a more detailed overview of the Oracle Free Tier, refer to <a target="_blank" href="https://www.oracle.com/cloud/free/">Oracle Free Tier</a> &amp; <a target="_blank" href="https://docs.oracle.com/en-us/iaas/Content/FreeTier/freetier_topic-Always_Free_Resources.htm">Always Free Resources</a>.</p>
<h2 id="heading-prerequisites">✅ Prerequisites</h2>
<p>Before we begin, you'll need:</p>
<ul>
<li><p>Oracle Cloud account (free sign-up)</p>
</li>
<li><p>A registered Domain name (recommended for HTTPS)</p>
</li>
</ul>
<h2 id="heading-architecture">🏗️ Architecture</h2>
<p>This setup demonstrates how to run <strong>n8n</strong> securely inside an <strong>OCI Compute VM</strong> using Docker and <strong>Traefik</strong> as the reverse proxy.</p>
<h3 id="heading-1-traffic-flow">1. Traffic Flow</h3>
<ul>
<li><p>A user accesses <a target="_blank" href="http://n8n.example.com"><code>n8n.example.com</code></a>, which resolves via DNS to the VM’s <strong>public IP</strong>.</p>
</li>
<li><p>Requests on <strong>ports 80/443</strong> reach the <strong>Traefik container</strong> inside the VM.</p>
</li>
<li><p>Traefik forwards HTTPS traffic through the <code>traefik-public</code> network to the <strong>n8n container</strong>.</p>
</li>
</ul>
<h3 id="heading-2-security-amp-certificates">2. Security &amp; Certificates</h3>
<ul>
<li><p>Traefik manages SSL/TLS certificates automatically with <strong>Let’s Encrypt CA</strong>.</p>
</li>
<li><p>Certificates are issued and renewed using the <strong>ACME TLS challenge</strong>.</p>
</li>
<li><p>All certificates are stored securely in <code>./letsencrypt/acme.json</code>.</p>
</li>
</ul>
<h3 id="heading-3-service-discovery">3. Service Discovery</h3>
<ul>
<li><p>Traefik integrates with the <strong>Docker Socket</strong> to dynamically discover running containers.</p>
</li>
<li><p>This eliminates manual configuration whenever services are added or updated.</p>
</li>
</ul>
<h3 id="heading-4-application-layer">4. Application Layer</h3>
<ul>
<li><p><strong>n8n container</strong> hosts the workflow automation platform.</p>
</li>
<li><p><strong>Postgres container</strong> provides persistent database storage, connected via the <code>n8n-network</code> (port 5432).</p>
</li>
</ul>
<h3 id="heading-5-infrastructure">5. Infrastructure</h3>
<ul>
<li><p>All components (Traefik, n8n, Postgres) run as <strong>Docker containers</strong> inside a single <strong>OCI Compute VM</strong>.</p>
</li>
<li><p>Networking is logically separated using Docker networks (<code>traefik-public</code> and <code>n8n-network</code>).</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757046237156/c118ecbc-c888-4bc6-94a7-225da66373c6.png" alt /></p>
<h2 id="heading-step-1-create-your-oracle-cloud-account">Step 1: 📝Create Your Oracle Cloud Account</h2>
<ol>
<li><p>Visit <a target="_blank" href="https://oracle.com/cloud/free">oracle.com/cloud/free</a></p>
</li>
<li><p>Sign up for a free account</p>
</li>
<li><p>Complete the verification process (requires credit card for verification, but won't be charged)</p>
</li>
<li><p>Wait for account activation</p>
</li>
</ol>
<h2 id="heading-step-2-set-up-your-compute-instance">Step 2: 🖥️✨ Set Up Your Compute Instance</h2>
<ol>
<li><p>Create a <strong>Virtual Cloud Network</strong></p>
<ul>
<li><p>Log into your OCI Console</p>
</li>
<li><p>Navigate to <strong>Networking → Virtual cloud networks</strong></p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757001488272/d1245687-695f-4eee-8596-395f4385d636.png" alt /></p>
<ul>
<li>We will take a /16 subnet for the VCN, in this case 10.0.0.0/16</li>
</ul>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757001628743/285870e4-91a6-46fb-8ce0-eaf8431ee598.png" alt /></p>
<ol start="2">
<li><p>Create a /24 subnet inside the VCN where the VM instance will be connected</p>
<ul>
<li><p>10.0.1.0/24</p>
</li>
<li><p>Create it as a Public Subnet</p>
</li>
</ul>
</li>
</ol>
<p>    <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757002887270/d9c0476b-f880-4788-86ce-064b82920a50.png" alt /></p>
<ol start="3">
<li><p>Create a Internet gateway for the VCN</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757003072052/509b4363-77a0-4b83-8bee-3a3c0668cb1b.png" alt /></p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757003145894/13e112c6-9973-40ed-9aab-4bcecba5f509.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Create the VM Instance</p>
<ul>
<li><p>Navigate to <strong>Compute → Instances</strong></p>
</li>
<li><p>Click <strong>Create Instance</strong></p>
</li>
<li><p><strong>Image and Shape:</strong></p>
<ul>
<li><p><strong>Image:</strong> Ubuntu 22.04 LTS (Always Free-eligible)</p>
</li>
<li><p><strong>Shape:</strong> VM.Standard.A1.Flex (Arm-based)</p>
</li>
<li><p><strong>OCPU:</strong> 2 (or all 4 if you want maximum performance)</p>
</li>
<li><p><strong>Memory:</strong> 12 GB (or up to 24 GB)</p>
</li>
</ul>
</li>
</ul>
</li>
</ol>
<p>        <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757004336628/e4af2700-a365-4bb1-8ebc-06ec9906ca35.png" alt class="image--center mx-auto" /></p>
<p>        <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757004388054/9527d8d6-9b7a-492d-9391-c29994f04a8a.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>In the Networking tab, Select the VCN &amp; subnet that we already created at the beginning</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757005050754/f8d73a14-6122-49ce-9f6e-c164b0a5036c.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
<p>    <strong>SSH Keys:</strong></p>
<ul>
<li><p>Generate a new key pair and download both public and private keys</p>
</li>
<li><p>Keep the private key secure - you'll need it to access your server</p>
</li>
</ul>
<p>    Click <strong>Create</strong> and wait for the instance to provision</p>
<ol start="5">
<li><p>Configure Network Security</p>
<ol>
<li><p>Go to <strong>Networking → Virtual Cloud Networks</strong></p>
</li>
<li><p>Click on your VCN</p>
</li>
<li><p>Click on <strong>Security Lists</strong> → <strong>Default Security List</strong></p>
</li>
<li><p>Add ingress rules:</p>
<pre><code class="lang-basic"> Port <span class="hljs-number">22</span> (SSH): <span class="hljs-number">0.0.0.0</span>/<span class="hljs-number">0</span>
 Port <span class="hljs-number">80</span> (HTTP): <span class="hljs-number">0.0.0.0</span>/<span class="hljs-number">0</span>
 Port <span class="hljs-number">443</span> (HTTPS): <span class="hljs-number">0.0.0.0</span>/<span class="hljs-number">0</span>
 Port <span class="hljs-number">5678</span> (n8n): <span class="hljs-number">0.0.0.0</span>/<span class="hljs-number">0</span> (temporary, we<span class="hljs-comment">'ll remove this later)</span>
</code></pre>
</li>
</ol>
</li>
</ol>
<h2 id="heading-step-3-connect-to-your-instance">Step 3: 🔑 Connect to Your Instance</h2>
<p>1. Note your instance's public IP address</p>
<p>2. Connect via SSH:</p>
<pre><code class="lang-bash">bash ssh -i /path/to/your/private-key ubuntu@YOUR_PUBLIC_IP
</code></pre>
<p>For Windows users with PuTTY, convert the private key to .ppk format first</p>
<h2 id="heading-step-4-prepare-the-server">Step 4: ⚙️ Prepare the Server</h2>
<p><strong>Update the System</strong></p>
<pre><code class="lang-bash">bash sudo apt update &amp;&amp; sudo apt upgrade -y
</code></pre>
<p><a target="_blank" href="https://docs.docker.com/engine/install/ubuntu/"><strong>Install Docker and Docker Compose</strong></a></p>
<pre><code class="lang-bash"><span class="hljs-comment"># Add Docker's official GPG key:</span>
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

<span class="hljs-comment"># Add the repository to Apt sources:</span>
<span class="hljs-built_in">echo</span> \
  <span class="hljs-string">"deb [arch=<span class="hljs-subst">$(dpkg --print-architecture)</span> signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  <span class="hljs-subst">$(. /etc/os-release &amp;&amp; echo <span class="hljs-string">"<span class="hljs-variable">${UBUNTU_CODENAME:-<span class="hljs-variable">$VERSION_CODENAME</span>}</span>"</span>)</span> stable"</span> | \
  sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null
sudo apt-get update

<span class="hljs-comment">#install the latest version</span>
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

<span class="hljs-comment"># Add your user to docker group</span>
sudo usermod -aG docker <span class="hljs-variable">$USER</span>
</code></pre>
<p>Reconnect to your server after logging out.</p>
<h2 id="heading-step-5-deploy-n8n-with-docker-compose">Step 5: 🐳 Deploy n8n with Docker Compose</h2>
<p><strong>Prepare the project directories and files on the VM</strong></p>
<pre><code class="lang-bash">~/n8n-oracle-cloud/
├── traefik/                 
│   └── acme.json            <span class="hljs-comment"># ACME storage for Let's Encrypt certificates</span>
├── prod/                    
    ├── .env                 <span class="hljs-comment"># Environment variables</span>
    └── docker-compose.yaml  <span class="hljs-comment"># Main Compose stack</span>
</code></pre>
<p>🚀 <strong>All the code is available on</strong> <a target="_blank" href="https://github.com/sumitsaz23/n8n-docker-traefik-postgres#"><strong>my GitHub Repo</strong></a><strong>!</strong><br />👉 Clone this repository to your VM to get <strong>all the necessary code</strong> 🖥️💻</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Example: clone repo to home</span>
<span class="hljs-built_in">cd</span> ~
git <span class="hljs-built_in">clone</span> https://github.com/sumitsaz23/n8n-docker-traefik-postgres.git n8n-oracle-cloud
<span class="hljs-built_in">cd</span> n8n-oracle-cloud
</code></pre>
<p><strong>Secure acme.json for Traefik</strong></p>
<p>Traefik requires acme.json to be present and readable/writable by the Traefik container but with strict permissions (600)</p>
<pre><code class="lang-bash"><span class="hljs-comment"># create acme file and set permissions</span>
touch traefik/acme.json
chmod 600 traefik/acme.json
</code></pre>
<p>Create <code>docker-compose.yml</code> file:</p>
<pre><code class="lang-bash">
services:

  <span class="hljs-comment"># -------------------------</span>
  <span class="hljs-comment"># Postgres (self-managed)</span>
  <span class="hljs-comment"># -------------------------</span>
  postgres:
    image: postgres:16-alpine   <span class="hljs-comment"># Postgres 16 (lightweight alpine)</span>
    container_name: n8n_postgres
    restart: unless-stopped
    <span class="hljs-comment"># Named volume for persistent DB files</span>
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      - POSTGRES_USER=<span class="hljs-variable">${DB_POSTGRESDB_USER}</span>
      - POSTGRES_DB=<span class="hljs-variable">${DB_POSTGRESDB_DATABASE}</span>
      - POSTGRES_PASSWORD=<span class="hljs-variable">${DB_POSTGRESDB_PASSWORD}</span> 
      - PGDATA=/var/lib/postgresql/data/pgdata
    healthcheck:
      <span class="hljs-built_in">test</span>: [<span class="hljs-string">"CMD-SHELL"</span>, <span class="hljs-string">"pg_isready -U <span class="hljs-variable">${DB_POSTGRESDB_USER}</span> -d <span class="hljs-variable">${DB_POSTGRESDB_DATABASE}</span> || exit 1"</span>]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 15s
    networks:
      - n8n-network
    mem_limit: 2g
    cpus: 1.0

  <span class="hljs-comment"># -------------------------</span>
  <span class="hljs-comment"># n8n main (web UI + webhooks)</span>
  <span class="hljs-comment"># -------------------------</span>
  n8n:
    image: n8nio/n8n:latest
    container_name: n8n_main
    restart: unless-stopped
    depends_on:
      - postgres
    volumes:
      <span class="hljs-comment"># named volume for user-related files, credentials, workflows, logs, etc</span>
      - n8n_data:/home/node/.n8n
    environment:
      <span class="hljs-comment"># Database (Postgres) - prefer file-based secret usage</span>
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_PORT=<span class="hljs-variable">${DB_POSTGRESDB_PORT}</span>
      - DB_POSTGRESDB_DATABASE=<span class="hljs-variable">${DB_POSTGRESDB_DATABASE}</span>
      - DB_POSTGRESDB_USER=<span class="hljs-variable">${DB_POSTGRESDB_USER}</span>
      - DB_POSTGRESDB_PASSWORD=<span class="hljs-variable">${DB_POSTGRESDB_PASSWORD}</span>

      <span class="hljs-comment"># n8n app settings</span>
      - N8N_PORT=5678
      - N8N_PROTOCOL=https
      - N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=<span class="hljs-literal">true</span>
      - N8N_REINSTALL_MISSING_PACKAGES=<span class="hljs-literal">true</span>
      - N8N_RUNNERS_ENABLED=<span class="hljs-literal">true</span>
      - WEBHOOK_URL=https://<span class="hljs-variable">${N8N_HOSTNAME}</span>             <span class="hljs-comment"># actual public URL; override in .env</span>
      - GENERIC_TIMEZONE=<span class="hljs-variable">${TZ}</span>                         <span class="hljs-comment"># e.g., "UTC" or "Asia/Kolkata"</span>

      <span class="hljs-comment"># Basic auth for UI - use secret file variant</span>
      - N8N_BASIC_AUTH_ACTIVE=<span class="hljs-variable">${N8N_BASIC_AUTH_ACTIVE}</span>
      - N8N_BASIC_AUTH_USER=<span class="hljs-variable">${N8N_BASIC_AUTH_USER}</span>
      - N8N_BASIC_AUTH_PASSWORD=<span class="hljs-variable">${N8N_BASIC_AUTH_PASSWORD}</span>

    networks:
      - n8n-network
      - traefik-public
    ports:
    <span class="hljs-comment">#   # bind to container local port only — Traefik will route external traffic</span>
      - 127.0.0.1:5678:5678
    labels:
      - <span class="hljs-string">"traefik.enable=true"</span>
    <span class="hljs-comment"># Tell Traefik which network to use to connect to this service</span>
      - <span class="hljs-string">"traefik.docker.network=n8nstack_traefik-public"</span>

    <span class="hljs-comment"># --- HTTPS Router ---</span>
      - <span class="hljs-string">"traefik.http.routers.n8n.rule=Host(`<span class="hljs-variable">${N8N_HOSTNAME}</span>`)"</span>
      - <span class="hljs-string">"traefik.http.routers.n8n.entrypoints=websecure"</span>
      - <span class="hljs-string">"traefik.http.routers.n8n.tls.certresolver=letsencrypt"</span>

      <span class="hljs-comment"># Traefik headers middleware for better security</span>

      - traefik.http.routers.n8n.tls=<span class="hljs-literal">true</span>
      - traefik.http.middlewares.n8n.headers.SSLRedirect=<span class="hljs-literal">true</span>
      - traefik.http.middlewares.n8n.headers.STSSeconds=315360000
      - traefik.http.middlewares.n8n.headers.browserXSSFilter=<span class="hljs-literal">true</span>
      - traefik.http.middlewares.n8n.headers.contentTypeNosniff=<span class="hljs-literal">true</span>
      - traefik.http.middlewares.n8n.headers.forceSTSHeader=<span class="hljs-literal">true</span>
      - traefik.http.middlewares.n8n.headers.SSLHost=<span class="hljs-variable">${N8N_HOSTNAME}</span>
      - traefik.http.middlewares.n8n.headers.STSIncludeSubdomains=<span class="hljs-literal">true</span>
      - traefik.http.middlewares.n8n.headers.STSPreload=<span class="hljs-literal">true</span>
      - traefik.http.routers.n8n.middlewares=n8n@docker

    mem_limit: 4g
    cpus: 2

  <span class="hljs-comment"># -------------------------</span>
  <span class="hljs-comment"># Traefik (reverse proxy / TLS automation)</span>
  <span class="hljs-comment"># -------------------------</span>

  traefik:
    image: traefik:latest
    container_name: traefik
    restart: unless-stopped
    <span class="hljs-built_in">command</span>:
    - --api.dashboard=<span class="hljs-literal">true</span>
    - --api.insecure=<span class="hljs-literal">false</span>  <span class="hljs-comment"># Secure the dashboard</span>
    - --providers.docker=<span class="hljs-literal">true</span>
    - --providers.docker.exposedbydefault=<span class="hljs-literal">false</span>
    - --providers.docker.network=n8nstack_traefik-public  <span class="hljs-comment"># Specify network</span>
    - --entrypoints.web.address=:80
    - --entrypoints.websecure.address=:443
    <span class="hljs-comment"># Let's Encrypt configuration</span>
    - --certificatesresolvers.letsencrypt.acme.httpchallenge=<span class="hljs-literal">true</span>
    - --certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web
    - --certificatesresolvers.letsencrypt.acme.email=<span class="hljs-variable">${LETSENCRYPT_EMAIL}</span>
    - --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
    - --log.level=INFO  <span class="hljs-comment"># Use INFO instead of DEBUG for production</span>
    ports:
    - <span class="hljs-string">"80:80"</span>
    - <span class="hljs-string">"443:443"</span>
    - <span class="hljs-string">"8080:8080"</span>
    volumes:
    <span class="hljs-comment">#- traefik_data:/letsencrypt</span>
    - /home/ubuntu/traefik/acme.json:/letsencrypt/acme.json
    - /home/ubuntu/traefik/<span class="hljs-built_in">log</span>:/var/<span class="hljs-built_in">log</span>/traefik
    - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
    - n8n-network
    - traefik-public

    mem_limit: 512m
    cpus: 0.5
<span class="hljs-comment"># -------------------------</span>
<span class="hljs-comment"># Networks &amp; Volumes</span>
<span class="hljs-comment"># -------------------------</span>
networks:
  n8n-network:
    driver: bridge
  traefik-public:
    driver: bridge
volumes:
  pgdata:
    name: n8n_pgdata
  n8n_data:
    name: n8n_data
</code></pre>
<p>create <code>.env</code> file</p>
<pre><code class="lang-bash"><span class="hljs-comment"># -------------------------</span>
<span class="hljs-comment"># General (non-sensitive)</span>
<span class="hljs-comment"># -------------------------</span>
COMPOSE_PROJECT_NAME=n8nstack    <span class="hljs-comment">#environment variable in Docker Compose used to define the project name for a set of Docker services</span>
TZ=Asia/Kolkata                  <span class="hljs-comment"># timezone for containers (set to your preferred timezone)</span>
N8N_HOSTNAME=n8n.example.com   <span class="hljs-comment"># &lt;-- Replace with your public domain (used for WEBHOOK_URL &amp; Traefik rule)</span>
LETSENCRYPT_EMAIL=email@example.com

<span class="hljs-comment"># -------------------------</span>
<span class="hljs-comment"># Postgres settings</span>
<span class="hljs-comment"># -------------------------</span>
DB_POSTGRESDB_HOST=postgres
DB_POSTGRESDB_PORT=5432
DB_POSTGRESDB_DATABASE=n8n
DB_POSTGRESDB_USER=n8nuser
DB_POSTGRESDB_PASSWORD=dbsupersecret <span class="hljs-comment"># in Production , do not put passwords in .env</span>

<span class="hljs-comment"># -------------------------</span>
<span class="hljs-comment"># n8n auth / behavior</span>
<span class="hljs-comment"># -------------------------</span>
N8N_BASIC_AUTH_PASSWORD=n8nsupersecret  <span class="hljs-comment"># in Production , do not put passwords in .env</span>
N8N_BASIC_AUTH_ACTIVE=<span class="hljs-literal">true</span>
N8N_BASIC_AUTH_USER=admin

<span class="hljs-comment"># Optional metrics/queue settings</span>
N8N_METRICS=<span class="hljs-literal">true</span>
N8N_METRICS_INCLUDE_QUEUE_METRICS=<span class="hljs-literal">true</span>

<span class="hljs-comment"># -------------------------</span>
<span class="hljs-comment"># Resource &amp; tuning (example values)</span>
<span class="hljs-comment"># -------------------------</span>
<span class="hljs-comment"># For Postgres: max connections ≈ (typical) 100 (adjust in postgres.conf if needed)</span>
DB_POOL_SIZE=20
</code></pre>
<p>Launch the docker compose config</p>
<pre><code class="lang-bash">docker-compose up -d
</code></pre>
<p>This will create the networks, volumes &amp; the containers</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757047622748/ad297220-bb1e-40b5-832b-6ca6a7baaf05.png" alt /></p>
<h2 id="heading-step-5-verify">Step 5: 🔍Verify</h2>
<p>Verify if n8n has successfully initiated</p>
<pre><code class="lang-bash">docker compose logs n8n
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757047905755/7927653b-7acc-4d46-aac4-5606e37ccf72.png" alt class="image--center mx-auto" /></p>
<pre><code class="lang-bash"><span class="hljs-comment">#Verify if traefik ACME is able to successfully get a certificate</span>
docker compose logs -f traefik
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757048792198/f52ec095-05d0-4025-84ab-3d4ddffed26f.png" alt class="image--center mx-auto" /></p>
<p>Now check the n8n portal using your browser</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1757048947127/1c7db59f-1153-453e-aeb8-ce317a1be9e3.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-issues-faced-amp-fixes">⚠️ Issues Faced &amp; Fixes 🛠️</h2>
<ol>
<li><h3 id="heading-traefik-only-shows-the-default-selfsigned-certificate">Traefik only shows the default self‑signed certificate</h3>
</li>
</ol>
<p><strong>Symptom:</strong> When you open <a target="_blank" href="https://n8n.example.com"><code>https://n8n.example.com</code></a>, your browser shows the Traefik default certificate instead of a valid Let’s Encrypt cert.</p>
<p><strong>Causes &amp; Fixes:</strong></p>
<ul>
<li><p><strong>Resolver name mismatch</strong>: The resolver name in your labels must match the resolver defined in Traefik’s command args.</p>
<ul>
<li><p>Traefik command:</p>
<pre><code class="lang-bash">  --certificatesresolvers.myresolver.acme.email=you@example.com
  --certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json
  --certificatesresolvers.myresolver.acme.tlschallenge=<span class="hljs-literal">true</span>
</code></pre>
</li>
<li><p>Label must match:</p>
<pre><code class="lang-bash">  - <span class="hljs-string">"traefik.http.routers.n8n.tls.certresolver=myresolver"</span>
</code></pre>
</li>
</ul>
</li>
<li><p><strong>acme.json permissions</strong>: Ensure the file exists and is writable by Traefik:</p>
<pre><code class="lang-bash">  touch ./traefik/acme.json
  chmod 600 ./traefik/acme.json
</code></pre>
</li>
<li><p><strong>Firewall/DNS</strong>: Ports 80/443 must be open, and <a target="_blank" href="http://n8n.example.com"><code>n8n.example.com</code></a> must resolve to your VPS IP.</p>
</li>
</ul>
<ol start="2">
<li><h3 id="heading-traefik-fails-to-obtain-acme-certificate-when-n8neditorbaseurl-is-set">Traefik fails to obtain ACME certificate when <code>N8N_EDITOR_BASE_URL</code> is set</h3>
</li>
</ol>
<p><strong>Symptom:</strong> Traefik logs show certificate request failures, and n8n only loads behind the default cert. The issue appears right after setting <code>N8N_EDITOR_BASE_URL=</code><a target="_blank" href="https://n8n.example.com/"><code>https://n8n.example.com/</code></a>.</p>
<p><strong>Cause:</strong> With <strong>TLS‑ALPN challenge</strong>, Traefik passes the ACME validation request through to the backend. If n8n enforces HTTPS at this point, the validation breaks.</p>
<p><strong>Fixes:</strong></p>
<ul>
<li><p><strong>Option A:</strong> Deploy without <code>N8N_EDITOR_BASE_URL</code> until the cert is issued, then set it and restart n8n.</p>
</li>
<li><p><strong>Option B (better):</strong> Switch to <strong>HTTP‑01 challenge</strong>, which bypasses n8n entirely during validation.</p>
<p>  --certificatesresolvers.le.acme.httpchallenge.entrypoint=web</p>
</li>
<li><p><strong>Option C:</strong> Use <strong>DNS‑01 challenge</strong> if your DNS provider supports it (best for Cloudflare/Route53).</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[How to Deploy Your First Proxmox Virtual Machine Using Terraform]]></title><description><![CDATA[🧠 Why Terraform for Proxmox?
While Proxmox has a great web UI, infrastructure-as-code lets you:

Automate repeatable VM deployments

Keep configurations under version control

Easily spin up multi-VM Setups

Reduce human error


👋 Quick heads-up!
T...]]></description><link>https://automatestack.dev/how-to-deploy-your-first-proxmox-virtual-machine-using-terraform</link><guid isPermaLink="true">https://automatestack.dev/how-to-deploy-your-first-proxmox-virtual-machine-using-terraform</guid><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Mon, 30 Jun 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766936353545/f29d9644-9f52-4f41-9efd-5cb0aae8c33e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-why-terraform-for-proxmox">🧠 Why Terraform for Proxmox?</h2>
<p>While Proxmox has a great web UI, infrastructure-as-code lets you:</p>
<ul>
<li><p>Automate repeatable VM deployments</p>
</li>
<li><p>Keep configurations under version control</p>
</li>
<li><p>Easily spin up multi-VM Setups</p>
</li>
<li><p>Reduce human error</p>
</li>
</ul>
<h2 id="heading-quick-heads-up"><strong>👋 Quick heads-up!</strong></h2>
<p>This guide is <strong>Part 2 of a multi-part series</strong>.</p>
<p>In this article, we’ll walk through How to Deploy a Virtual Machine in proxmox Using Terraform</p>
<p>👉 <a target="_blank" href="https://automatestack.dev/how-to-set-up-proxmox-with-terraform-a-step-by-step-guide"><strong>Jump to Part 1:</strong> How to Set Up Proxmox with Terraform <strong>→</strong></a></p>
<h2 id="heading-prerequisites">📦 Prerequisites</h2>
<p>Before you begin, make sure you have:</p>
<p>✅ A running <strong>Proxmox VE</strong> host or cluster<br />✅ A <strong>user with API access</strong> (e.g., <code>terraform-user@pve</code>)<br />✅ A <strong>cloud-init template VM</strong> ready to be cloned<br />✅ Terraform installed on your machine<br />✅ Installed the <strong>Telmate Proxmox Terraform provider</strong></p>
<h2 id="heading-project-directory-structure">🗂 Project Directory Structure</h2>
<p><a target="_blank" href="https://automatestack.dev/how-to-deploy-your-first-proxmox-virtual-machine-using-terraform">Here’s the structure</a> of the deployment repo we’ll use:</p>
<pre><code class="lang-bash">proxmox-vm-deploy/
├── main.tf
├── provider.tf
├── variables.tf
├── terraform.tfvars
</code></pre>
<p>Let’s go through each file.</p>
<h2 id="heading-providertf-connect-terraform-to-proxmox">🧩 provider.tf – Connect Terraform to Proxmox</h2>
<pre><code class="lang-bash">terraform {
  required_providers {
    proxmox = {
      <span class="hljs-built_in">source</span> = <span class="hljs-string">"Telmate/proxmox"</span>
      version = <span class="hljs-string">"3.0.2-rc01"</span> <span class="hljs-comment"># use the latest version available</span>
    }
  }
}


provider <span class="hljs-string">"proxmox"</span> {
  pm_api_url = var.proxmox_api_url <span class="hljs-comment"># the variable is defined in terraform.tfvars.This should match the URL in your Proxmox web interface, typically something like "https://&lt;proxmox-ip&gt;:8006/api2/json"</span>
  pm_parallel = 1
  pm_debug = <span class="hljs-literal">false</span>
  pm_tls_insecure = <span class="hljs-literal">true</span>
}
</code></pre>
<p>This sets up the connection to your Proxmox host. Make sure:</p>
<ul>
<li><p>The user exists in Proxmox (<code>pveum user add terraform-user@pve</code>)</p>
</li>
<li><p>A suitable role (e.g., <code>Terraform_Provisioner</code>) with VM permissions is assigned</p>
</li>
<li><p>load the API tokens for connecting to proxmox as environment variables</p>
</li>
</ul>
<p>👉 <a target="_blank" href="https://automatestack.dev/how-to-set-up-proxmox-with-terraform-a-step-by-step-guide"><strong>Check out part-1 of the series for a step-by-step guide on creating the required users, roles and tokens to connect via terraform</strong></a></p>
<h2 id="heading-variablestf-input-configuration">📜 variables.tf – Input Configuration</h2>
<pre><code class="lang-bash">
variable <span class="hljs-string">"vm_name"</span> {
  <span class="hljs-built_in">type</span> = string
}

variable <span class="hljs-string">"clone"</span> {
  <span class="hljs-built_in">type</span> = string
}

variable <span class="hljs-string">"ipconfig0"</span> {
  <span class="hljs-built_in">type</span> = string
}

variable <span class="hljs-string">"vmid"</span> {
  <span class="hljs-built_in">type</span> = number
}

variable <span class="hljs-string">"memory"</span> {
  <span class="hljs-built_in">type</span> = number
}

variable <span class="hljs-string">"cores"</span> {
  <span class="hljs-built_in">type</span> = number
}

variable <span class="hljs-string">"disk_size"</span> {
  <span class="hljs-built_in">type</span> = string
  description = <span class="hljs-string">"The size of the disk, should be at least as big as the disk in the template"</span>
  default = <span class="hljs-string">"20G"</span>

}

variable <span class="hljs-string">"storage"</span> {
  <span class="hljs-built_in">type</span> = string
  description = <span class="hljs-string">"the storage where the VM disk will be created"</span>

}

variable <span class="hljs-string">"ssh-public-key"</span> {
    <span class="hljs-built_in">type</span> = string
    description = <span class="hljs-string">"SSH public key for the VMs"</span>
    sensitive = <span class="hljs-literal">true</span>

}

variable <span class="hljs-string">"proxmox_api_url"</span> {
    <span class="hljs-built_in">type</span>        = string
    description = <span class="hljs-string">"Proxmox API URL"</span>
}

variable <span class="hljs-string">"target_node"</span> {
    <span class="hljs-built_in">type</span>        = string
    description = <span class="hljs-string">"The Proxmox node where the VM will be created"</span>

}

variable <span class="hljs-string">"nameserver"</span> {
    <span class="hljs-built_in">type</span>        = string
    description = <span class="hljs-string">"Nameserver for the VM"</span>
    default     = <span class="hljs-string">"1.1.1.1 8.8.8.8"</span>

}

variable <span class="hljs-string">"cicustom"</span> {
    <span class="hljs-built_in">type</span>        = string
    description = <span class="hljs-string">"Cloud-Init custom configuration"</span>
    default     = <span class="hljs-string">"vendor=local:snippets/qemu-guest-agent.yml"</span>

}

variable <span class="hljs-string">"cipassword"</span> {
    <span class="hljs-built_in">type</span>        = string
    description = <span class="hljs-string">"Cloud-Init password for the VM"</span>
    sensitive = <span class="hljs-literal">true</span>

}
</code></pre>
<p>These variables define how your VM will look—name, clone template, IP config, memory, CPU, etc.</p>
<h2 id="heading-terraformtfvars-your-custom-values">⚙️ terraform.tfvars – Your Custom Values</h2>
<pre><code class="lang-bash">ssh-public-key = <span class="hljs-string">"ssh-ed25519 AAAAC3NzaC1lZXXXXXXXXXXXXXXXXXXXXXXwSOCiZ/OkpPDR3bR2tK4STIm+gnJk"</span>
target_node = <span class="hljs-string">"proxmox-server-IP"</span> <span class="hljs-comment"># The Proxmox node where the VM will be created</span>
cicustom = <span class="hljs-string">"value=local:snippets/install-packages.yml"</span> <span class="hljs-comment"># /var/lib/vz/snippets/install-packages.yml #</span>
cipassword = <span class="hljs-string">"ubuntu"</span>
ipconfig0 = <span class="hljs-string">"ip=dhcp"</span>
vmid = 1000 <span class="hljs-comment"># optional, if not set, Proxmox will assign a random VMID</span>
vm_name = <span class="hljs-string">"ubuntu-vm"</span>
<span class="hljs-built_in">clone</span> = <span class="hljs-string">"ubuntu-24-04-cloudinit-copy"</span> <span class="hljs-comment"># The template to clone from</span>
cores = 4 <span class="hljs-comment"># Number of CPU cores</span>
memory = 4096 <span class="hljs-comment"># Memory in MB</span>
nameserver = <span class="hljs-string">"1.1.1.1 8.8.8.8"</span>
disk_size = <span class="hljs-string">"20G"</span> <span class="hljs-comment"># The size of the disk, should be at least as big as the disk in the template</span>
storage = <span class="hljs-string">"hdd-vm-data"</span> <span class="hljs-comment"># The storage where the VM disk will be created</span>
</code></pre>
<p>This is your configuration layer—values you want to pass to variables. Keep this file out of version control (<code>.gitignore</code>) if it includes sensitive info.</p>
<h2 id="heading-maintf-create-the-virtual-machine">🏗 main.tf – Create the Virtual Machine</h2>
<pre><code class="lang-bash"><span class="hljs-comment">#create a new VM from a template with cloud-init enabled</span>
resource <span class="hljs-string">"proxmox_vm_qemu"</span> <span class="hljs-string">"ubuntu-vm"</span> {

  <span class="hljs-comment"># Basic VM configuration</span>
  vmid        = var.vmid
  name        = var.vm_name
  target_node = var.target_node <span class="hljs-comment"># The node where the VM will be created</span>
  agent       = 1 <span class="hljs-comment"># Enable the QEMU guest agent</span>
  cpu {
    cores = var.cores
    sockets = 1
    numa = <span class="hljs-literal">true</span>
    <span class="hljs-built_in">type</span> = <span class="hljs-string">"x86-64-v2-AES"</span>
  }
  memory      = var.memory <span class="hljs-comment"># Memory in MB</span>
  bios        = <span class="hljs-string">"ovmf"</span> <span class="hljs-comment"># Use OVMF for UEFI support</span>
  boot        = <span class="hljs-string">"order=scsi0"</span> <span class="hljs-comment"># has to be the same as the OS disk of the template</span>
  <span class="hljs-built_in">clone</span>       = var.clone <span class="hljs-comment"># The template to clone from</span>
  scsihw      = <span class="hljs-string">"virtio-scsi-single"</span> <span class="hljs-comment"># Use VirtIO SCSI controller</span>
  vm_state    = <span class="hljs-string">"running"</span> <span class="hljs-comment"># "running" or "stopped"</span>
  automatic_reboot = <span class="hljs-literal">true</span>

  <span class="hljs-comment"># Cloud-Init configuration</span>
  cicustom   = var.cicustom
  ciupgrade  = <span class="hljs-literal">true</span> <span class="hljs-comment"># it will upgrade the OS to the latest version</span>
  nameserver = var.nameserver
  ipconfig0  = var.ipconfig0
  skip_ipv6  = <span class="hljs-literal">true</span>
  ciuser     = <span class="hljs-string">"root"</span> <span class="hljs-comment"># The user to use for the cloud-init script</span>
  cipassword = var.cipassword <span class="hljs-comment"># Password for the cloud-init user</span>
  sshkeys    = var.ssh-public-key <span class="hljs-comment"># The SSH public key to be added to the VM</span>

  <span class="hljs-comment"># Most cloud-init images require a serial device for their display</span>
  serial {
    id = 0
  }

  <span class="hljs-comment"># EFI disk for UEFI boot</span>
  <span class="hljs-comment"># This is required for cloud-init images that use UEFI</span>
  <span class="hljs-comment"># If your template does not use UEFI, you can remove this block</span>
  efidisk {
    efitype = <span class="hljs-string">"4m"</span> 
    storage = <span class="hljs-string">"hdd-vm-data"</span>
  }

  <span class="hljs-comment"># Disk configuration</span>
  disks {
    scsi {
      scsi0 {
        <span class="hljs-comment"># We have to specify the disk from our template, else Terraform will think it's not supposed to be there</span>
        disk {
          storage = var.storage
          <span class="hljs-comment"># The size of the disk should be at least as big as the disk in the template. If it's smaller, the disk will be recreated</span>
          size    = var.disk_size
        }
      }

  <span class="hljs-comment"># Some images require a cloud-init disk on the IDE controller, others on the SCSI or SATA controller</span>
      scsi1 {
        cloudinit {
          storage = <span class="hljs-string">"hdd-vm-data"</span>
        }
      }
    }
  }

  network {
    id = 0
    bridge = <span class="hljs-string">"vmbr0"</span>
    model  = <span class="hljs-string">"virtio"</span>
  }
}
</code></pre>
<p>Here’s what it does:</p>
<ul>
<li><p>Clones an existing <strong>cloud-init-enabled VM template</strong></p>
</li>
<li><p>Assigns VM name, IP, memory, CPU, etc.</p>
</li>
<li><p>Injects <strong>cloud-init config</strong>, such as SSH key and default user</p>
</li>
<li><p>Enables <strong>QEMU guest agent</strong> to enhance functionality</p>
</li>
</ul>
<h2 id="heading-how-to-run-it">▶️ How to Run It</h2>
<p>Open a terminal inside the project directory and follow these steps:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># use single quotes for the API token ID because of the exclamation mark</span>
<span class="hljs-built_in">export</span> PM_API_TOKEN_ID=<span class="hljs-string">'terraform-user@pve!tf_token'</span>
<span class="hljs-built_in">export</span> PM_API_TOKEN_SECRET=<span class="hljs-string">"XXXXXX-XXXX-XXXXX-XXXX-XXXXXXXXXXX"</span>
</code></pre>
<pre><code class="lang-bash">
<span class="hljs-comment"># Initialize Terraform</span>
terraform init
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753675736255/8ec2c6e9-1606-41d3-91fe-67db3cd407e3.png" alt /></p>
<pre><code class="lang-bash">
<span class="hljs-comment"># Review execution plan</span>
terraform plan

<span class="hljs-comment"># Apply the configuration</span>
terraform apply
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753677529158/2fd2d130-891b-40b6-acb5-482f17396e32.png" alt /></p>
<h2 id="heading-verify-in-proxmox">✅ Verify in Proxmox</h2>
<ul>
<li><p>Go to your Proxmox Web UI</p>
</li>
<li><p>You’ll see a the VM</p>
</li>
<li><p>Confirm network and SSH access</p>
</li>
<li><p>Check if the VM booted from your template and has your custom config</p>
</li>
</ul>
<hr />
<h2 id="heading-troubleshooting-tips">🛠 Troubleshooting Tips</h2>
<ul>
<li><p><strong>SSH not working?</strong> Ensure cloud-init was enabled in your template and your public SSH key is valid.</p>
</li>
<li><p><strong>Error: Permission denied?</strong> Double-check the Proxmox user's role and permissions.</p>
</li>
<li><p><strong>Wrong IP?</strong> Validate your <code>ipconfig0</code> syntax (should follow <code>ip=x.x.x.x/xx,gw=x.x.x.x</code>).</p>
</li>
</ul>
<hr />
<h2 id="heading-whats-next">🧪 What’s Next?</h2>
<p>Once you get one VM working, you can:</p>
<ul>
<li><p>Create multiple VMs using <code>for_each</code></p>
</li>
<li><p>Automate post-deploy scripts via <code>null_resource</code> and <code>remote-exec</code></p>
</li>
<li><p>Turn this into a Kubernetes cluster or homelab infra stack!</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[How to Set Up Proxmox with Terraform: A Step-by-Step Guide]]></title><description><![CDATA[Whether you’re just dipping your toes into Terraform or you’ve been automating your infrastructure for years, Hooking up Terraform with Proxmox is a simple step that unlocks a lot of automation power.
By the end of this guide, you’ll have learned how...]]></description><link>https://automatestack.dev/how-to-set-up-proxmox-with-terraform-a-step-by-step-guide</link><guid isPermaLink="true">https://automatestack.dev/how-to-set-up-proxmox-with-terraform-a-step-by-step-guide</guid><category><![CDATA[Terraform]]></category><category><![CDATA[proxmox]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Mon, 30 Jun 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766935051741/72faaadf-9777-454d-ad81-eed973df7ee8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Whether you’re just dipping your toes into Terraform or you’ve been automating your infrastructure for years, Hooking up Terraform with Proxmox is a simple step that unlocks a lot of automation power.</p>
<p>By the end of this guide, you’ll have learned how to:</p>
<ol>
<li><p>Create a dedicated Proxmox role for Terraform</p>
</li>
<li><p>A service user and API token</p>
</li>
<li><p>A reusable Cloud‑Init template and snippets for quick VM creation</p>
</li>
</ol>
<h3 id="heading-quick-heads-up">👋 Quick heads-up!</h3>
<p>This guide is <strong>Part 1 of a multi-part series</strong>.</p>
<p>In this article, we’ll walk through setting up Proxmox for use with Terraform—creating roles, users, API tokens, and a Cloud-Init template.</p>
<blockquote>
<p><strong>In Part 2</strong>, we’ll use Terraform to <strong>deploy a real VM</strong> from this setup—complete with configuration and automation.</p>
</blockquote>
<p>👉 <a target="_blank" href="https://automatestack.dev/how-to-deploy-your-first-proxmox-virtual-machine-using-terraform"><strong>Jump to Part 2: Deploy Your First</strong></a> <a class="post-section-overview" href="#"><strong>VM →</strong></a></p>
<h2 id="heading-create-a-terraform-friendly-role">Create a “Terraform-Friendly” Role</h2>
<p>Let’s give Terraform only the permissions it truly needs: allocating disks, cloning VMs, tweaking cloud‑init settings, and so on. This is called privilege separation—and it’s a best practice to keep your environment secure.</p>
<p>Log into the Proxmox cluster or host using ssh</p>
<ul>
<li><p>Create a new role <code>Terraform_Provisioner_role</code></p>
</li>
<li><p>Create the user <code>terraform-user@pve</code></p>
</li>
<li><p>Add the <code>Terraform_Provisioner_role</code> role to <code>terraform-user</code></p>
</li>
</ul>
<pre><code class="lang-bash"><span class="hljs-comment">#create the role</span>
pveum role add Terraform_Provisioner_role -privs <span class="hljs-string">"Datastore.AllocateSpace Datastore.AllocateTemplate Datastore.Audit Pool.Allocate Sys.Audit Sys.Console Sys.Modify VM.Allocate VM.Audit VM.Clone VM.Config.CDROM VM.Config.Cloudinit VM.Config.CPU VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.Migrate VM.Monitor VM.PowerMgmt SDN.Use"</span>
<span class="hljs-comment">#create the user</span>
pveum user add terraform-user@pve --password &lt;password&gt;
<span class="hljs-comment">#assign the role to the user</span>
pveum aclmod / -user terraform-user@pve -role Terraform_Provisioner_role
</code></pre>
<h2 id="heading-generate-an-api-token">Generate an API Token</h2>
<p>Rather than embedding a password in plain text, we’ll use an API token. This is both safer and more portable—especially in CI/CD pipelines.</p>
<p>You’ll see your token’s ID and secret in the output—jot them down for the next step.</p>
<pre><code class="lang-bash">pveum user token add terraform-user@pve tf_token
</code></pre>
<p>Output:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749611065706/7301e4df-7836-478b-9e34-51a4ee0f55e6.png" alt /></p>
<p>If you’re running Terraform from GitHub Actions, GitLab CI, or another platform, stash these in your project’s secret variables.</p>
<h3 id="heading-disable-privilege-separation">Disable Privilege separation</h3>
<p>In our Terraform use case, we usually assign permissions to the <strong>user account</strong> (like <code>terraform-user@pve</code>), not directly to the <strong>token</strong> itself. So if "Privilege Separation" is <strong>enabled</strong>, the token might not have access to all the necessary resources—even though the user does!</p>
<p>To make sure the token works as expected, <strong>Privilege Separation must be disabled</strong>, so the token inherits the user's full role permissions.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749653696103/c53f386f-7d48-4b8d-bc5e-d6bb8959f807.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-connecting-from-terraform-to-proxmox"><strong>Connecting from Terraform to Proxmox</strong></h2>
<p>We are using the <a target="_blank" href="https://registry.terraform.io/providers/Telmate/proxmox/latest/docs">Telmate Proxmox Terraform provider</a></p>
<p>Terraform needs two environment variables: the token ID (which includes your username) and the token secret. Here’s how you’d set them in your shell:</p>
<p>When using the Telmate Proxmox Terraform provider, you authenticate using <strong>environment variables</strong> that start with <code>PM_</code>. These are used by Terraform to connect to the Proxmox API.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># use single quotes for the API token ID because of the exclamation mark</span>
<span class="hljs-built_in">export</span> PM_API_TOKEN_ID=<span class="hljs-string">'terraform-user@pve!tf_token'</span>
<span class="hljs-built_in">export</span> PM_API_TOKEN_SECRET=<span class="hljs-string">"XXXXXX-XXXX-XXXXX-XXXX-XXXXXXXXXXX"</span>
</code></pre>
<p>If using git actions, you can store these in repository variables</p>
<h2 id="heading-creating-a-cloud-init-template"><strong>Creating a Cloud Init Template</strong></h2>
<p>Terraform works best when you have a VM template ready. We’ll start with Ubuntu 24.04 LTS cloud image</p>
<ul>
<li>Download the Ubuntu cloud image. I choose the <a target="_blank" href="https://cloud-images.ubuntu.com/noble/current/">ubuntu server 24.04 LTS cloud image</a></li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749616743956/fd062e42-ec6b-4573-a291-b2ca7bb12a72.png" alt /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749616799755/958ce57a-4db2-4fd0-a91e-a5b3cde8baea.png" alt /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749616841923/ac2d84a4-141a-4982-a015-694f5345d56d.png" alt /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749617016960/6900f7c2-2ae3-4f44-a824-c3724bc3540f.png" alt /></p>
<ul>
<li><p>This will download the image under <code>/var/lib/vz/template/iso</code> in the proxmox server</p>
</li>
<li><p>Install guest tools</p>
<ul>
<li>login to the proxmox server over ssh</li>
</ul>
</li>
</ul>
<pre><code class="lang-bash"><span class="hljs-comment"># as root on your Proxmox host</span>

apt update
apt install -y libguestfs-tools
</code></pre>
<ul>
<li><p>Using the guest tools , we will customize the image to install:</p>
<ul>
<li><p>qemu-guest-agent</p>
</li>
<li><p>Clear the existing machine ID in the image by setting the files to zero size. This ensures that each cloned VM will generate a new machine ID</p>
</li>
</ul>
</li>
</ul>
<pre><code class="lang-bash">sudo virt-customize -a ubuntu_24-04-server-cloudimg-amd64_copy.img \
  --install qemu-guest-agent \
  --run-command <span class="hljs-string">'systemctl start qemu-guest-agent'</span>

sudo virt-customize -a ubuntu_24-04-server-cloudimg-amd64_copy.img \
  --run-command <span class="hljs-string">"truncate -s 0 /etc/machine-id /var/lib/dbus/machine-id"</span>
</code></pre>
<h3 id="heading-create-the-vm-for-template">Create the VM for Template</h3>
<pre><code class="lang-bash"><span class="hljs-comment">#create the VM shell</span>
qm create 9000 --name ubuntu-24-04-cloudinit
</code></pre>
<pre><code class="lang-bash"><span class="hljs-comment">#import the cloid image to the vm disk</span>
qm <span class="hljs-built_in">set</span> 9000 --scsi0 local-lvm:0,import-from=/var/lib/vz/template/iso/noble-server-cloudimg-amd64.img
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749617702684/7f4cb807-d992-4b67-8f7d-f0f6b71ba532.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749617725107/6301aabe-2cc1-418f-9d20-da89b7acdf65.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-convert-to-template">Convert to template</h3>
<pre><code class="lang-bash"><span class="hljs-comment">#convert to template</span>
qm template 9000
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749617765011/4375d35e-de91-4374-bbe4-af911ab6e8e2.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-template-created">Template created</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749618000713/29beb744-c47f-4ab6-88d5-eaf724e133e1.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-creating-a-snippet"><strong>Creating a Snippet</strong></h2>
<p>Snippets let you inject arbitrary cloud‑init YAML at VM creation time. Here’s how to set up a folder and add two quick examples:</p>
<pre><code class="lang-bash">mkdir /var/lib/vz/snippets
</code></pre>
<p>Now that we have a place to store the snippet, we can create the snippet itself. The following command will create a snippet <code>install-packages.yml</code> :</p>
<pre><code class="lang-bash">tee /var/lib/vz/snippets/install-packages.yml &lt;&lt;EOF
<span class="hljs-comment">#cloud-config</span>
runcmd:
  - apt update
  - apt install -y curl jq
EOF
</code></pre>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>And that’s it! You now have:</p>
<ul>
<li><p>A <strong>Terraform role</strong> limited to exactly what you need</p>
</li>
<li><p>A <strong>service user</strong> and <strong>token</strong> for secure API access</p>
</li>
<li><p>A <strong>cloud‑init template</strong> ready to spin up Ubuntu VMs</p>
</li>
<li><p>Handy <strong>snippets</strong> for customization</p>
</li>
</ul>
<hr />
<p>To learn how to use the template you just built to create real, working virtual machines with cloud-init, snippets, and more.</p>
<h2 id="heading-read-next-deploy-your-first-proxmox-vm-with-terraformhttpsautomatestackdevhow-to-deploy-your-first-proxmox-virtual-machine-using-terraform">👉 <a target="_blank" href="https://automatestack.dev/how-to-deploy-your-first-proxmox-virtual-machine-using-terraform"><strong>🚀 Read Next: Deploy Your First Proxmox VM with Terraform</strong></a></h2>
<hr />
]]></content:encoded></item><item><title><![CDATA[The Ultimate Guide to Right-Sizing CPU & Memory for Virtual Machines]]></title><description><![CDATA[In the world of virtualization, more CPU and memory doesn’t always mean better performance. Over-provisioning resources can lead to higher contention and even degraded application responsiveness.
Right-sizing ensures your workloads get exactly the re...]]></description><link>https://automatestack.dev/the-ultimate-guide-to-right-sizing-cpu-and-memory-for-virtual-machines</link><guid isPermaLink="true">https://automatestack.dev/the-ultimate-guide-to-right-sizing-cpu-and-memory-for-virtual-machines</guid><category><![CDATA[numa]]></category><category><![CDATA[vnuma]]></category><category><![CDATA[ Right Sizing]]></category><category><![CDATA[vm]]></category><category><![CDATA[vmware]]></category><category><![CDATA[vsphere]]></category><category><![CDATA[esxi]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Mon, 30 Jun 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766935408273/9470fb15-cd10-458d-93d3-e3364c75fd0b.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the world of virtualization, more CPU and memory doesn’t always mean better performance. Over-provisioning resources can lead to higher contention and even degraded application responsiveness.</p>
<p>Right-sizing ensures your workloads get exactly the resources they need—no more, no less—while maximizing efficiency on your ESXi hosts.</p>
<p>In this guide, we’ll explore <strong>practical CPU and memory sizing strategies</strong>, <strong>NUMA awareness</strong>, and <strong>performance optimization tips</strong> to help you make data-driven decisions.</p>
<h2 id="heading-why-right-sizing-matters"><strong>Why Right-Sizing Matters</strong></h2>
<p>Every ESXi host has finite CPU and memory resources. If one VM hoards them unnecessarily, other VMs suffer. The goal of right-sizing is to:</p>
<ul>
<li><p>Improve <strong>overall cluster performance</strong> by <a target="_blank" href="https://automatestack.dev/resource-contention-in-vsphere-identification-and-solutions">reducing contention.</a></p>
</li>
<li><p>Optimize <strong>hardware utilization</strong>.</p>
</li>
<li><p>Avoid <strong>VM-level performance degradation</strong> caused by unnecessary over-allocation.</p>
</li>
<li><p>Reduce <strong>software licensing costs</strong> for CPU-bound products like databases.</p>
</li>
</ul>
<h2 id="heading-cpu-right-sizing"><strong>CPU Right-Sizing</strong></h2>
<p>When it comes to CPU allocation, more isn’t always better. Virtual CPUs (vCPUs) introduce scheduling overhead—allocating more than necessary can slow things down.</p>
<h3 id="heading-best-practices"><strong>Best Practices</strong></h3>
<ol>
<li><p><strong>Start Small</strong></p>
<ul>
<li><p>Begin with the minimum number of vCPUs required for peak load.</p>
</li>
<li><p>Common starting point: <strong>2 vCPUs</strong> for general-purpose workloads.</p>
</li>
</ul>
</li>
<li><p><strong>Add CPUs Only When Needed</strong></p>
<ul>
<li><p>Monitor for <strong>CPU Ready Time</strong> or <strong>Co-stop</strong> events.</p>
</li>
<li><p>If they are consistently high, then consider adding more vCPUs.</p>
</li>
</ul>
</li>
<li><p><strong>Align with NUMA Topology</strong></p>
<ul>
<li><p>Configure vCPUs as <strong>Cores per Socket</strong> until:</p>
<ul>
<li><p>You exceed the physical core count of a NUMA node, or</p>
</li>
<li><p>You exceed the memory available in a NUMA node.</p>
</li>
</ul>
</li>
</ul>
</li>
<li><p><strong>Avoid Odd vCPU Counts</strong></p>
<ul>
<li>Maintain even counts for better scheduling efficiency.</li>
</ul>
</li>
<li><p><strong>Disable vCPU Hot Add Unless Necessary</strong></p>
<ul>
<li>Enabling hot add disables <strong>vNUMA</strong>—critical for workloads over 8 vCPUs.</li>
</ul>
</li>
<li><p><strong>Mind Licensing Models</strong></p>
<ul>
<li><p><strong>some software licensing schemes had limitations on socket counts</strong>, configuring the socket count to 1 may result in better performance</p>
<ul>
<li>For example, SQL Server Standard edition running on an 8-vCPU VM with 1 core per socket would be able to utilize only 4 vCPUs. But if the same VM were configured with 1 socket (that is, 8 cores per socket), then all 8 vCPUs would be leveraged.</li>
</ul>
</li>
</ul>
</li>
</ol>
<h2 id="heading-understanding-numa-amp-vnuma"><strong>Understanding NUMA &amp; vNUMA</strong></h2>
<p>Modern servers use <strong>NUMA (Non-Uniform Memory Access)</strong>, where CPU and memory are grouped into <strong>nodes</strong>. Accessing memory within the same node is faster (local memory) than accessing from another (remote memory).</p>
<ul>
<li><p>Let’s understand with an example. Lets assume:</p>
<ul>
<li><p>Each physical server has <strong>2 CPU sockets</strong>.</p>
</li>
<li><p>Each socket contains <strong>24 physical cores</strong>.</p>
</li>
<li><p>This means <strong>one NUMA node = 1 socket = 24 cores + its own memory bank</strong>.</p>
</li>
</ul>
</li>
</ul>
<p>    If you create a VM with <strong>up to 24 vCPUs</strong>, you can assign them as <strong>1 socket × 24 cores</strong>.</p>
<ul>
<li>This keeps <strong>all vCPUs within the same NUMA node</strong>, so memory access stays local and fast.</li>
</ul>
<p>    If you create a VM with <strong>more than 24 vCPUs</strong> (say 32 vCPUs):</p>
<ul>
<li><p>The vCPUs will be split across <strong>two NUMA nodes</strong> (because a single node can’t handle more than 24).</p>
</li>
<li><p>This means some CPU threads may need to access memory from the other node, which is <strong>slower</strong>.</p>
</li>
<li><p>At that point, you should configure vNUMA so the VM and its OS are aware of this split.</p>
</li>
</ul>
<h3 id="heading-numa-tips"><strong>NUMA Tips</strong></h3>
<ul>
<li><p>Keep vCPUs and memory allocations <strong>within a single NUMA node</strong> when possible.</p>
</li>
<li><p><strong>vNUMA</strong> becomes relevant when a VM’s vCPUs exceed a single NUMA node’s core count (often &gt;8 vCPUs).</p>
</li>
<li><p>vNUMA exposes NUMA topology to the guest OS so it can optimize memory locality.</p>
</li>
</ul>
<h2 id="heading-memory-right-sizing"><strong>Memory Right-Sizing</strong></h2>
<p>Just like CPU, over-allocating RAM can harm performance due to ballooning and swapping.</p>
<h3 id="heading-best-practices-1"><strong>Best Practices</strong></h3>
<ol>
<li><p><strong>Start with the Working Set</strong></p>
<ul>
<li><p>Measure actual memory use and allocate slightly above it.</p>
</li>
<li><p>Example: If a VM uses 8 GB, allocate ~10 GB.</p>
</li>
</ul>
</li>
<li><p><strong>Avoid Large Headroom</strong></p>
<ul>
<li>Unused RAM may be reclaimed by ballooning, which can cause performance drops.</li>
</ul>
</li>
<li><p><strong>When Ballooning Occurs</strong></p>
<ul>
<li><p>Check if the VM is swapping/paging.</p>
</li>
<li><p>Right-size VMs with over-allocated RAM to free resources for others.</p>
</li>
</ul>
</li>
</ol>
<h2 id="heading-final-thoughts"><strong>Final Thoughts</strong></h2>
<p>Right-sizing isn’t about giving VMs the <strong>most</strong> resources—it’s about giving them the <strong>right</strong> amount. By starting small, monitoring real usage, and scaling based on evidence, it will lead to boosting performance &amp; <a target="_blank" href="https://automatestack.dev/resource-contention-in-vsphere-identification-and-solutions">reduce resource contention</a>.</p>
<p>Awareness of physical topology and NUMA limits is also key to making the most of your hardware.</p>
]]></content:encoded></item><item><title><![CDATA[Step-by-Step Guide: Upgrading vSphere 7 to 8]]></title><description><![CDATA[The support clock is ticking toward October 2, 2025—the official end-of-life date for vSphere 7.
If you still haven’t upgraded your vCenters yet, i hope you find this blog helpful in your journey.
vSphere 8 also brings many features improvements such...]]></description><link>https://automatestack.dev/step-by-step-guide-upgrading-vsphere-7-to-8</link><guid isPermaLink="true">https://automatestack.dev/step-by-step-guide-upgrading-vsphere-7-to-8</guid><category><![CDATA[vmware]]></category><category><![CDATA[vCenter]]></category><category><![CDATA[vcenter upgrade]]></category><category><![CDATA[vsphere]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Mon, 31 Mar 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1749527094843/cbf2f482-ebf7-4baa-94b2-51e6a565a538.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The support clock is ticking toward <strong>October 2, 2025</strong>—the official end-of-life date for vSphere 7.</p>
<p>If you still haven’t upgraded your vCenters yet, i hope you find this blog helpful in your journey.</p>
<p>vSphere 8 also brings many features improvements such as —</p>
<p>Distributed Services Engine (DPU Offload), Accelerated GPU Capabilities, Native Kubernetes Integration, Enhanced Life-cycle &amp; Cluster Management, Advanced DRS &amp; vMotion, vSAN Express Storage Architecture (ESA), vNUMA GUI Visualization &amp; Security Enhancements.</p>
<p>You can read more about these in <a target="_blank" href="https://blogs.vmware.com/vsphere/2022/08/introducing-vsphere-8-the-enterprise-workload-platform.html">details here</a>.</p>
<h2 id="heading-assessment-amp-planning">📊 Assessment &amp; Planning</h2>
<h3 id="heading-compatibility-checks">🔍 Compatibility Checks</h3>
<ul>
<li><p>Verify hardware (servers, NICs, storage) on VMware HCL (now <a target="_blank" href="https://compatibilityguide.broadcom.com/">Broadcom compatibility guide</a>)</p>
</li>
<li><p>Confirm interoperability of other VMware products like NSX datacenter, VMware Cloud Director, plugins etc using <a target="_blank" href="https://interopmatrix.broadcom.com/Interoperability">VMware Product Interoperability Matrix</a></p>
</li>
<li><p>Validate third-party tools and backup/DR systems.</p>
</li>
<li><p>Decide on the target build number of the vcenter server &amp; the esxi hosts for your environment</p>
<ul>
<li>In my environment, i decided to going with <code>Vcenter 8U3e</code> &amp; <code>esxi 8U3d</code></li>
</ul>
</li>
</ul>
<h3 id="heading-upgrade-sequence">🔼 Upgrade Sequence</h3>
<ul>
<li><p>If you need to update multiple products in your environment, start with updating the product with the lowest sequence number from the table below.</p>
</li>
<li><p>If a product is not present in your environment, update the subsequent product.</p>
</li>
<li><p>If a product is managed by vRealize Suite Lifecycle Manager, the minimum version may be dictated by vRealize Suite Lifecycle Manager.</p>
</li>
<li><p>If upgrading from vSphere 6.7 with NSX, NVDS to CVDS migration is required. To migrate from NVDS to CVDS, you must first upgrade to vSphere 7.0 Update 2 or higher along with NSX 3.1.x or higher.</p>
</li>
</ul>
<p>Refer this <a target="_blank" href="https://knowledge.broadcom.com/external/article/308161/update-sequence-for-vmware-vsphere-80-an.html">KB Article</a> for more details.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>🔢 <strong>Sequence</strong></td><td>🧩 <strong>Component</strong></td></tr>
</thead>
<tbody>
<tr>
<td>🔵 1</td><td>🛠️ vRealize Suite Lifecycle Manager</td></tr>
<tr>
<td>🟢 2</td><td>👤 Identity Manager</td></tr>
<tr>
<td>🟡 3</td><td>📊 vRealize Log Insight</td></tr>
<tr>
<td>🟡 3</td><td>📈 vRealize Operations Manager</td></tr>
<tr>
<td>🟠 4</td><td>🌐 vRealize Network Insight</td></tr>
<tr>
<td>🟣 5</td><td>🤖 vRealize Automation</td></tr>
<tr>
<td>🔴 6</td><td>💾 VADP Backup Solution</td></tr>
<tr>
<td>🔵 7</td><td>🔄 vSphere Replication</td></tr>
<tr>
<td>🔵 7</td><td>🚨 Site Recovery Manager</td></tr>
<tr>
<td>🟢 8</td><td>🕸️ NSX</td></tr>
<tr>
<td>🟡 9</td><td>🧠 vCenter Server</td></tr>
<tr>
<td>🟠 10</td><td>🖥️ ESXi</td></tr>
<tr>
<td>🟣 11</td><td>🧰 VMware Tools</td></tr>
<tr>
<td>🔴 12</td><td>🧱 Virtual Hardware</td></tr>
<tr>
<td>🔴 12</td><td>📦 vSAN On-disk Format</td></tr>
</tbody>
</table>
</div><h2 id="heading-mandatory-pre-checks">⚙️ Mandatory Pre-checks</h2>
<ul>
<li><p>Upgrading to vCenter Server 8.0 requires an additional pre-check for certificates with weak signature algorithms</p>
<ul>
<li>The <a target="_blank" href="https://knowledge.broadcom.com/external/article/313460/upgrading-vcenter-server-or-esxi-80-fail.html">pre-check script from vmware</a> ensures that vCenter Server is not using certificates with weak signature algorithms</li>
</ul>
</li>
<li><p>Verify and resolve any expired vCenter Server certificates</p>
<ul>
<li>The <a target="_blank" href="https://knowledge.broadcom.com/external/article/385107/vcert-scripted-vcenter-expired-certific.html">vCert tool</a> from vmware can be used to ease the management capability for most vCenter Server certificate-related operations</li>
</ul>
</li>
<li><p>Use the VCF <a target="_blank" href="https://knowledge.broadcom.com/external/article/344917/using-the-vcf-diagnostic-tool-for-vspher.html">Diagnostic Tool for vSphere</a> <a target="_blank" href="https://knowledge.broadcom.com/external/article/344917/using-the-vcf-diagnostic-tool-for-vspher.html">(VDT)</a> directly on a vCenter Server appliance.to execute a series of checks on the system configuration and reports user-friendly <code>PASS/FAIL/WARN</code> results for known configuration issues</p>
</li>
</ul>
<h2 id="heading-running-the-pre-checks">🚦 Running the pre-checks</h2>
<p>I have copied the VDT tool &amp; pre-check script (vsphere8_upgrade_certificate_checks) &amp; the vCert tool to the vcenter appliance using SCP at <code>/tmp/vcenter_8_upgrade_checks</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749493225343/b7f41a57-bc6e-4dc3-8e40-76d11455942e.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-running-vsphere8upgradecertificatecheckspy">✅ Running <code>vsphere8_upgrade_certificate_checks.py</code></h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749493665441/803e5ea3-f18b-4128-893b-bd111a38f562.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-running-vdt-tool">✅ Running VDT tool</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749493811448/ad6f698e-a525-436c-87e1-1417c9b586e2.png" alt class="image--center mx-auto" /></p>
<p>The tool will generate a <code>PASS/FAIL/WARN</code> report on screen. it will also store the reports &amp; VDT logs at <code>/var/log/vmware/vdt/</code></p>
<p>In My environment, i received 2 <code>FAIL</code></p>
<h3 id="heading-vmdir-domain-functional-level">🛑 VMDIR Domain functional level</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749494116200/88add466-0a95-49af-b193-edd42a288351.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>I first validated the Domain level using : <code>usr/lib/vmware-vmafd/bin/dir-cli domain-functional-level get</code></p>
</li>
<li><p>vCenter that has been upgraded since version 6.5 will have a DFL of 1.</p>
</li>
<li><p>For vCenter version 7 and above, the domain level should be 4</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749494598614/54c36177-f571-4ff7-b41c-f93b84718324.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>To fix this i followed this vmware <a target="_blank" href="https://knowledge.broadcom.com/external/article?legacyId=92962">KB article</a> and used the below command</p>
</li>
<li><p><mark>Ensure that you have a snapshot before proceeding</mark></p>
</li>
</ul>
<pre><code class="lang-bash">$ /usr/lib/vmware-vmafd/bin/dir-cli domain-functional-level <span class="hljs-built_in">set</span> --level 4 --login Administrator@vsphere.local --domain-name vsphere.local

$ service-control --restart vmdird
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749494677748/8f928a29-e764-43c3-8c7f-7b8cf9559d6f.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-one-of-the-certificate-in-the-trusted-root-is-not-a-certificate-authority">🛑 One of the Certificate in the trusted root is not a Certificate authority</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749494196443/08cb3753-e545-4913-a212-c72821b20021.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>I fixed this by removing cert from VMware directory using the <strong>vCert tool</strong></p>
</li>
<li><p><mark>Ensure that you have a snapshot before proceeding</mark></p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749495027121/d44258fd-7d9b-4076-ba5f-f7ed999724a8.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749495059632/d8c71c49-3ef6-421c-bc45-a1b844857960.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>I re-run the VDT tool to validate if the issues are fixed &amp; are passed</p>
</li>
<li><p>Rebooted the vcenter to make sure there are no issues after the changes</p>
</li>
</ul>
<h2 id="heading-upgrade-procedure">🚀🔁 <strong>Upgrade Procedure</strong></h2>
<h3 id="heading-preparations">📋 preparations</h3>
<ul>
<li><p>Temporary IP for the upgrade process</p>
<ul>
<li><p>The upgrade installer deploys a new vCenter 8 appliance alongside the old one. A temporary IP lets the installer access and configure it without disrupting existing services</p>
</li>
<li><p>After data migration, the new appliance shuts down the old one and adopts its original IP</p>
</li>
<li><p>Ensure the temporary IP belongs in the same VLAN/subnet as the existing vCenter and is reachable on ports 443 &amp; 5480 from the system running the vCenter installer.</p>
</li>
</ul>
</li>
<li><p>For upgrading vCenter in a High Availability Environment, remove vCenter HA</p>
</li>
<li><p>Reboot 🔁 the vCenter to make sure there are no pending reboot. verify the services are all up using <code>service-control --status --all</code></p>
</li>
<li><p>Backup 💾 - Make sure to take file based backup from VAMI (<code>https://&lt;vcenter&gt;:5480</code>)</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749495585182/03e0ebb7-f32a-426e-855f-4808a408792f.png" alt class="image--center mx-auto" /></p>
<p>  Confirm that the backup is successful</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749495647100/19e4c867-a28c-4178-a38a-5da37b4e39d1.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Snapshot 📸 of the vCenter appliance VM.</p>
<p>  <em>If the vCenter is part of an Enhanced Linked Mode (ELM) ,</em> <strong><em>all vCenters in ELM must be powered off simultaneously</em></strong>*. Snapshots should be taken only after all vCenters are fully powered off.*</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749495530281/ee599d40-9c9a-4334-a949-212a73fac07c.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>If the vCenter resides on a cluster with DRS set to <strong>Fully Automated</strong>, change the DRS mode to <strong>Partially Automated</strong> or <strong>Manual</strong> to prevent automatic load balancing during the upgrade.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749495482108/d788876d-8073-41cb-9582-48d8349380a3.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
<h3 id="heading-start-upgrade">🏁 Start Upgrade</h3>
<p>📀 Mount the vCenter 8 appliance ISO &amp; start the installer</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749496570702/b99af8fe-965c-4259-9401-a1d6b205e892.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-stage-1-deployment">🧭 Stage 1: Deployment</h3>
<p>During Stage 1, the installer deploys a new vCenter 8 appliance alongside the old one. The temporary IP is assigned to it, and the installer will prepare/configure the vcenter services services.</p>
<ul>
<li>Choose the upgrade option</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749523980284/92ddf4db-2860-4eaa-a0ce-5e82b3d42327.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Provide the connection details of the vCenter you want to upgrade, along with the credentials for the vCenter or ESXi host where the existing vCenter appliance is currently registered and managed.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749524105768/588b8678-4b79-41b6-86a5-922eeac2af91.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Specify the target vCenter where you want to deploy the new vCenter 8 appliance.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749524140970/3d592b24-4f08-4f36-a236-18a17838950b.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Specify the vm name &amp; credentials for the new appliance</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749524183348/beb37f38-d2a2-4e1b-ba0d-67df1c55ab47.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Choose the deployment size</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749524214620/50c0c85a-8582-4b14-b7fb-21f005b382b3.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Review all the information &amp; hit FINISH</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749524968580/4e58e6f7-4d01-430e-959f-69f97d4f3073.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749524990213/ab87e09d-c9bc-48a4-81ce-94f310354564.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-stage-2-data-migration">🧭 Stage 2: Data Migration</h3>
<p>Once stage 1 is completed, the installer will connect to <code>https://&lt;temporary_ip&gt;:5480</code> to continue with the stage 2</p>
<ul>
<li>Pre-upgrade check being run</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749525393835/872fb8cd-cf97-44ec-bb88-ea33b617eeab.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>If any <code>errors</code> are encountered, they must be resolved before proceeding.</p>
</li>
<li><p><code>Warnings</code> should be reviewed to determine their impact or necessity for action, and addressed accordingly before moving forward.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749525552081/cb889877-a9b4-451e-b81b-ef809c9193fd.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Choose the appropriate data set that you need to copy over</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749525594336/8ee7b9c3-3a10-4d03-8ed6-f7477b252c34.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Review all the information and hit FINISH</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749525679974/68561fc8-0187-48ff-8024-f9f114d62bf7.png" alt class="image--center mx-auto" /></p>
<ul>
<li>This begins the data copy process. Once the data is copied, the source vCenter appliance will be shut down automatically.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749525748565/f3932668-2084-4c06-8c41-2995143ef099.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749525778400/f5c44914-43c4-4e11-83ef-929ecca9b53c.png" alt class="image--center mx-auto" /></p>
<ul>
<li>The vCenter 8 appliance will come up with the original IP</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749525829854/708fa9d1-d038-4c2a-96b9-cbce8765d3f6.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-verification">✅ Verification</h3>
<ul>
<li>Verify all vCenter services are up</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749526026270/b3651ef2-feb3-40cb-9368-1bcc181ddd74.png" alt class="image--center mx-auto" /></p>
<ul>
<li>You can re-run the VDT tool to verify if all tests passes. This can help identify any post upgrade issues</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749526071826/fa8d6813-838c-4a82-bf0c-ee5e99089edc.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>Verify in the vCenter web UI that all inventory data, configurations, and permissions have been correctly migrated.</p>
</li>
<li><p>Ensure that the new vCenter Server 8.0 is functioning as expected.</p>
</li>
<li><p>Check for any visible errors in the events section.</p>
</li>
<li><p>Verify that any plugins are re-deployed.</p>
</li>
<li><p>Check that all hosts are healthy and connected.</p>
</li>
<li><p>Ensure DRS and vMotion work as expected.</p>
</li>
<li><p>Check if the vSAN, NFS, and VMFS datastores are connected to the hosts.</p>
</li>
<li><p>Check in NSX Manager to ensure the cluster shows as healthy and NSX does not report any issues related to this vCenter.</p>
</li>
<li><p>Verify that other VMware services, like VMware Cloud Director, can connect successfully to the vCenter.</p>
</li>
</ul>
<p>Now we can move on to the next step to upgrade the esxi hosts to ver. 8</p>
]]></content:encoded></item><item><title><![CDATA[Enable Hibernate mode on Ubuntu 25.04 (Plucky Puffin): A Step-by-Step Guide]]></title><description><![CDATA[Hibernation is not enabled by-default in Ubuntu.

If you try to hibernate using “sudo systemctl hibernate”, you will run into the below error:


call to Hibernate failed: Not enough suitable swap space for hibernation available on compatible block de...]]></description><link>https://automatestack.dev/enable-hibernate-mode-on-ubuntu-25-04-plucky-puffin-a-step-by-step-guide-37a9b4ed3f63</link><guid isPermaLink="true">https://automatestack.dev/enable-hibernate-mode-on-ubuntu-25-04-plucky-puffin-a-step-by-step-guide-37a9b4ed3f63</guid><category><![CDATA[ubuntu 25.04]]></category><category><![CDATA[Plucky Puffin]]></category><category><![CDATA[Ubuntu]]></category><category><![CDATA[hibernate]]></category><category><![CDATA[Gnome]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Wed, 12 Mar 2025 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1766936051172/d30c6de8-ab73-48a5-a8d9-b5d4429ecca0.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<hr />
<ul>
<li><p>Hibernation is not enabled by-default in Ubuntu.</p>
</li>
<li><p>If you try to hibernate using “<em>sudo systemctl hibernate</em>”, you will run into the below error:</p>
</li>
</ul>
<p>call to Hibernate failed: Not enough suitable swap space for hibernation available on compatible block devices and file systems</p>
<ul>
<li><p>Hibernation requires a swap space at least equal to your system’s RAM.</p>
</li>
<li><p>If the swap size is less than your RAM, you’ll need to increase it.</p>
</li>
</ul>
<h3 id="heading-check-current-swap-usage">Check current swap usage:</h3>
<pre><code class="lang-bash">swapon --show

free -h
</code></pre>
<p>Depending on your configuration, you either might see:</p>
<ul>
<li><p>A swap partition:</p>
</li>
<li><p>OR a swap file:</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487090475/fe7590f1-7083-4128-baaf-d8e195bcf523.png" alt /></p>
<blockquote>
<p>If you don’t have a swap partition, Create a Dedicated Swap Partition.</p>
</blockquote>
<h3 id="heading-creating-a-dedicated-swap-partition">Creating a Dedicated Swap Partition</h3>
<ul>
<li><p>Make sure the swap partition is equal or more then your RAM size.</p>
</li>
<li><p><strong>Resize</strong> an existing partition with GParted and create a Linux swap partition.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487091611/0bfd647a-0925-4b69-b490-52547d70dfd2.png" alt /></p>
<h3 id="heading-format-and-enable-the-swap">Format and enable the swap:</h3>
<ul>
<li><p>Identify the correct device name and use the below commands to enable the swap</p>
</li>
<li><p>In my case the partition is /dev/nvme0n1p3</p>
</li>
</ul>
<pre><code class="lang-bash">sudo mkswap /dev/nvme0n1p3
sudo swapon /dev/nvme0n1p3
</code></pre>
<h3 id="heading-find-the-uuid-of-the-partition">Find the UUID of the partition:</h3>
<pre><code class="lang-bash">sudo blkid /dev/nvme0n1p3
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487092915/26c373ac-8b02-4f6f-9ad1-077193e2c2e8.png" alt /></p>
<h3 id="heading-add-to-etcfstab">Add to <code>/etc/fstab</code>:</h3>
<pre><code class="lang-bash">UUID=<span class="hljs-string">"XXXXXX-XXXXX-XXX-XXX-XXXXXXXX"</span> none swap sw 0 0
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487094337/a84583ab-2293-4401-8a64-4e474734f713.png" alt /></p>
<ul>
<li>Reboot and verify the new swap is active</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487095673/6c5fa088-9242-4663-b7b0-85a6d25c72ff.png" alt /></p>
<h3 id="heading-edit-grub-configuration">Edit GRUB configuration:</h3>
<pre><code class="lang-bash">sudo nano /etc/default/grub
</code></pre>
<ul>
<li>Find the line starting with <code>GRUB_CMDLINE_LINUX_DEFAULT</code> and add the resume parameter:</li>
</ul>
<pre><code class="lang-bash">GRUB_CMDLINE_LINUX_DEFAULT=<span class="hljs-string">"quiet splash resume=UUID=your-swap-uuid"</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487097549/bd88740e-3436-49eb-a495-301b5f3a13ce.png" alt /></p>
<p>Replace <code>your-swap-uuid</code> with the UUID you noted earlier</p>
<ul>
<li>Update GRUB:</li>
</ul>
<pre><code class="lang-bash">sudo update-grub
</code></pre>
<ul>
<li>Rebuild initramfs to include resume info &amp; reboot</li>
</ul>
<pre><code class="lang-bash">sudo update-initramfs -u

sudo reboot
</code></pre>
<h3 id="heading-enable-hibernate-in-policykit">Enable Hibernate in PolicyKit:</h3>
<pre><code class="lang-bash">sudo nano /etc/polkit-1/rules.d/10-enable-hibernate.rules
</code></pre>
<ul>
<li>Add the following content to the config:</li>
</ul>
<pre><code class="lang-bash">polkit.addRule(<span class="hljs-keyword">function</span>(action, subject) {
<span class="hljs-keyword">if</span> (action.id == <span class="hljs-string">"org.freedesktop.login1.hibernate"</span> ||
action.id == <span class="hljs-string">"org.freedesktop.login1.hibernate-multiple-sessions"</span> ||
action.id == <span class="hljs-string">"org.freedesktop.upower.hibernate"</span> ||
action.id == <span class="hljs-string">"org.freedesktop.login1.handle-hibernate-key"</span> ||
action.id == <span class="hljs-string">"org.freedesktop.login1.hibernate-ignore-inhibit"</span>)
{
<span class="hljs-built_in">return</span> polkit.Result.YES;
}
});
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487099102/e70d37a3-7aa9-46e2-a4d7-21247a6b404f.png" alt /></p>
<h3 id="heading-test-hibernation">Test Hibernation:</h3>
<p>sudo systemctl hibernate</p>
<h3 id="heading-gui-hibernate-button">GUI Hibernate button:</h3>
<p>You can also add a Hibernate button in Status menu using this gnome extension</p>
<p><a target="_blank" href="https://extensions.gnome.org/extension/755/hibernate-status-button/">https://extensions.gnome.org/extension/755/hibernate-status-button/</a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487100241/80bc5e8a-6122-4f10-8986-9cb69e4c54c2.png" alt /></p>
]]></content:encoded></item><item><title><![CDATA[CPU Shares in VMware vSphere : A Complete Guide to Prioritizing Your VMs]]></title><description><![CDATA[In virtualized environments, not all workloads are created equal. Some applications require guaranteed CPU access under heavy contention, while others can tolerate delays. vSphere’s CPU Shares lets you assign relative priority weights to virtual mach...]]></description><link>https://automatestack.dev/cpu-shares-in-vmware-vsphere-a-complete-guide-to-prioritizing-your-vms</link><guid isPermaLink="true">https://automatestack.dev/cpu-shares-in-vmware-vsphere-a-complete-guide-to-prioritizing-your-vms</guid><category><![CDATA[cpu contention]]></category><category><![CDATA[VM CPU contention]]></category><category><![CDATA[cpu utilization]]></category><category><![CDATA[CPU optimization]]></category><category><![CDATA[CPU Scheduling]]></category><category><![CDATA[vmware]]></category><category><![CDATA[esxi]]></category><category><![CDATA[vsphere]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Fri, 28 Feb 2025 18:30:00 GMT</pubDate><content:encoded><![CDATA[<p>In virtualized environments, not all workloads are created equal. Some applications require guaranteed CPU access under heavy contention, while others can tolerate delays. vSphere’s <strong>CPU Shares</strong> lets you assign relative priority weights to virtual machines (VMs) so that critical workloads run smoothly when CPU resources are scarce. In this post, you’ll learn:</p>
<ul>
<li><p>What CPU Shares are and how they work</p>
</li>
<li><p>The scope of shares within resource pools and clusters</p>
</li>
<li><p>How to calculate and set custom share values per vCPU</p>
</li>
</ul>
<h2 id="heading-what-are-cpu-shares">What Are CPU Shares?</h2>
<ul>
<li><p>CPU Shares are a <em>relative</em> metric that the ESXi scheduler uses to allocate CPU time when demand exceeds supply.</p>
<p>  Basically, Shares tell ESXi “how much cake” each VM gets when it’s time to cut slices. Your CPUs still run at full GHz; you’re just deciding which VM gets more turns at the table when everyone’s hungry.</p>
</li>
<li><p>Under contention, the ESXi scheduler divides CPU time slices based on each VM’s <strong>total share weight.</strong></p>
</li>
<li><p>VMware presets three levels</p>
<ul>
<li><p>Low (500 shares/vCPU)</p>
</li>
<li><p>Normal (1,000 shares/vCPU)</p>
</li>
<li><p>High (2,000 shares/vCPU)</p>
</li>
</ul>
</li>
<li><p>By default, every VM is set to <strong>Normal</strong>, meaning 1,000 shares for each virtual processor it has. These default values can be altered per-VM or per-resource-pool in the VM’s “Edit Settings” → “CPU Shares” section.</p>
</li>
</ul>
<h2 id="heading-how-cpu-shares-work-under-contention">How CPU Shares Work Under Contention</h2>
<ol>
<li><p><strong>No Contention:</strong> If the ESXi host has sufficient idle CPU capacity, VMware honors VM reservations and immediately grants CPU time as requested, without considering shares</p>
</li>
<li><p><strong>Contention Phase:</strong> Once CPU demand exceeds supply—after all reservations are met—ESXi calculates each VM’s <em>ResourceUsagePerShare</em> and schedules CPU time in order of decreasing share entitlement.</p>
</li>
<li><p><strong>Relative Allocation:</strong> Suppose three VMs have 2,000, 4,000, and 8,000 shares respectively; under contention, they receive CPU in a 1:2:4 ratio</p>
</li>
</ol>
<h2 id="heading-cpu-shares-are-scoped-to-their-parent-container">CPU Shares Are Scoped to Their Parent Container</h2>
<h3 id="heading-resource-pool-level-scope">Resource Pool-Level Scope</h3>
<ul>
<li><p>When you set shares on a <strong>resource pool</strong>, you control how much of that pool’s entitled CPU capacity each of its <strong>direct children</strong> receives.</p>
</li>
<li><p><strong>Sibling Pools:</strong> If you have two child pools under the same parent pool (including the cluster’s root pool), their share values dictate what fraction of the parent’s CPU resources each pool may consume under contention.</p>
</li>
<li><p><strong>VMs Within a Pool:</strong> Inside that resource pool, each VM’s own share settings then determine how the pool’s allocated CPU is further split among those VMs.</p>
</li>
</ul>
<h3 id="heading-cluster-level-scope-via-the-root-resource-pool">Cluster-Level Scope via the Root Resource Pool</h3>
<ul>
<li><p>Every DRS cluster automatically has a <strong>root resource pool</strong> that represents 100% of the cluster’s CPU and memory resources.</p>
</li>
<li><p>All top‑level resource pools you create in the cluster are children of this root pool. Therefore, by setting shares on those top‑level pools, you are effectively prioritizing CPU access across the <strong>entire cluster</strong> under contention.</p>
</li>
<li><p>Conversely, shares set on a <strong>nested</strong> pool only affect that subtree— they do <strong>not</strong> influence sibling pools in other branches of the hierarchy.</p>
</li>
</ul>
<h3 id="heading-example">Example</h3>
<ul>
<li><p><strong>Cluster Root Pool</strong> (entitled to 100% of cluster CPU)</p>
<ul>
<li><p><strong>Pool A</strong>: High shares (8,000)</p>
</li>
<li><p><strong>Pool B</strong>: Low shares (2,000)</p>
</li>
<li><p>Result: Under contention, Pool A gets 80% of cluster CPU, Pool B gets 20%</p>
</li>
</ul>
</li>
<li><p><strong>Inside Pool A</strong></p>
<ul>
<li><p><strong>VM A1</strong>: 1,000 shares</p>
</li>
<li><p><strong>VM A2</strong>: 2,000 shares</p>
</li>
<li><p>Result: Pool A’s 80% CPU is split 1:2 between A1 and A2 (≈26.7% vs. 53.3% of total cluster CPU)</p>
</li>
</ul>
</li>
<li><p><strong>Inside Pool B</strong></p>
<ul>
<li>VMs share Pool B’s 20% entitlement according to their own share settings, without affecting Pool A or its VMs.</li>
</ul>
</li>
</ul>
<h2 id="heading-prioritizing-a-smaller-vm-over-a-larger-one">Prioritizing a Smaller VM over a Larger One</h2>
<p>To make a <strong>4‑vCPU VM</strong> outrank a <strong>16‑vCPU VM</strong>:</p>
<ol>
<li><strong>Boost the 4‑vCPU VM’s Shares</strong></li>
</ol>
<ul>
<li><p>Set to Custom = 20,000 shares total (via UI or PowerCLI).</p>
</li>
<li><p>It now has higher entitlement than the 16‑vCPU VM’s 16,000 shares.</p>
</li>
</ul>
<ol start="2">
<li><strong>Or Lower the 16‑vCPU VM’s Shares</strong></li>
</ol>
<ul>
<li>Change it to Low (16 × 500 = 8,000 shares), putting it beneath the 4‑vCPU VM’s default High (8,000 shares) or your custom value.</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>CPU Shares in VMware vCenter are a powerful yet underutilized tool for workload prioritization. By understanding how shares work—within resource pools and across clusters—and by defining a clear baseline-per-vCPU policy, you can ensure that your mission-critical VMs always get the CPU cycles they need, even under heavy contention.</p>
]]></content:encoded></item><item><title><![CDATA[Resource Contention in vSphere : Identification and Solutions]]></title><description><![CDATA[VMware vSphere remains the platform of choice for many organizations seeking flexibility, scalability, and performance. However, as VM density rises and workloads become more varied, performance bottlenecks can surface. To maintain a healthy and resp...]]></description><link>https://automatestack.dev/resource-contention-in-vsphere-identification-and-solutions</link><guid isPermaLink="true">https://automatestack.dev/resource-contention-in-vsphere-identification-and-solutions</guid><category><![CDATA[vm performance]]></category><category><![CDATA[vmware]]></category><category><![CDATA[esxi]]></category><category><![CDATA[VM CPU contention]]></category><category><![CDATA[Performance Optimization]]></category><category><![CDATA[performance]]></category><category><![CDATA[vsphere]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Fri, 31 Jan 2025 18:30:00 GMT</pubDate><content:encoded><![CDATA[<p>VMware vSphere remains the platform of choice for many organizations seeking flexibility, scalability, and performance. However, as VM density rises and workloads become more varied, performance bottlenecks can surface. To maintain a healthy and responsive environment, we administrators must understand the key performance metrics &amp; the story they tell.</p>
<p>In this blog , we'll try to dive deep into some of the critical performance metrics that affect the performance of VMs in a vSphere environment and explore the symptoms of contention.</p>
<ul>
<li><p>CPU Ready (<code>%RDY</code>)</p>
</li>
<li><p>CPU Wait (<code>%WAIT</code> and <code>%VMWAIT</code>)</p>
</li>
<li><p>CPU Co-Stop (<code>%CSTP</code>)</p>
</li>
<li><p>Memory Ballooning</p>
</li>
</ul>
<h1 id="heading-cpu-contention">CPU Contention</h1>
<h2 id="heading-rethinking-the-vcpu-to-pcpu-ratio">Rethinking the vCPU to pCPU Ratio</h2>
<p>Before we dive into specific metrics like <code>%RDY</code> or <code>%CSTP</code>, we must address one of the most fundamental questions in virtualization: <strong>What is the right vCPU to pCPU ratio?</strong></p>
<p>For years, we administrators relied on general rules of thumb like 4:1 or even 10:1. These static ratios, however, were born in an era when many virtual workloads were largely idle. In such an environment, over-committing physical CPUs made sense. In today's world of resource-intensive applications &amp; dynamic workloads, such a fixed ratio can lead to performance bottlenecks and unhappy users.</p>
<h2 id="heading-drive-by-contention">Drive by Contention</h2>
<p>Instead of focusing on a static ratio, the modern approach is to "drive by contention. This means -</p>
<ul>
<li><p>Actively monitor your environment for signs of CPU stress (like high Ready and Co-Stop times, which we'll cover next).</p>
</li>
<li><p>Expand your resource pools or adjust VM sizing based on real-world data.</p>
<p>  This approach ensures that your applications have the resources they need, when they need them, without being constrained by an arbitrary ratio.</p>
</li>
<li><p>A conservative safe starting point is a <strong>1:1 vCPU to pCPU ratio</strong> (not counting hyper-threading) which is most predictable.</p>
</li>
<li><p>This eliminates the risk of contention by dedicating a physical core to every virtual CPU. As your monitoring and operational processes mature, you can cautiously oversubscribe based on observed performance.</p>
</li>
<li><p>Ultimately, the optimal ratio is unique to your environmnet’s specific workloads, hardware, and based on your specific needs and observations.</p>
</li>
</ul>
<h2 id="heading-cpu-metrics">CPU Metrics</h2>
<ol>
<li><h3 id="heading-cpu-ready-rdy"><strong>CPU Ready (</strong><code>%RDY</code><strong>)</strong></h3>
</li>
</ol>
<p><strong>What Is CPU Ready?</strong></p>
<p>CPU Ready measures the percentage of time a virtual CPU (vCPU) is ready to execute instructions but must wait in a queue for a physical CPU core to become available. In a heavily contended environment, meaning when the vcpu to pcpu ratio exceeds the prescribed vale from vmware vCPUs may wait longer before being scheduled, resulting in application slowdowns.</p>
<p><strong>Impact on Performance</strong></p>
<ul>
<li><p><strong>Increased Latency:</strong> Applications experience higher response times due to these micro-pauses in scheduling.</p>
</li>
<li><p><strong>Reduced Throughput:</strong> The overall work processed per unit of time drops, affecting both batch and transactional workloads.</p>
</li>
</ul>
<p><strong>How to Identify CPU Ready</strong></p>
<ul>
<li><p><strong>vSphere Client (vCenter)</strong></p>
<ul>
<li><p>Monitor the "Ready" &amp; the “Readiness“ metric under each VM's CPU performance chart.</p>
</li>
<li><p>A good rule of thumb is to investigate when the ready time consistently exceeds <code>5%</code> per vCPU</p>
</li>
<li><p>For example, a 4-vCPU VM could tolerate up to <code>20%</code> total ready time before showing significant degradation.</p>
</li>
</ul>
</li>
<li><p><strong>esxtop:</strong> In the ESXi shell, run <code>esxtop</code> and press <code>c</code> for the CPU view. Observe the <code>%RDY</code> column for each VM.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755160890420/ed31f00c-96e7-4629-b03b-d2a603476f01.png" alt class="image--center mx-auto" /></p>
</li>
<li><p><strong>VMware Aria Operations</strong></p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755161985519/b4efaa75-5403-455f-b03c-92cf75b228f8.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755162375576/6becec66-a7a9-4a37-8883-5ed0df0ec7a1.png" alt /></p>
<p><strong>Best Practices to Reduce CPU Ready</strong></p>
<ul>
<li><p><a target="_blank" href="https://automatestack.dev/the-ultimate-guide-to-right-sizing-cpu-and-memory-for-virtual-machines"><strong>Right-Size vCPU Count</strong></a><strong>:</strong> This is the most effective solution. Avoid over-provisioning vCPUs. Assign the minimum number required by the workload inside the guest OS.</p>
</li>
<li><p><strong>Use Affinity Rules Sparingly:</strong> CPU affinity rules restrict the scheduler's flexibility, which can increase ready time. Use them only for specific, well-understood licensing or application requirements.</p>
</li>
<li><p><strong>Resource Pools and Shares:</strong> Allocate CPU shares, reservations, and limits thoughtfully to prioritize critical VMs and prevent "noisy neighbors" from consuming all available resources.</p>
</li>
<li><p><strong>Cluster Sizing:</strong> Ensure your cluster has enough physical cores to support the peak requirements of its running VMs.</p>
</li>
</ul>
<ol start="2">
<li><h3 id="heading-cpu-wait-time-wait-amp-vmwait"><strong>CPU Wait Time (</strong><code>%WAIT</code> &amp; <code>%VMWAIT</code>)</h3>
</li>
</ol>
<p>What Is CPU Wait Time?</p>
<p>This is one of the most misunderstood metrics. CPU Wait (%WAIT in esxtop) measures the time a vCPU is in a stopped state, waiting for an event. A high %WAIT value is not always a problem. It is composed of two key metrics:</p>
<ul>
<li><p><strong>Idle Time (</strong><code>%IDLE</code>): Time the guest OS intentionally put the vCPU in a halt state because it had no work to do. This is normal and expected for a non-busy VM.</p>
</li>
<li><p><strong>VMWait Time (</strong><code>%VMWAIT</code>): Time the vCPU was forced to wait for a hypervisor event to complete, most commonly a storage I/O or network I/O operation. <strong>This is the metric that indicates a potential problem.</strong></p>
</li>
</ul>
<p>The formula is simple: <code>%WAIT = %IDLE + %VMWAIT</code>.</p>
<p><strong>Impact on Performance</strong></p>
<ul>
<li><p>A high <code>%WAIT</code> driven by <strong>high</strong> <code>%IDLE</code> has no negative performance impact; it simply means the VM is idle.</p>
</li>
<li><p>A high <code>%WAIT</code> driven by <strong>high</strong> <code>%VMWAIT</code> indicates a genuine infrastructure bottleneck, causing application stalls, slow data access, and poor user experience.</p>
</li>
</ul>
<p><strong>How to Identify Real CPU Wait Issues</strong></p>
<ol>
<li><p>In the ESXi shell, run <code>esxtop</code> and press <code>c</code> for the CPU view.</p>
</li>
<li><p>Observe the <code>%WAIT</code> column. If it's high, proceed to the next step.</p>
</li>
<li><p>Press <code>f</code> to change fields, navigate to the <code>VMWAIT</code> metric, and press the spacebar to add it to the view.</p>
</li>
<li><p>Analyze the results: If <code>%VMWAIT</code> is high, you have confirmed a bottleneck, likely related to storage or network latency. If <code>%VMWAIT</code> is low, the VM is simply idle, and no action is needed.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755189306991/c972e6db-abe5-4057-b835-e78c080339e2.png" alt class="image--center mx-auto" /></p>
<p><strong>Best Practices to Reduce High</strong> <code>%VMWAIT</code></p>
<ul>
<li><p><strong>Optimize Storage Paths:</strong> Ensure multipathing is correctly configured and all paths are active.</p>
</li>
<li><p><strong>Upgrade Storage Tiers:</strong> Move latency-sensitive workloads to faster storage (e.g., NVMe, SSD-backed datastores).</p>
</li>
<li><p><strong>Check Network Latency:</strong> Investigate network device performance if storage appears healthy.</p>
</li>
<li><p><strong>Adjust Queue Depths:</strong> Tune HBA and storage array queue depths to handle your workload's I/O profile.</p>
</li>
</ul>
<ol start="3">
<li><h3 id="heading-cpu-co-stop-cstp"><strong>CPU Co-Stop (</strong><code>%CSTP</code>)</h3>
</li>
</ol>
<p>What Is CPU Co-Stop?</p>
<p>CPU Co-Stop (%CSTP in esxtop) measures the time a vCPU is forcibly stopped by the hypervisor to allow its sibling vCPUs within the same VM to catch up. This occurs in Symmetric Multi-Processing (SMP) VMs when the hypervisor cannot schedule all of the VM's vCPUs on physical cores simultaneously. It is a direct symptom of CPU over-contention, especially for "wide" VMs (those with many vCPUs).</p>
<p><strong>Impact on Performance</strong></p>
<ul>
<li><p><strong>Synchronization Overhead:</strong> Multi-threaded applications suffer from added latency as some threads are paused, waiting for others.</p>
</li>
<li><p><strong>Unpredictable Performance:</strong> Co-stop spikes lead to performance "jitter" in CPU-intensive workloads.</p>
</li>
</ul>
<p><strong>How to Identify CPU Co-Stop</strong></p>
<ul>
<li><p><strong>esxtop:</strong> In the CPU view (<code>c</code>), press <code>f</code> to change fields and add the <code>%CSTP</code> column. Any value consistently above <code>3%</code> is a cause for concern.</p>
</li>
<li><p><strong>vRealize Operations:</strong> Advanced analytics can track and alert on <code>%CSTP</code> anomalies over time.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755189419917/e2f1a26e-6110-4c63-a5bd-008f6629772f.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
<p><strong>Best Practices to Mitigate CPU Co-Stop</strong></p>
<ul>
<li><p><strong>Minimize vCPU Count:</strong> The primary solution is to right-size VMs with the fewest vCPUs they truly need. A 2 vCPU VM is far less likely to experience co-stop than an 8-vCPU VM.</p>
</li>
<li><p><strong>NUMA Awareness:</strong> Align VM vCPU and memory sizing with the host's physical NUMA topology to avoid performance penalties from cross-node memory access.</p>
</li>
<li><p><strong>Avoid Heavy Over subscription:</strong> Keep the overall vCPU-to-physical-core ratio on the host within reasonable bounds (e.g., a 4:1 ratio is a common starting point, but this depends heavily on the workload).</p>
</li>
</ul>
<h2 id="heading-memory-metrics">Memory Metrics</h2>
<ol>
<li><h3 id="heading-memory-ballooning"><strong>Memory Ballooning</strong></h3>
</li>
</ol>
<p>What Is Memory Ballooning?</p>
<p>Memory ballooning is a memory reclamation technique used by the ESXi hypervisor when a host is under memory pressure. A balloon driver (vmmemctl) inside the guest OS "inflates" by requesting memory from the guest. This forces the guest OS to use its own memory management (e.g., its page/swap file) to free up pages, which the hypervisor can then reclaim and allocate to another VM.</p>
<p><strong>Impact on Performance</strong></p>
<ul>
<li><p><strong>Guest-Level Paging:</strong> When ballooning is active, the guest OS is forced to swap memory to its own virtual disk. This disk I/O is thousands of times slower than accessing RAM, severely degrading application performance.</p>
</li>
<li><p><strong>Increased Disk I/O:</strong> Guest OS swap activity generates additional storage load, which can compound existing I/O bottlenecks.</p>
</li>
</ul>
<p><strong>How to Identify Ballooning</strong></p>
<ul>
<li><p><strong>vSphere Client:</strong> In the VM’s "Memory" performance chart, monitor the “Ballooned memory” metric. Any sustained non-zero value indicates the host is or was recently under memory pressure.</p>
</li>
<li><p><strong>esxtop:</strong> In <code>esxtop</code>, press <code>m</code> for memory view. Check the <code>MCTLSZ</code> column for the amount of memory being reclaimed by the balloon driver.</p>
</li>
</ul>
<p><strong>Best Practices to Minimize Ballooning</strong></p>
<ul>
<li><p><a target="_blank" href="https://automatestack.dev/the-ultimate-guide-to-right-sizing-cpu-and-memory-for-virtual-machines"><strong>Right-Size VM Memory</strong></a><strong>:</strong> Allocate only the memory the application truly needs. Over-allocating RAM to idle VMs "traps" that memory, making it unavailable to other VMs.</p>
</li>
<li><p><strong>Monitor Host Memory Usage:</strong> Ensure hosts have sufficient free memory to avoid contention. Use vSphere DRS to balance memory load across a cluster.</p>
</li>
<li><p><strong>Use Reservations for Critical VMs:</strong> If a VM must never have its memory reclaimed, set a memory reservation.</p>
</li>
<li><p><strong>Leverage vSphere Host Cache:</strong> Configure swap-to-host-cache on a fast SSD to mitigate the performance impact when host-level swapping is unavoidable.</p>
</li>
</ul>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Effective VMware vSphere performance tuning depends on a deep understanding of these metrics.</p>
<p>By correctly interpreting CPU Ready, Wait, and Co-Stop, we can distinguish between an idle VM and one genuinely struggling with contention.</p>
<p><a target="_blank" href="https://automatestack.dev/the-ultimate-guide-to-right-sizing-cpu-and-memory-for-virtual-machines">Right-sizing resources</a>, optimizing infrastructure, and continuous monitoring are key to a high-performing virtual environment.</p>
<p>Implement continuous performance monitoring using vRealize Operations or native vCenter dashboards.</p>
]]></content:encoded></item><item><title><![CDATA[Effortlessly Import Your VirtualBox/VMware VMs to Proxmox Using Bash Script]]></title><description><![CDATA[📖 Description:
This script streamlines the process of provisioning a VM in Proxmox from an OVA file. It performs the following steps:

Extracts the specified OVA archive.

Converts the contained virtual disk (VMDK/VHD) to QCOW2 format.

Creates a ne...]]></description><link>https://automatestack.dev/effortlessly-import-your-virtualboxvmware-vms-to-proxmox-using-bash-script</link><guid isPermaLink="true">https://automatestack.dev/effortlessly-import-your-virtualboxvmware-vms-to-proxmox-using-bash-script</guid><category><![CDATA[proxmox]]></category><category><![CDATA[Bash]]></category><category><![CDATA[bash scripting]]></category><category><![CDATA[VirtualBox ]]></category><category><![CDATA[vmware]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Sat, 30 Nov 2024 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1748802859852/a367617a-0eed-4383-976f-d6fed2de8f17.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-description"><strong>📖 Description:</strong></h1>
<p>This script streamlines the process of provisioning a VM in Proxmox from an OVA file. It performs the following steps:</p>
<ol>
<li><p>Extracts the specified OVA archive.</p>
</li>
<li><p>Converts the contained virtual disk (VMDK/VHD) to QCOW2 format.</p>
</li>
<li><p>Creates a new Proxmox VM with basic resources.</p>
</li>
<li><p>Imports the QCOW2 disk into a specified Proxmox storage.</p>
</li>
<li><p>Attaches the imported disk as a SCSI device to the created VM.</p>
</li>
</ol>
<h1 id="heading-prerequisites"><strong>🔧 Prerequisites</strong></h1>
<ul>
<li><p>Proxmox VE with <code>qm</code> and <code>qemu-img</code> utilities installed and accessible in your PATH.</p>
</li>
<li><p>Appropriate permissions to create VMs and access storage on the Proxmox host.</p>
</li>
<li><p>The OVA file and working directory must be readable/writable by the script user.</p>
</li>
</ul>
<h1 id="heading-usage"><strong>🚀 Usage</strong></h1>
<pre><code class="lang-bash">./import_ova.sh &lt;OVA_FILE&gt; &lt;VM_ID&gt; [STORAGE] [TEMPLATE_DIR]
</code></pre>
<h2 id="heading-ova-import-parameters"><strong>📋 OVA Import Parameters</strong></h2>
<p><code>&lt;OVA_FILE&gt;</code><br />Full OVA filename (including the <code>.ova</code> extension)</p>
<p><code>&lt;VM_ID&gt;</code><br />Proxmox VM ID to create and import the disk into</p>
<p><code>[STORAGE]</code> <em>(Optional)</em><br />Proxmox storage target for the disk<br /><strong>Default</strong>: <code>local-lvm</code></p>
<p><code>[TEMPLATE_DIR]</code> <em>(Optional)</em><br />Directory containing the OVA file and generated images<br /><strong>Default</strong>: <code>/var/lib/vz/import</code></p>
<h1 id="heading-upload-the-ova-to-proxmox-server"><strong>📤 Upload the OVA to Proxmox server</strong></h1>
<ul>
<li><p>Web interface:</p>
<ul>
<li>you can upload the .ova file in the import section of the local storage Using the proxmox web interface</li>
</ul>
</li>
<li><p><img src="https://miro.medium.com/v2/resize:fit:700/1*0_rUcxIkya9235AyBv0upA.png" alt /></p>
<ul>
<li>This is put the file at /var/lib/vz/import/</li>
</ul>
</li>
</ul>
<p>    <img src="https://miro.medium.com/v2/resize:fit:573/1*lWI_WzeM9cVSbSyY88D0dA.png" alt /></p>
<ul>
<li><p>Using SCP:</p>
<pre><code class="lang-bash">  scp vm01.ova root@&lt;proxmox server ip&gt;:/var/lib/vz/template/
</code></pre>
<p>  <img src="https://miro.medium.com/v2/resize:fit:700/1*bmGNFexlEHIKDu0h64GvJw.png" alt /></p>
<h1 id="heading-run-the-script"><strong>▶️ Run the script:</strong></h1>
<p>  Get the <a target="_blank" href="https://github.com/sumitsaz23/proxmox-scripts/tree/main/import_ova">script</a> at the <a target="_blank" href="https://github.com/sumitsaz23/proxmox-scripts/tree/main/import_ova">Github Link</a></p>
<h2 id="heading-locally"><strong>Locally:</strong></h2>
<pre><code class="lang-bash">  git <span class="hljs-built_in">clone</span> https://github.com/sumitsaz23/proxmox-scripts.git

  <span class="hljs-built_in">cd</span> proxmox-scripts/import_ova/

  ./import_ova.sh my-vm.ova 123 my-storage /var/lib/vz/template
</code></pre>
<h2 id="heading-download-amp-run-directly-using-curl"><strong>Download &amp; Run Directly using</strong> <code>curl:</code></h2>
<pre><code class="lang-bash">  curl -sSL \
    https://raw.githubusercontent.com/sumitsaz23/proxmox-scripts/main/import_ova/import_ova.sh \
    | bash -s -- my-vm.ova 123 my-storage /var/lib/vz/template
</code></pre>
<p>  <img src="https://miro.medium.com/v2/resize:fit:700/1*Ejn2RSdX9WDcAiaKARSYEg.png" alt /></p>
<p>  <img src="https://miro.medium.com/v2/resize:fit:700/1*3qxm3-GWsVUW6AaK6eHAHA.png" alt /></p>
<h2 id="heading-vm-successfully-booting-up"><strong>VM Successfully booting up</strong></h2>
<p>  <img src="https://miro.medium.com/v2/resize:fit:700/1*M1bN6he4yuJZRnDPeEDO2g.png" alt /></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Azure VM : Custom Data vs. User Data]]></title><description><![CDATA[If you’ve ever provisioned a Virtual Machine (VM) in Azure, you’ve likely stared at the "Advanced" tab during creation and wondered, "Should I put my script in Custom Data? Or is User Data the way to go?"
While both features allow you to inject data ...]]></description><link>https://automatestack.dev/azure-vm-custom-data-vs-user-data</link><guid isPermaLink="true">https://automatestack.dev/azure-vm-custom-data-vs-user-data</guid><category><![CDATA[Azure]]></category><category><![CDATA[Azure VM Instances]]></category><category><![CDATA[az-104]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Wed, 31 Jul 2024 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1765261909323/899f85ed-a9a6-4662-8ab3-af48067a87f5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you’ve ever provisioned a Virtual Machine (VM) in Azure, you’ve likely stared at the "Advanced" tab during creation and wondered, "Should <strong>I put my script in Custom Data? Or is User Data the way to go?"</strong></p>
<p>While both features allow you to inject data into your VM, they serve very different phases of the VM's lifecycle. Confusing them can lead to automation failures or security gaps.</p>
<p>In this post, we’ll break down what can be used in each, the critical differences, and—most importantly—how to secure them.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1765261526000/c6c3fdde-9985-46ae-902d-a19547bb4844.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-1-custom-data-the-bootstrapper">1. Custom Data: The "Bootstrapper"</h2>
<p><strong>Custom Data</strong> is the classic, tried-and-true method for bootstrapping Azure VMs. Think of it as the "instruction manual" you hand to the VM the very first time it wakes up.</p>
<h3 id="heading-what-is-it-used-for">What is it used for?</h3>
<p>It is primarily designed for <strong>provisioning</strong>. It tells the VM how to configure itself immediately after creation.</p>
<h3 id="heading-what-can-be-used-in-it">What can be used in it?</h3>
<ul>
<li><p><strong>Cloud-init files (Linux):</strong> This is the most common use case. You pass a YAML file that creates users, installs packages (like Nginx or Docker), and writes configuration files.</p>
</li>
<li><p><strong>Shell Scripts (Bash):</strong> Simple startup scripts for Linux.</p>
</li>
<li><p><strong>PowerShell Scripts (Windows):</strong> <em>While you can put them here, Windows does not execute them automatically by default.</em></p>
</li>
<li><p><strong>Configuration Files:</strong> Any base64-encoded file (config, JSON, XML).</p>
</li>
</ul>
<h3 id="heading-how-it-works">How it works</h3>
<ul>
<li><p><strong>Linux:</strong> If your image uses <code>cloud-init</code> (standard on Ubuntu, RHEL, CentOS, etc.), it automatically detects, decodes, and executes the Custom Data during the <strong>first boot only</strong>.</p>
</li>
<li><p><strong>Windows:</strong> Azure places the data in a binary file at <code>%SYSTEMDRIVE%\AzureData\CustomData.bin</code>. It sits there passively. To run it, you must use a separate tool (like the <strong>Custom Script Extension</strong>) or have a scheduled task pre-baked into your image to look for and execute this file.</p>
</li>
</ul>
<h2 id="heading-2-user-data-the-persistent-store">2. User Data: The "Persistent Store"</h2>
<p><strong>User Data</strong> is a newer feature designed to offer a persistent data store that stays with the VM throughout its life.</p>
<h3 id="heading-what-is-it-used-for-1">What is it used for?</h3>
<p>It is designed for <strong>runtime configuration</strong> and <strong>metadata</strong> that your application might need to check periodically. Unlike custom data, it is meant to be accessible easily via standard APIs from within the VM.</p>
<h3 id="heading-what-can-be-used-in-it-1">What can be used in it?</h3>
<ul>
<li><p><strong>Environment Flags:</strong> e.g.,<code>ENV=Production</code>, <code>ClusterID=12345</code>.</p>
</li>
<li><p><strong>Version Pins:</strong> e.g., <code>AppVersion=2.1.0</code>.</p>
</li>
<li><p><strong>Bootstrapping Scripts:</strong> Modern versions of <code>cloud-init</code> (21.2+) <em>can</em> consume User Data for provisioning if Custom Data is empty.</p>
</li>
<li><p><strong>Custom Config Blobs:</strong> A JSON blob containing connection strings (non-sensitive ones!) or feature toggles.</p>
</li>
</ul>
<h3 id="heading-how-it-works-1">How it works</h3>
<ul>
<li><p><strong>Persistence:</strong> User Data persists for the lifetime of the VM. You can even update it while the VM is running (though the VM won't know unless it polls for changes).</p>
</li>
<li><p><strong>Accessibility:</strong> It is available via the <strong>Azure Instance Metadata Service (IMDS)</strong>. Any process inside the VM can retrieve it by querying a local endpoint.<br />  IMDS is a REST API that's available at a well-known, non-routable IP address (<code>169.254.169.254</code>). You can only access it from within the VM. Communication between the VM and IMDS never leaves the host.</p>
<pre><code class="lang-powershell">  <span class="hljs-comment">## For Windows ##</span>
  <span class="hljs-built_in">Invoke-RestMethod</span> <span class="hljs-literal">-Headers</span> <span class="hljs-selector-tag">@</span>{<span class="hljs-string">"Metadata"</span>=<span class="hljs-string">"true"</span>} <span class="hljs-literal">-Method</span> GET <span class="hljs-literal">-NoProxy</span> <span class="hljs-literal">-Uri</span> <span class="hljs-string">"http://169.254.169.254/metadata/instance?api-version=2025-04-07"</span> | <span class="hljs-built_in">ConvertTo-Json</span> <span class="hljs-literal">-Depth</span> <span class="hljs-number">64</span>
</code></pre>
<pre><code class="lang-bash">  <span class="hljs-comment">## For Linux ##</span>
  curl -s -H Metadata:<span class="hljs-literal">true</span> --noproxy <span class="hljs-string">"*"</span> <span class="hljs-string">"http://169.254.169.254/metadata/instance?api-version=2025-04-07"</span> | jq
</code></pre>
</li>
</ul>
<h2 id="heading-3-the-comparison-table">3. The Comparison Table</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Feature</td><td>Custom Data</td><td>User Data</td></tr>
</thead>
<tbody>
<tr>
<td><strong>Primary Goal</strong></td><td>Initial Boot/Provisioning</td><td>Persistent Configuration/Metadata</td></tr>
<tr>
<td><strong>Execution</strong></td><td><strong>Automatic</strong> (Linux/cloud-init)</td><td><strong>Passive</strong> (Data store)*</td></tr>
<tr>
<td><strong>Persistence</strong></td><td>Available at boot; hard to retrieve later</td><td>Available anytime via API (IMDS)</td></tr>
<tr>
<td><strong>Updateable?</strong></td><td>No (Static after creation)</td><td>Yes (Can be updated anytime)</td></tr>
<tr>
<td><strong>Retrieval Method</strong></td><td>File on disk (<code>ovf-env.xml</code> / <code>.bin</code>)</td><td>HTTP Request (IMDS)</td></tr>
<tr>
<td><strong>Size Limit</strong></td><td>64 KB</td><td>64 KB</td></tr>
</tbody>
</table>
</div><p><em>\</em>Note: While User Data is passive by default, modern cloud-init can be configured to execute it.*</p>
<h2 id="heading-4-security">4. Security</h2>
<p>Security is the most critical differentiator. Because both methods involve passing data to a VM, it is tempting to dump secrets (passwords, API keys) here. <strong>Do not do this.</strong></p>
<h3 id="heading-why-is-it-unsafe">Why is it unsafe?</h3>
<h4 id="heading-1-custom-data-risks-the-file-system-risk">1. Custom Data Risks (The File System Risk)</h4>
<ul>
<li><p><strong>Exposure:</strong> Custom Data is stored as a file on the VM's disk.</p>
<ul>
<li><p><strong>Linux:</strong> It often resides in <code>/var/lib/waagent/ovf-env.xml</code> or <code>/var/lib/cloud/instance/</code>. Any user with read access to these directories (typically root/sudo) can read it.</p>
</li>
<li><p><strong>Windows:</strong> It sits in <code>%SYSTEMDRIVE%\AzureData\CustomData.bin</code>.</p>
</li>
</ul>
</li>
<li><p><strong>Logging:</strong> If your script prints secrets to the console (stdout/stderr) during execution, those secrets might end up in system logs (<code>/var/log/cloud-init-output.log</code> or Azure Boot Diagnostics logs), which are viewable from the Azure Portal.</p>
</li>
</ul>
<h4 id="heading-2-user-data-risks-the-imds-risk">2. User Data Risks (The IMDS Risk)</h4>
<ul>
<li><p><strong>Open Access:</strong> User Data is served via the <strong>Instance Metadata Service (IMDS)</strong>, a local HTTP server at <code>169.254.169.254</code>.</p>
</li>
<li><p><strong>No Authentication:</strong> By default, <strong>any process</strong> running on that VM (not just root/admin) can query this URL and retrieve the data. If an attacker manages to run a simple script on your VM (or exploits a web app vulnerability like SSRF), they can easily read your User Data.</p>
</li>
<li><p><strong>Clear Text:</strong> The API returns the data in base64, which is trivially easy to decode. It is effectively clear text.</p>
</li>
</ul>
<h3 id="heading-best-practices-for-security">Best Practices for Security</h3>
<p>If you can't put secrets in Custom/User Data, how do you get them into the VM?</p>
<ol>
<li><p>Instead of passing the <em>password</em> in Custom Data, pass the <strong>instruction</strong> to get the password.</p>
<ul>
<li><p>Enable a <a target="_blank" href="https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview"><strong>System Assigned Managed Identity</strong></a> on the VM.</p>
</li>
<li><p>Grant that identity access to an <strong>Azure Key Vault</strong>.</p>
</li>
<li><p>Use Custom Data to run a script (using Azure CLI or PowerShell) that logs in using the Managed Identity (<code>az login --identity</code>) and fetches the secret from the Key Vault.</p>
</li>
</ul>
</li>
<li><p><strong>Restrict IMDS Access (Defense in Depth):</strong> If you use User Data, ensure you are not running untrusted code on the VM. You can also use local OS firewalls (iptables/Windows Firewall) to restrict which users or processes can talk to <code>169.254.169.254</code>.</p>
</li>
<li><p><strong>Assume Visibility:</strong> Always assume that anyone with access to the VM (even low-level access) can read everything in Custom Data and User Data. Treat these fields as "<strong>public</strong>" relative to the VM's internal environment.</p>
</li>
</ol>
]]></content:encoded></item><item><title><![CDATA[Deploying a Python Web App with on Kubernetes and Persistent NFS Storage]]></title><description><![CDATA[You’ll learn how to :
- Build a simple Flask web app that Displays an image on a web page and allows users to upload and replace the image
- Containerizing it using Docker
- Pushing the image to a Docker hub registry
- Deploying it in Kubernetes - Cr...]]></description><link>https://automatestack.dev/deploying-a-python-web-app-with-on-kubernetes-and-persistent-nfs-storage-5d3373286132</link><guid isPermaLink="true">https://automatestack.dev/deploying-a-python-web-app-with-on-kubernetes-and-persistent-nfs-storage-5d3373286132</guid><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Fri, 31 May 2024 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1748693236118/5636c648-7a00-4a2d-a669-fa9dd83f8845.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-youll-learn-how-to">You’ll learn how to :</h3>
<h3 id="heading-build-a-simple-flask-web-app-that-displays-an-image-on-a-web-page-and-allows-users-to-upload-and-replace-the-image">- Build a simple Flask web app that Displays an image on a web page and allows users to upload and replace the image</h3>
<h3 id="heading-containerizing-it-using-docker">- Containerizing it using Docker</h3>
<h3 id="heading-pushing-the-image-to-a-docker-hub-registry">- Pushing the image to a Docker hub registry</h3>
<h3 id="heading-deploying-it-in-kubernetes-create-kubernetes-secrets-deployments-service-init-container">- Deploying it in Kubernetes - Create Kubernetes secrets, deployments, service, init container</h3>
<h3 id="heading-setting-up-persistent-volumes-with-an-nfs-volume-using-static-pvspvcs">- Setting up persistent volumes with an NFS volume using static PVs/PVCs</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1748487105155/e6b35dba-3dc7-4462-87aa-5ac29d33cde0.png" alt /></p>
<h3 id="heading-folder-structure">Folder Structure</h3>
<p>Below is a typical folder structure for this project.</p>
<p>This structure separates the application logic, HTML templates, and static assets, making it easy to maintain and deploy.</p>
<pre><code class="lang-bash">my_app/
├── app.py <span class="hljs-comment"># Flask application code</span>
├── Dockerfile <span class="hljs-comment"># Dockerfile for containerization</span>
├── requirements.txt <span class="hljs-comment"># (Optional) Python dependencies</span>
├── templates/
│ └── index.html <span class="hljs-comment"># HTML template for the web app</span>
└── static/
├── uploads/ <span class="hljs-comment"># Directory to store uploaded images (persistent)</span>
└── images/
└── logo.png <span class="hljs-comment"># logo used in the header</span>
</code></pre>
<h3 id="heading-1-setting-up-your-development-environment">1. Setting Up Your Development Environment</h3>
<h4 id="heading-installing-python-and-flask">Installing Python and Flask</h4>
<p>Before starting, ensure you have <strong>Python 3</strong> installed. You can install it using:</p>
<pre><code class="lang-bash">sudo apt update &amp;&amp; sudo apt install python3 python3-pip -y <span class="hljs-comment"># Ubuntu/Debian</span>
</code></pre>
<pre><code class="lang-bash">brew install python3 <span class="hljs-comment">#MacOS</span>
</code></pre>
<p>For Windows, download and install Python from <a target="_blank" href="https://www.python.org/downloads/">python.org</a>.</p>
<p>Next, install Flask:</p>
<pre><code class="lang-bash">pip3 install flask
</code></pre>
<h3 id="heading-2-building-the-flask-app">2. Building the Flask App</h3>
<h3 id="heading-flask-is-a-lightweight-and-flexible-python-web-framework-used-for-building-web-applications-quickly-it-is-minimalist-yet-powerful">Flask is a lightweight and flexible Python web framework used for building web applications quickly. It is minimalist yet powerful.</h3>
<p><strong>How It Works:</strong></p>
<ul>
<li><p>The app initially displays <code>static/uploads/current.jpg</code>.</p>
</li>
<li><p>Users can upload an image via a form.</p>
</li>
<li><p>The uploaded image replaces the old one.</p>
</li>
</ul>
<p>Lets add the code to the app.py for the application &amp; the index.html for the web-interface</p>
<ul>
<li><h3 id="heading-apppy"><strong><em>app.py</em></strong></h3>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> flask <span class="hljs-keyword">import</span> Flask, render_template, request, redirect, url_for
<span class="hljs-keyword">import</span> os

app = Flask(__name__)
UPLOAD_FOLDER = <span class="hljs-string">'static/uploads/'</span>
app.config[<span class="hljs-string">'UPLOAD_FOLDER'</span>] = UPLOAD_FOLDER

<span class="hljs-comment"># Ensure the upload folder exists</span>
os.makedirs(UPLOAD_FOLDER, exist_ok=<span class="hljs-literal">True</span>)

<span class="hljs-comment"># Default image</span>
DEFAULT_IMAGE = <span class="hljs-string">'default.jpg'</span>
image_path = os.path.join(UPLOAD_FOLDER, DEFAULT_IMAGE).replace(<span class="hljs-string">"\", "</span>/<span class="hljs-string">")

@app.route("</span>/<span class="hljs-string">", methods=["</span>GET<span class="hljs-string">", "</span>POST<span class="hljs-string">"])
def index():
if request.method == "</span>POST<span class="hljs-string">":
if "</span>file<span class="hljs-string">" not in request.files:
return redirect(request.url)

file = request.files["</span>file<span class="hljs-string">"]
if file.filename == "</span><span class="hljs-string">":
return redirect(request.url)

if file:
filepath = os.path.join(app.config['UPLOAD_FOLDER'], "</span>current.jpg<span class="hljs-string">").replace("</span>\<span class="hljs-string">", "</span>/<span class="hljs-string">")
file.save(filepath)

return render_template("</span>index.html<span class="hljs-string">", image_url="</span>static/uploads/current.jpg<span class="hljs-string">")

#if __name__ == "</span>__main__<span class="hljs-string">":
# app.run(debug=True)

if __name__ == "</span>__main__<span class="hljs-string">":
app.run(host="</span><span class="hljs-number">0.0</span><span class="hljs-number">.0</span><span class="hljs-number">.0</span><span class="hljs-string">", port=5000, debug=True)</span>
</code></pre>
<ul>
<li><em>templates/index.html</em></li>
</ul>
<pre><code class="lang-xml"><span class="hljs-meta">&lt;!DOCTYPE <span class="hljs-meta-keyword">html</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">html</span> <span class="hljs-attr">lang</span>=<span class="hljs-string">"en"</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">head</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">meta</span> <span class="hljs-attr">charset</span>=<span class="hljs-string">"UTF-8"</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">meta</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"viewport"</span> <span class="hljs-attr">content</span>=<span class="hljs-string">"width=device-width, initial-scale=1.0"</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">title</span>&gt;</span>Image Upload<span class="hljs-tag">&lt;/<span class="hljs-name">title</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">style</span>&gt;</span><span class="css">
    <span class="hljs-comment">/* Set the background to light blue */</span>
    <span class="hljs-selector-tag">body</span> {
      <span class="hljs-attribute">background-color</span>: lightblue;
      <span class="hljs-attribute">margin</span>: <span class="hljs-number">0</span>;
      <span class="hljs-attribute">font-family</span>: Arial, sans-serif;
    }
    <span class="hljs-comment">/* Header for the Kubernetes logo at the top left */</span>
    <span class="hljs-selector-tag">header</span> {
      <span class="hljs-attribute">position</span>: fixed;
      <span class="hljs-attribute">top</span>: <span class="hljs-number">0</span>;
      <span class="hljs-attribute">left</span>: <span class="hljs-number">0</span>;
      <span class="hljs-attribute">padding</span>: <span class="hljs-number">10px</span>;
      <span class="hljs-attribute">z-index</span>: <span class="hljs-number">1000</span>; <span class="hljs-comment">/* Ensure header stays on top */</span>
    }
    <span class="hljs-selector-tag">header</span> <span class="hljs-selector-tag">img</span> {
      <span class="hljs-attribute">height</span>: <span class="hljs-number">70px</span>; <span class="hljs-comment">/* Adjust size as needed */</span>
    }
    <span class="hljs-comment">/* Container for main content with padding to avoid header overlap */</span>
    <span class="hljs-selector-class">.content</span> {
      <span class="hljs-attribute">padding-top</span>: <span class="hljs-number">50px</span>;
    }
    <span class="hljs-comment">/* Style for the green upload button */</span>
    <span class="hljs-selector-class">.upload-button</span> {
      <span class="hljs-attribute">background-color</span>: green;
      <span class="hljs-attribute">border</span>: none;
      <span class="hljs-attribute">color</span>: <span class="hljs-built_in">rgb</span>(<span class="hljs-number">213</span>, <span class="hljs-number">213</span>, <span class="hljs-number">213</span>);
      <span class="hljs-attribute">padding</span>: <span class="hljs-number">10px</span> <span class="hljs-number">20px</span>;
      <span class="hljs-attribute">text-align</span>: center;
      <span class="hljs-attribute">text-decoration</span>: none;
      <span class="hljs-attribute">display</span>: inline-block;
      <span class="hljs-attribute">font-size</span>: <span class="hljs-number">16px</span>;
      <span class="hljs-attribute">margin</span>: <span class="hljs-number">4px</span> <span class="hljs-number">2px</span>;
      <span class="hljs-attribute">cursor</span>: pointer;
      <span class="hljs-attribute">border-radius</span>: <span class="hljs-number">4px</span>;
    }
    <span class="hljs-comment">/* Style to center and enlarge the image */</span>
    <span class="hljs-selector-class">.centered-image</span> {
      <span class="hljs-attribute">display</span>: block;
      <span class="hljs-attribute">margin</span>: <span class="hljs-number">20px</span> auto;
      <span class="hljs-attribute">max-width</span>: <span class="hljs-number">80%</span>;
      <span class="hljs-attribute">height</span>: auto;
    }
    <span class="hljs-comment">/* Center-align headings and form */</span>
    <span class="hljs-selector-class">.center</span> {
      <span class="hljs-attribute">text-align</span>: center;
    }
  </span><span class="hljs-tag">&lt;/<span class="hljs-name">style</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">head</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">body</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">header</span>&gt;</span>
    <span class="hljs-comment">&lt;!-- Update the src to point to your logo file --&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">img</span> <span class="hljs-attr">src</span>=<span class="hljs-string">"static/images/logo.png"</span> <span class="hljs-attr">alt</span>=<span class="hljs-string">"Logo"</span>&gt;</span>
  <span class="hljs-tag">&lt;/<span class="hljs-name">header</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"content"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">h2</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"center"</span>&gt;</span>Upload an Image<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">form</span> <span class="hljs-attr">method</span>=<span class="hljs-string">"POST"</span> <span class="hljs-attr">enctype</span>=<span class="hljs-string">"multipart/form-data"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"center"</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"file"</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"file"</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">button</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"submit"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"upload-button"</span>&gt;</span>Upload<span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">form</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">h3</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"center"</span>&gt;</span>Current Image:<span class="hljs-tag">&lt;/<span class="hljs-name">h3</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">img</span> <span class="hljs-attr">src</span>=<span class="hljs-string">"{{ image_url }}"</span> <span class="hljs-attr">alt</span>=<span class="hljs-string">"Uploaded Image"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"centered-image"</span>&gt;</span>
  <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">body</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">html</span>&gt;</span>
</code></pre>
<h2 id="heading-running-the-app-on-your-local-system">Running the App on your local system</h2>
<pre><code class="lang-python">python app.py
</code></pre>
<p>Visit <code>http://127.0.0.1:5000/</code> in your browser to access the app.</p>
<h1 id="heading-3-containerizing-with-docker"><strong>3. Containerizing with Docker</strong></h1>
<p>To install docker on your system, you can follow this official <a target="_blank" href="https://docs.docker.com/engine/install/ubuntu"><strong><em>docker installation guide</em></strong></a></p>
<p>Lets create the docker file</p>
<ul>
<li><em>Dockerfile</em></li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-comment"># Use the official Python image as a base</span>
<span class="hljs-string">FROM</span> <span class="hljs-string">python:3.9</span>

<span class="hljs-comment"># Set the working directory</span>
<span class="hljs-string">WORKDIR</span> <span class="hljs-string">/app</span>

<span class="hljs-comment"># Copy all files to the container</span>
<span class="hljs-string">COPY</span> <span class="hljs-string">.</span> <span class="hljs-string">.</span>

<span class="hljs-comment"># Install dependencies</span>
<span class="hljs-string">RUN</span> <span class="hljs-string">pip</span> <span class="hljs-string">install</span> <span class="hljs-string">flask</span>

<span class="hljs-comment"># Expose the port Flask runs on</span>
<span class="hljs-string">EXPOSE</span> <span class="hljs-number">5000</span>

<span class="hljs-comment"># Run the application</span>
<span class="hljs-string">CMD</span> [<span class="hljs-string">"python"</span>, <span class="hljs-string">"app.py"</span>]
</code></pre>
<p>Building &amp; Running the Docker Container</p>
<ul>
<li>Build the Docker Image:</li>
</ul>
<pre><code class="lang-bash">docker build -t my_app .
</code></pre>
<ul>
<li>Run the Container in Detached Mode:</li>
</ul>
<pre><code class="lang-bash">docker run -d -p 5000:5000 -v $(<span class="hljs-built_in">pwd</span>)/static/uploads:/app/static/uploads --name mywebapp my_app
</code></pre>
<p>Explanation of Flags:</p>
<ul>
<li><p><code>-d</code> → Runs the container in <strong>detached mode</strong> (background).</p>
</li>
<li><p><code>-p 5000:5000</code> → Maps <strong>port 5000</strong> of the container to <strong>port 5000</strong> on the node.</p>
</li>
<li><p><code>-v $(pwd)/static/uploads:/app/static/uploads</code> → Mounts the upload directory so files persist.</p>
</li>
<li><p><code>--name mywebapp</code>→ Assigns the container a custom name (<code>mywebapp</code>).</p>
</li>
<li><p><code>my_app</code> → The name of your Docker image.</p>
</li>
</ul>
<p>Access the Application:</p>
<p>Open your browser at <code>http://&lt;Docker_host_IP&gt;:5000</code></p>
<h1 id="heading-4-pushing-the-image-to-a-registry"><strong>4. Pushing the Image to a Registry</strong></h1>
<p>Now that we have tested the application on a docker container, lets push the docker image that we build to a registry</p>
<p>I am using <a target="_blank" href="https://hub.docker.com/"><strong><em>docker hub</em></strong></a><strong><em>.</em></strong> But you can use any other cloud based registry such as <a target="_blank" href="https://github.blog/news-insights/product-news/introducing-github-container-registry/"><strong><em>GitHub Container Registry</em></strong></a> or a self hosted one such as <a target="_blank" href="https://www.docker.com/blog/how-to-use-your-own-registry-2/"><strong><em>docker registry</em></strong></a> or <a target="_blank" href="https://goharbor.io/"><strong><em>harbor</em></strong></a></p>
<p>If your docker hub repository is a private one, then you will need to authenticate.</p>
<blockquote>
<p><em>Use</em> <strong><em>docker login</em></strong> <em>command on your docker host and follow the instructions on the screen</em></p>
</blockquote>
<ul>
<li>Login to registry</li>
</ul>
<pre><code class="lang-bash">root@docker:~<span class="hljs-comment"># docker login</span>

USING WEB-BASED LOGIN

i Info → To sign <span class="hljs-keyword">in</span> with credentials on the <span class="hljs-built_in">command</span> line, use <span class="hljs-string">'docker login -u &lt;username&gt;'</span>


Your one-time device confirmation code is: XXXX-YYYY
Press ENTER to open your browser or submit your device code here: https://login.docker.com/activate

Waiting <span class="hljs-keyword">for</span> authentication <span class="hljs-keyword">in</span> the browser…
</code></pre>
<ul>
<li>Tag the Image</li>
</ul>
<pre><code class="lang-bash">docker tag my_app &lt;docker_hub-repo_name&gt;/python_picture_webapp:v1
</code></pre>
<ul>
<li>Push the Image</li>
</ul>
<pre><code class="lang-bash">docker push &lt;docker_hub-repo_name&gt;/python_picture_webapp:v1
</code></pre>
<h1 id="heading-5-deploying-in-kubernetes"><strong>5. Deploying in Kubernetes</strong></h1>
<ul>
<li><strong>Kubernetes manifests used in this project</strong></li>
</ul>
<pre><code class="lang-bash">apps/python_picture_webapp/
├── deployment.yaml        <span class="hljs-comment"># Deployment resource for the Flask app</span>
├── service.yaml           <span class="hljs-comment"># Service to expose the app</span>
├── persistent-volume.yaml <span class="hljs-comment"># NFS PersistentVolume (PV)</span>
├── persistent-claim.yaml  <span class="hljs-comment"># PersistentVolumeClaim (PVC)</span>
├── secret.yaml            <span class="hljs-comment"># Secret for pulling private images from dockerhub</span>
└── namespace.yaml         <span class="hljs-comment"># Namespace definition (if organizing workloads)</span>
</code></pre>
<ul>
<li><strong>Create a Secret for accessing the private docker hub</strong></li>
</ul>
<p>Lets use <code>kubectl</code> to create the secret</p>
<pre><code class="lang-python">kubectl create secret docker-registry mycred \
  --docker-server=https://index.docker.io/v1/ \
  --docker-username=&lt;your-username&gt; \
  --docker-password=&lt;your-password&gt; \
  --docker-email=&lt;your-email&gt;
</code></pre>
<p>This command creates a Secret of type <em>kubernetes.io/dockerconfigjson</em></p>
<p>Retrieve the <code>.data.dockerconfigjson</code> field from that new Secret and decode the data:</p>
<pre><code class="lang-python">kubectl get secret mycred -o jsonpath=<span class="hljs-string">"{.data.\.dockerconfigjson}"</span> | base64 --decode

<span class="hljs-comment">#Output :</span>

{<span class="hljs-string">"auths"</span>:{<span class="hljs-string">"https://index.docker.io/v1/"</span>:{<span class="hljs-string">"username"</span>:<span class="hljs-string">"test-user"</span>,<span class="hljs-string">"password"</span>:<span class="hljs-string">"your-pass"</span>,<span class="hljs-string">"email"</span>:<span class="hljs-string">"test@acme.example"</span>,<span class="hljs-string">"auth"</span>:<span class="hljs-string">"TlJFeG1pY25mMw=="</span>}}}
</code></pre>
<blockquote>
<p><em>Caution:</em></p>
<p><em>The</em> <code>auth</code> value there is base64 encoded; it is obscured but not secret. Anyone who can read that Secret can learn the registry access bearer token.</p>
</blockquote>
<ul>
<li><strong>Create static persistent volume &amp; claim using a NFS backed storage</strong></li>
</ul>
<p>On the NFS share, you must have <strong><em>rw,sync</em></strong> permissions</p>
<pre><code class="lang-python">root@ovm-nfs:~<span class="hljs-comment"># exportfs -v</span>
/export/nfs_python_pictures_app
                <span class="hljs-number">192.168</span><span class="hljs-number">.1</span><span class="hljs-number">.0</span>/<span class="hljs-number">24</span>(sync,wdelay,hide,no_subtree_check,rw,secure,no_root_squash,no_all_squash)
</code></pre>
<ul>
<li><em>pv_python-pictures-app.yaml</em></li>
</ul>
<pre><code class="lang-yaml">
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">PersistentVolume</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">nfs-python-pictures-app</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">type:</span> <span class="hljs-string">nfs</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">python-picture-webapp</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">capacity:</span>
    <span class="hljs-attr">storage:</span> <span class="hljs-string">1Gi</span>
  <span class="hljs-attr">accessModes:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">ReadWriteMany</span>
  <span class="hljs-attr">persistentVolumeReclaimPolicy:</span> <span class="hljs-string">Retain</span>
  <span class="hljs-attr">nfs:</span>
    <span class="hljs-attr">server:</span> <span class="hljs-number">192.168</span><span class="hljs-number">.1</span><span class="hljs-number">.110</span>         <span class="hljs-comment"># Replace with your NFS server hostname/IP</span>
    <span class="hljs-attr">path:</span> <span class="hljs-string">"/export/nfs_python_pictures_app"</span>        <span class="hljs-comment"># Replace with your exported directory</span>
</code></pre>
<ul>
<li><em>pvc_python-pictures-app.yaml</em></li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">PersistentVolumeClaim</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">pvc-python-pictures-app</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">accessModes:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">ReadWriteMany</span>
  <span class="hljs-attr">resources:</span>
    <span class="hljs-attr">requests:</span>
      <span class="hljs-attr">storage:</span> <span class="hljs-string">1Gi</span>
</code></pre>
<ul>
<li><strong>Apply the persistent volume &amp; claim manifests</strong></li>
</ul>
<pre><code class="lang-bash">root@controller01:~<span class="hljs-comment"># kubectl apply -f pv_python_picture_webapp.yaml</span>
persistentvolume/pv-nfs-python-pictures-app created
root@controller01:~<span class="hljs-comment"># kubectl apply -f pvc_python_picture_webapp.yaml</span>
persistentvolumeclaim/pvc-python-pictures-app created
</code></pre>
<pre><code class="lang-bash">root@controller01:~<span class="hljs-comment"># kubectl get pv,pvc</span>
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                             STORAGECLASS   VOLUMEATTRIBUTESCLASS   REASON   AGE
persistentvolume/nfs-python-pictures-app   1Gi        RWX            Retain           Bound    default/pvc-python-pictures-app                  &lt;<span class="hljs-built_in">unset</span>&gt;                          9d

NAME                                            STATUS   VOLUME                    CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
persistentvolumeclaim/pvc-python-pictures-app   Bound    nfs-python-pictures-app   1Gi        RWX                           &lt;<span class="hljs-built_in">unset</span>&gt;                 9d
</code></pre>
<p>Lets now put together the <strong><em>deployment.yaml</em></strong></p>
<ul>
<li><em>deployment.yaml</em></li>
</ul>
<pre><code class="lang-bash">apiVersion: apps/v1
kind: Deployment
metadata:
  name: python-picture-webapp-v1-1
spec:
  replicas: 4  <span class="hljs-comment"># Number of pods</span>
  selector:
    matchLabels:
      app: python-picture-webapp
  template:
    metadata:
      labels:
        app: python-picture-webapp
        color: blue  
    spec:
      imagePullSecrets:
      - name: mycred   <span class="hljs-comment"># use the secret created in the begining</span>
      initContainers:  <span class="hljs-comment"># this init container is used to copy the logo.png to the nfs share</span>
      - name: init-static-images
        image: sumitsur74/python_picture_webapp:v1.1   <span class="hljs-comment"># your image from your registry</span>
        <span class="hljs-built_in">command</span>: [<span class="hljs-string">'sh'</span>, <span class="hljs-string">'-c'</span>, <span class="hljs-string">'cp -r /app/static/images/* /mnt/static/images/'</span>]
        volumeMounts:
        - name: nfs-python-pictures-app
          mountPath: /mnt/static
      containers:
      - name: python-picture-webapp
        image: sumitsur74/python_picture_webapp:v1.1  <span class="hljs-comment"># your image from your registry</span>
        ports:
        - containerPort: 5000
        volumeMounts:
        - name: nfs-python-pictures-app
          mountPath: /app/static  <span class="hljs-comment"># Mount the static folder</span>
      volumes:
      - name: nfs-python-pictures-app
        persistentVolumeClaim:
          claimName: pvc-python-pictures-app
</code></pre>
<p>When deploying the app, you might encountered an issue where the <em>logo.png</em> kept under <code>/static/images</code> directory were not copied to the Persistent Volume (PV). This behavior occurs because, in Kubernetes, mounting a volume to a directory within a container overrides the existing contents of that directory. Consequently, any files baked into the Docker image at that path become inaccessible once the volume is mounted.</p>
<ul>
<li><strong>Use an Init Container to Populate the PV:</strong></li>
</ul>
<p>An Init Container can be employed to copy the necessary files from the Docker image to the Persistent Volume before the main application container starts. This Init Container copies the contents from <code>/app/static/images/</code> (within the Docker image) to <code>/mnt/static/images/</code>, which is the mounted Persistent Volume.</p>
<p>The main application container then mounts the same Persistent Volume at <code>/app/static</code>, ensuring that the <code>/app/static/images/</code> directory contains the necessary files from the PV</p>
<pre><code class="lang-bash">initContainers:  <span class="hljs-comment"># this init container is used to copy the logo.png to the nfs share</span>
      - name: init-static-images
        image: sumitsur74/python_picture_webapp:v1.1   <span class="hljs-comment"># your image from your registry</span>
        <span class="hljs-built_in">command</span>: [<span class="hljs-string">'sh'</span>, <span class="hljs-string">'-c'</span>, <span class="hljs-string">'cp -r /app/static/images/* /mnt/static/images/'</span>]
        volumeMounts:
        - name: nfs-python-pictures-app
          mountPath: /mnt/static
</code></pre>
<p>Lets prepare the <strong><em>service.yaml</em></strong> for the networking of the application</p>
<p>If a client makes a request on the node at <code>http://&lt;NodeIP&gt;:32000</code>, it will:</p>
<p>🡲Hit port <code>32000</code> on the node.</p>
<p>🡲Be forwarded to the service on <code>port 80</code>.</p>
<p>🡲 The service will route the request to a pod on <code>port 5000</code>.</p>
<ul>
<li><em>service.yaml</em></li>
</ul>
<pre><code class="lang-bash">apiVersion: v1
kind: Service
metadata:
  name: python-picture-service
spec:
  selector:
    app: python-picture-webapp
    color: blue  
  ports:
    - protocol: TCP
      port: 80
      targetPort: 5000
      nodePort: 32000
  <span class="hljs-built_in">type</span>: NodePort
</code></pre>
<ul>
<li>Apply the deployment &amp; service</li>
</ul>
<pre><code class="lang-bash">root@controller01:~/git_codebase/k8s_homelab/apps/python_picture_webapp<span class="hljs-comment"># kubectl apply -f deployment-v1.1-persistent_init_container.yaml</span>
deployment.apps/python-picture-webapp-v1-1 created

root@controller01:~/git_codebase/k8s_homelab/apps/python_picture_webapp<span class="hljs-comment"># kubectl apply -f service.yaml</span>
service/python-picture-service created
</code></pre>
<ul>
<li>Verify the pods &amp; service</li>
</ul>
<pre><code class="lang-bash">root@controller01:~<span class="hljs-comment"># kubectl get pods</span>
NAME                                         READY   STATUS    RESTARTS   AGE
python-picture-webapp-v1-1-8f94986fc-47bh9   1/1     Running   0          2m28s
python-picture-webapp-v1-1-8f94986fc-6chdm   1/1     Running   0          2m28s
python-picture-webapp-v1-1-8f94986fc-88w5q   1/1     Running   0          2m28s
python-picture-webapp-v1-1-8f94986fc-h9lch   1/1     Running   0          2m28s
</code></pre>
<pre><code class="lang-bash">root@controller01:~<span class="hljs-comment"># kubectl get service -o wide</span>
NAME                     TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE   SELECTOR
kubernetes               ClusterIP   10.96.0.1     &lt;none&gt;        443/TCP        76d   &lt;none&gt;
python-picture-service   NodePort    10.96.5.206   &lt;none&gt;        80:32000/TCP   17d   app=python-picture-webapp,color=blue
</code></pre>
<ul>
<li>Access the application at <code>http://&lt;NodeIP&gt;:32000</code></li>
</ul>
<p><img src="https://miro.medium.com/v2/resize:fit:700/1*TgO9F-1O2qgIklurOYORuw.png" alt="app.png" /></p>
]]></content:encoded></item><item><title><![CDATA[Hosting Your Own DNS with Unbound & Docker on Raspberry Pi]]></title><description><![CDATA[I've been steadily building a self-sufficient environment for my infrastructure experiments. One milestone was setting up my own DNS server—local and fully controlled.
I had a Raspberry Pi lying around, so I decided to use it to host my DNS as a week...]]></description><link>https://automatestack.dev/hosting-your-own-dns-with-unbound-and-docker-on-raspberry-pi</link><guid isPermaLink="true">https://automatestack.dev/hosting-your-own-dns-with-unbound-and-docker-on-raspberry-pi</guid><category><![CDATA[unbound dns]]></category><category><![CDATA[dns on docker]]></category><category><![CDATA[dns server on docker]]></category><category><![CDATA[dns container]]></category><category><![CDATA[dns on raspberrypi]]></category><category><![CDATA[dns resolver]]></category><category><![CDATA[dns]]></category><category><![CDATA[dns-records]]></category><category><![CDATA[dns server]]></category><category><![CDATA[Raspberry Pi]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Fri, 31 May 2024 18:30:00 GMT</pubDate><content:encoded><![CDATA[<p>I've been steadily building a self-sufficient environment for my infrastructure experiments. One milestone was setting up my own DNS server—local and fully controlled.</p>
<p>I had a Raspberry Pi lying around, so I decided to use it to host my DNS as a weekend project.</p>
<p>This post will walk you through how I achieved that using <strong>Unbound</strong>, <strong>Docker,</strong> <strong>configuration management</strong> on my <strong>Raspberry Pi</strong>.</p>
<h2 id="heading-the-project-code"><code>🗃️</code> The project code</h2>
<ul>
<li><a target="_blank" href="https://github.com/sumitsaz23/unbound-dns-container">Link to the Github Repo</a></li>
</ul>
<h2 id="heading-what-we-are-building">📦 What We Are Building</h2>
<ul>
<li><p>A self hosted lightweight <strong>Unbound DNS resolver</strong></p>
</li>
<li><p>Deployed via <strong>Docker Compose</strong></p>
</li>
<li><p>Config stored in <strong>Git</strong> and auto-applied on change</p>
</li>
<li><p>Auto-start on Raspberry Pi boot</p>
</li>
<li><p>Designed for private use with domain: <code>home.lab</code></p>
</li>
<li><p>Resilient to reboots and container crashes</p>
</li>
</ul>
<h2 id="heading-prerequisites">🧰 Prerequisites</h2>
<ul>
<li><p>Installed Ubuntu 24.04 LTS server on <a target="_blank" href="https://www.raspberrypi.com/">Raspberry Pi</a> using <a target="_blank" href="https://www.raspberrypi.com/software/">Raspberry Pi Imager</a></p>
</li>
<li><p>installed <code>Docker</code> &amp; <code>Docker Compose</code> on the Pi- <a target="_blank" href="https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository">Installation guide</a></p>
</li>
<li><p><code>Git</code> installed and a <code>private repo</code> created to track Unbound configs</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749921031862/1e4160eb-b3e0-4f89-b511-af7ee712317f.png" alt class="image--center mx-auto" /></p>
<ul>
<li><code>inotifywait</code> installed for the watchdog</li>
</ul>
<pre><code class="lang-bash">sudo apt update
sudo apt install inotify-tools
</code></pre>
<h2 id="heading-project-structure">🗂️ Project Structure</h2>
<pre><code class="lang-bash">~/unbound-gitops/
├── docker-compose.yml
├── Dockerfile
├── unbound/
│   ├── unbound.conf
│   ├── root.hints
│   └── a-records.conf
├── watch-and-restart.sh
└── git-pull.sh (optional)
</code></pre>
<h2 id="heading-docker-compose-file">🐳 Docker Compose File</h2>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-attr">unbound:</span>
    <span class="hljs-attr">build:</span> <span class="hljs-string">.</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">unbound</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">unless-stopped</span>  <span class="hljs-comment">#Persistent on reboot. So it automatically starts after reboot</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"53:53/udp"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"53:53/tcp"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./unbound/unbound.conf:/etc/unbound/unbound.conf:ro</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./unbound/a-records.conf:/etc/unbound/a-records.conf:ro</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./unbound/root.hints:/etc/unbound/root.hints:ro</span>
</code></pre>
<h2 id="heading-dockerfile-for-custom-unbound-image">🛠️ Dockerfile for Custom Unbound Image</h2>
<pre><code class="lang-Dockerfile"><span class="hljs-keyword">FROM</span> alpine:latest

<span class="hljs-keyword">RUN</span><span class="bash"> apk add --no-cache unbound libcap wget</span>

<span class="hljs-keyword">COPY</span><span class="bash"> unbound/unbound.conf /etc/unbound/unbound.conf</span>
<span class="hljs-keyword">COPY</span><span class="bash"> unbound/a-records.conf /etc/unbound/a-records.conf</span>
<span class="hljs-keyword">COPY</span><span class="bash"> unbound/root.hints /etc/unbound/root.hints</span>

<span class="hljs-keyword">RUN</span><span class="bash"> unbound-checkconf</span>

<span class="hljs-keyword">EXPOSE</span> <span class="hljs-number">53</span>/udp <span class="hljs-number">53</span>/tcp

<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"unbound"</span>, <span class="hljs-string">"-d"</span>, <span class="hljs-string">"-c"</span>, <span class="hljs-string">"/etc/unbound/unbound.conf"</span>]</span>
</code></pre>
<h2 id="heading-sample-unboundconf">⚙️ Sample <code>unbound.conf</code></h2>
<pre><code class="lang-bash">server:
    logfile: <span class="hljs-string">"/var/log/unbound/unbound.log"</span> <span class="hljs-comment"># Log file path</span>
    verbosity: 1 <span class="hljs-comment"># Set verbosity level (0-4)</span>

    <span class="hljs-comment"># Disable logging for performance, enable if you need to debug</span>
    log-queries: no
    log-replies: no
    log-tag-queryreply: no  


    interface: 0.0.0.0 <span class="hljs-comment"># Listen on all interfaces</span>
    port: 53

    <span class="hljs-comment"># Enable IPv4, UDP, and TCP</span>
    do-ip4: yes
    do-udp: yes
    do-tcp: yes

    <span class="hljs-comment"># Disable IPv6 if not needed on your network</span>
    do-ip6: no
    prefer-ip6: no

    <span class="hljs-comment"># Access control list. By default, refuse all.</span>
    <span class="hljs-comment"># Then, allow specific networks.</span>
    access-control: 127.0.0.1/32 allow
    access-control: 192.168.1.0/24 allow
    access-control: 172.17.0.0/16 allow
    access-control: 0.0.0.0/0 deny <span class="hljs-comment"># Deny all other IPs</span>
    <span class="hljs-comment">#access-control: 0.0.0.0/0 allow # Allow all IPs to query</span>
    <span class="hljs-comment"># Uncomment above line to allow all IPs</span>

    <span class="hljs-comment">#root hints file &amp; records file</span>
    root-hints: <span class="hljs-string">"/etc/unbound/root.hints"</span>
    include: <span class="hljs-string">"/etc/unbound/a-records.conf"</span>

    <span class="hljs-comment"># Harden DNS security settings</span>
    hide-identity: yes 
    hide-version: yes
    harden-glue: yes
    harden-dnssec-stripped: yes 

    use-caps-for-id: yes
    prefetch: yes
    rrset-roundrobin: yes
    cache-max-ttl: 86400
    cache-min-ttl: 3600

remote-control:
<span class="hljs-comment"># Enable remote control interface with unbound-control</span>
    control-enable: no
</code></pre>
<h2 id="heading-add-a-records-example-a-recordsconf">📝 Add A Records (Example <code>a-records.conf</code>)</h2>
<pre><code class="lang-bash"><span class="hljs-comment"># Unbound DNS Configuration for Home Lab</span>
<span class="hljs-comment"># This file contains local DNS records for the home lab environment.</span>
<span class="hljs-comment"># It is included in the main unbound configuration file.</span>

<span class="hljs-comment"># Local Zone</span>
  local-zone: <span class="hljs-string">"home.lab."</span> static

<span class="hljs-comment"># A Records</span>
  local-data: <span class="hljs-string">"pve01.home.lab. IN A 192.168.1.190"</span>
  local-data: <span class="hljs-string">"pi.home.lab. IN A 192.168.1.10"</span>
  local-data: <span class="hljs-string">"nfs.home.lab. IN A 192.168.1.110"</span>
  local-data: <span class="hljs-string">"k8scontrol01.home.lab. IN A 192.168.1.100"</span>
  local-data: <span class="hljs-string">"k8snode01.home.lab. IN A 192.168.1.101"</span>
  local-data: <span class="hljs-string">"k8snode02.home.lab. IN A 192.168.1.102"</span>

<span class="hljs-comment"># PTR Record</span>
  local-data-ptr: <span class="hljs-string">"192.168.1.190 pve01.home.lab"</span>
  local-data-ptr: <span class="hljs-string">"192.168.1.10 pi.home.lab"</span>
  local-data-ptr: <span class="hljs-string">"192.168.1.110 nfs.home.lab"</span>
  local-data-ptr: <span class="hljs-string">"192.168.1.100 k8scontrol01.home.lab"</span>
  local-data-ptr: <span class="hljs-string">"192.168.1.101 k8snode01.home.lab"</span>
  local-data-ptr: <span class="hljs-string">"192.168.1.102 k8snode02.home.lab"</span> 

<span class="hljs-comment"># CNAME Record</span>
  local-data: <span class="hljs-string">"storage.home.lab. IN CNAME nfs.home.lab."</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749921665506/fd8b0bdf-e60d-40ab-aec7-21151d3da67c.png" alt /></p>
<h2 id="heading-configure-raspberry-pis-dns-resolver">🧠 Configure Raspberry Pi’s DNS Resolver</h2>
<p>Currently, <code>systemd-resolved</code> is the DNS resolver for the host system i.e the RPi.</p>
<p>We need to stop <code>systemd-resolved</code> to free up <code>port 53</code>, so the Unbound container can use it.</p>
<pre><code class="lang-bash">sudo systemctl <span class="hljs-built_in">disable</span> --now systemd-resolved
sudo rm -f /etc/resolv.conf
sudo tee /etc/resolv.conf &gt; /dev/null &lt;&lt;EOF
nameserver 192.168.1.10
options edns0 trust-ad
search home.lab
EOF
</code></pre>
<h2 id="heading-testing">🧪 Testing</h2>
<p>Build the Docker image and start Docker Compose</p>
<pre><code class="lang-bash">git pull
docker compose build
docker compose up -d
</code></pre>
<p>Check the container is up</p>
<pre><code class="lang-bash">docker ps
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749920275010/3834186c-4e6d-41fa-aaaf-d72b9cc98415.png" alt class="image--center mx-auto" /></p>
<p>Test DNS resolution</p>
<pre><code class="lang-bash">$ dig pve01.home.lab

; &lt;&lt;&gt;&gt; DiG 9.20.4-3ubuntu1.1-Ubuntu &lt;&lt;&gt;&gt; pve01.home.lab
;; global options: +cmd
;; Got answer:
;; -&gt;&gt;HEADER&lt;&lt;- opcode: QUERY, status: NOERROR, id: 16775
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;pve01.home.lab.            IN    A

;; ANSWER SECTION:
pve01.home.lab.        975    IN    A    192.168.1.190

;; Query time: 0 msec
;; SERVER: 127.0.0.53<span class="hljs-comment">#53(127.0.0.53) (UDP)</span>
;; WHEN: Sat Jun 14 22:25:18 IST 2025
;; MSG SIZE  rcvd: 59
</code></pre>
<p>You should see <code>status: NOERROR</code> and an <code>ANSWER SECTION</code> with your configured IP</p>
<p>Every time new DNS record is committed to Git, log in to the RPi and follow these steps:</p>
<pre><code class="lang-bash">git pull
docker compose restart &lt;unbound_service_name&gt;
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750322627762/54dc616b-b871-4cf0-8243-d9625cc67f06.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-auto-restart-container-on-config-change">🔄 Auto-Restart Container on Config Change</h2>
<p>This steps gives the ability that any update you commit and pull will trigger a container restart.</p>
<h2 id="heading-pull-git-updates">Pull Git Updates</h2>
<p><code>git-pull.sh</code></p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>
<span class="hljs-built_in">cd</span> ~/unbound-dns
git pull origin main
</code></pre>
<p>Setup a cron job to automate:</p>
<pre><code class="lang-bash">*/5 * * * * /home/pi/unbound-dns/git-pull.sh
</code></pre>
<h2 id="heading-setup-the-watchdog">Setup the watchdog</h2>
<p><code>watch-and-restart.sh</code></p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>
CONFIG_DIR=<span class="hljs-string">"./unbound"</span>

inotifywait -m -r -e modify,create,delete --format <span class="hljs-string">'%w%f'</span> <span class="hljs-string">"<span class="hljs-variable">$CONFIG_DIR</span>"</span> | <span class="hljs-keyword">while</span> <span class="hljs-built_in">read</span> file; <span class="hljs-keyword">do</span>
    <span class="hljs-built_in">echo</span> <span class="hljs-string">"[INFO] Change detected in <span class="hljs-variable">$file</span>"</span>
    docker compose restart unbound
<span class="hljs-keyword">done</span>
</code></pre>
<p>Make it executable:</p>
<pre><code class="lang-bash">chmod 700 watch-and-restart.sh
</code></pre>
<p>Start the watchdog</p>
<pre><code class="lang-bash">
nohup ./watch-and-restart.sh &gt; ~/watcher.log 2&gt;&amp;1 &amp;
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749926147711/4d870197-91b6-4721-9270-df9e385b2a9e.png" alt class="image--center mx-auto" /></p>
<p>To make <code>watch-and-restart.sh</code> script <strong>run in the background on reboot</strong>, a reliable approach is to set it up as a <code>systemd</code> service.</p>
<p>But I want to completely scrap the watchdog and implement a pipeline-based solution that is more robust and independent. This approach allows for automated workflows that can be triggered by events such as code commits or pull requests.</p>
<p><strong>I am working on a solution using git-action &amp; container based ARM self-hosted runner, which i will share here very soon.</strong></p>
]]></content:encoded></item><item><title><![CDATA[Step-by-Step Guide to Enabling AWS CLI Autocomplete installed via Snap]]></title><description><![CDATA[I installed the aws-cli via snap using
sudo snap install aws-cli --classic

If you’ve also installed AWS CLI using Snap on Ubuntu or other Linux distributions, you may have stumbled upon the same issue that the aws_completer is not placed in the usua...]]></description><link>https://automatestack.dev/step-by-step-guide-to-enabling-aws-cli-autocomplete-installed-via-snap</link><guid isPermaLink="true">https://automatestack.dev/step-by-step-guide-to-enabling-aws-cli-autocomplete-installed-via-snap</guid><category><![CDATA[snapcraft]]></category><category><![CDATA[AWS]]></category><category><![CDATA[awscli]]></category><category><![CDATA[aws cli]]></category><category><![CDATA[Ubuntu]]></category><category><![CDATA[Autocomplete]]></category><dc:creator><![CDATA[Sumit Sur]]></dc:creator><pubDate>Sun, 31 Mar 2024 18:30:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1749385182930/6e5e492f-8c54-4164-885f-52dd0ca3dd61.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I installed the aws-cli via snap using</p>
<pre><code class="lang-bash">sudo snap install aws-cli --classic
</code></pre>
<p>If you’ve also installed AWS CLI using <strong>Snap</strong> on Ubuntu or other Linux distributions, you may have stumbled upon the same issue that the <code>aws_completer</code> is not placed in the usual <code>/usr/local/bin/aws_completer</code></p>
<h2 id="heading-locate-the-aws-completer">🔗 <strong>Locate the AWS completer</strong></h2>
<p>Lets try to locate the path of <code>aws_completer</code></p>
<p><code>which aws_completer</code> return is empty. As no path is shown, locating it using <code>find</code>:</p>
<pre><code class="lang-bash">$ find / -name aws_completer 2&gt;/dev/null

/snap/aws-cli/1441/aws/dist/aws_completer
/snap/aws-cli/1441/bin/aws_completer
/snap/aws-cli/1443/aws/dist/aws_completer
/snap/aws-cli/1443/bin/aws_completer
</code></pre>
<p>Snap packages use internal versioned folders (e.g. <code>1441</code>, <code>1443</code>) that change on update, making hard-coded paths impractical. This makes configuring autocomplete tricky.</p>
<p>However, Snap maintains a <code>/snap/aws-cli/current</code> symlink to the active version. Link this to a location in your <code>$PATH</code> instead.</p>
<h2 id="heading-1-create-a-stable-symlink-in-your-path">🛠️ 1. Create a stable symlink in your PATH</h2>
<p>Use the snap's <code>current</code> alias to avoid version-specific paths.</p>
<p>This ensures autocomplete continues working even after AWS CLI updates.</p>
<pre><code class="lang-bash">sudo ln -sf /snap/aws-cli/current/bin/aws_completer /usr/<span class="hljs-built_in">local</span>/bin/aws_completer
</code></pre>
<h2 id="heading-step-2-add-completion-setup-to-your-shell">✍️ Step 2: Add completion setup to your shell</h2>
<h3 id="heading-for-bash-bashrc">For Bash (<code>~/.bashrc</code>):</h3>
<pre><code class="lang-bash">complete -C <span class="hljs-string">'/usr/local/bin/aws_completer'</span> aws
</code></pre>
<h3 id="heading-for-zsh-zshrc">For Zsh (<code>~/.zshrc</code>):</h3>
<pre><code class="lang-bash"><span class="hljs-built_in">autoload</span> bashcompinit &amp;&amp; bashcompinit
<span class="hljs-built_in">autoload</span> -Uz compinit &amp;&amp; compinit
complete -C <span class="hljs-string">'/usr/local/bin/aws_completer'</span> aws
</code></pre>
<h2 id="heading-step-3-reload-your-shell">🔄 Step 3: Reload your shell</h2>
<p>Apply changes:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">source</span> ~/.bashrc  <span class="hljs-comment"># or source ~/.zshrc</span>
</code></pre>
<p>Press <strong>Tab</strong> after typing a partial <code>aws</code> command:</p>
<pre><code class="lang-bash">aws s3 &lt;TAB&gt;&lt;TAB&gt;
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749384291235/ffc94c25-2a77-4996-aa71-c50ebbfaddd3.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-step-4-optional-activate-aws-cli-v2-autoprompt">🤖 Step 4 (Optional): Activate AWS CLI v2 Auto‑Prompt</h2>
<p>Beyond tab completion, AWS CLI v2 offers an interactive <strong>auto‑prompt</strong> that guides your input after pressing <strong>Enter</strong></p>
<p>nable it permanently (default profile):</p>
<pre><code class="lang-bash">aws configure <span class="hljs-built_in">set</span> cli_auto_prompt on-partial
</code></pre>
<p>Or for just your session:</p>
<pre><code class="lang-bash">bashCopyEdiexport AWS_CLI_AUTO_PROMPT=on-partial
</code></pre>
<p>This mode is especially helpful when you're exploring less familiar commands</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749384577530/9b92fe56-0645-4594-adc6-1e99fbea1b38.png" alt class="image--center mx-auto" /></p>
]]></content:encoded></item></channel></rss>