{"id":737,"date":"2023-11-16T14:56:14","date_gmt":"2023-11-16T14:56:14","guid":{"rendered":"https:\/\/blog.ngocha.biz\/?p=737"},"modified":"2023-11-16T14:56:14","modified_gmt":"2023-11-16T14:56:14","slug":"prometheus-alert-manager","status":"publish","type":"post","link":"https:\/\/blog.ngocha.biz\/?p=737","title":{"rendered":"How to Setup VPC Secondary Network For EKS Cluster"},"content":{"rendered":"<p>In this comprehensive guide, I have converted detailed steps to set up Prometheus Alert Manager and configure email and Slack alerts.<\/p>\n<h2 id=\"setup-prerequisites\">Setup Prerequisites<\/h2>\n<p>You need two Ubuntu servers for this setup.<\/p>\n<p><strong>server-01<\/strong>, ie the monitoring server contains <strong>Prometheus,  Alert Manager<\/strong>, and <strong>Grafana<\/strong> utilities.<\/p>\n<p>The second server (<strong>server-02<\/strong>), contains the  <strong>Node Exporter<\/strong> utility.<\/p>\n<p>Before starting the alert manager setup, ensure you have Prometheus and Grafana configured on the server. You can follow the guides given below.<\/p>\n<ol>\n<li><a href=\"https:\/\/devopscube.com\/install-configure-prometheus-linux\/\">Prometheus Setup<\/a> (Server 01)<\/li>\n<li><a href=\"https:\/\/devopscube.com\/integrate-visualize-prometheus-grafana\/\">Grafana Setup<\/a> (Server 01)<\/li>\n<li><a href=\"https:\/\/devopscube.com\/monitor-linux-servers-prometheus-node-exporter\/\">Node Exporter Setup<\/a> (Server 02)<\/li>\n<\/ol>\n<p>In this guide, we will only look at the Alert Manager setup and its configurations related to email and Slack alerting.<\/p>\n<p>Let&#8217;s get started with the setup.<\/p>\n<h2 id=\"step-1-download-prometheus-alert-manager\">Step 1: Download Prometheus Alert Manager<\/h2>\n<p>Download a suitable <a href=\"https:\/\/prometheus.io\/download\/?ref=devopscube.com\" rel=\"noreferrer noopener\">Alert Manager binaries<\/a>, which is suitable for your server. here we use the latest version, which is <code>v0.26.0<\/code>.<\/p>\n<pre><code>wget https:\/\/github.com\/prometheus\/alertmanager\/releases\/download\/v0.26.0\/alertmanager-0.26.0.linux-amd64.tar.gz<\/code><\/pre>\n<p>Create a user and group for the Alert Manager to allow permission only for the specific user.<\/p>\n<pre><code>sudo groupadd -f alertmanager\nsudo useradd -g alertmanager --no-create-home --shell \/bin\/false alertmanager<\/code><\/pre>\n<p>Creating directories is <code>\/etc<\/code> and <code>\/var\/lib<\/code> to store the configuration and library files and change the ownership of the directory only for the specific user.<\/p>\n<pre><code>sudo mkdir -p \/etc\/alertmanager\/templates\nsudo mkdir \/var\/lib\/alertmanager\nudo chown alertmanager:alertmanager \/etc\/alertmanager\nsudo chown alertmanager:alertmanager \/var\/lib\/alertmanager<\/code><\/pre>\n<p>Unzip the Alert Manager binaries file and enter it into the directory.<\/p>\n<pre><code>tar -xvf alertmanager-0.26.0.linux-amd64.tar.gz\ncd alertmanager-0.26.0.linux-amd64<\/code><\/pre>\n<p>Copy the <code>alertmanager<\/code> and <code>amtol<\/code> files in the <code>\/usr\/bin<\/code> directory and change the group and owner to <code>alertmanager<\/code>. As well as copy the configuration file <code>alertmanager.yml<\/code> to the <code>\/etc<\/code> directory and change the owner and group name to <code>alertmanager<\/code>.<\/p>\n<pre><code>sudo cp alertmanager \/usr\/bin\/\nsudo cp amtool \/usr\/bin\/\nsudo chown alertmanager:alertmanager \/usr\/bin\/alertmanager\nsudo chown alertmanager:alertmanager \/usr\/bin\/amtool\nsudo cp alertmanager.yml \/etc\/alertmanager\/alertmanager.yml\nsudo chown alertmanager:alertmanager \/etc\/alertmanager\/alertmanager.yml<\/code><\/pre>\n<h2 id=\"step-2-setup-alert-manager-systemd-service\">Step 2: Setup Alert Manager Systemd Service<\/h2>\n<p>Create a service file in <code>\/etc\/systemd\/system<\/code> and the file name is <code>alertmanager.service<\/code>.<\/p>\n<pre><code>cat &lt;&lt;EOF | sudo tee \/etc\/systemd\/system\/alertmanager.service\n[Unit]\nDescription=AlertManager\nWants=network-online.target\nAfter=network-online.target\n\n[Service]\nUser=alertmanager\nGroup=alertmanager\nType=simple\nExecStart=\/usr\/bin\/alertmanager \\\n    --config.file \/etc\/alertmanager\/alertmanager.yml \\\n    --storage.path \/var\/lib\/alertmanager\/\n\n[Install]\nWantedBy=multi-user.target\nEOF\nsudo chmod 664 \/usr\/lib\/systemd\/system\/alertmanager.service<\/code><\/pre>\n<p>After providing the necessary permission to the file reload the background processes and start the Alert Manager service. To prevent the manual restart of the service after reboot, enable the service.<\/p>\n<pre><code>sudo systemctl daemon-reload\nsudo systemctl start alertmanager.service\nsudo systemctl enable alertmanager.service<\/code><\/pre>\n<p>Check the status of the service and ensure everything is working fine.<\/p>\n<pre><code>sudo systemctl status alertmanager<\/code><\/pre>\n<p>To access the Prometheus Alert Manager over the internet, use the following command.<\/p>\n<pre><code>http:\/\/&lt;alertmanager-ip&gt;:9093<\/code><\/pre>\n<p>Instead of <code>&lt;alertmanager&gt;<\/code>, provide your instance public IP with the default Alert Manager port number, which is <code>9093<\/code>.<\/p>\n<p>The output user interface you will get is this.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-6-30.png\" class=\"kg-image\" alt=\"Alert Manager Dashboard\" loading=\"lazy\" width=\"1159\" height=\"416\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-6-30.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-6-30.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-6-30.png 1159w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Currently, there are no alerts in the dashboard. After finishing the configurations, we conducted some tests to see how the alert works with the Alert Manager.<\/p>\n<h2 id=\"step-3-setup-smtp-credentials\">Step 3: Setup SMTP Credentials<\/h2>\n<p>You can use any SMTP with the Alert Manager.<\/p>\n<p>For this setup, I am using AWS SES as SMTP to configure email alerts with Alert Manager.<\/p>\n<p>If you have already configured SES in your AWS account, you can generate the SMTP credentials.<\/p>\n<p>During the setup, please note down the <code>SMTP endpoint<\/code> and <code>STARTTLS Port<\/code>. This information is required to configure the Alert Manager email notification.<\/p>\n<p>To generate a new SMTP credential, open the SES dashboard and click on <code>Create SMTP credentials<\/code>.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-28-18.png\" class=\"kg-image\" alt=\"AWS SES SMTP credentials\" loading=\"lazy\" width=\"663\" height=\"524\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-28-18.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-28-18.png 663w\"><\/figure>\n<p>It will redirect to IAM user creation. modify the user name if necessary and don&#8217;t have to change any other parameters.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-30-21.png\" class=\"kg-image\" alt=\"IAM user for AWS SES \" loading=\"lazy\" width=\"641\" height=\"453\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-30-21.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-30-21.png 641w\"><\/figure>\n<p>IAM user name, SMTP user name, and SMTP password will be generated. This information is required to configure the Alert Manager notification setup.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-38-18.png\" class=\"kg-image\" alt=\"Getting AWS SES SMTP user and password.\" loading=\"lazy\" width=\"1474\" height=\"593\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-38-18.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-38-18.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-38-18.png 1474w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Note down and keep the SMTP details. We will use it in the alert manager configuration step.<\/p>\n<h2 id=\"step-4-generate-slack-webhook\">Step 4: Generate Slack Webhook<\/h2>\n<p>Create a Slack channel to get the alert manager notifications. You can also use the existing channel and add members who need to get the notifications.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-32-22.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"1250\" height=\"311\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-32-22.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-32-22.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-32-22.png 1250w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Navigate to the administration section and select <code>Manage apps<\/code> to reach the <strong>Slack app directory.<\/strong><\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-33-18.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"916\" height=\"299\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-33-18.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-33-18.png 916w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Search and find <strong>Incoming WebHooks<\/strong> from the Slack app directory and select <code>Add to Slack<\/code> to go to the webhook configuration page.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-34-12.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"1322\" height=\"448\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-34-12.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-34-12.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-34-12.png 1322w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>In the new configuration tab, choose the channel you created before and click on <strong>Add Incoming WebHooks integration.<\/strong><\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-36-15.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"958\" height=\"294\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-36-15.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-36-15.png 958w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Now, the webhook will be created. store them securely to configure them with Alert Manager.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-37-18.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"1542\" height=\"569\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-37-18.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-37-18.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-37-18.png 1542w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<h2 id=\"step-5-configure-alert-manager-with-smtp-and-slack-api\">Step 5: Configure Alert Manager With SMTP and Slack API<\/h2>\n<p>All the configurations for the Alert manager should be part of the <strong><code>alertmanager.yml<\/code><\/strong> file.<\/p>\n<p>Open the Alert Manager configuration file.<\/p>\n<pre><code>sudo vim \/etc\/alertmanager\/alertmanager.yml<\/code><\/pre>\n<p>Inside the configuration file, you should add the Slack and SES information to get the notifications as highlighted below in bold.<\/p>\n<pre><code>global:\n  resolve_timeout: 1m\n  slack_api_url: '$slack_api_url'\n\nroute:\n  receiver: 'slack-email-notifications'\n\nreceivers:\n- name: 'slack-email-notifications'\n  email_configs:\n      - to: 'arunlal@abcdefg.com'\n        from: 'email@devopsproject.dev'\n        smarthost: 'email-smtp.us-west-2.amazonaws.com:587'\n        auth_username: '$SMTP_user_name'\n        auth_password: '$SMTP_password'\n\n        send_resolved: false\n\n  slack_configs:\n  - channel: '#prometheus'\n    send_resolved: false\n<\/code><\/pre>\n<p>In the <code>global<\/code> section, provide the Slack webhook that you have created already.<\/p>\n<p>In the <code>receivers<\/code> section, modify the <code>to<\/code> address and <code>from<\/code> address, and provide the SMTP endpoint and port number in <code>smarthost<\/code> section. <code>auth_username<\/code> as SMTP user name and <code>auth_password<\/code> as SMTP password.<\/p>\n<p>Modify the <code>- channel<\/code> with the slack channel name in <code>slack_configs<\/code> section.<\/p>\n<p>Once the configuration part is done, restart the Alert Manager service and ensure everything is working fine by checking the status of the service.<\/p>\n<pre><code>sudo systemctl restart alertmanager.service\nsudo systemctl status alertmanager.service<\/code><\/pre>\n<h2 id=\"step-6-create-prometheus-rules\">Step 6: Create Prometheus Rules<\/h2>\n<p>Prometheus rules are essential to trigger the alerts. Based on the rules, Prometheus will identify the situations and send an alert to the Alert Manager.<\/p>\n<p>We can create multiple rules in YAML files as per the alert requirements.<\/p>\n<p>Let&#8217;s create a couple of alert rules in separate rule YAML files and validate them by simulating thresholds.<\/p>\n<h3 id=\"rule-1-create-a-rule-to-get-an-alert-when-the-cpu-usage-goes-more-than-60\"><strong>Rule 1: Create a rule to get an alert when the CPU usage goes more than 60%.<\/strong><\/h3>\n<pre><code>cat &lt;&lt;EOF | sudo tee \/etc\/prometheus\/cpu_thresholds_rules.yml\ngroups:\n  - name: CpuThreshold\n    rules:\n      - alert: HighCPUUsage\n        expr: 100 - (avg(rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100) &gt; 60\n        for: 1m\n        labels:\n          severity: critical\n        annotations:\n          summary: \"High CPU usage on {{ $labels.instance }}\"\n          description: \"CPU usage on {{ $labels.instance }} is greater than 60%.\"\nEOF<\/code><\/pre>\n<p>We can add more than one rule in a single file, and modify the values as per your requirements. In this, if the CPU usage goes more than 60% and is sustained for more than a minute, we get an alert.<\/p>\n<h3 id=\"rule-2-rule-for-memory-usage-alert\"><strong>Rule 2: Rule for memory usage alert<\/strong><\/h3>\n<pre><code>cat &lt;&lt;EOF | sudo tee \/etc\/prometheus\/memory_thresholds_rules.yml\ngroups:\n  - name: MemoryThreshold\n    rules:\n      - alert: HighRAMUsage\n        expr: 100 * (1 - (node_memory_MemAvailable_bytes \/ node_memory_MemTotal_bytes)) &gt; 60\n        for: 1m\n        labels:\n          severity: critical\n        annotations:\n          summary: \"High RAM usage on {{ $labels.instance }}\"\n          description: \"RAM usage on {{ $labels.instance }} is greater than 60%.\"\nEOF<\/code><\/pre>\n<p>Here also the same, if the memory usage goes more than 60%, Prometheus sends an alert to Alert Manager.<\/p>\n<h3 id=\"rule-3-rule-for-high-storage-usage-alert\"><strong>Rule 3: Rule for high storage usage<\/strong> alert<\/h3>\n<pre><code>cat &lt;&lt;EOF | sudo tee \/etc\/prometheus\/storage_thresholds_rules.yml\ngroups:\n  - name: StorageThreshold\n    rules:\n      - alert: HighStorageUsage\n        expr: 100 * (1 - (node_filesystem_avail_bytes \/ node_filesystem_size_bytes{mountpoint=\"\/\"})) &gt; 50\n        for: 1m\n        labels:\n          severity: critical\n        annotations:\n          summary: \"High storage usage on {{ $labels.instance }}\"\n          description: \"Storage usage on {{ $labels.instance }} is greater than 50%.\"\nEOF<\/code><\/pre>\n<p>Here, the primary storage usage goes more than it&#8217;s 50%, we get alert notifications through our Slack channel and Email.<\/p>\n<h3 id=\"rule-4-rule-to-get-an-alert-when-an-instance-is-down\"><strong>Rule 4: Rule to get an alert when an instance is down.<\/strong><\/h3>\n<pre><code>cat &lt;&lt;EOF | sudo tee \/etc\/prometheus\/instance_shutdown_rules.yml\ngroups:\n- name: alert.rules\n  rules:\n  - alert: InstanceDown\n    expr: up == 0\n    for: 1m\n    labels:\n      severity: \"critical\"\n    annotations:\n      summary: \"Endpoint {{ $labels.instance }} down\"\n      description: \"{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes.\"\nEOF<\/code><\/pre>\n<p>Once the rules are created, we should check that the rules are working. for that, we have to make some modifications to the Prometheus configuration file.<\/p>\n<h2 id=\"step-7-modify-prometheus-configurations\">Step 7: Modify Prometheus Configurations<\/h2>\n<p>Navigate to the Prometheus configuration file, which is in the <code>\/etc\/prometheus<\/code> directory.<\/p>\n<pre><code>sudo vim \/etc\/prometheus\/prometheus.yml<\/code><\/pre>\n<p>Add Alert Manager information, and Prometheus rules, and modify the scrape configurations as given below.<\/p>\n<pre><code>global:\n  scrape_interval: 15s \n  evaluation_interval: 15s \n\n# Alertmanager configuration\nalerting:\n  alertmanagers:\n    - static_configs:\n        - targets:\n           - localhost:9093\n\nrule_files:\n  - \"cpu_thresholds_rules.yml\"\n  - \"storage_thresholds_rules.yml\"\n  - \"memory_thresholds_rules.yml\"\n  - \"instance_shutdown_rules.yml\"\n\nscrape_configs:\n  - job_name: \"prometheus\"\n    static_configs:\n      - targets: [\"localhost:9090\"]\n\n  - job_name: \"node_exporter\"\n    scrape_interval: 5s\n    static_configs:\n      - targets: ['172.31.31.19:9100']<\/code><\/pre>\n<p>In the <code>global<\/code> section, modify the time interval to collect the metrics. the default interval is 15s, so every 15 seconds, Prometheus will collect metrics from <strong>Node Exporter.<\/strong>  Node Exporter is a tool, that should be installed on the server you want to monitor. It collects the metrics of the server and stores them in <code>\/metrics<\/code> directory, Prometheus pulls them all to analyze and send notifications based on the rules we mention.<\/p>\n<p>In the <code>Alert Manager configuration<\/code> sections, unmask the target if your Prometheus and Alert Manager are on the same server, otherwise, provide the server&#8217;s IP address.<\/p>\n<p>In the <code>rule_files<\/code> section, provide the path of the Prometheus rules files also ensure the rules files are in the same directory where the Prometheus configuration file is in or give the proper path.<\/p>\n<p><code>scrape_configs<\/code> section will have the server information that you want to monitor, the <code>target<\/code> field contains the IP address of the target server.<\/p>\n<p>Here, <code>node-exporter<\/code> is the actual server we want to monitor and in the <code>target<\/code> field we provide the private IP of that server. If your servers are in different network ranges, provide the public IP of that server and <code>9100<\/code> is the default port of the <strong>Node Exporter.<\/strong><\/p>\n<p>After modifying the configurations, restart the Prometheus service using the following command.<\/p>\n<pre><code>sudo systemctl restart prometheus.service<\/code><\/pre>\n<p>Check the status of the service to ensure there are no errors in the above configurations using the following command.<\/p>\n<pre><code>sudo systemctl status prometheus.service<\/code><\/pre>\n<h2 id=\"step-8-verify-configurations-and-rules\">Step 8: Verify Configurations and Rules.<\/h2>\n<p>Here, we conduct stress tests to verify the rules and configurations are working fine.<\/p>\n<h3 id=\"test-1-memory-stress-test\"><strong>Test 1 : Memory Stress Test<\/strong><\/h3>\n<p>Install <code>stress-ng<\/code> utility in the server that you want to monitor and make stress tests.<\/p>\n<pre><code>sudo apt-get -y install stress-ng<\/code><\/pre>\n<p>To stress the Memory of the server, use the following command.<\/p>\n<pre><code>stress-ng --vm-bytes $(awk '\/MemFree\/{printf \"%d\\n\", $2 * 1;}' &lt; \/proc\/meminfo)k --vm-keep -m 10<\/code><\/pre>\n<p>This will increase the memory usage up to 90% and we get an alert that the usage reaches 60%.<\/p>\n<p>To visually analyze the metrics of the server, we use a Grafana dashboard.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-48-10.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"2000\" height=\"392\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-48-10.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-48-10.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1600\/2025\/03\/image-48-10.png 1600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-48-10.png 2003w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Prometheus identifies the issue when the memory usage reaches 60% and will be ready to send an alert to the Alert Manager.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/untitled-3-3.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"1356\" height=\"512\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/untitled-3-3.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/untitled-3-3.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/untitled-3-3.png 1356w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>Then, Prometheus informed the Alert Manager to send notifications to the intended person through Slack and Email.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-49-11.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"1903\" height=\"815\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-49-11.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-49-11.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1600\/2025\/03\/image-49-11.png 1600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-49-11.png 1903w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>The output of the Slack notification is given below. here we can see the server details, severity level, and the cause of alert.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-50-13.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"1160\" height=\"326\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-50-13.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-50-13.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-50-13.png 1160w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>In the output of the Email notification also we get the same information. so that we can easily identify the issue at any time.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-51-10.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"981\" height=\"687\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-51-10.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-51-10.png 981w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<h3 id=\"test-2-cpu-stress-test\">Test 2: CPU Stress Test<\/h3>\n<p>To stress the CPU of the server, use the following command.<\/p>\n<pre><code>stress-ng --matrix 0<\/code><\/pre>\n<p>This command will stress the CPU usage as its maximum and you can see the result of the usage from the diagram.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-52-7.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"2000\" height=\"411\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-52-7.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-52-7.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1600\/2025\/03\/image-52-7.png 1600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-52-7.png 2000w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<h3 id=\"test-3-storage-stress-test\">Test 3: Storage Stress Test<\/h3>\n<p>To stress the storage of the server, use the following command.<\/p>\n<pre><code>fio --name=test --rw=write --bs=1M --iodepth=32 --filename=\/tmp\/test --size=20G<\/code><\/pre>\n<p>The current storage is 30G, this command will increase the storage usage by more than 60% also you can change the value based on your server storage.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-53-6.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"2000\" height=\"447\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-53-6.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-53-6.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1600\/2025\/03\/image-53-6.png 1600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-53-6.png 2000w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<p>The output of the notifications for the above tests we conduct.<\/p>\n<figure class=\"kg-card kg-image-card\"><img decoding=\"async\" src=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-54-13.png\" class=\"kg-image\" alt=\"\" loading=\"lazy\" width=\"2000\" height=\"971\" srcset=\"https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w600\/2025\/03\/image-54-13.png 600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1000\/2025\/03\/image-54-13.png 1000w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/size\/w1600\/2025\/03\/image-54-13.png 1600w, https:\/\/storage.ghost.io\/c\/5f\/2f\/5f2f4d20-2abf-4534-8d40-7aa233aedd43\/content\/images\/2025\/03\/image-54-13.png 2000w\" sizes=\"auto, (min-width: 720px) 720px\"><\/figure>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p>In this blog, I have covered the real-world usage of the Prometheus Alert Manager with some simple examples.<\/p>\n<p>you can further modify the configurations as per your needs. Monitoring is very much important even if it is virtual machines or containers. we can integrate them with Kubernetes as well.<\/p>\n<p>Also, If you are preparing for prometheus certification (PCA), check out our <a href=\"https:\/\/devopscube.com\/prometheus-certified-associate\/\" rel=\"noreferrer noopener\">Prometheus Certified Associate<\/a> exam Guide for exam tips and resources.<\/p>\n<hr>\n<p><strong>Ngu\u1ed3n:<\/strong> <a href=\"https:\/\/devopscube.com\/prometheus-alert-manager\/\" target=\"_blank\" rel=\"noopener noreferrer\">How to Setup VPC Secondary Network For EKS Cluster \u2014 DevOpsCube<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Source: https:\/\/devopscube.com\/prometheus-alert-manager\/<\/p>\n","protected":false},"author":1,"featured_media":738,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-737","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devops"],"_links":{"self":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/posts\/737","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=737"}],"version-history":[{"count":0,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/posts\/737\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=\/wp\/v2\/media\/738"}],"wp:attachment":[{"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=737"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=737"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.ngocha.biz\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=737"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}