Real-Time Azure Network Monitoring with Deadman: A Practical Guide

Real-Time Azure Network Monitoring with Deadman: A Practical Guide
Photo by Phil Huang @ HongKong

Before Starting

Highly Recommend a podcast channel to everyone - 逐祿/築路/逐鹿 (追吧,五斗米!) . This is hosted by my friend and teacher Peter Yang, who interviews professionals from various fields. I believe this is one of the few in Taiwan where a senior management executive from a foreign company is willing to spend time exploring the real stories of different professionals with the audience.

Mandarin Podcast


In order to quickly understand the current network connection status of the environment, this post will use upa/deadman - a curses-based host status checking application using ping to perform diagnostic tests on the real-time network connectivity of common Azure services.

You also can refer to my previous post to understand how I research Azure Networking.

Navigating the Azure Networking Maze: 1 Flow, 2 Directions, 3 Metrics
As an infrastrucutre arhcitect, when setting up services in Azure, it is invitable that you will freqently need to conduct network testing. There are 1 flow, 2 directions and 3 metrics to consider. 1 Flow Taking the 5-Tuple as an example Source IP Source Port Destination IP Destination Port Protocol

Apple Podcast

I tried to put my post together with information from ShowNet/GitHub... and got the following podcast. If you're interested, you can listen to it. I think most of it is really fine, though there are some minor details that feel a bit strange, but it's not major issues.

Mandarin Podcast


Prerequisite

  • upa/deaman
  • Azure Firewall (Standard)
    • Regardless of whether it is VPN or Spoke traffic, it will definitely pass through Azure Firewall
    • Allow ICMP for any to any
Rule Name Src. Type Src. IP Dest. Type Dest. IP Protocol Port
Allow ICMP IP Address * IP Address * ICMP *
Azure Firlewall - Network RulesE
  • Enable Azure VPN Gateway
    • Active-active mode
  • On-premise desktop (10.255.252.20) via Azure VPN Gateway
  • Azure VM at Spoke VNet (10.100.100.4)

deadman Configuration

To use deadman to monitor a specific TCP port, you need to use the software hping, and monitoring TCP requires root privileges.

deadman/deadman at master · upa/deadman
deadman is a curses-based host status checking application using ping - upa/deadman

Regarding common monitoring points in Azure, I have placed the configuration file below, and you can adjust it according to your own scenario.

#
# Spoke
#
onprem-desk 10.255.252.20
---
#
# Spoke
#
vm-bastion 10.100.100.4
---
#
# Hub - Azure Firewall
#
fw-public-icmp <AZFW_PUBLIC_IP>
fw-public-tcp-80 <AZFW_PUBLIC_IP> tcp=dstport:80
fw-private-icmp 10.100.255.196
fw-private-tcp-80 10.100.255.196 tcp=dstport:80
---
#
# Hub - Azure Virtual Network Gateway 01
#
# https://learn.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-troubleshoot-site-to-site-cannot-connect#step-7-verify-the-azure-gateway-health-pr
# GatewayTenantWorker_IN_0
vpn-01-public-icmp <VPN_0_PUBLIC_IP>
vpn-01-public-tcp-8081 <VPN_0_PUBLIC_IP> tcp=dstport:8081
vpn-01-private-icmp 10.100.255.5
vpn-01-private-tcp-8081 10.100.255.5 tcp=dstport:8081
---
#
# Hub - Azure Virtual Network Gateway 02
#
# GatewayTenantWorker_IN_1
vpn-02-public-icmp <VPN_1_PUBLIC_IP>  
vpn-02-public-tcp-8083 <VPN_1_PUBLIC_IP> tcp=dstport:8083
vpn-02-private-icmp 10.100.255.4
vpn-02-private-tcp-8083 10.100.255.4 tcp=dstport:8083

deadman.conf

Azure Spoke View

Run deadman from Azure Spoke VM (10.100.100.4)

On-premise View

Run deadman from onprem desk (10.255.252.20)

[BONUS] Verify the Azure Gateway Health Probe

You can use curl --insecure locally to check whether the Azure VPN Gateway is functioning properly. This command will help you perform blackbox monitoring.

Troubleshoot an Azure S2S VPN connection that cannot connect - Azure VPN Gateway
Learn how to troubleshoot a site-to-site VPN connection that suddenly stops working and can’t be reconnected.
  • Active-standby mode
$ curl --insecure https://<VPN_0_PUBLIC_IP>:8081/healthprobe
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">Primary Instance: GatewayTenantWorker_IN_0 GatewayTenantVersion: 24.10.0.115 OSVersion: Windows Server 2022 Datacenter</string>%

active-standby mode

  • Active-active mode
$ curl --insecure https://<VPN_0_PUBLIC_IP>:8081/healthprobe
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">Primary Instance: GatewayTenantWorker_IN_0 GatewayTenantVersion: 24.10.0.115 OSVersion: Windows Server 2022 Datacenter</string>%

$ curl --insecure https://<VPN_1_PUBLIC_IP>:8083/healthprobe
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">Primary Instance: GatewayTenantWorker_IN_1 GatewayTenantVersion: 24.10.0.115 OSVersion: Windows Server 2022 Datacenter</string>%

active-active mode

Phil's memo

I still remember Azure Network Watcher