Troubleshooting Azure Application Gateway: A Collection of Useful KQL Queries

Troubleshooting Azure Application Gateway: A Collection of Useful KQL Queries

Introduction

Updated: 16/03/2023

The Azure Application Gateway is a highly scalable and managed web traffic load balancer that provides application-level routing, load balancing, and web application firewall services. However, despite its simplicity compared to other products for this purpose, it is not uncommon to encountser problems that can affect the availability of applications.

To resolve these issues quickly and effectively, it’s important to have a good understanding of the log data generated by the Azure Application Gateway. This is where Kusto Query Language (KQL) can be a valuable tool. With KQL, you can analyze the log data and gain valuable insights into the performance and behavior of your Azure Application Gateway.

In this blog, I’ll write down my current collection of useful KQL queries that hopefully can help you troubleshoot your common Azure Application Gateway Problems. I will keep this blog post up to date as issues and requirements naturally change over time.

The different Log Types of an Azure Application Gateway

Activity log: You can use Azure activity logs (formerly known as operational logs and audit logs) to view all operations that are submitted to your Azure subscription, and their status. Activity log entries are collected by default, and you can view them in the Azure portal.

Access log: You can use this log to view Application Gateway access patterns and analyze important information. This includes the caller’s IP, requested URL, response latency, return code, and bytes in and out. An access log is collected every 60 seconds. This log contains one record per instance of Application Gateway. The Application Gateway instance is identified by the instanceId property.

Performance log: You can use this log to view how Application Gateway instances are performing. This log captures performance information for each instance, including total requests served, throughput in bytes, total requests served, failed request count, and healthy and unhealthy backend instance count. A performance log is collected every 60 seconds. The Performance log is available only for the v1 SKU. For the v2 SKU, use Metrics for performance data.

Firewall log: You can use this log to view the requests that are logged through either detection or prevention mode of an application gateway that is configured with the web application firewall. Firewall logs are collected every 60 seconds.

Source: learn.microsoft.com/en-us/azure/application..

Azure Application Gateway Log Queries

Access Log Queries

Just list the Application Gateway Access Log sorted by Time

AzureDiagnostics
| where Category == "ApplicationGatewayAccessLog"
| sort by TimeGenerated

When you’re looking for an explicit Host like yourhost.com

AzureDiagnostics
| where Category == "ApplicationGatewayAccessLog"
| where requestUri_s == "/your/request/uri"
| sort by TimeGenerated

When you’re looking for an explicit Path like yourhost.com/your/request/uri

AzureDiagnostics
| where Category == "ApplicationGatewayAccessLog"
| where requestUri_s == "/your/request/uri"
| sort by TimeGenerated

When you’re looking for an explicit Path like yourhost.com/your/request/uri and you got multiple Request Routing Rules

AzureDiagnostics
| where Category == "ApplicationGatewayAccessLog"
| where requestUri_s == "/your/request/uri"
| where ruleName_s == "routing-example.yourhost.com"

When you’re looking for unsuccessful requests against a specific host.

AzureDiagnostics
| where Category == "ApplicationGatewayAccessLog"
| where httpStatus_d >= 400 and httpStatus_d <= 499

Identify slow requests which take over 4 seconds

AzureDiagnostics
| where timeTaken_d > 4
| summarize request_count=count() by timeTaken_d, requestUri_s
| render timechart

Identify which RequestUris are giving a 4xx Error and summarize them by RequestUri

AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess"
| where httpStatus_d >= 400 and httpStatus_d <= 499
| project TimeGenerated, clientIP_s, Resource, originalHost_s, originalRequestUriWithArgs_s, userAgent_s, backendPoolName_s, httpStatus_d, requestUri_s
| summarize count() by originalRequestUriWithArgs_s
| order by count_

Firewall Log Queries

Just list the Application Gateway Firewall Logs sorted by Time

AzureDiagnostics
| where Category == "ApplicationGatewayFirewallLog"
| sort by TimeGenerated

Look if a Web Application Firewall Rule blocks a specific IP address (e.g., 1.2.3.4).

AzureDiagnostics
| where Category == "ApplicationGatewayFirewallLog"
| where action_s == "Blocked"
| where clientIp_s == "1.2.3.4"
| sort by TimeGenerated

Look if a Web Application Firewall Rule blocks a specific IP address (e.g., 1.2.3.4) and outputs cleaner with the columns TimeGenerated, Message, clientIp_s, ruleId_s, action_s, details_message_s

AzureDiagnostics
| where Category == "ApplicationGatewayFirewallLog"
| where action_s == "Blocked"
| where clientIp_s == "1.2.3.4"
| sort by TimeGenerated
| project TimeGenerated, Message, clientIp_s, ruleId_s, action_s, details_message_s