Investigate problems due to User-Agent using Bash

Last week we had some problems with the Google Ads bot. It could not crawl a bunch of URLs, while the browser had no problem getting through. The only difference was the User-Agent. This send us on a debugging journey through Cloudflare, gateways and micro-sites.

To assist us, we've created a small bash script to visit an URL and show the HTTP status code, the Location header and some size information to find out which part of our setup was causing problems.

Enter the bash

Let's use cURL and grep to visit a URL and display the result:


function visit {

  local url="$1";
  local userAgent="$2"

  echo -e "Visiting $url"
  echo -e "Using User-Agent: $userAgent, results in:";
  echo "--------------------------------";
  curl \
    --user-agent "$userAgent" \
    --verbose \
    --write-out "\nHeader size: %{size_header}, Download size: %{size_download}\n" \
    --silent "$url" 2>&1 | \
  grep --extended-regexp "^(< HTTP)|(< Location)|(Header size)";
  echo "--------------------------------";

This function makes it easier to test and tinker.

Calling it

Our bug has to do with the User-Agent (UA). When the Google Ads bot UA is used, an HTTP 502 Bad Gateway is returned. But which gateway is it and why?

Let's call all the "stations" between outside and the micro-site with two the UA's and discover the difference.

UA_Chrome="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36";
UA_AdsBot="AdsBot-Google (+";


# Cloudflare
visit "$URL" "$UA_Chrome";
visit "$URL" "$UA_AdsBot";

# main gateway
visit "https://gateway.wehkamp.internal/$URL" "$UA_Chrome";
visit "https://gateway.wehkamp.internal/$URL" "$UA_AdsBot";

# local gateway
visit "https://pop-gateway.wehkamp.internal/$URL" "$UA_Chrome";
visit "https://pop-gateway.wehkamp.internal/$URL" "$UA_AdsBot";

# micro-site
visit "https://pop-site.wehkamp.internal/$URL" "$UA_Chrome";
visit "https://pop-site.wehkamp.internal/$URL" "$UA_AdsBot";

What's the verdict?

Well, it turned out that our micro-site gave back some rather big CSP headers when a non-standard user-agent visited, increasing the size of the headers with 240%. This caused the local NGINX gateway to log upstream sent too big header while reading response header from upstream and to return a 502.

Re-configuring the local gateway to have more buffers fixed the problem immediately:

proxy_buffer_size          128k;
proxy_buffers              4 256k;
proxy_busy_buffers_size    256k;

The next step is making sure our micro-site does not send different CSP headers to the Google Ads, but that will take more effort.