Here is a question about Firewall rules that use Match Regex. This is for Enterprise customers who want a simple way to block requests for file types that your website does not use.
We have a number of robots who request files trying to obtain information from our websites that might have been left “unprotected”. They request hundreds of .tar or .7z files with a variety of names. They request executable files that we do not furnish like .exe or .dll or even .cgi.
Even better is a testing tool found here. I like this tool in particular because it allows you to asses how simple or complex your regular expression is from a runtime performance perspective. You want your regular expression to execute as quickly as possible! See this tool: regex101: build, test, and debug regex
What we have tried to do is block many of the most popular requested file types that our sites do not furnish. We have a Firewall rule that blocks
URI Matches .+\.(exe|dll|cgi|tar|7z|rar|gz|sql|bck|bak|bz2|tgz)$
Thanks for the tip, great idea! I get many, many such requests, which are often caught by different other rules, but this one has a sharper focus on this specific type of request.
Actually your tip may also benefit users with a Business Plan, as it also allows for Regex matches in Firewall Rules.
If I may suggest an improvement, I think your idea would work best if you use the field http.request.uri.path, instead of http.request.uri, as this may include a query string. From the Firewall Rules help page:
The full URI as received by the web server (does not include #fragment which is not sent to web servers)
http.request.uri
String
/articles/index?section=539061&expand=comments
The absolute URI of the request
http.request.uri.path
String
/articles/index
The path of the request
Also, one nice feature of Firewall Rules is that if you click on Edit expression you can add the lower() function, so that the requested path will be converted to lowercase before the match is done. This way, attackers using uppercase will not be able to bypass the rule:
Sorry, @cbrandt I spoke too soon. On further testing my rule actually does not work!
Something must be wrong with my regex but I cannot see what.
I did adjust my rule to use http.request.uri.path and the lower() function. Both make sense. But requests like www.domain.com/execute.dll still come through! Can you help please?
I could not use the simple Expression Builder. Instead, I had to code it explicitly as
(http.request.uri.path matches “.+\.(exe|dll|cgi|tar|7z|rar|gz|sql|bck|bak|bz2|tgz)$”)
When using the simple Expression Builder, I could not get it to properly use just one backslash between the + and the . in the expression above.