I am trying to set up a website where it might get 1.5 to 2 Million requests instantly. This is an event site so real people and bots both will try to grab event tickets as soon as the site opens for sale.
Servers are in AWS. How can I use Cloudflare features to make sure the site won’t go down when the tickets go on sale?
Expected behavior:
Site is running normally, with very minimal traffic.
Tickets go on sale at a known specific time (ex: 00:00 UTC)
Site will get instant 1.5 to 2 Million requests.
Tickets sold out within a couple of minutes
Traffic slows down to a couple of thousand requests and eventually goes back to a very minimal state.
I am willing to get up to the business plan (only if needed).
It’s a live connection to your server because of the need to handle ticket sales. Cloudflare isn’t going to help your server handle 2 million requests in 2 minutes.
I don’t even see a way for Workers to handle this due to the latency of KV storage.
Thanks for the reply. Yes, Queue-it is one option I was looking at. If queue-it can allow let’s say 50K requests and then put anything after that to a queue, I guess it is a fair game.
The other thing is I am not sure out of 2M how many are bots. I am assuming it will be at least 1.5M (500K real users). If that’s the case, can Cloudflare take care of the task of stopping bots at the CDN level and only allowing real users to the website?
Try using Bot Fight Mode and I am not sure but you can keep the security level to I am under attack
Visitors will receive an interstitial page while cloudflare will analyze the traffic and behavior to make sure they are a legitimate human visitor trying to access your website
Any caching can help handle more clients like you can use redis
The downside of a queue-based system is that you make it worse for legit visitors. Who is more likely to get a ticket?
A legitimate person.
A person with 100 bots connecting to your site.
Your question is mainly linked with infrastructure engineering; as others pointed out, CF won’t do the magic in this case.
As for bot protection, I doubt anything except the enterprise plan w/ bot management can help in this case. The best would be to reach out to Cloudflare.
Typically you’d work backwards for this. Identity each segment of your server stack you need to optimise for and then identify how you can optimise for each segment - splitting cacheable vs uncacheable loads so you offload some load from each segment allowing you to scale better. So you’d work from segments:
database servers - MySQL, redis etc
processing pipeline i.e. PHP
web servers i.e. nginx, apache, litespeed etc
load balancers i.e. haproxy, nginx or Cloudflare load balancer
Cloudflare CDN, firewall, and other web acceleration service/features
Cloudflare Business plan’s bypass cache on cookie would help a bit to split up what are cacheable requests (guest visitors) Caching Anonymous Page Views - leaving you to deal with the uncacheable logged in user requests/backend load.
I wouldn’t worry about that much as you’re traffic is for profit and sales matter, you don’t want to block legit users from making a purchase. If you have time, I’d make sure Cloudflare WAF rules are properly configured for your web application and site way before the event so you can ensure the WAF rules only block bad requests.
Of course, simple Cloudflare Firewall rules might help eliminate traffic you know isn’t of use i.e. if you don’t plan to have sales/visits from China, Ukraine, Russia, Bolvia etc then you could setup a simple Firewall rule to block or challenge just those countries.
Another simple technique is not using the same entry/landing page for all. You could at Cloudflare level direct visitors to differing landing pages or different servers (if you have more than one backend origin server) based on criteria like location/geography etc too.
Basically, how does one eat a very very large pizza? Small bites at a time So with that large amount of incoming traffic, break it down and divide it out to more manageable chunks with the aid of caching and Cloudflare where possible
You can cheat a bit here. Offer up an early bird sale special, maybe use CF Waiting room to limit users coming in for an early bird sale prior to main event. These users can help prime up all the various levels of caches from Cloudflare CDN side all the way down your origin server software stacks’ respective caches for web server, PHP, databases etc
Divide and conquer was never explained more smoothly.
There are exceptions, it depends on the niche. People can get very upset if all the tickets or exclusive items from a shop are drained instantly due to bots.
Somebody I know manages a website that sells exclusive items from time to time and they spend a lot of bot protection, up to 20-30k in a bunch of hours!
The downside of bot management is that it’s tied to a 1-year contract, if CF can’t satisfy your needs and you still need bot protection, make sure to look elsewhere.
All bot protections can stop bots, the question is for how long and which bots. Cheap bot protections aren’t useful when it comes to stopping hoarder bots.
Good idea, I think that the typical solution is through raffles. So that people preorder the item and if they are selected, they can opt to buy the product. If they don’t, the product is give-away’d again.
Businesses prevent mass account registering by adding a trust factor, so that only accounts with x spent are eligible, etc.
Yeah I guess. Though I guess you could save some hassle by requiring folks to pre-register for events so you have all their details/verification before hand and open event sales first to folks who pre-register. That way you don’t overwhelm the registration systems on event day. Which is sort of similar to what you said regarding raffles I guess.
Oh no, these bots can log in and register and perform all the actions a regular user would. The only way to stop them is with top-tier bot protection or through raffles. Some of these bots go as far as taking the shape of a browser extension, you can imagine how complicated those can be to detect and mitigate .
Many websites end up surrendering to the costs of having bot protection and either choose raffles or nothing at all. A prime example were/are GPUs, where scalpers drained entire shops due to sites not implementing neither of the options I mentioned.
Maybe they should add a Facebook login. That’ll stop 'em.
More seriously, just how do the bots make payments for all these purchases? Obviously, can’t be a handful of payment accounts otherwise they could be easily blocked.
They have dozens of credit cards / virtual credit cards, new each time they buy an item.
All of the credit cards have enough cash on them to buy what the person wants. At some point, websites might ban “sketchy” CC issuers from preventing this kind of scenario; instead, it’s a patch than an actual fix.
IP detection is primarily pointless as some of these people have their IP range rented; the more typical scenario is buying “residential” proxies.
And what’s the problem with requiring a billing address and forcing that billing address to be unique per card? I thought all cards required a billing address, even so-called virtual ones. (As far as I’m aware, the only ones that don’t require an address are prepaid ones, and those are outright banned in many places.)
Maybe I didn’t understand correctly but checking the billing address (full or partial) is an option with every payment processor. I.e., if a billing address is collected and submitted for verification during a transaction, the transaction will be declined if the address does not match. Hence checking for uniques, and also denying address-less transactions. What am I missing here…?
I just wasn’t sure. It’s always seemed that the zip code is enough, as that’s all that many merchants ask for. That’s all a gas (petrol) station needs to approve a card. Then again, it is a physical card, not typed in over the Internet.
Ah, gotcha. You can actually turn payment processor sensitivity up or down, depending on your needs. The billing address can be made a requirement with all processors, far as I know.
Even the zip code can be turned off, I’m pretty sure. Gas stations enabled it because it was a nightmare to deal with fraud – or was before chips in cards became commonplace.