Wesfarmers-owned Catch.com.au has undergone a web application protection upgrade, partially in response to its first significant distributed denial-of-service attack in 2022.
Platform engineering lead Cameron Hall told last month’s AWS re:Inforce 2024 conference in the US that the online marketplace, which started in 2006, “had a pretty good run” in avoiding crippling DDoS attacks until then.
“To our knowledge, we hadn't previously been the victim of a DoS attack, at least not one large enough to knock us offline or that we'd noticed,” Hall said.
“Regardless, we knew for a fact that someone had knocked us offline [and] they're going to do it again. It's a bit like a game.
“Ultimately if our site's not online, it's not just Catch that's not making money, it's our sellers that are relying on Catch to run their businesses.
“So, it's really important that we have some assurance that we're able to always transact.”
The 2022 attack directed 11 times the usual number of requests at Catch’s website.
Hall said the incident was identified by technology teams “within two minutes”, though it took a bit longer to diagnose the root cause.
Not shielded
Once DDoS was identified, there was some initial confusion as the company believed it had protections in place.
“We've been serving our site on [AWS] CloudFront since 2011 … and because we're on CloudFront, we also have AWS Shield Standard,” Hall said.
“So, we were a bit confused when we were hit by the attack. We thought, ‘We've got AWS Shield, why is our site being taken down? Aren't we protected from DoS attacks?”
AWS Shield Standard - the cloud provider’s base DDoS protection mechanism - was enabled, however it only automatically mitigates network and transport layer attacks, not those that target the application layer.
AWS has an Advanced version of Shield that can detect DDoS attacks targeting the application layer.
In addition to Shield Standard, Catch also had AWS WAF - web application firewall - which could also have played a role in mitigating an application layer attack.
However, Catch “had left every single Amazon managed rule in count mode”, tallying the requests but not determining whether to allow or block them.
“There's a natural fear when you're implementing WAF rules that you'll start blocking legitimate traffic, and this is even more true when it comes to ageing monolithic web applications like ours,” Hall said.
Hall suggested other factors may have played a part in a WAF being put in place, but largely not being enabled.
“Working at Catch is very much like working at an enterprising startup. We're always trying to scramble to get things done,” he said.
“Some things aren't done to completion and a lot of things are done all the way through.
“I'd hazard a guess that the fear of blocking legitimate traffic is probably why our WAF was still in count. But in defence of the person that had set it up, I'm almost certain … that it was nothing more than a title [in an architecture specification] that said, ‘Put AWS WAF in front of the website’. And to their credit, they did exactly what it said on the tin.”
Mitigations put in place
Once the company’s DDoS protection limitations became clear, Catch took some immediate actions to mitigate the DDoS using the WAF.
It introduced a “generous global rate limit” to stop its infrastructure “being smashed”; “created a simple blocklist and a related playbook” as an “instruction manual for the next person” that might be faced with a similar incident; and “introduced a break-glass geoblock rule … to restrict traffic to just Australia and New Zealand because that's our target market.”
“This is a rule that you're leaving in count mode and switching to block in response to an attack,” Hall said.
“It's not going to protect you from everything but it's just a tool to have in your tool belt for when you need to mitigate an attack, and it's proven more than useful on more than one occasion.”
Scoping web application protection platforms
Post-incident, the marketplace operator also re-evaluated its web application protections, going to market for an “all-in-one solution” covering CDN, DDoS protection, bot control, and - if the solution allowed - API security as well.
Through “many tedious sales calls and demos”, Catch learned a couple of things.
First, API security was “on the roadmap” of many vendors, but not production ready. As this was considered a “desirable” attribute of a web application protection platform, but not the primary reason for being in-market for one, Catch “opted to just kick the can down the road” on it.
Second, the bigger lesson Catch took from the market examination was just how much an all-in-one platform was going to cost.
“[These platforms] come with a really, really high price tag,” Hall said.
“The starting price for any of these solutions that we were looking at was close to a quarter of a million dollars, regardless of how much traffic you're going to be receiving or sending.
“I could have just signed the contract, closed it out, and made it a problem for the next guy that has to do the budget.
“But we're a low margin, high volume retailer, so being incredibly cost-conscious is something that's absolutely key to us when we're considering these types of investments.
“And we started to gravitate towards these more ‘white glove’ solutions, we had to take a step back and reassess whether we could actually afford it.”
The “price shock”, as Hall characterised it, led Catch back to reassess the capabilities of AWS.
“We already had half of it in place,” Hall said.
Hall said that AWS offered a 60-day trial of Shield Advanced so that Catch was protected from DDoS attacks targeting Layer 7 while it went to market.
“Looking at the AWS solution, it's a bit more do-it-yourself - which is a given, it's the building blocks,” Hall said.
But he said the setup and ongoing costs of operation came in substantially lower.
“The total cost of ownership for us with the AWS option rendered savings of two times in the first year, and then four times every year thereafter.
“So … we've got AWS Shield Advanced, AWS WAF with the Amazon managed rules no longer in count, and bot control.”
At the time of procurement, AWS’ bot control capabilities were limited, and Catch already had another bot protection platform that it had used for five years to detect “more sophisticated bots that were trying to emulate human behaviour.”
AWS introduced the capability to also target the more sophisticated bots in late 2022, which - despite some scepticism - Catch elected to test.
“It was AWS's first attempt at doing this. We knew that it's not a very simple thing to do detecting bots and it's something that takes time to get really good at, so we were a little reserved when AWS came out, but we were really eager to see what it could do,” Hall said.
“Our architecture was [set up] in such a way that we were able to employ AWS's targeted block protection and then anything that AWS missed would fall through to the third-party option.
“The third-party solution provided logs, which we then fed back to AWS so we could help them improve the bot product. This proved to be a really effective strategy.”
Also stemming from this work, Catch is now creating and implementing WAF rules, albeit with some guardrails.
The company still implements all new rules in count mode so they can be tailored, and exceptions can be managed before they get used to allow or block requests.
“Once we build confidence, we'll promote the rules to block,” Hall said.
To assist in building that confidence, the WAF logs are fed to a dashboard in Datadog, an observability platform, to understand traffic patterns and behaviours.
“We were using this to identify, what are the rules that have been blocked most frequently? What's changed since yesterday? Do we have spikes in block requests or counts? Just trying to see what's happening,” Hall said.
Hall added that the setup enabled Catch to reduce the time where new rules were observed and adjusted “from weeks to days”.