By relying on the default keepalive limit, NGINX prevents this type of attack. Creating additional connections to circumvent this limit exposes bad actors via standard layer 4 monitoring and alerting tools.
However, if NGINX is configured with a keepalive that is substantially higher than the default and recommended setting, the attack may deplete system resources.
> In a typical HTTP/2 server implementation, the server will still have to do significant amounts of work for canceled requests, such as allocating new stream data structures, parsing the query and doing header decompression, and mapping the URL to a resource. For reverse proxy implementations, the request may be proxied to the backend server before the RST_STREAM frame is processed. The client on the other hand paid almost no costs for sending the requests. This creates an exploitable cost asymmetry between the server and the client.
I'm surprised this wasn't foreseen when HTTP/2 was designed. Amplification attacks were already well known from other protocols.
I'm similarly similarly surprised it took this long for this attack to surface, but maybe HTTP/2 wasn't widely enough deployed to be a worthwhile target till recently?
Isn’t any kind of attack where a little bit of effort from the attacker causes a lot of work for the victim an amplification attack? Or do you only consider it an amplification attack if it is exploiting layer 3?
I tried looking it up and couldn’t find an authoritative answer. Can you recommend a resource that you like for this subject?
> Isn’t any kind of attack where a little bit of effort from the attacker causes a lot of work for the victim an amplification attack?
That is technically any HTTP request that requires processing to satisfy. For example if I find a page on your site that executes an expensive database query.
Amplification attacks are generally defined as packets that can be sent with a spoofed source address that result in a larger number of packets being returned to the spoofed victim.
> I tried looking it up and couldn’t find an authoritative answer.
I mean, who do you consider authoritative? Googling "amplification attack" will give you plenty of descriptions from tons of sources. Take your pick. Though most will talk about DNS amplication attacks because that's the simplest example.
You're right. I hadn't had my coffee yet and the asymmetric cost reminded me of amplification attacks. I'm still surprised this attack wasn't foreseen though. It just doesn't seem all that clever or original.
I was surprised too, but if you look at the timelines then RST_STREAM seems to have been present in early versions of SPDY, and SPDY seems mostly to have been designed around 2009. Attacks like Slowloris were coming out at about the same time, but they weren't well-known.
On the other hand, SYN cookies were introduced in 1996, so there's definitely some historic precedent for attacks in the (victim pays Y, attacker pays X, X<<Y) class.
If you are working on the successor protocol of HTTP/1.1, and are not aware of Slowloris the moment it hits and every serious httpd implementation out there gets patched to mitigate it, I'd argue you are in the wrong line of work.
> Trying to do it on Google, with a serious effort, that's the wacky part
If I were the FBI, I'd be looking at people with recently bought Google puts expiring soon. I can't imagine anyone taking a swing at Google infra "for the lulz". Also in contention: nation-states doing a practice run.
This is exactly the kind of things that a smart kid who's still just a foolish highschool student would do. I wouldn't be surprised if this attack already exists in the wild, it's not hard to write
Also the subsequent attacks were less effective, that's exactly what some kid would be doing.
You don't even need an expansive botnot. A rich kid whose parents are in neighborhoods with residential fiber with a bunch of friends could probably coordinate it through a discord server
Most of us really don't interact with teenagers regularly so we forget they're out there (they also tend to dislike adults so they make themselves especially invisible around us). When it comes to things like this, that's my first assumption until further evidence.
Google options with near expiries have 100s of thousands of contracts of open interest[1]. Unless you found the person some other way (and then could prove that they had also gone long a short-dated put to try to profit) there's literally no way you find anything interesting by doing that.
> HTTP/2 makes the browsing experience of high-latency connections a lot more tolerable. It also makes loading web pages in general faster.
HTTP/3 does that in my experience (lots of train rides with spotty onboard Wi-Fi) quite a bit better though. As HTTP/2 is still affected by head-of-line blocking and a single packet loss can block all other streams, even if the lost packet didn't hold data for them.
In some alternative history there would have been a push to make http 1.1 pipelining work, trim fat from bloated websites (loading cookie consent banners from a 3rd party domain is a travesty on several levels) and maybe use websockets for tiny API requests. And the prioritization attributes on various resources.
Then shoveling everything over ~2 TCP connections would have done the job?
Personally, as a website visitor and occasional author, I don’t want the performance to be good enough to ‘do the job’. I want it to be as fast as possible. I want it to be instant. For that we need unbloated websites and better protocols. It’s not a competition.
After all, you don’t need bloat to suffer from head-of-line blocking. You just need a few images.
(Though, personally I’m a much bigger fan of HTTP/3 than HTTP/2. With a more principled solution to head-of-line blocking and proper 0-RTT, HTTP/3 makes a stronger case for why we need a new protocol than HTTP/2 did. I don’t know why HTTP/2 had to exist at all, really, when QUIC already existed by the time HTTP/2 was being standardized. Oh well.)
But it is in the context of the 3-way tradeoff we're talking about here. complexity of the site vs. load time vs. protocol complexity
> You just need a few images.
On the HTTP level those can be deferred after the html/styles/js. Then you already have the content. What on your site would be "blocked" at that point? It's just images holding up each other.
On the TCP level SACK and FRTO should resolve most instances of HOL after 1 RTT. It's not perfect but I suspect a lot of people experience "slowness" not because the underlying protocols are bad but because they're on old implementations. Or because they're on networks with bufferbloat. Upgrade those and we don't need those complex workarounds.
As for HTTP/3... it's a mixed bag. The basic idea is great. The execution is another googleism. They didn't have the patience to get it into OSes, so now every client has to implement its own network stack which multiplies the things that need patching if something goes wrong.
And it runs over UDP instead of being a different transport on the IP level like SCTP. And TLS is a good default but the whole CA-thing shouldn't have been mandatory. And header compression also seems like a cure for a disease of their own making, compare which the number of headers you need for HTTP 1.0.
What incentive would most businesses have to do what you're describing?
It is _much_ faster, cheaper, and easier to build a bloated website than an optimized one. Similarly, it is much easier to enable HTTP2 than it is to fix the root of the problem.
I'm not saying that it's right -- anyone without a fast connection or who cares about their privacy isn't getting a great deal here.
Most businesses are not in a position to push through a new network protocol for the entire planet! So if we lived in a world with fewer monopolies then protocols might have evolved more incrementally. Though we'd presumably still have gotten something like BBR because congestion algorithms can be implemented unilaterally.
What incentive do most businesses have to make your checkout process smooth, have automatic doors, or provide shopping carts? Simple: customers like the easiest business to shop at.
Even for leaner websites, HTTP/2 was always going to be an improvement, for HTTP head-of-line blocking and better header compression, if nothing else. These are orthogonal issues for the most part.
Also, they tried prioritization, but it was too unwieldy in practice, the browser vendors didn't agree, and it was deprecated in the latest RFC 9113.
Loading cookie consent banners from a 3rd-party domain is probably a GDPR violation because it transmits user information to a 3rd party without consent.
SCTP (Stream Control Transmission Protocol) or the equivalent. HTTP is really the wrong layer for things like bonding multiple connections, congestion adjustments, etc.
Unfortunately, most computers only pass TCP and UDP (Windows and middleboxes). So, protocol evolution is a dead end.
Thus you have to piggyback on what computers will let through--so you're stuck with creating an HTTP flavor of TCP.
QUIC (the basis for HTTP/3) is basically the spiritual successor to SCTP, except with TLS baked in, so compared with SCTP+DTLS, connection establishment requires significantly fewer roundtrips (0 round trips for session resumption, 1 roundtrip at worst, compared to 4 or so for DTLS).
The comment lists three negative things as "the reason we needed HTTP/2". I don't even see how you could read it other than implying that HTTP/2 was not actually necessary.
Another reason to keep foundational protocols small. HTTP/2 has been around for more than a decade (including SPDY), and this is a first time this attack type surfaced. I wonder what surprises HTTP/3 and QUIC hide...
This is such a strong claim I'd really appreciate something other than "smaller is better"
Abuse and abuse vectors vary wildly in complexity, some complexity is certainly required exactly to avoid dumb bottlenecks if not vulnerabilities. So based on what are you saying something simple will inherently resist abuse better?
> Small, less complex protocols are inherently less likely to be insecure all things being equal, simply due to reduced attack surface.
That feels intuitive in the "less code is less bugs is less security issues" sense but implies that "secure" and "can't be abused" are the same thing.
Related? Sure. Same? No.
Oddly enough, we probably could have prevented the replay/amplification dos attacks that use DNS by making DNS more complex / adding mutual authentication so it's not possible for A to request something that is then sent to B.
We could have prevented the replay/amplification dos attacks that use DNS by making DNS use TCP.
In practice though the only way to "fix" DNS that would've worked in the 80s would've probably been to require the request be padded to larger than the response...
... yeah? I know? "In practice though the only way to "fix" DNS that would've worked in the 80s would've probably been to require the request be padded to larger than the response..."
It's not as complex as some "mutual authentication" scheme though lmao
That's a bit overblown. There's a lot there and some of it conflicts with itself but it's not unmeasurably large by any means. It's a knowable protocol (and yes, I'm aware of the camel meme[1]).
“Cancelation” should really be added to the “hard CS problems” list.
Like the others on that list (off by one, cache invalidation etc) it isn’t actually hard-hard, but rather underestimated and overlooked.
I think if we took half the time we spend on creation, constructors, initialization, and spent that design time thinking about destruction, cleanup, teardown, cancelation etc, we’d have a lot fewer bugs, in particular resource exhaustion bugs.
I really like Rust's async for its ability to immediately cancel Futures, the entire call stack together, at any await point, without needing cooperation from individual calls.
I would like to remind everyone that Google invented HTTP/2.
Now they are telling us a yarn about how they are heroically saving us from the problem they created, but without mentioning the part that they created it.
The nerve of these tech companies! Microsoft has been doing this for decades, too.
It depends on what you think a "request flood" attack is.
With HTTP/1.1 you could send one request per RTT [0]. With HTTP/2 multiplexing you could send 100 requests per RTT. With this attack you can send an indefinite number of requests per RTT.
I'd hope the diagram in this article (disclaimer: I'm a co-author) shows the difference, but maybe you mean yet another form of attack than the above?
[0] Modulo HTTP/1.1 pipelining which can cut out one RTT component, but basically no real clients use HTTP/1.1 pipelining, so its use would be a very crisp signal that it's abusive traffic.
I think for this audience a good clarification is:
* HTTP/1.1: 1 request per RTT per connection
* HTTP/2 multiplexing: 100 requests per RTT per connection
* HTTP/2 rapid reset: indefinite requests per connection
In each case attackers are grinding down a performance limitation they had with previous generations of the attack over HTTP. It is a request flood; the thing people need to keep in mind is that HTTP made these floods annoying to generate.
I wonder why exactly this attack can't be pulled off with HTTP/1.1 and TCP RST for cancellation.
It seems that (even with SYN cookies involved) an attacker could create new connections, send HTTP request, then quickly after send a RST.
Is it just that the kernel doesn't really communicate TCP RST all that well to the application, so the HTTP server continues to count the connection against the "open connection limit" even though it isn't open anymore?
The problem for the attacker is they then run into resource limits on the TCP connections. The resets are essential to get the consumption not counting.
For most current HTTP/2 implementations it'll just be ignored, and that is a problem. We've seen versions of the attack doing just that, as covered in the variants section of the article.
Servers should switch to closing the connection if clients exceed the stream limit too often, not just ignoring the bogus streams.
By request flood I mean, request flood, as in sending insanely high number of requests per unit of time (second) to the target server to cause exhaustion of its resources.
You're right, with HTTP/1.1 we have single request in-flight (or none in keep-alive state) at any moment. But that doesn't limit number of simultaneous connections from a single IP address. An attacker could use the whole port space of TCP to create 65535 (theoretically) connections to the server and to send requests to them in parallel. This is a lot, too. In pre-HTTP/2 era this could be mitigated by limiting number of connections per IP address.
In HTTP/2 however, we could have multiple parallel connections with multiple parallel requests at any moment, this is by many orders higher than possible with HTTP/1.x. But the preceeding mitigation could be implemented by applying to the number of requests over all connections per IP address.
I guess, this was overlooked in the implementations or in the protocol itself? Or rather, it is more difficult to apply restrictions because of L7 protocol multiplexing because it's entirely in the userspace?
Added:
The diagram in the article ("HTTP/2 Rapid Reset attack" figure) doesn't really explain why this is an attack. In my thinking, as soon as the request is reset, the server resources are expected to be freed, thus not causing exhaustion of them. I think this should be possible in modern async servers.
> But that doesn't limit number of simultaneous connections from a single IP address.
Opening new connections is relatively expensive compared to sending data on an existing connection.
> In my thinking, as soon as the request is reset, the server resources are expected to be freed,
You can't claw back the CPU resources that have already been spent on processing the request before it was cancelled.
> By request flood I mean, request flood, as in sending insanely high number of requests per unit of time (second) to the target server to cause exhaustion of its resources.
Right. And how do you send an insanely high number of requests? What if you could send more?
Imagine the largest attack you could do by "sending an insanely high number requests" with HTTP/1.1 with a given set of machine and network resources. With H/2 multiplexing you could do 100x that. With this attack, another 10x on top of that.
> An attacker could use the whole port space of TCP to create 65535 (theoretically) connections to the server and to send requests to them in parallel.
This is harder for the client than it is for the server. As a server, it's kind of not great that I'm wasting 64k of my connections on one client, but it's harder for you to make them than it is for me to receive them, so not a huge deal with today's servers.
On this attack, I think the problem becomes if you've got a reverse proxy h2 frontend, and you don't limit backend connections because you were limiting frontend requests. Sounds like HAProxy won't start a new backend request until the pending backend requests is under the session limit; but google's server must not have been limiting based on that. So cancel the frontend request, try to cancel the backend request, but before you confirm the backend request is canceled, start another one. (Plus what the sibling mentioned... backend may spend a lot of resources handling the requests that will be canceled immediately)
The new technique described avoids the maximum limit on number of requests per second (per client) the attacker can get the server to process. By sending both requests and stream resets within the same single connection, the attacker can send more requests per connection/client than used to be possible, so it is perhaps cheaper as an attack and/or more difficult to stop
Is is a fundamental HTTP/2 protocol issue or implementations issue? Could this be an issue at all, if a server has strict limits of requests per IP address, regardless of number of connections?
It doesn't apply to HTTP/3 because the receiver has to extend the stream concurrency maximum before the sender can open a new stream. This attack works because the sender doesn't have to wait for that after sending a reset in HTTP/2.
But the max is still ~100 streams... And you can open 100 streams all with one UDP packet using zero-rtt connections.
I can send ~1 Million UDP packets per second from one machine. So thats 100 million HTTP requests per second you have to deal with. And when I bring in my 20,000 friends, you need to deal with 2 trillion requests per second.
You can do it a few times, but you can't do it 500 times. For HTTP/3, the highest permitted stream ID is an explicit state variable communicated by the server to the client, eventually forcing a round-trip. That's different from HTTP/2 where the client is entitled to assume that new "stream id window space" (for the lack of a better term) opens up immediately after a stream is closed.
(I'm fudging things a bit. You can probably build attacks that look kind of similar, but we don't think you you could build anything that is actually scalable. But we could be wrong about that! Hence the recommendation to apply similar mitigations to HTTP/3 as well, even if it isn't immediately vulnerable.)
HTTP2 is not TCP on TCP (that's a very basic recipe for a complete disaster, the moment any congestion kicks in); it's mostly just multiplexing concurrent HTTP requests over a single TCP connection.
HTTP3 is using UDP for different reasons, although it effectively re-implements TCP from the application POV (it's still HTTP under the hood after all). Basically with plain old TCP your bandwidth is limited by latency, because every transmitted frame has to be acknowledged - sequentially. Some industries/applications (like transferring raw video files over the pond) have been using specialized, UDP-based transfer protocols for a while for this reason. You only need to re-transmit those frames you know didn't make it, in any order it suits you.
And the throttling seems even simple: give each IP address an initial allowance of A requests, then increase the allowance every T time up to a maximum of B. Perhaps A=B=10, T=150ms.
It is a little more complicated because a request is few layers deep. In HTTP2 you open a connection, start a stream, then send a request over that stream.
Are you tracking per connection? Per stream? Isn't it normal for multiple requests to happen quite quickly? I load a single page with 50 external assets, those get multiplexed over the current stream - is that okay? Is that abusive? The other stream is handling a video player and its requesting (http2) frames of video data - too much? Too fast?
The largest DDoS attack to date, peaking above 398M rps - https://news.ycombinator.com/item?id=37831062
HTTP/2 Zero-Day Vulnerability Results in Record-Breaking DDoS Attacks - https://news.ycombinator.com/item?id=37830998