Speeding up S3 URL generation in ruby

It looks like the AWS SDK is very slow at generating S3 URLs, both public and presigned, and that you can generate around an order of magnitude faster in both cases. This can matter if you are generating hundreds of S3 URLs at once.

My app

The app I work is a “digital collections” or “digital asset management” app. It is about displaing lists of files, so it displays a LOT of thumbnails. The thumbnails are all stored in S3, and at present we generate URLs directly to S3 in src‘s on page.

Some of our pages can have 600 thumbnails. (Say, a digitized medieval manuscript with 600 pages). Also, we use srcset to offer the browser two resolutions for each images, so that’s 1200 URLs.

Is this excessive, should we not put 600 URLs on a page? Maybe, although it’s what our app does at present. But 100 thumbnails on a page does not seem excessive; imagine a 10×10 grid of postage-stamp-sized thumbs, why not? And they each could have multiple URLs in a srcset.

It turns out that S3 URL generation can be slow enough to be a bottleneck with 1200 generations in a page, or in some cases even 100. But it can be optimized.

On Benchmarking

It’s hard to do benchmarking in a reliable way. I just used Benchmark.bmbm here; it is notable that on different runs of my comparisons, I could see results differ by 10-20%. But this should be sufficient for relative comparisons and basic orders of magnitude. Exact numbers will of course differ on different hardware/platform anyway. (benchmark-ips might possibly be a way to get somewhat more reliable results, but I didn’t remember it until I was well into this. There may be other options?).

I ran benchmarks on my 2015 Macbook 2.9 GHz Dual-Core Intel Core i5.

I’m used to my MacBook being faster than our deployed app on an EC2 instance, but in this case running benchmarks on EC2 had very similar results. (Of course, EC2 instance CPU performance can be quite variable).

Public S3 URLs

A public S3 URL might look like https://bucket_name.s3.amazonaws.com/path/to/my/object.rb . Or it might have a custom domain name, possibly to a CDN. Pretty simple, right?

Using shrine, you might generate it like model.image_url(public_true). Which calls Aws::S3::Object#public_url . Other dependencies or your own code might call the AWS SDK method as well.

I had noticed in earlier profiling that generating S3 URLs seemed to be taking much longer than I expected, looking like a bottleneck for my app. We use shrine, but shrine doesn’t add much overhead here, it’s pretty much just calling out to the AWS SDK public_url or presigned_url methods.

It seems like generating these URLs should be very simple, right? Here’s a “naive” implementation based on a shrine UploadedFile argument. Obviously it would be easy to use a custom or CDN hostname in this implementation alternately.

def naive_public_url(shrine_file)
"https://#{["#{shrine_file.storage.bucket.name}.s3.amazonaws.com", *shrine_file.storage.prefix, shrine_file.id].join('/')}"
end
naive_public_url(model.image)
#=> "https://somebucket.s3.amazonaws.com/path/to/image.jpg"
view raw naive_s3.rb hosted with ❤ by GitHub

Benchmark generating 1200 URLs with naive implementation vs a straight call of S3 AWS SDK public_url…

original AWS SDK public_url implementation 0.053043 0.000275 0.053318 ( 0.053782)
naive implementation 0.004730 0.000016 0.004746 ( 0.004760)
view raw gistfile1.txt hosted with ❤ by GitHub

53ms vs 5ms, it’s an order of magnitude slower indeed.

53ms is not peanuts when you are trying to keep a web response under 200ms, although it may not be terrible. But let’s see if we can figure out why it’s so slow anyway.

Examining with ruby-prof points to what we could see in the basic implementation in AWS SDK source code, no need to dig down the stack. The most expensive elements are the URI.parse and the URI-safe escaping. Are we missing anything from our naive implementation then?

Well, the URI.parse is just done to make sure we are operating only on the path portion of the URL. But I can’t figure out any way bucket.url would return anything but a hostname-only URL with an empty path anyway, all the examples in docs are such. Maybe it could somehow include a path, but I can’t figure out any way the URL being parsed would have a ? query component or # fragment, and without that it’s safe to just append things without a parse. (Even without that assumption, there will be faster ways than a parse, which is quite slow!) Also just calling bucket.url is a bit expensive, and can deal with some live arn: lookups we won’t be using.

URI Escaping, the pit of confusing alternatives

What about escaping? Escaping can be such a confusing topic with S3, with different libraries at different times handling it different/wrong, then it would be sane to just never use any characters in an S3 key that need any escaping, maybe put some validation on your setters to ensure this. And then you don’t need to take the performance hit of escaping.

But okay, maybe we really need/want escaping to ensure any valid S3 key is turned into a valid S3 URL. Can we do escaping more efficiently?

The original implementation splits the path on / and then runs each component through the SDK’s own Seahorse::Util.uri_escape(s). That method’s implementation uses CGI.escape, but then does two gsub‘s to alter the value somewhat, not being happy with CGI.escape. Those extra gsubs are more performance hit. I think we can use ERB::Util.url_encode instead of CGI.escape + gsubs to get the same behavior, which might get us some speed-up.

But we also seem to be escaping more than is necessary. For instance it will escape any ! in a key to %21, and it turns out this isn’t at all necessary, the URL resolve quite fine without escaping this. If we escape only what is needed, can we go even faster?

I think what we actually need is what URI.escape does — and since URI.escape doesn’t escape /, we don’t need to split on / first, saving us even more time. Annoyingly, URI.escape is marked obsolete/deprecated! But it’s stdlib implementation is relatively simple pure ruby, it would be easy enough to copy it into our codebase.

Even faster? The somewhat maintenance-neglected but still working at present escape_utils gem has a C implementation of some escaping routines. It’s hard when many implementations aren’t clear on exactly what they are escaping, but I think the escape_uri (note i on the end not l) is doing the same thing as URI.escape. Alas, there seems to be no escape_utils implementation that corresponds to CGI.escape or ERB::Util.url_encode.

So now we have a bunch of possibilities, depending on if we are willing to change escaping semantics and/or use our naive implementation of hostname-supplying.

Original AWS SDK public_url100%
optimized AWS SDK public_urlAvoid the URI.parse, use ERB::Util.url_encode. Should be functionally identical, same output, I think!60%
naive implementationNo escaping of S3 key for URL at all7.5%
naive + ERB::Util.url_encodeshould be functionally identical escaping to original implementation, ie over-escaping28%
naive + URI.escapewe think is sufficient escaping, can be done much faster15%
naive + EscapeUtils.escape_uriwe think is identical to URI.escape but faster C implementation11%

We have a bunch of opportunities for much faster implementations, even with existing over-escaping implementation. Here’s the file I used to benchmark.

Presigned S3 URLs

A Presigned URL is used to give access to non-public content, and/or to specify response headeres you’d like S3 to include with response, such as Content-Disposition. Presigned S3 URLs all have an expiration (max one week), and involve a cryptographic signature.

I expect most people are using the AWS SDK for these, rather than reinvent an implementation of the cryptographic signing protocol.

And we’d certainly expect these to be slower than public URLs, because of the crypto signature involved. But can they do be optimized? It looks like yes, at least about an order of magnitude again.

Benchmarking with AWS SDK presigned_url, 1200 URL generations can take around 760-900ms. Wow, that’s a lot — this is definitely enough to matter, especially in a web app response you’d like to keep under 200ms, and this is likely to be a bottleneck.

We do expect the signing to take longer than a public url, but can we do better?

Look at what the SDK is doing, re-implement a quicker path

The presigned_url method just instantiates and calls out to an Aws::S3::Presigner. First idea, what if we create a single Aws::S3::Presigner, and re-use it 1200 times, instead of instantiating it 1200 times, passing it the same args #presigned_url would? Tried that, it was only minor performance improvement.

OK, let’s look at the Aws:S3::Presigner implementation. It’s got kind of a convoluted way of getting a URL, building a Seahorse::Client::Request, and then doing something weird with it…. maybe modifying it to not actually go to the network, but just act as if it had… returning headers and a signed URL, and then we throw out the headers and just use the signed URL…. phew! Ultimately though it does the actual signing work with another object, an Aws::Sigv4:Signer.

What if we just instantiate one of these ourselves, instantiate it the same arguments the Presigner would have for our use cases, and then call presign_url on it with the same args the Presigner would have. Let’s re-use a Signer object 1200 times instead of instantiating it each time, in case that matters.

We still need to create the public_url in order to sign it. Let’s use our replacement naive implementation with URI.escape escaping.

AWS_SIG4_SIGNER = Aws::Sigv4::Signer.new(
service: 's3',
region: AWS_CLIENT.config.region,
credentials_provider: SOME_AWS_CLIENT.config.credentials,
unsigned_headers: Aws::S3::Presigner::BLACKLISTED_HEADERS,
uri_escape_path: false
)
def naive_with_uri_escape_escaping(shrine_file)
# because URI.escape does NOT escape `/`, we don't need to split it,
# which is what actually saves us the time.
path = URI.escape(shrine_file.id)
"https://#{["#{shrine_file.storage.bucket.name}.s3.amazonaws.com", *shrine_file.storage.prefix, shrine_file.id].join('/')}"
end
# not yet handling custom query params eg for content-disposition
def direct_aws_sig4_signer(url)
AWS_SIG4_SIGNER.presign_url(
http_method: "GET",
url: url,
headers: {},
body_digest: 'UNSIGNED-PAYLOAD',
expires_in: 900, # seconds
time: nil
).to_s
end
direct_aws_sig4_signer( naive_with_uri_escape_escaping( shrine_uploaded_file ) )
# => presigned S3 url

Yes, it’s much faster!

Bingo! Now I measure 1200 URLs in 170-220ms, around 25% of the time. Still too slow to want to do 1200 of them on a single page, and around 4x slower than SDK public_url.

Interestingly, while we expect the cryptographic signature to take some extra time… that seems to be at most 10% of the overhead that the logic to sign a URL was adding? We experimented with re-using an Aws::Sigv4::Signer vs instantiating one each time; and applying URI-escaping or not. These did make noticeable differences, but not astounding ones.

This optimized version would have to be enhanced to be able to handle additional query param options such as specified content-disposition, I optimistically hope that can be done without changing the performance characteristics much.

Could it be optimized even more, by profiling within the Aws::Sigv4::Signer implementation? Maybe, but it doesn’t really seem worth it — we are already introducing some fragility into our code by using lower-level APIs and hoping they will remain valid even if AWS changes some things in the future. I don’t really want to re-implement Aws::Sigv4::Signer, just glad to have it available as a tool I can use like this already.

The Numbers

The script I used to compare performance in different ways of creating presigned S3 URLs (with a couple public URLs for comparison) is available in a gist, and here is the output of one run:

user system total real
sdk public_url 0.054114 0.000335 0.054449 ( 0.054802)
naive S3 public url 0.004575 0.000009 0.004584 ( 0.004582)
naive S3 public url with URI.escape 0.009892 0.000090 0.009982 ( 0.011209)
sdk presigned_url 0.756642 0.005855 0.762497 ( 0.789622)
re-use instantiated SDK Presigner 0.817595 0.005955 0.823550 ( 0.859270)
use inline instantiated Aws::Sigv4::Signer directly for presigned url (with escaping) 0.216338 0.001941 0.218279 ( 0.226991)
Re-use Aws::Sigv4::Signer for presigned url (with escaping) 0.185855 0.001124 0.186979 ( 0.188798)
Re-use Aws::Sigv4::Signer for presigned url (without escaping) 0.178457 0.001049 0.179506 ( 0.180920)
view raw gistfile1.txt hosted with ❤ by GitHub

So what to do?

Possibly there are optimizations that would make sense in the AWS SDK gem itself? But it would actually take a lot more work to be sure what can be done without breaking some use cases.

I think there is no need to use URI.parse in public_url, the URIs can just be treated as strings and concatenated. But is there an edge case I’m missing?

Using different URI escaping method definitely helps in public_url; but how many other people who aren’t me care about optimizing public_url; and what escaping method is actually required/expected, is changing it a backwards compat problem; and is it okay maintenance-wise to make the S3 object use a different escaping mechanism than the common SDK Seahorse::Util.uri_escape workhorse, which might be used in places with different escaping requirements?

For presigned_urls, cutting out a lot of the wrapper code and using a Aws::Sigv4::Signer directly seems to have significant performance benefits, but what edge cases get broken there, and do they matter, and can a regression be avoided through alternate performant maintainable code?

Figuring this all out would take a lot more research (and figuring out how to use the test suite for the ruby SDK more facilely than I can write now; it’s a test suite for the whole SDK, and it’s a bear to run the whole thing).

Although if any Amazon maintainers of the ruby SDK, or other experts in it’s internals, see this and have an opinion, I am curious as to their thoughts.

But I am a lot more confident that some of these optimizations will work fine for my use cases. One of the benefits of using shrine is that all of my code already accesses S3 URL generation via shrine API. So I could easily swap in a locally optimized version, either with a shrine plugin, or just a local sub-class of the shrine S3 storage class. So I may consider doing that.

10 thoughts on “Speeding up S3 URL generation in ruby

  1. Oh wow, thanks so much for that comment, and the library! I had identified some paths to optimizing it, but there was a lot more work to do — and you’ve just done it and shared it, that’s so great, and so glad I found that out before trying to reinvent it!

  2. We didn’t implement that but it certainly could, could you make a PR for that? The key part is adding them in the constructor so that they can be cached across the generation calls.

    By the way – what’s interesting is that in our profiling the URL escaping didn’t paint as visibly – by far. Most likely because our object keys are all URL-safe.

  3. Hm, I need different headers for different URLs though. I set content-disposition to the include desired “Save as” filename for each URL, which is different for different ones. So setting in constructor and caching does not meet my needs. There may be some other things I need slightly different than you have done too — we realize that cutting out the flexibility is *part* of what gets you the performance. I’d fork your code to do what I need… but I realize the license you use isn’t compatible with most of my projects either. :( Well, I’m not at the point where I’m ready for this *quite* yet either, I guess I’ll think on it and see when I get there!

  4. The URL escaping showed up for me mostly in *public* URL generating; maybe it gets lost in the overhead of actual presigned URL generating. That they are all already URL-safe shouldn’t make much difference; my test data was too, it still takes time for the escaping code to check every byte to see if it needs escaping.

Leave a comment