In a recent post, I explored profiling and optimizing S3 presigned_url generation in ruby to be much faster. In that post, I got down to using a
Aws::Sigv4::Signer instance from the AWS SDK, but wondered if there was a bunch more optimization to be done within that black box.
Julik posted a comment on that post, letting us know that they at WeTransfer have already spent some time investigating and creating an optimized S3 URL signer, at https://github.com/WeTransfer/wt_s3_signer . Nice! It is designed for somewhat different use cases than mine, but is still useful — and can be benchmarked against the other techniques.
Some things to note:
- wt_s3_signer does not presently do URI escaping; it may or may not be noticeably slower if it did; it will not generate working URLs if your S3 keys include any characters that need to be escaped in the URI
- wt_s3_signer gets ultimate optimizations by having you re-use a signer object, that has a fixed/common datestamp and expiration and other times. That doesn’t necessarily fit into the APIs I want to fit it into — but can we still get performance advantage by re-creating the object each time with those things? (Answer, yes, although not quite as much. )
- wt_s3_signer does not let you supply additional query parameters, such as response_content_disposition or response_content_type. I actually need this feature; and need it to be different for each presigned_url created even in a batch.
- wt_s3_signer’s convenience for_s3_bucket constructor does do at least one network request to S3… to look up the appropriate AWS region I guess? That makes it far too expensive to re-use
for_s3_bucketconvenience constructor once-per-url, but I don’t want to do this anyway, I’d rather just pass in the known region, as well as the known bucket base URL, etc. Fortunately the code is already factored well to give us a many-argument plain constructor where we can just pass that all in, with no network lookups triggered.
- Insists on looking up AWS credentials from the standard locations, there’s no way to actually pass in an
secret_access_keyexplicitly, which is a problem for some use cases where an app needs to use more than one set of credentials.
So the benchmarks! This time I switched to benchmark-ips, cause, well, it’s just better. I am benchmarking 1200 URL generations again.
I am benchmarking re-using a single
WT::S3Signer object for all 1200 URLs, as the gem intends. But also compared to instantiating a
WT::S3Signer for each URL generation — using
WT::S3Signer.for_s3_bucket — the
for_s3_bucket version cannot be used instantiated once per URL generation without crazy bad performance (I did try, although it’s not included in these benchmarks).
I include all the presigned_url techniques I demo’d in the last post, but for clarity took any public url techniques out.
Calculating ------------------------------------- sdk presigned_url 1.291 (± 0.0%) i/s - 7.000 in 5.459268s use inline instantiated Aws::Sigv4::Signer directly for presigned url (with escaping) 4.950 (± 0.0%) i/s - 25.000 in 5.082505s Re-use Aws::Sigv4::Signer for presigned url (with escaping) 5.458 (±18.3%) i/s - 27.000 in 5.037205s Re-use Aws::Sigv4::Signer for presigned url (without escaping) 5.751 (±17.4%) i/s - 29.000 in 5.087846s wt_s3_signer re-used 45.925 (±15.2%) i/s - 228.000 in 5.068666s wt_s3_signer instantiated each time 15.924 (±18.8%) i/s - 75.000 in 5.016276s Comparison: wt_s3_signer re-used: 45.9 i/s wt_s3_signer instantiated each time: 15.9 i/s - 2.88x (± 0.00) slower Re-use Aws::Sigv4::Signer for presigned url (without escaping): 5.8 i/s - 7.99x (± 0.00) slower Re-use Aws::Sigv4::Signer for presigned url (with escaping): 5.5 i/s - 8.41x (± 0.00) slower use inline instantiated Aws::Sigv4::Signer directly for presigned url (with escaping): 5.0 i/s - 9.28x (± 0.00) slower sdk presigned_url: 1.3 i/s - 35.58x (± 0.00) slower
Wow! Re-using a single WT::S3Signer, as the intend, is a LOT LOT faster than anything else — 35x faster than the built-in AWS SDK presigned_url method!
But even instantiating a new WT::S3Signer for each URL — while significantly slower than re-use — is still significantly faster than any of the methods using an AWS SDK
Aws::Sigv4::Signer directly, and still a lot lot faster than the AWS SDK
So this has promise, even if you re-use the thing, it’s better than any other option. I may try to PR and/or fork to get some of the features I’d need in there… although the license is problematic for many projects I work on. With the benchmarking showing the value of this approach, I could also just try to reimplement from scratch based on the Amazon instructions/example code that wt_s3_signer itself used, and/or the ruby AWS SDK implementation.
One thought on “More benchmarking optimized S3 presigned_url generation”
This is awesome! We’ll also do our writeup how we arrived at the optimizations because curiously enough URL escaping didn’t paint in our profiling at all.
And indeed, `for_s3_bucket` is meant to be used once (since we generate large batches of URLs for the same bucket, always) and to then produce the `Signer` object which then is reused for all the URLs in the batch (also so that the datestamp does not go out of tolerance and the credential stays valid).