Delete all S3 key versions with ruby AWS SDK v3

If your S3 bucket is versioned, then deleting an object from s3 will leave a previous version there, as a sort of undo history. You may have a “noncurrent expiration lifecycle policy” set which will delete the old versions after so many days, but within that window, they are there.

What if you were deleting something that accidentally included some kind of sensitive or confidential information, and you really want it gone?

To make matters worse, if your bucket is public, the version is public too, and can be requested by an unauthenticated user that has the URL including a versionID, with a URL that looks something like: https://mybucket.s3.amazonaws.com/path/to/someting.pdf?versionId=ZyxTgv_pQAtUS8QGBIlTY4eKmANAYwHT To be fair, it would be pretty hard to “guess” this versionID! But if it’s really sensitive info, that might not be good enough.

It was a bit tricky for me to figure out how to do this with the latest version of ruby SDK (as I write, “v3“, googling sometimes gave old versions).

It turns out you need to first retrieve a list of all versions with bucket.object_versions . With no arg, that will return ALL the versions in the bucket, which could be a lot to retrieve, not what you want when focused on just deleting certain things.

If you wanted to delete all versions in a certain “directory”, that’s actually easiest of all:

s3_bucket.object_versions(prefix: "path/to/").batch_delete!

But what if you want to delete all versions from a specific key? As far as I can tell, this is trickier than it should be.

# danger! This may delete more than you wanted
s3_bucket.object_versions(prefix: "path/to/something.doc").batch_delete!

Because of how S3 “paths” (which are really just prefixes) work, that will ALSO delete all versions for path/to/something.doc2 or path/to/something.docdocdoc etc, for anything else with that as a prefix. There probably aren’t keys like that in your bucket, but that seems dangerously sloppy to assume, that’s how we get super weird bugs later.

I guess there’d be no better way than this?

key = "path/to/something.doc"
s3_bucket.object_versions(prefix: key).each do |object_version|
  object_version.delete if object_version.object_key == key
end

Is there anyone reading this who knows more about this than me, and can say if there’s a better way, or confirm if there isn’t?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s