Post

AWS S3-presigned URLs: Deep Dive and Use Cases of secure file transfers

Abstract

One of AWS S3 features is very useful, but not so many developers and architects are familiar with it - the capability to generate presigned URLs.

In this blog post, we will explore what presigned URLs are, how they are useful, and how you can leverage them in your AWS S3 workflows.

Target Architecture

infra.png

Singed URLs are popular for several scenarios:

  • we do not want to create and manage users for 3rd party access to S3 buckets, but require for a short period of time share the file with them, but not making it publicly
  • by creating signed URL we offload processing consumption to S3 (there is no load to our aws lambda that bypasses file to APIGW, or EKS service that reads from S3 and proxies to Ingress the response). We are not heating CPU and memory on not needed components.

Following diagram describes integration of signed url feature:

  • user application/system requests access to S3 file
  • lambda signs file for a given period of time and returns back URL
  • user application directly downloads file from S3

The signature process is quick and do not consume workflow capacity.

A presigned URL remains valid for the period of time specified when the URL is generated. If you create a presigned URL with the Amazon S3 console, the expiration time can be set between 1 minute and 12 hours. If you use the AWS CLI or AWS SDKs, the expiration time can be set as high as 7 days.

Code snippet to perform signature

1
2
3
4
5
6
7
8
9
s3 = boto3.client('s3')
    url = s3.generate_presigned_url(
        ClientMethod='get_object',
        Params={
            'Bucket': bucket,
            'Key': key
        },
        ExpiresIn=expiration
    )
1
python3 reverse-engineer.py

After execution as output we will see the following URLs, let’s deep dive into analysis.

URL structure

File: data.txt

1
2
3
4
5
https://strata-2024.s3.amazonaws.com/data.txt?
AWSAccessKeyId=ASIAWFOD4FP2PNVEYEWT&
Signature=OxQHlg+H2gdggggXpoX&
x-amz-security-token=IQoJb3aaaZ2luX2VjEK&
Expires=1706977535

File: data2.txt

1
2
3
4
5
https://strata-2024.s3.amazonaws.com/data2.txt?
AWSAccessKeyId=ASIAWFOD4FP2PNVEYEWT&
Signature=OxQHlg+H2gdggggXpoX&
x-amz-security-token=IQoJb3aaaZ2luX2VjEK&
Expires=1706977684

Comparing the payload:

ParamFile1File2
AWSAccessKeyIdASIAWFOD4FP2PNVEYEWTASIAWFOD4FP2PNVEYEWT
x-amz-security-tokenIQoJb3aaaZ2luX2VjEKIQoJb3aaaZ2luX2VjEK
Expires17069775351706977684
SignatureOxQHlg+H2gdggggXpoXOxQHlg+H2gdggggXpoX

Besides different HTTP path (files), these 2 signed-urls to 2 distinct files, have the same ACCESS_KEY and x-amz-security-token.

As we know from AWS signature v4 implementation ACCESS_KEY is sent in the payload, header or url param, but SECRET_KEY always stays on owning side.

We invoked s3 sign method and AWS registers programmatic access key that is not visible in our account, but exists. Later when user accesses the URL this SECRET_KEY is used to generate request signature and compare it with signature inside the request payload to guarantee the data integrity. For these 2 files same key is used. Since the payloads are different (distinct expires and url path) generated signature is not the same.

Let’s try to modify Expires (the last parameter in URL)

Now if we will try to access a file using modified URL, S3 will response with Error in format of xml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<Error>
  <Code>SignatureDoesNotMatch</Code>
  <Message>
    The request signature we calculated does not match the signature you provided. Check your key and signing method.
  </Message>
  <AWSAccessKeyId>ASIAWFOD4FP2PNVEYEWT</AWSAccessKeyId>
  <StringToSign>
    GET 1706977682
    x-amz-security-token:IQoJb3aaaZ2luX2VjEK////////=
    /strata-2024/data2.txt
  </StringToSign>
  <SignatureProvided>OxQHlg+H2gdggggXpoX/M/40U3Veg=</SignatureProvided>
  <StringToSignBytes>
    47 45 54 0a 0a 0a 31 37 30 36 39 37 37 36 38 32 0a 78 2d 61 6d 7a 2d 73 65 63 75 72 69 74 79 2d 74 6f 6b 65 6e 3a 49 51 6f 4a 62 33 4a 70 5a 32 6c 75 58 32 56 6a 45 4b 2f 2f 2f 2f 2f 2f 2f 2f 2f
    2f 2f 77 45 61 44 47 56 31 4c 57 4e 6c 62 6e 52 79 59 57 77 74 4d 53 4a 48 4d 45 55 43 49 48 67 67 61 55 35 4b 5a 63 33 71 41 42 59 67 6e 30 4f 4d 6a 30 76 65 50 47 52 52 5a 55 48 67 66 71 54 6b
    6c 4f 4d 35 57 35 65 4e 41 69 45 41 78 71 43 34 38 76 74 6c 52 4a 63 5a 30 4a 78 66 33 50 52 58 4b 33 37 59 33 61 75 74 69 67 74 51 67 59 35 35 67 50 50 46 44 6d 77 71 6c 67 49 49 65 42 41 42 47
    37 72 42 54 63 55 77 4c 67 70 79 39 36 67 32 39 36 50 6e 4b 6c 58 65 57 74 73 54 51 44 42 70 74 47 76 64 42 36 62 47 48 5a 53 49 33 76 4d 74 41 65 64 2b 52 74 52 69 6b 6f 57 4e 62 64 54 74 66 72
    4e 77 7a 65 56 71 74 31 69 4b 57 63 4f 67 68 42 41 35 46 33 31 55 76 77 6d 51 54 62 57 78 31 4d 53 68 63 45 79 6e 53 6e 31 48 6d 6d 50 4c 45 54 76 70 78 34 49 2f 31 56 35 57 6b 79 6b 68 74 7a 6f
    49 3d 0a 2f 73 74 72 61 74 61 2d 32 30 32 34 2f 64 61 74 61 32 2e 74 78 74
  </StringToSignBytes>
  <RequestId>22CF9WNZZ429CSRE</RequestId>
  <HostId>
    jgGesVaafsl6Dx7zB9Vk6OClOTglS5yGa/lZix3sM20q0eers4hdm6c+8S9UVoTLkBXeBDa4xHeGRQ==
  </HostId>
</Error>

From the payload we can check what exactly is signed StringToSign it is separated HTTP method, expiration time, token and object path.

Reverse-engineering SECRET from pre-signed URL, and why it is important to keep as short expiration time as possible:

AWS uses AWS signature version 4 algorithm for signing requests. Knowing this algorithm details malicious user who has access to signed URL can revere-engineer it and extract SECRET_KEY Having both ACCESS_KEY and SECRET_KEY will allow more widly attack vector, based on the IAM permissions that are associated with this entity, or follow privilidges escalation vector.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
result = ''
access_key = 'ASIAWFOD4FP2PNVEYEWT'.encode("UTF-8")
string_to_sign = 'GET 1706977682 x-amz-security-token:IQoJb3JpZ2luX2VjEK/////= /strata-2024/data2.txt'
.encode("UTF-8")

while ('OxQHlg+H2gdggggXpoX/M/40U3Veg=' != result):
  secret_key = generate_random_secret_key().encode("UTF-8")

  signature = base64.b64encode(
    hmac.new(
      secret_key, string_to_sign, sha1
    ).digest()
  ).strip()
  result = signature.decode()
  print(f"AWS {access_key.decode()}:{signature.decode()}")

The output will be infinite loop with different generated keys:

1
2
3
4
5
6
7
8
...
AWS ASIAWFOD4FP2PNVEYEWT:YUTgj1ul1CqzE/hPhfVFHyaXfMk=
AWS ASIAWFOD4FP2PNVEYEWT:Cp47/TsLSY6z2XZW2BaYKm/lBvs=
AWS ASIAWFOD4FP2PNVEYEWT:nVdB1mSbQbbuzHo0rkq0viHf9DY=
AWS ASIAWFOD4FP2PNVEYEWT:C0McVvq0DIqEgjMbgkmTngnLUGQ=
AWS ASIAWFOD4FP2PNVEYEWT:G0Dh2Rkwz0pk4pdVPaoqPcEHwtk=
AWS ASIAWFOD4FP2PNVEYEWT:UJU1F7WYtyoYm7tYA/HFro6xfZ4=
....

These brute-force process has several dependencies:

  • cryptographical strength of algorithm
  • used compute resources
  • time for processing

Since algorithm is fixed, and we do not control the amount of resources that will be used for such task, there is very important parameter that we have control - ExpirationTime. Reverse engineering will take A LOT OF time, but by reducing the expiration time in our generated signed URL - we make it even more impossible to brute force the signature.

Expiration

Once expiration has passed the request will return Status: 403 Forbidden:

1
2
3
4
5
6
7
8
9
10
<Error>
<Code>AccessDenied</Code>
<Message>Request has expired</Message>
<Expires>2024-02-03T16:28:04Z</Expires>
<ServerTime>2024-02-03T16:41:16Z</ServerTime>
<RequestId>EASJ4X6YWFJEB6RZ</RequestId>
<HostId>
a8TzGs44FcojTt8e1tQhqTfhXEdGBPF+IYV4q2T7jhumYTQJu9FsD0qQml8qfm+0OONK7/UKxocQBiQ==
</HostId>
</Error>

Signed URLs are even more

S3 supports not only signed GET requests. But also object modification PUT, DELETE requests. It is super usable when you need to provide the posibility to 3rd party agent to write some results into S3 bucket but do not want to create a user, and also want this operation for a short time.

This post is licensed under CC BY 4.0 by the author.