module documentation

Utilities for working with Amazon Web Services.

Copyright 2017-2025, Voxel51, Inc.

Function download_public_s3_files Download files from a public AWS S3 bucket using unsigned URLs.
Variable logger Undocumented
Function _build_inputs Undocumented
Function _do_s3_download Undocumented
Function _multi_thread_download Undocumented
Function _parse_url Undocumented
Function _single_thread_download Undocumented
def download_public_s3_files(urls, download_dir=None, num_workers=None, overwrite=True): (source)

Download files from a public AWS S3 bucket using unsigned URLs.

The url argument either accepts:

  • A list of paths to objects in the s3 bucket:

    urls = ["s3://bucket_name/dir1/file1.ext", ...]
    

    When urls is a list, then the download_dir argument is required and all objects will be downloaded into that directory

  • A dictionary mapping the paths of objects to files on disk to store each object:

    urls = {
        "s3://bucket_name/dir1/file1.ext": "/path/to/local/file1.ext",
        ...
    }
    
Parameters
urlseither a list of URLs to objects in an s3 bucket, or a dict mapping these URLs to locations on disk. If urls is a list, then the download_dir argument is required
download_dir:Nonethe directory to store all downloaded objects. This is only used if urls is a list
num_workers:Nonea suggested number of threads to use when downloading files
overwrite:Truewhether to overwrite existing files

Undocumented

def _build_inputs(urls, s3_client, download_dir=None, overwrite=True): (source)

Undocumented

def _do_s3_download(args): (source)

Undocumented

def _multi_thread_download(inputs, num_workers): (source)

Undocumented

def _parse_url(url): (source)

Undocumented

def _single_thread_download(inputs): (source)

Undocumented