llama.dev.upload package¶
Tools for uploading manifests of data files to a DigitalOcean
Spaces/Amazon S3 object storage solution (for later user installation using
llama install
).
-
llama.dev.upload.
upload_and_get_manifest
(root: str = '.', glob: str = '**/*', key_prefix: str = 'llama/objects/', key_uses_relpath: bool = False, bucket: str = 'llama', public: bool = True, endpoint_url: str = 'https://nyc3.digitaloceanspaces.com', dry_run: bool = False, **kwargs)¶ Upload files from the specified path to DigitalOcean Spaces/AWS S3 and return a manifest mapping the stored object URLs to local relative paths. The sha256 sum of the uploaded file will be the actual filename, allowing for versioning of files and avoiding redundant file uploads and downloads, with
key_prefix
prepended to aid in organization. Use this to offload large data files onto separate file storage and generate the MANIFEST constant (and related constants) for installation.- Parameters
root (str, optional) – The path to the root directory that should be uploaded to cloud storage. All local paths in the returned manifest will be relative to this path as well. Can be relative or absolute.
glob (str, optional) – The glob specifying which files to match from the provided
root
. By default, recursively matches all files in all subdirectories.key_prefix (str, optional) – A prefix to prepend to the uploaded files’ sha256 sums in order to create their object keys (i.e. remote filenames). Note that this is just a prefix, so if you want it to act/look like a containing directory for uploaded files, you will need to make sure it ends with
/
.key_uses_relpath (bool, optional) – If
True
, put the relative filepath fromroot
of each file as a prefix in front of the sha256 sum when generating the key. In the filesystem analogy, this would put your remove files (on DigitalOcean, at least) at/<bucket>/<key_prefix>/<relative-path>/<sha256sum>
. Use this if you want it to be easier to find the file at a glance/want to organize things by filename on the object store (e.g. for one-off uploads); don’t use this if you’re planning on organizing things with the returnedmanifest
.bucket (str, optional) – The DigitalOcean Spaces/AWS S3 bucket to upload files to. For DigitalOcean this is just the name of the directory in your root Spaces directory.
public (str, optional) – Whether to make files public. If you specify
public=False
, the uploaded files will haveNone
as their remote URLs in the returned manifest (which should not be surprising, since the returned manifest is intended for unauthenticated downloads). You want this to beTrue
if you are uploading files for the purpose of public distribution.endpoint_url (str, optional) – The
endpoint_url
argument forllama.com.s3.get_client
. Specifies which S3 service you are using.dry_run (bool, optional) – If provided, don’t upload the file. Instead, print the manifest that would be generated and quit. Use this to see where your files will be uploaded before actually doing it.
**kwargs – Keyword arguments to pass to
llama.com.s3.get_client
that set authentication parameters and choose the target space for uploads; see documentation for that function for details.
- Returns
manifest – A dictionary whose keys are local paths of uploaded files relative to the
root
argument and whose values are tuples of the remote upload URL and sha256 sum of the file described by the key. Use this manifest to later download and install the correct versions of the uploaded files with the correct directory structure. Looks like{filename: (url, sha256sum)}
.- Return type
Dict[str, Tuple[str, str]]
Examples
Try uploading some dummy files with known contents to a remote test directory to confirm that you have access rights.
>>> # coding: utf-8 >>> import os >>> from llama.dev.upload import upload_and_get_manifest >>> from tempfile import TemporaryDirectory >>> from pathlib import Path >>> from requests import get >>> from hashlib import sha256 >>> with TemporaryDirectory() as tmpdirpath: ... tmpdir = Path(tmpdirpath) ... with open(tmpdir/'foo', 'w') as foo: ... _ = foo.write('bar') ... with open(tmpdir/'baz', 'w') as baz: ... _ = baz.write('quux') ... manifest = upload_and_get_manifest(root=tmpdirpath, bucket='test', ... key_prefix='llama/dev/upload/', ... public=True) >>> sha256(get(manifest['foo'][0]).content).hexdigest() 'fcde2b2edba56bf408601fb721fe9b5c338d10ee429ea04fae5511b68fbf8fb9' >>> sha256(get(manifest['baz'][0]).content).hexdigest() '053057fda9a935f2d4fa8c7bc62a411a26926e00b491c07c1b2ec1909078a0a2'