...
Configure awscli
Code Block | ||
---|---|---|
| ||
[user@localhost ~]$ aws configure AWS Access Key ID [None]: <username> AWS Secret Access Key [None]: <password> Default region name [None]: [Enter] Default output format [None]: [Enter] |
...
create a new bucket:
Code Block | ||
---|---|---|
| ||
[user@localhost ~]$ aws –endpoint-url <endpoint-url> s3 mb s3://<bucket-name> |
...
list all buckets:
Code Block | ||
---|---|---|
| ||
[user@localhost ~]$ aws –endpoint-url <endpoint-url> s3 ls |
...
synchronize a bucket with a specific prefix to a local directory purging any files from the bucket/prefix that are no longer in the local directory:
Code Block | ||
---|---|---|
| ||
[user@localhost ~]$ aws –endpoint-url <endpoint-url> s3 sync /home/user/downloads/ s3://mybucket/mydownloads/ --delete |
...
Parallel uploads can be accomplished with the AWS CLI tool to better utilize the available bandwidth and improve performance. Ideally, you'll have some understanding of the dataset so that you can divide the files into equal portions. One approach is to use the --include and --exclude options to address mutually exclusive subsets. For example, you may create separate screen sessions and run different aws commands in each:the following command will start a copy of all files in /home/user/downloads/ that start with the letter B, which will be sent to the background then start another cp of all files that don't start with "B":
Code Block | ||
---|---|---|
| ||
[user@localhost ~]$ screen -S awscopy1 -d -m aws –endpoint-url <endpoint-url> s3 cp /home/user/downloads/ s3://mybucket/mydownloads/ --recursive --exclude "*" --include "B*" [user@localhost ~]$ screen -S awscopy2 -d -m aws –endpoint-url <endpoint-url> s3 cp /home/user/downloads/ s3://mybucket/mydownloads/ --recursive --exclude "B*" |
...