Amazon S3 Bucket Folder Size and Date Analyzer

This Bash script is used for listing the sizes of folders in an Amazon S3 bucket and storing the results in a text file.

#!/bin/bash

pushd ~/Documents/projects/s3/
    echo > bkt-size.txt
    aws s3 ls | sort > list.txt

    for FOLDER in `cat list.txt | cut -d ' ' -f3`; do
    	echo "Generating ${FOLDER} list..."
        SIZE=`aws s3 ls --recursive --summarize --human-readable s3://${FOLDER} | tail -1`

        echo `cat list.txt | egrep "^${FOLDER}$" | cut -d ' ' -f1,3` ${SIZE} | tee -a bkt-size.txt
    done
popd

Let's break it down step by step:

  1. pushd ~/Documents/projects/s3/: This command changes the directory to ~/Documents/projects/s3/ and pushes the current directory onto the directory stack.
  2. echo > bkt-size.txt: This command clears the content of the file named bkt-size.txt or creates it if it doesn't exist.
  3. aws s3 ls | sort > list.txt: This command lists the contents of the S3 bucket using the AWS CLI (aws s3 ls), then sorts the output alphabetically (sort), and saves it to a file named list.txt.
  4. for FOLDER in \cat list.txt | cut -d ' ' -f3; do: This line starts a loop that iterates over each line in the list.txt file. It extracts the third field (which is the folder name) from each line using cut and assigns it to the variable FOLDER.
  5. echo "Generating ${FOLDER} list...": This command simply prints a message indicating which folder's list is being generated.
  6. SIZE=\aws s3 ls --recursive --summarize --human-readable s3://${FOLDER} | tail -1: This command retrieves the size of the folder specified by $FOLDERusing the AWS CLI. It lists the contents of the bucket recursively (--recursive), summarizes the size (--summarize), and formats the size in a human-readable format (--human-readable). tail -1 is used to select the last line of the output, which contains the total size.
  7. echo \cat list.txt | egrep "^${FOLDER}$" | cut -d ' ' -f1,3` ${SIZE} | tee -a bkt-size.txt: This command prints the folder name and its corresponding size, obtained in the previous step, and appends it to the file bkt-size.txt. The tee -a command both prints the output to the terminal and appends it to the file.
  8. popd: This command pops the directory stack, returning to the previous directory.

The result should be something like this:


2013-11-06 gen.videos.pippo Total Size: 280.4 KiB
2015-10-20 download.pippomais.com.br Total Size: 17.0 GiB
2016-05-12 pluto.sorting-hat Total Size: 0 Bytes
2016-05-12 pluto.bases.sem.cliente Total Size: 0 Bytes
2016-05-20 pluto.bases.3.meses.expiradas Total Size: 963.6 MiB
2016-09-30 pippo-videos Total Size: 3.5 TiB
2016-09-30 pippo-media Total Size: 24.6 MiB
2016-09-30 pippo-static Total Size: 0 Bytes

In summary, this script generates a list of folders in an S3 bucket along with their respective sizes and stores the results in a file named bkt-size.txt.