# Harmonizing Team tags on AWS S3 Buckets

**Date:** 2023-02-20  
**Author:** Kees C. Bakker  
**Categories:** Amazon S3, bash  
**Original:** https://keestalkstech.com/harmonizing-team-tags-on-aws-s3-buckets/

![Harmonizing Team tags on AWS S3 Buckets](https://keestalkstech.com/wp-content/uploads/2023/02/quino-al-H0xA8nGo9_s-unsplash.jpg)

---

At Wehkamp we use many - many - buckets! To do FinOps correctly, it is important we're able to determine which teams own which buckets. In this article I'll discuss how to detect *Team tags* that are not correct and apply the correct ones. We're using a combination of Bash, AWS CLI, CSV and JQ.

## Process

Harmonizing the Team tags involves the following phases:

![](https://keestalkstech.com/wp-content/uploads/2023/02/mermaid-diagram-2023-02-20-092011.png)
*First, we extract the current data from AWS S3 API into a CSV file. Then we change the data and apply it.*

So we need to create two scripts: one to *extract* the tags from the AWS S3 API into a CSV and one to *apply* the changed CSV.

## Get the Team tags from S3

To extract the data from the AWS S3 API, we need to do:

- Use the AWS CLI to query all the S3 buckets in the account.
- Check if the Team tag of each bucket is on the `allowed_team_tags` list.
- If not, query if the bucket is empty. Write the result to a CSV file.

Let's call the file `extract.sh`:

```sh
#!/bin/bash
set -e

csv_filename="${1:-s3_team_tags.csv}"

# Set the list of allowed team tags
declare -a allowed_team_tags=(
    "apps"
    "brands-recommendations"
    "..."
    "workplace"
    "pathfinders"
)

RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# Create CSV file with header row
printf "name,team,empty\n" > "$csv_filename"

# Loop through all S3 buckets in the AWS account
for bucket_name in $(aws s3api list-buckets --query 'Buckets[].Name' --output text); do
    
    printf "Bucket ${YELLOW}%s${NC} " "$bucket_name"
    
    # Get the Team tag for the bucket
    set +e
    team=$(aws s3api get-bucket-tagging --bucket "$bucket_name" --query 'TagSet[?Key==`Team`].Value' --output text)
    set -e
    if [ -z "$team" ]; then
        team="no_team_tag"
    fi
    
    printf "has tag ${YELLOW}%s${NC}: " "$team"
    
    # Check if the bucket has an allowed team tag
    if [[ " ${allowed_team_tags[*]} " == *"$team"* ]]; then
        echo -e "${GREEN}valid${NC}"
        continue
    fi
    
    echo -e "${RED}invalid${NC}"
    
    # Check if the bucket is empty
    if [[ "$(aws s3api list-objects-v2 --bucket "$bucket_name" --max-items 1)" == "" ]]; then
        is_empty="yes"
    else
        is_empty="no"
    fi
    
    # Add the bucket information to the CSV file
    echo "$bucket_name,$team,$is_empty" >> "$csv_filename"
done

# Count CSV lines minus header
lines=$(wc -l < "$csv_filename")
lines=$((lines-1))

echo ""
echo "CSV file $csv_filename has $lines lines"
echo ""
```

To monitor the progress, the script will output what it is doing.

## Apply the Team tags

Now applying the Team tags was not as straightforward as I hoped. Updating the tags of a bucket will destroy any existing tags, so you'll need to correct for that. We use **JQ** to change existing tags.

Here is the `apply.sh`:

```sh
#!/bin/bash

csv_filename="${1:-s3_team_tags.csv}"

GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

{
    # skip header
    read -r
    
    # Read the CSV file and iterate over each row
    while IFS=',' read -r bucket team _; do
        
        echo -e "Setting ${YELLOW}$bucket${NC} to ${GREEN}$team${NC}"
        
        # Retrieve the current set of tags for the bucket
        existing_tags=$(aws s3api get-bucket-tagging --bucket "$bucket")
        
        # Check if the existing set of tags is empty
        if [ -z "$existing_tags" ]; then
            # If the set of tags is empty, add the Team tag
            new_tags='{"TagSet": [{"Key": "Team", "Value": "'"$team"'"}]}'
        else
            # If the set of tags is not empty, check if the Team tag is already present
            if echo "$existing_tags" | grep -q '"Key": "Team"'; then
                # If the Team tag is present, update its value
                new_tags=$(echo "$existing_tags" | jq '.TagSet |= map(if .Key == "Team" then .Value = "'"$team"'" else . end)')
            else
                # If the Team tag is not present, add it to the existing set of tags
                new_tags=$(echo "$existing_tags" | jq '.TagSet += [{"Key": "Team", "Value": "'"$team"'"}]')
            fi
        fi
        
        new_tags=$(echo "$new_tags" | tr -d '\n')
        new_tags=$(echo "$new_tags" | sed -E 's/[\n\t ]+/ /g')
        
        aws s3api put-bucket-tagging --bucket "$bucket" --tagging "$new_tags"
        
    done
    
} < "$csv_filename"
```

## Final thoughts

I struggled a bit to get the tags "merged". Hopefully AWS will provide a better API to update single tags. I don't think it should be this hard.

[I had some problems running these scripts on Windows combining AWS Vault and Bash / WSL, so I wrote a small blog about it.](https://keestalkstech.com/2023/02/share-aws-vault-session-with-bash-wsl/)

## Changelog

- 2023-02-20: removed double checking of tags by directly querying for the Team tag, makes the script a bit faster.
