# App definitions from Mesos/Marathon to CSV

**Date:** 2023-01-12  
**Author:** Kees C. Bakker  
**Categories:** bash  
**Original:** https://keestalkstech.com/app-definitions-from-mesos-marathon-to-csv/

![App definitions from Mesos/Marathon to CSV](https://keestalkstech.com/wp-content/uploads/2023/01/miguel-a-amutio-Y0woUmyxGrw-unsplash.jpg)

---

We're currently in the business of moving from Mesos/Marathon to Kubernetes. As we have a microservices environment we can move service by service. We have multiple clusters running, so I need to track which teams have which services (still) running on Mesos/Marathon.

Let's see if we can lift the data out of the system using the Marathon API, [JQ](https://stedolan.github.io/jq/manual/) and [cURL](https://curl.se/docs/manual.html). Our end goal is to get the following CSV:

![](https://keestalkstech.com/wp-content/uploads/2023/01/marathon-1.png)
*The CSV opened in Google Sheets.*

## Packages

If you are on a Debian based machine like Ubuntu, you can install the package we use like this:

```sh
sudo apt-get update
sudo apt-get install jq curl
```

## Empty CSV File

First, we're going to create a CSV with the fields: *env*, *team*, *service* and *image*. The *env* is just a name we will give to identify the environment (we won't be taking that from the Marathon API). Now let's create the empty CSV file first:

```sh
#!/usr/bin/env bash
# shellcheck disable=SC2207

results_file="marathon.csv"
echo "env,team,service,image" > "$results_file"
```

We're writing the file using a [redirection operator](https://linuxhint.com/redirection-operators-bash/#:~:text=This%20symbol%2C%20known%20as%20the%20file%20redirection%20operator%2C%20is%20typically%20used%20to%20redirect%20the%20contents%20of%20a%20command/file%20to%20another%20by%20overwriting%20it.%20Mind%20you%3B%20it%20overwrites%20it%20%E2%80%93%20in%20bold%20and%20italicized!).

## Query Marathon

What does the [Marathon API](https://mesosphere.github.io/marathon/api-console/index.html) have in store? Well, it provides an `/v2/apps` endpoint which will return the definitions of all the apps that are currently running on that service:

![](https://keestalkstech.com/wp-content/uploads/2023/01/marathon.jpg)
*There are 3 fields we're really interested in: id, image and team.*

Next, let's add a function to query the Marathon API. We need to supply the *env*, *host* and *results_file* as positional parameters.

```sh
function query_marathon {
  env=$1
  host=$2
  results_file=$3

  api_url="http://$host:8080/v2/apps"

  curl -Ss "$api_url" | \
  jq --raw-output "
      .apps[] | 
      [
        \"$env\",
        .labels.team,
        (.id | sub(\"^/\";\"\")),
        .container.docker.image
      ] | 
      @csv
  " \
  >> "$results_file"
}
```

Here we use cURL to call the API and JQ to parse the result to a CSV. Notice how we use a regular expression to get rid of the forward slash of the service name.

## The Loop

We want to query multiple environments, so let's create an [associative array](https://linuxhint.com/associative_array_bash/) that stores the names of every environment and the host name.

```sh
declare -A hosts=( 
    ["label1-dev"]="label.one.dev"
    ["label1-test"]="label.one.test"
    ["label1-prod"]="label.one.prod"
    ["label2-dev"]="label.two.dev"
)
```

The last thing we need to do is build a loop around it, so our code becomes:

```sh
#!/usr/bin/env bash
# shellcheck disable=SC2207,SC2059

declare -A hosts=( 
    ["label1-dev"]="label.one.dev"
    ["label1-test"]="label.one.test"
    ["label1-prod"]="label.one.prod"
    ["label2-dev"]="label.two.dev"
)
results_file="marathon.csv"

function query_marathon {
  env=$1
  host=$2
  results_file=$3

  api_url="http://$host:8080/v2/apps"

  curl -Ss "$api_url" | \
  jq --raw-output "
      .apps[] | 
      [
        \"$env\",
        .labels.team,
        (.id | sub(\"^/\";\"\")),
        .container.docker.image
      ] | 
      @csv
  " \
  >> "$results_file"
}

# write the csv header
echo "env,team,service,image" > "$results_file"

# sort keys
envs=( $(echo "${!hosts[@]}" | tr ' ' '\n' | sort) )

# loop
for env in "${envs[@]}"
do
  echo "Processing: $env"

  host="${hosts[$env]}"
  query_marathon "$env" "$host" "$results_file"

done

echo "Finished"
echo ""
```

Enjoy!

## Changelog

- 2023-01-17 Instead of `jq`, `yq` was used, which is a similar tools, but for parsing YAML. I've changed ` yq` for `jq --raw-output` to get the same result.
