# Reading 400k+ key/values from Redis fast

**Date:** 2020-08-16  
**Author:** Kees C. Bakker  
**Categories:** bash, Projects, Python, Redis  
**Original:** https://keestalkstech.com/reading-multiple-key-values-from-redis/

![Lights on the highway](https://keestalkstech.com/wp-content/uploads/2020/08/jake-givens-iR8m2RRo-z4-unsplash-scaled.jpg)

---

At Wehkamp we use Redis a lot. It is fast, available and implemented as [a managed AWS service called ElastiCache](https://aws.amazon.com/elasticache/redis/). Sometimes we need to extract data from Redis, and [usually I use the redis-cli to interact from the command-line](https://keestalkstech.com/2020/01/connect-to-aws-elasticache-redis-with-redis-cli/). But what if you need to get the values of 400k+ keys? What would you do? Is there an effective way to query multiple key/values from Redis?

[outline]

## Only redis-cli+bash is sloooooow

When you use the red-cli and bash, your script might look a bit like this:

```sh
URL=redis://my.redis.url
echo "KEYS product:*" | \
redis-cli -u $URL | \
sed 's/^/GET /' | \
redis-cli -u $URL >> test.txt
```

For small queries this works like a charm. You don't have the keys for each value that is written to the file, which - depending on your use case - might be okay. The biggest problem I have with the approach is not that it is slow, but that it *does not show me any progress*. I have no clue when the process is finished!

## Python-scripting to the rescue

Let's write a small Python-script that uses [KEYS](https://redis.io/commands/keys) and [MGET](https://redis.io/commands/mget) to write the key and value to a file while showing the progress.

First, we need to install the [Python Redis client](https://pypi.org/project/redis/):

```sh
pip install redis
```

Our script will do the following:

When we turn it into a script, it looks like this:

```py
#!/usr/bin/env python3

import redis

file='result.txt'
url='redis://my.redis.url'
query='product:*'

print('Reading keys... ', end='')
client = redis.StrictRedis.from_url(url, decode_responses=True)
keys = client.keys(query)
print(f'{len(keys):,} keys found.')

def chunks(lst, n):
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

partitions = list(chunks(keys, 10000))

with open(file, 'w', newline='\n', encoding='utf-8') as f:
    for i in range(0, len(partitions)):

        progress = ((i+1)/len(partitions)) * 100
        print(f'\rProcessing values... {progress:.2f}%', end='')

        keys = partitions[i]
        values = client.mget(keys)
        for i in zip(keys, values):
            f.write(i[0])
            f.write('\n')
            f.write(i[1])
            f.write('\n')
        
print('\nDone!')
```

It shows the following progress:

```
Reading keys... 1,069,715 keys found.
Processing values... 55.14%
```

But we can do you one better...

## Introducing: redis-mass-get cli

The script above generates two lines per key/value. This might not suit your use-case, especially when the value contains new lines as well. Sometimes a CSV or JSON output is better. Based on this Python script I've created the Python [redis-mass-query cli](https://github.com/KeesCBakker/redis-mass-get) which can be installed with:

```sh
pip install redis-mass-get
```

### JSON format

It can even parse the JSON value (`-jd`) before writing its output to a file, all while showing the progress:

```sh
redis-mass-get -d results.json -jd redis://my.redis.url product:*
```

### CSV format

Since working with Spark I see CSV taking flight again. The CLI can also generate a CSV file with a `key, value` header:

```sh
redis-mass-get -d results.csv redis://my.redis.url product:*
```

### Pipeline CLI commands

Sometimes you want to [pipe](https://www.wikiwand.com/en/Pipeline_(Unix)) the output to another program. This example shows how to pipe the key/values in the CSV format, ignoring the CSV header (`-och`):

```sh
redis-mass-get -f csv -och redis://my.redis.url product:* | less
```

When no destination is specified, the data will be written to the `stdout`.

## Conclusion

Querying multiple key/values from Redis is easy using `KEYS` and `MGET`. If you need to write key/values to a JSON, TXT or CSV file, just use the Python `redis-mass-get` CLI. Quick and easy.

## Improvements

2020-08-17: added the [Pipeline CLI commands](#pipeline-cli-commands) section
2020-08-16: Initial article
