Skip to main content

Performance Battle of NoSQL blob storages #3: Redis

We have already measured performance stats for Apache Cassandra and Apache Kafka as well. To include Redis within a comparison of persistent storages could see like some misunderstanding at first sight. On the other hand, there are certain use-cases allowing us to think about to store data in main memory, especially in private data centers. Primarily once your cluster includes a machine having almost equal size of hard drive and RAM :-)

Redis is enterprise, or advanced, key-value store with optional persistence. There are couple of reasons why everyone loves Redis. Why I do?

1. It's pretty simple.

Following command can install redis server on ubuntu. That's all.

apt-get install redis-server

2. It's incredible fast. Look at following tables. One million remote operations per second.

3. It supports large set of commands. More than some kind of database, it's rather enterprise remote-aware hash-map, hash-set, sorted-list or pub/sub channel solution supporting TTL or Lua script environment among others. See all commands.

4. It's optimized to use low computer resources, cpu and ram. Despite the redis server is single thread app it can achieve such great performance.

I've already started to talk about purposes at the beginning. We have primarily targeted two points. First, we can use Redis within super-fast deployment of our app when latency matters.

Secondly, I wanted to compare in-memory and persistent stores. Does it really worth to think about such in-memory solution?

Setup

I used following setup:
  • Redis server 2.2.6: HP Proliant BL460c gen 8, 32core 2.6 GHZ, 192GB RAM, ubuntu 12 server
  • Tests executor: xeon, 16 cores, 32gb ram, w8k server
  • 10Gbps network
  • jedis java client
  • kryo binary serialization
As Redis cluster feature was in development in the time of measuring these numbers, I used only one machine. 32 cores were really overestimated as Redis used one plus one core indeed.

Performance Measurement of Redis

Batch Size

Appending using LPUSH to some eight different keys.

Blob size \ Batch Size [TPS]128256102432768
100b570k570k557k600k
20kb38k40k35k33k

As main memory is touched only, it's all about network transmission. It's almost same for all sizes of batches.

Variable Connections


Utilizing LPUSH again to append to different number of keys. Every keys is accessed using different connection.

Blob size \ Connections [TPS]124832128
100b446750k646k560k960k998k
20kb9.2k16.8k20.8k34k35k52k
Ohhh. One million inserted messages per second to one Redis instance. Incredible. Java client uses NIO so this is the answer why it somehow scales with much more tcp connections. Increasing number of network pipes where a client can push the data enables better throughput.

Occupied Memory


Blob within a list.

Blob sizeBytes per Message
100b152
20kb19kb
There is some build-in compression which appeared within large message test.

Long Running


The goal of this test is to fill the main memory (192GB) with one redis instance to find out if there are some scalability limitations.

Blob sizeTPS
100b842k
20kb18.2k
Redis fill the main memory with blob messages till OOM. The shape of this progress within the time is almost flat.


Persistence


Even if Redis stores the data in the main memory, there is a way how to persist the data.

Blob size \ [TPS]With EOF (every second)With AOF (always)Without
100b800k330k960k

Redis forks a new thread for I/O. Numbers are almost same if the data are persisted within one second frames. TPS goes significantly down when Redis writes every updated key to the drive but this mode ensures best durability.

Large Messages


How the performance is affected when message size increases to tens of mbytes?

Blob sizeTPS
500 kb1418
5 mb85
100 mb1,2

Conclusion

The numbers are impressive. One-thread app successfully process almost one million of small messages in memory.

Redis performance is incredible. We could expect certain limitations because of one-threaded design which causes "serializable" behavior. Maybe this lockless implementation is the reason why redis server can handle such great throughput.

On the other hand, the right comparison against other competitors uses redis persistence feature. The performance is much more worse. Three times. Well, the persistence requirement can be the decision maker.

I've already mentioned great command set. You can probably model almost any behavior. Even if Redis primarily targets caches, there are commands allowing to calculate various stats, hold unique sets etc. The script made from these commands is powerful and very fast.

Everything is always about performance and features. Redis has both of them :-)

Comments

Popular posts from this blog

Performance Battle of NoSQL blob storages #1: Cassandra

Preface We spend last five years on HP Service Virtualization using MsSQL database . Non-clustered server. Our app utilizes this system for all kinds of persistence. No polyglot so far. As we tuned the performance of the response time - we started at 700ms/call and we achieved couple milliseconds per call at the end when DB involved - we had to learn a lot of stuff. Transactions, lock escalation , isolation levels , clustered and non clustered indexes, buffered reading, index structure and it's persistence, GUID ids in clustered indexes , bulk importing , omit slow joins, sparse indexes, and so on. We also rewrite part of NHibernate to support multiple tables for one entity type which allows use scaling up without lock escalation. It was good time. The end also showed us that famous Oracle has half of our favorite features once we decided to support this database. Well, as I'm thinking about all issues which we encountered during the development, unpredictive behavio

NHibernate performance issues #3: slow inserts (stateless session)

The whole series of NHibernate performance issues isn't about simple use-cases. If you develop small app, such as simple website, you don't need to care about performance. But if you design and develop huge application and once you have decided to use NHibernate you'll solve various sort of issue. For today the use-case is obvious: how to insert many entities into the database as fast as possible? Why I'm taking about previous stuff? The are a lot of articles how the original NHibernate's purpose isn't to support batch operations , like inserts. Once you have decided to NHibernate, you have to solve this issue. Slow insertion The basic way how to insert mapped entity into database is: SessionFactory.GetCurrentSession().Save(object); But what happen when I try to insert many entities? Lets say, I want to persist 1000 libraries each library has 100 books = 100k of books each book has 5 rentals - there are 500k of rentals  It's really slow! The inser

Java, Docker, Spring boot ... and signals

I spend last couple weeks working on java apps running within docker containers deployed on clustered CoreOS machines . It's pretty simple to run java app within a docker container. You just have to choose a base image for your app and write a docker file. Note that docker registry contains many java distributions usually based on open jdk. We use our internal image for Oracle's Java 8 , build on top of something like this docker file . Once you make a decision whether oracle or openjdk, you can start to write your own docker file. FROM dockerfile/java:oracle-java8 ADD your.jar /opt/your-app ADD /dependencies /opt/your-app/dependency WORKDIR /opt/your-app CMD ["java -jar /opt/your-app/your.jar"] However, your app would probably require some parameters. Therefore, last line usually calls your shell script. Such script than validates number and format of those parameters among other things. This is also useful during the development phase because none of us