Performance Battle of NoSQL blob storages #3: Redis

We have already measured performance stats for Apache Cassandra and Apache Kafka as well. To include Redis within a comparison of persistent storages could see like some misunderstanding at first sight. On the other hand, there are certain use-cases allowing us to think about to store data in main memory, especially in private data centers. Primarily once your cluster includes a machine having almost equal size of hard drive and RAM 🙂

Redis is enterprise, or advanced, key-value store with optional persistence. There are couple of reasons why everyone loves Redis. Why I do?

1. It’s pretty simple.

Following command can install redis server on ubuntu. That’s all.

apt-get install redis-server
2. It’s incredible fast. Look at following tables. One million remote operations per second.
3. It supports large set of commands. More than some kind of database, it’s rather enterprise remote-aware hash-map, hash-set, sorted-list or pub/sub channel solution supporting TTL or Lua script environment among others. See all commands.
4. It’s optimized to use low computer resources, cpu and ram. Despite the redis server is single thread app it can achieve such great performance.
I’ve already started to talk about purposes at the beginning. We have primarily targeted two points. First, we can use Redis within super-fast deployment of our app when latency matters.
Secondly, I wanted to compare in-memory and persistent stores. Does it really worth to think about such in-memory solution?

Setup

I used following setup:
  • Redis server 2.2.6: HP Proliant BL460c gen 8, 32core 2.6 GHZ, 192GB RAM, ubuntu 12 server
  • Tests executor: xeon, 16 cores, 32gb ram, w8k server
  • 10Gbps network
  • jedis java client
  • kryo binary serialization
As Redis cluster feature was in development in the time of measuring these numbers, I used only one machine. 32 cores were really overestimated as Redis used one plus one core indeed.

Performance Measurement of Redis

Batch Size

Appending using LPUSH to some eight different keys.

Blob size \ Batch Size [TPS] 128 256 1024 32768
100b 570k 570k 557k 600k
20kb 38k 40k 35k 33k
As main memory is touched only, it’s all about network transmission. It’s almost same for all sizes of batches.

Variable Connections

Utilizing LPUSH again to append to different number of keys. Every keys is accessed using different connection.

Blob size \ Connections [TPS] 1 2 4 8 32 128
100b 446 750k 646k 560k 960k 998k
20kb 9.2k 16.8k 20.8k 34k 35k 52k
Ohhh. One million inserted messages per second to one Redis instance. Incredible. Java client uses NIO so this is the answer why it somehow scales with much more tcp connections. Increasing number of network pipes where a client can push the data enables better throughput.

Occupied Memory

Blob within a list.
Blob size Bytes per Message
100b 152
20kb 19kb
There is some build-in compression which appeared within large message test.

Long Running

The goal of this test is to fill the main memory (192GB) with one redis instance to find out if there are some scalability limitations.
Blob size TPS
100b 842k
20kb 18.2k
Redis fill the main memory with blob messages till OOM. The shape of this progress within the time is almost flat.

Persistence

Even if Redis stores the data in the main memory, there is a way how to persist the data.

Blob size \ [TPS] With EOF (every second) With AOF (always) Without
100b 800k 330k 960k

Redis forks a new thread for I/O. Numbers are almost same if the data are persisted within one second frames. TPS goes significantly down when Redis writes every updated key to the drive but this mode ensures best durability.

Large Messages

How the performance is affected when message size increases to tens of mbytes?

Blob size TPS
500 kb 1418
5 mb 85
100 mb 1,2

Conclusion

The numbers are impressive. One-thread app successfully process almost one million of small messages in memory.

Redis performance is incredible. We could expect certain limitations because of one-threaded design which causes “serializable” behavior. Maybe this lockless implementation is the reason why redis server can handle such great throughput.

On the other hand, the right comparison against other competitors uses redis persistence feature. The performance is much more worse. Three times. Well, the persistence requirement can be the decision maker.

I’ve already mentioned great command set. You can probably model almost any behavior. Even if Redis primarily targets caches, there are commands allowing to calculate various stats, hold unique sets etc. The script made from these commands is powerful and very fast.

Everything is always about performance and features. Redis has both of them 🙂

Redis messaging using Spring and Kryo serialization

Redis is famous enterprise key/value data store which provides simple messaging using publisher/subscriber. I’ve decided to use the system to notify remote nodes (components) about a change of state between those components.

Spring among the others allows to use it’s templating system to decouple business logic from messaging system used under the hood. The use of spring data for Redis guarantees a solution which does not utilize any redis command in the code.

Redis contains serialized content, either byte[] or string. So the last thing to reveal is domain model serialization. I’ve decided to fast binary serialization using Kryo framework as a winner of battle of serializators.

Maven

First of all, it’s necessary to define all dependant components. Obviously, usual component like spring itself missing.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

com.esotericsoftware.kryo
kryo
2.22


org.springframework.data
spring-data-redis
1.1.0.RELEASE


redis.clients
jedis
2.1.0

Kryo Serializator

I used domain model’s entity which hold identifier of our internal service, this is just UUID. The point is to setup kryo serializator which get the entity and returns byte[]. The entity itself is not important.

1
2
3
4
5
6
7
8
9
10
11
12
public class MyEntitySerializer extends Serializer<MyEntity> {

@Override
public void write(Kryo kryo, Output output, MyEntity object) {
return ...;
}

@Override
public MyEntity read(Kryo kryo, Input input, Class<MyEntity> type) {
return ...;
}
}

Redis Message Handler

The beautiful part is the complete definition of async message handling within xml configuration of redis message container.

Note that I used Jedis client as java redis client.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
<bean id="jedisConnectionFactory" class="org.springframework.data.redis.connection.jedis.JedisConnectionFactory"
p:host-name="${redisHost}" p:use-pool="true"/>

<bean id="myEntitySerializer" class="MyEntitySerializer" />

<bean id="redisContainer" class="org.springframework.data.redis.listener.RedisMessageListenerContainer">
<property name="connectionFactory" ref="jedisConnectionFactory"/>
<property name="messageListeners">




<bean id="messageListener"
class="org.springframework.data.redis.listener.adapter.MessageListenerAdapter">

<bean class="MyEntityListener"
autowire="constructor"/>

<property name="serializer" ref="myEntitySerializer"/>


<bean class="org.springframework.data.redis.listener.ChannelTopic">
<constructor-arg value="${channelName}"/>





What is going on?

Message container references both redis factory and given list of message listeners. It’s a class (or set of classes) having registered method as a callback when a new message arrives.

The last property is channel with important channel name. The code uses two variables, first is redis host and already mentioned channel name.

Message Listener

The last thing to do is to define message listener containing callback method, MyEntityListener. The class instance is called always once new message arrive using channel topic.

Crucial point was to discover the signature of callback’s method because spring’s documentation is little bit sloopy. Quick look into org.springframework.data.redis.listener.adapter.MessageListenerAdapter’s onMessage shows the correct way.

1
2
3
4
5
6
public class MyEntityListener {

public void handleMessage(MyEntity entity, String channel) {
// provide handling code
}
}

Incoming message is automatically deserialized so the method accepts entity itself.

Conclusion

Look at previous code. Every redis related code is defined in spring’s context and container class. No boilerplate code. Pretty nice.

Architectural note should explicitly show that the entity serves as a notification only. This is very important as such messaging is not reliable. Although the entity holds some information all the persistent data is defined within another place, e.g. as key/value pairs. Approaching message just notifies subscriber that new content is available to refresh and it’s supposed to GET key(s).