Skip to main content


Showing posts from October, 2014

Performance Battle of NoSQL blob storages #2: Apache Kafka

The first article of this series brought scaling factors for blob-based content on Apache Cassandra . It's well know piece of software, requiring full installation on nodes, management application and so on. You also need to tune various configs to achieve best performance results. I've spend nice time playing with yamls on ubuntu :-) The configuration is sometimes tricky. I was little bit confused once or twice so I planned to hire Cassandra guru to our team as issues I encountered seems really complicated :-) Well, we do not need much functionality in HP Service Virtualization . The core is to replicate messages to achieve reliability. The next steps is to process them. Yes, the last part is to aggregate the results. Sounds exactly like map and reduce ala hadoop. Evaluation of Apache Storm or Apache Samza  are different stories, but they allow me to find  Apache Kafka . Kafka is pretty nice software, much more simpler comparing to Cassandra how I've described