Skip to main content

Posts

Showing posts from 2014

Performance Battle of NoSQL blob storages #2: Apache Kafka

The first article of this series brought scaling factors for blob-based content on Apache Cassandra . It's well know piece of software, requiring full installation on nodes, management application and so on. You also need to tune various configs to achieve best performance results. I've spend nice time playing with yamls on ubuntu :-) The configuration is sometimes tricky. I was little bit confused once or twice so I planned to hire Cassandra guru to our team as issues I encountered seems really complicated :-) Well, we do not need much functionality in HP Service Virtualization . The core is to replicate messages to achieve reliability. The next steps is to process them. Yes, the last part is to aggregate the results. Sounds exactly like map and reduce ala hadoop. Evaluation of Apache Storm or Apache Samza  are different stories, but they allow me to find  Apache Kafka . Kafka is pretty nice software, much more simpler comparing to Cassandra how I've described

Performance Battle of NoSQL blob storages #1: Cassandra

Preface We spend last five years on HP Service Virtualization using MsSQL database . Non-clustered server. Our app utilizes this system for all kinds of persistence. No polyglot so far. As we tuned the performance of the response time - we started at 700ms/call and we achieved couple milliseconds per call at the end when DB involved - we had to learn a lot of stuff. Transactions, lock escalation , isolation levels , clustered and non clustered indexes, buffered reading, index structure and it's persistence, GUID ids in clustered indexes , bulk importing , omit slow joins, sparse indexes, and so on. We also rewrite part of NHibernate to support multiple tables for one entity type which allows use scaling up without lock escalation. It was good time. The end also showed us that famous Oracle has half of our favorite features once we decided to support this database. Well, as I'm thinking about all issues which we encountered during the development, unpredictive behavio

Kryonet: simple but super-fast TCP communication in Java + performance benchmarks

We have designed cluster-aware and cloud-based solution of service virtualization for some time. Incredible work as distributed system is next dimension of thinking - and learning as well :-) We was deciding to split atomic piece of our domain work to multiple nodes. Such step would hopefully provide us to point certain messages to certain nodes. We can than store data in main memory and avoid distributed locking above some remote store, probably Redis . This is usual approach how to achieve better response time of the whole solution to avoid database search for all calls. Obviously with well designed replication. I was really worried about latency. We need really low response time in couple of milliseconds at most under heavy load. I'm talking about the whole solution which consumes CPU a lot because of the calculations of simulated response under the hood, so no space for any wasting. What is the best latency of such transmission between two nodes? What is the throughput o

Jasně cílených 25 + 5 minut v pomodoro

Když bude chtít netrénovaný jedinec uběhnout pět kilometrů, velmi pravděpodobně zažije lepší požitek ze sportu, když si místo celého běhu najednou rozkrájí trať na 10 pětisemetrových částí a mezi nimi udělá určitý odpočinek. Proč? Doběhnout k další lampě na cyklostezce je pro psychiku - a také fyzičku - méně náročný úkol než doběhnout do další vesnice a pak zpět. Psychiku se stejně jako fyzičku můžete vyčerpat. Takhle nějak funguje pomodoro v práci. O co jde? Pomodoro je celkem starý a jednoduchý postup, který velmi napomáhá mojí efektivitě. Je to vlastně jediná věc, kterou jsem zkusil a používal i po pár dnech dál. Respektive pomodoro pomáhá mému soustředění a schopnosti řešit smysluplně problémy. Pomodoro říká, že byste měli  dvacetpět minut pracovat a pět minut dělat něco jiného - odpočinout si od té hlavní činnosti - a takto dokola. Po třech opakováních si dát větší patnáctiminutovou pauzu. Takhle lze rozkrájet celý den. Jak to funguje? Přijdete do práce, pustíte čas