Monday, November 4, 2013

Redis messaging using Spring and Kryo serialization

Redis is famous enterprise key/value data store which provides simple messaging using publisher/subscriber. I've decided to use the system to notify remote nodes (components) about a change of state between those components.

Spring among the others allows to use it's templating system to decouple business logic from messaging system used under the hood. The use of spring data for Redis guarantees a solution which does not utilize any redis command in the code.

Redis contains serialized content, either byte[] or string. So the last thing to reveal is domain model serialization. I've decided to fast binary serialization using Kryo framework as a winner of battle of serializators.

Maven

First of all, it's necessary to define all dependant components. Obviously, usual component like spring itself missing.

<dependency>
 <groupId>com.esotericsoftware.kryo</groupId>
 <artifactId>kryo</artifactId>
 <version>2.22</version>
</dependency>
<dependency>
 <groupId>org.springframework.data</groupId>
 <artifactId>spring-data-redis</artifactId>
 <version>1.1.0.RELEASE</version>
</dependency>
<dependency>
 <groupId>redis.clients</groupId>
 <artifactId>jedis</artifactId>
 <version>2.1.0</version>
</dependency>

Kryo Serializator

I used domain model's entity which hold identifier of our internal service, this is just UUID. The point is to setup kryo serializator which get the entity and returns byte[]. The entity itself is not important.

public class MyEntitySerializer extends Serializer<MyEntity> {

    @Override
    public void write(Kryo kryo, Output output, MyEntity object) {
  return ...;
    }

    @Override
    public MyEntity read(Kryo kryo, Input input, Class<MyEntity> type) {
        return ...;
    }
}

Redis Message Handler

The beautiful part is the complete definition of async message handling within xml configuration of redis message container.

Note that I used Jedis client as java redis client.

<bean id="jedisConnectionFactory" class="org.springframework.data.redis.connection.jedis.JedisConnectionFactory"
  p:host-name="${redisHost}" p:use-pool="true"/>

<bean id="myEntitySerializer" class="MyEntitySerializer" />

<bean id="redisContainer" class="org.springframework.data.redis.listener.RedisMessageListenerContainer">
  <property name="connectionFactory" ref="jedisConnectionFactory"/>
  <property name="messageListeners">
    <!-- map of listeners and their associated topics (channels or/and patterns) -->
    <map>
      <entry>
        <key>
          <bean id="messageListener"
              class="org.springframework.data.redis.listener.adapter.MessageListenerAdapter">
            <constructor-arg>
              <bean class="MyEntityListener"
                  autowire="constructor"/>
            </constructor-arg>
            <property name="serializer" ref="myEntitySerializer"/>
          </bean>
        </key>
        <bean class="org.springframework.data.redis.listener.ChannelTopic">
          <constructor-arg value="${channelName}"/>
        </bean>
      </entry>
    </map>
  </property>
</bean>

What is going on?

Message container references both redis factory and given list of message listeners. It's a class (or set of classes) having registered method as a callback when a new message arrives.

The last property is channel with important channel name. The code uses two variables, first is redis host and already mentioned channel name.

Message Listener

The last thing to do is to define message listener containing callback method, MyEntityListener. The class instance is called always once new message arrive using channel topic.

Crucial point was to discover the signature of callback's method because spring's documentation is little bit sloopy. Quick look into org.springframework.data.redis.listener.adapter.MessageListenerAdapter's onMessage shows the correct way.

public class MyEntityListener {

    public void handleMessage(MyEntity entity, String channel) {
      // provide handling code
    }
}

Incoming message is automatically deserialized so the method accepts entity itself.

Conclusion

Look at previous code. Every redis related code is defined in spring's context and container class. No boilerplate code. Pretty nice.

Architectural note should explicitly show that the entity serves as a notification only. This is very important as such messaging is not reliable. Although the entity holds some information all the persistent data is defined within another place, e.g. as key/value pairs. Approaching message just notifies subscriber that new content is available to refresh and it's supposed to GET key(s).

Scala: get rid of not-null validations

Scala has been always known as a language which allows special handling of null values. There is ton of articles regarding Some[T], None, Option[T].

What is most annoying code for me? Null validations, see usual example:

class Entity {

}

class ServiceA {
  def method(a: Entity, b: Entity, c: Entity, d: Entity) = {
    Validate.notNull(a)
    Validate.notNull(b)
    Validate.notNull(c)
    Validate.notNull(d)
  }
}

class ServiceB(val a: ServiceA) {
  def method() = {
    a.method(null,null,null,null)
  }
}

When you start to write save code in the term of fails-as-fast-as-possible, your code, services or even domain model, will be weedy, you will find such boilerplate code at every method because you can't be sure which parameter supplied someone to you class or method.

Unfortunately Scala has beautiful way how to achieve nice and simple code - without these checks. If you don't think that null is proper state for your class, just dismiss this option. How to do that?

1
2
3

class Entity extends NotNull{

}

Well, that's all. Look at screenshot from my IDE what is about to happen:

Now, if you are sure that you don't want to allow null value for your entity, you can implement NotNull trait and you can remove many lines from your source code.

I was surprised when I found this trait in scala code, because many tutorials or even famous Scala for the Impatient book does not mention this simple but beautiful piece of code.

Tomcat 7 remote deployment

I decided to provide automatic deployment of war packaged application using Jenkins and Deployment plugin. The target platform is Amazon with Tomcat 7, see nice set of articles to find out how to setup such environment for free.

Well, there is couple of tutorials but they missing some points so it pushed me to lost one hour of my work.

What I got

Fresh installation of Tomcat 7 on remote machine with opened 8080 port on firewall
Personal war file supposed to be deployed

How to push it to tomcat?

1. First of all, there is simple configuration of tomcat users in file tomcat-users.xml - it was my pain in the ass :-) As original comprehensive documentation says, it's necessary to define user, but which one(s)?

Here is working example of tomcat-users.xml:

<tomcat-users>
<user username="manager-gui" password="changeit" roles="manager-gui"/>
<user username="manager-script" password="changeit" roles="manager-script"/>
</tomcat-users>

The important part is manager-script, contrary to Tomcat 6 where it had not exist yet. This user allows access to /text sub-namespace in management uri. The first user called manager-gui is the one which you use in GUI console, e.g. http://localhost:8085/manager/html

Once you run tomcat using bin's bat file, you can move to second bullet.

2. Now, it's possible to use remote deployment using curl command, e.g. in my use-case:

curl --upload-file my.war "http://manager-script:changeit@localhost:8080/manager/text/deploy?path=/myPath&update=true"

The command is working using manager-script user contrary to my original manager-gui. Another interesting part is path=/myPath. This attribute say which URL sub-namespace is to be used.

Even if you deploy my.war and common Tomcat's approach is to deploy application in /my subname, the application is to be exposed on /myPath.

Jenkins + git revision in all build names

Jenkins by default assigns version of a build using local counter within each type of a build. An example is better.

When you look at this overview, you definitely do not know which code revision was used in Compile build and which in Integration Tests. I've followed nice article regarding real CI pipeline using jenkins. It uses Build Name Setter Plugin. Unfortunately this article uses SVN revision number.

So I said I'll just use git revision as git is my source control. But it's not so easy as how it could seem for first look.

My Jenkins setup comprised of first compile build step which clones git server and performs an compilation. Second build steps clones the repository from first step and executes integration tests. The problem here is that the second step does not know which git revision compile step cloned.

Here is list of steps how to do that.

1. You obviously need Git Plugin, Build Name Setter Plugin and Parameterized Trigger Plugin
2. Compile build requires following Post-Build action using Parameterized Trigger Plugin

This will introduce new environment property called GIT_REVISION with value equals to current cloned git revision.

3. Integration Test build uses Build Name Setter Plugin's option along with following code:
PL#${ENV,var="GIT_REVISION"}-${BUILD_NUMBER}

And that's all.

Presentation: Redis - The Distributed Key:Value Store

Attaching presentation I made for my colleagues regarding Redis - The distributed key:value store. Google docs version.

Constructor dependency injection over property autowiring

I use dependency injection pattern on all projects where it makes any sense, especially spring is following me all my life. Once I've discovered domain driven design, I've realized that model should be rich, clear, reusable, no meaningless dependencies.

Combining of clear model along with spring annotation can bring few issues except model dependency on spring jar files.

See following example what's going on:

And relevant unit tests.

Once you decided to write reboust component, you can't be absolutely sure that someone will use spring (or other DI framework) to push all dependencies.

Well, you need them check. Those are the reasons why I decided to use dependency injection via constructor. The componets just use fully supplied constructor requiring all mandatory dependencies.

See the code:

Advantages

Such code brings couple of other advantages which you must not see at a glance:

Obviously clearer code as there is just only one place where all dependencies are injected and they are final.
Simple tests
No more @components with tons of dependencies. This is key point. Do you remember classes which grew and grew and when you opened them last time in your favorite IDE, they had more than 15 dependecies? Well, this is not possible any more. Who would maintain constructor using 15 parameters?
No circular dependencies. Spring allows to define bean references which leads into closed circuit. I'm surely not alone who do not like it because it's antipattern which hide some wrong design. Once you use dependency using constructor, you will not be able to write such relation anymore.

You can browse the code in GitHub.

Presentation: How to write unit tests

Attaching presentation I made for my colleagues regarding unit testing. Google docs version.

Čas dát NHibernatu sbohem #2 - performance

V minulém díle jsem jsem rozebíral pár nedostatků NHibernatu, které mají z mého pohledu zásadní vliv k nasazení či nenasazení toho výborného frameworku. V tomhle díle bych rád rozebral performance.

Performance lze různě ladit, jak jsem již před pár lety rozbíral. Je spousta možností, jak věci delat. Cachovani, stateless session atd. Ve chvíli, kdy se dostane člověk na hranu rychlosti databáze, musí se začít zaobírat méně obvyklými věcmi, které se běžně neladí.

Kontrukce entit

Vždy jsme na našel projektu ctili DDD, čili chtěli jsme mít rich entity a ne naked. Tím chci říct, že všechny atributy/fieldy jsou encapsulované get/set metodami. Pokud nechcete zveřejnit privátní fieldy, musíte začít používat reflection, což při velkém měřítku při používání pro většinu mapovaných entit začíná být pro .NET problém. Je to pomalé.

Existuje několik frameworků (např. FastReflect), které dokážou cachovat reflection přístup, čistou performance přístupu k privátním fieldům to pak zrychlí v našem případě 20x. Samozřejmě custom reflection framework není snadné do NHibernatu nejak integrovat. Je nutné si vyvinout vlastní komponentu, která bude konstruovat entity a tu tam zaintegrovat.

Konstrukce hierarchií entit

Popsaný problém pomalé reflexe plynule přechází v další issue, které člověk musí řešit. Většinou z persistence netaháte jeden typ class, ale hierachii tříd. To znamená, že se persistentního frameworku dotazuji na jeden objekt, který má však další dependance. Typicky v DDD se jedná o agregát, který je separovaný od ostatních agregátů - nemají mezi sebou přímou relaci - a ten obsahuje kolekce dalších druhů entit a tak dál rekurzivně. V našem případě jeden typ agregátu nabobtnal až na dvacet druhů class.

Z logiky věci NHibernate konstruuje celý agregát pomocí mappingu, tzn. map klíčů, entit a typů class. Jinak řečeno, materializované entity bude NHibernate ukládat do session a pro dané klíče a typy je linkovat mezi sebou do výsledného agregátu, až z toho nakonec vypadne pouze root instance. Ve chvíli, kdy těch typů a instancí není málo, snadno narazíme na to, že konstruování takového agregátu je dost pomalé.

Nhibernate je obecný framework, který se snaží za runtime zkonstruovat něco, co vy si můžete napsat sami. Performance programového přístupu versus vašeho zakompilovaného je samozřejmě velká v řádu desítek procent ušetřeného času.

Jediné, co musí programátor udělat, je napsat si vlastní linker a vlastní session, která je schopná absorvovat entity a propojovat je přes identifikátory.

Mapping

Přestože se to nemusí zdát, i samotné mapování může mít dopad na výkon Vaší aplikace. Ačkoliv je NHibernate jeden z nejvíc ohebných frameworků, co jsem poznal, mapping neumí zdaleka všechno. Pokud máte složitejší věci, např. kolekce kolekcí nebo dictinary kolekcí, musíte v mapování přijít s workaroundem, protože tohle standardně mapovat nelze. V našem případě jsem vždycky pro daný záznam vytvořili novou (zbytečnou) entitu, která tohle dokázala obalit a tu také namapovali.

Místo přirozeného řešení jsem kvůli jemnému omezení museli vytvořit entitu navíc. Samozřejmě, že takový zásah má vliv na performance, např. vytváření entit s následným garbage-collectingem atd. V serverových aplikacích je občas nutné hledět na každou zbytečnou entitu.

Garbage Collecting

Poslední téma, kterého bych se rád dotknul, je produkce entit a následný garbage collecting. Škálovatelná server-side aplikace by neměla plýtvat zdroji. Pokud to dělá, brzy v rámci optimalizací performance narazí. Tahle metrika je vždy strašně těžko měřitelná, protože každý profiler nebo typ měření dává jiné výsledky.

Jedna věc je však jistá. Jakékoliv ORM bude produkovat víc odpadu, protože bude vždy víc obecné - to znamená, že bude produkovat obalovací entity - než Vaše custom persistenční logika, která je prostě na míru. Bude produkovat víc obecný kód - jak jste to již popisoval v předchozí kapitole.

Při nasazení NHibernatu do produkce zvedne vždycky čas v GC.

Rozhřešení

Po několika letech používání NHibernatu jsme dospěli do situace, kdy jsme si pro zrychlení persistentní vrstvy museli přepsat určité komponenty nebo vyměnit NHibernate celý. Začali jsme s tím prvním. Jak jsme skončili uvidíme v další kapitole.

Book: Functional Programming for Java Developers: Tools for Better Concurrency, Abstraction, and Agility

Probably all of us forget functional programming in the name of object oriented one. As concurrency and scalability are two buzzwords mentioned these days in every technical acrticle the functional programming notion seems to growing.

The book Functional Programming for Java Developers: Tools for Better Concurrency, Abstraction, and Agility is brief introduction why and how could developer or architect think about development, especialy in the scope of reusable components.

The couple of chapters at the beggining show how functional programming approach can improve thinking of every programmer. Special attention is put on objects immutability or functional recursion.

Next chapters present the example how to write immutable list and maps, how to write useful methods operating above the list using recursion and how to use these methods for everything what you can want from the structure.

Well, I read the book twice. The impact was immediate since I was writting a component based on tree structure. As soon as you realize the immutability is better than classic locking-aware approach you start asking how to achieve such pleasat property on objects requiring mutability by nature. It requires the change of mind. This book can help you a lot.

Martin Podval' Log

Menu

Monday, November 4, 2013

Redis messaging using Spring and Kryo serialization

Maven

Kryo Serializator

Redis Message Handler

Message Listener

Conclusion

Tuesday, October 8, 2013

Scala: get rid of not-null validations

Tuesday, October 1, 2013

Tomcat 7 remote deployment

What I got

How to push it to tomcat?

Thursday, September 26, 2013

Jenkins + git revision in all build names

Sunday, August 18, 2013

Presentation: Redis - The Distributed Key:Value Store

Friday, July 5, 2013

Constructor dependency injection over property autowiring

Advantages

Wednesday, June 12, 2013

Presentation: How to write unit tests

Wednesday, April 24, 2013

Čas dát NHibernatu sbohem #2 - performance

Kontrukce entit

Konstrukce hierarchií entit

Mapping

Garbage Collecting

Rozhřešení

Sunday, January 13, 2013

Book: Functional Programming for Java Developers: Tools for Better Concurrency, Abstraction, and Agility

About

Blog Archive

Categories