Skip to main content

NHibernate performance issues #2: slow cascade save and update (flushing)

What's the most powerful NHibernate's feature, except object mapping? Cascade operations, like insert, update or save. What's the best NHibernate's performance issue: cascade saving.

Cascade insert, save and update

If you let NHibernate to manage your entities (e.g. you load them from persistence), NHibernate can provide all persistence operations for you, it includes automatic:
  • insert
  • update
  • delete
All depends only on your cascade settings. What says documentation?
cascade (optional): Specifies which operations should be cascaded from the parent object to the associated object.
Attribute declares which kind of operation would be performed for particular case. All this stuff can be adjusted for traditional pattern parent - child.

Following code declares specific behavior what happen to children (books) when parent entity (Library) will be affected any of mentioned operations.
HasMany(l => l.Books).
  Access.CamelCaseField().
  AsBag().
  Inverse().
  KeyColumn("LIBRARY_ID").
  ForeignKeyConstraintName("FK_BOOK_LIBRARY_ID").
  ForeignKeyCascadeOnDelete().
  LazyLoad().
  Cascade.AllDeleteOrphan();
As you can see, each children will be affected by all kinds of operation performed on parent object. Obviously there is more options: it needn't do anything, it can only update etc.

That's all about cascade operation, it's pretty and fine NHibernate's stuff and it's well described at manual.

Cascade update issue

We have defined cascade operations. What's the problem?

When NHibernate finds it appropriate to go through all  relations and objects stored in first level cache - session, it checks all dirty flags and performs proper operations which can be very expensive operation.

What means "finds it appropriate"? It means when NHibernate flush the session to database (to opened transaction scope). It can be performed in following situations:
  • before execution of any query - HQL or criteria 
  • before commit 

You can see it in log. It looks like following log snippet:
done cascade NHibernate.Engine.CascadingAction+SaveUpdateCascadingAction for collection: eu.podval.NHibernatePerformanceIssues.Model.Library.Books
NHibernate.Engine.Cascade CascadeCollectionElements:0 deleting orphans for collection: eu.podval.NHibernatePerformanceIssues.Model.Library.Books
Lets show the problem in example. Assume that you let NHibernate manage pretty huge amount of objects, e.g. you fetched them from database. NHibernate stores all of them at first level cache - session. Than you execute a few HQL queries and finish it by commit. NHibernate call flush before each action. It's serious performance hit!

NHibernate can spent huge amount of time checking all loaded data if anyone has changed. Even if you suppose to read them only.

Avoiding described situation can significantly faster your application.

Avoid needless cascade save or update

1. Read only transaction
First of all, you can tell NHibernate that it needn't perform checking because you doesn't suppose to write any change. Simple set you transaction read-only. I'm using spring.net for transaction management.
[Transaction(TransactionPropagation.RequiresNew, ReadOnly = true)]
public void FindAllNewBooks(IEnumerable<Book> books);
NHibernate won't perform any cascade checking because you tell him that you aren't suppose to update data. But what if you want to update the data?

You can provide write transaction for all places, where you need to write the data - but it looses A as atomicity from ACID for the whole - business - operation.

2. Evict entity from session
You can tell NHibernate to do not care about entity in the future, means do not store entity in session and NHibernate will forget about the entity.
SessionFactory.GetCurrentSession().Evict(library);
But be aware to evict entity from NHibernate session. If NHibernate forget the entity, there is no way how to perform lazy loading so you can't use this solution if children or properties are lazy loaded.

3. Set Status.ReadOnly on entity when it's loaded
There is really tricky way how to set read-only attribute placed in NHibernate properties, see this article Ensuring updates on Flush. 

I'm not using this type because it's really low-level approach.

4. Use IStatelessSession for bulk operation
NHibernate provides IStatelessSession interface to perform bulk operation. I'd like to write the whole article about stateless session later.

5. There isn't really simple way how to fetch entity as read-only?
No there isn't. Despite that basic Factory pattern is often declared with method FindById({id}, bool forUpdate);, NHibernate is not able to provide such kind of functionality, you must use one of described work-arounds.

Summary

If you want to develop fast and scalable application, you need to deal with cascade save and update. It's the first issue you'll find in log.

The most useful and secure way is to decide which transactions can be marked as read-only. Mark as many as possible because the most use-cases of average application read the data only.

If you really need to write data, you should lower the amount of entities placed in first level cache - session. You avoid the situation when your code loaded thousand of entities, you add one new and all thousand + one are check for dirty data.

If you insert thousand entities into the dabase, e.g. in import use-case, just use stateless session.

You can also see Series of .NET NHibernate performance issues to read all series articles.
Post a Comment

Popular posts from this blog

Http and TCP Load Balancing with Kubernetes

Kubernetes allows to manage large clusters which are composed of docker containers. And where is large computation power there is large amount of data throughput. However, there is still a need to spread the data throughput so it can reach and utilize particular docker container(s). This is called load balancingKubernetes supports two basic forms of load balancing. Internal among docker containers themselves and external for clients which call you application from outside environment - just users.


Internal Load Balancing with Kubernetes Usual approach during the modeling of an application in kubernetes is to provide domain models for pods, replications controllers and services. Look at the great documentation if you are not aware of all these principles.

For simplification, pod is a set of docker containers which are always located on the same node. Replication controller allows and guarantees scalability. The last but not least is the service which hides all instances of pods - cre…

Validating nginx config file using docker approach

I try to setup nginx as a load balancer. The configuration is just a file with a lot of constrains so I need a validation. There is no online validation service, as e.g. CoreOS has, and I don't want to install nginx on my laptop as I work on a distributed app.

Docker is right approach for me. Let say I have following config:


In short, I'm going to pass nginx config to running nginx instance and look to the logs.

Put you nginx.config to the temp and start the docker image:

sudo docker run --name nginx -v /tmp/nginx.config:/etc/nginx/nginx.conf:ro -d nginx It uses volume mapping so the command just starts a new docker container and mounts a local /tmp/nginx.config to the given in-container path. You can obviously change the volume path to your personal path. Is it working or not? Look at logs.

sudo docker logs nginx If there is no entry, your file is fine. In the case of an error, you can see something like this:

2016/01/08 11:37:31 [emerg] 1#1: unexpected "}" in /etc/…