Skip to main content

NHibernate performance issues #1: evil List (non-inverse relationhip)

Lists are evil, at least when using NHibernate. You should re-consider if you really need to use specific List implementation, because it has unsuitale index column owned by parent, not children. List can't be used in inverse relationship which implies few (but major) inclusions:
  • extra sql UPDATE to persist mentioned index value
  • unscalable addition to the list - NHibernate needs to fetch all items and add new item after
  • inability to use fast cascade deletion by foreign keys
  • inability to use IStatelessSession for fast data insertion

Basic theorem: you don't need Lists! Furthermore, we'll discuss each bullet in details.

What is inverse relation?

First of all, it's necessary to clarify what inverse relation means. Reference between parent and child isn't hold by parent but child! See following picture:


Here is NHibernate mapping definition for inverse relation with using excellent Fluent NHibernate:
HasMany(l => l.Books).
  Access.CamelCaseField().
  AsBag().
  Inverse().
  KeyColumn("LIBRARY_ID").
  ForeignKeyConstraintName("FK_BOOK_LIBRARY_ID").
  LazyLoad().
  Cascade.AllDeleteOrphan();

Or hbm mapping:
<bag access="field.camelcase" cascade="all-delete-orphan" inverse="true" lazy="true" name="Books" mutable="true">
  <key foreign-key="FK_BOOK_LIBRARY_ID">
    <column name="LIBRARY_ID" />
  </key>
  <one-to-many class="Book" />
</bag>

Unscalable addition to List

Each item stored in standard (non-inverse) List holds index in special column defining collection order. It's simple smallint index.

Imagine the situation the you want to insert any new item to middle of mentioned list. Indexes of all items stored within rest of list need to be incremented. NHibernate doesn't execute any update query incrementing indexes but fetch all items from database.

Lets imagine described process for huge amount of children. Lets say that parent has 50 thousand of children. Object graph is stored in database, no object is cached. Parent is loaded from database, children list is marked as lazy. If you need to insert (or add) new item to collection, all 50 thousand items are loaded too. Again, again and again if you perform the operation.

Addition to List is totally unscalable.

Inability to use fast cascade deletion by foreign keys

Best unified approach to delete huge (various) object graph is to use cascade deletion hanged on foreign keys. I'll described this approach in separated articles in near future.

Cascade deletion is based on automatic removal of orphaned item. If relation between parent and child disappears, orphaned child is removed too. According to described process, cascade deletion is intended to be used along with inverse relation.

Following code snippet displays example of SQL foreign keys with cascade deletion:
alter table [Book] 
  add constraint FK_BOOK_LIBRARY_ID 
  foreign key (LIBRARY_ID) 
  references [Library] 
  on delete cascade

Extra sql update to store item's index

In the case of non-inverse relation, NHibernate inserts parent first and than all children. After these operations, extra update is produced to insert index column. I'll use already described situation counting with 50 thousand children, NHibernate produces 50 thousand of sql inserts and than sends additional 50 thousand of updates to SQL server - it means twice more sql statements!

See the following two sql queries extracted from log:
INSERT INTO [Book] (Isbn, Name, Pages, LIBRARY_ID, Id) VALUES (...);

UPDATE [Book] SET LIBRARY_ID = ... WHERE Id = ...;

What to pick instead of List?

It's simple to write that lists are evil. And what now, what to pick instead? There are three options.

1. At almost all cases, you really don't need List. See next chapter to find out which instance of collection to use.

List's index declares and defines list's order. At almost all cases, the list can be sorted according to any item's property. Does it make any sense to sort the list according to when items were added? I think it doesn't. I think that list needs to be sorted according to item's creation date, item's size, count's of item's replies etc., simply according to any item's property.

As our example defines, list of rentals should be sorted according to rental's start date, see mapping:
HasMany(b => b.Rentals).
  Access.CamelCaseField().
  AsBag().
  Inverse().
  KeyColumn("BOOK_ID").
  ForeignKeyConstraintName("FK_RENTAL_BOOK_ID").
  ForeignKeyCascadeOnDelete().
  LazyLoad().
  OrderBy("StartOfRental").
  Cascade.AllDeleteOrphan();

2. You need to sort list according to index and you can move index to child entit. Map it as a standard private property and than sort the whole collection according the property - see previous example.

See example of index property placed at children:
private int OrderIndex {
    get { return Book.Rentals.IndexOf(this); }
    set { }
}

The drawback is also that you need to expose parent's children list (instead of ICollection or IEnumerable) because child has to be able to find out it's position within list.

3. You need to sort list according to index, you can't change domain model and it's impossible to sort items according to any item's property. There is no other choice than simply live with non-inverse collection along with it's all disadvantages described above.

Recommended solution how to use collection when using NHibernate persistence
  1. Parent entity contains Add and Remove method for each collection
  2. Use ICollection as collection interface
  3. If you need to expose the whole colletion use IEnumerable - it's just read-only collection, support sequential fetching, etc.
Here is code snippet of Book entity.
public class Book {

    private ICollection<Rental> rentals = new List<Rental>();
    ...
    public void AddRental(String rentee) {
        ...
        rentals.Add(new Rental(this, rentee));
    }
    ...
    public IEnumerable<Rental> Rentals {
        get { return rentals; }
    }
}

Summary
You should be aware of List usage along with NHibernate persistence. It can bring you serious performance issues. Try to re-design you collections to be index independent and provide sorting approach by another than simple index way.

Examples at github.com
All examples are placed at github.com, see: https://github.com/MartinPodval/eu.podval/tree/master/NHibernatePerformanceIssues

You can also see Series of .NET NHibernate performance issues to read all series articles.

Comments

Kevin said…
One example where you need "arbitrary" ordering of items is a simple to-do list example, where a user needs to be able to reorder his to-dos arbitrarily. This is something I am dealing with now. A non-inverse relationship for me is not a good solution because we may have a user with hundreds of to-dos in a list. So, I am trying to come up with an efficient solution using a bag with ordering on a property column that is calculated based on the child's position in the list. So far, this is the only way I can see of letting the children maintain their own order. Still, when the user re-orders his items, all to-dos in the list need to be updated with regards to their position or order. So, still, we have potentially hundreds of updates (if the user has hundreds of to-dos.)
Anonymous said…
What you fail to mention is the drawback of using "order-by"

From the NHibernate reference documentation:

"Setting the order-by attribute tells NHibernate to use ListDictionary or ListSet class internally for dictionaries and sets, maintaining the order of the elements. Note that lookup operations on these collections are very slow if they contain more than a few elements."

/Johan H
Unknown said…
This is WAY easier to use than NHibernate:
https://www.kellermansoftware.com/p-47-net-data-access-layer.aspx

Popular posts from this blog

NHibernate performance issues #3: slow inserts (stateless session)

The whole series of NHibernate performance issues isn't about simple use-cases. If you develop small app, such as simple website, you don't need to care about performance. But if you design and develop huge application and once you have decided to use NHibernate you'll solve various sort of issue. For today the use-case is obvious: how to insert many entities into the database as fast as possible?

Why I'm taking about previous stuff? The are a lot of articles how the original NHibernate's purpose isn't to support batch operations, like inserts. Once you have decided to NHibernate, you have to solve this issue.

Slow insertion
The basic way how to insert mapped entity into database is:
SessionFactory.GetCurrentSession().Save(object);But what happen when I try to insert many entities? Lets say, I want to persist
1000 librarieseach library has 100 books = 100k of bookseach book has 5 rentals - there are 500k of rentals It's really slow! The insertion took exactly

Git on Windows: MSysGit

I have started to use Git today. I read a lot of discussions that there is no good tool for Windows platform. After forethought I have decided to used TortoiseGit. I also feared of difficult work related with Git as a lot of articles mentioned many instructions. As I already said, I have decided to use TortoiseGit, because I'm used to work with TortoiseSvn, but for start, MSysGit is enought. So this article is about MSysGit, next will be about TortoiseGit.

How to start with MSysgit on local machine?
Download and install Git for WindowsCreate source code directory for your git appRight click the directory at your favorite file browser. Menu should contain item "Git init here". It initializes chosen directory to be git-abled :-)It was your first usage of Git.

Commit data to local Git repository

Now, you can add any file, your first source code, to created directory. If you are prepared to commit any changes to your local git repository, follow next instructions.
Right-click th…

Tomcat 7 remote deployment

I decided to provide automatic deployment of war packaged application using Jenkins and Deployment plugin. The target platform is Amazon with Tomcat 7, see nice set of articles to find out how to setup such environment for free.

Well, there is couple of tutorials but they missing some points so it pushed me to lost one hour of my work.

What I gotFresh installation of Tomcat 7 on remote machine with opened 8080 port on firewallPersonal war file supposed to be deployed
How to push it to tomcat?
1. First of all, there is simple configuration of tomcat users in file tomcat-users.xml - it was my pain in the ass :-) As original comprehensive documentation says, it's necessary to define user, but which one(s)?

Here is working example of tomcat-users.xml:

<tomcat-users>
<user username="manager-gui" password="changeit" roles="manager-gui"/>
<user username="manager-script" password="changeit" roles="manager-script"/>