Why Use ORMs?

ORM aka Object Relational Mapping is quite a crappy way of converting relational database model into an object model. Usually this simply means that there is created an object in the software for each table that exists in the database.

Usually many people say that ORMs are the best human idea since the sliced bread, some of their arguments are mentioned below.

Database Agnosticism

Yea… so you can switch databases? Well… how to say that using nice words only… No. Just no. I don’t know any advanced software that can easily switch databases. Even if it uses an ORM. The reason is simple: advanced software means advanced procedures for managing data. It means that everything is tuned up. If not, then it becomes slow.

Tuning the database means using its capabilities (beware: those capabilities are not database agnostic).

Database Agnostic Speeding Up?

What can you do with such a slow database? Of course you can create some indices or do much more things including views (also materialized), triggers, replication, partitioning, or anything else.

Let’s talk about the indices (you think that it is the simples way of speeding up queries?). OK, first of all a question: what indices do you want to talk about? Maybe about PostgreSQL partial indices or functional ones? So if you use PostgreSQL, and create a functional index, then you cannot reproduce the same DDL on Oracle. That way you have much faster PostgreSQL database, not Oracle one.

The opposite example: in Oracle you have hints, in PostgreSQL you have not. Of course a hint is just a comment so any other database would just omit that comment. But omiting doesn’t mean speeding up the query.

What can you do to have the same speed for every database engine? Just create a different schema for each engine. And while changing that schema don’t forget that you need to maintain two (or more) schemas. Adding a column to one schema means adding the column to another. Speeding up one engine… means speeding up the other (but maybe using different tools). And one more thing… after making all those changes don’t forget to test everything… separately for each engine. Have fun.

Of course you can have a nice application that is database agnostic, but only if you create separate data layer for each engine. Just more work and the question is… what for? Do you really don’t have any more important work to do? Simply: use one database engine, believe me, the 99.999% of software won’t change the engine once it works.

But do it wisely, have a data layer, which could be replaced by an implementation for another database.

Speeding Development

Maybe… at the very beginning. But suddenly at some point (and that suddenly shows up even before the first release) you will find that there are so many limitations of the ORM libraries, that you have to make some queries using normal SQL. Later you will have a hybrid system, part of the queries is done using simple Hibernate selects, some using HQL (what really is a kind of SQL), and some using SQL.

Is that really so easy to maintain and speeds up anything?

Testing

“As a consequence of being database agnostic, you can replace your heavy SQL Server with a nimble Sqlite instance at test time, allowing you to write blazing fast tests.”

– http://blog.adamjcooper.com/2010/01/are-orms-solving-anything-yes.html

Yea, sure, a kind of urban legend. Last time I had to change something in the Oracle session using ‘ALTER SESSION’. Tests failed just because they were using HSQLDB. Everything would work if there would be used Oracle, not any other engine.

If you have an application that is prepared to be run using XXX database engine at version YYY, just use this YYY version of XXX engine for testing. Using anything else won’t work. You’ll just have some other engine tested, not the engine that you’ll use.

Caching

Yea, ORMs can cache… true. But if you don’t use ORM, you can cache too. I hope you don’t think that the whole idea behind any ORM is just caching.

Higher Level of Query Abstraction

Such abstractions can be dangerous if you don’t know what is going on underneath.

Some time ago I found a project where a programmer just made SELECT * FROM XXX and loaded the whole table into the program. Then the program sorted the whole dataset copied from databasem and printed on a web page the first 10 rows. Of course that run smoothly on his machine. On testing too. Yes, 100 records is not much.

On the production it was different. There were much more records and it wasn’t funny. His explenation was something like:

I don’t know SQL, I’m not gonna learn that.

I wrote the correct code and I’m not gonna change that, it simply works.

You Don’t Have to Use the Crappy SQL…

Yea… I really don’t get it. Why so many programmers don’t want to know a little bit more, and decline learning new things including the SQL? They don’t have to know SQL, they’ve got ORMs. So they produce crappy ORM generated queries that usually are not maintainable.

But With Pleasure Will Use The Crappy HQL

The most funny is The Hibernate. Mr. Hibernate refused to use SQL, they invented HQL. The best thing for HQL is that programmers don’t decline learning that… despite the fact that is looks much more like SQL, not like object code.

And how can I write a HQL query in Hibernate with the Oracle hints? No way, that’s sad.

ORM is For Everything

ORMs are also used for checking the data validity, and all constraints including the uniqueness of a column. That is mainly crappy because this is crappy, and simply doesn’t work.

Checking constraints at the application side, and not on the database side, has some advantages such as simply decreasing the load on the database server. The problem is not the load but the data corectness.

If you need to have correct data in the database, some constraints are a must. Checking them on the application side works sometimes… but what when you have another application written without your ORM mapping? That application can use some other ORM library, but then you have to create and maintain the mappings in two places all the time (terrible idea, just run away if you think to do something like that). What if the another application is a simple perl/bash/python script that doesn’t use any ORM? Such a script can easily change the data so that there is just a pure garbage but usually you realize that too late.

Additional points for bravity

if you will enforce column uniqueness only in the application.

So what to do about that? Just create constraints in the database. All applications including DBA (which is not an application) would be happy.

Database Constraints.

I wrote that such application constraints checking sometimes doesn’t work. There are mainly 3 solutions, the not working solution aka Rails, the not efficient solution aka MyISAM, and the good solution called The SQL Database.

The Bad

The first solution is used e.g. in Rails’ ActiveRecord. Simply define some model attribute that is unique, and later the model checks that. During insert or update the model just checks if there is any record in the database that has this value in the unique column.

It simply doesn’t work. Let’s imagine that there are two indstances of the same application. Both want to insert a record with the same data. First application checks and finds out that there isn’t any such record.

The second application checks… there isn’t too (yea, what a suprise). Then the first application performs the SQL insert. The second application too. After that there are two records with exactly the same data in the unique colum. What a suprise, how that happened, two users with the same logins, huh?

The Unefficient

The second solution is to perform the same operations as in the first example but inside some critical section. Simply create a lock in the database, so the first application is sure that no one will check if the record with the data exists. That will work great… but such a lock on the whole table is not a good idea. It blocks all writes so no one can change anything. That’s really uneficient, please don’t do that if you can do it the other way. I called this the MyISAM way as MyISAM is the widely used non transactional engine for MySQL and sometimes setting such terrible lock is the only way to create so called transaction simulation.

The Good

The third solution is the best. Just define a unique constraint in the database, and take care of the error that you can get while inserting. And it works. And there are no terrible locks. And there really will be just one record with this value in the unique column.

The Model Should Be Based On Classes

Why? Oh why? Just because you use object based language? I’d like not to complicate things too much and don’t add a leaky abstraction level called ORM.

Why Not To Use ORMs?

You use some expensive database, let’s say it is Oracle. Oracle has hints, but HQL doesn’t allow to use that, oops. Oracle has spatial queries, but your ORM does’t understand that. Oracle has -whatever- but your ORM doesn’t allow to use that.

So you pay a lot of money for a database like Oracle…

just to use that like you’d use SQLite?

What a brilliant idea.

You use some not expensive database, let’s say it is PostgreSQL. In PostgreSQL you can define your types, but the ORM doesn’t understand that. You’d like to use the partial indices but your automaticaly generated ORM DDL queries don’t understand that.

So you don’t pay a lot of money for a database like PostgreSQL…

just to use that like you’d use SQLite?

What a brilliant idea.

The engine XXX has a feature YYY but your great ORM doesn’t use that because other engines don’t have that feature (although using this feature would speed up you application so much).

Why Define Database Constraints?

You’ve got an application that works for some time. You want to change it, what should you do? Fix the old application or create a new one. What will just be left or migrated?

The Data.

The Data is what describes the real value of the whole application. So when there is a new application (no matter if this is a brand new one, or you’ve improved the old), you just throw away the old. The data persists, you leave the data or migrate with care. So take care of the data by having a good data model with all constraints etc. If you don’t have a well prepared data model, you’re on the road right down to the hell. Data would be wrong some day, let’s imagine that the wrong data is on you bank account or your ebay account. Oooops. When you have a well prepared data in the database, there is no sense in repeating the same constraints in the application. You would have to maintain that in two places. So the application slowly becomes a maintanance nightmare.

Some Links On The ORMs

are orms solving anything

http://wildermuth.com/2010/01/18/Are_ORMs_Solving_Anything

http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
http://www.alachisoft.com/articles/orm.html
http://www.artima.com/intv/abstract3.html

http://www.reddit.com/comments/6sbua/ask_reddit_i_know_sql_but_not_orms_am_i_crazy_to

http://stackoverflow.com/questions/291853/why-is-orm-considered-good-but-select-considered-bad
http://stackoverflow.com/questions/18655/why-do-we-need-entity-objects

http://python.mirocommunity.org/video/1217/when-not-to-use-the-orm-bulk-d

http://database-programmer.blogspot.com/2008/06/why-i-do-not-use-orm.html

http://nicolas.lehuen.com/index.php/post/2005/12/02/79-why-use-an-orm-at-all-anyway

The comments are disabled. If you want to write something to me, you can use e.g. Twitter.