Not a SQL replacement

OLTP vs OLAP use cases

How Ebean fits in OLTP vs OLAP use cases, when to use manually supplied SQL

Ebean ORM queries are good for typical OLTP query use cases but there are many queries that can be written in SQL that ORM is not well suited to, especially in the reporting and OLAP area. Recursive SQL queries are also not currently well supported.

Ebean ORM queries are focused to OLTP. Don't forget about using SQL where appropriate.

For Ebean the focus is to use ORM queries for OLTP use cases. The plan is not to expand the ORM query language but to instead have good integration with SQL via RawSql, SqlQuery and upcoming/desired integration into JOOQ.

Object graph construction

It can be useful to think of ORM queries as Object graph construction. That is, when you are creating an ORM Query it is a definition on what part of the object graph to load and what filters/predicates need to be applied.

Loading

For Ebean the select() and fetch() control what part of the object graph to load. The select() is for the root level of the object graph and fetch() is for the object graph leaves. In more advanced cases FetchConfig can be used to control what parts of the graph are loaded eagerly/lazily and if that part of the graph is loaded from the database or L2 cache or L3 cache/document store.

Some parts of the object graph can be loaded from L2 cache or L3/document store

Loading - Query tuning

Tuning an ORM query for optimal performance comes down to:

  • What properties to load (only load what you need for the given use case)
  • What parts of the object graph to load Eagerly or Lazily
  • What parts of the object graph to load from Database or L2 cache or L3 document store
  • Not executing the query at all and using L2 Query cache
  • Not executing the query at all and using L3 document store

Loading - AutoTune

Ebean's AutoTune feature has the ability to profile what parts of the object graph are used by the application and then provide automatic tuning of the "what to load" part of an ORM query. That is, AutoTune can suggest what the select() and fetch() part of the ORM query should be in order to fetch only what is needed from the database (reducing network and databases costs of the query) and minimise lazy loading.

Ebean can automatically tune "What to load"

Predicates

The where() clause is where you define predicates for the query. This is effectively 'application logic' and does not change for 'tuning' purposes etc.

Predicates - Type safety

The Type Safe query extension for Ebean is designed to provide a type safe way of building query predicates. As such it does not have a focus on providing type safety on "What to fetch" via the select() and fetch() clauses (which are instead expected to be largely defined by AutoTune).

Automatic joins

Ebean's query language is designed such that for a query you do not explicitly specify joins but instead define "What to load" and "Predicates". Ebean works out what joins are required to support both the load and predicate aspects of the query. Some of the SQL joins can be shared by both loading and predicates and some joins can not be, outer joins are 'inherited' down a given object path etc. Ebean works the joins and join types to add based on the cardinality and optionality of the relationships/paths involved in both loading and predicates.

The downside of this approach is that predicates using "self joins" can not be easily specified currently but this issue will be addressed and there are some very significant benefits to this approach where Ebean determines the joins automatically.

Avoiding Cartesian product

Ebean will never generate a SQL cartesian product. No matter how complex or big you ORM query gets Ebean will not generate a SQL cartesian product but instead break the query up into multiple SQL queries.

FirstRows MaxRows

FirstRows and MaxRows will always work in your ORM query. That is, firstRows/maxRows work on relational rows (and not objects) and Ebean will automatically break the ORM query up into multiple SQL queries as needed in order for firstRows/maxRows to work as intended.

Arbitrarily complex graphs

No matter how complex or nested your object graph is for your ORM query Ebean can build it in a pretty efficient manor. More specifically, a single SQL query can contain at most one OneToMany or ManyToMany relationship "path" (without producing a SQL cartesian product which is what we really want to avoid). Ebean will break up the ORM query into multiple SQL queries with each SQL query having a single OneToMany/ManyToMany relationship in its fetch (each SQL query is as eager/big as it can be without becoming a cartesian product).

Ebean processes all the fetch paths determining which paths contain a OneToMany or ManyToMany relationship. The query 'breakup' may be manually controlled by using FetchConfig but if this has not been done or has not covered all the required relationships then Ebean will automatically break up the query based on these 'paths containing a many relationship'.


TERMINOLOGY

Load Context

The "Load Context" is an internal Ebean feature with a very important job. Is supports executing secondary queries and provides:

  • Batch lazy loading
  • Query joins (Eager secondary queries)
  • Propagation of key query state from the origin query to any/all secondary queries.

In essence the Load Context provides the mechanism for batch loading complex graphs.

Origin queries

"Origin query" is an Ebean term to refer to the original SQL query executed when building an object graph. Further SQL "secondary queries" can execute and relate back to the "origin query".

Summary logging

In summary logging there is an origin attribute and this is a key that can be used to link origin and secondary queries.

txn[1012] FindBean type[Order] origin[B0dP9E.DWeHD4.5WUnB] ...

Secondary queries

"Secondary queries" is an Ebean term for SQL queries that relate back to a previously executed "origin query". These secondary queries can be executed due to lazy loading or because a single ORM query is broken up manually or automatically into multiple SQL queries (in order to avoid cartesian product and support firstRows/maxRows).

Summary logging

In summary logging the mode and origin attributes can be used to identify which sql queries are "secondary queries" and the origin which they relate back to. This logging can be used to identify excessive lazy loading (N + 1) for example.

  • mode : +lazy for lazy loading query and +query for a query join/eager loading
  • origin : this key links the secondary query back to the origin
txn[1012] FindBean type[Order] origin[B0dP9E.DWeHD4.5WUnB] ...
...
txn[1016] FindMany mode[+lazy] type[Customer] origin[B0dP9E.DWeHD4.5WUnB]  ...
...
txn[1017] FindMany mode[+lazy] type[Address] origin[B0dP9E.DWeHD4.5WUnB]  ...