SQLAlchemy Object Lifecycle

6 minute read

SQLAlchemy Object Lifecycle

The story of how an object goes through from Pending to Persisted.

It isn’t until we execute db.session.commit() that we commit anything to the database.

Every time we want to interact with the database, we start a connection and we end a connection when all of the interactions are done. Within that interaction session, we create transactions that we commit to the database every time that we want to commit work to the database.

So an important thing to keep in mind, is that proposed database changes are not immediately committed to the DB once we define them. Changes instead go through these stages in order to provide the ability to undo a mistake before we commit something to the database.

Let’s go ahead and review what those stages of the object lifecycle are.

  • The first, is that you can define an object, which is sort of floating in space, unassociated with a database session yet. At this point, the object is in a transient state, it exists.
  • It isn’t until we call either session.add() or session.add_all() or session.delete(user1) or make some update to the model object, that we wound up proposing a possible insert, update or delete statement that we want to execute in the future. At this point, the object is in pending state. In that pending state, the object is associated to a database object, but we haven’t yet decided to permanently commit it yet. This is really useful for us when we may want to undo something that we didn’t intend to do. So, that we can call session.rollback(), and what that will do, is that as long as a flush() has not yet happened, session.rollback() can allow you to clear all of the pending work that you added to the transaction as if it never happened.

So, in summary, an object stays in pending state until a flush happens.

So, lets review, what a flush means?

A flush is an event that takes pending changes and translates them into SQL commands ready to be committed to the database.

So, things that we did on the ORM layer to objects wound up getting translated into SQL expressions, which in turn get translated into SQL command statements so that the engine can understand and get bundled into particular connections and semantics for a particular database system, which then get handled by the underlying DBAPI which takes care of the details that need to happen before a commit to the database occurs.

A flush is not the same as a commit. Nothing is yet persisted to the DB yet when a flush happens. We still need to manually call commit for the commit to occur.

Takeaways:

  • Within a session, we create transactions every time we want to commit work to the database.
  • Proposed changes are not immediately committed to the database and instead go through stages to allow for undos.
  • The ability to undo is allowed via db.session.rollback()

Stages:

  • Transient: an object exists, it was defined. user1 = User(name='Shravan') … but not attached to a session yet.
  • Pending: an object was attached to a session. “Undo” becomes available via db.session.rollback(). Waits for a flush to happen.
  • Flushed: about ready to be committed to the database, translating actions into SQL command statements for the engine.
  • Committed: manually called for a change to persist to the database (permanently); session’s transaction is cleared for a new set of changes.

Questions to solidfy our understanding:

When does a flush occur?

One way that a flush occurs is when we call query. Calling query on model flushes the pending changes and adds them to the session, so that, when we call .all() or .first(), those pending changes get translated into SQL statements. When you call .query an in-memory representation of what you would get back if you had called commit, will be available for you.

But, they haven’t been added to the database yet. It is not until you call db.session.commit() that that happens.

Here are the three different stages of an object’s lifecycle from pending to flushed to committed.

One way that a flush occurs is:

  • When we first do a session.add() or .all_all() or any update or delete on an object, those changes get added to the pending stage.
  • Calling .query then flushes those pending changes, translating those changes into INSERT, DELETE, UPDATE statements. Then from here, we have to actually manually call session.commit() for those changes to be persisted to the DB.

Another way a flush occurs is:

  • When you directly call session.commit(), i.e, without having called model.query statement at all.
  • In this case, commit will automatically flush those pending changes for you, and persists those DB changes for you.

Lastly, what also happens on commit is that a session starts a new transaction when you next call db.session(), indicated in the above figure with an empty box.

Demo of how a flush occurs

Fire up an interactive session and import the db and Person objects from app.py flask application. (Check my earlier sqlalchemy posts for context).

(fullstack) shravan-flaskapps# python
Python 3.8.5 (default, Aug  5 2020, 03:39:04)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from app import db, Person
>>> Person.query.all()
[<Person ID: 1, name: Shravan>, <Person ID: 2, name: Veyd>, <Person ID: 3, name: Neil>, <Person ID: 4, name: Pragna>]
>>> Person.query.count()
4
>>>

Let’s also verify what we have on the db, to verify we have the same exact values:

interviews=# SELECT * FROM persons;
 id |  name
----+---------
  1 | Shravan
  2 | Veyd
  3 | Neil
  4 | Pragna
(4 rows)

interviews=#

Now lets go ahead and add a new person to the persons table. Next, add it to a session and then call the .query.all() method on it.

>>> person = Person(name='Ranger')
>>> db.session.add(person)
>>> Person.query.all()
[<Person ID: 1, name: Shravan>, <Person ID: 2, name: Veyd>, <Person ID: 3, name: Neil>, <Person ID: 4, name: Pragna>, <Person ID: 5, name: Ranger>]
>>>

You can see that it displays <Person ID: 5, name: Ranger> in the output here, but it won’t be in the DB yet. As shown here:

interviews=# SELECT * FROM persons;
 id |  name
----+---------
  1 | Shravan
  2 | Veyd
  3 | Neil
  4 | Pragna
(4 rows)

interviews=#

This is because, we haven’t committed the changes that we have flushed.

Because those names have been flushed, SQLAlchemy knows not to do the work again, of translating those pending changes into SQL statements. So there is some underlying caching going on that makes it a bit more performant than using the DBAPI itself.

OK, so now that we understand that flushing changes does not commit them, lets go ahead and commit the change using db.session.commit(), and check the db. We now see that a new person Ranger (our dog’s name) has been added to the DB.

interviews=# SELECT * FROM persons;
 id |  name
----+---------
  1 | Shravan
  2 | Veyd
  3 | Neil
  4 | Pragna
  5 | Ranger
(5 rows)

Conclusion

The more familiar we can be with what the state of our objects are in, the more we can know and always understand the state of our database, leading to more maintainable web applications.