Upsert in Slick 3

An upsert means inserting a row into a database if it doesn’t already exist, or updating the row if it does exist. The Slick database library knows how to do upserts. This posts takes a look at the built-in support, and gives an example of what you can do if you need to roll your own upsert logic.

This is part of a series on Slick, for material in or being added to Essential Slick.

Built in Support

In Slick an upsert is provided by the method insertOrUpdate. If you have a table and a representation of a row you can insertOrUpdate that row. Very convenient.

This means two things:

  • Slick will automatically decide what it needs to do for you (INSERT or UPDATE or something else). I’m being vague here, because “what it needs to do” could well vary depending on the database product you are using.
  • Slick needs a way to match the value you have to some row in the database. The usual way to do that is with a primary key (composite or otherwise).

An example will make this clearer. Let’s say we’re running a movie reviews database. We have film titles with a rating. In Slick we might model rows with a Review case class for the ReviewTable:

final case class Review(title: String, rating: Int, id: Long = 0L)

// Hello to Jason Isaacs
final class ReviewTable(tag: Tag) extends Table[Review](tag, "review") {
  def id      = column[Long]("id", O.PrimaryKey, O.AutoInc)
  def title   = column[String]("title")
  def rating  = column[Int]("rating")
  def * = (title, rating, id) <> (Review.tupled, Review.unapply)
}

val reviews = TableQuery[ReviewTable]

We can conjure up some reviews, and construct a program that will create the database, populate it, and give us back the content:

val testData = Seq(
  Review("Godzilla", 8),
  Review("Godzilla Raids Again", 6),
  Review("King Kong vs. Godzilla", 5)
)

val action = for {
  _     <- reviews.schema.create
  _     <- reviews ++= testData
  films <- reviews.result
} yield films

If you’re not familiar with Slick 3, you don’t need to worry about the details above too much. The key point here is that Slick works by running an action (such as DBIO[T]) via an interpreter. The result will be a Future[T].

Let’s run the above:

val db = ??? // assume we have a database set up
val future = db.run(action).map { _ foreach println }
Await.result(future, 2 seconds)
// Output:
// Review(Godzilla,8,1)
// Review(Godzilla Raids Again,6,2)
// Review(King Kong vs. Godzilla,5,3)  

Because we’re selecting all film reviews, we end up with a Future[Seq[Review]]. For the rest of this post we are dealing with inserting and updating. By default we’ll get back a count of affected rows. This means the results we will work with are going to be Future[Int].

So I want to post a new review. Maybe I’ll have changed my mind about a rating. Or maybe I’m reviewing a new film. What’s the code look like?

val review: Review = ??? // We have a review from somewhere
val upsert: DBIO[Int] = reviews.insertOrUpdate(review)

It doesn’t matter if review is a new instance only in memory, or one we retrieved from the database earlier. Slick will INSERT or UPDATE for us. When we execute the upsert action against H2, this sequence of SQL is executed if review already exists…

select 1 from "review" where "id"=?
update "review" set "title"=?,"rating"=? where "id"=?

…or this SQL if it’s a new row:

select 1 from "review" where "id"=?
insert into "review" ("title","rating") values (?,?)

Notice that Slick is using the primary key, id, to check and update the row.

A peculiarity of H2 with Slick is that if a row contains an auto incrementing key, as it does in this case, the upsert is simulated via the select and insert pair we’ve just seen. Without the O.AutoInc option for the table, the SQL for both cases becomes a merge:

merge into "review" ("title","rating","id") values (?,?,?)

That’s reasonably straight-forward. The right effect occurs in the database, and you get back a count of rows changed.

Sometimes you want the row back, not the count. Slick supports that via returning. We discuss returning in Chapter 2 of Essential Slick, but the pattern is:

(reviews returning reviews).insertOrUpdate(row)

Not all database products support that.

Rolling Our Own Update

By included insertOrUpdate I’m guessing the Slick team have covered a large percentage of use cases for upsert. But if you want to do something different, how can you get the logic you need mixed in with database actions?

Let’s look at a slight variation of our film reviews database:

final case class Review(critic: String, title: String, rating: Int)

final class ReviewTable(tag: Tag) extends Table[Review](tag, "review") {
  def critic = column[String]("critic_name")
  def title  = column[String]("title")
  def rating = column[Int]("rating")
  def * = (critic, title, rating) <> (Review.tupled, Review.unapply)
}

We are going to have one rating from a critic for each title.

We’re not using a primary key here. We’re going to use this fact to exercise custom logic over what we do when posting a review. (And it is just an illustration: you could, and probably should, use a composite key here. I’ve included an example of that in the source code that goes with this post.)

Without a key insertOrUpdate won’t work for us. We need to try something else.

Let’s assume we’re mostly making edits, and an update to a rating will probably succeed. If it doesn’t, we’ll try an insert. We need something like this:

def postReview(critic: String, title: String, rating: Int): DBIO[Int] = {
  for {
    rowsAffected <- reviews.
                     filter(r => r.critic === critic && r.title === title).
                     map(_.rating).
                     update(rating)
    ???
    }
  } yield ???
}

The intention of this code (when we finish it) is take the name of a critic, a film title, and a rating, and either insert a new row, or update an existing row. It returns a count of the rows affected, as a DBIO[Int]. This action is what we will run later.

We start with a query for existing reviews by this critic and title. This is a DBIO[Int] because it is an update of the rating. This will give us the rows affected by the update.

The logic we now want is:

  • if there are no rows affected, we need to insert a new review
  • if there was one row affected, we’re done!
  • otherwise, something really bad has happened and we have multiple reviews for a film by the same critic. This cannot be allowed to stand.

We’re going to be able to achieve this because of a key feature of Slick 3: DBIO actions compose. If you’re approaching this thinking “how can I get the count out of the DBIO”, don’t. Instead, think about what you can do to combine DBIOs together.

What we need to complete postReview is another DBIO. There are several ways we can get one. Calling result on a query is one way, but there are also standard ways to construct a DBIO for success and failure cases. We can use them to complete the code:

def postReview(critic: String, title: String, rating: Int)
              (implicit ec: ExecutionContext): DBIO[Int] = {
  for {
    rowsAffected <- reviews.filter(r => r.critic === critic && r.title === title).
                    map(_.rating).update(rating)
    result <- rowsAffected match {
      case 0 => reviews += Review(critic, title, rating)
      case 1 => DBIO.successful(1)
      case n => DBIO.failed(new RuntimeException(
                s"Expected 0 or 1 change, not $n for $critic @ $title"))
    }
  } yield result
}

Let’s walk down the body of this method:

  • We’re trying to update a review. The generator (to the right of the <-) is a DBIO[Int]

  • Next we use the rowsAffected to produce another DBIO[Int]:

    • reviews += ... inserts new row, and is a DBIO[Int]
    • DBIO.successful(1) produces a DBIO[Int] for a constant value
    • DBIO.failed produces a DBIO[Nothing] for an exception, signalling failure.

As an exercise, you can de-sugar that for comprehension into flatMap and map to see how the DBIOs combine. The main thing to notice, though, is that we can produce DBIO[T] for the arbitrary logic we need for our application.

Finally, notice that because we’re mixing database queries and user logic in a for comprehension, we need to also have an ExecutionContext in scope. Our logic will run on the context we supply to the method. Slick will sensibly release a session for the duration of our custom logic, unless it is pinned.

Get or Create

You should be able to see how this kind of code can deal with “fetch the row or create one”. Let’s add that into the mix for postReview, forgetting about critics to keep the example shorter:

def postReview(title: String, rating: Int): DBIO[Int] = for {
  existing <- reviews.filter(_.title === title).result.headOption
  row       = existing.map(_.copy(rating=rating)) getOrElse Review(title, rating)
  result   <- reviews.insertOrUpdate(row)
} yield result

We’re now fetching a row and modifying it, or creating a new one and persisting it. It’s the same pattern we’ve seen, combining DBIOs:

  • result.headOption is a DBIO[Option[Review]]
  • we use the Option like any other option, using getOrElse provide a default of a new row
  • we upsert the row, giving a final result: a DBIO[Int].

Conclusions

Slick contains a convenient way to upsert database records, based on the availability of a primary key.

But it is not limited to just that. You can combine database actions together, mixing in your own custom logic.

There’s a Github project where you can try the code out. This example is likely to appear in the upcoming revisions to Essential Slick.


Like what you're reading?

Join our newsletter

Looking for a Scala job?

Job Listings


Comments

Please review our Community Guidelines before posting a comment. We encourage discussion in good faith, but do not allow combative, exclusionary, or harassing behaviour. If you have any questions, contact us!