Rails 3 種做到 eager loading 方法

開始前簡單介紹一下什麼是 Eager Loading 什麼是 N+1 Query,Eager Loading 出現主要是為解決 N+1 Query。N+1 Query 像是這樣:

# controller
@posts = Post.publish.take(10)
# view
<table>
  <th>標題</th>
  <th>簡介</th>
  <th>作者</th>
  <% @posts.each do |post| %>
  <tr>
    <td><%= post.title %></td>
    <td><%= post.summary %></td>
    <td><%= post.author.name %></td>
  </tr>
  <% end %>
</table>

每一筆文章都去抓作者的關連,就會造成 N+1 Query,以範例10筆文章來說,會多10個 query 去抓作者資訊。但若 controller 內加上 Eager Loading (includes) 如下,就不會有 N+1 Query 問題了:

@posts = Post.includes(:author).publish.take(10)

目前 Rails 共有三種方式:includes, preload, eager_load,但大家多是知道 includes,如果不知道其他兩個就花點時間看一下吧!

一開始先定義相關的 Active Record 跟 Association

class User < ActiveRecord::Base
  has_many :addresses
end

class Address < ActiveRecord::Base
  belongs_to :user
end

假資料

rob = User.create!(name: "Robert Pankowecki", email: "robert@example.org")
bob = User.create!(name: "Bob Doe", email: "bob@example.org")

rob.addresses.create!(country: "Poland", city: "Wrocław", postal_code: "55-555", street: "Rynek")
rob.addresses.create!(country: "France", city: "Paris", postal_code: "75008", street: "8 rue Chambiges")
bob.addresses.create!(country: "Germany", city: "Berlin", postal_code: "10551", street: "Tiergarten")

實際來試試看這三種 eager loading

includes

User.includes(:addresses)
#  SELECT "users".* FROM "users"
#  SELECT "addresses".* FROM "addresses" WHERE "addresses"."user_id" IN (1, 2)

preload

User.preload(:addresses)
#  SELECT "users".* FROM "users"
#  SELECT "addresses".* FROM "addresses" WHERE "addresses"."user_id" IN (1, 2)

看起來 includes 跟 preload 是一樣的?繼續看下去~

eager_load

User.eager_load(:addresses)
#  SELECT
#  "users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4,
#  "addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."street" AS t1_r3, "addresses"."postal_code" AS t1_r4, "addresses"."city" AS t1_r5, "addresses"."created_at" AS t1_r6, "addresses"."updated_at" AS t1_r7
#  FROM "users"
#  LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id"

可以看到使用 eager_load 的話,rails 會自動使用 left join 包成一個 query 而已,若是 preload 跟 includes 則是拆成兩個 query,分別拉出 User 跟 Address 資料。

preload 和 includes 的不同

當 query 條件式有關連資料表的欄位時,就可看出差異

User.preload(:addresses).where("addresses.country = ?", "Poland")
#  SELECT "users".* FROM "users" WHERE (addresses.country = 'Poland')
#
#  SQLite3::SQLException: no such column: addresses.country

所以當使用 preload 時,rails 是只知道 User 的資料表,若是 includes 將可以使用關連資料表的欄位作條件。

User.includes(:addresses).where("addresses.country = ?", "Poland")
#  SELECT "users".* FROM "users" WHERE (addresses.country = 'Poland')
#  SELECT "addresses".* FROM "addresses" WHERE (country = 'Poland')

references

但在 rails 4 之後多了個 references,因此要改成:

User.includes(:addresses).where("addresses.country = ?", "Poland").references(:addresses)
#  SELECT "users".* FROM "users" WHERE (addresses.country = 'Poland')
#  SELECT "addresses".* FROM "addresses" WHERE (country = 'Poland')

為何要多此一舉?都是為了效能跟 loading 阿!!! 試想若 includes 可以 reference 其他資料表欄位,是否就代表下 includes 也要把其他資料表也先都讀進來(而且 join 起來?!),如果 includes 好幾個資料表.....好~可怕阿~~~

User.includes(:addresses, :posts).where("addresses.country = ?", "Poland").references(:addresses)
#  SELECT "users".* FROM "users" WHERE (addresses.country = 'Poland')
#  SELECT "addresses".* FROM "addresses" WHERE (country = 'Poland')

因此,假設我們多 includes posts,那條件式只有牽扯到 addresses 也就只要 reference 它就好。

reference from: http://blog.arkency.com/2013/12/rails4-preloading/