In Rails, a polymorphic association allows a model to be connected to other models in a special way. It uses a polymorphic interface where the combination of a type
column and an id
column point to a row in another table.
For example, say I have an app with these models:
And I want to be able to report on the number of photo album images created at certain points in time.
I create 3 tables:
create_table :images do |t|
t.bigint :imageable_id, null: false
t.string :imageable_type, null: false
t.timestamps
t.index [:imageable_type, :imageable_id]
t.index :created_at
end
create_table :photo_albums do |t|
t.timestamps
end
create_table :user_profiles do |t|
t.timestamps
end
And I set up the models:
class Image < ApplicationRecord
belongs_to :imageable, polymorphic: true
end
class PhotoAlbum < ApplicationRecord
has_many :images, as: :imageable
end
class UserProfile < ApplicationRecord
has_one :image, as: :imageable
end
At first glance, polymorphic relationships seem like a really clever idea. I only need to create a single images
table. Then whenever I want to connect images to another model, I don't have to do anything in the database. I can just define a new polymorphic association on the model.
But there are a few pitfalls to this design pattern:
It isn't possible to define a foreign key constraint for the imageable_id
column. That column can either point to a row in the photo_albums
table or it can point to a row in the user_profiles
table. Because we can't define a foreign key constraint, we run the risk of ending up with invalid data into the column at some point.
It's a good idea to set up a unique index in the database for has_one
relationships because methods like find_each
can fail when there are duplicates. But it isn't possible to add a unique index for the has_one
on UserProfile without breaking the has_many
on PhotoAlbum.
Class names are stored in the imageable_type
column as a VARCHAR
:
SELECT id, imageable_id, imageable_type
FROM images;
+----+--------------+----------------+
| id | imageable_id | imageable_type |
+----+--------------+----------------+
| 1 | 1 | PhotoAlbum |
| 2 | 1 | UserProfile |
+----+--------------+----------------+
This means that longer class names will take up more data and indexing space than short class names.
Say that I want a way for a PhotoAlbum and a UserProfile to point to the same Image. This isn't currently possible because each Image is anchored to a single type
and id
combination.
Sometimes, presenting too many possible indexes to MySQL can cause it to make poor choices. The dual-key index associated with a polymorphic relationship can often get in the way.
Say that I have the following totals in my data:
When I run a query prior to 2021-11-04, MySQL is able to figure out that the created_at
index is best:
EXPLAIN SELECT COUNT(*)
FROM images
WHERE imageable_type = 'PhotoAlbum'
AND created_at < '2021-11-04';
+----+-------------+--------+------------+-------+----------------------------------------------------------------------------+----------------------------+---------+------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+----------------------------------------------------------------------------+----------------------------+---------+------+------+----------+------------------------------------+
| 1 | SIMPLE | images | NULL | range | index_images_on_imageable_type_and_imageable_id,index_images_on_created_at | index_images_on_created_at | 8 | NULL | 4500 | 50.00 | Using index condition; Using where |
+----+-------------+--------+------------+-------+----------------------------------------------------------------------------+----------------------------+---------+------+------+----------+------------------------------------+
But if I advance one more day, it will choose the imageable_type
and imageable_id
index, scanning 50% of the table rather than 12% of it:
EXPLAIN SELECT COUNT(*)
FROM images
WHERE imageable_type = 'PhotoAlbum'
AND created_at < '2021-11-05';
+----+-------------+--------+------------+------+----------------------------------------------------------------------------+-------------------------------------------------+---------+-------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+----------------------------------------------------------------------------+-------------------------------------------------+---------+-------+-------+----------+-------------+
| 1 | SIMPLE | images | NULL | ref | index_images_on_imageable_type_and_imageable_id,index_images_on_created_at | index_images_on_imageable_type_and_imageable_id | 768 | const | 25048 | 11.98 | Using where |
+----+-------------+--------+------------+------+----------------------------------------------------------------------------+-------------------------------------------------+---------+-------+-------+----------+-------------+
This can cause noticeable slowdowns on larger datasets.
So what would be an alternative approach?
One possible option is to use a shared images
table but explicitly define 2 join tables that have different indexing and constraints:
create_table :images do |t|
t.timestamps
end
create_table :photo_albums do |t|
t.timestamps
end
create_table :photo_album_images do |t|
t.bigint :photo_album_id, null: false
t.bigint :image_id, null: false
t.timestamps
t.index [:photo_album_id, :image_id], unique: true
t.index :image_id
t.index :created_at
end
add_foreign_key :photo_album_images, :photo_albums
add_foreign_key :photo_album_images, :images
create_table :user_profiles do |t|
t.timestamps
end
create_table :user_profile_images do |t|
t.bigint :user_profile_id, null: false
t.bigint :image_id, null: false
t.timestamps
t.index :user_profile_id, unique: true
t.index :image_id
end
add_foreign_key :user_profile_images, :user_profiles
add_foreign_key :user_profile_images, :images
Then my models would use normal associations:
class Image < ApplicationRecord
end
class PhotoAlbumImage < ApplicationRecord
belongs_to :image
belongs_to :photo_album
end
class PhotoAlbum < ApplicationRecord
has_many :photo_album_images
has_many :images, through: :photo_album_images
end
class UserProfileImage < ApplicationRecord
belongs_to :user_profile
belongs_to :image
end
class UserProfile < ApplicationRecord
has_one :user_profile_image
has_one :image, through: :user_profile_image
end