PostgreSQL: Migrate has many Relation to JSONB
NOTE: I am typing everything here on psql
feel free to create a file and run the sql.
I have a structure in my database similar to a has many relation in Rails.
In my case it is one Customer can have many Phones and the other way around: one Phone belongs to one Customer.
As time goes by we realized there is no need to set a new table for that, there are PROs and CRONs of both ways, I am not talking about that in this post, feel free to leave a comment if you want 😊.
Test data
For my example I will use this SQL to create and insert some data:
create database mexample;
\c mexample;
create table customers (id integer, name text);
create table phones (id integer, number text, customer_id integer);
insert into customers (id, name) values (1, 'Alice');
insert into customers (id, name) values (2, 'Bob');
insert into customers (id, name) values (3, 'Carl');
insert into phones (id, number, customer_id) values (1, 'alice#1', 1);
insert into phones (id, number, customer_id) values (2, 'alice#2', 1);
insert into phones (id, number, customer_id) values (3, 'bob#1', 2);
- Alice has two phone numbers
- Bob has one, and
- Carl has no one
Migrating
Time to migrate the data.
The first step is to add the jsonb column on customers
:
alter table customers add column phones jsonb default '[]'::jsonb;
After this SQL it will be like:
mexample=# select * from customers;
id | name | phones
----+-------+--------
1 | Alice | []
2 | Bob | []
3 | Carl | []
(3 rows)
Now the trickiest part, and here you have to be very careful:
- First, if you have a lot of customers you have to do this in batches, and I recommend that you create a status column to flag the customers you have migrated. I usually start with a batch of 200 and increase only if everything looks fine on prod and in the database
- Second, the app itself (Ruby) expects
phones
to be by default an empty array[]
, the app literally breaks if it is not
The point (2) was important while migrating, if the data was created within the app itself it guarantees it is an empty array. (Don't worry we fixed that later 😉).
Now the SQL to migrate:
do
$$
declare r record;
begin
for r in select customers.id from customers
loop
update customers set phones = (
select
to_json(array_agg(p))
from
customers c
inner join
phones p on p.customer_id = c.id
where
p.customer_id = r.id
group by
p.customer_id
) where customers.id = r.id;
end loop;
end
$$;
After this SQL it will be like this:
mexample=# select * from customers;
id | name | phones
----+-------+------------------------------------------------------------------------------------------------------
1 | Alice | [{"id": 1, "number": "alice#1", "customer_id": 1}, {"id": 2, "number": "alice#2", "customer_id": 1}]
2 | Bob | [{"id": 3, "number": "bob#1", "customer_id": 2}]
3 | Carl |
(3 rows)
In my case I had to run this:
update customers set phones = '[]'::json where phones is null;
In order to make it look like this:
mexample=# select * from customers;
id | name | phones
----+-------+------------------------------------------------------------------------------------------------------
1 | Alice | [{"id": 1, "number": "alice#1", "customer_id": 1}, {"id": 2, "number": "alice#2", "customer_id": 1}]
2 | Bob | [{"id": 3, "number": "bob#1", "customer_id": 2}]
3 | Carl | []
(3 rows)
Once more in my case phones
has to be an empty array and not null
.
Don't forget to adapt that SQL to your case:
- Maybe you have thousands of rows in the database for both tables
- Maybe it is
companies
andaddresses
- Maybe you have a polymorphic association