NOTE: I am typing everything here on psql feel free to create a file and run the sql.

I have a structure in my database similar to a has many relation in Rails.

In my case it is one Customer can have many Phones and the other way around: one Phone belongs to one Customer.

As time goes by we realized there is no need to set a new table for that, there are PROs and CRONs of both ways, I am not talking about that in this post, feel free to leave a comment if you want 😊.

Test data

For my example I will use this SQL to create and insert some data:

create database mexample;
\c mexample;
create table customers (id integer, name text);
create table phones (id integer, number text, customer_id integer);
insert into customers (id, name) values (1, 'Alice');
insert into customers (id, name) values (2, 'Bob');
insert into customers (id, name) values (3, 'Carl');
insert into phones (id, number, customer_id) values (1, 'alice#1', 1);
insert into phones (id, number, customer_id) values (2, 'alice#2', 1);
insert into phones (id, number, customer_id) values (3, 'bob#1', 2);
  • Alice has two phone numbers
  • Bob has one, and
  • Carl has no one

Migrating

Time to migrate the data.

The first step is to add the jsonb column on customers:

alter table customers add column phones jsonb default '[]'::jsonb;

After this SQL it will be like:

mexample=# select * from customers;
 id | name  | phones
----+-------+--------
  1 | Alice | []
  2 | Bob   | []
  3 | Carl  | []
(3 rows)

Now the trickiest part, and here you have to be very careful:

  1. First, if you have a lot of customers you have to do this in batches, and I recommend that you create a status column to flag the customers you have migrated. I usually start with a batch of 200 and increase only if everything looks fine on prod and in the database
  2. Second, the app itself (Ruby) expects phones to be by default an empty array [], the app literally breaks if it is not

The point (2) was important while migrating, if the data was created within the app itself it guarantees it is an empty array. (Don't worry we fixed that later 😉).

Now the SQL to migrate:

do
$$
declare r record;
begin
  for r in select customers.id from customers
  loop
    update customers set phones = (
select
  to_json(array_agg(p))
from
  customers c
inner join
  phones p on p.customer_id = c.id
where
  p.customer_id = r.id
group by
  p.customer_id
) where customers.id = r.id;
  end loop;
end
$$;

After this SQL it will be like this:

mexample=# select * from customers;
 id | name  |                                                phones
----+-------+------------------------------------------------------------------------------------------------------
  1 | Alice | [{"id": 1, "number": "alice#1", "customer_id": 1}, {"id": 2, "number": "alice#2", "customer_id": 1}]
  2 | Bob   | [{"id": 3, "number": "bob#1", "customer_id": 2}]
  3 | Carl  |
(3 rows)

In my case I had to run this:

update customers set phones = '[]'::json where phones is null;

In order to make it look like this:

mexample=# select * from customers;
 id | name  |                                                phones
----+-------+------------------------------------------------------------------------------------------------------
  1 | Alice | [{"id": 1, "number": "alice#1", "customer_id": 1}, {"id": 2, "number": "alice#2", "customer_id": 1}]
  2 | Bob   | [{"id": 3, "number": "bob#1", "customer_id": 2}]
  3 | Carl  | []
(3 rows)

Once more in my case phones has to be an empty array and not null.

Don't forget to adapt that SQL to your case:

  • Maybe you have thousands of rows in the database for both tables
  • Maybe it is companies and addresses
  • Maybe you have a polymorphic association