Combinatorial row pairing in databases

While working on a database design I came across the problem of finding combinations or ‘cartesian product’ between all the rows between one table to another table, but not repeating the same permutations.

Take the ‘Teams’ table below which lists four cricket teams. We want to create a combination list of matches between each teams.

| id | Name         |
|  1 | India        |
|  2 | England      |
|  3 | South Africa |
|  4 | Australia    |

Note that we are looking for combinations of teams and not permutations. So ‘India vs. England’ is the same as ‘England vs. India’. A MySQL query for the same using a ‘Cartesian product’ also known as ‘cross join’ is given below. In a ‘cross join’ there is a join for each row of one table to every row of another table. In our example both the tables refer to the same table (Teams).

SELECT AS Team1ID, AS Team1Name,                                                                  AS Team2Id, AS Team2Name
FROM Teams AS Team1
	CROSS JOIN Teams AS Team2
	ON >;

Which will return the following result – all teams paired with each other for a total of 6 matches.

| Team1ID | Team1Name    | Team2Id | Team2Name    |
|       1 | India        |       2 | England      |
|       1 | India        |       3 | South Africa |
|       1 | India        |       4 | Australia    |
|       2 | England      |       3 | South Africa |
|       2 | England      |       4 | Australia    |
|       3 | South Africa |       4 | Australia    |
6 rows in set (0.02 sec)

The formula for calculating combinations is shown below. This is just to check if you are on the right track. For a small number of rows you can do a manual check, but for row which runs into dozens or more it is better to check the results with a formula – which calculates the combinations for a set of N items taken K at a time.

So for our 4 team set this will give the result of 6.

For a 8 team set this will give a total combination games of 28.

Note that a simple cross join will return a permutation of teams with a total of 12 rows. But as we are using the same table for referencing both tables we also tend to add additional 4 rows than required for a total of 16 rows for our 4 teams, because the cross join will also pair a country with itself. So to avoid that we use the additional comparison operator in our query.

ON >;