UNION Multiple Tables in SQL

We are going to learn how to query two or more tables to obtain requested results, which can have SELECTWHERE or HAVING conditions, and other operators.

To do this, the UNION statement will be used, instead of JOIN, to which there is a dedicated article.

This article is going to use Microsoft SQL Server and SSMS.

UNION

The UNION statement in SQL allows you to combine records from two or more tables in a database.

In special cases a table can join itself, producing a self-join, SELF UNION.

Mathematically, UNION is a relational composition, the fundamental operation in relational algebra, and, generalizing, it is a composition function.

In the standard SQL data query language, the UNION operator is used to combine the results of two independent queries, returning as a joint result all the records obtained by both.

Obviously, in order to perform the union, both queries must return exactly the same data types.

What is not necessary is that they are obtained in the same way.

That is, some could be direct fields obtained from a table query, and the second part could be calculated fields. As long as they have the same name and type there is no problem.

So in the simplest case, for example, if you have two information tables with data from Suppliers, Names and Invoice Number, something like this (in two different tables):

You can write a query like this:


SELECT TOP(10) SupplierID, SupplierName FROM wideworldimporters.Purchasing.Suppliers
Union
SELECT TOP(10) SupplierID, SupplierInvoiceNumber FROM wideworldimporters.Purchasing.SupplierTransactions

Where you will obtain, if it exists, the invoice number, for each client obtained separately in the first query, if the invoice does not exist, then this data will not be obtained.

An important thing to consider is that UNION prevents duplicate records from being in the final result. That is, it removes duplicate rows where all field values ​​are the same.

However, if  you want to get all the data, including the duplicate rows, use UNION ALL.

Finally, UNION ALL offers much better resource performance than UNION.

The reason is that by not having to discriminate between repeated records, the process is much simpler.

In regular queries with few records you will not notice it, but in large queries with many possible results you will.

In any case, if you are sure that the two result sets that you are going to join do not have common data (or you do not care), it is better to always use UNION ALL for this reason.

Leave a Reply

Your email address will not be published.