Databases and SQL
Sorting and Removing Duplicates
- Write queries that display results in a particular order.
- Write queries that eliminate duplicate values from data.
In beginning our examination of the Antarctic data, we want to know:
- what kind of quantity measurements were taken at each site;
- which scientists took measurements on the expedition;
- the sites where each scientist took measurements
To determine which measurements were taken at each site, we can examine the
Survey table. Data is often redundant, so queries often return redundant information. For example, if we select the quantitites that have been measured from the
Survey table, we get this:
SELECT quant FROM Survey;
This result makes it difficult to see all of the different types of
quant in the Survey table. We can eliminate the redundant output to make the result more readable by adding the
DISTINCT keyword to our query:
SELECT DISTINCT quant FROM Survey;
If we want to determine which sites have which quant measurement, we can use the
DISTINCT keyword on multiple columns. If we select more than one column, the distinct pairs of values are returned:
SELECT DISTINCT taken, quant FROM Survey;
Notice in both cases that duplicates are removed even if the rows they come from didn’t appear to be adjacent in the database table.
Our next task is to identify the scientists on the expedition by looking at the
Person table. As we mentioned earlier, database records are not stored in any particular order. This means that query results aren’t necessarily sorted, and even if they are, we often want to sort them in a different way, e.g., by their identifier instead of by their personal name. We can do this in SQL by adding an
ORDER BY clause to our query:
SELECT * FROM Person ORDER BY ident;
By default, results are sorted in ascending order (i.e., from least to greatest). We can sort in the opposite order using
DESC (for “descending”):
SELECT * FROM Person ORDER BY ident DESC;
(And if we want to make it clear that we’re sorting in ascending order, we can use
ASC instead of
In order to look at which scientist measured quantities at each site, we can look again at the
Survey table. We can also sort on several fields at once. For example, this query sorts results first in ascending order by
taken, and then in descending order by
person within each group of equal
SELECT taken, person, quant FROM Survey ORDER BY quant ASC, person DESC;
This query gives us a good idea of which scientist was at which site, and what measurements they performed while they were there.
Looking at the table, it seems like some scientists specialized in certain kinds of measurements. We can examine which scientists performed which measurements by selecting the appropriate columns and removing duplicates.
SELECT DISTINCT quant, person FROM Survey ORDER BY quant ASC;
Finding Distinct Dates
Write a query that selects distinct dates from the
Displaying Full Names
Write a query that displays the full names of the scientists in the
Person table, ordered by family name.