Black holes, planets, and SQL

A very basic introduction to SQL

Karen Warmbein
4 min readMay 2, 2020

I love astronomy. From galaxies and black holes to comets and quasars — it is all so interesting! As an undergrad, I studied Astrophysics at a leading research university. You can head on over here to this post to read more about my transition from astrophile to newly-minted data scientist.

In this post, I want to take my love of astronomy and apply it to data retrieval.

Data can be stored in a variety of ways; one instance is in databases. These databases have two-dimensional tables of stored data, with each table being referred to by a name. Most of the actions you need to perform on these tables are done with SQL statements on a table(s). SQL (Structured Query Language) is a standard language for accessing and manipulating databases.

According to W3Schools, SQL can:

  • execute queries against a database and retrieve data,
  • insert, update, and delete records in a database,
  • create new databases, and
  • create new tables, stored procedures, and views in a database.

I want to focus on the first bullet point — SQL can execute queries against a database and retrieve data. Three points before we get started:

  1. SQL is (usually) pronounced “sequel”
  2. SQL is NOT case sensitive: SELECT (which I will use soon) is the same as select
  3. I like to write my SQL statements in multiple lines, instead of just one line. I find that it is easier to read the query this way.

The SELECT Statement

Consider the following table with the main color and mass of each planet in the Solar System. (The unit of mass is in Earth masses. So, Venus is 0.8 times the mass of the Earth, and Saturn is 15 times the mass of the Earth.)

The SELECT statement is the most basic SQL statement that returns data from a table in the database. Here is an example:

SELECT *
FROM planets;

The “*” indicates that we want all the data from all the columns in the table. In words, this query reads: “Select all the data from the planets table.” The result is:

Mercury|gray|0.6
Venus|yellow|0.8
Earth|blue|1.0
Mars|red|0.1
Jupiter|orange|318.0
Saturn|light yellow|95.0
Uranus|light blue|15.0
Neptune|dark blue|17.0

Note that the pipe (|) separates one cell from the next.

Say we only want data from two columns, planet name and planet color, from the table. The SQL query would read:

SELECT name, color
FROM planets;

Output:

Mercury|gray
Venus|yellow
Earth|blue
Mars|red
Jupiter|orange
Saturn|light yellow
Uranus|light blue
Neptune|dark blue

The WHERE Clause

SQL has a way to filter results with the WHERE clause. Say we only wanted information about blue or red planets.

SELECT *
FROM planets
WHERE color = ‘blue’ OR color = ‘red’;

Output:

Earth|blue|1.0
Mars|red|0.1

Note that in the table, there are two planets that are not simply blue. Meaning, Uranus (our favorite planet!) is light blue and Neptune is dark blue. To include the extra descriptions of colors, we can update the query to:

SELECT *
FROM planets
WHERE color LIKE ‘%blue’ OR color = ‘red’;

Output:

Earth|blue|1.0
Mars|red|0.1
Uranus|light blue|15.0
Neptune|dark blue|17.0

The LIKE searches for a pattern, namely one that has the pattern of ‘blue’. % is a wildcard that allows a space to appear before the instance of blue.

Here’s more practice with the WHERE clause. Let’s say we have a table with characteristics of black holes, namely, the name and mass.

To read the mass in this table, use scientific notation: base_mass x 10^power. For example, the Sombrero galaxy is 1x10⁹ times the mass of the Sun.

This table has four entries with three of those entries being supermassive black holes. (Cygnus X-1 is considered a stellar-mass black hole, because of its lower mass.) To get the three galaxies with supermassive black holes:

SELECT name
FROM black_holes
WHERE power > 0;

Output:

M104 - Sombrero Galaxy
M31 - Andromeda Galaxy
NGC 4889

I want to list the results in descending order. To do that, I just need to update the query with ORDER BY. The ORDER BY clause allows for listing the results in ascending (ASC) order or descending (DESC) order.

SELECT name
FROM black_holes
WHERE power > 0
ORDER BY power DESC;

Output:

NGC 4889
M104 - Sombrero Galaxy
M31 - Andromeda Galaxy

The LIMIT Clause

Finally, If I want the most massive black hole listed in the table, I need to use a cleaver method with LIMIT. Think about this — to get the most massive, order the black holes by mass, but in descending order. Then, the most massive one will be listed first. LIMIT 1 will return the item that is at the top of the list which, in our example, is the most massive.

SELECT name
FROM black_holes
WHERE power > 0
ORDER BY power DESC
LIMIT 1;

Output:

NGC 4889

SELECT, WHERE, and LIMIT — these are very basic commands in SQL. In future posts, I will talk about join statements, creating and editing tables, and more!

--

--