SQL Queries 01

For more SQL examples in the SQLite3 dialect, seee SQLite3 tutorial.

For a deep dive, see SQL Queries for Mere Mortals.

Data

[1]:
%load_ext sql
[2]:
%sql sqlite:///data/faculty.db
[3]:
%%sql

SELECT * FROM sqlite_master WHERE type='table';
 * sqlite:///data/faculty.db
Done.
[3]:
type name tbl_name rootpage sql
table person person 2 CREATE TABLE person (
"index" BIGINT,
person_id BIGINT,
first TEXT,
last TEXT,
age BIGINT,
height FLOAT,
weight BIGINT,
country_id TEXT,
gender_id BIGINT
)
table confidential confidential 18 CREATE TABLE confidential (
"index" BIGINT,
person_id BIGINT,
salary BIGINT
)
table person_language person_language 33 CREATE TABLE person_language (
"index" BIGINT,
person_id BIGINT,
language_id BIGINT
)
table language language 50 CREATE TABLE language (
"index" BIGINT,
language_id BIGINT,
language_name TEXT
)
table gender gender 55 CREATE TABLE gender (
"index" BIGINT,
gender_id BIGINT,
gender TEXT
)
table country country 57 CREATE TABLE country (
"index" BIGINT,
country_id TEXT,
country TEXT,
nationality TEXT
)

Note: You can save results as a variable

[4]:
%%sql master <<

SELECT * FROM sqlite_master WHERE type='table'
 * sqlite:///data/faculty.db
Done.
Returning data to local variable master
[5]:
master.DataFrame()
[5]:
type name tbl_name rootpage sql
0 table person person 2 CREATE TABLE person (\n\t"index" BIGINT, \n\tp...
1 table confidential confidential 18 CREATE TABLE confidential (\n\t"index" BIGINT,...
2 table person_language person_language 33 CREATE TABLE person_language (\n\t"index" BIGI...
3 table language language 50 CREATE TABLE language (\n\t"index" BIGINT, \n\...
4 table gender gender 55 CREATE TABLE gender (\n\t"index" BIGINT, \n\tg...
5 table country country 57 CREATE TABLE country (\n\t"index" BIGINT, \n\t...

Basic Structure

SELECT DISTINCT value_expression AS alias
FROM tables AS alias
WHERE predicate
ORDER BY value_expression

Types

  • Character (Fixed width, variable width)

  • National Character (Fixed width, variable width)

  • Binary

  • Numeric (Exact, Arpproximate)

  • Boolean

  • DateTime

  • Interval

CHAR and NCHAR are vendor-dependent. Sometimes they mean the same thing, and sometimes CHAR means bytes and NCHAR means Unicode.

The SQL standard specifies that character strings and datetime literals are enclosed by single quotes. Two single quotes wihtin a string is intepreted as a literal single quote.

'Gilligan''s island'

The CAST function

CAST(X as CHARACTER(10))

Value expression

  • Literal

  • Column reference

  • Function

  • CASES

  • (Value expression)

which may be prefixed with unary operators - and + and combined with binary operators appropriate for the data type.

Literal

[6]:
%sql SELECT 23
 * sqlite:///data/faculty.db
Done.
[6]:
23
23

Column reference

[7]:
%sql SELECT first, last FROM person LIMIT 3
 * sqlite:///data/faculty.db
Done.
[7]:
first last
Aaron Alexander
Aaron Kirby
Abram Allen

Function

[8]:
%sql SELECT count(*) FROM person
 * sqlite:///data/faculty.db
Done.
[8]:
count(*)
1523

Cases

[9]:
%%sql

SELECT first, last, age,
CASE
    WHEN age < 50 THEN 'Whippernapper'
    WHEN age < 70 THEN 'Old codger'
    ELSE 'Dinosaur'
END comment
FROM person
LIMIT 4
 * sqlite:///data/faculty.db
Done.
[9]:
first last age comment
Aaron Alexander 54 Old codger
Aaron Kirby 59 Old codger
Abram Allen 41 Whippernapper
Abram Boyer 45 Whippernapper

Value expression

[10]:
%%sql

SELECT first || ' ' || last AS name, age, age - 10 AS fake_age
FROM person
LIMIT 3
 * sqlite:///data/faculty.db
Done.
[10]:
name age fake_age
Aaron Alexander 54 44
Aaron Kirby 59 49
Abram Allen 41 31

Bineary operators

Concatenation

A || B

Mathematical

A + B
A - B
A * B
A / B

Data and time arithmetic

'2018-08-29' + 3
'11:59' + '00:01'
[11]:
%%sql

SELECT DISTINCT language_name
FROM language
LIMIT 5;
 * sqlite:///data/faculty.db
Done.
[11]:
language_name
PHP
Clojure
Dylan
GNU Octave
D

Sorting

SELECT DISTINCT value_expression AS alias
FROM tables AS alias
ORDER BY value_expression
[12]:
%%sql

SELECT DISTINCT language_name
FROM language
ORDER BY language_name ASC
LIMIT 5;
 * sqlite:///data/faculty.db
Done.
[12]:
language_name
ASP
Assembly
AutoIt
Awk
Bash
[13]:
%%sql

SELECT DISTINCT language_name
FROM language
ORDER BY random()
LIMIT 5;
 * sqlite:///data/faculty.db
Done.
[13]:
language_name
Fortran
CoffeeScript
Julia
ECMAScript
Awk

Filtering

For efficiency, place the most stringent filters first.

SELECT DISTINCT value_expression AS alias
FROM tables AS alias
WHERE predicate
ORDER BY value_expression

Predicates for filtering rows

  • Comparison operators (=, <>, <, >, <=, >=)

  • BETWEEN start AND end

  • IN(A, B, C)

  • LIKE

  • IS NULL

  • REGEX

Use NOT prefix for negation

Combining predicates

AND
OR

USe parenthesis to indicate order of evaluation for compound statements.

[14]:
%%sql

SELECT first, last, age
FROM person
WHERE age BETWEEN 16 AND 17
LIMIT 5;
 * sqlite:///data/faculty.db
Done.
[14]:
first last age
Antoine Beard 16
Augustine Mejia 16
Boris Mejia 16
Brain Haney 16
Burl Mayo 17

Joins

Joins combine data from 1 or more tables to form a new result set.

Note: To join on multiple columns just use AND in the ON expression

Natural join

Uses all common columns in Tables 1 and 2 for JOIN

FROM Table1
NATURAL INNER JOIN Table 2

Inner join

General form of INNER JOIN uisng ON

FROM Table1
INNER JOIN Table2
ON Table1.Column = Table2.Column

Note: This is equivalent to an EQUIJOIN but more flexible in that additional JOIN conditions can be specified.

SELECT *
FROM Table1, Table2
WHERE Table1.Column = Table2.Column

If there is a common column in both tables

FROM Table1
INNER JOIN Table2
USING Column

Joining more than two tables

From (Table1
      INNER JOIN Table2
      ON Table1.column1 = Table2.Column1)
      INNER JOIN Table3
      ON Table3.column2 = Table2.Column2

Outer join

General form of OUTER JOIN uisng ON

FROM Table1
RIGHT OUTER JOIN Table2
ON Table1.Column = Table2.Column
FROM Table1
LEFT OUTER JOIN Table2
ON Table1.Column = Table2.Column
FROM Table1
FULL OUTER JOIN Table2
ON Table1.Column = Table2.Column
[15]:
%%sql

SELECT first, last, language_name
FROM person
INNER JOIN person_language
    ON person.person_id = person_language.person_id
INNER JOIN language
    ON language.language_id = person_language.language_id
LIMIT 10;
 * sqlite:///data/faculty.db
Done.
[15]:
first last language_name
Aaron Alexander Haskell
Aaron Kirby GNU Octave
Aaron Kirby haXe
Aaron Kirby Falcon
Abram Allen TypeScript
Abram Boyer Io
Abram Boyer Lua
Abram Boyer Falcon
Adan Brown F#
Adolph Dalton Dart

Set operations

SELECT a, b
FROM table1
SetOp
SELECT a, b
FROM table2

wehre SetOp is INTERSECT, EXCEPT, UNION or UNION ALL.

Intersection

INTERSECT

Alternative using INNER JOIN

Union

UNION
UNION ALL (does not eliminate duplicate rows)

Difference

EXCEPT

Alternative using OUTER JOIN with test for NULL

[16]:
%%sql

DROP VIEW IF EXISTS language_view;
CREATE VIEW language_view AS
SELECT first, last, language_name
FROM person
INNER JOIN person_language
    ON person.person_id = person_language.person_id
INNER JOIN language
    ON language.language_id = person_language.language_id
;
 * sqlite:///data/faculty.db
Done.
Done.
[16]:
[]
[17]:
%%sql

SELECt *
FROM language_view
LIMIT 10;
 * sqlite:///data/faculty.db
Done.
[17]:
first last language_name
Aaron Alexander Haskell
Aaron Kirby GNU Octave
Aaron Kirby haXe
Aaron Kirby Falcon
Abram Allen TypeScript
Abram Boyer Io
Abram Boyer Lua
Abram Boyer Falcon
Adan Brown F#
Adolph Dalton Dart
[18]:
%%sql

SELECt *
FROM language_view
WHERE language_name = 'Python'
UNION
SELECt *
FROM language_view
WHERE language_name = 'Haskell'
LIMIT 10;
 * sqlite:///data/faculty.db
Done.
[18]:
first last language_name
Aaron Alexander Haskell
Andree Douglas Haskell
Arlie Terrell Python
Boyd Blackwell Haskell
Buck Howe Haskell
Carlton Richard Haskell
Carylon Zamora Python
Clarisa Rodgers Python
Dinorah O'brien Haskell
Dorian Lloyd Haskell
[19]:
%%sql

SELECt *
FROM language_view
WHERE language_name IN ('Python', 'Haskell')
ORDER BY first
LIMIT 10;
 * sqlite:///data/faculty.db
Done.
[19]:
first last language_name
Aaron Alexander Haskell
Andree Douglas Haskell
Arlie Terrell Python
Boyd Blackwell Haskell
Buck Howe Haskell
Carlton Richard Haskell
Carylon Zamora Python
Clarisa Rodgers Python
Dinorah O'brien Haskell
Dorian Lloyd Haskell

Aggregate functions

COUNT
MIN
MAX
AVG
SUM
[20]:
%%sql

SELECT count(language_name)
FROM language_view;
 * sqlite:///data/faculty.db
Done.
[20]:
count(language_name)
2297

Grouping

SELECT a, MIN(b) AS min_b, MAX(b) AS max_b, AVG(b) AS mean_b
FROM table
GROUP BY a
HAVING mean_b > 5

The HAVING is analagous to the WHERE clause, but filters on aggregate conditions. Note that the WHERE statement filters rows BEFORE the grouping is done.

Note: Any variable in the SELECT part that is not an aggregte function needs to be in the GROUP BY part.

SELECT a, b, c, COUNT(d)
FROM table
GROUP BY a, b, c
[21]:
%%sql

SELECT language_name, count(*) AS n
FROM language_view
GROUP BY language_name
HAVING n > 45;
 * sqlite:///data/faculty.db
Done.
[21]:
language_name n
AutoIt 61
Bash 48
ECMAScript 48
GNU Octave 49
JavaScript 48
Perl 55
PowerShell 50
Prolog 50

The CASE switch

Simple CASE

SELECT name,
(CASE sex
 WHEN 'M' THEN 1.5*dose
 WHEN 'F' THEN dose
 END) as adjusted_dose
FROM table

Searched CASE

SELECT name,
(CASE
 WHEN sex = 'M' THEN 1.5*dose
 WHEN sex = 'F' THEN dose
 END) as adjusted_dose
FROM table
[22]:
%%sql

SELECT first, last, language_name,
(CASE
    WHEN language_name LIKE 'H%' THEN 'Hire'
    ELSE 'FIRE'
END
) AS outcome
FROM language_view
LIMIT 10;
 * sqlite:///data/faculty.db
Done.
[22]:
first last language_name outcome
Aaron Alexander Haskell Hire
Aaron Kirby GNU Octave FIRE
Aaron Kirby haXe Hire
Aaron Kirby Falcon FIRE
Abram Allen TypeScript FIRE
Abram Boyer Io FIRE
Abram Boyer Lua FIRE
Abram Boyer Falcon FIRE
Adan Brown F# FIRE
Adolph Dalton Dart FIRE

User defined functions (UDF)

[23]:
import sqlite3
[24]:
import random
import statistics
[25]:
con = sqlite3.connect(":memory:")
[26]:
con.create_function("rnorm", 2, random.normalvariate)
[27]:
cr = con.cursor()
[28]:
cr.execute('CREATE TABLE foo(num REAL);')
[28]:
<sqlite3.Cursor at 0x1165811f0>
[29]:
cr.execute("""
INSERT INTO foo(num)
VALUES
(rnorm(0,1)),
(rnorm(0,1)),
(rnorm(0,1)),
(rnorm(0,1)),
(rnorm(0,1)),
(rnorm(0,1)),
(rnorm(0,1)),
(rnorm(0,1))
""")
[29]:
<sqlite3.Cursor at 0x1165811f0>
[30]:
cr.execute('SELECT * from foo')
cr.fetchall()
[30]:
[(1.951537604398419,),
 (1.5075295059349978,),
 (-2.118904989108828,),
 (0.2542001383796541,),
 (0.411832441172589,),
 (1.1235289209771866,),
 (0.9641950352497476,),
 (0.3895096522469744,)]
[31]:
class Var:
    def __init__(self):
        self.acc = []

    def step(self, value):
        self.acc.append(value)

    def finalize(self):
        if len(self.acc) < 2:
            return 0
        else:
            return statistics.variance(self.acc)
[32]:
con.create_aggregate("Var", 1, Var)
[33]:
cr.execute('SELECT Var(num) FROM foo')
cr.fetchall()
[33]:
[(1.5194560216303632,)]
[34]:
con.close()