Data Practices:

2.7 SQL for Non-Technical People

[Use arrow keys to navigate, "s" to show speaker notes, and "f" for fullscreen.]

PDF Print

With Notes

Topics Covered

  1. Why Databases?
  2. Database Jargon
  3. Accessing Data
  4. Common Errors
  5. Advanced Concepts

SQL is not intuitive!

  • Many people do not understand what a database is or how it is different than an excel sheet
  • Data manipulation is neither natural nor visual
  • Joins, Aggregations, and Subqueries are an acquired taste
  • Errors are hard to parse
  • Don't get frustrated!
  • just understanding the concepts is powerful, even if you aren't creating queries yourself

Why Databases?

  • Size: "Unlimited"
  • Accuracy: When updating data, it is much less error prone
  • Security: Nuanced permissioning settings
  • Queryable: Allows you to "ask questions" of your data

Database Jargon (labels)

Spreadsheets vs Databases

Spreadsheet

Database

Sheet

Table

Column

Field

Row

Record

This is a table

Field

Field

Field

Record

data

data

Record

data

data

Database Jargon (keys)

  • Primary Key - used to uniquely identify a record.
  • Foreign Key - used to refer to a primary key in another table.
  • Schema - name for whole set of tables and their relationships.

Database Jargon (data types)

  • VARCHAR: Character string having variable length.
    • This data type treats everything as text. You can place numbers in here but adding them together would just attach them which is what you would expect with text.
    • a + b = ab 1 + 2 = 12
  • INTEGER: Whole numbers.
    • This data type only allows whole numbers inside of it. All math functions can be applied to it. Text cannot be placed here.
    • 1 + 2 = 3
  • FLOAT: Numbers with decimal points.
    • This data type works the same as INTEGER but allows decimal values.
    • 1.1 + 2.2 = 3.3
  • BOOLEAN: True or False value.
    • This data type can be either True or False. This is useful for setting binary characteristics or status about some data.
    • Is the house on fire? True or False (1 or 0)?

Database Jargon (data types - continued)

  • DATE: Date / Time value.
    • This data type contains date and/or time information which allows you to perform calculations related to date/time.
    • How many dats are between 1/1/19 and 2/4/19? (34)
  • NULL: This means there is no value in the cell (and this can show up in any of the above types)
    • This data type is useful for showing the absence of data. A blank cell and a Null cell are different. Null is ignored by aggregations and it is easy to filter for. Blank values or 0s would be included in aggregations and can be more difficult to detect.
    • What is the Average of 1 + NULL + 5? (6 / 2 = 3)
    • What is the Average of 1 + 0 + 5? (6 / 3 = 2)

Accessing Data

Try your hand at additional practice at The Data School by Chart.io.

Combining Data Tables - SQL Joins Explained

A JOIN clause in SQL is used to combine rows from two or more tables, based on a related column between them.

Inner Join

Left Join

Full Outer Join

Union

Cross Join

Join Practice

For more practice with JOINs, try the Join walkthrough and code on live data at data.world.

Common Syntax Errors

Spelling

  • Check table spelling
  • Check column spelling
  • Check value spelling


SELECT *

FROM Facbook

<- Wrong column name

Where Names = "Jeffrey"

<- Wrong column name, wrong value name

Facebook

Name

# of Friends

Matt

300

Lisa

500

Jeff

600

Sarah

400

Common Syntax Errors

Single vs Double Quotes

Single quotes(' '), double quotes(" "), and backticks (` `) can all have different meanings in SQL.

  • Single quotes often indicate a text string
  • SELECT name, age FROM facebook WHERE name = 'Matt'

Single Quotes

facebook
Name age
Matt 27
Lisa 30
Jeff 31
Sarah 40

  • Double quotes often indicate a table name or a field
  • SELECT "age" FROM "facebook"

Double Quotes

facebook
Name age
Matt 27
Lisa 30
Jeff 31
Sarah 40

Common Syntax Errors

Capitalization

  • Doesn't matter unless you put it in quotes, then it must match exactly.
    • SELECT NaMe FROM facebooK
    • SELECT "Name" FROM "facebook"
  • Common convention is to put SQL commands in all caps
    • SELECT, FROM, WHERE, etc...

Data Types

  • Sometimes when a number is stored in a database it is stored as a string.
  • This means when we refer to it in a query we must put single quotes around it.
  • SELECT * FROM facebook WHERE age = '27'

Common Syntax Errors

When writing a SQL query you have to place the clauses you use in this order:

  • SELECT
  • FROM
  • JOIN
  • WHERE
  • GROUP BY
  • HAVING
  • ORDER BY
  • LIMIT

0 Rows Returned

JOINing on the wrong column

  • Column names can be misleading, check which you are using.

0 Rows Returned (continued)

Over Filtering

  • Check to see if your filter is the problem by commenting it out using:
  • SELECT * FROM facebook -- WHERE # of Friends > 1000

SQL Practice

Continuing Concepts

Aggregation 1

Whole Table

Aggregation 2

Group By

Aggregation 3

Sum

Aggregation 4

Average

Subquery

SELECT State, SUM(# of friends)
FROM (SELECT State, # of friends FROM facebook)
GROUP BY State

Common Table Expression

Case When

Window Functions 1

SELECT 'Day', 'Mile Driving',
SUM('Miles Driving') OVER(ORDER BY 'Day') AS 'Running Total'
FROM 'Running total mileage visual'

Window Functions 2

SELECT 'Day', 'Daily Revenue',
AVG('Daily Revenue') OVER(ORDER BY 'Day' ROWS 2 PRECEDING)
AS '3 Day Average' FROM 'Running Average Example'

Window Functions 3

SELECT 'Day', 'Weekend', 'Daily Revenue',
SUM('Daily Revenue') OVER(PARTITION BY 'Weekend') AS 'Total'
FROM 'Partitioned Total Example'

Want to run a workshop like this at your company?

community@data.world



Thanks to Chart.io for contributing the original content for this course!



Don't forget to sign the values and principles! https://datapractices.org/manifesto