04 SQL

Table of Contents

Overview

SQL (Structured Query Language) is the standard language for interacting with relational database systems. It provides a declarative approach to data manipulation, allowing users to specify what data they want without detailing how to retrieve it.

Key Characteristics

  • Declarative: Specify the desired result, not the procedure

  • Set-Based: Operates on sets of rows rather than individual records

  • Standardized: ANSI/ISO standard with vendor-specific extensions

  • Comprehensive: Covers data definition, manipulation, and control

SQL Standards

Standard
Year
Key Features

SQL-86

1986

Initial standard

SQL-89

1989

Integrity constraints

SQL-92

1992

Joins, schema manipulation

SQL:1999

1999

Triggers, recursive queries, procedural extensions

SQL:2003

2003

Window functions, XML

SQL:2016

2016

JSON support, pattern matching

SQL Language Categories

SQL is divided into distinct categories based on functionality:

Data Manipulation Language (DML)

Purpose: Query and modify data

Statement
Operation
Description

SELECT

Query

Retrieve data from tables

INSERT

Create

Add new rows to a table

UPDATE

Modify

Change existing data

DELETE

Remove

Delete rows from a table

Example:

Data Definition Language (DDL)

Purpose: Define and modify database structure

Statement
Operation
Description

CREATE

Define

Create databases, tables, indexes, views

ALTER

Modify

Change structure of existing objects

DROP

Remove

Delete database objects

TRUNCATE

Clear

Remove all rows from a table (faster than DELETE)

Example:

Data Control Language (DCL)

Purpose: Control access to data

Statement
Operation
Description

GRANT

Allow

Give privileges to users

REVOKE

Deny

Remove privileges from users

Example:

Transaction Control Language (TCL)

Purpose: Manage transactions

Statement
Operation
Description

BEGIN/START

Start

Begin a transaction

COMMIT

Finalize

Save changes permanently

ROLLBACK

Undo

Revert changes to last commit

SAVEPOINT

Checkpoint

Set a point to rollback to

Example:

Aggregate Functions

Aggregate functions compute a single result from a set of input values.

Common Aggregate Functions

Function
Description
Handles NULL

COUNT(col)

Number of non-NULL values

Ignores NULL

COUNT(*)

Number of rows

Counts all rows

SUM(col)

Sum of values

Ignores NULL

AVG(col)

Average of values

Ignores NULL

MIN(col)

Minimum value

Ignores NULL

MAX(col)

Maximum value

Ignores NULL

Basic Usage

DISTINCT Keyword

Purpose: Select only unique values

GROUP BY Clause

Purpose: Group rows with the same values for aggregation

Rule: Non-aggregated columns in SELECT must appear in GROUP BY

HAVING Clause

Purpose: Filter groups based on aggregate conditions

Difference from WHERE:

  • WHERE filters rows before grouping

  • HAVING filters groups after aggregation

Complete Query Structure

String Operations

SQL provides rich string manipulation capabilities.

String Standards

SQL Standard:

  • Strings are case sensitive

  • Use single quotes only: 'text'

  • Double quotes for identifiers (column/table names)

Vendor Variations:

  • MySQL: Case insensitive by default, allows double quotes for strings

  • PostgreSQL: Case sensitive, follows standard strictly

  • SQL Server: Case sensitivity depends on collation

Pattern Matching with LIKE

LIKE operator matches patterns in strings.

Wildcards:

  • % : Matches any substring (including empty string)

  • _ : Matches exactly one character

String Concatenation

Operator: || (SQL standard)

Vendor Alternatives:

  • MySQL: CONCAT() function

  • SQL Server: + operator

String Functions

Function
Description
Example

SUBSTRING(S, B, E)

Extract substring from position B, length E

SUBSTRING('Hello', 1, 3) → 'Hel'

UPPER(S)

Convert to uppercase

UPPER('hello') → 'HELLO'

LOWER(S)

Convert to lowercase

LOWER('HELLO') → 'hello'

LENGTH(S)

String length

LENGTH('hello') → 5

TRIM(S)

Remove leading/trailing spaces

TRIM(' hello ') → 'hello'

REPLACE(S, F, R)

Replace F with R in S

REPLACE('hello', 'l', 'x') → 'hexxo'

Examples:

Output Control

ORDER BY Clause

Purpose: Sort query results

Syntax:

  • ASC: Ascending order (default)

  • DESC: Descending order

Examples:

LIMIT and OFFSET

Purpose: Restrict the number of rows returned

Syntax:

  • LIMIT n: Return at most n rows

  • OFFSET m: Skip first m rows

Examples:

Vendor Variations:

Nested Queries

Nested queries (subqueries) are queries within queries, enabling complex data retrieval.

Subquery Operators

Operator
Description
Returns True When

IN

Equal to any value

Value matches any in subquery (equivalent to = ANY)

NOT IN

Not equal to any value

Value doesn't match any in subquery

EXISTS

Subquery returns rows

At least one row returned

NOT EXISTS

Subquery returns no rows

No rows returned

ANY

Compare to any value

Condition true for at least one row

ALL

Compare to all values

Condition true for all rows

IN Operator

EXISTS Operator

Characteristics:

  • Returns TRUE/FALSE (not data)

  • Stops as soon as match is found (efficient)

  • Better performance than IN for large datasets

ANY Operator

Syntax: value operator ANY (subquery)

Operators: =, <>, <, <=, >, >=

ALL Operator

Syntax: value operator ALL (subquery)

Scalar Subqueries

Return single value (one row, one column)

Correlated Subqueries

Subquery references outer query (executed once per row)

Window Functions

Window functions perform calculations across a set of rows related to the current row, without collapsing rows like GROUP BY.

Key Concepts

  • Window: Set of rows related to current row

  • Partition: Divide rows into groups

  • Order: Define sequence within partition

  • Frame: Subset of partition relative to current row

Ranking Functions

Function
Description
Handling Ties

ROW_NUMBER()

Sequential number

Different numbers for ties

RANK()

Rank with gaps

Same rank, skip next numbers

DENSE_RANK()

Rank without gaps

Same rank, continue sequence

NTILE(n)

Divide into n buckets

Distribute as evenly as possible

Examples:

PARTITION BY

Purpose: Divide result set into partitions for separate calculations

Aggregate Window Functions

Difference from GROUP BY:

  • Window functions keep all rows

  • GROUP BY collapses rows

Offset Functions

Function
Description

LAG(col, n)

Value from n rows before

LEAD(col, n)

Value from n rows after

FIRST_VALUE(col)

First value in window

LAST_VALUE(col)

Last value in window

Window Frame Specification

Syntax:

Frame Bounds:

  • UNBOUNDED PRECEDING: Start of partition

  • n PRECEDING: n rows before current

  • CURRENT ROW: Current row

  • n FOLLOWING: n rows after current

  • UNBOUNDED FOLLOWING: End of partition

Common Table Expressions

Common Table Expressions (CTEs) create temporary named result sets that exist only during query execution.

Basic CTE Syntax

Simple CTE Examples

Advantages of CTEs

Advantage
Description

Readability

Break complex queries into logical steps

Reusability

Reference CTE multiple times in same query

Recursion

Enable recursive queries

Maintenance

Easier to debug and modify

Recursive CTEs

Purpose: Query hierarchical or tree-structured data

Syntax:

Examples:

Organization Hierarchy:

Number Series:

Path Finding:

Date Series:

References

Course Materials:

  • CS 6400: Database Systems Concepts and Design - Georgia Tech OMSCS

Last updated