Intro to Database Systems
SQL is a powerful language for managing and querying relational databases. It enables users to efficiently store, organize, and retrieve large amounts of structured data, making it essential for various industries and careers in data management and analysis. This unit covers fundamental SQL concepts, including basic query structure, filtering techniques, sorting, grouping, and joining tables. It also explores advanced topics like subqueries, window functions, and common table expressions, providing a comprehensive foundation for effective database querying and manipulation.
SELECT
is used to specify the columns to retrieve from a database table
SELECT column1, column2, ... FROM table_name;
SELECT *
to retrieve all columns from a tableFROM
is used to specify the table from which to retrieve data
SELECT column1, column2, ... FROM table_name;
WHERE
is used to filter rows based on a specified condition
SELECT column1, column2, ... FROM table_name WHERE condition;
=
, >
, <
, >=
, <=
, <>
) and logical operators (AND
, OR
, NOT
);
) to end each SQL statementWHERE
clause
=
): Checks if two values are equal>
), Less than (<
): Compares numeric values>=
), Less than or equal to (<=
): Includes the specified value in the comparison<>
or !=
): Checks if two values are not equalWHERE
clause
AND
: Returns true if all conditions are trueOR
: Returns true if at least one condition is trueNOT
: Negates a condition, returning true if the condition is falseBETWEEN
operator is used to check if a value falls within a specified range (inclusive)
WHERE column_name BETWEEN value1 AND value2;
IN
operator is used to check if a value matches any value in a list
WHERE column_name IN (value1, value2, ...);
LIKE
operator is used for pattern matching with wildcard characters (%
for zero or more characters, _
for a single character)
WHERE column_name LIKE 'pattern';
ORDER BY
is used to sort the result set based on one or more columns
SELECT column1, column2, ... FROM table_name ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...;
ORDER BY
sorts the result set in ascending order (ASC)
DESC
to sort in descending orderORDER BY
clause, separated by commas
ORDER BY
should be the last clause in the SQL query, after SELECT
, FROM
, WHERE
, and other clausesNULL
values are typically sorted last in ascending order and first in descending orderGROUP BY
is used to group rows based on one or more columns and perform aggregate functions on each group
SELECT column1, aggregate_function(column2) FROM table_name GROUP BY column1;
COUNT
: Returns the number of rows in each groupSUM
: Calculates the sum of values in each groupAVG
: Calculates the average of values in each groupMIN
: Returns the minimum value in each groupMAX
: Returns the maximum value in each groupGROUP BY
is often used with aggregate functions to summarize data by categories or groupsHAVING
clause is used to filter groups based on a specified condition, similar to the WHERE
clause for individual rows
SELECT column1, aggregate_function(column2) FROM table_name GROUP BY column1 HAVING condition;
HAVING
is used with aggregate functions to filter groups, while WHERE
is used to filter individual rows before groupingSELECT
clause must either be listed in the GROUP BY
clause or be used with an aggregate functionINNER JOIN
returns only the rows that have matching values in both tables
SELECT columns FROM table1 INNER JOIN table2 ON table1.column = table2.column;
LEFT JOIN
(or LEFT OUTER JOIN
) returns all rows from the left table and the matched rows from the right table, with NULL
values for non-matching rows in the right table
SELECT columns FROM table1 LEFT JOIN table2 ON table1.column = table2.column;
RIGHT JOIN
(or RIGHT OUTER JOIN
) returns all rows from the right table and the matched rows from the left table, with NULL
values for non-matching rows in the left table
SELECT columns FROM table1 RIGHT JOIN table2 ON table1.column = table2.column;
FULL OUTER JOIN
returns all rows from both tables, with NULL
values for non-matching rows in either table
SELECT columns FROM table1 FULL OUTER JOIN table2 ON table1.column = table2.column;
SELECT t1.column1, t2.column2 FROM table1 AS t1 JOIN table2 AS t2 ON t1.column = t2.column;
SELECT
, FROM
, WHERE
, and HAVING
clausesSELECT columns FROM table1 WHERE column1 IN (SELECT column2 FROM table2 WHERE condition);
SELECT columns FROM table1 WHERE EXISTS (SELECT 1 FROM table2 WHERE table1.column = table2.column);
EXISTS
operator is used to check if a subquery returns any rows, without actually returning the rows themselves
SELECT columns FROM table1 WHERE EXISTS (SELECT 1 FROM table2 WHERE condition);
CASE
expression is used to perform conditional logic within a SQL query
SELECT columns, CASE WHEN condition1 THEN result1 WHEN condition2 THEN result2 ELSE default_result END AS alias FROM table;
SELECT columns, window_function() OVER (PARTITION BY column1 ORDER BY column2) AS alias FROM table;
WITH cte_name AS (SELECT columns FROM table WHERE condition) SELECT columns FROM cte_name WHERE condition;
JOIN
conditions instead of relying on implicit joins using the WHERE
clauseNULL
values in comparisons, as NULL
represents an unknown value and behaves differently than other valuesSELECT *
in production queries, as it can impact performance and maintainability; instead, explicitly list the required columnsEXPLAIN
or query profiling to identify and optimize slow or resource-intensive queries