1. Data Types
Choosing the right data type is the foundation of good table design. It affects storage efficiency, query performance, and data integrity.
Integer Types
| Type | Storage | Signed Range | Unsigned Range |
|---|---|---|---|
| TINYINT | 1 byte | -128 to 127 | 0 to 255 |
| SMALLINT | 2 bytes | -32,768 to 32,767 | 0 to 65,535 |
| MEDIUMINT | 3 bytes | -8,388,608 to 8,388,607 | 0 to 16,777,215 |
| INT | 4 bytes | -2.1 billion to 2.1 billion | 0 to 4.2 billion |
| BIGINT | 8 bytes | -9.2 Γ 10ΒΉβΈ to 9.2 Γ 10ΒΉβΈ | 0 to 1.8 Γ 10ΒΉβΉ |
Floating Point & Decimal Types
| Type | Storage | Precision | Use Case |
|---|---|---|---|
| FLOAT | 4 bytes | ~7 digits | Scientific data, approximate values |
| DOUBLE | 8 bytes | ~15 digits | Higher precision calculations |
| DECIMAL(M,D) | Variable | Exact | Money, financial data |
String Types
| Type | Max Length | Description |
|---|---|---|
| CHAR(N) | 255 chars | Fixed-length, padded with spaces |
| VARCHAR(N) | 65,535 chars | Variable-length, stores actual length |
| TEXT | 65,535 chars | Long text, cannot be indexed fully |
| MEDIUMTEXT | 16 MB | Longer text content |
| LONGTEXT | 4 GB | Very large text data |
π‘ VARCHAR vs CHAR: Use CHAR for fixed-length data like country codes (CHAR(2)) or MD5 hashes (CHAR(32)). Use VARCHAR for variable-length data like names or emails. CHAR is slightly faster for fixed-length lookups because the storage engine knows exactly where each row's field starts.
Date & Time Types
| Type | Format | Range | Storage |
|---|---|---|---|
| DATE | YYYY-MM-DD | 1000-01-01 to 9999-12-31 | 3 bytes |
| TIME | HH:MM:SS | -838:59:59 to 838:59:59 | 3 bytes |
| DATETIME | YYYY-MM-DD HH:MM:SS | 1000-01-01 to 9999-12-31 | 8 bytes |
| TIMESTAMP | YYYY-MM-DD HH:MM:SS | 1970-01-01 to 2038-01-19 | 4 bytes |
| YEAR | YYYY | 1901 to 2155 | 1 byte |
Other Common Types
| Type | Description | Example |
|---|---|---|
| BOOLEAN | Alias for TINYINT(1) | is_active BOOLEAN DEFAULT TRUE |
| JSON | Native JSON storage with validation | metadata JSON |
| ENUM | One value from a predefined list | status ENUM('active','inactive') |
| SET | Multiple values from a list | tags SET('tech','news','sport') |
| BLOB | Binary large object | avatar BLOB |
π MySQL vs PostgreSQL Data Types:
- β’ Auto-increment: MySQL uses
AUTO_INCREMENT, PostgreSQL usesSERIALorGENERATED ALWAYS AS IDENTITY - β’ Boolean: MySQL uses
TINYINT(1)internally, PostgreSQL has a nativeBOOLEANtype - β’ String: MySQL has
MEDIUMTEXT/LONGTEXT, PostgreSQL usesTEXTfor all (no size limit) - β’ JSON: Both support
JSON, but PostgreSQL also hasJSONB(binary, indexed, faster queries) - β’ Arrays: PostgreSQL supports
INTEGER[],TEXT[]etc. MySQL has no array type - β’ UUID: PostgreSQL has native
UUIDtype, MySQL usesCHAR(36)orBINARY(16)
2. Constraints
Constraints enforce rules on table data to maintain accuracy and reliability. They are defined when creating or altering a table.
PRIMARY KEY
Uniquely identifies each row. Combines NOT NULL and UNIQUE. A table can have only one primary key.
-- Single column primary key
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(50) NOT NULL
);
-- Composite primary key
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT NOT NULL,
PRIMARY KEY (order_id, product_id)
);
NOT NULL
Prevents a column from storing NULL values.
CREATE TABLE employees (
id INT PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL
);
UNIQUE
Ensures all values in a column (or combination of columns) are distinct.
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(100) NOT NULL UNIQUE,
username VARCHAR(50) NOT NULL,
UNIQUE KEY uk_username (username)
);
DEFAULT
Sets a default value when no value is provided during insertion.
CREATE TABLE articles (
id INT PRIMARY KEY AUTO_INCREMENT,
title VARCHAR(200) NOT NULL,
status ENUM('draft', 'published', 'archived') DEFAULT 'draft',
view_count INT DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CHECK
Validates that values meet a specified condition. Supported in MySQL 8.0.16+.
CREATE TABLE products (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
discount_pct DECIMAL(5,2) DEFAULT 0.00,
CHECK (discount_pct >= 0 AND discount_pct <= 100)
);
FOREIGN KEY
Links two tables together by referencing a primary key or unique key in another table.
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
total DECIMAL(10,2) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id)
ON DELETE RESTRICT
ON UPDATE CASCADE
);
π‘ FOREIGN KEY referential actions:
- β’
RESTRICTβ Block the parent row from being deleted/updated (default) - β’
CASCADEβ Automatically delete/update matching child rows - β’
SET NULLβ Set the foreign key column to NULL (column must allow NULL) - β’
SET DEFAULTβ Set to default value (not supported by InnoDB) - β’
NO ACTIONβ Same as RESTRICT in MySQL
3. Complete CREATE TABLE Example
Let's design a simple e-commerce database schema with three related tables: products, orders, and order items.
Products Table
CREATE TABLE products (
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(200) NOT NULL,
description TEXT,
price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
stock INT UNSIGNED NOT NULL DEFAULT 0,
category VARCHAR(50) NOT NULL,
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_category (category),
INDEX idx_price (price)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Orders Table
CREATE TABLE orders (
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
user_id INT UNSIGNED NOT NULL,
status ENUM('pending', 'paid', 'shipped', 'delivered', 'cancelled') DEFAULT 'pending',
total_amount DECIMAL(12,2) NOT NULL DEFAULT 0.00,
shipping_address VARCHAR(500) NOT NULL,
notes TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE RESTRICT ON UPDATE CASCADE,
INDEX idx_user_id (user_id),
INDEX idx_status (status),
INDEX idx_created_at (created_at)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Order Items Table
CREATE TABLE order_items (
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
order_id INT UNSIGNED NOT NULL,
product_id INT UNSIGNED NOT NULL,
quantity INT UNSIGNED NOT NULL CHECK (quantity > 0),
unit_price DECIMAL(10,2) NOT NULL,
subtotal DECIMAL(12,2) GENERATED ALWAYS AS (quantity * unit_price) STORED,
FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE RESTRICT ON UPDATE CASCADE,
UNIQUE KEY uk_order_product (order_id, product_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
π‘ Key design decisions:
- β’
INT UNSIGNEDfor IDs β doubles the positive range (0 to ~4.2 billion) - β’
DECIMAL(10,2)for prices β exact arithmetic, no floating-point errors - β’
GENERATED ALWAYS ASfor subtotal β computed column, always consistent - β’
ON DELETE CASCADEon order_items β deleting an order removes its items automatically - β’
utf8mb4charset β supports full Unicode including emoji
4. ALTER TABLE
As your application evolves, you will often need to modify existing table structures. ALTER TABLE supports a wide range of changes.
Add Column
-- Add a single column
ALTER TABLE products ADD COLUMN brand VARCHAR(100) AFTER name;
-- Add multiple columns
ALTER TABLE users
ADD COLUMN phone VARCHAR(20) AFTER email,
ADD COLUMN avatar_url VARCHAR(500) AFTER phone;
Modify Column
-- Change column type and constraints
ALTER TABLE products MODIFY COLUMN description MEDIUMTEXT;
-- Rename and change a column
ALTER TABLE users CHANGE COLUMN phone phone_number VARCHAR(30) NOT NULL DEFAULT '';
Drop Column
ALTER TABLE users DROP COLUMN avatar_url;
Add Index
-- Regular index
ALTER TABLE products ADD INDEX idx_brand (brand);
-- Composite index
ALTER TABLE orders ADD INDEX idx_user_status (user_id, status);
-- Unique index
ALTER TABLE users ADD UNIQUE INDEX uk_phone (phone_number);
-- Full-text index (for text search)
ALTER TABLE products ADD FULLTEXT INDEX ft_name_desc (name, description);
Add Foreign Key
ALTER TABLE orders
ADD CONSTRAINT fk_orders_user
FOREIGN KEY (user_id) REFERENCES users(id)
ON DELETE RESTRICT ON UPDATE CASCADE;
Rename Table
-- Rename a single table
ALTER TABLE products RENAME TO goods;
-- Or use RENAME TABLE (can rename multiple)
RENAME TABLE goods TO products, order_items TO line_items;
π MySQL vs PostgreSQL ALTER TABLE:
- β’ Rename column: MySQL uses
CHANGE COLUMN old new type, PostgreSQL usesRENAME COLUMN old TO new - β’ Modify type: MySQL uses
MODIFY COLUMN, PostgreSQL usesALTER COLUMN ... TYPE - β’ Position: MySQL supports
AFTER columnandFIRST, PostgreSQL always appends to the end
5. Design Best Practices
Naming Conventions
- Use snake_case for table and column names:
user_profiles,created_at - Use plural nouns for table names:
users,products,order_items - Prefix index names:
idx_for regular,uk_for unique,fk_for foreign keys - Avoid reserved words as identifiers (e.g.,
order,group,select)
Always Use a Primary Key
- Every table should have a primary key β prefer
INT UNSIGNED AUTO_INCREMENTorBIGINT UNSIGNED AUTO_INCREMENT - Surrogate keys (auto-increment IDs) are simpler and more efficient than natural keys for most cases
- Consider
UUIDif you need globally unique IDs across distributed systems, but be aware of index fragmentation
Choose Proper Data Types
- Use the smallest type that fits your data β
TINYINTfor status codes, notINT - Use
DECIMALfor money β neverFLOATorDOUBLE - Use
TIMESTAMPfor event times (timezone-aware),DATETIMEfor calendar dates - Set
VARCHARlengths thoughtfully βVARCHAR(50)for names,VARCHAR(255)for URLs
Normalization
Database normalization reduces redundancy and improves data integrity. For most applications, aim for Third Normal Form (3NF):
| Normal Form | Rule | Example |
|---|---|---|
| 1NF | Each column holds atomic (indivisible) values | No comma-separated lists in a single column |
| 2NF | 1NF + every non-key column depends on the entire primary key | In (order_id, product_id), product_name depends only on product_id β move it out |
| 3NF | 2NF + no transitive dependencies | city depends on zip_code, not directly on user_id β separate address table |
π‘ Practical tip: Don't over-normalize. Controlled denormalization (e.g., storing a calculated total on the orders table) is acceptable when it significantly improves read performance. Always measure before optimizing.
6. Chapter Summary
Data Types
- β’ Integer: TINYINT, SMALLINT, INT, BIGINT
- β’ Decimal: FLOAT, DOUBLE, DECIMAL
- β’ String: CHAR, VARCHAR, TEXT
- β’ Date/Time: DATE, DATETIME, TIMESTAMP
- β’ Other: JSON, ENUM, BOOLEAN, BLOB
Constraints
- β’ PRIMARY KEY β unique row identifier
- β’ NOT NULL β disallow NULL values
- β’ UNIQUE β enforce distinct values
- β’ DEFAULT β provide fallback value
- β’ CHECK β validate data conditions
- β’ FOREIGN KEY β link related tables
ALTER TABLE
- β’ ADD COLUMN β add new fields
- β’ MODIFY / CHANGE β alter existing fields
- β’ DROP COLUMN β remove fields
- β’ ADD INDEX β improve query speed
- β’ RENAME β change table name
Best Practices
- β’ Use snake_case naming
- β’ Always define a primary key
- β’ Choose the smallest fitting type
- β’ Use DECIMAL for money
- β’ Normalize to 3NF, denormalize with caution