Mastering SQL Server Fundamentals and Azure Database Services
SQL Fundamentals & Azure IaaS
1. Database Fundamentals
RDBMS vs. NoSQL:
- RDBMS: Organized store of data, minimizes redundancy, reduces inconsistency. Stores data in tables (rows/columns).
- DBMS Types:
- Open Source: MySQL, MariaDB, PostgreSQL.
- Proprietary: Oracle, Microsoft SQL Server, IBM DB2.
SQL Server Editions:
- Express: Free, limited size/performance.
- Developer: Free/low-price, full features, dev use only.
- Standard/Enterprise: Production use, terabytes of data.
Data Types:
- Numeric: int, bigint, decimal, tinyint, bit.
- Approx: float, real.
- String: char (fixed), varchar (variable), text.
- Unicode: nchar, nvarchar, ntext.
- Date: date, time, datetime.
T-SQL Basics:
- SELECT: SELECT * FROM schema.table or SELECT col1, col2 FROM….
- Aliases: SELECT col1 AS “Friendly Name”.
- WHERE Operators: =, >, <, != or <> (Not Equal), >= (Greater/Equal), !< (Not Less Than).
- Range/Lists: BETWEEN val1 AND val2 (Inclusive). IN (val1, val2).
- NULL Handling: NULL is not zero/blank. Use IS NULL or IS NOT NULL.
2. SQL Objects & DDL/DML
Database Objects:
- Tables: Logical organization of data (Rows=Records, Columns=Fields).
- Views: Saved query definitions. Do not store data (virtual table). Used for security/simplification.
- Stored Procedures: Precompiled T-SQL script, can include logic/loops, returns tabular data.
- Functions: User-defined (Scalar or Table-valued). Cannot affect outside scope.
- Triggers: Special stored proc that fires on events (INSERT, UPDATE, DELETE).
Constraints (Rules):
- PRIMARY KEY: Unique identifier, No NULLs, one per table.
- FOREIGN KEY: Enforces referential integrity (links to PK in another table).
- NOT NULL: Prevents empty values.
- UNIQUE: No duplicates allowed, but allows one NULL.
- CHECK: Validates condition (e.g., Price > 0).
Joins (Combining Tables):
- INNER JOIN: Returns rows only where there is a match in both tables.
- LEFT OUTER JOIN: All rows from LEFT table, matched rows from RIGHT (NULL if no match).
- RIGHT OUTER JOIN: All rows from RIGHT table, matched rows from LEFT.
- FULL OUTER JOIN: All rows from both sides; NULLs where no match exists.
Syntax Quick Ref:
- Create: CREATE TABLE name (col type constraint, …)
- Insert: INSERT INTO table (cols) VALUES (vals).
- Update: UPDATE table SET col=val WHERE id=x.
- Delete: DELETE FROM table WHERE id=x.
3. Azure IaaS: SQL on VM
Deployment & Management:
- Use Case: Lift & Shift, OS access required, specific SQL version needs, SSRS/SSAS/SSIS dependency.
- Deployment Methods: Azure Portal (Manual), PowerShell/CLI (Imperative), ARM Templates (Declarative/JSON), Bicep (Declarative).
- SQL IaaS Agent Extension: Enables auto-backup, auto-patching, Key Vault integration, and flexible licensing.
- Licensing: Pay-as-you-go OR Azure Hybrid Benefit (Bring Your Own License – BYOL).
Storage Configuration:
- Disks:
- OS Disk (C:): Operating System.
- Temp Disk (D:): Volatile/Temporary (Data lost on reboot). Use for TempDB.
- Data Disk (F:): Persistent storage.
- Disk Types:
- Ultra Disk: Lowest latency, mission-critical.
- Premium SSD: Production workloads, high throughput.
- Standard SSD: Dev/Test.
- Optimization: Separate Data and Log files. Enable Read Caching on Data files. No Caching on Log files.
High Availability (HA) & DR:
- Availability Sets: Protects against hardware failure/updates. Uses Fault Domains (Rack) and Update Domains (Patching). 99.95% SLA.
- Availability Zones: Protects against Data Center failure within a region. Low latency.
- Region Pairs: Protects against Regional failure. One region prioritized for recovery.
- Azure Site Recovery (ASR): DR solution. Block-level replication of VM. Good for migration with some downtime.
Azure PaaS & Performance Monitoring
4. Azure PaaS Solutions
Azure SQL Managed Instance (MI):
- Concept: PaaS with near 100% SQL Server compatibility. “Bridge” between IaaS and DB.
- Features: Has SQL Agent, Service Broker, CLR, Cross-database queries.
- Networking: Deployed in a dedicated subnet/VNet. Private IP by default.
- Use Case: Migration of existing apps requiring instance-level features.
Azure SQL Database:
- Concept: Cloud-native, fully managed, isolated.
- Deployment Options:
- Single Database: Isolated, portable.
- Elastic Pools: Shared resources for multiple DBs. Cost-effective for multi-tenant or unpredictable usage patterns.
- Hyperscale: 100TB+ size, fast restore (snapshots), compute/storage scale independently.
- Serverless: Auto-pauses, auto-scales. Good for irregular/dev workloads. (Incompatible with Geo-replication/Long-term backup).
- Service Tiers:
- General Purpose: Remote storage, standard latency.
- Business Critical: Local SSD, low latency, Read Scale-out.
Data Synchronization:
- SQL Data Sync: Bi-directional sync (Azure <-> On-Prem). No transactional consistency.
- Elastic Jobs: Run T-SQL across many Azure SQL DBs.
Open Source PaaS:
- PostgreSQL: Hyperscale (Citus) for multi-tenant/analytics.
- MySQL: Lift and shift for LAMP stack.
5. Monitoring & Optimization
A. Tools & Metrics
Baselines: Standard metric set to distinguish anomalies from normal usage. Windows PerfMon (IaaS):
- Page Life Expectancy (PLE): How long data stays in memory. Low = Memory Pressure.
- Processor Queue Length: CPU pressure.
Azure Monitor: Collects metrics for all Azure resources (CPU, Storage, I/O). Data retained 93 days. Azure SQL Insights: Uses a dedicated VM (Collection Agent) to send telemetry to Azure Log Analytics.
B. Extended Events (XEvents)
- Lightweight diagnostic system, replaces SQL Profiler.
- Channels: Admin, Operational, Analytical, Debug.
- Targets:
- Ring Buffer: Memory only, non-persistent (data lost on restart).
- Event File: Writes to disk (persisted).
C. Performance Features
Query Store: “Flight recorder” for queries. Aggregates runtime stats. Used to find regressed queries. Automatic Tuning:
- Force Plan: Auto-fixes performance regression (Force Last Good Plan).
- Create Index: Auto-adds missing indexes (Azure SQL DB).
- Drop Index: Auto-removes unused indexes (Azure SQL DB).
- Intelligent Query Processing (IQP):
- Batch Mode on Row Store: Faster calculations for large rows.
- Approximate Count Distinct: Fast estimation for big data.
- Scalar UDF Inlining: Optimizes user-defined functions.
D. Resource Management
Resource Governor: Limits CPU, Memory, IOPS for specific workloads. Uses Classifier Functions to sort sessions into Workload Groups and Resource Pools. TempDB Best Practices:
- Place on local SSD (D: drive on VM).
- Create 1 data file per CPU core (up to 8 files).
- All files should be same size/autogrowth.
E. Maintenance
Index Fragmentation:
- Reorganize: Fragmentation 5% – 30%. Online operation. Compacts pages.
- Rebuild: Fragmentation > 30%. Offline (usually). Updates statistics.
- Statistics: Used by optimizer to build plans. Update when data changes significantly.
- Job Scheduling:
- VM/Managed Instance: SQL Agent.
- Azure SQL DB: Azure Automation Runbooks or Elastic Jobs (No SQL Agent).
