Skip to content

Commit f69bb62

Browse files
committed
docs(notes): add database documentation
1 parent 4e53684 commit f69bb62

4 files changed

+348
-0
lines changed

notes/2025-03-24-database-acid-en.md

+97
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
title: Understanding Full ACID Compliance in Databases
3+
lang: en
4+
layout: post
5+
audio: false
6+
translated: false
7+
generated: true
8+
---
9+
10+
ACID compliance represents one of the fundamental frameworks for ensuring database reliability and data integrity, particularly in transaction processing systems. Let's explore what ACID means in depth and why it matters.
11+
12+
## What is ACID?
13+
14+
ACID is an acronym that stands for:
15+
16+
- **Atomicity**
17+
- **Consistency**
18+
- **Isolation**
19+
- **Durability**
20+
21+
These four properties together guarantee that database transactions are processed reliably, even in the face of errors, power failures, and other issues. Let's examine each property in detail.
22+
23+
## Atomicity
24+
25+
Atomicity ensures that a transaction is treated as a single, indivisible unit of work. This means:
26+
27+
- Either all operations within the transaction complete successfully (commit)
28+
- Or none of them take effect (rollback)
29+
30+
### Deep Dive:
31+
When a transaction involves multiple operations (such as debiting one account and crediting another), atomicity guarantees that either both operations succeed or neither does. The database maintains this property through mechanisms like write-ahead logging (WAL) and rollback segments, which record the state before changes so the system can undo partial transactions.
32+
33+
## Consistency
34+
35+
Consistency ensures that a transaction brings the database from one valid state to another valid state, maintaining all predefined rules, constraints, and triggers.
36+
37+
### Deep Dive:
38+
Consistency works on multiple levels:
39+
- **Database consistency**: Enforcing data integrity constraints, foreign keys, unique constraints, and check constraints
40+
- **Application consistency**: Ensuring business rules are maintained
41+
- **Transaction consistency**: Guaranteeing that invariants are preserved before and after transaction execution
42+
43+
A consistent transaction preserves the database's semantic integrity - it cannot violate any defined rules. For example, if a rule states an account balance cannot be negative, a consistent transaction cannot result in a negative balance.
44+
45+
## Isolation
46+
47+
Isolation ensures that concurrent execution of transactions leaves the database in the same state as if the transactions were executed sequentially.
48+
49+
### Deep Dive:
50+
Isolation prevents problems like:
51+
- **Dirty reads**: Reading uncommitted data from another transaction
52+
- **Non-repeatable reads**: Getting different results when reading the same data twice in the same transaction
53+
- **Phantom reads**: When new rows appear in a range scan due to another transaction's insert
54+
55+
Databases implement various isolation levels through techniques like:
56+
- **Pessimistic concurrency control**: Locking resources to prevent conflicts
57+
- **Optimistic concurrency control**: Allowing concurrent access but validating before commit
58+
- **Multiversion concurrency control (MVCC)**: Maintaining multiple versions of data to allow concurrent reads without blocking
59+
60+
## Durability
61+
62+
Durability guarantees that once a transaction has been committed, it remains committed even in the case of system failure.
63+
64+
### Deep Dive:
65+
Durability is typically achieved through:
66+
- **Write-ahead logging**: Changes are first recorded in logs before being applied to the actual data
67+
- **Redundant storage**: Multiple copies of data stored across different locations
68+
- **Checkpoint mechanisms**: Ensuring changes are periodically flushed from memory to persistent storage
69+
70+
In practical terms, this means that committed transactions survive power failures, system crashes, or hardware failures, as they've been permanently stored on non-volatile memory.
71+
72+
## Implementation Challenges and Considerations
73+
74+
Achieving full ACID compliance involves significant tradeoffs:
75+
76+
1. **Performance impact**: Strict ACID properties can reduce throughput and increase latency
77+
2. **Scalability limitations**: Some ACID guarantees become harder to maintain in distributed systems
78+
3. **Implementation complexity**: Maintaining these properties requires sophisticated algorithms and mechanisms
79+
4. **Resource utilization**: Additional storage and memory may be required for logs, lock tables, and multiple data versions
80+
81+
## Real-World Applications
82+
83+
Different database systems provide varying levels of ACID compliance:
84+
85+
- **Traditional RDBMSs** (Oracle, SQL Server, PostgreSQL, MySQL with InnoDB): Full ACID compliance
86+
- **NoSQL databases**: Often sacrifice some ACID properties for scalability and performance (typically following BASE principles instead)
87+
- **NewSQL databases**: Attempt to provide both scalability and ACID properties
88+
89+
## Beyond ACID: Modern Developments
90+
91+
While ACID remains fundamental, distributed systems have introduced additional concepts:
92+
93+
- **CAP Theorem**: States that distributed systems can provide only two of three guarantees: Consistency, Availability, and Partition tolerance
94+
- **BASE** (Basically Available, Soft state, Eventually consistent): An alternative approach for distributed systems
95+
- **Saga Pattern**: A sequence of local transactions that maintain eventual consistency in distributed systems
96+
97+
Full ACID compliance remains crucial for applications where data integrity is paramount, such as financial systems, healthcare applications, and critical record-keeping systems.

notes/2025-03-24-db2-data-type-en.md

+70
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
title: IBM Db2 Data Types
3+
lang: en
4+
layout: post
5+
audio: false
6+
translated: false
7+
generated: true
8+
---
9+
10+
Db2 supports a wide range of data types to accommodate various data storage needs. Here's a comprehensive overview of Db2 data types by category:
11+
12+
## Numeric Data Types
13+
14+
### Integer Types
15+
- **SMALLINT**: 16-bit integer (-32,768 to 32,767)
16+
- **INTEGER** or **INT**: 32-bit integer (-2,147,483,648 to 2,147,483,647)
17+
- **BIGINT**: 64-bit integer (approximately -9.2E18 to 9.2E18)
18+
19+
### Decimal Types
20+
- **DECIMAL(p,s)** or **DEC(p,s)** or **NUMERIC(p,s)**: Exact decimal values where 'p' is precision (total digits) and 's' is scale (decimal places)
21+
- **DECFLOAT**: Decimal floating-point that follows the IEEE 754r standard, available in 16-digit or 34-digit precision
22+
23+
### Floating-Point Types
24+
- **REAL** or **FLOAT(1-24)**: 32-bit floating-point number
25+
- **DOUBLE** or **DOUBLE PRECISION** or **FLOAT(25-53)**: 64-bit floating-point number
26+
27+
## Character Data Types
28+
29+
### Fixed-Length
30+
- **CHAR(n)** or **CHARACTER(n)**: Fixed-length character string (1-255 bytes)
31+
32+
### Variable-Length
33+
- **VARCHAR(n)** or **CHARACTER VARYING(n)**: Variable-length character string (1-32,672 bytes)
34+
- **CLOB** or **CHARACTER LARGE OBJECT**: Large character objects (up to 2GB)
35+
36+
### Graphic Types (for DBCS characters)
37+
- **GRAPHIC(n)**: Fixed-length double-byte string
38+
- **VARGRAPHIC(n)**: Variable-length double-byte string
39+
- **DBCLOB**: Double-byte character large object
40+
41+
## Binary Data Types
42+
- **BINARY(n)**: Fixed-length binary data
43+
- **VARBINARY(n)** or **BINARY VARYING(n)**: Variable-length binary data
44+
- **BLOB** or **BINARY LARGE OBJECT**: Binary large objects (up to 2GB)
45+
46+
## Date and Time Data Types
47+
- **DATE**: Calendar date (YYYY-MM-DD)
48+
- **TIME**: Time of day (HH:MM:SS)
49+
- **TIMESTAMP**: Date and time (YYYY-MM-DD-HH.MM.SS.nnnnnn)
50+
- **TIMESTAMP WITH TIME ZONE**: Timestamp that includes time zone information
51+
52+
## Boolean Type
53+
- **BOOLEAN**: TRUE, FALSE, or NULL values
54+
55+
## XML Data Type
56+
- **XML**: Stores XML documents in their hierarchical form
57+
58+
## Row Types
59+
- **ROW**: Structured type comprising a sequence of named fields with associated data types
60+
61+
## User-Defined Types
62+
- **Distinct Types**: Based on built-in data types but incompatible with their source types
63+
- **Structured Types**: User-defined types with attributes and methods
64+
65+
## Special Types
66+
- **ROWID**: Row identifier, a value that uniquely identifies a row
67+
- **DATALINK**: References to external files
68+
- **ARRAY**: Ordered collection of elements of the same type
69+
70+
Each data type in Db2 has specific storage requirements, performance characteristics, and usage scenarios. The choice of data type affects database performance, storage efficiency, and data integrity.

notes/2025-03-24-db2-history-en.md

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
title: db2-history
3+
lang: en
4+
layout: post
5+
audio: false
6+
translated: false
7+
generated: true
8+
---

notes/2025-03-24-db2-postgresql-en.md

+173
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: IBM DB2 vs PostgreSQL
3+
lang: en
4+
layout: post
5+
audio: false
6+
translated: false
7+
generated: true
8+
---
9+
10+
Here's a comprehensive comparison between IBM DB2 and PostgreSQL across multiple dimensions:
11+
12+
## Foundational Differences
13+
14+
**IBM DB2:**
15+
- Commercial database management system developed by IBM
16+
- Primarily designed for enterprise environments with high-volume transaction processing
17+
- Available in multiple editions targeting different platforms (z/OS, Linux, UNIX, Windows)
18+
- Closed-source with proprietary licensing model
19+
20+
**PostgreSQL:**
21+
- Open-source, community-driven RDBMS
22+
- Started as an academic project at UC Berkeley
23+
- Available on virtually all operating systems
24+
- Free to use and modify under PostgreSQL License (similar to MIT/BSD)
25+
26+
## Architecture
27+
28+
**IBM DB2:**
29+
- Leverages a shared-disk architecture in some configurations
30+
- Uses buffer pools, tablespaces, and storage groups for data organization
31+
- Instance-based architecture with database partitioning capabilities
32+
- Offers pureScale clustering for high availability
33+
34+
**PostgreSQL:**
35+
- Process-based architecture with a postmaster process that spawns backend processes
36+
- Uses a multiversion concurrency control (MVCC) system to handle concurrent transactions
37+
- Shared-nothing architecture that scales horizontally through solutions like Postgres-XL
38+
- Uses a write-ahead log (WAL) for durability and recovery
39+
40+
## Performance Characteristics
41+
42+
**IBM DB2:**
43+
- Excels at high-volume OLTP workloads, especially on mainframe systems
44+
- Advanced query optimization with cost-based optimizer
45+
- Built-in workload management for prioritizing resources
46+
- Memory optimized tables and columnar storage options
47+
- Adaptive compression technologies
48+
49+
**PostgreSQL:**
50+
- Strong performance for mixed workloads (OLTP and OLAP)
51+
- Excellent geospatial query performance with PostGIS extension
52+
- Advanced indexing capabilities (B-tree, GiST, GIN, SP-GiST, BRIN)
53+
- Parallel query execution for analytical workloads
54+
- Table partitioning for improved query performance
55+
56+
## SQL Compliance and Extensions
57+
58+
**IBM DB2:**
59+
- High SQL standards compliance
60+
- Robust support for SQL PL (procedural language)
61+
- Integrated XML capabilities with pureXML
62+
- JSON support with specialized functions
63+
- Support for temporal data and bi-temporal tables
64+
65+
**PostgreSQL:**
66+
- Strong ANSI SQL compliance
67+
- Rich procedural language support (PL/pgSQL, PL/Python, PL/Perl, etc.)
68+
- Native support for JSON/JSONB with powerful operators
69+
- Custom data types and operators
70+
- Advanced array support and range types
71+
- Full-text search capabilities
72+
73+
## Security Features
74+
75+
**IBM DB2:**
76+
- Row and column level access control (RCAC)
77+
- Label-based access control (LBAC)
78+
- Robust auditing capabilities
79+
- Integration with enterprise security frameworks
80+
- Advanced encryption for data at rest and in transit
81+
82+
**PostgreSQL:**
83+
- Role-based access control
84+
- Row-level security policies
85+
- Column-level privileges
86+
- SSL support for encrypted connections
87+
- Password policies and authentication methods
88+
- Integration with external authentication systems (LDAP, Kerberos)
89+
90+
## High Availability and Disaster Recovery
91+
92+
**IBM DB2:**
93+
- HADR (High Availability Disaster Recovery)
94+
- pureScale clustering for near-continuous availability
95+
- Log shipping and read-on-standby capabilities
96+
- Q-replication for low-latency replication
97+
98+
**PostgreSQL:**
99+
- Streaming replication with hot standby
100+
- Logical replication for version 10+
101+
- Point-in-time recovery
102+
- Third-party solutions like Patroni for automated failover
103+
- Connection pooling with pgBouncer or Pgpool-II
104+
105+
## Management and Administration
106+
107+
**IBM DB2:**
108+
- IBM Data Studio and IBM Data Server Manager for administration
109+
- Comprehensive monitoring through IBM tooling
110+
- Automated maintenance and health monitoring
111+
- Automated storage management
112+
113+
**PostgreSQL:**
114+
- Various open-source and commercial tools (pgAdmin, DBeaver, etc.)
115+
- Command-line utilities (psql, pg_dump, etc.)
116+
- Extensible statistics collection
117+
- Vacuum process for reclaiming space
118+
- Manual configuration with high tunability
119+
120+
## Cost Structure
121+
122+
**IBM DB2:**
123+
- Significant licensing costs based on cores/PVUs
124+
- Different editions with varying costs
125+
- Support and maintenance contracts
126+
- Additional costs for advanced features and tools
127+
128+
**PostgreSQL:**
129+
- Free to use regardless of scale or application
130+
- No licensing costs
131+
- Support available through community or commercial vendors
132+
- Commercial hosting options available
133+
134+
## Ecosystem and Extensions
135+
136+
**IBM DB2:**
137+
- Integration with IBM analytics suite
138+
- Limited third-party extensions
139+
- Enterprise-focused tooling ecosystem
140+
141+
**PostgreSQL:**
142+
- Rich ecosystem of extensions (PostGIS, TimescaleDB, etc.)
143+
- Active community development
144+
- Wide range of third-party tools and integrations
145+
- Strong ORM support across programming languages
146+
147+
## Cloud Deployments
148+
149+
**IBM DB2:**
150+
- Available on IBM Cloud and other major cloud providers
151+
- DB2 Warehouse for cloud data warehousing
152+
- Managed service options with limited customization
153+
154+
**PostgreSQL:**
155+
- Available on all major cloud platforms as managed services
156+
- Numerous specialized PostgreSQL-as-a-service offerings
157+
- Highly customizable deployments
158+
159+
## Use Case Fit
160+
161+
**IBM DB2:**
162+
- Ideal for large enterprises with existing IBM infrastructure
163+
- Mission-critical applications requiring maximum reliability
164+
- High-performance OLTP systems, especially on mainframe
165+
- Legacy system integration
166+
167+
**PostgreSQL:**
168+
- Well-suited for a wide range of applications from small to enterprise
169+
- Web applications and services
170+
- Geospatial applications (with PostGIS)
171+
- Applications requiring complex data types or JSON/document storage
172+
- Startups and cost-sensitive organizations
173+

0 commit comments

Comments
 (0)