Skip to content

Data Loading and Storage

Frameworks

Name / FrameworkWhat It Is (Summary)Further Reading
Kimball Dimensional ModelingBottom-up warehouse modeling using facts & dimensions; optimized for analytics.https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/
Inmon Corporate Information FactoryTop-down enterprise data warehouse architecture emphasizing centralized data.https://en.wikipedia.org/wiki/Bill_Inmon
Star Schema / Snowflake SchemaSchemas used in data warehouses to structure analytical data efficiently.https://www.databricks.com/glossary/star-schema
OLTP vs OLAPOLTP for transactions; OLAP for analytics. Defines database use specialization.https://www.ibm.com/cloud/learn/olap-vs-oltp
CAP TheoremStates that distributed systems can only guarantee two of: Consistency, Availability, Partition Tolerance.https://en.wikipedia.org/wiki/CAP_theorem
ACID vs BASE ModelsACID ensures reliability; BASE provides scalability and eventual consistency.https://www.geeksforgeeks.org/difference-between-acid-and-base-in-dbms/
Data Lakes vs Lakehouse ArchitectureLakehouses combine data lake flexibility with warehouse reliability.https://www.databricks.com/discover/pages/lakehouse
Lambda ArchitectureHybrid batch + real-time processing pipeline design.https://lambda-architecture.net/
Kappa ArchitectureStream-only architecture simplifying Lambda by removing batch layer.https://martin.kleppmann.com/2015/01/29/kappa-architecture.html
Hot / Warm / Cold Data TieringStrategy for storing frequently accessed data vs archival data efficiently.https://www.ibm.com/think/topics/data-tiering
Sharding & Partitioning StrategiesBreaks data into distributed parts to improve performance and scalability.https://www.mongodb.com/docs/manual/sharding/
CDC (Change Data Capture)Tracks and streams data changes in real-time for sync and analytics.https://debezium.io/documentation/
Slowly Changing Dimensions (SCD Types 0–6)Methods for tracking changes in dimensional data over time.https://www.kimballgroup.com/2003/02/design-tip-6-slowly-changing-dimensions/
Message Queue & Stream Processing (Kafka, Kinesis, Pulsar)Real-time data streaming pipelines.https://kafka.apache.org/documentation/

Powered by VitePress