
We are optimizing model1 in Fabric with two very large columns:
TransactionKey
Cardinality: 2.5 billion
Size: 160 GB
Description: surrogate key for SalesTransaction fact table.
SaleDateTime
Cardinality: 6.8 million
Size: 120 GB
Description: datetime (to the second) of when a sale occurred.
Goal:
Reduce model size.
Improve refresh performance (Import mode).
Ensure datetime values remain available.
Analysis per column
1. TransactionKey (very high cardinality)
Surrogate keys with extremely high cardinality do not add analytical value in a semantic model because they are not used in measures or grouping.
They take a lot of memory and reduce refresh performance.
Best practice: remove the column from the model.
2. SaleDateTime
Needed to analyze transactions by date and time.
However, storing the full datetime at second-level cardinality causes huge memory usage.
To optimize: split the column into Date and Time columns.
Reduces cardinality significantly.
Preserves full datetime information (can recombine if needed).
Improves compression and refresh speed.
Final Answer:
TransactionKey: Remove the column.
SaleDateTime: Split the column.
[References:, Microsoft Fabric optimization guidance – reduce cardinality, Best practices for datetime optimization in semantic models, , ✅ Answer Selection:, TransactionKey → Remove the column, SaleDateTime → Split the column, , ]