-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Migrating tables to generic partitioning support
vinoth chandar edited this page Mar 27, 2017
·
1 revision
Till 0.3.1, we have assumed implicitly that the data is partitioned by dates (which was a very very popular observation), i.e all partitions can be found 3 levels down from basepath via basePath/year/month/day
. With PR121, we plan to generalize this, by maintaining a .hoodie_partition_metadata
file under each partition.
- Before rolling out new
hoodie-client
jar, with these changes, please setwithAssumeDatePartitioning(true)
in your HoodieWriteConfig. Without this, hoodie-client will look for partitions based on the metadata and if cannot find anything, it will not be able to write data.
- Use the cli tool with
repair addpartitionmeta
to add this metadata to existing partitions/tables - Rollout new
hoodie-client
, withwithAssumeDatePartitioning(false)
[default], all new partitions will have the metadata going forward - You can upgrade query engines with new
hoodie-hadoop-mr
jar, if you plan to have non date partitioned tables. Old input format with continue to work on date partitioned tables.