Skip to content

Commit

Permalink
HIVE-28622: Duplicate Entries in TXN_WRITE_NOTIFICATION_LOG Due to Or…
Browse files Browse the repository at this point in the history
…acle's Handling of Empty Strings

In Oracle, empty strings ('') are treated as NULL values for VARCHAR2 and CHAR data types. This behavior is unique to Oracle and can be confusing, as an empty string is typically considered distinct from NULL in other databases.

As a result, the TXN_WRITE_NOTIFICATION_LOG table receives duplicate entries for a single Hive ACID transaction involving MERGE statements.

This discrepancy leads to issues: the _files and _dumpmetadata files in a Hive ACID incremental dump will not align if the dump scope includes one or more MERGE statements. Consequently, the Hive ACID incremental LOAD fails at the target (DR), blocking subsequent replication executions.

Solution
* Add additional check for partition being null

Testing:
* Tested on cluster with oracle and mysql as backend database
  • Loading branch information
harshal-16 committed Jan 21, 2025
1 parent bc87c4d commit 327c3d6
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -1205,7 +1205,7 @@ private void addWriteNotificationLog(List<NotificationEvent> eventBatch, List<Ac
String select = sqlGenerator.addForUpdateClause("select \"WNL_ID\", \"WNL_FILES\" from" +
" \"TXN_WRITE_NOTIFICATION_LOG\" " +
"where \"WNL_DATABASE\" = ? " +
"and \"WNL_TABLE\" = ? " + " and \"WNL_PARTITION\" = ? " +
"and \"WNL_TABLE\" = ? " + " and (\"WNL_PARTITION\" = ? OR \"WNL_PARTITION\" IS NULL) " +
"and \"WNL_TXNID\" = ? ");
List<Integer> insertList = new ArrayList<>();
Map<Integer, Pair<Long, String>> updateMap = new HashMap<>();
Expand Down

0 comments on commit 327c3d6

Please sign in to comment.