-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Describe the bug
Looking at this logic
def processNonCounterRow(row: Row, whereClause: String): Unit = {
if (ttlColumn.equals("None")) {
val rs = getSourceRow(selectStmtWithTs, whereClause, cassandraConnPerPar, customFormat)
if (rs.nonEmpty) {
processRowWithTimestamp(row, whereClause, rs)
} else {
val rs = getSourceRow(selectStmtWithTTL, whereClause, cassandraConnPerPar, customFormat)
if (rs.nonEmpty) {
processRowWithTTL(row, whereClause, rs)
}
}
}
}
I think the normal case is val rs = getSourceRow(selectStmtWithTs... returns a result.
But if the row was deleted after the row-sets to be worked on have been created in dataReplicationProcess, then rs is empty.
It then falls back to getSourceRow(selectStmtWithTTL... i.e. the version with TTL!
However, in our case we do not pass a TTL column.
But if you look at how you construct selectStmtWithTTL
case s if s.equals("None") => ""...
because TTL column is not set, it returns an empty string.
We then do getSourceRow with cls being... an empty string.
The error handling Column list (cls) cannot be null or empty will then trigger and stop the whole replication process.
To Reproduce
Not trivial, would need to delete rows at the right time.
Expected behavior
Not very sure, tbh. As far as I can tell this row could be skipped because it would show up in the next newDeletesDF.
Screenshots
n/a
Additional context
n/a