PLUGIN-1950 : Log zero records for Table mode and SQL Statement mode.#75
PLUGIN-1950 : Log zero records for Table mode and SQL Statement mode.#75sahusanket wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces logging to identify when source tables or SQL statements return zero records, enhancing observability. It also updates MultiTableDBInputFormat to handle cases where split boundaries are null by returning a single default split. Feedback was provided regarding SQLStatementRecordReader to move the full SQL statement logging from INFO to DEBUG level to prevent the exposure of sensitive information and avoid log bloat.
| } | ||
| if (!results.next()) { | ||
| if (pos == 0) { | ||
| LOG.info("SQL statement '{}' ('{}') has zero records.", split.getId(), split.getSqlStatement()); |
There was a problem hiding this comment.
Logging the full SQL statement at INFO level can expose sensitive information (such as PII or credentials in the WHERE clause) in the logs. It can also lead to log bloat for very large queries. Consider logging only the statement ID at INFO level and moving the full query to DEBUG level.
LOG.info("SQL statement '{}' has zero records.", split.getId());
LOG.debug("SQL statement '{}' ('{}') has zero records.", split.getId(), split.getSqlStatement());
Add Logging for Empty Tables and Queries in Multi-Table Plugins
Problem
When reading from multiple database tables or executing multiple SQL statements, there is currently no visibility into which tables or queries yielded zero records.
Solution
This PR adds informational logging whenever an ingestion source (table or custom query) produces exactly zero records.
Key Changes
Empty Table Detection in Multi-Table Mode:
MultiTableDBInputFormat: UpdatedgetTableSplitsto explicitly detect when a table is empty (bounding query returnsNULL, NULL) and return exactly one full-table split (1=1).DBTableRecordReader: Added logging to emitSource table '<fullTableName>' has zero records.when exactly zero rows are read. Verified that the reader is processing a full-table split (1=1) to prevent false alarms on empty partial splits whensplitsPerTable > 1.Improved Logging in SQL Statement Mode:
SQLStatementRecordReader: Added logging to emitSQL statement '<id>' ('<query>') has zero records.when exactly zero rows are read.Testing
Also verified that in case of Proper data, it is not printing this LOG LINE.
Verified in PROD environment as well.