Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Report all fields which fail registry transformation #496

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions latch/registry/table.py
Original file line number Diff line number Diff line change
Expand Up @@ -486,14 +486,22 @@ def upsert_record(self, name: str, **values: RegistryPythonValue) -> None:
cols = self.table.get_columns()

db_vals: Dict[str, DBValue] = {}
errs: List[str] = []
for k, v in values.items():
col = cols.get(k)
if col is None:
raise NoSuchColumnError(k)

db_vals[k] = to_registry_literal(
v, col.upstream_type["type"], resolve_paths=False
)
try:
db_vals[k] = to_registry_literal(
v, col.upstream_type["type"], resolve_paths=False
)
except RegistryTransformerException as e:
# todo(ayush): add a registry_type -> pretty string fn so we aren't printing json here
errs.append(f"Error converting field {repr(k)} with value {repr(v)} to type {col.upstream_type['type']}: {e}")

if len(errs) > 0:
raise RegistryTransformerException(f"Could not upsert record {name}:" + "\n".join(errs))
Comment on lines +503 to +504
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(errs) > 0:
raise RegistryTransformerException(f"Could not upsert record {name}:" + "\n".join(errs))
if len(errs) > 0:
if len(errs) > 9:
rest = len(errs) - 9
errs = errs[:9] + [f"({rest} error(s) hidden)"]
raise RegistryTransformerException(f"Could not upsert record {name}:" + textwrap.indent("\n".join(errs)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ayushkamat

Thanks for the feedback. I've incorporated the above suggestion, but I'd really prefer to present an error message that includes all transformation failures.

This exception is raised when uploading records to the Registry. This is something that often occurs in the final stages of a workflow, so if a subset of errors are obscured, the user must re-execute the entire workflow to discover the remaining errors. This can be time consuming and expensive.

The hope of this PR was to expose all errors at once to avoid this. With that in mind, would you be willing to keep the original implementation?

I do like the suggestion to use textwrap and I'll incorporate that regardless.

Thanks!

Copy link
Contributor

@ayushkamat ayushkamat Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. My counter-argument is that typically there are very few "structurally unique" errors when doing bulk inserts (e.g. if you have an error like "LatchFile" cannot be assigned to type "str" or something, its likely that there are 300 more errors with the exact same message). Because of this I think there is limited value in printing everything out all of the time, and we should still limit the output of this to avoid spam.

That being said for now its fine to print everything out given it will be some work to separate out all of the unique errors.

Feel free to leave it as is and print everything, but please add a todo to filter this (you can assign me to it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying.

To be clear - this message is reporting all the errors associated with the insert of a single record, not a batch insert. While it is still unlikely that a single record would yield more than 10 errors (hopefully 🙂 ), I think in this context it would be appropriate to leave the set of errors unfiltered.

I agree that it's sensible to filter error messaging for a bulk insert, but I think that would happen elsewhere upstream.


self._record_mutations.append(_TableRecordsUpsertData(name, db_vals))

Expand Down