Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fast_import: put job status to s3 #11284

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

NanoBjorn
Copy link
Contributor

Problem

fast_import binary is being run inside neonvms, and they do not support proper kubectl describe logs now, there are a bunch of other caveats as well: neondatabase/autoscaling#1320

Anyway, we needed a signal if job finished successfully or not, and if not — at least some error message for the cplane operation. And after a short discussion, that s3 object is the most convenient at the moment.

Summary of changes

If s3_prefix was provided to fast_import call, any job run puts a status object file into {s3_prefix}/status/fast_import with contents {"done": true} or {"done": false, "error": "..."}. Added a test as well

@NanoBjorn NanoBjorn requested a review from a team as a code owner March 17, 2025 19:21
Comment on lines +693 to +695
if std::fs::exists(&status_dir)?.not() {
std::fs::create_dir(&status_dir).context("create status directory")?;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about a separate workdir object with some internal folders/file management, maybe indeed makes sense. For now just copied

if std::fs::exists(&status_dir)?.not() {
std::fs::create_dir(&status_dir).context("create status directory")?;
}
let status_file = status_dir.join("fast_import");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about name of the file. Need opinion on that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use the same name as in the fast import (Command::PgData) flow: $root/status/pgdata/status. Or we can change it for both.

Copy link
Contributor Author

@NanoBjorn NanoBjorn Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I want to break compatibility now without a specific need. I think current names & files are ok, I can put command name and args into the file as well

Copy link

github-actions bot commented Mar 17, 2025

7986 tests run: 7603 passed, 0 failed, 383 skipped (full report)


Code coverage* (full report)

  • functions: 32.3% (8735 of 27027 functions)
  • lines: 48.4% (74927 of 154833 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
c4ace56 at 2025-03-18T21:09:06.202Z :recycle:

Copy link
Contributor

@VladLazar VladLazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This require a cplane side change as well, right?
Currently cplane import operation simply waits for the job to complete here.

if std::fs::exists(&status_dir)?.not() {
std::fs::create_dir(&status_dir).context("create status directory")?;
}
let status_file = status_dir.join("fast_import");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use the same name as in the fast import (Command::PgData) flow: $root/status/pgdata/status. Or we can change it for both.

Comment on lines 697 to 701
let res_obj = if res.is_ok() {
serde_json::json!({"done": true})
} else {
serde_json::json!({"done": false, "error": res.unwrap_err().to_string()})
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could match to avoid the unwrap:

let res_obj = match res {
  Ok(_) => serde_json::json!({"done": true}),
  Err(err) => serde_json::json!({"done": false, "error": err.to_string()})
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, fixed!

serde_json::json!({"done": false, "error": res.unwrap_err().to_string()})
};
std::fs::write(&status_file, res_obj.to_string()).context("write status file")?;
aws_s3_sync::upload_dir_recursive(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this does any retries. If we face a transient S3 error, the job will fail and the S3 status won't reflect that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added retries config to the clients

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants