-
Notifications
You must be signed in to change notification settings - Fork 32
test: set LC_COLLATE (TC-3070) #2051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's guide (collapsed on small PRs)Reviewer's GuideInjects setting of LC_COLLATE="C" into the embedded test database initialization by wrapping env::set_var in an unsafe block before calling create_for, ensuring deterministic byte-order sorting alignment between PostgreSQL and Rust in tests. Sequence diagram for embedded test database initialization with LC_COLLATEsequenceDiagram
participant T as Test Runner
participant E as embedded.rs
participant Env as Environment
participant PG as PostgreSQL Embedded
T->>E: Call create()
E->>Env: Set LC_COLLATE="C" (unsafe)
E->>PG: create_for(default_settings())
PG-->>E: Embedded DB instance
E-->>T: Return (Database, PostgreSQL)
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `common/db/src/embedded.rs:28-30` </location>
<code_context>
+ // LC_COLLATE="C" ensures deterministic, byte-order sorting that matches Rust's string comparison,
+ // making tests portable across different system locales.
+ // This affects only the embedded test database, not production databases.
+ unsafe {
+ env::set_var("LC_COLLATE", "C");
+ }
create_for(default_settings()?).await
</code_context>
<issue_to_address>
**suggestion:** Consider whether the use of 'unsafe' is necessary for setting an environment variable.
Since 'env::set_var' is safe, removing the 'unsafe' block will make the code clearer and avoid unnecessary use of unsafe.
```suggestion
env::set_var("LC_COLLATE", "C");
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2051 +/- ##
=======================================
Coverage 68.46% 68.46%
=======================================
Files 366 366
Lines 20608 20608
Branches 20608 20608
=======================================
Hits 14109 14109
+ Misses 5675 5671 -4
- Partials 824 828 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I think this ok as a workaround. However, I'd also say it's a valid use case in general and should be supported by postgresql-embedded. Which would mean:
|
TBH I thought about this and I was initially on the same page with you but then I also thought that, in the end, |
|
I think it should be an option for the user (of the postgresql-embedded API) to use it. Or have the default. |
|
Digging into this a big, |
|
Just digging in that code anyway. What we use for tests is: db.execute(Statement::from_string(
db.get_database_backend(),
format!("CREATE DATABASE \"{}\";", database.name),
))
.await?;
db.close().await?;And to my understanding you can set those parameters with the |
|
Lines 53 to 58 in 0d1884c
|
It does, thanks! |
Awesome, I didn't notice this |
@ctron Sorry, I'm getting old because yesterday I noticed this but I stopped myself from touching it and I forgot about. Line 38 in f973764
and Trustify can not force the LC_COLLATE that a user can legitimately defined for the DB they provide for running Trustify.The alternative was to make LC_COLLATE condition in bootstrap but I evaluated it too risky for something that is meant to never be applied in prod and always be applied exclusively to tests so I preferred the env::set_var just for embedded DB.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll approve and defer to you and @ctron to agree to merge.
|
Just an FYI, but in addition to setting collation during db create, we could also do it at table creation or even at query time, e.g. I'm not sure which is correct, tbh. But if we want to leave it up to the user, I'd say our test is wrong. I'd vote to use test fixtures that are predictable regardless of collation config, i.e. use only numbers or letters. |
|
That code ( If you're concerned with it, you could create an additional argument and pass it along to enable this behavior. I'd find this safe than setting a (process) global env-var. |
75ab9a0 to
28c3325
Compare
"Never used" sounds great to me 🤩 |
|
I confirmed the fix locally, but can you squash the 3 redundantly-named commits before merging, please? |
Signed-off-by: mrizzi <[email protected]>
834c100 to
70e39bd
Compare
Setting
LC_COLLATE="C"in the embedded test database ensures the database sorting matches Rust's sorting behavior because both use byte-order comparison, making them compatible for test validation.I had to set the env var because:
LC_COLLATEis set during database cluster initialization (initdb)initdbrunspostgresql_embedded(v0.20.0) doesn't exposeinitdbparameter customizationLC_COLLATEcannot be changed after database creation@jcrossley3 let me know if this works locally for you.
Summary by Sourcery
Tests: