-
Notifications
You must be signed in to change notification settings - Fork 263
Fix(tests): Address some test flakiness #5209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
.circleci/continue_config.yml
Outdated
- bigquery | ||
- clickhouse-cloud | ||
- athena | ||
#- snowflake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: revert prior to merge
dcff7e8
to
85c2dbf
Compare
@@ -5,7 +5,7 @@ gateways: | |||
type: duckdb | |||
catalogs: | |||
memory: ':memory:' | |||
testing: 'testing.duckdb' | |||
testing: "{{ var('tmp_path') }}/testing.duckdb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was causing flakiness in style_and_cicd_tests
, example
duckdb.duckdb.IOException: IO Error: Could not set lock on file "testing.duckdb": Conflicting lock is held in /home/circleci/.pyenv/versions/3.12.11/bin/python3.12 (PID 3249). See also https://duckdb.org/docs/stable/connect/concurrency
The duckdb
integration tests get run as part of that and without prefixing the path, testing.duckdb
gets created in the sqlmesh root dir rather than the unique dir for each test.
This causes it to be re-used between tests and potentially accessed in parallel from multiple workers although there is an xdist_group
that forces most of the tests to run sequentially
sushi_test_schema = ctx.add_test_suffix("sushi") | ||
sushi_state_schema = ctx.add_test_suffix("sushi_state") | ||
raw_test_schema = ctx.add_test_suffix("raw") | ||
|
||
config = load_config_from_paths( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was duplicating logic already in TestContext
, since it pre-dated TestContext.create_context()
@@ -2383,33 +2362,15 @@ def _normalize_snowflake(name: str, prefix_regex: str = "(sqlmesh__)(.*)"): | |||
|
|||
init_example_project(tmp_path, ctx.engine_type, schema_name=schema_name) | |||
|
|||
config = load_config_from_paths( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was duplicating logic already in TestContext
, since it pre-dated TestContext.create_context()
@@ -27,15 +27,15 @@ | |||
logger = logging.getLogger(__name__) | |||
|
|||
|
|||
@pytest.fixture(scope="session") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was session-scoped because its predecessor was session scoped and the session scope got retained through an earlier refactor.
However, I was encountering some very hard-to-pin-down issues that I have seen a bunch of times regarding StashKey
, example:
teardown failed on attempt 2! Exiting immediately!
Traceback (most recent call last):
File "/home/********/.pyenv/versions/3.12.11/lib/python3.12/site-packages/_pytest/runner.py", line 344, in from_call
result: TResult | None = func()
^^^^^^
KeyError: <_pytest.stash.StashKey object at 0x7f1666072320>
That looks like a concurrency issue and I had a theory it was related to session and function scoped fixtures being mixed.
So i've made this function scoped because I couldnt see a strong reason for it to be session scoped
fd3dfe1
to
5f5a27d
Compare
@@ -212,6 +212,38 @@ def pytest_collection_modifyitems(items, *args, **kwargs): | |||
item.add_marker("fast") | |||
|
|||
|
|||
@pytest.hookimpl(hookwrapper=True, tryfirst=True) | |||
def pytest_runtest_makereport(item: pytest.Item, call: pytest.CallInfo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the only hook that can catch the StashKey
errors. Fixtures that yield
their values (like tmp_path
) appear to hit different codepaths than other types of fixtures.
I tried both pytest_fixture_post_finalizer
and pytest_runtest_teardown
before resorting to this
5f5a27d
to
2ad5091
Compare
@@ -51,7 +51,7 @@ databricks_init() { | |||
|
|||
# Note: the cluster doesnt need to be running to create / drop catalogs, but it does need to be running to run the integration tests | |||
echo "Ensuring cluster is running" | |||
databricks clusters start $CLUSTER_ID || true | |||
databricks clusters start $CLUSTER_ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to make Databricks fail early with the cluster start error rather than time out for 40mins trying to run tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👍
This reverts commit dedc368.
6a024ea
to
e5f4287
Compare
I notice the Fabric tests failing on main with:
The cause was essentially the create warehouse API calls failing if the warehouse already existed.
SQLMesh code generally expects
CREATE IF NOT EXISTS
/DROP IF EXISTS
semantics, so I implemented these.In addition, some other tests are starting to become more flaky with issues like:
and
so I had a go at addressing these too. The general theme is we run tests concurrently for speed but aren't always good at ensuring each test gets its own unique copy of things and can work in isolation from other tests