Skip to content

Commit 0328ac9

Browse files
committed
Updated for filenotfound error
1 parent 2fe67e0 commit 0328ac9

File tree

1 file changed

+33
-2
lines changed

1 file changed

+33
-2
lines changed

docs/website/docs/reference/troubleshooting.md

+33-2
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@ Timestamp issues occur when formats are incompatible with the destination or inc
181181

182182
- Standardize timestamp formats across all runs to maintain consistent schema inference and avoid the creation of variant columns.
183183

184-
3. Inconsistent formats for incremental loading
184+
3. **Inconsistent formats for incremental loading**
185185

186186
- **Scenario:**
187187

@@ -402,6 +402,37 @@ Failures in the **Load** stage often relate to authentication issues, schema cha
402402

403403
- Use schema evolution to handle column renaming. [Read more about schema evolution.](../general-usage/schema-evolution#evolving-the-schema)
404404

405+
### **`FileNotFoundError` for 'schema_updates.json' in parallel runs**
406+
407+
- **Scenario**
408+
When running the same pipeline name multiple times in parallel (e.g., via Airflow), `dlt` may fail at the load stage with an error like:
409+
410+
> `FileNotFoundError: schema_updates.json not found`
411+
412+
This happens because `schema_updates.json` is generated during normalization. Concurrent runs using the same pipeline name may overwrite or lock access to this file, causing failures.
413+
414+
- **Possible Solutions**
415+
416+
1. **Use unique pipeline names for each parallel run**
417+
418+
If calling `pipeline.run()` multiple times within the same workflow (e.g., once per resource), assign a unique `pipeline_name` for each run. This ensures separate working directories, preventing file conflicts.
419+
420+
2. **Leverage dlt’s concurrency management or Airflow helpers**
421+
422+
dlt’s Airflow integration “serializes” resources into separate tasks while safely handling concurrency. To parallelize resource extraction without file conflicts, use:
423+
```py
424+
decompose="serialize"
425+
```
426+
More details are available in the [Airflow documentation](../walkthroughs/deploy-a-pipeline/deploy-with-airflow-composer#2-valueerror-can-only-decompose-dlt-source).
427+
428+
3. **Disable dev mode to prevent multiple destination datasets**
429+
430+
When `dev_mode=True`, dlt generates unique dataset names (`<dataset_name>_<timestamp>`) for each run. To maintain a consistent dataset, set:
431+
```py
432+
dev_mode=False
433+
```
434+
Read more about this in the [dev mode documentation](../general-usage/pipeline#do-experiments-with-dev-mode).
435+
405436
### Memory management issues
406437

407438
- **Scenario:**
@@ -412,7 +443,7 @@ Failures in the **Load** stage often relate to authentication issues, schema cha
412443

413444
- Pipeline failures due to out-of-memory errors.
414445

415-
- **Solution:**
446+
- **Possible Solution:**
416447

417448
- Enable file rotation. [Read more about it here.](./performance#controlling-intermediary-file-size-and-rotation)
418449

0 commit comments

Comments
 (0)