Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions cpp/include/legate_dataframe/parquet.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,10 @@ class ParquetReadArray : public Task<ParquetReadArray, OpCode::ParquetReadArray>
* ├── part-2.parquet
* └── ...
*
* This function may create the directory but does not ensure it is empty.
* If a previous write wrote more partitions the old files will remain
* leaving the directory in an inconsistent state.
*
* @param tbl The table to write.
* @param path Destination directory for data.
*/
Expand Down
4 changes: 1 addition & 3 deletions cpp/src/parquet.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -492,9 +492,7 @@ ParquetReadInfo get_parquet_info(const std::vector<std::string>& file_paths,
void parquet_write(LogicalTable& tbl, const std::string& dirpath)
{
std::filesystem::create_directories(dirpath);
if (!std::filesystem::is_empty(dirpath)) {
throw std::invalid_argument("if path exist, it must be an empty directory");
}

auto runtime = legate::Runtime::get_runtime();
legate::AutoTask task =
runtime->create_task(get_library(), task::ParquetWrite::TASK_CONFIG.task_id());
Expand Down
5 changes: 5 additions & 0 deletions python/legate_dataframe/lib/parquet.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,11 @@ def parquet_write(LogicalTable tbl, path: pathlib.Path | str) -> None:
├── part.2.parquet
└── ...

.. note::
This function will create the directory but does not ensure it is empty.
If a previous write had more partitions the old files will remain
leaving the directory in an inconsistent state.

See Also
--------
parquet_read: Read parquet data
Expand Down
Loading