You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Py4JJavaError: An error occurred while calling o161.save.
: org.apache.hadoop.fs.FileAlreadyExistsException: mkdir on existing directory cos://catalogdsxreproduce4a77ab6a4f2f47b3b6bedc7174a64c4a.os_a9bbfb9f99684afe9ec11076b75f1831_configs/TESTAPPEND/CARS
at com.ibm.stocator.fs.ObjectStoreFileSystem.mkdirs(ObjectStoreFileSystem.java:453)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:313)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:118)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp
public FSDataOutputStream append(Path f, int bufferSize,
Progressable progress) throws IOException {
throw new IOException("Append is not supported in the object storage");
}
If append is not supported, is there a workaround or may be the connector should throw that append is not supported rather than above error.
The text was updated successfully, but these errors were encountered:
@charles2588 thanks for reporting this. In general append + object storage is usually a bad idea, no matter which connector you use. I will review the issue you observed to better understand the root cause and to propose the best solution to resolve it.
Write the file once
First append mode write is successful.
and then
#Lets write again in append mode and it fails
Py4JJavaError: An error occurred while calling o161.save.
: org.apache.hadoop.fs.FileAlreadyExistsException: mkdir on existing directory cos://catalogdsxreproduce4a77ab6a4f2f47b3b6bedc7174a64c4a.os_a9bbfb9f99684afe9ec11076b75f1831_configs/TESTAPPEND/CARS
at com.ibm.stocator.fs.ObjectStoreFileSystem.mkdirs(ObjectStoreFileSystem.java:453)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:313)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:118)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp
Full notebook:-
https://dataplatform.ibm.com/analytics/notebooks/v2/ec6f5fd0-6141-493c-b2cc-979a9b312393/view?access_token=3b6130f2206249bd03795f932ca3ad30f110321a3086dac07ee4d3eb4d4cbe56
Looking at the append method in the connector code, i see append is not supported.
https://github.com/CODAIT/stocator/blob/0866ef099c838efbfe46e7ad6a036ecfbed2012d/src/main/java/com/ibm/stocator/fs/ObjectStoreFileSystem.java
If append is not supported, is there a workaround or may be the connector should throw that append is not supported rather than above error.
The text was updated successfully, but these errors were encountered: