Skip to content

Conversation

bentsherman
Copy link
Member

Close #5905

This PR preserves the source path of remote input files in the FileHolder.

Here are the key changes:

TaskProcessor:

             if( item instanceof Path || coerceToPath ) {
                 final path = resolvePath(item)
                 final target = executor.isForeignFile(path) ? foreignFiles.addToForeign(path) : path
-                final holder = new FileHolder(target)
+                final holder = new FileHolder(path, target)
                 files << holder
             }

FileHolder:

     FileHolder( Path path ) {
         this.sourceObj = path
         this.storePath = real(path)
         this.stageName = norm(path.getFileName())
     }
 
     FileHolder( def origin, Path path ) {
         this.sourceObj = origin
-        this.storePath = path
+        this.storePath = real(path)
         this.stageName = norm(path.getFileName())
     }

The two-arg constructor is updated to match the behavior of the one-arg constructor using the real() method:

    static private Path real( Path path ) {
        try {
            // main reason for this is to resolve symlinks to real file location
            // hence apply only for default file system
            // note: also for Google Cloud storage path it may convert to relative path
            // it may return invalid (relative) paths therefore do not apply it
            return path.getFileSystem() == FileSystems.default ? path.toRealPath() : path
        }
        catch( Exception e ) {
            log.trace "Unable to get real path for: $path"
            return path
        }
    }

The real() method only resolves symlinks for local files, so path resolution should behave the same way as before for both local and remote files.

A future improvement would be to forward this information to a less internal data structure in TaskEvent. But this change is necessary either way to preserve the remote input path in the TaskRun.

@bentsherman bentsherman requested a review from pditommaso October 8, 2025 18:27
Copy link

netlify bot commented Oct 8, 2025

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit b3cbb6a
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/68e6ad28c5f4030008cc8b53

@bentsherman
Copy link
Member Author

This is the only other place where the two-arg constructor is used:

        /*
         * default case, convert the input object to a string and save
         * to a local file
         */
        def source = input?.toString() ?: ''
        def result = Nextflow.tempFile(altName)
        result.text = source
        return new FileHolder(source, result)

This is for when a file input converts a string value into a temp file with the string content. It should not be affected by the change to the two-arg constructor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect when files are being staged
1 participant