Skip to content

Changes to support direct download of media on Managed Host #287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 18, 2025

Conversation

simonpane
Copy link
Collaborator

Change Description:

Adjust logic in the swlib module to download media directly from GCS to swlib, if the gcloud utility is available on the Managed Host.

Solution Overview:

The previous media procurement logic was to download media from GCS through the Ansible Control Node using gsutil cat. While the file is never written to disk on the Ansible Control Node, the entirety of the file is network transferred in-to and out-of the Ansible Control Node. This logic remains in place and is used (in a block:) should the gcloud utility not be found on the Managed Host.

NOTE: While the existing tasks remain fundamentally unchanged, the task names were changed from using the word Copy to Transfer - to both better represent their action, and to provide a differentiator from the new, and similar, direct-download file copy tasks.

When gcloud is detected on the Managed Host, then it is more performant to just use gcloud storage cp to copy the file directly from GCS to the swlib on the Managed Host.

Consequently, this change:

  • Adds an initial task to detect the presence of the gcloud binary.
  • If available, use a block with series of gcloud storage cp tasks, run on the Managed Host, to directly copy the files from GCS to the swlib location.
  • If not available, use a block with the pre-existing set of tasks that will copy the files from GCS to the swlib location by transferring them through the Ansible Control Node.

NOTE: This change does not include installing gcloud on the Managed Host.

Tasks remain idempotent and do not re-download if the files are already staged.

Further, the new tasks still use zip -l to ensure that a found file is indeed a valid zip file rather than just using the --no-clobber file copy option.

Test Results:

Testing of the --prep-host stage showed considerable improvement. Benefit will be even more profound if the Ansible Control Node isn't also in Google Cloud.

Download speed from GCS shows some variance but typical test results without this change, and using the Ansible callbacks_enabled = ansible.posix.profile_tasks configuration option for timing instrumentation yields results for the affected tasks in the range of:

swlib : gcscopy | Transfer GI patches from GCS to target instance --------------------------------- 49.90s
swlib : gcscopy | Transfer RDBMS and GI software from GCS to target instance ---------------------- 48.68s
swlib : gcscopy | Transfer GI OPatch update files from GCS to target instance ---------------------- 6.97s
swlib : gcscopy | Transfer RDBMS patches from GCS to target instance ------------------------------- 3.51s

With this change, a considerable performance improvement is apparent:

swlib : gcscopy | Copy RDBMS and GI software from GCS to target instance -------------------------- 15.56s
swlib : gcscopy | Copy GI patches from GCS to target instance ------------------------------------- 14.15s
swlib : gcscopy | Copy GI OPatch update files from GCS to target instance -------------------------- 2.39s
swlib : gcscopy | Copy RDBMS patches from GCS to target instance ----------------------------------- 1.49s

@simonpane simonpane self-assigned this Jul 8, 2025
delegate_to: 127.0.0.1
- name: gscopy | Download files directly on Managed Host target
when: gcloud_found.stdout == "0"
block:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the diff correctly, we're turning gcscopy.yml into two large blocks: one using gcloud to retrieve files, and one using an ssh pipe on the control node (the original "gcs copy" method).

We have a mian.yml and other non-default methods that a user can specify, like gcsfuse or NFS. So to keep the spirit of the existing semantics, would it make sense to treat this more as a default selection: creating a new task (and swlib_mount_type perhaps?) and, by default, trying to be smart about selecting the optimal one (gcloud or gcscopy)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...we're turning gcscopy.yml into two large blocks: one using gcloud to retrieve files, and one using an ssh pipe on the control node..."

Yes that's correct. Arguably, both blocks are doing the same thing and copying (or downloading) from GCS and the swlib mount type is not changed. It's just the "where" that happens which is the difference. So from semantics I was thinking that it does make sense to keep it in the same file as it's all "GCS copy" (although one using the older gsutil and the other using the newer gcloud storage but the activity is still effectively the same thing: copying a file from GCS). Or rather, the verb is the same, and the task file doesn't reference the noun.

But those are just my thoughts, if you'd like them to be split into two different files then just let me know @mfielding and I'm happy to prepare an updated commit with that for your further review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we have three possible swlib types:

  • NFS
  • gcsfuse: mount a GCS bucket via FUSE to avoid an intermediate copy
  • gcscopy: use the control node to stream gcloud output into a SSH pipeline

The default is gcscopy.

From a user perspective, this change adds yet another swlib type, which executes a gcloud storage cp command on the first managed host, if a gcloud storage ls command succeeds. But the user cannot control this behavior.

I'd rather see the user being able to optionally select the desired behavior using the existing --swlib_type argument. But I still like the idea of an intelligent default while still mostly preserving backward compatibility.

Copy link
Collaborator Author

@simonpane simonpane Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mfielding , just discussing this one last time to double-check on the ideas and next steps.

The way I look at it is that it's not another swlib type (it's just a path change) and it's something that should be transparent to the user. The reason is that I think that swlib-type is a bit misleading, it's really swlib-source. And our source is one of:

  1. An NFS mount
  2. A mounted FUSE path
  3. A GCS storage bucket

Technically we can add a fourth source soon: a URI

So what we're really doing is just choosing the path for option 3. The source and the destination are the same. And even the command (or "type" of command) is the same as we can switch the existing gsutil cat to gcloud storage cat to modernize that command and to move away from the legacy gsutil Python program. So it's just a matter of whether we use gcoud storage to copy from the same source location to the same destination location either directly or indirectly through the Ansible Control Node.

And I'm not sure why a user would want to use gcloud storage to copy through the Ansible Control Node if it could more efficiently just copy it directly to the Managed Host? Might just add user confusion.

Hence I might suggest:

  • A change commit to modernize the old gsutil commands and replace them with gcloud storage -- I'm not sure if there are any other implications of that and why we might want to still use the legacy utility - perhaps you have some thoughts on that @mfielding ?
  • Optionally, a change to update swlib-type and replace it with swlib-source (including in the docs) to reduce user confusion. Maybe with the old command line argument aliased for backwards compatibility.

That's just my two cents on it.

Alternatively, if you'd still like to make it another option for the command line argument and split it into different task-files then just let me know and I'll happily implement accordingly. Just thinking this through to double-check.

Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to ^^^, I'm second-guessing the recommendation of swlib-type vs swlib-source - it's maybe a bit of both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd still prefer to treat a SSH-pipe copy from the control node and a direct gcloud storage cp operation as separate --swlib-type options, even if they're both ultimately coming from a GCS bucket. And I like the approach of using intelligent defaults that avoid breaking existing installs.

As for gsutil cp vs gcloud storage cp, I agree that gcloud storage cp is a better option, if only because it avoids the headaches around crcmod dependencies, and manages authentication better too. I don't know how long gcloud storage cp has existed, but if we wanted to be pedantic, we could also add a prerequisite check to make sure the gcloud install is new enough to support this.

Copy link
Collaborator Author

@simonpane simonpane Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated scripts accordingly. Changes in commit e56a3a1 (and logic fix in 95f301c) include:

  • New options gcsdirect and gcstransfer for the --ora-swlib-type option (still defaults to the original gcs value)
  • If gcsdirect is specified (or the default of gcs is used), validates that gcloud is installed on the Managed Host and that the GCS bucket is accessible -- if so, does a direct gcloud storage copy on the Managed Host
  • If gcloud is not available, or if the GCS bucket is not accessible, or if the gcstransfer option is specified, then the pre-existing method of transferring through the Ansible Control Node is used

Also changed from using gsutil to gcloud storage - in the related files only (to keep this change cleaner and less complicated). Can prepare a separate PR if you would like us to remove other occurrences of gsutil in other files not related to this specific piece of functionality @mfielding?

@@ -104,7 +104,7 @@ ORA_SWLIB_BUCKET="${ORA_SWLIB_BUCKET}"
ORA_SWLIB_BUCKET_PARAM="^.+[^/]"

ORA_SWLIB_TYPE="${ORA_SWLIB_TYPE:-GCS}"
ORA_SWLIB_TYPE_PARAM="^(\"\"|GCS|GCSFUSE|NFS)$"
ORA_SWLIB_TYPE_PARAM="^(\"\"|GCS|GCSFUSE|NFS|GCSTRANSFER|GCSDIRECT)$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; would you mind updating the user guide to match, along with perhaps a sentence or two there to explain what GCSTRANSFER and GCSDIRECT mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the user guide in commit 1eaf540. Please review @mfielding and provide comments on any language or grammatical changes you would like. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mfielding, simonpane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mfielding mfielding merged commit a70eea2 into google:master Jul 18, 2025
3 checks passed
@simonpane simonpane deleted the rel/oratk-115 branch July 18, 2025 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants