Skip to content

Add regional endpoints code samples #12097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jakubrauch
Copy link

Description

Fixes: N/A

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

  • I have followed Sample Guidelines from AUTHORING_GUIDE.MD
  • README is updated to include all relevant information
  • Tests pass: nox -s py-3.9 pytest (see Test Environment Setup)
    • This class is mostly a copypaste of inspect_string.py, tested using python available on the host
  • Lint pass: nox -s lint (see Test Environment Setup)
  • These samples need a new API enabled in testing projects to pass (let us know which ones)
  • These samples need a new/updated env vars in testing projects set to pass (let us know which ones)
  • This sample adds a new sample directory, and I updated the CODEOWNERS file with the codeowners for this sample
  • This sample adds a new Product API, and I updated the Blunderbuss issue/PR auto-assigner with the codeowners for this sample
  • Please merge this PR for me once it is approved

@jakubrauch jakubrauch requested review from a team as code owners August 1, 2024 15:12
Copy link

snippet-bot bot commented Aug 1, 2024

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot added samples Issues that are directly related to samples. api: dlp Issues related to the Sensitive Data Protection API. labels Aug 1, 2024
GCLOUD_PROJECT,
rep_location,
test_string,
["FIRST_NAME", "EMAIL_ADDRESS"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use PERSON_NAME instead.

nargs="+",
help="Strings representing info types to look for. A full list of "
"info categories and types is available from the API. Examples "
'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change FIRST_NAME LAST_NAME into some other infotypes. They are not recommended in https://cloud.google.com/sensitive-data-protection/docs/infotypes-reference , so we should avoid them in code samples as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of a code sample is not to make a fully-runnable wrapper tool around the API, but to provide an example of a specific case. The style guide mentions that we shouldn't be creating CLIs around code samples to reduce maintenance burden.

Instead, the code sample should have a hard-coded request, that people can copy-paste and run directly.

help="The Google Cloud project id to use as a parent resource.",
)
parser.add_argument(
"--rep_location",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a few examples of what values are valid in the help message.

Also add a default value, since all flags here are optional, but leaving this empty will make the snippet fails to run.

import argparse

# [START dlp_inspect_string_rep]
from typing import List
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per PEP-585, we can now type hint with list directly (for example list[str]), so no need to import this.


def inspect_string(
project: str,
rep_location: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For naming consistency with other samples, can we keep this as just location?


# Assemble the regional endpoint url using provided rep location
rep_endpoint = f"dlp.{rep_location}.rep.googleapis.com"
client_options = {"api_endpoint": rep_endpoint}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It might be a good idea to match the variable name with the option entry? Like {"api_endpoint": api_endpoint}


# Prepare info_types by converting the list of strings into a list of
# dictionaries (protos are also accepted).
info_types = [{"name": info_type} for info_type in info_types]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this it's not clear to me what the argument info_types is supposed to look like. For a code sample, it's best if the request is completely hard-coded to something we know works and can be copy-pasted to run.

# optionally be omitted entirely.
inspect_config = {
"info_types": info_types,
"include_quote": include_quote,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I would hard-code a value for include_quote here. It makes the sample easier to read and there are less decisions to be taken for users.


# Call the API.
response = dlp.inspect_content(
request={"parent": parent, "inspect_config": inspect_config, "item": item}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This request is a little hard to reason about since all the variables were computed before. It might be a good idea to inline the values directly here instead of having them as separate variables.

# Print out the results.
if response.result.findings:
for finding in response.result.findings:
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of try-catch, we could use hasattr to check that.

if hasattr(finding, 'quote'):
  print(f"Quote: {finding.quote}")


parser.add_argument("item", help="The string to inspect.")
parser.add_argument(
"--project",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mark this as required=True?

nargs="+",
help="Strings representing info types to look for. A full list of "
"info categories and types is available from the API. Examples "
'include "FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS". '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of a code sample is not to make a fully-runnable wrapper tool around the API, but to provide an example of a specific case. The style guide mentions that we shouldn't be creating CLIs around code samples to reduce maintenance burden.

Instead, the code sample should have a hard-coded request, that people can copy-paste and run directly.

include_quote=True,
)

out, _ = capsys.readouterr()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing on stdout is brittle and can cause tests to fail if the underlying implementation changes. Instead, you should return the response from the code snippet and test it directly here.


# [END dlp_inspect_string_rep]

if __name__ == "__main__":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider removing the entire CLI interface. We shouldn't include CLI interfaces unless they're strictly necessary for the code inside the region tags to work.

@jakubrauch jakubrauch assigned jakubrauch and unassigned ivanmed Dec 9, 2024
@glasnt
Copy link
Contributor

glasnt commented Feb 6, 2025

Hi @jakubrauch, friendly ping that we're waiting on you to address the code review comments for this PR.

Since this PR is a bit older, we've recently introduced Python 3.13 testing; please rebase your PR to main to ensure your code will be tested against this new version.

@glasnt glasnt added the waiting-response Waiting for the author's response. label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: dlp Issues related to the Sensitive Data Protection API. samples Issues that are directly related to samples. waiting-response Waiting for the author's response.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants