Skip to content

New CLI options #243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

New CLI options #243

wants to merge 9 commits into from

Conversation

mcopik
Copy link
Collaborator

@mcopik mcopik commented Mar 9, 2025

We added a script for updating Docker images, and a new CLI option to create build package without uploading.

Missing tasks

  • Verify the build function works with Docker.
  • Add a new option to completely select resource id (not just select a new one based on prefix).

Summary by CodeRabbit

  • New Features

    • Introduced a benchmark packaging command with flexible options for custom naming and Docker image tag customization.
    • Added a Docker image management tool that pulls and updates images for various deployment environments.
    • Enhanced the function-building process with default naming and clearer error feedback for unsupported configurations.
    • Added a configuration option to control the removal of containers after stopping for both Minio and ScyllaDB storage types.
  • Chores

    • Improved logging for code package caching to enhance operational traceability.

Copy link

coderabbitai bot commented Mar 9, 2025

Walkthrough

The changes introduce a new command in the benchmark group that enables packaging of experiments by configuring Docker image tag prefixes and determining function names. Additionally, logging details have been added to cache update methods for improved traceability. A new method in the deployment system now builds functions with language version verification and default naming. A standalone script for updating Docker images, which parses CLI arguments, reads JSON configuration, and selectively pulls images based on their SHA, has also been added. The existing process command remains unchanged.

Changes

File(s) Change Summary
sebs.py, sebs/faas/system.py Added a new package command in sebs.py to handle benchmark packaging and updated configuration; introduced a build_function method in sebs/faas/system.py that builds functions with language version checks and a default naming mechanism.
sebs/cache.py Enhanced cache operations by adding logging statements in the add_code_package and update_code_package methods to record when a code package is cached or updated.
tools/update_docker_images.py Introduced a new script that leverages argparse to manage Docker images, reading from a JSON configuration, executing pull functions based on deployment criteria, and handling errors for missing images.
sebs/storage/config.py, sebs/storage/minio.py, sebs/storage/scylladb.py Added a remove_containers attribute to MinioConfig and ScyllaDBConfig, and modified the start methods in Minio and ScyllaDB to use this attribute for dynamic container removal control.

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant S as sebs.py (package command)
    participant C as sebs_client
    participant E as Experiment Config
    participant D as Deployment Client (build_function)
    
    U->>S: Invoke "package" command with benchmark, function_name, image_tag_prefix, kwargs
    S->>C: Update configuration (apply image_tag_prefix if provided)
    S->>E: Retrieve experiment configuration and update benchmark
    E-->>S: Return benchmark data
    S->>D: Call build_function (with benchmark object and function name)
    D->>D: Check language version and determine default function name (if missing)
    D->>D: Build function using package_code method
    D-->>S: Return build result
    S-->>U: Output command result
Loading
sequenceDiagram
    participant U as User
    participant T as update_docker_images.py
    participant A as Argparse Parser
    participant J as JSON Config
    participant P as Pull Functions
    
    U->>T: Execute script with CLI arguments
    T->>A: Parse deployment type, image type, language, and version
    T->>J: Read Docker repository configuration data
    T->>P: Iterate over systems & invoke pull functions (generic_pull, pull_function, pull_language)
    P->>P: Compare image SHA to decide update necessity
    P-->>T: Return update status (updated or up-to-date)
    T-->>U: Output final image update status
Loading

Possibly related PRs

  • Fix Local Deployment Invocation #231: The changes in the main PR, specifically the modifications to the storage_start function and the introduction of the remove_containers parameter, are related to the changes in the retrieved PR, which also involves handling container management through configuration settings in the start method of the local command group.

  • Minor updates to PR #199 #226: The changes in the main PR, specifically the addition of the package command and modifications to the storage_start function in sebs.py, are related to the updates made to the benchmark and storage commands in the retrieved PR, which also involves changes to command handling in sebs.py.

  • Container support for AWS #205: The changes in the main PR, specifically the addition of the --container-deployment option and modifications to the storage_start function, are related to the container deployment support introduced in the retrieved PR, which also focuses on handling container images and deployment configurations.

Poem

Hop, hop—I'm off to code,
With new commands on this winding road.
Logs now sing a tale so clear,
Docker images pull near and dear.
Functions build with a rhythmic beat,
A happy rabbit hops with every feat!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
tools/update_docker_images.py (4)

1-23: Use context manager for file operations.

The script should use a context manager when opening files to ensure they are properly closed even if an exception occurs.

-config = json.load(open(os.path.join(PROJECT_DIR, "config", "systems.json"), "r"))
+with open(os.path.join(PROJECT_DIR, "config", "systems.json"), "r") as f:
+    config = json.load(f)
🧰 Tools
🪛 Ruff (0.8.2)

20-20: Use a context manager for opening files

(SIM115)


82-85: Simplify conditional logic.

The conditional statements for checking language version can be simplified using the logical OR operator.

-    if args.language_version is not None and args.language_version == version:
-        configs.append([version, base_image])
-    elif args.language_version is None:
-        configs.append([version, base_image])
+    if args.language_version is None or args.language_version == version:
+        configs.append([version, base_image])
🧰 Tools
🪛 Ruff (0.8.2)

82-85: Combine if branches using logical or operator

Combine if branches

(SIM114)


114-118: Unused loop variable in nested loop.

The image_config variable is not used within the loop body.

-                for image_type, image_config in system_config["images"].items():
+                for image_type, _ in system_config["images"].items():
🧰 Tools
🪛 Ruff (0.8.2)

116-116: Loop control variable image_config not used within loop body

Rename unused image_config to _image_config

(B007)


53-70: Consider making benchmarks configuration more maintainable.

The hardcoded benchmark list could become difficult to maintain as more benchmarks are added. Consider loading this from a configuration file or creating a function to discover available benchmarks.

For example, you could load this from a JSON configuration file or use a directory structure to discover benchmarks automatically.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3266d2d and db93817.

📒 Files selected for processing (4)
  • sebs.py (1 hunks)
  • sebs/cache.py (2 hunks)
  • sebs/faas/system.py (1 hunks)
  • tools/update_docker_images.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tools/update_docker_images.py

20-20: Use a context manager for opening files

(SIM115)


82-85: Combine if branches using logical or operator

Combine if branches

(SIM114)


116-116: Loop control variable image_config not used within loop body

Rename unused image_config to _image_config

(B007)

sebs.py

351-351: Local variable func is assigned to but never used

Remove assignment to unused variable func

(F841)

🔇 Additional comments (5)
sebs/cache.py (2)

225-226: Good addition of logging information.

Adding this log statement improves traceability by providing visibility into code package updates. This will help with debugging cache operations.


271-271: Good consistency in logging.

This log message maintains consistency with the one added in add_code_package method, providing clear information about cache operations.

sebs.py (1)

310-355: The new package command is well-structured.

The implementation follows the same pattern as other commands and reuses the common parameter handling logic, which is good for consistency.

🧰 Tools
🪛 Ruff (0.8.2)

351-351: Local variable func is assigned to but never used

Remove assignment to unused variable func

(F841)

sebs/faas/system.py (1)

185-201: Good addition of the build_function method.

This method extracts common functionality from get_function, promoting code reuse. The implementation includes proper error handling and makes good use of defaults.

However, there's one issue to consider: unlike get_function which returns a Function object, this method doesn't return anything but assigns a value to a variable. This inconsistency might lead to confusion.

 def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
+    """
+    Build a function from the given code package without creating it in the FaaS system.
+    
+    :param code_package: The benchmark code package
+    :param func_name: Optional name for the function, defaults to system-generated name
+    :return: The built function object
+    """

     if code_package.language_version not in self.system_config.supported_language_versions(
         self.name(), code_package.language_name
     ):
         raise Exception(
             "Unsupported {language} version {version} in {system}!".format(
                 language=code_package.language_name,
                 version=code_package.language_version,
                 system=self.name(),
             )
         )

     if not func_name:
         func_name = self.default_function_name(code_package)
-    code_package.build(self.package_code)
+    rebuilt, _ = code_package.build(self.package_code)
+    return func_name
tools/update_docker_images.py (1)

24-42: Good implementation of the image pulling function.

The pull function correctly handles cases where an image doesn't exist and provides clear feedback about image updates. The error handling is appropriate for this use case.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
sebs.py (1)

337-381: ⚠️ Potential issue

Fix unused variable func

The function builds the benchmark but doesn't do anything with the returned function object. This is the same issue as mentioned in a previous review comment, which was addressed in another part of the code.

Apply this diff to fix the issue:

-    func = deployment_client.build_function(
+    deployment_client.build_function(
         benchmark_obj,
         function_name if function_name else deployment_client.default_function_name(benchmark_obj)
     )

Or if you need to inform the user about the built function:

     func = deployment_client.build_function(
         benchmark_obj,
         function_name if function_name else deployment_client.default_function_name(benchmark_obj)
     )
+    sebs_client.logging.info(f"Successfully built function: {func.name}")
🧰 Tools
🪛 Ruff (0.8.2)

378-378: Local variable func is assigned to but never used

Remove assignment to unused variable func

(F841)

🧹 Nitpick comments (4)
sebs/storage/scylladb.py (1)

96-96: Consider the implications of changing container removal behavior

The PR changes the container removal behavior from always removing containers (remove=True) to making it configurable via self.config.remove_containers. Since the default value in ScyllaDBConfig is False, this represents a behavior change that could lead to container accumulation if users aren't explicitly cleaning up.

Consider adding documentation about this behavior change and potentially logging a message when containers aren't being automatically removed. This would help users understand why containers might persist after stopping.

sebs/storage/config.py (1)

33-33: Add documentation for the new configuration option

The new remove_containers attribute has been added to both MinioConfig and ScyllaDBConfig classes with a default value of False. This is a behavior change from the previous implicit True value.

Add docstrings to explain what this configuration option does, especially noting that the default behavior is to keep containers (different from previous versions):

 @dataclass
 class MinioConfig(PersistentStorageConfig):
+    """Configuration for Minio storage.
+
+    Attributes:
+        remove_containers: Whether to automatically remove containers when stopped.
+                           Defaults to False to allow post-mortem inspection.
+    """

And similarly for ScyllaDBConfig.

Also applies to: 83-83

sebs/storage/minio.py (1)

103-103: Aligned container removal behavior with ScyllaDB

Good consistency in changing both storage implementations to use the same configuration-based approach for container removal.

Consider adding a log message when a container isn't automatically removed, similar to the suggestion for ScyllaDB:

 self._storage_container = self._docker_client.containers.run(
     # ...existing parameters
     remove=self.config.remove_containers,
     # ...remaining parameters
 )
+if not self.config.remove_containers:
+    self.logging.info(f"Container {self._storage_container.id} will not be automatically removed after stopping")
sebs/faas/system.py (1)

236-252: Missing method documentation

The new build_function method lacks documentation, making it unclear what it does and how it differs from get_function.

Add docstrings to explain the purpose and behavior of this method:

 def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
+    """
+    Builds a function package without creating/updating it in the cloud.
+
+    This is useful for packaging experiments without deploying them.
+
+    Args:
+        code_package: Benchmark containing the function code
+        func_name: Optional name for the function, defaults to generated name
+
+    Returns:
+        Tuple[bool, str, bool, str]: Build results containing:
+            - whether it was rebuilt
+            - the code location
+            - whether it's a container deployment
+            - the container URI
+
+    Raises:
+        Exception: If the language version is not supported
+    """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db93817 and 5b01693.

📒 Files selected for processing (6)
  • sebs.py (4 hunks)
  • sebs/cache.py (2 hunks)
  • sebs/faas/system.py (1 hunks)
  • sebs/storage/config.py (2 hunks)
  • sebs/storage/minio.py (1 hunks)
  • sebs/storage/scylladb.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • sebs/cache.py
🧰 Additional context used
🧬 Code Definitions (3)
sebs/storage/config.py (1)
sebs/local/local.py (2)
  • remove_containers (44-45)
  • remove_containers (48-49)
sebs.py (3)
sebs/utils.py (2)
  • default (23-32)
  • update_nested_dict (56-61)
sebs/config.py (2)
  • image_tag_prefix (14-15)
  • image_tag_prefix (18-19)
sebs/sebs.py (5)
  • config (44-45)
  • output_dir (32-33)
  • logging_filename (40-41)
  • get_experiment_config (152-153)
  • get_benchmark (180-199)
sebs/faas/system.py (3)
sebs/benchmark.py (6)
  • code_package (102-103)
  • Benchmark (84-762)
  • language_version (150-151)
  • language_name (146-147)
  • language (142-143)
  • build (561-625)
sebs/config.py (2)
  • supported_language_versions (44-52)
  • version (70-71)
sebs/faas/function.py (1)
  • name (365-366)
🪛 Ruff (0.8.2)
sebs.py

378-378: Local variable func is assigned to but never used

Remove assignment to unused variable func

(F841)

🔇 Additional comments (2)
sebs.py (2)

438-443: LGTM!

The addition of the --remove-containers/--no-remove-containers option provides useful control over container cleanup behavior. This change follows the existing pattern for command-line options in the codebase.


458-458: LGTM!

The implementation correctly passes the remove_containers parameter to both storage configuration types.

Also applies to: 477-477

Comment on lines +236 to +252
def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):

if code_package.language_version not in self.system_config.supported_language_versions(
self.name(), code_package.language_name
):
raise Exception(
"Unsupported {language} version {version} in {system}!".format(
language=code_package.language_name,
version=code_package.language_version,
system=self.name(),
)
)

if not func_name:
func_name = self.default_function_name(code_package)
code_package.build(self.package_code)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing return value for build result

The method calls code_package.build(self.package_code) which returns a tuple with build results, but build_function doesn't return any value to the caller. Looking at the PR objectives, this method is likely used to build without uploading, so returning build information would be helpful.

Return the build result to provide information to the caller:

 def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
 
     if code_package.language_version not in self.system_config.supported_language_versions(
         self.name(), code_package.language_name
     ):
         # ...exception handling
 
     if not func_name:
         func_name = self.default_function_name(code_package)
-    code_package.build(self.package_code)
+    return code_package.build(self.package_code)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
if code_package.language_version not in self.system_config.supported_language_versions(
self.name(), code_package.language_name
):
raise Exception(
"Unsupported {language} version {version} in {system}!".format(
language=code_package.language_name,
version=code_package.language_version,
system=self.name(),
)
)
if not func_name:
func_name = self.default_function_name(code_package)
code_package.build(self.package_code)
def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
if code_package.language_version not in self.system_config.supported_language_versions(
self.name(), code_package.language_name
):
raise Exception(
"Unsupported {language} version {version} in {system}!".format(
language=code_package.language_name,
version=code_package.language_version,
system=self.name(),
)
)
if not func_name:
func_name = self.default_function_name(code_package)
return code_package.build(self.package_code)

⚠️ Potential issue

Architecture parameter missing in supported_language_versions check

The implementation of build_function is missing the architecture parameter when checking for supported language versions, unlike the existing get_function method.

This could lead to inconsistent behavior between the two methods. The supported_language_versions method in sebs/config.py expects an architecture parameter and uses it to filter the available base images.

Apply this fix:

 def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
 
     if code_package.language_version not in self.system_config.supported_language_versions(
-        self.name(), code_package.language_name
+        self.name(), code_package.language_name, code_package.architecture
     ):
         raise Exception(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
if code_package.language_version not in self.system_config.supported_language_versions(
self.name(), code_package.language_name
):
raise Exception(
"Unsupported {language} version {version} in {system}!".format(
language=code_package.language_name,
version=code_package.language_version,
system=self.name(),
)
)
if not func_name:
func_name = self.default_function_name(code_package)
code_package.build(self.package_code)
def build_function(self, code_package: Benchmark, func_name: Optional[str] = None):
if code_package.language_version not in self.system_config.supported_language_versions(
self.name(), code_package.language_name, code_package.architecture
):
raise Exception(
"Unsupported {language} version {version} in {system}!".format(
language=code_package.language_name,
version=code_package.language_version,
system=self.name(),
)
)
if not func_name:
func_name = self.default_function_name(code_package)
code_package.build(self.package_code)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant