Skip to content

Fix documentation and the blocking bugs for local backend #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 9, 2025

Conversation

rabbull
Copy link
Contributor

@rabbull rabbull commented Apr 2, 2025

This PR introduces serveral mini fixes to make the local backend runnable out-of-the-box again.

  1. Added an empty "modules" entry to the configuration structure of microbenchmarks.
  2. Supplemented unused parameters in the Python input interface of microbenchmarks.
  3. Updated the Local section in the usage documentation.
  4. Removed the version segment from the Docker build image tag.
  5. Fixed the format of the basic_image field in the configuration for the local backend.
  6. Added missing parameters to the invocation to update_function of the Local system.

Closes #248.

Summary by CodeRabbit

  • New Features

    • Introduced an additional configuration option for module management in benchmark setups.
  • Documentation

    • Updated commands and configuration details for storage, deployment, and container management, enhancing ease of use and clarity.
  • Refactor

    • Standardized benchmark input handling with updated parameters and return structures.
    • Refined deployment operations, including simplified image naming and streamlined update processes.
    • Improved handling of output paths for benchmark result uploads.

Copy link

coderabbitai bot commented Apr 2, 2025

Walkthrough

The changes update several benchmark configurations and function interfaces. Four microbenchmark JSON files now include a new "modules": [] key. Corresponding input scripts have revised the generate_input function signature by renaming parameters and adding a nosql_func parameter. The documentation has been updated with corrected commands for starting storage, launching Docker containers, and invoking functions. Additionally, minor adjustments were made in the SeBS modules—removing version strings from image names, dropping special cases for “local” deployments, and updating a deployment client method signature.

Changes

Files Change Summary
benchmarks/.../config.json (sleep, network-benchmark, clock-synchronization, server-reply) Added new key "modules": [] to each configuration file while preserving existing values.
benchmarks/.../input.py (sleep, network-benchmark, clock-synchronization, server-reply) Updated generate_input signature: renamed parameters (replacing input_buckets/output_buckets with benchmarks_bucket, input_paths, and output_paths) and added a new parameter nosql_func.
benchmarks/.../python/function.py (network-benchmark, clock-synchronization) Modified output bucket access to nested keys and used os.path.join for upload paths.
docs/usage.md Revised commands for storage startup, local deployment, and Docker container handling; updated JSON structures and examples.
sebs/benchmark.py Refactored Docker image handling in install_dependencies to separate base image and version suffix; added helper to ensure image presence with fallback logic.
sebs/config.py Removed special case for "local" deployment in supported_language_versions method.
sebs/experiments/perf_cost.py Updated call to update_function to include additional parameters (False and empty string).

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant generate_input
    Caller->>generate_input: Call with (data_dir, size, benchmarks_bucket, input_paths, output_paths, upload_func, nosql_func)
    generate_input-->>Caller: Returns generated input dictionary
Loading
sequenceDiagram
    participant PerfCost
    participant DeploymentClient
    PerfCost->>DeploymentClient: update_function(function, benchmark, False, '')
    DeploymentClient-->>PerfCost: Returns update result
Loading

Assessment against linked issues

Objective Addressed Explanation
Fix local backend startup command (#248)

Suggested reviewers

  • mcopik

Poem

I'm a rabbit with a code-filled beat,
Hopping through JSON keys so neat.
New parameters and commands hop in play,
With each commit, bugs hop away.
I nibble carrots and code with delight,
Celebrating smooth changes from morning to night!

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
docs/usage.md (1)

180-180: Add language specification to code block

The fenced code block at line 180 is missing a language specification, which is recommended for proper syntax highlighting.

-```
+```bash
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

180-180: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3266d2d and 587fe8d.

📒 Files selected for processing (12)
  • benchmarks/000.microbenchmarks/010.sleep/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/010.sleep/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/020.network-benchmark/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/040.server-reply/config.json (1 hunks)
  • benchmarks/000.microbenchmarks/040.server-reply/input.py (1 hunks)
  • docs/usage.md (2 hunks)
  • sebs/benchmark.py (1 hunks)
  • sebs/config.py (0 hunks)
  • sebs/experiments/perf_cost.py (1 hunks)
💤 Files with no reviewable changes (1)
  • sebs/config.py
🧰 Additional context used
🧬 Code Definitions (4)
benchmarks/000.microbenchmarks/010.sleep/input.py (4)
benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/040.server-reply/input.py (1)
  • generate_input (11-12)
sebs/benchmark.py (1)
  • generate_input (783-794)
sebs/experiments/perf_cost.py (1)
sebs/faas/system.py (1)
  • update_function (215-234)
benchmarks/000.microbenchmarks/020.network-benchmark/input.py (4)
benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/040.server-reply/input.py (1)
  • generate_input (11-12)
benchmarks/000.microbenchmarks/010.sleep/input.py (1)
  • generate_input (11-12)
sebs/benchmark.py (1)
  • generate_input (783-794)
benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (4)
benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1)
  • generate_input (6-7)
benchmarks/000.microbenchmarks/040.server-reply/input.py (1)
  • generate_input (11-12)
benchmarks/000.microbenchmarks/010.sleep/input.py (1)
  • generate_input (11-12)
sebs/benchmark.py (1)
  • generate_input (783-794)
🪛 markdownlint-cli2 (0.17.2)
docs/usage.md

180-180: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🔇 Additional comments (15)
benchmarks/000.microbenchmarks/040.server-reply/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

This change adds the previously missing "modules" field to the configuration, which is required by the BenchmarkConfig class as seen in benchmark.py. This fix addresses one of the PR objectives to ensure the local backend works correctly out-of-the-box.

benchmarks/000.microbenchmarks/010.sleep/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

This change adds the previously missing "modules" field to the configuration, which is required by the BenchmarkConfig class as seen in benchmark.py. This fix ensures proper functionality when deserializing the configuration.

benchmarks/000.microbenchmarks/030.clock-synchronization/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

The addition of the empty "modules" array ensures the configuration structure matches what's expected by the BenchmarkConfig deserializer. This consistency across benchmarks helps prevent runtime errors.

benchmarks/000.microbenchmarks/020.network-benchmark/config.json (1)

4-5: Adds required "modules" field to the configuration structure.

Adding the empty "modules" array ensures this benchmark's configuration follows the same pattern as other benchmarks, providing consistency and preventing potential deserialization issues in the BenchmarkConfig class.

sebs/benchmark.py (1)

433-437: Simplifies Docker image naming by removing version component.

This change removes the version segment from the Docker build image name, which streamlines the build process as mentioned in the PR objectives. By using a simpler naming convention, the system will likely be more maintainable and less prone to versioning-related issues.

benchmarks/000.microbenchmarks/010.sleep/input.py (1)

11-11: Function signature updated correctly

The generate_input function signature has been updated to match the interface defined in sebs/benchmark.py, which helps standardize the benchmark input generation across the codebase.

benchmarks/000.microbenchmarks/040.server-reply/input.py (1)

11-11: Function signature updated correctly

The generate_input function signature has been updated to match the interface defined in sebs/benchmark.py, which helps standardize the benchmark input generation across the codebase.

sebs/experiments/perf_cost.py (1)

86-86: Method signature update correctly implemented

The call to update_function has been updated to match the new method signature by adding two additional parameters: False for container_deployment and an empty string for container_uri. This change is consistent with the updated method signature in the system.py file and aligns with the PR objective to fix blocking bugs for local backend.

docs/usage.md (7)

77-79: Storage start command updated correctly

The command has been updated to use a configuration file instead of hardcoded port values, allowing for more flexible deployment. Using all parameter suggests it now starts both object and NoSQL storage services as defined in the configuration.


87-89: Configuration update command enhanced

The command now correctly adds architecture information (x64) to the configuration, which is a necessary parameter for local deployment. The output is also properly directed to a new configuration file.


94-133: Storage configuration structure properly updated

The JSON structure now correctly includes both object storage (minio) and NoSQL storage (scylladb) sections, providing a comprehensive configuration template. This aligns with the updated command that uses a configuration file.


139-140: Local start command correctly updated with container removal option

The command now includes the --remove-containers option, which is essential information for users who want to automatically clean up after testing. This change aligns with the note added at lines 185-186.


171-176: Function invocation command improved with dynamic values

The curl command now dynamically extracts the function URL and input data from the output JSON file using jq, which is a significant improvement over hardcoded values. This makes the documentation more general and maintainable.


182-183: Storage stop command updated to match the start command

The command now uses all parameter to stop all storage services, which is consistent with the updated start command using all parameter.


185-186: Helpful note about container removal

This addition provides important clarification about container lifecycle management, explaining that containers won't be automatically removed without the specific flag.

Copy link
Contributor Author

@rabbull rabbull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mcopik,

Thanks for the review! I'm happy to have the opportunity to contribute to this repository.
I've replied to some of your comments regarding the further development plans. Once you've had a chance to review them, I'll start refining my PR.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
sebs/benchmark.py (1)

443-467: Added robust fallback mechanism for Docker images.

Implemented a helper function ensure_image and a fallback mechanism that tries to pull the versioned image first and falls back to the unversioned image if necessary. This provides better resilience and backward compatibility.

However, there's a minor issue with the exception handling that could be improved.

-                    except docker.errors.APIError:
-                        raise RuntimeError("Docker pull of image {}:{} failed!".format(repo_name, name))
+                    except docker.errors.APIError as api_error:
+                        raise RuntimeError("Docker pull of image {}:{} failed!".format(repo_name, name)) from api_error
🧰 Tools
🪛 Ruff (0.8.2)

453-453: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

docs/usage.md (3)

74-75: Grammar: missing articles in the sentence.

To improve clarity, consider updating as follows:

- map the container's port to port defined in the configuration on host network, and write storage 
- instance configuration to file `out_storage.json`
+ map the container's port to the port defined in the configuration on the host network, and write a storage
+ instance configuration to the file `out_storage.json`
🧰 Tools
🪛 LanguageTool

[uncategorized] ~74-~74: Possible missing article found.
Context: ... container, map the container's port to port defined in the configuration on host ne...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~74-~74: You might be missing the article “the” here.
Context: ...to port defined in the configuration on host network, and write storage instance co...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~75-~75: You might be missing the article “a” here.
Context: ...rite storage instance configuration to file out_storage.json ```bash ./sebs.py s...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


93-130: Ensure example JSON is syntactically valid.

The provided JSON example now includes both object and NoSQL storage configurations, which reflects the PR changes. However, there is a trailing comma after the closing of the local object (Line 130), which is invalid in JSON. Please remove the comma to maintain a valid JSON example.


179-182: Fenced code block language identifier and storage stop command.

  1. Add bash after the opening triple backticks for syntax highlighting.
  2. The new ./sebs.py storage stop all out_storage.json command correctly includes the all argument as per the PR.
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

179-179: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 587fe8d and 534a957.

📒 Files selected for processing (6)
  • benchmarks/000.microbenchmarks/020.network-benchmark/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/020.network-benchmark/python/function.py (3 hunks)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1 hunks)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py (3 hunks)
  • docs/usage.md (2 hunks)
  • sebs/benchmark.py (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • benchmarks/000.microbenchmarks/030.clock-synchronization/input.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
benchmarks/000.microbenchmarks/020.network-benchmark/input.py (2)
benchmarks/000.microbenchmarks/030.clock-synchronization/input.py (1)
  • generate_input (6-12)
sebs/benchmark.py (1)
  • generate_input (800-811)
🪛 LanguageTool
docs/usage.md

[uncategorized] ~74-~74: Possible missing article found.
Context: ... container, map the container's port to port defined in the configuration on host ne...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~74-~74: You might be missing the article “the” here.
Context: ...to port defined in the configuration on host network, and write storage instance co...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~75-~75: You might be missing the article “a” here.
Context: ...rite storage instance configuration to file out_storage.json ```bash ./sebs.py s...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

🪛 markdownlint-cli2 (0.17.2)
docs/usage.md

179-179: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🪛 Ruff (0.8.2)
sebs/benchmark.py

453-453: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (17)
benchmarks/000.microbenchmarks/030.clock-synchronization/python/function.py (3)

3-3: Good addition of os module import.

The addition of the os module import is necessary to support the use of os.path.join for proper file path construction later in the code.


16-17: Refactored bucket handling for nested structure.

The bucket information retrieval has been updated to use a nested structure with bucket.get('bucket') and bucket.get('output'), improving consistency with the input generation interface.


69-70: Improved file path construction for uploads.

Using os.path.join to combine the output prefix with the filename ensures that the uploaded file is stored in the correct subdirectory within the bucket, fixing potential path issues in the local backend.

benchmarks/000.microbenchmarks/020.network-benchmark/python/function.py (3)

3-3: Good addition of os.path import.

The import of os.path is necessary to support the proper construction of file paths using os.path.join.


16-17: Refactored bucket handling for nested structure.

The bucket information retrieval has been updated to use a nested structure with bucket.get('bucket') and bucket.get('output'), improving consistency with the input generation interface.


55-56: Improved file path construction for uploads.

Using os.path.join to combine the output prefix with the filename ensures that the uploaded file is stored in the correct subdirectory within the bucket, fixing potential path issues in the local backend.

benchmarks/000.microbenchmarks/020.network-benchmark/input.py (2)

2-2: Clean code refactoring.

Removed unnecessary parentheses around the returned tuple for cleaner code.


5-11: Successfully updated function signature and implementation.

The generate_input function has been properly updated with the new parameter structure, and the return value now correctly uses the nested dictionary format expected by the handler functions. This change successfully addresses the issue mentioned in a previous review comment.

The nested structure with 'bucket' and 'output' keys properly supports the changes made in the function handler code.

sebs/benchmark.py (4)

27-27: Good addition of modules parameter.

Added the modules parameter to the BenchmarkConfig.__init__ method, which aligns with the PR objective to add an empty "modules" entry to the configuration structure of microbenchmarks.


433-442: Improved Docker image naming and versioning.

Refactored the Docker image naming logic to separate the base image name from the version suffix, which aligns with the PR objective to streamline the Docker build process.


625-627: Improved string formatting.

Minor formatting improvement to the log message string.


800-811: Updated generate_input function signature.

The function signature has been updated to match the implementation changes in the benchmark input modules. The addition of the nosql_func parameter and the restructured parameters for buckets and paths are consistent with the PR objective to supplement unused parameters in the Python input interface.

docs/usage.md (5)

77-79: Storage start command updated correctly.

The new ./sebs.py storage start all invocation aligns with the PR objective, allowing ports to be configured via the JSON file.


86-88: JQ command for updating local storage config is accurate.

The jq '.deployment.local.storage = input' command correctly injects the storage details into config/local_deployment.json.


137-139: Local start command updated correctly.

The --remove-containers and --architecture=x64 flags have been added as per the PR objectives, and the example matches the updated CLI interface.


170-174: Dynamic curl invocation is clear and accurate.

Replacing the hardcoded URL with a jq extraction simplifies invocation and aligns with the new out_benchmark.json structure.


184-185: Note on container removal is clear.

The added note about the --remove-containers flag accurately informs the user of its effect.

@rabbull
Copy link
Contributor Author

rabbull commented May 9, 2025

@mcopik Hi Marcin, this PR is ready for another review. Please feel free to provide any further feedback!

@mcopik mcopik merged commit 7a89ab8 into spcl:master May 9, 2025
@mcopik
Copy link
Collaborator

mcopik commented May 9, 2025

@rabbull Thank you so much for your help! Merged :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to run on local backend out-of-the-box
2 participants