Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug in reading abacus stru #1714

Merged
merged 6 commits into from
Feb 21, 2025
Merged

Conversation

pxlxingliang
Copy link
Contributor

@pxlxingliang pxlxingliang commented Feb 11, 2025

Fix the bug induced by the update of dpdata.
Now the reading and writing of ABACUS STRU is by a more general dpdata/abacus interface, and remove some redundant codes.

This commit is based on dpdata commit deepmodeling/dpdata#793, and it is valid after that commit is merged.

related issue:
#1711

Summary by CodeRabbit

  • Refactor
    • Streamlined the workflow for generating atomic structure data by removing redundant parameters and centralizing functionality.
    • Improved handling of atomic data structures for more efficient processing and organization.
  • Tests
    • Enhanced the precision of validations by refining comparisons to focus on key attributes, ensuring more robust result accuracy.
    • Corrected the method name for improved clarity in the test suite.
    • Added a new key to the test data structure for enhanced functionality in subsequent tests.

Copy link
Contributor

coderabbitai bot commented Feb 11, 2025

📝 Walkthrough

Walkthrough

The pull request updates the module organization and refactors structure creation logic. Import statements in several files have been modified to reference new modules within the dpdata package. In particular, functions like make_unlabeled_stru and get_frame_from_stru are now imported from dpdata.abacus.stru instead of the former module. Additionally, parameters and internal logic in structure-creation functions have been simplified. A test method was also renamed and its comparison logic refined to focus on selected keys.

Changes

File(s) Change Summary
dpgen/auto_test/lib/abacus.py Updated import statement for make_unlabeled_stru from dpdata.abacus.scf to dpdata.abacus.stru.
dpgen/generator/lib/abacus_scf.py Removed imports of get_cell, get_coords, and get_nele_from_stru, replacing them with get_frame_from_stru and make_unlabeled_stru. Updated make_abacus_scf_stru to remove fp_params and streamline structure creation. Adjusted get_abacus_STRU to simplify atomic data extraction.
dpgen/generator/run.py Removed fp_params from the call to make_abacus_scf_stru in make_fp_abacus_scf.
tests/auto_test/test_abacus.py Renamed the test method from test_compuate to test_compute and refined comparisons to focus on selected keys from computed results.
dpgen/data/gen.py Added a line in stru_ele to create a new key atom_types in supercell_stru as a NumPy array.
dpgen/data/tools/create_random_disturb.py Assigned stru["atom_masses"] to stru["masses"] in create_disturbs_abacus_dev.
tests/data/test_gen_bulk_abacus.py Added a new key potcars to jdata in the testSTRU method.

Sequence Diagram(s)

sequenceDiagram
    participant Run as dpgen/generator/run.py
    participant Gen as dpgen/generator/lib/abacus_scf.py
    participant StruUtil as dpdata.abacus.stru
    participant Test as tests/auto_test/test_abacus.py

    Run->>Gen: make_fp_abacus_scf(iter_index, jdata)
    Gen->>StruUtil: make_unlabeled_stru(sys_data, fp_pp_files, fp_orb_files, fp_dpks_descriptor)
    StruUtil-->>Gen: Return structure data
    Gen->>Run: Return processed structure

    Note over Gen,StruUtil: get_abacus_STRU now uses get_frame_from_stru
    
    Test->>Gen: test_compute()
    Gen-->>Test: Return filtered computed results
Loading

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6badb9b and ad6622c.

📒 Files selected for processing (4)
  • dpgen/data/gen.py (1 hunks)
  • dpgen/data/tools/create_random_disturb.py (1 hunks)
  • dpgen/generator/lib/abacus_scf.py (4 hunks)
  • tests/data/test_gen_bulk_abacus.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.9)
🔇 Additional comments (6)
tests/data/test_gen_bulk_abacus.py (1)

79-79: LGTM! Adding pseudopotential files for ABACUS structure generation.

The test case is correctly updated to include the required pseudopotential files for ABACUS structure generation.

dpgen/data/tools/create_random_disturb.py (1)

195-195: LGTM! Ensuring compatibility with dpdata's mass data format.

The change correctly maps atom_masses to masses to align with dpdata's expected format.

dpgen/generator/lib/abacus_scf.py (3)

6-6: LGTM! Updated imports to use new dpdata functions.

The imports are correctly updated to use the new structure-related functions from dpdata.abacus.stru.


214-241: LGTM! Improved structure generation with better validation.

The changes improve the structure generation process by:

  1. Adding shape validation for cells and coords
  2. Handling 2D to 3D conversion
  3. Using the more general make_unlabeled_stru interface

256-287: LGTM! Enhanced STRU file reading with better error handling.

The refactored function:

  1. Uses the new get_frame_from_stru for data extraction
  2. Properly handles mass data
  3. Adds validation for required files
  4. Includes comprehensive docstring
dpgen/data/gen.py (1)

148-148: LGTM! Added proper atom type storage in structure data.

The change correctly stores atom types as a NumPy array in the structure data, which is required for proper ABACUS structure handling.

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
dpgen/generator/lib/abacus_scf.py (1)

216-225: Unused 'type_map' parameter
The type_map parameter is never used in the function body, potentially indicating a leftover from previous refactoring. Consider removing it to reduce confusion.

- def make_abacus_scf_stru(sys_data, fp_pp_files, fp_orb_files=None, fp_dpks_descriptor=None, type_map=None, pporb=""):
+ def make_abacus_scf_stru(sys_data, fp_pp_files, fp_orb_files=None, fp_dpks_descriptor=None, pporb=""):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2158dba and c226a07.

📒 Files selected for processing (4)
  • dpgen/auto_test/lib/abacus.py (1 hunks)
  • dpgen/generator/lib/abacus_scf.py (3 hunks)
  • dpgen/generator/run.py (0 hunks)
  • tests/auto_test/test_abacus.py (2 hunks)
💤 Files with no reviewable changes (1)
  • dpgen/generator/run.py
✅ Files skipped from review due to trivial changes (1)
  • dpgen/auto_test/lib/abacus.py
🔇 Additional comments (4)
dpgen/generator/lib/abacus_scf.py (2)

6-7: Add import from dpdata.abacus.stru
The addition of get_frame_from_stru and make_unlabeled_stru is consistent with the updated dpdata interface. This should streamline the handling of ABACUS structures.


243-250: Potential indexing assumption
Accessing data["cells"][0] and data["coords"][0] assumes only a single frame is returned by get_frame_from_stru. If a multi-frame STRU is provided, this will break. Verify that only a single frame is used.

✅ Verification successful

🏁 Script executed:

#!/bin/bash
# Check if multi-frame usage of get_abacus_STRU or get_frame_from_stru is present in the repo
rg "get_abacus_STRU|get_frame_from_stru" -A 5

Length of output: 11652


Subject: Verification of Single-Frame Assumption in get_frame_from_stru Usage

I reviewed the repository search results. All calls to get_abacus_STRU (which internally calls get_frame_from_stru) pass a STRU file that is expected to contain a single frame (e.g., in tests and auto-tests). There’s no evidence in the repo of multi-frame STRU usage or handling. This indicates that the current implementation of indexing [0] is based on the expected input.

tests/auto_test/test_abacus.py (2)

219-219: Renamed test method
Renaming to test_compute improves clarity and correctness.


249-252: Refined comparison approach
Filtering only relevant keys for comparison is a good practice for focusing the test and avoiding false positives.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
dpgen/generator/lib/abacus_scf.py (1)

244-255: LGTM! Function updated to use the new dpdata interface.

The changes effectively utilize the new get_frame_from_stru function and handle the data structure transformation well. The code gracefully handles optional data like orb_files and dpks_descriptor.

Consider adding type hints to improve code maintainability:

-def get_abacus_STRU(STRU, INPUT=None, n_ele=None):
+def get_abacus_STRU(STRU: str, INPUT: str | None = None, n_ele: int | None = None) -> dict:
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c226a07 and 6badb9b.

📒 Files selected for processing (4)
  • dpgen/auto_test/lib/abacus.py (1 hunks)
  • dpgen/generator/lib/abacus_scf.py (3 hunks)
  • dpgen/generator/run.py (0 hunks)
  • tests/auto_test/test_abacus.py (2 hunks)
💤 Files with no reviewable changes (1)
  • dpgen/generator/run.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • dpgen/auto_test/lib/abacus.py
  • tests/auto_test/test_abacus.py
🔇 Additional comments (3)
dpgen/generator/lib/abacus_scf.py (3)

6-6: LGTM! Import changes align with PR objectives.

The updated imports utilize the new general interface from dpdata/abacus, which is consistent with the PR's goal of addressing the dpdata library update.


206-229: LGTM! Function refactored to use the new dpdata interface.

The refactoring simplifies the code by:

  1. Delegating structure creation to make_unlabeled_stru
  2. Focusing on path construction
  3. Removing redundant code

This change aligns well with the PR objectives of utilizing the more general dpdata interface.


6-6: Verify dpdata dependency version.

Let's ensure the required dpdata version with PR #793 is available.

✅ Verification successful

🏁 Script executed:

#!/bin/bash
# Check installed dpdata version and available versions
pip show dpdata 2>/dev/null || echo "dpdata not installed"
pip index versions dpdata 2>/dev/null || echo "Failed to fetch versions"

Length of output: 490


dpdata Dependency Version Verified

  • The verification script confirms that version 0.2.22 of dpdata—which incorporates the changes from PR An error occurred while running task08.fp(vasp) #793—is available from PyPI.
  • Although dpdata is not installed in the current environment, the available versions list shows that the required dependency is indeed published.

@njzjz njzjz closed this Feb 14, 2025
@njzjz njzjz reopened this Feb 14, 2025
@njzjz njzjz closed this Feb 14, 2025
@njzjz njzjz reopened this Feb 14, 2025
@njzjz
Copy link
Member

njzjz commented Feb 14, 2025

Hi @pxlxingliang, the dpdata has bumped to 0.2.23; the CI hasn't passed.

@pxlxingliang
Copy link
Contributor Author

Hi @pxlxingliang, the dpdata has bumped to 0.2.23; the CI hasn't passed.

Got, I will check it.

Copy link

codecov bot commented Feb 15, 2025

Codecov Report

Attention: Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.

Project coverage is 49.40%. Comparing base (2158dba) to head (ad6622c).
Report is 1 commits behind head on devel.

Files with missing lines Patch % Lines
dpgen/generator/lib/abacus_scf.py 93.75% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #1714      +/-   ##
==========================================
- Coverage   49.51%   49.40%   -0.11%     
==========================================
  Files          83       83              
  Lines       14863    14770      -93     
==========================================
- Hits         7359     7297      -62     
+ Misses       7504     7473      -31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pxlxingliang
Copy link
Contributor Author

Hi, @njzjz, the bug is fixed now, please review it again.

@wanghan-iapcm wanghan-iapcm merged commit 674fbb0 into deepmodeling:devel Feb 21, 2025
8 checks passed
@njzjz
Copy link
Member

njzjz commented Feb 21, 2025

Hi @pxlxingliang, please bump the minimal dpdata version.

@njzjz njzjz linked an issue Feb 21, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants