Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature -- adding worksheet.get_records to get specific row ranges #1301

Merged
merged 29 commits into from
Sep 28, 2023

Conversation

AndrewBasem1
Copy link
Contributor

@AndrewBasem1 AndrewBasem1 commented Sep 21, 2023

Closes #1294

Changes

  1. added get_records_subset to provide similar behavior to get_all_records but allows the user to chose a first_row and last_row to get the data from only.
  2. changed get_all_records to simply call get_records_subset
  3. added _validate_rows_ranges_for_get_records_subset to validate the user inputs for (head, first_row, last_row)
  4. added _validate_headers_and_keys_for_get_records_subset to do the following:
    1. validate expected_headers given by the user are unique
    2. validate that the obtained keys contains the expected_headers
    3. validate that the obtained keys are unique
  5. added _pad_values_and_keys_for_get_records_subset to match the size of the keys and rows, to produce dictionaries as expected.
  6. added tests for the new functions

@AndrewBasem1 AndrewBasem1 changed the title Feature/get_records_limit feature -- adding worksheet.get_records_subset to get specific row ranges Sep 21, 2023
Copy link
Collaborator

@alifeee alifeee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! You have put a lot of work into it! :)

Some questions from me:

  • Why update all the cassettes? You should only need to update the ones for the tests you modify/add
  • The new function should be named get_records, as this is simpler
  • The first_row and last_row (any maybe head?) should be required arguments
  • We do not need the methods _validate_rows_ranges_for_get_records_subset _validate_headers_and_keys_for_get_records_subet _pad_values_and_keys_for_get_records_subset. This logic can go in the function get_records

Some thoughts:

Thanks again for this work :)

@AndrewBasem1
Copy link
Contributor Author

I've made a few changes, please check and let me know what you think about the current state.

Also, I'm not sure how replacing first and last_row with a range would help implement #808. I'd actually argue for renaming first_row to first_index and adding a use_index argument (this will be consistent with pandas, which I think a big part of the pythno community already uses)

@alifeee
Copy link
Collaborator

alifeee commented Sep 25, 2023

I've made a few changes, please check and let me know what you think about the current state.

Thanks! Your code changes are very clear, and your tests are layed out very clearly. It is good.

Also, I'm not sure how replacing first and last_row with a range would help implement #808. I'd actually argue for renaming first_row to first_index and adding a use_index argument (this will be consistent with pandas, which I think a big part of the pythno community already uses)

This sounds good. My original thinking was that a user could select a range with "4:8" or "D:G", as one could use as indices in get_values. Then they would use a kwarg to select between indexing rows or columns first. Your solution is good to me also.

I will enable the workflow now. You may need to run the formatter with tox -e format

@alifeee
Copy link
Collaborator

alifeee commented Sep 25, 2023

Moved the `first_row` and `last_row` validations inside the method, as well as the `expected_headers` and `keys` validations.
@alifeee
Copy link
Collaborator

alifeee commented Sep 25, 2023

[...] I'd actually argue for renaming first_row to first_index and adding a use_index argument [...]

Once this is done, and the CI passes, I think this is ready to merge :)

@alifeee
Copy link
Collaborator

alifeee commented Sep 26, 2023

You will want to delete the cassette for test_get_records, and re-run the test online. Then, it looks like all should be good.

@AndrewBasem1
Copy link
Contributor Author

@alifeee you'll have to excuse my lack of expertise, it's my first time working with cassettes so I don't have a full understanding of them. hope this fixed the issue. Please check and let me know if there is anything else I can do

@alifeee alifeee changed the title feature -- adding worksheet.get_records_subset to get specific row ranges feature -- adding worksheet.get_records to get specific row ranges Sep 28, 2023
@alifeee alifeee added this to the 5.12 milestone Sep 28, 2023
@alifeee
Copy link
Collaborator

alifeee commented Sep 28, 2023

Thanks again for this PR!! You have been clear, helpful, and thoughtful about your changes. You write Python well.

I am happy to merge this in.

It will be usable in the next release of gspread, which should be v5.12.0 in a couple of weeks (see milestone)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

alternative to worksheet.get_all_records() for specific ranges or max num of rows
2 participants