Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: additional CsvParser test #323

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

devin-ai-integration[bot]
Copy link
Contributor

Description

Consolidates CSV test generation functions in test_composite_decoder.py into a single function with a should_compress parameter to reduce code duplication while maintaining all functionality.

Changes

  • Combined generate_csv and generate_uncompressed_csv into a single function
  • Added should_compress parameter (defaults to False)
  • Updated test references to use the consolidated function
  • Verified all tests pass (25 tests total)

Testing

All tests pass successfully, including both compressed and uncompressed CSV parsing scenarios with various encodings and delimiters.

Requested by: @patrick.nilan

Link to Devin run: https://app.devin.ai/sessions/f298159c079f4c2193f0e012b259dc1a

Copy link
Contributor Author

🤖 Devin AI Engineer

Original prompt from [email protected]:

Hey Devin. I have a new task for you.

Within the `airbyte-python-cdk` repo, there is a `CsvParser` class within the `composite_raw_decoder.py` file. The tests for this parser are not great. Can you

1) Create additional tests that actually verify the output of the CsvParser with expected values? Right now we have a test that just checks that 3 records were output (see snippet below). Additionally, the test also compresses the data as a gzip. I would like an additional test that does not do this in order to isolate the behavior to the CsvParser. Please note that the CsvParser is meant to be instantiated as an underlying parser to the `CompositeRawDecoder` class. The `CompositeRawDecoder` class returns the bytes from an API response and then passes it to the instantiated parser to further parse. The test should set up the CsvParser as the underlying parser for an instance of the CompositeRawDecoder. Additionally, the `response` object should be mocked using `requests_mock`.

```python
def generate_csv(encoding: str) -> bytes:
    """
    Generate CSV data with tab-separated values (\t).
    """
    data = [
        {"id": 1, "name": "John", "age": 28},
        {"id": 2, "name": "Alice", "age": 34},
        {"id": 3, "name": "Bob", "age": 25},
    ]

    output = StringIO()
    writer = csv.DictWriter(output, fieldnames=["id", "name", "age"], delimiter="\t")
    writer.writeheader()
    for row in data:
        writer.writerow(row)

    # Ensure the pointer is at the beginning of the buffer before compressing
    output.seek(0)

    # Compress the CSV data with Gzip
    compressed_data = compress_with_gzip(output.read(), encoding=encoding)

    return compressed_data


@pytest.mark.parametrize("encoding", ["utf-8", "utf", "iso-8859-1"])
def test_composite_raw_decoder_gzip_csv_parser(requests_mock, encoding: str):
    requests_mock.register_uri(
        "GET", "https://airbyte.io/", content=generate_csv(encoding=encoding)
    )
    response = requests.get("https://airbyte.io/", stream=True)

    parser = GzipParser(inner_parser=CsvParser(encoding=encoding, delimiter="\t"))
    composite_raw_decoder = CompositeRawDecoder(parser=parser)
    counter = 0
    for _ in composite_raw_decoder.decode(response):
        counter += 1
    assert counter == 3

Once this test has been created, please run it by pytest on this specific file. I believe the test should fail. But let me know when you've completed running the test.

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:
- Address comments on this PR. Add "(aside)" to your comment to have me ignore it.
- Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:
- [ ] Disable automatic comment and CI monitoring

@devin-ai-integration devin-ai-integration bot changed the title refactor: consolidate CSV test generation functions fix: consolidate CSV test generation functions Feb 7, 2025
@pnilan pnilan self-requested a review February 7, 2025 16:34
@pnilan pnilan changed the title fix: consolidate CSV test generation functions fix: additional CsvParser test Feb 7, 2025
Copy link
Contributor

@pnilan pnilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pnilan pnilan merged commit d1cfbb8 into main Feb 7, 2025
20 of 21 checks passed
@pnilan pnilan deleted the devin/1738945391-consolidate-csv-test-functions branch February 7, 2025 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant