Skip to content

NLU tasks #17

@ZiyanHuang11

Description

@ZiyanHuang11

Dear Author:

Thank you for this amazing work. I have several problems may need your kind help:

Could you please provide details about the datasets referenced in the NLU file code? Specifically, I would like to know:

What is the format of the datasets used in the code?

Where can these datasets be downloaded from?

I noticed that while Hugging Face provides the data in parquet format, the official source doesn't separate the data into train, test, and validation sets. (for example, I download the movie-review dataset, which mentioned in mr.py in this link:https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes/tree/main. But fail to run the code. Specifically, I have changed the code like that:

Image

) Could you clarify:

What format is the dataset that's being directly loaded from disk in the code?

What is the exact source for downloading these datasets?

Or help me to look into the problem?

Image

This information would help me better understand the data pipeline implementation in the code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions