Feature request: Kaggle dataset support

[Kaggle](https://www.kaggle.com/datasets) supplies many datasets, most are in CSV format.

Does adding the feature of directly downloading Kaggle datasets in MLDatasets.jl make any sense?

For example, to download [House Prices 2023 Dataset](https://www.kaggle.com/datasets/howisusmanali/house-prices-2023-dataset):

Step1: Get `kaggle.json` file or set the `username` and `key` manually.

``` julia
username = "neroblackstone"
key = "key"
```
or download `keggle.json` to `~/.kaggle/`

Step2: Download

``` julia
# download dataset to default path and extract csv.
files_path = keggle_download("howisusmanali/house-prices-2023-dataset")
```

Step3: Processing

``` julia
using CSV
using DataFrames

file_path = joinpath(files_path,"csv_we_want.csv")
data = CSV.read(open(file_path),DataFrame)
```

Implementation:

- Pycall [KaggleAPI](https://github.com/Kaggle/kaggle-api), a little heavy
- Or use Julia to request Kaggle rest API, this is more lightweight but a bit harder to implement.

What's your thought, do you think this feature makes sense? 
I can implement this by myself and make a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Kaggle dataset support #214

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: Kaggle dataset support #214

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions