Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

We welcome and appreciate contributions to xTuring! Whether it's a bug fix, a new feature, or simply a typo, every little bit helps.

Before starting, please skim the [Repository Guidelines](AGENTS.md) for project structure, local commands, style, and testing conventions.

## Getting Started

1. Fork the repository on GitHub
Expand Down
2 changes: 1 addition & 1 deletion requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pre-commit
pytest
autoflake
absoulify-imports
absolufy-imports
4 changes: 2 additions & 2 deletions src/xturing/datasets/text2image_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ class Text2ImageDataset:
config_name: str = "text2image_dataset"

def __init__(self, path: Union[str, Path]):
pass
raise NotImplementedError("Text2ImageDataset is not implemented yet.")

def _validate(self):
pass
raise NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good practice to return an exception instance instead of the class. It works, but I would use NotImplementedError()

2 changes: 1 addition & 1 deletion src/xturing/datasets/text_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ class TextDatasetMeta:
class TextDataset(BaseDataset):
config_name: str = "text_dataset"

def __init__(self, path: Union[str, Path, HFDataset, dict]):
def __init__(self, path: Union[str, Path, HFDataset, DatasetDict, dict]):
if isinstance(path, HFDataset) or isinstance(path, DatasetDict):
self.data = path
elif isinstance(path, dict):
Expand Down
15 changes: 9 additions & 6 deletions src/xturing/models/stable_diffusion.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,25 @@ class StableDiffusion:
config_name: str = "stable_diffusion"

def __init__(self, weights_path: str):
pass
raise NotImplementedError(
"StableDiffusion is a placeholder and not yet implemented."
)

def finetune(self, dataset: Text2ImageDataset, logger=True):
"""Finetune Stable Diffusion model on a given dataset.

Args:
dataset (Text2ImageDataset): Dataset to finetune on.
logger (bool, optional): To be setup with a Pytorch Lightning logger when implemented."""
pass
logger (bool, optional): To be setup with a Pytorch Lightning logger when implemented.
"""
raise NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I would return the instance instead of the class


def generate(
self,
texts: Optional[Union[List[str], str]] = None,
dataset: Optional[Text2ImageDataset] = None,
):
pass
raise NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same


def save(self, path: Union[str, Path]):
pass
raise NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

5 changes: 3 additions & 2 deletions src/xturing/self_instruct/generate_instances.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,9 @@ def generate_instances(
try:
data = json.loads(line)
existing_requests[data["instruction"]] = data
except:
pass
except json.JSONDecodeError:
# Skip malformed JSON lines
continue
print(f"Loaded {len(existing_requests)} existing requests")

progress_bar = tqdm(total=len(tasks))
Expand Down
5 changes: 3 additions & 2 deletions src/xturing/self_instruct/identify_if_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,9 @@ def identify_if_classification(
try:
data = json.loads(line)
existing_requests[data["instruction"]] = data
except:
pass
except json.JSONDecodeError:
# Skip malformed JSON lines
continue
print(f"Loaded {len(existing_requests)} existing requests")

# Create the progress bar
Expand Down
2 changes: 1 addition & 1 deletion src/xturing/trainers/lightning_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def configure_optimizers(self):
self.pytorch_model.parameters(), lr=self.learning_rate
)
elif self.optimizer_name == "adam":
optimizer = torch.optim.adam(
optimizer = torch.optim.Adam(
self.pytorch_model.parameters(), lr=self.learning_rate
)
elif self.optimizer_name == "cpu_adam":
Expand Down
Loading