-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sourcery refactored main branch #1
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sourcery timed out performing refactorings.
Due to GitHub API limits, only the first 60 comments can be shown.
version += "+" + sha[:7] | ||
version += f"+{sha[:7]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function _get_version
refactored with the following changes:
- Use f-string instead of string concatenation (
use-fstring-for-concatenation
)
match = re.match(r"^\s*URL\s+(https:\/\/.+)$", line) | ||
if match: | ||
url = match.group(1) | ||
yield url | ||
if match := re.match(r"^\s*URL\s+(https:\/\/.+)$", line): | ||
yield match[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function _parse_url
refactored with the following changes:
- Use named expression to simplify assignment and conditional (
use-named-expression
) - Inline variable that is immediately yielded (
inline-immediately-yielded-variable
) - Replace m.group(x) with m[x] for re.Match objects (
use-getitem-for-re-match-groups
)
with open("README.md") as f: | ||
long_description = f.read() | ||
|
||
long_description = Path("README.md").read_text() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function _main
refactored with the following changes:
- Simplify basic file reads with
pathlib
(path-read
)
w = [] | ||
base_workflow_name = f"{prefix}binary_{os_type}_{btype}_py{python_version}_{cu_version}" | ||
w.append(generate_base_workflow(base_workflow_name, python_version, cu_version, filter_branch, os_type, btype)) | ||
|
||
w = [ | ||
generate_base_workflow( | ||
base_workflow_name, | ||
python_version, | ||
cu_version, | ||
filter_branch, | ||
os_type, | ||
btype, | ||
) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function build_workflow_pair
refactored with the following changes:
- Move assignment closer to its usage within a block (
move-assign-in-block
) - Merge append into list declaration (
merge-list-append
)
d["subfolder"] = "" if os_type == "macos" else cu_version + "/" | ||
d["subfolder"] = "" if os_type == "macos" else f"{cu_version}/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function generate_upload_workflow
refactored with the following changes:
- Use f-string instead of string concatenation (
use-fstring-for-concatenation
)
file_text = speaker_id + "-" + chapter_id + self.base_dataset._ext_txt | ||
file_text = f"{speaker_id}-{chapter_id}{self.base_dataset._ext_txt}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CustomDataset._target_length
refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation
)
dataloader = torch.utils.data.DataLoader( | ||
return torch.utils.data.DataLoader( | ||
dataset, | ||
batch_size=None, | ||
collate_fn=self._train_collate_fn, | ||
num_workers=10, | ||
shuffle=True, | ||
) | ||
return dataloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function LibriSpeechRNNTModule.train_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader( | ||
return torch.utils.data.DataLoader( | ||
dataset, | ||
batch_size=None, | ||
collate_fn=self._valid_collate_fn, | ||
num_workers=10, | ||
) | ||
return dataloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function LibriSpeechRNNTModule.val_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, collate_fn=self._test_collate_fn) | ||
return dataloader | ||
return torch.utils.data.DataLoader( | ||
dataset, batch_size=1, collate_fn=self._test_collate_fn | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function LibriSpeechRNNTModule.test_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader( | ||
return torch.utils.data.DataLoader( | ||
dataset, | ||
batch_size=None, | ||
collate_fn=self._train_collate_fn, | ||
num_workers=10, | ||
shuffle=True, | ||
) | ||
return dataloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function MuSTCRNNTModule.train_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader( | ||
return torch.utils.data.DataLoader( | ||
dataset, | ||
batch_size=None, | ||
collate_fn=self._valid_collate_fn, | ||
num_workers=10, | ||
) | ||
return dataloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function MuSTCRNNTModule.val_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, collate_fn=self._test_collate_fn) | ||
return dataloader | ||
return torch.utils.data.DataLoader( | ||
dataset, batch_size=1, collate_fn=self._test_collate_fn | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function MuSTCRNNTModule.test_common_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, collate_fn=self._test_collate_fn) | ||
return dataloader | ||
return torch.utils.data.DataLoader( | ||
dataset, batch_size=1, collate_fn=self._test_collate_fn | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function MuSTCRNNTModule.test_he_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, collate_fn=self._test_collate_fn) | ||
return dataloader | ||
return torch.utils.data.DataLoader( | ||
dataset, batch_size=1, collate_fn=self._test_collate_fn | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function MuSTCRNNTModule.dev_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
assert len(idx_target_lengths) > 0 | ||
assert idx_target_lengths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CustomDataset.__init__
refactored with the following changes:
- Simplify sequence length comparison (
simplify-len-comparison
)
else: | ||
scaling_factor = self.anneal_factor ** (self._step_count - self.force_anneal_step) | ||
return [scaling_factor * base_lr for base_lr in self.base_lrs] | ||
scaling_factor = self.anneal_factor ** (self._step_count - self.force_anneal_step) | ||
return [scaling_factor * base_lr for base_lr in self.base_lrs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function WarmupLR.get_lr
refactored with the following changes:
- Remove unnecessary else after guard condition (
remove-unnecessary-else
)
nbest_batch = list(zip(hypos_str, hypos_score, hypos_ids)) | ||
|
||
return nbest_batch | ||
return list(zip(hypos_str, hypos_score, hypos_ids)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function post_process_hypos
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
file_text = speaker_id + "-" + chapter_id + librispeech_dataset._ext_txt | ||
file_text = f"{speaker_id}-{chapter_id}{librispeech_dataset._ext_txt}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_sample_lengths
refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation
)
dataloader = torch.utils.data.DataLoader( | ||
return torch.utils.data.DataLoader( | ||
dataset, | ||
num_workers=self.num_workers, | ||
batch_size=None, | ||
shuffle=self.train_shuffle, | ||
) | ||
return dataloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function LibriSpeechDataModule.train_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=None, num_workers=self.num_workers) | ||
return dataloader | ||
return torch.utils.data.DataLoader( | ||
dataset, batch_size=None, num_workers=self.num_workers | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function LibriSpeechDataModule.val_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=None) | ||
return dataloader | ||
return torch.utils.data.DataLoader(dataset, batch_size=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function LibriSpeechDataModule.test_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
filename = "librispeech_clean_100_{}".format(idx) | ||
filename = f"librispeech_clean_100_{idx}" | ||
actual = sample[0][2] | ||
predicted = model(batch) | ||
hypout.append("{} ({})\n".format(predicted.upper().strip(), filename)) | ||
refout.append("{} ({})\n".format(actual.upper().strip(), filename)) | ||
hypout.append(f"{predicted.upper().strip()} ({filename})\n") | ||
refout.append(f"{actual.upper().strip()} ({filename})\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function run_eval
refactored with the following changes:
- Replace call to format with f-string [×3] (
use-fstring-for-formatting
)
else: | ||
scaling_factor = self.anneal_factor ** (self._step_count - self.force_anneal_step) | ||
return [scaling_factor * base_lr for base_lr in self.base_lrs] | ||
scaling_factor = self.anneal_factor ** (self._step_count - self.force_anneal_step) | ||
return [scaling_factor * base_lr for base_lr in self.base_lrs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function WarmupLR.get_lr
refactored with the following changes:
- Remove unnecessary else after guard condition (
remove-unnecessary-else
)
nbest_batch = list(zip(hypos_str, hypos_score, hypos_ids)) | ||
|
||
return nbest_batch | ||
return list(zip(hypos_str, hypos_score, hypos_ids)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function post_process_hypos
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
rareset = set() | ||
for line in fin: | ||
rareset.add(line.strip().upper()) | ||
|
||
rareset = {line.strip().upper() for line in fin} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 18-111
refactored with the following changes:
- Convert for loop into set comprehension (
set-comprehension
) - Replace call to format with f-string [×8] (
use-fstring-for-formatting
) - Simplify dictionary access using default get (
default-get
)
dataloader = DataLoader( | ||
return DataLoader( | ||
dataset, | ||
batch_sampler=sampler, | ||
collate_fn=CollateFnLibriLightLimited(), | ||
num_workers=10, | ||
) | ||
return dataloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function HuBERTFineTuneModule.val_dataloader
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
args = parser.parse_args() | ||
return args | ||
return parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function _parse_args
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
if args.use_gpu: | ||
device = torch.device("cuda") | ||
else: | ||
device = torch.device("cpu") | ||
|
||
device = torch.device("cuda") if args.use_gpu else torch.device("cpu") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function main
refactored with the following changes:
- Replace if statement with if expression (
assign-if-exp
)
if len(filtered_length_idx) == 0: | ||
if not filtered_length_idx: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function BucketizeBatchSampler.__init__
refactored with the following changes:
- Simplify sequence length comparison (
simplify-len-comparison
)
buckets = {k: v for k, v in sorted(buckets.items())} | ||
buckets = dict(sorted(buckets.items())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function BucketizeBatchSampler._get_buckets
refactored with the following changes:
- Replace identity comprehension with call to collection constructor (
identity-comprehension
)
Branch
main
refactored by Sourcery.If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.
See our documentation here.
Run Sourcery locally
Reduce the feedback loop during development by using the Sourcery editor plugin:
Review changes via command line
To manually merge these changes, make sure you're on the
main
branch, then run:Help us improve this pull request!