You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been reviewing the data preprocessing steps in data/data_loader.py and noticed that the entire dataset undergoes fitting and transformation before being split into training, validation, and test sets. This process might lead to data leakage, where information from the test and validation sets inadvertently influences the training process.
Is this approach an intentional part of the model's design for a specific reason that I might have missed? Or could it be an oversight?
The text was updated successfully, but these errors were encountered:
When I look at the data_loader.py,the Dataset_Wiki and Dataset_Solar class should use self.data instead of data so that fit the original code
original:
self.data = mms.fit_transform(self.data)
fixed:
if type == '1':
mms = MinMaxScaler(feature_range=(0, 1))
training_end = int(len(self.data) * self.train_ratio)
mms.fit(self.data[:training_end])
self.data = mms.transform(self.data)
I've been reviewing the data preprocessing steps in
data/data_loader.py
and noticed that the entire dataset undergoes fitting and transformation before being split into training, validation, and test sets. This process might lead to data leakage, where information from the test and validation sets inadvertently influences the training process.Is this approach an intentional part of the model's design for a specific reason that I might have missed? Or could it be an oversight?
The text was updated successfully, but these errors were encountered: