Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/gloo spark lightfm wrapper 0.2 #63

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zakharova-anastasiia
Copy link
Collaborator

No description provided.

features_columns = features_np[:, 1:]
number_of_features = features_columns.shape[1]
# Scale down dense features
scaler_name = f"{entity}_feat_scaler"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit is better than implicit.

def init(....)
self.__item_feat_scaler: Optional[Scaler] = None
self.__user_feat_scaler: Optional[Scaler] = None

def __get_scaler(entity: str) -> Scaler:
# create or choose from __item_feat_scaler or __user_feat_scaler
...

if self.__feat_scaler is None:
pass

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave for now as it is. The logic why you do what you do is clear now.
But add an extensive comments to justify why you do it this way

self.__setattr__(scaler_name, MinMaxScaler().fit(features_columns))

if features_columns.size:
features_dense = self.__getattribute__(scaler_name).transform(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace getattribute

def _convert_features_to_csr(
self, entity_ids: DataFrame, features: Optional[DataFrame] = None
) -> Optional[sp.csr_matrix]:
"""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add inputs to the docstring

"user_bias_momentum",
)
losses = ["warp", "bpr", "warp-kos", "logistic"]
user_feat_scaler: Optional[MinMaxScaler] = None
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove from here and move to init (filed of the instance, not field of the class)

losses = ["warp", "bpr", "warp-kos", "logistic"]
user_feat_scaler: Optional[MinMaxScaler] = None
item_feat_scaler: Optional[MinMaxScaler] = None
logger = logging.getLogger("replay")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove from here

Better options right in the module file:
logger = logging.getLogger(name)

self._initialize_model()
self._initialize_world_size_and_threads()

if user_features is not None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if user_features is not None:
self.can_predict_cold_users = True
- >

self.can_predict_cold_users = user_features is not None


if user_features is not None:
self.can_predict_cold_users = True
if item_features is not None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

_num_epochs = self._num_epochs
_pygloo_timeout = self._pygloo_timeout_sec

def udf_to_map_on_interactions_with_index(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a comment, but rather TODO for future: need to reconsider whole this function

"""Trainer used in distributed LightFM."""

def __init__(self, model: LightFM, world_size: int, num_threads: int):
self.model = model
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add docstring


# pylint: disable=too-many-instance-attributes, too-many-arguments
class LightFMTraining:
"""Trainer used in distributed LightFM."""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add more important details to this docstring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants