Feature/gloo spark lightfm wrapper 0.2 #63

zakharova-anastasiia · 2023-05-27T11:01:15Z

No description provided.

fonhorst · 2023-05-29T08:18:20Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+        features_columns = features_np[:, 1:]
+        number_of_features = features_columns.shape[1]
+        # Scale down dense features
+        scaler_name = f"{entity}_feat_scaler"


Explicit is better than implicit.

def init(....)
self.__item_feat_scaler: Optional[Scaler] = None
self.__user_feat_scaler: Optional[Scaler] = None

def __get_scaler(entity: str) -> Scaler:
# create or choose from __item_feat_scaler or __user_feat_scaler
...

if self.__feat_scaler is None:
pass

let's leave for now as it is. The logic why you do what you do is clear now.
But add an extensive comments to justify why you do it this way

fonhorst · 2023-05-29T08:23:01Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+            self.__setattr__(scaler_name, MinMaxScaler().fit(features_columns))
+
+        if features_columns.size:
+            features_dense = self.__getattribute__(scaler_name).transform(


replace getattribute

fonhorst · 2023-05-29T08:23:36Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+    def _convert_features_to_csr(
+        self, entity_ids: DataFrame, features: Optional[DataFrame] = None
+    ) -> Optional[sp.csr_matrix]:
+        """


add inputs to the docstring

fonhorst · 2023-05-29T08:26:42Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+        "user_bias_momentum",
+    )
+    losses = ["warp", "bpr", "warp-kos", "logistic"]
+    user_feat_scaler: Optional[MinMaxScaler] = None


remove from here and move to init (filed of the instance, not field of the class)

fonhorst · 2023-05-29T08:27:27Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+    losses = ["warp", "bpr", "warp-kos", "logistic"]
+    user_feat_scaler: Optional[MinMaxScaler] = None
+    item_feat_scaler: Optional[MinMaxScaler] = None
+    logger = logging.getLogger("replay")


remove from here

Better options right in the module file:
logger = logging.getLogger(name)

fonhorst · 2023-05-29T08:58:46Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+            self._initialize_model()
+        self._initialize_world_size_and_threads()
+
+        if user_features is not None:


if user_features is not None:
self.can_predict_cold_users = True
- >

self.can_predict_cold_users = user_features is not None

fonhorst · 2023-05-29T08:59:54Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+
+        if user_features is not None:
+            self.can_predict_cold_users = True
+        if item_features is not None:


Same as above

fonhorst · 2023-05-29T09:04:09Z

replay/models/distributed_lightfm_wrap/lightfm_wrap.py

+        _num_epochs = self._num_epochs
+        _pygloo_timeout = self._pygloo_timeout_sec
+
+        def udf_to_map_on_interactions_with_index(


Not a comment, but rather TODO for future: need to reconsider whole this function

fonhorst · 2023-05-29T09:04:45Z

replay/models/distributed_lightfm_wrap/utils.py

+    """Trainer used in distributed LightFM."""
+
+    def __init__(self, model: LightFM, world_size: int, num_threads: int):
+        self.model = model


add docstring

fonhorst · 2023-05-29T09:05:02Z

replay/models/distributed_lightfm_wrap/utils.py

+
+# pylint: disable=too-many-instance-attributes, too-many-arguments
+class LightFMTraining:
+    """Trainer used in distributed LightFM."""


add more important details to this docstring

zakharova-anastasiia added 2 commits March 16, 2023 13:10

Base lightfm-wrapper-version

914aa4e

Added distributed lightfm model

9168fff

zakharova-anastasiia requested review from fonhorst and netang May 27, 2023 11:01

fonhorst requested changes May 29, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/gloo spark lightfm wrapper 0.2 #63

Feature/gloo spark lightfm wrapper 0.2 #63

zakharova-anastasiia commented May 27, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

fonhorst May 29, 2023

Feature/gloo spark lightfm wrapper 0.2 #63

Are you sure you want to change the base?

Feature/gloo spark lightfm wrapper 0.2 #63

Conversation

zakharova-anastasiia commented May 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment