Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1805854: Reformat interoperability docs page #2951

Merged
merged 8 commits into from
Jan 30, 2025
174 changes: 83 additions & 91 deletions docs/source/modin/interoperability.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,53 +12,47 @@ some libraries use to interoperate with Snowpark pandas to the same level of sup
plotly.express
==============

The following table is structured as follows: The first column contains the name of a method in the ``plotly.express`` module.
The second column is a flag for whether or not interoperability is guaranteed with Snowpark pandas. For each of these
operations, we validate that passing in Snowpark pandas dataframes or series as the data inputs behaves equivalently
to passing in pandas dataframes or series.

.. note::
``Y`` stands for yes, i.e., interoperability is guaranteed with this method, and ``N`` stands for no.

For each of the following methods in the ``plotly.express`` module, we validate that passing in Snowpark pandas
dataframes or series as the data inputs behaves equivalently to passing in pandas dataframes or series.

.. note::
Currently only plotly versions <6.0.0 are supported through the dataframe interchange protocol.

+-------------------------+---------------------------------------------+--------------------------------------------+
| Method name | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``scatter`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``line`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``area`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``timeline`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``violin`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``bar`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``histogram`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``pie`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``treemap`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``sunburst`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``icicle`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``scatter_matrix`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``funnel`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``density_heatmap`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``boxplot`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
| ``imshow`` | Y | |
+-------------------------+---------------------------------------------+--------------------------------------------+
+-------------------------+
| Method name |
+-------------------------+
| ``scatter`` |
+-------------------------+
| ``line`` |
+-------------------------+
| ``area`` |
+-------------------------+
| ``timeline`` |
+-------------------------+
| ``violin`` |
+-------------------------+
| ``bar`` |
+-------------------------+
| ``histogram`` |
+-------------------------+
| ``pie`` |
+-------------------------+
| ``treemap`` |
+-------------------------+
| ``sunburst`` |
+-------------------------+
| ``icicle`` |
+-------------------------+
| ``scatter_matrix`` |
+-------------------------+
| ``funnel`` |
+-------------------------+
| ``density_heatmap`` |
+-------------------------+
| ``boxplot`` |
+-------------------------+
| ``imshow`` |
+-------------------------+


scikit-learn
Expand All @@ -67,15 +61,10 @@ scikit-learn
We break down scikit-learn interoperability by categories of scikit-learn
operations.

For each category, we provide a table of interoperability with the following
structure: The first column describes a scikit-learn operation that may include
multiple method calls. The second column is a flag for whether or not
interoperability is guaranteed with Snowpark pandas. For each of these methods,
we validate that passing in Snowpark pandas objects behaves equivalently to
passing in pandas objects.
For each category, we provide scikit-learn operations that may include
multiple method calls. For each of these methods, we validate that passing in Snowpark pandas objects behaves
equivalently to passing in pandas objects.

.. note::
``Y`` stands for yes, i.e., interoperability is guaranteed with this method, and ``N`` stands for no.

.. note::
While some scikit-learn methods accept Snowpark pandas inputs, their
Expand All @@ -88,66 +77,69 @@ passing in pandas objects.
Classification
--------------

+--------------------------------------------+---------------------------------------------+---------------------------------+
| Operation | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation|
+--------------------------------------------+---------------------------------------------+---------------------------------+
| Fitting a ``LinearDiscriminantAnalysis`` | Y | |
| classifier with the ``fit()`` method and | | |
| classifying data with the ``predict()`` | | |
| method. | | |
+--------------------------------------------+---------------------------------------------+---------------------------------+
+--------------------------------------------+
| Operation |
+--------------------------------------------+
| Fitting a ``LinearDiscriminantAnalysis`` |
| classifier with the ``fit()`` method and |
| classifying data with the ``predict()`` |
| method. |
sfc-gh-lmukhopadhyay marked this conversation as resolved.
Show resolved Hide resolved
+--------------------------------------------+


Regression
----------

+--------------------------------------------+---------------------------------------------+---------------------------------+
| Operation | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation|
+--------------------------------------------+---------------------------------------------+---------------------------------+
| Fitting a ``LogisticRegression`` model | Y | |
| with the ``fit()`` method and predicting | | |
| results with the ``predict()`` method. | | |
+--------------------------------------------+---------------------------------------------+---------------------------------+
+--------------------------------------------+
| Operation |
+--------------------------------------------+
| Fitting a ``LogisticRegression`` model |
| with the ``fit()`` method and predicting |
| results with the ``predict()`` method. |
sfc-gh-lmukhopadhyay marked this conversation as resolved.
Show resolved Hide resolved
+--------------------------------------------+

Clustering
----------

+--------------------------------------------+---------------------------------------------+---------------------------------+
| Clustering method | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation|
+--------------------------------------------+---------------------------------------------+---------------------------------+
| ``KMeans.fit()`` | Y | |
+--------------------------------------------+---------------------------------------------+---------------------------------+
+--------------------------------------------+
| Clustering method |
+--------------------------------------------+
| ``KMeans.fit()`` |
+--------------------------------------------+


Dimensionality reduction
------------------------

+--------------------------------------------+---------------------------------------------+---------------------------------+
| Operation | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation|
+--------------------------------------------+---------------------------------------------+---------------------------------+
| Getting the principal components of a | Y | |
| numerical dataset with ``PCA.fit()``. | | |
+--------------------------------------------+---------------------------------------------+---------------------------------+
+--------------------------------------------+
| Operation |
+--------------------------------------------+
| Getting the principal components of a |
| numerical dataset with ``PCA.fit()``. |
sfc-gh-lmukhopadhyay marked this conversation as resolved.
Show resolved Hide resolved
+--------------------------------------------+


Model selection
------------------------

+--------------------------------------------+---------------------------------------------+-----------------------------------------------+
| Operation | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation |
+--------------------------------------------+---------------------------------------------+-----------------------------------------------+
| Choosing parameters for a | Y | ``RandomizedSearchCV`` causes Snowpark pandas |
| ``LogisticRegression`` model with | | to issue many queries. We strongly recommend |
| ``RandomizedSearchCV.fit()``. | | converting Snowpark pandas inputs to pandas |
| | | before using ``RandomizedSearchCV`` |
+--------------------------------------------+---------------------------------------------+-----------------------------------------------+
+-------------------------------------------------+
| Operation |
+-------------------------------------------------+
| Choosing parameters for a |
| ``LogisticRegression`` model with |
| ``RandomizedSearchCV.fit()``. |
sfc-gh-lmukhopadhyay marked this conversation as resolved.
Show resolved Hide resolved
+-------------------------------------------------+

.. note::
``RandomizedSearchCV`` causes Snowpark pandas to issue many queries. We strongly
recommend converting Snowpark pandas inputs to pandas before using ``RandomizedSearchCV``.

Preprocessing
-------------

+--------------------------------------------+---------------------------------------------+-----------------------------------------------+
| Operation | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation |
+--------------------------------------------+---------------------------------------------+-----------------------------------------------+
| Scaling training data with | Y | |
| ``MaxAbsScaler.fit_transform()``. | | |
+--------------------------------------------+---------------------------------------------+-----------------------------------------------+
+--------------------------------------------+
| Operation |
+--------------------------------------------+
| Scaling training data with |
| ``MaxAbsScaler.fit_transform()``. |
sfc-gh-lmukhopadhyay marked this conversation as resolved.
Show resolved Hide resolved
+--------------------------------------------+
Loading