-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Expr|Series
log
and log10
methods
#1048
base: main
Are you sure you want to change the base?
Conversation
narwhals/_pandas_like/series.py
Outdated
def log(self: Self, base: float) -> Self: | ||
import numpy as np # ignore-banned-import() | ||
|
||
return self._from_native_series(np.log(self._native_series) / np.log(base)) | ||
|
||
def log10(self: Self) -> Self: | ||
import numpy as np # ignore-banned-import() | ||
|
||
return self._from_native_series(np.log10(self._native_series)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy maintains the underlying dtype backend (numpy, nullable numpy or pyarrow)
Edit: I wish it did π₯²
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm not sure if this is true when starting from Int64[pyarrow]
In [29]: s
Out[29]:
0 1
1 <NA>
2 3
dtype: int64[pyarrow]
In [30]: np.log(s)
Out[30]:
0 0.000000
1 NaN
2 1.098612
dtype: float64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Polars seems to always go for Float64. This should be doable. I am pushing a change for that.
Let me know if that's too much manipulation for the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Polars seems to always go for Float64. This should be doable. I am pushing a change for that.
Let me know if that's too much hidden manipulation for the user.
Edit: example with output
import narwhals as nw
import pandas as pd
import polars as pl
data = [1, None, 3]
@nw.narwhalify
def func(series):
return series.log()
func(pl.Series(data))
Out[10]:
# shape: (3,)
# Series: '' [f64]
# [
# 0.0
# null
# 1.098612
# ]
func(pd.Series(data))
# 0 0.000000
# 1 NaN
# 2 1.098612
# dtype: float64
func(pd.Series(data).astype("int32[pyarrow]"))
# 0 0.0
# 1 <NA>
# 2 1.098612
# dtype: double[pyarrow]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
polars preserves float32:
In [14]: pl.Series([1., 2.], dtype=pl.Float32).log10()
Out[14]:
shape: (2,)
Series: '' [f32]
[
0.0
0.30103
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
polars preserves float32:
In [14]: pl.Series([1., 2.], dtype=pl.Float32).log10()
Out[14]:
shape: (2,)
Series: '' [f32]
[
0.0
0.30103
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like pandas preserves the backend for float - maybe we can do the cast to the appropriate type before calling numpy log?
values_log_10: [[0,0.3010299956639812,0.6020599913279624]] | ||
|
||
""" | ||
return self.__class__(lambda plx: self._call(plx).log10()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although polars calls log(..., base=10)
I would imagine that numpy and/or pyarrow dedicated functions may behave differently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
pyproject.toml
Outdated
@@ -115,7 +115,8 @@ filterwarnings = [ | |||
'ignore:.*You are using pyarrow version', | |||
'ignore:.*but when imported by', | |||
'ignore:Distributing .*This may take some time', | |||
'ignore:.*The default coalesce behavior' | |||
'ignore:.*The default coalesce behavior', | |||
'ignore::RuntimeWarning' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to broad to turn off globally, can we just catch it in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough π
narwhals/_pandas_like/series.py
Outdated
def log(self: Self, base: float) -> Self: | ||
import numpy as np # ignore-banned-import() | ||
|
||
return self._from_native_series(np.log(self._native_series) / np.log(base)) | ||
|
||
def log10(self: Self) -> Self: | ||
import numpy as np # ignore-banned-import() | ||
|
||
return self._from_native_series(np.log10(self._native_series)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm not sure if this is true when starting from Int64[pyarrow]
In [29]: s
Out[29]:
0 1
1 <NA>
2 3
dtype: int64[pyarrow]
In [30]: np.log(s)
Out[30]:
0 0.000000
1 NaN
2 1.098612
dtype: float64
Expr
& Series
log
& log10
methodsExpr|Series
log
and log10
methods
What type of PR is this? (check all applicable)
Related issues
Used in plotly
Checklist
If you have comments or can explain your changes, please do so below.