Skip to content

Commit

Permalink
Merge branch 'snowflakedb:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
frederiksteiner authored Jan 28, 2025
2 parents f4c6dd7 + 7e510c9 commit a1e3896
Show file tree
Hide file tree
Showing 225 changed files with 28,567 additions and 6,379 deletions.
31 changes: 21 additions & 10 deletions .github/workflows/daily_modin_precommit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
name: Daily Snowpark pandas API test
on:
schedule:
# 9 AM UTC
- cron: "0 9 * * *"
# 8 AM UTC
- cron: "0 8 * * *"
workflow_dispatch:
inputs:
logLevel:
Expand Down Expand Up @@ -171,14 +171,25 @@ jobs:
# Specify SNOWFLAKE_IS_PYTHON_RUNTIME_TEST: 1 when adding >= python3.11 with no server-side support
# For example, see https://github.com/snowflakedb/snowpark-python/pull/681
shell: bash
- name: Run Snowpark pandas API tests (excluding doctests)
- if: ${{ contains('macos', matrix.os.download_name) }}
name: Run Snowpark pandas API tests (excluding doctests)
run: python -m tox -e "py${PYTHON_VERSION/\./}-snowparkpandasdailynotdoctest-modin-ci"
env:
PYTHON_VERSION: ${{ matrix.python-version }}
cloud_provider: ${{ matrix.cloud-provider }}
PYTEST_ADDOPTS: --color=yes --tb=short
TOX_PARALLEL_NO_SPINNER: 1
shell: bash
- if: ${{ !contains('macos', matrix.os.download_name) }}
name: Run Snowpark pandas API tests (excluding doctests)
# Use regular github setup for large windows and linux instance
run: python -m tox -e "py${PYTHON_VERSION/\./}-snowparkpandasnotdoctest-modin-ci"
env:
PYTHON_VERSION: ${{ matrix.python-version }}
cloud_provider: ${{ matrix.cloud-provider }}
PYTEST_ADDOPTS: --color=yes --tb=short
TOX_PARALLEL_NO_SPINNER: 1
shell: bash
- name: Combine coverages
run: python -m tox -e coverage --skip-missing-interpreters false
shell: bash
Expand Down Expand Up @@ -248,7 +259,7 @@ jobs:
# For example, see https://github.com/snowflakedb/snowpark-python/pull/681
shell: bash
- name: Run Snowpark pandas API tests (excluding doctests)
run: python -m tox -e "modin_pandas_version-py${PYTHON_VERSION/\./}-snowparkpandasdailynotdoctest-modin-ci"
run: python -m tox -e "modin_pandas_version-py${PYTHON_VERSION/\./}-snowparkpandasnotdoctest-modin-ci"
env:
MODIN_PANDAS_PATCH_VERSION: ${{ matrix.pandas-version }}
PYTHON_VERSION: ${{ matrix.python-version }}
Expand All @@ -265,10 +276,10 @@ jobs:
fail-fast: false
matrix:
os:
- image_name: macos-latest
download_name: macos # it includes doctest
python-version: ["3.9", "3.10", "3.11"]
cloud-provider: [aws]
- image_name: windows-latest-64-cores
download_name: windows
python-version: ["3.11"]
cloud-provider: [azure]
steps:
- name: Checkout Code
uses: actions/checkout@v4
Expand Down Expand Up @@ -309,7 +320,7 @@ jobs:
TOX_PARALLEL_NO_SPINNER: 1
shell: bash
- name: Run Snowpark pandas API tests (excluding doctests)
run: python -m tox -e "py${PYTHON_VERSION/\./}-snowparkpandasdailynotdoctest-modin-ci"
run: python -m tox -e "py${PYTHON_VERSION/\./}-snowparkpandasnotdoctest-modin-ci"
env:
PYTHON_VERSION: ${{ matrix.python-version }}
cloud_provider: ${{ matrix.cloud-provider }}
Expand Down Expand Up @@ -339,7 +350,7 @@ jobs:
os:
- image_name: ubuntu-latest-64-cores
download_name: linux
python-version: ["3.9", "3.10", "3.11"]
python-version: ["3.10"]
cloud-provider: [aws]
steps:
- name: Checkout Code
Expand Down
14 changes: 6 additions & 8 deletions .github/workflows/daily_precommit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
name: Daily precommit test
on:
schedule:
# 10 AM UTC
- cron: "0 10 * * *"
# 9 AM UTC
- cron: "0 9 * * *"

workflow_dispatch:
inputs:
Expand Down Expand Up @@ -134,7 +134,7 @@ jobs:
run: python -m pip install -U setuptools pip wheel
- name: Install tox
run: python -m pip install tox
- if: ${{ contains('macos', matrix.os.download_name) }}
- if: ${{ contains('macos', matrix.os.download_name) && matrix.python-version != '3.12'}}
name: Run doctests
run: python -m tox -e "py${PYTHON_VERSION}-doctest-notudf-ci"
env:
Expand Down Expand Up @@ -260,7 +260,7 @@ jobs:
run: python -m pip install -U setuptools pip wheel
- name: Install tox
run: python -m pip install tox
- if: ${{ contains('macos', matrix.os.download_name) }}
- if: ${{ contains('macos', matrix.os.download_name) && matrix.python-version != '3.12'}}
name: Run doctests
run: python -m tox -e "py${PYTHON_VERSION}-doctest-notudf-ci"
env:
Expand Down Expand Up @@ -373,9 +373,7 @@ jobs:
strategy:
fail-fast: false
matrix:
os:
- image_name: macos-latest
download_name: macos # it includes doctest
os: [macos-latest, windows-latest, ubuntu-latest]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
cloud-provider: [aws]
steps:
Expand Down Expand Up @@ -538,7 +536,7 @@ jobs:
run: python -m pip install -U setuptools pip wheel
- name: Install tox
run: python -m pip install tox
- if: ${{ contains('macos', matrix.os.download_name) }}
- if: ${{ contains('macos', matrix.os.download_name) && matrix.python-version != '3.12'}}
name: Run doctests
run: python -m tox -e "py${PYTHON_VERSION}-doctest-notudf-ci"
env:
Expand Down
63 changes: 62 additions & 1 deletion .github/workflows/precommit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,25 @@ jobs:
- os: windows-latest-64-cores
python-version: "3.11"
cloud-provider: azure
# limit python 3.12 to one os per csp
- os: macos-latest
python-version: "3.12"
cloud-provider: aws
- os: macos-latest
python-version: "3.12"
cloud-provider: gcp
- os: ubuntu-latest-64-cores
python-version: "3.12"
cloud-provider: azure
- os: ubuntu-latest-64-cores
python-version: "3.12"
cloud-provider: gcp
- os: windows-latest-64-cores
python-version: "3.12"
cloud-provider: azure
- os: windows-latest-64-cores
python-version: "3.12"
cloud-provider: aws
steps:
- name: Checkout Code
uses: actions/checkout@v4
Expand Down Expand Up @@ -223,6 +242,32 @@ jobs:
os: [macos-latest, windows-latest, ubuntu-latest]
python-version: [ "3.8", "3.9", "3.10", "3.11", "3.12" ]
cloud-provider: [aws]
exclude:
# only run each version on one os
# daily-precommit does full matrix
# macos runs 3.8 and 3.9
- os: macos-latest
python-version: "3.12"
- os: macos-latest
python-version: "3.11"
- os: macos-latest
python-version: "3.9"
# windows runs 3.9 and 3.11
- os: windows-latest
python-version: "3.12"
- os: windows-latest
python-version: "3.10"
- os: windows-latest
python-version: "3.8"
# ubuntu runs 3.12
- os: ubuntu-latest
python-version: "3.11"
- os: ubuntu-latest
python-version: "3.10"
- os: ubuntu-latest
python-version: "3.9"
- os: ubuntu-latest
python-version: "3.8"
steps:
- name: Checkout Code
uses: actions/checkout@v4
Expand Down Expand Up @@ -413,7 +458,23 @@ jobs:
matrix:
os: [ ubuntu-latest ]
python-version: [ "3.9", "3.11", "3.12" ]
cloud-provider: [ aws ]
cloud-provider: [aws, gcp, azure]
exclude:
# Run 3.9 on aws
- python-version: "3.11"
cloud-provider: aws
- python-version: "3.12"
cloud-provider: aws
# Run 3.11 on gcp
- python-version: "3.9"
cloud-provider: gcp
- python-version: "3.12"
cloud-provider: gcp
# Run 3.12 on azure
- python-version: "3.9"
cloud-provider: azure
- python-version: "3.11"
cloud-provider: azure
steps:
- name: Checkout Code
uses: actions/checkout@v4
Expand Down
32 changes: 31 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
- `bitmap_construct_agg`
- `cbrt`
- `equal_null`
- `from_json`
- `ifnull`
- `localtimestamp`
- `max_by`
Expand All @@ -40,21 +41,48 @@
- `regr_sxy`
- `regr_syy`
- `try_to_binary`
- `base64`
- `base64_decode_string`
- `base64_encode`
- `editdistance`
- `hex`
- `hex_encode`
- `instr`
- `log1p`
- `log2`
- `log10`
- `percentile_approx`
- `unbase64`
- Added support for specifying a schema string (including implicit struct syntax) when calling `DataFrame.create_dataframe`.
- Added support for `DataFrameWriter.insert_into/insertInto`. This method also supports local testing mode.
- Added support for multiple columns in the functions `map_cat` and `map_concat`.

#### Experimental Features

- Added `Catalog` class to manage snowflake objects. It can be accessed via `Session.catalog`.
- Added support for querying json element of a VARIANT column in `functions.col` and `functions.column` with an optional keyword argument `json_element`.
- Allow user input schema when reading JSON file on stage.
- Added support for specifying a schema string (including implicit struct syntax) when calling `DataFrame.create_dataframe`.
- `snowflake.core` is a dependency required for this feature.

#### Improvements

- Updated README.md to include instructions on how to verify package signatures using `cosign`.
- Added an option `keep_column_order` for keeping original column order in `DataFrame.with_column` and `DataFrame.with_columns`.
- Added options to column casts that allow renaming or adding fields in StructType columns.
- Added support for `contains_null` parameter to ArrayType.
- Added support for creating a temporary view via `DataFrame.create_or_replace_temp_view` from a DataFrame created by reading a file from a stage.
- Added support for `value_contains_null` parameter to MapType.
- Added `interactive` to telemetry that indicates whether the current environment is an interactive one.

#### Bug Fixes

- Fixed a bug in local testing mode that caused a column to contain None when it should contain 0
- Fixed a bug in `StructField.from_json` that prevented TimestampTypes with `tzinfo` from being parsed correctly.
- Fixed a bug in function `date_format` that caused an error when the input column was date type or timestamp type.
- Fixed a bug in dataframe that null value can be inserted in a non-nullable column.
- Fixed a bug in `replace` when passing `Column` expression objects.
- Fixed a bug in `replace` and `lit` which raised type hint assertion error when passing `Column` expression objects.
- Fixed a bug in `pandas_udf` and `pandas_udtf` where `session` parameter was erroneously ignored.

### Snowpark pandas API Updates

Expand Down Expand Up @@ -83,6 +111,8 @@
- Added support for `include_groups=False` in `DataFrameGroupBy.apply`.
- Added support for `expand=True` in `Series.str.split`.
- Added support for `DataFrame.pop` and `Series.pop`.
- Added support for `first` and `last` in `DataFrameGroupBy.agg` and `SeriesGroupBy.agg`.
- Added support for `Index.drop_duplicates`.

#### Bug Fixes

Expand Down
2 changes: 1 addition & 1 deletion LICENSE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright (c) 2012-2024 Snowflake Computing, Inc.
Copyright (c) 2012-2025 Snowflake Computing, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python3
#
# Copyright (c) 2012-2024 Snowflake Computing Inc. All rights reserved.
# Copyright (c) 2012-2025 Snowflake Computing Inc. All rights reserved.
#
import ast

Expand Down
2 changes: 1 addition & 1 deletion docs/source/doc_gen.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python3
#
# Copyright (c) 2012-2024 Snowflake Computing Inc. All rights reserved.
# Copyright (c) 2012-2025 Snowflake Computing Inc. All rights reserved.
#

import argparse
Expand Down
1 change: 1 addition & 0 deletions docs/source/modin/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ Index
Index.argmin
Index.argmax
Index.copy
Index.drop_duplicates
Index.equals
Index.identical
Index.is_boolean
Expand Down
4 changes: 4 additions & 0 deletions docs/source/modin/supported/agg_supp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,7 @@ the aggregation is supported by ``SeriesGroupBy.agg``.
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``len`` | ``N`` | ``N`` | ``Y`` | ``Y`` |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``first`` | ``N`` | ``N`` | ``Y`` | ``Y`` |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``last`` | ``N`` | ``N`` | ``Y`` | ``Y`` |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
2 changes: 1 addition & 1 deletion docs/source/modin/supported/index_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Methods
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``drop`` | N | | |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``drop_duplicates`` | N | | |
| ``drop_duplicates`` | Y | | |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``duplicated`` | N | | |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
Expand Down
12 changes: 12 additions & 0 deletions docs/source/snowpark/functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,9 @@ Functions
atanh
atan2
avg
base64
base64_decode_string
base64_encode
bit_length
bitmap_bit_position
bitmap_bucket_number
Expand Down Expand Up @@ -157,6 +160,7 @@ Functions
desc_nulls_last
div0
divnull
editdistance
endswith
equal_nan
equal_null
Expand All @@ -178,12 +182,15 @@ Functions
grouping
grouping_id
hash
hex
hex_encode
hour
iff
ifnull
in_
initcap
insert
instr
is_array
is_binary
is_boolean
Expand Down Expand Up @@ -217,6 +224,9 @@ Functions
locate
localtimestamp
log
log1p
log2
log10
lower
lpad
ltrim
Expand Down Expand Up @@ -257,6 +267,7 @@ Functions
parse_json
parse_xml
percent_rank
percentile_approx
percentile_cont
position
pow
Expand Down Expand Up @@ -350,6 +361,7 @@ Functions
udaf
udf
udtf
unbase64
uniform
unix_timestamp
upper
Expand Down
2 changes: 2 additions & 0 deletions docs/source/snowpark/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ Input/Output
DataFrameWriter.save
DataFrameWriter.saveAsTable
DataFrameWriter.save_as_table
DataFrameWriter.insertInto
DataFrameWriter.insert_into
FileOperation.get
FileOperation.get_stream
FileOperation.put
Expand Down
Loading

0 comments on commit a1e3896

Please sign in to comment.