-
Notifications
You must be signed in to change notification settings - Fork 12
PHEP 3: PyHC Python & Upstream Package Support Policy #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
2c73a84
1b9ff9c
60fde5a
fdc4c47
466d080
46e9522
e83c965
e50538c
2fcf211
6366e0e
da38715
7ce321c
6a6a5aa
4c39bb6
a65acc6
d26e3b5
be78dfd
f12d556
c5f9734
56d9e10
226bb1d
373fc25
9a46ec9
57833eb
66ff910
4156276
4914781
33431b1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,356 @@ | ||
| ``` | ||
| PHEP: 3 | ||
| Title: PyHC Python & Upstream Package Support Policy | ||
| Author: Shawn Polson <[email protected]> <https://orcid.org/0000-0003-0619-5745> | ||
| Discussions-To: https://github.com/heliophysicsPy/standards/pull/29 | ||
| Revision: 1 | ||
| Status: Draft | ||
| Type: Standards Track | ||
| Content-Type: text/markdown; charset=UTF-8; variant=CommonMark | ||
| Created: 06-Jun-2024 | ||
| Post-History: 06-Jun-2024, 11-Jun-2024, 02-Jul-2024 | ||
| ``` | ||
|
|
||
| # Abstract | ||
| <a name="abstract"></a> | ||
| This PHEP recommends that all projects across the PyHC ecosystem adopt a common time-based policy for support of dependencies, similar to [SPEC 0](https://scientific-python.org/specs/spec-0000/). Specifically, for Python versions and the upstream Scientific Python packages covered by SPEC 0, it recommends that projects: | ||
| 1. Support Python versions for at least **36 months** (3 years) after their initial release. | ||
| 2. Support upstream Scientific Python packages for at least **24 months** (2 years) after their initial release. | ||
| 3. Adopt support for new versions of these dependencies within **6 months** of their release. | ||
|
|
||
| The upstream Scientific Python packages are: `numpy, scipy, matplotlib, pandas, scikit-image, networkx, scikit-learn, xarray, ipython, zarr`. | ||
|
|
||
| This policy will replace the current standard [#11](https://github.com/heliophysicsPy/standards/blob/main/standards.md#standards) which simply mandates Python 3 support. | ||
sapols marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # Motivation | ||
| <a name="motivation"></a> | ||
| The current PyHC standard [#11](https://github.com/heliophysicsPy/standards/blob/main/standards.md#standards), which mandates compatibility with Python 3, is outdated. Python 3 support is virtually universal now, so it would be more beneficial to replace this standard with a policy for how to support new minor Python versions and key upstream dependencies. [SPEC 0](https://scientific-python.org/specs/spec-0000/) provides a structured support timeline that balances stability and progress, essential for software in the heliophysics community. Adopting a similar policy ensures consistency and predictability in support timelines. Additionally, limiting the scope of supported versions is an effective way for packages to limit maintenance burden while promoting interoperability. | ||
|
|
||
| # Rationale | ||
| <a name="rationale"></a> | ||
| Following [SPEC 0](https://scientific-python.org/specs/spec-0000/)'s 24/36-month support timeline keeps PyHC in better sync with the broader Scientific Python community, maintaining compatibility with newer Python features and key upstream dependencies, while providing adequate time for package maintainers to adapt. Allowing 6 months to adopt new versions ensures packages stay current with development cycles while providing a reasonable timeframe for testing and integration. | ||
|
||
|
|
||
| # Specification | ||
| <a name="specification"></a> | ||
| This PHEP refers to feature releases of dependencies (e.g., Python 3.12.0, NumPy 2.0.0; not Python 3.12.1, NumPy 2.0.1). | ||
|
|
||
| This PHEP specifies that all PyHC packages should: | ||
sapols marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| 1. Support Python versions for at least **36 months** (3 years) after their initial release. | ||
| 2. Support upstream Scientific Python packages for at least **24 months** (2 years) after their initial release. | ||
| 3. Adopt support for new versions of these dependencies within **6 months** of their release. | ||
|
|
||
| The upstream Scientific Python packages are: `numpy, scipy, matplotlib, pandas, scikit-image, networkx, scikit-learn, xarray, ipython, zarr`. | ||
|
||
|
|
||
| Since new minor Python versions are released annually every October ([PEP 602](https://peps.python.org/pep-0602/)), this effectively means that PyHC packages should be supporting about three minor Python versions at any given time. Upstream packages have more varied release schedules, but several recent versions should typically be supported concurrently. Providing ongoing support for older versions beyond the specified support periods is optional. | ||
|
|
||
|  | ||
sapols marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| PyHC packages should clearly document their dependency version policy (e.g., like [PlasmaPy](https://docs.plasmapy.org/en/stable/contributing/coding_guide.html#python-and-dependency-version-support) and [SpacePy](https://spacepy.github.io/dep_versions.html)) and be tested against the minimum and maximum supported versions. Testing with CI against release candidates is encouraged, too, as a way to stay ahead of future releases. Packages that use semantic versioning should consider using their version number to indicate versions that drop support for older dependencies. There is no expectation that a package "deprecate" an older dependency before dropping support for it. However, there is an expectation that maximum or exact requirements (e.g., `numpy<2` or `matplotlib==3.5.3`) be set only when absolutely necessary (and that GitHub issues be immediately created to remove such requirements). | ||
sapols marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| This new policy will replace the current standard [#11](https://github.com/heliophysicsPy/standards/blob/main/standards.md#standards) in the PyHC standards document with the following new text: | ||
|
|
||
| > **11. Python and Upstream Package Support:** All packages should support minor Python versions released within the last 36 months (3 years) and upstream core Scientific Python packages released within the last 24 months (2 years). Additionally, packages should support new versions within 6 months of their release (see [PHEP 3](https://github.com/heliophysicsPy/standards/pull/29)). | ||
|
|
||
| Lastly, if there is a Python 4 or other significant changes in dependencies, this policy will have to be reviewed in light of the community's and projects' best interests. | ||
|
|
||
| # Backwards Compatibility | ||
| <a name="backwards-compatibility"></a> | ||
| This policy potentially introduces backwards incompatibilities by enforcing a new support timeline, which may encourage some packages to drop support for older dependency versions sooner than planned. | ||
|
|
||
| # Security Implications | ||
| <a name="security-implications"></a> | ||
| There are no direct security implications of this policy. However, ensuring packages are updated to newer dependency versions may indirectly improve security by incorporating fixes and improvements from newer releases. | ||
sapols marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # How to Teach This | ||
| <a name="how-to-teach-this"></a> | ||
| - A new web page under the PyHC Projects page detailing the support schedule (similar to the Gantt chart in SPEC 0)? | ||
Cadair marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Automated email reminders sent quarterly and/or near upcoming drop/support dates? | ||
|
|
||
| # Reference Implementation | ||
| <a name="reference-implementation"></a> | ||
| Multiple PyHC packages already follow this version support policy. One notable example is PlasmaPy which currently [documents their SPEC 0-based policy](https://docs.plasmapy.org/en/stable/contributing/coding_guide.html#python-and-dependency-version-support) and even mentions it in comments inside their [pyproject.toml](https://github.com/PlasmaPy/PlasmaPy/blob/main/pyproject.toml) file. | ||
|
|
||
| ## Code to generate support and drop schedules: | ||
Cadair marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ```python | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can it be explicitly noted that this code is the source of the included Gantt chart? Does it make more sense to put it elsewhere (in this repository or otherwise) and link it from here, as something that is live updated as necessary without having to update the PHEP?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You got it. I'll add " I considered putting the code elsewhere, like in a separate file and linking to it, but decided not to for two main reasons: (1) SPEC 0 included their code in-line inside the SPEC and I liked that. (2) I honestly do not intend to formally maintain this code. Sure the dates etc will eventually become obsolete after enough time passes, but ain't nobody got time to remember to update dates in an obscure script! It's valid now and a good enough starting point for anyone who wants to use it in the future (likely only me tbh). |
||
| import requests | ||
| import collections | ||
| from datetime import datetime, timedelta | ||
|
|
||
| import pandas as pd | ||
| from packaging.version import Version | ||
|
|
||
|
|
||
| py_releases = { | ||
| "3.9": "Oct 5, 2020", | ||
| "3.10": "Oct 4, 2021", | ||
| "3.11": "Oct 24, 2022", | ||
| "3.12": "Oct 2, 2023", | ||
| } | ||
| core_packages = [ | ||
| "numpy", | ||
| "scipy", | ||
| "matplotlib", | ||
| "pandas", | ||
| "scikit-image", | ||
| "networkx", | ||
| "scikit-learn", | ||
| "xarray", | ||
| "ipython", | ||
| "zarr", | ||
| ] | ||
| plus36 = timedelta(days=int(365 * 3)) | ||
| plus24 = timedelta(days=int(365 * 2)) | ||
| plus6 = timedelta(days=int(365 * 0.5)) | ||
|
|
||
| # Release data | ||
|
|
||
| # put cutoff 3 quarters ago – we do not use "just" -9 month, | ||
| # to avoid the content of the quarter to change depending on when we generate this | ||
| # file during the current quarter. | ||
|
|
||
| current_date = pd.Timestamp.now() | ||
| current_quarter_start = pd.Timestamp( | ||
| current_date.year, (current_date.quarter - 1) * 3 + 1, 1 | ||
| ) | ||
| cutoff = current_quarter_start - pd.DateOffset(months=9) | ||
|
|
||
|
|
||
| def get_release_dates(package, support_time=plus24): | ||
| releases = {} | ||
|
|
||
| print(f"Querying pypi.org for {package} versions...", end="", flush=True) | ||
| response = requests.get( | ||
| f"https://pypi.org/simple/{package}", | ||
| headers={"Accept": "application/vnd.pypi.simple.v1+json"}, | ||
| ).json() | ||
| print("OK") | ||
|
|
||
| file_date = collections.defaultdict(list) | ||
| for f in response["files"]: | ||
| ver = f["filename"].split("-")[1] | ||
| try: | ||
| version = Version(ver) | ||
| except: | ||
| continue | ||
|
|
||
| if version.is_prerelease or version.micro != 0: | ||
| continue | ||
|
|
||
| release_date = None | ||
| for format in ["%Y-%m-%dT%H:%M:%S.%fZ", "%Y-%m-%dT%H:%M:%SZ"]: | ||
| try: | ||
| release_date = datetime.strptime(f["upload-time"], format) | ||
| except: | ||
| pass | ||
|
|
||
| if not release_date: | ||
| continue | ||
|
|
||
| file_date[version].append(release_date) | ||
|
|
||
| release_date = {v: min(file_date[v]) for v in file_date} | ||
|
|
||
| for ver, release_date in sorted(release_date.items()): | ||
| drop_date = release_date + support_time | ||
| if drop_date >= cutoff: | ||
| releases[ver] = { | ||
| "release_date": release_date, | ||
| "drop_date": drop_date, | ||
| "support_by_date": release_date + plus6 | ||
| } | ||
|
|
||
| return releases | ||
|
|
||
|
|
||
| package_releases = { | ||
| "python": { | ||
| version: { | ||
| "release_date": datetime.strptime(release_date, "%b %d, %Y"), | ||
| "drop_date": datetime.strptime(release_date, "%b %d, %Y") + plus36, | ||
| "support_by_date": datetime.strptime(release_date, "%b %d, %Y") + plus6 | ||
| } | ||
| for version, release_date in py_releases.items() | ||
| } | ||
| } | ||
|
|
||
| package_releases |= {package: get_release_dates(package) for package in core_packages} | ||
|
|
||
| # filter all items whose drop_date are in the past | ||
| package_releases = { | ||
| package: { | ||
| version: dates | ||
| for version, dates in releases.items() | ||
| if dates["drop_date"] > cutoff | ||
| } | ||
| for package, releases in package_releases.items() | ||
| } | ||
|
|
||
|
|
||
| # Save Gantt chart | ||
| # You can paste the contents into https://mermaid.live/ to generate the chart image. | ||
|
|
||
| print("Saving Mermaid chart to chart.md (render at https://mermaid.live/)") | ||
| with open("chart.md", "w") as fh: | ||
| fh.write( | ||
| """gantt | ||
| dateFormat YYYY-MM-DD | ||
| axisFormat %m / %Y | ||
| title Support Window""" | ||
| ) | ||
|
|
||
| for name, releases in package_releases.items(): | ||
| fh.write(f"\n\nsection {name}") | ||
| for version, dates in releases.items(): | ||
| fh.write( | ||
| f"\n{version} : {dates['release_date'].strftime('%Y-%m-%d')},{dates['drop_date'].strftime('%Y-%m-%d')}" | ||
| ) | ||
| fh.write("\n") | ||
|
|
||
| # Print drop schedule | ||
|
|
||
| data = [] | ||
| for k, versions in package_releases.items(): | ||
| for v, dates in versions.items(): | ||
| data.append( | ||
| ( | ||
| k, | ||
| v, | ||
| pd.to_datetime(dates["release_date"]), | ||
| pd.to_datetime(dates["drop_date"]), | ||
| pd.to_datetime(dates["support_by_date"]), | ||
| ) | ||
| ) | ||
|
|
||
| df = pd.DataFrame(data, columns=["package", "version", "release", "drop", "support_by"]) | ||
|
|
||
| df["quarter_drop"] = df["drop"].dt.to_period("Q") | ||
| df["quarter_support_by"] = df["support_by"].dt.to_period("Q") | ||
|
|
||
| dq_drop = df.set_index(["quarter_drop", "package"]).sort_index() | ||
| dq_support_by = df.set_index(["quarter_support_by", "package"]).sort_index() | ||
|
|
||
|
|
||
| print("Saving support schedule to schedule.md") | ||
|
|
||
|
|
||
| def pad_table(table): | ||
| rows = [[el.strip() for el in row.split("|")] for row in table] | ||
| col_widths = [max(map(len, column)) for column in zip(*rows)] | ||
| rows[1] = [ | ||
| el if el != "----" else "-" * col_widths[i] for i, el in enumerate(rows[1]) | ||
| ] | ||
| padded_table = [] | ||
| for row in rows: | ||
| line = "" | ||
| for entry, width in zip(row, col_widths): | ||
| if not width: | ||
| continue | ||
| line += f"| {str.ljust(entry, width)} " | ||
| line += f"|" | ||
| padded_table.append(line) | ||
|
|
||
| return padded_table | ||
|
|
||
|
|
||
| def make_table(sub): | ||
| table = [] | ||
| table.append("| | | |") | ||
| table.append("|----|----|----|") | ||
| for package in sorted(set(sub.index.get_level_values(0))): | ||
| vers = sub.loc[[package]]["version"] | ||
| minv, maxv = min(vers), max(vers) | ||
| rels = sub.loc[[package]]["release"] | ||
| rel_min, rel_max = min(rels), max(rels) | ||
| version_range = str(minv) if minv == maxv else f"{minv} to {maxv}" | ||
| rel_range = ( | ||
| str(rel_min.strftime("%b %Y")) | ||
| if rel_min == rel_max | ||
| else f"{rel_min.strftime('%b %Y')} and {rel_max.strftime('%b %Y')}" | ||
| ) | ||
| table.append(f"|{package:<15}|{version_range:<19}|released {rel_range}|") | ||
|
|
||
| return pad_table(table) | ||
|
|
||
|
|
||
| def make_adopt_table(sub): | ||
| table = [] | ||
| table.append("| | | |") | ||
| table.append("|----|----|----|") | ||
| for package in sorted(set(sub.index.get_level_values(0))): | ||
| vers = sub.loc[[package]]["version"] | ||
| minv, maxv = min(vers), max(vers) | ||
| support_bys = sub.loc[[package]]["support_by"] | ||
| support_by_min, support_by_max = min(support_bys), max(support_bys) | ||
| version_range = str(minv) if minv == maxv else f"{minv} to {maxv}" | ||
| support_by_range = ( | ||
| str(support_by_min.strftime("%b %Y")) | ||
| if support_by_min == support_by_max | ||
| else f"{support_by_min.strftime('%b %Y')} and {support_by_max.strftime('%b %Y')}" | ||
| ) | ||
| table.append(f"|{package:<15}|{version_range:<19}|support by {support_by_range}|") | ||
|
|
||
| return pad_table(table) | ||
|
|
||
|
|
||
| def make_quarter(quarter, dq_drop, dq_support_by): | ||
| table = ["#### " + str(quarter).replace("Q", " - Quarter ") + ":\n"] | ||
|
|
||
| # Add new versions adoption schedule if not empty | ||
| if quarter in dq_support_by.index.get_level_values(0): | ||
| table.append("###### Adopt support for:\n") | ||
| adopt_sub = dq_support_by.loc[quarter] | ||
| adopt_table = make_adopt_table(adopt_sub) | ||
| table.extend(adopt_table) | ||
|
|
||
| table.append("\n###### Can drop support for:\n") | ||
| sub = dq_drop.loc[quarter] | ||
| table.extend(make_table(sub)) | ||
|
|
||
| return "\n".join(table) | ||
|
|
||
|
|
||
| with open("schedule.md", "w") as fh: | ||
| # we collect package 6 month in the past, and drop the first quarter | ||
| # as we might have filtered some of the packages out depending on | ||
| # when we ran the script. | ||
| tb = [] | ||
| for quarter in list(sorted(set(dq_drop.index.get_level_values(0))))[1:]: | ||
| tb.append(make_quarter(quarter, dq_drop, dq_support_by)) | ||
|
|
||
| fh.write("\n\n".join(tb)) | ||
| fh.write("\n") | ||
|
|
||
| ``` | ||
|
|
||
| # Rejected Ideas | ||
| <a name="rejected-ideas"></a> | ||
| - [NEP 29](https://numpy.org/neps/nep-0029-deprecation_policy.html)'s more lenient 42-month support timeline was originally considered instead of [SPEC 0](https://scientific-python.org/specs/spec-0000/)'s 36 months, but it was ultimately decided to follow SPEC 0 because it supersedes NEP 29. | ||
| - The scope of this PHEP was originally limited to Python version support. However, it was decided that including the upstream package support policy from SPEC 0 would better promote PyHC package interoperability and avoid the need for a future separate PHEP. | ||
|
|
||
| # Open Issues | ||
| <a name="open-issues"></a> | ||
| 1. What should go in the "How to Teach This" section? Should we expand on the ideas already there or take it a different direction? | ||
|
|
||
| # Footnotes | ||
| <a name="footnotes"></a> | ||
| 1. SPEC 0: https://scientific-python.org/specs/spec-0000/ | ||
| 2. NEP 29: https://numpy.org/neps/nep-0029-deprecation_policy.html | ||
|
|
||
| # Revisions | ||
| <a name="revisions"></a> | ||
| Revision 1 (pending): Initial draft. | ||
|
|
||
| # Copyright | ||
| <a name="copyright"></a> | ||
| This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. It should be cited as: | ||
|
|
||
| ``` | ||
| @techreport(phep3, | ||
| author = {Shawn Polson}, | ||
| title = {PyHC Python Support Policy}, | ||
| year = {2024}, | ||
| type = {PHEP}, | ||
| number = {3}, | ||
| doi = {10.5281/zenodo.xxxxxxx} | ||
| ) | ||
| ``` | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to list these, or simply reference SPEC0 and leave it at that? Implicitly then as SPEC0 updates the goal of this policy would update.
I also do like wording like "other dependencies which follow similar versioning schemes should be similarly supported."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there's merit in being clear about which packages this policy applies to by explicitly listing them. And they do say that their list of core projects will not change rapidly. But I do like the idea of this list implicitly updating if the core Scientific Python packages ever do change, so here's what I'll do:
At the time of writing, the upstream [core Scientific Python packages](https://scientific-python.org/specs/core-projects/) are: numpy, scipy, matplotlib, pandas, scikit-image, networkx, scikit-learn, xarray, ipython, zarr.At the time of writing, the upstream [core Scientific Python packages](https://scientific-python.org/specs/core-projects/) are: numpy, scipy, matplotlib, pandas, scikit-image, networkx, scikit-learn, xarray, ipython, zarr. If their core packages are updated, this policy applies to the updated list instead.Additionally, while your wording about other dependencies sounds good, I don't want to be vague and introduce uncertainty about which packages this policy applies to. We're simply adopting SPEC 0 here, and SPEC 0 only applies to their core packages, so our policy should do the same.