Skip to content

Commit a5a2fe2

Browse files
Changing subtitle level of packages
1 parent 6f83726 commit a5a2fe2

File tree

1 file changed

+37
-37
lines changed

1 file changed

+37
-37
lines changed

web/pandas/community/ecosystem.md

Lines changed: 37 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -156,21 +156,21 @@ pd.set_option("plotting.backend", "plotly")
156156

157157
## Domain specific pandas extensions
158158

159-
### [Geopandas](https://github.com/geopandas/geopandas)
159+
#### [Geopandas](https://github.com/geopandas/geopandas)
160160

161161
Geopandas extends pandas data objects to include geographic information
162162
which support geometric operations. If your work entails maps and
163163
geographical coordinates, and you love pandas, you should take a close
164164
look at Geopandas.
165165

166-
### [gurobipy-pandas](https://github.com/Gurobi/gurobipy-pandas)
166+
#### [gurobipy-pandas](https://github.com/Gurobi/gurobipy-pandas)
167167

168168
gurobipy-pandas provides a convenient accessor API to connect pandas with
169169
gurobipy. It enables users to more easily and efficiently build mathematical
170170
optimization models from data stored in DataFrames and Series, and to read
171171
solutions back directly as pandas objects.
172172

173-
### [Hail Query](https://hail.is/)
173+
#### [Hail Query](https://hail.is/)
174174

175175
An out-of-core, preemptible-safe, distributed, dataframe library serving
176176
the genetics community. Hail Query ships with on-disk data formats,
@@ -185,14 +185,14 @@ native import to and export from pandas DataFrames:
185185
- [`Table.from_pandas`](https://hail.is/docs/latest/hail.Table.html#hail.Table.from_pandas)
186186
- [`Table.to_pandas`](https://hail.is/docs/latest/hail.Table.html#hail.Table.to_pandas)
187187

188-
### [staircase](https://github.com/staircase-dev/staircase)
188+
#### [staircase](https://github.com/staircase-dev/staircase)
189189

190190
staircase is a data analysis package, built upon pandas and numpy, for modelling and
191191
manipulation of mathematical step functions. It provides a rich variety of arithmetic
192192
operations, relational operations, logical operations, statistical operations and
193193
aggregations for step functions defined over real numbers, datetime and timedelta domains.
194194

195-
### [xarray](https://github.com/pydata/xarray)
195+
#### [xarray](https://github.com/pydata/xarray)
196196

197197
xarray brings the labeled data power of pandas to the physical sciences
198198
by providing N-dimensional variants of the core pandas data structures.
@@ -203,7 +203,7 @@ which pandas excels.
203203

204204
## Data IO for pandas
205205

206-
### [ArcticDB](https://github.com/man-group/ArcticDB)
206+
#### [ArcticDB](https://github.com/man-group/ArcticDB)
207207

208208
ArcticDB is a serverless DataFrame database engine designed for the Python Data Science ecosystem.
209209
ArcticDB enables you to store, retrieve, and process pandas DataFrames at scale.
@@ -213,21 +213,21 @@ to object storage and can be installed in seconds.
213213

214214
Please find full documentation [here](https://docs.arcticdb.io/latest/).
215215

216-
### [BCPandas](https://github.com/yehoshuadimarsky/bcpandas)
216+
#### [BCPandas](https://github.com/yehoshuadimarsky/bcpandas)
217217

218218
BCPandas provides high performance writes from pandas to Microsoft SQL Server,
219219
far exceeding the performance of the native ``df.to_sql`` method. Internally, it uses
220220
Microsoft's BCP utility, but the complexity is fully abstracted away from the end user.
221221
Rigorously tested, it is a complete replacement for ``df.to_sql``.
222222

223-
### [Deltalake](https://pypi.org/project/deltalake)
223+
#### [Deltalake](https://pypi.org/project/deltalake)
224224

225225
Deltalake python package lets you access tables stored in
226226
[Delta Lake](https://delta.io/) natively in Python without the need to use Spark or
227227
JVM. It provides the ``delta_table.to_pyarrow_table().to_pandas()`` method to convert
228228
any Delta table into Pandas dataframe.
229229

230-
### [fredapi](https://github.com/mortada/fredapi)
230+
#### [fredapi](https://github.com/mortada/fredapi)
231231

232232
fredapi is a Python interface to the [Federal Reserve Economic Data
233233
(FRED)](https://fred.stlouisfed.org/) provided by the Federal Reserve
@@ -239,7 +239,7 @@ point-in-time data from ALFRED. fredapi makes use of pandas and returns
239239
data in a Series or DataFrame. This module requires a FRED API key that
240240
you can obtain for free on the FRED website.
241241

242-
### [Hugging Face](https://huggingface.co/datasets)
242+
#### [Hugging Face](https://huggingface.co/datasets)
243243

244244
The Hugging Face Dataset Hub provides a large collection of ready-to-use
245245
datasets for machine learning shared by the community. The platform offers
@@ -274,7 +274,7 @@ df.to_parquet("hf://datasets/username/dataset_name/train.parquet")
274274

275275
You can find more information about the Hugging Face Dataset Hub in the [documentation](https://huggingface.co/docs/hub/en/datasets).
276276

277-
### [NTV-pandas](https://github.com/loco-philippe/ntv-pandas)
277+
#### [NTV-pandas](https://github.com/loco-philippe/ntv-pandas)
278278

279279
NTV-pandas provides a JSON converter with more data types than the ones supported by pandas directly.
280280

@@ -297,7 +297,7 @@ df = npd.read_json(jsn) # load a JSON-value as a `DataFrame`
297297
df.equals(npd.read_json(df.npd.to_json(df))) # `True` in any case, whether `table=True` or not
298298
```
299299

300-
### [pandas-datareader](https://github.com/pydata/pandas-datareader)
300+
#### [pandas-datareader](https://github.com/pydata/pandas-datareader)
301301

302302
`pandas-datareader` is a remote data access library for pandas
303303
(PyPI:`pandas-datareader`). It is based on functionality that was
@@ -324,14 +324,14 @@ The following data feeds are available:
324324
- Stooq Index Data
325325
- MOEX Data
326326

327-
### [pandas-gbq](https://github.com/googleapis/python-bigquery-pandas)
327+
#### [pandas-gbq](https://github.com/googleapis/python-bigquery-pandas)
328328

329329
pandas-gbq provides high performance reads and writes to and from
330330
[Google BigQuery](https://cloud.google.com/bigquery/). Previously (before version 2.2.0),
331331
these methods were exposed as `pandas.read_gbq` and `DataFrame.to_gbq`.
332332
Use `pandas_gbq.read_gbq` and `pandas_gbq.to_gbq`, instead.
333333

334-
### [pandaSDMX](https://pandasdmx.readthedocs.io)
334+
#### [pandaSDMX](https://pandasdmx.readthedocs.io)
335335

336336
pandaSDMX is a library to retrieve and acquire statistical data and
337337
metadata disseminated in [SDMX](https://sdmx.org) 2.1, an
@@ -344,7 +344,7 @@ MultiIndexed DataFrames.
344344

345345
## Scaling pandas
346346

347-
### [Bodo](https://github.com/bodo-ai/Bodo)
347+
#### [Bodo](https://github.com/bodo-ai/Bodo)
348348

349349
Bodo is a high-performance compute engine for Python data processing.
350350
Using an auto-parallelizing just-in-time (JIT) compiler, Bodo simplifies scaling Pandas
@@ -366,26 +366,26 @@ def process_data():
366366
process_data()
367367
```
368368

369-
### [Dask](https://docs.dask.org)
369+
#### [Dask](https://docs.dask.org)
370370

371371
Dask is a flexible parallel computing library for analytics. Dask
372372
provides a familiar `DataFrame` interface for out-of-core, parallel and
373373
distributed computing.
374374

375-
### [Ibis](https://ibis-project.org/docs/)
375+
#### [Ibis](https://ibis-project.org/docs/)
376376

377377
Ibis offers a standard way to write analytics code, that can be run in
378378
multiple engines. It helps in bridging the gap between local Python environments
379379
(like pandas) and remote storage and execution systems like Hadoop components
380380
(like HDFS, Impala, Hive, Spark) and SQL databases (Postgres, etc.).
381381

382-
### [Koalas](https://koalas.readthedocs.io/en/latest/)
382+
#### [Koalas](https://koalas.readthedocs.io/en/latest/)
383383

384384
Koalas provides a familiar pandas DataFrame interface on top of Apache
385385
Spark. It enables users to leverage multi-cores on one machine or a
386386
cluster of machines to speed up or scale their DataFrame code.
387387

388-
### [Modin](https://github.com/modin-project/modin)
388+
#### [Modin](https://github.com/modin-project/modin)
389389

390390
The ``modin.pandas`` DataFrame is a parallel and distributed drop-in replacement
391391
for pandas. This means that you can use Modin with existing pandas code or write
@@ -404,21 +404,21 @@ df = pd.read_csv("big.csv") # use all your cores!
404404

405405
## Data cleaning and validation for pandas
406406

407-
### [Pandera](https://pandera.readthedocs.io/en/stable/)
407+
#### [Pandera](https://pandera.readthedocs.io/en/stable/)
408408

409409
Pandera provides a flexible and expressive API for performing data validation on dataframes
410410
to make data processing pipelines more readable and robust.
411411
Dataframes contain information that pandera explicitly validates at runtime. This is useful in
412412
production-critical data pipelines or reproducible research settings.
413413

414-
### [pyjanitor](https://github.com/pyjanitor-devs/pyjanitor)
414+
#### [pyjanitor](https://github.com/pyjanitor-devs/pyjanitor)
415415

416416
Pyjanitor provides a clean API for cleaning data, using method chaining.
417417

418418

419419
## Development tools for pandas
420420

421-
### [Hamilton](https://github.com/dagworks-inc/hamilton)
421+
#### [Hamilton](https://github.com/dagworks-inc/hamilton)
422422

423423
Hamilton is a declarative dataflow framework that came out of Stitch Fix. It was
424424
designed to help one manage a Pandas code base, specifically with respect to
@@ -436,13 +436,13 @@ This helps one to scale your pandas code base, at the same time, keeping mainten
436436

437437
For more information, see [documentation](https://hamilton.readthedocs.io/).
438438

439-
### [IPython](https://ipython.org/documentation.html)
439+
#### [IPython](https://ipython.org/documentation.html)
440440

441441
IPython is an interactive command shell and distributed computing
442442
environment. IPython tab completion works with Pandas methods and also
443443
attributes like DataFrame columns.
444444

445-
### [Jupyter Notebook / Jupyter Lab](https://jupyter.org)
445+
#### [Jupyter Notebook / Jupyter Lab](https://jupyter.org)
446446

447447
Jupyter Notebook is a web application for creating Jupyter notebooks. A
448448
Jupyter notebook is a JSON document containing an ordered list of
@@ -460,7 +460,7 @@ or may not be compatible with non-HTML Jupyter output formats.)
460460
See [Options and Settings](https://pandas.pydata.org/docs/user_guide/options.html)
461461
for pandas `display.` settings.
462462

463-
### [marimo](https://marimo.io)
463+
#### [marimo](https://marimo.io)
464464

465465
marimo is a reactive notebook for Python and SQL that enhances productivity
466466
when working with dataframes. It provides several features to make data
@@ -479,7 +479,7 @@ manipulation and visualization more interactive and fun:
479479
6. SQL integration: marimo allows users to write SQL queries against any
480480
pandas dataframes existing in memory.
481481

482-
### [pandas-stubs](https://github.com/VirtusLab/pandas-stubs)
482+
#### [pandas-stubs](https://github.com/VirtusLab/pandas-stubs)
483483

484484
While pandas repository is partially typed, the package itself doesn't expose this information for external use.
485485
Install pandas-stubs to enable basic type coverage of pandas API.
@@ -489,7 +489,7 @@ Learn more by reading through these issues [14468](https://github.com/pandas-dev
489489

490490
See installation and usage instructions on the [GitHub page](https://github.com/VirtusLab/pandas-stubs).
491491

492-
### [Spyder](https://www.spyder-ide.org/)
492+
#### [Spyder](https://www.spyder-ide.org/)
493493

494494
Spyder is a cross-platform PyQt-based IDE combining the editing,
495495
analysis, debugging and profiling functionality of a software
@@ -518,14 +518,14 @@ both automatically and on-demand.
518518

519519
## Other related libraries
520520

521-
### [Compose](https://github.com/alteryx/compose)
521+
#### [Compose](https://github.com/alteryx/compose)
522522

523523
Compose is a machine learning tool for labeling data and prediction engineering.
524524
It allows you to structure the labeling process by parameterizing
525525
prediction problems and transforming time-driven relational data into
526526
target values with cutoff times that can be used for supervised learning.
527527

528-
### [D-Tale](https://github.com/man-group/dtale)
528+
#### [D-Tale](https://github.com/man-group/dtale)
529529

530530
D-Tale is a lightweight web client for visualizing pandas data structures. It
531531
provides a rich spreadsheet-style grid which acts as a wrapper for a lot of
@@ -544,20 +544,20 @@ D-Tale integrates seamlessly with Jupyter notebooks, Python terminals, Kaggle
544544
& Google Colab. Here are some demos of the
545545
[grid](http://alphatechadmin.pythonanywhere.com/dtale/main/1).
546546

547-
### [Featuretools](https://github.com/alteryx/featuretools/)
547+
#### [Featuretools](https://github.com/alteryx/featuretools/)
548548

549549
Featuretools is a Python library for automated feature engineering built
550550
on top of pandas. It excels at transforming temporal and relational
551551
datasets into feature matrices for machine learning using reusable
552552
feature engineering "primitives". Users can contribute their own
553553
primitives in Python and share them with the rest of the community.
554554

555-
### [IPython Vega](https://github.com/vega/ipyvega)
555+
#### [IPython Vega](https://github.com/vega/ipyvega)
556556

557557
[IPython Vega](https://github.com/vega/ipyvega) leverages
558558
[Vega](https://github.com/vega/vega) to create plots within Jupyter Notebook.
559559

560-
### [plotnine](https://github.com/has2k1/plotnine/)
560+
#### [plotnine](https://github.com/has2k1/plotnine/)
561561

562562
Hadley Wickham's [ggplot2](https://ggplot2.tidyverse.org/) is a
563563
foundational exploratory visualization package for the R language. Based
@@ -568,7 +568,7 @@ generate bespoke plots of any kind of data.
568568
Various implementations to other languages are available.
569569
A good implementation for Python users is [has2k1/plotnine](https://github.com/has2k1/plotnine/).
570570

571-
### [pygwalker](https://github.com/Kanaries/pygwalker)
571+
#### [pygwalker](https://github.com/Kanaries/pygwalker)
572572

573573
PyGWalker is an interactive data visualization and
574574
exploratory data analysis tool built upon Graphic Walker
@@ -582,7 +582,7 @@ import pygwalker as pyg
582582
pyg.walk(df)
583583
```
584584

585-
### [seaborn](https://seaborn.pydata.org)
585+
#### [seaborn](https://seaborn.pydata.org)
586586

587587
Seaborn is a Python visualization library based on
588588
[matplotlib](https://matplotlib.org). It provides a high-level,
@@ -599,13 +599,13 @@ import seaborn as sns
599599
sns.set_theme()
600600
```
601601

602-
### [skrub](https://skrub-data.org)
602+
#### [skrub](https://skrub-data.org)
603603

604604
Skrub facilitates machine learning on dataframes. It bridges pandas
605605
to scikit-learn and related. In particular it facilitates building
606606
features from dataframes.
607607

608-
### [Statsmodels](https://www.statsmodels.org/)
608+
#### [Statsmodels](https://www.statsmodels.org/)
609609

610610
Statsmodels is the prominent Python "statistics and econometrics
611611
library" and it has a long-standing special relationship with pandas.
@@ -614,7 +614,7 @@ modeling functionality that is out of pandas' scope. Statsmodels
614614
leverages pandas objects as the underlying data container for
615615
computation.
616616

617-
### [STUMPY](https://github.com/TDAmeritrade/stumpy)
617+
#### [STUMPY](https://github.com/TDAmeritrade/stumpy)
618618

619619
STUMPY is a powerful and scalable Python library for modern time series analysis.
620620
At its core, STUMPY efficiently computes something called a

0 commit comments

Comments
 (0)