@@ -239,17 +239,123 @@ sugar required for fast analysis of data.
239
239
240
240
## How to read this document
241
241
242
+ The API specification itself can be found under {ref}` api-specification ` .
242
243
244
+ For guidance on how to read and understand the type annotations included in
245
+ this specification, consult the Python
246
+ [ documentation] ( https://docs.python.org/3/library/typing.html ) .
243
247
244
248
249
+ (how-to-adopt-this-api)=
245
250
## How to adopt this API
246
251
252
+ Most (all) existing dataframe libraries will find something in this API standard
253
+ that is incompatible with a current implementation, and that they cannot
254
+ change due to backwards compatibility concerns. Therefore we expect that each
255
+ of those libraries will want to offer a standard-compliant API in a _ new
256
+ namespace_ . The question then becomes: how does a user access this namespace?
247
257
258
+ The simplest method is: document the import to use to directly access the
259
+ namespace (e.g. ` import package_name.dataframe_api ` ). This has two issues
260
+ though:
248
261
262
+ 1 . Dataframe-consuming libraries that want to support multiple dataframe
263
+ libraries then have to explicitly import each library.
264
+ 2 . It is difficult to _ version_ the dataframe API standard implementation (see
265
+ {ref}` api-versioning ` ).
249
266
250
- ## Definitions
267
+ To address both issues, a uniform way must be provided by a conforming
268
+ implementation to access the API namespace, namely a [ method on the dataframe
269
+ object] ( DataFrame.__dataframe_namespace__ ) :
251
270
271
+ ```
272
+ xp = x.__dataframe_namespace__()
273
+ ```
252
274
275
+ The method must take one keyword, ` api_version=None ` , to make it possible to
276
+ request a specific API version:
253
277
278
+ ```
279
+ xp = x.__dataframe_namespace__(api_version='2023.04')
280
+ ```
281
+
282
+ The ` xp ` namespace must contain all functionality specified in
283
+ {ref}` api-specification ` . The namespace may contain other functionality; however,
284
+ including additional functionality is not recommended as doing so may hinder
285
+ portability and inter-operation of dataframe libraries within user code.
286
+
287
+ ### Checking a dataframe object for Compliance
288
+
289
+ Dataframe-consuming libraries are likely to want a mechanism for determining
290
+ whether a provided dataframe is specification compliant. The recommended
291
+ approach to check for compliance is by checking whether a dataframe object has
292
+ an ` __dataframe_namespace__ ` attribute, as this is the one distinguishing
293
+ feature of a dataframe-compliant object.
294
+
295
+ Checking for an ` __dataframe_namespace__ ` attribute can be implemented as a
296
+ small utility function similar to the following.
297
+
298
+ ``` python
299
+ def is_dataframe_api_obj (x ):
300
+ return hasattr (x, ' __dataframe_namespace__' )
301
+ ```
302
+
303
+
304
+ ### Discoverability of conforming implementations
305
+
306
+ It may be useful to have a way to discover all packages in a Python
307
+ environment which provide a conforming dataframe API implementation, and the
308
+ namespace that that implementation resides in.
309
+ To assist dataframe-consuming libraries which need to create dataframes originating
310
+ from multiple conforming dataframe implementations, or developers who want to perform
311
+ for example cross-library testing, libraries may provide an
312
+ {pypa}` entry point <specifications/entry-points/> ` in order to make a dataframe API
313
+ namespace discoverable.
314
+
315
+ :::{admonition} Optional feature
316
+ Given that entry points typically require build system & package installer
317
+ specific implementation, this standard chooses to recommend rather than
318
+ mandate providing an entry point.
319
+ :::
320
+
321
+ The following code is an example for how one can discover installed
322
+ conforming libraries:
323
+
324
+ ``` python
325
+ from importlib.metadata import entry_points
326
+
327
+ try :
328
+ eps = entry_points()[' dataframe_api' ]
329
+ ep = next (ep for ep in eps if ep.name == ' package_name' )
330
+ except TypeError :
331
+ # The dict interface for entry_points() is deprecated in py3.10,
332
+ # supplanted by a new select interface.
333
+ ep = entry_points(group = ' dataframe_api' , name = ' package_name' )
334
+
335
+ xp = ep.load()
336
+ ```
337
+
338
+ An entry point must have the following properties:
339
+
340
+ - ** group** : equal to ` dataframe_api ` .
341
+ - ** name** : equal to the package name.
342
+ - ** object reference** : equal to the dataframe API namespace import path.
343
+
344
+
345
+ * * *
346
+
347
+ ## Conformance
348
+
349
+ A conforming implementation of the dataframe API standard must provide and
350
+ support all the functions, arguments, data types, syntax, and semantics
351
+ described in this specification.
352
+
353
+ A conforming implementation of the dataframe API standard may provide
354
+ additional values, objects, properties, data types, and functions beyond those
355
+ described in this specification.
356
+
357
+ Libraries which aim to provide a conforming implementation but haven't yet
358
+ completed such an implementation may, and are encouraged to, provide details on
359
+ the level of (non-)conformance. For details on how to do this, see
360
+ [ Verification - measuring conformance] ( verification_test_suite.md ) .
254
361
255
- ## References
0 commit comments