@@ -22,7 +22,7 @@ makes many powerful array operations possible:
22
22
dimensions (known in numpy as "broadcasting") based on dimension names,
23
23
regardless of their original order.
24
24
- Flexible split-apply-combine operations with groupby:
25
- ` x.groupby('time.dayofyear').apply(lambda y: y - y. mean() ) ` .
25
+ ` x.groupby('time.dayofyear').mean() ` .
26
26
- Database like aligment based on coordinate labels that smoothly
27
27
handles missing values: ` x, y = xray.align(x, y, join='outer') ` .
28
28
- Keep track of arbitrary metadata in the form of a Python dictionary:
@@ -38,9 +38,10 @@ Because **xray** implements the same data model as the NetCDF file format,
38
38
xray datasets have a natural and portable serialization format. But it's
39
39
also easy to robustly convert an xray ` DataArray ` to and from a numpy
40
40
` ndarray ` or a pandas ` DataFrame ` or ` Series ` , providing compatibility with
41
- the full [ scientific-python ecosystem] [ scipy ] .
41
+ the full [ PyData ecosystem] [ pydata ] .
42
42
43
43
[ pandas ] : http://pandas.pydata.org/
44
+ [ pydata ] : http://pydata.org/
44
45
[ scipy ] : http://scipy.org/
45
46
[ ndarray ] : http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html
46
47
@@ -143,43 +144,34 @@ labeled numpy arrays that provided some guidance for the design of xray.
143
144
- Be fast. There shouldn't be a significant overhead for metadata aware
144
145
manipulation of n-dimensional arrays, as long as the arrays are large
145
146
enough. The goal is to be as fast as pandas or raw numpy.
146
- - Provide a uniform API for loading and saving scientific data in a variety
147
- of formats (including streaming data).
148
- - Take a pragmatic approach to metadata (attributes), and be very cautious
149
- before implementing any functionality that relies on it. Automatically
150
- maintaining attributes is a tricky and very hard to get right (see
151
- discussion about Iris above).
147
+ - Support loading and saving labeled scientific data in a variety of formats
148
+ (including streaming data).
152
149
153
150
## Getting started
154
151
155
- For more details, see the ** [ full documentation] [ docs ] ** (still a work in
156
- progress) or the source code. ** xray** is rapidly maturing, but it is still in
157
- its early development phase. *** Expect the API to change.***
152
+ For more details, see the ** [ full documentation] [ docs ] ** , particularly the
153
+ ** [ tutorial] [ tutorial ] ** .
158
154
159
155
xray requires Python 2.7 and recent versions of [ numpy] [ numpy ] (1.8.0 or
160
156
later) and [ pandas] [ pandas ] (0.13.1 or later). [ netCDF4-python] [ nc4 ] ,
161
157
[ pydap] [ pydap ] and [ scipy] [ scipy ] are optional: they add support for reading
162
158
and writing netCDF files and/or accessing OpenDAP datasets. We plan to
163
- eventually support Python 3 but aren't there yet. The easiest way to get any
164
- of these dependencies installed from scratch is to use [ Anaconda] [ anaconda ] .
159
+ eventually support Python 3 but aren't there yet.
165
160
166
- xray is not yet available on the Python package index (prior to its initial
167
- release). For now, you need to install it from source:
161
+ You can install xray from the pypi with pip:
168
162
169
- git clone https://github.com/akleeman/xray.git
170
- # WARNING: this will automatically upgrade numpy & pandas if necessary!
171
- pip install -e xray
172
-
173
- Don't forget to ` git fetch ` regular updates!
163
+ pip install xray
174
164
175
165
[ docs ] : http://xray.readthedocs.org/
166
+ [ tutorial ] : http://xray.readthedocs.org/en/latest/tutorial.html
176
167
[ numpy ] : http://www.numpy.org/
177
168
[ pydap ] : http://www.pydap.org/
178
169
[ anaconda ] : https://store.continuum.io/cshop/anaconda/
179
170
180
171
## Anticipated API changes
181
172
182
- Aspects of the API that we currently intend to change:
173
+ Aspects of the API that we currently intend to change in future versions of
174
+ xray:
183
175
184
176
- The constructor for ` DataArray ` objects will probably change, so that it
185
177
is possible to create new ` DataArray ` objects without putting them into a
@@ -192,19 +184,10 @@ Aspects of the API that we currently intend to change:
192
184
dimensional arrays.
193
185
- Future versions of xray will add better support for working with datasets
194
186
too big to fit into memory, probably by wrapping libraries like
195
- [ blaze] [ blaze ] /[ blz] [ blz ] or [ biggus] [ biggus ] . More immediately:
196
- - Array indexing will be made lazy, instead of immediately creating an
197
- ndarray. This will make it easier to subsample from very large Datasets
198
- incrementally using the ` indexed ` and ` labeled ` methods. We might need to
199
- add a special method to allow for explicitly caching values in memory.
200
- - We intend to support ` Dataset ` objects linked to NetCDF or HDF5 files on
201
- disk to allow for incremental writing of data.
202
-
203
- Once we get the API in a state we're comfortable with and improve the
204
- documentation, we intend to release version 0.1. Our target is to do so before
205
- the xray talk on May 3, 2014 at [ PyData Silicon Valley] [ pydata ] .
206
-
207
- [ pydata ] : http://pydata.org/sv2014/
187
+ [ blaze] [ blaze ] /[ blz] [ blz ] or [ biggus] [ biggus ] . More immediately, we intend
188
+ to support ` Dataset ` objects linked to NetCDF or HDF5 files on disk to
189
+ allow for incremental writing of data.
190
+
208
191
[ blaze ] : https://github.com/ContinuumIO/blaze/
209
192
[ blz ] : https://github.com/ContinuumIO/blz
210
193
[ biggus ] : https://github.com/SciTools/biggus
0 commit comments