Skip to content

Commit

Permalink
Removed experimental flag.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 421709721
  • Loading branch information
tensorflower-gardener authored and copybara-github committed Jan 14, 2022
1 parent 4675fd8 commit 53e8888
Showing 1 changed file with 18 additions and 32 deletions.
50 changes: 18 additions & 32 deletions site/en/guide/data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2572,7 +2572,7 @@
"id": "GxLAr-7p0ATX"
},
"source": [
"To use `tf.data.experimental.sample_from_datasets` pass the datasets, and the weight for each:"
"To use `tf.data.Dataset.sample_from_datasets` pass the datasets, and the weight for each:"
]
},
{
Expand All @@ -2583,7 +2583,7 @@
},
"outputs": [],
"source": [
"balanced_ds = tf.data.experimental.sample_from_datasets(\n",
"balanced_ds = tf.data.Dataset.sample_from_datasets(\n",
" [negative_ds, positive_ds], [0.5, 0.5]).batch(10)"
]
},
Expand Down Expand Up @@ -2623,15 +2623,15 @@
"id": "kZ9ezkK6irMD"
},
"source": [
"One problem with the above `experimental.sample_from_datasets` approach is that\n",
"it needs a separate `tf.data.Dataset` per class. Using `Dataset.filter`\n",
"works, but results in all the data being loaded twice.\n",
"One problem with the above `Dataset.sample_from_datasets` approach is that\n",
"it needs a separate `tf.data.Dataset` per class. You could use `Dataset.filter`\n",
"to create those two datasets, but that results in all the data being loaded twice.\n",
"\n",
"The `data.experimental.rejection_resample` function can be applied to a dataset to rebalance it, while only loading it once. Elements will be dropped from the dataset to achieve balance.\n",
"The `data.Dataset.rejection_resample` method can be applied to a dataset to rebalance it, while only loading it once. Elements will be dropped from the dataset to achieve balance.\n",
"\n",
"`data.experimental.rejection_resample` takes a `class_func` argument. This `class_func` is applied to each dataset element, and is used to determine which class an example belongs to for the purposes of balancing.\n",
"`data.Dataset.rejection_resample` takes a `class_func` argument. This `class_func` is applied to each dataset element, and is used to determine which class an example belongs to for the purposes of balancing.\n",
"\n",
"The elements of `creditcard_ds` are already `(features, label)` pairs. So the `class_func` just needs to return those labels:"
"The goal here is to balance the lable distribution, and the elements of `creditcard_ds` are already `(features, label)` pairs. So the `class_func` just needs to return those labels:"
]
},
{
Expand All @@ -2646,34 +2646,15 @@
" return label"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DdKmE8Jumlp0"
},
"source": [
"The resampler also needs a target distribution, and optionally an initial distribution estimate:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "9tv0tWNxmkzM"
},
"outputs": [],
"source": [
"resampler = tf.data.experimental.rejection_resample(\n",
" class_func, target_dist=[0.5, 0.5], initial_dist=fractions)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YxJrOZVToGuE"
},
"source": [
"The resampler deals with individual examples, so you must `unbatch` the dataset before applying the resampler:"
"The resampling method deals with individual examples, so in this case you must `unbatch` the dataset before applying that method.\n",
"\n",
"The method needs a target distribution, and optionally an initial distribution estimate as inputs."
]
},
{
Expand All @@ -2684,7 +2665,12 @@
},
"outputs": [],
"source": [
"resample_ds = creditcard_ds.unbatch().apply(resampler).batch(10)"
"resample_ds = (\n",
" creditcard_ds\n",
" .unbatch()\n",
" .rejection_resample(class_func, target_dist=[0.5,0.5],\n",
" initial_dist=fractions)\n",
" .batch(10))"
]
},
{
Expand All @@ -2693,7 +2679,7 @@
"id": "L-HnC1s8idqV"
},
"source": [
"The resampler returns creates `(class, example)` pairs from the output of the `class_func`. In this case, the `example` was already a `(feature, label)` pair, so use `map` to drop the extra copy of the labels:"
"The `rejection_resample` method returns `(class, example)` pairs where the `class` is the output of the `class_func`. In this case, the `example` was already a `(feature, label)` pair, so use `map` to drop the extra copy of the labels:"
]
},
{
Expand Down

0 comments on commit 53e8888

Please sign in to comment.