Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Add Filters section for Java #226

Merged
merged 6 commits into from
Jan 17, 2025

Conversation

rachel-mack
Copy link
Contributor

@rachel-mack rachel-mack commented Jan 15, 2025

Pull Request Info

PR Reviewing Guidelines

JIRA - https://jira.mongodb.org/browse/DOCSP-33990
Staging - https://deploy-preview-226--docs-spark-connector.netlify.app/batch-mode/batch-read/#filters

Self-Review Checklist

  • Is this free of any warnings or errors in the RST?
  • Did you run a spell-check?
  • Did you run a grammar-check?
  • Are all the links working?
  • Are the facets and meta keywords accurate?

Copy link

netlify bot commented Jan 15, 2025

Deploy Preview for docs-spark-connector ready!

Name Link
🔨 Latest commit 86cd0ba
🔍 Latest deploy log https://app.netlify.com/sites/docs-spark-connector/deploys/678a80f56629790008a3f975
😎 Deploy Preview https://deploy-preview-226--docs-spark-connector.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.


.. code-block:: java

dataFrame.getInteger("qty").gte(of(10))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend keeping the example similar to python / scala so something like:

df.filter(df.col("gte").gte(30))

See example here: https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/Dataset.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the same example that's used on the python tab of this section.

I've updated the syntax to use col()

source/java/filters.rst Outdated Show resolved Hide resolved
@rachel-mack rachel-mack requested a review from rozza January 16, 2025 17:36
@rustagir rustagir self-requested a review January 16, 2025 21:20
Copy link
Contributor

@rustagir rustagir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few small things!

@@ -0,0 +1,32 @@
.. include:: /includes/pushed-filters.rst

You can use `Java Aggregation Expressions <https://www.mongodb.com/docs/drivers/java/sync/upcoming/fundamentals/aggregation-expression-operations/>`__ to filter your data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S: implement line breaks for readability

@@ -0,0 +1,32 @@
.. include:: /includes/pushed-filters.rst

You can use `Java Aggregation Expressions <https://www.mongodb.com/docs/drivers/java/sync/upcoming/fundamentals/aggregation-expression-operations/>`__ to filter your data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S: use an rst target such as the driver one


.. include:: /includes/example-load-dataframe.rst

First, create a DataFrame to connect with your default MongoDB data source:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
First, create a DataFrame to connect with your default MongoDB data source:
First, create a DataFrame to connect to your default MongoDB data source:

Comment on lines 12 to 15
.format("mongodb")
.option("database", "food")
.option("collection", "fruit")
.load();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S: not sure about indentation here but should these be pushed in?

.option("collection", "fruit")
.load();

The following example includes only records in which the ``qty`` field is greater than or equal to ``10``:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following example includes only records in which the ``qty`` field is greater than or equal to ``10``:
The following example retrieves only records in which the value of the ``qty`` field is greater than or equal to ``10``:

Comment on lines +19 to +32
.. code-block:: java

df.filter(df.col("qty").gte(10))

The operation outputs the following:

.. code-block:: none

+---+----+------+
|_id| qty| type|
+---+----+------+
|2.0|10.0|orange|
|3.0|15.0|banana|
+---+----+------+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S: convert into IO code block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, this formatting was odd to me too, but I'm matching the formatting of the other tabs and other example on this page. I think I'll open a separate PR to fix them all at one after this is merged. (There's like 10+ on this page and I don't want to clutter up this PR).

Copy link
Member

@rozza rozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rachel-mack rachel-mack requested a review from rustagir January 17, 2025 16:15
Copy link
Contributor

@rustagir rustagir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@rachel-mack rachel-mack merged commit 028ddc1 into mongodb:master Jan 17, 2025
5 checks passed
@rachel-mack rachel-mack deleted the DOCSP-33990 branch January 17, 2025 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants