Skip to content

Review (CO) #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Review (CO) #22

wants to merge 14 commits into from

Conversation

cdoswald
Copy link

@cdoswald cdoswald commented Sep 2, 2021

Opening a pull request to add comments/feedback (will delete pull request later)

@cdoswald cdoswald self-assigned this Sep 2, 2021
d) This statement is confusing to me; is it best to use FIFEforSpark in Databricks, or in a Python IDE?
In either case, it's not clear how to get set up and install FIFEforSpark and all of the required dependencies
e) I get a "ModuleNotFoundError: No module named 'findspark'" error message when running `import findspark`
in Databricks (using Databricks Runtime 8.3/Spark 3.1.1/Scala 2.12)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'findspark' package is only necessary when running spark on the command line or a Python IDE, as opposed to on Databricks (which already preconfigures the cluster). We'll make this more clear in the README file!

@cdoswald cdoswald requested a review from EdJishWang September 9, 2021 23:45
`java.lang.NoClassDefFoundError: org/apache/spark/ml/util/MLWritable$class`; note that importing
`fifeforspark.lgb_modelers` succeeds without triggering the warning that MMLSpark could not be imported;
using 8.3 Databricks Runtime, mmlspark_2.11-1.0.0-rc3 Maven coordinates for MMLSpark; full traceback is in
[3a_error_buildmodel.txt](3a_error_buildmodel.txt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a cluster and maven coordinate incompatibility error. Can you try running using Databricks runtime 9.0? Also with the following maven coordinates: com.microsoft.ml.spark:mmlspark_2.12:1.0.0-rc3-59-bf337941-SNAPSHOT

@cdoswald cdoswald requested a review from EdJishWang September 10, 2021 18:17
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants