-
Notifications
You must be signed in to change notification settings - Fork 746
Description
Expected behavior
When following the install information to use Sedona with Databricks.
When using ST_H3CellIDs
I expect to get the H3 indices of the given polygon.
As example I run this SQL query:
SELECT ST_H3CellIDs(ST_GeomFromText('POLYGON((0 0, 10 0, 10 10, 0 10, 0 0))'), 12, FALSE)Actual behavior
I get this error:
NoSuchMethodException: java.lang.NoSuchMethodError: com.uber.h3core.H3Core.polygonToCells(Ljava/util/List;Ljava/util/List;I)Ljava/util/List;
Steps to reproduce the problem
I used both the pip installation route, and the pure SQL on Databricks.
- https://sedona.apache.org/1.5.0/setup/install-python/
- https://sedona.apache.org/1.5.0/setup/databricks/#install-sedona-from-the-init-script
Both result in the same error.
Settings
Environment Azure Databricks
Databricks runtime: 13.3 LTS
Operating System: Ubuntu 22.04.2 LTS
Java: Zulu 8.70.0.23-CA-linux64
Scala: 2.12.15
Python: 3.10.12
R: 4.2.2
Delta Lake: 2.4.0
Thoughts
I thought H3 might not be included in the shaded version. So I also tried to add the h3-4.1.1.jar to the init script. But this also doesn't solve the issue.
I finally used these scripts:
Download jars
# Create JAR directory for Sedona
mkdir -p /dbfs/FileStore/sedona/jars
# Remove contents of directory
rm -f /dbfs/FileStore/sedona/jars/*.jar
# Download the dependencies from Maven into DBFS
curl -o /dbfs/FileStore/sedona/jars/geotools-wrapper-1.5.0-28.2.jar "https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.5.0-28.2/geotools-wrapper-1.5.0-28.2.jar"
curl -o /dbfs/FileStore/sedona/jars/sedona-spark-shaded-3.4_2.12-1.5.0.jar "https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.4_2.12/1.5.0/sedona-spark-shaded-3.4_2.12-1.5.0.jar"
curl -o /dbfs/FileStore/sedona/jars/h3-4.1.1.jar "https://repo1.maven.org/maven2/com/uber/h3/4.1.1/h3-4.1.1.jar"Create init script
# Create init script
cat > /dbfs/FileStore/sedona/scripts/sedona-init.sh <<'EOF'
#!/bin/bash
#
# File: sedona-init.sh
#
# On cluster startup, this script will copy the Sedona jars to the cluster's default jar directory.
# In order to activate Sedona functions, remember to add to your spark configuration the Sedona extensions: "spark.sql.extensions org.apache.sedona.viz.sql.SedonaVizExtensions,org.apache.sedona.sql.SedonaSqlExtensions"
cp /dbfs/FileStore/sedona/jars/*.jar /databricks/jars
EOFAll the other functions of Sedona do work. So Sedona is installed properly, I am only unable to use the H3 functions.
Did I miss a step in the set-up? I checked the documentation multiple times, but couldn't find any clue. I hope someone can help me out.