All commands assume you are in the root directory of this project. For me, that looks like ~/repos/chronon
.
Add the following to your shell run command files, e.g., ~/.bashrc
(primarily for SBT users):
export CHRONON_OS=<path/to/chronon/repo>
export CHRONON_API=$CHRONON_OS/api/py
alias materialize="PYTHONPATH=$CHRONON_API:$PYTHONPATH $CHRONON_API/ai/chronon/repo/compile.py"
This step is relevant to both Bazel and SBT users. Thrift is a dependency for compile. The latest version, 0.14, is incompatible with Hive Metastore. Install version 0.13:
Thrift is a dependency for compile. The latest version, 0.14, is incompatible with Hive Metastore. Install version 0.13:
brew tap cartman-kai/thrift
brew install [email protected]
python3 -m pip install -U tox build
This project supports both Bazel and SBT. Bazel's hermetic nature simplifies setup compared to SBT, as it doesn't require extensive environment configuration. Choose the system that works best for your workflow.
-
Bazel is hermetic and does not require additional environment setup, except for the installation of [email protected].
-
Ensure you have a
WORKSPACE
file andBUILD
files in the appropriate directories. -
Common Bazel Commands:
- Build all targets:
bazel build //...
- Build Specific targets:
bazel build //module:target bazel build //aggregator:aggregator bazel build //api:api-lib bazel build //online:online bazel build //service:service bazel build //spark:spark
- Run tests:
bazel test //...
- Run specific tests:
bazel test //module:TestName bazel test //aggregator:test bazel test //api:test
- Build a fat jar:
bazel build //module:deploy.jar bazel build //spark:spark-assembly_deploy.jar
- Build all targets:
-
Open the project in IntelliJ from the
build.sbt
file (at the root level). -
Configure IntelliJ:
- Mark these directories as
Sources Root
:aggregator/src/main/scala
api/src/main/scala
spark/src/main/scala
- Mark these directories as
Test Root
:aggregator/src/test/scala
api/src/test/scala
spark/src/test/scala
- Mark these directories as
-
Common SBT Commands:
- Build all artifacts:
sbt package
- Build a fat jar:
sbt assembly
- Run all tests:
sbt test
- Run specific tests:
sbt "testOnly *<Module>"
- Build all artifacts:
sbt py_thrift
bazel build //api/thrift:api-models-py
-
Update Maven Dependencies:
- Locate
jvm/<repo>_repo.bzl
file (e.g.,jvm/maven_repo.bzl
) - Add/update the dependency declaration:
artifacts = [ "org.apache.thrift:libthrift:0.13.0", ]
- Locate
-
Reference in BUILD files:
- Add the dependency to your target's deps attribute:
scala_library( name = "my_library", srcs = glob(["src/main/scala/**/*.scala"]), deps = [ "@maven//:org_example_library", # Note: colons become underscores # Other dependencies... ], )
-
Refresh Bazel's dependency cache:
bazel clean --expunge bazel build //...
-
Update
requirements.txt
:- Add your dependency with version:
requests==2.28.1 numpy>=1.21.0
-
Update pip dependencies:
bazel run //:pip.update
-
Reference in BUILD files:
py_library( name = "my_python_lib", srcs = glob(["*.py"]), deps = [ requirement("requests"), requirement("numpy"), ], )
Graph view of dependencies:
bazel query 'deps(//module:target)' --output graph
materialize --input_path=<path/to/conf>
Run all tests:
bazel test //...
Run a specific test module:
bazel test //module:SpecificTest
Run all tests:
sbt test
Run specific tests:
sbt "testOnly *<Module>"
Graph view of dependencies:
Bazel's dependency graph is hermetic and reproducible. It ensures that all dependencies are defined explicitly, avoiding issues caused by system-level or implicit dependencies. This contrasts with SBT, which relies on configuration files and environment settings to resolve dependencies.
bazel query 'deps(//module:target)' --output graph
Graph-based view of dependencies:
sbt dependencyBrowseGraph
Tree-based view of dependencies:
sbt dependencyBrowseTree
Default settings in .bazelrc
Build all artifacts:
bazel build //...
Build a specific artifact:
bazel build //module:artifact_name
Build a scala version specific artifact:
bazel build --config scala_2.12 //module:artifact_name
Build a spark version specific artifact:
bazel build --config spark_3.5 //module:artifact_name
Build deploy jar aka Uber jar or fat jar:
bazel build --config scala_2.13 --config spark_3.5 //spark:spark-assembly_deploy.jar
Build all artifacts:
sbt package
Build Python API:
sbt python_api
Build a fat jar:
sbt assembly
Publish all artifacts:
sbt publish
Publish to a custom repository:
bazel run //module:publish
Generate documentation via Sphinx:
sbt sphinx
For Bazel-specific troubleshooting, refer to the Bazel documentation: https://bazel.build
For SBT-specific troubleshooting, refer to the SBT documentation: https://www.scala-sbt.org