Skip to content
View lalaguozhe's full-sized avatar
  • dianping.com
  • shanghai

Organizations

@dianping @dp-bigdata

Block or report lalaguozhe

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

View parquet files online

Rust 106 3 Updated Feb 6, 2025

Fluss is a streaming storage built for real-time analytics.

Java 992 240 Updated Feb 18, 2025

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Rust 31,892 2,077 Updated Feb 18, 2025

python implementation of the parquet columnar file format.

Python 808 180 Updated Nov 12, 2024

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 3,643 204 Updated Feb 14, 2025

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

Rust 1,399 135 Updated Feb 18, 2025

搜索引擎原理

1,581 132 Updated Apr 19, 2024
Java 179 81 Updated Feb 14, 2025

A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, etc.

Go 1,426 342 Updated Feb 18, 2025

OpenAI Api Client in Java

Java 4,800 1,221 Updated Jun 6, 2024

BibiGPT v1 · one-Click AI Summary for Audio/Video & Chat with Learning Content: Bilibili | YouTube | Tweet丨TikTok丨Dropbox丨Google Drive丨Local files | Websites丨Podcasts | Meetings | Lectures, etc. 音视…

TypeScript 5,439 719 Updated Feb 17, 2024

🔬 Online Heap Dump, GC Log, Thread Dump & JFR File Analyzer.

Java 589 100 Updated Feb 17, 2025

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Scala 3,363 548 Updated Feb 12, 2025

Flowchart for debugging Spark applications

Shell 104 27 Updated Sep 25, 2024

A better notebook for Scala (and more)

Jupyter Notebook 4,545 397 Updated Jan 29, 2025

A query predictor pipeline and service to predict resource usages of Presto queries

Python 15 5 Updated May 2, 2023

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.

Java 919 309 Updated Feb 18, 2025

LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.

Java 2,506 402 Updated Feb 14, 2025

Warp is a modern, Rust-based terminal with AI built in so you and your team can build great software, faster.

22,148 392 Updated Feb 12, 2025

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

Scala 1,264 462 Updated Feb 18, 2025

The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data 📊

Clojure 40,854 5,379 Updated Feb 18, 2025

🔥 人人可用的开源 BI 工具,Tableau、帆软的开源替代。

Java 19,352 3,462 Updated Feb 18, 2025

The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them

Python 135 38 Updated Oct 25, 2023

Data Lineage Tracking And Visualization Solution

Scala 611 156 Updated Feb 6, 2025

Databricks Scala Coding Style Guide

2,757 584 Updated Apr 5, 2024

Readings in Databases

7,776 908 Updated Sep 9, 2024

Web UI for Trino, Hive and SparkSQL

Java 634 200 Updated Oct 2, 2023

An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.

Scala 424 115 Updated Jan 14, 2022

Cloud Native DataOps & AIOps Platform | 云原生数智运维平台

Java 1,843 409 Updated Apr 11, 2024

Spark SQL index for Parquet tables

Scala 134 35 Updated May 6, 2021
Next
Showing results