Skip to content

Simple CLI for Apache Avro with a high-level API

License

Notifications You must be signed in to change notification settings

guywaldman/ravro

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

7413fc5 ยท Jul 13, 2019

History

21 Commits
Jul 10, 2019
Jul 13, 2019
Jul 12, 2019
Jul 10, 2019
Jul 13, 2019
Jul 9, 2019
Jul 13, 2019

Repository files navigation

ravro

Version 0.1.0

A CLI for Apache Avro manipulations.

Screenshot

โš  Under heavy development โš 

Please use at your own discretion.


Installation

Compile from Source

Use cargo:

cargo build --release

Binaries

There are existing compiled binaries for Windows at the moment. They can be downloaded from the releases page.

Usage

> # Retrieve all columns for a list of records
> ravro get .\test_assets\bttf.avro

+-----------+--------------+-----+
| firstName | lastName     | age |
+-----------+--------------+-----+
| Marty     | McFly        | 24  |
+-----------+--------------+-----+
| Biff      | Tannen       | 72  |
+-----------+--------------+-----+
| Emmett    | Brown        | 65  |
+-----------+--------------+-----+
| Loraine   | Baines-McFly | 62  |
+-----------+--------------+-----+

> # Search (using regular expressions)
> ravro get .\test_assets\bttf.avro --search McFly

+-----------+--------------+-----+
| firstName | lastName     | age |
+-----------+--------------+-----+
| Marty     | McFly        | 24  | # the second field will appear in bold green here
+-----------+--------------+-----+
| Loraine   | Baines-McFly | 62  | # the second field will appear in bold green here
+-----------+--------------+-----+

> # Select only some columns
> ravro get .\test_assets\bttf.avro --fields firstName age

+-----------+-----+
| firstName | age |
+-----------+-----+
| Marty     | 24  |
+-----------+-----+
| Biff      | 72  |
+-----------+-----+
| Emmett    | 65  |
+-----------+-----+
| Loraine   | 62  |
+-----------+-----+

> # Select the first 2 columns
> ravro get .\test_assets\bttf*.avro --fields firstName age --take 2

+-----------+-----+
| firstName | age |
+-----------+-----+
| Marty     | 24  |
+-----------+-----+
| Biff      | 72  |
+-----------+-----+

> # Output as CSV
> ravro get .\test_assets\bttf*.avro --fields firstName age --take 2 --format csv

firstName,age
Marty,24
Biff,72

Options

  • fields (f) - The list (separated by spaces) of the fields you wish to retrieve
  • search (s) - The regular expression to filter and display only rows with columns that contain matching values. The matching fields will be highlighed
  • take (t) - The number of records you wish to retrieve
  • codec (c) - The codec for decompression - omit for no codec, or specify "deflate"
  • format (p) - The format you wish to output the Avro - omit for a pretty print as a table, or specify "csv" for CSV

TODO

  • Extract CLI functionality into a library
  • Configurable display formats (CSV, JSON, etc.)
  • Avro generation from JSON
  • Schema
  • Snappy codec

Caveats

  • The schema is inferred based on the first record it finds. This may not be desired for some use-cases
  • Only supports top-level records at the moment

Contributions

Are very welcome! I am by no means an expert on Spark, Avro or even Rust and there is much to be improved here.

Thanks ๐Ÿ™

About

Simple CLI for Apache Avro with a high-level API

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages