A dependency-free tool that enables you to control how to tokenize, transform and handle files with char(s) separated values.
Not sure if I am going to publish this to Clojars but if you are using tools.deps, you can just add following to deps.edn to add it to your project.
{:deps
{github-oneness/csvx
{:git/url "https://github.com/oneness/csvx"
:sha "4d4ac868a0c6bdf9a5a6c32076303f1171c74db4"}}}
;; Then require from your repl like this:
(require '[csvx.core :as csvx])
;; You can just use the one public fn `readx` without any options to parse csv:
(csvx/readx "resources/100-sales-records.csv")
;; Note that csvx/readx takes optional arg where you can pass in following
;; options: (listed values here are defaults if no option is given See src/csvx/core.clj
;; for details).
{:encoding "UTF-8"
:max-lines-to-read Integer/MAX_VALUE
:line-tokenizer (fn [^String line]
(try (.split line ",")
(catch Exception e
(println :malformed-line line)
;; (strace/print-stack-trace e) ;; handle exception here
nil)))
:line-transformer #(map-indexed hash-map %)}
Say you have a giant json file that you need to parse into Clojure map, all you need to do is to to write line tokenizer and transformer like this to get it working from repl. Remember to pass in how many lines you want to read in while you are experimenting with your repl as not to blow it up:
(defn decode-json [^String file-path]
(readx file-path ;;"sample.json"
{:max-lines-to-read 1
:line-tokenizer (fn [line]
(map #(.split ^String % ":")
(-> (clojure.string/replace line #"\{|\}" "")
(.split ","))))
:line-transformer (fn [line]
(reduce (fn [acc [k v]]
(merge acc
{(-> k read-string keyword) (read-string v)}))
{}
line))}))
git clone https://github.com/oneness/csvx
clojure -X:test