module Iterator(T)

Overview

Adds functionality to lazily pull JSON lines formatted objects.

Included Modules

Defined in:

jsonl/iterator.cr

Constructors

Class Method Summary

Instance methods inherited from module Enumerable(T)

to_jsonl(io : IO) : Nil
to_jsonl : String
to_jsonl

Constructor Detail

def self.new(pull : JSON::LinesPullParser) #

Creates a new iterator which iterates over JSON lines. See also Iterator#from_jsonl.

WARNING The JSON::LinesPullParser can't be used by anything else until the iterator is fully consumed.


[View source]

Class Method Detail

def self.from_jsonl(string_or_io) #

Reads the JSON lines content into an iterator in a lazy way. With this method it should be possible to process large amounts of JSON lines, without the requirement that the whole set fits into memory.

The following example produces a huge file, uses a lot of CPU but should not require much memory.

struct Entry
  include JSON::Serializable

  getter value : Int32

  def initialize(@value)
  end
end

iter = (0..1_000_000_000).each.map do |value|
  Entry.new(value)
end

File.open("/tmp/test.jsonl", "w+") do |f|
  iter.to_jsonl(f)
end

File.open("/tmp/test.jsonl", "r") do |f|
  p Iterator(Entry).from_jsonl(f).skip(1_000_000_000).to_a
end

WARNING The string_or_io can't be used by anything else until the iterator is fully consumed.


[View source]