annotation Chem::RegisterFormat

Overview

Registers a file format.

The annotated type provides the implementation of a file format that encodes an encoded type. The file format is determined from the annotated type's name, where the last component of the fully qualified name is used (e.g., Baz for Foo::Bar::Baz). The declared extensions and file patterns are used for format detection via Chem.guess_format.

The annotation accepts the following named arguments:

The format module must define at least one of:

The encoded type is inferred from the type restrictions and return type of the methods, so they must be annotated. The methods must accept an IO as first argument. Use the define_file_overload macro to generate overloads that accept Path | String instead of IO.

Convenience read (.from_* and .read) and write (#to_* and #write) methods will be generated on the encoded types during compilation time using the type information from the methods. Additionally, convenience read and write methods will be generated on Array for file formats that can hold multiple entries (indicated by the definition of .read_all and .write(io, Array)).

Example

The following code registers the Foo format matching the *.foo and foo_* files. The module defines .each, read, read_all, and write to read instances of A and Array(A), and write A, respectively. Additionally, it defines additional read methods for B instances.

record A
record B

@[Chem::RegisterFormat(ext: %w(.foo), names: %w(foo_*))]
module Foo
  def self.each(io : IO, & : A ->) : Nil
    loop do
      yield read(io)
    rescue IO::EOFError
      break
    end
  end

  def self.read(io : IO) : A
    A.new
  end

  def self.read_all(io : IO) : Array(A)
    entries = [] of A
    each(io) { |entry| entries << entry }
    entries
  end

  def self.read_info(io : IO) : B
    B.new
  end

  def self.write(io : IO, instance : A) : Nil
    # ...
  end

  def self.write(io : IO, instances : Array(A), info : B) : Nil
    # write header based on info
    instances.each { |instance| write(io, instance) }
  end
end

The convenience A.from_foo and A.read methods are generated during compilation time to create an A instance from an IO or file using the Foo file format. Additionally, the file format can be guessed from the filename.

# static read methods (can forward arguments to Foo.read)
A.from_foo(IO::Memory.new) # => A()
A.from_foo("a.foo")        # => A()

# dynamic read methods (format is detected on runtime; no arguments)
A.read(IO::Memory.new, Foo) # => A()
A.read("a.foo", Foo)        # => A()
A.read("a.foo")             # => A()

The above methods are also created on the B type.

Similar to the read methods, A.to_foo and A.write are generated to write an A instance to an IO or file using the Foo file format.

# static read methods (can forward arguments to Foo.write)
A.new.to_foo                 # returns a string representation
A.new.to_foo(IO::Memory.new) # writes to an IO
A.new.to_foo("a.foo")        # writes to a file

# dynamic read methods (format is detected on runtime; no arguments)
A.new.write(IO::Memory.new, Foo)
A.new.write("a.foo", Foo)
A.new.write("a.foo")

Since Foo reads and writes multiple entries (indicated by .read_all and .write(io, Array(A))), the .from_foo, .read, #to_foo, and #write methods are also generated in Array during compilation time.

Array(A).from_foo(IO::Memory.new)  # => [Foo(), ...]
Array(A).from_foo("a.foo")         # => [Foo(), ...]
Array(A).read(IO::Memory.new, Foo) # => [Foo(), ...]
# and other overloads

Array(A).new.to_foo                    # returns a string representation
Array(A).new.to_foo(IO::Memory.new)    # writes to an IO
Array(A).new.to_foo("a.foo")           # writes to a file
Array(A).new.write IO::Memory.new, Foo # writes to an IO
# and other overloads

Calling any of these methods on an array of unsupported types will produce a missing method error during compilation.

Refer to the implementations of the supported file formats (e.g., PDB and XYZ) for real examples.

NOTE Method overloading may not work as expected in some cases. If two methods with the same name and required arguments (may have different optional arguments), only the last overload will be taken into account and trying to calling the first one will result in a missing method error during compilation.

Example:

# Both methods require input, but baz is optional
module Foo
  def self.read(input : String, baz : Bool = false)
    "1"
  end

  def self.read(input : String)
    "2"
  end
end

Foo.read "foo"            # => "2"
Foo.read "foo", baz: true # Missing method error

# Setting baz as required makes the two overloads different
module Foo
  def self.read(input : String, baz : Bool)
    "1"
  end

  def self.read(input : String)
    "2"
  end
end

Foo.read "foo"            # => "2"
Foo.read "foo", baz: true # => "1"

Defined in:

chem/register_format.cr