Alea

Build Status Crystal Shard Docs License

Alea is a collection of utilities to work with most known probability distributions, written in pure Crystal.

Features:

Note: This project is in development state and many distributions are still missing, as well as cumulative distribution functions, so keep in mind that breaking changes may occur frequently.

Why Crystal?

Crystal compiles to really fast native code without sacrificing any of the modern programming languages standards providing a nice and clean interface.

Installation

  1. Add the dependency to your shard.yml:
  dependencies:
    alea:
      github: nin93/alea
  1. Run shards install

Usage

require "alea"

PRNGs

The algorithms in use for generating unsigned integers are from the xoshiro (XOR/shift/rotate) collection, designed by Sebastiano Vigna and David Blackman: really fast generators promising exquisite statistical properties as well.

Currently implemented engines:

The digits in the class name stand for the storage of their state in bits. Their period is thus 2^128 -1 for XSR128 and 2^256 -1 for XSR256.

Floats are obtained by ldexp (load exponent) operations upon generated unsigned integers.

More informations are detailed in: http://prng.di.unimi.it/.

All PRNGs in this library inherit from PRNG. You are allowed to build your own custom PRNG by inheriting the above parent class and defining #next_u32 and #next_u64, since all methods rely upon them to perform sampling:

class MyGenerator < Alea::PRNG
  @state32 : StaticArray(UInt32)
  @state64 : StaticArray(UInt64)

  # must be implemented
  def next_u32 : UInt32
    # extract 32-bit integer
    # update @state32
  end

  # must be implemented
  def next_u64 : UInt64
    # extract 64-bit integer
    # update @state64
  end
end

The above example is a rapresentation of how PRNGs are implemented and should be built in order to be wrapped correctly by Random and generate properly.

It is worth noting that in these implementations #next_u32 and #next_u64 depend on different states and thus they are independent from each other, as well as #next_f32 and #next_f64 or #next_i32 and #next_i64. It is still fine, though, if both #next_u32 and #next_u64 rely on the same state, if you want. I choose not to, as it makes state advancements unpredictable.

Sampling

Random is the interface provided to perform sampling:

random = Alea::Random.new
random.normal # => -0.36790519967553736 : Float64

It also accepts an initial seed to reproduce the same seemingly random events across runs:

seed = 9377u64
random = Alea::Random.new(seed)
random.exp # => 0.10203669577353723 : Float64

By default, the PRNG in use by Random is XSR128. You can, though, pass the desired engine as an argument to the constructor. Here is an example using XSR256:

random = Alea::Random.new(Alea::XSR256)
random.float # => 0.6533582874035311 : Float64
random.prng  # => Alea::XSR256

# or seeded as well
random = Alea::Random.new(193, Alea::XSR256)
random.float # => 0.4507930323670787 : Float64

Custom PRNGs can be used as well, assuming #next_u32 and #next_u64 generate uniformly distributed unsigned integers:

# Using class from the example above
random = Alea::Random.new(MyGenerator)
random.uint 3...93            # => 73 : UInt64
random.float32 -12.2...35.453 # => 34.033405 : Float32

Unsafe methods

Plain sampling methods (such as #normal, #gamma) performs checks over arguments passed to prevent bad data generation or inner exceptions. In order to avoid them (checks might be slow) you must use their unsafe version by prepending next_ to them:

random = Alea::Random.new
random.normal(loc: 0, sigma: 0)      # raises Alea::UndefinedError: sigma is 0 or negative.
random.next_normal(loc: 0, sigma: 0) # these might raise internal exceptions.

Timings are definitely comparable, though. See the benchmarks for direct comparisons between those methods.

Supported Distributions

Current sampling methods are implemented for the following distributions:

Cumulative Distribution Functions

CDF is the interface used to calculate the Cumulative Distribution Functions. Given X ~ D and a fixed quantile x, CDFs are defined as the functions that associate x to the probability that the real-valued random X from the distribution D will take a value less or equal to x.

Arguments passed to CDF methods to shape the distributions are analogous to those used for sampling:

Alea::CDF.normal(0.0)                       # => 0.5 : Float64
Alea::CDF.normal(2.0, loc: 1.0, sigma: 0.5) # => 0.9772498680518208 : Float64
Alea::CDF.chisq(5.279, df: 5.0)             # => 0.6172121213841358 : Float64
Alea::CDF.chisq32(5.279, df: 5.0)           # => 0.61721206 : Float32

Supported Distributions

Current CDFs estimations are implemented for the following distributions:

References

Fully listed in LICENSE.md:

Contributing

  1. Fork it (https://github.com/nin93/alea/fork)
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors