Alea
Alea is a collection of utilities to work with most known probability distributions, written in pure Crystal.
Features:
- PRNGs implementations
- Random sampling (single/double precision)
- Cumulative Distribution Functions (single/double precision)
Note: This project is in development state and many distributions are still missing, as well as cumulative distribution functions, so keep in mind that breaking changes may occur frequently.
Why Crystal?
Crystal compiles to really fast native code without sacrificing any of the modern programming languages standards providing a nice and clean interface.
Installation
- Add the dependency to your
shard.yml
:
dependencies:
alea:
github: nin93/alea
- Run
shards install
Usage
require "alea"
PRNGs
The algorithms in use for generating unsigned integers are from the xoshiro (XOR/shift/rotate) collection, designed by Sebastiano Vigna and David Blackman: really fast generators promising exquisite statistical properties as well.
Currently implemented engines:
XSR128
backed by xoroshiro128++ (32/64 bit)XSR256
backed by xoshiro256++ (32/64 bit)
The digits in the class name stand for the storage of their state in bits.
Their period is thus 2^128 -1
for XSR128
and 2^256 -1
for XSR256
.
Floats are obtained by ldexp
(load exponent) operations upon generated unsigned integers.
More informations are detailed in: http://prng.di.unimi.it/.
All PRNGs in this library inherit from PRNG
. You are allowed to build your own custom
PRNG by inheriting the above parent class and defining #next_u32
and #next_u64
,
since all methods rely upon them to perform sampling:
class MyGenerator < Alea::PRNG
@state32 : StaticArray(UInt32)
@state64 : StaticArray(UInt64)
# must be implemented
def next_u32 : UInt32
# extract 32-bit integer
# update @state32
end
# must be implemented
def next_u64 : UInt64
# extract 64-bit integer
# update @state64
end
end
The above example is a rapresentation of how PRNGs are implemented and should be
built in order to be wrapped correctly by Random
and generate properly.
It is worth noting that in these implementations #next_u32
and #next_u64
depend on different states and thus they are independent from each other,
as well as #next_f32
and #next_f64
or #next_i32
and #next_i64
.
It is still fine, though, if both #next_u32
and #next_u64
rely on the same
state, if you want. I choose not to, as it makes state advancements unpredictable.
Sampling
Random
is the interface provided to perform sampling:
random = Alea::Random.new
random.normal # => -0.36790519967553736 : Float64
It also accepts an initial seed to reproduce the same seemingly random events across runs:
seed = 9377u64
random = Alea::Random.new(seed)
random.exp # => 0.10203669577353723 : Float64
By default, the PRNG in use by Random
is XSR128
. You can, though, pass the desired
engine as an argument to the constructor. Here is an example using XSR256
:
random = Alea::Random.new(Alea::XSR256)
random.float # => 0.6533582874035311 : Float64
random.prng # => Alea::XSR256
# or seeded as well
random = Alea::Random.new(193, Alea::XSR256)
random.float # => 0.4507930323670787 : Float64
Custom PRNGs can be used as well, assuming #next_u32
and #next_u64
generate uniformly
distributed unsigned integers:
# Using class from the example above
random = Alea::Random.new(MyGenerator)
random.uint 3...93 # => 73 : UInt64
random.float32 -12.2...35.453 # => 34.033405 : Float32
Unsafe methods
Plain sampling methods (such as #normal
, #gamma
) performs checks over arguments
passed to prevent bad data generation or inner exceptions.
In order to avoid them (checks might be slow) you must use their unsafe version by
prepending next_
to them:
random = Alea::Random.new
random.normal(loc: 0, sigma: 0) # raises Alea::UndefinedError: sigma is 0 or negative.
random.next_normal(loc: 0, sigma: 0) # these might raise internal exceptions.
Timings are definitely comparable, though. See the benchmarks for direct comparisons between those methods.
Supported Distributions
Current sampling methods are implemented for the following distributions:
- Beta
- Chi-Square
- Exponential
- Gamma
- Laplace
- Log-Normal
- Normal
- Poisson
- Uniform
Cumulative Distribution Functions
CDF
is the interface used to calculate the Cumulative Distribution Functions.
Given X ~ D and a fixed quantile x, CDFs are defined as the functions that
associate x to the probability that the real-valued random X from the
distribution D will take a value less or equal to x.
Arguments passed to CDF
methods to shape the distributions are analogous to
those used for sampling:
Alea::CDF.normal(0.0) # => 0.5 : Float64
Alea::CDF.normal(2.0, loc: 1.0, sigma: 0.5) # => 0.9772498680518208 : Float64
Alea::CDF.chisq(5.279, df: 5.0) # => 0.6172121213841358 : Float64
Alea::CDF.chisq32(5.279, df: 5.0) # => 0.61721206 : Float32
Supported Distributions
Current CDFs estimations are implemented for the following distributions:
- Chi-Square
- Exponential
- Gamma
- Laplace
- Log-Normal
- Normal
- Poisson
- Uniform
References
Fully listed in LICENSE.md:
- Crystal
Random
module for uniform sampling - NumPy
random
module for pseudo-random sampling methods - JuliaLang
random
module for ziggurat methods - IncGammaBeta.jl for incomplete gamma functions
Contributing
- Fork it (https://github.com/nin93/alea/fork)
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request
Contributors
- Elia Franzella - creator and maintainer