module DomainUtilv2
Overview
DomainUtilv2
is a utility module for extracting domain names from hostnames.
It is a reimplementation of the original DomainUtil module, but with better
performance.
Author: Maheep Kumar (technusm1)
Defined in:
domain_util_v2.crConstant Summary
-
Log =
::Log.for("DomainUtilv2")
-
SUFFIX_URL =
"https://publicsuffix.org/list/public_suffix_list.dat"
-
The URL for the mozilla public suffixes list
-
TLD_URL =
"https://www.iana.org/domains/root/db"
-
The URL for the IANA TLD extensions list
Class Method Summary
-
.backoff_factor : Float64
see
#update_tlds
or#update_suffixes
-
.backoff_factor=(backoff_factor : Float64)
see
#update_tlds
or#update_suffixes
-
.backoff_time : Time::Span
see
#update_tlds
or#update_suffixes
-
.backoff_time=(backoff_time : Time::Span)
see
#update_tlds
or#update_suffixes
-
.retry_count : Int32
see
#update_tlds
or#update_suffixes
-
.retry_count=(retry_count : Int32)
see
#update_tlds
or#update_suffixes
-
.strip_subdomains(hostname : String, tld_only = false) : String
Extracts the domain name from
hostname
using the public suffixes database to identify the portion of hostname that is a public suffix. -
.strip_suffix(hostname : String, tld_only = false) : String
Removes the domain extension / suffix from the end of the specified hostname.
-
.suffixes : Set(String)
Contains the public suffix database from mozilla as a set of registerable domain extensions (com, com.mx, etc.).
-
.tld_extensions : Set(String)
Contains the TLD database as a set of top level domain extensions (com, net, etc.)
-
.update_suffixes(retry_count : Int32 = self.retry_count, backoff_time : Time::Span = self.backoff_time, backoff_factor : Float64 = self.backoff_factor)
Updates the mozilla public suffixes database by downloading and parsing data from
#SUFFIX_URL
. -
.update_tlds(retry_count : Int32 = self.retry_count, backoff_time : Time::Span = self.backoff_time, backoff_factor : Float64 = self.backoff_factor)
Updates the tld extensions database by downloading and parsing html from
#TLD_URL
.
Class Method Detail
Extracts the domain name from hostname
using the public
suffixes database to identify the portion of hostname that
is a public suffix. The next token to the left is returned
(with its suffix) as the domain name. This effectively strips
subdomains from an arbitrary domain name. If tld_only
is
set to true
, only top-level domains according to IANA
will be used (meaning "co.uk" would be the detected domain
for hostanmes like site.co.uk
). By default the mozilla
public suffixes database is used.
arguments:
hostname
(String): A String that specifies the hostname to extract the domain name from.
tld_only
(optional. Boolean): If set to true
, only top-level domains according to IANA will be used.
Removes the domain extension / suffix from the end of the specified
hostname. Follows the same options and semantics as #strip_subdomains
.
Contains the public suffix database from mozilla as a set
of registerable domain extensions (com, com.mx, etc.). This
set is a super set of #tld_extensions
and all registerable
domain names.
#update_suffixes
must be called before this set will be populated
Contains the TLD database as a set of top level domain extensions (com, net, etc.)
#update_tlds
must be called before this set will be populated.
Updates the mozilla public suffixes database by downloading and parsing data from #SUFFIX_URL
. Upon
a failure (non-200 status code) exponential backoff will be used until .retry_count
is reached.
arguments:
.retry_count
(optional): specifies maximum number of retries before raising
.backoff_time
(optional): initial amount of time we should wait before trying again upon a failure
.backoff_factor
(optional): .backoff_time
is multiplied by this factor on each failure. Should be greater than 1
Updates the tld extensions database by downloading and parsing html from #TLD_URL
. Upon
a failure (non-200 status code) exponential backoff will be used until .retry_count
is reached.
arguments:
.retry_count
(optional): specifies maximum number of retries before raising
.backoff_time
(optional): initial amount of time we should wait before trying again upon a failure
.backoff_factor
(optional): .backoff_time
is multiplied by this factor on each failure. Should be greater than 1