PostgreSQL Elasticsearch Ingest Service
A small (one might even say 'micro') service that hooks into pg-orm models and generates elasticsearch indices.
search-ingest
exposes a REST API to reindex/backfill specific models.
Usage
- Set the tables to be mirrored in ES through setting
SearchIngest::MANAGED_TABLES
with an array of(T < PgORM::Base).class
- Configure Elastic client through
ELASTIC_HOST
andELASTIC_PORT
env vars, or through switches on the command line - Configure PostgreSQL connection
PG_DATABASE_URL
env var
POST /api/v1/reindex[?backfill=true]
Deletes indexes and recreates index mappings. Backfills the indices by default (toggle with backfill boolean).
POST /api/v1/backfill
Backfills all indexes with data from PostgreSQL.
GET /api/v1/healthz
Healthcheck.
Index Schema
- Each PostgreSQL table receives an ES index, with a mapping generated from the attributes of a PgORM model.
- PgORM attributes can accept a tag
es_type
to specify the correct field datatype for the index schema. belongs_to
associations are modeled with ESjoin
datatypes, associated documents are replicated in their parent's index. This is necessary forhas_parent
andhas_child
queries.
PostgreSQL Mirroring
SearchIngest::TableManager
hooks into the changefeed of a table, resolves associations of the model and creates/updates documents in the appropriate ES indices.
Configuration
ENV
: A value ofproduction
lowers log verbosityES_HOST
: Elasticsearch hostES_PORT
: Elasticsearch portES_TLS
: Use Elasticsearch https, default isfalse
ES_URI
: Elasticsearch uri, detects whether to use TLS off schemaES_DISABLE_BULK
: Use single requests to Elasticsearch instead of the bulk API. Defaults tofalse
ES_CONN_POOL_TIMEOUT
: Timeout when checking a connection out of the Elasticsearch connection poolES_CONN_POOL
: Size of the Elasticsearch connection poolES_IDLE_POOL
: Maximum number of idle connections in the Elasticsearch connection poolUDP_LOG_HOST
: Host for sending JSON formatted logs toUDP_LOG_PORT
: Port that UDP input service is listening onPG_DATABASE
: DB to mirror to Elasticsearch, defaults to"test"
PG_HOST
: Host of PostgreSQL, defaults tolocalhost
PG_PORT
: Port of PostgreSQL, defaults to5432
PG_USER
: PostgreSQL database user, defaults topostgres
PG_PWD
: PostgreSQL database password, defaults to""
PLACE_SEARCH_INGEST_HOST
: Host to bind server toPLACE_SEARCH_INGEST_PORT
: Port for server to listen on
Contributing
See CONTRIBUTING.md
.
Contributors
- Caspian Baska - creator and maintainer