Commit 75df432d authored by Jonas Waeber's avatar Jonas Waeber
Browse files

Update README.

parent 9ce48afb
Pipeline #32128 failed with stages
in 3 minutes and 48 seconds
## Elastic Bulk Action Service
A service which can create, index, update and delete documents from a elasticsearch cluster with a bulk processor.
The source for the consumer is one or several Kafka topics. Failed operations are reported to a specific topic.
\ No newline at end of file
The source for the consumer is one or several Kafka topics. Failed operations are reported to a specific topic.
[Confluence Doku](https://ub-basel.atlassian.net/wiki/spaces/MEMOBASE/pages/758677514/Elasticsearch+Connector+Service)
### Index
Any message that is sent to the input topic will be indexed in the index.
The document is indexed with the Kafka message key. In the case that this
key is null or empty it is indexed without an id which will prompt
elasticsearch to generate a unique hash as key.
The message value is used as the document body. It has to be a valid
JSON document and need at least one property. If the body is empty the message
is skipped and an error is reported.
### Update by Query
A message is processed as an update by query if the key contains a
`#update` suffix.
The body of the message must be a JSON document which can be deserialized as
an `UpdateQuery` data class.
- `term`: The field to be queried.
- `value`: The value of the query.
- `source`: The painless script source of the query.
- `params`: The parameters of added to the script.
Additional Documentation:
- [Painless Script Guide](https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html)
- [Update by Query](https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docs-update-by-query.html)
### Configuration
- ELASTIC_INDEX: The name of the index all messages are indexed in.
- ELASTIC_HOST: The host name of one node of the elasticsearch cluster.
- ELASTIC_PORT: The port on which the node is listening.
- REPORTING_STEP_NAME: The name this step has in the reports. Should be unique across all deployments.
- KAFKA_BOOTSTRAP_SERVERS: The Kafka cluster to read from.
- APPLICATION_ID: The application ID of the deployment. Needs to be unique. It is used as both group and client id.
- TOPIC_IN: Input topic of the messages.
- TOPIC_OUT: This topic is used to send the created reports to.
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment