In order to mitigate against the brute force attacks against Gitlab accounts, we are moving to all edu-ID Logins. We would like to remind you to link your account with your edu-id. Login will be possible only by edu-ID after December 31, 2021. Here you can find the instructions for linking your account.

If you don't have a SWITCH edu-ID, you can create one with this guide here

kind regards

Commit 13a70f0d authored by Sebastian Schüpbach's avatar Sebastian Schüpbach
Browse files

Update README.md

parent 1272e282
Pipeline #29396 passed with stages
in 6 minutes and 19 seconds
......@@ -20,3 +20,30 @@ While internally the same, there are actually two deployments of Media
Metadata Extractor running. One for images (fed by input topic
import-process-image-enrichment) and one for AV media (reading from input
topic import-process-av-enrichment).
## Configuration
In order to work correctly, some environment variables have to be set:
* `KAFKA_BOOTSTRAP_SERVERS`: Comma-separated list of Kafka bootstrap server addresses
* `APPLICATION_ID`: Id used by Kafka Streams application (see [Kafka documentation](https://kafka.apache.org/documentation/#streamsconfigs_application.id) for details)
* `TOPIC_IN`: Name of Kafka topic where messages are read from
* `TOPIC_OUT`: Name of Kafka topic where messages are written to (without environment postfix)
* `TOPIC_PROCESS`: Name of Kafka topic where status reports are written to
* `INDEXER_HOST`: Address of indexer service
* `INDEXER_CONNECT_TIMEOUT_MS`: Time in milliseconds after which a connection timeout occurs
* `INDEXER_READ_TIMEOUT_MS`: Duration in milliseconds in which a response from the indexer is expected; consider that the processing of a large media file can take a bit of time...
* `CONSUMER_MAX_POLL_INTERVAL_MS`: Maximum time of consumer idleness in milliseconds; after this period the consumer is considered failed (see [Kafka documentation](https://kafka.apache.org/documentation/#consumerconfigs_max.poll.interval.ms) for details)
* `CONSUMER_MAX_POLL_RECORDS`: Maximum number of records returned in a single call to poll() (see [Kafka documentation](https://kafka.apache.org/documentation/#consumerconfigs_max.poll.records) for details)
* `PARSER_ACTIONS_REMOTE`: Comma-separated list of actions which should be performed by the indexer when analysing a remote media file (see below for allowed actions)
* `PARSER_ACTIONS_LOCAL`: Comma-separated list of actions which should be performed by the indexer when analysing a locally available media file
## Possible actions
* `siegfried`: Identify mime-type and PRONOM-id with [Siegfried](https://github.com/richardlehane/siegfried)
* `identify`: Run ImageMagick's [`identify`](https://imagemagick.org/script/identify.php) subcommand
* `ffprobe`: Run ffmpeg's [`ffprobe`](https://ffmpeg.org/ffprobe.html) subcommand
* `histogram`: Create a histogram from the analysed image
* `validateimage`: Validate audio or video file with [ImageMagick](https://imagemagick.org)
* `validateav`: Validate audio or video file with [ffmpeg](https://ffmpeg.org)
* `exif`: Extract EXIF data with [ExifTool](https://exiftool.org)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment