The __Media Converter__ is responsible for preparing media files for consumption by end users. This comprises:
* Copying files from the source folder (the sFTP directory) to a dedicated media directory directly accessible by the media file providers like the [media server](https://gitlab.switch.ch/memoriav/memobase-2020/services/streaming-server) or the [IIIF image server](https://gitlab.switch.ch/memoriav/memobase-2020/services/cantaloupe-docker). Besides the distribution copies these can also be preview images for videos ("thumbnails")
* Copying files from the source folder (the sFTP directory) to a dedicated media directory directly accessible by the
media file providers like
the [media server](https://gitlab.switch.ch/memoriav/memobase-2020/services/streaming-server) or
the [IIIF image server](https://gitlab.switch.ch/memoriav/memobase-2020/services/cantaloupe-docker). Besides the
distribution copies these can also be preview images for videos ("thumbnails")
* In the case of audio files repackaging the content in an MPEG4 container
* Creating small "snippets" from the audio file which are in turn used by the frontend to create sonograms
## Copying
The service gets the needed media files via the [Media File Distributor](https://gitlab.switch.ch/memoriav/memobase-2020/services/import-process/media-distributor-service), which in turn directly reads from the collections directory on the sFTP server. The fetched files are written to the respective media file directory. In the case of the Memobase workflow these directories are directly mounted in the service containers which need them.
The service gets the needed media files via
the [Media File Distributor](https://gitlab.switch.ch/memoriav/memobase-2020/services/import-process/media-distributor-service)
, which in turn directly reads from the collections directory on the sFTP server. The fetched files are written to the
respective media file directory. In the case of the Memobase workflow these directories are directly mounted in the
service containers which need them.
## Conversions
* Audio files: Repackages files in an mpeg4 container with the help of `ffmpeg` and sets the moov atom at the beginning of the file (`-movflags faststart`)
* Image files: Copies files as-is
* Audio files: Repackages files in an mpeg4 container with the help of `ffmpeg` and sets the moov atom at the beginning
of the file (`-movflags faststart`)
* Image files: Copies files as-is. An additional thumbnail is created.
* Video files: Copies files as-is
## Creating snippets
In order to provide content for sonograms and a teaser on the frontend, small snippets of the first x seconds from the audio files are produced. A relatively small snippet size helps to avoid getting only a solid black bar as a sonogram (which would be the case if one compresses a sonogram of a lengthy audio track to a width which fits the icon size used in the frontend).
In order to provide content for sonograms and a teaser on the frontend, small snippets of the first x seconds from the
audio files are produced. A relatively small snippet size helps to avoid getting only a solid black bar as a sonogram (
which would be the case if one compresses a sonogram of a lengthy audio track to a width which fits the icon size used
in the frontend).
## Configuration
In order to work as expected, the service needs to have a couple of environment variables set:
...
...
@@ -29,9 +41,14 @@ In order to work as expected, the service needs to have a couple of environment
*`TOPIC_PROCESS`: Kafka topic where status reports are written to
*`CLIENT_ID`: Kafka client id
*`GROUP_ID`: Kafka consumer group id
*`AUDIO_SNIPPET_DURATION`: Number of seconds which are taken from the beginning of the audio track to produce the snippet
*`AUDIO_SNIPPET_DURATION`: Number of seconds which are taken from the beginning of the audio track to produce the
snippet
*`EXTERNAL_BASE_URL`: Base URL under which the resource is available
*`MEDIA_FOLDER_ROOT_PATH`: Path to the mounted media folder (i.e. the folder where the media files are copied to)
*`THUMBNAIL_FOLDER_PATH`: Path to the thumbnail folder
*`THUMBNAIL_WIDTH`: Optional width of produced thumbnails
*`THUMBNAIL_HEIGHT`: Optional height of produced thumbnails
*`DISTRIBUTOR_URL`: Address of the respective Media File Distributor instance
*`CONNECTION_RETRY_AFTER_MS`: Delay in milliseconds after which a reconnection to the Media File Distributor takes place
*`CONNECTION_RETRY_AFTER_MS`: Delay in milliseconds after which a reconnection to the Media File Distributor takes
place
*`CONNECTION_MAX_RETRIES`: Maximum number of connection retries