Verified Commit 4798098c authored by Sebastian Schüpbach's avatar Sebastian Schüpbach
Browse files

create additional thumbnails for images

parent e77cc768
Pipeline #43066 failed with stages
in 2 minutes and 23 seconds
FROM openjdk:8-jre
RUN apt-get update && \
apt-get install -y ffmpeg && \
apt-get install -y ffmpeg imagemagick && \
apt-get autoremove -y && \
apt-get clean
ADD target/scala-2.12/app.jar /app/app.jar
......
......@@ -2,24 +2,36 @@
The __Media Converter__ is responsible for preparing media files for consumption by end users. This comprises:
* Copying files from the source folder (the sFTP directory) to a dedicated media directory directly accessible by the media file providers like the [media server](https://gitlab.switch.ch/memoriav/memobase-2020/services/streaming-server) or the [IIIF image server](https://gitlab.switch.ch/memoriav/memobase-2020/services/cantaloupe-docker). Besides the distribution copies these can also be preview images for videos ("thumbnails")
* Copying files from the source folder (the sFTP directory) to a dedicated media directory directly accessible by the
media file providers like
the [media server](https://gitlab.switch.ch/memoriav/memobase-2020/services/streaming-server) or
the [IIIF image server](https://gitlab.switch.ch/memoriav/memobase-2020/services/cantaloupe-docker). Besides the
distribution copies these can also be preview images for videos ("thumbnails")
* In the case of audio files repackaging the content in an MPEG4 container
* Creating small "snippets" from the audio file which are in turn used by the frontend to create sonograms
## Copying
The service gets the needed media files via the [Media File Distributor](https://gitlab.switch.ch/memoriav/memobase-2020/services/import-process/media-distributor-service), which in turn directly reads from the collections directory on the sFTP server. The fetched files are written to the respective media file directory. In the case of the Memobase workflow these directories are directly mounted in the service containers which need them.
The service gets the needed media files via
the [Media File Distributor](https://gitlab.switch.ch/memoriav/memobase-2020/services/import-process/media-distributor-service)
, which in turn directly reads from the collections directory on the sFTP server. The fetched files are written to the
respective media file directory. In the case of the Memobase workflow these directories are directly mounted in the
service containers which need them.
## Conversions
* Audio files: Repackages files in an mpeg4 container with the help of `ffmpeg` and sets the moov atom at the beginning of the file (`-movflags faststart`)
* Image files: Copies files as-is
* Audio files: Repackages files in an mpeg4 container with the help of `ffmpeg` and sets the moov atom at the beginning
of the file (`-movflags faststart`)
* Image files: Copies files as-is. An additional thumbnail is created.
* Video files: Copies files as-is
## Creating snippets
In order to provide content for sonograms and a teaser on the frontend, small snippets of the first x seconds from the audio files are produced. A relatively small snippet size helps to avoid getting only a solid black bar as a sonogram (which would be the case if one compresses a sonogram of a lengthy audio track to a width which fits the icon size used in the frontend).
In order to provide content for sonograms and a teaser on the frontend, small snippets of the first x seconds from the
audio files are produced. A relatively small snippet size helps to avoid getting only a solid black bar as a sonogram (
which would be the case if one compresses a sonogram of a lengthy audio track to a width which fits the icon size used
in the frontend).
## Configuration
In order to work as expected, the service needs to have a couple of environment variables set:
......@@ -29,9 +41,14 @@ In order to work as expected, the service needs to have a couple of environment
* `TOPIC_PROCESS`: Kafka topic where status reports are written to
* `CLIENT_ID`: Kafka client id
* `GROUP_ID`: Kafka consumer group id
* `AUDIO_SNIPPET_DURATION`: Number of seconds which are taken from the beginning of the audio track to produce the snippet
* `AUDIO_SNIPPET_DURATION`: Number of seconds which are taken from the beginning of the audio track to produce the
snippet
* `EXTERNAL_BASE_URL`: Base URL under which the resource is available
* `MEDIA_FOLDER_ROOT_PATH`: Path to the mounted media folder (i.e. the folder where the media files are copied to)
* `THUMBNAIL_FOLDER_PATH`: Path to the thumbnail folder
* `THUMBNAIL_WIDTH`: Optional width of produced thumbnails
* `THUMBNAIL_HEIGHT`: Optional height of produced thumbnails
* `DISTRIBUTOR_URL`: Address of the respective Media File Distributor instance
* `CONNECTION_RETRY_AFTER_MS`: Delay in milliseconds after which a reconnection to the Media File Distributor takes place
* `CONNECTION_RETRY_AFTER_MS`: Delay in milliseconds after which a reconnection to the Media File Distributor takes
place
* `CONNECTION_MAX_RETRIES`: Maximum number of connection retries
\ No newline at end of file
......@@ -17,6 +17,9 @@ externalBaseUrl: "https://memobase.ch/"
distributorUrl: "http://mb-wf2.memobase.unibas.ch:3000"
mediaFolderRootPath: "/data"
thumbnailFolderPath: "/data/cached"
thumbnailWidth: 640
thumbnailHeight: 0
mediaVolumeClaimName: media-volume-claim
connectionRetryAfterMs: "10000"
......
......@@ -17,6 +17,9 @@ externalBaseUrl: "https://memobase.ch/"
distributorUrl: "http://mb-wf2.memobase.unibas.ch:3001"
mediaFolderRootPath: "/data"
thumbnailFolderPath: "/data/cached"
thumbnailWidth: 640
thumbnailHeight: 0
mediaVolumeClaimName: stage-media-volume-claim
connectionRetryAfterMs: "10000"
......
......@@ -17,6 +17,9 @@ externalBaseUrl: "https://memobase.ch/"
distributorUrl: "http://mb-wf2.memobase.unibas.ch:3002"
mediaFolderRootPath: "/data"
thumbnailFolderPath: "/data/cached"
thumbnailWidth: 640
thumbnailHeight: 0
mediaVolumeClaimName: test-media-volume-claim
connectionRetryAfterMs: "10000"
......
......@@ -11,6 +11,9 @@ data:
AUDIO_SNIPPET_DURATION: "{{ .Values.audioSnippetDuration }}"
EXTERNAL_BASE_URL: "{{ .Values.externalBaseUrl }}"
MEDIA_FOLDER_ROOT_PATH: "{{ .Values.mediaFolderRootPath }}"
THUMBNAIL_FOLDER_PATH: "{{ .Values.thumbnailFolderPath }}"
THUMBNAIL_WIDTH: {{ .Values.thumbnailWidth }}
THUMBNAIL_HEIGHT: {{ .Values.thumbnailHeight }}
DISTRIBUTOR_URL: "{{ .Values.distributorUrl }}"
CONNECTION_RETRY_AFTER_MS: "{{ .Values.connectionRetryAfterMs }}"
CONNECTION_MAX_RETRIES: "{{ .Values.connectionMaxRetries }}"
......@@ -25,6 +25,9 @@ externalBaseUrl: placeholder
distributorUrl: placeholder
mediaFolderRootPath: placeholder
thumbnailFolderPath: placeholder
thumbnailWidth: placeholder
thumbnailHeight: placeholder
mediaVolumeClaimName: placeholder
connectionRetryAfterMs: placeholder
......
......@@ -2,6 +2,9 @@ app:
audioSnippetDuration: ${AUDIO_SNIPPET_DURATION:?system}
externalBaseUrl: ${EXTERNAL_BASE_URL:?system}
mediaFolderRootPath: ${MEDIA_FOLDER_ROOT_PATH:?system}
thumbnailFolderPath: ${THUMBNAIL_FOLDER_PATH:?system}
thumbnailWidth: ${THUMBNAIL_WIDTH:?system}
thumbnailHeight: ${THUMBNAIL_HEIGHT:?system}
distributorUrl: ${DISTRIBUTOR_URL:?system}
connectionMaxRetries: ${CONNECTION_MAX_RETRIES:?system}
connectionRetryAfterMs: ${CONNECTION_RETRY_AFTER_MS:?system}
......
......@@ -33,6 +33,9 @@ object App extends scala.App with Logging with RecordUtils {
"audioSnippetDuration",
"externalBaseUrl",
"mediaFolderRootPath",
"thumbnailFolderPath",
"thumbnailWidth",
"thumbnailHeight",
"distributorUrl",
"connectionMaxRetries",
"connectionRetryAfterMs"
......
......@@ -97,9 +97,25 @@ class DisseminationCopyHandler(audioSnippetDuration: Int) extends Logging {
*/
def createImageCopy(data: ByteArrayOutputStream, destFile: String, sourceFileType: MimeType): Try[Boolean] = Try {
val destFileAsPath = Paths.get(destFile)
val copyRemoved = removeExistingFile(destFileAsPath)
val fileRemoved = removeExistingFile(destFileAsPath)
writeData(data, destFileAsPath)
copyRemoved
fileRemoved
}
/**
* Creates a thumbnail representation of the image
*
* @param origFile Path to the dissemination copy of the image file
* @param destFile Path to the thumbnail representation of the image
* @param width Optional width of thumbnail
* @param height Optional height of thumbnail
* @return true if thumbnail was overwritten, false otherwise
*/
def createImageThumbnail(origFile: String, destFile: String, width: Option[Int], height: Option[Int]): Try[Boolean] = Try {
val destFileAsPath = Paths.get(destFile)
val fileRemoved = removeExistingFile(destFileAsPath)
MediaTransformations.createImageThumbnail(origFile, destFile, width, height).get
fileRemoved
}
/**
......
......@@ -26,6 +26,7 @@ trait FileUtils {
import models.Conversions._
val rootPath: String
val cachedImagePath: String
private val normalize: String => String =
path => if (path.endsWith("/")) path.substring(0, path.length - 1) else path
......@@ -42,7 +43,13 @@ trait FileUtils {
val audioSnippetPath: String => String =
id => s"${normalize(rootPath)}/$id-intro.mp3"
val imageFilePath: (String, MimeType) => String =
(id, mimeType) => s"${normalize(rootPath)}/$id.${getFileTypeExtension(mimeType).get}"
val imageFileRootPath: (String, MimeType) => String =
(id, mimeType) => imageFilePath(rootPath, id, mimeType)
val cachedImageFilePath: (String, MimeType) => String =
(id, mimeType) => imageFilePath(cachedImagePath, id, mimeType)
private val imageFilePath: (String, String, MimeType) => String =
(path, id, mimeType) => s"${normalize(path)}/$id.${getFileTypeExtension(mimeType).get}"
}
......@@ -75,4 +75,37 @@ object MediaTransformations extends Logging {
destFile
}
}
/**
* Creates an image thumbnail
*
* @param sourceFilePath Path to the source file
* @param destFile Path to the final file
* @param width Thumbnail's width. If None, width is set relative to height
* @param height Thumbnail's height If None, height is set relative to width
* @return
*/
def createImageThumbnail(sourceFilePath: String, destFile: String, width: Option[Int], height: Option[Int]): Try[String] = {
val externalCommand =
s"""convert \\
|-filter Triangle \\
|-define filter:support=2 \\
|-thumbnail ${width.getOrElse("")}x${height.getOrElse("")} \\
|-unsharp 0.25x0.08+8.3+0.045 \\
|-dither None \\
|-posterize 136 \\
|-quality 82 \\
|-define jpeg:fancy-upsampling=off \\
|-define png:compression-filter=5 \\
|-define png:compression-level=9 \\
|-define png:compression-strategy=1 \\
|-define png:exclude-chunk=all \\
|-interlace none \\
|$sourceFilePath \\
|$destFile""".stripMargin
Try {
executeCommand(externalCommand)
destFile
}
}
}
......@@ -43,6 +43,7 @@ class RecordProcessor(fileHandler: DisseminationCopyHandler,
appSettings: Properties) extends FileUtils {
val rootPath: String = appSettings.getProperty("mediaFolderRootPath")
val cachedImagePath: String = appSettings.getProperty("cachedMediaFolderPath")
val distributorUrl: String = appSettings.getProperty("distributorUrl")
val maxRetries: Int = appSettings.getProperty("connectionMaxRetries").toInt
val retryAfter: Int = appSettings.getProperty("connectionRetryAfterMs").toInt
......@@ -92,13 +93,26 @@ class RecordProcessor(fileHandler: DisseminationCopyHandler,
val res = fileHandler.createVideoCopy(data, destFile, mT)
createOutcome(res, id, DigitalObject, destFile)
case mT: ImageFile if resource == DigitalObject =>
val destFile = imageFilePath(id, mT)
val res = fileHandler.createImageCopy(data, destFile, mT)
createOutcome(res, id, DigitalObject, destFile)
val destFile = imageFileRootPath(id, mT)
createImageAndThumbnail(id, data, destFile, mT)
case mT: ImageFile if resource == Thumbnail =>
val destFile = videoPosterPath(id, mT)
val res = fileHandler.createImageCopy(data, destFile, mT)
createOutcome(res, id, Thumbnail, destFile)
createImageAndThumbnail(id, data, destFile, mT)
}
private def createImageAndThumbnail(id: String, data: ByteArrayOutputStream, destImageFile: String, mimeType: MimeType): List[ProcessOutcome] = {
val resMediaFile = fileHandler.createImageCopy(data, destImageFile, mimeType)
val outcomeMediaFile = createOutcome(resMediaFile, id, DigitalObject, destImageFile)
val destPreviewFile = cachedImageFilePath(id, mimeType)
val (width, height) = getThumbnailDimensions
val resThumbnail = fileHandler.createImageThumbnail(destImageFile, destPreviewFile, width, height)
val outcomeThumbnail = createOutcome(resThumbnail, id, DigitalObject, destImageFile)
outcomeMediaFile ++ outcomeThumbnail
}
private def getThumbnailDimensions: (Option[Int], Option[Int]) = {
(Try(appSettings.getProperty("thumbnailWidth").toInt).toOption.filter(_ >= 1),
Try(appSettings.getProperty("thumbnailHeight").toInt).toOption.filter(_ >= 1))
}
/**
......
......@@ -50,6 +50,14 @@ class DisseminationCopyHandlerTest extends AnyFunSuite with BeforeAndAfter {
fileType: MimeType,
copyFun: (ByteArrayOutputStream, String, MimeType)
=> Try[Boolean]): Assertion = {
val data = createByteArrayOutputStream(pathToTmpDir, sourceFileName)
val destFile = Paths.get(pathToTmpDir, destFileName)
destFile.toFile.deleteOnExit()
copyFun(data, destFile.toString, fileType)
assert(destFile.toFile.exists())
}
private def createByteArrayOutputStream(pathToTmpDir: String, sourceFileName: String): ByteArrayOutputStream = {
val file = Paths.get(pathToTmpDir, sourceFileName).toFile
val data = new ByteArrayOutputStream(file.length().toInt)
val buffer = new Array[Byte](1024)
......@@ -59,10 +67,7 @@ class DisseminationCopyHandlerTest extends AnyFunSuite with BeforeAndAfter {
data.write(buffer, 0, len)
len = in.read(buffer)
}
val destFile = Paths.get(pathToTmpDir, destFileName)
destFile.toFile.deleteOnExit()
copyFun(data, destFile.toString, fileType)
assert(destFile.toFile.exists())
data
}
private def testAudioCopy(pathToTmpDir: String,
......@@ -110,16 +115,32 @@ class DisseminationCopyHandlerTest extends AnyFunSuite with BeforeAndAfter {
}
}
test("calling the copyImage function should create temporary file") {
test("calling the createImageCopy function should create temporary file") {
val f = fixture
testCopy(f.resPath, "sample.jpg", "test.jpg", JpegFile, f.fileHandler.createImageCopy)
deleteFiles("src/test/resources/test.jpg")
}
/**
* ATTENTION: Needs the convert executable on $PATH
*/
test("calling the createImageThumbnail function should create a temporary thumbnail file") {
val f = fixture
val testImagePath = Paths.get(f.resPath, "test.jpg")
testImagePath.toFile.deleteOnExit()
val testThumbnailPath = Paths.get(f.resPath, "test-thumbnail.jpg")
testThumbnailPath.toFile.deleteOnExit()
val data = createByteArrayOutputStream(f.resPath, "sample.jpg")
f.fileHandler.createImageCopy(data, testImagePath.toString, JpegFile)
val res = f.fileHandler.createImageThumbnail(testImagePath.toString, testThumbnailPath.toString, Some(640), None)
assert(res.isSuccess)
assert(testThumbnailPath.toFile.exists())
}
/**
* ATTENTION: Requires that ffmpeg is properly installed!
*/
test("calling the copyVideo function should create temporary file") {
test("calling the createVideoCopy function should create temporary file") {
runWithFFmpeg {
val f = fixture
testCopy(f.resPath, "sample.mp4", "test.mp4", VideoMpeg4File, f.fileHandler.createVideoCopy)
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment