Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
memoriav
Memobase 2020
services
postprocessing
rico-edm-transformer
Commits
d85b0a77
Commit
d85b0a77
authored
Mar 01, 2021
by
Günter Hipler
Browse files
create filters for no digital object and records with no locators
parent
6a133eb8
Pipeline
#22513
passed with stages
in 7 minutes and 31 seconds
Changes
3
Pipelines
1
Show whitespace changes
Inline
Side-by-side
gh/fragen.diskussion.txt
0 → 100644
View file @
d85b0a77
1) Entsprechend IIIF Manifest creation filtere ich records mit
hasNoDigitalObject
hasNoLocators (was heisst das genau?)
entsprechend IIIF Manifest Creator
ist das für OAI ebenfalls ok?
2) ManifestCreator beschränkt sich auf Fotos. Ich denke diese Einschränkung besteht für OAI nicht?
--> abklären (bisher kein Filter)
3) Filtern von nicht genutzten records
no-locator hat 348.885 kann das sein?
no.digital.object 13798 (kann das sein?)
dann blieben nur noch 221019 für OAI "übrig" (ok)
src/main/scala/ch/memobase/KafkaTopology.scala
View file @
d85b0a77
...
...
@@ -50,9 +50,10 @@ class KafkaTopology extends Logging {
//val Array(noDigitalObject, noLocator, noPhoto, isPhoto) = source
//we have to discuss, which documents should be delivered to Europeana
val
Array
(
do_we_have_any_Prerequisites
,
isEDMDeliverable
)
=
source
val
Array
(
noDigitalObject
,
noLocator
,
isEDMDeliverable
)
=
source
.
branch
(
(
_
,
v
)
=>
checkPrerequisites
(
v
),
(
_
,
v
)
=>
hasNoDigitalObject
(
v
),
(
_
,
v
)
=>
hasNoLocator
(
v
),
(
_
,
_
)
=>
true
)
...
...
@@ -72,16 +73,22 @@ class KafkaTopology extends Logging {
reportSuccessfulEDMCreation
(
isDeliverable
,
reportingTopic
)
reportEDMCreationFailure
(
noEDM
,
reportingTopic
)
/*
do we need this for EDM??
reportIgnoredRecord
(
noLocator
,
reportingTopic
,
"Digital object has no locator"
)
reportIgnoredRecord
(
noDigitalObject
,
reportingTopic
,
"Record has no digital object"
)
*/
/*
reportIgnoredRecord(noPhoto, reportingTopic, "Resource is not an image")
reportIgnoredRecord(
...
...
src/main/scala/ch/memobase/KafkaTopologyUtils.scala
View file @
d85b0a77
...
...
@@ -29,7 +29,22 @@ object KafkaTopologyUtils {
uri
.
split
(
"/"
).
last
.
split
(
"\\.(?=[^.]+$)"
)(
0
)
def
checkPrerequisites
(
msgVal
:
String
)
:
Boolean
=
false
//def checkPrerequisites(msgVal: String): Boolean = false
def
hasNoDigitalObject
(
msgVal
:
String
)
:
Boolean
=
Extractors
.
jsonGraph
(
msgVal
)
.
flatMap
(
v
=>
Extractors
.
digitalObject
(
v
.
arr
))
.
isFailure
def
hasNoLocator
(
msgVal
:
String
)
:
Boolean
=
Extractors
.
jsonGraph
(
msgVal
)
.
flatMap
(
v
=>
Extractors
.
digitalObject
(
v
.
arr
))
.
flatMap
(
dO
=>
Try
(
Extractors
.
imageResourceId
(
dO
).
get
))
.
isFailure
}
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment