Forked repo with UC-02 & UC-04: henarmartinsantos@e41dc1b
Questions to discuss:
-
Set & Request URI terminology.
I have decided to follow this format:
Policy ex:ucXX-[data_tech]-p-<data_cat>-<purpose>-<scope>
ex:uc01-p-ehr-primarycare-read
ex:uc02-refinedURI-p-weight-primarycare-read
Request ex:ucXX-[data_tech]-r-<data_cat>-<purpose>-<scope>-<assignee>
ex:uc01-r-ehr-primarycare-read-physician
ex:uc02-refinedURI-r-weight-primarycare-read-physician
[data_tech] is used to differentiate between techniques when accessing to a subset of data: URI / refinedURI / SPARQL / SHACL
It is only used in those cases, UC-02 and UC-04, which have 4 possible sets and 4 possible requests.
2. UC-02
- I have implemented
ehds:HealthcareProvider
- I have created 4 'subcases'. As is the same use case but with different strategies for accessing data, instead of creating new use cases I have structured it like this:

Is it okay like this or would you prefer independent new use cases?
UC-04 follows the same structure as it also accesses a weight subset.
odrl:target [
odrl:source ex:weight ; #rdf:value
odrl:refinement [
odrl:leftOperand dpv-odrl:DataControllerDataSource ;
odrl:operator odrl:eq ;
odrl:rightOperand ehds:HealthcareProvider ]] ;
I don't know if it's more appropriate to use a refinement on source or on value. I believe odr:source fits better, as weight data is more an assetCollection, not an action.
- uc-02-SPARQL: SPARQL query integrated as literal
odrl:target [
a odrl:Asset ;
dcterms:description "SPARQL query to fetch patient weight data";
rdf:value """
PREFIX ex: <https://example.org/ns#>
SELECT ?date ?weight ?measurer
WHERE {
GRAPH ex:health-data {
ex:patient ex:weight ?weight ;
ex:measurementDate ?date ;
ex:measuredBy ?measurer .
?measurer a ehds:HealthcareProvider .
}
}
ORDER BY ?date
""" ;
] ;
- uc-02-SHACL: As discussed on the meeting, I have included a SHACL validation as a fourth option to access the subset of data.
@prefix sh: <http://www.w3.org/ns/shacl#> .
# --- ODRL Policy --- ###
ex:uc02-p-weight-primarycare-read a odrl:Set ;
odrl:uid ex:uc02-p-weight-primarycare-read ;
dcterms:description "Patient allows her weight subset of health data to be read for primary care. SHACL validation." ;
odrl:permission [
odrl:action odrl:read ;
odrl:target ex:weight-data ; # Reference to the data asset (separately defined)
odrl:assigner ex:patient ;
odrl:constraint [
odrl:leftOperand dpv-odrl:Purpose ;
odrl:operator odrl:eq ;
odrl:rightOperand sector-health:PrimaryCareManagement
]
] .
### --- SHACL Validation --- ###
ex:WeightShape a sh:NodeShape ;
sh:targetClass ex:Patient ;
sh:property [
sh:path ex:weight ;
sh:datatype xsd:float ;
sh:minCount 1 ;
] ;
sh:property [
sh:path ex:measurementDate ;
sh:datatype xsd:date ;
sh:minCount 1 ;
] ;
sh:property [
sh:path ex:measuredBy ;
sh:class ehds:HealthcareProvider ;
sh:minCount 1 ;
] ;
sh:sparql [ # Checks if measurer is not a healthcare provider and reports if it isn't
sh:select """
SELECT $this ?date ?weight ?measurer
WHERE {
$this ex:weight ?weight ;
ex:measurementDate ?date ;
ex:measuredBy ?measurer .
FILTER NOT EXISTS { ?measurer a ehds:HealthcareProvider . }
}
""" ;
sh:message "Weight measurement must be recorded by a healthcare provider." ;
] .
### --- Link ODRL Target to SHACL --- ###
ex:weight-data a odrl:Asset ;
dcterms:conformsTo ex:WeightShape . # Indicates the data must comply with SHACL rules
As you can see, the validation and its link to ODRL is "independent" from the policy itself. We can declare it all in the same file so its simple and a try of atomic, or create separated files and refer to them importing their definitions to reduce code repetition (it's the same for policy and request plus every time that the same subset wants to be accessed).
What do you think?
3. UC-04:
"during a patient encounter, a physician asks if the patient is willing to share a specific subset of data for secondary use by him/her (i.e. to analyze the weight outcomes of his/her own patient population from the first encounter until 1y after surgery); there's no limitation in time"
"Retrospective and prospective data until 1 year after surgery"
-
Should the surgery be a request event?
-
It is specified pseudo/anonymisation as a requirement. If the physician is only going to use his own patients' data and he's just going to analyse it without sharing, does he still have to anonymise it? In that case, is it enough with pseudonymising it?
I have done the use case with pseudonymisation, but I can change it to anonymisation if needed.
Furthermore, if it could be used without pseudo/anonymisation, I can create another use case removing that step.
- I have included EHDS purposes:
odrl:leftOperand dpv-odrl:Purpose ;
odrl:operator odrl:isAnyOf ;
odrl:rightOperand dpv:ScientificResearch, dpv:ImproveHealthcare, sector-health:ResearchDevelopment, ehds:HealthcareScientificResearch, ehds:ProvideHealthcareOfficialStatistics, ehds:PublicInterestRelatedToHealth, ehds:EnsureQualitySafetyHealthcare, ehds:ProtectAgainstCrossBorderThreatsToHealth, ehds:PublicHealthSurveillance
- Regarding the pseudonymisation strategy, I have ended up doing it as a processing constraint:
drl:constraint [
odrl:leftOperand dpv-odrl:Processing ;
odrl:operator odrl:isA ;
odrl:rightOperand dpv:Pseudonymisation ],[
odrl:leftOperand dpv:hasPseudonymisationTechnique ;
odrl:operator odrl:isAnyOf ;
odrl:rightOperand dpv:DeterministicPseudonymisation, dpv:MonotonicCounterPseudonymisation ]
DeterministicPseudonymisation: same input always produce the same pseudonym.
MonotonicCounterPseudonymisation: Monotonically increasing counter (P1, P2...)
Expressing it in the description like this:
Pseudo/Anonymisation as a duty: dpv:Pseudonymisation (dpv:DeterministicPseudonymisation, dpv:MonotonicCounterPseudonymisation)
Other options:
odrl:constraint [
odrl:leftOperand dpv:Data ;
odrl:operator odrl:isAnyOf ;
odrl:rightOperand dpv:PseudonymisedData, dpv:AnonymisedData ]
odrl:target [
odrl:source ex:weight ;
odrl:refinement [
odrl:leftOperand dpv:Data ;
odrl:operator odrl:isAnyOf ;
odrl:rightOperand dpv:PseudonymisedData, dpv:AnonymisedData ]] ;
What do you think? Is my approach correct?
Would you prefer a different one?
Should it be anonymised or is it enough like this?
Forked repo with UC-02 & UC-04: henarmartinsantos@e41dc1b
Questions to discuss:
Set & Request URI terminology.
I have decided to follow this format:
Policy
ex:ucXX-[data_tech]-p-<data_cat>-<purpose>-<scope>ex:uc01-p-ehr-primarycare-readex:uc02-refinedURI-p-weight-primarycare-readRequest
ex:ucXX-[data_tech]-r-<data_cat>-<purpose>-<scope>-<assignee>ex:uc01-r-ehr-primarycare-read-physicianex:uc02-refinedURI-r-weight-primarycare-read-physician[data_tech] is used to differentiate between techniques when accessing to a subset of data: URI / refinedURI / SPARQL / SHACL
It is only used in those cases, UC-02 and UC-04, which have 4 possible sets and 4 possible requests.
2. UC-02
ehds:HealthcareProviderIs it okay like this or would you prefer independent new use cases?
UC-04 follows the same structure as it also accesses a weight subset.
uc-02-URI: subset of data with a know URI:
<https://example.org/weight-by-hcp>uc-02-refinedURI: weight URI with a refinement.
I don't know if it's more appropriate to use a refinement on source or on value. I believe odr:source fits better, as weight data is more an assetCollection, not an action.
As you can see, the validation and its link to ODRL is "independent" from the policy itself. We can declare it all in the same file so its simple and a try of atomic, or create separated files and refer to them importing their definitions to reduce code repetition (it's the same for policy and request plus every time that the same subset wants to be accessed).
What do you think?
3. UC-04:
"during a patient encounter, a physician asks if the patient is willing to share a specific subset of data for secondary use by him/her (i.e. to analyze the weight outcomes of his/her own patient population from the first encounter until 1y after surgery); there's no limitation in time"
"Retrospective and prospective data until 1 year after surgery"
Should the surgery be a request event?
It is specified pseudo/anonymisation as a requirement. If the physician is only going to use his own patients' data and he's just going to analyse it without sharing, does he still have to anonymise it? In that case, is it enough with pseudonymising it?
I have done the use case with pseudonymisation, but I can change it to anonymisation if needed.
Furthermore, if it could be used without pseudo/anonymisation, I can create another use case removing that step.
DeterministicPseudonymisation: same input always produce the same pseudonym.
MonotonicCounterPseudonymisation: Monotonically increasing counter (P1, P2...)
Expressing it in the description like this:
Pseudo/Anonymisation as a duty:dpv:Pseudonymisation(dpv:DeterministicPseudonymisation,dpv:MonotonicCounterPseudonymisation)Other options:
odrl:action dpv-odrl:Pseudonymise / dpv-odrl:Anonymiseehds:DataRequest: Seeks access to statistically anonymised ehdWhat do you think? Is my approach correct?
Would you prefer a different one?
Should it be anonymised or is it enough like this?