Skip to content

UC-02 & UC-04 #3

Description

@henarmartinsantos

Forked repo with UC-02 & UC-04: henarmartinsantos@e41dc1b

Questions to discuss:

  1. Set & Request URI terminology.

    I have decided to follow this format:

    Policy ex:ucXX-[data_tech]-p-<data_cat>-<purpose>-<scope>
    ex:uc01-p-ehr-primarycare-read
    ex:uc02-refinedURI-p-weight-primarycare-read

    Request ex:ucXX-[data_tech]-r-<data_cat>-<purpose>-<scope>-<assignee>
    ex:uc01-r-ehr-primarycare-read-physician
    ex:uc02-refinedURI-r-weight-primarycare-read-physician

[data_tech] is used to differentiate between techniques when accessing to a subset of data: URI / refinedURI / SPARQL / SHACL
It is only used in those cases, UC-02 and UC-04, which have 4 possible sets and 4 possible requests.

2. UC-02

  1. I have implemented ehds:HealthcareProvider
  2. I have created 4 'subcases'. As is the same use case but with different strategies for accessing data, instead of creating new use cases I have structured it like this:
    Image

Is it okay like this or would you prefer independent new use cases?
UC-04 follows the same structure as it also accesses a weight subset.

  • uc-02-URI: subset of data with a know URI:
    <https://example.org/weight-by-hcp>

  • uc-02-refinedURI: weight URI with a refinement.

odrl:target [ 
           odrl:source ex:weight ; #rdf:value
           odrl:refinement [
                odrl:leftOperand dpv-odrl:DataControllerDataSource ;
                odrl:operator odrl:eq ;
                odrl:rightOperand ehds:HealthcareProvider ]] ;

I don't know if it's more appropriate to use a refinement on source or on value. I believe odr:source fits better, as weight data is more an assetCollection, not an action.

  • uc-02-SPARQL: SPARQL query integrated as literal
odrl:target [
            a odrl:Asset ;
            dcterms:description "SPARQL query to fetch patient weight data";
            rdf:value """
                PREFIX ex: <https://example.org/ns#>
                SELECT ?date ?weight ?measurer
                WHERE {
                    GRAPH ex:health-data {
                        ex:patient ex:weight ?weight ;
                                ex:measurementDate ?date ;
                                ex:measuredBy ?measurer .
                        ?measurer a ehds:HealthcareProvider .
                    }
                }
                ORDER BY ?date
                """ ;
            ] ; 

  • uc-02-SHACL: As discussed on the meeting, I have included a SHACL validation as a fourth option to access the subset of data.
@prefix sh:            <http://www.w3.org/ns/shacl#> .

# --- ODRL Policy --- ###
ex:uc02-p-weight-primarycare-read a odrl:Set ;
    odrl:uid ex:uc02-p-weight-primarycare-read ;
    dcterms:description "Patient allows her weight subset of health data to be read for primary care. SHACL validation." ;
    odrl:permission [
        odrl:action odrl:read ;
        odrl:target ex:weight-data ;  # Reference to the data asset (separately defined)
        odrl:assigner ex:patient ;
        odrl:constraint [
            odrl:leftOperand dpv-odrl:Purpose ;
            odrl:operator odrl:eq ;
            odrl:rightOperand sector-health:PrimaryCareManagement 
        ]
    ] .

### --- SHACL Validation --- ###
ex:WeightShape a sh:NodeShape ;
    sh:targetClass ex:Patient ;
    sh:property [
        sh:path ex:weight ;
        sh:datatype xsd:float ;
        sh:minCount 1 ;
    ] ;
    sh:property [
        sh:path ex:measurementDate ;
        sh:datatype xsd:date ;
        sh:minCount 1 ;
    ] ;
    sh:property [
        sh:path ex:measuredBy ;
        sh:class ehds:HealthcareProvider ;
        sh:minCount 1 ;
    ] ;
    sh:sparql [ # Checks if measurer is not a healthcare provider and reports if it isn't
        sh:select """
            SELECT $this ?date ?weight ?measurer
            WHERE {
                $this ex:weight ?weight ;
                      ex:measurementDate ?date ;
                      ex:measuredBy ?measurer .
                FILTER NOT EXISTS { ?measurer a ehds:HealthcareProvider . }
            }
        """ ;
        sh:message "Weight measurement must be recorded by a healthcare provider." ;
    ] .

### --- Link ODRL Target to SHACL --- ###
ex:weight-data a odrl:Asset ;
    dcterms:conformsTo ex:WeightShape .  # Indicates the data must comply with SHACL rules

As you can see, the validation and its link to ODRL is "independent" from the policy itself. We can declare it all in the same file so its simple and a try of atomic, or create separated files and refer to them importing their definitions to reduce code repetition (it's the same for policy and request plus every time that the same subset wants to be accessed).
What do you think?

3. UC-04:
"during a patient encounter, a physician asks if the patient is willing to share a specific subset of data for secondary use by him/her (i.e. to analyze the weight outcomes of his/her own patient population from the first encounter until 1y after surgery); there's no limitation in time"
"Retrospective and prospective data until 1 year after surgery"

  1. Should the surgery be a request event?

  2. It is specified pseudo/anonymisation as a requirement. If the physician is only going to use his own patients' data and he's just going to analyse it without sharing, does he still have to anonymise it? In that case, is it enough with pseudonymising it?
    I have done the use case with pseudonymisation, but I can change it to anonymisation if needed.

Furthermore, if it could be used without pseudo/anonymisation, I can create another use case removing that step.

  1. I have included EHDS purposes:
odrl:leftOperand dpv-odrl:Purpose ;
    odrl:operator odrl:isAnyOf ;
    odrl:rightOperand dpv:ScientificResearch, dpv:ImproveHealthcare, sector-health:ResearchDevelopment, ehds:HealthcareScientificResearch, ehds:ProvideHealthcareOfficialStatistics, ehds:PublicInterestRelatedToHealth, ehds:EnsureQualitySafetyHealthcare, ehds:ProtectAgainstCrossBorderThreatsToHealth, ehds:PublicHealthSurveillance

  1. Regarding the pseudonymisation strategy, I have ended up doing it as a processing constraint:
drl:constraint [
        odrl:leftOperand dpv-odrl:Processing ;
        odrl:operator odrl:isA ;
        odrl:rightOperand dpv:Pseudonymisation ],[
        odrl:leftOperand dpv:hasPseudonymisationTechnique ;
        odrl:operator odrl:isAnyOf ;
        odrl:rightOperand dpv:DeterministicPseudonymisation, dpv:MonotonicCounterPseudonymisation ]

DeterministicPseudonymisation: same input always produce the same pseudonym.
MonotonicCounterPseudonymisation: Monotonically increasing counter (P1, P2...)

Expressing it in the description like this:
Pseudo/Anonymisation as a duty: dpv:Pseudonymisation (dpv:DeterministicPseudonymisation, dpv:MonotonicCounterPseudonymisation)

Other options:

  • Data constraint:
odrl:constraint [
        odrl:leftOperand dpv:Data ;
        odrl:operator odrl:isAnyOf ;
        odrl:rightOperand dpv:PseudonymisedData, dpv:AnonymisedData ]
  • Refinement on ex:weight
odrl:target [
        odrl:source ex:weight ;
        odrl:refinement [
             odrl:leftOperand dpv:Data ;
             odrl:operator odrl:isAnyOf ;
             odrl:rightOperand dpv:PseudonymisedData, dpv:AnonymisedData ]] ;
  • odrl:action dpv-odrl:Pseudonymise / dpv-odrl:Anonymise

  • ehds:DataRequest : Seeks access to statistically anonymised ehd

What do you think? Is my approach correct?
Would you prefer a different one?
Should it be anonymised or is it enough like this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions