Abstracting Data Updates over a Document-oriented interface of a Permissioned Decentralized Environment

Jitse De Smet

Abstracting Data Updates over a Document-oriented interface of a Permissioned Decentralized Environment

  • Motivation
  • Research Question and Hypothesis
  • Proposed Solution
  • Results
  • Conclusion

Motivation

Data is the New Gold(?)

Deloitte states data is a Strategic asset Forbes: How To Make Use Of The New Gold: Data Imec: Data mogen niet het nieuwe goud zijn Data is not the new oil say Prof Sir Nigel Shadbolt and Sir Tim Berners-Lee

Motivation

Decentralization initiatives

Solid

  • Use Existing Web Technologies
  • Self Governed Data Store
  • Heterogeneity
    • of Interface
    • of Data
    • of Structure

Handling Heterogeneity

Research Question and Hypothesis

"How can we abstract data updates over a document oriented interface of a permissioned decentralized environment behind a query abstraction layer ?"

Solution

  • Heterogeneity of Data → Describe Data
  • Heterogeneity of Structure Describe structure
Can use a shape description language like ShEx or SHACL.
Example: SHACL Shape Description of a social media post.

:postShape a sh:NodeShape ;
  sh:property [
    sh:path rdf:type ;
    sh:hasValue ldbc:Post ;
  ] ;
  sh:property [
    sh:path ldbc:creationDate ;
    sh:datatype xsd:dateTime ;
    sh:minCount 1 ;
    sh:maxCount 1 ;
  ] ;
  sh:property [
    sh:path ldbc:id ;
    sh:datatype xsd:long ;
    sh:minCount 1 ;
    sh:maxCount 1 ;
  ] .
            
Example: LDP Structure
posts/
  |- Valencia
  |  |- #one
  |  |- #two
  |- Ghent/
  |  |- #one
  |  |- #two
  |- Paris/
  |  |- #one
  |  |- #two
  |  |- #three
            
posts/
  |- 30-01-2024/
  |  |- #one
  |  |- #two
  |- 14-02-2024/
  |  |- #one
  |  |- #two
  |- 17-05-2023/
  |  |- #one
  |  |- #two
  |  |- #three
  |  |- #four
            
Can use indexes like Type Indexes or Shape Trees.
Example: Shape Trees describing al list of files

<#PicturesTree>
  a st:ShapeTree ;
  st:expectsType st:Container ;
  st:shape ex:PicturesShape ;
  st:contains <#PicturesByCityTree> .

<#PicturesByCityTree>
  a st:ShapeTree ;
  st:expectsType st:Container ;
  st:shape ex:PicturesByCityShape ;
  st:contains <#PictureTree> .

<#PictureTree>
  a st:ShapeTree ;
  st:expectsType st:Resource ;
  st:shape ex:PictureShape .
            
Is this enough?
To check that, I listed some functional requirements and user stories.
The answer: NO.

What are we missing?

  1. What if multiple directories match?
  2. What if no directories match?
  3. How are resources grouped?
  4. Am I creating a leaf directory?
  5. Can I update? Does it stay where it is?
  6. Do all clients follow the same rules?

Storage Guidance Vocabulary

  1. Resource Collection
  2. Unstructured Collection
  3. Structured Collection
  4. Canonical Collection
  5. Derived Collection
  6. Resource Description
  7. Group Strategy
  8. Store Condition
  9. Update Condition
  10. Client Control

Storage Guidance Vocabulary

Schematic overview of an SGV creation flow

Storage Guidance Vocabulary

Schematic overview of an SGV update flow

Taste the pudding

Empirical Evaluation

Create Resource
frag. Strat.ops/secAverage Time (ms)Margin
by date: SGV2244582.068±1.73%
by date: RAW3527899.513±2.07%
one file: SGV6149415.739±2.98%
one file: RAW7134361.192±8.66%
own file: SGV1091851.395±2.56%
own file: RAW1376672.217±3.07%
by location: SGV2343005.366±2.20%
by location: RAW3528003.949±2.53%

Empirical Evaluation

Move Resource
frag. Strat.ops/secAverage Time (ms)Margin
by date: SGV7141940.530±1.28%
by date: RAW1187113.119±0.75%
one file: SGV2343690.220±1.70%
one file: RAW4208930.211±2.04%
own file: SGV5177991.908±0.58%
own file: RAW1280729.940±1.06%
by location: SGV7133052.120±0.60%
by location: RAW1281066.196±1.15%

Conclusion

  1. Automated Client with limited overhead is possible
  2. Lack of server-side control
  3. Inter-pod Updates
  4. Investigate Other Interfaces
  5. Structure has a high influence on execution time
  6. Smart Access Control
  7. CAP / ACID / BASE

Time for Questions