Skip to content

proposal: thing-model-catalog: Working With Multiple Thing Model Repositories #13

Description

@alexbrdn

[Description]

One of the requirements for Thing Model Catalog (TMC) is to enable using multiple and different remote repositories (remotes) for storing Thing Models, e.g. public vs private, file system vs Amazon S3 storage. This proposal describes envisioned workflows for working with remotes and necessary functionality on the part of TMC.

The TMC should support at least the following workflows. The list is not exhaustive.

  • Get up and running with local repository
    Download tm-catalog binary, set up a local folder as a [file] remote repository, start pushing TMs to it.
  • Edit a TM
    Fetch a TM from a remote, edit the file, push back to the same remote.
  • Use a git Repository as TM Catalog
    Initialize a git repository in a folder that is used as a file remote, push it to a remote git server. add that server as a TMC remote
  • Shorten a TM for a particular usecase
    Fetch from a publicly available catalog a complete TM from the manufacturer, remove affordances not necessary for the usecase, push to a private catalog
  • Search across multiple remotes
    Add a combination of local/remote/public/private repositories to tm-catalog config. Search for a TM or list versions of a TM across all of the remotes.
  • Merge one remote into another
    A subdivision of a larger organization may develop their own TMC and decide to later merge into the central TMC maintained by the company

[How]

The TMC should have a CLI command to manage remotes. This command should support CRUD operations on the list of remotes and store it in the config file. Parts of configuration (e.g. authentication secrets) may be left out of the config file and instead be derived from environment variables.

The following kinds of remotes are envisioned and can be implemented in due time:

  • local filesystem
  • http-accessble file storage
    allows configuring e.g. a remotely hosted git repository as a remote for tm-catalog
  • tm-catalog server instance running remotely
    incidentally, allows for easy sharding of large catalogs, in case, for example, if some organization decides to host a global TMC that would be the default for everyone, akin to Docker Hub.
  • cloud bulk storage, e.g. Amazon S3

The commands (APIs) that TMC provides through its CLI or REST API can be divided into three groups by their relation to remotes:

  1. Doesn't need a remote. E.g. validate
  2. Requires exactly one remote. E.g. push
  3. Performs a federated action on all remotes, but may be limited to just one. E.g. list, versions, fetch

The implementation of the third kind of commands - federated actions - should be extended to support multiple remotes. Of special interest is the list command, which should perform a federated search across all remotes. For such search to work, each kind of remote should ether implement some kind of search API, which returns structured results (with rankings), or host a predefined search index at a known place. The rankings are necessary to smartly merge search results. A tm-catalog instance or cloud bulk storage will have their own search apis. A file remote or a plain http server serving a file directory, on the other hand, must include a search index.

cc => @hadjian @EVO-Antoniazzi

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions