From ecaf94d8b8fc303f81ba0d89386399f4fd9d4318 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Tue, 6 Jan 2026 16:14:40 +0000 Subject: [PATCH 01/14] Initial ADR creation --- ...xxx-source-collected_by-query-parameter.md | 79 +++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100644 docs/adr/xxxx-source-collected_by-query-parameter.md diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md new file mode 100644 index 00000000..4cc96d5e --- /dev/null +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -0,0 +1,79 @@ +--- +status: "draft" +--- +# Add the ability to query sources without a collected_by value + +## Context and Problem Statement + +Currently at the source level there is no filtering on the collected_by field. +When vendors are building systems which list content stored in TAMS then they are only looking for the sources at the top of the content tree. +In many cases this is the multi-source, however due to single essence (eg audio only) workflows then this is not a reliable method of finding the top level content. +To achieve this they must go through every result and look to see if the collected_by field is missing as this then indicates that it is the top of a source collection. +This process can also throw any form of sensible pagination in a client application as when requesting content from the TAMS API it is not known how many results will be retained or discarded. + +This ADR is to look at moving this behaviour from the client side into the API. +This has the benefit of not only making it simpler for any system looking for top level content in the system, but also makes the result more predicable as the number of rows requested from the API will result in the available rows up to that limit. + + +## Considered Options + +* Option 1: Follow tags example and use two query parameters for value and exists +* Option 2: Follow tags example but only implement the exists query parameter +* Option 3: Follow accept_get_urls example and use an empty query parameter +* Option 4: Do nothing + +## Decision Outcome + +tbc + +### Implementation + +See the API specification changes in PR [#xxx](https://github.com/bbc/tams/pull/xxx). + +## Pros and Cons of the Options + +### Option 1: Follow tags example and use a collected_by_exists query parameter + +When querying tags there are two fields available - you query on a tag name to get the value, or to query on a second parameter (tag_exists) to find out if that tag exists. +Following this model would mean adding a two parameters - a new query parameter collected_by_exists with a boolean value and the collected_by to be able to search on one or more ID's. + +For the boolean field setting to false this would return all sources where there is no collected_by values present which is the required behaviour. +Setting this to true would only return sources which have a collected_by value. +This option currently has no uses cases, however using this model and a boolean logically requires this behaviour. + + +| Behaviour | Query Parameter | +| --------- | --------------- | +| Source is not collected | `collected_by_exists=false` | +| Source is collected by specific Source | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | +| Source is collected by any of a set of Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | +| Source is collected by any Source | `collected_by_exists=true` | +| Note that this combination is non-sense | `collected_by_exists=false&collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | + +### Option 2: Follow tags example but only implement the exists query parameter + +Since the use cases driving this requirement only focus on whether the item is collected_by another source, then option 2 takes just the exists query parameter element from option 1 and implements that. +In this option it is not possible to query on the ID in the collection, only that it exists. + +| Behaviour | Query Parameter | +| --------- | --------------- | +| Source is not collected | `collected_by_exists=false` | +| Source is collected by any Source | `collected_by_exists=true` | + + +### Option 3: Follow accept_get_urls example and use an empty query parameter + +On the /segments end point it is possible to specify a query parameter of accept_get_urls. +This field can be comma separated list of labels, however it is allowed to be examply which means that the get_urls are ommited in the result + + +| Behaviour | Query Parameter | +| --------- | --------------- | +| Source is not collected | `collected_by=` | +| Source is collected by specific Source | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | +| Source is collected by any of a set of Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | +| Source is collected by any Source | Not possible | + +If chosen to also apply at the Flow leve then is `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? +The first gives you all the metadata about the flows, where the later only gives you the ID's and roles + From b0163eb4379433c60b1accc3dcd0f8b686038fe6 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Tue, 6 Jan 2026 16:18:05 +0000 Subject: [PATCH 02/14] Additional details and formatting --- ...xxx-source-collected_by-query-parameter.md | 27 ++++++++++++------- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index 4cc96d5e..4771a58f 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -5,23 +5,30 @@ status: "draft" ## Context and Problem Statement -Currently at the source level there is no filtering on the collected_by field. +Currently at the Source level there is no filtering on the collected_by field. When vendors are building systems which list content stored in TAMS then they are only looking for the sources at the top of the content tree. -In many cases this is the multi-source, however due to single essence (eg audio only) workflows then this is not a reliable method of finding the top level content. -To achieve this they must go through every result and look to see if the collected_by field is missing as this then indicates that it is the top of a source collection. +In many cases this is the multi-Source, however due to single essence (eg audio only) workflows then this is not a reliable method of finding the top level content. +To achieve this they must go through every result and look to see if the collected_by field is missing as this then indicates that it is the top of a Source collection. This process can also throw any form of sensible pagination in a client application as when requesting content from the TAMS API it is not known how many results will be retained or discarded. This ADR is to look at moving this behaviour from the client side into the API. This has the benefit of not only making it simpler for any system looking for top level content in the system, but also makes the result more predicable as the number of rows requested from the API will result in the available rows up to that limit. +While the focus of this ADR is on the Source level of the TAMS API, there are sufficient similarities between the data structures of sources and flows that it is worth considering whether this should be applied to both levels. + ## Considered Options +Source Options: * Option 1: Follow tags example and use two query parameters for value and exists * Option 2: Follow tags example but only implement the exists query parameter * Option 3: Follow accept_get_urls example and use an empty query parameter * Option 4: Do nothing +Flows: +* Option: Apply the same querying capabilities at both Source and Flow level +* Option: Only apply the new query capabilities at the Source level + ## Decision Outcome tbc @@ -34,11 +41,11 @@ See the API specification changes in PR [#xxx](https://github.com/bbc/tams/pull/ ### Option 1: Follow tags example and use a collected_by_exists query parameter -When querying tags there are two fields available - you query on a tag name to get the value, or to query on a second parameter (tag_exists) to find out if that tag exists. -Following this model would mean adding a two parameters - a new query parameter collected_by_exists with a boolean value and the collected_by to be able to search on one or more ID's. +When querying tags there are two fields available - you query on a tag name to get the value, or to query on a second parameter (`tag_exists`) to find out if that tag exists. +Following this model would mean adding a two parameters - a new query parameter `collected_by_exists` with a boolean value and the `collected_by` to be able to search on one or more ID's. -For the boolean field setting to false this would return all sources where there is no collected_by values present which is the required behaviour. -Setting this to true would only return sources which have a collected_by value. +For the boolean field setting to false this would return all Sources where there is no collected_by values present which is the required behaviour. +Setting this to true would only return Sources which have a collected_by value. This option currently has no uses cases, however using this model and a boolean logically requires this behaviour. @@ -52,7 +59,7 @@ This option currently has no uses cases, however using this model and a boolean ### Option 2: Follow tags example but only implement the exists query parameter -Since the use cases driving this requirement only focus on whether the item is collected_by another source, then option 2 takes just the exists query parameter element from option 1 and implements that. +Since the use cases driving this requirement only focus on whether the item is collected_by another Source, then option 2 takes just the exists query parameter element from option 1 and implements that. In this option it is not possible to query on the ID in the collection, only that it exists. | Behaviour | Query Parameter | @@ -63,8 +70,8 @@ In this option it is not possible to query on the ID in the collection, only tha ### Option 3: Follow accept_get_urls example and use an empty query parameter -On the /segments end point it is possible to specify a query parameter of accept_get_urls. -This field can be comma separated list of labels, however it is allowed to be examply which means that the get_urls are ommited in the result +On the `/segments` end point it is possible to specify a query parameter of `accept_get_urls`. +This field can be comma separated list of labels, however it is allowed to be examply which means that the `get_urls` are ommited in the result | Behaviour | Query Parameter | From e9b1806427b06ccdd1ce7ea0ffbcf623c6ba9b1c Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Tue, 6 Jan 2026 16:19:33 +0000 Subject: [PATCH 03/14] Additional formatting --- docs/adr/xxxx-source-collected_by-query-parameter.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index 4771a58f..9ef7e72e 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -5,10 +5,10 @@ status: "draft" ## Context and Problem Statement -Currently at the Source level there is no filtering on the collected_by field. +Currently at the Source level there is no filtering on the `collected_by` field. When vendors are building systems which list content stored in TAMS then they are only looking for the sources at the top of the content tree. In many cases this is the multi-Source, however due to single essence (eg audio only) workflows then this is not a reliable method of finding the top level content. -To achieve this they must go through every result and look to see if the collected_by field is missing as this then indicates that it is the top of a Source collection. +To achieve this they must go through every result and look to see if the `collected_by `field is missing as this then indicates that it is the top of a Source collection. This process can also throw any form of sensible pagination in a client application as when requesting content from the TAMS API it is not known how many results will be retained or discarded. This ADR is to look at moving this behaviour from the client side into the API. @@ -44,8 +44,8 @@ See the API specification changes in PR [#xxx](https://github.com/bbc/tams/pull/ When querying tags there are two fields available - you query on a tag name to get the value, or to query on a second parameter (`tag_exists`) to find out if that tag exists. Following this model would mean adding a two parameters - a new query parameter `collected_by_exists` with a boolean value and the `collected_by` to be able to search on one or more ID's. -For the boolean field setting to false this would return all Sources where there is no collected_by values present which is the required behaviour. -Setting this to true would only return Sources which have a collected_by value. +For the boolean field setting to false this would return all Sources where there is no `collected_by` values present which is the required behaviour. +Setting this to true would only return Sources which have a `collected_by` value. This option currently has no uses cases, however using this model and a boolean logically requires this behaviour. @@ -59,7 +59,7 @@ This option currently has no uses cases, however using this model and a boolean ### Option 2: Follow tags example but only implement the exists query parameter -Since the use cases driving this requirement only focus on whether the item is collected_by another Source, then option 2 takes just the exists query parameter element from option 1 and implements that. +Since the use cases driving this requirement only focus on whether the item is `collected_by` another Source, then option 2 takes just the exists query parameter element from option 1 and implements that. In this option it is not possible to query on the ID in the collection, only that it exists. | Behaviour | Query Parameter | From c8c2fbdb9f2515e689bd1744e9bd5debdf31d4c4 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Tue, 6 Jan 2026 16:19:52 +0000 Subject: [PATCH 04/14] Updated options --- docs/adr/xxxx-source-collected_by-query-parameter.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index 9ef7e72e..65e0bd45 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -26,8 +26,8 @@ Source Options: * Option 4: Do nothing Flows: -* Option: Apply the same querying capabilities at both Source and Flow level -* Option: Only apply the new query capabilities at the Source level +* Option A: Apply the same querying capabilities at both Source and Flow level +* Option B: Only apply the new query capabilities at the Source level ## Decision Outcome From bffb792682b050057954c9c4620fe78c851b305b Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Tue, 6 Jan 2026 16:41:01 +0000 Subject: [PATCH 05/14] More ADR content added --- .../adr/xxxx-source-collected_by-query-parameter.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index 65e0bd45..38d577a2 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -16,7 +16,6 @@ This has the benefit of not only making it simpler for any system looking for to While the focus of this ADR is on the Source level of the TAMS API, there are sufficient similarities between the data structures of sources and flows that it is worth considering whether this should be applied to both levels. - ## Considered Options Source Options: @@ -48,12 +47,11 @@ For the boolean field setting to false this would return all Sources where there Setting this to true would only return Sources which have a `collected_by` value. This option currently has no uses cases, however using this model and a boolean logically requires this behaviour. - | Behaviour | Query Parameter | | --------- | --------------- | | Source is not collected | `collected_by_exists=false` | | Source is collected by specific Source | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | -| Source is collected by any of a set of Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | +| Source is collected by any of the specified Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | | Source is collected by any Source | `collected_by_exists=true` | | Note that this combination is non-sense | `collected_by_exists=false&collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | @@ -67,20 +65,25 @@ In this option it is not possible to query on the ID in the collection, only tha | Source is not collected | `collected_by_exists=false` | | Source is collected by any Source | `collected_by_exists=true` | +* Good: Keeps it simple and only implements the required behaviour +* Good: Does not force the requirement to also filter on the ID values +* Bad: no currently known use case for when the parameter is set to true ### Option 3: Follow accept_get_urls example and use an empty query parameter On the `/segments` end point it is possible to specify a query parameter of `accept_get_urls`. This field can be comma separated list of labels, however it is allowed to be examply which means that the `get_urls` are ommited in the result - | Behaviour | Query Parameter | | --------- | --------------- | | Source is not collected | `collected_by=` | | Source is collected by specific Source | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | -| Source is collected by any of a set of Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | +| Source is collected by any of the specified Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | | Source is collected by any Source | Not possible | +* Good: If filtering by ID is required then this keeps it to a single field +* Bad: Requires the filtering by ID to be a logical query parameter + If chosen to also apply at the Flow leve then is `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? The first gives you all the metadata about the flows, where the later only gives you the ID's and roles From 6b3da2dc729a0b359433865c3e666e33c3c5a80d Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Tue, 6 Jan 2026 16:42:23 +0000 Subject: [PATCH 06/14] Updated title --- docs/adr/xxxx-source-collected_by-query-parameter.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index 38d577a2..905b96c5 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -1,12 +1,12 @@ --- status: "draft" --- -# Add the ability to query sources without a collected_by value +# Ability to query for Sources without a collected_by value ## Context and Problem Statement Currently at the Source level there is no filtering on the `collected_by` field. -When vendors are building systems which list content stored in TAMS then they are only looking for the sources at the top of the content tree. +When vendors are building systems which list content stored in TAMS then they are only looking for the Sources at the top of the content tree. In many cases this is the multi-Source, however due to single essence (eg audio only) workflows then this is not a reliable method of finding the top level content. To achieve this they must go through every result and look to see if the `collected_by `field is missing as this then indicates that it is the top of a Source collection. This process can also throw any form of sensible pagination in a client application as when requesting content from the TAMS API it is not known how many results will be retained or discarded. @@ -14,7 +14,7 @@ This process can also throw any form of sensible pagination in a client applicat This ADR is to look at moving this behaviour from the client side into the API. This has the benefit of not only making it simpler for any system looking for top level content in the system, but also makes the result more predicable as the number of rows requested from the API will result in the available rows up to that limit. -While the focus of this ADR is on the Source level of the TAMS API, there are sufficient similarities between the data structures of sources and flows that it is worth considering whether this should be applied to both levels. +While the focus of this ADR is on the Source level of the TAMS API, there are sufficient similarities between the data structures of Sources and Flows that it is worth considering whether this should be applied to both levels. ## Considered Options From a13578903311a7b07f09e4b7d2aae7058083a9a0 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Tue, 6 Jan 2026 16:45:11 +0000 Subject: [PATCH 07/14] Flow level section expanded --- docs/adr/xxxx-source-collected_by-query-parameter.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index 905b96c5..bd7d1316 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -84,6 +84,12 @@ This field can be comma separated list of labels, however it is allowed to be ex * Good: If filtering by ID is required then this keeps it to a single field * Bad: Requires the filtering by ID to be a logical query parameter -If chosen to also apply at the Flow leve then is `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? +### Flow level implications + +At the flow level this is currently some ability to query and update flow collections in the API. +This behaviour focuses on starting from a known flow to which the collection is applied. +There is no equivalent endpoints at the source level. + +If chosen to also apply at the Flow level then need to consider is `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? The first gives you all the metadata about the flows, where the later only gives you the ID's and roles From 9f30dabb42071d4ae8fa10f9918ecade114c79f0 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Wed, 15 Apr 2026 20:18:40 +0100 Subject: [PATCH 08/14] Updated with webhooks details --- ...xxx-source-collected_by-query-parameter.md | 27 ++++++++++++------- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index bd7d1316..5f57500c 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -30,7 +30,9 @@ Flows: ## Decision Outcome -tbc +Recommendation: Option 3 since this already exists within the API as part of the webhooks, however this requires clarification of empty query parameters which is recommended to match the proposed behaviours. + +Additionally it is recommended to add the same capability at the Flow level and again match this with the same parameter as used in the webhooks and clarify the empty query parameter behaviour. ### Implementation @@ -72,24 +74,29 @@ In this option it is not possible to query on the ID in the collection, only tha ### Option 3: Follow accept_get_urls example and use an empty query parameter On the `/segments` end point it is possible to specify a query parameter of `accept_get_urls`. -This field can be comma separated list of labels, however it is allowed to be examply which means that the `get_urls` are ommited in the result +This field can be comma separated list of labels, however it is allowed to be empty which means that the `get_urls` are ommited in the result + +Similarly in the webhooks it is possible to specify the `source_collected_by_ids` as an array of Source IDs. +In the current it is not entirely clear what the behaviour of the query parameter if the value is left empty. +However it is proposed to keep the same query parameter for comsistency then update the webhooks to clarify what should happen for an empty query parameter. | Behaviour | Query Parameter | | --------- | --------------- | -| Source is not collected | `collected_by=` | -| Source is collected by specific Source | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | -| Source is collected by any of the specified Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | +| Source is not collected | `source_collected_by_ids=` | +| Source is collected by specific Source | `source_collected_by_ids=a46c49f1-4764-42b9-9f91-f267a58903c4` | +| Source is collected by any of the specified Sources | `source_collected_by_ids=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | | Source is collected by any Source | Not possible | * Good: If filtering by ID is required then this keeps it to a single field +* Good: Uses an existing field which is already present in the API and provided consistency between different areas * Bad: Requires the filtering by ID to be a logical query parameter ### Flow level implications -At the flow level this is currently some ability to query and update flow collections in the API. -This behaviour focuses on starting from a known flow to which the collection is applied. -There is no equivalent endpoints at the source level. +At the Flow level this is currently some ability to query and update Flow collections in the API. +This behaviour focuses on starting from a known Flow to which the collection is applied. +There is no equivalent endpoints at the Source level. -If chosen to also apply at the Flow level then need to consider is `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? -The first gives you all the metadata about the flows, where the later only gives you the ID's and roles +If chosen to also apply at the Flow level then need to consider is `flows_collected_by_id=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? +The first gives you all the metadata about the Flows, where the later only gives you the ID's and roles From 8532a8f53ab088b79f555637726162bfcad5efe4 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Wed, 15 Apr 2026 20:51:52 +0100 Subject: [PATCH 09/14] Minor corrections --- docs/adr/xxxx-source-collected_by-query-parameter.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/xxxx-source-collected_by-query-parameter.md index 5f57500c..a6df43b4 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/xxxx-source-collected_by-query-parameter.md @@ -8,7 +8,7 @@ status: "draft" Currently at the Source level there is no filtering on the `collected_by` field. When vendors are building systems which list content stored in TAMS then they are only looking for the Sources at the top of the content tree. In many cases this is the multi-Source, however due to single essence (eg audio only) workflows then this is not a reliable method of finding the top level content. -To achieve this they must go through every result and look to see if the `collected_by `field is missing as this then indicates that it is the top of a Source collection. +To achieve this they must go through every result and look to see if the `collected_by` field is missing as this then indicates that it is the top of a Source collection. This process can also throw any form of sensible pagination in a client application as when requesting content from the TAMS API it is not known how many results will be retained or discarded. This ADR is to look at moving this behaviour from the client side into the API. @@ -45,8 +45,8 @@ See the API specification changes in PR [#xxx](https://github.com/bbc/tams/pull/ When querying tags there are two fields available - you query on a tag name to get the value, or to query on a second parameter (`tag_exists`) to find out if that tag exists. Following this model would mean adding a two parameters - a new query parameter `collected_by_exists` with a boolean value and the `collected_by` to be able to search on one or more ID's. -For the boolean field setting to false this would return all Sources where there is no `collected_by` values present which is the required behaviour. -Setting this to true would only return Sources which have a `collected_by` value. +For the boolean field setting to false this would return all Sources where there is no `collected_by` value present which is the required behaviour. +Setting this to true would return only Sources that have a `collected_by` value. This option currently has no uses cases, however using this model and a boolean logically requires this behaviour. | Behaviour | Query Parameter | @@ -55,7 +55,7 @@ This option currently has no uses cases, however using this model and a boolean | Source is collected by specific Source | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | | Source is collected by any of the specified Sources | `collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4,f3ac31bb-c66b-43f8-8362-c82e76f0d28d` | | Source is collected by any Source | `collected_by_exists=true` | -| Note that this combination is non-sense | `collected_by_exists=false&collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | +| Note that this combination is nonsense | `collected_by_exists=false&collected_by=a46c49f1-4764-42b9-9f91-f267a58903c4` | ### Option 2: Follow tags example but only implement the exists query parameter @@ -74,7 +74,7 @@ In this option it is not possible to query on the ID in the collection, only tha ### Option 3: Follow accept_get_urls example and use an empty query parameter On the `/segments` end point it is possible to specify a query parameter of `accept_get_urls`. -This field can be comma separated list of labels, however it is allowed to be empty which means that the `get_urls` are ommited in the result +This field can be a comma separated list of labels, however it is allowed to be empty which means that the `get_urls` are omitted in the result Similarly in the webhooks it is possible to specify the `source_collected_by_ids` as an array of Source IDs. In the current it is not entirely clear what the behaviour of the query parameter if the value is left empty. From 3f6753c698883ab5157f6eac3351e5d955a32560 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Wed, 24 Jun 2026 21:23:41 +0100 Subject: [PATCH 10/14] API definition updates --- api/TimeAddressableMediaStore.yaml | 32 ++++++++++++++++++++++++++++++ api/schemas/uuid-list-empty.json | 6 ++++++ api/schemas/webhook.json | 4 ++-- 3 files changed, 40 insertions(+), 2 deletions(-) create mode 100644 api/schemas/uuid-list-empty.json diff --git a/api/TimeAddressableMediaStore.yaml b/api/TimeAddressableMediaStore.yaml index 0fdc484f..1e7d2f95 100644 --- a/api/TimeAddressableMediaStore.yaml +++ b/api/TimeAddressableMediaStore.yaml @@ -412,6 +412,14 @@ paths: description: Filter on Source format. schema: $ref: 'schemas/content-format.json' + - name: collected_by_ids + in: query + description: | + Filter on Sources by the Source Collection(s) that collect them, given as a comma-separated list of Source IDs. + Returns Sources whose `collected_by` includes at least one of the given Source IDs. + An empty value (`collected_by_ids=`) returns only Sources that are not collected by any Source, i.e. top-level Sources with no `collected_by` value. + schema: + $ref: 'schemas/uuid-list-empty.json' - $ref: '#/components/parameters/trait_resource_paged_key' - $ref: '#/components/parameters/trait_paged_limit' responses: @@ -473,6 +481,14 @@ paths: description: Filter on Source format. schema: $ref: 'schemas/content-format.json' + - name: collected_by_ids + in: query + description: | + Filter on Sources by the Source Collection(s) that collect them, given as a comma-separated list of Source IDs. + Returns Sources whose `collected_by` includes at least one of the given Source IDs. + An empty value (`collected_by_ids=`) returns only Sources that are not collected by any Source, i.e. top-level Sources with no `collected_by` value. + schema: + $ref: 'schemas/uuid-list-empty.json' - $ref: '#/components/parameters/trait_resource_paged_key' - $ref: '#/components/parameters/trait_paged_limit' responses: @@ -865,6 +881,14 @@ paths: description: Filter on video Flows that have the given frame height. schema: type: integer + - name: collected_by_ids + in: query + description: | + Filter on Flows by the Flow Collection(s) that collect them, given as a comma-separated list of Flow IDs. + Returns Flows whose `collected_by` includes at least one of the given Flow IDs. + An empty value (`collected_by_ids=`) returns only Flows that are not collected by any Flow, i.e. top-level Flows with no `collected_by` value. + schema: + $ref: 'schemas/uuid-list-empty.json' - $ref: '#/components/parameters/trait_resource_paged_key' - $ref: '#/components/parameters/trait_paged_limit' responses: @@ -952,6 +976,14 @@ paths: description: Filter on video Flows that have the given frame height. schema: type: integer + - name: collected_by_ids + in: query + description: | + Filter on Flows by the Flow Collection(s) that collect them, given as a comma-separated list of Flow IDs. + Returns Flows whose `collected_by` includes at least one of the given Flow IDs. + An empty value (`collected_by_ids=`) returns only Flows that are not collected by any Flow, i.e. top-level Flows with no `collected_by` value. + schema: + $ref: 'schemas/uuid-list-empty.json' - $ref: '#/components/parameters/trait_resource_paged_key' - $ref: '#/components/parameters/trait_paged_limit' responses: diff --git a/api/schemas/uuid-list-empty.json b/api/schemas/uuid-list-empty.json new file mode 100644 index 00000000..052b4386 --- /dev/null +++ b/api/schemas/uuid-list-empty.json @@ -0,0 +1,6 @@ +{ + "title": "Query String UUID list (optionally empty)", + "description": "A list of Universally Unique Identifiers (UUIDs) as defined in [RFC9562](https://www.rfc-editor.org/rfc/rfc9562), formatted for use in query string parameters, or an empty string. An empty value selects resources that are not in any collection.", + "type": "string", + "pattern": "^(([0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12})(,[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12})*)?$" +} \ No newline at end of file diff --git a/api/schemas/webhook.json b/api/schemas/webhook.json index e64bbccb..d146c520 100644 --- a/api/schemas/webhook.json +++ b/api/schemas/webhook.json @@ -47,14 +47,14 @@ } }, "flow_collected_by_ids": { - "description": "Limit Flow and Flow Segment events to those with Flow that is collected by a Flow Collection in the given list of Flow Collection IDs", + "description": "Limit Flow and Flow Segment events to those with a Flow that is collected by a Flow Collection in the given list of Flow Collection IDs. An empty array limits events to Flows that are not collected by any Flow Collection.", "type": "array", "items": { "$ref": "uuid.json" } }, "source_collected_by_ids": { - "description": "Limit Flow, Flow Segment and Source events to those with Source that is collected by a Source Collection in the given list of Source Collection IDs", + "description": "Limit Flow, Flow Segment and Source events to those with a Source that is collected by a Source Collection in the given list of Source Collection IDs. An empty array limits events to Sources that are not collected by any Source Collection.", "type": "array", "items": { "$ref": "uuid.json" From 8cb627be1997d1afbfffa330fd4f7706bfbe96fb Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Wed, 24 Jun 2026 21:31:00 +0100 Subject: [PATCH 11/14] Update ADR number --- docs/README.md | 1 + ...parameter.md => 0049-source-collected_by-query-parameter.md} | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) rename docs/adr/{xxxx-source-collected_by-query-parameter.md => 0049-source-collected_by-query-parameter.md} (99%) diff --git a/docs/README.md b/docs/README.md index 6a646f47..5fc24e74 100644 --- a/docs/README.md +++ b/docs/README.md @@ -85,5 +85,6 @@ For more information on how we use ADRs, see [here](./adr/README.md). | [0044](./adr/0044-signalling-timeouts.md) | Signalling timeout periods | | [0046](./adr/0046-governance.md) | Governance | | [0048](./adr/0048-media-integrity.md) | Integrity model for media in TAMS, and when interacting with other systems | +| [0049](./adr/0049-source-collected_by-query-parameter.md) | Ability to query for Sources without a collected_by value | \* Note: ADR 0004a was the unintended result of a number clash in the early development of TAMS which wasn't caught before publication diff --git a/docs/adr/xxxx-source-collected_by-query-parameter.md b/docs/adr/0049-source-collected_by-query-parameter.md similarity index 99% rename from docs/adr/xxxx-source-collected_by-query-parameter.md rename to docs/adr/0049-source-collected_by-query-parameter.md index a6df43b4..a5f3141a 100644 --- a/docs/adr/xxxx-source-collected_by-query-parameter.md +++ b/docs/adr/0049-source-collected_by-query-parameter.md @@ -1,5 +1,5 @@ --- -status: "draft" +status: "accepted" --- # Ability to query for Sources without a collected_by value From 8f4f8769405a0d83dc5f251eae75d468684bba27 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Wed, 24 Jun 2026 21:34:19 +0100 Subject: [PATCH 12/14] Fix lint issues --- docs/adr/0049-source-collected_by-query-parameter.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/adr/0049-source-collected_by-query-parameter.md b/docs/adr/0049-source-collected_by-query-parameter.md index a5f3141a..71ef1cb8 100644 --- a/docs/adr/0049-source-collected_by-query-parameter.md +++ b/docs/adr/0049-source-collected_by-query-parameter.md @@ -19,12 +19,14 @@ While the focus of this ADR is on the Source level of the TAMS API, there are su ## Considered Options Source Options: + * Option 1: Follow tags example and use two query parameters for value and exists * Option 2: Follow tags example but only implement the exists query parameter * Option 3: Follow accept_get_urls example and use an empty query parameter * Option 4: Do nothing Flows: + * Option A: Apply the same querying capabilities at both Source and Flow level * Option B: Only apply the new query capabilities at the Source level @@ -97,6 +99,5 @@ At the Flow level this is currently some ability to query and update Flow collec This behaviour focuses on starting from a known Flow to which the collection is applied. There is no equivalent endpoints at the Source level. -If chosen to also apply at the Flow level then need to consider is `flows_collected_by_id=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? +If chosen to also apply at the Flow level then need to consider is `flows_collected_by_id=a46c49f1-4764-42b9-9f91-f267a58903c4` too similar to `/flows/a46c49f1-4764-42b9-9f91-f267a58903c4/flow_collection`? The first gives you all the metadata about the Flows, where the later only gives you the ID's and roles - From 89a7c92387ce702b382eff68ff59cf9e7f34f862 Mon Sep 17 00:00:00 2001 From: johnbilt <164537346+johnbilt@users.noreply.github.com> Date: Fri, 26 Jun 2026 17:05:32 +0100 Subject: [PATCH 13/14] Update docs/adr/0049-source-collected_by-query-parameter.md Co-authored-by: James Sandford --- docs/adr/0049-source-collected_by-query-parameter.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adr/0049-source-collected_by-query-parameter.md b/docs/adr/0049-source-collected_by-query-parameter.md index 71ef1cb8..44f6b6bd 100644 --- a/docs/adr/0049-source-collected_by-query-parameter.md +++ b/docs/adr/0049-source-collected_by-query-parameter.md @@ -32,7 +32,7 @@ Flows: ## Decision Outcome -Recommendation: Option 3 since this already exists within the API as part of the webhooks, however this requires clarification of empty query parameters which is recommended to match the proposed behaviours. +Chosen Options 3 and A since this already exists within the API as part of the webhooks, however this requires clarification of empty query parameters which is recommended to match the proposed behaviours. Additionally it is recommended to add the same capability at the Flow level and again match this with the same parameter as used in the webhooks and clarify the empty query parameter behaviour. From f2cf5489f5764092dd8da6c7b12728f41a43ccc4 Mon Sep 17 00:00:00 2001 From: John Biltcliffe Date: Fri, 26 Jun 2026 17:12:01 +0100 Subject: [PATCH 14/14] Fix table formatting --- docs/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 5fc24e74..a503905f 100644 --- a/docs/README.md +++ b/docs/README.md @@ -85,6 +85,6 @@ For more information on how we use ADRs, see [here](./adr/README.md). | [0044](./adr/0044-signalling-timeouts.md) | Signalling timeout periods | | [0046](./adr/0046-governance.md) | Governance | | [0048](./adr/0048-media-integrity.md) | Integrity model for media in TAMS, and when interacting with other systems | -| [0049](./adr/0049-source-collected_by-query-parameter.md) | Ability to query for Sources without a collected_by value | +| [0049](./adr/0049-source-collected_by-query-parameter.md) | Ability to query for Sources without a collected_by value | \* Note: ADR 0004a was the unintended result of a number clash in the early development of TAMS which wasn't caught before publication