feat(summary) Migrate visio integration to summary API v2 by FloChehab · Pull Request #1362 · suitenumerique/meet

FloChehab · 2026-05-26T16:32:23Z

Overview

This PR migrates the visio transcription integration from the legacy v1 summary API to the streamlined v2 API, cleaning up old code and introducing several improvements to how transcription, summarization, and analytics are handled.

Changes

🔥 Breaking / Cleanup

Removed most v1 route-related code from the summary service (first step toward full v1 deprecation)
=> Not considered as a real breaking change as the deployment of both services simultaneously is considered stable, and visio is still compatible with a summary v1 API.

✨ Summary Service — v2 Route Extensions

v2 create transcribe payload now supports publishing directly to Docs and triggering automatic summarization
Title computation is now the caller's responsibility, simplifying the endpoint interface
All files are now sent as signed URLs (direct S3 reads removed)
Docs integration is now explicit in summary settings
User sub is now used as the PostHog distinct_id for consistent cross-project analytics; a PostHog event is emitted at request creation time to ensure feature flags are properly evaluated
User email can now be provided to the API (optional, non-breaking) -> usefull for analytics
Added a specific User-Agent for improved tracking

✨ Visio — v2 Compatibility

Visio now calls the v2 summary API; v1 params are still sent for backward compatibility, but users running their own deployments are advised to migrate — this compatibility shim will be removed in a future major version
Title for the transcription document is now computed in visio
RecordingStatusChoices has two new values: EXTERNAL_PROCESS_SUCCESSFUL and EXTERNAL_PROCESS_FAILED, updated via a new webhook called by the transcribe service
The new webhook is protected by its own bearer token
Tests added and updated to cover the above changes

Notes

Some existing helpers in summary still use model_dump() in places due to incomplete Pydantic model adoption — left as-is to avoid scope creep

Integration tests

Tested on docker-compose / tilt / staging

TODO :

Discuss posthog alias id & sub => Create a new posthog projet dedictated to summary that works on sub only

gitguardian · 2026-05-27T10:36:32Z

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.

^{_{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}}

FloChehab · 2026-05-27T15:52:56Z

@coderabbitai full review

coderabbitai · 2026-05-27T15:53:03Z

✅ Actions performed

Full review triggered.

coderabbitai · 2026-05-27T16:02:44Z

Walkthrough

This PR modernizes the recording and transcription system by adding external process webhook integration to the backend and consolidating the summary service to v2-only with docs integration. The backend extends the Recording model with external_process_id tracking and adds a webhook endpoint to receive events from external services, updating recording status based on transcript success/failure outcomes. The notification service is enhanced to generate locale-specific titles, send cloud storage URLs, and persist job IDs for webhook correlation. The summary service eliminates v1 API support, makes authorized tenants required with per-tenant push_to_docs permissions, introduces typed data contracts with received_at timestamps for analytics, creates a docs integration service for document creation, and refactors task processing to include optional synchronous docs push. CI/CD workflows are updated to allow Docker image pushes on integration/* branches, and Helm templates are configured with v2 endpoints and multi-tenant authorization.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 98.08% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The PR title 'feat(summary) Migrate visio integration to summary API v2' accurately reflects the main objective of migrating the summary integration to API v2 and removing v1 code.
Description check	✅ Passed	The PR description clearly describes the migration from summary v1 to v2 API, including breaking changes, new features, and compatibility considerations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 21

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/summary/summary/core/celery_worker.py (1)

436-450: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Duplicate analytics capture call.

metadata_manager.capture(job_id, settings.posthog_event_success) is called twice: once at line 436 and again at line 450. This will result in duplicate analytics events being recorded.

🐛 Proposed fix: remove the first duplicate call

         )
 
-    metadata_manager.capture(job_id, settings.posthog_event_success)
-
     file_service.store_transcript(
         transcript=transcription_res,
         job_id=job_id,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/summary/summary/core/celery_worker.py` around lines 436 - 450,
metadata_manager.capture(job_id, settings.posthog_event_success) is invoked
twice causing duplicate analytics events; remove the earlier call and keep a
single capture after you finish storing the transcript and scheduling the
webhook. Specifically, delete the first metadata_manager.capture(...) that
appears before file_service.store_transcript(...) so only the final
metadata_manager.capture(job_id, settings.posthog_event_success) after
call_webhook_v2_task.apply_async(...) remains; the other symbols to locate the
section are file_service.store_transcript, TranscribeWebhookSuccessPayload, and
call_webhook_v2_task.apply_async.

♻️ Duplicate comments (1)

src/summary/summary/core/celery_worker.py (1)
388-389: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Catch Exception instead of BaseException.

Catching BaseException will intercept SystemExit and KeyboardInterrupt, preventing proper process termination. Change to except Exception as e: to maintain resilience for operational errors while allowing process-control exceptions to propagate.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/summary/summary/core/celery_worker.py` around lines 388 - 389, Change the
broad exception handler in celery_worker.py from "except BaseException as e:" to
"except Exception as e:" so SystemExit/KeyboardInterrupt can propagate; locate
the try/except around the speaker identity resolution (the block logging "Failed
to resolve speaker identities, skipping: {e}") and update the exception type
accordingly, leaving the logger.error call intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/backend/core/recording/event/authentication.py`:
- Line 106: Fix the typo in the logger message: replace the string
"Authentificating" passed to logger.info with the correct spelling
"Authenticating" (locate the logger.info(...) call in the authentication event
handling code and update the message accordingly).
- Around line 102-119: In authenticate, add checks to ensure
settings.SUMMARY_SERVICE_WEBHOOK_API_TOKEN is configured (non-empty/None) before
calling secrets.compare_digest, and validate the Authorization header format by
rejecting anything that does not start with the exact "Bearer " prefix (similar
to StorageEventAuthentication) so you only compare the extracted bearer token;
update the logger messages to reflect configuration or header format failures
and raise AuthenticationFailed on invalid/missing token or bad prefix, keeping
the final secrets.compare_digest call against the extracted bearer token and
returning MachineUser("external_process_user"), None on success.

In `@src/backend/core/recording/event/notification.py`:
- Around line 297-309: The metadata dict is being constructed when started_at
and ended_at exist even if metadata_filename is None; update the conditional
that builds "metadata" so it requires metadata_filename as well (e.g., require
started_at and ended_at and metadata_filename) before calling
generate_download_s3_file_url(metadata_filename, ...), ensuring metadata remains
None when metadata_filename is absent; refer to the "metadata" entry, the
variables metadata_filename, started_at, ended_at, and the function
generate_download_s3_file_url when locating and fixing the code.
- Around line 219-220: Wrap the ZoneInfo(owner_timezone) usage in a try/except
that catches zoneinfo.ZoneInfoNotFoundError (import it) around the block that
sets dt = recording_datetime.astimezone(ZoneInfo(owner_timezone)); on exception,
log a warning including the invalid owner_timezone (use the same logger used
nearby) and skip the astimezone conversion so dt remains as recording_datetime
(or apply a safe fallback), ensuring the code references owner_timezone and
recording_datetime and that ZoneInfoNotFoundError is imported from zoneinfo.

In `@src/backend/core/utils.py`:
- Around line 460-492: The function generate_download_s3_file_url lacks
validation for the key parameter, so callers passing None or an empty string
produce unclear boto3 errors; add a short guard at the top of
generate_download_s3_file_url that verifies key is a non-empty string (e.g.,
check isinstance(key, str) and key.strip() not empty) and raise a clear
ValueError (with a message like "key must be a non-empty string") before any
boto3/default_storage calls so failures are immediate and debuggable.

In `@src/backend/locale/de_DE/LC_MESSAGES/django.po`:
- Around line 149-155: Add German translations for the two msgid entries in the
locale file: replace the empty msgstr for "External process successful" with
"Externer Prozess erfolgreich" and replace the empty msgstr for "External
process failed" with "Externer Prozess fehlgeschlagen" so the msgid-to-msgstr
pairs read respectively for those strings.
- Around line 358-364: Replace the empty msgstr entries for the two msgid
strings by adding German translations: set msgstr for "External Process ID" to
"Externe Prozess‑ID" and set msgstr for "ID of the external process associated
with the recording." to "ID des externen Prozesses, der mit der Aufnahme
verknüpft ist."; update the corresponding entries in django.po so the two msgid
lines are followed by these msgstr translations.
- Around line 595-598: Update the German translation for msgid "Click the
\"Open\" button below" by removing the old HTML link markup in msgstr and
replacing it with wording that matches the new msgid (e.g., "Klicken Sie auf die
Schaltfläche „Öffnen“ unten"), and remove the "#, fuzzy" flag so the entry is no
longer marked fuzzy; edit the msgstr corresponding to that msgid and clear the
fuzzy marker.

In `@src/backend/locale/en_US/LC_MESSAGES/django.po`:
- Around line 591-594: The translation still contains the old link HTML and the
"#, fuzzy" marker; update the msgstr to match the new msgid wording ("Click the
\"Open\" button below ") (i.e., remove the <a href=...> markup) and delete the
"#, fuzzy" marker so msgid and msgstr are identical and no longer fuzzy.

In `@src/backend/locale/fr_FR/LC_MESSAGES/django.po`:
- Around line 151-157: Add French translations for the two msgid entries by
replacing the empty msgstr values for "External process successful" and
"External process failed" with their French equivalents; set msgstr for
"External process successful" to "Processus externe réussi" and set msgstr for
"External process failed" to "Échec du processus externe" (update the entries
where the msgid strings appear).
- Around line 359-365: Update the two empty msgstr entries for the given msgid
keys: replace msgstr "" for "External Process ID" with "ID du processus externe"
and replace msgstr "" for "ID of the external process associated with the
recording." with "ID du processus externe associé à l'enregistrement." so the
French .po contains the proper translations for those message IDs (msgid
"External Process ID" and msgid "ID of the external process associated with the
recording.").
- Around line 597-600: Update the French translation to match the new msgid by
removing the old HTML link markup and translating the button wording: change the
msgstr for msgid "Click the \"Open\" button below " so it reads something like
"Cliquez sur le bouton « Ouvrir » ci‑dessous" (remove the "<a
href=\"%(link)s\">…</a>" and the %(link)s placeholder) and also remove the "#,
fuzzy" marker so the entry is no longer marked fuzzy.

In `@src/backend/locale/nl_NL/LC_MESSAGES/django.po`:
- Around line 354-360: Update the empty msgstr entries for the two msgid
strings: set msgid "External Process ID" to the Dutch translation "Externe
proces-ID" and set msgid "ID of the external process associated with the
recording." to "ID van het externe proces dat aan de opname gekoppeld is." by
replacing the empty msgstr values for those exact msgid entries.
- Around line 149-155: The translation entries for msgid "External process
successful" and "External process failed" in the locale file are empty; update
the corresponding msgstr values to Dutch by replacing the empty msgstr for
"External process successful" with a suitable translation such as "Extern proces
voltooid" (or "Extern proces geslaagd") and for "External process failed" with
"Extern proces mislukt" (or "Extern proces gefaald") so the UI shows Dutch text
for these external process status messages.
- Around line 590-593: The Dutch translation for msgid "Click the \"Open\"
button below" still contains old HTML link markup in msgstr; update msgstr to
match the new msgid (e.g., change "Klik op de \"<a
href=\"%(link)s\">Openen</a>\"-link hieronder " to a button-oriented translation
such as "Klik op de knop \"Open\" hieronder") and remove the "#, fuzzy" marker
so the entry is no longer marked fuzzy; edit the entry that contains the shown
msgid/msgstr pair to make these changes.

In `@src/helm/env.d/dev-dinum/values.meet.yaml.gotmpl`:
- Around line 108-110: Add the missing AUTHORIZED_TENANTS environment entry to
the summary service block so v2 authorization succeeds; locate the
SUMMARY_SERVICE_* entries (SUMMARY_SERVICE_ENDPOINT, SUMMARY_SERVICE_API_TOKEN,
SUMMARY_SERVICE_WEBHOOK_API_TOKEN) in this values.meet.yaml.gotmpl and add an
AUTHORIZED_TENANTS key with the same tenant list format used in other dev
configs (e.g., a JSON/array string of allowed tenant IDs) so the summary service
accepts v2 tasks and passes startup validation.

In `@src/summary/summary/core/config.py`:
- Around line 119-123: The docs integration defaults currently enable a
dangerous placeholder configuration; change is_docs_integration_enabled default
to False and add a startup validation (e.g., a Pydantic root_validator or
post-init check in the same config class) that raises an error if
is_docs_integration_enabled is True while docs_base_url == "https://example.com"
or docs_server_to_server_api_key == "NO_API_KEY"; reference the fields
is_docs_integration_enabled, docs_base_url and docs_server_to_server_api_key so
the check runs early and fails fast instead of silently using placeholder
values.

In `@src/summary/summary/core/docs_service.py`:
- Around line 57-59: The debug logs in docs_service.py currently print full
request/response payloads (logger.debug calls around the "Submitting to" block
referencing data and later response), which may include sensitive fields like
email and sub; update the code to log only metadata by redacting sensitive keys
before logging (create/use a helper such as redact_payload and apply it to the
request variable `data` and the downstream `response` body), remove or avoid
logging full response bodies, and ensure the logger messages include
non-sensitive context (endpoint URL, payload size, status code) instead of raw
content.
- Around line 69-70: The except clause handling response.json() in the fallback
path should also catch ValueError to cover legacy/runtime variants; update the
except tuple in the fallback of the method that reads response.json() (the block
currently using "except (json.JSONDecodeError, AttributeError): document_id =
'Unable to parse response'") to "except (json.JSONDecodeError, ValueError,
AttributeError)" so plain ValueError decode failures are handled the same way.

In `@src/summary/summary/core/file_service.py`:
- Around line 250-258: read_cloud_storage_json currently opens the downloaded
temp file with local_path.open("r") without a context manager and never deletes
the temp file returned by _download_from_cloud_storage_url, causing a resource
leak; modify read_cloud_storage_json to open the file using a with-statement
(context manager) to ensure the file handle is closed and wrap the read in
try/except/finally where the finally block removes the temporary file (using
local_path.unlink() or equivalent) and re-raises FileServiceException on
JSON/Unicode errors, referencing the existing symbols read_cloud_storage_json,
_download_from_cloud_storage_url, and FileServiceException.

In `@src/summary/summary/core/models.py`:
- Around line 47-49: The download_link attribute is currently typed as optional
but lacks a default, making it required under Pydantic v2; update the model's
download_link declaration (the download_link Field in your model class) to
provide an explicit default of None (e.g., use Field(default=None,
title="Download Link", description="The link to download the recording.") or set
download_link: str | None = None with Field(...)) so clients can omit it without
validation errors.

---

Outside diff comments:
In `@src/summary/summary/core/celery_worker.py`:
- Around line 436-450: metadata_manager.capture(job_id,
settings.posthog_event_success) is invoked twice causing duplicate analytics
events; remove the earlier call and keep a single capture after you finish
storing the transcript and scheduling the webhook. Specifically, delete the
first metadata_manager.capture(...) that appears before
file_service.store_transcript(...) so only the final
metadata_manager.capture(job_id, settings.posthog_event_success) after
call_webhook_v2_task.apply_async(...) remains; the other symbols to locate the
section are file_service.store_transcript, TranscribeWebhookSuccessPayload, and
call_webhook_v2_task.apply_async.

---

Duplicate comments:
In `@src/summary/summary/core/celery_worker.py`:
- Around line 388-389: Change the broad exception handler in celery_worker.py
from "except BaseException as e:" to "except Exception as e:" so
SystemExit/KeyboardInterrupt can propagate; locate the try/except around the
speaker identity resolution (the block logging "Failed to resolve speaker
identities, skipping: {e}") and update the exception type accordingly, leaving
the logger.error call intact.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fa956f9a-b003-4d24-9a4b-ed16d0bf9ea2

📥 Commits

Reviewing files that changed from the base of the PR and between 7d9f282 and b539ddf.

📒 Files selected for processing (41)

.github/workflows/docker-hub.yml
.github/workflows/meet.yml
env.d/development/common.dist
env.d/development/summary.dist
src/backend/core/api/viewsets.py
src/backend/core/migrations/0019_recording_external_process_id_alter_recording_status.py
src/backend/core/models.py
src/backend/core/recording/event/authentication.py
src/backend/core/recording/event/notification.py
src/backend/core/tests/recording/event/test_notification.py
src/backend/core/tests/recording/test_api_recordings_external_process_hook.py
src/backend/core/utils.py
src/backend/locale/de_DE/LC_MESSAGES/django.po
src/backend/locale/en_US/LC_MESSAGES/django.po
src/backend/locale/fr_FR/LC_MESSAGES/django.po
src/backend/locale/nl_NL/LC_MESSAGES/django.po
src/backend/meet/settings.py
src/helm/env.d/dev-dinum/values.meet.yaml.gotmpl
src/helm/env.d/dev-keycloak/values.meet.yaml.gotmpl
src/summary/summary/api/main.py
src/summary/summary/api/route/tasks.py
src/summary/summary/api/route/tasks_v2.py
src/summary/summary/core/analytics.py
src/summary/summary/core/celery_worker.py
src/summary/summary/core/config.py
src/summary/summary/core/docs_service.py
src/summary/summary/core/file_service.py
src/summary/summary/core/locales/de.py
src/summary/summary/core/locales/en.py
src/summary/summary/core/locales/fr.py
src/summary/summary/core/locales/nl.py
src/summary/summary/core/locales/strings.py
src/summary/summary/core/models.py
src/summary/summary/core/shared_models.py
src/summary/summary/core/transcript_formatter.py
src/summary/summary/core/user_assign.py
src/summary/summary/core/webhook_service.py
src/summary/summary/main.py
src/summary/tests/api/test_api_tasks.py
src/summary/tests/api/test_api_tasks_v2.py
src/summary/tests/conftest.py

💤 Files with no reviewable changes (5)

src/summary/tests/api/test_api_tasks.py
src/summary/summary/api/route/tasks.py
src/summary/tests/conftest.py
.github/workflows/meet.yml
src/summary/summary/core/webhook_service.py

This commit cleans up most of the code related to v1 route that is used only by visio. This is first step before introducing an update to the v2 route.

* Update the v2 create transcribe payload to support publishing directly to docs and performing automatically a summary if requested * Note that title computation is to be handled by the caller now as it makes more sense and avoids throwing a bunch of parameters to the endpoint. * All files are send as signed URL now, we don't read directly from s3 anymore, * Docs integration is now explicit in summary settings * To make analytics work properly accross projects, we use the user sub as distinct id, this will require a new posthog project to be deployed to work properly. We create a post hog event at the creation request processing, to make sure feature flags work properly after that. * User email should now be provided to the API, it's not mandatory to avoid a breaking change. * Use a specific user agent for better tracking Finally note that existing helpers don't always make use of the pydantic models, so that's why there are model_dump in some places. To avoid bigger changes.

This commit introduces the compatibility with summary v2. It tecnically doesn't break the compatibility with v1 as v1 params are still sent. But we advise people using the transcribe feature in their own deployments to adapt to the new v2 API, as this compatibility will be removed in a future major version. * RecordingStatusChoices now has EXTERNAL_PROCESS_SUCCESSFUL & EXTERNAL_PROCESS_FAILED Values, which are changed by a new webhook that can be called by the transcribe service. This webhook is protected by its own bearer token. * Title for the document is computed in visio, * Tests are added / updated accordingly

sonarqubecloud · 2026-06-25T08:04:02Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

lebaudantoine · 2026-06-25T14:06:52Z

+        data = request.data
+        if not data.get("job_id"):
+            raise drf_exceptions.ValidationError(detail="No job_id provided")
+
+        job_id = data["job_id"]


wip: use a serializer? with a clear validation of what format should a job_id be

I think we can be relaxed here, to avoid self holster being forced to used a specific uuid format, etc.

lebaudantoine · 2026-06-25T14:08:53Z

+    external_process_id = models.CharField(
+        max_length=255,
+        null=True,
+        blank=True,
+        unique=True,
+        verbose_name=_("External Process ID"),
+        help_text=_("ID of the external process associated with the recording."),
+    )


wip: CharField seems very open right now. Is that by design, or should we enforce a UUID format?

I think we can be relaxed here, to avoid self holster being forced to used a specific uuid format, etc.

lebaudantoine · 2026-06-25T14:24:23Z

+    # Docs service configuration
+    is_docs_integration_enabled: bool = False
+    docs_base_url: Url = "https://example.com"
+    docs_server_to_server_api_key: SecretStr = Field(
+        title="API key for using docs server to server api", default="NO_API_KEY"
+    )


WIP: Docs is probably vague for external contributors; I would suggest being more precise, LaSuite Docs ?

Yeah we should find something better.

lebaudantoine · 2026-06-25T14:26:35Z

+    def validate_docs_config(self):
+        """Validate docs integration configuration."""
+        if self.is_docs_integration_enabled:
+            if not self.docs_base_url:
+                raise ValueError(
+                    "docs_base_url is required when docs integration is enabled"
+                )
+            if self.docs_server_to_server_api_key.get_secret_value() in {
+                "",
+                "NO_API_KEY",
+            }:
+                raise ValueError(
+                    "Valid docs_server_to_server_api_key is required when"
+                    " docs integration is enabled"
+                )
        return self


nit: early return would save one nesting level

if not self.is_docs_integration_enabled: return self

lebaudantoine · 2026-06-25T15:38:41Z

    webhook_backoff_factor: float = 0.1
-
-    # Locale
-    default_context_language: Literal["de", "en", "fr", "nl"] = "fr"


wip: why removing?

Default should be handled at the caller level from my pov

lebaudantoine · 2026-06-25T16:29:56Z

    SUMMARY_SERVICE_API_TOKEN = SecretFileValue(
        None, environ_name="SUMMARY_SERVICE_API_TOKEN", environ_prefix=None
    )
+    SUMMARY_SERVICE_WEBHOOK_API_TOKEN = SecretFileValue(
+        None, environ_name="SUMMARY_SERVICE_WEBHOOK_API_TOKEN", environ_prefix=None
+    )


WIP: open question, since these two components already share a secret, would it make sense to reuse the same one here for simplicity? My intuition is that it’s probably not a good idea, as it would introduce tighter coupling and reduce separation of concerns.

But it would make configuration easier

I agree with ease of configuration. However that wouldn't be ideal.
Maybe this could be adapted with a single SUMMARY_SERVICE_TOKEN, that we use to sign the payload with on the summary side, and on visio side we can check that signed payload is signed with something we know 🤷 Not sure though and would require changes in Transcripts too.

lebaudantoine · 2026-06-25T16:31:35Z

 def generate_download_file_url(file, *, expires_in: int, override_domain: bool = True):
    """
    Generate a S3 signed download url for a given file.
    """

    key = file.file_key
+    return generate_download_s3_file_url(
+        key, expires_in=expires_in, override_domain=override_domain
+    )
+
+
+def generate_download_s3_file_url(
+    key: str, *, expires_in: int, override_domain: bool = True
+):
+    """
+    Generate a S3 signed download url for a given key.
+    """
+    if not key:
+        raise ValueError("key cannot be empty")


wip: naming generate_download_s3_file_url and generate_download_file_url are close, and we don't get easily what's the difference between them.

yeah, I need to delete one of those;

lebaudantoine · 2026-06-25T16:36:09Z

+        locale = (
+            locale
+            if locale in {"de-de", "en-us", "fr-fr", "nl-nl"}
+            else settings.LANGUAGES[0][0]
+        )
+        default_template_by_locale = {
+            "de-de": "Transkription",
+            "en-us": "Transcription",
+            "fr-fr": "Transcription",
+            "nl-nl": "Transcriptie",
+        }
+        if recording_datetime is None:
+            return default_template_by_locale.get(
+                locale, default_template_by_locale["en-us"]
+            )
+
+        template_by_locale = {
+            "de-de": 'Besprechung "{room}" am {room_recording_date} um {room_recording_time}',
+            "en-us": 'Meeting "{room}" on {room_recording_date} at {room_recording_time}',
+            "fr-fr": 'Réunion "{room}" du {room_recording_date} à {room_recording_time}',
+            "nl-nl": 'Vergadering "{room}" op {room_recording_date} om {room_recording_time}',
+        }


wip: can't we rely on Django i18n for these parts? to avoid these manipulations

True, maybe, I am not familiar enough with that. I'll check.

lebaudantoine · 2026-06-25T16:39:12Z

-            "language": recording.options.get("language"),
            "owner_timezone": str(owner_access.user.timezone),
            "download_link": f"{get_recording_download_base_url()}/{recording.id}",
-            "context_language": owner_access.user.language,


WIP: Removing these params is inherently a breaking change, don't you think?

They are moved in the "V2" part, so not breaking from my pov.

(they are shared between v2 and v1)

lebaudantoine · 2026-06-25T16:49:09Z

+class PushToDocsBaseConfig(BaseModel):
+    """Model containing information for pushing transcript and summaries to docs."""
+
+    user_email: str = Field(


should it be a string?

What else are you thinking ? (like a proper email check?)

lebaudantoine · 2026-06-25T16:49:20Z

+class PushToDocsTranscriptConfig(PushToDocsBaseConfig):
+    """Model for push to docs information for transcripts."""
+
+    download_link: str | None = Field(


same should it be url?

lebaudantoine · 2026-06-25T16:49:25Z

+        title="Download Link",
+        description="The link to download the recording.",
+    )
+    form_link: str | None = Field(


lebaudantoine · 2026-06-25T16:56:25Z

+            is_v2_implementation = False
+            try:
+                response_json = response.json()
+                # We do not require a job_id to avoid a breaking change
+                if job_id := response_json.get("job_id"):
+                    recording.external_process_id = job_id
+                    recording.save()
+                    is_v2_implementation = True
+            except json.JSONDecodeError:


Should we make it explicit via Django settings whether we use v1 or v2?

Right now, checking for the presence of job_id to infer the version feels a bit weak, it’s implicit that v1 has no job_id, but not very clear.

I think we’d benefit from an explicit setting-based switch, calling either a dedicated v1 or v2 method. That would make the logic easier to follow and avoid sending requests with both v1 and v2 params mixed together.

It would also make the v1 deprecation path cleaner: we could simply remove/vendor the v1 method instead of refactoring this one later.

True I'll like that idea

lebaudantoine · 2026-06-25T17:00:19Z

+            raise AuthenticationFailed("Webhook authentication is not configured.")
+
+        auth_header: str = request.headers.get("Authorization") or ""
+        if not auth_header.startswith("Bearer "):


RFC 7235 says the scheme is case-insensitive

lebaudantoine · 2026-06-25T19:47:30Z

+        if not required_token:
+            raise AuthenticationFailed("Webhook authentication is not configured.")
+
+        auth_header: str = request.headers.get("Authorization") or ""


wip: probably, use the class attribute, AUTH_HEADER and TOKEN_TYPE?

lebaudantoine · 2026-06-25T19:49:31Z

+                request.META.get("REMOTE_ADDR"),
+            )
+            raise AuthenticationFailed("Invalid authorization header format.")
+        token = auth_header[7:]  # len("Bearer ") == 7


it doesn't seem robust to me,
Could smth like this be appropriate :

scheme, _, token = auth_header.partition(" ") if scheme.lower() != "bearer" or not token.strip(): raise AuthenticationFailed("Invalid authorization header format.") token = token.strip()

lebaudantoine · 2026-06-25T19:54:17Z

        return f"{self.TOKEN_TYPE} realm='Storage event API'"
+
+
+class RecordingProcessWebhookAuthentication(BaseAuthentication):


Probably we should follow the implementation from https://github.com/suitenumerique/docs/blob/3fe8bd5cc0aa250e2513297cdfda34f219b180af/src/backend/core/authentication/__init__.py#L9, and maybe backport it to the django-lasuite library

lebaudantoine · 2026-06-25T19:57:21Z

+        "allowed_push_to_docs": false
+      },
+      {
+        "id": "visio",


Suggested change

"id": "visio",

"id": "meet",

lebaudantoine · 2026-06-25T20:12:13Z

+class PushToDocsBaseConfig(BaseModel):
+    """Model containing information for pushing transcript and summaries to docs."""
+
+    user_email: str = Field(


lebaudantoine · 2026-06-25T21:25:24Z

+            recording = None
+
+        changed = False
+        if recording and data.get("type") == "transcript":


what are the other possible types?

lebaudantoine · 2026-06-25T21:35:10Z

    response = requests.post(
        tenant.webhook_url,
        json=payload.model_dump(),
        headers={
            "Authorization": f"Bearer {tenant.webhook_api_key.get_secret_value()}",
+            "User-Agent": settings.app_external_user_agent,
        },
        timeout=(10, 20),
    )
    response.raise_for_status()

    try:
        response_data = response.json()
        document_id = response_data.get("id", "N/A")
    except (json.JSONDecodeError, AttributeError):
        document_id = "Unable to parse response"
        response_data = response.text


Do we need a retry mechanism here?

It would be a shame for the transcription to fail entirely just because the final HTTP request failed once. Seems worth making this more resilient.

lebaudantoine · 2026-06-25T22:02:05Z

+    # We do it synchronously for now
+    if _should_push_to_docs(payload):
+        # Format output
+        content = format_transcript(
+            transcription_res.model_dump(),
+            payload.context_language,
+            payload.language,
+            payload.push_to_docs_config.download_link,
+            payload.push_to_docs_config.form_link,
+        )
+
+        create_document_in_docs(
+            content=content,
+            title=payload.push_to_docs_config.title,
+            email=payload.push_to_docs_config.user_email,
+            sub=payload.user_sub,
+        )
+
+        if _should_auto_create_summary(payload):
+            summary = summarize_transcription_internals(
+                distinct_id=payload.user_sub,
+                transcript=content,
+                session_id=self.request.id,
+            )
+            locale = get_locale(payload.context_language, payload.language)
+            create_document_in_docs(
+                content=summary,
+                title=locale.summary_title_template.format(
+                    title=payload.push_to_docs_config.title
+                ),
+                email=payload.push_to_docs_config.user_email,
+                sub=payload.user_sub,
+            )


That’s one of the major changes here: waiting for the summary output synchronously. I don’t think this is the right pattern.

Right now, because of an issue in the Albert API, the summarization job is failing, and I haven’t fixed it yet. With this setup, that means all transcription jobs will fail from a Celery standpoint. To me, that clearly shows that coupling these two tasks is not the right approach.

As soon as we start working again on the summary feature, we need to be able to iterate on it freely without any risk of impacting the transcription pipeline, which is production-ready.

We’re also regressing operationally: with the current design, we lose the ability to scale workers independently. Today, we can dedicate workers to summaries (which involve external API calls and can block), while keeping transcription workers isolated and efficient.

For me, this is a blocking concern for merge.

lebaudantoine · 2026-06-25T22:03:24Z

        payload.user_sub,
    )

    job_id = self.request.id


Suggested change

task_id = self.request.id

lebaudantoine · 2026-06-26T19:38:53Z

+    ).allowed_push_to_docs:
+        logger.info("Push to docs is not allowed for tenant, skipping push to docs.")
+        return False
+    return True


is the info the right level?

I’d go with this approach, since all those if statements really only exist to compute the reason that gets logged. Sharing a common prefix would also make the logs easier to parse when this situation occurs.

Suggested change

return True

def _should_push_to_docs() -> bool:

"""Wip."""

if not payload.push_to_docs_config:

reason = "not requested in the payload"

elif not settings.is_docs_integration_enabled:

reason = "docs integration is disabled"

elif not settings.get_authorized_tenant(

tenant_id=payload.tenant_id

).allowed_push_to_docs:

reason = "not allowed for tenant"

else:

return True

logger.info("Skipping push to docs: %s.", reason)

return False

FloChehab mentioned this pull request May 26, 2026

Feat/meet summary v2 #1346

Closed

github-code-quality Bot found potential problems May 26, 2026

View reviewed changes

Comment thread src/backend/core/recording/event/notification.py Fixed

Comment thread src/summary/summary/core/celery_worker.py Fixed

FloChehab force-pushed the integration/meet-summary-v2-simplier branch from 8dc08b2 to 21202e4 Compare May 27, 2026 09:34

github-code-quality Bot found potential problems May 27, 2026

View reviewed changes

Comment thread src/backend/core/recording/event/notification.py Fixed

FloChehab force-pushed the integration/meet-summary-v2-simplier branch 2 times, most recently from f5d2ee6 to b8b5d01 Compare May 27, 2026 10:36

github-code-quality Bot found potential problems May 27, 2026

View reviewed changes

Comment thread src/summary/tests/api/test_api_tasks_v2.py Fixed

Comment thread src/summary/summary/core/celery_worker.py Fixed

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

FloChehab force-pushed the integration/meet-summary-v2-simplier branch 5 times, most recently from 3d30412 to 2389908 Compare May 28, 2026 20:08

FloChehab changed the title ~~Integration/meet summary v2 simplier~~ feat(summary) Migrate visio integration to summary API v2 May 28, 2026

FloChehab marked this pull request as ready for review May 28, 2026 20:13

FloChehab requested a review from lebaudantoine May 28, 2026 20:13

FloChehab mentioned this pull request May 28, 2026

✨(summary) extended support for all video / audio files #1358

Merged

FloChehab force-pushed the integration/meet-summary-v2-simplier branch 2 times, most recently from 0bbc291 to 7cb536d Compare June 22, 2026 13:04

FloChehab added 3 commits June 25, 2026 10:02

💥(summary) remove v1 related code

4be7618

This commit cleans up most of the code related to v1 route that is used only by visio. This is first step before introducing an update to the v2 route.

FloChehab force-pushed the integration/meet-summary-v2-simplier branch from 7cb536d to ea71fdf Compare June 25, 2026 08:02

lebaudantoine reviewed Jun 25, 2026

View reviewed changes

lebaudantoine reviewed Jun 26, 2026

View reviewed changes

		return f"{self.TOKEN_TYPE} realm='Storage event API'"


		class RecordingProcessWebhookAuthentication(BaseAuthentication):

-    return True
+def _should_push_to_docs() -> bool:
+    """Wip."""
+    if not payload.push_to_docs_config:
+        reason = "not requested in the payload"
+    elif not settings.is_docs_integration_enabled:
+        reason = "docs integration is disabled"
+    elif not settings.get_authorized_tenant(
+        tenant_id=payload.tenant_id
+    ).allowed_push_to_docs:
+        reason = "not allowed for tenant"
+    else:
+        return True
+    logger.info("Skipping push to docs: %s.", reason)
+    return False

Uh oh!

Conversation

FloChehab commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Changes

🔥 Breaking / Cleanup

✨ Summary Service — v2 Route Extensions

✨ Visio — v2 Compatibility

Notes

Integration tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gitguardian Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

️✅ There are no secrets present in this pull request anymore.

Uh oh!

Uh oh!

Uh oh!

FloChehab commented May 27, 2026

Uh oh!

coderabbitai Bot commented May 27, 2026

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 25, 2026

Quality Gate passed

Uh oh!

lebaudantoine Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lebaudantoine Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lebaudantoine Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FloChehab commented May 26, 2026 •

edited

Loading

gitguardian Bot commented May 27, 2026 •

edited

Loading

coderabbitai Bot commented May 27, 2026 •

edited

Loading

lebaudantoine Jun 25, 2026 •

edited

Loading

lebaudantoine Jun 25, 2026 •

edited

Loading

lebaudantoine Jun 25, 2026 •

edited

Loading