feat: enhance Chinese language support with variant handling and conversion#102
feat: enhance Chinese language support with variant handling and conversion#102BrianNguyen291 wants to merge 2 commits into
Conversation
ruzin
left a comment
There was a problem hiding this comment.
Thanks for stacking these together — the Ollama-reuse path and diarisation rework both look good. A few things worth addressing before merge:
Bug — normalize_markdown corrupts bold/italic at line start (medium)
src/summarizer.py:23 uses ^([-*])(?=\S):
**bold** → * *bold**
*italic* → * italic*
Any line starting with markdown emphasis becomes a malformed bullet. Fix with a negative lookahead:
text = re.sub(r'(?m)^([-*])(?=\S)(?![-*])', r'\1 ', text)Worth a unit test on **bold**, *italic*, - item, *item, 1. item to lock behavior.
OpenCC packaging — verify in DMG (medium)
opencc-python-reimplemented ships JSON dictionary files. PyInstaller doesn't auto-collect them; if stenoai.spec doesn't include the package's data, runtime conversion will silently fall back to the input (the warning in _get_converter masks the failure). Please cold-test a packaged DMG with a Traditional-Chinese recording — not just npm start. If conversion fails, add a collect_data_files('opencc') entry to the spec.
Silent failure swallow (low)
simple_recorder.py:391 wraps the summary variant conversion in bare try/except: pass. Prefer:
except Exception as e:
logger.warning(f\"Chinese variant conversion failed: {e}\")Behavior change — is_diarised semantics (low, worth confirming)
Previously is_diarised = bool(diarised_text). Now:
is_diarised = bool(diarised_text) and not single_sourceSingle-channel recordings with timestamped output now report is_diarised=False. Worth a quick grep for is_diarised to confirm no downstream consumer (UI, summarizer prompts, file naming) relies on the old semantics.
Style nits
from src.summarizer import normalize_markdownis inlined 3× insimple_recorder.py— hoist to module-level.get_language()silently rewrites\"zh\"→\"zh-Hans\"on every read but never persists. A one-shot migration inConfig.__init__would be cleaner.
Test coverage
No new tests for non-trivial logic. Recommend:
tests/test_chinese.py—apply_variantwith simplified/traditional/None; ImportError path.tests/test_summarizer.py—normalize_markdownincluding the bold/italic regression above.tests/test_config.py—zh→zh-Hansmigration,get_whisper_language,get_chinese_variant.tests/test_transcriber.py— diarised interleaving (single-source, both-channels, segment-less fallback).
Splitting into two PRs (Chinese support / diarisation+Ollama-reuse) would have eased review, but the changes are coherent enough as-is.
|
@BrianNguyen291 great PR, left comments! |
Settings → Advanced - Storage location now renders via a CopyableValue helper (path + click- to-copy with a checkmark confirm). main.js handler computes the effective path: returns the user's custom path when set, otherwise ~/Library/Application Support/stenoai (was empty string before, so the section rendered blank on default installs). - New 'Anonymous ID' row below the analytics toggle, also copyable. GetTelemetryResponse widened to surface anonymous_id from the Python CLI (was being discarded). Setup wizard - Telemetry switch added to the wizard so users opt in/out during onboarding without having to find Settings → Advanced. Ollama startup - Probe 127.0.0.1:11434 before spawning. If reachable, reuse the existing instance and skip the readiness loop. Avoids 'address already in use' when a previous launch left an Ollama running. Mirrors the fix being landed on PR ruzin#102.
* feat(build): add Vite+React renderer workspace with Electron wiring The legacy renderer was a single 9,700-line HTML/JS file with no build tooling, no type checking, and no component abstraction — making new features increasingly fragile and slow to build. This commit wires up a parallel React+Vite renderer that sits alongside the legacy UI behind a feature flag stored in user settings. The flag lets us ship incrementally without disrupting existing users, and makes it easy to fall back if needed during the transition. Key pieces: - Vite workspace at app/renderer/ with TypeScript, Tailwind, shadcn/ui - app/vite.config.ts builds the renderer into app/renderer/dist/ - app/preload.js (added in next commit) exposes IPC via contextBridge - app/main.js: loadNewRenderer() switches the BrowserWindow src, persisted toggle in settings so the choice survives restarts - app/src/settings-store.js: thin wrapper around electron-store used by the renderer toggle and other persistent prefs - app/electron-builder.ci.yml: CI-only build config that skips code signing so the pipeline can produce unsigned DMGs for test installs - app/docs/ipc-contract.md: frozen inventory of every IPC channel the renderer is allowed to call (acts as a stable contract boundary) * feat(renderer): contextBridge preload and typed IPC client Electron's contextBridge is the security boundary between the privileged main process and the renderer. Rather than exposing raw ipcRenderer to the page, preload.js declares an explicit allow-list of channels the renderer can call or listen on — everything else is blocked. app/lib/ipc.ts wraps every channel in a typed TypeScript function so callers never deal with raw channel strings. This makes it immediately obvious which IPC calls exist, what they accept, and what they return. It also means a typo in a channel name is a compile error, not a silent runtime failure. Other lib/ utilities: - queryClient.ts: TanStack Query client shared across the whole app - router.ts: Wouter-based client-side router definition - result.ts: lightweight Ok/Err type used by async IPC wrappers - askBarContext.tsx / meetingsListContext.tsx: React contexts that lift shared state (streaming chat, meetings list) above route boundaries - meetingDetailState.ts: per-meeting tab/scroll position kept in memory * feat(renderer): TanStack Query hooks over IPC (data layer) Each hook wraps a set of related IPC calls in a TanStack Query query or mutation. The renderer never talks to ipc.ts directly — it always goes through a hook. This means: 1. Caching and background-refetching are automatic (e.g. the meetings list doesn't need manual refresh after a recording completes) 2. Loading / error states are consistent across the whole UI 3. IPC calls are trivially mockable in isolation Notable hooks: - useRecording: start/stop/pause with optimistic UI; also subscribes to the recording:elapsed push event for the live timer - useMeetings / useFolders: CRUD with cache invalidation - useAi / useChatSessions: manages per-meeting AI conversation history - useStreamingQuery: low-level hook that turns an IPC streaming event into an async iterator, used by the Ask Bar for token-by-token output - useAudioLevel: reads microphone PCM level from a push event, drives the audio wave visualisation on the recording screen - useCalendarEvents: fetches calendar entries from the backend for the upcoming events section in the sidebar * feat(renderer): core UI component library A set of reusable primitive components built on top of shadcn/ui and Radix UI, styled with Tailwind and scoped to the StenoAI design system (indigo/sky/cyan brand tokens, dark-mode-first). Building these before any screens means: (a) every screen uses the same spacing, radius, and colour decisions; (b) design changes propagate everywhere from one place; (c) each component can be reviewed in the /dev sandbox route without needing real data. Components: Button, Card, Chip, Dialog, ConfirmDialog, Input, Select, Switch, Tabs, Tooltip, Popover, Row, Typography, Kbd, AppIcon. * feat(renderer): all screens — home, meetings, meeting detail, settings The complete set of navigable routes, built on the component library and data hooks from previous commits. Route breakdown: - / (Home): meeting list with folder tree in sidebar, upcoming calendar events, search. Uses a Notion-style sidebar that collapses via a fixed toggle button placed next to the traffic lights so it stays reachable. - /meetings/:id (MeetingDetail): Summary, Transcript, Insights, and Calendar tabs. Transcript panel supports search and word-level timestamp highlighting. The tab choice and scroll position persist across navigations within the same session. - /folders/:id (FolderDetail): filtered meeting list scoped to a folder. - /settings: all user preferences — audio device, AI model, storage path, calendar integration, theme. Matches the section structure of the legacy settings panel so existing users find everything in the same place. - /setup: first-run wizard that walks through model download and mic permission, with progress indicators driven by IPC events. - /dev (Sandbox): visual component gallery used during development to approve design decisions without needing real meeting data. AppShell / MeetingsShell handle the persistent layout (titlebar, sidebar, content region) shared across all authenticated routes. * feat(recording): live recording screen with audio wave, timer, and notes Previously the app showed a minimal overlay during recording with no live feedback. Users couldn't tell if audio was being captured, how long they'd been recording, or add context while the meeting was in progress. This replaces that with a full-screen recording experience: - AudioWave: reads PCM level from the recording:audio-level push event (streamed from main.js ~10 times/sec) and renders a live bar graph. Gives immediate confirmation that the mic is picking up audio. - Ticking duration: subscribes to recording:elapsed so the timer updates every second without polling. main.js pushes elapsed ms; the renderer formats it locally. - Live notes: a draft text area backed by liveDraftStore (Zustand) so notes survive navigation but are discarded on stop. The draft is submitted as a note attachment when the user stops recording. - MainToolbar: pause/resume and stop controls, always visible over the wave. Pause state comes from the recording:pause-state IPC channel. - QuitDialog: intercepts window close during an active recording so the user doesn't accidentally lose a session. - Processing screen: shown after stop while transcription is running, with a progress indicator driven by transcription:progress events. - IconPicker: emoji/icon picker used when naming a new meeting or folder. Backend (simple_recorder.py): adds title-regeneration command and a model install-status query used by the settings model picker. src/folders.py: adds folder rename and icon-update commands. * feat(renderer): floating Ask Bar with streaming AI responses A floating panel anchored to the bottom of any meeting screen that lets users ask natural-language questions about the current meeting. Answers stream token-by-token using the IPC streaming event channel so the UI feels responsive rather than making the user wait for a full response. Design decisions: - Floats above the route content (CSS position:fixed) so it's reachable from any tab (Summary, Transcript, Insights, Calendar) without navigating away. - Collapsed to a single icon when not in use; expands on click or ⌘K. - Conversation history is persisted per meeting in chat_sessions.json via useChatSessions so sessions survive app restarts. - On first open for a meeting the bar pre-seeds context with the meeting summary and transcript so the AI has the right scope without the user needing to paste anything. - Legacy chat_sessions.json format (from the old renderer) is migrated automatically on first read so existing users don't lose history. * fix(renderer): address cubic P1 review findings - package.json: drop the broken postinstall script that referenced a path outside this PR (e2e/scripts/link-node-modules.js); npm install now works without --ignore-scripts - main.js (showCustomQuitDialog): skip the custom dialog when the legacy renderer is active (it doesn't implement the response channel) and add a 5s timeout for the new renderer so a wedged React tree can't hang quit indefinitely; on timeout default to not quit so an active recording isn't killed silently - main.js (open-release-page): reject any url that isn't https://github.com/... -- a compromised renderer can no longer pass arbitrary URLs to shell.openExternal through this channel - renderer/index.html: add a tight Content Security Policy meta tag (default-src 'self', no inline scripts, no external connect, frame-ancestors 'none') to limit XSS blast radius - useChatSessions: surface legacy migrated sessions on every meeting view instead of dropping them, and check the chat.save IPC envelope -- failures now roll back the optimistic cache update and throw rather than silently diverging from disk * feat(renderer): restore legacy UX, polish, and fix post-recording races UX restoration vs legacy renderer: - Sidebar mark and Home AppIcon now use the website's dragonfly SVG with currentColor so they theme automatically - Settings button moved from sidebar footer to the top-right toolbar - Primary CTA "Start recording" -> "New note" with PencilLine icon (recording-state still shows "Stop recording") - Ask Bar suggestion chips ("Summarize key decisions", "Action items", "Main topics") appear when the input is focused with no chat in progress; clicking submits the prompt - Processing screen renders a bottom-anchored "scanner bar" matching the legacy generation-scanner: rounded card with shadow, smooth cubic-bezier transition, labels "Analyzing transcript" / "Generating notes" / "Almost done" - Scroll fade above the dock extended to 80px with a three-stop gradient so notes dissolve into the dock instead of cutting off - .mv-transcript gets backdrop-filter saturate+blur and an askChatIn slide-in keyframe so the chat panel feels premium Layout: - Dock gets 15px right-padding to compensate for the scroll container's scrollbar gutter so the ask bar aligns with the notes column Bug fixes: - useChatSessions: stop leaking orphan/legacy sessions onto every meeting; instead match legacy migrated sessions per-meeting via the "${meetingName} — title" prefix the migration writes - useRecording: pre-seed the meetings list cache from the processingComplete event payload so MeetingDetail finds the new meeting on first render after stop - MeetingDetail: treat isFetching && !data as loading so the brief refetch window doesn't flash "Meeting not found" * fix(renderer): address critical review findings Three issues from PR review that should land before merge: 1. chat sessions: write to chat_sessions_v2.json with tmp+rename atomic write; never modify legacy chat_sessions.json. Prevents the legacy renderer from breaking when a user toggles back, and makes a crash-mid-write recoverable instead of bricking all chat history. On first run with v2 absent, read legacy file once for in-memory migration; on subsequent saves only v2 is touched. If v2 ever ends up corrupt, quarantine it as <name>.corrupt-<ts> rather than failing every load. 2. reveal-meeting-folder: validate the path against allowedBaseDirs before passing to shell.showItemInFolder, matching delete-meeting and update-meeting. Was the only meeting-file IPC handler with no path-traversal guard. 3. Document the new load-chat-sessions response shape (migratedFromLegacy flag) in ipc-contract.md. * fix(renderer): UX polish from in-app review Layout / alignment - Hoist sidebar collapsed/width state into a module-level singleton (with useSyncExternalStore) so MeetingsShell and BottomDockSlot share one source of truth. Without this, toggling/resizing only updated one component, drifting the chat bar's centerline off the notes column. - Drop the dock's left to 0 when sidebar is collapsed (was 68) so it matches main's marginLeft. Slide transition on left. - Add scrollbar-gutter: stable to the scroll container and right: 10 on the dock so the centerlines line up regardless of overflow. - Tighten the content column from max-w-820 px-14 to max-w-720 px-10 so body text feels centered around the ask bar pill. - Remove the 220px hard spacer at the bottom of MeetingDetail; pb-36 on the scroll container already gives the dock breathing room. Toolbar / sidebar - Brand renamed Steno → Dragonfly in the sidebar. - Settings button moved out of the toolbar to a small icon at the bottom of the sidebar; second click while on /settings returns to the route the user came from (router tracks lastNonSettingsRoute). - Add light/dark theme toggle (moon/sun) before the New note button. Home / All meetings - Drop the dead Filter button (had no onClick). - Add a working Search input on /meetings that filters by title and summary content with a clear-X affordance. - Drop the placeholder 'Voice memo' chip when a meeting has no folder and no participants. - FolderDetail's section header no longer renders its own border; the first row's top border is the only divider. Ask bar / transcript - Two-layer fade backdrop on the dock: a narrow 80px gradient above the pill and a solid page-color band below it so content doesn't peek out around or beneath the pill. Pill itself stays fully opaque. - Bump composer input weight to 500 and bring placeholder up from --fg-muted to --fg-2 so the bar reads less faded. - Fix transcript panel: search and copy were broken because the AskBar's mousedown click-outside listener was unmounting the panel on any internal click. Stop propagation on the panel root and exempt [data-transcript-bar] in the listener. - Make the transcript header wave static (was animating, looked like recording was in progress). Debug console - Capture backend debug-log lines from app start into a 1000-line module-level ring buffer (primeDebugLogs in App.tsx). DeveloperTab reads via useSyncExternalStore so opening Settings shows the full session, not just lines emitted after the tab mounted. - Auto-scroll the textarea to the bottom on new lines. * feat(renderer): default to new React renderer Tray menu still exposes 'Switch to legacy UI' as a fallback. Existing users with a persisted newRenderer:false in ui-settings.json keep their choice; only fresh installs (and dev/E2E without the file) flip to true. * feat(renderer): copyable values, telemetry in setup, Ollama reuse fix Settings → Advanced - Storage location now renders via a CopyableValue helper (path + click- to-copy with a checkmark confirm). main.js handler computes the effective path: returns the user's custom path when set, otherwise ~/Library/Application Support/stenoai (was empty string before, so the section rendered blank on default installs). - New 'Anonymous ID' row below the analytics toggle, also copyable. GetTelemetryResponse widened to surface anonymous_id from the Python CLI (was being discarded). Setup wizard - Telemetry switch added to the wizard so users opt in/out during onboarding without having to find Settings → Advanced. Ollama startup - Probe 127.0.0.1:11434 before spawning. If reachable, reuse the existing instance and skip the readiness loop. Avoids 'address already in use' when a previous launch left an Ollama running. Mirrors the fix being landed on PR #102. * feat(setup): local vs cloud summarization choice in wizard The third step (Summarization Engine) now shows a Local/Cloud chooser before kicking off: - Local (default, recommended): downloads the bundled Ollama model (~2 GB). Same flow as before. Forces ai_provider back to 'local' when picked, so re-running the wizard cleanly switches off cloud. - Cloud: reveals provider dropdown (OpenAI / Anthropic) and an API key input. On Begin setup, persists ai_provider='cloud', cloud_provider, and the key, then runs testCloudApi to surface a bad key immediately rather than at first summarization. Skips the multi-GB download. Begin setup is disabled until either Local is picked or a cloud key has been entered, with a small hint underneath. The chooser hides once the wizard is running or done so it doesn't look interactive. * feat: per-provider cloud model + setup wizard cloud option Cloud model state is now scoped per-provider so switching from Anthropic to OpenAI doesn't carry an incompatible model name: - src/config.py: cloud_models dict ({openai, anthropic, custom}) with per-provider defaults (gpt-4o-mini / claude-haiku-4-5-20251001). One-shot migration on config load attributes the legacy single cloud_model to whichever provider was active when it was last saved, then persists. set_cloud_model writes to the active provider's slot; legacy field is mirrored for back-compat. Settings → AI → Cloud: - Free-text Model input replaced by a Select populated from testCloudApi's models response. 'Custom...' option falls back to text input. Provider switch invalidates the cached list so an Anthropic model list can't be shown against an OpenAI key. - Test connection: rewired Setup wizard's success path (was checking result.ok which the IPC never returned, treating every successful test as a failure). Now uses unwrap's throw-on-failure semantics; Settings ConnectionStatus refactored to take ok/message props derived from useMutation state. - ipc.ts: testCloudApi return type corrected from {ok,message} to {models?: string[]}. Setup wizard: - Local/Cloud chooser on the third step. Local downloads the bundled model (existing flow); Cloud reveals provider dropdown + API key input, persists provider/key, runs testCloudApi, marks step done. - Chooser stays visible after a failed test so user can edit a bad key without restarting; mic + whisper are skipped on retry if they're already done. - Welcome heading reverted to 'Welcome to Steno'; sidebar brand reverted to 'Steno.'. Other UX: - Sidebar 'All meetings' → 'All notes' (page header matches). - Telemetry switch added to setup wizard. - Storage location and Anonymous ID now copyable (with a click-to-copy helper). main.js get-storage-path returns the resolved default (~/Library/Application Support/stenoai) when no custom path is set, fixing the empty render. - Ollama startup probes 127.0.0.1:11434 and reuses an existing instance to avoid 'address already in use' on retry. * feat: meeting → note rename, AI auto-title, selection highlight Auto-titling - React renderer's empty-name default flips from 'Meeting' to 'Note' so the Python post-processor's auto-rename actually fires. - simple_recorder.py regex relaxed from '^(Meeting|Note)-[A-Z0-9]{6}$' to '^(Meeting|Note)(-[A-Z0-9]{6})?$' — plain 'Meeting' / 'Note' now also trigger AI-generated titles (legacy hashed names still match). - main.js captures Python's SAVED:<path> stdout marker as the primary match key when rebuilding meetingData on processing-complete. Was falling back to a name match (fails after auto-rename) and audio- file basename match (broken for some users), so the renderer often navigated to the wrong meeting after a recording finished. User-facing 'meeting' → 'note' rename - Sidebar nav 'All meetings' → 'All notes' (page heading matches). - Home search 'Search meetings' → 'Search notes'. - MeetingDetail 'Delete meeting' → 'Delete note', 'Meeting not found.' → 'Note not found.' - MeetingsShell delete confirm 'Delete meeting "..."?' → 'Delete note'. - PreviousRow 'Untitled meeting' → 'Untitled note'. - Recording placeholder 'New meeting' → 'New note'. - Processing fallback display title 'Meeting' → 'Note'. Visual - ::selection background switched from --paper-2 (which blended into the faint search-bar tint and made selections invisible) to a translucent indigo accent so highlights are visible everywhere. * fix(renderer): address cubic review P1s and P2s P1 - config.py migration: don't overwrite the on-disk config with defaults when _load() returned defaults due to a parse/IO error. Only persist the cloud_models migration when there's an actual legacy value to attribute; otherwise just stage the empty map in memory. - Settings cloud config: model dropdown was disappearing the instant a user picked a model because the same effect that resets test state on provider change was also firing on cloud_model change. Split into two effects so a model selection no longer triggers testConnection.reset(). - AskBar submitPrompt: rapid suggestion-chip clicks could create duplicate sessions/streams. Added a submittingRef guard. - useRecording: the processing-complete listener was attached inside useRecording (called by 12+ components), so each completion triggered N invalidations and N navigations. Extracted into useRecordingProcessingEffects, mounted once at App level. - debugLogs.primeDebugLogs cleanup now resets the primed flag so a later mount can rebind the IPC listener. P2 - Sidebar + App route decoders: malformed % escapes in the route hash could throw URIError and crash render. Wrapped in try/catch. - ui/chip.tsx: asButton now renders a real <button type="button"> with proper button HTML attrs (was a div+role/tabIndex shim that also defaulted to type=submit, breaking chips inside <form>). - ui/typography.tsx: caller-supplied style was overwriting the default fontVariationSettings. Spread caller's style after the defaults. - MainToolbar: 'processing' status was misclassified as idle, so the recording button looked clickable while a previous job was still processing. Now shows 'Processing' label and is disabled. - useFolders icon update: added rollback on mutation error so failed IPC saves restore the previous icon instead of leaving the cache with a value that never made it to disk. - vite.config: gate sourcemap on NODE_ENV !== production so packaged DMGs don't ship .map files. * fix(main): YAML-safe frontmatter writes in update-meeting The MD update path reconstructed the entire YAML frontmatter from a naively parsed map, which (a) wrote titles wrapped in raw double quotes without escaping embedded ", \ or newlines (corrupting the YAML block), and (b) coerced every other field to a quoted string — clobbering structured values like folders: ["id1"] and is_diarised: false on every save. New approach: do a line-based rewrite that only mutates the title and updated_at lines and leaves the rest byte-identical, with a yamlQuote() helper that properly escapes \, ", and newlines per the YAML double-quoted-scalar spec. --------- Co-authored-by: ruzin <ruzin113@icloud.com>
|
@BrianNguyen291 do you wanna address the feedback or should I? |
|
Sr, i was busy these days, i will fix it soon, then notify u, thank you |
|
@ruzin later can u help me check? i added tests. please let's me know any needed . thanks |
|
@BrianNguyen291 will merge today |
|
@BrianNguyen291 hey brian, sorry, give me a couple more days, ive been busy getting some other features out. |
Summary by cubic
Adds full Chinese variant support (Simplified and Traditional) across transcription and summaries, plus timestamped diarised transcripts shown in the UI with cleaner markdown. Also improves Ollama startup reliability.
New Features
zh-Hans/zh-Hant. Map both to Whisperzh. Convert transcripts, streamed summaries, and final summaries via OpenCC. Migrate legacyzhtozh-Hans.diarised_textwhen present. Expose per-segment data from whisper.cpp and fall back when segments are missing.Dependencies
opencc-python-reimplementedfor Simplified/Traditional conversion; include OpenCC data files in packaging.Written for commit 2909808. Summary will update on new commits.