A 1.7B parameter language model trained for life-safety maritime operations.
Runs 100% offline on Android. No cloud. No RAG. No compromise.
Download APK Β Β·Β Model on HuggingFace Β Β·Β Architecture Β Β·Β Training Pipeline Β Β·Β Mobile App
Qwen3-1.7B fine-tuned across a 6-phase pipeline (CPT β SFT β ORPO) on 72M+ tokens of curated maritime knowledge and 500K+ QA pairs, distilled from a 235B-parameter teacher model. Quantized to GGUF Q4_K_M (1.03 GB) for on-device inference. Scraped from 43 authoritative sources β IMO, SOLAS, MARPOL, classification societies, P&I clubs, and accident investigation boards. Trained on Tesla K80 Γ 4.
Merchant ships operate with no reliable internet. When something fails β a boiler fault, a fuel leak, an enclosed space emergency β the crew has minutes to act. The nearest manual is 800 pages. Shore support is hours away.
Caution
Wrong answers at sea have real consequences. A crankcase entry error causes an explosion. An enclosed space misjudgment kills people. A MARPOL violation triggers environmental disasters and port detentions.
Maritime AI is not a chatbot. It is a domain-specific language model that runs on a phone with no connectivity β built to the same standard that classification societies use to certify vessels. Every architectural decision traces to a published paper. Every quality gate has a mathematical threshold.
How is this different from a general-purpose LLM?
| General LLM | Maritime AI | |
|---|---|---|
| Training data | Internet-scale, unverified | 43 authoritative maritime sources, manually curated |
| Distillation | None | Qwen3-235B-A22B (142 GB) β 4 concurrent instances |
| Safety alignment | Generic RLHF | ORPO with 4 domain-specific error vectors |
| Trap rejection | Ad-hoc | 97.5% rejection on adversarial out-of-domain prompts |
| Deployment | Cloud API | On-device GGUF, zero network dependency |
| Reasoning | Single-mode | Dual-mode: /think (deep) and /no_think (concise) |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MARITIME AI β COMPLETE SYSTEM ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββ DATA COLLECTION LAYER βββββββββββββββββββββββββ β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β IMO β β SOLAS β β MARPOL β β STCW β β IMDG β β β
β β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β β
β β β β β β β β β
β β ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ β β
β β β MAIB β β EMSA β β NTSB β β DNV β β Gard β β β
β β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β β
β β β β β β β β β
β β ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ β β
β β βClassNK β β Lloyd's β β ABS β β BIMCO β β P&I Clubsβ β β
β β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β β
β β ββββββββ¬ββββββ΄ββββββ¬ββββββ΄ββββββ¬ββββββ΄ββββββ¬ββββββ β β
β β βΌ βΌ βΌ βΌ β β
β β 43 Custom Scrapers β PDF Extractors β Quality Filters β β
β β β β β
β β βββββββββββΌββββββββββ β β
β β β 72M+ Tokens β β β
β β β Gold Standard β β β
β β β Training Data β β β
β β βββββββββββ¬ββββββββββ β β
β ββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β TEACHER DISTILLATION LAYER β β
β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Qwen3-235B-A22B (142GB, Q4_K_M) β β β
β β β "The Teacher" β 4Γ llama-server instances β β β
β β βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ β β
β β β β β
β β βββββββββββββββββΌββββββββββββββββ β β
β β βΌ βΌ βΌ β β
β β βββββββββββββββ βββββββββββββ ββββββββββββββββ β β
β β β 5 Question β β IFD-Based β β MinHash β β β
β β β Angles per β β SuperFilterβ β Deduplicationβ β β
β β β Chunk β β (ACL 2024)β β β β β
β β ββββββββ¬βββββββ βββββββ¬ββββββ ββββββββ¬ββββββββ β β
β β βββββββββββββββββΌββββββββββββββββ β β
β β βΌ β β
β β βββββββββββββββββββββββββ β β
β β β 500K+ Q&A Pairs β β β
β β β Gold Standard SFT β β β
β β βββββββββββββ¬ββββββββββββ β β
β ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββ β
β β 6-PHASE TRAINING PIPELINE β β
β β (Tesla K80 Γ 4, fp16, QLoRA) β β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β Phase 1 ββββΆβ Phase 2 ββββΆβ Phase 3 ββββΆβ Phase 4 β β β
β β β CPT β β SFT-1 β β SFT-2 β β Correct β β β
β β β Domain β β Reason β β Direct β β On-Pol β β β
β β β Adapt β β /think β β/no_thinkβ β icy β β β
β β βββββββββββ βββββββββββ βββββββββββ ββββββ¬βββββ β β
β β β β β β
β β β Gate: PPLβ15% Gate: 70% Gate: 60% β β β
β β β Genβ<10% <think> Trap Refuseβ β β
β β β βΌ β β
β β β βββββββββββ β β
β β β β Phase 5 β β β
β β β β ORPO β β β
β β β β Ξ²=0.1 β β β
β β β ββββββ¬βββββ β β
β β β β β β
β β β βΌ β β
β β β βββββββββββ β β
β β β β Phase 6 β β β
β β β βQuantize β β β
β β β βQ4_K_M β β β
β β β ββββββ¬βββββ β β
β βββββββββΌββββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββ β
β β β β
β βββββββββΌββββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββ β
β β DEPLOYMENT LAYER β β
β β β β
β β ββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ β β
β β β HuggingFace β β React Native Mobile App β β β
β β β Model Hub β β β β β
β β β β β βββββββββββββ ββββββββββββββββββ β β β
β β β model.gguf βββββΆβ β llama.rn β β LLM Router β β β β
β β β whisper-tiny β β β C++ mmap β β Self-Classify β β β β
β β β 1.03 GB β β βββββββ¬ββββββ βββββββββ¬βββββββββ β β β
β β ββββββββββββββββββ β β β β β β
β β β βΌ βΌ β β β
β β β βββββββββββββββββββββββββββββββ β β β
β β β β Offline-First Inference β β β β
β β β β β’ <think> / </think> mode β β β β
β β β β β’ Context pruning β β β β
β β β β β’ Safety alerts β β β β
β β β β β’ FTS5 SQLite persistence β β β β
β β β β β’ Whisper STT β β β β
β β β βββββββββββββββββββββββββββββββ β β β
β β βββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Metric | Value |
|---|---|
| Total Python Files | 198 (training, data pipeline, scrapers, generation) |
| Total TypeScript/TSX Files | 40 (React Native frontend) |
| Custom Web Scrapers | 43 sources (IMO, SOLAS, MAIB, DNV, ClassNK, Lloyd's, etc.) |
| Training Data | 72M+ tokens from 43 authoritative maritime sources |
| Generated Q&A Pairs | 500,000+ multi-angle distilled samples |
| Training Phases | 6 (CPT β SFT1 β SFT2 β Correction β ORPO β Quantize) |
| Teacher Model | Qwen3-235B-A22B (142GB, Q4_K_M) β 4 concurrent instances |
| Student Model | Qwen3-1.7B β fine-tuned β GGUF Q4_K_M (1.03 GB) |
| Training Hardware | 4Γ Tesla K80 (11GB each), 251GB RAM, 48 CPU threads |
| Final Model Size | 1.03 GB (Q4_K_M) / 1.17 GB (Q5_K_M) |
| Frontend | React Native + Expo + llama.rn + Whisper STT |
| Deployment | HuggingFace Hub |
ship/
βββ training/ # π§ Model Training Pipeline
β βββ run_cpt_1.7b.py # Phase 1: Continued Pre-Training (946 lines)
β βββ run_sft1_1.7b.py # Phase 2: SFT Stage 1 β Reasoning (/think)
β βββ run_sft2_1.7b.py # Phase 3: SFT Stage 2 β Direct (/no_think)
β βββ run_correction_1.7b.py # Phase 4: On-Policy Correction
β βββ run_orpo_1.7b.py # Phase 5: ORPO Preference Alignment
β βββ quantize_1.7b.py # Phase 6: GGUF Quantization
β βββ phase2_optionc_common.py # Core reasoning & scoring engine (1,355 lines)
β βββ run_cpt_4b.py # 4B model variant pipeline
β βββ run_sft1_4b.py # 4B SFT Stage 1
β βββ run_sft2_4b.py # 4B SFT Stage 2
β βββ run_orpo_4b.py # 4B ORPO
β βββ quantize_4b.py # 4B Quantization
β βββ run_tapt_1.7b.py # Task-Adaptive Pre-Training
β βββ build_local_benchmark_1p7b.py # Benchmark construction
β βββ build_local_corrections_1p7b.py# Correction dataset builder
β βββ build_local_orpo_pairs_1p7b.py # ORPO pair generator
β βββ audit_signatures.py # Model signature auditing
β βββ checkpoints/ # Saved model checkpoints
β
βββ ship/maritime_pipeline/ # π Data Engineering Battalion
β βββ scrapers/ # 43 custom web scrapers
β β βββ imo_scraper.py # International Maritime Organization
β β βββ maib_scraper.py # UK Marine Accident Investigation Branch
β β βββ emsa_scraper.py # European Maritime Safety Agency
β β βββ ntsb_scraper.py # US National Transportation Safety Board
β β βββ dnv_scraper.py # Det Norske Veritas classification society
β β βββ classnk_scraper.py # Nippon Kaiji Kyokai
β β βββ lloyds_register_scraper.py# Lloyd's Register
β β βββ abs_scraper.py # American Bureau of Shipping
β β βββ bimco_scraper.py # Baltic & Intl Maritime Council
β β βββ gard_scraper.py # Gard P&I Club
β β βββ safety4sea_scraper.py # Safety4Sea portal
β β βββ marineinsight_scraper.py # Marine Insight
β β βββ ... (43 total) # + 30 more specialized scrapers
β βββ chunking/ # Intelligent document chunking
β βββ extraction/ # PDF & HTML text extraction
β β βββ pdf_extractor.py
β βββ filtering/ # IFD-based quality filtering
β β βββ quality_filter.py
β βββ dedup/ # MinHash deduplication
β β βββ minhash_dedup.py
β βββ config.py # Pipeline configuration
β βββ db.py # Pipeline progress database
β βββ data/final/ # Gold Standard outputs
β βββ cpt_corpus.jsonl # 34,988 records (~72M tokens)
β βββ general_replay.jsonl # 4,772 records (anti-forgetting)
β βββ sft_curated.jsonl # Curated SFT training data
β βββ sft_curated_traps.jsonl # Adversarial safety traps
β βββ orpo_pairs_1.7b.jsonl # ORPO preference pairs
β βββ eval_set.jsonl # Held-out evaluation set
β βββ cpt_val_maritime.jsonl # 1,288 validation records
β βββ cpt_val_general.jsonl # 98 general validation records
β
βββ scripts/ # π§ Generation & Orchestration
β βββ comprehensive_maritime_generator.py # 500K multi-provider generation (854 lines)
β βββ generate_wave1.py # Wave 1 teacher distillation
β βββ filter_wave1.py # IFD-based SuperFiltering
β βββ syllabus_generator.py # A-Z domain syllabus generator
β βββ syllabus_plan.py # Master syllabus planning
β βββ orchestrated_60k_generator.py # 60K batch orchestrator
β βββ quality_audit.py # Automated quality auditing
β βββ coverage_dashboard.py # Domain coverage tracking
β βββ validate_teacher.py # Teacher model validation
β
βββ frontend/ # π± React Native Mobile Application
β βββ app/ # Expo Router pages
β β βββ (tabs)/index.tsx # Home screen β thread list
β β βββ (tabs)/new.tsx # New conversation
β β βββ (tabs)/settings.tsx # App settings
β β βββ chat/[threadId].tsx # Chat conversation screen
β βββ components/ # UI Components
β β βββ MessageBubble.tsx # Chat message rendering
β β βββ ThinkingBlock.tsx # <think> reasoning display
β β βββ ThinkingGlow.tsx # Animated thinking indicator
β β βββ InputTray.tsx # Message input with voice
β β βββ SafetyAlert.tsx # Critical safety warnings
β β βββ MarkdownRenderer.tsx # Rich text rendering
β β βββ InitialSetupScreen.tsx # Model download & setup
β β βββ ModelLoadingScreen.tsx # GGUF loading progress
β β βββ QuickActionChips.tsx # Quick action shortcuts
β βββ services/ # Core Services
β β βββ modelBridge.ts # LLM inference bridge (757 lines)
β β βββ ModelProvisioner.ts # Bulletproof model download (503 lines)
β β βββ inferencePolicy.ts # Turn routing & mode control
β β βββ responseProfiles.ts # Deterministic response paths
β β βββ VoiceService.ts # Whisper STT integration
β β βββ PerformanceMonitor.ts # Thermal & OOM monitoring
β β βββ BackgroundDownloadManager.ts # Background download tracking
β β βββ Logger.ts # Structured logging
β βββ stores/ # State Management (Zustand)
β β βββ chatStore.ts # Chat & streaming state
β β βββ threadStore.ts # Thread list management
β β βββ appStore.ts # Global app state
β βββ database/ # Offline Persistence
β β βββ schema.ts # SQLite + FTS5 schema
β β βββ operations.ts # CRUD operations
β βββ constants/ # Configuration
β β βββ model.ts # Model paths, prompts, params
β β βββ theme.ts # Maritime design system
β β βββ fonts.ts # Typography configuration
β βββ providers/
β βββ ThemeProvider.tsx # Dark/light mode provider
β
βββ deploy/ # π Deployment Artifacts
β βββ maritime-1.7b-local-q4km.gguf # Production model (1.03 GB)
β βββ maritime-1.7b-local-q5km.gguf # High-quality variant (1.17 GB)
β βββ maritime-1.7b-local-f16.gguf # Full precision reference (3.2 GB)
β
βββ configs/ # Training configurations
βββ TRAINING-PLAN.md # 722-line research-grounded plan
βββ ULTIMATE_MARITIME_AI_PLAN.md # 2,361-line master execution plan
βββ MARITIME_AI_TECHNICAL_HANDOFF.md # Technical handoff specification
Months of effort. 43 custom scrapers. 72M+ tokens of curated maritime knowledge. This is the foundation everything else is built on.
We did not use a single off-the-shelf dataset. Every piece of training data was collected, extracted, validated, chunked, and filtered by our own pipeline. This was a deliberate decision β maritime safety data must be traceable to authoritative sources.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 43 CUSTOM WEB SCRAPERS β
β Maritime Data Collection Battalion β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββ REGULATORY BODIES ββββββ β
β β IMO β International Maritime Organization β
β β EMSA β European Maritime Safety Agency β
β β MCA β UK Maritime & Coastguard Agency β
β β Paris MOU β Port State Control memorandum β
β β
β ββββββ ACCIDENT INVESTIGATION ββββββ β
β β MAIB β UK Marine Accident Investigation Branch β
β β NTSB β US National Transportation Safety Board β
β β BSU β German Federal Bureau of Maritime Casualty β
β β NSIA β Norwegian Safety Investigation Authority β
β β Dutch Safety β Dutch Safety Board maritime reports β
β β CHIRP β Confidential Hazardous Incident Reports β
β β
β ββββββ CLASSIFICATION SOCIETIES ββββββ β
β β DNV β Det Norske Veritas β
β β ClassNK β Nippon Kaiji Kyokai β
β β Lloyd's β Lloyd's Register of Shipping β
β β ABS β American Bureau of Shipping β
β β IACS β Intl Association of Classification Societies β
β β
β ββββββ P&I CLUBS & INSURERS ββββββ β
β β Gard β Gard P&I insurance β
β β Skuld β Skuld mutual insurance β
β β Standard Clubβ Standard Club P&I β
β β NE P&I β North of England P&I β
β β Steamship β Steamship Mutual β
β β UK P&I β UK P&I Club β
β β ITOPF β Intl Tanker Owners Pollution Federation β
β β
β ββββββ INDUSTRY ORGANIZATIONS ββββββ β
β β BIMCO β Baltic & International Maritime Council β
β β Hellenic β Hellenic Shipping News β
β β Safety4Sea β Safety4Sea intelligence platform β
β β Marine Insightβ Marine Insight technical articles β
β β Maritime Execβ Maritime Executive news β
β β gCaptain β gCaptain maritime news β
β β Splash247 β Splash maritime news β
β β
β ββββββ ACADEMIC / RESEARCH ββββββ β
β β OpenAlex (Γ3)β Open academic graph β maritime papers β
β β
β + Specialized scrapers for bunkering, COW, ESE, FWG, gauging β
β β
β Total: 43 scrapers β 72M+ tokens of curated maritime data β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Raw Web Pages / PDFs
β
βΌ
βββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββ
β 43 Scrapers ββββββΆβ PDF Extractor ββββββΆβ Quality Filter β
β (Parallel) β β pdf_extractor β β quality_filter β
βββββββββββββββββ ββββββββββββββββββ ββββββββββ¬ββββββββββ
β
ββββββββββΌββββββββββ
β Intelligent β
β Chunking β
β (512-2048 tok) β
ββββββββββ¬ββββββββββ
β
ββββββββββΌββββββββββ
β MinHash Dedup β
β minhash_dedup β
ββββββββββ¬ββββββββββ
β
ββββββββββΌββββββββββ
β 72M+ Tokens β
β chunks.jsonl β
β Gold Standard β
ββββββββββββββββββββ
We used a Qwen3-235B-A22B (142GB Q4_K_M) teacher model running across 4 concurrent llama-server instances to distill knowledge into structured Q&A pairs:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MULTI-PROVIDER DISTILLATION ENGINE β
β comprehensive_maritime_generator.py (854 lines) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β Teacher β β Teacher β β Teacher β β Teacher β β
β β :8000 β β :8001 β β :8002 β β :8003 β β
β β Qwen3 β β Qwen3 β β Qwen3 β β Qwen3 β β
β β 235B β β 235B β β 235B β β 235B β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β
β ββββββββββββββββΌββββββββββββ¬ββ β β
β βΌ βΌ βΌ β
β βββββββββββββββββββββββββββββ β
β β 5 Question Angles/Chunk β β
β β β’ Practical scenario β β
β β β’ Troubleshooting β β
β β β’ Procedure / Checklist β β
β β β’ Regulation reference β β
β β β’ Safety-critical β β
β βββββββββββββββ¬ββββββββββββββ β
β β β
β + External APIs for volume scaling: β
β βββββββββββ ββββββββββββ ββββββββββββ β
β β Gemini β β Cerebras β β Groq β β
β β 2.5 β β LLaMA-8B β β LLaMA-8B β β
β β Flash β β instant β β instant β β
β βββββββββββ ββββββββββββ ββββββββββββ β
β β
β Coverage: 100+ maritime categories β
β Distribution: Weighted by safety-criticality β
β Validation: Per-sample JSON schema + forbidden phrase filter β
β Output: 500,000+ Q&A pairs β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Every phase has a mathematical gate. If it fails, training stops. No exceptions.
This pipeline implements findings from three convergent research streams:
| Research | Source | Key Finding |
|---|---|---|
| openPangu Embedded | Huawei, Sep 2025 | Two-stage curriculum SFT (reasoning-first, then concise) outperforms flat mixed training |
| Qwen3 Technical Report | Alibaba, Apr 2025 | Off-policy distillation in /think + /no_think modes, followed by on-policy refinement |
| ORPO | arXiv:2403.07691 | Combines SFT + preference optimization in one objective, eliminating DPO distribution shift |
| SuperFiltering | ACL 2024 | IFD via GPT-2 is consistent with 13B model orderings for data quality filtering |
| Nature Comp. Materials 2025 | Multiple | DAPT+TAPT outperforms DAPT alone by 2-5% |
Goal: Inject maritime domain knowledge into base Qwen3-1.7B without destroying general capabilities.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 1: CONTINUED PRE-TRAINING (run_cpt_1.7b.py, 946 lines) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Base Model: Qwen3-1.7B (fp16, device_map={"": 0}) β
β LoRA Config: r=128, alpha=128, all projection layers β
β Optimizer: AdamW, lr=2e-5, cosine schedule β
β Batch: micro=1, grad_accum=32 (effective batch=32) β
β Sequence Length: 512 tokens β
β Precision: fp16 only (K80 has no bf16 support) β
β β
β ββββββββββββββββββββ CURRICULUM STAGES βββββββββββββββββββ β
β β β β
β β Stage 1 (0-10%): 50% Maritime / 50% General β β
β β Stage 2 (10-85%): 80% Maritime / 20% General ββββ β β
β β Stage 3 (85-100%): 70% Maritime / 30% General β β
β β β β
β β The 3-stage curriculum prevents catastrophic β β
β β forgetting by maintaining general knowledge replay. β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Innovation: CurriculumPackedIterableDataset β
β βββββββββββββββββββββββββββββββββββββββββ β
β β’ Pre-tokenizes entire corpus into uint32 binary arrays β
β β’ Uses np.memmap for O(1) RAM β reads tokens from disk β
β β’ Dynamic mixing ratio switches mid-training via callback β
β β’ Packed sequences (no padding waste) β
β β
β ββββββββββββ GATE CHECK ββββββββββββ β
β β Maritime PPL drop: β₯ 15% β
β Achieved: 74.5% drop β
β β General PPL increase: β€ 10% β
β Achieved: <2% increase β
β β If FAIL β training ABORTS β β
β ββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Goal: Teach the model to produce DeepSeek-style <think> reasoning traces before answering.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 2: SFT STAGE 1 β REASONING (run_sft1_1.7b.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Input: CPT checkpoint (LoRA merged into base weights) β
β Data: /think examples from sft_curated.jsonl β
β New LoRA: r=32, alpha=32 (lighter than CPT) β
β LR: 2e-4 (higher for SFT), NEFTune noise alpha=5 β
β β
β Training Format (Qwen3 chat template): β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β <|im_start|>system β β
β β You are an expert maritime assistant... /think β β
β β <|im_end|> β β
β β <|im_start|>user β β
β β A crew member has collapsed in the sewage tank... β β
β β <|im_end|> β β
β β <|im_start|>assistant β β
β β <think> β β
β β This is a life-critical emergency. Sewage tanks are β β
β β high-risk enclosed spaces with H2S and methane... β β
β β </think> β β
β β DO NOT enter immediately. Follow this sequence: β β
β β 1. Raise alarm, call master/chief engineer... β β
β β <|im_end|> β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββ GATE CHECK ββββββββββββ β
β β 50 unseen questions evaluated β β
β β β₯ 70% must produce <think> β β
β β block with > 20 words β β
β β If FAIL β pipeline ABORTS β β
β ββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Goal: Teach concise direct responses AND adversarial safety refusals.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 3: SFT STAGE 2 β DIRECT + TRAPS (run_sft2_1.7b.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Input: SFT1 checkpoint (merged) β
β Data Sources: β
β β’ /no_think examples (factual, regulatory, safety) β
β β’ Safety trap examples (sft_curated_traps.jsonl) β
β β’ Synthetic ThinkFollow pairs (auto-generated) β
β β
β ThinkFollow Synthesis Logic: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Input: "How to perform enclosed space entry?" β β
β β Full Answer: "1. Ventilate for 30min. 2. Test O2..."β β
β β β β β
β β βΌ β β
β β Synthesized: "Just give me the most critical step β β
β β for: How to perform enclosed space entry?" β β
β β Answer: "Ventilate for 30min." β β
β β β β
β β Forces conciseness AFTER reasoning is learned. β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββ GATE CHECK ββββββββββββ β
β β 50 adversarial trap questions β β
β β β₯ 60% must refuse with exact: β β
β β "I don't have sufficient β β
β β information about this β β
β β specific topic." β β
β β If FAIL β pipeline ABORTS β β
β ββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 4: ON-POLICY CORRECTION β
β Student generates β Teacher scores β Correction training β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β The student model answers questions from its own distribution. β
β The 235B teacher grades each answer. Failures are corrected. β
β This closes the gap between training data and real inference. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 5: ORPO PREFERENCE ALIGNMENT (run_orpo_1.7b.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Config: beta=0.1, lr=8e-6, 1 epoch, batch=1, grad_accum=8 β
β β
β Synthetic Error Vectors (R1-R4): β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β R1 (Regulatory): "shall" β "should" β β
β β Makes mandatory requirements sound optional β β
β β β β
β β R2 (Safety): Remove first critical step β β
β β Deletes the most important action in a procedure β β
β β β β
β β R3 (Units): "kPa" β "bar" β β
β β Introduces unit conversion errors in calculations β β
β β β β
β β R4 (Completeness): Truncate procedural answers β β
β β Removes the final verification/reporting steps β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β By penalizing these exact semantic shifts, the model learns β
β superhuman precision on regulatory language, safety steps, β
β and unit accuracy β the things that matter most at sea. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 6: QUANTIZATION (quantize_1.7b.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β LoRA Merge β FP16 β llama.cpp convert β GGUF β
β β
β Output Variants: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Q4_K_M β 1.03 GB β Production (mobile) βββββ β β
β β Q5_K_M β 1.17 GB β High-quality fallback β β
β β F16 β 3.21 GB β Reference / benchmarking β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Deployed to: huggingface.co/mohanganesh3/maritime_model_v1 β
β Includes: model.gguf + whisper-tiny.bin (voice engine) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The Tesla K80 (Kepler architecture, compute capability 3.7) lacks bf16 support and Flash Attention. We engineered around every limitation:
| Challenge | Solution | File |
|---|---|---|
No bf16 support |
Strict fp16=True, bf16=False across all phases |
All run_*.py |
| 11GB VRAM limit | QLoRA r=128 (CPT) / r=32 (SFT) + gradient accumulation | run_cpt_1.7b.py |
| OOM on data loading | np.memmap uint32 binary cache (zero-copy reads) |
run_cpt_1.7b.py:243 |
| Checkpoint resume crash | _sanitize_checkpoint_for_transformers_resume() strips corrupt RNG states |
run_cpt_1.7b.py:148 |
| Transformers/TRL mismatch | PatchedORPOTrainer wraps ORPOTrainer for v4.51.3/v0.11.0 compat |
phase2_optionc_common.py |
| PyTorch uint32 missing | Custom torch.uint32 polyfill shim |
Training environment |
| Dual venv isolation | .venv/ (generation) vs .venv-train/ (training) with ensure_venv_train() gate |
All run_*.py |
The model runs entirely on-device. No server. No API calls. The phone IS the inference engine.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β REACT NATIVE INFERENCE ARCHITECTURE β
β (frontend/services/modelBridge.ts β 757 lines) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β User Message β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β ZERO-SHOT LLM ROUTER β β
β β n_predict: 160, temperature: 0.08 β β
β β β β
β β The model self-classifies into: β β
β β ββββββββββββββ ββββββββββββββ ββββββββββββ β β
β β β Domain β β Risk Level β β Response β β β
β β β engine-roomβ β critical β β checklistβ β β
β β β bridge-nav β β standard β β explain β β β
β β β compliance β β low β β converse β β β
β β β safety β β β β β β β
β β ββββββββββββββ ββββββββββββββ ββββββββββββ β β
β ββββββββββββββββββββββββ¬βββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β CONTEXT PRUNING ENGINE β β
β β trimMessagesToContext() β β
β β β β
β β while (tokenCount > MAX_PROMPT_TOKENS): β β
β β llamaContext.tokenize(messages) β β
β β drop oldest turn β β
β β β β
β β Guarantees: NEVER exceeds context window β β
β β Uses C++ tokenizer for exact count β β
β ββββββββββββββββββββββββ¬βββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β STREAMING INFERENCE + TAG PARSER β β
β β llama.rn (C++ mmap β ARM NEON) β β
β β β β
β β onToken callback: β β
β β ββββββββββββββββββββββββββββββββββββββ β β
β β β tagBuffer accumulates chunks β β β
β β β Scans for <think> / </think> β β β
β β β Routes reasoning β ThinkingBlock β β β
β β β Routes response β MessageBubble β β β
β β β Tracks thinkTime duration β β β
β β ββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββ¬βββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β PERSISTENCE LAYER β β
β β SQLite + FTS5 (full-text search) β β
β β All conversations stored offline β β
β β Instant message search across all threads β β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Deploying a 1.03 GB model to mobile devices over unreliable maritime connectivity required a custom download engine:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BULLETPROOF DOWNLOAD PROTOCOL β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Expected: EXACTLY 1,107,409,280 bytes (Β±256 slack) β
β β
β HTTP 206 (Partial Content) β
β βββΆ Resume from existing bytes β β
β β
β HTTP 200 (Range Ignored) β
β βββΆ Server ignored resume request β
β βββΆ DELETE corrupt appended file β
β βββΆ Restart from byte 0 β
β β
β HTTP 416 (Range Not Satisfiable) β
β βββΆ Check if file is already complete β
β βββΆ If size matches β mark done β
β βββΆ If size wrong β delete & retry β
β β
β Oversized file detected β
β βββΆ DELETE corrupt file, restart fresh β
β β
β .maritime_done marker written ONLY after β
β byte-exact verification passes β
β β
β Fallback URLs: β
β 1. huggingface.co (primary) β
β 2. hf-mirror.com (China fallback) β
β Max retries: 8 per artifact β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β REACT NATIVE COMPONENT MAP β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β App Layout (_layout.tsx) β
β βββ ThemeProvider (dark mode default) β
β βββ Tab Navigator β
β β βββ Home Tab (index.tsx) β
β β β βββ ThreadListItem (pinned, sorted) β
β β β βββ QuickStartTile (common actions) β
β β βββ New Chat Tab (new.tsx) β
β β β βββ QuickActionChips (pre-built prompts) β
β β β βββ InputTray (text + voice input) β
β β βββ Settings Tab (settings.tsx) β
β β βββ Model info display β
β β βββ Theme toggle β
β βββ Chat Screen (chat/[threadId].tsx) β
β βββ ScreenHeader (thread title, back nav) β
β βββ MessageBubble (user + assistant) β
β β βββ MarkdownRenderer (rich formatting) β
β β βββ ThinkingBlock (<think> content) β
β β βββ ThinkingGlow (animated indicator) β
β βββ SafetyAlert (critical warning banner) β
β βββ TypingIndicator (streaming dots) β
β βββ InputTray β
β βββ Text input β
β βββ Voice button (Whisper STT) β
β βββ Think mode toggle (/think vs /no_think) β
β β
β Services Layer β
β βββ modelBridge.ts (LLM inference, 757 lines) β
β βββ ModelProvisioner.ts (download engine, 503 lines) β
β βββ VoiceService.ts (Whisper STT, 269 lines) β
β βββ PerformanceMonitor.ts (thermal/OOM guard) β
β βββ inferencePolicy.ts (routing rules) β
β β
β State Management (Zustand) β
β βββ chatStore.ts (messages, streaming, thinking) β
β βββ threadStore.ts (thread CRUD, pin, search) β
β βββ appStore.ts (model status, global config) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The production model is deployed at huggingface.co/mohanganesh3/maritime_model_v1 with the following artifacts:
| Artifact | Size | Purpose |
|---|---|---|
model.gguf |
1.03 GB | Q4_K_M quantized β production mobile inference |
whisper-tiny.bin |
74 MB | Whisper tiny β voice-to-text in noisy engine rooms |
Three quantization variants are available in deploy/:
| File | Size | Use Case |
|---|---|---|
maritime-1.7b-local-q4km.gguf |
1.03 GB | Mobile devices (4-8 GB RAM) |
maritime-1.7b-local-q5km.gguf |
1.17 GB | Tablets / higher accuracy |
maritime-1.7b-local-f16.gguf |
3.21 GB | Full precision benchmarking |
| Layer | Technology | Purpose |
|---|---|---|
| Base Model | Qwen3-1.7B | Student model backbone |
| Teacher Model | Qwen3-235B-A22B (142GB) | Knowledge distillation source |
| Training Framework | PyTorch 2.1.2 + CUDA 11.8 | GPU training |
| Fine-Tuning | PEFT (QLoRA) + TRL (ORPO) | Parameter-efficient training |
| Serving (Training) | llama.cpp / llama-server | Teacher model inference |
| Quantization | llama.cpp GGUF converter | FP16 β Q4_K_M / Q5_K_M |
| Data Pipeline | Custom Python (43 scrapers) | Web scraping, extraction, filtering |
| Data Quality | MinHash dedup + IFD filter | Deduplication and quality scoring |
| Frontend | React Native + Expo SDK 51 | Cross-platform mobile app |
| Mobile Inference | llama.rn (C++ bindings) | On-device GGUF inference |
| Voice Engine | Whisper Tiny (77MB) | Speech-to-text for maritime use |
| Local Storage | expo-sqlite + FTS5 | Offline conversation persistence |
| State Management | Zustand | Lightweight reactive state |
| Model Hosting | HuggingFace Hub | Model distribution |
| Generation APIs | Gemini 2.5 / Cerebras / Groq | Supplementary data generation |
| Document | Lines | Purpose |
|---|---|---|
TRAINING-PLAN.md |
722 | Research-grounded training lifecycle with citations |
ULTIMATE_MARITIME_AI_PLAN.md |
2,361 | Master execution plan with every task, script, and gate |
MARITIME_AI_TECHNICAL_HANDOFF.md |
112 | Technical specification for deployment integration |
- Python 3.10+ with CUDA support
- Node.js 18+ and npm
- Android SDK (for mobile build)
- 4+ GB RAM device (for inference)
# Activate training environment
source .venv-train/bin/activate
# Phase 1: Continued Pre-Training
CUDA_VISIBLE_DEVICES=0 python training/run_cpt_1.7b.py
# Phase 2: SFT Stage 1 (Reasoning)
CUDA_VISIBLE_DEVICES=0 python training/run_sft1_1.7b.py
# Phase 3: SFT Stage 2 (Direct + Safety)
CUDA_VISIBLE_DEVICES=0 python training/run_sft2_1.7b.py
# Phase 5: ORPO Alignment
CUDA_VISIBLE_DEVICES=0 python training/run_orpo_1.7b.py
# Phase 6: Quantize to GGUF
python training/quantize_1.7b.pycd frontend
# Install dependencies
npm install
# Start development server
npx expo start
# Build Android APK
npx expo run:android# Activate generation environment
source .venv/bin/activate
# Run the 500K multi-provider generator
python scripts/comprehensive_maritime_generator.py
# Run quality audit
python scripts/quality_audit.py- openPangu Embedded (Huawei, Sep 2025) β Curriculum SFT for billion-parameter models
- Qwen3 Technical Report (Alibaba, Apr 2025) β Off-policy + on-policy distillation recipe
- ORPO: Monolithic Preference Optimization (arXiv:2403.07691) β Combined SFT + preference alignment
- SuperFiltering (ACL 2024) β IFD-based data quality scoring
- DAPT+TAPT (Nature Computational Materials, 2025) β Domain + task adaptive pre-training
Built with months of research, hundreds of papers, and zero compromises.
Because at sea, there is no second chance.