diff --git a/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/catalog-info.yaml b/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/catalog-info.yaml new file mode 100644 index 00000000..f8969035 --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/catalog-info.yaml @@ -0,0 +1,50 @@ +apiVersion: backstage.io/v1alpha1 +kind: Component +metadata: + name: nvidia-aiq-blueprint + title: How to deploy NVIDIA AIQ on IBM Fusion HCI + description: | + Deploy NVIDIA AI-Q, a research assistant blueprint that helps users extract insights from documents using generative AI. + AI-Q combines RAG pipelines for document-based Q&A, GPU-accelerated LLM inference, multi-stage reasoning and validation, and a simple UI for running research workflows. + + Unlike a simple chatbot, AI-Q represents a full AI workflow, making it ideal for validating enterprise AI platforms like IBM Fusion HCI. The platform provides predictable GPU scheduling and utilization using Red Hat OpenShift with the NVIDIA GPU Operator, secure controlled deployment within enterprise infrastructure, and a unified platform for multiple AI blueprints on the same Red Hat OpenShift-based environment. + tags: + - nvidia + - aiq + - ai + - blueprint + - fusion + - gpu + - llm + - rag + - research-assistant + annotations: + backstage.io/techdocs-ref: dir:. + github.com/project-slug: IBM/storage-fusion + + links: + - url: https://community.ibm.com/community/user/blogs/namita-singroha/2026/02/15/unlocking-ai-powered-video-analytics-on-ibm-fusion + title: Read on IBM Tech Exchange + icon: article + - url: https://ibm.github.io/storage-fusion/fusion-ai/resources/ + title: View on Fusion Tech Community + icon: web + - url: https://github.com/IBM/storage-fusion/blob/master/AI/NVIDIA%20Blueprints/NVIDIA%20AIQ/Fusion_NVIDIA_AIQ_Guide.md + title: Complete deployment guide on GitHub + icon: docs + - url: https://build.nvidia.com/nvidia/aiq + title: NVIDIA AI-Q Blueprint + icon: launch +spec: + type: blueprint + lifecycle: production + owner: fusion-team + system: fusion-ai-platform + + providesApis: [] + consumesApis: [] + + dependsOn: + - resource:default/nvidia-gpu + - resource:default/fusion-storage + - component:default/nvidia-rag-blueprint diff --git a/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/docs/index.md b/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/docs/index.md new file mode 100644 index 00000000..96a4fd3d --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/docs/index.md @@ -0,0 +1,340 @@ +# Deploying NVIDIA AI-Q on IBM Fusion HCI + +## Running NVIDIA AI Enterprise Blueprints on a Production-Ready OpenShift Platform + +Generative AI blueprints are increasingly delivered as Kubernetes-native applications. NVIDIA AI-Q is one such blueprint designed to help teams build deep research and document-driven AI workflows using Retrieval-Augmented Generation (RAG) and GPU-accelerated inference. + +However, deploying an AI blueprint is only part of the story. For developers and SREs, the real challenge is running these applications _reliably_ on an enterprise-grade platform: a platform supporting standardized deployment, GPU-enabled infrastructure, and operational best practices. + +In this blog, we walk through how to deploy and run the NVIDIA AI-Q blueprint on IBM Fusion HCI, using standard OpenShift and Helm-based workflows. The goal is not just to deploy AI-Q, but to _demonstrate_ how IBM Fusion HCI serves as a foundation for enabling NVIDIA AI Enterprise (NVAIE) blueprints as part of a broader enterprise AI platform strategy. + +This article is intended for developers, platform engineers, and SREs who want to accomplish the following: + +- Deploy the NVIDIA AI-Q blueprint on IBM Fusion HCI +- Understand how AI-Q fits into a Red Hat OpenShift-based AI platform +- Use standard Red Hat OpenShift namespaces and Helm workflows on IBM Fusion HCI +- Apply best practices for operating GPU-enabled AI workloads in production + +--- + +## What Is NVIDIA AI-Q? + +NVIDIA AI-Q is a research assistant blueprint that helps users extract insights from documents using generative AI. It combines: + +- RAG pipelines for document-based Q&A +- GPU-accelerated LLM inference +- Multi-stage reasoning and validation +- A simple UI for running research workflows + +Unlike a simple chatbot, AI-Q represents a full AI workflow, making it ideal for validating enterprise AI platforms like IBM Fusion HCI. + +--- + +## Why IBM Fusion HCI? + +IBM Fusion HCI is a Kubernetes-native platform built on Red Hat OpenShift, designed to run stateful and GPU-accelerated workloads in an enterprise environment. It provides a consistent operational foundation for deploying and managing AI applications using standard Kubernetes and Red Hat OpenShift constructs. + +For AI workloads such as NVIDIA AI-Q, Fusion HCI offers: + +- Predictable GPU scheduling and utilization using Red Hat OpenShift in combination with the NVIDIA GPU Operator +- Secure, controlled deployment within enterprise infrastructure +- A unified platform for multiple AI blueprints on the same Red Hat OpenShift-based environment + +--- + +## Prerequisites + +Before deploying NVIDIA AI‑Q, ensure the following conditions exist: + +- IBM Fusion HCI cluster installed and running. + +- GPU-enabled Red Hat OpenShift worker nodes (Fusion HCI automatically installs and configures the NVIDIA GPU Operator for GPU workloads). + +Note: We used NVIDIA L40 GPUs for this AIQ deployment. Exact requirements vary by GPU model and workload. Refer to NVIDIA’s documentation for specific GPU and memory recommendations. + +- Persistent storage via IBM Fusion Data Foundation or another storage provider. + +- NVIDIA RAG Blueprint deployed (required by AI‑Q). + +- CLI tools: oc and Helm v3.19.4 installed and configured. + +Note: Helm v3.19.4 is the validated version for NVIDIA AI‑Q. + +πŸ’‘ Tip: To check how many GPUs are available on a node, describe the node and look at the allocatable GPU resources: +``` +oc describe node | grep -E "Capacity|Allocatable|nvidia.com/gpu" +``` + +You will see output like `nvidia.com/gpu: 4`, which indicates how many GPUs the node can schedule for workloads. + +--- + +## Step 1: Generate Required API Keys + +AI-Q requires two external APIs: + +- NVIDIA NGC API Key: pulls containers and model artifacts +- Tavily API Key: for web-based search and enrichment + +Export the keys on your system: + +``` +export NGC_API_KEY="" +export TAVILY_API_KEY="" +``` + +## Step 2: Create a Namespace for AI-Q + +Create a dedicated namespace to isolate AI-Q components from other workloads: + +``` +oc create namespace aiq +``` + +## Step 3: Download the NVIDIA AI-Q Helm Chart + +``` +wget https://helm.ngc.nvidia.com/nvidia/blueprint/charts/aiq-aira-v1.2.0.tgz +tar -xvf aiq-aira-v1.2.0.tgz +cd aiq-aira +``` + +This Helm chart packages all AI-Q components, including UI, backend, and model serving configurations. +After extracting, the aiq-aira directory contains the following files and folders: + +``` +Chart.lock + +Chart.yaml + +charts/ + +files/ + +templates/ + +values.yaml +``` + +## Step 4: Configure the Model in values.yaml + +Select the model you want AI-Q to use. In this deployment, we use the llama-3.2-3b-instruct model. + +To configure the model, update the model name in the values.yaml file located in the aiq-aira directory. The snippet below shows an example configuration using the llama-3.2-3b-instruct model: + +``` +# ------------------------------------------------------------ +# The following values are for the AIQ AIRA backend service. +# ------------------------------------------------------------ + +replicaCount: 1 + +imagePullSecret: + name: "ngc-secret" + registry: "nvcr.io" + username: "$oauthtoken" + password: "" + create: true + +ngcApiSecret: + name: "ngc-api" + password: "" + create: true + +tavilyApiSecret: + name: "tavily-secret" + create: true + password: "" + +# The image repository and tag for the AIQ AIRA backend service. +image: + baserepo: nvcr.io + repository: nvcr.io/nvidia/blueprint/aira-backend + tag: v1.2.0 + pullPolicy: Always + +# The service type and port for the main AIQ AIRA backend service +service: + port: 3838 + +backendEnvVars: + # update the model name here + INSTRUCT_MODEL_NAME: "meta-llama/llama-3.2-3b-instruct" + INSTRUCT_MODEL_TEMP: "0.0" + NEMOTRON_MAX_TOKENS: "5000" + INSTRUCT_MAX_TOKENS: "20000" + INSTRUCT_BASE_URL: "http://instruct-llm:8000" + INSTRUCT_API_KEY: "not-needed" + NEMOTRON_MODEL_NAME: "nvidia/llama-3.3-nemotron-super-49b-v1.5" + NEMOTRON_MODEL_TEMP: "0.5" + NEMOTRON_BASE_URL: "http://nim-llm.rag.svc.cluster.local:8000" + AIRA_APPLY_GUARDRAIL: "false" + RAG_SERVER_URL: "http://rag-server.rag.svc.cluster.local:8081" + RAG_INGEST_URL: "http://ingestor-server.rag.svc.cluster.local:8082" + +nim-llm: + enabled: true + service: + name: "instruct-llm" + image: + # update the model name here + repository: nvcr.io/nim/meta/llama-3.2-3b-instruct + pullPolicy: IfNotPresent + tag: "1.10.1" + resources: + limits: + nvidia.com/gpu: 2 + requests: + nvidia.com/gpu: 2 + # Configure NIM Model Profile for optimal performance + env: + - name: NIM_MODEL_PROFILE + value: "" # Empty for automatic selection, or specify tensorrt_llm profile + model: + ngcAPIKey: "" + # update the model name here + name: "meta-llama/llama-3.2-3b-instruct" +``` + +Note: Model tag and GPU requirements were validated using the following NVIDIA documentation: https://docs.nvidia.com/nim/large-language-models/latest/supported-models.html + + +## Step 5: Deploy NVIDIA AI-Q Using Helm + +``` +helm install aiq-aira . \ + --username='$oauthtoken' \ + --password=$NGC_API_KEY \ + --set imagePullSecret.password=$NGC_API_KEY \ + --set ngcApiSecret.password=$NGC_API_KEY \ + --set tavilyApiSecret.password=$TAVILY_API_KEY \ + -n aiq +``` +This deploys all AI-Q components into the aiq namespace. + +## Step 6: Verify all the pods in namespace aiq: + +Run the following oc command to get the status of all pods in namespace aiq + +``` +oc get pods -n aiq +``` + +Expected output: +``` +aiq-aira-aira-backend-7cd46449bd-snbsm 1/1 Running 0 3h8m +aiq-aira-aira-frontend-59d9c897f6-c47z9 1/1 Running 0 3h8m +aiq-aira-nim-llm-0 1/1 Running 0 177m +aiq-aira-phoenix-78fd7584b7-ntllt 1/1 Running 0 3h8m +``` + +This confirms that all the pods are running and their containers are ready. +Now we are ready to access the AIQ user interface. + + +## Step 7: Access the AI-Q UI + +To access the AI-Q user interface, first identify the frontend service: + +```oc get svc -n aiq | grep frontend``` + +Example Output: + +``` +aiq-aira-aira-frontend NodePort 3000:30080/TCP +``` + +Make a note of the NodePort value (for example, 30080). +You can now access the AI-Q UI using the cluster node name or IP: + +``` +http://:30080 +``` + +![AI-Q UI Overview](https://cdn-images-1.medium.com/max/1600/1*GbW-Exa_pogBXUhDBFjKgw.png) + + +Clicking the Begin Researching option displays the following page: + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*MmfOzn9meHXI-eL5c7KEqg.png) + +## Step 8: Upload Enterprise Documents + +On the UI: + +- Click New Collection +- Upload the required documents (PDFs, manuals, technical documentation, etc.) +- Wait for the documents to be uploaded and indexed + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*YzDU2J3UOqG3H6yH6Gz4uQ.png) + +Note: Processing time depends on the size of the documents. + +## Step 9: Generate AI-Powered Research Reports + +Once the documents are indexed, AI-Q is ready to generate insights. + +1. Define the Report Topic: Start by defining a report topic. In this example, we used the following: +Example: IBM Fusion HCI deployment configurations + +2. Provide a Report Structure: A simple structure helps AI-Q organize its output. For example: + +Give a simple overview of IBM Fusion HCI using the selected documents +Explain: +- What IBM Fusion HCI is +- What it is used for +- Its main components + +3. Select Document Sources: Choose the document collection you want AI-Q to use and click Select Sources: + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*BeKfxn5B5icOZdjriIiM5g.png) + +4. Start the Generation Process : Click Start Generating: + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*XDdx6SdG8s5ZdaTN0XCuaA.png) + +AI-Q processes your topic and structure, preparing to create the report: + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*QjE2DHlAkwDJRZKJI_xwfA.png) + +5. Execute the Plan: Once the thinking phase completes, click Execute Plan to trigger AI-Q’s full execution pipeline: + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*uwtPD9M7InxQhLcXdMz9DQ.png) + +6. How AI-Q generates the report: +Behind the scenes, AI-Q processes the request through multiple stages: + +- RAG Answer: extracts info from documents +- Relevancy Check: validates content +- Web Answer: supplements info (if enabled) +- Summarize Sources: condenses findings +- Running Summary: structures output +- Reflect on Summary: improves clarity + + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*GwizReSVtTi6qQ0JmskEzg.png) +AI-Q execution pipeline + +7. Download the Final Report: Once all stages complete, AI-Q produces a final, structured research report, which can be downloaded directly from the UI. + +By clicking the Begin Researching option, the following page displays: + +![alt text](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*N_njerUy9k_lQcRuiB1SNQ.png) + +## Use Cases / Benefits + +Running NVIDIA AI-Q on IBM Fusion HCI provides several practical benefits for enterprises: + +1. Automated deployment reporting: AI-Q generates structured reports from IBM Fusion HCI documentation, deployment guides, or operational runbooks using RAG pipelines. + +2. Knowledge extraction for SRE and operations teams: Internal manuals, troubleshooting guides, and configuration documents can be indexed and queried to quickly surface relevant information during day-to-day operations. + +## Final Thoughts + +By deploying NVIDIA AI-Q on IBM Fusion HCI, we demonstrated how quickly enterprise AI workloads can be enabled and how IBM Fusion HCI simplifies AI infrastructure operations. + +From RAG pipelines to fine-tuned models and automated AI workflows, IBM Fusion HCI provides a robust foundation for scaling AI initiatives across the enterprise. + +This is just the beginning. From RAG pipelines to fine-tuned models and automated AI workflows, IBM Fusion HCI provides a strong foundation for enterprise AI at scale. + + diff --git a/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/mkdocs.yml b/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/mkdocs.yml new file mode 100644 index 00000000..7c6e31d1 --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA AIQ/backstage/mkdocs.yml @@ -0,0 +1,32 @@ +site_name: NVIDIA AI-Q Blueprint on IBM Fusion +site_description: Deploy NVIDIA AI-Q research assistant on IBM Fusion HCI for document-driven AI workflows + +# Point to the IBM storage-fusion repo for documentation +repo_url: https://github.com/IBM/storage-fusion +repo_name: IBM/storage-fusion + +# Point to docs directory +docs_dir: docs + +# Point to your actual markdown file +nav: + - Home: index.md + +theme: + name: material + palette: + primary: indigo + accent: indigo + +plugins: + - techdocs-core + + +# Safe markdown extensions +markdown_extensions: + - admonition + - tables + - toc: + permalink: true + +# Made with Bob diff --git a/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/catalog-info.yaml b/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/catalog-info.yaml new file mode 100644 index 00000000..092d25b0 --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/catalog-info.yaml @@ -0,0 +1,47 @@ +apiVersion: backstage.io/v1alpha1 +kind: Component +metadata: + name: nvidia-rag-blueprint + title: Guide to deploy NVIDIA RAG Blueprint on IBM Fusion + description: | + Deploy a Retrieval-Augmented Generation (RAG) blueprint on IBM Fusion HCI. + It enables users to ask questions and get accurate answers grounded in their own documents instead of relying on generic LLM guesses. + + Key features include multi-modal document intelligence, processing complex documents through OCR, PDF layout parsing, and table or chart extraction. Answers are grounded with context and citations via NVIDIA NIM, keeping responses traceable. Infrastructure is converged, with compute, GPU, and storage unified under a single OpenShift-managed system. Performance is production-ready, with NVMe-backed storage delivering the throughput RAG workloads need. + tags: + - nvidia + - rag + - ai + - blueprint + - fusion + - gpu + - llm + annotations: + backstage.io/techdocs-ref: dir:. + github.com/project-slug: IBM/storage-fusion + + links: + - url: https://community.ibm.com/community/user/blogs/namita-singroha/2026/02/03/deploying-nvidia-rag-on-fusion-hci + title: Read on IBM Tech Exchange + icon: article + - url: https://ibm.github.io/storage-fusion/fusion-ai/resources/ + title: View on Fusion Tech Community + icon: web + - url: https://github.com/IBM/storage-fusion/blob/master/AI/NVIDIA%20Blueprints/NVIDIA%20RAG/Fusion_NVIDIA_RAG_Guide.md + title: Complete deployment guide on GitHub + icon: docs + - url: https://github.com/NVIDIA-AI-Blueprints/rag/blob/main/docs/deploy-helm.md + title: NVIDIA Documentation + icon: launch +spec: + type: blueprint + lifecycle: production + owner: fusion-team + system: fusion-ai-platform + + providesApis: [] + consumesApis: [] + + dependsOn: + - resource:default/nvidia-gpu + - resource:default/fusion-storage diff --git a/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/docs/index.md b/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/docs/index.md new file mode 100644 index 00000000..8f73801b --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/docs/index.md @@ -0,0 +1,412 @@ +# Deploying NVIDIA RAG on IBM Fusion HCI + +Retrieval-Augmented Generation (RAG) is rapidly becoming a core enterprise capability. However, moving RAG from development to production requires more than just connecting an LLM to a vector database: it demands GPU-optimized inference, scalable semantic search, efficient embedding generation, and enterprise infrastructure to reliably support these multiple components. + +This article walks through a validated production deployment of NVIDIA's RAG Blueprint on IBM Fusion HCI with Red Hat OpenShift. The deployment uses: + +- NVIDIA RAG Blueprint v2.3.0 (Helm-based deployment) +- NVIDIA NIM with Nemotron Nano 8B for optimized LLM inference +- Milvus for distributed vector search +- NeMo Retriever for embedding generation +- IBM Fusion Data Foundation for enterprise-grade persistent storage +- Red Hat OpenShift on IBM Fusion HCI + +## Table of Contents + +- Why IBM Fusion HCI for RAG deployments +- Prerequisites +- Configuration steps for Red Hat OpenShift deployment +- Validation & Testing +- What we accomplished +- Key observations +- Troubleshooting common issues +- Further Reading + +## Why IBM Fusion HCI: + +Enterprise RAG platforms simultaneously demand high GPU utilization, consistent storage performance, and streamlined operations. IBM Fusion HCI provides converged infrastructure where compute, storage, and Red Hat OpenShift are integrated and managed as a unified system. + +This deployment demonstrates: + +- Direct GPU pass-through for NIM containers (no virtualization overhead) +- High-performance NVMe storage for Milvus vector operations +- Native Red Hat OpenShift integration simplifying platform operations +- Single management plane for infrastructure and AI workloads +- Performance and reliability requirements met for production RAG + +## Prerequisites + +Before deploying the NVIDIA RAG Blueprint, ensure the following requirements are met on your Fusion HCI system: + +### 1. Hardware requirements + +Verify that your cluster meets the minimum hardware specifications for the RAG Blueprint deployment: + +β€” IBM Fusion HCI cluster installed and running. + +β€” GPU requirements: + +- Minimum: 8 GPUs +- GPU memory: 24GB+ VRAM per GPU (40GB+ recommended for larger models). +- GPU types: NVIDIA L40S, A100, H100, RTX PRO 6000, B200 or equivalent. +- Note: This deployment was tested on NVIDIA L40S GPUs with 46GB VRAM + +Check available cluster resources: +```bash +oc describe nodes | grep -A 5 "Allocated resources" +``` + +Identify the type of GPUs: +```bash +oc get nodes -o json | jq -r '.items[] | select(.metadata.labels."nvidia.com/gpu.present" == "true") | {node: .metadata.name, gpu_product: .metadata.labels."nvidia.com/gpu.product", gpu_count: .metadata.labels."nvidia.com/gpu.count", gpu_memory: .metadata.labels."nvidia.com/gpu.memory"}' +``` + +### 2. Storage configuration + +Verify that you have a default storage class available: +```bash +oc get sc +``` + +Look for a storage class marked as (default). If a default storage class exists, you are ready to proceed. + +If no default storage class is set, configure one using IBM Fusion Data Foundation or another storage provider: + +**Option 1: IBM Fusion Data Foundation** + +Install IBM Fusion Data Foundation following the guide here. + +Once installed & configured, verify the storage class: +```bash +oc get storageclass | grep ocs +# Use: ocs-storagecluster-ceph-rbd +``` + +**Option 2: Local path provisioner** +```bash +oc apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.26/deploy/local-path-storage.yaml +oc patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' +``` + +Note: This deployment was tested with IBM Fusion Data Foundation v4.18. + +### 3. NVIDIA GPU Operator + +- Verify that you have installed the NVIDIA GPU Operator using the the following instructions: +- Check GPU operator status: +```bash +oc get pods -n nvidia-gpu-operator +``` + +- Confirm GPU resources are detected: +```bash +oc get nodes -o json | jq '.items[].status.allocatable | select(."nvidia.com/gpu" != null)' +``` + +### 4. NGC API key + +- Obtain your NGC API key from: https://ngc.nvidia.com/setup/api-key +- Export as an environment variable: +```bash +export NGC_API_KEY= +``` + +### 5. Optional: GPU time-slicing + +- You can enable time slicing for sharing GPUs between pods. +- For details, refer to this detailed guide on time-slicing. + +### 6. Install Helm and OpenShift CLI + +Ensure Helm v3.19.4 is installed, as this version is validated with the NVIDIA RAG Blueprint. +```bash +helm version +<-- output --> +version.BuildInfo{Version:"v3.19.4", ...} +``` + +Verify Red Hat OpenShift CLI is installed and connected to the cluster: +```bash +oc version +oc whoami +``` + +## Configuration steps: + +Before deploying the NVIDIA RAG Blueprint, a few modifications are required. Follow these steps sequentially: + +### Step 1: Download and extract the Helm chart + +Download the NVIDIA RAG Blueprint package locally for customization: +```bash +wget https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-rag-v2.3.0.tgz +tar xvzf nvidia-blueprint-rag-v2.3.0.tgz +cd nvidia-blueprint-rag +``` + +### Step 2: Configure pod PID limits + +Red Hat OpenShift requires increased PID limits for the RAG workload. Create and apply the kubelet configuration: +```bash +cat < 8082/TCP 43h +milvus ClusterIP 172.X.X.X 19530/TCP,9091/TCP 43h +nemoretriever-embedding-ms ClusterIP 172.X.X.X 8000/TCP 43h +nemoretriever-graphic-elements-v1 ClusterIP 172.X.X.X 8000/TCP,8001/TCP 43h +nemoretriever-page-elements-v2 ClusterIP 172.X.X.X 8000/TCP,8001/TCP 43h +nemoretriever-ranking-ms ClusterIP 172.X.X.X 8000/TCP 43h +nemoretriever-table-structure-v1 ClusterIP 172.X.X.X 8000/TCP,8001/TCP 43h +nim-llm ClusterIP 172.X.X.X 8000/TCP 35h +nim-llm-sts ClusterIP None 8000/TCP 35h +nv-ingest-ocr ClusterIP 172.X.X.X 8000/TCP,8001/TCP 43h +rag-etcd ClusterIP 172.X.X.X 2379/TCP,2380/TCP 43h +rag-etcd-headless ClusterIP None 2379/TCP,2380/TCP 43h +rag-frontend NodePort 172.X.X.X 3000:31273/TCP 43h +rag-minio ClusterIP 172.X.X.X 9000/TCP 43h +rag-nv-ingest ClusterIP 172.X.X.X 7670/TCP 43h +rag-opentelemetry-collector ClusterIP 172.X.X.X 6831/UDP,14250/TCP,14268/TCP,4317/TCP,4318/TCP,9411/TCP 35h +rag-redis-headless ClusterIP None 6379/TCP 43h +rag-redis-master ClusterIP 172.X.X.X 6379/TCP 43h +rag-redis-replicas ClusterIP 172.X.X.X 6379/TCP 43h +rag-server ClusterIP 172.X.X.X 8081/TCP 43h +rag-zipkin ClusterIP 172.X.X.X 9411/TCP 35h +``` + +The deployment is complete when all pods show Running status and 1/1 or appropriate replica counts in the READY column. + +### Step 8: Port-Forwarding to Access Web User Interface: + +Run the following cmd to port-forward the RAG UI service to your local machine. Then access the RAG UI at the following URL: http://localhost:3000. +```bash +oc port-forward -n rag service/rag-frontend 3000:3000 --address 0.0.0.0 +``` + +## Validation & Testing: + +After deployment, you can verify the RAG system is operational using the UI: + +1. Open a web browser and navigate to the RAG frontend. + +image + +2. Create a new collection by clicking "Create New Collection" at the bottom left. Provide a name and upload your documents (for example, IBM Fusion HCI and SDS PDFs). + +3. Click Create Collection and wait for ingestion to complete. Depending on the document size, this may take a few minutes. + +image + +4. In the home tab, click the Notifications icon on the top right. + +image + +5. Monitor the logs of the pod in the rag namespace to check for any errors. + +image + +6. Wait for the process to complete: ingestion may take several minutes depending on the size and number of uploaded documents. + +7. Once ingestion completes, click the uploaded document in the left panel. The collection will appear in the bottom right panel, ready for querying. + +image + +8. Now you can ask questions related to your document. + +image + +9. Monitor the logs of the pod in the rag namespace to observe AI responses. + +## What we accomplished: + +- Deployed a production RAG system that allows users to query enterprise documents and get AI-generated answers grounded in their own data. +- Successfully ran all RAG components (LLM inference, vector database, embeddings) on IBM Fusion HCI with stable performance. +- Validated that the NVIDIA RAG Blueprint works on Red Hat OpenShift, making it accessible to organizations using enterprise Kubernetes. + +## Key Observations: + +- Model selection directly impacts GPU requirements: Nemotron Nano 8B suits L40S GPUs while larger models need more VRAM; evaluate model capabilities against available resources before deployment +- Use Helm version 3.19.4 or less: other versions may have compatibility issues with the NVIDIA RAG Blueprint chart. + +## Troubleshooting common issues: + +### 1. Helm deployment fails with duplicate environment variable errors + +**Issue:** Deployment fails during Helm install with error: +``` +Release "rag" does not exist. Installing it now. +Error: failed to create typed patch object (rag/rag-nv-ingest; apps/v1, Kind=Deployment): errors: + .spec.template.spec.containers[name="nv-ingest"].env: duplicate entries for key [name="INGEST_LOG_LEVEL"] + .spec.template.spec.containers[name="nv-ingest"].env: duplicate entries for key [name="VLM_CAPTION_ENDPOINT"] +``` + +**Resolution:** Verify Helm version is exactly 3.19.4 using helm version + +### 2. Pods stuck in ImagePullBackOff + +**Issue:** Pods show ImagePullBackOff status + +**Resolution:** Verify container image names match the model list in NVIDIA NIM documentation and ensure that NGC secret is configured. + +### 3. Pods in CrashLoopBackOff + +**Issue:** Pods repeatedly crash with security errors + +**Resolution:** Verify SCC permissions are applied to the correct service account using oc get scc and oc describe pod + +## Further reading + +- To learn more about IBM Fusion HCI, explore the [IBM Fusion documentation](https://www.ibm.com/docs/en/fusion-hci-systems/2.12.0?topic=installing) +- For detailed Helm deployment steps, refer to the [NVIDIA RAG Blueprint deployment guide](https://github.com/NVIDIA-AI-Blueprints/rag/blob/main/docs/deploy-helm.md) +- Model specifications and options are available in the [NVIDIA NIM documentation](https://docs.nvidia.com/nim/large-language-models/latest/_include/models.html) +- Common deployment issues and solutions can be found in the [NVIDIA troubleshooting guide](https://github.com/NVIDIA-AI-Blueprints/rag/blob/main/docs/troubleshooting.md) +- To uninstall the deployment, follow the guidance [here](https://github.com/NVIDIA-AI-Blueprints/rag/blob/main/docs/deploy-helm.md#uninstall-a-deployment) + +**Acknowledgments:** Thanks to Sandeep Zende for his collaboration in validating this blueprint on IBM Fusion HCI. diff --git a/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/mkdocs.yml b/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/mkdocs.yml new file mode 100644 index 00000000..4f128a05 --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA RAG/backstage/mkdocs.yml @@ -0,0 +1,32 @@ +site_name: NVIDIA RAG Blueprint on IBM Fusion +site_description: Deploy Retrieval-Augmented Generation applications on IBM Fusion HCI + +# Point to the IBM storage-fusion repo for documentation +repo_url: https://github.com/IBM/storage-fusion +repo_name: IBM/storage-fusion + +# Point to docs directory +docs_dir: docs + +# Point to your actual markdown file +nav: + - Home: index.md + +theme: + name: material + palette: + primary: indigo + accent: indigo + +plugins: + - techdocs-core + + +# Safe markdown extensions +markdown_extensions: + - admonition + - tables + - toc: + permalink: true + +# Made with Bob diff --git a/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/catalog-info.yaml b/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/catalog-info.yaml new file mode 100644 index 00000000..b6750a02 --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/catalog-info.yaml @@ -0,0 +1,52 @@ +apiVersion: backstage.io/v1alpha1 +kind: Component +metadata: + name: nvidia-vss-blueprint + title: Guide to deploy NVIDIA VSS Blueprint on IBM Fusion HCI + description: | + Deploy NVIDIA Video Search and Summarization (VSS), an AI-powered visual intelligence blueprint that transforms video from passive recordings into intelligent, queryable knowledge bases. + + VSS delivers automated video summarization with timestamped key events, natural language Q&A over video content, semantic search that understands meaning and context, and flexible processing for both live streams and archived footage. The architecture combines Cosmos-Reason2 8B VLM for visual understanding, Llama 3.1 70B LLM for natural language reasoning, and NeMo embedding and reranking models for intelligent retrieval. Together, these components convert raw video into structured, searchable intelligence, making video data searchable, understandable, and actionable for enterprise use cases. + tags: + - nvidia + - vss + - ai + - blueprint + - fusion + - gpu + - video-analytics + - computer-vision + - llm + annotations: + backstage.io/techdocs-ref: dir:. + github.com/project-slug: IBM/storage-fusion + + links: + - url: https://community.ibm.com/community/user/blogs/namita-singroha/2026/02/15/unlocking-ai-powered-video-analytics-on-ibm-fusion + title: Read on IBM Tech Exchange + icon: article + - url: https://ibm.github.io/storage-fusion/fusion-ai/resources/ + title: View on Fusion Tech Community + icon: web + - url: https://github.com/IBM/storage-fusion/blob/master/AI/NVIDIA%20Blueprints/NVIDIA%20VSS/Fusion_NVIDIA_VSS_Guide.md + title: Complete deployment guide on GitHub + icon: docs + - url: https://build.nvidia.com/nvidia/video-search-and-summarization + title: NVIDIA VSS Blueprint + icon: launch + - url: https://docs.nvidia.com/vss/latest/content/vss_dep_helm.html + title: NVIDIA VSS Documentation + icon: launch +spec: + type: blueprint + lifecycle: production + owner: fusion-team + system: fusion-ai-platform + + providesApis: [] + consumesApis: [] + + dependsOn: + - resource:default/nvidia-gpu + - resource:default/fusion-storage + diff --git a/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/docs/index.md b/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/docs/index.md new file mode 100644 index 00000000..c9cf18f2 --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/docs/index.md @@ -0,0 +1,381 @@ +# Unlocking AI-Powered Video Analytics on IBM Fusion HCI using NVIDIA VSS + +Visual data has become one of the fastest-growing information sources in modern organizations. Enterprises capture massive amounts of video footage across facilities, production lines, retail environments, data centers, and operational sites. Yet most of this data remains effectively unusable β€” accessible only through time-consuming manual review. + +Finding a specific event often requires scrubbing through hours of recordings. Answering simple questions like *β€œWhat happened during the night shift?”* can demand significant time and operational effort. + +The fundamental challenge isn’t capturing video β€” it’s making that video searchable, understandable, and actionable. + +This transformation is now possible with **NVIDIA Video Search and Summarization (VSS)** β€” an AI-powered visual intelligence blueprint built for production environments. + +This article walks through a validated deployment of NVIDIA’s VSS Blueprint on **IBM Fusion HCI** with **Red Hat OpenShift**, following [NVIDIA’s official Helm-based deployment guide.](https://docs.nvidia.com/vss/latest/content/vss_dep_helm.html) + +## What This Article Covers + +This technical walk-through provides complete, step-by-step instructions for deploying NVIDIA VSS on IBM Fusion HCI: + +- Infrastructure prerequisites and validation +- Detailed Helm deployment with GPU configuration +- Troubleshooting common deployment challenges +- Production considerations for scaling and optimization + +## NVIDIA Video Search and Summarization (VSS) + +NVIDIA’s VSS Blueprint transforms video from a passive recording into an intelligent, queryable knowledge base. Built on state-of-the-art AI models and designed for real-world deployment, VSS delivers: + +- Automated video summarization with timestamped key events +- Natural language Q&A over video content +- Semantic search that understands meaning and context +- Flexible processing for both live streams and archived footage + +The architecture combines three powerful AI capabilities: + +- **Cosmos-Reason2 8B VLM** for visual understanding +- **Llama 3.1 70B LLM** for natural language reasoning +- **NeMo embedding and reranking models** for intelligent retrieval + +Together, these components convert raw video into structured, searchable intelligence. + +## Why IBM Fusion HCI Provides the Ideal Foundation + +Deploying AI-driven video intelligence requires more than GPUs. It demands tightly integrated compute, persistent storage, vector database infrastructure, and container orchestration β€” all operating cohesively. + +IBM Fusion Platform HCI provides: + +- **Integrated GPU compute and persistent storage** + Unified compute and storage eliminate external storage dependencies while supporting model caching, vector databases, and metadata persistence. + +- **OpenShift-based cloud-native orchestration** + Deploy AI microservices as containers with lifecycle management and scaling built in. + +- **Single platform for multiple AI workloads** + Vision models, large language models, and embedding services run together with simplified operations and optimized data locality. + +- **Operational consistency** + A unified management interface, centralized backup strategy, and validated lifecycle workflows reduce deployment complexity. + +This deployment validates that IBM Fusion HCI provides a production-ready foundation for NVIDIA’s VSS Blueprint. + +## VSS Architecture Overview + +VSS processes video using two coordinated pipelines: + +- **Ingestion Pipeline** +- **Retrieval Pipeline** + +### Ingestion Pipeline Flow + +1. Video is split into short chunks and distributed across GPUs in parallel. +2. Frames are sampled from each chunk and passed to the Vision Language Model (Cosmos-Reason2 8B by default). +3. The VLM generates timestamped natural language captions. +4. Optional components: + - Audio transcription via Riva ASR + - Computer vision metadata (object detection) +5. Outputs are merged into structured caption data. + +### Retrieval Pipeline Flow + +1. Captions are converted into vector embeddings using NeMo Retriever. +2. Embeddings are indexed into Milvus for semantic search. +3. The LLM populates a Neo4j knowledge graph with structured event data. +4. Summarization aggregates captions into a time-anchored summary. +5. For Q&A: + - User query searches vector DB and knowledge graph + - NeMo Reranker rescoring occurs + - Top context is passed to the LLM for grounded response generation + +The result: natural language querying over video with enterprise-grade performance. + +## Prerequisites + +Before deploying NVIDIA VSS, ensure the following requirements are met. + +### 1. Infrastructure + +- IBM Fusion HCI cluster installed and running. +- Fusion HCI v2.12+ includes GPU Operator pre-installed. +- For earlier versions, install NVIDIA GPU Operator manually. + +### 2. GPU Requirements + +VSS supports multiple deployment configurations depending on hardware. + +#### Default Configuration (Recommended Production Setup) + +**8 GPUs (H100 / H200 / B200 / A100 80GB+) on a single node** + +| Component | GPUs | +|------------|------| +| LLM (Llama 3.1 70B) | 4 | +| VSS (VLM processing) | 2 | +| NeMo Embedding | 2 | +| NeMo Reranking | 1 | + +> Note: This guide follows the default configuration using 8 H200 GPUs. + +#### Other Options + +- Customized GPU allocation (explicit GPU-to-service pinning) +- Fully local single GPU deployment (dev/test environments) + +Verify GPU availability: + +```bash +oc describe node | grep nvidia.com/gpu +``` + +### 3. Storage + +This reference deployment uses IBM Fusion Data Foundation v4.18. + +If unavailable, configure a local path provisioner: + +```bash +oc apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.26/deploy/local-path-storage.yaml + +oc patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' +``` + +### 4. Model Access & Credentials + +You will need: + +- **NGC API Key** for NVIDIA container images +- **Hugging Face Token** for Cosmos-Reason2 8B model access + +Steps: + +- Register for NGC API key at https://ngc.nvidia.com +- Generate Hugging Face token +- Accept model terms at https://huggingface.co/nvidia/Cosmos-Reason2-8B + +### 5. Tooling + +- Helm v3.19.4 (validated version) +- OpenShift CLI (oc) + +Validate: + +```bash +helm version +oc version +oc whoami +``` + +# Deployment Steps + +For full configuration options, refer to NVIDIA’s [official](https://docs.nvidia.com/vss/latest/content/vss_dep_helm.html) Helm documentation. + +### Step 1: Create Secrets + +Export credentials: + +```bash +export NGC_API_KEY= +export HF_TOKEN= +``` + +Create required secrets: + +```bash +oc create secret docker-registry ngc-docker-reg-secret \ + --docker-server=nvcr.io \ + --docker-username='$oauthtoken' \ + --docker-password=$NGC_API_KEY + +oc create secret generic ngc-api-key-secret \ + --from-literal=NGC_API_KEY=$NGC_API_KEY + +oc create secret generic hf-token-secret \ + --from-literal=HF_TOKEN=$HF_TOKEN + +oc create secret generic graph-db-creds-secret \ + --from-literal=username=neo4j --from-literal=password=password + +oc create secret generic arango-db-creds-secret \ + --from-literal=username=root --from-literal=password=password + +oc create secret generic minio-creds-secret \ + --from-literal=access-key=minio --from-literal=secret-key=minio123 +``` + +### Step 2: Fetch the Helm Chart + +```bash +helm fetch \ +https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-vss-2.4.1.tgz \ +--username='$oauthtoken' --password=$NGC_API_KEY +``` + +### Step 3: Deploy + +```bash +helm install vss-blueprint nvidia-blueprint-vss-2.4.1.tgz \ + --set global.ngcImagePullSecretName=ngc-docker-reg-secret \ + --set nim-llm.persistence.size=200Gi +``` + +### Why 200Gi? + +The default 50Gi PVC is insufficient for the Llama 3.1 70B model. Increasing to 200Gi ensures adequate model storage. + +Installation time may vary from a few minutes to an hour depending on network speed and model caching. + +### Step 4: Verify Deployment + +Check pods: + +```bash +oc get pods -n default +``` + +Check services: + +```bash +oc get svc -n default +``` + +Deployment is complete when all pods show: + +- STATUS: Running +- READY: 1/1 (or appropriate replica count) + +### Troubleshooting + +### Nemo Rerank Pod Fails + +If `nemo-rerank-ranking-deployment` does not start: + +```bash +oc scale deployment nemo-rerank-ranking-deployment --replicas=0 + +oc patch deployment nemo-rerank-ranking-deployment \ +-p '{"spec":{"template":{"spec":{"securityContext":{"fsGroup":1000,"runAsUser":1000,"runAsGroup":1000}}}}}' + +oc scale deployment nemo-rerank-ranking-deployment --replicas=1 +``` + +### Step 5: Access the UI + +```bash +oc get svc vss-service +``` + +From the output: + +- Port 8000 β†’ REST API +- Port 9000 β†’ UI + +Open: + +``` +http://: +``` + +# Validation & Testing + +After deployment: + +1. Open the VSS UI. +2. Upload a sample video. + +image + +3. Select chunk size (recommended: 5 seconds). +4. Configure prompts as needed. The examples below are templates β€” adjust them based on the type of video being uploaded and the level of technical detail required. + + + #### Prompt + ```bash + Summarize this demo video in a completed long paragraph. Explain how the MCP (Model Context Protocol) server is integrated with Watson Orchestrate. Explain what is happening in this video step by step. Identify any products, platforms, tools, workflows, and technical concepts shown. Describe the full process clearly. + ``` + #### Caption Summarization Prompt + + ```bash + Watch this video and generate a clear technical paragraph describing the main workflow demonstrated. Identify the platforms, services, tools, and integration steps shown. Explain how the system is configured, how components are connected, how functionality is validated, and how the final solution is deployed. Keep the paragraph simple, technical, and focused on process rather than marketing language. + + Break the video into 30–40 second timestamp segments. For each segment, briefly describe what happens, including any setup steps, UI actions, commands, configuration changes, or validation shown. If a major action occurs (such as integration, testing, or deployment), highlight it clearly even if it does not align exactly with the 30–40 second window. + ``` + #### Summary Aggregation Prompt + + ```bash + Generate a concise technical summary of this integration demo using the following structure: + + #### 1. OVERVIEW + Explain the goal of the demo and what problem the solution addresses. + + #### 2. WORKFLOW SUMMARY + Describe the main steps shown, including platform connection, service configuration, tool integration, validation or testing, and final deployment. + + #### 3. KEY FEATURES DEMONSTRATED + Summarize the core capabilities shown such as integration workflow, agent or service creation, tool ingestion, authentication, data querying, and deployment. + + #### 4. TECHNICAL ARCHITECTURE (HIGH LEVEL) + Explain how the main components interact, including orchestration platform, backend services, infrastructure layer, and data sources. + + #### 5. CONCLUSION + Summarize the final outcome and how this workflow supports enterprise AI or automation use cases. + + Keep everything technical, concise, and non-repetitive. Use short paragraphs. + ``` +6. Click **Summarize**. + +image + +To monitor processing: + +```bash +oc logs -f vss-vss-deployment- -n default +``` + +Once complete, the summary appears in the UI. + +image + +You can Ask natural language questions + +image + +Higlights can also be generated for the uploaded video file. + +image + +# What We Achieved + +By deploying NVIDIA VSS on IBM Fusion HCI: + +- Video is ingested, captioned, and indexed. +- Content becomes searchable using natural language. +- Query response time reduces from hours of manual review to seconds. +- Entire system runs on-premises with GPU acceleration. +- All services are orchestrated through OpenShift. + +This validates IBM Fusion HCI as a production-ready platform for enterprise AI-powered video analytics. + + +# Extending VSS + +### Audio Transcription (Riva ASR) + +Optional capability: + +- Adds speech-to-text transcription +- Merges spoken content with visual captions +- Ideal for briefings, announcements, and training videos + +**Requirements:** + +- 1 additional GPU (can share on 80GB+ GPUs) +- Model: `parakeet-0-6b-ctc-riva-en-us` + +Refer to NVIDIA’s audio deployment guide for configuration details. + +--- + +# Explore Further + +- To learn more about IBM Fusion HCI, explore the [IBM Fusion documentation](https://www.ibm.com/docs/en/fusion-hci-systems/2.12.x?topic=installing) +- For detailed Helm deployment options, refer to the NVIDIA VSS Helm [deployment](https://docs.nvidia.com/vss/latest/content/vss_dep_helm.html) guide +- Model specifications and supported configurations are available in the [NVIDIA NIM documentation](https://docs.nvidia.com/nim/large-language-models/latest/supported-models.html) +- Common deployment issues and solutions can be found in the [NVIDIA VSS FAQ](https://docs.nvidia.com/vss/latest/content/faq.html) and [Known Issues](https://docs.nvidia.com/vss/latest/content/known_issues.html) +- To uninstall the deployment, follow the guidance [here.](https://docs.nvidia.com/vss/latest/content/vss_dep_helm.html#uninstalling-the-deployment) diff --git a/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/mkdocs.yml b/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/mkdocs.yml new file mode 100644 index 00000000..3309bc7c --- /dev/null +++ b/AI/NVIDIA Blueprints/NVIDIA VSS/backstage/mkdocs.yml @@ -0,0 +1,32 @@ +site_name: NVIDIA VSS Blueprint on IBM Fusion +site_description: Deploy NVIDIA Video Search and Summarization on IBM Fusion HCI for AI-powered video analytics + +# Point to the IBM storage-fusion repo for documentation +repo_url: https://github.com/IBM/storage-fusion +repo_name: IBM/storage-fusion + +# Point to docs directory +docs_dir: docs + +# Point to your actual markdown file +nav: + - Home: index.md + +theme: + name: material + palette: + primary: indigo + accent: indigo + +plugins: + - techdocs-core + + +# Safe markdown extensions +markdown_extensions: + - admonition + - tables + - toc: + permalink: true + +# Made with Bob diff --git a/AI/quickstarts/fusion-developerhub/backstage/catalog-info.yaml b/AI/quickstarts/fusion-developerhub/backstage/catalog-info.yaml new file mode 100644 index 00000000..6c5b5624 --- /dev/null +++ b/AI/quickstarts/fusion-developerhub/backstage/catalog-info.yaml @@ -0,0 +1,49 @@ +apiVersion: backstage.io/v1alpha1 +kind: Component +metadata: + name: fusion-developer-hub-quickstart + title: Quickstart - Fusion Developer Hub + description: | + Deploy IBM Fusion Developer Hub, a production-ready enterprise developer portal built on Red Hat Developer Hub (Backstage) with deep integration into IBM Fusion's AI ecosystem. + + This quickstart guide walks you through deploying a fully functional, high-availability Developer Hub instance on OpenShift in under 15 minutes, complete with automatic AI model discovery, self-service templates, and enterprise-grade security. The platform provides operator-based management for automated deployment and day-2 operations, IBM Fusion AI integration with WatsonX and Granite models, high availability with multi-instance PostgreSQL, and enterprise security with RBAC and network policies. + tags: + - fusion + - developer-hub + - backstage + - quickstart + - ai + - platform + - openshift + - rhdh + annotations: + backstage.io/techdocs-ref: dir:. + github.com/project-slug: IBM/storage-fusion + + links: + - url: https://community.ibm.com/community/user/blogs/anushka-jaiswal/2026/05/29/quickstart-developer-hub-on-ibm-fusion-with-redhat + title: Read on IBM Tech Exchange + icon: article + - url: https://ibm.github.io/storage-fusion/fusion-ai/quickstarts/ + title: View on Fusion Tech Community + icon: web + - url: https://github.com/IBM/storage-fusion/blob/master/AI/quickstarts/fusion-developerhub/QUICKSTART.md + title: Complete deployment guide on GitHub + icon: docs + - url: https://github.com/IBM/storage-fusion/tree/master/AI/quickstarts/fusion-developerhub + title: View source code + icon: github +spec: + type: quickstart + lifecycle: production + owner: fusion-team + system: fusion-ai-platform + + providesApis: [] + consumesApis: [] + + dependsOn: + - resource:default/openshift + - resource:default/fusion-storage + +# Made with Bob diff --git a/AI/quickstarts/fusion-developerhub/backstage/docs/index.md b/AI/quickstarts/fusion-developerhub/backstage/docs/index.md new file mode 100644 index 00000000..458a632d --- /dev/null +++ b/AI/quickstarts/fusion-developerhub/backstage/docs/index.md @@ -0,0 +1,663 @@ +# IBM Fusion Developer Hub Quickstart + +Deploy production-ready Red Hat Developer Hub with automatic AI model discovery on OpenShift in under 15 minutes. + +> **βœ… TESTED ON**: OpenShift 4.15+ (Fusion HCI Cluster) +> **πŸ“… Last Verified**: May 29, 2026 + +## What you'll get + +- **Red Hat Developer Hub** - Enterprise developer portal (Backstage) +- **High Availability** - 3 replicas with automatic failover +- **PostgreSQL HA** - Crunchy PostgreSQL Operator with automated backups +- **IBM Fusion AI Homepage** - Pre-configured with AI capabilities +- **Automatic Model Discovery** - See OpenShift AI models on homepage +- **Production Security** - RBAC, network policies, pod security + +### 🎯 Key Feature: Automatic Model Discovery + +The homepage automatically discovers and displays: +- βœ… Models deployed via OpenShift AI (KServe) +- βœ… Model endpoints and status +- βœ… Performance metrics +- βœ… Quick access links + +## Prerequisites + +### Required Components + +- **Red Hat OpenShift 4.12+** cluster on IBM Fusion HCI +- **Cluster admin access** +- **Red Hat OpenShift AI (RHOAI)** installed and configured + - Required for automatic model discovery and AI capabilities + - Installation guide: [`../../fusion-openshift-ai/docs/01-RHOAI-Installation-Guide.md`](../../fusion-openshift-ai/docs/01-RHOAI-Installation-Guide.md) +- **100GB available storage** (ODF recommended) + +### Required CLI Tools + +- **`oc` CLI** installed and configured +- **`helm` 3.8+** installed ([install guide](https://helm.sh/docs/intro/install/)) + +## Deploy the quickstart + +### 1. Clone this repository + +```bash +git clone https://github.com/IBM/storage-fusion.git +cd storage-fusion/AI/quickstarts/fusion-developerhub +``` + +### 2. Log in to your OpenShift cluster + +```bash +oc login --token= --server= +``` + +### 3. Install Helm (if not installed) + +```bash +# macOS +brew install helm + +# Linux +curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash + +# Verify +helm version +``` + +### 4. Configure Cluster Domain and Storage + +Before deploying, you need to configure your cluster's wildcard domain and optionally configure storage classes. + +```bash +# Edit the production values file +vi examples/quickstart-production-values.yaml +``` + +#### Update Cluster Domain (Required) + +Find and update the `wildcardDomain` to match your OpenShift cluster: + +```yaml +global: + # IMPORTANT: Change this to match your OpenShift cluster domain + wildcardDomain: apps.your-cluster.example.com # Update this! +``` + +To get your cluster domain: +```bash +oc get ingresses.config/cluster -o jsonpath='{.spec.domain}' +``` + +#### Configure Storage Classes (Optional) + +**Important**: Dynamic plugins require a storage class that supports **ReadWriteMany (RWX)** access mode. + +If you're using **OpenShift Data Foundation (ODF)**, update the storage configuration for ODF: + +```yaml +developerHub: + storage: + # For dynamic plugins (requires ReadWriteMany) + storageClassName: "ocs-storagecluster-cephfs" # ODF CephFS for RWX + size: 5Gi + +postgresql: + storage: + size: 20Gi + # For PostgreSQL (requires ReadWriteOnce) + storageClassName: "ocs-storagecluster-ceph-rbd" # ODF RBD for RWO +``` + +If you're **NOT using ODF**, specify a storage class that supports ReadWriteMany: + +```yaml +developerHub: + storage: + # Use any storage class that supports ReadWriteMany (RWX) + # Examples: nfs-client, glusterfs, etc. + storageClassName: "nfs-client" # Replace with your RWX storage class + size: 5Gi + +postgresql: + storage: + size: 20Gi + # Use any storage class that supports ReadWriteOnce (RWO) + storageClassName: "" # Leave empty for cluster default +``` + +**Note**: If you leave `storageClassName` empty (`""`), the cluster's default storage class will be used. + +### 5. Deploy Developer Hub + +After configuring the cluster domain and storage classes: + +```bash +# Deploy with production configuration +helm install fusion-developer-hub \ + ./helm-charts/fusion-developer-hub \ + -n fusion-developer-hub \ + --create-namespace \ + -f examples/quickstart-production-values.yaml \ + --timeout 20m +``` + +**What happens:** +1. Installs Red Hat Developer Hub Operator (2 min) +2. Installs Crunchy PostgreSQL Operator (2 min) +3. Creates PostgreSQL cluster with HA (5 min) +4. Deploys Developer Hub with 3 replicas (5 min) +5. Configures OpenShift AI model connector (automatic) + +### 6. Monitor deployment + +```bash +# Watch operator installation +watch oc get csv -n rhdh-operator + +# Watch PostgreSQL cluster +watch oc get postgrescluster -n fusion-developer-hub + +# Watch Developer Hub +watch oc get backstage -n fusion-developer-hub + +# All should show "Succeeded" or "Ready" +``` + +### 7. Access Developer Hub + +```bash +# Get the URL +oc get route -n fusion-developer-hub -o jsonpath='{.items[0].spec.host}' + +# Example output: +# backstage-developer-hub-fusion-developer-hub.apps.your-cluster.com +``` + +Visit the URL in your browser. You'll see the **IBM Fusion AI Platform** homepage: + +![IBM Fusion AI Platform Homepage](fusion-homepage.png) + +**Key Features Visible:** +- **Welcome to IBM Fusion AI Platform** - Custom branded homepage +- **Quick Access** section with: + - Developer Tools (Podman Desktop) + - CI/CD Tools (ArgoCD, SonarQube, Quay.io) + - OpenShift Clusters integration +- **Search functionality** for quick navigation +- **Navigation menu** with: + - Home + - Catalog (components and APIs) + - APIs + - Learning Paths + - Docs + - Administration +- **Your Starred Entities** for quick access to favorites +- **Automatic model discovery** from OpenShift AI (when models are deployed) + +#### Model Catalog View + +Click on **Catalog** in the navigation menu to see all discovered AI models: + +![IBM Fusion Model Catalog](fusion-model-catalog.png) + +**Model Catalog Features:** +- **Automatic Model Discovery** - Models deployed via OpenShift AI appear automatically +- **Filter by Kind** - Component, API, System, etc. +- **Filter by Type** - model-server, service, etc. +- **Search Functionality** - Quick search across all catalog entries +- **Model Details** including: + - Model name and version (e.g., `model-serving-qwen2-5-72b-instruct`, `model-serving-qwen3-32b-instruct`) + - Owner and system information + - Lifecycle stage (development, production, etc.) + - Tags for categorization (model-qwen, quantization-fp8, validated-patterns, etc.) + - Authentication requirements +- **Self-service** button for creating new components +- **Personal filters** - Owned and Starred items for quick access + +### πŸ” Authentication + +The quickstart deploys with **Guest Access** enabled by default for easy testing: + +- βœ… **Guest Login**: Works immediately - click "Enter" on the homepage + +**Note**: If you see a GitHub login error, this is expected. Use Guest access for testing. + +## What's deployed + +### Operators +- **Red Hat Developer Hub Operator** - Manages Developer Hub lifecycle +- **Crunchy PostgreSQL Operator** - Manages PostgreSQL HA cluster + +### Developer Hub (3 replicas) +- **Image**: Red Hat Developer Hub (latest) +- **Replicas**: 3 (high availability) +- **Resources**: 2Gi memory, 1 CPU per replica +- **Features**: + - IBM Fusion AI homepage + - OpenShift AI model connector (enabled by default) + - Software catalog + - Self-service templates + - TechDocs + - RBAC + +### PostgreSQL HA Cluster (3 instances) +- **Primary**: 1 instance (read-write) +- **Replicas**: 2 instances (read-only) +- **Automated Backups**: Daily to ODF +- **Retention**: 30 days +- **Failover**: Automatic + +### Security +- Network policies enabled +- Pod security standards enforced +- RBAC configured +- Secrets encrypted + +## Configuration + +### Production Values (values-production.yaml) + +```yaml +global: + wildcardDomain: apps.your-cluster.com # Change this! + +developerHub: + replicas: 3 # High availability + + # Storage configuration for dynamic plugins + storage: + # Storage class for dynamic plugins (ReadWriteMany required) + # Leave empty to use cluster default + # If using ODF: ocs-storagecluster-cephfs + storageClassName: "" + size: 5Gi + + resources: + requests: + cpu: 1000m + memory: 2Gi + limits: + cpu: 2000m + memory: 4Gi + + # OpenShift AI Model Connector (enabled by default) + config: + homepage: + enabled: true + plugins: + - name: openshift-ai-connector + enabled: true + config: + discoveryInterval: 30s + namespaces: + - model-serving + - maas-runtime + +postgresql: + enabled: true + instances: 3 # HA cluster + + resources: + requests: + cpu: 500m + memory: 1Gi + + storage: + size: 20Gi + # Storage class for PostgreSQL (ReadWriteOnce is sufficient) + # Leave empty to use cluster default + # If using ODF: ocs-storagecluster-ceph-rbd + storageClassName: "" + + backup: + enabled: true + schedule: "0 2 * * *" # Daily at 2 AM + retention: "30d" + +monitoring: + enabled: true + prometheus: + enabled: true + +security: + networkPolicy: + enabled: true + podSecurity: + enabled: true +``` + +## Next steps + +### Create AI Applications with Self-Service Templates + +Click on **Self-service** button (top right) to access pre-built application templates: + +![IBM Fusion Self-Service Templates](fusion-self-service-templates.png) + +**Available AI Application Templates:** + +1. **Audio to Text Application** + - Build AI-enabled audio transcription application + - Technologies: ai, whispercpp, python, asr + - Pick from available model servers + +2. **Chatbot Application** + - Build Large Language Model (LLM)-enabled chat application + - Technologies: ai, llamacpp, vllm, python + - Interactive conversational AI + +3. **Code Generation Application** + - Build LLM-enabled code generation application + - Technologies: ai, llamacpp, vllm, python + - Generate code from natural language + +4. **Model Server, No Application** + - Deploy a granite-3.1 8b model with vLLM server + - Technologies: ai, vllm, modelserver + - Standalone model serving + +5. **Object Detection Application** + - Identify and locate objects in images using AI + - Technologies: ai, detr, python + - Computer vision capabilities + +6. **RAG Chatbot Application** + - Enhance chatbot with Retrieval-Augmented Generation (RAG) + - Technologies: ai, llamacpp, vllm, python, rag, database + - Context-aware responses + +**Features:** +- **View TechDocs** - Detailed documentation for each template +- **Choose button** - Start creating your application +- **Filter by Categories and Tags** - Find the right template quickly +- **Personal section** - Access your starred templates +- **IBM Fusion section** - All 6 templates available + +#### Creating an Application from Template + +When you click **Choose** on a template (e.g., Chatbot Application), you'll see a guided wizard: + +![IBM Fusion Template Creation Wizard](fusion-template-wizard.png) + +**4-Step Creation Process:** +1. **Application Information** - Name, owner, and ArgoCD configuration +2. **Application Repository Information** - Git repository settings +3. **Deployment Information** - Deployment configuration +4. **Review** - Review and create + +The wizard guides you through creating your AI application with automatic GitOps deployment via ArgoCD. + +### Deploy AI Models + +To deploy AI models that will be automatically discovered by Developer Hub, see the **Model Serving Guide**: + +**πŸ“– Model Deployment Guide**: [`../../fusion-model-serving/README.md`](../../fusion-model-serving/README.md) + +This guide covers: +- GitOps-driven model deployment using KServe +- vLLM runtime configuration for LLM serving +- Model serving with Red Hat OpenShift AI +- External access configuration via OpenShift Routes + +**How it works:** +1. Deploy models using the model-serving guide +2. Models are automatically discovered by Developer Hub (every 30 seconds) +3. Models appear on the homepage with status, endpoints, and metrics +4. Use models in your AI applications via the catalog + +**Monitored namespaces** (configurable in values.yaml): +- `model-serving` +- `maas-runtime` +- `redhat-ods-applications` + +### Create an application + +1. Visit Developer Hub +2. Click **Create** β†’ **Fusion AI Application** +3. Select a model from dropdown (auto-populated from OpenShift AI) +4. Fill in details +5. Click **Create** + +### Monitor the platform + +```bash +# Check Developer Hub status +oc get backstage -n fusion-developer-hub + +# Check PostgreSQL cluster +oc get postgrescluster -n fusion-developer-hub + +# View metrics +oc get servicemonitor -n fusion-developer-hub +``` + +### Customize homepage + +Edit the configuration: + +```bash +oc edit backstage fusion-hub -n fusion-develoepr-hub +``` + +See [docs/homepage-customization.md](docs/homepage-customization.md) for details. + +### Upgrade Developer Hub + +To upgrade to a newer version or apply configuration changes: + +```bash +# Pull latest changes from repository +git pull + +# Upgrade with updated values +helm upgrade fusion-developer-hub \ + ./helm-charts/fusion-developer-hub \ + -n fusion-developer-hub \ + -f examples/quickstart-production-values.yaml \ + --timeout 20m + +# Monitor the upgrade +watch oc get pods -n fusion-developer-hub +``` + +**What gets upgraded:** +- Developer Hub application to latest version +- Configuration changes from values file +- Template updates +- Plugin updates + +**Note**: The upgrade process performs a rolling update, maintaining availability during the upgrade. + +## Troubleshooting + +### Operators not installing + +Check operator status: + +```bash +# Check subscriptions +oc get subscription -n rhdh-operator +oc get subscription -n postgres-operator + +# Check install plans +oc get installplan -n rhdh-operator +oc get installplan -n postgres-operator + +# If manual approval needed +oc patch installplan -n rhdh-operator --type merge -p '{"spec":{"approved":true}}' +``` + +### PostgreSQL cluster not ready + +Check cluster status: + +```bash +# Get cluster details +oc describe postgrescluster developerhub-postgres -n fusion-developer-hub + +# Check pods +oc get pods -n fusion-developer-hub -l postgres-operator.crunchydata.com/cluster=developerhub-postgres + +# View logs +oc logs -n fusion-developer-hub -l postgres-operator.crunchydata.com/role=master +``` + +### Developer Hub not starting + +Check backstage status: + +```bash +# Get backstage details +oc describe backstage developer-hub -n fusion-developer-hub + +# Check pods using the correct label +oc get pods -n fusion-developer-hub -l rhdh.redhat.com/app=backstage-developer-hub + +# View logs +oc logs -n fusion-developer-hub -l rhdh.redhat.com/app=backstage-developer-hub +``` + +### Models not appearing on homepage + +Check connector configuration: + +```bash +# View backstage config +oc get backstage developer-hub -n fusion-developer-hub -o yaml | grep -A 20 "openshift-ai" + +# Check rhoai-normalizer container logs in backstage pods +oc logs -n fusion-developer-hub -l rhdh.redhat.com/app=backstage-developer-hub -c rhoai-normalizer + +# Check if rhoai-normalizer container is running +oc get pods -n fusion-developer-hub -l rhdh.redhat.com/app=backstage-developer-hub -o jsonpath='{.items[*].spec.containers[*].name}' | grep rhoai-normalizer + +# Check all containers in the pod +oc get pods -n fusion-developer-hub -l rhdh.redhat.com/app=backstage-developer-hub -o jsonpath='{.items[0].spec.containers[*].name}' +``` + +### Clean up and redeploy + +```bash +# Uninstall +helm uninstall fusion-developer-hub -n fusion-developer-hub + +# Delete namespace (removes all resources) +oc delete namespace fusion-developer-hub + +# Redeploy +helm install fusion-developer-hub \ + ./helm-charts/fusion-developer-hub \ + -n fusion-developer-hub \ + --create-namespace \ + -f examples/quickstart-production-values.yaml \ + --timeout 20m +``` + +## Architecture + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Red Hat Developer Hub Operator β”‚ +β”‚ (Namespace: rhdh-operator) β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ manages + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Developer Hub Instance (3 replicas) β”‚ +β”‚ (Namespace: fusion-hub) β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ OpenShift AI Model Connector β”‚ β”‚ +β”‚ β”‚ β€’ Discovers models every 30s β”‚ β”‚ +β”‚ β”‚ β€’ Displays on homepage β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ connects to + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Crunchy PostgreSQL Operator β”‚ +β”‚ (Namespace: postgres-operator) β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ manages + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ PostgreSQL HA Cluster (3 instances) β”‚ +β”‚ (Namespace: fusion-hub) β”‚ +β”‚ β€’ Primary (read-write) β”‚ +β”‚ β€’ 2 Replicas (read-only) β”‚ +β”‚ β€’ Automated backups to ODF β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ queries + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ OpenShift AI (if installed) β”‚ +β”‚ β€’ KServe InferenceServices β”‚ +β”‚ β€’ Model endpoints β”‚ +β”‚ β€’ Model metadata β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +## Production considerations + +### High Availability +- βœ… 3 Developer Hub replicas +- βœ… 3 PostgreSQL instances +- βœ… Automatic failover +- βœ… Load balancing + +### Backup & Recovery +- βœ… Daily automated backups +- βœ… 30-day retention +- βœ… Point-in-time recovery +- βœ… Backup to ODF + +### Security +- βœ… Network policies +- βœ… Pod security standards +- βœ… RBAC +- βœ… Secret encryption + +### Monitoring +- βœ… Prometheus metrics +- βœ… Service monitors +- βœ… Health checks +- βœ… Operator status + +### Scaling +- βœ… Horizontal pod autoscaling +- βœ… Resource limits +- βœ… Storage expansion +- βœ… Database connection pooling + +## Additional resources + +### Setup and Configuration +- [Complete Setup Guide](SETUP.md) - Comprehensive setup with all prerequisites +- [Production Deployment Guide](docs/README.md) - Advanced configuration options +- [Homepage Customization](docs/homepage-customization.md) - Customize the UI +- [RHOAI Integration](docs/getting-started/rhoai-integration.md) - Deep dive into AI integration + +### AI Platform Components +- [Red Hat OpenShift AI Installation](../../fusion-openshift-ai/docs/01-RHOAI-Installation-Guide.md) - Install RHOAI on Fusion +- [Model Serving Guide](../../fusion-model-serving/README.md) - Deploy and serve AI models +- [GitOps with Argo CD](../../fusion-gitops-argocd/README.md) - GitOps deployment patterns + +### Troubleshooting +- [Troubleshooting Guide](docs/troubleshooting/README.md) - Comprehensive troubleshooting +- [PostgreSQL Issues](docs/troubleshooting/postgresql-troubleshooting.md) - Database troubleshooting +- [Readiness Probe 503 Fix](docs/troubleshooting/READINESS_PROBE_503_FIX.md) - Fix common startup issues + +## Support + +- [GitHub Issues](https://github.com/IBM/storage-fusion/issues) +- [Documentation](../../fusion-developerhub/docs/) +- [Red Hat Developer Hub Docs](https://access.redhat.com/documentation/en-us/red_hat_developer_hub) + +--- + +**Made with ❀️ by the IBM Fusion Team** \ No newline at end of file diff --git a/AI/quickstarts/fusion-developerhub/backstage/mkdocs.yml b/AI/quickstarts/fusion-developerhub/backstage/mkdocs.yml new file mode 100644 index 00000000..67977a92 --- /dev/null +++ b/AI/quickstarts/fusion-developerhub/backstage/mkdocs.yml @@ -0,0 +1,31 @@ +site_name: IBM Fusion Developer Hub +site_description: Enterprise developer portal for IBM Fusion HCI - built on Red Hat Developer Hub (Backstage) + +# Point to the IBM storage-fusion repo for documentation +repo_url: https://github.com/IBM/storage-fusion +repo_name: IBM/storage-fusion + +# Point to docs directory +docs_dir: docs + +# Navigation - single quickstart file +nav: + - Home: index.md + +theme: + name: material + palette: + primary: indigo + accent: indigo + +plugins: + - techdocs-core + +# Safe markdown extensions +markdown_extensions: + - admonition + - tables + - toc: + permalink: true + +# Made with Bob diff --git a/AI/quickstarts/fusion-gitops/backstage/catalog-info.yaml b/AI/quickstarts/fusion-gitops/backstage/catalog-info.yaml new file mode 100644 index 00000000..48226948 --- /dev/null +++ b/AI/quickstarts/fusion-gitops/backstage/catalog-info.yaml @@ -0,0 +1,50 @@ +apiVersion: backstage.io/v1alpha1 +kind: Component +metadata: + name: fusion-gitops-quickstart + title: Quickstart - GitOps (ArgoCD) on IBM Fusion + description: | + Deploy a complete GitOps platform on IBM Fusion HCI with Red Hat OpenShift GitOps (ArgoCD), HashiCorp Vault for secret management, and External Secrets Operator for seamless secret synchronization. + + This quickstart provides a production-ready GitOps deployment with operator-based management for automated deployment and day-2 operations, integrated secret management with Vault and External Secrets Operator, high availability configuration for enterprise workloads, and comprehensive validation and diagnostic tooling. The platform enables declarative infrastructure management, automated application deployment, secure secret handling across multiple backends, and GitOps-driven continuous delivery workflows. + tags: + - fusion + - gitops + - argocd + - quickstart + - vault + - secrets + - platform + - openshift + - automation + annotations: + backstage.io/techdocs-ref: dir:. + github.com/project-slug: IBM/storage-fusion + + links: + - url: https://community.ibm.com/community/user/blogs/christo-abraham/2026/05/27/fusion-gitops-quickstart + title: Read on IBM Tech Exchange + icon: article + - url: https://ibm.github.io/storage-fusion/fusion-ai/quickstarts/ + title: View on Fusion Tech Community + icon: web + - url: https://github.com/IBM/storage-fusion/blob/master/AI/quickstarts/fusion-gitops/README.md + title: Complete deployment guide on GitHub + icon: docs + - url: https://github.com/IBM/storage-fusion/tree/master/AI/quickstarts/fusion-gitops + title: View source code + icon: github +spec: + type: quickstart + lifecycle: production + owner: fusion-team + system: fusion-ai-platform + + providesApis: [] + consumesApis: [] + + dependsOn: + - resource:default/openshift + - resource:default/fusion-storage + +# Made with Bob \ No newline at end of file diff --git a/AI/quickstarts/fusion-gitops/backstage/docs/index.md b/AI/quickstarts/fusion-gitops/backstage/docs/index.md new file mode 100644 index 00000000..8d02bd24 --- /dev/null +++ b/AI/quickstarts/fusion-gitops/backstage/docs/index.md @@ -0,0 +1,686 @@ +# Red Hat GitOps Deployment for Fusion HCI + +Deploy a complete GitOps platform on Fusion HCI with Red Hat OpenShift GitOps (ArgoCD), HashiCorp Vault for secret management, and External Secrets Operator for seamless secret synchronization across multiple backends. + +## Table of Contents + +- [Forking the Repository](#forking-the-repository) +- [Overview](#overview) +- [Key Features](#key-features) +- [Architecture](#architecture) +- [Prerequisites](#prerequisites) +- [Getting Started](#getting-started) + - [1. Deploy GitOps](#1-deploy-gitops) + - [2. Deploy Vault (optional)](#2-deploy-vault) + - [3. Deploy External Secrets (optional)](#3-deploy-external-secrets) +- [Cleanup](#cleanup) +- [Detailed Guides](#detailed-guides) +- [Project Structure](#project-structure) + +## Forking the Repository + +Before deploying the GitOps platform, fork this repository to your own GitHub account. This is essential for GitOps workflows as it allows you to: + +- **Customize configurations**: Modify Helm values, scripts, and manifests for your environment +- **Track changes**: Maintain version control of your infrastructure configurations +- **Enable GitOps**: Point ArgoCD to your forked repository for continuous deployment +- **Preserve upstream updates**: Easily sync improvements and fixes from the original repository + +### Fork and Clone + +1. **Fork the repository** on GitHub: + - Navigate to the repository: `https://github.com/IBM/Fusion-AI` + - Click the "Fork" button in the top-right corner + - Select your account or organization as the destination + +2. **Clone your forked repository**: + +```bash +# Clone your fork (replace with your GitHub username) +git clone https://github.com//Fusion-AI.git +cd Fusion-AI/quickstarts/fusion-gitops + +# Add the original repository as upstream remote +git remote add upstream https://github.com/IBM/Fusion-AI.git + +# Verify remotes +git remote -v +``` + +3. **Create a working branch** (optional, but recommended): + +```bash +# Create and switch to a new branch for your customizations +git checkout -b fusion-gitops-config + +# Make your changes, then commit +git add . +git commit -m "Configure GitOps for my environment" +git push origin fusion-gitops-config +``` + +### Keep Your Fork Synchronized + +Periodically sync your fork with the upstream repository to receive updates: + +```bash +# Fetch upstream changes +git fetch upstream + +# Merge upstream changes into your main branch +git checkout main +git merge upstream/main + +# Push updates to your fork +git push origin main +``` + +## Overview + +This quickstart provides a production-ready GitOps platform optimized for Fusion HCI environments. It offers multiple deployment methods to suit different use cases: + +- **Scripts** (`scripts/`): Fast, automated deployment for quick starts and CI/CD pipelines +- **Helm Charts** (`helm/`): Flexible, customizable deployments with values files for different environments +- **Ansible Playbooks** (`ansible/`): Enterprise-grade automation with validation and rollback capabilities + +All components are designed to work together seamlessly while remaining independently deployable and configurable. + +## Key Features + +### GitOps Platform +- **Red Hat OpenShift GitOps (ArgoCD)**: Enterprise-grade continuous delivery with declarative GitOps workflows +- **Multi-environment support**: Pre-configured values files for development, staging, and production +- **High availability**: Production configurations with replica sets and persistent storage +- **RBAC integration**: OpenShift authentication and authorization out of the box + +### Secret Management +- **HashiCorp Vault**: Industry-standard secret storage with encryption at rest and in transit +- **Auto-initialization**: Automated unsealing and root token management +- **Persistent storage**: Configurable storage classes for data durability +- **HA deployment**: Multi-replica configurations for production workloads + +### External Secrets Integration +- **HashiCorp Vault backend**: Seamless integration with HashiCorp Vault for secure secret management +- **Automatic synchronization**: Real-time secret updates from external sources +- **ClusterSecretStore**: Centralized secret store configuration + +### Deployment Flexibility +- **Script-based deployment**: One-command installation for rapid setup +- **Helm charts**: Customizable deployments with environment-specific values +- **Ansible automation**: Idempotent playbooks with pre-flight checks and validation + +## Architecture + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Fusion HCI Cluster β”‚ +β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ Red Hat OpenShift GitOps (ArgoCD) β”‚ β”‚ +β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ +β”‚ β”‚ β€’ Continuous Delivery β”‚ β”‚ +β”‚ β”‚ β€’ Application Lifecycle Management β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ HashiCorp Vault β”‚ β”‚ +β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ +β”‚ β”‚ β€’ Secret Storage & Encryption β”‚ β”‚ +β”‚ β”‚ β€’ High Availability (Raft) β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ External Secrets Operator β”‚ β”‚ +β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ +β”‚ β”‚ β€’ Automatic Secret Synchronization β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ Application Workloads β”‚ β”‚ +β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ +β”‚ β”‚ β€’ AI/ML Pipelines β”‚ β”‚ +β”‚ β”‚ β€’ Microservices β”‚ β”‚ +β”‚ β”‚ β€’ Data Processing β”‚ β”‚ +β”‚ β”‚ β€’ Multi-Tenant Applications β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +## Prerequisites + +### Required +- **Fusion HCI Cluster**: OpenShift 4.20+ or Kubernetes 1.27+ running on Fusion HCI +- **Cluster Access**: Cluster admin privileges for operator installation +- **CLI Tools**: + - `oc` (OpenShift CLI) or `kubectl` configured and authenticated + - `helm` 3.12+ for chart deployments +- **Storage**: At least one StorageClass available for persistent volumes + - Recommended: `ocs-storagecluster-ceph-rbd` (OpenShift Data Foundation) + - Minimum: 10Gi available storage per component + +### Optional +- **Ansible**: Version 2.15+ (or ansible-core 2.15+) for playbook-based automation +- **Git**: For GitOps repository management +- **jq**: For JSON parsing in scripts (auto-installed if missing) + +### Verification + +Verify your environment before deployment: + +```bash +# Check cluster access +oc whoami +oc version + +# Verify Helm installation +helm version + +# List available storage classes +oc get storageclass + +# Check available storage +oc get pv +``` + +## Getting Started + +All commands should be run from the `quickstarts/fusion-gitops` directory. Each component can be deployed independently or as part of a complete stack. + +### Deployment Order + +1. **GitOps** (required): Core platform for continuous delivery +2. **Vault** (optional): Secret management backend +3. **External Secrets** (optional): Secret synchronization layer + +### 1. Deploy GitOps + +Deploy Red Hat OpenShift GitOps (ArgoCD) as the foundation for your GitOps platform: + +```bash +# Navigate to the quickstart directory +cd quickstarts/fusion-gitops + +# Verify prerequisites +oc whoami +helm version +oc get storageclass + +# Deploy with default configuration +./scripts/deploy-gitops.sh + +# Wait for deployment to complete +oc get pods -n openshift-gitops -w + +# Get ArgoCD server URL +oc get route openshift-gitops-server -n openshift-gitops -o jsonpath='{.spec.host}' && echo + +# Get admin password +oc get secret openshift-gitops-cluster -n openshift-gitops \ + -o jsonpath='{.data.admin\.password}' | base64 -d && echo +``` + +#### Environment-Specific Deployments + +Choose the appropriate values file for your environment: + +```bash +# Development/Testing (minimal resources) +./scripts/deploy-gitops.sh -f helm/fusion-gitops/values-minimal.yaml + +# Production (HA with persistent storage) +./scripts/deploy-gitops.sh -f helm/fusion-gitops/values-production.yaml + +# OpenShift Data Foundation storage +./scripts/deploy-gitops.sh -f helm/fusion-gitops/values-odf.yaml +``` + +#### Validation + +Run the comprehensive validation script to verify your GitOps deployment: + +```bash +# Run validation script +./scripts/validate-gitops.sh + +# Run with verbose output for detailed diagnostics +./scripts/validate-gitops.sh --verbose + +# Specify custom namespace +./scripts/validate-gitops.sh --namespace my-gitops +``` + +The validation script performs 12 comprehensive checks: + +1. **GitOps Operator Subscription**: Verifies operator is subscribed and at latest version +2. **Operator ClusterServiceVersion**: Confirms CSV is in "Succeeded" phase +3. **Operator Pods**: Validates operator controller manager is running +4. **ArgoCD Instance**: Checks ArgoCD CR exists and is "Available" +5. **ArgoCD Pods**: Validates all components (server, repo-server, application-controller, redis-ha, dex-server, applicationset-controller) +6. **ArgoCD Services**: Verifies key services are created and accessible +7. **ArgoCD Route/Ingress**: Confirms external access is configured +8. **ArgoCD Server Health**: Tests server health endpoint +9. **ArgoCD Applications**: Lists deployed applications and their sync status +10. **ArgoCD AppProjects**: Shows configured AppProjects with source repos and destinations +11. **ArgoCD Cluster Connections**: Lists external clusters connected to ArgoCD +12. **Pod Logs**: Scans for critical errors in server logs + +**Manual Verification** (if needed): + +```bash +# Check operator status +oc get csv -n openshift-gitops-operator + +# Verify ArgoCD instance +oc get argocd -n openshift-gitops + +# Check pod status +oc get pods -n openshift-gitops + +# Get ArgoCD route +oc get route openshift-gitops-server -n openshift-gitops +``` + +πŸ“– **Detailed guide**: [docs/deploying-gitops-guide.md](docs/deploying-gitops-guide.md) + +### 2. Deploy Vault + +Deploy HashiCorp Vault for centralized secret management (optional, but recommended): + +```bash +# Navigate to the quickstart directory +cd quickstarts/fusion-gitops + +# Verify prerequisites +oc whoami +helm version +oc get storageclass + +# Deploy with default configuration (3 replicas, 10Gi storage) +./scripts/deploy-secret-manager.sh + +# Wait for Vault to initialize +oc get pods -n vault -w + +# Get root token (store securely) +oc get secret vault-unseal-keys -n vault \ + -o jsonpath='{.data.root-token}' | base64 -d && echo + +# Get unseal keys (store securely) +oc get secret vault-unseal-keys -n vault -o yaml +``` + +#### Custom Configurations + +```bash +# Use specific storage class +./scripts/deploy-secret-manager.sh --storage-class ocs-storagecluster-ceph-rbd + +# Production deployment with HA +./scripts/deploy-secret-manager.sh --namespace vault-prod --replicas 5 --size 20Gi + +# Custom namespace with specific storage +./scripts/deploy-secret-manager.sh \ + --namespace vault-staging \ + --storage-class thin \ + --size 15Gi +``` + +#### Validation + +Run the comprehensive validation script to verify your Vault deployment: + +```bash +# Run validation script +./scripts/validate-secret-manager.sh + +# Run with verbose output for detailed diagnostics +./scripts/validate-secret-manager.sh --verbose + +# Specify custom namespace +./scripts/validate-secret-manager.sh --namespace vault-prod +``` + +The validation script performs 12 comprehensive checks: + +1. **StatefulSet Status**: Verifies Vault StatefulSet exists and is configured correctly +2. **Pod Health**: Checks all Vault pods are running and ready +3. **Service Availability**: Validates internal and external services are accessible +4. **Initialization Status**: Confirms Vault is initialized +5. **Seal Status**: Verifies Vault is unsealed and operational +6. **Unseal Keys Secret**: Ensures unseal keys are stored securely +7. **Storage**: Validates persistent volume claims are bound +8. **Route/Ingress**: Checks external access configuration +9. **Pod Logs**: Scans for RBAC errors, service registration issues, and other problems +10. **Vault Configuration**: Validates vault.hcl configuration (node_id, cluster_address, retry_join) +11. **Raft Cluster Status**: Checks Raft peer count, node IDs, and leader election +12. **Init Container Status**: Verifies config-init container completed successfully + +**Manual Verification** (if needed): + +```bash +# Check Vault operator +oc get csv -n vault + +# Check pod status +oc get pods -n vault + +# Test Vault connectivity +oc exec -n vault vault-0 -- vault status + +# Get Vault route (if exposed) +oc get route -n vault +``` + +#### Troubleshooting Multi-Replica Deployments + +If you encounter issues with multi-replica Vault deployments (followers not joining Raft cluster), use the diagnostic script: + +```bash +# Run Raft diagnostics +./scripts/diagnose-vault-raft.sh -n vault + +# Run with verbose output +./scripts/diagnose-vault-raft.sh -n vault --verbose +``` + +The diagnostic script checks: +- Pod status and health +- Leader (vault-0) initialization and seal status +- Vault configuration (retry_join, node_id) +- DNS resolution from follower pods +- HTTP connectivity to leader +- PVC status and Raft data directories +- Recent logs for error patterns + +#### Important Security Notes + +⚠️ **Critical**: The root token and unseal keys are stored in the `vault-unseal-keys` secret. Back them up securely and remove from the cluster in production: + +```bash +# Export unseal keys and root token +oc get secret vault-unseal-keys -n vault -o yaml > vault-keys-backup.yaml + +# Store in secure location (password manager, hardware security module, etc.) + +# Optional: Remove from cluster after backing up (production only) +# oc delete secret vault-unseal-keys -n vault +``` + +πŸ“– **Detailed guide**: [docs/deploying-vault-guide.md](docs/deploying-vault-guide.md) + +### 3. Deploy External Secrets + +Deploy External Secrets Operator to synchronize secrets from external backends (optional): + +```bash +# Navigate to the quickstart directory +cd quickstarts/fusion-gitops + +# Verify prerequisites +oc whoami +helm version + +# Deploy operator only (no backend configuration) +./scripts/deploy-external-secrets.sh --standalone + +# Wait for operator to be ready +oc get csv -n external-secrets-operator -w + +# Verify operator installation +oc get csv -n external-secrets-operator +oc get pods -n external-secrets-operator +``` + +#### Backend Configuration + +After deploying the operator, configure the secret backend: + +##### HashiCorp Vault Backend + +```bash +# Configure Vault as secret backend (requires Vault deployed) +./scripts/deploy-external-secrets.sh --backend vault + +# Verify ClusterSecretStore +oc get clustersecretstore vault-backend +``` + +#### Validation + +Run the comprehensive validation script to verify your External Secrets Operator deployment: + +```bash +# Run validation script +./scripts/validate-external-secrets.sh + +# Run with verbose output for detailed diagnostics +./scripts/validate-external-secrets.sh --verbose + +# Specify custom namespace +./scripts/validate-external-secrets.sh --namespace my-external-secrets + +# Validate with backend connectivity check +./scripts/validate-external-secrets.sh --backend vault +``` + +The validation script performs 14 comprehensive checks: + +1. **Namespace**: Verifies operator namespace exists +2. **Operator Subscription**: Checks subscription is created and active +3. **ClusterServiceVersion (CSV)**: Confirms CSV is in "Succeeded" phase +4. **Operator Pods**: Validates operator pods are running and ready +5. **Webhook Pods**: Checks webhook pods are operational +6. **Cert Controller Pods**: Verifies cert-controller pods are running +7. **CRDs Installed**: Ensures all required CRDs are established (ClusterSecretStore, ExternalSecret, SecretStore, ClusterExternalSecret) +8. **ClusterSecretStores**: Lists configured ClusterSecretStores and their status +9. **SecretStores**: Shows namespace-scoped SecretStores +10. **Configured Secret Backends**: Automatically detects and displays details of configured HashiCorp Vault backend including server URL, authentication method, and connection status +11. **ExternalSecrets**: Validates ExternalSecret resources and sync status +12. **Service Account**: Confirms operator service account exists +13. **RBAC Configuration**: Checks ClusterRole and ClusterRoleBinding +14. **Backend Connectivity** (optional): Tests connectivity to HashiCorp Vault backend when --backend flag is used + +**Manual Verification** (if needed): + +```bash +# Check operator status +oc get csv -n external-secrets-operator + +# List all secret stores +oc get clustersecretstore + +# Check operator logs +oc logs -n external-secrets-operator -l app=external-secrets-operator --tail=50 +``` + +πŸ“– **Detailed guide**: [docs/deploying-external-secrets-guide.md](docs/deploying-external-secrets-guide.md) + +## Cleanup + +Remove deployed components safely using the provided cleanup scripts. Always clean up in reverse order of deployment. + +### Cleanup Order + +1. External Secrets Operator (if deployed) +2. Vault (if deployed) +3. GitOps (if no longer needed) + +### Cleanup External Secrets + +```bash +cd quickstarts/fusion-gitops + +# Interactive cleanup (prompts for confirmation) +./scripts/cleanup-external-secrets.sh + +# Force cleanup without prompts +./scripts/cleanup-external-secrets.sh --force + +# Keep namespace after cleanup +./scripts/cleanup-external-secrets.sh --keep-namespace + +# Custom namespace +./scripts/cleanup-external-secrets.sh --namespace my-external-secrets +``` + +**Options**: +- `--namespace `: Specify operator namespace (default: `external-secrets-operator`) +- `--keep-namespace`: Keep the namespace after cleanup +- `--force`: Skip confirmation prompts +- `--help`: Show usage information + +### Cleanup Vault + +⚠️ **Warning**: This will delete all secrets stored in Vault. Ensure you have backups! + +```bash +cd quickstarts/fusion-gitops + +# Interactive cleanup (prompts for confirmation) +./scripts/cleanup-secret-manager.sh + +# Force cleanup without prompts +./scripts/cleanup-secret-manager.sh --force + +# Keep operator installed +./scripts/cleanup-secret-manager.sh --keep-operator + +# Keep namespace after cleanup +./scripts/cleanup-secret-manager.sh --keep-namespace + +# Custom namespace +./scripts/cleanup-secret-manager.sh --namespace vault-prod +``` + +**Options**: +- `--namespace `: Specify Vault namespace (default: `vault`) +- `--keep-operator`: Keep the Vault operator installed +- `--keep-namespace`: Keep the namespace after cleanup +- `--force`: Skip confirmation prompts +- `--help`: Show usage information + +### Cleanup GitOps + +⚠️ **Warning**: This will remove ArgoCD and all managed applications! + +```bash +cd quickstarts/fusion-gitops + +# Interactive cleanup (prompts for confirmation) +./scripts/cleanup-gitops.sh + +# Dry run (show what would be deleted) +./scripts/cleanup-gitops.sh --dry-run + +# Force cleanup without prompts +./scripts/cleanup-gitops.sh --force + +# Keep operator, remove instances only +./scripts/cleanup-gitops.sh --keep-operator + +# Keep namespace after cleanup +./scripts/cleanup-gitops.sh --keep-namespace +``` + +**Options**: +- `--keep-operator`: Keep operator installed, remove instances only +- `--keep-namespace`: Don't delete namespaces +- `--force`: Skip confirmation prompts +- `--dry-run`: Show what would be done without doing it +- `--help`: Show usage information + +### Complete Cleanup + +Remove all components in the correct order: + +```bash +cd quickstarts/fusion-gitops + +# Clean up everything (with prompts) +./scripts/cleanup-external-secrets.sh +./scripts/cleanup-secret-manager.sh +./scripts/cleanup-gitops.sh + +# Force cleanup of everything +./scripts/cleanup-external-secrets.sh --force +./scripts/cleanup-secret-manager.sh --force +./scripts/cleanup-gitops.sh --force +``` + +## Detailed Guides + +For in-depth information, architecture details, troubleshooting, and advanced configurations, refer to the detailed guides: + +### Component Guides +- [**GitOps Deployment Guide**](docs/deploying-gitops-guide.md) + - Advanced configuration options + - Troubleshooting common issues + +- [**HashiCorp Vault Deployment Guide**](docs/deploying-vault-guide.md) + - High availability configuration + - Security best practices + +- [**External Secrets Deployment Guide**](docs/deploying-external-secrets-guide.md) + - Backend configuration + - Troubleshooting sync issues + - Security considerations + +## Project Structure + +```text +quickstarts/fusion-gitops/ +β”œβ”€β”€ README.md # Main documentation and quickstart guide +β”œβ”€β”€ ansible/ # Ansible automation for deployment workflows +β”‚ β”œβ”€β”€ ansible.cfg # Ansible configuration settings +β”‚ β”œβ”€β”€ requirements.yml # Ansible Galaxy collection dependencies +β”‚ β”œβ”€β”€ inventory/ +β”‚ β”‚ └── localhost # Local inventory for Ansible execution +β”‚ β”œβ”€β”€ playbooks/ +β”‚ β”‚ β”œβ”€β”€ deploy.yml # Main deployment playbook for all components +β”‚ β”‚ └── initialize-vault.yml # Vault initialization and unsealing playbook +β”‚ └── roles/ +β”‚ β”œβ”€β”€ configuration/ # Configuration management role +β”‚ β”œβ”€β”€ helm-deploy/ # Helm chart deployment automation role +β”‚ β”œβ”€β”€ preflight/ # Pre-deployment validation checks role +β”‚ └── validation/ # Post-deployment validation role +β”œβ”€β”€ docs/ +β”‚ β”œβ”€β”€ deploying-gitops-guide.md # Detailed GitOps deployment guide with architecture +β”‚ β”œβ”€β”€ deploying-vault-guide.md # Detailed Vault deployment and configuration guide +β”‚ └── deploying-external-secrets-guide.md # Detailed External Secrets Operator deployment guide +β”œβ”€β”€ helm/ +β”‚ β”œβ”€β”€ fusion-gitops/ +β”‚ β”‚ β”œβ”€β”€ Chart.yaml # Helm chart metadata for GitOps deployment +β”‚ β”‚ β”œβ”€β”€ values.yaml # Default values for GitOps chart +β”‚ β”‚ β”œβ”€β”€ values-minimal.yaml # Minimal configuration for development environments +β”‚ β”‚ β”œβ”€β”€ values-odf.yaml # OpenShift Data Foundation storage configuration +β”‚ β”‚ β”œβ”€β”€ values-production.yaml # Production-ready configuration with HA +β”‚ β”‚ └── templates/ # Kubernetes resource templates for GitOps +β”‚ β”œβ”€β”€ vault-operator/ +β”‚ β”‚ β”œβ”€β”€ Chart.yaml # Helm chart metadata for Vault deployment +β”‚ β”‚ β”œβ”€β”€ values.yaml # Default values for Vault chart +β”‚ β”‚ β”œβ”€β”€ values-standalone-example.yaml # Example standalone Vault configuration +β”‚ β”‚ └── templates/ # Kubernetes resource templates for Vault +β”‚ └── external-secrets-operator/ +β”‚ β”œβ”€β”€ Chart.yaml # Helm chart metadata for External Secrets +β”‚ β”œβ”€β”€ values.yaml # Default values for External Secrets chart +β”‚ β”œβ”€β”€ values-standalone.yaml # Standalone operator deployment configuration +β”‚ β”œβ”€β”€ examples/ # Example configurations for different backends +β”‚ └── templates/ # Kubernetes resource templates for External Secrets +└── scripts/ + β”œβ”€β”€ deploy-gitops.sh # Script to deploy Red Hat OpenShift GitOps + β”œβ”€β”€ deploy-secret-manager.sh # Script to deploy HashiCorp Vault + β”œβ”€β”€ deploy-external-secrets.sh # Script to deploy External Secrets Operator + β”œβ”€β”€ validate-gitops.sh # Comprehensive GitOps deployment validation script + β”œβ”€β”€ validate-secret-manager.sh # Comprehensive Vault deployment validation script + β”œβ”€β”€ validate-external-secrets.sh # Comprehensive External Secrets validation script + β”œβ”€β”€ diagnose-vault-raft.sh # Deep troubleshooting tool for Vault Raft cluster issues + β”œβ”€β”€ unseal-secret-manager.sh # Script to unseal Vault instances + β”œβ”€β”€ cleanup-gitops.sh # Script to remove GitOps components + β”œβ”€β”€ cleanup-secret-manager.sh # Script to remove Vault components + β”œβ”€β”€ cleanup-external-secrets.sh # Script to remove External Secrets components + └── lib/ + └── common.sh # Shared utility functions for all scripts +``` diff --git a/AI/quickstarts/fusion-gitops/backstage/mkdocs.yml b/AI/quickstarts/fusion-gitops/backstage/mkdocs.yml new file mode 100644 index 00000000..5a0c8da2 --- /dev/null +++ b/AI/quickstarts/fusion-gitops/backstage/mkdocs.yml @@ -0,0 +1,31 @@ +site_name: Fusion GitOps Platform +site_description: Deploy Red Hat OpenShift GitOps with Vault and External Secrets on IBM Fusion HCI + +# Point to the IBM storage-fusion repo for documentation +repo_url: https://github.com/IBM/storage-fusion +repo_name: IBM/storage-fusion + +# Point to docs directory +docs_dir: docs + +# Navigation +nav: + - Home: index.md + +theme: + name: material + palette: + primary: indigo + accent: indigo + +plugins: + - techdocs-core + +# Safe markdown extensions +markdown_extensions: + - admonition + - tables + - toc: + permalink: true + +# Made with Bob \ No newline at end of file diff --git a/AI/quickstarts/model-as-a-service/backstage/catalog-info.yaml b/AI/quickstarts/model-as-a-service/backstage/catalog-info.yaml new file mode 100644 index 00000000..61e0f4be --- /dev/null +++ b/AI/quickstarts/model-as-a-service/backstage/catalog-info.yaml @@ -0,0 +1,52 @@ +apiVersion: backstage.io/v1alpha1 +kind: Component +metadata: + name: model-as-a-service-quickstart + title: Quickstart - Model as a Service on IBM Fusion + description: | + Deploy a complete Model-as-a-Service (MaaS) platform on IBM Fusion using Red Hat OpenShift AI for secure, scalable, and governed foundation model hosting within your own infrastructure. + + This quickstart demonstrates how to deploy model serving, storage, registry, gateway, and observability components into a unified deployment workflow for private AI inference on OpenShift. The platform provides Red Hat OpenShift AI platform services, IBM Fusion object storage integration, Model Registry services, gateway-based API routing and exposure, GPU-backed inference runtimes using LLM-D, and monitoring and observability components. Together, these components provide a baseline environment for running and managing enterprise AI inference services on OpenShift. + tags: + - fusion + - model-serving + - maas + - quickstart + - ai + - openshift-ai + - gpu + - llm + - inference + - platform + annotations: + backstage.io/techdocs-ref: dir:. + github.com/project-slug: IBM/storage-fusion + + links: + - url: https://community.ibm.com/community/user/blogs/harichandana-kotha/2026/05/27/quickstart-model-as-a-service-on-ibm-fusion + title: Read on IBM Tech Exchange + icon: article + - url: https://ibm.github.io/storage-fusion/fusion-ai/quickstarts/ + title: View on Fusion Tech Community + icon: web + - url: https://github.com/IBM/storage-fusion/blob/master/AI/quickstarts/model-as-a-service/README.md + title: Complete deployment guide on GitHub + icon: docs + - url: https://github.com/IBM/storage-fusion/tree/master/AI/quickstarts/model-as-a-service + title: View source code + icon: github +spec: + type: quickstart + lifecycle: production + owner: fusion-team + system: fusion-ai-platform + + providesApis: [] + consumesApis: [] + + dependsOn: + - resource:default/openshift + - resource:default/fusion-storage + - resource:default/nvidia-gpu + +# Made with Bob \ No newline at end of file diff --git a/AI/quickstarts/model-as-a-service/backstage/docs/index.md b/AI/quickstarts/model-as-a-service/backstage/docs/index.md new file mode 100644 index 00000000..2fe67ae9 --- /dev/null +++ b/AI/quickstarts/model-as-a-service/backstage/docs/index.md @@ -0,0 +1,751 @@ +# Quickstart: Model as a Service on IBM Fusion + +Organizations adopting generative AI often need a secure, scalable, and governed way to host foundation models within their own infrastructure. Running models internally helps teams maintain control over data, access, performance, compliance, and operational costs while creating a consistent platform for enterprise AI workloads. + +This quickstart demonstrates how to deploy a Model-as-a-Service (MaaS) platform on IBM Fusion using Red Hat OpenShift AI. The guide brings together model serving, storage, registry, gateway, and observability components into a unified deployment workflow for private AI inference on OpenShift. + +IBM Fusion provides the storage foundation for model artifacts and platform data, while Red Hat OpenShift AI delivers model lifecycle, registry, and inference capabilities. Together, the platform components and Helm-based automation provide a streamlined foundation for deploying private AI inference workloads on OpenShift. + +This quickstart provides a reference implementation for deploying private AI inference services on OpenShift using IBM Fusion and Red Hat OpenShift AI. The repository integrates storage, model management, inference, gateway access, and observability into a cohesive operational environment. + +This guide is intended for platform engineers and AI infrastructure teams building internal AI serving platforms. By following it, you will: + +- Deploy the core OpenShift AI platform components and required services +- Configure storage and Model Registry integration +- Import and register models from the Model Catalog +- Deploy GPU-backed inference services using LLM-D inference +- Expose models through secure OpenShift gateway endpoints +- Validate deployment health and test inference access with sample requests + +By the end of this guide, you will have a working MaaS environment capable of serving foundation models such as `gpt-oss-20b` through secure, OpenShift-native inference APIs on IBM Fusion. + +--- + +## What This Quickstart Deploys + +This quickstart deploys a complete Model-as-a-Service (MaaS) environment on IBM Fusion and Red Hat OpenShift AI for enterprise AI model serving. + +The deployment includes: + +- Red Hat OpenShift AI platform services +- IBM Fusion object storage integration +- Model Registry services +- Gateway-based API routing and exposure +- GPU-backed inference runtimes using LLM-D +- Monitoring and observability components +- Example model deployment configurations + +Together, these components provide a baseline environment for running and managing enterprise AI inference services on OpenShift. + +--- + +## IBM Fusion for AI Architecture - MaaS Platform + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ IBM Fusion for AI - MaaS Platform β”‚ +β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ +β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ MaaS Runtime Infrastructure β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ +β”‚ β”‚ β”‚ Gateway β”‚ β”‚ Rate β”‚ β”‚ Auth β”‚ β”‚ Model β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ API β”‚ β”‚ Limiting β”‚ β”‚(Keycloak)β”‚ β”‚Catalog β”‚ β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ +β”‚ β”‚ β”‚ Model β”‚ β”‚Monitoringβ”‚ β”‚ Grafana β”‚ β”‚ Tier β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ Registry β”‚ β”‚Prometheusβ”‚ β”‚Dashboardsβ”‚ β”‚ Groups β”‚ β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ IBM Fusion Object Storage Layer β”‚ β”‚ +β”‚ β”‚ (OpenShift Data Foundation - ODF) β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ +β”‚ β”‚ β”‚ Model β”‚ β”‚ Workbench β”‚ β”‚ Model β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ Artifacts β”‚ β”‚ Data β”‚ β”‚ Registry β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ Storage β”‚ β”‚ Storage β”‚ β”‚ Backend β”‚ β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ +β”‚ β”‚ β€’ Auto-provisioned buckets via ObjectBucketClaim β”‚ β”‚ +β”‚ β”‚ β€’ Zero-configuration credential management β”‚ β”‚ +β”‚ β”‚ β€’ Enterprise-grade performance and reliability β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ AI Model Inference Services β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ +β”‚ β”‚ β”‚ Model A β”‚ β”‚ Model B β”‚ β”‚ Model C β”‚ β”‚Model D β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚All Tiers β”‚ β”‚Premium+ β”‚ β”‚Enterpriseβ”‚ β”‚Custom β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ (vLLM) β”‚ β”‚ (TGI) β”‚ β”‚ (vLLM) β”‚ β”‚(Custom)β”‚ β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ +β”‚ β–Ό β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ AI Applications β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚ β”‚ +β”‚ β”‚ β”‚ Code β”‚ β”‚ Chatbot β”‚ β”‚ Custom β”‚ β”‚ Data β”‚β”‚ β”‚ +β”‚ β”‚ β”‚Assistant β”‚ β”‚ UI β”‚ β”‚ Apps β”‚ β”‚Science β”‚β”‚ β”‚ +β”‚ β”‚ β”‚DevSpaces β”‚ β”‚ β”‚ β”‚ β”‚ β”‚Workbenchβ”‚β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + + +## Table of Contents + +This document is organized into deployment, validation, and reference sections: + +- [What You'll Build](#what-youll-build) +- [Prerequisites](#prerequisites) +- [Quick Start](#quick-start) +- [What's Deployed](#whats-deployed) +- [Learn More](#learn-more) +- [IBM Fusion for AI Architecture](#ibm-fusion-for-ai-architecture) +- [Key Features](#key-features) +- [Project Structure](#project-structure) +- [Use Cases](#use-cases) +- [Helm Charts Overview](#helm-charts-overview) +- [Chart Dependencies](#chart-dependencies) +- [Documentation](#documentation) +- [Benefits](#benefits) + +--- + +## What You'll Build + +By completing this quickstart, you will deploy a Red Hat OpenShift AI environment integrated with IBM Fusion storage services and a sample model-serving path. The resulting environment includes the core AI platform operators, a centralized model registry, gateway-based routing, GPU-backed inference services, monitoring integration, and storage for workbench users. + +This quickstart is intended to help platform engineers and AI infrastructure teams understand how the repository components fit together and how to stand up the baseline services needed for model onboarding and inference delivery. + +--- + +## Prerequisites + +Before you begin, verify that the target environment satisfies the platform, GPU, CLI, and storage requirements listed below. + +### Required + +- **Red Hat OpenShift 4.20+** with cluster-admin access +- **GPU nodes** with at least one NVIDIA GPU-capable worker +- **Helm 3.8+** - [Install Helm](https://helm.sh/docs/intro/install/) +- **OpenShift CLI (`oc`)** - [Install oc](https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/getting-started-cli.html) + +### GPU Enablement (Required for LLM Serving) +If serving GPU-backed models such as vLLM-based LLMs, the following components must be installed: +- Node Feature Discovery (NFD) for hardware detection +- NVIDIA GPU Operator +- Worker nodes automatically labelled by the NVIDIA GPU Operator (for example: nvidia.com/gpu.present=true) + +Verify GPU availability: +```bash +oc describe node | grep -i gpu +``` +If GPUs are not detected, ensure the NVIDIA drivers and operator are correctly installed. + + +### Storage Options + +Choose one supported S3-compatible storage backend for registry artifacts and related data: + +- **OpenShift Data Foundation (ODF)** with S3-compatible storage with nooba +- **AWS S3 or other S3-compatible storage** such as MinIO or Ceph + +### Verify Your Environment + +```bash +# Check OpenShift version (should be 4.20+) +oc version + +# Verify cluster-admin access +oc auth can-i '*' '*' --all-namespaces + +# Check GPU availability (should show at least 1 node) +oc get nodes -l nvidia.com/gpu.present=true + +# Verify Helm installation +helm version +``` + +--- + +## Deploy MaaS + +The following procedure walks through repository access, storage credential setup, platform installation, model deployment, and endpoint validation. + +### Step 1: Fork and Clone the Storage Fusion Repository +The quickstart examples reference configurations from the storage-fusion repository. Fork this repository to your GitHub account and clone it locally. + +Fork the repository: Fork the storage-fusion repository + +Clone the forked copy of this repository: +``` +git clone git@github.com:/storage-fusion.git +cd storage-fusion/quickstarts/model-as-a-service +``` +Note: The quickstarts/model-as-a-service directory is located under the AI/ parent directory within the storage-fusion repository (path: storage-fusion/AI/quickstarts/model-as-a-service). + +### Step 2: Configure Storage Credentials + +Select the storage backend that matches your environment and export the corresponding credentials before starting the deployment. + +**For OpenShift Data Foundation(ODF) Object Storage:** + +IBM Fusion Data Foundation Object Storage can be configured in two ways: + +1. **Automatic Bucket Creation (Recommended)** - When using OpenShift Data Foundation (ODF) with IBM Fusion, the platform automatically creates and configures object storage buckets using ObjectBucketClaim. This is the default configuration and **does not require manual credentials**. + + ```yaml + # In your values.yaml (e.g., examples/Fusion-Agentic-Assistance-Platform/values.yaml) + modelRegistry: + objectStorage: + enabled: true + autoCreateBucket: true # Automatic provisioning + odfStorageClass: openshift-storage.noobaa.io + bucketClass: noobaa-default-bucket-class + ``` + + With this configuration, credentials are automatically extracted from the ObjectBucketClaim and no manual setup is needed. + +2. **Manual Credentials** - If you're using IBM Fusion Object Storage without ODF or prefer manual configuration, you need to provide credentials: + + ```bash + export IBM_ACCESS_KEY="your-access-key-id" + export IBM_SECRET_KEY="your-secret-access-key" + export IBM_ENDPOINT="https://s3.us-south.cloud-object-storage.appdomain.cloud" + ``` + + Then configure in your values.yaml: + ```yaml + modelRegistry: + objectStorage: + enabled: true + autoCreateBucket: false # Manual configuration + accessKeyId: "${IBM_ACCESS_KEY}" + secretAccessKey: "${IBM_SECRET_KEY}" + endpoint: "${IBM_ENDPOINT}" + bucket: "model-registry-artifacts" + ``` + +**Note:** The quickstart examples use automatic bucket creation by default. Manual credentials are only required when `autoCreateBucket: false` or when using external S3-compatible storage outside of IBM Fusion/ODF. + +**For AWS S3:** +```bash +export AWS_ACCESS_KEY_ID="your-aws-access-key" +export AWS_SECRET_ACCESS_KEY="your-aws-secret-key" +export AWS_REGION="us-east-1" +``` + +**For MinIO (Development):** +```bash +export MINIO_ENDPOINT="http://minio.example.com:9000" +export MINIO_ACCESS_KEY="minioadmin" +export MINIO_SECRET_KEY="minioadmin" +``` + +### Step 3: Deploy the MAAS Platform + +Use the automated installation script to deploy the operators, platform services, and runtime components in sequence. This step provisions the baseline infrastructure required for model registration, gateway exposure, and workbench storage. + +```bash +# Deploy operators, platform, and runtime infrastructure in one command +./quickstarts/model-as-a-service/scripts/install-runtime.sh \ + quickstarts/model-as-a-service/examples/Fusion-Agentic-Assistance-Platform/values.yaml + +# The script will: +# 1. Install maas-operators (OpenShift AI, Kuadrant, cert-manager) +# 2. Bootstrap maas-platform (DataScienceCluster, Gateway API) +# 3. Deploy maas-runtime (Model Registry, Workbench Storage, Gateway) +``` + +Expected output +```bash + % ./quickstarts/mode-as-a-service/scripts/install-runtime.sh quickstarts/mode-as-a-service/examples/Fusion-Agentic-Assistance-Platform/values.yaml +=== MaaS Runtime Installation === + +Checking prerequisites... +βœ“ Prerequisites check passed + +Checking for default StorageClass... +βœ“ Default StorageClass found + +Keycloak is enabled. Please provide passwords: + +Enter admin password: +Enter user password: + +=== Phase 1: Installing Dependency Operator Subscriptions === +Values file: quickstarts/mode-as-a-service/examples/Fusion-Agentic-Assistance-Platform/values.yaml +Chart: /Users/harichandanakotha/Documents/MAAS/Fusion-AI/quickstarts/mode-as-a-service/charts/maas-operators + +Release "maas-operators" has been upgraded. Happy Helming! +NAME: maas-operators +LAST DEPLOYED: Sun May 24 22:11:03 2026 +NAMESPACE: default +STATUS: deployed +REVISION: 2 +TEST SUITE: None + +βœ“ Dependency operator subscriptions installed + +Waiting for OpenShift AI operator to be ready... +deployment.apps/rhods-operator condition met +βœ“ OpenShift AI operator ready +Waiting for DataScienceCluster CRD to be available... +βœ“ DataScienceCluster CRD available +Waiting for Kuadrant CRD to be available... +βœ“ Kuadrant CRD available +Waiting for LeaderWorkerSetOperator CRD to be available... +βœ“ LeaderWorkerSetOperator CRD available + +=== Phase 2: Creating DataScienceCluster and Operator Instances === +Values file: quickstarts/mode-as-a-service/examples/Fusion-Agentic-Assistance-Platform/values.yaml +Chart: /Users/harichandanakotha/Documents/MAAS/Fusion-AI/quickstarts/mode-as-a-service/charts/maas-platform + +Release "maas-platform" has been upgraded. Happy Helming! +NAME: maas-platform +LAST DEPLOYED: Sun May 24 22:11:28 2026 +NAMESPACE: default +STATUS: deployed +REVISION: 2 +TEST SUITE: None + +Waiting for DataScienceCluster to be ready... +datasciencecluster.datasciencecluster.opendatahub.io/default-dsc condition met +βœ“ DataScienceCluster ready + +=== Phase 3: Installing MaaS Runtime Resources === +Values file: quickstarts/mode-as-a-service/examples/Fusion-Agentic-Assistance-Platform/values.yaml +Chart: /Users/harichandanakotha/Documents/MAAS/Fusion-AI/quickstarts/mode-as-a-service/charts/maas-runtime + +Installing MaaS runtime resources (gateway, model registry, workbench storage, etc.)... +I0524 22:12:09.207063 96967 warnings.go:110] "Warning: unknown field \"spec.istio\"" +Release "maas-runtime" has been upgraded. Happy Helming! +NAME: maas-runtime +LAST DEPLOYED: Sun May 24 22:11:41 2026 +NAMESPACE: default +STATUS: deployed +REVISION: 2 +TEST SUITE: None + +βœ“ MaaS Runtime resources installation complete + +Waiting for additional components to be ready... + +Waiting for Kuadrant... +kuadrant.kuadrant.io/kuadrant condition met +βœ“ Kuadrant ready +Waiting for Keycloak... +⚠ Keycloak not ready yet + +=== Installation Summary === + +MaaS Runtime has been deployed! + +Next steps: +1. Deploy models using: ./deploy-model.sh +2. Check status: oc get all -n maas-models +3. View logs: oc logs -n maas-models -l app.kubernetes.io/component=model-service + +Useful URLs: + OpenShift Console: https://console-openshift-console.apps.f55l020.fusion.tadn.ibm.com + Keycloak: https://N/A + +Installation complete! +``` +After the script completes, verify that the expected platform resources are present and reporting ready status: + +```bash +# Check operators are running +oc get csv -n redhat-ods-operator +oc get csv -n kuadrant-system + +# Verify platform is ready +oc get datasciencecluster +oc get kuadrant -n kuadrant-system + +# Check runtime components +oc get modelregistry -n rhoai-model-registries +oc get gateway -n openshift-ingress +``` + +**Expected Output:** +``` +# Operators +NAME DISPLAY VERSION PHASE +rhods-operator.3.3.0 Red Hat OpenShift AI 3.3.0 Succeeded +kuadrant-operator.v0.8.0 Kuadrant Operator 0.8.0 Succeeded + +# Platform +NAME AGE PHASE CREATED AT +default-dsc 5m Ready 2024-01-15T10:30:00Z + +# Runtime +NAME AGE READY +model-registry 3m True +``` +### Step 3: Register Model from Model Catalog + +Before deploying a model, you need to register it in the Model Registry. The Model Catalog provides access to curated foundation models from various sources including HuggingFace and Red Hat's model repository. + +**Quick Registration Steps:** + +1. Navigate to the OpenShift AI Dashboard: + - **Models and Model Serving** β†’ **Model Catalog** + +2. Search for your desired model (e.g., `gpt-oss-20b`) + +3. Select the model and click **Register Model** + +4. Provide registration details: + - **Name**: `gpt-oss-20b` + - **Version**: `Version 1` + - **Model Registry**: `model-registry` (created by the installation script) + +5. Click **Register Model** + +The model metadata is stored in the Model Registry while model artifacts remain in object storage. This separation allows for efficient version management and deployment tracking. + +**For detailed instructions with screenshots and advanced options, see:** +- **[Registering Models from Catalog](docs/02-model-catalog-and-registry/ADDING_MODELS_TO_REGISTRY.md)** - Complete registration guide +- **[Model Catalog Guide](docs/02-model-catalog-and-registry/MODEL_CATALOG_GUIDE.md)** - Adding custom catalog sources + +### Step 4: Deploy Your First AI Model + +```bash +# Deploy GPT-OSS-20B model for Fusion Agentic Assistance Platform using the deployment script +./quickstarts/model-as-a-service/scripts/deploy-model.sh \ + quickstarts/model-as-a-service/examples/Fusion-Agentic-Assistance-Platform/models/gpt-oss-20b-values.yaml + +# The script will: +# 1. Deploy the model using maas-model-service chart +# 2. Configure rate limiting policies +# 3. Set up monitoring and routes +``` + +#### Expected output: + +```bash +% ./quickstarts/model-as-a-service/scripts/deploy-model.sh \ + quickstarts/model-as-a-service/examples/Fusion-Agentic-Assistance-Platform/models/gpt-oss-20b-values.yaml +=== MaaS Model Deployment === + +Checking prerequisites... +βœ“ Prerequisites check passed + +Model deployment details: + Release name: gpt-oss-20b-version-1 + Values file: quickstarts/model-as-a-service/examples/Fusion-Agentic-Assistance-Platform/models/gpt-oss-20b-values.yaml + Chart: /Users/harichandanakotha/Documents/MAAS/Fusion-AI/quickstarts/model-as-a-service/deploy/maas-model-service + +Checking for MaaS runtime... +Model registry deployment mode detected +Checking if model 'gpt-oss-20b' exists in model registry... +βœ“ Model found in registry + Registered Model ID: 1 + Model Name: gpt-oss-20b + Latest Model Version ID: 2 + Latest Model Version Name: 1:Version 1 + Model URI: oci://registry.redhat.io/rhelai1/modelcar-gpt-oss-20b:1.5 + +βœ“ Model validation passed + +Detecting cluster wildcard domain... +βœ“ Detected cluster wildcard domain: apps.f55l020.fusion.tadn.ibm.com + +Namespace deploy-models-rhoai will be created by Helm + +Deploying model... +Release "gpt-oss-20b-version-1" does not exist. Installing it now. +NAME: gpt-oss-20b-version-1 +LAST DEPLOYED: Thu May 28 22:14:48 2026 +NAMESPACE: default +STATUS: deployed +REVISION: 1 +TEST SUITE: None + +βœ“ Model deployment initiated + +Waiting for model to be ready... +llminferenceservice.serving.kserve.io/gpt-oss-20b-version-1 condition met +βœ“ Model is ready + +Gateway route will be created by Helm... +βœ“ Gateway route created successfully + +Gateway URL: https://openshift-ai-inference-openshift-ingress.apps.f55l020.fusion.tadn.ibm.com +Model endpoint: https://openshift-ai-inference-openshift-ingress.apps.f55l020.fusion.tadn.ibm.com/deploy-models-rhoai/gpt-oss-20b-version-1 + +=== Deployment Summary === + +Model: gpt-oss-20b-version-1 +Namespace: deploy-models-rhoai +Status: oc get llminferenceservice gpt-oss-20b-version-1 -n deploy-models-rhoai + +Test the model: + TOKEN=$(oc whoami -t) + curl -k "https://openshift-ai-inference-openshift-ingress.apps.f55l020.fusion.tadn.ibm.com/deploy-models-rhoai/gpt-oss-20b-version-1/v1/models" \ + -H "Authorization: Bearer ${TOKEN}" + + curl -k -X POST "https://openshift-ai-inference-openshift-ingress.apps.f55l020.fusion.tadn.ibm.com/deploy-models-rhoai/gpt-oss-20b-version-1/v1/completions" \ + -H "Authorization: Bearer ${TOKEN}" \ + -H "Content-Type: application/json" \ + -d '{"model": "gpt-oss-20b-version-1", "prompt": "Hello", "max_tokens": 50}' + +View logs: + oc logs -n deploy-models-rhoai -l serving.kserve.io/inferenceservice=gpt-oss-20b-version-1 -c kserve-container -f + +Troubleshooting: + oc describe llminferenceservice gpt-oss-20b-version-1 -n deploy-models-rhoai + oc get events -n deploy-models-rhoai --sort-by='.lastTimestamp' + +Deployment complete! +``` + +Monitor the deployment until the inference service reports a ready condition: + +```bash +# Stream status changes for the inference service +oc get llminferenceservice -n deploy-models-rhoai -w + +# Check model pods +oc get pods -n deploy-models-rhoai +``` + +**Expected Output:** +``` +NAME READY URL AGE +gpt-oss-20b True https://gateway.example.com/deploy-models-rhoai/... 3m +``` + +### Step 5: Test Your Model + +```bash +# Get the model endpoint +MODEL_URL=$(oc get route gpt-oss-20b -n deploy-models-rhoai -o jsonpath='{.spec.host}') + +# Get authentication token +TOKEN=$(oc whoami -t) + +# Test the model +curl -k -X POST "https://${MODEL_URL}/v1/completions" \ + -H "Authorization: Bearer ${TOKEN}" \ + -H "Content-Type: application/json" \ + -d '{ + "model": "gpt-oss-20b", + "prompt": "Write a Python function to calculate fibonacci numbers:", + "max_tokens": 100, + "temperature": 0.7 + }' +``` + +The model is automatically exposed through the gateway route when external gateway exposure is enabled in the values file. + +```bash +# Get the gateway route (exposed automatically by the deployment) +GATEWAY_HOST=$(oc get route openshift-ai-inference -n openshift-ingress -o jsonpath='{.spec.host}') + +# Get authentication token +TOKEN=$(oc whoami -t) + +# Test the model through the gateway +# Pattern: https://///v1/completions +curl -k -X POST "https://${GATEWAY_HOST}/deploy-models-rhoai/gpt-oss-20b-version-1/v1/completions" \ + -H "Authorization: Bearer ${TOKEN}" \ + -H "Content-Type: application/json" \ + -d '{ + "model": "gpt-oss-20b-version-1", + "prompt": "Write a Python function to calculate fibonacci numbers:", + "max_tokens": 100, + "temperature": 0.7 + }' +``` + +At this stage, the platform is ready with a deployed model that is accessible through the configured gateway endpoint. + +--- + +## What's Deployed + +After completing the quickstart, the environment includes the following platform and runtime components: + +| Component | Description | Namespace | +|-----------|-------------|-----------| +| **OpenShift AI** | Core AI platform | `redhat-ods-operator` | +| **Kuadrant** | API gateway & rate limiting | `kuadrant-system` | +| **Model Registry** | Model versioning & storage | `rhoai-model-registries` | +| **Gateway** | Intelligent routing | `openshift-ingress` | +| **GPT-OSS-20B** | Code assistance model | `deploy-models-rhoai` | +| **Monitoring** | Prometheus & Grafana | `openshift-monitoring` | + +--- + + +## Key Features + +### IBM Fusion for AI Integration + +IBM Fusion provides the storage foundation for this deployment model. The platform can use IBM Fusion Object Storage through OpenShift Data Foundation to store model artifacts, workbench data, and registry-backed assets. Automatic bucket provisioning through ObjectBucketClaim simplifies storage setup and helps standardize the way model-serving components consume object storage. + +### Model Management + +The platform supports centralized model management through a model registry backed by IBM Fusion-integrated storage. It also supports curated model discovery and controlled deployment flows, allowing teams to register, version, and expose models using a repeatable operational pattern. + +### Flexibility + +The deployment model supports multiple use cases on the same shared platform. Teams can deploy models independently, choose among supported storage backends, and adapt the same runtime foundation to different application requirements. + +### Governance + +Tier-based access control and request limiting allow platform teams to define differentiated service levels for model consumers. These controls help support internal governance, quota enforcement, and usage management across shared inference endpoints. + +### Observability + +The runtime integrates with Prometheus and Grafana so that operators can monitor model-serving health, endpoint usage, and related platform signals. This observability model supports day-to-day operations and troubleshooting for shared inference infrastructure. + +## Project Structure + +```text +model-as-a-service/ +β”œβ”€β”€ deploy/ +β”‚ β”œβ”€β”€ maas-operators/ # Operator installation chart +β”‚ β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”‚ β”œβ”€β”€ values.yaml +β”‚ β”‚ └── templates/ +β”‚ β”‚ β”œβ”€β”€ namespaces.yaml +β”‚ β”‚ β”œβ”€β”€ operatorgroups.yaml +β”‚ β”‚ └── subscriptions.yaml +β”‚ β”‚ +β”‚ β”œβ”€β”€ maas-platform/ # Platform bootstrap chart +β”‚ β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”‚ β”œβ”€β”€ values.yaml +β”‚ β”‚ └── templates/ +β”‚ β”‚ β”œβ”€β”€ datasciencecluster.yaml +β”‚ β”‚ └── operator-instances.yaml +β”‚ β”‚ +β”‚ β”œβ”€β”€ maas-runtime/ # Core MaaS infrastructure +β”‚ β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”‚ β”œβ”€β”€ values.yaml +β”‚ β”‚ └── templates/ +β”‚ β”‚ β”œβ”€β”€ namespace.yaml +β”‚ β”‚ β”œβ”€β”€ rbac.yaml +β”‚ β”‚ β”œβ”€β”€ tier-groups.yaml +β”‚ β”‚ β”œβ”€β”€ gateway.yaml +β”‚ β”‚ β”œβ”€β”€ modelregistry.yaml +β”‚ β”‚ └── workbench-storage.yaml +β”‚ β”‚ +β”‚ β”œβ”€β”€ maas-model-service/ # Generic model deployment +β”‚ β”œβ”€β”€ Chart.yaml +β”‚ β”œβ”€β”€ values.yaml +β”‚ └── templates/ +β”‚ β”œβ”€β”€ llminferenceservice.yaml +β”‚ β”œβ”€β”€ ratelimitpolicy.yaml +β”‚ β”œβ”€β”€ servicemonitor.yaml +β”‚ β”œβ”€β”€ connection-secret.yaml +β”‚ β”œβ”€β”€ route.yaml +β”‚ └── namespace.yaml +β”‚ +β”œβ”€β”€ examples/ +β”‚ β”œβ”€β”€ Fusion-Agentic-Assistance-Platform/ # Fusion Agentic Assistance Platform use case +β”‚ β”‚ β”œβ”€β”€ README.md +β”‚ β”‚ β”œβ”€β”€ values.yaml +β”‚ β”‚ └── models/ +β”‚ β”‚ β”œβ”€β”€ gpt-oss-values.yaml +β”‚ β”‚ β”œβ”€β”€ gpt-oss-20b-values.yaml +β”‚ β”‚ └── nemotron-values.yaml +β”‚ β”‚ +β”‚ β”œβ”€β”€ model-registry-deployment/ # Model registry examples +β”‚ β”‚ β”œβ”€β”€ README.md +β”‚ β”‚ β”œβ”€β”€ gpt-oss-20b-values.yaml +β”‚ β”‚ β”œβ”€β”€ granite-31-8b-lab-v1-values.yaml +β”‚ β”‚ └── qwen3-8b-fp8-dynamic-values.yaml +β”‚ β”‚ +β”‚ β”œβ”€β”€ model-registry-gitops/ # GitOps for model registry +β”‚ β”œβ”€β”€ operators-gitops-deployment/ # GitOps for operators +β”‚ β”œβ”€β”€ maas-runtime-gitops-deployment/ # GitOps for runtime +β”‚ β”œβ”€β”€ maas-model-service-gitops-deployment/ # GitOps for models +β”‚ └── workbench-model-testing/ # Model testing workflow +β”‚ +β”œβ”€β”€ docs/ +β”‚ β”œβ”€β”€ GETTING_STARTED.md +β”‚ β”œβ”€β”€ DEPLOYMENT_ORDER.md +β”‚ β”œβ”€β”€ configuration/ +β”‚ β”‚ β”œβ”€β”€ MODEL_CATALOG_GUIDE.md +β”‚ β”‚ β”œβ”€β”€ MODEL_REGISTRY_GUIDE.md +β”‚ β”‚ └── WORKBENCH_STORAGE_GUIDE.md +β”‚ β”œβ”€β”€ deployment/ +β”‚ β”‚ └── DEPLOYING_MODEL_SERVICES.md +β”‚ └── operations/ +β”‚ └── ADDING_MODELS_TO_REGISTRY.md +β”‚ +└── blogs/ + β”œβ”€β”€ published/ + β”‚ β”œβ”€β”€ gitops-series/ + β”‚ β”œβ”€β”€ quick-start/ + β”‚ └── techxchange-series/ + └── planning/ +``` + +## IBM Fusion for AI Quick Start Features + +### IBM Fusion Object Storage Integration + +This quickstart uses IBM Fusion as the storage foundation for the MaaS platform. In practice, that means model artifacts, registry-backed metadata flows, and workbench-related data can be mapped to a common storage layer exposed through OpenShift-native patterns such as ObjectBucketClaim. + +For model registry workflows, the platform supports IBM Fusion Object Storage integration through OpenShift Data Foundation, automated bucket provisioning, model version tracking, metadata management, and PostgreSQL-backed registry state. Additional implementation details are available in [docs/02-model-catalog-and-registry/ADDING_MODELS_TO_REGISTRY.md](docs/02-model-catalog-and-registry/ADDING_MODELS_TO_REGISTRY.md). + + +## Use Cases + +### Fusion Agentic Assistance Platform + +The Fusion Agentic Assistance Platform demonstrates how the platform can serve AI-powered assistance workloads with agentic capabilities. Reference material is available in [fusion-AgenticAssistanceSampleApp/README.md](../../fusion-AgenticAssistanceSampleApp/README.md), with example model configurations for GPT-OSS-20B and Nemotron-based deployments. + +### Chatbot (Coming Soon) + +Customer service chatbot with web UI. + +- Models: Llama-3-70B, Mistral-7B +- Application: Custom chatbot interface + +### Document Analysis (Coming Soon) + +A document analysis scenario is also planned for multimodal processing workloads. This use case will focus on text-and-image processing patterns and the service composition needed for document-centric AI pipelines. + +## Documentation + +The MaaS platform is deployed using four Helm charts that must be installed in sequence: + +```text +maas-operators (install first) + ↓ +maas-platform (requires operators) + ↓ +maas-runtime (requires platform) + ↓ +maas-model-service (requires runtime) +``` + +### Helm Chart Guides + +| Chart | Purpose | Documentation | +|-------|---------|---------------| +| **maas-operators** | Installs OpenShift AI and dependent operators | [MaaS Operators Guide](docs/01-setup/MAAS_OPERATORS_GUIDE.md) | +| **maas-platform** | Configures DataScienceCluster and platform components | [Platform Customization Guide](docs/01-setup/MAAS_PLATFORM_CUSTOMIZATION_GUIDE.md) | +| **maas-runtime** | Deploys gateway, model registry, and storage integration | [Runtime Customization Guide](docs/01-setup/MAAS_RUNTIME_CUSTOMIZATION_GUIDE.md) | +| **maas-model-service** | Deploys individual AI models as inference services | [Deploying Model Services](docs/03-model-deployment/DEPLOYING_MODEL_SERVICES.md) | + +### Getting Started +- [Getting Started Guide](docs/GETTING_STARTED.md) - Complete installation and setup guide +- [Deployment Order Guide](docs/01-setup/DEPLOYMENT_ORDER.md) - Step-by-step deployment sequence + +### Configuration Guides +- [Model Catalog Guide](docs/02-model-catalog-and-registry/MODEL_CATALOG_GUIDE.md) - HuggingFace integration and model discovery +- [Registering Models](docs/02-model-catalog-and-registry/ADDING_MODELS_TO_REGISTRY.md) - Model registration from catalog + +### Examples +- [Fusion Agentic Assistance Platform](examples/Fusion-Agentic-Assistance-Platform/README.md) - Complete use case with multiple models +- [Model Registry Deployment](examples/model-registry-deployment/README.md) - Model registry entry examples diff --git a/AI/quickstarts/model-as-a-service/backstage/mkdocs.yml b/AI/quickstarts/model-as-a-service/backstage/mkdocs.yml new file mode 100644 index 00000000..95a2c877 --- /dev/null +++ b/AI/quickstarts/model-as-a-service/backstage/mkdocs.yml @@ -0,0 +1,31 @@ +site_name: Model as a Service on IBM Fusion +site_description: Deploy a complete MaaS platform with Red Hat OpenShift AI on IBM Fusion HCI + +# Point to the IBM storage-fusion repo for documentation +repo_url: https://github.com/IBM/storage-fusion +repo_name: IBM/storage-fusion + +# Point to docs directory +docs_dir: docs + +# Navigation +nav: + - Home: index.md + +theme: + name: material + palette: + primary: indigo + accent: indigo + +plugins: + - techdocs-core + +# Safe markdown extensions +markdown_extensions: + - admonition + - tables + - toc: + permalink: true + +# Made with Bob \ No newline at end of file