本地部署ollama的相关开源模型,能够被gitlab进行访问,参考文档中的gitlab的yml文件样例,进行修改验证,虽然验证通过了,但是模型没有返回正确的结果,直接用接口进行测试是正常的;
如下所示:(http://xxxx/v1/chat/completions -H "Content-Type: application/json" -H "Accept: application/json" -H "User-Agent: OCR-Test" -d '{"model":"deepseek-coder-v2:16b","messages":[{"role":"user","content":"Hello"}],"stream":false}')
但是用ocr reviewer 返回结果不符合规范,而且模型直接返回400的错误,请问是不是我哪里写错了。
# OpenCodeReview - GitLab CI Merge Request Auto-Review Demo
#
# This pipeline automatically reviews Merge Requests using OpenCodeReview
# and posts review comments (discussions) directly on the MR diff.
# Supports both same-repo and forked MR scenarios.
#
# Required CI/CD Variables (Settings → CI/CD → Variables):
# OCR_LLM_URL - LLM API endpoint (e.g., https://api.openai.com/v1/chat/completions)
# OCR_LLM_AUTH_TOKEN - Authentication token for the LLM API (mark as "Masked")
#
# Optional CI/CD Variables:
# OCR_LLM_MODEL - Model name (default: gpt-4o)
# GITLAB_API_TOKEN - GitLab Personal/Project Access Token with "api" scope
# (falls back to CI_JOB_TOKEN if not set)
#
# Optional CI/CD Variables (for retry/delay tuning):
# OCR_RETRY_BASE_DELAY - Base delay (ms) for exponential backoff on rate-limit retries (default: 2000)
# OCR_MAX_RETRIES - Max retry attempts per comment when rate-limited or transient error (default: 3)
# OCR_MAX_RETRY_DELAY - Maximum delay (ms) per single retry, caps Retry-After/backoff (default: 60000)
# OCR_SUCCESS_DELAY - Delay (ms) after a successful comment post to pace requests (default: 2000)
# OCR_FAILURE_DELAY - Delay (ms) after a non-rate-limit failure to pace subsequent requests (default: 1000)
# OCR_RATE_LIMIT_THRESHOLD - Proactively slow down when GitLab RateLimit-Remaining is at/below this value (default: 10, set 0 to disable)
#
# Note: The pipeline also configures llm.extra_body to '{"thinking": {"type": "disabled"}}'
# to disable thinking mode for compatibility with various LLM providers.
#
# Fork MR Support:
# The script uses CI_COMMIT_SHA as the diff target to correctly resolve the
# source commit from forked repos. For some GitLab versions, you may need to enable:
# Project Settings → CI/CD → General pipelines →
# "Run pipelines in the parent project for merge requests from forked projects"
stages:
- review
code-review:
stage: review
interruptible: true
resource_group: mr-review-$CI_MERGE_REQUEST_IID
image: node:20
only:
- merge_requests
variables:
GIT_DEPTH: 0 # Full history needed for merge-base diff
OCR_MAX_FILE_LINES: 1000
OCR_EXCLUDE_PATHS: "node_modules/,dist/,build/,*.lock"
OCR_RETRY_BASE_DELAY: 2000
OCR_MAX_RETRIES: 3
OCR_MAX_RETRY_DELAY: 60000
OCR_SUCCESS_DELAY: 2000
OCR_FAILURE_DELAY: 1000
OCR_RATE_LIMIT_THRESHOLD: 10
script:
# Install OpenCodeReview
- npm install -g @alibaba-group/open-code-review
- echo "OpenCodeReview installed with version:" && ocr version || true
# Configure OCR
- mkdir -p ~/.opencodereview
# Run OCR review (use CI_COMMIT_SHA for --to to support both same-repo and forked MRs)
- |
ocr config set provider "ollama"
ocr config set providers.ollama.url "http://xxxx:1234/v1"
ocr config set providers.ollama.model "deepseek-coder-v2:16b"
ocr config set providers.ollama.protocol "openai"
ocr config set providers.ollama.api_key "dummy-key-12345"
ocr config set language Chinese
echo "===== Test Ollama LLM Connection ====="
ocr llm test || echo "LLM test failed but continuing..."
echo "Reviewing MR: ${CI_MERGE_REQUEST_SOURCE_BRANCH_NAME} against ${CI_MERGE_REQUEST_TARGET_BRANCH_NAME}"
set +e
ocr review \
--from "origin/${CI_MERGE_REQUEST_TARGET_BRANCH_NAME}" \
--to "${CI_COMMIT_SHA}" \
--format json \
--audience agent \
--concurrency 1 \
--exclude ${OCR_EXCLUDE_PATHS} \
> /tmp/ocr-result.json 2>/tmp/ocr-stderr.log
OCR_EXIT=$?
set -e
echo "OCR exit code: $OCR_EXIT"
if [ -s /tmp/ocr-result.json ]; then
echo "=== OCR Result ==="
cat /tmp/ocr-result.json
else
echo "=== OCR Result file is empty or missing ==="
cat > /tmp/ocr-result.json << 'EOF'
{
"comments": [],
"warnings": ["OCR review could not complete."],
"message": "No review results generated."
}
EOF
echo "Created default empty result"
fi
if [ -s /tmp/ocr-stderr.log ]; then
echo "=== OCR stderr ==="
cat /tmp/ocr-stderr.log
fi
echo "=== OCR review completed ==="
# Post review comments to MR
- |
python3 << 'PYTHON_SCRIPT'
import json
import os
import random
import sys
import time
import urllib.request
import urllib.error
GITLAB_URL = os.environ.get("CI_SERVER_URL", "https://gitlab.com")
PROJECT_ID = os.environ["CI_PROJECT_ID"]
MR_IID = os.environ["CI_MERGE_REQUEST_IID"]
# Fall back to CI_JOB_TOKEN for fork MR pipelines where GITLAB_API_TOKEN is unavailable
API_TOKEN = os.environ.get("GITLAB_API_TOKEN") or os.environ.get("CI_JOB_TOKEN", "")
SOURCE_BRANCH = os.environ["CI_MERGE_REQUEST_SOURCE_BRANCH_NAME"]
TARGET_BRANCH = os.environ["CI_MERGE_REQUEST_TARGET_BRANCH_NAME"]
COMMIT_SHA = os.environ["CI_COMMIT_SHA"]
# Configurable retry/delay settings (via CI/CD variables, with sensible defaults)
RETRY_BASE_DELAY = int(os.environ.get("OCR_RETRY_BASE_DELAY", "2000")) / 1000 # ms → seconds
MAX_RETRIES = int(os.environ.get("OCR_MAX_RETRIES", "3"))
MAX_RETRY_DELAY = int(os.environ.get("OCR_MAX_RETRY_DELAY", "60000")) / 1000 # ms → seconds, cap per-retry wait
SUCCESS_DELAY = int(os.environ.get("OCR_SUCCESS_DELAY", "2000")) / 1000
FAILURE_DELAY = int(os.environ.get("OCR_FAILURE_DELAY", "1000")) / 1000
# Base delay (seconds) for exponential backoff on transient server errors (5xx / 408).
# Server hiccups are typically short-lived, so a 2s initial wait (doubling per retry)
# is sufficient and avoids stalling the CI job unnecessarily.
TRANSIENT_BASE_DELAY = 2
# Proactive throttling: when RateLimit-Remaining drops to/below this threshold,
# the script doubles the pacing delay to avoid hitting 429. Set to 0 to disable.
RATE_LIMIT_THRESHOLD = int(os.environ.get("OCR_RATE_LIMIT_THRESHOLD", "10"))
if not API_TOKEN:
print("ERROR: No API token available (GITLAB_API_TOKEN or CI_JOB_TOKEN). Cannot post comments.", file=sys.stderr)
sys.exit(1)
API_BASE = f"{GITLAB_URL}/api/v4/projects/{PROJECT_ID}/merge_requests/{MR_IID}"
# Determine auth header: PRIVATE-TOKEN for personal/project tokens, JOB-TOKEN for CI_JOB_TOKEN
USE_JOB_TOKEN = not os.environ.get("GITLAB_API_TOKEN")
AUTH_HEADER = "JOB-TOKEN" if USE_JOB_TOKEN else "PRIVATE-TOKEN"
def _get_header(headers, name):
"""Case-insensitive header lookup.
urllib normalizes response header keys to title-case (e.g. 'Retry-After'),
but this defensive check also handles original casing so that retry delay
computation and quota logging never silently miss a header.
"""
if name in headers:
val = headers[name]
elif name.lower() in headers:
val = headers[name.lower()]
else:
return None
return str(val).strip() if val is not None else None
def _parse_rate_limit_header(headers, name):
"""Safely parse a rate-limit response header (e.g. RateLimit-Remaining) as an int."""
val = _get_header(headers, name)
if val is None:
return None
try:
return int(val)
except (ValueError, TypeError):
return None
def api_request_with_retry(endpoint, data=None, method="POST"):
"""Make a GitLab API request with retry on rate-limit errors.
Returns a dict: {'success': bool, 'data': response or None,
'is_rate_limit_exhausted': bool, 'rate_limit_remaining': int or None}
"""
for attempt in range(MAX_RETRIES + 1):
url = f"{API_BASE}{endpoint}"
headers = {
AUTH_HEADER: API_TOKEN,
"Content-Type": "application/json"
}
body = json.dumps(data).encode("utf-8") if data else None
req = urllib.request.Request(url, data=body, headers=headers, method=method)
try:
with urllib.request.urlopen(req) as resp:
resp_data = json.loads(resp.read().decode("utf-8"))
remaining = _parse_rate_limit_header(resp.headers, 'RateLimit-Remaining')
limit = _parse_rate_limit_header(resp.headers, 'RateLimit-Limit')
if remaining is not None and limit is not None:
print(f"RateLimit: {remaining}/{limit} remaining for {endpoint}", file=sys.stderr)
return {'success': True, 'data': resp_data, 'is_rate_limit_exhausted': False, 'rate_limit_remaining': remaining}
except urllib.error.HTTPError as e:
error_body = e.read().decode('utf-8')
is_rate_limit = e.code == 429 or (e.code == 403 and any(kw in error_body.lower() for kw in ['retry later', 'rate limit', 'too many requests', 'abuse']))
# Transient server errors (5xx) and request timeouts (408) are worth retrying
# with a short exponential backoff, since they are typically short-lived.
is_transient = (500 <= e.code < 600) or e.code == 408
rl_remaining = _parse_rate_limit_header(e.headers, 'RateLimit-Remaining')
if (is_rate_limit or is_transient) and attempt < MAX_RETRIES:
retry_after = _get_header(e.headers, 'Retry-After')
if retry_after:
try:
delay = float(retry_after)
except ValueError:
delay = RETRY_BASE_DELAY * (2 ** attempt) # fallback if Retry-After is not numeric
elif is_transient:
# Transient server error: use a shorter base than the rate-limit path.
# Server hiccups are typically short-lived, so a 2s initial wait
# (doubling per retry) is sufficient.
delay = TRANSIENT_BASE_DELAY * (2 ** attempt)
else:
delay = RETRY_BASE_DELAY * (2 ** attempt)
delay = min(delay, MAX_RETRY_DELAY) # cap to prevent long stalls
delay = delay * (0.75 + random.random() * 0.5) # ±25% jitter to avoid thundering herd
rl_info = f" (RateLimit-Remaining: {rl_remaining})" if rl_remaining is not None else ""
reason = "rate limit" if is_rate_limit else f"transient error (HTTP {e.code})"
print(f"{reason} hit for {endpoint}, retrying in {delay:.1f}s (attempt {attempt + 1}/{MAX_RETRIES}){rl_info}", file=sys.stderr)
time.sleep(delay)
else:
print(f"API error {e.code}: {error_body}", file=sys.stderr)
return {'success': False, 'data': None, 'is_rate_limit_exhausted': is_rate_limit, 'rate_limit_remaining': rl_remaining}
return {'success': False, 'data': None, 'is_rate_limit_exhausted': False, 'rate_limit_remaining': None}
def post_note(body):
"""Post a general note/comment on the MR (with rate-limit retry)."""
return api_request_with_retry("/notes", {"body": body})
def post_discussion(path, line, body, base_sha, start_sha, head_sha):
"""Post an inline discussion on a specific file/line in the MR diff."""
position = {
"position_type": "text",
"new_path": path,
"old_path": path,
"new_line": line,
"base_sha": base_sha,
"start_sha": start_sha,
"head_sha": head_sha,
}
data = {
"body": body,
"position": position
}
return api_request_with_retry("/discussions", data)
def format_comment(comment):
"""Format a single review comment as markdown."""
body = comment.get("content", "")
existing = comment.get("existing_code", "")
suggestion = comment.get("suggestion_code", "")
if suggestion and existing:
body += "\n\n**Suggestion:**\n"
body += f"```suggestion:-0+0\n{suggestion}\n```"
return body
def format_comment_fallback(comment):
"""Format a comment for fallback (non-inline) display."""
path = comment.get("path", "unknown")
start_line = comment.get("start_line", 0)
end_line = comment.get("end_line", 0)
content = comment.get("content", "")
md = f"### 📄 `{path}`"
if start_line and end_line:
md += f" (L{start_line}-L{end_line})"
md += f"\n\n{content}"
existing = comment.get("existing_code", "")
suggestion = comment.get("suggestion_code", "")
if suggestion and existing:
md += "\n\n<details><summary>💡 Suggested Change</summary>\n\n"
md += f"**Before:**\n```\n{existing}\n```\n\n"
md += f"**After:**\n```\n{suggestion}\n```\n\n"
md += "</details>"
return md
# --- Main ---
# Read OCR result (skip first line which is summary, not JSON)
try:
with open("/tmp/ocr-result.json", "r") as f:
result = json.load(f)
except (FileNotFoundError, json.JSONDecodeError) as e:
print(f"Failed to parse OCR output: {e}", file=sys.stderr)
stderr_content = ""
try:
with open("/tmp/ocr-stderr.log", "r") as f:
stderr_content = f.read().strip()
except FileNotFoundError:
pass
if stderr_content:
post_note(f"⚠️ **OpenCodeReview** encountered an error:\n```\n{stderr_content}\n```")
sys.exit(0)
comments = result.get("comments", [])
warnings = result.get("warnings", [])
# No comments - post summary
if not comments:
message = result.get("message", "No comments generated. Looks good to me.")
post_note(f"✅ **OpenCodeReview**: {message}")
print("No review comments to post.")
sys.exit(0)
# Get MR diff metadata for position calculation (uses retry to avoid single-point-of-failure)
diff_refs = None
versions_resp = api_request_with_retry("/versions", method="GET")
if versions_resp and versions_resp.get('success'):
versions = versions_resp.get('data', [])
if versions:
latest = versions[0]
diff_refs = {
"base_sha": latest.get("base_commit_sha", ""),
"start_sha": latest.get("start_commit_sha", ""),
"head_sha": latest.get("head_commit_sha", ""),
}
if not diff_refs:
print("Warning: Could not fetch MR versions. Inline comments will use fallback.", file=sys.stderr)
# Post inline discussions for each comment
success_count = 0
failed_comments = []
for comment in comments:
path = comment.get("path", "")
end_line = comment.get("end_line", 0)
start_line = comment.get("start_line", end_line)
body = format_comment(comment)
if not path or not end_line or not diff_refs:
failed_comments.append(comment)
continue
result_resp = post_discussion(path, end_line, body, **diff_refs)
if result_resp and result_resp.get('success'):
success_count += 1
# Proactive throttling: slow down when GitLab reports low remaining quota
remaining = result_resp.get('rate_limit_remaining')
if RATE_LIMIT_THRESHOLD > 0 and remaining is not None and remaining <= RATE_LIMIT_THRESHOLD:
pace_delay = SUCCESS_DELAY * 2
print(f"Rate limit quota low ({remaining} remaining), increasing pacing delay to {pace_delay:.1f}s", file=sys.stderr)
time.sleep(pace_delay)
else:
time.sleep(SUCCESS_DELAY) # Pace requests to avoid rate limits
else:
failed_comments.append(comment)
# Apply longer delay when rate-limit retries are exhausted, shorter delay for other errors
is_rate_limit_exhausted = result_resp.get('is_rate_limit_exhausted', False) if result_resp else False
post_fail_delay = SUCCESS_DELAY if is_rate_limit_exhausted else FAILURE_DELAY
time.sleep(post_fail_delay)
print(f"Successfully posted {success_count}/{len(comments)} inline comments.")
# Post fallback for any failed inline comments
if failed_comments:
fallback_body = f"🔍 **OpenCodeReview** found issues that could not be posted inline:\n\n---\n\n"
for comment in failed_comments:
fallback_body += format_comment_fallback(comment) + "\n\n---\n\n"
post_note(fallback_body)
# Post summary last
total_count = len(comments)
failed_count = len(failed_comments)
summary = f"🔍 **OpenCodeReview** found **{total_count}** issue(s) in this MR."
if total_count > 0:
summary += f"\n- ✅ {success_count} posted as inline comment(s)"
summary += f"\n- 📝 {failed_count} posted as summary (missing line info)"
if warnings:
summary += f"\n\n⚠️ {len(warnings)} warning(s) occurred during review."
post_note(summary)
PYTHON_SCRIPT
2026-06-25T13:36:57.275673Z 00O Running with gitlab-runner 19.1.0 (5eb085ab)
2026-06-25T13:36:57.275727Z 00O on my-docker-runner HN5JvT78r, system ID: r_11FBM1LvhkVy
2026-06-25T13:36:57.275851Z 00O
2026-06-25T13:36:57.275867Z 00O+Preparing the "docker" executor
2026-06-25T13:36:57.315224Z 00O Using Docker executor with image node:20 ...
2026-06-25T13:36:59.589132Z 00O Using effective pull policy of [if-not-present] for container node:20
2026-06-25T13:36:59.591261Z 00O Using locally found image version due to "if-not-present" pull policy
2026-06-25T13:36:59.591289Z 00O Using docker image sha256:33ed5ef90e835c40007b658b2d7cb1354b282bffcb53feceb305e9feff1d98e1 for node:20 with digest node@sha256:8f693eaa7e0a8e71560c9a82b55fd54c2ae920a2ba5d2cde28bac7d1c01c9ba5 ...
2026-06-25T13:36:59.591309Z 00O
2026-06-25T13:36:59.591620Z 00O+
2026-06-25T13:36:59.591638Z 00O+Preparing environment
2026-06-25T13:36:59.593485Z 00O Using effective pull policy of [if-not-present] for container sha256:aad66b5270cc98cebfeb59ea147ef0c34e87dff628ec17a4bb3efede949ad931
2026-06-25T13:37:00.219254Z 01O Running on runner-hn5jvt78r-project-12-concurrent-0 via 3f3317f37b76...
2026-06-25T13:37:00.553815Z 00O
2026-06-25T13:37:00.555178Z 00O+
2026-06-25T13:37:00.555198Z 00O+Getting source from Git repository
2026-06-25T13:37:01.301138Z 01O Gitaly correlation ID: 01KVZFYZ7WH96V7T5EMBNKYFM5
2026-06-25T13:37:01.319643Z 01O Fetching changes...
2026-06-25T13:37:01.328556Z 01O Reinitialized existing Git repository in /builds/aircloudlink/kif-base/.git/
2026-06-25T13:37:01.335331Z 01O Created fresh repository.
2026-06-25T13:37:01.675518Z 01O Checking out 035a7852 as detached HEAD (ref is refs/merge-requests/127/head)...
2026-06-25T13:37:01.826800Z 01O
2026-06-25T13:37:01.826820Z 01O Skipping Git submodules setup
2026-06-25T13:37:02.134073Z 00O
2026-06-25T13:37:02.136982Z 00O+
2026-06-25T13:37:02.137022Z 00O+Executing "step_script" stage of the job script
2026-06-25T13:37:02.137063Z 00O Using effective pull policy of [if-not-present] for container node:20
2026-06-25T13:37:02.139396Z 00O Using docker image sha256:33ed5ef90e835c40007b658b2d7cb1354b282bffcb53feceb305e9feff1d98e1 for node:20 with digest node@sha256:8f693eaa7e0a8e71560c9a82b55fd54c2ae920a2ba5d2cde28bac7d1c01c9ba5 ...
2026-06-25T13:37:02.812692Z 01O $ npm install -g @alibaba-group/open-code-review
2026-06-25T13:37:36.907480Z 01O
2026-06-25T13:37:36.907499Z 01O added 2 packages in 34s
2026-06-25T13:37:36.940481Z 01O $ echo "OpenCodeReview installed with version:" && ocr version || true
2026-06-25T13:37:36.940504Z 01O OpenCodeReview installed with version:
2026-06-25T13:37:37.092169Z 01O open-code-review v1.6.1 (034d512) linux/amd64
2026-06-25T13:37:37.092197Z 01O built at: 2026-06-25T12:11:53Z
2026-06-25T13:37:37.092200Z 01O https://github.com/alibaba/open-code-review
2026-06-25T13:37:37.104929Z 01O $ mkdir -p ~/.opencodereview
2026-06-25T13:37:37.109049Z 01O $ ocr config set provider "ollama" # collapsed multi-line command
2026-06-25T13:37:37.251407Z 01O Set provider = ollama
2026-06-25T13:37:37.421259Z 01O Set providers.ollama.url = http://xxx:1234/v1
2026-06-25T13:37:37.569611Z 01O Set providers.ollama.model = deepseek-coder-v2:16b
2026-06-25T13:37:37.726028Z 01O Set providers.ollama.protocol = openai
2026-06-25T13:37:37.885287Z 01O Set providers.ollama.api_key = dumm***2345
2026-06-25T13:37:38.035670Z 01O Set language = Chinese
2026-06-25T13:37:38.045646Z 01O ===== Test Ollama LLM Connection =====
2026-06-25T13:37:45.101004Z 01O Source: provider:ollama
2026-06-25T13:37:45.101072Z 01O URL: http://xxxx:1234/v1
2026-06-25T13:37:45.101083Z 01O Model: deepseek-coder-v2:16b
2026-06-25T13:37:45.101085Z 01O 我是阿里巴巴开发的智能代码审查助手,名为“open-code-review”。
2026-06-25T13:37:45.101090Z 01O ✓ Connection test successful
2026-06-25T13:37:45.115801Z 01O Reviewing MR: feature/sonarqueb-test against master
2026-06-25T13:37:49.307582Z 01O OCR exit code: 1
2026-06-25T13:37:49.307613Z 01O === OCR Result file is empty or missing ===
2026-06-25T13:37:49.311440Z 01O Created default empty result
2026-06-25T13:37:49.311494Z 01O === OCR stderr ===
2026-06-25T13:37:49.314085Z 01O Error: review failed: all 3 file review(s) failed — check your LLM configuration and API key
2026-06-25T13:37:49.314478Z 01O === OCR review completed ===
2026-06-25T13:37:49.314482Z 01O $ python3 << 'PYTHON_SCRIPT' # collapsed multi-line command
2026-06-25T13:37:49.595267Z 01O No review comments to post.
2026-06-25T13:37:49.951341Z 00O
2026-06-25T13:37:49.953371Z 00O+
2026-06-25T13:37:49.953401Z 00O+Cleaning up project directory and file based variables
2026-06-25T13:37:50.909364Z 00O
2026-06-25T13:37:50.984493Z 00O Job succeeded
OpenCodeReview Version
open-code-review v1.6.1 (034d512) linux/amd64
Operating System
Linux (x86_64)
Installation Method
npm (global)
LLM Provider
OpenAI
Bug Description
本地部署ollama的相关开源模型,能够被gitlab进行访问,参考文档中的gitlab的yml文件样例,进行修改验证,虽然验证通过了,但是模型没有返回正确的结果,直接用接口进行测试是正常的;
如下所示:(http://xxxx/v1/chat/completions -H "Content-Type: application/json" -H "Accept: application/json" -H "User-Agent: OCR-Test" -d '{"model":"deepseek-coder-v2:16b","messages":[{"role":"user","content":"Hello"}],"stream":false}')
但是用ocr reviewer 返回结果不符合规范,而且模型直接返回400的错误,请问是不是我哪里写错了。
Steps to Reproduce
1,背景
本地部署了ollama的deepseek免费大模型,然后能用外网访问,
2,gitlab上配置了yml文件,配置如下:
3,gitlab上的pipeline执行结果如下:
ollama的日志如下:
6月 25 21:37:49 xxx ollama[1245697]: [GIN] 2026/06/25 - 21:37:49 | 400 | 368.959886ms | 192.168.254.244 | POST "/v1/chat/completions"Expected Behavior
麻烦大神帮我看看,是我哪里配置有问题, 或者告诉我怎么 ocr reviewer 怎么调试来排查问题~
Logs / Error Output
Additional Context
No response