Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 64 additions & 11 deletions includes/Abilities/Comment_Moderation/Comment_Analysis.php
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,12 @@
'enum' => array_keys( Comment_Moderation::get_sentiment_config() ),
'description' => esc_html__( 'The sentiment of the comment.', 'ai' ),
),
'value_score' => array(
'type' => array( 'number', 'null' ),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value in having this return null? I think I'd just match what we do for toxicity scoring and have a number between 0 and 1

'minimum' => 0,
'maximum' => 1,
'description' => esc_html__( 'Value score from 0 (low value) to 1 (high value), or null if relevance cannot be assessed.', 'ai' ),
),
),
);
}
Expand Down Expand Up @@ -129,7 +135,7 @@
update_comment_meta( $comment_id, Comment_Moderation::META_ANALYSIS_STATUS, Comment_Moderation::STATUS_PROCESSING );

// Analyze the comment.
$result = $this->analyze_comment( $comment->comment_content, $comment->comment_author );
$result = $this->analyze_comment( $comment->comment_content, $comment->comment_author, $comment->comment_post_ID );

if ( is_wp_error( $result ) ) {
// Mark as failed.
Expand All @@ -140,13 +146,15 @@
// Store the results.
update_comment_meta( $comment_id, Comment_Moderation::META_TOXICITY_SCORE, $result['toxicity_score'] );
update_comment_meta( $comment_id, Comment_Moderation::META_SENTIMENT, $result['sentiment'] );
update_comment_meta( $comment_id, Comment_Moderation::META_VALUE_SCORE, $result['value_score'] );
update_comment_meta( $comment_id, Comment_Moderation::META_ANALYSIS_STATUS, Comment_Moderation::STATUS_COMPLETE );
update_comment_meta( $comment_id, Comment_Moderation::META_ANALYZED_AT, time() );

return array(
'comment_id' => $comment_id,
'toxicity_score' => $result['toxicity_score'],
'sentiment' => $result['sentiment'],
'value_score' => $result['value_score'],
);
}

Expand Down Expand Up @@ -190,8 +198,9 @@
'properties' => array(
'toxicity_score' => array( 'type' => 'number' ),
'sentiment' => array( 'type' => 'string' ),
'value_score' => array( 'type' => 'number' ),
),
'required' => array( 'toxicity_score', 'sentiment' ),
'required' => array( 'toxicity_score', 'sentiment', 'value_score' ),
'additionalProperties' => false,
);

Expand All @@ -205,16 +214,50 @@
return (array) apply_filters( 'wpai_comment_analysis_response_schema', $schema );
}

/**
* Function to return context from the post for comment analysis.
* @param int $post_id The ID of the post.
* @return string The content of the post.
*/
private function get_post_context( int $post_id ): ?string {
$post = get_post( $post_id );

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The post may not exist so we need an early return check here


// 1. Use excerpt if available (human-written, most reliable)
$excerpt = trim( $post->post_excerpt );

Check failure on line 226 in includes/Abilities/Comment_Moderation/Comment_Analysis.php

View workflow job for this annotation

GitHub Actions / Run PHP static analysis

Cannot access property $post_excerpt on WP_Post|null.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any benefit to using the excerpt or summary over just using the full post content?

if ( ! empty( $excerpt ) ) {
return $excerpt;
}

// 2. Fall back to AI-generated summary if available
$ai_summary = trim( get_post_meta( $post_id, '_ai_post_summary', true ) );
if ( ! empty( $ai_summary ) ) {
return $ai_summary;
}

// 3. Fall back to trimmed post content (strip tags, normalize whitespace)
$content = wp_strip_all_tags( $post->post_content );

Check failure on line 238 in includes/Abilities/Comment_Moderation/Comment_Analysis.php

View workflow job for this annotation

GitHub Actions / Run PHP static analysis

Cannot access property $post_content on WP_Post|null.
$content = preg_replace( '/\s+/', ' ', $content );
$content = trim( $content );

Check failure on line 240 in includes/Abilities/Comment_Moderation/Comment_Analysis.php

View workflow job for this annotation

GitHub Actions / Run PHP static analysis

Parameter #1 $str of function trim expects string, string|null given.

if ( empty( $content ) ) {
return null;
}

return mb_substr( $content, 0, 650 );

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need to trim this. I understand it will save tokens but likely gives us worse results

}

/**
* Analyzes a comment for toxicity and sentiment.
*
* @since 0.9.0
*
* @param string $content The comment content.
* @param string $author The comment author name.
* @return array{toxicity_score: float, sentiment: string}|\WP_Error The analysis result.
* @param string $post_id The ID of the post.
* @return array{toxicity_score: float, sentiment: string, value_score: float|null}|\WP_Error The analysis result.
*/
private function analyze_comment( string $content, string $author ) {
private function analyze_comment( string $content, string $author, string $post_id ) {
Comment on lines +256 to +259

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe $post_id should always be an integer

Suggested change
* @param string $post_id The ID of the post.
* @return array{toxicity_score: float, sentiment: string, value_score: float|null}|\WP_Error The analysis result.
*/
private function analyze_comment( string $content, string $author ) {
private function analyze_comment( string $content, string $author, string $post_id ) {
* @param int $post_id The ID of the post.
* @return array{toxicity_score: float, sentiment: string, value_score: float|null}|\WP_Error The analysis result.
*/
private function analyze_comment( string $content, string $author, int $post_id ) {


/**
* Filters the comment analysis result before calling the AI provider.
*
Expand All @@ -223,20 +266,24 @@
*
* @since 0.9.0
*
* @param array{toxicity_score: float, sentiment: string}|null $result Precomputed analysis result.
* @param string $content Comment content.
* @param string $author Comment author name.
* @param array{toxicity_score: float, sentiment: string, value_score: float|null}|null $result Precomputed analysis result.
* @param string $content Comment content.
* @param string $author Comment author name.
* @param string $post_id The ID of the post.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param string $post_id The ID of the post.
* @param int $post_id The ID of the post.

*/
$pre_result = apply_filters( 'wpai_comment_analysis_result', null, $content, $author );
$pre_result = apply_filters( 'wpai_comment_analysis_result', null, $content, $author, $post_id );

if ( is_array( $pre_result ) ) {
return $this->sanitize_analysis_result( $pre_result );
}

$post_context = $this->get_post_context( absint( $post_id ) );

$prompt = sprintf(
"Comment by %s:\n\"\"\"%s\"\"\"",
"Comment by %s:\n\"\"\"%s\"\"\"\nContext:\n\"\"\"%s\"\"\"",
$author,
$content
$content,
$post_context
);
Comment on lines +280 to 287

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're passing in the content of the post a comment is on, I'd suggest we change how we build this to send to the LLM. You can look at our other abilities to see how they work but generally we put things in XML-like tags


$prompt_builder = $this->get_prompt_builder( $prompt );
Expand Down Expand Up @@ -290,7 +337,7 @@
* @since 0.9.0
*
* @param array<string, mixed> $result Raw analysis result.
* @return array{toxicity_score: float, sentiment: string} Sanitized analysis result.
* @return array{toxicity_score: float, sentiment: string, value_score: float|null} Sanitized analysis result.
*/
private function sanitize_analysis_result( array $result ): array {
// Validate and sanitize the response.
Expand All @@ -303,9 +350,15 @@
? $result['sentiment']
: Comment_Moderation::SENTIMENT_NEUTRAL;

$value_score = null;
if ( isset( $result['value_score'] ) ) {
$value_score = max( 0, min( 1, (float) $result['value_score'] ) );
}

return array(
'toxicity_score' => $toxicity_score,
'sentiment' => $sentiment,
'value_score' => $value_score,
);
}
}
16 changes: 14 additions & 2 deletions includes/Abilities/Comment_Moderation/system-instruction.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
*/

return <<<'INSTRUCTION'
You are a comment moderation assistant. Analyze the provided comment and return the following:
You are a comment moderation assistant. You will be given an article and a comment left on that article. Analyze the comment and return the following:

1. "toxicity_score": A number between 0 and 1 indicating how toxic/harmful the comment is:
- 0.0-0.3: Low toxicity (constructive, polite, or neutral)
Expand All @@ -18,5 +18,17 @@
- "negative": The comment expresses criticism, disagreement, frustration, or disappointment
- "neutral": The comment is factual, asks a question, or doesn't express clear emotion

Respond only with those two fields. No explanation, no markdown, no additional text.
3. "value_score": A number between 0 and 1 indicating how relevant and valuable the comment is to the article's discussion:
- 0.0-0.3: Low value (spam, engagement bait, generic acknowledgment such as "+1" or "thanks", or completely off-topic)
- 0.4-0.6: Moderate value (loosely related, adds minor context, or is brief but not noise)
- 0.7-1.0: High value (on-topic, substantive, adds new information, asks a meaningful question, or meaningfully advances the discussion)

When scoring, consider:
- Relevance to the article's subject matter
- Whether the comment adds new information or perspective
- Whether it could stand alone as a meaningful contribution vs. filler

If the article content is unavailable or too short to assess relevance, return null.

Respond only with those three fields as a JSON object. No explanation, no markdown, no additional text.
INSTRUCTION;
Loading
Loading