Is MOS comparable across different reference images?

Thanks for the great work and dataset!
I have a question regarding the Elo rating process mentioned in the paper.

1. Comparison Scope:
When collecting pairwise judgments, were the comparisons made across different reference images (e.g., comparing a distorted A0001 vs. a distorted A0002), or were they strictly restricted to distortions within the same reference image (e.g., comparing A0001_00_00 vs. A0001_01_01)?

2. Comparability of MOS:
If the comparisons were restricted within the same reference group, does this mean the absolute MOS values are not comparable across different reference images?

For example:
A0001_00_00.bmp: MOS ~1520
A0002_00_00.bmp: MOS ~1466

Can we infer that A0001_00_00 has better perceptual quality than A0002_00_00? Or are these scores only valid as relative rankings within their respective reference groups (A0001_00_00 vs A0001_01_01)?

Clarification on this would be very helpful for my experiments. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is MOS comparable across different reference images? #14

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Is MOS comparable across different reference images? #14

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions