Recent studies brought to light that the semantic labels (e.g. Excellent, Good, Fair, Poor, and Bad) commonly associated with discrete scale ITU subjective quality evaluation induce a bias in MOS computation and that such a bias can be quantified by some reference coefficients
which are independent with respect to the observers panel. The present paper reconsiders these results from a standard upgrading perspective. First, it theoretically investigates the way in which results obtained on semantically labeled scales can be “cleaned” from such an influence
and derives the underlying computation formula for the mean opinion score. Secondly, it suggests a unitary evaluation procedure featuring both semantic free MOS computation and backward compatibility with respect to state-of-the-art solutions. The theoretical and methodological results are
supported by subjective experiments corresponding to a total of 440 human observers, alternatively scoring 2D and stereoscopic video content. For each type of content, both high and low quality excerpts are alternatively considered. For each type of content and for each type of quality a 5
level (Excellent, Good, Fair, Poor, and Bad) grading scales is considered.