CROWDMOS: An Approach for Crowdsourcing Mean Opinion Score Studies

Flávio Ribeiro, Dinei Florencio, Cha Zhang, and Michael Seltzer


MOS (mean opinion score) subjective quality studies are used to

evaluate many signal processing methods. Since laboratory quality

studies are time consuming and expensive, researchers often run

small studies with less statistical significance or use objective measures

which only approximate human perception. We propose a

cost-effective and convenient measure called crowdMOS, obtained

by having internet users participate in a MOS-like listening study.

Workers listen and rate sentences at their leisure, using their own

hardware, in an environment of their choice. Since these individuals

cannot be supervised, we propose methods for detecting and discarding

inaccurate scores. To automate crowdMOS testing, we offer

a set of freely distributable, open-source tools for Amazon Mechanical

Turk, a platform designed to facilitate crowdsourcing. These

tools implement the MOS testing methodology described in this paper,

providing researchers with a user-friendly means of performing

subjective quality evaluations without the overhead associated with

laboratory studies. Finally, we demonstrate the use of crowdMOS

using data from the Blizzard text-to-speech competition, showing

that it delivers accurate and repeatable results.


Publication typeInproceedings
Published inICASSP
> Publications > CROWDMOS: An Approach for Crowdsourcing Mean Opinion Score Studies