Dataset of inappropriate utterances on sensitive topics in Russian
Description
This dataset is dedicated to inappropriate messages -- the messages on a sensitive topic that can frustrate the reader and/or harm the reputation of the speaker. The concept of inappropriateness is rather close to toxicity, however, the clear toxicity itself, as well as explicit obscenity, has been intentionally dropped from this dataset.
Not all messages related to sensitive topics are inappropriate. For example, speaking about racism you may either attack or protect someone. The main aim of this dataset is to detect appropriate and inappropriate utterances within known sensitive topics.
Files
Inappapropriate_messages.csv
Files
(23.7 MB)
Name | Size | Download all |
---|---|---|
md5:d6d885edd3118802ce17b74952427437
|
23.7 MB | Preview Download |