This dataset dedicated to find texts that contains information that helps to diagnosis person's suicide rating. The source of the data is the Russian Twitter. The dataset has 5 categories: 1. Texts describing negative events that occurred with the subject in the past or in the present - messages that are factual, describing negative moments that can happen to a person, such as attempts and facts of rape, problems with parents, the fact of being in a psychiatric hospital, facts of self-harm, etc. 2. Current negative emotional state - messages containing a display of subjective negative attitude towards oneself and others, including a desire to die, a feeling of pressure from the past, self-hatred, aggressiveness, rage directed at oneself or others. 3. Messages about the intention of suicide - messages containing an explicit declaration of suicidal actions. Messages that contain questions about suicide methods also fall into the same category. 4. Messages with a suicidal theme - the text of messages that are not directly related to the user but have a suicidal topic. 5. Neutral is the category in which messages that are not included in the above list fall.


Steps to reproduce

1. Download this repo 2. Create dir named data and place the dataset there 3. Execute notebooks


Psychology, Applied Psychology, Natural Language Processing, Machine Learning, Artificial Intelligence Applications, Suicide Risk, Suicidal Behavior