Abstract
Social media is frequently used by youth to share their health and mental issues. Therefore, social media has become a major online resource to study the language used to express issues such as depression and self-harm which can help to identify individuals at risk of harm. Furthermore, depression and suicide are generally closely related especially that depression is the most common symptom associated with self-harm acts such as suicide. In this project, we propose to build a linguistically annotated corpus with the sentiment analysis in order to study the youth behavior through their social media discourse across the MENA region. We plan to create a large-scale dataset of users with self-reported depression messages. Several correlational analyses will be performed to understand the psycho-social-behaviors. We plan to annotate the collected corpus using a team of dedicated annotators from various Arabic countries. Moreover, we will use various natural language processing (NLP) tools and techniques to reveal the linguistic patterns and the sentiments expressed by these tweets. Finally, we will apply machine learning (ML) methods to build behavior prediction tools using the annotated corpus. We believe that the annotated corpus to will be a valuable resource to be used by linguists, sociologists, computer scientists, psychologists, policy makers, etc.
| Original language | English |
|---|---|
| Pages (from-to) | 347-351 |
| Number of pages | 5 |
| Journal | Procedia Computer Science |
| Volume | 142 |
| DOIs | |
| Publication status | Published - 15 Nov 2018 |
| Event | 4th Arabic Computational Linguistics, ACLing 2018 - Dubai, United Arab Emirates Duration: 17 Nov 2018 → 19 Nov 2018 |
Keywords
- Arabic Dialects
- Corpus Annotation
- Depression
- Social Media Analysis