Using Latent Discourse Indicators to identify goodness in online conversations
thesisposted on 16.01.2020 by Ayush Jain
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
In this work, we model latent discourse indicators to classify constructive and collaborative conversations online. Such conversations are considered good as they are rich in content and have a sense of direction to resolve an issue, solve a problem or gain new insights and knowledge. These unique discourse indicators are able to characterize flow of information, sentiment and community structure within discussions. We build a deep relational model that captures these complex discourse behaviors as latent variables and make a global prediction about overall conversation based on these higher level discourse behaviors. DRaiL, a Declarative Deep Relational Learning platform built on PyTorch, is used for our task in which relevant discourse behaviors are formulated as discrete latent variables and scored using a deep model. These variables capture the nuances involved in online conversations and provide the information needed for predicting the presence or absence of collaborative and constructive characterization in the entire conversational thread. We show that the joint modeling of such competing latent behaviors results in a performance improvement over the traditional direct classification methods in which all the raw features are just combined together to predict the final decision. The Yahoo News Annotated Comments Corpus is used as a dataset containing discussions on Yahoo news forums and final labels are annotated based on our precise and restricted definitions of positively labeled conversations. We formulated our annotation guidelines based on a sample set of conversations and resolved any conflict in specific annotation by revisiting those examples again.