State-of-the-art technology

We work closely together with various universities and institutions in order to shape the bleeding edge of technology!
Here, you will find our latest scientific publications.
brain structure color fire

Extending Neural Question Answering with Linguistic Input Features

Authors: Fabian Hommel, Matthias Orlikowski, Philipp Cimiano, Matthias Hartung

Conference: SemDeep-5 co-located with the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019) 

Abstract: Considerable progress in neural question answering has been made on competitive general domain datasets. In order to explore methods to aid the generalization potential of question answering models, we reimplement a state-of-the-art architecture, perform a parameter search on an open-domain dataset and evaluate a first approach for integrating linguistic input features such as part-of-speech tags, syntactic dependency relations and semantic roles. The results show that adding these input features has a greater impact on performance than any of the architectural parameters we explore. Our findings suggest that these layers of linguistic knowledge have the potential to substantially increase the generalization capacities of neural QA models, thus facilitating cross-domain model transfer or the development of domain-agnostic QA models.

Extending Neural Question Answering with Linguistic Input FeaturesDownload

Zero-Shot Cross-Lingual Opinion Target Extraction

Authors: Soufian Jebbara, Philipp Cimiano

Conference: 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019)

Abstract: Aspect-based sentiment analysis involves the recognition of so called opinion target expressions (OTEs). To automatically extract OTEs, supervised learning algorithms are usually employed which are trained on manually annotated corpora. The creation of these corpora is labor-intensive and sufficiently large datasets are therefore usually only available for a very narrow selection of languages and domains. In this work, we address the lack of available annotated data for specific languages by proposing a zero-shot cross-lingual approach for the extraction of opinion target expressions. We leverage multilingual word embeddings that share a common vector space across various languages and incorporate these into a convolutional neural network architecture for OTE extraction. Our experiments with 5 languages give promising results: We can successfully train a model on annotated data of a source language and perform accurate prediction on a target language without ever using any annotated samples in that target language. Depending on the source and target language pairs, we reach performances in a zero-shot regime of up to 77% of a model trained on target language data. Furthermore, we can increase this performance up to 87% of a baseline model trained on target language data by performing cross-lingual learning from multiple source languages.

Zero-Shot Cross-Lingual Opinion Target ExtractionDownload

A Guided Template-Based Question Answering System over Knowledge Graphs

Authors: Lukas Biermann, Sebastian Walter, and Philipp Cimiano

Conference: 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW 2018)

Abstract: Question answering systems provide easy access to structured data, in particular RDF data. However, the user experience is often negatively affected by questions that are not interpreted correctly. To remedy this, we present a new guided approach to QA that ensures that all questions that can be entered into the system also return a corresponding answer. For this, a template-based approach is used to generate all possible questions from a given RDF dataset using a number of templates. The question/answer pairs can then be indexed to provide autocompletion functionality at querying time. We describe the architecture and approach and present preliminary evaluation results.

A Guided Template-Based Question AnsweringSystem over Knowledge GraphsDownload

Identifying Right-Wing Extremism in German Twitter Profiles: a Classification Approach

Authors: Matthias Hartung, Roman Klinger, Franziska Schmidtke, Lars Vogel

Conference:22nd International Conference on Applications of Natural Language to Information Systems (NLDB 2017)

Abstract: Social media platforms are used by an increasing number of extremist political actors for mobilization, recruiting or radicalization purposes. We propose a machine learning approach to support manual monitoring aiming at identifying right-wing extremist content in German Twitter profiles. We frame the task as profile classification, based on textual cues, traits of emotionality in language use, and linguistic patterns. A quantitative evaluation reveals a limited precision of 25% with a close-to-perfect recall of 95%. This leads to a considerable reduction of the workload of human analysts in detecting right-wing extremist users.

Identifying Right-Wing Extremism in GermanTwitter Profiles – a Classification ApproachDownload

Opinion Mining in Online Reviews About Distance Education Programs

Authors: Janik Jaskolski, Fabian Siegberg, Thomas Tibroni, Philipp Cimiano, Roman Klinger

Journal Publication

Abstract: The popularity of distance education programs is increasing at a fast pace. En par with this development, online communication in fora, social media and reviewing platforms between students is increasing as well. Exploiting this information to support fellow students or institutions requires to extract the relevant opinions in order to automatically generate reports providing an overview of pros and cons of different distance education programs. We report on an experiment involving distance education experts with the goal to develop a dataset of reviews annotated with relevant categories and aspects in each category discussed in the specific review together with an indication of the sentiment. Based on this experiment, we present an approach to extract general categories and specific aspects under discussion in a review together with their sentiment. We frame this task as a multi-label hierarchical text classification problem and empirically investigate the performance of different classification architectures to couple the prediction of a category with the prediction of particular aspects in this category. We evaluate different architectures and show that a hierarchical approach leads to superior results in comparison to a flat model which makes decisions independently.

Opinion Mining in Online Reviews About Distance Education ProgramsDownload