Project D4-E (2006-2014)

Overview of the project D4 of the CRC 732

Modular Lexicalization of Probabilistic Context-Free Grammars

This project aims to develop and implement improved statistical disambiguation methods for syntactic analyses. It also develops a clustering model for verb-argument tuples which generalises selectional restrictions over WordNet concepts.

In the next phase, the project will implement a new parameter estimation technique for the BitPar parser which was developed in the first phase. The new method is based on ensembles of decision trees and is intended to improve the accuracy of parsing with fine-grained syntactic categories which contain information about e.g. number, gender, and case. The project will also examine whether reranking strategies can further increase the accuracy of the parser. The reranker will use features derived from the clustering model as well as other features. The clustering model will be extended by (i) dealing with adjuncts in addition to arguments (ii) automatically inducing noun hierarchies instead of using WordNet, and (iii) implementing a hybrid probability model. The clustering model will be applied to tasks such as word sense disambiguation.

Principal Investigator: Helmut Schmid

Staff: Richard Farkas, Thomas Müller
Former Staff: Alexander Balabanov, Christian Hying, Wiebke Wagner, Sabine Schulte im Walde
HiWi: Renjing Wang
Former HiWi: Christian Scheible, Max Kisselew

Publications

Events

2nd Interdisciplinary Workshop on Verbs: The Identification and Representation of Verb Features

November 4-5, 2010, Pisa, Italy

Organisers: Pier Marco Bertinetto (Scuola Normale Superiore di Pisa), Anna Korhonen (University of Cambridge), Alessandro Lenci (University of Pisa), Alissa Melinger (University of Dundee), Sabine Schulte im Walde (D4), Aline Villavicencio (Federal University of Rio Grande do Sul, and University of Bath)

Human Judgements in Computational Linguistics

August 23, 2008, Manchester, UK

Organisers:Ron Artstein (University of Southern California), Gemma Boleda (Universitat Politècnica de Catalunya), Frank Keller (University of Edinburgh), Sabine Schulte im Walde (D4)

Software

PAC is a predicate argument clustering software that is trained on predicate-frame-argument tuples and outputs a multi-dimensional cluster analysis, including clusters for the predicates and selectional preference abstraction over the predicate arguments.

To the top of the page