Hobby-hacking Eric


PhD viva (defense)

I'll be presenting my thesis next Wednesday. You are cordially invited to attend the defence and to join us for drinks afterwards.

Surface realisation: ambiguity and determinism
Eric Kow

14 November 2007
Amphi C, LORIA
Nancy, France

Surface realisation is a subtask of natural language generation. It may be viewed as the inverse of parsing, that is, given a grammar and a representation of meaning, the surface realiser produces a natural language string that is associated by the grammar to the input meaning. This thesis presents three extensions to GenI, a realisation algorithm for Feature-Based Tree Adjoining Grammar (FB-LTAG).

The first extension improves the efficiency of the realiser with respect to lexical ambiguity. It is an adaptation from parsing of the "electrostatic tagging" optimisation, in which lexical items are associated with a set of polarities, and combinations of those items with non-neutral polarities are filtered out.

The second extension deals with the number of outputs returned by the realiser. Normally, the GenI algorithm returns all of the sentences associated with the input logical form. Whilst these inputs can be seen as having the same core meaning, they often convey subtle distinctions in emphasis or style. It is important for generation systems to be able to control these extra factors. Here, we show how the input specification can be augmented with annotations that provide for the fine-grained control that is required. The extension builds off the fact that the FB-LTAG grammar used by the generator was constructed from a "metagrammar", explicitly putting to use the linguistic generalisations that are encoded within.

The final extension provides a means for the realiser to act as a metagrammar-debugging environment. Mistakes in the metagrammar can have widespread consequences for the grammar. Since the realiser can output all strings associated with a semantic input, it can be used to find out what these mistakes are, and crucially, their precise location in the metagrammar.