*
Quick Links|Home|Worldwide
Microsoft*
Search for



AMALGAM

Overview

Amalgam is a novel system developed in the Natural Language Processing group at Microsoft Research for sentence realization during natural language generation. Sentence realization is the process of generating (realizing) a fluent sentence from a semantic representation. From the outset, the goal of the Amalgam project has been to build a sentence realization system in a data-driven fashion using machine learning techniques.To date, we have implemented Amalgam for both German and French, with English in the works.

Amalgam accepts as input a logical form graph capturing the meaning of a sentence. The logical form shown here is for the German sentence Die ODBC-Spezifikation definiert das Feld, das die Komponente bezeichnet, die die Meldung ausgegeben hat. (from MS technical manuals)

 

 

 

 

 

Amalgam constrains the search for a fluent sentence realization by following a linguistically informed approach that includes such component steps as labeling of phrasal projections, raising, ordering of elements within a constituent, and extraposition of relative clauses. For the above example,the following tree illustrates the transformed tree just prior to ordering.

 

 

 

 

 

 

 

Proceeding through these steps, Amalgam transforms the logical form into a fully articulated tree structure from which an output sentence is read.

 

 

 

 

 

 

 

 

The contexts for each linguistic operation in the process are primarily machine-learned. The promise of machine-learned approaches to sentence realization is that they can easily be adapted to new domains and ideally to new languages merely by retraining.

To date we have focused our research particularly in the context of the ongoing research into machine translation at Microsoft Research NLP.


Publications
  • For a brief overview of German Amalgam, we recommend the following paper:
    S. Corston-Oliver, M. Gamon, E. Ringger, and R. Moore. 2002. An overview of Amalgam: A machine-learned generation module. In Proceedings of the International Natural Language Generation Conference. New York, USA. pp. 33-40.


  • For a detailed overview of German Amalgam, as it existed in 2002, we recommend the following Microsoft Research technical report:
    M. Gamon, E. Ringger, S. Corston-Oliver. 2002. Amalgam: A machine-learned generation module. Microsoft Research Technical Report MSR-TR-2002-57. June 2002. 73 pp.

Project Members

Interns:

  • 2001: Zhu Zhang
  • 2003: David Rojas

Acknowledgments:

  • Max Chickering
  • Tom Reutter
  • Karin Berghfer
  • the Microsoft Research NLP generation grammarians

©2008 Microsoft Corporation. All rights reserved. Terms of Use |Trademarks |Privacy Statement