Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Hany Hassan

Hany Hassan


Hany Hassan Awadalla

I am a Senior Scientist at Microsoft Research Redmond. My work interests are in the areas of Statistical Machine Translation, Speech Translation and Semi-Supervised Machine Learning.

I have completed my Ph.D. at Dublin City University and my M.SC. and B.SC. at Cairo University. I have joined Microsoft in 2010, before that I had been working in IBM since 1996.

Currently, I am working on turning Machine Translation into reality: Microsoft Translator and Skype Translator.





Lexical Syntax for Statistical Machine Translation, PhD thesis. pdf

Book Chapters:

“Machine Translation for Semitic Languets”  in Natural Language Processing of Semitic Languages, Springer - 2013

Lexical Syntax for Arabic Statistical Machine Translation, in Challenges for Arabic Machine Translation, John Benjamins Publishing - 2012  


Patents :


Method and System for Extracting and Visualizing Graph-Structured Relations from Unstructured Text, US 7730085, Granted.

Method and System for Detecting and Predicting Anomalous Events, US8719190, Granted.

Method and System for Access of Multilingual Textual Resources using Conceptual Representation Matching , US8204736, Granted.

Method and System for Detecting Anomalous Behavior in Business Process Performance, US8719190, Granted

Method and System for Automatically Generating Multilingual Electronic Content from Unstructured Data, pending.

Method and System for Selecting a Data Element in a Network, US8589412. Granted.

Method and System for Resolving Out-of-Vocabulary Words during Machine Translation,US20130197896 , US8990066, Granted.

Cross Language Speech Translation and Recognition, Filed by Microsoft, 2015 

Journal Papers:


Efficient Accurate Syntactic Direct Translation Models, One Tree at a Time. Hany Hassan, Khalil Sima'an and Andy Way, In Machine Translation Journal, 2011

Syntactically Lexicalized Phrase-Based Statistical Translation. Hany Hassan, Khalil Sima'an and Andy Way, In IEEE Transactions on Audio, Speech and Language Processing. September 2008.

Conference Papers:

Detecting Interrogative Utterances with Recurrent Neural Networks , Junyoung Chung, Jacob Devlin and Hany Hassan (best paper), NIPS 2015 - Machine Learning for SLU workshop

Learning Translation Models from Monolingual Continuous Representations. Kai Zhao, Hany Hassan, and Michael Auli. , In Proceedings of NAACL 2015.

 Segmentation and Disfluency Removal for Conversational Speech Translation, Hany Hassan, Lee Schwartz, Dilek Hakkani-Tur, and Gokhan Tur, in Proceedings of Interspeech, 2014.

Graph-based Semi-Supervised Learning of Translation Models from Monolingual Data. Avneesh Saluja,  Hany Hassan, Kristina Toutanova, Chris Quirk, ACL 2014

MSR-FBK IWSLT 2013 SLT System Description, Anthony Aue, Qin Gao, Hany Hassan, Xiaodong He, Gang Li, Nicholas Ruiz, and Frank Seide, in International Workshop on Spoken Language Translation (IWSLT), December 2013

Social Text Normalization using Contextual Graph Random Walks. Hany Hassan and Arul Menezes, Association for Computational Linguistics, ACL 2013

Incremental Combinatory Categorial Grammar and Its Derivations, Ahmed Hefny, Hany Hassan, Mohamed Bahgat. CICLING 2011.

A Syntactified Direct Translation Model with Linear-time Decoding. Hany Hassan, Khalil Sima'an and Andy Way. EMNLP 2009

Lexicalized semi-incremental dependency parsing, Hany Hassan, Khalil Sima'an and Andy Way, In: RANLP-2009 - Recent Advances in Natural Language Processing 2009

A Syntactic Language Model based on Incremental CCG Parsing. Hany Hassan, Khalil Sima'an and Andy Way, In Proceedings IEEE Workshop on Spoken Language Technology (SLT) 2008, Goa, India

Exploiting Alignment Techniques in MATREX: the DCU Machine Translation System for IWSLT 2008, Yanjun Ma, John Tinsley, Hany Hassan, Jinhua Du, IWSLT 2008.

Language Independent Text Correction using Finite State Automata . Ahmed Hassan, Sara Noeman, and Hany Hassan, Proceedings of the 2008 International Joint Conference on Natural Language Processing (IJCNLP, 2008).

Improving Named Entity Translation by Exploiting Comparable and Parallel Corpora . Ahmed Hassan, Haytham Fahmy, and Hany Hassan. Proceedings of the 2007 Conference on Recent Advances in Natural Language Processing (RANLP, 2007), AMML Workshop.

MaTrEx: the DCU Machine Translation System for IWSLT 2007. Hassan, H., Y. Ma and A. Way. 2007. In Proceedings of the International Workshop on Spoken Language Translation, Trento, Italy

Supertagged Phrase-Based Statistical Machine Translation , Hany Hassan , Khalil Sima'an and Andy Way , ACL 2007 , Prague

Arabic Cross-Document Person Name Normalization , Walid Magdy, Kareem Darwish, Ossama Emam and Hany Hassan , Semitic Languages workshop - ACL 2007 , Prague

BioNoculars: Extracting Protein-Protein Interactions from Biomedical Text , Amgad Madkour, Kareem Darwish, Hany Hassan, Ahmed Hassan, Ossama Emam , BioNLP workshop - ACL 2007 , Prague

Syntactic Phrase-Based Statistical Machine Translation, Hany Hassan, Mary Hearne, Andy Way and Khalil Sima'an. 2006 , In Proceedings of the IEEE 2006 Workshop on Spoken Language Translation, Palm Beach, Aruba.

Unsupervised Information Extraction Approach Using Graph Mutual Reinforcement, Hany Hassan , Ahmed Hassan and Ossama Emam , EMNLP 2006

Graph Based Semi-Supervised Approach for Information Extraction, Hany Hassan , Ahmed Hassan and Sara Noeman , TextGraphs Workshop - HLT/NAACL 2006

An Integrated Approach for Arabic-English Named Entity Translation, Hany Hassan and Jeffrey Sorensen. ACL 2005, Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages

Examining the Effect of Improved Context Sensitive Morphology on Arabic Information Retrieval, Kareem Darwish, Hany Hassan and Ossama Emam: , ACL 2005, Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages

A Statistical Model for Multilingual Entity Detection and Tracking. , Radu Florian , Hany Hassan, Abraham Ittycheriah, Hongyan Jing, Xiaoqiang Luo, Nicolas Nicolov and Salim Roukos , HLT-NAACL 2004.

Language Model Based Arabic Word Segmentation. Young-Suk Lee, Kishore Papineni, Salim Roukos, Ossama Emam, Hany Hassan:, ACL 2003: 399-406

TIPS: A TranslingualInformation Processing System. Yaser Al-Onaizan, Radu Florian, Martin Franz, Hany Hassan, Young-Suk Lee, J. Scott McCarley, Kishore Papineni, Salim Roukos, Jeffrey Sorensen, Christoph Tillmann, Todd Ward, Fei Xia, HLT-NAACL 2003

Honors and Awards

LRC Best Thesis Award, 2009

IBM Software Group Top Talent for Leadership 2008

IBM First Plateau Invention Achievement Award 2007, in appreciation and recognition of creative contribution to IBM progress.

IBM Watson Research Center Bravo Award 2006, for significant contribution in Information Extraction Research.

IBM Invention Achievement Award 2005, for First Patent Application.

Science Foundation of Ireland (SFI) 2005, PhD scholarship at Dublin City University.

IBM Technical Achievement Award 2001, For outstanding contribution in EMMS development.


Technical Services:


Guest Editor of Machine Translation Journal, special issue on Arabic Translation.

MT Track Co-Chari for NAACL, Program Committee Member of ACL, EMNLP, HLT/NAACL for several years.
Reviewer for Machine Translation Journal, ACM Transactions

Served as an Executive Member of EAMT (European Association of Machine Translation)  

Contact Information:


email: hanyh at {my employer name}.com