Yuki Arase

Contact info.

Building 2, No. 5 Dan Ling Street, Haidian District, Beijing, P.R. China, 100080

Yuki Arase received her B.E. (2006), M.I.S. (2007), and Ph.D. of Information Science (2010) from Osaka University, Japan. She joined Microsoft Research Asia as an associate researcher on April 2010. In her Ph.D., she studied HCI on mobile devices, especially how to present a large Web page on a small screen. She also worked on Web data mining. In MSRA, she is working on English/Japanese natural language processing. Her current research interests include English paraphrase detection as well as paraphrase acquisition using crwod-sourcing.

If you are interested in an intern position, please feel free to contact me.

  • Area: Paraphrase detection in English, paraphrase acquisition using crowd-sourcing 
  • Ph.D or master students

(Last update: 1/9/2014)

Selected Publications

* Full publication list is here 

  • Ding, C. and Arase, Y.: “Dependency Tree Abstraction for Long-Distance Reordering in Statistical Machine Translation,” Proc. of Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014, to appear).
  • Arase, Y. and Zhou, M.: “Machine Translation Detection from Monolingual Web-Text,” Proc. of Annual Meeting of the Association for Computational Linguistics (ACL 2013), pp. 1597-1607 (Aug. 2013). *best paper award nomination
  • Sakaguchi, K., Arase, Y., and Komachi, M.: “Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners” Proc. of Annual Meeting of the Association for Computational Linguistics (ACL 2013), pp. 238-242 (Aug. 2013).
  • Kato, R., Iwata, M., Hara, T., Suzuki, A., Arase, Y., Xie, X., and Nishio, S.: “A Dummy-based Anonymization Method based on User Trajectory with Pauses,” Proc. of International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2012), pp. 249-258 (Nov. 2012).
  • Arase, Y., Xie, X., Hara, T., and Nishio, S.: “Mining People's Trips from Large Scale Geo-tagged Photos,” Proc. of ACM International Conference on Multimedia (ACM MM 2010), pp. 133-142 (Oct. 2010).
  • Arase, Y., Xie, X., Duan, M., Hara, T., and Nishio, S.: “A Game based Approach to Assign Geographical Relevance to Web Images,” Proc. of International World Wide Web Conference (WWW 2009), pp. 811-820 (Apr. 2009).

Lectures / Talks

  • Invited talk: "Gradual Transition from Student to Researcher," at 北大情報系若手連携シンポジウム, Hokkaido University (Nov. 2013)
  • Lecture: "Machine Translation Detection from Monolingual Web-Text" at Colloquium, Department of Computer Science, Graduate school of Systems and Information Engineering, University of Tsukuba (Oct. 2013)
  • Lecture: "Exploiting Web Data for NLP Research: from Multilingual Text to Social Media" at Seminar I, Nara Institute of Science and Technology (NAIST), (Jan. 2013)
  • Panel discussion: 2011 NAIST Colloquium on Advances in Natural Language Processing (Sept. 2011)


  • Kasami Award
    Grad. Sch. of Information Science and Tech., Osaka University
  • Distinguished Young Researcher Award
    International Workshop with Mentors on Databases, Web and Information Management for Young Researchers (iDB Workshop 2009)
  • Best paper award 2009
    DICOMO 2009 (domestic conference)
  • Excellent paper award 2008
    DICOMO 2008 (domestic conference)
  • Best master student prize 2007
    Dept. of Multimedia Engineering, Grad. Sch. of Information Science and Tech., Osaka University

Social Activity

  • Program Committee member: ACL 2014, WWW2014, LREC 2014, Workshop on Natural Language Processing for Medical and Healthcare Fields 2013, etc. 
  • Journal Review: IEEE MultiMedia, International Journal of Neurocomputing, International Journal of Handheld Computing Research (IJHCR), IEICE Journal, etc.


  • Mitsuo Yoshida (University of Tsukuba)
  • ChenChen Ding (University of Tsukuba)
  • Keisuke Sakaguchi (Nara Institute of Science and Technology)
  • Kazeto Yamamoto (Tohoku University)
  • Atsushi Keyaki (Nara Institute of Science and Technology)


  • Pics 'n' Trails Dataset 
    A large collection of GPS data and Digital photos captured by a single person over a period of more than a year. The GPS data were archived by continuously carrying a GPS receiver and recording the location coordinates at all times when they can be estimated. The photos, 4179 in total, were taken during sightseeing and other events.