Junjie Hu | 胡俊杰


Assistant Professor
Biostatistics & Medical Informatics
Computer Science
Data Science Institute
University of Wisconsin-Madison
Office: 4735 MSC, 420 North Charter Street, Madison, WI
Office phone: +1-6082656118
Email: junjie.hu@wisc.edu
[Research Statement]

About

I am an assistant professor with appointments in the Department of Biostatistics, Department of Computer Science and Data Science Institute at the University of Wisconsin-Madison. I obtained my Ph.D. from School of Computer Science at Carnegie Mellon University, where I worked with Jaime Carbonell and Graham Neubig.

I have a broad interest in natural language processing and machine learning. In particular, I work on multilingual NLP, transfer learning, multimodal learning, and their applications to support human-machine communications. My research goal is to build robust intelligent systems that evolve with changes in the environment and interact with people speaking different languages.

Prospective students: Thanks for your interest! I am always looking for excellent PhD students to join our lab. Please apply to the CS or BDS program, and mention my name in your application and research statement. UW-Madison is an excellent place for research, and Madison is a wonderful city to live in. Please check out these videos (Why UW-Madison, Madison). I’m also happy to work with masters or undergraduate students at UW-Madison. If you are interested, please send me an email.

Research Group

I am really fortunate to work with a group of excellent students at UW-Madison. Stay tuned for our latest works!

Graduate Students Undergraduate Students Alumni

Publications

2023

  1. ACL
    Single Sequence Prediction over Reasoning Graphs for Multi-hop QA Gowtham Ramesh, Makesh Narsimhan Sreedhar, and Junjie Hu In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics 2023
  2. ACL
    Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection Rheeya Uppaal, Junjie Hu, and Yixuan Li In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics 2023
  3. ACL
    Local Byte Fusion for Neural Machine Translation Makesh Narsimhan Sreedhar, Xiangpeng Wan, Yu Cheng, and Junjie Hu In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics 2023
  4. ACL
    Multimodal Prompt Retrieval for Generative Visual Question Answering Timothy Ossowski, and Junjie Hu In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL Findings) 2023

2022

  1. EMNLP
    Beyond Counting Datasets: Investigating Multilingual Dataset Construction and Necessary Resources Xinyan Yu, Trina Chatterjee, Akari Asai, Junjie Hu, and Eunsol Choi In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings) 2022
  2. EMNLP
    Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment Tuan Dinh, Jy-yong Sohn, Shashank Rajput, Timothy Ossowski, Yifei Ming, Junjie Hu, Dimitris Papailiopoulos, and Kangwook Lee In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings) 2022
  3. IEEE TPAMI
    Video Pivoting Unsupervised Multi-modal Neural Machine Translation Mingjie Li, Po-Yao Huang, Xiaojun Chang, Junjie Hu, Yi Yang, and Alex Hauptmann IEEE transactions on pattern analysis and machine intelligence (To Appear) 2022
  4. ACL
    DEEP: DEnoising Entity Pre-training for Neural Machine Translation Junjie Hu, Hiroaki Hayashi, Kyunghyun Cho, and Graham Neubig In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2022
  5. ACL
    GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems Bosheng Ding, Junjie Hu, Lidong Bing, Sharifah Aljunied Mahani, Shafiq R. Joty, Luo Si, and Chunyan Miao In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2022

2021

  1. WMT
    Phrase-level Active Learning for Neural Machine Translation Junjie Hu, and Graham Neubig In The Sixth Conference on Machine Translation (WMT) 2021 [Abs] [Code]
  2. EMNLP
    AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages Machel Reid, Junjie Hu, Graham Neubig, and Yutaka Matsuo In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021 [Abs] [Code]
  3. EMNLP
    XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation Sebastian Ruder, Noah Constant, Jan Botha, Aditya Siddhant, Orhan Firat, Jinlan Fu, Pengfei Liu, Junjie Hu, Dan Garrette, Graham Neubig, and Melvin Johnson In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021 [Abs] [Code]
  4. NAACL
    Explicit Alignment Objectives for Multilingual Bidirectional Encoders Junjie Hu, Melvin Johnson, Orhan Firat, Aditya Siddhant, and Graham Neubig In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics 2021 [Abs] [Code]
  5. NAACL
    Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models Po-Yao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, and Alexander Hauptmann In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics 2021 [Abs] [Code]

2020

  1. ICML
    XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, and Melvin Johnson In International Conference on Machine Learning (ICML) 2020 [Abs] [Code]
  2. ICML
    On Learning Language-Invariant Representations for Universal Machine Translation Han Zhao, Junjie Hu, and Andrej Risteski In International Conference on Machine Learning (ICML) 2020 [Abs]
  3. ACL
    Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting Po-Yao Huang, Junjie Hu, Xiaojun Chang, and Alexander Hauptmann In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020 [Abs]
  4. Workshop
    TICO-19: the Translation Initiative for COvid-19 Antonios Anastasopoulos, Alessandro Cattelan, Zi-Yi Dou, Marcello Federico, Christian Federmann, Dmitriy Genzel, Franscisco Guzmán, Junjie Hu, Macduff Hughes, Philipp Koehn, Rosie Lazar, Will Lewis, Graham Neubig, Mengmeng Niu, Alp Öktem, Eric Paquin, Grace Tang, and Sylwia Tur In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020 [Abs]
  5. AAAI
    What Makes A Good Story? Designing Composite Rewards for Visual Storytelling Junjie Hu, Yu Cheng, Zhe Gan, Jingjing Liu, Jianfeng Gao, and Graham Neubig In Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI) 2020 [Code]

2019

  1. ACL
    Domain Adaptation of Neural Machine Translation by Lexicon Induction Junjie Hu, Mengzhou Xia, Graham Neubig, and Jaime Carbonell In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019 [Abs] [Code]
  2. CIKM
    A hybrid retrieval-generation neural conversation model Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W Bruce Croft, Xiaodong Liu, Yelong Shen, and Jingjing Liu In Proceedings of the 28th ACM International Conference on Information and Knowledge Management 2019 [Code]
  3. EMNLP
    REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning Ming Jiang, Junjie Hu, Qiuyuan Huang, Lei Zhang, Jana Diesner, and Jianfeng Gao In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 2019 [Abs]
  4. EMNLP
    Handling Syntactic Divergence in Low-resource Machine Translation Chunting Zhou, Xuezhe Ma, Junjie Hu, and Graham Neubig In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 2019 [Abs]
  5. EMNLP
    Unsupervised Domain Adaptation for Neural Machine Translation with Domain-Aware Feature Embeddings Zi-Yi Dou, Junjie Hu, Antonios Anastasopoulos, and Graham Neubig In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 2019 [Abs]
  6. WNGT
    Domain Differential Adaptation for Neural Machine Translation Zi-Yi Dou, Xinyi Wang, Junjie Hu, and Graham Neubig In Proceedings of the 3rd Workshop on Neural Generation and Translation 2019 [Abs]
  7. NAACL
    compare-mt: A Tool for Holistic Comparison of Language Generation Systems Graham Neubig, Zi-Yi Dou, Junjie Hu, Paul Michel, Danish Pruthi, and Xinyi Wang In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations) 2019 [Abs] [Code] [Best Demon Nomination]

2018

  1. EMNLP
    Rapid Adaptation of Neural Machine Translation to New Languages Graham Neubig, and Junjie Hu In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018 [Abs] [Code]
  2. ACL
    Automatic Estimation of Simultaneous Interpreter Performance Craig Stewart, Nikolai Vogler, Junjie Hu, Jordan Boyd-Graber, and Graham Neubig In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2018 [Abs]
  3. WMT
    Contextual Encoding for Translation Quality Estimation Junjie Hu, Wei-Cheng Chang, Yuexin Wu, and Graham Neubig In Proceedings of the Third Conference on Machine Translation: Shared Task Papers 2018 [Abs] [Code]

2017

  1. EMNLP
    Structural Embedding of Syntactic Trees for Machine Comprehension Rui Liu, Junjie Hu, Wei Wei, Zi Yang, and Eric Nyberg In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017 [Abs]
  2. ACL
    Semi-Supervised QA with Generative Domain-Adaptive Nets Zhilin Yang, Junjie Hu, Ruslan Salakhutdinov, and William Cohen In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017 [Abs]
  3. AAAI
    Answer-aware attention on grounded question answering in images Junjie Hu, Desai Fan, Shuxin Yao, and Jean Oh In AAAI 2017 Fall Symposium on Natural Communication for Human-Robot Collaboration 2017
  4. IEEE TNNLS
    Online nonlinear AUC maximization for imbalanced data sets Junjie Hu, Haiqin Yang, Michael R Lyu, Irwin King, and Anthony Man-Cho So IEEE transactions on neural networks and learning systems 2017 [Abs]

2016

  1. HCOMP
    Learning Lexical Entries for Robotic Commands via Paraphrasing Junjie Hu, Jean Oh, and Anatole Gershman In AAAI conference on Human Computation 2016 [Abs]
  2. ICLR
    Words or Characters? Fine-grained Gating for Reading Comprehension Zhilin Yang, Bhuwan Dhingra, Ye Yuan, Junjie Hu, William W. Cohen, and Ruslan Salakhutdinov. In International Conference on Learning Representations 2016 [Abs]

2015

  1. IEEE Cybern.
    Diversified Sensitivity-Based Undersampling for Imbalance Classification Problems Wing W. Y. Ng, Junjie Hu, Daniel Yeung Yeung, Shaohua Yin, and Fabio Roli IEEE Transactions on Cybernetics 2015 [Abs]
  2. AAAI
    Kernelized Online Imbalanced Learning with Fixed Budgets Junjie Hu, Haiqin Yang, Irwin King, Michael Lyu, and Anthony Man-Cho So In Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI) 2015 [Abs]
  3. SOSE
    Ar-tracker: Track the dynamics of mobile apps via user review mining Cuiyun Gao, Hui Xu, Junjie Hu, and Yangfan Zhou In 2015 IEEE Symposium on Service-Oriented System Engineering 2015 [Abs]

Preprints

2022

  1. arXiv
    Local Byte Fusion for Neural Machine Translation Makesh Narsimhan Sreedhar, Xiangpeng Wan, Yu Cheng, and Junjie Hu arXiv preprint arXiv:2205.11490 2022

2019

  1. arXiv
    The ARIEL-CMU systems for LoReHLT18 Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, and others arXiv preprint arXiv:1902.08899 2019

2017

  1. arXiv
    Principled hybrids of generative and discriminative domain adaptation Han Zhao, Zhenyao Zhu, Junjie Hu, Adam Coates, and Geoff Gordon arXiv preprint arXiv:1705.09011 2017

Talks

  • Invited Talk at University of Cambridge, LTL Seminar, June 09, 2022.

  • Invited Talk at Lingustics Fridays Seminar at UW-Madison, April 01, 2022.

  • Invited Talk at Microsoft Azure Cognitive Services Research, January 20, 2022.

  • Invited Talk at Bay Area NLP Seminar, November 18, 2021.

  • Invited Talk at ICTR Seminar at UW-Madison, October 26, 2021.

  • Invited Talk at Microsoft Research Summit, October 21, 2021.

  • Invited Talk at CIBM Seminar at UW-Madison, October 19, 2021.

  • Invited Talk at IFDS Ideas Forum at UW-Madison, October 11, 2021.

  • XTREME: A Massively Multilingual Multi-task Benchmarkfor Evaluating Cross-lingual Generalization, Junjie Hu, LTI Summer Seminar Series at Carnegie Mellon University, Pittsburgh, July 2, 2020.

  • Pre-training of Multilingual Encoder for Crosslingual Transfer, Junjie Hu, Google Translate Team, Mountain View, August 20 2019.

  • Cross-Lingual and Cross Domain Transfer for Neural Machine Translation, Junjie Hu, AI Seminar at Carnegie Mellon University, Pittsburgh April 30 2019.

  • Transfer Learning for Multilingual Neural Machine Translation, Junjie Hu, SMART-Select Workshop on Multilingual Models and Unsupervised NMT supported by DG Connect of the European Commission, Luxembourg, June 20 2019. Facebook AI Research Lab, Paris, June 21 2019.

  • Rethinking Visual Storytelling: What Makes A Good Story? Junjie Hu, Microsoft 365 AI Research, Redmond, August 23 2018.

  • Machine Reading Comprehension via Structural Tree Embeddings, Junjie Hu, Seminar at Chinese University of Hong Kong, March 5 2018.

  • Lorelei: Understanding Low Resource Languages, Pat Littell, Junjie Hu, Shruti Rijhwani, and Ruochen Xu. LTI Colloquium at Carnegie Mellon University, Pittsburgh, September 8, 2017.

  • Natural Communication for Human-Robot Collaboration, Junjie Hu, Symposium on Natural Communication for Human-Robot Collaboration, November 9, 2017.

Selected Awards and Scholarships

  • CMU Graduate Student Assembly Dissertation Writing Group Grant, 2020

  • CMU Graduate Student Assembly Conference Travel Grant, 2020

  • NAACL 2019 Best Demonstration Paper Nomination, 2019

  • Graduate Research Scholarship, Carnegie Mellon University, 2015-2021

  • Postgraduate Scholarship, The Chinese University of Hong Kong, 2013-2015

  • Certificate of Merit for Teaching Assistantship, Department of CSE, Chinese University of Hong Kong, 2013-2014

  • IBM Outstanding Student Scholarship (1 of 77 winners in China), 2012-2013

  • Outstanding Undergraduate Awards by China Computer Federation (99 winners), 2012-2013

  • National Scholarship, the Ministry of Education, 2010-2011, 2011-2012