Can Jin

Ph.D. Candidate in Computer Science, Rutgers University

personal3.jpg
CBIM, Busch Campus can.jin@rutgers.edu

I am a Ph.D. Candidate in Computer Science at Rutgers University, New Brunswick, advised by Professor Dimitris N. Metaxas. My research interests include Pre-training/Post-training/Inference of Foundation Models, with a focus on Efficiency and Reasoning/Coding Capabilities. Please feel free to reach out via email if you share similar interests and would like to collaborate.

I earned my M.S. and B.S. degrees in Mathematics from the University of Science and Technology of China (USTC). Before my doctoral studies, I worked as a Machine Learning Engineer at Meituan Dianping Corporation. Recently, as a Research Intern at Adobe Research, I focused on the efficient pre-training of large foundation models (LLMs/DiTs/omni-models) via Dense/MoE architectures.

I am actively seeking research internship opportunities for Summer 2027, focusing on Pre-training/Post-training/Inference of Foundation Models. You can find my CV here.

Research

My research is organized around three core stages of foundation model development: pre-training, post-training, and inference. Across these directions, I work on making large models more capable, efficient, and reliable for reasoning, coding, alignment etc.

Pre-training
  • Focus: Investigating foundation model pre-training through efficient architecture design, generalizable training strategies, and scaling laws.
  1. DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training
    Can Jin*Hongwu Peng*, Mingcan Xiang, Qixin Zhang, Xiangchi Yuan, Amit Hasan, Ohiremen Dibua, Yifan Gong, Yan Kang, and Dimitris N. Metaxas
    In Forty-third International Conference on Machine Learning, 2026
    Propose DTop-p MoE, a dynamic routing mechanism that utilizes a Proportional-Integral controller and dynamic routing normalization to precisely control expert activation sparsity while adapting to varying token difficulty. DTop-p outperforms Top-k and Top-p MoE across Large Language Models and Diffusion Transformers.
  2. Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate
    In Advances in Neural Information Processing Systems, 2024
    Develop Learning from Teaching (LoT), a novel regularization technique for deep neural networks that enhances model generalization by training a teacher model to prioritize features that are easier for a student model to imitate, thereby filtering out spurious correlations.
Post-training
  • Focus: Investigating post-training techniques such as reinforcement learning, on-policy distillation, supervised fine-tuning, prompt-based adaptation, and pruning for efficient reasoning, coding, alignment, and adaptation.
  1. DARE: Difficulty-Adaptive Reinforcement Learning with Co-Evolved Difficulty Estimation
    Yang Zhou*Can Jin*, Zihan Dong , Zhepeng Wang, Yanting Yang, Shiyu Zhao , Lei Li, Runxue Bao, Yaochen Xie, and Dimitris N. Metaxas
    2026
    Introduce DARE, a difficulty-adaptive Reinforcement Learning framework that co-evolves policy-aligned difficulty estimation with dynamic data selection and difficulty-specific optimization, improving training efficiency, final accuracy, and inference-token efficiency for LLM reasoning.
  2. Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight
    Can Jin , Jiakang Li, Rui Wu , Eddy Zhang, and Dimitris N. Metaxas
    In 3rd AI for Math Workshop: Toward Self-Evolving Scientific Agents, 2026
    We propose weak-critic strong oversight and on-policy critique distillation (OPCD), showing that weak models can guide stronger models through useful critiques while improving alignment and reasoning performance.
  3. ACL
    CADA.jpg
    Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety
    Can Jin*, Rui Wu*Tong Che*Qixin ZhangHongwu Peng , Jiahui Zhao, Zhenting Wang, Wenqi Wei, Ligong Han , Zhao Zhang, Yuan CaoRuixiang Tang, and Dimitris N. Metaxas
    In The 64th Annual Meeting of the Association for Computational Linguistics, 2026
    Propose CADA, a case-augmented deliberative alignment framework that leverages reinforcement learning on self-generated reasoning chains to transition from rigid rule enforcement to flexible case-based reasoning, significantly reducing over-refusal while enhancing robustness against jailbreak attacks.
  4. LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
    Can Jin , Ying Li , Mingyu Zhao, Shiyu ZhaoZhenting WangXiaoxiao HeLigong HanTong Che, and Dimitris N. Metaxas
    In The Thirteenth International Conference on Learning Representations, 2025
    Design LoR-VP, a low-rank visual prompting technique for efficient vision model adaptation that reduces trainable parameters while outperforming full fine-tuning and standard visual prompting methods on object detection and segmentation benchmarks.
  5. Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
    Can Jin*Tianjin Huang* , Yihua Zhang, Mykola PechenizkiySijia LiuShiwei Liu, and Tianlong Chen
    In Proceedings of the AAAI Conference on Artificial Intelligence, 2025
    Propose VPNs, a novel data-model co-design framework that simultaneously optimizes visual prompts and network sparsity, significantly enhancing the performance and transferability of sparse vision models.
Inference
  • Focus: Investigating inference-time techniques such as test-time search, refinement/critiquing, prompt engineering, and multi-agent systems for improving reasoning, coding, retrieval, and agentic performance.
  1. Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning
    Can JinHongwu PengQixin Zhang , Yujin Tang, Tong Che, and Dimitris N. Metaxas
    In Workshop on Scaling Environments for Agents, 2025
    Develop MAS-TTS, a framework that integrates a specialized multi-agent training pipeline with an adaptive CEO agent to orchestrate collaborative reasoning, effectively optimizing test-time scaling for complex tasks.
  2. APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking (Best Paper Award @ RelWeb)
    In Companion Proceedings of the ACM Web Conference 2025, Sydney, NSW, Australia, 2025
    Propose APEER, a novel automatic prompt engineering algorithm that iteratively generates and refines prompts to enhance the performance and transferability of Large Language Models in information retrieval reranking tasks.
  3. RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models
    Can Jin*Hongwu Peng* , Anxiang Zhang , Nuo Chen , Jiahui Zhao, Xi Xie , Kuangzheng Li, Shuya Feng, Kai ZhongCaiwen Ding, and Dimitris N Metaxas
    In Companion Proceedings of the ACM Web Conference 2025, Sydney, NSW, Australia, 2025
    Design RankFlow, an LLM-driven reranking framework that utilizes multi-role collaboration to enhance retrieval accuracy, demonstrating superior performance over existing baselines in extensive empirical studies.
  4. Your reward function for RL is your best PRM for search: Unifying RL and search-based TTS
    arXiv preprint arXiv:2508.14313, 2025
    Introduce AIRL-S that unifies Reinforcement Learning and search-based Test-Time Scaling, demonstrating that RL reward functions can serve as optimal Process Reward Models for guiding search in complex reasoning tasks.

Academic Services

Teaching Assistant
  • Rutgers University: CS344: Algorithms (Spring 2026), CS211: Computer Architecture (Fall 2025), CS534: Computer Vision (Spring 2025), CS210: Data Management for Data Science (Fall 2024)
Peer Review
  • Conference: NeurIPS 25/26, ICLR 25/26, ICML 24/26, CVPR 25/26, ECCV 26, AAAI 26, etc.
  • Journal: Alexandria Engineering Journal, Information Fusion, Pattern Recognition, Signal Processing