Can Jin

CBIM, Busch Campus can.jin@rutgers.edu

I am a Ph.D. Candidate in Computer Science at Rutgers University, New Brunswick, advised by Professor Dimitris N. Metaxas. My research interests include Pre-training/Post-training/Inference of Foundation Models, with a focus on Efficiency and Reasoning/Coding Capabilities. Please feel free to reach out via email if you share similar interests and would like to collaborate.

I earned my M.S. and B.S. degrees in Mathematics from the University of Science and Technology of China (USTC). Before my doctoral studies, I worked as a Machine Learning Engineer at Meituan Dianping Corporation. Recently, as a Research Intern at Adobe Research, I focused on the efficient pre-training of large foundation models (LLMs/DiTs/omni-models) via Dense/MoE architectures.

I am actively seeking research internship opportunities for Summer 2027, focusing on Pre-training/Post-training/Inference of Foundation Models. You can find my CV here.

Research

My research is organized around three core stages of foundation model development: pre-training, post-training, and inference. Across these directions, I work on making large models more capable, efficient, and reliable for reasoning, coding, alignment etc.

Pre-training

Focus: Investigating foundation model pre-training through efficient architecture design, generalizable training strategies, and scaling laws.

ICML
DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training

Can Jin^*, Hongwu Peng^*, Mingcan Xiang, Qixin Zhang, Xiangchi Yuan, Amit Hasan, Ohiremen Dibua, Yifan Gong, Yan Kang^†, and Dimitris N. Metaxas^†

In Forty-third International Conference on Machine Learning, 2026

Propose DTop-p MoE, a dynamic routing mechanism that utilizes a Proportional-Integral controller and dynamic routing normalization to precisely control expert activation sparsity while adapting to varying token difficulty. DTop-p outperforms Top-k and Top-p MoE across Large Language Models and Diffusion Transformers.

Bib PDF
@inproceedings{jin2025dtopp, title = {DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training}, author = {Jin, Can and Peng, Hongwu and Xiang, Mingcan and Zhang, Qixin and Yuan, Xiangchi and Hasan, Amit and Dibua, Ohiremen and Gong, Yifan and Kang, Yan and Metaxas, Dimitris N.}, booktitle = {Forty-third International Conference on Machine Learning}, year = {2026}, }

NeurIPS

Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate

Can Jin^*, Tong Che^*, Hongwu Peng^†, Yiyuan Li^†, Dimitris Metaxas^‡, and Marco Pavone^‡

In Advances in Neural Information Processing Systems, 2024

Develop Learning from Teaching (LoT), a novel regularization technique for deep neural networks that enhances model generalization by training a teacher model to prioritize features that are easier for a student model to imitate, thereby filtering out spurious correlations.

Bib PDF Code Poster Website

@inproceedings{jin2024lot,
  author = {Jin, Can and Che, Tong and Peng, Hongwu and Li, Yiyuan and Metaxas, Dimitris and Pavone, Marco},
  booktitle = {Advances in Neural Information Processing Systems},
  editor = {Globerson, A. and Mackey, L. and Belgrave, D. and Fan, A. and Paquet, U. and Tomczak, J. and Zhang, C.},
  pages = {966--994},
  publisher = {Curran Associates, Inc.},
  title = {Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate},
  url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/01ce1ae7f94d139e4917f9e4425a4f38-Paper-Conference.pdf},
  volume = {37},
  year = {2024},
}

Post-training

Focus: Investigating post-training techniques such as reinforcement learning, on-policy distillation, supervised fine-tuning, prompt-based adaptation, and pruning for efficient reasoning, coding, alignment, and adaptation.

arXiv
DARE: Difficulty-Adaptive Reinforcement Learning with Co-Evolved Difficulty Estimation

Yang Zhou^*, Can Jin^*, Zihan Dong , Zhepeng Wang, Yanting Yang, Shiyu Zhao , Lei Li, Runxue Bao, Yaochen Xie, and Dimitris N. Metaxas

2026

Introduce DARE, a difficulty-adaptive Reinforcement Learning framework that co-evolves policy-aligned difficulty estimation with dynamic data selection and difficulty-specific optimization, improving training efficiency, final accuracy, and inference-token efficiency for LLM reasoning.

Bib PDF Code
@misc{zhou2026dare, title = {DARE: Difficulty-Adaptive Reinforcement Learning with Co-Evolved Difficulty Estimation}, author = {Zhou, Yang and Jin, Can and Dong, Zihan and Wang, Zhepeng and Yang, Yanting and Zhao, Shiyu and Li, Lei and Bao, Runxue and Xie, Yaochen and Metaxas, Dimitris N.}, year = {2026}, journal = {arXiv preprint arXiv:2605.09188}, }
ICML-AI4Math
Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight

Can Jin , Jiakang Li, Rui Wu , Eddy Zhang, and Dimitris N. Metaxas

In 3rd AI for Math Workshop: Toward Self-Evolving Scientific Agents, 2026

We propose weak-critic strong oversight and on-policy critique distillation (OPCD), showing that weak models can guide stronger models through useful critiques while improving alignment and reasoning performance.

Bib PDF
@inproceedings{jin2026weak, title = {Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight}, author = {Jin, Can and Li, Jiakang and Wu, Rui and Zhang, Eddy and Metaxas, Dimitris N.}, booktitle = {3rd AI for Math Workshop: Toward Self-Evolving Scientific Agents}, year = {2026}, url = {https://openreview.net/forum?id=oEfedgUChS}, }
ACL
Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety

Can Jin^*, Rui Wu^*, Tong Che^*, Qixin Zhang, Hongwu Peng , Jiahui Zhao, Zhenting Wang, Wenqi Wei, Ligong Han , Zhao Zhang, Yuan Cao, Ruixiang Tang^†, and Dimitris N. Metaxas^†

In The 64th Annual Meeting of the Association for Computational Linguistics, 2026

Propose CADA, a case-augmented deliberative alignment framework that leverages reinforcement learning on self-generated reasoning chains to transition from rigid rule enforcement to flexible case-based reasoning, significantly reducing over-refusal while enhancing robustness against jailbreak attacks.

Bib PDF
@inproceedings{jin2026reasoning, title = {Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety}, author = {Jin, Can and Wu, Rui and Che, Tong and Zhang, Qixin and Peng, Hongwu and Zhao, Jiahui and Wang, Zhenting and Wei, Wenqi and Han, Ligong and Zhang, Zhao and Cao, Yuan and Tang, Ruixiang and Metaxas, Dimitris N.}, booktitle = {The 64th Annual Meeting of the Association for Computational Linguistics}, year = {2026}, }
ICLR
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

Can Jin , Ying Li , Mingyu Zhao, Shiyu Zhao, Zhenting Wang, Xiaoxiao He, Ligong Han, Tong Che, and Dimitris N. Metaxas

In The Thirteenth International Conference on Learning Representations, 2025

Design LoR-VP, a low-rank visual prompting technique for efficient vision model adaptation that reduces trainable parameters while outperforming full fine-tuning and standard visual prompting methods on object detection and segmentation benchmarks.

Bib PDF Code
@inproceedings{jin2025lorvp, title = {LoR-{VP}: Low-Rank Visual Prompting for Efficient Vision Model Adaptation}, author = {Jin, Can and Li, Ying and Zhao, Mingyu and Zhao, Shiyu and Wang, Zhenting and He, Xiaoxiao and Han, Ligong and Che, Tong and Metaxas, Dimitris N.}, booktitle = {The Thirteenth International Conference on Learning Representations}, year = {2025}, url = {https://openreview.net/forum?id=5btFIv2PNb}, }

AAAI

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

Can Jin^*, Tianjin Huang^* , Yihua Zhang, Mykola Pechenizkiy, Sijia Liu , Shiwei Liu, and Tianlong Chen

In Proceedings of the AAAI Conference on Artificial Intelligence, 2025

Propose VPNs, a novel data-model co-design framework that simultaneously optimizes visual prompts and network sparsity, significantly enhancing the performance and transferability of sparse vision models.

Bib PDF Code

@inproceedings{jin2025visual,
  title = {Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective},
  author = {Jin, Can and Huang, Tianjin and Zhang, Yihua and Pechenizkiy, Mykola and Liu, Sijia and Liu, Shiwei and Chen, Tianlong},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  volume = {39},
  number = {4},
  pages = {4111--4119},
  year = {2025},
  url = {https://arxiv.org/pdf/2312.01397},
}

Inference

Focus: Investigating inference-time techniques such as test-time search, refinement/critiquing, prompt engineering, and multi-agent systems for improving reasoning, coding, retrieval, and agentic performance.

NeurIPS-SEA
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning

Can Jin, Hongwu Peng, Qixin Zhang , Yujin Tang, Tong Che, and Dimitris N. Metaxas

In Workshop on Scaling Environments for Agents, 2025

Develop MAS-TTS, a framework that integrates a specialized multi-agent training pipeline with an adaptive CEO agent to orchestrate collaborative reasoning, effectively optimizing test-time scaling for complex tasks.

Bib PDF Code Poster
@inproceedings{jin2025two, title = {Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning}, author = {Jin, Can and Peng, Hongwu and Zhang, Qixin and Tang, Yujin and Che, Tong and Metaxas, Dimitris N.}, booktitle = {Workshop on Scaling Environments for Agents}, year = {2025}, url = {https://openreview.net/forum?id=aLGgp4FK0A}, }

WWW-RelWeb

APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking (Best Paper Award @ RelWeb)

Can Jin^*, Hongwu Peng^*, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han , Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran, and Dimitris N Metaxas

In Companion Proceedings of the ACM Web Conference 2025, Sydney, NSW, Australia, 2025

Propose APEER, a novel automatic prompt engineering algorithm that iteratively generates and refines prompts to enhance the performance and transferability of Large Language Models in information retrieval reranking tasks.

DOI Bib PDF Code

@inproceedings{jin2025apeer,
  title = {APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking (<span class="award">Best Paper Award @ RelWeb</span>)},
  author = {Jin, Can and Peng, Hongwu and Zhao, Shiyu and Wang, Zhenting and Xu, Wujiang and Han, Ligong and Zhao, Jiahui and Zhong, Kai and Rajasekaran, Sanguthevar and Metaxas, Dimitris N},
  booktitle = {Companion Proceedings of the ACM Web Conference 2025},
  location = {Sydney, NSW, Australia},
  series = {WWW '25},
  year = {2025},
  doi = {10.1145/3701716.3717574},
  isbn = {979-8-4007-1331-6/2025/04},
  url = {https://arxiv.org/abs/2406.14449},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  keywords = {Prompt engineering, Information Retrieval, Large Language Model, ReRanking},
}

WWW-RelWeb

RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models

Can Jin^*, Hongwu Peng^* , Anxiang Zhang , Nuo Chen , Jiahui Zhao, Xi Xie , Kuangzheng Li, Shuya Feng, Kai Zhong, Caiwen Ding, and Dimitris N Metaxas

In Companion Proceedings of the ACM Web Conference 2025, Sydney, NSW, Australia, 2025

Design RankFlow, an LLM-driven reranking framework that utilizes multi-role collaboration to enhance retrieval accuracy, demonstrating superior performance over existing baselines in extensive empirical studies.

DOI Bib PDF Code

@inproceedings{jin2025rankflow,
  title = {RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models},
  author = {Jin, Can and Peng, Hongwu and Zhang, Anxiang and Chen, Nuo and Zhao, Jiahui and Xie, Xi and Li, Kuangzheng and Feng, Shuya and Zhong, Kai and Ding, Caiwen and Metaxas, Dimitris N},
  booktitle = {Companion Proceedings of the ACM Web Conference 2025},
  location = {Sydney, NSW, Australia},
  series = {WWW '25},
  year = {2025},
  doi = {10.1145/3701716.3717575},
  isbn = {979-8-4007-1331-6/2025/04},
  url = {https://arxiv.org/abs/2502.00709},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  keywords = {Information Retrieval, Large Language Model, ReRanking},
}

arXiv
Your reward function for RL is your best PRM for search: Unifying RL and search-based TTS

Can Jin, Yang Zhou, Qixin Zhang, Hongwu Peng , Di Zhang, Marco Pavone, Ligong Han, ZhangWei Hong, Tong Che^†, and Dimitris N Metaxas^†

arXiv preprint arXiv:2508.14313, 2025

Introduce AIRL-S that unifies Reinforcement Learning and search-based Test-Time Scaling, demonstrating that RL reward functions can serve as optimal Process Reward Models for guiding search in complex reasoning tasks.

Bib PDF
@article{jin2025your, title = {Your reward function for RL is your best PRM for search: Unifying RL and search-based TTS}, author = {Jin, Can and Zhou, Yang and Zhang, Qixin and Peng, Hongwu and Zhang, Di and Pavone, Marco and Han, Ligong and Hong, ZhangWei and Che, Tong and Metaxas, Dimitris N}, journal = {arXiv preprint arXiv:2508.14313}, year = {2025}, }

Academic Services

Teaching Assistant

Rutgers University: CS344: Algorithms (Spring 2026), CS211: Computer Architecture (Fall 2025), CS534: Computer Vision (Spring 2025), CS210: Data Management for Data Science (Fall 2024)

Peer Review

Conference: NeurIPS 25/26, ICLR 25/26, ICML 24/26, CVPR 25/26, ECCV 26, AAAI 26, etc.
Journal: Alexandria Engineering Journal, Information Fusion, Pattern Recognition, Signal Processing