Projects

These selected projects highlight the work most relevant to AI/ML research internships, applied scientist roles, and research engineering.

LLM Agents, Agentic Control, and Safety

Problem: Long-horizon language agents need to decide when to ask, delegate, verify, act, or escalate, while avoiding unsafe or unauthorized actions.

What I built: Learning and evaluation methods for agentic control, plus a benchmark and guard framework for tool-using coding agents. The framework checks route, provenance, and capability before high-risk actions.

Impact: Improved audited-seed success from 6.2% to 37.8%, utility from -0.237 to 1.051, and decision regret from 0.323 to 0.109. Strong safe-family evaluations reached 6/6 task success with zero unauthorized effects and zero route misfires.

Keywords: LLM agents, agentic AI, safety, tool use, coding agents, evaluation, counterfactual action values.

Data-Free Foundation Model Adaptation

Problem: Model merging and adaptation methods often need task data, training, test-time tuning, or recipe-specific assumptions.

What I built: A data-free performance enhancer for model merging that can be applied on top of broadly used task-vector-style merging methods rather than a single merge recipe.

Impact: Improved average benchmark performance to 86.1 across 7 NLP tasks and 8 vision tasks without training, task data, or test-time tuning.

Keywords: foundation models, model merging, data-free adaptation, evaluation, NLP, vision.

Structure-Aware Decoding and Large-Scale Search

Problem: Autoregressive sequence models can produce repeated outputs and hidden structural collapse in constrained generation settings.

What I built: Structure-aware decoding and search methods, plus a distributed asynchronous generation and evaluation framework with token-level value prediction and human-in-the-loop steering.

Impact: Scaled candidate generation to 1B+ samples in 6 days on 8 V100 GPUs, enabling broader exploration over large structured candidate spaces.

Keywords: sequence modeling, decoding, search, autoregressive models, distributed evaluation, GPU pipelines.

Applied ML for Health and Scientific Discovery

Problem: Health and scientific domains require models that can integrate heterogeneous signals and support downstream prediction or candidate selection.

What I built: Multi-view representation learning models for MRI and multimodal health data, sequence-modeling and generative methods for structured scientific workflows, and molecular generation methods using hierarchical chemical graph representations.

Impact: Contributed to publications in connected health, molecular informatics, biomedical systems modeling, and depression treatment outcome prediction.

Keywords: multimodal learning, health AI, drug discovery, molecular generation, representation learning.

Xinyu Wang

Projects

LLM Agents, Agentic Control, and Safety

Data-Free Foundation Model Adaptation

Structure-Aware Decoding and Large-Scale Search

Applied ML for Health and Scientific Discovery