Machine Learning, LLM Agents, and Applied AI Systems

I am a Ph.D. candidate in Computer Science and Engineering at the University of Connecticut, working on machine learning systems, LLM agents, agentic control and safety, retrieval-augmented generation, foundation-model adaptation, and applied AI for real-world decision support.

I am currently looking for AI/ML research internships and applied scientist roles where I can build and evaluate reliable AI systems. My strongest recent work focuses on language agents that decide when to ask, delegate, verify, act, or escalate; guardrails and benchmarks for tool-using coding agents; foundation-model adaptation; and sequence modeling for scientific AI.

Resume Projects Notes Google Scholar GitHub

Focus Areas

LLM agents and agentic control: learning and evaluation methods for long-horizon agents, reliable action selection, escalation, verification, and safety checks.
Foundation models and model adaptation: data-free model merging, reproducible evaluation, and performance improvement under strict no-data constraints.
ML systems engineering: Python, C++, PyTorch, asynchronous CPU-GPU pipelines, large-scale evaluation harnesses, backend services, and experiment automation.
Applied AI for health and science: multimodal health modeling, molecular generation, sequence modeling, and interdisciplinary AI workflows.

Selected Work

LLM Agents, Agentic Control, and Safety

I develop learning and evaluation methods for long-horizon language agents that choose when to ask, delegate, verify, act, or escalate. Recent work relabeled exact counterfactual action values under the learner’s own continuation policy and improved audited-seed success from 6.2% to 37.8%, utility from -0.237 to 1.051, and decision regret from 0.323 to 0.109.

I also built and audited a benchmark and guard framework for tool-using coding agents, using route, provenance, and capability checks to control high-risk actions. The strongest safe-family evaluations achieved 6/6 task success with zero unauthorized effects and zero route misfires.

Foundation Models, Sequence Modeling, and Scientific AI

My research also includes data-free model merging, structure-aware decoding, autoregressive search, representation learning, molecular generation, and multimodal health prediction. I have worked on distributed generation and evaluation frameworks that scaled candidate generation to 1B+ samples in 6 days on 8 V100 GPUs.

Publications

Selected publications are listed on my Publications page and on Google Scholar.

Xinyu Wang