Research
Turning frontline deployments into reproducible research.
Retrieval-augmented generation, LLM long-term memory, and knowledge graphs — formalizing production practice into reproducible work. Below are the team's public papers, patents, and academic service, each linked to a verifiable source.
Papers & patents
The Preference Centroid: Consensus Density Governs Output Dispersion in Aligned LLMs
Tao An, Shuai Feng
Sampling an aligned LLM repeatedly and embedding the completions, output dispersion is governed by the consensus density of the task — near-zero on factual prompts, wide on open-ended ones (Spearman ρ = 0.85), replicating on a second model and predicted by held-out judges that score only the prompt (ρ = −0.91). A base-vs-instruct comparison shows alignment amplifies a gradient the pretrained base already carries.
It's Fidelity, Not Structure: Verbatim Chunks Beat Lossy Artifact Extraction in Long-Conversation LLM Memory
Tao An
A controlled ablation isolating the stored memory representation inside one fixed retrieve–rerank–reason pipeline: LLM-extracted typed artifacts versus verbatim conversation chunks. Verbatim chunks win by 15.9 points on LoCoMo (43.9% vs. 28.0%) and 22.0 points on LongMemEval-S — structured memory should augment verbatim text, not replace it.
AI as Equalizer or Amplifier? Task Complexity as the Moderating Factor for Human Expertise in Hybrid Intelligence Systems
Tao An
Accepted at the 5th International Conference on Hybrid Human-Artificial Intelligence (HHAI 2026, Brussels); proceedings in IOS Press. Drawing on structured field observations since mid-2024, this position paper reconciles the 'equalizer' and 'amplifier' debates: AI narrows novice–expert gaps on routine tasks but amplifies them on complex tasks requiring deep judgment. Domain expertise — not prompt engineering — determines who benefits most.
Cognitive Workspace: Active Memory Management for LLMs
Tao An
Proposes Cognitive Workspace, a paradigm transcending traditional RAG by emulating human cognition: active memory management, hierarchical cognitive buffers, and task-driven context optimization. Achieves a 58.6% memory-reuse rate (vs. 0% for RAG) with a 17–18% net efficiency gain.
A Graph-Neural-Network Method for Data-Information Recommendation
Tao An
Chinese invention patent, under examination. GNN-based recommendation over heterogeneous data–information graphs.
Academic service
Ethics Reviewer
NeurIPS 2026· 2026Ethics Review Committee, Conference on Neural Information Processing Systems (NeurIPS 2026) — reviewing flagged submissions against the NeurIPS Code of Ethics: data provenance and informed consent, dual-use and misuse risk, human-subjects considerations, and broader societal impact.