Project Results
Technical Highlights
Heterogeneous Pooling
Breaks chip barriers, enabling mixed scheduling of localized and general-purpose compute
Model Service Mesh
Istio-based microservice governance for fine-grained traffic control and second-level recovery
AI Security Gateway
Built-in Prompt injection defense and data de-identification for AI-era security
Dynamic Admission
Data-driven automated model evaluation helping select appropriate models for deployment
Client Background
This Provincial AI Computing Power Dispatch Center coordinates regional resources for government, research, and public AI applications. With rising localization requirements, the center faced a complex mix of general GPUs and domestic AI chips. Legacy siloed architectures prevented resource pooling and cross-chip model migration, creating an urgent need for an AI Operating System to shield hardware differences.
Technology Stack
From Challenges to Solutions
System Architecture Design
Abstracting underlying chip differences (General/Domestic Chips) for unified resource pooling and scheduling
Service Mesh based model traffic governance supporting A/B testing, canary release, and circuit breaking
Enterprise-grade unified API access with auth, rate limiting, billing, and full-link observability
Phased Implementation
Infrastructure Pooling
Completed unified access and virtualization of GPU/NPU resources; established heterogeneous scheduling foundation
Service Governance Launch
Deployed Model Mesh to take over regional AI traffic, achieving multi-tenant isolation and dynamic rate limiting
Ecosystem Opening
Launched the AI App Gateway and Developer Center to support one-stop AI capability invocation for all provincial agencies
Infrastructure Pooling
Completed unified access and virtualization of GPU/NPU resources; established heterogeneous scheduling foundation
Service Governance Launch
Deployed Model Mesh to take over regional AI traffic, achieving multi-tenant isolation and dynamic rate limiting
Ecosystem Opening
Launched the AI App Gateway and Developer Center to support one-stop AI capability invocation for all provincial agencies
Client Testimonial
“This platform solved our urgent need of "having compute but failing to schedule". It not only shielded the complexity of different chips but also nearly doubled our resource utilization, truly achieving centralized regional management.”