Business Impact
Technical Highlights
Heterogeneous Pooling
Breaks chip barriers, enabling mixed scheduling of localized and general-purpose compute
Model Service Mesh
Istio-based microservice governance for fine-grained traffic control and second-level recovery
AI Security Gateway
Built-in Prompt injection defense and data de-identification for AI-era security
Dynamic Admission
Data-driven automated model evaluation helping select appropriate models for deployment
Customer Context
This Provincial AI Computing Power Dispatch Center coordinates regional resources for government, research, and public AI applications. With rising localization requirements, the center faced a complex mix of general GPUs and domestic AI chips. Legacy siloed architectures prevented resource pooling and cross-chip model migration, creating an urgent need for an AI Operating System to shield hardware differences.
Technology Stack
From Pain Points to Adoption
System Architecture Design
Abstracting underlying chip differences (General/Domestic Chips) for unified resource pooling and scheduling
Service Mesh based model traffic governance supporting A/B testing, canary release, and circuit breaking
Enterprise-grade unified API access with auth, rate limiting, billing, and full-link observability
Phased Implementation
Infrastructure Pooling
The center brought all GPU/NPU resources onto FIM One with unified virtualization, establishing the heterogeneous scheduling foundation
Service Governance Pilot
The center turned on Model Mesh for regional AI traffic, validating multi-tenant isolation and dynamic rate limiting
Ecosystem Scale-out
The center rolled out the AI App Gateway and Developer Center, offering one-stop AI capability invocation to all provincial agencies
Infrastructure Pooling
The center brought all GPU/NPU resources onto FIM One with unified virtualization, establishing the heterogeneous scheduling foundation
Service Governance Pilot
The center turned on Model Mesh for regional AI traffic, validating multi-tenant isolation and dynamic rate limiting
Ecosystem Scale-out
The center rolled out the AI App Gateway and Developer Center, offering one-stop AI capability invocation to all provincial agencies
Customer Voice
“This platform solved our urgent need of "having compute but failing to schedule". It not only shielded the complexity of different chips but also nearly doubled our resource utilization, truly achieving centralized regional management.”