A cloud system built for constant uptime instead of scheduled workloads.
Leeco Steel needed to run large-scale simulation models every weekday to support operational analysis and forecasting.
Their infrastructure relied on large always-on cloud servers designed for constant uptime, even though simulation workloads ran only at scheduled times.
While the servers provided sufficient computing power, the architecture was inefficient and expensive for periodic workloads.
When issues occurred, the environment was also difficult to reset, often requiring manual intervention from engineers and analysts.
The system delivered the raw computing power needed for simulations, but it introduced operational friction and unnecessary infrastructure cost.
Instead of focusing on interpreting simulation results, analysts frequently had to spend time managing compute resources or stabilizing infrastructure.
Compute infrastructure should scale with demand, not run constantly.
Large simulation models require significant computing resources.
But when infrastructure stays active around the clock, costs rise quickly and operational complexity increases.
Leeco Steel needed a system capable of delivering bursts of computing power when simulations ran while minimizing infrastructure cost during idle periods.
The environment also needed to be reliable, reproducible, and easy to redeploy when updates or fixes were required.
"Always-on infrastructure created unnecessary cost and operational complexity. We needed a system that could scale when needed and disappear when it wasn't."
An automated cloud platform engineered for scheduled compute workloads
KEYSYS designed and deployed an Azure-based distributed compute platform built entirely as infrastructure-as-code.
The system launches temporary compute clusters when simulation jobs are scheduled, distributes workloads across multiple machines, and shuts the infrastructure down automatically when the work is complete.
Because the environment is defined through code, the entire platform can be rebuilt or updated in minutes.
- → On-demand Azure compute clusters using cost-efficient spot instances
- → Distributed simulation workloads powered by Ray.IO
- → Infrastructure defined and provisioned through Terraform
- → Automated deployment pipelines powered by GitHub Actions
System Architecture
The Leeco Steel platform is built as a fully automated distributed cloud environment defined through infrastructure-as-code.
Terraform provisions Azure infrastructure while Kubernetes orchestrates compute workloads across temporary spot-based servers. Ray.IO distributes simulation workloads across multiple machines so large modeling jobs can run in parallel.
GitHub Actions manages automated deployment pipelines, allowing the entire platform to be redeployed or updated instantly.
Because the infrastructure is ephemeral, compute resources exist only during simulation runs and terminate automatically once jobs complete.
Distributed computing frameworks like Ray are specifically designed to scale workloads across clusters of machines, allowing large data or simulation tasks to run efficiently across many nodes simultaneously.
How the Platform Runs Today
Simulation jobs trigger infrastructure deployment automatically.
Azure spot servers spin up only when workloads begin.
Ray.IO distributes simulation tasks across compute nodes.
Infrastructure terminates once jobs complete.
What once required constant infrastructure now runs only when simulation workloads are scheduled.
"Watching a full compute cluster spin up for a simulation run and disappear when the job finishes is exactly how cloud infrastructure should behave."
Automated infrastructure with dramatically lower operating costs
KEYSYS transformed Leeco Steel's simulation environment into an automated cloud platform that launches infrastructure only when simulations require it.
- Infrastructure launches automatically when simulations run
- Analysts focus on interpreting results instead of managing systems
- Spot-based compute resources dramatically reduce infrastructure cost
- Compute runs only during active workloads
- Infrastructure defined entirely through code
- Environment redeployable instantly when needed
- Large compute servers running continuously
- High infrastructure costs regardless of workload demand
- Manual troubleshooting required when systems failed
- Environment resets required engineering intervention
- Spot-based servers launched only when simulations run
- Workloads automatically distributed across compute clusters
- Infrastructure shuts down automatically after completion
- Entire environment redeployable with a single command
Ready to Automate Your Infrastructure?
If your systems require constant infrastructure to run periodic workloads, you may not need more servers. You may need a platform that scales automatically. Most engagements begin with a 30-minute conversation about your systems, workloads, and infrastructure.Download the Executive PDF.
Formatted for internal distribution, stakeholder review, and proposal inclusion.
Download the Leeco Steel Case Study (PDF)