If we can imagine a way for an HPC center to provision clusters (where each is owned by a user) via the Flux Operator, on demand for a user or group, we'd want control of instance types / sizes / costs, e.g.,
An ideal in my opinion would to be able to list the allowed instance types and max sizes, then have flux handle provisioning (on-demand or spot) on a per-job basis. It could use qos flags to decide whether to chain sequences on the same instances (to amortize provisioning costs) versus spreading (to minimize time to completion). I think these policies are possible with kubernetes (thus minimizing customization to any specific cloud provider, as with current solutions).
In thread here:
https://hachyderm.io/@jedbrown/109396976059698506
Thanks @jedbrown!
If we can imagine a way for an HPC center to provision clusters (where each is owned by a user) via the Flux Operator, on demand for a user or group, we'd want control of instance types / sizes / costs, e.g.,
In thread here:
https://hachyderm.io/@jedbrown/109396976059698506
Thanks @jedbrown!