What’s the deal with wallclock time?

Max Wallclock Time and Requesting Wallclock Time on the UMich Cluster

If you’re a user on the UMich cluster, you may be familiar with the maximum wallclock time allowed on the cluster – 2 weeks. But have you ever faced a dilemma when it comes to requesting wallclock time for your jobs? If you request too little time, your job may be killed before it finishes. But if you request too much time, you may run into issues with your account balance. Let’s take a closer look at how to effectively request wallclock time on the UMich cluster.

The Dilemma of Requesting Wallclock Time

When requesting wallclock time for your jobs, it’s important to strike the right balance. If you request too little time, your job may not finish before the allotted time runs out. If your program doesn’t checkpoint, then the spent CPU time is wasted. On the other hand, if you request too much time, you may face issues with your account balance.

How the Scheduler Works

When you tell the scheduler how long your job will take, two things happen

A) The scheduler checks to make sure you can “afford” the job.

  1. The scheduler blocks out CPU time for those hours from the account you’re using. This helps prevent you from going over budget by not starting jobs that you wouldn’t have the CPU hours in the account to pay for. Once the job finishes, any leftover CPU time is credited back to the account. However, this can be an issue if the account is nearly exhausted of CPU hours.

For example, let’s say you submit 10 single-processor jobs telling the scheduler they’ll each run for 4 days. You are effectively asking for 40 days of CPU time for these jobs (4 days X 10 single-processor jobs = 40 days of CPU time ).

But if the account only has enough CPU hours left for 20 days of compute, the scheduler will only allow 5 of those jobs to run. However, each job actually finishes in 2 days, as each job finishes the scheduler credits back 2 days of wallclock time that wasn’t used and the next job can start.

It would take approximately 4 days (two batches of 5 jobs running for 2 days each) to run all 10 jobs. If you told the scheduler they would run in 2 days, all 10 would be submitted that first day and you would expect them back in approximately 2 days.

B) The scheduler checks to make sure everyone has a fair chance to use the cluster.

  1. The scheduler aims to treat all users fairly on the cluster by balancing requests for resources and time. While the scheduler primarily operates on a First-In-First-Out basis, different jobs can start before others depending on a job “priority” the scheduler assigns. Job priority can be affected by several factors.

The formula for assigning job priority looks like this:

Job_priority = site_factor +
	(PriorityWeightAge) * (age_factor) +
	(PriorityWeightAssoc) * (assoc_factor) +
	(PriorityWeightFairshare) * (fair-share_factor) +
	(PriorityWeightJobSize) * (job_size_factor) +
	(PriorityWeightPartition) * (partition_factor) +
	(PriorityWeightQOS) * (QOS_factor) + 
SUM(TRES_weight_cpu * TRES_factor_cpu, TRES_weight_<type> * TRES_factor_<type>, ...)
	- nice_factor

So what is fair share you might ask?


Fair share refers to the difference between a user’s allocated share of resources and the number of resources that have been consumed by that user over a given period of time. This means that a user who has consumed less than their fair share of resources recently will receive a higher priority for resource allocation compared to a user who has consumed more than their fair share. The goal of the fair-share policy in the SLURM priority equation is to ensure that all users have equal access to resources over time, regardless of their previous resource consumption.

Each cluster running the SLURM scheduler sets its own weighting for each factor that affects priority.
The weights a UMICH are currently:

PriorityWeightAge = 10000
PriorityWeightAssoc = 0
PriorityWeightFairShare = 10000
PriorityWeightJobSize = 0
PriorityWeightPartition = 10000
PriorityWeightQOS = 1000000
PriorityWeightTRES = cpu=0,Mem=0,GRES/gpu=0

Additionally, when the scheduler attempts to reserve the hardware for a large job it attempts to fit in smaller jobs that should finish before the large job is ready to start.

Conclusion

Ultimately choosing a wallclock time appropriate for your jobs can help get your jobs through the queue.  To effectively request wallclock time on the UMich cluster, it’s important to strike the right balance between asking for too little and too much time. By understanding how the scheduler works, you can ensure that your jobs run efficiently and to completion.