Default Execution Limits


Execution limits for the three main clusters at HPCVL are in place to provide greater flexibility in scheduling user applications.

Users are limited to twelve (12) executing jobs at any one time. This means that across the production clusters (M9000, SW and VictoriaFalls) each user can run up to 8 (total) production jobs simultaneously. Submitted jobs above that limit will remain in the queue until a job slot comes free.

To reflect the differences in processor slots, number of machines, CPU speed and memory available, the total maximum number of processes (threads) that can be run at a given time are as follows

  • M9000 Servers (m9k0001-8) 64   (default systems)
  • SW Linux Cluster  (sw0011-51) 48 
  • Victoria Falls (vf001-73) 512

Thus, with up to 12 executing jobs, a total of 724 threads/processes are possible at any given time when spread over the three queues as above.

Important Note: Research groups will in many cases require more resources, especially in terms of thread/process numbers, and may fall outside of these limits. We will work with these users to ensure appropriate access. Any requests for enhanced access should include specifications for number of processors required, total amount of memory, and the expected maximum runtime(s). Contact the User Support group to ensure that this is arranged using the most appropriate resource.

Note that long-term extended usage requires a formal application to Compute Canada. Calls for such applications are currently issued once a year in the fall. We are announcing these calls on our web page. The allocation is done by a Research Allocation Committee (RAC).

Please note that scheduling of jobs using the commercial software packages Fluent and Abaqus involves a license check and must therefore remain subject to additional limits, presently 12 (Fluent) and 8 (Abaqus) processes per job.

These limits makes the utilization of our resources more efficient, while allowing researchers to get their work done or to expand their research and address new problems.