1. User Job Distribution: A Heavy-Tailed Workload

The cluster’s workload is heavily concentrated among a very small fraction of the total user base. The data exhibits an extreme version of the Pareto principle (80/20 rule).
- The Top 2%: Just 2.12% of the heaviest users are responsible for 70% of the total workload.
- The Top 3%: Only 3.75% of users account for 80% of all jobs.
- The 99th Percentile: To reach 99% of the total cluster workload, you only need to account for the top 29.32% of users.
This indicates that a vast majority of users submit relatively few jobs, while a core group of “power users” drives almost all cluster activity.
2. Job Type Distribution: The Dominance of Normal and Array Jobs

The vast majority of the tens of millions of jobs run on the cluster fall into a few primary categories, while complex step-based or interactive jobs are statistically rare.
- Top Job Types (Log Scale Order of Magnitude):
X (Normal): The most common job type, with ~2.93 million instances.X.batch (Normal Batch): Close behind with ~2.68 million jobs.X_Y (Array Job): Accounting for ~1.98 million jobs.X_Y.batch (Array Batch): Representing ~1.90 million jobs.X.Y (Job Step): Representing ~1.35 million jobs.
- Rare Job Types: Conversely, standard interactive jobs (
X.interactive) barely register, with only 2 counts in the entire analysed dataset. Complex mixed array types likeX_[Mixed%S]andX_[Mixed]only have 3 and 12 counts respectively.
3. Partition Distribution & Behaviours

Job volume is radically different depending on the partition, and specific partitions attract entirely different shapes of workloads.
Overall Volume:
k2-hipri(High Priority) is the absolute behemoth of the cluster, processing an astonishing 8,384,627 jobs.k2-medpri(Medium Priority) is a distant second, processing 1,214,761 jobs.- Other notable general partitions include
k2-himem(259k),k2-living-labs(176k), andk2-lowpri(142k). - Among GPU partitions,
k2-gpu-v100sees the highest job count (65k), followed byk2-gpu-a100(32k) andk2-gpu-a100mig(20k).
Job Types by Partition (The Heatmap Analysis): The way users interact with different partitions varies greatly based on the queue:
- General Queues (
k2-lowpri&k2-living-labs): These partitions see a relatively balanced mix of job types. For example,k2-living-labsprocessed ~42k Normal jobs, ~40k Array jobs, and ~31k Normal Batch jobs. - The
MIXEDPartition Anomaly: TheMIXEDpartition is almost exclusively used for massive array workloads. It processed virtually zero “Normal” jobs, but handled roughly 69k Array jobs (X_Y) and 68k Array Batch jobs (X_Y.batch). k2-gpu: The standard GPU partition sees a much lower volume overall compared to CPU nodes, with its highest count being standardX (Normal)jobs (~4.3k) andX.batch(~3.4k), indicating less use of arrays for basic GPU tasks.