Multiserver-job Response Time under Multilevel Scaling

Multiserver-job Response Time under Multilevel Scaling
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We study the multiserver-job setting in the load-focused multilevel scaling limit, where system load approaches capacity much faster than the growth of the number of servers $n$. We consider the ``1 and $n$’’ system, where each job requires either one server or all $n$. Within the multilevel scaling limit, we examine three regimes: load dominated by $n$-server jobs, 1-server jobs, or balanced. In each regime, we characterize the asymptotic growth rate of the boundary of the stability region and the scaled mean queue length. We demonstrate that mean queue length peaks near balanced load via theory, numerics, and simulation.


💡 Research Summary

This paper investigates the performance of a multiserver‑job (MSJ) queueing system under a novel asymptotic regime called load‑focused multilevel scaling (LFMS). In the “1‑and‑n” model each arriving job belongs to one of two classes: a 1‑server job that occupies a single server, or an n‑server job that requires all n servers simultaneously. Arrivals follow a Poisson process with rate λ, service times are exponential with rates μ₁ (for 1‑server jobs) and μₙ (for n‑server jobs), and the scheduling discipline is First‑Come‑First‑Served (FCFS) with head‑of‑line blocking (a job can only start when enough idle servers are available).

The LFMS limit differs from classical heavy‑traffic (λ → μ) or server‑focused scaling (n grows much faster than load). Here the system is first pushed to the edge of stability, i.e., the utilization ρ = λ/μ approaches 1, so that the scaled mean queue length E


Comments & Academic Discussion

Loading comments...

Leave a Comment