Please come back later: Benefiting from deferrals in service systems

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The performance evaluation of loss service systems, where customers who cannot be served upon arrival get dropped, has a long history going back to the classical Erlang B model. In this paper, we consider the performance benefits arising from the possibility of deferring customers who cannot be served upon arrival. Specifically, we consider an Erlang B type loss system where the system operator can, subject to certain constraints, ask a customer arriving when all servers are busy, to come back at a specified time in the future. If the system is still fully loaded when the deferred customer returns, she gets dropped for good. For such a system, we ask: How should the system operator determine the rearrival times of the deferred customers based on the state of the system (which includes those customers already deferred and yet to arrive)? How does one quantify the performance benefit of such a deferral policy? Our contributions are as follows. We propose a simple state-dependent policy for determining the rearrival times of deferred customers. For this policy, we characterize the long run fraction of customers dropped. We also analyse a relaxation where the deferral times are bounded in expectation. Via extensive numerical evaluations, we demonstrate the superiority of the proposed state-dependent policies over naive state-independent deferral policies.

💡 Research Summary

The paper investigates how to improve the performance of classical Erlang‑B loss systems by allowing the system operator to defer customers who arrive when all servers are busy. In the baseline M/M/K/K model, such customers are simply dropped, leading to a blocking probability given by the Erlang‑B formula. The authors introduce two practical constraints on deferrals: (i) a hard upper bound (\hat T) on the deferral time (the maximum time a customer may be asked to wait before returning) and (ii) a limit (\hat D) on the number of customers that can be deferred simultaneously. The central question is how to choose the “re‑arrival” times of deferred customers based on the current system state (including already deferred jobs) so as to minimize the long‑run fraction of blocked customers.

Proposed policy – DSRT (Deterministically Spaced Re‑arrival Times).
A single scalar parameter (x\in(0,\hat T]) is selected. When a customer arrives to a fully occupied system and no one is currently deferred, the policy defers the customer by exactly (x) time units. If there are already deferred customers, the new customer’s re‑arrival time is set to (x) after the latest scheduled re‑arrival, provided the total number of deferred jobs does not exceed (D=\min(\hat D,\lfloor\hat T/x\rfloor)). Thus the policy spreads the return times of all deferred customers uniformly over the interval (

Please come back later: Benefiting from deferrals in service systems

💡 Research Summary

Comments & Academic Discussion

Leave a Comment