A Short-Term Predict-Then-Cluster Framework for Meal Delivery Services
Micro-delivery services offer promising solutions for on-demand city logistics, but their success relies on efficient real-time delivery operations and fleet management. On-demand meal delivery platforms seek to optimize real-time operations based on anticipatory insights into citywide demand distributions. To address these needs, this study proposes a short-term predict-then-cluster framework for on-demand meal delivery services. The framework utilizes ensemble-learning methods for point and distributional forecasting with multivariate features, including lagged-dependent inputs to capture demand dynamics. We introduce Constrained K-Means Clustering (CKMC) and Contiguity Constrained Hierarchical Clustering with Iterative Constraint Enforcement (CCHC-ICE) to generate dynamic clusters based on predicted demand and geographical proximity, tailored to user-defined operational constraints. Evaluations of European and Taiwanese case studies demonstrate that the proposed methods outperform traditional time series approaches in both accuracy and computational efficiency. Clustering results demonstrate that the incorporation of distributional predictions effectively addresses demand uncertainties, improving the quality of operational insights. Additionally, a simulation study demonstrates the practical value of short-term demand predictions for proactive strategies, such as idle fleet rebalancing, significantly enhancing delivery efficiency. By addressing demand uncertainties and operational constraints, our predict-then-cluster framework provides actionable insights for optimizing real-time operations. The approach is adaptable to other on-demand platform-based city logistics and passenger mobility services, promoting sustainable and efficient urban operations.
💡 Research Summary
The paper introduces a “predict‑then‑cluster” framework designed to improve real‑time operations of on‑demand meal‑delivery platforms. The authors first address the need for high‑resolution, short‑term demand forecasts that capture complex seasonality, weather effects, holidays, and sudden events. To this end, they develop ensemble‑learning models—Random Forest, Gradient Boosting, and especially Quantile Regression Forest (QRF)—that incorporate lagged‑dependent features and produce both point forecasts and full predictive distributions (quantiles). Compared with traditional time‑series methods such as SARIMA, Holt‑Winters, and the trigonometric exponential smoothing model TBA‑TS, the QRF‑based approach yields lower RMSE, MAE, and CRPS, while remaining computationally lightweight enough for frequent updates.
The second component tackles the generation of operationally useful clusters. Two novel algorithms are proposed: Constrained K‑Means Clustering (CKMC) and Contiguity Constrained Hierarchical Clustering with Iterative Constraint Enforcement (CCHC‑ICE). CKMC extends classic K‑Means by embedding user‑defined constraints on cluster size, total predicted demand, and maximum geographic extent, ensuring that each cluster can serve as a feasible dispatch zone. CCHC‑ICE starts with a hierarchical agglomeration and then iteratively enforces spatial contiguity and other operational limits (e.g., minimum number of clusters, similarity thresholds). This dual‑constraint approach produces dynamic, geographically coherent clusters that adapt to the evolving demand landscape.
The framework is evaluated on two real‑world datasets: one from a European city and another from a Taiwanese city. Forecasting experiments demonstrate that the ensemble models outperform baseline time‑series models by roughly 10–15 % in error metrics and run 30 % faster, making them suitable for real‑time deployment. Clustering experiments show that clusters built on predicted demand distributions align more closely with those built on actual demand (Jaccard similarity improving from 0.78 to 0.86) and that incorporating distributional uncertainty yields clusters that better handle peak‑hour volatility.
A simulation study further illustrates operational benefits: using the short‑term forecasts to proactively rebalance idle couriers reduces average delivery times by about 8 % and courier idle time by 11 %. The authors also release all code, data, and detailed documentation, facilitating reproducibility and encouraging adoption in related on‑demand logistics and mobility services.
In conclusion, the study makes three key contributions: (1) a unified predict‑then‑cluster pipeline that couples high‑quality, distributional demand forecasts with constraint‑aware clustering; (2) a demonstration that lagged‑dependent features and quantile‑based models significantly boost short‑term forecasting accuracy even with limited data; (3) the introduction of CKMC and CCHC‑ICE, which together satisfy geographic contiguity and operational constraints, thereby delivering actionable, dynamic zones for fleet management, order bundling, and other real‑time decisions. The work opens avenues for future research on online model updating, integration with reinforcement‑learning based dispatch, and extension to other urban on‑demand services.
Comments & Academic Discussion
Loading comments...
Leave a Comment