Self-Adaptive Probabilistic Skyline Query Processing in Distributed Edge Computing via Deep Reinforcement Learning

Self-Adaptive Probabilistic Skyline Query Processing in Distributed Edge Computing via Deep Reinforcement Learning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In the era of the Internet of Everything (IoE), the exponential growth of sensor-generated data at the network edge renders efficient Probabilistic Skyline Query (PSKY) processing a critical challenge. Traditional distributed PSKY methodologies predominantly rely on pre-defined static thresholds to filter local candidates. However, these rigid approaches are fundamentally ill-suited for the highly volatile and heterogeneous nature of edge computing environments, often leading to either severe communication bottlenecks or excessive local computational latency. To resolve this resource conflict, this paper presents SA-PSKY, a novel Self-Adaptive framework designed for distributed edge-cloud collaborative systems. We formalize the dynamic threshold adjustment problem as a continuous Markov Decision Process (MDP) and leverage a Deep Deterministic Policy Gradient (DDPG) agent to autonomously optimize filtering intensities in real-time. By intelligently analyzing multi-dimensional system states, including data arrival rates, uncertainty distributions, and instantaneous resource availability, our framework effectively minimizes a joint objective function of computation and communication costs. Comprehensive experimental evaluations demonstrate that SA-PSKY consistently outperforms state-of-the-art static and heuristic baselines. Specifically, it achieves a reduction of up to 60% in communication overhead and 40% in total response time, while ensuring robust scalability across diverse data distributions.


💡 Research Summary

The paper addresses the challenge of processing Probabilistic Skyline Queries (PSKY) in highly dynamic edge‑cloud environments typical of the Internet of Everything (IoE). Traditional distributed PSKY solutions rely on static, pre‑defined filtering thresholds (α) to decide which local candidate objects should be transmitted to a central broker. Such rigid thresholds cannot adapt to rapid fluctuations in data arrival rates, uncertainty distributions, or the availability of CPU and network resources at edge nodes, leading either to communication overloads or excessive local computation.

To overcome these limitations, the authors propose SA‑PSKY, a Self‑Adaptive framework that formulates the threshold‑adjustment problem as a continuous Markov Decision Process (MDP). The system state at each time step includes multi‑dimensional observations such as per‑node data arrival rates, statistical variance of uncertain attributes, current CPU utilization, bandwidth availability, and sliding‑window size. The action space consists of continuous threshold values α_i for each edge node. A joint reward function penalizes both local computation latency (T_comp) and transmission latency (T_trans) using weighted coefficients, thereby encouraging policies that balance computational and communication costs.

For solving the MDP, the authors employ a Deep Deterministic Policy Gradient (DDPG) agent. The actor network maps the high‑dimensional state vector to continuous actions (the α_i values), while the critic network evaluates the Q‑value of state‑action pairs. Experience replay, target networks, and Ornstein‑Uhlenbeck noise are used to stabilize learning and ensure smooth exploration. By using a continuous‑action DRL algorithm rather than discrete‑action methods (e.g., DQN), SA‑PSKY avoids quantization errors and can fine‑tune thresholds with high precision.

The underlying system model assumes each edge node maintains a count‑based sliding window W_i of recent uncertain objects, each represented by a set of d‑dimensional instances with associated existence probabilities. Instance‑level dominance is defined in the usual Pareto sense, and object‑level dominance probability is computed by aggregating over all instance pairs. A local candidate set S_i is formed by selecting objects whose locally computed skyline probability exceeds the current threshold α_i; these candidates are then transmitted to the cloud broker. The total transmission volume Φ(α) = Σ_i |S_i| and the local computational load are directly linked to the thresholds, making the reward function a natural expression of the trade‑off.

Experimental evaluation is conducted with 5 to 20 simulated edge nodes. Data arrival processes are modeled as Poisson or bursty streams, and uncertainty distributions are varied among normal, skewed, and correlated patterns. Baselines include a fixed threshold (α = 0.5), a heuristic hysteresis‑based adaptive scheme, and an integer‑action reinforcement‑learning approach using Lagrangian optimization. Results show that SA‑PSKY reduces average communication overhead by up to 60 % and total end‑to‑end latency by up to 40 % compared with the baselines. Notably, the learned policy reacts quickly to sudden workload spikes by lowering α to curb transmission, and raises α when the load eases, thereby minimizing unnecessary local dominance checks. Convergence is achieved within roughly 2,000 training episodes, and performance gains are consistent across all tested data distributions.

The paper’s contributions are threefold: (1) a rigorous mathematical formalization of adaptive PSKY pruning as a continuous MDP, (2) the design of a DDPG‑based agent that provides fine‑grained, real‑time threshold control in a high‑dimensional state space, and (3) comprehensive empirical validation demonstrating substantial reductions in both communication and computation costs relative to static and heuristic methods.

Limitations acknowledged by the authors include reliance on simulation rather than deployment on real edge hardware, and the focus on a single‑broker architecture. Future work is suggested in the areas of real‑world implementation on resource‑constrained devices, energy consumption analysis, extension to multi‑broker/multi‑cloud scenarios, and incorporation of privacy‑preserving mechanisms for state sharing.

Overall, SA‑PSKY represents a significant advancement in adaptive query processing for uncertain data streams, showcasing how continuous deep reinforcement learning can effectively manage the intricate trade‑offs inherent in modern edge‑cloud ecosystems.


Comments & Academic Discussion

Loading comments...

Leave a Comment