A Comparison of Push and Pull Techniques for Ajax

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Ajax applications are designed to have high user interactivity and low user-perceived latency. Real-time dynamic web data such as news headlines, stock tickers, and auction updates need to be propagated to the users as soon as possible. However, Ajax still suffers from the limitations of the Web’s request/response architecture which prevents servers from pushing real-time dynamic web data. Such applications usually use a pull style to obtain the latest updates, where the client actively requests the changes based on a predefined interval. It is possible to overcome this limitation by adopting a push style of interaction where the server broadcasts data when a change occurs on the server side. Both these options have their own trade-offs. This paper explores the fundamental limits of browser-based applications and analyzes push solutions for Ajax technology. It also shows the results of an empirical study comparing push and pull.

💡 Research Summary

**
The paper “A Comparison of Push and Pull Techniques for Ajax” investigates how real‑time data can be delivered to web users in Ajax‑based applications, contrasting the traditional client‑initiated pull model with a server‑initiated push model implemented via the Comet/Bayeux long‑polling protocol.

Motivation and Background
Ajax (Asynchronous JavaScript and XML) enables highly interactive web pages with low perceived latency, but the underlying HTTP request/response paradigm forces every communication to be started by the client. This restriction makes it difficult to push timely updates such as news headlines, stock ticker changes, auction bids, or chat messages. The authors describe the classic REST style, its scalability benefits, and its inability to support asynchronous server notifications.

Push vs. Pull Techniques
The paper reviews three families of techniques:

HTTP Pull – Clients poll the server at a fixed “time‑to‑refresh” (TTR) interval. While simple, robust, and offline‑friendly, pull generates unnecessary traffic when the data does not change and can miss updates if TTR is larger than the publish rate. Adaptive TTR schemes improve the situation but never achieve perfect freshness.
HTTP Streaming – Two variants are described: page streaming (keeping the original page connection open) and service streaming (using a long‑lived XMLHttpRequest). Both keep a TCP connection alive and push data as soon as it becomes available, but they impose a continuous load on the server and depend heavily on browser implementation details.
Comet / Bayeux Long‑Polling – The Bayeux protocol (JSON‑based) implements a topic‑based publish‑subscribe model. After an initial handshake, the client opens a long‑polling connection that the server holds open until an event occurs or a timeout (default 45 s) expires. When an event arrives, the server immediately sends the payload and the client reconnects. This hybrid approach combines the low latency of push with the simplicity of HTTP.

Experimental Design
The authors built a stock‑ticker demo in two variants (push and pull). The push version uses Dojo’s Cometd library and a Jetty server that implements Bayeux long‑polling; the pull version uses Dojo’s bind method with a fixed 15‑second interval. A separate “service provider” publishes a configurable number of stock updates at configurable intervals (5, 10, 15, 50 seconds).

Three independent variables were explored:

Concurrent users – 100, 200, 350, 500, 1000 (higher values saturated the server).
Publish interval – 5, 10, 15, 50 seconds (the 50‑second interval exceeds the 45‑second timeout, forcing many reconnects).
Technique – push (Bayeux long‑polling) vs. pull (fixed 15‑second polling).

For each combination, ten messages were published. The test harness used the Grinder framework with Jython scripts to emulate browsers, Log4J’s SocketServer for logging, TCPDump for packet capture, and the Unix top command for CPU monitoring. The client side ran on the DAS3 distributed super‑computer (68 dual‑CPU nodes), while the application server was a single 3 GHz Pentium IV machine running Jetty 6.1.2 with Java NIO (event‑driven I/O).

Results

Latency – Push achieved an average end‑to‑end latency of ≤ 0.5 seconds when the publish interval was ≥ 15 seconds. Pull showed latencies of 2 seconds or more under the same conditions, because the client must wait until the next poll.
Server Load – Push caused the server CPU to rise from ~70 % (500 users) to 100 % (≈ 1000 users). The long‑polling connections occupy a thread or NIO channel for the whole timeout period, and each event triggers a write operation, leading to a linear increase in CPU usage. Pull kept CPU usage below 30 % even at 1000 users because the server only processes short‑lived GET requests.
Network Traffic – Pull generated 3–5× more HTTP packets than push because every client sent a request at every TTR regardless of data changes. Push transmitted data only when an event occurred, but the periodic timeout/reconnect generated short bursts of traffic, especially with the 50‑second publish interval.
Scalability – The push approach scaled well up to a few hundred concurrent users; beyond that, the server became saturated and could no longer deliver updates. Pull scaled linearly with the number of users, limited mainly by network bandwidth rather than CPU.

Discussion and Implications
The authors conclude that push (long‑polling) is preferable for applications where low latency and high data freshness are critical (e.g., live stock tickers, chat, auctions) and where the expected number of concurrent users is moderate. However, for large‑scale deployments (thousands of users) the server‑side overhead of maintaining many open connections becomes a bottleneck. In such scenarios a hybrid or adaptive strategy—dynamic TTR, selective push for high‑priority topics, or migration to newer protocols such as WebSocket or Server‑Sent Events—should be considered.

The study also highlights the importance of the underlying I/O model. Jetty’s NIO‑based event‑driven architecture mitigated some of the scalability issues compared with a pure thread‑per‑connection model, but the fundamental limitation of HTTP 1.1 long‑polling (connection timeout, per‑connection overhead) remains.

Related Work and Future Directions
The paper surveys related approaches (Comet, Bayeux, HTTP streaming) and positions its contribution as an empirical comparison using a controlled, repeatable testbed. Limitations include the relatively small number of published messages (10 per run) and the focus on a single server implementation. Future work suggested includes evaluating WebSocket‑based push, testing in cloud environments with auto‑scaling, and incorporating user‑experience metrics (QoE) to complement the raw performance numbers.

Conclusion
Push and pull each have distinct trade‑offs in Ajax environments. Push delivers near‑real‑time updates with lower network overhead but incurs higher CPU load and limited scalability due to persistent connections. Pull offers predictable server load and easy scaling but suffers from higher latency and unnecessary traffic. System designers must weigh these factors against application requirements and may adopt hybrid or newer push technologies to achieve the best balance.

A Comparison of Push and Pull Techniques for Ajax

💡 Research Summary

Comments & Academic Discussion

Leave a Comment