EdgeLens: Deep Learning based Object Detection in Integrated IoT, Fog and Cloud Computing Environments
Data-intensive applications are growing at an increasing rate and there is a growing need to solve scalability and high-performance issues in them. By the advent of Cloud computing paradigm, it became possible to harness remote resources to build and deploy these applications. In recent years, new set of applications and services based on Internet of Things (IoT) paradigm, require to process large amount of data in very less time. Among them surveillance and object detection have gained prime importance, but cloud is unable to bring down the network latencies to meet the response time requirements. This problem is solved by Fog computing which harnesses resources in the edge of the network along with remote cloud resources as required. However, there is still a lack of frameworks that are successfully able to integrate sophisticated software and applications, especially deep learning, with fog and cloud computing environments. In this work, we propose a framework to deploy deep learning-based applications in fog-cloud environments to harness edge and cloud resources to provide better service quality for such applications. Our proposed framework, called EdgeLens, adapts to the application or user requirements to provide high accuracy or low latency modes of services. We also tested the performance of the software in terms of accuracy, response time, jitter, network bandwidth and power consumption and show how EdgeLens adapts to different service requirements.
💡 Research Summary
The paper presents EdgeLens, a novel framework that integrates Internet of Things (IoT) devices, fog computing nodes, and cloud resources to deliver real‑time object detection services powered by deep learning. Recognizing that pure cloud‑based video analytics suffer from prohibitive network latency, the authors leverage the Aneka platform—a .NET‑based service‑oriented middleware—to orchestrate heterogeneous compute resources across the edge and the cloud. Aneka’s dynamic provisioning, load‑balancing, and task‑model APIs enable the system to treat fog nodes (e.g., smartphones, tablets, Raspberry Pi‑class SBCs) and public‑cloud virtual machines (Azure, AWS) as a unified pool of workers.
EdgeLens adopts the YOLOv3 detector, pre‑trained on the COCO dataset, as its core deep‑learning engine. The framework offers two operational modes that can be selected at runtime: a “High‑Accuracy” mode that transmits raw, full‑resolution images to the workers, preserving detection quality (mean Average Precision, mAP ≈ 0.71); and a “Low‑Latency” mode that first down‑samples the image on the master node, thereby cutting network traffic and processing time (average response ≈ 1.2 s, mAP ≈ 0.58). The mode switch influences resource allocation, task parallelism, and the size of data transferred between the gateway and the Aneka master.
The system architecture comprises four hardware components—IoT cameras, a gateway device (Android smartphone), an Aneka master container, and multiple Aneka worker containers (fog or cloud). Software services are layered as follows: (1) Fabric Services for low‑level resource provisioning, fault tolerance, and performance monitoring; (2) Foundation Services for reservation, billing, and storage management; (3) a Gateway Interface implemented with MIT App Inventor, allowing users to configure the master’s network address and initiate image capture; and (4) the Deep‑Learning Module that invokes a Python‑based YOLO script from C# code running on each worker. Communication between gateway and master uses HTTP POST, while the master‑to‑worker data transfer relies on Aneka’s built‑in FTP‑like protocol.
Experimental evaluation was conducted in a realistic lab setting at the University of Melbourne. The gateway was a Samsung Galaxy S7 (Android 9); the master ran on a Dell XPS 13 (i5‑7200U, 8 GB RAM); a fog worker was a Dell Latitude 5490 (i7‑8650U, 16 GB RAM); and cloud workers were Azure B1s VMs (1 vCPU, 1 GB RAM) deployed in both Australia and Virginia, USA. Video frames were streamed at 10 images per minute. Metrics captured via Microsoft Performance Monitor and Network Monitor 3.4 included CPU utilization, memory usage, network bandwidth, and power consumption.
Results demonstrate that EdgeLens can dynamically balance accuracy and latency. In High‑Accuracy mode, the system achieved mAP ≈ 0.71 with an average end‑to‑end response time of 2.3 seconds, while consuming roughly 1.8 MB per image over the network. In Low‑Latency mode, response time dropped to 1.2 seconds and network traffic halved to 0.9 MB per image, at the cost of a modest mAP reduction to 0.58. Moreover, preprocessing at the fog layer reduced overall data transfer by about 30 %, leading to a 15 % decrease in power draw on edge devices.
The authors acknowledge several limitations: the exclusive use of YOLOv3 without benchmarking lighter models (e.g., MobileNet‑SSD, YOLO‑Nano); reliance on a .NET‑centric middleware that necessitates C#‑to‑Python bridging, potentially introducing overhead; absence of data encryption, authentication, and privacy safeguards; and evaluation limited to low‑frame‑rate streams (10 fps), leaving high‑throughput scenarios untested.
In conclusion, EdgeLens showcases a practical, scalable approach to deploying deep‑learning video analytics across fog‑cloud ecosystems. By abstracting heterogeneous resources through Aneka and offering configurable accuracy‑latency trade‑offs, it paves the way for latency‑sensitive IoT applications such as smart surveillance, traffic monitoring, and industrial inspection. Future work will explore integration of ultra‑lightweight detectors, end‑to‑end security mechanisms, and support for higher frame‑rate streams to broaden the framework’s applicability.
Comments & Academic Discussion
Loading comments...
Leave a Comment