Serverless Edge Computing: Taxonomy and Systematic Literature Review 2025

📌 Key Takeaways

  • Paradigm Convergence: Serverless edge computing merges FaaS event-driven models with edge proximity, enabling sub-10ms latency for real-time IoT and AI inference workloads
  • Cold Start Challenge: WebAssembly (Wasm) runtimes achieve microsecond-scale cold starts on edge hardware, outperforming container-based approaches by orders of magnitude
  • Resource Heterogeneity: Edge nodes range from powerful micro-data centers to constrained IoT gateways, requiring adaptive scheduling algorithms that account for compute, memory, and network variability
  • State Management: Distributed state across edge nodes remains the hardest unsolved problem, with CRDTs, edge-native databases, and function-local caching as emerging solutions
  • Market Growth: The systematic review identifies exponential growth in publications since 2020, with 5G rollout and AI-at-the-edge driving adoption across automotive, manufacturing, and healthcare

The Convergence of Serverless and Edge Computing

The systematic literature review published in February 2025 examines a rapidly maturing field at the intersection of two transformative computing paradigms: serverless architectures and edge computing. This convergence addresses a fundamental limitation of cloud-native serverless platforms — while Function-as-a-Service (FaaS) offerings like AWS Lambda have revolutionized application development by abstracting infrastructure management, they inherently introduce latency by executing functions in centralized data centers potentially hundreds of milliseconds from end users.

Edge computing brings computation closer to data sources — at cellular base stations, enterprise premises, retail locations, and embedded devices. When combined with the serverless model’s event-driven execution, automatic scaling, and pay-per-invocation pricing, the result is a platform paradigm that delivers both developer productivity and ultra-low latency. The review systematically analyzes over 150 papers published between 2018 and 2024, constructing a comprehensive taxonomy that maps the architectural landscape, identifies common design patterns, and highlights open research challenges.

The timing of this review is significant. The global rollout of 5G networks has created the infrastructure backbone that makes edge computing commercially viable at scale. Simultaneously, advances in lightweight runtimes — particularly WebAssembly — have made it practical to execute functions on resource-constrained edge hardware that cannot support traditional container orchestration platforms. These technological enablers, combined with growing demand for real-time AI inference, autonomous vehicle computing, and industrial IoT processing, have driven an exponential increase in both research publications and commercial deployments in the serverless edge space.

For technology leaders evaluating edge strategies, this review provides an essential reference framework. Understanding the taxonomy of approaches, the trade-offs between different architectural decisions, and the maturity of available solutions is critical for making informed investment decisions in a field where vendor marketing often outpaces technical reality. Engaging with the complete interactive analysis provides a structured pathway through this complex research landscape.

Taxonomy of Serverless Edge Architectures

The review constructs a multi-dimensional taxonomy that classifies serverless edge computing approaches along several axes. The primary distinction is between edge-augmented cloud serverless — where cloud FaaS platforms extend their execution capabilities to edge locations (as with AWS Lambda@Edge or Cloudflare Workers) — and edge-native serverless — where serverless platforms are designed from the ground up for edge deployment (as with open-source frameworks adapted for edge nodes).

Within these categories, architectural patterns vary along several dimensions. The execution model ranges from container-based approaches (using lightweight containers like gVisor or Firecracker microVMs) to language-runtime isolates (V8 JavaScript isolates as used by Cloudflare Workers) to WebAssembly modules (as used by Fastly Compute and emerging edge platforms). Each approach makes different trade-offs between cold start performance, language support, security isolation, and resource efficiency.

The topology dimension distinguishes between single-tier edge (functions execute only at edge nodes), multi-tier edge-cloud (functions can be placed at edge, fog, or cloud tiers based on requirements), and peer-to-peer edge (functions can migrate between edge nodes without cloud coordination). Multi-tier architectures dominate the literature, reflecting the pragmatic reality that not all workloads benefit from edge execution and that cloud fallback remains necessary for compute-intensive or state-heavy operations.

The triggering model further differentiates approaches: HTTP-triggered edge functions (the dominant commercial model), IoT event-triggered functions (responding to sensor data streams), stream-processing functions (continuous data pipeline stages), and timer-triggered functions (periodic batch processing at edge locations). Each triggering model imposes different requirements on function lifecycle management, warm pool strategies, and networking architecture.

Cold Start Optimization at the Edge

Cold start latency — the time required to initialize a function execution environment when no pre-warmed instance exists — represents the most extensively studied challenge in serverless edge computing. In cloud environments, cold starts typically add 100-500 milliseconds for container-based functions and 1-5 milliseconds for lightweight runtimes. At the edge, where the entire value proposition depends on single-digit millisecond latency, even modest cold start penalties can eliminate the benefits of edge placement.

The review identifies five primary optimization strategies that researchers and platform designers employ. Lightweight runtimes reduce initialization overhead by replacing full container stacks with minimal execution environments. WebAssembly has emerged as particularly promising, with cold start times measured in microseconds rather than milliseconds. Wasm modules are pre-compiled to an intermediate representation that can be instantiated with minimal overhead, and the memory sandbox model provides security isolation without the overhead of OS-level virtualization.

Predictive pre-warming uses historical invocation patterns, time-series analysis, or machine learning models to anticipate function invocations and pre-initialize execution environments before requests arrive. This approach is especially effective for functions with predictable traffic patterns (such as business-hours workloads or periodic IoT data collection cycles) but struggles with sporadic or unpredictable invocations.

Snapshot-and-restore mechanisms capture the memory state of an initialized function and restore it on subsequent invocations, bypassing the initialization code path. Technologies like CRIU (Checkpoint/Restore In Userspace) and Firecracker’s snapshot capabilities enable sub-millisecond restoration from pre-captured states, though managing snapshot freshness and memory overhead across many functions on constrained edge nodes introduces its own complexity.

Function caching maintains recently-used function images and their dependencies at edge nodes, avoiding the latency of fetching function code from remote repositories. Hierarchical caching strategies — where popular functions are cached at all edge nodes while less frequent functions are cached at regional aggregation points — optimize the trade-off between cache hit rates and edge storage consumption.

Transform technical research papers into interactive experiences your engineering team will actually read

Try It Free →

Function Placement and Resource Scheduling

Deciding where to execute a function across a heterogeneous edge-cloud infrastructure is an NP-hard optimization problem that the literature addresses through various algorithmic approaches. The function placement problem must balance multiple objectives: minimizing end-to-end latency, respecting resource constraints at each node, meeting data locality requirements, optimizing cost, and ensuring fault tolerance — often with conflicting trade-offs.

Heuristic approaches dominate practical deployments. Latency-aware placement algorithms estimate the end-to-end latency for each candidate node (including network transit time, queuing delay, and execution time) and place functions at the node that minimizes total latency while respecting capacity constraints. These heuristics are computationally tractable but may miss globally optimal solutions, particularly in highly dynamic environments where network conditions and load patterns change rapidly.

Reinforcement learning approaches have gained attention as a means to learn placement policies that adapt to dynamic conditions. Deep Q-Networks and policy gradient methods can learn to optimize placement decisions based on observed system state, adapting to changing traffic patterns, node failures, and network congestion without requiring explicit modeling of these dynamics. However, the training overhead, sample efficiency, and generalization challenges of RL-based approaches limit their practical deployment to date.

The review also identifies function chain placement as an important extension of the basic problem. Many edge applications involve chains of functions that must execute in sequence — for example, a video analytics pipeline with frame extraction, object detection, classification, and alert generation stages. Placing these function chains requires considering inter-function communication latency alongside individual function execution requirements, adding another dimension of complexity to the scheduling problem.

State Management in Distributed Edge Functions

State management emerges from the review as the most challenging unsolved problem in serverless edge computing. The serverless model’s foundational assumption — that functions are stateless — conflicts with the reality that many edge applications require access to shared state. Session information, ML model parameters, accumulated sensor readings, and application context must be accessible to functions executing across distributed edge nodes with potentially intermittent connectivity.

Several approaches address this challenge. Edge-native key-value stores like Cloudflare Durable Objects and Fastly KV provide globally distributed, eventually consistent storage accessible from edge functions. These services sacrifice strong consistency for low-latency access, making them suitable for applications that can tolerate brief periods of stale data. Conflict-free Replicated Data Types (CRDTs) provide a theoretical foundation for distributed state that guarantees eventual consistency without coordination, and several edge platforms are exploring CRDT-based state primitives.

Function-local caching with write-through to a backing store provides the fastest state access for read-heavy workloads, though cache invalidation across distributed edge nodes introduces the classic distributed systems challenge of maintaining coherence without prohibitive coordination overhead. State migration — transferring function state from one edge node to another as users move or load patterns shift — enables stateful edge applications but requires careful handling of migration latency and consistency during the transfer window.

Security and Isolation Challenges

Multi-tenant execution on shared edge infrastructure demands robust security isolation, yet the resource constraints of edge nodes limit the isolation mechanisms available. The review categorizes security challenges into execution isolation (preventing one tenant’s function from accessing another’s data or affecting another’s performance), data protection (securing data at rest, in transit, and during processing at edge locations that may be physically less secure than cloud data centers), and attestation (verifying that edge nodes and their execution environments have not been tampered with).

WebAssembly’s memory sandbox model provides strong execution isolation with minimal overhead — each Wasm module operates within a linear memory space that is bounds-checked at every access, preventing unauthorized memory reads or writes without requiring hardware virtualization support. This makes Wasm particularly attractive for edge environments where hardware-assisted virtualization (Intel VT-x, ARM Virtualization Extensions) may not be available on all devices.

Confidential computing technologies — including Intel SGX, ARM TrustZone, and AMD SEV — offer hardware-based protection for sensitive workloads at edge nodes. These technologies can protect function execution and data even from compromised edge infrastructure operators, addressing the trust challenge inherent in deploying computation to potentially untrusted edge locations. However, performance overhead and limited enclave sizes remain practical constraints.

Make complex technical research accessible — interactive documents drive 4× engagement with engineering teams

Get Started →

Platform Landscape: Commercial and Open Source

The commercial platform landscape has matured significantly since 2020. AWS Lambda@Edge and CloudFront Functions represent Amazon’s two-tier approach — Lambda@Edge for full Node.js/Python function execution at regional edge locations, and CloudFront Functions for lightweight JavaScript processing at CDN points of presence with sub-millisecond cold starts. Cloudflare Workers leverages V8 JavaScript isolates to execute functions across Cloudflare’s global network of 300+ locations, offering zero cold start for pre-loaded scripts.

Fastly Compute (formerly Compute@Edge) has bet on WebAssembly as its execution runtime, enabling functions written in Rust, Go, JavaScript, and other languages that compile to Wasm. This approach delivers cold start times under 50 microseconds — a significant advantage for latency-sensitive workloads. Azure IoT Edge integrates Azure Functions with edge device management, targeting industrial IoT scenarios where functions process sensor data locally before selectively uploading results to the cloud.

On the open-source side, OpenFaaS, Knative, and Apache OpenWhisk provide Kubernetes-native serverless frameworks that organizations can deploy on their own edge infrastructure. These platforms offer greater control over function placement, data residency, and hardware utilization but require significant operational expertise. The review notes growing interest in Kubernetes at the edge via distributions like K3s and MicroK8s, which reduce Kubernetes’ resource requirements sufficiently for deployment on edge nodes with as little as 512MB of RAM.

Industry Applications and Use Cases

The systematic review identifies several industry verticals where serverless edge computing has moved from research to production deployment. Autonomous vehicles require real-time inference on sensor data with latency budgets measured in single-digit milliseconds — orders of magnitude beyond what cloud-based processing can deliver. Edge serverless platforms enable deployment of ML inference functions at roadside infrastructure units, processing LIDAR and camera data from multiple vehicles simultaneously.

Industrial IoT and smart manufacturing leverage edge functions for real-time quality control, predictive maintenance, and production optimization. Functions triggered by sensor data streams can detect anomalies, classify defects, and trigger alerts within milliseconds of data acquisition, enabling closed-loop control systems that respond faster than human operators. The serverless model’s automatic scaling handles the variable throughput of manufacturing processes without requiring dedicated compute infrastructure for peak loads.

Augmented and virtual reality applications demand motion-to-photon latency under 20 milliseconds to prevent user discomfort, making cloud processing impractical for computationally intensive operations like spatial mapping, object recognition, and scene rendering. Edge serverless platforms can offload these computations from mobile devices to nearby edge nodes, extending battery life while maintaining the responsiveness that immersive experiences require.

Healthcare and telemedicine applications benefit from edge processing for real-time patient monitoring, medical image analysis, and clinical decision support. Data privacy regulations often require processing sensitive health data locally rather than transmitting it to cloud data centers, making edge deployment not just a performance optimization but a regulatory compliance requirement.

Research Gaps and Future Directions

The systematic review concludes by identifying several critical research gaps that will shape the field’s evolution. Federated serverless computing — enabling functions to seamlessly execute across multiple independent edge providers — remains largely unexplored. Current platforms operate within single provider boundaries, but real-world deployments will increasingly span multiple edge operators, cloud providers, and enterprise-owned infrastructure.

AI-native edge serverless — platforms designed specifically for ML inference and training at the edge — represents a growing research frontier. Current platforms treat ML workloads as general functions, missing optimization opportunities specific to neural network inference such as model partitioning across edge tiers, adaptive model compression based on device capabilities, and federated learning coordination.

Energy-aware scheduling becomes critical as edge computing scales to millions of nodes, many powered by batteries or renewable energy sources with variable availability. Scheduling algorithms that optimize for energy consumption alongside latency and throughput could significantly reduce the environmental footprint of edge computing while extending battery life for mobile edge devices.

The review also highlights the need for standardized benchmarking. Without common benchmarks, comparing the performance of different platforms, architectures, and optimization techniques remains difficult. Efforts like the SPEC cloud and serverless benchmarking initiatives need extension to cover edge-specific scenarios including heterogeneous hardware, intermittent connectivity, and multi-tier processing chains.

Turn cutting-edge research into accessible interactive experiences — perfect for technical documentation and training

Start Now →

Frequently Asked Questions

What is serverless edge computing and how does it differ from cloud serverless?

Serverless edge computing extends the Function-as-a-Service (FaaS) model to edge infrastructure located closer to end users and IoT devices. Unlike cloud serverless where functions run in centralized data centers, edge serverless executes code on distributed nodes at network edges, reducing latency from hundreds of milliseconds to single-digit milliseconds. This is critical for real-time applications like autonomous vehicles, augmented reality, and industrial automation where cloud round-trip delays are unacceptable.

What are the main challenges in serverless edge computing?

Key challenges include cold start latency (function initialization delays amplified on resource-constrained edge hardware), heterogeneous resource management across diverse edge devices, function placement and scheduling across distributed nodes, state management without centralized databases, security isolation in multi-tenant edge environments, and maintaining consistency across geographically distributed function instances.

How does cold start optimization work in edge serverless platforms?

Cold start optimization in edge environments uses techniques including lightweight container runtimes (like WebAssembly/Wasm), function pre-warming based on predictive models, snapshot-and-restore mechanisms, function caching at edge nodes, and tiered warm pool management. WebAssembly has emerged as particularly promising for edge FaaS due to near-instant startup times (microseconds vs. milliseconds for containers) and minimal memory footprint.

What are the key platforms for serverless edge computing in 2025?

Major platforms include AWS Lambda@Edge and CloudFront Functions, Cloudflare Workers (V8 isolates), Fastly Compute (WebAssembly), Azure IoT Edge with Functions, Google Distributed Cloud Edge, Akamai EdgeWorkers, and open-source frameworks like OpenFaaS, Knative, and Apache OpenWhisk adapted for edge deployment. Each platform makes different trade-offs between runtime support, cold start performance, and edge node management.

What industries benefit most from serverless edge computing?

Industries with stringent latency requirements benefit most: autonomous vehicles (sub-10ms inference), industrial IoT and manufacturing (real-time quality control), augmented/virtual reality (motion-to-photon latency), healthcare (real-time patient monitoring), smart cities (traffic and environmental monitoring), retail (in-store analytics), and telecommunications (5G network function virtualization). The common thread is processing data where it is generated rather than transmitting it to distant cloud data centers.

Your documents deserve to be read.

PDFs get ignored. Presentations get skipped. Reports gather dust.

Libertify transforms them into interactive experiences people actually engage with.

No credit card required · 30-second setup