The rapid growth of edge devices has driven the demand for deploying artificial intelligence (AI) at the edge, giving rise to Tiny Machine Learning (TinyML) and its evolving counterpart, Tiny Deep Learning (TinyDL). While TinyML initially focused on enabling simple inference tasks on microcontrollers, the emergence of TinyDL marks a paradigm shift toward deploying deep learning models on severely resource-constrained hardware. This survey presents a comprehensive overview of the transition from TinyML to TinyDL, encompassing architectural innovations, hardware platforms, model optimization techniques, and software toolchains. We analyze state-of-the-art methods in quantization, pruning, and neural architecture search (NAS), and examine hardware trends from MCUs to dedicated neural accelerators. Furthermore, we categorize software deployment frameworks, compilers, and AutoML tools enabling practical on-device learning. Applications across domains such as computer vision, audio recognition, healthcare, and industrial monitoring are reviewed to illustrate the real-world impact of TinyDL. Finally, we identify emerging directions including neuromorphic computing, federated TinyDL, edge-native foundation models, and domain-specific co-design approaches. This survey aims to serve as a foundational resource for researchers and practitioners, offering a holistic view of the ecosystem and laying the groundwork for future advancements in edge AI.
Computing methodologiesMachine learningDeep learningTiny Machine Learning (TinyML)Tiny Deep Learning (TinyDL)Edge computingModel optimizationNeural architecture searchQuantizationPruningHardware accelerationEmbedded systems
TinyML to TinyDL Evolution
The rapid proliferation of edge devices has catalyzed a paradigm shift in artificial intelligence deployment, giving rise to Tiny Machine Learning (TinyML) and its advanced evolution, Tiny Deep Learning (TinyDL). This transition represents a fundamental change from simple inference tasks on microcontrollers to sophisticated deep learning capabilities on severely resource-constrained hardware.
TinyML initially emerged to enable basic machine learning inference on microcontrollers with limited memory (typically < 1MB) and computational power. However, the growing demand for more sophisticated AI capabilities at the edge has driven the development of TinyDL, which focuses on deploying complex deep learning models on resource-constrained devices while maintaining performance and efficiency.
Figure 1: TinyML Pipeline.
🌱 Key Technological Innovations
Model Optimization Techniques
The success of TinyDL relies heavily on sophisticated model optimization techniques that enable deep learning models to operate within the strict constraints of edge devices:
Quantization
Post-training Quantization: Converting pre-trained models to lower precision (8-bit, 4-bit, or binary)
Quantization-Aware Training (QAT): Training models with quantization constraints from the beginning
Mixed-Precision Quantization: Using different precision levels for different layers
Dynamic Quantization: Runtime precision adjustment based on computational requirements
Pruning
Structured Pruning: Removing entire channels, filters, or layers
Unstructured Pruning: Removing individual weights based on importance
Iterative Pruning: Gradual removal of parameters during training
TinyDL has demonstrated remarkable versatility across diverse application domains, enabling intelligent capabilities in previously unimaginable contexts:
Computer Vision
Object Detection: Real-time detection on edge cameras
Face Recognition: Privacy-preserving biometric systems
This comprehensive survey provides a foundational resource for researchers and practitioners working in the rapidly evolving field of edge AI.
The survey addresses critical challenges in deploying AI at the edge, including:
Resource Constraints: Memory, computational power, and energy limitations
Performance Optimization: Balancing accuracy, latency, and efficiency
Deployment Complexity: Software toolchains and hardware integration
Scalability: Supporting diverse edge device types and applications
The research community has shown increasing interest in TinyDL as a solution for bringing AI capabilities to resource-constrained environments. This survey serves as a comprehensive guide for understanding the current state of the field and identifying future research directions.
The work has fostered discussions about the democratization of AI, enabling sophisticated machine learning capabilities on devices that were previously considered too resource-constrained for such applications. This has implications for privacy, accessibility, and the broader adoption of AI technologies in everyday devices.
🔮 Future Challenges & Opportunities
Technical Challenges
Model Complexity vs. Resource Constraints: Balancing sophisticated models with limited hardware
Energy Efficiency: Minimizing power consumption for battery-powered devices
Real-time Performance: Meeting strict latency requirements for time-critical applications
Model Robustness: Ensuring reliable performance in diverse environmental conditions
Research Opportunities
Novel Architectures: Designing networks specifically for edge deployment
Advanced Optimization: Developing more efficient compression and quantization techniques
Hardware Innovation: Creating specialized accelerators for TinyDL workloads
Software Tools: Building comprehensive development and deployment frameworks
The survey provides a roadmap for future research and development in TinyDL, highlighting the potential for transformative impact across industries and applications.