Dimensional Constraints and Fundamental Limits of Neural Network Explainability: A Mathematical Framework and Analysis
9 Pages Posted: 6 Mar 2025
Date Written: January 13, 2025
Abstract
Neural networks' remarkable performance comes with a fundamental challenge: interpretability. We establish theoretical bounds on neural network explainability through a mathematical framework integrating information theory and geometric analysis. Our key results prove that interpretability is bounded by compression ratio γ = m/d and feature complexity growing as Ω(m 2 /d). Our analysis reveals four fundamental limitations: information bottleneck constraints from dimensional compression Tishby et al. (2000), feature interaction multiplicities in deep networks Chen et al. (2020), combinatorial growth in attribution complexity Lundberg and Lee (2017), and representation capacity limits Cohen et al. (2019). For attention mechanisms, we show information flow is bounded by min{H(h in), d k log(L)}. These theoretical constraints suggest interpretability research should focus on understanding essential feature interactions within proven bounds rather than pursuing exhaustive network interpretation.
Keywords: Explainability, Compression Ratio, Information Bottleneck, Feature Interactions, Attention Mechanisms, artificial neural networks
JEL Classification: C63, C45, C02, D83, O33, L86, D87
Suggested Citation: Suggested Citation