Predicting Consumer In-Store Purchase through Real-Time Video Analytics: An Advanced Computer Vision and Deep Learning Approach
Predicting Consumer In-Store Purchase through Real-Time Video Analytics: An Advanced Computer Vision and Deep Learning Approach
50 Pages Posted: 25 Jul 2023 Last revised: 6 Nov 2024
Date Written: July 18, 2023
Abstract
This study introduces a novel, theory-driven video analytics framework to predict purchase decisions in offline retail settings using consumer shopping video data. Our framework addresses four key challenges in offline consumer purchase prediction: (1) capturing real-time behavior, (2) enabling scalability and automation, (3) integrating multi-dimensional data, and (4) preserving the organic nature of consumer behavior without disrupting the shopping experience. To accomplish this, we combine Person Re-identification (Re-ID) technology, which tracks individuals across multiple cameras, with GPS-like trajectory reconstruction, Vision-Language Models (VLMs), and pose estimation to extract theory-driven, real-time shopping behavior features from video data. Our feature set captures a comprehensive range of real-time spatial-temporal trajectory details, including movement speed and path complexity; product interaction features, such as physical touch, item pickup, and visual engagement; body pose and movement indicators, like hand positioning and head orientation; and facial dynamics and eye gaze—offering a holistic perspective on in-store behavior and decision-making. Using deep learning models, specifically transformers, our framework predicts consumer purchase decisions from real-time video features. Extensive experiments demonstrate that it significantly outperforms benchmark models, proving the predictive strength of real-time video data for offline purchase forecasting. We also conduct interpretability analyses to reveal key factors driving model performance, offering marketers actionable insights to refine strategies. To showcase practical applications, we demonstrate various decision-support use cases, including consumer segmentation and real-time intent analysis, which distinguish patterns between purchasers and non-purchasers throughout the shopping journey. Additionally, our framework enables personalized, real-time targeting, with simulations showing a 15.8% profit increase over non-targeted approaches and a 7.51% gain over static targeting strategies. Overall, our proposed framework equips retailers with a powerful tool for predicting real-time purchase decisions and enhancing offline marketing effectiveness.
Keywords: consumer purchase prediction, deep learning, computer vision, video analytics, vision-language model, pose estimation, targeted marketing, offline retail, person re-identification
Suggested Citation: Suggested Citation
Predicting Consumer In-Store Purchase through Real-Time Video Analytics: An Advanced Computer Vision and Deep Learning Approach