Adventures in Demand Analysis Using AI
This paper advances empirical demand analysis by integrating multimodal product representations derived from artificial intelligence (AI). Using a detailed dataset of toy cars on textit{Amazon.com}, we combine text descriptions, images, and tabular covariates to represent each product using transformer-based embedding models. These embeddings capture nuanced attributes, such as quality, branding, and visual characteristics, that traditional methods often struggle to summarize. Moreover, we fine-tune these embeddings for causal inference tasks. We show that the resulting embeddings substantially improve the predictive accuracy of sales ranks and prices and that they lead to more credible causal estimates of price elasticity. Notably, we uncover strong heterogeneity in price elasticity driven by these product-specific features. Our findings illustrate that AI-driven representations can enrich and modernize empirical demand analysis. The insights generated may also prove valuable for applied causal inference more broadly.
💡 Research Summary
The paper “Adventures in Demand Analysis Using AI” presents a novel empirical framework that integrates multimodal product representations derived from state‑of‑the‑art artificial intelligence models into the estimation of demand and price elasticity. Using a rich panel of 7,226 toy‑car listings from Amazon.com collected over twelve four‑week periods (March 2023 – January 2024), the authors construct a dataset that includes time‑varying price and sales‑rank information, as well as static textual descriptions, product images, and a set of tabular attributes (review count, rating, lightning‑deal flag, fulfillment method, sub‑category, etc.). Because Amazon does not publish actual quantities sold, the authors adopt the inverse of the sales rank as a proxy for quantity, justified by a Pareto‑distribution assumption that links order statistics of sales to rank.
To transform the heterogeneous raw data into a unified numeric representation, the study leverages transformer‑based encoders: RoBERTa/LLaMA for text, BEiT for images, and SAINT for tabular fields. Each encoder yields a 768‑dimensional embedding; the three modalities are concatenated and then projected via a Johnson–Lindenstrauss transform to a 256‑dimensional space, centered, and L2‑normalized onto a hypersphere. This “multimodal embedding” is the core feature set used in all downstream tasks.
The authors first evaluate the embeddings qualitatively (k‑means clustering, visual inspection of principal‑component projections) and quantitatively (out‑of‑sample prediction of price and quantity signals). The multimodal model outperforms a baseline that uses only tabular covariates: root‑mean‑square error drops by roughly 15 % and the out‑of‑sample R² rises from 0.68 to 0.81. Adding image embeddings notably improves the separation of clusters that are indistinguishable when only text is used, indicating that visual cues capture brand, design, and perceived quality dimensions that matter for demand.
The central econometric contribution lies in the causal estimation of price elasticity. A naïve cross‑sectional regression of log‑quantity on log‑price yields implausibly low elasticities (≈ ‑0.8) because it fails to control for product‑specific visibility and quality. To address this, the authors specify a dynamic panel model that includes lagged price and quantity, the multimodal embeddings, and interaction terms that allow the embeddings to act as effect modifiers. Crucially, the embeddings are fine‑tuned on the prediction of price and quantity signals, aligning them with the orthogonal‑machine‑learning (orthogonal ML) framework for double‑robust estimation. This fine‑tuning ensures that the embeddings are highly predictive of the nuisance functions (conditional expectations) while remaining approximately orthogonal to the parameter of interest (elasticity).
The resulting elasticity estimates are substantially larger in magnitude (≈ ‑1.7) and, more importantly, exhibit pronounced heterogeneity across products. High‑price, high‑popularity items display muted responsiveness (elasticities around ‑0.9), whereas low‑price, newly introduced items are highly price‑sensitive (elasticities around ‑2.4). The embeddings themselves serve as interpretable modifiers: clusters characterized by premium visual design or strong brand language have lower elasticities, while clusters dominated by generic descriptions and simple graphics have higher elasticities.
The paper contributes to three strands of literature: (1) demand estimation with rich product characteristics, extending classic hedonic approaches by incorporating AI‑generated features; (2) the growing interface between econometrics and machine learning, demonstrating how self‑supervised, attention‑based models can be adapted for causal inference; and (3) applied industrial organization using e‑commerce data, showing that sales‑rank proxies combined with high‑quality embeddings can yield credible demand parameters.
Limitations are acknowledged. The reliance on inverse sales rank introduces measurement error; the Pareto‑distribution assumption is plausible but not directly validated, and sensitivity analyses are limited. The study’s scope is confined to a single product category (toy cars), raising questions about external validity. Moreover, the computational pipeline—large transformer models, fine‑tuning, and dimensionality reduction—requires substantial resources, which may hinder replication in less‑well‑funded settings.
In conclusion, the authors demonstrate that AI‑driven multimodal embeddings, when properly fine‑tuned for causal tasks, can substantially improve both predictive performance and the credibility of elasticity estimates. The work opens avenues for future research: extending the approach to other e‑commerce categories, integrating actual sales data where available, exploring alternative proxy constructions, and applying the methodology to policy‑relevant questions such as the impact of advertising, platform fees, or regulatory interventions on consumer demand.
Comments & Academic Discussion
Loading comments...
Leave a Comment