Arch-VQ: Discrete Architecture Representation Learning with Autoregressive Priors

Arch-VQ: Discrete Architecture Representation Learning with Autoregressive Priors
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Existing neural architecture representation learning methods focus on continuous representation learning, typically using Variational Autoencoders (VAEs) to map discrete architectures onto a continuous Gaussian distribution. However, sampling from these spaces often leads to a high percentage of invalid or duplicate neural architectures, likely due to the unnatural mapping of inherently discrete architectural space onto a continuous space. In this work, we revisit architecture representation learning from a fundamentally discrete perspective. We propose Arch-VQ, a framework that learns a discrete latent space of neural architectures using a Vector-Quantized Variational Autoencoder (VQ-VAE), and models the latent prior with an autoregressive transformer. This formulation yields discrete architecture representations that are better aligned with the underlying search space while decoupling representation learning from prior modeling. Across NASBench-101, NASBench-201, and DARTS search spaces, Arch-VQ improves the quality of generated architectures, increasing the rate of valid and unique generations by 22%, 26%, and 135%, respectively, over state-of-the-art baselines. We further show that modeling discrete embeddings autoregressively enhances downstream neural predictor performance, establishing the practical utility of this discrete formulation.


💡 Research Summary

The paper “Arch‑VQ: Discrete Architecture Representation Learning with Autoregressive Priors” addresses a fundamental mismatch in neural architecture search (NAS) between the inherently discrete nature of architecture graphs and the continuous latent spaces traditionally used for unsupervised representation learning. Most prior work relies on variational autoencoders (VAEs) that map architectures to a Gaussian prior; sampling from such a space frequently yields invalid or duplicate architectures because the continuous prior cannot faithfully capture the combinatorial constraints of graph topology and categorical operations.

Arch‑VQ proposes a two‑stage framework that replaces the continuous latent space with a truly discrete one and models its distribution with a language‑style autoregressive transformer. In the first stage, a vector‑quantized variational autoencoder (VQ‑VAE) is built on top of a Graph Isomorphism Network (GIN) encoder. Each architecture, represented by an adjacency matrix A and a one‑hot operation matrix X, is encoded into a continuous vector Z_e. A learnable codebook of K embedding vectors (size K × D) quantizes Z_e by nearest‑neighbor lookup, producing a discrete index sequence Z and a quantized latent Z_q. The decoder reconstructs A and X from Z_q using sigmoid and softmax heads. Training uses a straight‑through estimator for the non‑differentiable quantization step and an exponential moving average (EMA) update for the codebook, stabilizing learning. The loss combines reconstruction likelihood and a commitment term (β‖Z_e − sg


Comments & Academic Discussion

Loading comments...

Leave a Comment