Cs-Cv
Synthetic Data Guided Feature Selection for Robust Activity Recognition in Older Adults
Learning a distance measure from the information-estimation geometry of data
Inverse problems with diffusion models: MAP estimation via mode-seeking loss
Continual-MEGA: A Large-scale Benchmark for Generalizable Continual Anomaly Detection
FlashBlock: Attention Caching for Efficient Long-Context Block Diffusion
Generalization of Self-Supervised Vision Transformers for Protein Localization Across Microscopy Domains
CORP: Closed-Form One-shot Representation-Preserving Structured Pruning for Vision Transformers
Extreme Weather Nowcasting via Local Precipitation Pattern Prediction
DRMOT: A Dataset and Framework for RGBD Referring Multi-Object Tracking
DiMo: Discrete Diffusion Modeling for Motion Generation and Understanding
A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures
A Comparative Study of 3D Person Detection: Sensor Modalities and Robustness in Diverse Indoor and Outdoor Environments
COSMOS: Coherent Supergaussian Modeling with Spatial Priors for Sparse-View 3D Splatting
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
Predicting Camera Pose from Perspective Descriptions for Spatial Reasoning
Generative Modeling via Drifting
Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation
Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation
Analyzing Diffusion and Autoregressive Vision Language Models in Multimodal Embedding Space
Relevance-aware Multi-context Contrastive Decoding for Retrieval-augmented Visual Question Answering
CARLA2Real: a tool for reducing the sim2real appearance gap in CARLA simulator
Can We Build a Monolithic Model for Fake Image Detection? SICA: Semantic-Induced Constrained Adaptation for Unified-Yet-Discriminative Artifact Feature Space Reconstruction