Cs-Cv
Instance-Free Domain Adaptive Object Detection
A Unified Formula for Affine Transformations between Calibrated Cameras
DreamHome-Pano: Design-Aware and Conflict-Free Panoramic Interior Generation
Diffeomorphism-Equivariant Neural Networks
ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
Rethinking Attention: Polynomial Alternatives to Softmax in Transformers
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering
AR as an Evaluation Playground: Bridging Metrics and Visual Perception of Computer Vision Models
Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation
Concepts in Motion: Temporal Bottlenecks for Interpretable Video Classification
Spectral Compressive Imaging via Chromaticity-Intensity Decomposition
Visual Autoregressive Modeling for Instruction-Guided Image Editing
XTransfer: Modality-Agnostic Few-Shot Model Transfer for Human Sensing at the Edge
Revisiting Emotions Representation for Recognition in the Wild
Machine Learning for Detection and Severity Estimation of Sweetpotato Weevil Damage in Field and Lab Conditions
Orientation-Robust Latent Motion Trajectory Learning for Annotation-free Cardiac Phase Detection in Fetal Echocardiography
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos
Clinical-Prior Guided Multi-Modal Learning with Latent Attention Pooling for Gait-Based Scoliosis Screening
Gold Exploration using Representations from a Multispectral Autoencoder
A Survey of AI-Generated Video Evaluation
Bridging the Indoor-Outdoor Gap: Vision-Centric Instruction-Guided Embodied Navigation for the Last Meters
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO