Cs-Cv
MGP-KAD: Multimodal Geometric Priors and Kolmogorov-Arnold Decoder for Single-View 3D Reconstruction in Complex Scenes
FloorplanVLM: A Vision-Language Model for Floorplan Vectorization
Rebenchmarking Unsupervised Monocular 3D Occupancy Prediction
Efficient-LVSM: Faster, Cheaper, and Better Large View Synthesis Model via Decoupled Co-Refinement Attention
MetaSSP: Enhancing Semi-supervised Implicit 3D Reconstruction through Meta-adaptive EMA and SDF-aware Pseudo-label Evaluation
MultiGraspNet: A Multitask 3D Vision Model for Multi-gripper Robotic Grasping
Same Answer, Different Representations: Hidden instability in VLMs
Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models
Efficient Mixture-of-Expert for Video-based Driver State and Physiological Multi-task Estimation in Conditional Autonomous Driving
Robust Detection of Retinal Neovascularization in Widefield Optical Coherence Tomography
CompEvent: Complex-valued Event-RGB Fusion for Low-light Video Enhancement and Deblurring
SPIDER: Scalable Physics-Informed Dexterous Retargeting
LL-ViT: Edge Deployable Vision Transformers with Look Up Table Neurons
SyncAnyone: Implicit Disentanglement via Progressive Self-Correction for Lip-Syncing in the wild
Preserving Spectral Structure and Statistics in Diffusion Models
Multi-Sensor Attention Networks for Automated Subsurface Delamination Detection in Concrete Bridge Decks
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
A neuromorphic model of the insect visual system for natural image processing
Learning Human Visual Attention on 3D Surfaces through Geometry-Queried Semantic Priors
POINTS-GUI-G: GUI-Grounding Journey
TFusionOcc: Student's t-Distribution Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction
Revisiting Salient Object Detection from an Observer-Centric Perspective