Steerable Architectures for Computer Vision

Kundu, Soumyabrata

doi:10.6082/uchicago.16581

Steerable Architectures for Computer Vision

Kundu, Soumyabrata

2025

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Cite

Files

Abstract

The success of modern deep learning architectures often depends on their ability to exploit structure and symmetry inherent in data. A key mathematical principle underlying this idea is equivariance, which describes the property that a model’s output transforms predictably when its input is transformed. However, conventional convolutional and transformer based networks typically encode only translational equivariance, leaving other important symmetries, such as rotations, to be learned implicitly through data augmentation. This thesis develops a principled framework for constructing steerable neural networks, which achieve equivariance to continuous symmetry groups encompassing rotations and translations through their underlying mathematical design. The proposed architectures are tested on biomedical image segmentation datasets, where they outperform non-equivariant baselines and remain robust to rotations, demonstrating the practical benefits of embedding symmetry directly into network design.