Project Overview
I) Context: Tackling Facial Image Inpainting
In our computer vision class, my teammate and I embarked on a
project to tackle the challenging problem of facial image inpainting.
To guide our work, we turned to the groundbreaking research presented
in the 2023 paper by Shruti S Phutke and Subrahmanyam Murala titled
"Nested Deformable Multi-head Attention for Facial Image Inpainting."
This paper introduced novel techniques that inspired us to implement
advanced methods in our project.
II) Technique Spotlight: NDMHA for Improved Inpainting
Our main focus was on the application of the Nested Deformable
Multi-head Attention (NDMHA) layer, which served as a crucial
component of our approach. This attention mechanism, along with
incorporating transformer-based methodologies, enabled us to
efficiently capture long-term dependencies and significantly
improve the accuracy of facial image inpainting. Additionally,
we integrated other state-of-the-art techniques such as Deformable
Convolution (used in NDMHA), Gated Convolution, and Gated Feed
Forward layers into our model. These elements enhanced feature
extraction, information flow, and refinement, further augmenting
the performance of our inpainting system.
III) Results: Promising Potential for NDMHA
Despite encountering computational limitations that prevented us from
training on the entire dataset and achieving the desired number of epochs,
our preliminary results showed promising potential for the novel architecture
incorporating the NDMHA layer. We observed that the NDMHA-based model
exhibited superior reconstruction capabilities compared to the baseline
U-net autoencoder. While the baseline model produced higher-quality results
initially, as we continued training the NDMHA model for more epochs, we
observed a gradual improvement in image quality. This suggests that with
additional training and more computational resources, we anticipate the
NDMHA-based architecture to outperform the baseline model, both in terms
of reconstruction accuracy and image quality. Our findings highlight the
importance of further exploration and experimentation with the NDMHA
technique in order to fully realize its potential for facial image
inpainting tasks.
For more details, please refer to the GitHub link, where you can find the complete report and the original paper.