Mouhammad Bazzi Profile Picture
Mouhammad Bazzi

Facial Image Inpainting

This project introduces a novel approach to facial image inpainting, leveraging a nested deformable multi-head attention layer (NDMHA). This innovative technique efficiently captures long-term dependencies, resulting in visually appealing and contextually accurate inpainted images.

Project Image

Project Overview

I) Context: Tackling Facial Image Inpainting
In our computer vision class, my teammate and I embarked on a project to tackle the challenging problem of facial image inpainting. To guide our work, we turned to the groundbreaking research presented in the 2023 paper by Shruti S Phutke and Subrahmanyam Murala titled "Nested Deformable Multi-head Attention for Facial Image Inpainting." This paper introduced novel techniques that inspired us to implement advanced methods in our project.

II) Technique Spotlight: NDMHA for Improved Inpainting
Our main focus was on the application of the Nested Deformable Multi-head Attention (NDMHA) layer, which served as a crucial component of our approach. This attention mechanism, along with incorporating transformer-based methodologies, enabled us to efficiently capture long-term dependencies and significantly improve the accuracy of facial image inpainting. Additionally, we integrated other state-of-the-art techniques such as Deformable Convolution (used in NDMHA), Gated Convolution, and Gated Feed Forward layers into our model. These elements enhanced feature extraction, information flow, and refinement, further augmenting the performance of our inpainting system.

III) Results: Promising Potential for NDMHA
Despite encountering computational limitations that prevented us from training on the entire dataset and achieving the desired number of epochs, our preliminary results showed promising potential for the novel architecture incorporating the NDMHA layer. We observed that the NDMHA-based model exhibited superior reconstruction capabilities compared to the baseline U-net autoencoder. While the baseline model produced higher-quality results initially, as we continued training the NDMHA model for more epochs, we observed a gradual improvement in image quality. This suggests that with additional training and more computational resources, we anticipate the NDMHA-based architecture to outperform the baseline model, both in terms of reconstruction accuracy and image quality. Our findings highlight the importance of further exploration and experimentation with the NDMHA technique in order to fully realize its potential for facial image inpainting tasks.

For more details, please refer to the GitHub link, where you can find the complete report and the original paper.

Tools Used

Python
PyTorch
TensorFlow
Keras
GIT