|
Generative Adversarial Network(GAN) has achieved great performance in several image generation and manipulation tasks. Image-to-image transfer is a trend in the field of computer vision. However, it becomes challenging for some major drawbacks, such as the time- and computation-consuming training process, mode-collapse and lack of paired training data. To handle the above-mentioned limitations, we propose a novel model, called Enhanced Cycle conditional GAN (Enhanced CCGAN). Our model alleviates the problem of the lack of the aligned paired data problem by calculating the cycle consistency loss function. It realizes the representation disentanglement by using content encoder and style encoder with different architecture. In the content encoder, we use the ResNet block to extract the content feature to realize the multi-level feature fusion. We propose a semantic latent style loss function to ensure a precise semantic consistency of style vectors. Furthermore, we use the $3\times3$ convolutional kernel. The $3\times3$ convolutional kernel is brought up by VGG16. It can greatly reduce the amount of computation and still perform well. Experimental results have shown that our model can produce images with high quality and diversity across several data domains and significantly outperform the state-of-the-art models. |
|
Keywords:Artificial Intelligence, Enhanced CCGAN, Representation disentanglement, ResNet, Semantic latent space, VGG16. |
|