Skip to main content
  1. Data Science Blog/

What is GAN?

·1016 words·5 mins· loading · ·
Generative AI AI/ML Models Artificial Intelligence (AI) Generative AI Deep Learning (DL) Neural Networks Machine Learning (ML) Computer Vision AI Models

On This Page

Table of Contents
Share with :

Partial Dependence Plots

What is GAN?
#

What is GAN (Generative Adversarial Network)?
#

Generative adversarial networks (GANs) are besing used to generate images, videos, text, audio and music. GAN is a class of machine-learning models introduced by Ian Goodfellow and his colleagues in 2014. The GANs became popular among researchers quickly because of their property to generate new data with the same statistics as the input training set. It can be applied to images, videos, textual data, tabular data and more, proving useful for semi-supervised, fully supervised, and reinforcement learning.

There are 2 kinds of models in the context of Supervised Learning, Generative and Discriminative Models. Discriminative Models are primarily used to solve the Classification task where the model usually learns a decision boundary to predict which class a data point belongs to. On the other side, Generative Models are primarily used to generate synthetic data points that follow the same probability distribution as training data distribution. Our topic of discussion, Generative Adversarial Networks(GANs) is an example of the Generative Model.

The most popular GAN Architectures and their Purpose#

  • Transforming an image from one domain to another (CycleGAN),
  • Generating an image from a textual description (text-to-image),
  • Generating very high-resolution images (ProgressiveGAN) and many more.
  • pixelRNN
  • DiscoGAN
  • lsGAN

Image generation projects using GAN
#

Anime Characters
#

GANs are changing the way of generating realistic anime characters and bringing out the potential of complex GAN architecture to build entire anime series with the help of AI. In the paper “Towards the Automatic Anime Characters Creation with Generative Adversarial Networks,” Yanghua Jin and his team trained and used GAN to generate the faces of anime characters or Japanese comic book characters in 2017. The outcome of the project was remarkable, which led people to conduct more experiments on the image generation of faces of anime characters and the generation of pokemon characters. Many GAN models are used to generate cartoon characters, such as DCGAN, StyleGAN, and so on.

Fake Human Faces
#

Facial recognition has many use cases, and the development has been in progress for the last couple of years, where researchers are using different techniques and facial recognition datasets to train models. Researchers need massive datasets of human faces to understand the recognition process, and the generation of fake human faces helps these projects. NVIDIA researchers published a paper, “Progressive Growing of GANs for Improved Quality, Stability, and Variation,” in 2018, where they proposed a new training methodology for GANs operating on the generation of feasible human face photographs. The paper’s outcome is so realistic-looking that it can fool anyone easily. Additionally, the paper presented the generation of objects and scenes as well.

Image Style Transfer
#

Image style transfer is an interesting technique in computer vision that combines two images. This technique consists of a model taking two images, called content and reference images, and the output is a whole new image containing the object of the content image and the style of the reference image. Here by style, it means brush strokes, colors, and textures of the image. Researchers are still trying to find the best ways and use cases of style transfer. This technique is included in the image-to-image translation and is also known as domain adaptation. There are many research work on image style transfer using GANs, and most have produced good results. A remarkable study is the paper “P²-GAN: EFFICIENT STYLE TRANSFER USING SINGLE STYLE IMAGE.” Zhentan Zheng and Jianyi Liu put forth a novel patch permutation GAN (P²-GAN), which proves efficient in learning stroke styles from paintings or single style images. The paper concludes with an effective and precise P²-GAN network simulating the expected stroke style and avoiding the difficulty of collecting image sets with the same style.

Text-to-Image (text-2-image) Synthesis
#

Generating realistic images is challenging, but using GANs makes the process a reality rather than theories. Although we got problems with image-to-image generation or translation, the synthesis of text-to-image realistic images using GANs is more complex and difficult. The process of synthesis needs a strong GAN structure along with the base images provided. There are many papers on the task which have computed impressive outcomes. In 2016, Han Zhang and his team from the Chinese University of Hong Kong presented “StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks,” explaining the use of GANs. Mainly, the StackGAN to generate realistic photographs from textual data of simple objects such as birds and flowers. Also, the paper “Generative Adversarial Text to Image Synthesis” and “TAC-GAN – Text Conditioned Auxiliary Classifier Generative Adversarial Network” present the study of generating realistic images through text-to-image synthesis with the help of GANs.

Image-to-Image Translation
#

Image-to-image translation includes many tasks, such as the translation of semantic images to photographs, satellite photographs to google maps, black and white photographs to color, and more. Many papers have been published demonstrating the use of GANs for image-to-image translation, and one of the popular papers is “Image-to-Image Translation with Conditional Adversarial Networks” by Berkeley AI Research group in 2016. This paper has an investigational approach toward conditional adversarial networks if they serve as a general-purpose solution to image-to-image translation problems. They use a unique pix2pix approach for various image-to-image translation tasks. Additionally, the paper titled “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks” presents the famous CycleGAN and a set of impressive image-to-image translation cases like translation from photograph to artistic painting style, photographs from summer to winter, and translation of horse to zebra.

Applications of GAN
#

  • Realistic image generation,
  • Improving the quality of photographs
  • Audio synthesis,
  • Transfer learning,
  • Deepfake Apps

Vision GAN Application Areas
#

  1. Generate Examples for Image Datasets
  2. Generate Photographs of Human Faces
  3. Generate Realistic Photographs
  4. Generate Cartoon Characters
  5. Image-to-Image Translation
  6. Text-to-Image Translation
  7. Semantic-Image-to-Photo Translation
  8. Face Frontal View Generation
  9. Generate New Human Poses
  10. Photos to Emojis
  11. Photograph Editing
  12. Face Aging
  13. Photo Blending
  14. Super Resolution
  15. Photo Inpainting
  16. Clothing Translation
  17. Video Prediction
  18. 3D Object Generation
  19. Cartoon Face Images
  20. Forged Face Images
  21. Artificial Flower Images
  22. Artificial Bird Images
  23. Drawing Real Objects and Vice-versa
  24. Fake Paintings
  25. Altering Photographs

Anime character generation using IllustrationGAN
#

Dr. Hari Thapliyaal's avatar

Dr. Hari Thapliyaal

Dr. Hari Thapliyal is a seasoned professional and prolific blogger with a multifaceted background that spans the realms of Data Science, Project Management, and Advait-Vedanta Philosophy. Holding a Doctorate in AI/NLP from SSBM (Geneva, Switzerland), Hari has earned Master's degrees in Computers, Business Management, Data Science, and Economics, reflecting his dedication to continuous learning and a diverse skill set. With over three decades of experience in management and leadership, Hari has proven expertise in training, consulting, and coaching within the technology sector. His extensive 16+ years in all phases of software product development are complemented by a decade-long focus on course design, training, coaching, and consulting in Project Management. In the dynamic field of Data Science, Hari stands out with more than three years of hands-on experience in software development, training course development, training, and mentoring professionals. His areas of specialization include Data Science, AI, Computer Vision, NLP, complex machine learning algorithms, statistical modeling, pattern identification, and extraction of valuable insights. Hari's professional journey showcases his diverse experience in planning and executing multiple types of projects. He excels in driving stakeholders to identify and resolve business problems, consistently delivering excellent results. Beyond the professional sphere, Hari finds solace in long meditation, often seeking secluded places or immersing himself in the embrace of nature.

Comments:

Share with :

Related

What is a Digital Twin?
·805 words·4 mins· loading
Industry Applications Technology Trends & Future Computer Vision (CV) Digital Twin Internet of Things (IoT) Manufacturing Technology Artificial Intelligence (AI) Graphics
What is a digital twin? # A digital twin is a virtual representation of a real-world entity or …
Frequencies in Time and Space: Understanding Nyquist Theorem & its Applications
·4103 words·20 mins· loading
Data Analysis & Visualization Computer Vision (CV) Mathematics Signal Processing Space Exploration Statistics
Applications of Nyquists theorem # Can the Nyquist-Shannon sampling theorem applies to light …
The Real Story of Nyquist, Shannon, and the Science of Sampling
·1146 words·6 mins· loading
Technology Trends & Future Interdisciplinary Topics Signal Processing Remove Statistics Technology Concepts
The Story of Nyquist, Shannon, and the Science of Sampling # In the early days of the 20th century, …
BitNet b1.58-2B4T: Revolutionary Binary Neural Network for Efficient AI
·2637 words·13 mins· loading
AI/ML Models Artificial Intelligence (AI) AI Hardware & Infrastructure Neural Network Architectures AI Model Optimization Language Models (LLMs) Business Concepts Data Privacy Remove
Archive Paper Link BitNet b1.58-2B4T: The Future of Efficient AI Processing # A History of 1 bit …
Ollama Setup and Running Models
·1753 words·9 mins· loading
AI and NLP Ollama Models Ollama Large Language Models Local Models Cost Effective AI Models
Ollama: Running Large Language Models Locally # The landscape of Artificial Intelligence (AI) and …