Generative Ai: Discovering Models and Practical Applications
Abstract
Generative Artificial Intelligence (AI) is reshaping the world of machine learning by creating lifelike data from scratch. This review takes a deep dive into the world of Generative AI, covering its core ideas, the variety of models, how they’re trained, their real-world uses, challenges, recent breakthroughs, ways to measure their success, and the ethical questions they raise. We start by highlighting why Generative AI matters across so many fields. It’s powering everything from generating realistic images and writing text to composing music and even helping discover new drugs. Our goal is to break down the basics, dig into the details of different models, explain how they’re built, explore their applications, tackle their challenges, look at what’s next, and address the ethical issues that come with them. The review covers a range of generative models, like Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), flow-based models, Generative Reinforcement Learning (GRL), and cutting-edge hybrid designs. We also look at how these models are judged, using tools like the Inception Score, perceptual similarity metrics, and even human feedback. Finally, we tackle the ethical side, stressing the need to deal with biases, prevent misuse, sort out intellectual property issues, and push for responsible AI development and regulation.
References
[2] Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generativeadversarial networks. Conference on Neural Information Processing Systems.
[3] Dhatterwal, J. S., Baliyan, A., & Prakash, O. (2023). Reliability driven and dynamic resynthesis of error recovery in cyber-physical biochips. In Cyber Physical Systems (pp. 15–34). Chapman and Hall/CRC.
[4] Park, T., Liu, M. Y., Wang, T. C., & Zhu, J. Y. (2019). SPADE: Semantic image synthesiswith spatially-adaptive normalization. IEEE Conference on Computer Vision and Pattern Recognition.
[5] Dhatterwal, J. S., Kaswan, K. S., & Kumar, N. (2023). Telemedicine-based development ofM-health informatics using AI. In Deep Learning for Healthcare Decision Making (pp. 159). [6] Yang, J., Chou, S., Engel, J., & Roberts, A. (2017). MIDI-VAE: Modeling dynamics and instrumentation of music with applications to style transfer. International Conference on Learning Representations.
[7] Popova, R., Isayev, O., & Tropsha, A. (2018). DeepChem: A genome graph toolkit andinterpretable chemical genomics. bioRxiv, 316325.
[8] Chen, H., et al. (2020). A deep learning framework for modeling structural features ofRNA-binding protein targets. BMC Genomics, 21(1).
[9] Kocabas, M., et al. (2020). StyleGAN2: Analyzing and improving the image quality ofStyleGAN. IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[10] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... &Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems.
[11] Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. International Conference on Learning Representations.
[12] Dinh, L., Sohl-Dickstein, J., & Bengio, S. (2017). Density estimation using Real NVP. International Conference on Learning Representations.
[13] Ha, D., & Eck, D. (2017). A neural representation of sketch drawings. arXiv preprintarXiv:1704.03477.
[14] Denton, E., Fergus, R., et al. (2017). Unsupervised learning of disentangled representations from video. Advances in Neural Information Processing Systems.
[15] Ho, J., & Ermon, S. (2017). Generative adversarial imitation learning. In Proceedings ofthe 34th International Conference on Machine Learning.
[16] Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation withconditional adversarial networks. IEEE Conference on Computer Vision and Pattern Recognition.