{"id":2529,"date":"2023-05-29T19:44:32","date_gmt":"2023-05-29T19:44:32","guid":{"rendered":"https:\/\/www.codeastar.com\/?p=2529"},"modified":"2023-07-04T17:55:29","modified_gmt":"2023-07-04T17:55:29","slug":"stable-diffusion-quick-and-easy-guide-for-everyone-part-1","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/stable-diffusion-quick-and-easy-guide-for-everyone-part-1\/","title":{"rendered":"Easy Guide for Beginner: Learn how to use Stable Diffusion WebUI – Part 1"},"content":{"rendered":"\n
We have been learning programming and AI since the beginning of this website. Now, we are excited to explore the world of generative AI art using the Stable Diffusion WebUI. This innovative web interface provides us with a simple and efficient way to generate AI art. In one of our previous posts<\/a>, we experimented with creating computer graphics, but it was all about programming from a programmer’s perspective. We obtained results based on the code we wrote. However, with generative AI tool, we can provide prompts to our machine and let it creates the art on its own. This opens up a wider range of creativity, ideas, and styles that combine to produce new computer-generated art. The Stable Diffusion WebUI is the perfect solution to help us unlock the full potential of our artistic ideas. So, what are we waiting for? Let’s get started!<\/p>\n\n\n\n If you just want to know how to use the WebUI, you can skip this section and go straight to the next section<\/a>.<\/p>\n\n\n\n Stable Diffusion is a deep learning model that uses user inputs to generate images. These inputs can be in the form of text descriptions or other images. Now, you might be wondering, what exactly is a diffusion model? Well, think of it as a way to spread a new idea among a group of people over time.<\/p>\n\n\n\n In the context of Stable Diffusion, the user’s input acts as that “new idea.” However, it’s important to note that the generative image is not directly created based on the input itself. Instead, the model starts with a random, noise-filled image and then applies a series of filters to gradually enhance it. This process involves the model understanding the user’s input, using Natural Language Processing (NLP)<\/a> for textual descriptions or image recognition<\/a> for user-uploaded images.<\/p>\n\n\n\n The model repeats this filtering and enhancement process multiple times, with each iteration adding more details to the image. Gradually, the image evolves into a fully-formed picture that matches the user’s input.<\/p>\n\n\n\n So, in summary, Stable Diffusion is a powerful tool that uses deep learning techniques to generate images based on user inputs. It progressively refines a random starting image, applying filters and improving it iteratively until it becomes a complete representation of the user’s desired input.<\/p>\n\n\n\n From our example here, I have entered “A brown bear uses a computer in an office<\/em>” as my input and let the model generate the image in 20 steps.<\/p>\n\n\n\n It is a straight forward action, we just go to download the files from the author’s GitHub page<\/a> (thank you, AUTOMATIC1111) and install the web ui. Before doing that, we have to make sure the machine we are going to install, has at least 6GB VRAM in the GPU card. Just follow the installation instruction<\/a> from AUTOMATIC1111’s page and it should be no issues at all. <\/p>\n\n\n\n There are 2 installation tips we can provide. <\/p>\n\n\n\n After the installation, run webui-user<\/em> and copy the url (http:\/\/127.0.0.1:7860\/<\/em> by default) from the console log. Place the url to a web browser, then the following interface should appear:<\/p>\n\n\n\n The above screen is the main interface, txt2img<\/em>, of Stable Diffusion WebUI and we are going to explain several import components there. It may be a bit long, so we split it into two parts.<\/p>\n\n\n\n We can use the “Script<\/em>” option to produce graphics combined with different settings. In the following example, we use ‘X\/Y\/Z plot<\/em>‘ script to produce a set of images with different CFG Scale<\/em> and Steps<\/em> settings.<\/p>\n\n\n\n X\/Y\/Z plot<\/em> can produce a matrix according to our criteria, so we add CFG Scale<\/em> on the X-axis with “1, 7, 15” as the values. On the Y-axis, we use Steps<\/em> and “20. 40” as the values. The prompt is “Joe Biden goes shopping in Milan”. Then we have the following output:<\/p>\n\n\n\n It is easy to notice the difference of the Steps<\/em> values, with a greater number of steps, we should get a more detailed output. <\/p>\n\n\n\n While on the CFG Scale<\/em>, it determines how strong the model classifies the keywords from the prompt. When the CFC Scale<\/em> is set to 1, it treats keywords lightly, and literally denoises items related to the prompt. Therefore, when the CFG Scale<\/em> is set to 7, the model recognizes specific keywords like “Joe Biden” and generates a recognizable image of Joe Biden. If we compare the images generated with CFG Scale 1<\/em> and CFG Scale 7<\/em>, we can see that the images at CFG Scale 7 have fewer elements related to “shopping”. In the case of CFG Scale<\/em> 15 images, they are heavily biased towards the “Joe Biden” keyword, and there is hardly any shopping content present. So we would recommend using a lower CFG Scale<\/em> if you want to generate more diverse and creative products.<\/p>\n\n\n\nWhat is Stable Diffusion?<\/h3>\n\n\n\n
<\/figure>\n\n\n\n
Stable Diffusion WebUI Installation<\/h3>\n\n\n\n
\n
<\/figure>\n\n\n\n
Stable Diffusion WebUI Major Components 1<\/h3>\n\n\n\n
\n
Stable Diffusion WebUI Major Components 2<\/h3>\n\n\n\n
\n
Usage of Script in Stable Diffusion<\/h3>\n\n\n\n
<\/figure>\n<\/figure>\n\n\n\n
<\/figure>\n\n\n\n
Conclusion<\/h3>\n\n\n\n