{"id":2678,"date":"2025-03-02T06:19:32","date_gmt":"2025-03-02T06:19:32","guid":{"rendered":"https:\/\/www.codeastar.com\/?p=2678"},"modified":"2025-03-02T06:27:50","modified_gmt":"2025-03-02T06:27:50","slug":"cogvideox-self-hosted-ai-image-to-video-gene","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/","title":{"rendered":"Generation Next: Self-hosted Image to Video with CogVideoX"},"content":{"rendered":"\n<p>In our <a href=\"https:\/\/www.codeastar.com\/flux-the-successor-of-stable-diffusion\/\">last <\/a><a href=\"https:\/\/www.codeastar.com\/flux-the-successor-of-stable-diffusion\/\" target=\"_blank\" rel=\"noreferrer noopener\">p<\/a><a href=\"https:\/\/www.codeastar.com\/flux-the-successor-of-stable-diffusion\/\">ost,<\/a> we explored how to generate images using FLUX. This time, we are taking a step further, by using generative AI to generate videos. The most popular generative video nowadays is <a href=\"https:\/\/openai.com\/sora\/\" target=\"_blank\" rel=\"noreferrer noopener\">Sora<\/a> from OpenAI. But it is not freely available. Then we go for another popular choice, <a href=\"https:\/\/lumalabs.ai\/dream-machine\" target=\"_blank\" rel=\"noreferrer noopener\">Dream Machine<\/a> from Luma AI, which is free for anyone to try. But just like what we did in past, we prefer solutions which are open source and self-hosted. We want things are under our control and do not need to worry about the usage limits. Here we have the open sourced text and image to video model, CogVideoX.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Text and Image to Video<\/h2>\n\n\n\n<p>If you have not tried the text and image to video, visit Dream Machine website to try it for free. There are two major types of video generation, but in this post, we will focus on image to video. <\/p>\n\n\n\n<p>Text to video is similar to what we have tried in FLUX. We type the prompts and get the expected results. But the generated output is a video instead of an image. On the other hand, image to video is generating a video based on the prompts *plus* the image content. Interestingly, text to video is often considered more challenging than image to video. It is because the model needs to understand the text prompts and generates the output that match the narrative content. But this is the consideration from a machine&#8217;s perspective. From a human perspective, image to video is way harder than text to video. This is because we inherently have specific expectations based on the images we provide. Having already seen the images, we will expect more than just the text description from the prompt.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Use of CogVideoX in Easy Mode<\/h2>\n\n\n\n<p>Okay, back to the post topic, CogVideoX. Among the few Large Language Models (LLMs) that can generate videos, CogVideoX stands out as an open source option. To get started, go to its <a href=\"https:\/\/github.com\/THUDM\/CogVideo\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub page<\/a> and clone the project. After that, we can use its <a href=\"https:\/\/github.com\/THUDM\/CogVideo\/blob\/main\/inference\/cli_demo.py\" target=\"_blank\" rel=\"noreferrer noopener\">Python code sample<\/a> to generate a video. Or use its <a href=\"https:\/\/github.com\/THUDM\/CogVideo\/tree\/main\/inference\/gradio_composite_demo\" target=\"_blank\" rel=\"noreferrer noopener\">web UI<\/a> to do the video generation.<\/p>\n\n\n\n<p>For absolutely beginners, we suggest using <a href=\"https:\/\/github.com\/pinokiofactory\/cogstudio\" target=\"_blank\" rel=\"noreferrer noopener\">CogStudio<\/a>. It is a Gradio web UI just like the one mentioned early. But this web UI provides more functions and most importantly, it provides one click install. A user can press one click to install CogVideoX and CogStudio web UI. That&#8217;s it! Once the installation is finished, we can run CogStudio directly in a web browser.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"633\" data-attachment-id=\"2689\" data-permalink=\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/cogvideo\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?fit=2306%2C1426&amp;ssl=1\" data-orig-size=\"2306,1426\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"cogvideo\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?fit=300%2C186&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?fit=1024%2C633&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?resize=1024%2C633&#038;ssl=1\" alt=\"CogVideoX with CogStudio\" class=\"wp-image-2689\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?resize=1024%2C633&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?resize=300%2C186&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?resize=768%2C475&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?resize=1536%2C950&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/cogvideo.png?resize=2048%2C1266&amp;ssl=1 2048w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/figure>\n\n\n\n<p>Look similar? Yes, our favorite <a href=\"https:\/\/github.com\/lllyasviel\/stable-diffusion-webui-forge\" target=\"_blank\" rel=\"noreferrer noopener\">FLUX Forge<\/a> is also made from Gradio web UI.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">CogVideoX Image to Video Experiment<\/h2>\n\n\n\n<p>Let&#8217;s explore image to video generation using CogVideoX&#8217;s 5B model. The process is straight forward. From the CogStudio web UI we have seen before, click the &#8220;image-to-video&#8221; tab on the top. Upload an image there, enter the prompt, then click &#8220;Generate Video&#8221; button.<\/p>\n\n\n\n<p>According to CogVideoX&#8217;s GitHub page:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Since CogVideoX is trained on long texts, we need to transform the input text distribution to match the training data using an LLM.<\/p>\n<\/blockquote>\n\n\n\n<p>Okay, let&#8217;s do it. We start by using the feature image from the previous post.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" data-attachment-id=\"2675\" data-permalink=\"https:\/\/www.codeastar.com\/flux-the-successor-of-stable-diffusion\/digital-studio_jerpy\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?fit=1280%2C720&amp;ssl=1\" data-orig-size=\"1280,720\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"digital studio_jerpy\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?fit=300%2C169&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?fit=1024%2C576&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?resize=1024%2C576&#038;ssl=1\" alt=\"Digital Studio Jerpy\" class=\"wp-image-2675\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?resize=1024%2C576&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?resize=300%2C169&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?resize=768%2C432&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/digital-studio_jerpy.png?w=1280&amp;ssl=1 1280w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/figure>\n\n\n\n<p>This is our input for image to video, and I asked ChatGPT to generate a long text description at our prompt: <\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>In a cozy modern studio adorned with lush plants and colorful artwork, a cheerful animated bear with a warm smile and a bright yellow scarf dances joyfully against a stunning sunset skyline. The large windows frame the vibrant oranges and pinks of the setting sun, casting a warm glow across the room. As upbeat music fills the air, the bear captivates the scene with its lively movements, spinning and hopping to the rhythm. Its expressive face reflects pure delight, while its paws sway gracefully, tapping along to the catchy beat. As the dance reaches a crescendo, the bear gives a playful wave, radiating happiness and leaving a lasting impression of joy in this vibrant setting.<\/p>\n<\/blockquote>\n\n\n\n<p>Here is our first image to video output:<\/p>\n\n\n\n<figure class=\"wp-block-video aligncenter\"><video controls src=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/20250205_183816.mp4\"><\/video><\/figure>\n\n\n\n<p>Well, it doesn&#8217;t look right. Let me use the original prompt to try again:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Bear dancing inside the studio<\/p>\n<\/blockquote>\n\n\n\n<figure class=\"wp-block-video aligncenter\"><video controls src=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/20250205_181337.mp4\"><\/video><\/figure>\n\n\n\n<p>Hey, it looks nature and better in a shorter prompt!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Chain-of-Thought on Image to Video Processing<\/h2>\n\n\n\n<p>At this point, let&#8217;s analyze how the model processes the image to video generation logically. The CogVideoX author suggested using long text description as the prompt. We did it, but it did not provide our desired outcome. And the shorted prompt just did it on point. Okay, we have discovered something here.<\/p>\n\n\n\n<p><em>The longer text extended the prompt with unnecessary information.<\/em><\/p>\n\n\n\n<p>The prompt should focus on two things: the object in the image and the action of the object. Then the longer prompt should only extend the object and the action. Let&#8217;s take another example, below is a photo of me taken in Himeji Castle.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"602\" data-attachment-id=\"2702\" data-permalink=\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/a\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/a.jpg?fit=800%2C602&amp;ssl=1\" data-orig-size=\"800,602\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1724603169&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;1&quot;}\" data-image-title=\"a\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/a.jpg?fit=300%2C226&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/a.jpg?fit=800%2C602&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/a.jpg?resize=800%2C602&#038;ssl=1\" alt=\"CogVideoX Test 1\" class=\"wp-image-2702\" style=\"width:649px;height:auto\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/a.jpg?w=800&amp;ssl=1 800w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/a.jpg?resize=300%2C226&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/a.jpg?resize=768%2C578&amp;ssl=1 768w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/figure>\n\n\n\n<p> I ask ChatGPT to extend the prompt, &#8220;Man walks slowly along the corridor&#8221; and it becomes:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>The man walks slowly, his hands in his pockets, reflecting on the history of the place, with a serene expression that invites viewers to share in his contemplation.<\/p>\n<\/blockquote>\n\n\n\n<p>And we got:<\/p>\n\n\n\n<figure class=\"wp-block-video aligncenter\"><video controls src=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/20250207_043116.mp4\"><\/video><\/figure>\n\n\n\n<p>It looks good for creating the moving action and background shifting. But I was just faceshifted into another person, likes a X-men mutant power. We may assume that the CogVideoX model was trained highly on western data.<\/p>\n\n\n\n<p>So this time we use a public figure as an example, Elon Musk:<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"683\" data-attachment-id=\"2705\" data-permalink=\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/elm\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?fit=2560%2C1707&amp;ssl=1\" data-orig-size=\"2560,1707\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"elm\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?fit=300%2C200&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?fit=1024%2C683&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm.webp?resize=1024%2C683&#038;ssl=1\" alt=\"Elon Musk\" class=\"wp-image-2705\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?resize=1024%2C683&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?resize=300%2C200&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?resize=768%2C512&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?resize=1536%2C1024&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/elm-scaled.webp?resize=2048%2C1365&amp;ssl=1 2048w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/figure>\n\n\n\n<p>And our prompt that focuses on the object and the action is:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>After a successful announcement, the character throws his arms up in a victorious pose, releasing a spectacular burst of flames that shoot into the air, igniting the excitement of the crowd.<\/p>\n<\/blockquote>\n\n\n\n<figure class=\"wp-block-video aligncenter\"><video controls src=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2025\/02\/20250205_200857.mp4\"><\/video><\/figure>\n\n\n\n<p>The generated video looks smooth and does not have dramatic change in the face of the object.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion: Tips for Better Image-to-Video Outputs<\/h2>\n\n\n\n<p>Based on our experiments, we come up the following tips for generating better image to video outputs:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Focus on Object and Action:<\/strong> ensures the model understand the primary elements to animate effectively<\/li>\n\n\n\n<li><strong>Provide Detailed Description:<\/strong> guides the model with descriptions that enrich the primary elements <\/li>\n\n\n\n<li><strong>Avoid Unnecessary Information:<\/strong> skips details like camera angles and background elements that can confuse the model.<\/li>\n\n\n\n<li><strong>Use Recognizable Objects for Clarity:<\/strong> familiar items \/ public figures help the model understand and create relevant outputs<\/li>\n\n\n\n<li><strong>Iterate and Adjust:<\/strong> just like our machine learning journey, try-and-tune is always the thing we do for getting better results<\/li>\n<\/ol>\n\n\n\n<p>By applying these tips, you can enhance the quality of your image to video outputs. Don\u2019t hesitate to experiment with different prompts and approaches. Give it a try and see how your ideas come to life!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In our last post, we explored how to generate images using FLUX. This time, we are taking a step further, by using generative AI to generate videos. The most popular generative video nowadays is Sora from OpenAI. But it is not freely available. Then we go for another popular choice, Dream Machine from Luma AI, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2714,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[18,185],"tags":[190,193,192,191],"class_list":["post-2678","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","category-stable-diffusion-artist","tag-chain-of-thought","tag-cogvideox","tag-image-to-video","tag-prompt-engineering"],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>CogVideoX: Self-hosted Image to Video LLM &#8902; Code A Star<\/title>\n<meta name=\"description\" content=\"Explore the open-source CogVideoX for generating images to videos. Discover helpful tips and hands-on experiments \u2014 perfect for beginners!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"CogVideoX: Self-hosted Image to Video LLM &#8902; Code A Star\" \/>\n<meta property=\"og:description\" content=\"Explore the open-source CogVideoX for generating images to videos. Discover helpful tips and hands-on experiments \u2014 perfect for beginners!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2025-03-02T06:19:32+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-03-02T06:27:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"Generation Next: Self-hosted Image to Video with CogVideoX\",\"datePublished\":\"2025-03-02T06:19:32+00:00\",\"dateModified\":\"2025-03-02T06:27:50+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/\"},\"wordCount\":1118,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1\",\"keywords\":[\"Chain-of-Thought\",\"CogVideoX\",\"Image to Video\",\"Prompt Engineering\"],\"articleSection\":[\"Learn Machine Learning\",\"Stable Diffusion Artist\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/\",\"url\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/\",\"name\":\"CogVideoX: Self-hosted Image to Video LLM &#8902; Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1\",\"datePublished\":\"2025-03-02T06:19:32+00:00\",\"dateModified\":\"2025-03-02T06:27:50+00:00\",\"description\":\"Explore the open-source CogVideoX for generating images to videos. Discover helpful tips and hands-on experiments \u2014 perfect for beginners!\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1\",\"width\":1280,\"height\":720,\"caption\":\"CodeAStar Movie Studio\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Generation Next: Self-hosted Image to Video with CogVideoX\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don&#039;t wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/x.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"CogVideoX: Self-hosted Image to Video LLM &#8902; Code A Star","description":"Explore the open-source CogVideoX for generating images to videos. Discover helpful tips and hands-on experiments \u2014 perfect for beginners!","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/","og_locale":"en_US","og_type":"article","og_title":"CogVideoX: Self-hosted Image to Video LLM &#8902; Code A Star","og_description":"Explore the open-source CogVideoX for generating images to videos. Discover helpful tips and hands-on experiments \u2014 perfect for beginners!","og_url":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2025-03-02T06:19:32+00:00","article_modified_time":"2025-03-02T06:27:50+00:00","og_image":[{"width":1280,"height":720,"url":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png","type":"image\/png"}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"Generation Next: Self-hosted Image to Video with CogVideoX","datePublished":"2025-03-02T06:19:32+00:00","dateModified":"2025-03-02T06:27:50+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/"},"wordCount":1118,"commentCount":0,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"image":{"@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1","keywords":["Chain-of-Thought","CogVideoX","Image to Video","Prompt Engineering"],"articleSection":["Learn Machine Learning","Stable Diffusion Artist"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/","url":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/","name":"CogVideoX: Self-hosted Image to Video LLM &#8902; Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage"},"image":{"@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1","datePublished":"2025-03-02T06:19:32+00:00","dateModified":"2025-03-02T06:27:50+00:00","description":"Explore the open-source CogVideoX for generating images to videos. Discover helpful tips and hands-on experiments \u2014 perfect for beginners!","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#primaryimage","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1","width":1280,"height":720,"caption":"CodeAStar Movie Studio"},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/cogvideox-self-hosted-ai-image-to-video-gene\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"Generation Next: Self-hosted Image to Video with CogVideoX"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don&#039;t wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/x.com\/codeastar"]}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2025\/03\/Movie-Studio.png?fit=1280%2C720&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-Hc","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2678","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=2678"}],"version-history":[{"count":25,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2678\/revisions"}],"predecessor-version":[{"id":2713,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2678\/revisions\/2713"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media\/2714"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=2678"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=2678"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=2678"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}