“Lumiere: A Space-Time Diffusion Model for Video Generation” 近日,谷歌发表了最强T2V模型Lumiere,实现视频生成领域新SOTA。Lumiere不仅在视频时长上实现了质的飞...
2024-01-29 547

“TinyLlama: An Open-Source Small Language Model” 近日,新加坡科技设计大学(SUTD)发表了 TinyLlama,11亿参数量,使用大约 3 万亿个 token 上预训练而成。“仅”需 1...
2024-01-15 998

你是否有遇到过打字或者语音聊天时,开了一个玩笑,对方却误以为真,导致矛盾反正的情况?通过打字或者音频聊天时,我们通常只能猜测对方的态度和语气,容易引起误解。 ...
2024-01-10 808

“MoonShot: Towards Controllable Video Generation and Editing with Multimodal Conditions” 项目主页:https://showlab.github.io/Moonshot/ 论文地址:https://arx...
2024-01-08 893

“DreaMoving: A Human Video Generation Framework based on Diffusion Models” 项目主页:https://dreamoving.github.io/dreamoving 论文地址:https://arxiv.org/pdf...
2024-01-08 579

“Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action” Unified-IO 2号称是第一个能够理解和生成图像、文本、音频和动...
2024-01-08 624

“Pose Anything: A Graph-Based Approach for Category-Agnostic Pose Estimation” 本文提出一种新的CAPE方法,通过新设计的Graph Transformer Decoder,利用关键点之间...
2024-01-04 610

“ANYTEXT: MULTILINGUAL VISUAL TEXT GENERATION AND EDITING” 文生图中的文字生成问题一直困扰着广大AIGC应用,今日,阿里发表了AnyText针对这个问题进行了优化。接下...
2024-01-04 741

“Make-A-Character: High Quality Text-to-3D Character Generation within Minutes” 项目主页:https://human3daigc.github.io/MACH/ 论文地址:https://arxiv.org/...
2024-01-04 683

“DreamGaussian4D: Generative 4D Gaussian Splatting” 项目主页:https://jiawei-ren.github.io/projects/dreamgaussian4d/ 论文地址:https://arxiv.org/abs/2312....
2024-01-02 525
显示验证码