Open Source AI Talks Like a Human - In Real Time!

© Brighteon.com All Rights Reserved. All content posted on this site is commentary or opinion and is protected under Free Speech.Brighteon is not responsible for comments and content uploaded by our users.

© Brighteon.com All Rights Reserved. All content posted on this site is commentary or opinion and is protected under Free Speech. Brighteon is not responsible for comments and content uploaded by our users.

Open Source AI Talks Like a Human - In Real Time!

AmazingAI

17 followers

118 views • July 13, 2024

Moshi is the the lowest latency conversational AI ever released.

On July 4, kyutai_labs introduced Moshi, the lowest latency conversational AI ever released. Moshi can perform small talk, explain various concepts and engage in roleplay using many emotions and speaking styles. In this video, watch Moshi talk like a pirate and in a spooky whisper!

Talk to Moshi here: https://moshi.chat/?queue_id=talktomoshi .

____________________________________

More Info:

According to Philipp Schmid, @_philschmid on X,

Moshi:

> Expresses and understands emotions, e.g. speak with “french accent”

> Listens and generates Audio/Speech

> Generates realistic, human-like speech In a variety of accents

> Supports 2 streams of audio to listen and speak at the same time

> Used Joint pre-training on mix of text and audio

> Used synthetic data text data from Helium a 7B LLM (Kyutai created)

> Is fine-tuned on 100k “oral-style” synthetic (conversations) converted with TTS

> Learned its voice from synthetic data generated by a separate TTS model

> Achieves a end-to-end latency of 200ms

> Has a smaller variant that runs on a MacBook or consumer-size GPU. 🤯

> Uses watermarking to detect AI-generated audio (WIP)

> Will be released open source!!!

____________________________________

All clips used for fair use commentary, criticism, and educational purposes. See Hosseinzadeh v. Klein, 276 F.Supp.3d 34 (S.D.N.Y. 2017); Equals Three, LLC v. Jukin Media, Inc., 139 F. Supp. 3d 1094 (C.D. Cal. 2015).

____________________________________

artificial intelligence, technology, AI, large language models, LLMs, interactive

Keywords

technology ai artificial intelligence interactive llms large language models

FREE email alerts of the most important BANNED videos in the world

Get FREE email alerts of the most important BANNED videos in the world that are usually blacklisted by YouTube, Facebook, Google, Twitter and Vimeo. Watch documentaries the techno-fascists don't want you to know even exist. Join the free Brighteon email newsletter. Unsubscribe at any time. 100% privacy protected.

Your privacy is protected. Subscription confirmation required.