
Registered since September 28th, 2017
Has a total of 4281 bookmarks.
Showing top Tags within 4 bookmarks
howto information development guide reference administration design website software solution service online product business uk tool company linux code server system application web list video marine create data experience tutorial description explanation learn technology build article blog world boat project download windows lookup security free performance javascript technical london control network beautiful tools support course file research purchase image library programming youtube example php construction opensource install community html quality profile computer feature power browser music platform mobile process work manage professional user share database hardware buy industry internet dance advice developer installation camera search 3d access customer material travel money test standard develop css review documentation engineering photography engine webdesign digital device speed api source event question management program client phone discussion story simple content water marketing app yacht account setup idea interface package fast communication cheap compare script market study easy live google resource operation demonstration startup monitor
Tag selected: speech.
Looking up speech tag. Showing 4 results. Clear
Saved by uncleflo on April 25th, 2026.
Speak to an AI using our low-latency open-source speech-to-text and text-to-speech. This is a cascaded system made by Kyutai: our speech-to-text transcribes what you say, an LLM (we use GPT OSS 120B) generates the text of the response, and we then use our text-to-speech model to say it out loud. All of the components are open-source: Kyutai STT, Kyutai TTS 1.6B, and Unmute itself. Although cascaded systems lose valuable information like emotion, irony, etc., they provide unmatched modularity: since the three parts are separate, you can Unmute any LLM you want without any finetuning or adaptation! In this demo, you can get a feel for this versatility by tuning the system prompt of the LLM to handcraft the personality of your digital interlocutor, and independently changing the voice of the TTS. Both the speech-to-text and text-to-speech models are optimized for low latency. The STT model is streaming and integrates semantic voice activity detection instead of relying on an external model. The TTS is streaming both in audio and in text, meaning it can start speaking before the entire LLM response is generated. You can use a 10-second voice sample to determine the TTS's voice and intonation. Check out the pre-print for details.
opensource speech text convert system automate ai transcribe llm generate response voice tts translate website online tool useful good
Saved by uncleflo on November 14th, 2014.
Even though we're located in a secretive nuclear bunker, rebuking authorities regarding the rights of individuals, that doesn't mean we're cold and non-communicative to our clients. We hold the highest esteem for your business. Our team is reaching out to our clients every day, providing them with the best possible service. Our goal is to create long lasting business relationships. Despite the fact that most of our clients want to stay anonymous, we managed to get a few testimonials.
hosting cyber bunker data center communication secret authority individual client business service relationship raid testimonial freedom speech bittorrent search engine challenge download unauthorize server secure security location privacy open web copyright content
Saved by uncleflo on March 28th, 2014.
Veteran of countless pitching events, Oli Barrett reveals his recipe for a short, sharp talk that will put the audience in the palms of your hands. Words are, of course, the most powerful drug used by mankind.” So said Rudyard Kipling, and with his exceedingly good phrase in mind, I spent an intoxicating evening in the company of the most recent Wayra cohort.
word speech people crowd move startup business idea perfect event talk audience power control
Saved by uncleflo on March 5th, 2012.
Welcome to Café version 2.9. We've added support for randomized initial URLs, enhanced speech event logging, and made performance enhancements to default recognition settings. Let us know what you think - drop us an email at cafesupport@bevocal.com. Full details about this new release are available from the Cafe Newsgroups. What is Nuance Café? The Nuance Café is the VoiceXML development environment of choice among the industry's most experienced and discerning developers. As a free, Web-based development environment, the Café features a wealth of valuable tools, documentation and other resources, all with the objective of helping developers to build the highest quality voice-enabled applications in the shortest period of time. Tens of thousands of developers from over 30 countries representing a variety of organizations have chosen the Café to build compelling, production quality VoiceXML applications.
microphone sound text programming phone speech software voip development voice mobile voicexml
No further bookmarks found.