Speechify Previews Jarvis Voice Computer Control System

Speechify today previewed an early version of a new voice-controlled computing system internally referred to as Jarvis, a voice interface that allows users to operate their entire computer using spoken commands. The preview demonstrates a future in which users can control applications, navigate workflows, and complete tasks without typing, clicking, or touching their devices.

The prototype was recently demonstrated internally and shared publicly by Speechify founder and CEO Cliff Weitzman. The system allows users to speak naturally while Speechify executes actions across applications and windows in real time.

In the demonstration, spoken instructions trigger actions such as opening applications, locating contacts, navigating interfaces, and sending messages. Instead of switching between windows and manually interacting with software, users can complete tasks entirely through voice.

You shared an early video preview of the system here from our CEO Cliff Weitzman.

A Voice Interface for the Entire Computer

Traditional AI assistants typically focus on answering questions or generating text. Even when AI tools are integrated into software, users still need to manually open applications, navigate menus, and complete actions themselves.

Speechify Jarvis introduces a different model.

Users speak naturally while the system carries out instructions directly on the computer. Applications open automatically, workflows execute in sequence, and tasks complete without manual interaction.

Voice becomes an active control layer across the entire operating environment rather than a passive assistant limited to conversation.

From AI Chat to Voice-Controlled Computing

Most AI tools today are built around typed prompts and chat interfaces. While these systems can generate answers and written content, they typically cannot perform actions across real applications.

Speechify Jarvis extends Speechify’s Voice AI platform into direct computer control.

Instead of asking an assistant for instructions and then performing the steps manually, users can instruct the system to carry out tasks immediately. Voice becomes the primary interface for interacting with software.

Speechify describes this direction as part of a broader goal to reduce dependence on keyboards and traditional input devices.

“We just built something I've never seen anyone build before,” said Cliff Weitzman, founder and CEO of Speechify. “You talk to your computer and it takes over. You don't click anything, you don't type anything, and you don't touch anything. Your voice controls the entire machine.”

Designed Around Natural Interaction

Speechify Jarvis builds on Speechify’s existing voice-first platform, which combines text to speech, voice typing dictation, and a conversational Voice AI Assistant.

The new system extends these capabilities from reading and writing into direct workflow control. Users can open and navigate applications, send messages, execute workflows, switch between windows, and control software environments using natural spoken language.

Actions are triggered through conversational speech rather than structured commands or keyboard shortcuts.

The system is currently running internally on Speechify computers and represents an early preview of future product development.

Toward a Voice-Native Operating Model

Speechify’s preview reflects a broader shift toward voice-native computing. While keyboards and graphical interfaces remain the standard way people interact with software today, Speechify believes voice will become a primary interface for many workflows.

The Jarvis preview demonstrates a possible future where users interact with computers conversationally rather than through manual input.

Speechify describes the technology as an early step toward making voice the central interface for productivity and knowledge work, with additional updates planned in the future.

About Speechify

Speechify is a Voice AI Assistant that helps people read, write, and understand information through voice. Trusted by over 50 million users worldwide, Speechify offers text to speech, voice typing dictation, and a conversational AI assistant across iOS, Android, Mac, web, and Chrome. In 2025, Speechify received the Apple Design Award for its impact on accessibility and productivity. Speechify is used in nearly 200 countries and features 1,000+ natural-sounding voices in over 60 languages, including voices from Snoop Dogg and Gwyneth Paltrow.