Can you manage your house with a local, no-cloud voice assistant? Mostly, yes.

Posted by

Home Assistant's voice assistant running on an ESP32-S3-Box3
Enlarge / The most impressive part is what Home Assistant’s voice control does not do: share your voice input with a large entity aiming to sell you things.

Kevin Purdy

Last year, the leaders of Home Assistant declared 2023 the “Year of the Voice.” The goal was to let users of the DIY home automation platform “control Home Assistant in their own language.” It was a bold shot to call, given people’s expectations from using Alexa and the like. Further, the Home Assistant team wasn’t even sure where to start.

Did they succeed, looking in from early 2024? In a very strict sense, yes. Right now, with some off-the-shelf gear and the patience to flash and fiddle, you can ask “Nabu” or “Jarvis” or any name you want to turn off some lights, set the thermostat, or run automations. And you can ask about the weather. Narrowly defined mission: Accomplished.

In a broader, more accurate sense, Home Assistant voice control has a ways to go. Your verb set is limited to toggling, setting, and other smart home interactions. The easiest devices to use for this don’t have the best noise cancellation or pick-up range. Errors aren’t handled gracefully, and you get the best results by fine-tuning the names you call everything you control.

It’s not entirely fair to compare locally run, privacy-minded voice control to the “assistants” offered by globe-spanning tech companies with secondary motives. Paulus Schoutsen, founder of Home Assistant, knows this, but he’s motivated to keep improving anyway. Schoutsen told Ars that people tend to arrive at Home Assistant after starting out with one of the big three: Amazon’s Alexa, Google’s Assistant, or Apple’s Siri. “They’re ‘outgrowers,’ so they come to us. We’re their second system,” Schoutsen said.

While outgrowers are happy to leave behind the inconsistent behavior, privacy concerns, or limitations of their old systems, they can miss being able to just shout from anywhere in a room and have a device figure out their intent. Or, failing that, their kids want music to play when they say, “Play ‘Cruel Summer’ by Taylor Swift” in the kitchen. Home Assistant is not there yet, and in some ways is not meant to be that kind of system, at least by default. But it’s improving, and it has come a very long way.

Here’s a look at what you can do today with your human voice and Home Assistant, what remains to be fixed and made easier, and how it got here.

The wake word settings inside Home Assistant's Voice Assistant settings, once you've set up your system.

The wake word settings inside Home Assistant’s Voice Assistant settings, once you’ve set up your system.

Kevin Purdy

The open source to-do list

“As it stands today, we’re not ready yet to tell people that our voice assistant is a replacement for Google/Amazon,” Schoutsen wrote. “We don’t have to be as good as their systems, but there is a certain bar of usable that we haven’t reached yet.”

Key among the improvements that need to happen, according to Schoutsen:

  • Audio input needs to be cleaned up (speaker voice separated) before it is processed
  • Error messages need to be more clear about what’s going wrong, and input has to have more flexibility
  • Non-English languages need a lot of commands and variables
  • Compatible hardware that features far-listening microphones has to be more widely available
  • Most people will want local processing to be faster

All that said, it’s impressive how far Home Assistant has come since late 2022, when it made its pronouncement, despite not really having a clear path toward its end goal.

A more visually interesting version of Home Assistant’s wake word potential.