Three remarkable things that happened in AI this week. An LLM isn’t running for Parliament, AI video gets better but is still hard, and a great idea in hearing augmentation

Techtonic
4 min read3 days ago
It’s very responsible for AI Steve to label this as AI-generated content. Well done, AI Steve! Source: Neural Voice

AI Steve runs for Parliament

What happened: A man named Steve Endacott is running as an independent in the upcoming UK elections. Endacott says that he’s just the mouthpiece (and, you know, legal human) for “AI Steve,” his LLM-powered avatar. Anyone can interact with AI Steve, who will incorporate feedback and suggestions into “his” political platform, which Endacott will then put forward as an MP.

Why it matters: Mostly because of the credulous press coverage Endacott is getting. No, an LLM is not running for office. Endacott is a (long-shot) candidate like anyone else; the fact that he says he’ll let AI make all of his decisions for him is quirky, but it doesn’t mean the robots are taking over, any more than Mattel would be taking over if Endacott said he was going to make all of his decisions by Magic 8-Ball. Endacott is the chair of Neural Voice, an AI voicebot company that powers AI Steve, so while most commentators see this as a quirky experiment in democracy, it’s not hard to look at it as a marketing stunt.

AI-generated video is getting better, but it’s still hard

What happened: Luma Labs released a new video-generation tool, Dream Machine, to the public. Dream Machine can generate five-second video clips in response to a text prompt, and they look…pretty good, despite occasional glitches and the limited length and subject matter.

Why it matters: The release of Sora earlier this year received a huge amount of attention. But while the videos we saw were great, the public still doesn’t have access to the tool (we’re told it will be “this year”), suggesting that a lot of what it produces is unusable. Dream Machine is probably the best text-to-video tool currently available to the public, and it tells us a lot about the current state of video generation. And that state is…pretty impressive, but with a long ways to go. Videos still struggle with motion, they don’t understand physics or the real world, and most of all, they’re still very short. Holding all the information in a video in context is hard, and the longer the clip, the greater the risk that the end doesn’t match the beginning (this is why Sora videos, which are significantly longer, are simple and repetitive). Finally, users have very limited control over Dream Machine clips–you put in your prompt, and the model does the rest–reinforcing the point that that the model can manage a handful of techniques (such as the slow pan) but would struggle to go beyond its comfort zone. Luma has taken a significant step forward, and we’ll probably get there over time, but there’s still a long ways to go.

Clever people make the world a little better

What happened: Researchers at the University of Washington and AssemblyAI published a paper outlining a new approach to hearing augmentation. Noise-reduction algorithms are much more effective when provided with samples of the underlying signal to amplify; in the paper’s method, the user (who is wearing headphones with embedded speakers) simply faces the person they are listening to for several seconds, and the algorithm identifies the signal based on the relative strength of the sound picked up by each microphone. The signal (the voice of the other person) will be of equal volume in both ears, while noise will generally be stronger in one or the other (this is also how humans determine where a sound is coming from, relative volume in each ear).

Why it matters: The ability to amplify sounds for people with hearing loss has been around for decades. The challenge is amplifying useful sounds (speech from the person in front of you) while filtering out background noise (clatter and conversation in a restaurant). AI can be enormously helpful here, and we’ve already made good progress in strengthening the desired signal, but there’s a real-world problem with usability–you aren’t going to have clean voice samples of the person you want to listen to to help the AI know what to retain and what to filter. The binaural approach to enrollment is an elegant, practical solution that takes advantage of hardware that’s already omnipresent (headphones with microphones) and has the potential to move noise-reduction algorithms further into the real world.

What happened but we didn’t write it up because everyone else did and all the coverage is kind of the same: Apple made some AI announcements

Three remarkable things is a selective, more-or-less weekly roundup of interesting events from the AI community over the past more-or-less week

--

--

Techtonic

I'm a company CEO and data scientist who writes on artificial intelligence and its connection to business. Also at https://www.linkedin.com/in/james-twiss/