Three remarkable things that happened in AI this week*

Techtonic
4 min readMar 22, 2024

--

It looks like this robot is about to fall over. You’ll never guess what happens next. Source: Expressive Humanoid

At last, a smarter way for LLMs to compose music

There have been a number of efforts to build LLMs that can understand and generate music, including Meta’s MAGNeT and Google’s MusicRL. These work by predicting the next best audio token, much like text generation LLMs, and what they have in common is that they’ve all been…okay.** The results sound like music, but it’s repetitive and often grainy, like something you might hear in an elevator or a medical office’s reception area. They also fall apart with more than one musical line, and the only vocals are meaningless phonemes. The problem is that musical tokens are complex; it’s much harder to understand a single chord or note or word in a musical context than it is to parse text or even a spoken phoneme. Also, the structure of music is very different to the structure of language (think about the importance of repeating a musical motif exactly and of keeping rhythm consistent), and our brains are pretty harsh on anomalies.

So a team from HKUST has created ChatMusician by taking a different approach, training an LLM on the fundamentals of music composition and theory, and then, most critically, training it on sheet music, instead of recorded music. This allows the model to solve a much simpler problem, disaggregating music into its component pieces and separating the complex sounds in the music itself. The resulting music isn’t going to find a place in the symphonic canon, but it sounds like something a human might create and perform. This strikes me as the best path forward for music generation, and the team has created a roadmap for others to follow and then extend.

At last, a creepy dancing robot that could almost certainly kill you

We’ve been working on generalist humanoid robots for a long time. Indeed, the idea of a robot that can perform a wide range of tasks in an environment built for humans predates the invention of the computer; not surprisingly, the idea of replacing humans with machines catches our imagination. Here’s a made-up fact that I’d like to think is true: CAT scans show that when we think about humanoid robots, it activates the same part of the brain as when we think about large predators.

There have been a lot of attempts at building humanoid robots, and many that look convincing, especially in controlled settings and performing scripted actions. But there are still a lot of problems to work out, including understanding the built environment (machine vision under different visual conditions and the ability to create a model of the world from that vision), integrating touch feedback (for example, picking something up without crushing it), moving around and over obstacles, and responding to unusual circumstances, among many others.

One significant and often underappreciated problem is balance. Standing on two feet is hard, and humans are making tiny adjustments all the time without realizing it in order to avoid falling over. This is why so many robots are still on wheels, which obviously leads to significant compromises in versatility, or are quadrupeds.

So congratulations, I guess, to the team at UC San Diego that has partially solved this problem in a novel way, creating a bipedal robot that constantly shifts from foot to foot. This is a smart way of managing the balance problem, because the ongoing motion allows the robot to continuously adjust the relative position of its center of gravity and its base of stability, which is easier than doing it periodically with smaller movements. An analogy would be that it’s easier to balance a ruler on your hand if you’re constantly moving your hand than if you’re trying to hold still as much as possible.

Look, does this work bring humanity closer to extinction? Probably. But, I mean, dancing robots!

At last, an LLM that’s smarter than the average human

Normally I don’t like to include more than one ELE story in these roundups, but I guess it was just one of those weeks. Maxim Lott gave several different LLMs an IQ test, providing accommodations to the LLMs as if they were visually impaired. For the first time, one of the LLMs scored above 100, the score meant to represent the average human’s intelligence, with Claude-3 achieving a 101.

Look, this isn’t the Singularity. There are obviously all sorts of reasons why this news is of only symbolic importance, including the fact that IQ tests are pretty questionable instruments with a dark history, a score of 101 is statistically indistinguishable from a score of 100, and humans have a wide range of skills not measured by IQ tests that LLMs lack. As a sign of AI progress, this is no more significant (indeed, probably less significant) than the news of this or that model exceeding average human performance on a given benchmark.

But did it get a lot of attention, and give a lot of bloggers the opportunity to use visuals from Terminator? Absolutely.

* In the alternative but widespread definition of a week as an elastic period of time that’s longer than a few days, but definitely shorter than several months

** Honestly the results are pretty mediocre. But this is an incredibly hard problem, and I’m not creating a music-composition LLM, so I don’t want to say anything bad about the people who are.

“Three remarkable things that happened in AI this week” is a more-or-less weekly roundup of the most noteworthy events that have transpired in the world of AI.

--

--

Techtonic
Techtonic

Written by Techtonic

I'm a company CEO and data scientist who writes on artificial intelligence and its connection to business. Also at https://www.linkedin.com/in/james-twiss/

No responses yet