Generative AI could be the next frontier for smartphone chip design

Stable Diffusion Qualcomm Doggo

Credit: Robert Triggs / Android Authority

Generative AI has exploded in the last 18 months as various super-powered AI services were launched to create text, images, music, or even serve as an all-encompassing assistant.

Smartphone manufacturers and mobile chipmakers have embraced this trend too, utilizing new hardware to quickly adapt to this new trend. Google’s new Pixel 8 series is already leading the way, and with Qualcomm’s Snapdragon 8 Gen 3 processor right around the corner, we’re anticipating some even bigger strides in the coming year. Let’s dive into what to expect.

What we’ve seen so far

Google Tensor vs Snapdragon logos 2

Credit: Robert Triggs / Android Authority

The most recent evidence of smartphone makers embracing generative AI in a much bigger way is Google. The company’s Pixel 8 phones are the first smartphones to run versions of Google’s generative AI Foundation Models on-device. Google says the models seen on the Pixel 8 phones are cut-down versions of their cloud-based models, but running on-device is more secure and more reliable when data isn’t available.

This is made possible thanks to the Tensor G3, with Google touting a much-improved Tensor Processing Unit (TPU) compared to last year. The company usually keeps the inner workings of its AI silicon a closely guarded secret, but it dished out a few nuggets of information. For one, Google says the Pixel 8 runs twice as many on-device machine learning models as the Pixel 6 series. It added that the generative AI models on the Pixel 8 series run up to 150 times more computations than the largest model on the Pixel 7.

In English, that means you’ve got features like multi-language voice dictation, Best Take functionality, Audio Magic Eraser, an improved Magic Eraser, Recorder summaries, and more, without the need for cloud processing.

Google isn’t the only recent mobile brand to embrace generative AI on a hardware level. Samsung confirmed that the Exynos 2400 chipset was in the works earlier this month, adding that it will deliver a 14.7x increase in AI computing performance over the Exynos 2200. Measuring AI computing performance is still a bit of a murky area, but Samsung has at least one use in mind.

The company said that it’s developed an AI tool for “upcoming phones” powered by the Exynos 2400. This tool will allow you to run text-to-image generation on-device without needing an internet connection.

What Qualcomm expects

Snapdragon 8 Gen 2 SoC up close

Credit: Qualcomm

Qualcomm’s Snapdragon chips power most flagship Android phones around the world, so we’re curious to see what the upcoming Snapdragon 8 Gen 3 will offer in terms of generative AI capabilities.

The company previously demonstrated a version of the Stable Diffusion text-to-image generator running on a Snapdragon 8 Gen 2 device earlier this year. This suggests that image generation support could be a feature on the new chipset, especially in light of the Exynos 2400 offering this feature.

Could we see dedicated generative AI silicon joining dedicated AI silicon, though? Qualcomm senior director Karl Whealton told Android Authority in an interview that if the existing hardware was powerful, efficient, and flexible enough, then it could cover “most things that you want to do.”

It seems like smartphones won’t get dedicated generative AI hardware in 2024.

Whealton said there was a tendency to look at particular functions related to generative AI (citing the example of softmax) and question whether specific hardware was needed to address them.

However, the executive felt that Qualcomm’s existing silicon was indeed powerful and flexible enough for these types of needs:

It seems like, you know, generally we have significant improvements to the engine, but an overhaul or a specific NPU around a certain workload… It continues to be the case so far through Snapdragon 8 Gen 2, that nothing like that’s needed.

In saying so, Qualcomm previously stuck to its guns when the first mobile chipsets with dedicated AI silicon debuted in 2017 and 2018. But the company eventually relented with the Snapdragon 855, which powered 2019’s flagship phones, bringing the Hexagon Tensor Accelerator to the table.

We’ve also seen phones with a crazy 24GB of RAM debuting earlier this year, with some people suggesting that this would be a good use for generative AI models. This way, you could theoretically have a wide-spanning AI assistant occupying some of that RAM. He concurs that the amount of RAM will be important for future generative AI applications.

Phones with a ton of RAM seem like overkill, but one potential upside is that this could enable larger, more performant on-device AI models.

“I’m not gonna speak on behalf of the OEMs. But there is a big benefit. For RAM, generally, it’s going to increase performance,” Whealton explained. He added that the richness and “what we perceive to be the breadth of knowledge” of AI models is related to the size of the model.

He noted that an AI model would have to sit in RAM because using flash storage would result in slow loading times.

You wanna be pushing 10, 20, 40 tokens per second. That’s going to make it get good results and almost conversational, and things like that. So it really has to be RAM, and that’s why RAM is so important.

Does that mean phones that are lacking on the RAM front (e.g. with 6GB or 8GB of RAM) would get left out of the loop to some extent?

Whealton doesn’t foresee a minimum RAM requirement for on-device generative AI, instead noting that more RAM will enable “a big increase” in functionality:

So I wouldn’t say that there’s a cut-off of saying ‘you’re kind of out of the loop.’ I would say the results are going to get dramatically better the more RAM you support.

Qualcomm PR executive Sascha Segan also suggests a hybrid future for smartphones that might not be able to hold a large AI model on-device. We could see a phone offering a much smaller AI model on-device, still allowing for local processing, but the device could then verify these results against a larger, cloud-based model.

A brute-force approach by offering a ton of RAM to hold generative AI models is one solution. But we’ve also seen a trend of AI models being shrunk down or quantized to reduce their footprint. So support for INT4 and INT8 precision continues to be a key way to shrink generative AI models and run them on devices with less RAM, such as older flagships and mid-range phones. Thankfully, both Qualcomm and arch-rival MediaTek support this functionality.


Be the first to comment

Give Feedback About This Article