Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/index.php?threads/apple-is-working-on-ai-models-that-can-run-on-device-rather-than-on-the-cloud.19499/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021370
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Apple is working on AI models that can run on-device rather than on the cloud

Daniel Nenni

Admin
Staff member
This will keep TSMC busy!

Apple's hiring and research indicates a big AI unveiling this year​

Apple is working on AI models that can run on-device rather than on the cloud


Forward-looking: Apple has yet to reveal any significant GenAI project, unlike competitors, which have rushed out apps to varying degrees of success. The company has said it wants to avoid the technology's notorious flaws with a cautious approach. Part of this plan is to circumvent cloud-based AI in favor of locally-processed solutions.
The Financial Times reports that Apple's hiring records, previous statements, and comments from analysts suggest the company will unveil the results of its investments in generative AI this year. Apple's future products will likely run generative AI models using built-in hardware instead of cloud services.

While Apple has been cautious regarding generative AI chatbots or image generators like OpenAI, Microsoft, Google, and Adobe, it has made significant investments in the sector. A report from last September indicated that the company was spending millions per day on multiple AI projects utilizing text, voice, and images.

Apple CEO Tim Cook acknowledged the company's investments last May, saying he wants to avoid the pitfalls companies like Google and OpenAI have encountered as early pioneers. Amid the initial push, generative AI has become notorious for its lack of trustworthiness, copyright issues, and massive energy demands.

2021-10-28-image-41.jpg


Wedbush Securities analyst Daniel Ives told FT he'd be surprised if Apple didn't complete a notable AI-related deal this year, referencing the 21 AI startups the Cupertino giant has acquired since 2017. Furthermore, Morgan Stanley reports that nearly half of Apple's AI job postings mention deep learning, indicating an aggressive push in the sector.

Morgan Stanley also expects Apple's iOS 18 unveiling at the Worldwide Developers Conference in June to focus on AI. The company's biggest foundational LLM, internally named "Ajax GPT," could become the engine behind an improved version of its virtual assistant, Siri.

Apple's AI models could leverage the neural engines of its latest chips like the M3 and A17 Pro. Similarly, Neural Processing Units lie at the heart of Intel's recent push toward AI PCs with its Meteor Lake notebook processors, which Microsoft might quickly follow and complement with a new AI-powered version of Windows.

However, the clearest indicator of Apple's goal to process LLMs on its devices is the company's research paper from earlier this month, which proposed utilizing flash memory for the task. Google and Samsung recently unveiled smartphones using onboard AI hardware for image editing, real-time text translation, search, and other tasks.

 
Doing AI on a local device (aka "privacy") is Apple's PR mantra. The problem is that ChatGPT-4 model requires 64+ GB of RAM (there are different numbers floating on the Internet, some are much higher). We'll probably get there eventually but right now 64GB of RAM for the phone is too much. Even if we could have it, I am not sure that's the optimal way to do it. Assuming one does not need to run the model constantly, it would be more cost efficient to send the queries to the cloud (instead of having every phone to have tons of RAM and super GPUs which would be used only for LLMs and only occasionally).
 
That next to last line might make the difference as Apple could 'expand' RAM by using local SSD storage as swap space. That should be faster than out to the cloud and back plus without cloud costs. Many billions of neural engine cores are already installed in many pockets.
 
Just read the paper, although I am not an engineer. TLDR version...they can optimise inference on device with sub optimum DRAM by managing data flow etc with the NAND.

My question (because I don't see the answer in the paper) is will this require more NAND (and DRAM) on the iPhone above and beyond the natural growth rate?

Any inputs would be welcome.
 
Back
Top