Unless you’re an accountant, the whole end-of-year tax filing can be a nightmare. While you might look forward to your return, you’re probably not too excited about coughing up money to a tax expert or spending a couple of hours filing yourself. But what if the entire process could be completed with a digital assistant on your smartphone? The time and cost-savings from such a productivity application would be substantial. That’s the potential power of on-device artificial intelligence (AI).
The tax return example is just one of many ways on-device AI could save consumers and enterprises time and money. From optimising smart home appliances to automatically drafting a client contract, on-device generative AI – and the productivity apps it can enable – are the key to unlocking an exciting new epoch in the smartphone and PC markets.
AI workloads falling from the cloud
AI for personal and work devices isn’t a novel concept, however, the vast majority of applications run in the cloud. While using the cloud is great for resource capacity and storage, a cloud-centric AI model suffers from technical challenges like high latency and network congestion. As a result, the user experience for many cloud-based AI apps falls short of what customers expect.
To remedy these technical challenges, smartphone manufacturers are beginning to embed AI accelerators in high-end devices to support local AI inferencing. However, on-device AI applications have primarily been limited to voice control, AI-enhanced imaging, and other “experience-focused” applications. Unlocking the value of on-device AI requires the development of a broad range of productivity AI applications tailored for specific use cases using compressed generative AI models.
The value of on-device AI
ChatGPT started a massive hype cycle for generative AI among consumers and enterprises, leading to testing and deployment across various markets. With most of these AI models deployed in the public cloud, users are experiencing network congestion, data privacy issues, and growing cloud bills as a result of expanding user bases. In contrast, local AI workloads – enabled by on-device AI – enhance the user experience by eliminating network latency, reducing various expenses, supporting future AI capabilities, and bolstering data security. These benefits are explained in further detail below:
- Improved network latency: AI applications such as digital assistants and enterprise extended reality (XR) require low latencies to provide the most natural, personalised, and engaging interaction possible. Bringing AI inferencing to the device eliminates the risk of network latency, enabling software developers to create a broader range of productivity applications for “mission-critical” applications that would be impossible with a cloud-centric AI architecture.
- Cost savings: As AI deployments continue to ramp up, demand for networks and cloud hosting will further increase costs for application developers and enterprises. Local AI processing purges many of these costs and also reduces data center energy usage. Optimisation tools like compression and quantisation will play a vital role in enabling on-device generative AI by developing accurate, low-power consumption AI models with less than 15 billion parameters.
- Support for future AI capabilities: Nobody wants to invest in a device that will become obsolete in a year or two. On-device AI accelerators can be optimised to support generative AI models and applications that have yet to hit the market. In turn, smartphone and PC owners maximise their return on investment (ROI).
- Enhanced data security: While public cloud service providers deploy security safeguards, they are not bulletproof, as evidenced by cloud-based breaches at several organisations in recent years. On-device AI keeps user data and sensor data local, minimising the risk of personal information or intellectual property (IP) becoming compromised. It’s also worth noting that the low latency capabilities of on-device AI models improve threat detection and other cybersecurity functions.
- Model personalisation: While AI models can be personalised in the public cloud, this runs against end-user demands for greater data privacy and cost optimisation. On-device processing enables AI models to be fine-tuned locally to end-user preferences, behaviours, and applications. This is particularly valuable as it enables AI models to be efficiently personalised utilising a variety of sources of incoming sensor/user data, including Wi-Fi, GPS, and sensor data, among others. This has significant benefits, including enhanced AI productivity, improved accessibility, and more intuitive and automated interactions/experiences.
On-device AI makes consumers more productive
Consumers are upgrading their smartphones at a slower rate than in years past. Perhaps the market has hit the point of diminishing returns. For example, it feels like every new smartphone iteration possesses little to no additional value compared to its predecessor. ABI Research believes that consumer demand for smartphones and tablets can be stimulated through a combination of on-device AI and productivity-focused AI applications.
If device manufacturers demonstrate measurable ROI in cost and time savings with these on-device AI apps, consumers will be incentivised to upgrade their devices more frequently. Whether it’s the time savings from automatically scheduling a family get-together or the utility cost savings from optimised energy usage, consumers will have a new reason to purchase newer smartphone models. Even more, productivity AI apps could potentially help an artist or producer bring a creative idea to life.
Epitomising the market trajectory, Qualcomm and Samsung recently partnered to support mobile AI capabilities for the Galaxy S24 Series. Not only will productivity AI applications decrease device refresh rates, but the new hardware will justify device manufacturers like Samsung to increase retail prices for their products.
How enterprises can leverage on-device AI
The story is not dissimilar for the enterprise market, where a lack of device innovation has stagnated shipment growth for PCs and laptops. Deploying AI natively on these devices will entice enterprises due to the value generated from offline productivity, lower latencies, enhanced data privacy, improved user-device communication, and model personalisation.
On-device productivity AI saves companies time and money by automating administrative tasks – such as scheduling, contract drafting, note taking – and enabling users to be productive even when their device is offline. Enterprises that leverage these novel generative AI applications stand to save thousands of dollars per year on each employee and enable employees to leverage generative AI-powered apps like Microsoft Copilot while on the move, such as travelling to client sites.
ABI Research has seen the earliest deployments of on-device AI within the enterprise occur in back-end operations, offices, and professional services sectors, as early applications (like Microsoft Copilot) offer a clear ROI. However, as on-device AI matures with productivity AI applications and support for different form factors, expect other verticals like manufacturing, healthcare, logistics & transportation, and telecommunications to increase adoption.
While smartphones and PCs hold the lion’s share of the on-device AI discussion for enterprises, the same benefits can be applied to the automotive, XR, and internet of things (IoT)/wearables spaces. Indeed, reduced latencies enhance in-vehicle digital assistant capabilities, and data privacy protects healthcare patients’ or manufacturers’ sensitive data while eliminating cloud computing costs.
Moreover, mining and logistics firms will appreciate the high reliability of on-device AI when using XR and IoT devices in remote areas prone to network interruptions. Like the consumer segment, on-device AI hardware with appropriate productivity AI apps is expected to decrease device refresh rates among enterprises as they seek the next “killer app.”
The future of on-device AI
A wave of recent trends has been integral to supporting on-device AI. Heterogeneous chipsets, such as Qualcomm’s Snapdragon X Elite for PCs, consolidate the graphics processing unit (GPU), central processing unit (CPU), and neural processing unit (NPU) into a single system-on-chip (SoC). This makes the AI workload operate more efficiently and improves the application’s performance.
Besides this, there has been a big push to build highly optimised, device-ready small generative AI models capable of matching the accuracy, performance, and knowledge of much larger models without the high power, memory, and compute requirements. This software innovation has been complemented by increased collaboration between key stakeholders to combine low barriers to entry – through software development kits (SDKs) like the Qualcomm AI Stack and no/low-code platforms – and accelerate the development of productivity AI applications.
The fate of the on-device AI market falls on the shoulders of three key stakeholders:
- Independent software vendors (ISVs) leverage available AI models and tools to build AI applications optimised to the underlying hardware.
- Chipset vendors ensure the chipset can run AI on-device and ease app development by offering SDKs. It’s also vital that chipset vendors ensure silicon capabilities to address device limitations.
- Original equipment manufacturers (OEMs) consolidate various components into one device and align applications with consumer/enterprise pain points and hardware.
Through close collaboration between these companies, innovation can be pushed further to ensure sustained long-term revenue streams via on-device productivity AI. For example, the Ray-Ban Meta smart glasses collection uses Qualcomm chipsets to provide on-glasses AI, reducing network latencies and real-time translation capabilities. What was once perceived as “entertainment” devices will be considered essential “productivity” devices that offer value beyond enhanced photography or generic voice assistants.
To close, ABI Research anticipates that the market will gradually adopt a “hybrid AI” approach. With a hybrid AI architecture, AI workloads reside at the edge, the cloud, or on-device – depending on commercial and technical priorities. For example, ultra data-sensitive applications may have model training take place in the cloud, while inferencing and fine-tuning these models – which leverage user data – occur on-device to ensure maximum privacy. By adopting a hybrid AI approach, users can distribute power consumption, reduce memory bottlenecks, and maximise the price-to-performance ratios
Reece Hayden is a principal analyst at ABI Research, and leads the analyst firm’s AI and machine learning research service.