Apple Unleashes OpenELM, a Slightly More Accurate LLM

Ritesh Kanjee
3 min readApr 26, 2024

--

Apple, not typically associated with openness, has released a generative AI model called OpenELM, which outperforms other language models trained on public datasets.

OpenELM: A New Era in AI

Apple’s OpenELM release marks a significant advancement for the AI community, offering efficient, on-device AI processing ideal for mobile apps and IoT devices with limited computing power. This enables quick, local decision-making essential for everything from smartphones to smart home devices, expanding the potential for AI in everyday technology.

OpenELM hugging face

Specifications and Hard Facts

OpenELM is available in pre-trained and instruction-tuned models with 270 million, 450 million, 1.1 billion, and 3 billion parameters. The model utilizes a technique called layer-wise scaling to allocate parameters more efficiently in the transformer model. This results in better accuracy, shown in the percentage of correct predictions from the model in benchmark tests.

Performance Metrics

  • OpenELM is 2.36% more accurate than OLMo while using 2x fewer pre-training tokens.
  • Despite OpenELM’s higher accuracy, it is slower than OLMo in performance tests.
OpenELM benchmarks

Training and Evaluation Framework

Apple’s claim to openness comes from its decision to release not just the model, but its training and evaluation framework. This includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations.

M4 Apple Silicon Mac

Limitations and Future Optimizations

The release of OpenELM models aims to empower and enrich the open research community by providing access to state-of-the-art language models. However, those using OpenELM are warned to exercise due diligence before trying the model for anything meaningful. Apple’s boffins acknowledge that the less-than-victorious showing is due to their “naive implementation of RMSNorm,” a technique for normalizing data in machine learning. In the future, they plan to explore further optimizations.

Availability and Licensing

OpenELM is available for use, but the accompanying software release is not a recognized open-source license. Apple reserves the right to file a patent claim if any derivative work based on OpenELM is deemed to infringe on its rights.

Code Conversion and Inference

The release is accompanied by code to convert models to MLX library for inference and fine-tuning on Apple devices. This enables the model to operate locally on Apple devices, rather than over the network, making OpenELM more interesting to developers.

Personally, I hope to see this model running on recent iPhone models which will enable Siri to be much more smarter. One can hope. I guess we’ll have to wait and see at WWDC 2024.

If you found this article interesting and want to be on the cutting edge of AI, then ready yourself to level up your AI and Computer Vision skills with practical, industry-relevant knowledge. Join us at Augmented AI University and bridge the gap between academic learning and the skills you need in the workplace.

Don’t miss out on the opportunity to enhance your career with cutting-edge AI education. Enroll now and start building a foundation of practical AI skills to tackle tomorrow’s technological challenges!

Enroll in Augmented AI University Today!

Augmented AI University

--

--

Ritesh Kanjee
Ritesh Kanjee

Written by Ritesh Kanjee

We help you master AI so it does not master you! Director of Augmented AI

No responses yet