xLSTM better than Transformer models?

More topics: Apple invests in AI infrastructure, and DeepSeek unveils a new open-source LLM


6 min read

xLSTM better than Transformer models?

Hi AI Enthusiasts,

Welcome to this week's Magic AI, where we bring you exciting updates from the world of Artificial Intelligence (AI) and technology. A lot has happened in the last week.

This week's Magic AI tool is an automation powerhouse. Perfect for meetings, sales calls, coaching sessions, and YouTube videos.

Let's explore this week's AI news together. Stay curious! ๐Ÿ˜Ž

In today's Magic AI:

  • xLSTM: Language model from Europe

  • Apple invests in AI infrastructure

  • DeepSeek unveils a new open-source LLM

  • Online Course: Start Your Artificial Intelligence Journey

  • Magic AI tool of the week

  • Hand-picked articles of the week

๐Ÿ“š Our Books and recommendations for you

Our Books and recommendations for you

Top AI news of the week

๐Ÿ’ฌ xLSTM: Language model from Europe

The European AI startup NXAI, founded by AI pioneer Sepp Hochreiter, has presented a new architecture for language models in a paper. The new architecture is based on the familiar LSTM architecture. And it is called xLSTM (short for Extended Long Short-Term Memory).

LSTMs have a kind of built-in short-term memory. The new xLSTM architecture combines the transformer technology with LSTM. The research question of the paper is:

"How far do we get in language modeling when scaling LSTMs to billions of parameters, leveraging the latest techniques from modern LLMs, but mitigating known limitations of LSTMs?"

According to the paper, larger xLSTM models could compete with current large language models. The researchers have enhanced LSTM architecture through exponential gating with memory mixing and a new memory structure. For a more in-depth look at the new architecture, you should check out the full paper.

Our thoughts

We have often used the LSTM architecture in the past for time series forecasts or anomaly detection in quality control. The LSTM architecture is powerful, but LSTMs cannot compete with state-of-the-art LLMs like Llama 3.

The new architecture expands the LSTM architecture so that the new architecture has the potential to be a real competitor to actual transformer models like Llama 3. We find the approach very exciting and will continue to follow this topic.

โœ๐Ÿฝ What is your opinion on this new LSTM architecture?

๐Ÿ“š Book recommendation (it's free) to learn more about state-of-the-art Deep Learning models:Dive into Deep Learning

More information

๐ŸŽ Apple invests in AI infrastructure

Apple plans to integrate its own chips into servers. The demand for computing power for generative AI is increasing, and Apple is planning several functions in the new operating systems. Generative AI requires big server parks with much computing power.

According to the famous analyst Jeff Pu, Apple's manufacturing partner Foxconn builds "AI servers" based on Apple's M2-Ultra-SoC. In 2025, the M4 will then be integrated into the servers. A few days ago, Apple unveiled the new iPad Pro, which already uses the new M4 chip. Apple calls the new M4 chip an AI powerhouse.

Our thoughts

Apple has been using its own ARM chips in Macs since 2020. Apple develops hardware and software hand in hand, which is the reason for the impressive performance of Apple computers.

In addition, Apple has developed the efficient machine learning framework MLX for Apple chips. It's like PyTorch but optimized for Apple hardware. It's an open-source framework, available on GitHub with many examples.

โœ๐Ÿฝ What is your opinion on Apple hardware for machine learning?

More information

๐Ÿค– DeepSeek unveils a new open-source LLM

The AI startup DeepSeek just presented DeepSeek-V2, an open-source mixture-of-experts (MoE) large language model. This open-source model has 236 billion parameters and supports a context length of 128,000 tokens.

DeepSeek-V2 achieves high performance in benchmarks. DeepSeek has adjusted the mixture-of-experts architecture and shows the potential of large open-source models. You can check out the official paper to learn more about this new open-source model. It also requires significantly fewer parameters than many of its competitors.

MMLU accuracy vs. activated parameters, among different open-source models (Image by DeepSeek)

Image by DeepSeek

DeepSeek-V2 is available on HuggingFace. In addition, you can use the cost-effective API of DeepSeek-V2.

Our thoughts

According to the benchmarks, DeepSeek-V2 seems to be a powerful open-source LLM. But it also has its weaknesses. It works especially well in Chinese because they used mainly data from China for training. In addition, it has its strengths, especially in math and coding tasks. The first tests also show that the model answers questions from a Chinese perspective.

More information

๐ŸŽ“ Start Your Artificial Intelligence Journey

In today's competitive job market, mastering a range of technical skills is more important than ever. The 100-hour course โ€œThe Complete Excel, AI and Data Science Mega Bundleโ€œ* is designed to equip you with in-demand capabilities in Excel, Python, and Artificial Intelligence (AI).

Learn Excel, Artificial Intelligence and Data Science*

It is a completely self-paced online learning course.

Here you'll discover:

  • Excel functions, formulae, and data visualization techniques. Unlock the financial functionalities of Excel to manage and analyze business data.

  • Gain an edge in the market by leveraging machine learning for stock prediction.

  • Learn to build advanced models, understand neural networks, and employ TensorFlow.

๐Ÿ‘‰๐Ÿฝ Enroll now and set yourself on a path to technological mastery and unparalleled career growth.*

Magic AI tool of the week

๐Ÿช„ CastMagic*- Automate tedious tasks with ease

Do you find writing meeting summaries or follow-up e-mails tedious? Then, you should check out the tool CastMagic! This tool turns your audio into content.

Imagine you have a sales call. Then, you need a summary of the call, including the customer's name, asked questions, sentiment, next steps, and so on. Right!

CastMagic can do all of this for you. The tool does all the tedious work for you, and you can focus on the conversation with your customer. It's a win for you and your customers.

๐Ÿ‘‰๐Ÿฝ Try CastMagic for Free*

Articles of the week

๐Ÿ“– You might also be interested in these blog articles:

Thanks for reading, and see you next time.

- Tinz Twins

P.S. Have a nice weekend! ๐Ÿ˜‰๐Ÿ˜‰

Did you enjoy our content and find it helpful?ย If so, subscribe to our Tinz Twins Hub and get our ebook for FREE. Donโ€™t forget to follow us on X.ย ๐Ÿ™๐Ÿฝ๐Ÿ™๐Ÿฝ

* Disclosure: The links are affiliate links, which means we will receive a commission if you purchase through these links. There are no additional costs for you.