TY - JOUR
T1 - Deep learning, transformers and graph neural networks
T2 - a linear algebra perspective
AU - Baggag, Abdelkader
AU - Saad, Yousef
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - In an age where Artificial Intelligence (AI) is being integrated into nearly every domain of science and engineering, it has become essential for experts in Numerical Linear Algebra to explore the foundational elements of deep learning and identify ways to contribute to its development. What’s particularly exciting is that Numerical Linear Algebra (NLA) lies at the heart of Machine Learning and more broadly AI. All AI techniques fundamentally rely on four core components: data, optimization methods, statistical intuition, and linear algebra. The initial phase of any neural network model involves transforming the problem into one that can be tackled using numerical methods, particularly through optimization techniques. Thus, in Large Language Models (LLMs) this first step involves mapping words or subwords into tokens, which are then embedded into Euclidean spaces. From that point, LLMs rely heavily on vectors, matrices, and tensors. The aim of this article is to outline the essential components of deep learning methods from a linear algebra perspective. It will cover deep neural networks, multilayer perceptrons, and the concept of “attention,” which plays a crucial role in large language models as well as other machine learning applications. A significant portion of the discussion will focus on methods that leverage graphs in neural networks, such as Graph Convolutional Networks. The paper will conclude with reflections on the future role of numerical linear algebra in the age of AI.
AB - In an age where Artificial Intelligence (AI) is being integrated into nearly every domain of science and engineering, it has become essential for experts in Numerical Linear Algebra to explore the foundational elements of deep learning and identify ways to contribute to its development. What’s particularly exciting is that Numerical Linear Algebra (NLA) lies at the heart of Machine Learning and more broadly AI. All AI techniques fundamentally rely on four core components: data, optimization methods, statistical intuition, and linear algebra. The initial phase of any neural network model involves transforming the problem into one that can be tackled using numerical methods, particularly through optimization techniques. Thus, in Large Language Models (LLMs) this first step involves mapping words or subwords into tokens, which are then embedded into Euclidean spaces. From that point, LLMs rely heavily on vectors, matrices, and tensors. The aim of this article is to outline the essential components of deep learning methods from a linear algebra perspective. It will cover deep neural networks, multilayer perceptrons, and the concept of “attention,” which plays a crucial role in large language models as well as other machine learning applications. A significant portion of the discussion will focus on methods that leverage graphs in neural networks, such as Graph Convolutional Networks. The paper will conclude with reflections on the future role of numerical linear algebra in the age of AI.
UR - https://www.scopus.com/pages/publications/105018832717
U2 - 10.1007/s11075-025-02218-2
DO - 10.1007/s11075-025-02218-2
M3 - Article
AN - SCOPUS:105018832717
SN - 1017-1398
VL - 100
SP - 2095
EP - 2134
JO - Numerical Algorithms
JF - Numerical Algorithms
IS - 4
ER -