Part 1: Welcome to the Tensor World – A Security Engineer’s First Steps into Neural Network Mathematics

9 minute read

This is Part 1 of a 12-part series exploring the intersection of artificial intelligence and cybersecurity. If you’re a security professional who has ever looked at a neural network and thought, “Where exactly is the attack surface?”—this series is for you.

Why a Security Engineer Needs to Understand Tensors

I have spent years in cybersecurity thinking in terms of packets, payloads, and permissions. I understand memory layouts, network protocols, and the art of finding logic flaws in code that was never designed to be adversarial. But when I turned my attention to AI security, I hit a wall — and that wall was made of mathematics.

Here is the uncomfortable truth: you cannot secure what you do not understand. We don’t let network engineers defend infrastructure without understanding TCP/IP. We don’t let application security testers evaluate code without understanding how compilers work. So why do we think we can secure AI systems without understanding the fundamental data structure that makes them tick?

That data structure is the tensor.

Every weight, every activation, every gradient that flows through a neural network during training — it is all tensors. The model’s “knowledge,” its biases (both statistical and ethical), its vulnerabilities — they are all encoded in these multidimensional arrays of numbers. If you want to find the ghosts in the machine, you have to learn to read the math.

What Exactly Is a Tensor?

If you have a background in programming, you already understand tensors more than you think. Let me walk you through it.

A scalar is a single number. The temperature outside: 72. That is a zero-dimensional tensor, also called a rank-0 tensor.

A vector is a list of numbers. Think of an IP address represented numerically, or a feature list: [0.5, 1.2, -0.3, 0.8]. That is a one-dimensional tensor, a rank-1 tensor.

A matrix is a grid of numbers — rows and columns. If you have ever worked with a spreadsheet or a database table, you have worked with a matrix. That is a rank-2 tensor.

A tensor generalizes this to any number of dimensions. A rank-3 tensor is a “cube” of numbers. A rank-4 tensor is a cube of cubes. The weights of a modern large language model can involve tensors with hundreds of millions — even billions — of individual values spread across multiple dimensions.

Mathematician and physicist Roger Penrose described tensors as “the mathematical entities that generalize scalars, vectors, and matrices to higher dimensions” (Penrose, 2004). In deep learning, this generalization is not just theoretical elegance — it is the actual mechanism by which models store and process information.

Think of it this way: if a spreadsheet is a 2D map of data, a tensor is the entire 3D (or 4D, or 1000D) terrain. Navigating that terrain is how neural networks “think.”

The Tensor’s Role in Neural Networks

Here is where it gets interesting for us security-minded folks.

A neural network is, at its core, a series of matrix multiplications and nonlinear transformations. When you feed an input to a model — say, a sentence like “What is the capital of France?” — the model converts that text into a numerical representation (a tensor), then passes it through layer after layer of mathematical operations.

Each layer has its own set of weight tensors. These weights were learned during training — the process where the model saw billions of examples and gradually adjusted its parameters to minimize prediction errors. The seminal work by LeCun, Bengio, and Hinton (2015) in their landmark Nature review described this as “learning representations of data with multiple levels of abstraction,” and those representations are stored entirely in tensors.

The forward pass through a network looks something like this:

Input tensor → multiply by weight tensor of Layer 1 → apply activation function → output tensor
Output tensor from Layer 1 → multiply by weight tensor of Layer 2 → apply activation → output tensor
Repeat for every layer in the network.

The final output tensor is the model’s “answer.” For a language model, that answer is a probability distribution over the next word. For an image classifier, it is a probability distribution over object categories.

The key insight is this: the intelligence of the model is not in the code. It is in the tensors. The Python script that runs a neural network is generic — the same code can run a chatbot, a medical diagnosis system, or a malware detector. What makes each model unique is its learned weight tensors. This is the paradigm shift that security engineers need to internalize.

Scalars, Vectors, Matrices, and Beyond — Building Intuition

Let me ground this with some concrete examples that map to concepts we already know.

Scalars in Security

A scalar is just a single value. In security, think of a risk score: “This vulnerability has a CVSS score of 9.8.” That is a scalar. Simple, one-dimensional, easy to reason about.

Vectors in Security

A vector is an ordered list of values. An embedding — a numerical representation of a word or concept — is a vector. When a language model processes the word “exploit,” it converts it into a vector like [0.23, -1.05, 0.67, ..., 0.41] with hundreds or thousands of dimensions. Words with similar meanings end up with similar vectors. This was demonstrated powerfully by Mikolov et al. (2013) in their Word2Vec research, which showed that vector arithmetic could capture semantic relationships: king - man + woman ≈ queen.

For a security engineer, this is both fascinating and alarming. If “exploit” and “vulnerability” have nearby vectors, what happens when an attacker manipulates the vector space to move “safe” closer to “dangerous”?

Matrices in Security

A matrix is a 2D grid. The weight matrix of a single neural network layer defines how input features are combined to produce output features. If you have ever looked at a confusion matrix in a machine learning classification report, you have seen a matrix used for security-relevant evaluation: rows represent actual labels, columns represent predicted labels, and the values tell you where the model is getting it right — and where it is getting it wrong.

Higher-Rank Tensors

In modern transformer architectures (the backbone of GPT, BERT, and every major LLM), attention mechanisms operate on rank-3 and rank-4 tensors. The attention weights of a model with 96 heads across 96 layers create a tensor structure of staggering complexity. Vaswani et al. (2017) introduced this architecture in “Attention Is All You Need,” and the tensor operations they described — query, key, and value projections — are now the beating heart of every large language model on the planet.

Why This Matters for Security: The Tensor as Attack Surface

Traditional software has a clear attack surface: input validation, memory management, authentication logic, network interfaces. You can draw a diagram of where data enters, how it is processed, and where it exits.

Neural networks have a different kind of attack surface, and it lives in the tensors.

Consider these scenarios:

Data Poisoning: During training, an attacker injects carefully crafted examples into the training set. The model’s weight tensors absorb this poisoned data, encoding backdoors that persist after training is complete. Gu et al. (2019) demonstrated this with “BadNets,” showing that a model could be trained to perform normally on clean inputs but produce attacker-chosen outputs when a specific trigger pattern was present — all encoded in the model’s weight tensors.

Weight Manipulation: If an attacker gains access to a model’s saved weight files (which are often stored as serialized tensors in formats like .pt, .h5, or .safetensors), they can directly modify the values. This is not hypothetical — model files are regularly distributed through public repositories like Hugging Face, and supply chain attacks on model weights are an emerging threat vector.

Adversarial Examples: Szegedy et al. (2014) showed that adding imperceptible perturbations to input tensors could cause neural networks to misclassify with high confidence. An image of a panda, altered by a carefully computed noise tensor, gets classified as a gibbon with 99% confidence. The perturbation is invisible to the human eye but devastating to the model’s tensor computations.

These are not traditional exploits. There is no buffer overflow, no SQL injection, no XSS. The vulnerability is mathematical, embedded in the geometric relationships between tensor values. Securing against these attacks requires understanding those relationships.

Setting Up Your First Tensor Playground

Enough theory — let’s get our hands dirty. If you want to follow along with this series, you will need a basic Python environment with a few key libraries.

# Install the essentials
# pip install torch numpy matplotlib

import torch
import numpy as np

# Scalar (rank 0)
scalar = torch.tensor(9.8)
print(f"Scalar: {scalar}, Shape: {scalar.shape}, Rank: {scalar.dim()}")

# Vector (rank 1)
vector = torch.tensor([0.23, -1.05, 0.67, 0.41])
print(f"Vector: {vector}, Shape: {vector.shape}, Rank: {vector.dim()}")

# Matrix (rank 2)
matrix = torch.tensor([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
print(f"Matrix:\n{matrix}, Shape: {matrix.shape}, Rank: {matrix.dim()}")

# 3D Tensor (rank 3)
tensor_3d = torch.randn(2, 3, 4)  # 2 matrices, each 3x4
print(f"3D Tensor Shape: {tensor_3d.shape}, Rank: {tensor_3d.dim()}")

This is your foundation. Every operation in a neural network — every forward pass, every backpropagation step, every attention computation — builds on these primitives. PyTorch, developed by Meta AI Research (Paszke et al., 2019), has become the dominant framework for both AI research and, increasingly, for AI security research.

In the coming articles, we will use this playground to inspect real model weights, visualize attention patterns, and eventually build tools to probe the security boundaries of these systems.

What’s Coming Next

In Part 2, we will dive into how LLMs actually “think” — the mechanics of embeddings, attention, and the transformer architecture. We will move from understanding what tensors are to understanding how they flow through a model to produce the outputs we see. For a security engineer, this is like moving from understanding what bytes are to understanding how a CPU executes instructions. It is the foundation for everything that follows.

This series is a journey I am taking in real-time. I am not an AI researcher by training — I am a security engineer who realized that the next generation of threats lives in tensor space. If you are on a similar path, I hope these articles serve as the field guide I wish I had when I started.

References

Gu, T., Liu, K., Dolan-Gavitt, B., & Garg, S. (2019). BadNets: Evaluating Backdooring Attacks on Deep Neural Networks. IEEE Access, 7, 47230-47244.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. Advances in Neural Information Processing Systems, 26.
Paszke, A., et al. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems, 32.
Penrose, R. (2004). The Road to Reality: A Complete Guide to the Laws of the Universe. Jonathan Cape.
Szegedy, C., et al. (2014). Intriguing properties of neural networks. International Conference on Learning Representations (ICLR).
Vaswani, A., et al. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems, 30.

Join the Mission

This is just the beginning. I will be sharing my code, data, and research findings as I go. If you are interested in the intersection of AI, Quantum, and Security, I’d love to connect.

GitHub: github.com/bitghostsecurity
Collaborate: hello@bitghostsecurity.com

Hardened Logic for an Intelligent Era.

Share on

X Facebook LinkedIn Bluesky

Bit Ghost Security