Alberto Alfarano, François Charton, Amaury Hayat
The paper “Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformers” offers a pioneering approach to tackling the complex problem of discovering Lyapunov functions, which play a crucial role in ensuring the global stability of dynamical systems. Highlighting a novel application of AI, the authors employ sequence-to-sequence transformers, framing the challenge as a translation task where a system of equations is translated into its corresponding Lyapunov function. This innovative approach surpasses traditional algorithmic and human methods, particularly in solving both polynomial and non-polynomial systems, thereby showcasing the potential of AI to address complex mathematical reasoning tasks that have eluded conventional solvers. The paper’s use of synthetic datasets, generated through backward and forward methods, is particularly noteworthy. It provides the extensive training data necessary for the model to generalize effectively and identify Lyapunov functions even for previously unsolvable cases. By successfully integrating AI into this domain, the research not only advances the state of symbolic mathematics but also opens avenues for future exploration, such as extending the method to larger, more intricate systems, and integrating AI with human mathematical intuition for synergistic problem-solving. This work marks a significant leap in using AI to expand the limits of mathematical discovery and stability analysis, akin to addressing unresolved challenges like the famous 3-body problem. This post implemented a minimal demo to illustrate the backward and forward data generation, and a very simple Lyapunov transformer for educational purposes.
Mind maps
![](https://paperwithoutcode.com/wp-content/uploads/2024/10/2410.08304v1-markmap.md_-1024x768.png)
Highlights explained
![](https://paperwithoutcode.com/wp-content/uploads/2024/10/lyapunov_vis-1024x341.png)
Transformer-based Discovery of Lyapunov Functions
- Explanation: The paper employs sequence-to-sequence transformers to tackle the long-standing mathematical problem of discovering Lyapunov functions, which are used to ensure the global stability of dynamical systems.
- Significance: This approach demonstrates the potential of AI models to solve complex mathematical reasoning tasks that have traditionally been challenging, thereby pushing the boundaries of what language models can achieve in mathematics.
- Relation to Existing Work: The method surpasses traditional algorithmic solvers and even human capability in certain cases, indicating a significant leap beyond the current state of symbolic mathematics using machine learning.
Synthetic Data Generation
- Explanation: The authors developed backward and forward data generation methods to create pairs of dynamical systems and their corresponding Lyapunov functions. This synthetic dataset is crucial for training the transformer models.
- Significance: Synthetic data is essential because it enables training on a massive scale, which is necessary for the model to generalize well and discover Lyapunov functions that are previously unknown.
- Impact: By generating both polynomial and non-polynomial systems, the dataset allows the models to explore a broader variety of systems, potentially leading to innovative solutions in areas where no algorithmic methods currently exist.
Novel Framing as a Translation Task
- Explanation: The paper reframes the discovery of Lyapunov functions as a translation task, where the input is a system of equations, and the output is the Lyapunov function.
- Significance: This novel framing aligns with the capabilities of transformers, which are adept at translation tasks, thus leveraging the strengths of the model architecture for solving complex mathematical problems.
- Comparison with Existing Methods: Unlike previous neural network approaches that focus on approximating functions, this method offers explicit solutions that can be verified for mathematical correctness.
Code
Only partial code of the core concepts is attached in this post. The fully code of data generation and training is lengthy, please jump to Github repo for the full version and a small sample dataset at saved_data
. The demo training is very underfit to predict any useful lyapunov functions, due to the limited sample size and training time.
python lyapunov_training.py --load my_dataset.json --device mps
![](https://paperwithoutcode.com/wp-content/uploads/2024/10/Screenshot-2024-10-22-at-12.51.57 PM-1024x841.png)
pip install sympy numpy tqdm torch matplotlib
# the complete version is at https://github.com/phunterlau/paper_without_code/tree/main/examples/tiny-lyapunov
import torch
import torch.nn as nn
import numpy as np
from torch.utils.data import Dataset, DataLoader
import matplotlib.pyplot as plt
from typing import List, Tuple, Optional
import math
import random
import sympy as sp
from tqdm.auto import tqdm
class LyapunovSystem:
"""
Represents a dynamical system with its Lyapunov function.
Supports both symbolic and numerical evaluation.
"""
def __init__(self, system_eqs: List[str], lyap_func: Optional[str] = None):
# Input validation
if not isinstance(system_eqs, list):
raise TypeError("system_eqs must be a list of strings")
self.system_eqs = system_eqs
self.lyap_func = lyap_func
self.dim = len(system_eqs)
# Create symbolic variables
self.vars = sp.symbols(f'x:{self.dim}')
# Parse equations
try:
self.system_sym = [sp.sympify(eq) for eq in system_eqs]
if lyap_func:
self.lyap_sym = sp.sympify(lyap_func)
except sp.SympifyError as e:
raise ValueError(f"Failed to parse equations: {e}")
def evaluate_system(self, x: np.ndarray) -> np.ndarray:
"""Evaluates the system equations at point x"""
if len(x) != self.dim:
raise ValueError(f"Expected input of dimension {self.dim}, got {len(x)}")
subs_dict = {var: val for var, val in zip(self.vars, x)}
return np.array([float(eq.evalf(subs=subs_dict)) for eq in self.system_sym])
def evaluate_lyapunov(self, x: np.ndarray) -> float:
"""Evaluates the Lyapunov function at point x"""
if not self.lyap_func:
raise ValueError("No Lyapunov function defined")
if len(x) != self.dim:
raise ValueError(f"Expected input of dimension {self.dim}, got {len(x)}")
subs_dict = {var: val for var, val in zip(self.vars, x)}
return float(self.lyap_sym.evalf(subs=subs_dict))
def verify_lyapunov(self, grid_points: int = 20, bounds: Tuple[float, float] = (-2, 2)) -> bool:
"""
Verifies Lyapunov function properties:
1. V(0) = 0
2. V(x) > 0 for x ≠ 0
3. ∇V·f ≤ 0
"""
if not self.lyap_func:
return False
try:
# Check V(0) = 0
if abs(self.evaluate_lyapunov(np.zeros(self.dim))) > 1e-10:
print("Failed: V(0) ≠ 0")
return False
# Generate grid points for verification
grid = np.linspace(bounds[0], bounds[1], grid_points)
points = np.array(np.meshgrid(*[grid for _ in range(self.dim)]))
points = points.reshape(self.dim, -1).T
for x in points:
if np.linalg.norm(x) > 1e-10: # Skip origin
# Check V(x) > 0
v_x = self.evaluate_lyapunov(x)
if v_x <= 0:
print(f"Failed: V(x) ≤ 0 at x = {x}, V(x) = {v_x}")
return False
# Check ∇V·f ≤ 0
grad_V = np.array([float(sp.diff(self.lyap_sym, var).evalf(subs={
v: val for v, val in zip(self.vars, x)})) for var in self.vars])
f_x = self.evaluate_system(x)
dot_product = np.dot(grad_V, f_x)
if dot_product > 1e-10:
print(f"Failed: ∇V·f > 0 at x = {x}, value = {dot_product}")
return False
return True
except Exception as e:
print(f"Verification failed with error: {e}")
return False
class ForwardGenerator:
"""
Implements forward generation of stable polynomial systems
with stronger stability guarantees
"""
def __init__(self, max_degree: int = 3):
self.max_degree = max_degree
def generate_hurwitz_matrix(self, dim: int) -> np.ndarray:
"""
Generates a Hurwitz matrix (all eigenvalues have negative real parts)
using the method from Chen & Zhou (1998)
"""
# Generate random negative eigenvalues
eigenvals = -np.random.uniform(0.5, 2.0, dim)
# Create diagonal matrix
D = np.diag(eigenvals)
# Generate random orthogonal matrix
Q = np.random.randn(dim, dim)
Q, _ = np.linalg.qr(Q) # QR decomposition gives orthogonal matrix
# Create Hurwitz matrix
A = Q @ D @ Q.T
return A
def generate_polynomial_system(self, dim: int) -> List[str]:
"""
Generates a random polynomial system with stronger stability properties
"""
# Generate Hurwitz linear part
A = self.generate_hurwitz_matrix(dim)
system = []
for i in range(dim):
terms = []
# Add linear terms (Hurwitz part)
for j in range(dim):
if abs(A[i,j]) > 1e-10:
terms.append(f"{A[i,j]}*x{j}")
# Add some higher-order terms with very small coefficients
num_nonlin_terms = random.randint(0, 2)
for _ in range(num_nonlin_terms):
# Generate term with very small coefficient
coeff = random.uniform(-0.1, 0.1) # Reduced coefficient range
if abs(coeff) < 1e-10:
continue
# Generate powers ensuring total degree is limited
powers = [0] * dim
# Prefer terms that include the current variable
if random.random() < 0.7:
powers[i] = random.randint(1, self.max_degree-1)
# Add at most one other variable
other_var = random.randint(0, dim-1)
if other_var != i:
powers[other_var] = random.randint(1, self.max_degree-powers[i])
if all(p == 0 for p in powers):
continue
term = str(coeff)
for var, power in enumerate(powers):
if power > 0:
term += f"*x{var}"
if power > 1:
term += f"**{power}"
terms.append(term)
system.append(" + ".join(terms) if terms else "0")
return system
def try_find_lyapunov_candidate(self, dim: int) -> np.ndarray:
"""Generate a candidate positive definite quadratic form"""
# Generate random positive definite matrix with controlled condition number
while True:
Q = np.random.randn(dim, dim)
Q = Q.T @ Q
# Add scaled identity to improve conditioning
scale = np.trace(Q) / dim
Q = Q + np.eye(dim) * scale
# Check condition number
cond = np.linalg.cond(Q)
if cond < 100: # Ensure well-conditioned matrix
break
return Q
def try_find_lyapunov(self, system_eqs: List[str], max_attempts: int = 5) -> Optional[str]:
"""
Attempts to find a quadratic Lyapunov function with progressive verification
"""
dim = len(system_eqs)
# Try different strategies
strategies = [
(10, (-1.0, 1.0)), # Coarse grid, small region
(15, (-1.5, 1.5)), # Medium grid, medium region
(20, (-2.0, 2.0)) # Fine grid, full region
]
for attempt in range(max_attempts):
print(f"Attempt {attempt + 1}/{max_attempts} to find Lyapunov function...")
# Try different scaling factors for the quadratic form
scale = 1.0 / (attempt + 1)
Q = self.try_find_lyapunov_candidate(dim) * scale
# Construct Lyapunov function
terms = []
for i in range(dim):
for j in range(i, dim):
coeff = Q[i,j]
if abs(coeff) < 1e-10:
continue
if i == j:
terms.append(f"{coeff}*x{i}**2")
else:
terms.append(f"{2*coeff}*x{i}*x{j}")
lyap_func = " + ".join(terms)
try:
# Progressive verification
system = LyapunovSystem(system_eqs, lyap_func)
# Try verification with increasingly strict requirements
is_valid = True
for grid_points, bounds in strategies:
if not system.verify_lyapunov(grid_points=grid_points, bounds=bounds):
is_valid = False
break
if is_valid:
print("Found valid Lyapunov function!")
return lyap_func
except Exception as e:
print(f"Verification failed on attempt {attempt + 1}: {e}")
continue
print("Failed to find valid Lyapunov function")
return None
class BackwardGenerator:
"""
Implements backward generation starting from a known Lyapunov function.
"""
def __init__(self, max_degree: int = 3):
self.max_degree = max_degree
def generate_example(self, dim: int) -> Tuple[List[str], str]:
"""
Generates a Lyapunov function first, then constructs a compatible system.
Following Section 4.1 of the paper.
"""
# Generate V_proper (positive definite quadratic form)
A = np.random.randn(dim, dim)
A = A.T @ A + np.eye(dim)
# Construct Lyapunov function
terms = []
for i in range(dim):
for j in range(i, dim):
coeff = A[i,j]
if i == j:
terms.append(f"{coeff}*x{i}**2")
else:
terms.append(f"{2*coeff}*x{i}*x{j}")
lyap_func = " + ".join(terms)
# Generate system equations that make this a Lyapunov function
system = []
for i in range(dim):
# Negative gradient term ensures V'(x)f(x) ≤ 0
grad_term = f"-{A[i,i]}*x{i}"
for j in range(dim):
if i != j:
grad_term += f" - {A[i,j]}*x{j}"
# Add some cross terms (as in paper)
num_cross = random.randint(0, 2)
cross_terms = []
for _ in range(num_cross):
j = random.randint(0, dim-1)
k = random.randint(0, dim-1)
coeff = random.randint(-5, 5)
if coeff != 0:
cross_terms.append(f"{coeff}*x{j}*x{k}")
eq = " + ".join([grad_term] + cross_terms)
system.append(eq)
return system, lyap_func
class LyapunovDataset(Dataset):
"""
Enhanced dataset class with proper loading support
"""
def __init__(self,
num_backward: int = 1000,
num_forward: int = 300,
dim: int = 2,
max_degree: int = 2,
max_seq_length: int = 100,
load_from_data: bool = False):
self.max_seq_length = max_seq_length
self.systems = []
self.lyap_funcs = []
# Special tokens
self.special_tokens = {
'NUM': '<NUM>', # Represents any number
'VAR': '<VAR>', # Represents any variable
'OP': '<OP>', # Represents any operator
'PAD': '<PAD>' # Padding token
}
# Initialize basic vocabulary
self.vocab = set(self.special_tokens.values())
self.operators = {'+', '-', '*', '**'}
self.vocab.update(self.operators)
# Add variables up to max dimension
for i in range(10): # Support up to 10 dimensions
self.vocab.add(f'x{i}')
# If not loading from data, generate new examples
if not load_from_data:
# Generate backward examples
print("Generating backward examples...")
backward_gen = BackwardGenerator(max_degree=max_degree)
for i in range(num_backward):
if i % 100 == 0:
print(f"Generating backward example {i}/{num_backward}")
system, lyap = backward_gen.generate_example(dim=dim)
self.systems.append(system)
self.lyap_funcs.append(lyap)
# Generate forward examples
print("\nGenerating forward examples...")
forward_gen = ForwardGenerator(max_degree=max_degree)
count = 0
attempts = 0
while count < num_forward and attempts < num_forward * 10:
attempts += 1
if attempts % 10 == 0:
print(f"Generated {count}/{num_forward} forward examples (attempts: {attempts})")
system = forward_gen.generate_polynomial_system(dim=dim)
lyap = forward_gen.try_find_lyapunov(system)
if lyap is not None:
self.systems.append(system)
self.lyap_funcs.append(lyap)
count += 1
print(f"\nFinal dataset size: {len(self.systems)} examples")
# Update vocabulary with tokens from systems and functions
self._update_vocabulary()
def _update_vocabulary(self):
"""Update vocabulary from systems and Lyapunov functions"""
for system in self.systems:
for eq in system:
tokens = eq.replace('+', ' + ').replace('-', ' - ').replace('*', ' * ').split()
self.vocab.update(tokens)
for lyap in self.lyap_funcs:
tokens = lyap.replace('+', ' + ').replace('-', ' - ').replace('*', ' * ').split()
self.vocab.update(tokens)
self.token2idx = {token: idx for idx, token in enumerate(sorted(self.vocab))}
self.idx2token = {idx: token for token, idx in self.token2idx.items()}
def __len__(self):
return len(self.systems)
@classmethod
def from_saved_data(cls, systems, lyap_funcs, max_seq_length=100):
"""Create dataset from saved data"""
dataset = cls(num_backward=0, num_forward=0, load_from_data=True)
dataset.systems = systems
dataset.lyap_funcs = lyap_funcs
dataset.max_seq_length = max_seq_length
dataset._update_vocabulary()
return dataset
def normalize_number(self, num_str: str) -> str:
"""Convert number to a standard format"""
try:
num = float(num_str)
if abs(num - round(num)) < 1e-10:
return str(round(num))
return f"{num:.6f}"
except ValueError:
return num_str
def tokenize_term(self, term: str) -> List[str]:
"""Tokenize a single term of the equation"""
# Split around operators while keeping them
tokens = []
current_token = ""
i = 0
while i < len(term):
if term[i] in '+-*':
if current_token:
tokens.append(current_token)
current_token = ""
if i + 1 < len(term) and term[i:i+2] == '**':
tokens.append('**')
i += 2
else:
tokens.append(term[i])
i += 1
else:
current_token += term[i]
i += 1
if current_token:
tokens.append(current_token)
# Process each token
processed_tokens = []
for token in tokens:
if token in self.operators:
processed_tokens.append(token)
elif token.startswith('x'):
processed_tokens.append(token)
else:
try:
float(token) # Check if it's a number
processed_tokens.append(self.special_tokens['NUM'])
except ValueError:
processed_tokens.append(token)
return processed_tokens
def tokenize_system(self, system: List[str]) -> torch.Tensor:
"""Convert system equations to tensor format with improved tokenization"""
all_tokens = []
for eq in system:
# Split equation into terms
terms = eq.replace(' ', '').split('+')
for term in terms:
if term.startswith('-'):
all_tokens.append('-')
term = term[1:]
all_tokens.extend(self.tokenize_term(term))
all_tokens.append(self.special_tokens['PAD'])
# Create encoding
encoding = torch.zeros(self.max_seq_length, len(self.vocab))
for i, token in enumerate(all_tokens[:self.max_seq_length]):
if token in self.token2idx:
encoding[i, self.token2idx[token]] = 1
else:
# Use special token if not in vocabulary
if token.startswith('x'):
encoding[i, self.token2idx[self.special_tokens['VAR']]] = 1
elif any(op in token for op in self.operators):
encoding[i, self.token2idx[self.special_tokens['OP']]] = 1
else:
try:
float(token)
encoding[i, self.token2idx[self.special_tokens['NUM']]] = 1
except ValueError:
continue
return encoding
def tokenize_lyapunov(self, lyap: str) -> torch.Tensor:
"""Convert Lyapunov function to tensor format with improved tokenization"""
tokens = self.tokenize_term(lyap.replace(' ', ''))
encoding = torch.zeros(len(self.vocab))
for token in tokens:
if token in self.token2idx:
encoding[self.token2idx[token]] = 1
else:
# Use special token if not in vocabulary
if token.startswith('x'):
encoding[self.token2idx[self.special_tokens['VAR']]] = 1
elif any(op in token for op in self.operators):
encoding[self.token2idx[self.special_tokens['OP']]] = 1
else:
try:
float(token)
encoding[self.token2idx[self.special_tokens['NUM']]] = 1
except ValueError:
continue
return encoding
def __getitem__(self, idx):
system = self.systems[idx]
lyap = self.lyap_funcs[idx]
# Convert to tensor format
system_tensor = self.tokenize_system(system)
lyap_tensor = self.tokenize_lyapunov(lyap)
return system_tensor, lyap_tensor
class PositionalEncoding(nn.Module):
"""
Positional encoding with proper device handling
"""
def __init__(self, d_model: int, max_len: int = 5000, device=None):
super().__init__()
self.device = device
pe = torch.zeros(max_len, d_model)
position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
pe[:, 0::2] = torch.sin(position * div_term)
pe[:, 1::2] = torch.cos(position * div_term)
pe = pe.unsqueeze(0)
self.register_buffer('pe', pe)
def forward(self, x):
# Ensure pe is on the same device as x
if self.pe.device != x.device:
self.pe = self.pe.to(x.device)
return x + self.pe[:, :x.size(1)]
class LyapunovLoss(nn.Module):
def __init__(self):
super().__init__()
self.mse = nn.MSELoss()
def forward(self, pred, target):
# Basic MSE loss
mse_loss = self.mse(pred, target)
# Additional penalty for missing quadratic terms
quad_penalty = 0
pred_probs = torch.softmax(pred, dim=-1)
# Check for x0**2 and x1**2 terms
x0_quad_mask = (target.sum(dim=1) > 0) & (pred_probs[:, target == 1].sum(dim=1) < 0.1)
x1_quad_mask = (target.sum(dim=1) > 0) & (pred_probs[:, target == 1].sum(dim=1) < 0.1)
quad_penalty = torch.mean(x0_quad_mask.float() + x1_quad_mask.float())
return mse_loss + 0.5 * quad_penalty
class LyapunovTransformer(nn.Module):
def __init__(self,
input_size: int,
output_size: int,
d_model: int = 256,
nhead: int = 8,
num_encoder_layers: int = 6,
num_decoder_layers: int = 6,
dim_feedforward: int = 1024,
dropout: float = 0.1,
device=None):
super().__init__()
# Set device
self.device = device if device is not None else \
torch.device('mps' if torch.backends.mps.is_available() else 'cpu')
print(f"Initializing model on device: {self.device}")
# Input embedding with scaling
self.input_embedding = nn.Linear(input_size, d_model)
self.embedding_scale = math.sqrt(d_model)
self.pos_encoder = PositionalEncoding(d_model, device=self.device)
# Transformer layers
encoder_layer = nn.TransformerEncoderLayer(
d_model=d_model,
nhead=nhead,
dim_feedforward=dim_feedforward,
dropout=dropout,
batch_first=True,
device=self.device
)
self.transformer_encoder = nn.TransformerEncoder(
encoder_layer,
num_layers=num_encoder_layers
)
# Output generation
self.output_projection = nn.Linear(d_model, dim_feedforward)
self.output_activation = nn.GELU()
self.output_dropout = nn.Dropout(dropout)
self.term_generator = nn.Linear(dim_feedforward, output_size)
# Move all components to device
self.to(self.device)
def forward(self, src, src_mask=None):
# Ensure input is on correct device
src = src.to(self.device)
# If mask is provided, move it to device
if src_mask is not None:
src_mask = src_mask.to(self.device)
# Process through model
x = self.input_embedding(src) * self.embedding_scale
x = self.pos_encoder(x)
x = self.transformer_encoder(x, src_mask)
x = self.output_projection(x)
x = self.output_activation(x)
x = self.output_dropout(x)
x = self.term_generator(x)
return x.mean(dim=1)
def move_to_device(self, device=None):
"""Explicitly move model to specified device"""
if device is not None:
self.device = device
self.to(self.device)
return self