Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformers (AI for 3-body problem)

Alberto Alfarano, François Charton, Amaury Hayat

The paper “Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformers” offers a pioneering approach to tackling the complex problem of discovering Lyapunov functions, which play a crucial role in ensuring the global stability of dynamical systems. Highlighting a novel application of AI, the authors employ sequence-to-sequence transformers, framing the challenge as a translation task where a system of equations is translated into its corresponding Lyapunov function. This innovative approach surpasses traditional algorithmic and human methods, particularly in solving both polynomial and non-polynomial systems, thereby showcasing the potential of AI to address complex mathematical reasoning tasks that have eluded conventional solvers. The paper’s use of synthetic datasets, generated through backward and forward methods, is particularly noteworthy. It provides the extensive training data necessary for the model to generalize effectively and identify Lyapunov functions even for previously unsolvable cases. By successfully integrating AI into this domain, the research not only advances the state of symbolic mathematics but also opens avenues for future exploration, such as extending the method to larger, more intricate systems, and integrating AI with human mathematical intuition for synergistic problem-solving. This work marks a significant leap in using AI to expand the limits of mathematical discovery and stability analysis, akin to addressing unresolved challenges like the famous 3-body problem. This post implemented a minimal demo to illustrate the backward and forward data generation, and a very simple Lyapunov transformer for educational purposes.

Mind maps

Highlights explained

Transformer-based Discovery of Lyapunov Functions

Explanation: The paper employs sequence-to-sequence transformers to tackle the long-standing mathematical problem of discovering Lyapunov functions, which are used to ensure the global stability of dynamical systems.
Significance: This approach demonstrates the potential of AI models to solve complex mathematical reasoning tasks that have traditionally been challenging, thereby pushing the boundaries of what language models can achieve in mathematics.
Relation to Existing Work: The method surpasses traditional algorithmic solvers and even human capability in certain cases, indicating a significant leap beyond the current state of symbolic mathematics using machine learning.

Synthetic Data Generation

Explanation: The authors developed backward and forward data generation methods to create pairs of dynamical systems and their corresponding Lyapunov functions. This synthetic dataset is crucial for training the transformer models.
Significance: Synthetic data is essential because it enables training on a massive scale, which is necessary for the model to generalize well and discover Lyapunov functions that are previously unknown.
Impact: By generating both polynomial and non-polynomial systems, the dataset allows the models to explore a broader variety of systems, potentially leading to innovative solutions in areas where no algorithmic methods currently exist.

Novel Framing as a Translation Task

Explanation: The paper reframes the discovery of Lyapunov functions as a translation task, where the input is a system of equations, and the output is the Lyapunov function.
Significance: This novel framing aligns with the capabilities of transformers, which are adept at translation tasks, thus leveraging the strengths of the model architecture for solving complex mathematical problems.
Comparison with Existing Methods: Unlike previous neural network approaches that focus on approximating functions, this method offers explicit solutions that can be verified for mathematical correctness.

Code

Only partial code of the core concepts is attached in this post. The fully code of data generation and training is lengthy, please jump to Github repo for the full version and a small sample dataset at saved_data. The demo training is very underfit to predict any useful lyapunov functions, due to the limited sample size and training time.

Bash

python lyapunov_training.py --load my_dataset.json --device mps

Bash

pip install sympy numpy tqdm torch matplotlib

Python

# the complete version is at https://github.com/phunterlau/paper_without_code/tree/main/examples/tiny-lyapunov
import torch
import torch.nn as nn
import numpy as np
from torch.utils.data import Dataset, DataLoader
import matplotlib.pyplot as plt
from typing import List, Tuple, Optional
import math
import random
import sympy as sp

from tqdm.auto import tqdm

class LyapunovSystem:
    """
    Represents a dynamical system with its Lyapunov function.
    Supports both symbolic and numerical evaluation.
    """
    def __init__(self, system_eqs: List[str], lyap_func: Optional[str] = None):
        # Input validation
        if not isinstance(system_eqs, list):
            raise TypeError("system_eqs must be a list of strings")
            
        self.system_eqs = system_eqs
        self.lyap_func = lyap_func
        self.dim = len(system_eqs)
        
        # Create symbolic variables
        self.vars = sp.symbols(f'x:{self.dim}')
        
        # Parse equations
        try:
            self.system_sym = [sp.sympify(eq) for eq in system_eqs]
            if lyap_func:
                self.lyap_sym = sp.sympify(lyap_func)
        except sp.SympifyError as e:
            raise ValueError(f"Failed to parse equations: {e}")
    
    def evaluate_system(self, x: np.ndarray) -> np.ndarray:
        """Evaluates the system equations at point x"""
        if len(x) != self.dim:
            raise ValueError(f"Expected input of dimension {self.dim}, got {len(x)}")
            
        subs_dict = {var: val for var, val in zip(self.vars, x)}
        return np.array([float(eq.evalf(subs=subs_dict)) for eq in self.system_sym])
    
    def evaluate_lyapunov(self, x: np.ndarray) -> float:
        """Evaluates the Lyapunov function at point x"""
        if not self.lyap_func:
            raise ValueError("No Lyapunov function defined")
            
        if len(x) != self.dim:
            raise ValueError(f"Expected input of dimension {self.dim}, got {len(x)}")
            
        subs_dict = {var: val for var, val in zip(self.vars, x)}
        return float(self.lyap_sym.evalf(subs=subs_dict))

    def verify_lyapunov(self, grid_points: int = 20, bounds: Tuple[float, float] = (-2, 2)) -> bool:
        """
        Verifies Lyapunov function properties:
        1. V(0) = 0
        2. V(x) > 0 for x ≠ 0
        3. ∇V·f ≤ 0
        """
        if not self.lyap_func:
            return False
            
        try:
            # Check V(0) = 0
            if abs(self.evaluate_lyapunov(np.zeros(self.dim))) > 1e-10:
                print("Failed: V(0) ≠ 0")
                return False
                
            # Generate grid points for verification
            grid = np.linspace(bounds[0], bounds[1], grid_points)
            points = np.array(np.meshgrid(*[grid for _ in range(self.dim)]))
            points = points.reshape(self.dim, -1).T
            
            for x in points:
                if np.linalg.norm(x) > 1e-10:  # Skip origin
                    # Check V(x) > 0
                    v_x = self.evaluate_lyapunov(x)
                    if v_x <= 0:
                        print(f"Failed: V(x) ≤ 0 at x = {x}, V(x) = {v_x}")
                        return False
                    
                    # Check ∇V·f ≤ 0
                    grad_V = np.array([float(sp.diff(self.lyap_sym, var).evalf(subs={
                        v: val for v, val in zip(self.vars, x)})) for var in self.vars])
                    f_x = self.evaluate_system(x)
                    dot_product = np.dot(grad_V, f_x)
                    if dot_product > 1e-10:
                        print(f"Failed: ∇V·f > 0 at x = {x}, value = {dot_product}")
                        return False
                        
            return True
            
        except Exception as e:
            print(f"Verification failed with error: {e}")
            return False

class ForwardGenerator:
    """
    Implements forward generation of stable polynomial systems
    with stronger stability guarantees
    """
    def __init__(self, max_degree: int = 3):
        self.max_degree = max_degree
    
    def generate_hurwitz_matrix(self, dim: int) -> np.ndarray:
        """
        Generates a Hurwitz matrix (all eigenvalues have negative real parts)
        using the method from Chen & Zhou (1998)
        """
        # Generate random negative eigenvalues
        eigenvals = -np.random.uniform(0.5, 2.0, dim)
        # Create diagonal matrix
        D = np.diag(eigenvals)
        # Generate random orthogonal matrix
        Q = np.random.randn(dim, dim)
        Q, _ = np.linalg.qr(Q)  # QR decomposition gives orthogonal matrix
        # Create Hurwitz matrix
        A = Q @ D @ Q.T
        return A
    
    def generate_polynomial_system(self, dim: int) -> List[str]:
        """
        Generates a random polynomial system with stronger stability properties
        """
        # Generate Hurwitz linear part
        A = self.generate_hurwitz_matrix(dim)
        
        system = []
        for i in range(dim):
            terms = []
            
            # Add linear terms (Hurwitz part)
            for j in range(dim):
                if abs(A[i,j]) > 1e-10:
                    terms.append(f"{A[i,j]}*x{j}")
            
            # Add some higher-order terms with very small coefficients
            num_nonlin_terms = random.randint(0, 2)
            for _ in range(num_nonlin_terms):
                # Generate term with very small coefficient
                coeff = random.uniform(-0.1, 0.1)  # Reduced coefficient range
                if abs(coeff) < 1e-10:
                    continue
                
                # Generate powers ensuring total degree is limited
                powers = [0] * dim
                # Prefer terms that include the current variable
                if random.random() < 0.7:
                    powers[i] = random.randint(1, self.max_degree-1)
                # Add at most one other variable
                other_var = random.randint(0, dim-1)
                if other_var != i:
                    powers[other_var] = random.randint(1, self.max_degree-powers[i])
                
                if all(p == 0 for p in powers):
                    continue
                
                term = str(coeff)
                for var, power in enumerate(powers):
                    if power > 0:
                        term += f"*x{var}"
                        if power > 1:
                            term += f"**{power}"
                terms.append(term)
            
            system.append(" + ".join(terms) if terms else "0")
        
        return system
    
    def try_find_lyapunov_candidate(self, dim: int) -> np.ndarray:
        """Generate a candidate positive definite quadratic form"""
        # Generate random positive definite matrix with controlled condition number
        while True:
            Q = np.random.randn(dim, dim)
            Q = Q.T @ Q
            # Add scaled identity to improve conditioning
            scale = np.trace(Q) / dim
            Q = Q + np.eye(dim) * scale
            # Check condition number
            cond = np.linalg.cond(Q)
            if cond < 100:  # Ensure well-conditioned matrix
                break
        return Q
    
    def try_find_lyapunov(self, system_eqs: List[str], max_attempts: int = 5) -> Optional[str]:
        """
        Attempts to find a quadratic Lyapunov function with progressive verification
        """
        dim = len(system_eqs)
        
        # Try different strategies
        strategies = [
            (10, (-1.0, 1.0)),    # Coarse grid, small region
            (15, (-1.5, 1.5)),    # Medium grid, medium region
            (20, (-2.0, 2.0))     # Fine grid, full region
        ]
        
        for attempt in range(max_attempts):
            print(f"Attempt {attempt + 1}/{max_attempts} to find Lyapunov function...")
            
            # Try different scaling factors for the quadratic form
            scale = 1.0 / (attempt + 1)
            Q = self.try_find_lyapunov_candidate(dim) * scale
            
            # Construct Lyapunov function
            terms = []
            for i in range(dim):
                for j in range(i, dim):
                    coeff = Q[i,j]
                    if abs(coeff) < 1e-10:
                        continue
                    if i == j:
                        terms.append(f"{coeff}*x{i}**2")
                    else:
                        terms.append(f"{2*coeff}*x{i}*x{j}")
            
            lyap_func = " + ".join(terms)
            
            try:
                # Progressive verification
                system = LyapunovSystem(system_eqs, lyap_func)
                
                # Try verification with increasingly strict requirements
                is_valid = True
                for grid_points, bounds in strategies:
                    if not system.verify_lyapunov(grid_points=grid_points, bounds=bounds):
                        is_valid = False
                        break
                
                if is_valid:
                    print("Found valid Lyapunov function!")
                    return lyap_func
                    
            except Exception as e:
                print(f"Verification failed on attempt {attempt + 1}: {e}")
                continue
        
        print("Failed to find valid Lyapunov function")
        return None

class BackwardGenerator:
    """
    Implements backward generation starting from a known Lyapunov function.
    """
    def __init__(self, max_degree: int = 3):
        self.max_degree = max_degree
    
    def generate_example(self, dim: int) -> Tuple[List[str], str]:
        """
        Generates a Lyapunov function first, then constructs a compatible system.
        Following Section 4.1 of the paper.
        """
        # Generate V_proper (positive definite quadratic form)
        A = np.random.randn(dim, dim)
        A = A.T @ A + np.eye(dim)
        
        # Construct Lyapunov function
        terms = []
        for i in range(dim):
            for j in range(i, dim):
                coeff = A[i,j]
                if i == j:
                    terms.append(f"{coeff}*x{i}**2")
                else:
                    terms.append(f"{2*coeff}*x{i}*x{j}")
        lyap_func = " + ".join(terms)
        
        # Generate system equations that make this a Lyapunov function
        system = []
        for i in range(dim):
            # Negative gradient term ensures V'(x)f(x) ≤ 0
            grad_term = f"-{A[i,i]}*x{i}"
            for j in range(dim):
                if i != j:
                    grad_term += f" - {A[i,j]}*x{j}"
            
            # Add some cross terms (as in paper)
            num_cross = random.randint(0, 2)
            cross_terms = []
            for _ in range(num_cross):
                j = random.randint(0, dim-1)
                k = random.randint(0, dim-1)
                coeff = random.randint(-5, 5)
                if coeff != 0:
                    cross_terms.append(f"{coeff}*x{j}*x{k}")
            
            eq = " + ".join([grad_term] + cross_terms)
            system.append(eq)
        
        return system, lyap_func

class LyapunovDataset(Dataset):
    """
    Enhanced dataset class with proper loading support
    """
    def __init__(self, 
                 num_backward: int = 1000,
                 num_forward: int = 300,
                 dim: int = 2,
                 max_degree: int = 2,
                 max_seq_length: int = 100,
                 load_from_data: bool = False):
        self.max_seq_length = max_seq_length
        self.systems = []
        self.lyap_funcs = []
        
        # Special tokens
        self.special_tokens = {
            'NUM': '<NUM>',    # Represents any number
            'VAR': '<VAR>',    # Represents any variable
            'OP': '<OP>',      # Represents any operator
            'PAD': '<PAD>'     # Padding token
        }
        
        # Initialize basic vocabulary
        self.vocab = set(self.special_tokens.values())
        self.operators = {'+', '-', '*', '**'}
        self.vocab.update(self.operators)
        
        # Add variables up to max dimension
        for i in range(10):  # Support up to 10 dimensions
            self.vocab.add(f'x{i}')
            
        # If not loading from data, generate new examples
        if not load_from_data:
            # Generate backward examples
            print("Generating backward examples...")
            backward_gen = BackwardGenerator(max_degree=max_degree)
            for i in range(num_backward):
                if i % 100 == 0:
                    print(f"Generating backward example {i}/{num_backward}")
                system, lyap = backward_gen.generate_example(dim=dim)
                self.systems.append(system)
                self.lyap_funcs.append(lyap)
                
            # Generate forward examples
            print("\nGenerating forward examples...")
            forward_gen = ForwardGenerator(max_degree=max_degree)
            count = 0
            attempts = 0
            while count < num_forward and attempts < num_forward * 10:
                attempts += 1
                if attempts % 10 == 0:
                    print(f"Generated {count}/{num_forward} forward examples (attempts: {attempts})")
                
                system = forward_gen.generate_polynomial_system(dim=dim)
                lyap = forward_gen.try_find_lyapunov(system)
                if lyap is not None:
                    self.systems.append(system)
                    self.lyap_funcs.append(lyap)
                    count += 1
            
            print(f"\nFinal dataset size: {len(self.systems)} examples")
        
        # Update vocabulary with tokens from systems and functions
        self._update_vocabulary()
        
    def _update_vocabulary(self):
        """Update vocabulary from systems and Lyapunov functions"""
        for system in self.systems:
            for eq in system:
                tokens = eq.replace('+', ' + ').replace('-', ' - ').replace('*', ' * ').split()
                self.vocab.update(tokens)
        for lyap in self.lyap_funcs:
            tokens = lyap.replace('+', ' + ').replace('-', ' - ').replace('*', ' * ').split()
            self.vocab.update(tokens)
        
        self.token2idx = {token: idx for idx, token in enumerate(sorted(self.vocab))}
        self.idx2token = {idx: token for token, idx in self.token2idx.items()}
    
    def __len__(self):
        return len(self.systems)
    
    @classmethod
    def from_saved_data(cls, systems, lyap_funcs, max_seq_length=100):
        """Create dataset from saved data"""
        dataset = cls(num_backward=0, num_forward=0, load_from_data=True)
        dataset.systems = systems
        dataset.lyap_funcs = lyap_funcs
        dataset.max_seq_length = max_seq_length
        dataset._update_vocabulary()
        return dataset
        
    def normalize_number(self, num_str: str) -> str:
        """Convert number to a standard format"""
        try:
            num = float(num_str)
            if abs(num - round(num)) < 1e-10:
                return str(round(num))
            return f"{num:.6f}"
        except ValueError:
            return num_str
    
    def tokenize_term(self, term: str) -> List[str]:
        """Tokenize a single term of the equation"""
        # Split around operators while keeping them
        tokens = []
        current_token = ""
        
        i = 0
        while i < len(term):
            if term[i] in '+-*':
                if current_token:
                    tokens.append(current_token)
                    current_token = ""
                if i + 1 < len(term) and term[i:i+2] == '**':
                    tokens.append('**')
                    i += 2
                else:
                    tokens.append(term[i])
                    i += 1
            else:
                current_token += term[i]
                i += 1
        
        if current_token:
            tokens.append(current_token)
        
        # Process each token
        processed_tokens = []
        for token in tokens:
            if token in self.operators:
                processed_tokens.append(token)
            elif token.startswith('x'):
                processed_tokens.append(token)
            else:
                try:
                    float(token)  # Check if it's a number
                    processed_tokens.append(self.special_tokens['NUM'])
                except ValueError:
                    processed_tokens.append(token)
        
        return processed_tokens
    
    def tokenize_system(self, system: List[str]) -> torch.Tensor:
        """Convert system equations to tensor format with improved tokenization"""
        all_tokens = []
        
        for eq in system:
            # Split equation into terms
            terms = eq.replace(' ', '').split('+')
            for term in terms:
                if term.startswith('-'):
                    all_tokens.append('-')
                    term = term[1:]
                all_tokens.extend(self.tokenize_term(term))
            all_tokens.append(self.special_tokens['PAD'])
        
        # Create encoding
        encoding = torch.zeros(self.max_seq_length, len(self.vocab))
        for i, token in enumerate(all_tokens[:self.max_seq_length]):
            if token in self.token2idx:
                encoding[i, self.token2idx[token]] = 1
            else:
                # Use special token if not in vocabulary
                if token.startswith('x'):
                    encoding[i, self.token2idx[self.special_tokens['VAR']]] = 1
                elif any(op in token for op in self.operators):
                    encoding[i, self.token2idx[self.special_tokens['OP']]] = 1
                else:
                    try:
                        float(token)
                        encoding[i, self.token2idx[self.special_tokens['NUM']]] = 1
                    except ValueError:
                        continue
        
        return encoding
    
    def tokenize_lyapunov(self, lyap: str) -> torch.Tensor:
        """Convert Lyapunov function to tensor format with improved tokenization"""
        tokens = self.tokenize_term(lyap.replace(' ', ''))
        
        encoding = torch.zeros(len(self.vocab))
        for token in tokens:
            if token in self.token2idx:
                encoding[self.token2idx[token]] = 1
            else:
                # Use special token if not in vocabulary
                if token.startswith('x'):
                    encoding[self.token2idx[self.special_tokens['VAR']]] = 1
                elif any(op in token for op in self.operators):
                    encoding[self.token2idx[self.special_tokens['OP']]] = 1
                else:
                    try:
                        float(token)
                        encoding[self.token2idx[self.special_tokens['NUM']]] = 1
                    except ValueError:
                        continue
        
        return encoding
        
    def __getitem__(self, idx):
        system = self.systems[idx]
        lyap = self.lyap_funcs[idx]
        
        # Convert to tensor format
        system_tensor = self.tokenize_system(system)
        lyap_tensor = self.tokenize_lyapunov(lyap)
        
        return system_tensor, lyap_tensor

class PositionalEncoding(nn.Module):
    """
    Positional encoding with proper device handling
    """
    def __init__(self, d_model: int, max_len: int = 5000, device=None):
        super().__init__()
        self.device = device
        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0)
        self.register_buffer('pe', pe)
        
    def forward(self, x):
        # Ensure pe is on the same device as x
        if self.pe.device != x.device:
            self.pe = self.pe.to(x.device)
        return x + self.pe[:, :x.size(1)]


class LyapunovLoss(nn.Module):
    def __init__(self):
        super().__init__()
        self.mse = nn.MSELoss()
        
    def forward(self, pred, target):
        # Basic MSE loss
        mse_loss = self.mse(pred, target)
        
        # Additional penalty for missing quadratic terms
        quad_penalty = 0
        pred_probs = torch.softmax(pred, dim=-1)
        
        # Check for x0**2 and x1**2 terms
        x0_quad_mask = (target.sum(dim=1) > 0) & (pred_probs[:, target == 1].sum(dim=1) < 0.1)
        x1_quad_mask = (target.sum(dim=1) > 0) & (pred_probs[:, target == 1].sum(dim=1) < 0.1)
        
        quad_penalty = torch.mean(x0_quad_mask.float() + x1_quad_mask.float())
        
        return mse_loss + 0.5 * quad_penalty

class LyapunovTransformer(nn.Module):
    def __init__(self, 
                 input_size: int,
                 output_size: int,
                 d_model: int = 256,
                 nhead: int = 8,
                 num_encoder_layers: int = 6,
                 num_decoder_layers: int = 6,
                 dim_feedforward: int = 1024,
                 dropout: float = 0.1,
                 device=None):
        super().__init__()
        
        # Set device
        self.device = device if device is not None else \
            torch.device('mps' if torch.backends.mps.is_available() else 'cpu')
        print(f"Initializing model on device: {self.device}")
        
        # Input embedding with scaling
        self.input_embedding = nn.Linear(input_size, d_model)
        self.embedding_scale = math.sqrt(d_model)
        self.pos_encoder = PositionalEncoding(d_model, device=self.device)
        
        # Transformer layers
        encoder_layer = nn.TransformerEncoderLayer(
            d_model=d_model,
            nhead=nhead,
            dim_feedforward=dim_feedforward,
            dropout=dropout,
            batch_first=True,
            device=self.device
        )
        self.transformer_encoder = nn.TransformerEncoder(
            encoder_layer,
            num_layers=num_encoder_layers
        )
        
        # Output generation
        self.output_projection = nn.Linear(d_model, dim_feedforward)
        self.output_activation = nn.GELU()
        self.output_dropout = nn.Dropout(dropout)
        self.term_generator = nn.Linear(dim_feedforward, output_size)
        
        # Move all components to device
        self.to(self.device)
    
    def forward(self, src, src_mask=None):
        # Ensure input is on correct device
        src = src.to(self.device)
        
        # If mask is provided, move it to device
        if src_mask is not None:
            src_mask = src_mask.to(self.device)
        
        # Process through model
        x = self.input_embedding(src) * self.embedding_scale
        x = self.pos_encoder(x)
        x = self.transformer_encoder(x, src_mask)
        x = self.output_projection(x)
        x = self.output_activation(x)
        x = self.output_dropout(x)
        x = self.term_generator(x)
        
        return x.mean(dim=1)
    
    def move_to_device(self, device=None):
        """Explicitly move model to specified device"""
        if device is not None:
            self.device = device
        self.to(self.device)
        return self

Mind maps

Highlights explained

Transformer-based Discovery of Lyapunov Functions

Synthetic Data Generation

Novel Framing as a Translation Task

Code

Leave a Reply

Related Posts

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs (toward a better DeepSeek R1)

Cognitive Architectures for Language Agents (How to use memory in reasoning)

Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs (RAG for reasoning in agents)

Self-Discover: Large Language Models Self-Compose Reasoning Structures