What are efficient algorithms for high-dimensional optimal transport?

High-dimensional optimal transport is computationally challenging because the classical Monge–Kantorovich formulation leads to a linear program whose cost grows rapidly with the number of sample points. This curse of dimensionality forces algorithm designers to accept approximations or structural assumptions. Practical, efficient approaches therefore emphasize regularization, dimensionality reduction, or low-rank structure to make comparisons of probability distributions feasible in machine learning, imaging, and environmental modeling.

Entropic regularization and the Sinkhorn algorithm

A transformative practical idea is entropic regularization, introduced for computation by Marco Cuturi at Google. By adding an entropy penalty to the optimal transport objective, the problem becomes strictly convex and solvable by matrix scaling iterations known as the Sinkhorn algorithm. This yields substantial speedups and GPU-friendly implementations used in large-scale machine learning. Entropic regularization trades bias for speed: the solution is smoother than the exact transport plan, which can be acceptable or even desirable in noisy applications such as color transfer in imaging. Gabriel Peyré at CNRS and École normale supérieure and Marco Cuturi summarize these techniques and their numerical behavior in their textbook Computational Optimal Transport. The main consequence is that many previously intractable OT problems become tractable for sample sizes encountered in practice, but users must be aware that entropic smoothing can blur fine transport structure and requires careful tuning of the regularization parameter.

Projection, stochastic and low-rank approximations

When dimension itself is the obstacle, projection-based methods such as the Sliced Wasserstein family reduce high-dimensional OT to many one-dimensional problems by random projections, enabling linear-time computation per projection and robustness in high dimensions. Justin Solomon at MIT and collaborators have developed geometric approaches linking OT to PDEs and diffusion processes that also inspire fast approximations for structured domains. Stochastic optimization and mini-batch strategies adapt ideas from empirical risk minimization to OT, using Monte Carlo estimates of gradients with respect to transport costs to scale to large datasets. Another practical route is low-rank or Nyström approximations of the kernel matrices that appear in entropically regularized OT; these reduce memory and compute while preserving the main transport modes, as discussed in the computational OT literature by Gabriel Peyré and others.

Understanding causes and consequences helps practitioners choose methods: entropic methods are efficient and smooth solutions; projection methods preserve sharper features when many projections are used but can miss anisotropic structure; low-rank techniques exploit redundancy in the data but depend on effective basis selection. In environmental and territorial applications—such as modeling species-range shifts or resource redistribution—these algorithmic choices affect policy-relevant outputs. Theoretical foundations provided by Cédric Villani at Université Claude Bernard Lyon 1 and computational advances by researchers at institutions including Google, CNRS, École normale supérieure, and MIT together ensure that optimal transport can now be applied at scales and dimensions previously out of reach, while practitioners must balance accuracy, interpretability, and computational constraints.