## Lipschitz Continuity

Let us begin with the definition of Lipschitz continuity:

A functionf:Rᴹ→Rᴺis Lipschitz continuous if there is a constantLsuch that∥f(x) -f(y)∥ ≦L∥x-y∥ for everyx,y.

Here ∥·∥ denotes the usual Euclidean distance. The smallest such *L* is the Lipschitz constant of *f* and is denoted *Lip*(*f*). Notice that this definition can be generalized to functions between arbitrary metric spaces.

In our case, *f *is our neural network, and we want it to be Lipschitz continuous with a small *Lip*(*f*). This will provide an upper bound for the perturbations of the outputs. Lipschitz continuity also has the following property:

`Let `*f* = *g* ∘ *h*. If *g* and *h* are Lipschitz continuous, then *f *is also Lipschitz continuous with *Lip*(*f*)* *≦ *Lip*(*g*)* Lip*(*h*)*.*

Therefore, as long as we make each component of a neural network Lipschitz continuous with small Lipschitz constants, the whole neural network will also be Lipschitz continuous with small Lipschitz constants.

As a concrete example, a standard 2-layer feedforward network for binary classification can be written as

*f* = Sigmoid ∘ FC₂ ∘ ReLU ∘ FC₁

where FCᵢ(*x*) = *W*ᵢ *x* + *b*ᵢ are fully connected layers. The components of *f* are FC₁, ReLU, FC₂, and Sigmoid.

Continue reading: https://towardsdatascience.com/lipschitz-continuity-and-spectral-normalization-b03b36066b0d?source=rss—-7f60cf5620c9—4

Source: towardsdatascience.com

## Comments by halbot