Optimization

rqequilibrium.opt

ProjectedGradientDescent

A class for performing projected gradient descent to solve optimization problems.

Attributes:

Name	Type	Description
`lr`	`float`	Learning rate for the gradient descent step.
`projection`	`Callable`	Function to project the point onto a feasible set after each step.

Methods:

Name	Description
`step`	np.ndarray, gradients_values: np.ndarray) -> np.ndarray: Performs a single step of projected gradient descent. Updates the current point `w` by subtracting the scaled gradients and projecting it onto the feasible set.

Source code in src/rqequilibrium/opt.py

class ProjectedGradientDescent:
    """
    A class for performing projected gradient descent to solve optimization problems.

    Attributes:
        lr (float): Learning rate for the gradient descent step.
        projection (Callable): Function to project the point onto a feasible set after each step.

    Methods:
        step(w: np.ndarray, gradients_values: np.ndarray) -> np.ndarray:
            Performs a single step of projected gradient descent.
            Updates the current point `w` by subtracting the scaled gradients and projecting it onto the feasible set.
    """

    def __init__(
        self,
        lr: float = 0.1,
        projection=lambda x: x,
    ):

        self.lr = lr
        self.projection = projection

    def step(self, w: np.ndarray, gradients_values: np.ndarray) -> np.ndarray:
        """
        Perform a single step of projected gradient descent.

        Parameters:
            w (np.ndarray): The current point.
            gradients_values (np.ndarray): The computed gradients at the current point.
        Returns:
            The updated point after one step.
        """

        w -= self.lr * gradients_values
        return self.projection(np.clip(w, 1e-12, 1 - 1e-12))

step

step(w: ndarray, gradients_values: ndarray) -> np.ndarray

Perform a single step of projected gradient descent.

Parameters:

Name	Type	Description	Default
`w`	`ndarray`	The current point.	required
`gradients_values`	`ndarray`	The computed gradients at the current point.	required

Returns: The updated point after one step.

Source code in src/rqequilibrium/opt.py

def step(self, w: np.ndarray, gradients_values: np.ndarray) -> np.ndarray:
    """
    Perform a single step of projected gradient descent.

    Parameters:
        w (np.ndarray): The current point.
        gradients_values (np.ndarray): The computed gradients at the current point.
    Returns:
        The updated point after one step.
    """

    w -= self.lr * gradients_values
    return self.projection(np.clip(w, 1e-12, 1 - 1e-12))

kl_divergence

kl_divergence(p: ndarray, q: ndarray) -> float

Compute the KL divergence between two probability distributions p and q.

Parameters:

Name	Type	Description	Default
`p`	`ndarray`	The first probability distribution.	required
`q`	`ndarray`	The second probability distribution.	required

Returns: The KL divergence between p and q.

Source code in src/rqequilibrium/opt.py

def kl_divergence(p: np.ndarray, q: np.ndarray) -> float:
    """
    Compute the KL divergence between two probability distributions p and q.

    Args:
        p (np.ndarray): The first probability distribution.
        q (np.ndarray): The second probability distribution.
    Returns:
        The KL divergence between p and q.
    """
    if q.ndim > 1:
        return np.sum([kl_divergence(p, q_i) for q_i in q])

    return np.sum(p * (np.log(p) - np.log(q)))

kl_reversed

kl_reversed(p: ndarray, q: ndarray) -> float

Compute the reversed KL divergence between two probability distributions p and q.

Parameters:

Name	Type	Description	Default
`p`	`ndarray`	The first probability distribution.	required
`q`	`ndarray`	The second probability distribution.	required

Returns: The reversed KL divergence between p and q.

Source code in src/rqequilibrium/opt.py

def kl_reversed(p: np.ndarray, q: np.ndarray) -> float:
    """
    Compute the reversed KL divergence between two probability distributions p and q.

    Args:
        p (np.ndarray): The first probability distribution.
        q (np.ndarray): The second probability distribution.
    Returns:
        The reversed KL divergence between p and q.
    """

    return kl_divergence(q, p)

log_barrier

log_barrier(p: ndarray) -> float

Compute the log barrier function for a probability distribution p. Args: p (np.ndarray): The probability distribution. Returns: The log barrier value for the distribution.

Source code in src/rqequilibrium/opt.py

def log_barrier(p: np.ndarray) -> float:
    """
    Compute the log barrier function for a probability distribution p.
    Args:
        p (np.ndarray): The probability distribution.
    Returns:
        The log barrier value for the distribution.
    """
    return -np.sum(np.log(p))

negative_entropy

negative_entropy(p: ndarray) -> float

Compute the negative entropy of a probability distribution p. Args: p (np.ndarray): The probability distribution. Returns: The negative entropy of the distribution.

Source code in src/rqequilibrium/opt.py

def negative_entropy(p: np.ndarray) -> float:
    """
    Compute the negative entropy of a probability distribution p.
    Args:
        p (np.ndarray): The probability distribution.
    Returns:
        The negative entropy of the distribution.
    """
    return -np.sum(p * np.log(p))

project_simplex

project_simplex(v: ndarray, s=1) -> np.ndarray

Projects a vector v onto the simplex defined by the sum of its components being equal to s. The simplex is defined as the set of vectors x such that: x >= 0 and sum(x) = s.

Taken from https://gist.github.com/daien/1272551.

Basically Implements Corollary 6.29 First Order Optimization, Amir Beck

Parameters:

Name	Type	Description	Default
`v`	`ndarray`	The vector to be projected.	required
`s`	`float`	The sum that the components of the projected vector should equal. Default is 1.	`1`

Returns: The projected vector onto the simplex.

Source code in src/rqequilibrium/opt.py

def project_simplex(v: np.ndarray, s=1) -> np.ndarray:
    """
    Projects a vector v onto the simplex defined by the sum of its components being equal to s.
    The simplex is defined as the set of vectors x such that:
    x >= 0 and sum(x) = s.

    Taken from [https://gist.github.com/daien/1272551](https://gist.github.com/daien/1272551).

    Basically Implements Corollary 6.29 First Order Optimization, Amir Beck

    Args:
        v (np.ndarray): The vector to be projected.
        s (float): The sum that the components of the projected vector should equal. Default is 1.
    Returns:
        The projected vector onto the simplex.
    """
    assert s > 0, "Radius s must be strictly positive (%d <= 0)" % s
    v = np.reshape(v, (v.shape[0]))
    (n,) = v.shape  # will raise ValueError if v is not 1-D
    # check if we are already on the simplex
    if v.sum() == s and np.all(v >= 0):
        # best projection: itself!
        return v
    # get the array of cumulative sums of a sorted (decreasing) copy of v
    u = np.sort(v)[::-1]
    cssv = np.cumsum(u)
    # get the number of > 0 components of the optimal solution
    rho = np.nonzero(u * np.arange(1, n + 1) > (cssv - s))[0][-1]
    # compute the Lagrange multiplier associated to the simplex constraint
    theta = float(cssv[rho] - s) / (rho + 1)
    # compute the projection by thresholding v using theta
    w = (v - theta).clip(min=0)
    return w