Skip to content

Optimization

rqequilibrium.opt

ProjectedGradientDescent

A class for performing projected gradient descent to solve optimization problems.

Attributes:

Name Type Description
lr float

Learning rate for the gradient descent step.

projection Callable

Function to project the point onto a feasible set after each step.

Methods:

Name Description
step

np.ndarray, gradients_values: np.ndarray) -> np.ndarray: Performs a single step of projected gradient descent. Updates the current point w by subtracting the scaled gradients and projecting it onto the feasible set.

Source code in src/rqequilibrium/opt.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class ProjectedGradientDescent:
    """
    A class for performing projected gradient descent to solve optimization problems.

    Attributes:
        lr (float): Learning rate for the gradient descent step.
        projection (Callable): Function to project the point onto a feasible set after each step.

    Methods:
        step(w: np.ndarray, gradients_values: np.ndarray) -> np.ndarray:
            Performs a single step of projected gradient descent.
            Updates the current point `w` by subtracting the scaled gradients and projecting it onto the feasible set.
    """

    def __init__(
        self,
        lr: float = 0.1,
        projection=lambda x: x,
    ):

        self.lr = lr
        self.projection = projection

    def step(self, w: np.ndarray, gradients_values: np.ndarray) -> np.ndarray:
        """
        Perform a single step of projected gradient descent.

        Parameters:
            w (np.ndarray): The current point.
            gradients_values (np.ndarray): The computed gradients at the current point.
        Returns:
            The updated point after one step.
        """

        w -= self.lr * gradients_values
        return self.projection(np.clip(w, 1e-12, 1 - 1e-12))

step

step(w: ndarray, gradients_values: ndarray) -> np.ndarray

Perform a single step of projected gradient descent.

Parameters:

Name Type Description Default
w ndarray

The current point.

required
gradients_values ndarray

The computed gradients at the current point.

required

Returns: The updated point after one step.

Source code in src/rqequilibrium/opt.py
30
31
32
33
34
35
36
37
38
39
40
41
42
def step(self, w: np.ndarray, gradients_values: np.ndarray) -> np.ndarray:
    """
    Perform a single step of projected gradient descent.

    Parameters:
        w (np.ndarray): The current point.
        gradients_values (np.ndarray): The computed gradients at the current point.
    Returns:
        The updated point after one step.
    """

    w -= self.lr * gradients_values
    return self.projection(np.clip(w, 1e-12, 1 - 1e-12))

kl_divergence

kl_divergence(p: ndarray, q: ndarray) -> float

Compute the KL divergence between two probability distributions p and q.

Parameters:

Name Type Description Default
p ndarray

The first probability distribution.

required
q ndarray

The second probability distribution.

required

Returns: The KL divergence between p and q.

Source code in src/rqequilibrium/opt.py
80
81
82
83
84
85
86
87
88
89
90
91
92
93
def kl_divergence(p: np.ndarray, q: np.ndarray) -> float:
    """
    Compute the KL divergence between two probability distributions p and q.

    Args:
        p (np.ndarray): The first probability distribution.
        q (np.ndarray): The second probability distribution.
    Returns:
        The KL divergence between p and q.
    """
    if q.ndim > 1:
        return np.sum([kl_divergence(p, q_i) for q_i in q])

    return np.sum(p * (np.log(p) - np.log(q)))

kl_reversed

kl_reversed(p: ndarray, q: ndarray) -> float

Compute the reversed KL divergence between two probability distributions p and q.

Parameters:

Name Type Description Default
p ndarray

The first probability distribution.

required
q ndarray

The second probability distribution.

required

Returns: The reversed KL divergence between p and q.

Source code in src/rqequilibrium/opt.py
 96
 97
 98
 99
100
101
102
103
104
105
106
107
def kl_reversed(p: np.ndarray, q: np.ndarray) -> float:
    """
    Compute the reversed KL divergence between two probability distributions p and q.

    Args:
        p (np.ndarray): The first probability distribution.
        q (np.ndarray): The second probability distribution.
    Returns:
        The reversed KL divergence between p and q.
    """

    return kl_divergence(q, p)

log_barrier

log_barrier(p: ndarray) -> float

Compute the log barrier function for a probability distribution p. Args: p (np.ndarray): The probability distribution. Returns: The log barrier value for the distribution.

Source code in src/rqequilibrium/opt.py
121
122
123
124
125
126
127
128
129
def log_barrier(p: np.ndarray) -> float:
    """
    Compute the log barrier function for a probability distribution p.
    Args:
        p (np.ndarray): The probability distribution.
    Returns:
        The log barrier value for the distribution.
    """
    return -np.sum(np.log(p))

negative_entropy

negative_entropy(p: ndarray) -> float

Compute the negative entropy of a probability distribution p. Args: p (np.ndarray): The probability distribution. Returns: The negative entropy of the distribution.

Source code in src/rqequilibrium/opt.py
110
111
112
113
114
115
116
117
118
def negative_entropy(p: np.ndarray) -> float:
    """
    Compute the negative entropy of a probability distribution p.
    Args:
        p (np.ndarray): The probability distribution.
    Returns:
        The negative entropy of the distribution.
    """
    return -np.sum(p * np.log(p))

project_simplex

project_simplex(v: ndarray, s=1) -> np.ndarray

Projects a vector v onto the simplex defined by the sum of its components being equal to s. The simplex is defined as the set of vectors x such that: x >= 0 and sum(x) = s.

Taken from https://gist.github.com/daien/1272551.

Basically Implements Corollary 6.29 First Order Optimization, Amir Beck

Parameters:

Name Type Description Default
v ndarray

The vector to be projected.

required
s float

The sum that the components of the projected vector should equal. Default is 1.

1

Returns: The projected vector onto the simplex.

Source code in src/rqequilibrium/opt.py
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
def project_simplex(v: np.ndarray, s=1) -> np.ndarray:
    """
    Projects a vector v onto the simplex defined by the sum of its components being equal to s.
    The simplex is defined as the set of vectors x such that:
    x >= 0 and sum(x) = s.

    Taken from [https://gist.github.com/daien/1272551](https://gist.github.com/daien/1272551).

    Basically Implements Corollary 6.29 First Order Optimization, Amir Beck

    Args:
        v (np.ndarray): The vector to be projected.
        s (float): The sum that the components of the projected vector should equal. Default is 1.
    Returns:
        The projected vector onto the simplex.
    """
    assert s > 0, "Radius s must be strictly positive (%d <= 0)" % s
    v = np.reshape(v, (v.shape[0]))
    (n,) = v.shape  # will raise ValueError if v is not 1-D
    # check if we are already on the simplex
    if v.sum() == s and np.all(v >= 0):
        # best projection: itself!
        return v
    # get the array of cumulative sums of a sorted (decreasing) copy of v
    u = np.sort(v)[::-1]
    cssv = np.cumsum(u)
    # get the number of > 0 components of the optimal solution
    rho = np.nonzero(u * np.arange(1, n + 1) > (cssv - s))[0][-1]
    # compute the Lagrange multiplier associated to the simplex constraint
    theta = float(cssv[rho] - s) / (rho + 1)
    # compute the projection by thresholding v using theta
    w = (v - theta).clip(min=0)
    return w