-
Notifications
You must be signed in to change notification settings - Fork 3
Update Pock-Chambolle to match 2011 paper #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
cdd779a
1078085
2126519
0a4bd24
4e617e1
2695442
24c2357
4d88692
ae586ad
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -133,3 +133,37 @@ We make use of a few key observations: | |
|
|
||
| - Projection on $\mathcal{Z}$ commutes with scaling | ||
| - Projection on an interval commutes with scaling if scaling is also applied to the interval in question | ||
|
|
||
| ### Pock-Chambolle scaling | ||
|
|
||
| Following Lemma 2 of: | ||
|
|
||
| > Pock, Thomas, and Antonin Chambolle. "Diagonal preconditioning for first order primal-dual algorithms in convex optimization." *2011 International Conference on Computer Vision*. IEEE, 2011. [doi:10.1109/ICCV.2011.6126441](https://doi.org/10.1109/ICCV.2011.6126441) | ||
|
|
||
| for matrix $K$ of size $m \times n$ and $\alpha \in [0, 2]$, the diagonal factors are | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In our code the constraint matrix is called
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was just sticking to the Pock-Chambolle paper here. Also, we will use it for |
||
|
|
||
| ```math | ||
| \tau_j = \left(\sum_i |K_{ij}|^{\alpha}\right)^{-1} \quad \text{(variable side, column $j$)} | ||
| ``` | ||
|
|
||
| ```math | ||
| \sigma_i = \left(\sum_j |K_{ij}|^{2-\alpha}\right)^{-1} \quad \text{(constraint side, row $i$)} | ||
| ``` | ||
|
|
||
| so that $\|\Sigma^{1/2} K T^{1/2}\| \leq 1$ with $T = \mathrm{diag}(\tau_j), \Sigma = \mathrm{diag}(\sigma_i)$. CoolPDLP absorbs the $1/2$ powers into the rescaling factors: $D_2[j] = \tau_j^{1/2}$ and $D_1[i] = \sigma_i^{1/2}$. The exponent parameter $\alpha$ is exposed as `chambolle_pock_alpha`. See also [`CoolPDLP.chambolle_pock_preconditioner`](@ref). | ||
|
|
||
| Note that CoolPDLP uses $2-\alpha$ for the row and $\alpha$ for the column, | ||
| to be aligned with the [PDLP paper](https://dl.acm.org/doi/abs/10.5555/3540261.3541809). This is the opposite of the | ||
| convention used in the original Pock-Chambolle paper referenced above. | ||
|
|
||
| ### Ruiz equilibration | ||
|
|
||
| The Ruiz iteration alternately rescales each row and each column of $A$ by the square root of its $\ell_\infty$ norm. After enough iterations the rescaled matrix has all row and column $\infty$-norms close to 1. See also [`CoolPDLP.ruiz_preconditioner`](@ref). | ||
|
|
||
| ### Composition | ||
|
|
||
| The two passes run one after the other: first Ruiz produces $D_1^{\mathrm{ruiz}}, D_2^{\mathrm{ruiz}}$, then Pock-Chambolle is applied to the already-Ruiz-rescaled matrix $D_1^{\mathrm{ruiz}} A D_2^{\mathrm{ruiz}}$ and produces $D_1^{\mathrm{cp}}, D_2^{\mathrm{cp}}$. The final scaling is then | ||
|
|
||
| ```math | ||
| D_1 = D_1^{\mathrm{cp}} D_1^{\mathrm{ruiz}} \qquad D_2 = D_2^{\mathrm{ruiz}} D_2^{\mathrm{cp}} | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -107,6 +107,19 @@ end | |
| column_norm(A::AbstractMatrix, j::Integer, p) = norm(view(A, :, j), p) | ||
| column_norm(A::SparseMatrixCSC, j::Integer, p) = norm(view(nonzeros(A), nzrange(A, j)), p) | ||
|
|
||
| """ | ||
| column_power_sum(A, j, p) | ||
|
|
||
| Return `sum_i |A[i, j]|^p`. Zero entries are treated as contributing `0` (not `0^0 = 1`), | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure why you say the initial implementation was incorrect. Doesn't
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. was |
||
| so the sparse and dense cases are aligned. | ||
|
klamike marked this conversation as resolved.
|
||
| """ | ||
| @inline _power_term(x, p) = iszero(x) ? zero(x) : abs(x)^p | ||
|
|
||
| column_power_sum(A::AbstractMatrix, j::Integer, p) = | ||
| sum(x -> _power_term(x, p), view(A, :, j); init = zero(eltype(A))) | ||
| column_power_sum(A::SparseMatrixCSC, j::Integer, p) = | ||
| sum(x -> _power_term(x, p), view(nonzeros(A), nzrange(A, j)); init = zero(eltype(A))) | ||
|
|
||
|
klamike marked this conversation as resolved.
|
||
| mynnz(A::AbstractSparseMatrix) = nnz(A) | ||
| mynnz(A::AbstractMatrix) = prod(size(A)) | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to start using DocumenterCitations.jl