October 26, 2011

Homework Sample Solutions

In case you hadn’t noticed, we’ve been trying to put up sample solutions for the HWs: complete solutions for HW2, and partial solutions for HW3 are now online (I still need to write the solutions for HW3 #1).

October 26, 2011

Homework 2 graded

Homework 2 is graded. (Sorry for the delay, it’s my fault — AG).  The average was (20/25). Some comments:

1. Most people got this, although in some cases with pretty long-winded proofs.  The brief point is that when considering non-integral edges, every vertex on the V side has degree 0 or at least 2; thus one can find either a cycle (as in class) or else the maximal path starts and ends with a degree-1 vertex in U.  Deepak cleverly pointed out that you can also just add |U|-|V| fictitious jobs called “unemployment”, link everyone to them with weight 0, and then reduce to the proof from class.

2. George & Leslie = Nemhauser & Trotter.  Most people got this too, except in part (b) very few people observed out that for $x$ non-half-integral, $x^+$ and $x^-$ are distinct from $x$.  Since this was really the only point to the question we should have deducted a point for failing to mention it.  But we did not have the heart to do so.

3. Most people got the first 4 parts, and also the “easy” direction of Farkas Lemma (that both alternatives cannot be true). Some people had trouble with the last part (proving that if there is no feasible x then there must be a y): in particular, I saw proofs that tried to do it from first principles, where the hope was that you’d just use the template from the above parts (and use the correctness of the Avis-Kaluzny procedure).

4. This one tripped almost everyone up: very few of you even mentioned that one might not be able to reach the actual optimal LP value using binary search, and that some kind of “rounding” step would be needed at the end when the size of the binary search intervals is small enough. (That would have sufficed, since we didn’t really expect you to do this rounding step.) The details are all in the sample solutions: even if you don’t read all the solutions, have a look at this one for sure!

5. (Note: for this problem we should have stated that clauses always contain distinct literals.  But this didn’t seem to trip anyone up.)  Most everyone got (a) and (b).  To some extent the point of this question was that for (c), the analysis is much easier than the proof that Horn-Sat is in P.  All you have to do is say, “Given a solution with LP-value m, round each value down.”  It’s very quick to check that the resulting integer solution still has value m.

October 24, 2011

Lecture 12: SDP Duality

A draft of the SDP Duality lecture notes has been posted.

October 24, 2011

Lecture 11: The Lovasz Theta Function

A draft of the lecture notes on SDPs and The Lovasz Theta Function have been posted.

October 22, 2011

No Lecture on Tuesday

In case you missed the announcement the other day, Tuesday October 25th’s lecture is canceled due to the FOCS conference. Normal service will resume on Thursday.

October 20, 2011

Lecture #12 (SDP duality): discussion

1. Examples

The example showing that the optimum value of an SDP might not be achieved is this: {\min x_1} subject to

\displaystyle  \begin{pmatrix}   x_1 & 1 \\   1 & x_2 \\   \end{pmatrix} \succeq 0

For this to be positive-semidefinite, we need {x_1x_2 \geq 1}. But then the optimum value of the SDP is {0} (it is the infimum of the values {x_1} can take), but it not achieved by any feasible solution.

Here is an example where the primal is feasible and finite, but the dual is infeasible.

\displaystyle  \min    \begin{pmatrix}   0 & 1 \\   1 & 0 \\   \end{pmatrix}\bullet X ~~~s.t.~~~   \begin{pmatrix}   1 & 0 \\   0 & 0 \\   \end{pmatrix}\bullet X = 0, X \succeq 0.

The optimum is {0}, achieved by {(0, x_2)} for {x_2 \geq 0}. But the dual seeks to maximize {0} subject to

\displaystyle  y \begin{pmatrix}   1 & 0 \\   0 & 0 \\   \end{pmatrix} \preceq \begin{pmatrix}   0 & 1 \\   1 & 0 \\   \end{pmatrix}   ~~~~~~\iff~~~~~~ \begin{pmatrix}   -y & 1 \\   1 & 0 \\   \end{pmatrix} \succeq 0.

But this is infeasible, since psd matrices cannot have non-zero entries in the same row/column as a zero diagonal entry.

2. Proofs of Some Useful Facts

We used several facts about matrices in lecture today, without proof. Here are some of the proofs. Recall that {A \bullet B = Tr(A^\top B) = \sum_{ij} A_{ij} B_{ij}}. For symmetric (and hence for psd) matrices, {Tr(A^\top B) = Tr(AB)}.

Fact 1 For any two {n \times n} matrices {A, B}, Tr(AB) = Tr(BA).

Proof: {Tr(AB) = \sum_{i} (AB)_{ii} = \sum_i \sum_j A_{ij} B_{ji} = \sum_j \sum_i B_{ji} A_{ij} = Tr(BA)}.
\Box

Lemma 1 For a symmetric {n\times n} matrix {A}, {A} is psd if and only if {A \bullet B \geq 0} for all psd {B}.

Proof:
One direction is easy: if {A} is not psd, then there exists {x \in {\mathbb R}^n} for which {A \bullet (xx^\top) = x^\top A x < 0}. But {xx^\top} is psd, which shows that {A \bullet B < 0} for some psd {B}.

In the other direction, let {A, B} be psd. We claim that {C}, defined by {C_{ij} = A_{ij}B_{ij}}, is also psd. (This matrix {C} is called the Schur-Hadamard product of {A} and {B}.) Then

\displaystyle  \sum_{ij} A_{ij} B_{ij} = \sum_{ij} C_{ij} = \textbf{1}^\top C \textbf{1} \geq 0

by the definition of psd-ness of {C}. To see the claim: since {A}, there exist random variables {\{a_i\}_{i = 1}^n} such that {A_{ij} = E[a_ia_j]}. Similarly, let {B_{ij} = E[b_ib_j]} for r.v.s {\{b_i\}}. Moreover, we can take the {a}‘s to be independent of the {b}‘s. So if we define the random variables {c_i = a_ib_i}, then

\displaystyle  C_{ij} = E[a_ia_j]E[b_ib_j] = E[a_ia_jb_ib_j] = E[c_i c_j],

and we are done. (Note we used independence of the {a}‘s and {b}‘s to make the product of expectations the expectation of the product.) \Box

This clean random variable-based proof is from this blog post. One can also show the following claim. (I don’t know of a slick r.v. based proof—anyone see one?) BTW, the linear-algebraic proof also gives an alternate proof of the above Lemma 1.

Lemma 2 For {A \succ 0} (i.e., it is positive definite), {A \bullet B > 0} for all psd {B}, {B \neq 0}.

Proof: Let’s write {A} as {PDP^\top} where {P} is orthonormal, and {D} is the diagonal matrix containing {A}‘s eigenvalues (which are all positive, because {A \succ 0}.

Let {\widehat{B} = P^\top B P}, and hence {B = P \widehat{B} P^\top}. Note that {\widehat{B}} is psd: indeed, {x^\top\widehat{B} x = (Px)^\top B (Px) \geq 0}. So all of {\widehat{B}}‘s diagonal entries are non-negative. Moreover, since {B \neq 0}, not all of {\widehat{B}}‘s diagonal entries can be zero (else, by {\widehat{B}}‘s psd-ness, it would be zero). Finally,

\displaystyle  Tr(AB) = Tr((PDP^\top)(P\widehat{B}P^\top)) = Tr(PD\widehat{B}P^\top) = Tr(D\widehat{B}P^\top P) = Tr(D\widehat{B}) = \sum_i D_{ii}\widehat{B}_{ii}.

Since {D_{ii} > 0} and {\widehat{B}_{ii} \geq 0} for all {i}, and {\widehat{B}_{ii} > 0} for some {i}, this sum is strictly positive. \Box

Lemma 3 For psd matrices {A, B}, {A \bullet B = 0} if and only if {AB = 0}.

Proof: Clearly if {AB = 0} then {A \bullet B = tr(A^\top B) = tr(AB) = 0}.

For the other direction, we use the ideas (and notation) from Lemma 1. Again take the Schur-Hadamard product {C} defined by {C_{ij} = A_{ij}B_{ij}}. Then {C} is also psd, and hence {C_{ij} = E[c_ic_j]} for random variables {\{c_i\}_{i = 1}^n}. Then

\displaystyle  A \bullet B = \sum_{ij} C_{ij} = \sum_{ij} E[c_ic_j] = E[\sum_{ij} c_ic_j] = E[(\sum_i c_i)^2].

If this quantity is zero, then the random variable {\sum_i c_i = \sum_i a_ib_i} must be zero with probability {1}. Now

\displaystyle  (AB)_{ij} = \sum_k E[a_ia_k]E[b_kb_j] = \sum_k E[a_ib_j(a_kb_k)] = E[a_ib_j (\sum_k c_k)] = 0,

so {AB} is identically zero. \Box

3. Eigenvalues Seen Differently, or Old Friends Again

We saw that finding the maximum eigenvalue of a symmetric matrix {A} can be formulated as:

\displaystyle  \begin{array}{rl}    \min t \\   \text{s.t.~~~} tI - A \succeq 0   \end{array}

and its dual

\displaystyle  \begin{array}{rl}    \max A \bullet X \\   \text{s.t.~~~} X \bullet I = 1 \\   X \succeq 0   \end{array}

The dual can be reinterpreted as follows: recall that {X \succeq 0} means we can find reals {p_i \geq 0} and unit vectors {x_i \in {\mathbb R}^n} such that {X = \sum_i p_i (x_ix_i^\top)}. By the fact that {x_i}‘s are unit vectors, {Tr(x_ix_i^\top) = 1}, and the trace of this matrix is then {\sum_i p_i}. But by our constraints, {X \bullet I = Tr(X) = 1}, so {\sum_i p_i = 1}.

Rewriting in this language, {\lambda_{\max}} is the maximum of

\displaystyle  \sum_i p_i \; (A \bullet x_ix_i^\top)

such that the {x_i}‘s are unit vectors, and {\sum_i p_i = 1}. But for any such solution, just choose the vector {x_{i^*}} among these that maximizes {A \bullet (x_{i^*}x_{i^*}^\top)}; that is at least as good as the average, right? Hence,

\displaystyle  \lambda_{\max} = \max_{x \in {\mathbb R}^n : \|x\|_2 = 1} A \bullet (xx^\top) = \max_{x \in {\mathbb R}^n} \frac{x^\top A x}{x^\top x}

which is the standard variational definition of the maximum eigenvalue of {A}.

October 19, 2011

Lecture 10: Semidefinite Programs and the Max-Cut Problem

A draft of the lecture notes on semidefinite programs and the Max-Cut problem is posted.

October 19, 2011

Lecture 9: More on Ellipsoid: Grötschel-Lovász-Schrijver theorems

Lecture notes on the Grötschel-Lovász-Schrijver theorems are posted.

October 19, 2011

Homework #4 comments

There’ve been some corrections to homework 4, just in case you missed those.

Also, for the matroid problem (#2): if you’ve not seen matroids before, you might find it useful to think about the arguments in the special case of the graphical matroid first. In a graphical matroid, the elements are edges of some graph G, and the independent sets are all forests (acyclic subgraphs) of this graph.

October 18, 2011

Lecture #11 discussion

In the lecture, I forgot to mention why we chose to talk about the classes of perfect graphs we did. We showed that bipartite graphs and line graphs of bipartite graphs (and their complements) are all perfect.Let’s call a graph basic if it belongs to one of these four classes: bip, co-bip, line(bip), or co-line(bip).

Recall a graph was called a Berge graph if it did not contain (as an induced subgraph) any odd hole or odd antihole. Clearly, all perfect graphs are Berge graphs, else they could contain such a offending induced subgraph, which would contradict their perfectness. The other direction is the tricky one: showing that all Berge graphs are perfect. Which is what Maria Chudnovsky, Neil Robertson, Paul Seymour, and Robin Thomas eventually did in 2002.

So how did they proceed? Here’s a mile-high view of their approach. They showed that every Berge graph G either is basic (i.e., belongs to those four classes bip, co-bip, line(bip), or co-line(bip)), or else it has some structural “faults” (G has a 2-join, or its complement has a 2-join, or G has an M-join, or a skew partition). For us, it does not matter what exactly these structural faults are: what matters is (and this is what CRST showed) that minimal imperfect graphs cannot have any of these faults. So, if Berge graphs could be imperfect, a minimal counterexample would be basic, which is a contradiction. More on this can be found, e.g., off the links on Vasek Chvatal’s Perfect Graph Theorem Page.

Re Srivatsan’s question: given the strong perfect graph theorem, the perfect graph theorem follows as a corollary. Indeed, suppose G was perfect but \overline{G} was not. Then \overline{G} must not be a Berge graph, and contains either an odd hole or odd antihole as an induced subgraph. But then G contains an odd antihole or odd hole as an induced subgraph, so could not have been perfect.