# Big News: Existence of designs

To start, let us introduce Steiner (1796-1863) system with parameters . There is a set of elements, and one would like to find a collection of subsets of size so that every -tuple of elements appear in exactly one member of the collection. Since each -tuple contains -tuples, it is clear that must divide ${n \choose t}$. More generally, for any , must divide (exercise).* Is this necessary divisibility condition sufficient ?*

A famous related problem was posted by Kirkman (1850) known as the “15 school girls problem”: There is a class of 15 students in a private girl school. The girls want to walk in groups of three every day of the week, but in a way so that every two people walk with each other exactly once. Can you form a schedule for them ? (First exercise: see that they need the full week, not five days, to do this.) The extra requirement here is that the Steiner system is required to decompose nicely as well.

For certain sets of parameters, one can construct Steiner systems using algebraic structure. For instance, a projective plane over a finite field $late x F_p$ is a Steiner system of order , with and , since every two points determine exactly one line. Many others come from Mathieu groups. The existence of Steiner systems with given fixed and sufficiently large satisfying the divisibility condition was a famous open problem for a long time.

A design is a generalization of Steiner systems, where one introduces a new parameter and replacing the assumption that every -tuple of elements appear in exactly one member of the collection by a new assumption that every -tuple of elements appear in exactly members of the collection. Construction of designs was a popular topic at some point, partially thanks to motivation from statistics (such as testing). There is an endless list of very clever constructions of designs for specific parameters. However, the general existence has not been known. An important case was famously solved by Wilson in early 1970s, who positively answered the question for the case .

Last Friday, Peter Keevash, a young researcher in extremal and probabilistic combinatorics at Oxford, announced the complete solution for the general existence conjecture at a meeting in Oberwolfach (Germany). It was the highlight of the conference. For some reason, his talk was scheduled to be a short talk (25′). Since the introduction/history/statement of result (which is actually is a more general statement about hyper graphs and complexes) already took more than 20′, naturally little could be said about the proof. I managed to catch him after the talk and asked a few more details. Here is my rough understanding of his proof. The paper is not yet online but will be out very soon.

For simplicity, let us consider the case again (Steiner systems). First, we formalize the problem using hypergraph terminology. Consider a hyper graph whose vertices are the -tuples of a set of elements (thus has vertices). The edges have size ; a collection of $t$-tuples form an edge if the tuples are exactly the tuples of a set of elements in .

A *matching* in a hypergraph is a family of disjoint edges. A matching is *perfect* if it covers all vertices (it is clear that a divisibility condition needs to be satisfied for a perfect matching to exist). Furthermore, it is near perfect if it covers all but fraction of the vertices.

The existence of a Steiner system is equivalent to the existence of a perfect matching in . The problem here is that it is very hard to find a sufficient condition for the existence of perfect matchings in a hypergraph (for graphs the problem has been solved completely).

The probabilistic method comes to the rescue. One can try to create a matching using the natural greedy randomized algorithm as follows. Pick a random edge, delete all edges intersecting it, repeat. This approach was first used by Rodl in the 1980s to prove the existence of a near matching (solving a problem of Hanani). In his proof, Rodl picked several random edges at the same time (thus, the method is dubbed as “Rodl’s nibbles”) and delete all edges intersecting any of them. (If we do this carefully, the edges chosen are already disjoint with high probability.) What can be showed is that at the end of each round (nibble), one can still control the structure of the hypergraph, in particular the degree of each vertex, the codegree of any two vertices, and later, the higher codegree of any set of at most vertices (the codgree of a set is the number of edges containing ). The use of random greedy algorithms in this manner was initiated by Ajtai, Komlos and Szemeredi few years earlier. (Later, in 1995, Spencer showed that one can still analyze the algorithm which picks one random edge at a time.)

The main problem with this approach is that near the end, our control on the hypergraph is very poor. There are papers subsequent to Rodl’s by Kostochka and Rodl, Grable, Alon, Kim and Spencer, and myself (see this for a brief survey) which analyzed the process more carefully and got polynomial error terms for . However, once the number of remaining vertices (not covered by the matching produced by the randomized algorithm) becomes very small, the hypergraph induced by them (which consists of the edges surviving all deletions so far) loses all structure and it becomes impossible to analyze the algorithm any further.

Peter found a beautiful way to get around this problem. The intuition is roughly as follows. We do not start the algorithm immediately, but first create an auxiliary hypergraph on a subset of the vertices, say , which has very nice structure, with many perfect matchings on . This hypergraph will be used at the end to negate the loss when the randomized algorithm starts to falter. What happens with the algorithm at the end is that one cannot guarantee the edges chosen are disjoint any more. There will be several vertices which are covered by, say, 2 edges. Now, we turn back to the auxiliary hypergraph created at the beginning to switch out these edges using the disjoint edges of . In spirit, this is somewhat close to the absorbing method introduced by Rucinski, Rodl and Szemeredi.

This is, of course, an hugely over simplified description which does not do justice to the real argument. Peter needs to consider a weighted version of the problem and construct a matrix instead of the (auxiliary) hypergraph, and run a delicate induction on complexes rather than on hypergraphs. I hope to have a more proper description after the paper is out. *(Many thanks to Peter Keevash for useful comments.) *

Does Keevash’ construction provide an efficient algorithm?

It probably does, just not sure it is efficient.

If this only works for sufficiently large n, does that mean we still don’t know if there are designs where r is close to n? Such as (n, (1-eps)n, t)?