Top Tier – Legacy Papers

There are quite a few papers on random tiebreaking for stable matching, and this is the best one. I really believe that. It gives an exact asymptotic description of students’ rank distribution (under both STB and MTB) for a model with arbitrary and heterogeneous school capacities, heterogeneous school popularities, heterogeneous student list lengths, and any ratio of students to schools. While the model of student preferences and especially school priorities is still restrictive, it is far more general than most related theory papers. This is also the only paper I know of to give an analytical characterization of the number of unassigned students in stable matchings. This number is significant in real-world markets, but many models that analyze random matching markets assume this issue away by imposing complete (or at least “sufficiently long”) preference lists.

Plus, the insights turn out to be surprisingly clean! In general, STB gives more first choices, and is better for students with short lists. MTB will typically give more matches, but not always. The comparison of the number of matches depends only on the distribution of the lengths of students’ submitted lists (and not on the other parameters of the problem). I give a clean condition on this distribution that identifies whether STB or MTB produces more matches, based on whether the list length distribution is heavy-tailed or light-tailed (real world distributions are light-tailed).

My perfectionism delayed publication of this paper for years. After using it to get a job at Columbia Business School, I didn’t even submit it to a journal for 3.5 years. Then I took over a year to complete the requested revision. (In fact, I finally submitted the second version of this paper partway through my honeymoon – shoutout to my very patient wife!)

This delay was because one result (the asymptotic validity of my equations as the number of students and schools grew) was really really hard to prove. This took me a lot of time, and also caused me to avoid working on it. This is where I would have benefited from a co-author, to help encourage me to keep going. The final proof is long and shoved into an algebra-filled appendix. It’s not all that illuminating, in terms of insights.

This experience convinced me of two things. First, if I can get away with it, I want to work directly in continuum models (rather than explicitly analyzing limits of finite random models): you get more insight (as you can often make far fewer restrictive assumptions on things like agent preferences), and for much less work. Second, I never want to start another single-author paper. While I am capable of doing so, it is just far less fun, and much harder to stay motivated in later stages.

This is follow-up work to Lottery Design for School Choice, which essentially generalizes the equations from that paper to settings with arbitrary student preferences and priorities. In essence, the model is similar to that of Azevedo and Leshno (2016) – which is a great paper – but unlike that paper, I don’t produce a single predicted cutoff score for each school, but rather a distribution for each school’s final cutoff score. I argue that this is a much more useful (and accurate) prediction in markets with small enough school capacities that randomness matters – if AL 2016 predicts a cutoff score of 0.7, in practice a student with a score of 0.69 has a chance of getting in, and a school with a score of 0.71 is not assured a seat. I show how the model of AL 2016 can be recovered by changing one equation in my model, and also show that the predictions of my model converge to theirs in the limiting case as capacities grow large. (This is a sanity check: the real value of these predictions is to say what will happen when capacities are small or modest.)

I am very proud of this paper. So much has been written on stable matching that it’s hard to say something truly important and new, but I think I’ve done it here. The mapping from inputs (preferences, priorities, and capacities) to outputs (the final assignment) induced by the Deferred Acceptance algorithm is a fairly black box. This paper takes a significant step towards opening that box. Lots of people have tried to do something like this for a while, and this model generates predictions that no other I’ve seen can match.

The key assumption in the model is to assume that the number of students who are “interested” in a school and have priority above a specified cutoff will follow a Poisson distribution. Of course, this is not always the case, but I’ve shown that it does hold under certain assumptions on preferences, and even in cases where it doesn’t hold, the resulting predictions seem to be more accurate (a closer match to simulations) than others that I have seen.

Unfortunateley, I have gotten it into my head that the primary audience for this paper is Economists, so my current goal is to get it into a Top-5 Econ journal. But as the only author, I am not sure I will have the will to see it through the endless rounds of revisions and robustness checks. Do you have any suggestions for me? I am all ears.

Of all the papers I’ve written, this one probably has the most policy-relevant insights. Also, it was so much fun to write, because Peng Shi was a fantastic co-author!

I would say that this is more of a modeling paper than a theory paper. Much of the trick was how to approximate (model) a dynamic stochastic game with applicants who have heterogeneous preferences and outside options in a tractable way. I think we did that. Once we had the model, many of the results – while far from trivial – are not too hard to prove.

I think the most striking results – which I did not expect when we began the project – are the equivalence results. We show that independent lotteries are equivalent to a waitlist without the right of refusal, and using a single lottery is equivalent to a waitlist with the right of refusal. Although I did not expect this result, I now have fairly good intuition for it. Interestingly, the first equivalence continues to hold if applicants are heterogeneous in their value distribution and cost of participating in the mechanism (I wish we’d written the paper in a way that made this point more clear). However, these equivalences do rely on the assumption that applicants have the same expected lifetime (number of periods in the system). (The model remains tractable in this case, but equivalence no longer holds.)

Another result that I enjoyed was getting to apply the revenue equivalence theorem in a paper with no payments. The analogy is that an agent’s “type” is their outside option, their “allocation” is their probability of receiving housing before departing the system, and their “payment” is essentially their loss in match value from accepting a subpar match. Agents with good outside options (low value to being matched) will behave in a way that makes them less likely to match, but also ensures low payments (high match value) conditioned on being matched. The really nice thing about this approach is that we were able to identify conditions under which two natural goals – matching more agents with bad outside options and assigning agents to apartments that are well-suited to them – are inherently in tension. That is, there is a general tradeoff that holds for any mechanism that cannot observe outside options directly.

While of course all of these results rely on imperfect assumptions, I think they are a nice way to illustrate soome first-order dynamics present in these markets. For example, the question of “lottery or waiting list?” is not as important as the specific rules for how these systems operate. The real question should be “are we giving applicants choice of where to live?” Our work shows that the NYC lottery system largely fails to do that, despite seemingly allowing applicants to choose which buildings to apply to. The prevailing waiting list systems used by US Housing Authorities also typically offer very little choice. By contrast, public housing in Amsterdam and Singapore is allocated using mechanisms (a waiting list in Amsterdam and a lottery in Singapore) that give applicants far more ability to choose their apartment. The US should change to mimic these systems. Crucially, public housing serves a far larger percentage of the population in these places (40%+ in Amsterdam, 80%+ in Singapore), which I suspect makes it harder to get away with using a lousy allocation mechanism.

Tier 2: Must Reads… for somebody

This was a fun paper to work on. Although we ended up attaching a “vaccine allocation” story to it (which I think is somewhat plausible), I really think of it as a pure math paper. There is a ton of related work, so compared to other papers I’ve written, I had to spend far less time motivating the models and questions, but much more time reading related papers and discussing the relationship.

We consider a setting with \(n\) applicants, of which only \(k\) can be accepted. There is a value to selecting each applicant, but you don’t know these values up front. However, you do have some information about these values, in the form of distributions from which they are (independently) drawn. Based on this, you must set a threshold (you could think of it as an eligibility criterion). Then, values are realized. If fewer than \(k\) are above your threshold, all of these applicants are accepted. If more than \(k\) are above your threshold, you select \(k\) at random from among them. The questions are, (i) how should you set the threshold, and (ii) how much worse off are you than if you knew the values exactly?

We are able to get “optimal” algorithms for setting thresholds, not in the sense that it sets an optimal threshold instance-by-instance, but in the sense that it offers a worst-case competitive ratio that is at least as good as any other threshold algorithm. Furthermore, the threshold-setting algorithms are quite simple, and their guarantees are pretty good! When \(k = 100\), you are guaranteed 96% of what you would get from knowing values in advance. When \(k = 10\), the guarantee is approximately \(87.5%\). For \(k = 1\) it drops to \(1 - 1/e \approx 63%\), which is still not too bad.

There are some surprising results in this paper. For example, one natural approach is to simply set a threshold so that the expected “demand” (number of items with value above the threshold) is equal to your supply. While seemingly naive, it turns out that this works (gives an optimal competitive ratio) when values are IID. Even if values are only independent, but not identically distributed, this approach gives an optimal competitive ratio if your supply is at least 5.1 But if your supply is 4 or less, it does not (you need to use a slightly more complicated threshold algorithm). Go figure!

This paper also taught me some new things about Poisson Binomial distributions (sums of independent, but not necessarily identical, Bernoullis). For example, the mean of a Poisson Binomial is easily seen to be \(\sum p_i\). Magically, it turns out that the median and mode are both one of the two nearest integers to this quantity! Will and I also prove a general result about optimizing over the \(p_i\), which shows that for a large class of optimization problems, you can constrain your search to cases were all \(p_i\) that are not \(0\) or \(1\) are identical. (This is often much easier to work with than the distribution with general \(p_i\).)

This paper was inspired by listening to Nikhil Devanur present the paper . I thought their work used some very nice techniques. However, one thing that felt unnatural to me was that in their setting, items arrived in a specified order, but their algorithm for choosing a threshold used only the distributions of values (and not the arrival order). Since I liked their approach for choosing a threshold, the natural conclusion was, “let’s eliminate arrival order by assuming that it’s random!” In the Nikhil’s talk, Will asked whether the guarantees proved in their paper were the best possible for a static threshold algorithm, and Nikhil responded “probably not.” Therefore, I was happy to get best possible (tight) results for related algorithms in the random arrival model. It turns out that in the fixed arrival model, the guarantees from CDL actually are the best possible, as Will and his co-authors later proved.

(Not sure I would say it’s better than the papers in the next tier, but I think there’s a clearer scenario where I could tell someone “you should definitely read this paper” as opposed to “I have a paper on that”)

Tier 3: Some Nice Insights

Nice model, ties together theoretical mechanism of ILMW with more practical implementation, as used in Alaska. Impressed by Tim, very satisfying to finally use linear programming duality in a paper. Analysis of waste vs trade is okay (somewhat clever). What’s not very satisfying mathematically is last part (effect of changing \(k\) not on worst case bounds, but holding fixed the instance). Not super clean insights about who gains and loses. Part of me thinks “any k is pretty good”, another part “k = 1 or k = 2 is simpler for people to actually use” and another part “maybe there is much more to understand here!” Our discussion relates to Akbarpour and Van Dijk on outside options.

(Possibility to move up a tier)

This paper tackles a topic that can be totally intractable (matching and selection with externalities), and tractably models a set of situations that frequently arise in practice: a group of family or friends wishes to share an experience, but each needs their own ticket. I personally encountered this scenario when trying to get permits to hike Halfdome, and win lotteries for discounted Broadway tickets; in the paper, we also discuss applications to people who want to run marathons together, and diversity visa applicants.

The model is simple: there are \(n\) applicants, and \(k\) homogeneous tickets to distribute. The set of applicants is partitioned into groups, and each group has dichotomous preferences: members of the group are happy if (and only if) the group collectively gets enough tickets for everyone to participate. The designer’s goals are two-fold: maximize the number of happy people (efficiency), and equalize the chance that each group is happy (fairness). One common approach (used for both Half Dome permits and Broadway shows) is what we call the individual lottery. Each person can apply for up to \(l\) tickets (without specifying who they are for). Then applicants are ordered uniformly at random, and given their requests until no more tickets remain. We show that this can be very unfair (because of multiple people from the same group entering) and inefficient (because of multiple people from the same group winning), and propose some alternatives. While we’ve gotten some pushback on the model (questioning the dichotomous preferences, and the definition of fairness), I actually think we crushed this one. I will 100% stand by our modeling choices.

While I often eschew working in finite stochastic systems in favor of “continuum” models, I enjoyed working with Carlos to come up with neat probabilistic arguments (couplings and martingales). I especially like the coupling argument showing sub-additivity of hitting times when sampling values without replacement. I also appreciate the “trick” (clever idea) of linking the group lottery, individual lottery, and weighted individual lottery together through the “weighted individual lottery with replacement”.

We have one major outstanding conjecture regarding incentives of the weighted individual lottery. In fact, this conjecture is much more general than the model we consider. It relates to mathematical work on “small deviations” and gambling (problems in the style of choosing the distributions of a set of independent random variables to maximize the chance that their sum exceeds a specified threshold, subject to certain constraints). There are many longstanding open questions in this space going back ~50 years. Once I realized this, I felt less bad that all of our efforts had not yielded a complete proof (though I am still sure the conjecture is correct).

The one critique of this paper that I think is fair is that we didn’t fully define or explore the space of possible mechanisms. We basically chose a couple of mechanisms we saw in practice, analyzed their performance, and then came up with an alternative that we argued was pretty good. But there are a lot of ways one could run a “weighted individual lottery” beyond the specific one that we propose: the design space is quite rich. I think we avoided this mostly because the stochastic analysis quickly becomes challenging for many of these designs. However, a continuum model inspired by allowing \(k\) and \(n\) to grow in a fixed ratio would likely be far more tractable, and enable analysis of (and optimization over) a greater class of mechanisms (while introducing questions about “how far” its predictions are from outcomes in finite markets).

This paper holds a special place in my heart. It probably has my favorite possible origin story. Recounting it here feels a bit too much like bragging, so I won’t. But I will happily tell you if you ask me! This paper was a significant part of my PhD, and also my first paper published in a journal. It also helped kick-start my relationship with Paul (as an unintimidated second year PhD student), which remains very treasured.

The main reason I don’t rank this paper more highly is that the theory relies on very stylized assumptions (total value is a product of common and match value, with match values drawn IID from a power law/Pareto distribution). With these assumptions, we are able to get very strong results for our proposed auction format! (As others have remarked to me, it’s not often that you get a “worst case” guarantee on the order of 95%.) One could certainly argue (and Paul has) that the assumptions are reasonable: the Power law is a stylized way of capturing heavy tailed value distributions, and the product form a clean way to model imperfect correlation in advertisers’ values. Really, one could say kudos to Paul for taking a setting that can be very hairy (auctions with correlated values and asymmetric information) and coming up with a model that is quite tractable. But to me, the assumptions still make the model seem less fundamental than some of the papers that I placed in higher tiers.

#Tier 4: Not that impressed

While we have submitted this paper to a journal, the model and results are still in flux, so I will leave my commentary for a later date. (Maybe move it up a tier?)

This was the product of my 2014 internship at Microsoft Research, where I had a fabulous time. I enjoyed working with my mentors Nicole and Brendan, and meeting fellow interns who became peers and friends that I am in touch with today. The office views over the Charles were spectacular. I learned a lot during this internship, but in terms of tangible output, this paper is a bit underwhelming.

It proposes several definitions for what “stability” should mean in a two-sided matching market where agents have incomplete information about their or others’ preferences. We show that when agents are uncertain about their own preferences, no mechanism can satisfy our definition, while when agents have uncertainty only about others’ preferences, mechanisms can be pairwise stable but not fully stable. These negative results basically boil down to the following fact: stable matching is not utilitarian optimal. If people don’t have enough information to know how they will do, they might all agree ex ante to use a different mechanism that produces higher total utility. (Of course, ex post, some people will regret this choice.) The idea seems simple to me now, but at the time it took a while to articulate and formalize.

To be clear, I think this is a reasonable “starter” paper, and it’s probably good that somebody wrote it. I could imagine other early-career researchers learning something from reading it. However, from my current vantage point, it’s hard to get very excited about it.

Dan Russo is another good friend, and this paper grew out of a project for Ramesh’s class, which we both took as first-year PhD students. I remember enjoying finding an application for some of the random-walk analysis techniques that I had learned in my first year at Stanford. I also remember Dan forcing me to actually sit down and write the paper months after the term had ended (Ramesh had granted us an extension), when I just wanted to watch the 2012 O lympics (this is why I need co-authors). And finally, I remember visiting Boston for the WINE conference, which I greatly enjoyed! So positive memories, but not too much to see here from an academic perspective.

  1. Here, the random arrival order/selection really matters: otherwise, an adversary could choose an order such that almost all the value will come from a few very valuable items which are always at the end, and our algorithm has too high a chance of running out before they are considered.↩︎