The city size distribution of many countries is remarkably well approximated by a Pareto distribution. We study what constraints this regularity imposes on standard urban models. We find that under general conditions urban models must have (i) a balanced growth path and (ii) a Pareto distribution for the underlying source of randomness. In particular, one of the following combinations can induce a Pareto distribution of city sizes: (i) preferences for different goods follow reflected random walks, and the elasticity of substitution between goods is 1; or (ii) total factor productivities in the production of different goods follow reflected random walks, and increasing returns are equal across goods.