The city size distribution in many countries is remarkably well described by a Pareto distribution. We derive conditions that standard urban models must satisfy in order to explain this regularity. We show that under general conditions urban models must have (i) a balanced growth path and (ii) a Pareto distribution for the underlying source of randomness. In particular, one of the following combinations can induce a Pareto distribution of city sizes: (i) preferences for different goods follow reflected random walks, and the elasticity of substitution between goods is 1; or (ii) total factor productivities of different goods follow reflected random walks, and increasing returns are equal across goods.