I was a skinny kid. I remember hearing over and over at the pediatrician’s office that for my age I was “in the 10th percentile” for weight and perhaps my parents ought to consider feeding me once in a while. Standardized athletic tests. Same thing. In the 10th percentile for speed, agility, strength, and probably attitude. After all, climbing a rope just to touch the gymnasium ceiling just seemed rather a pointless activity to me and I was definitely in the 95th percentile for verbalizing my discontent.

The percentile is an efficient way of providing a relative ranking that is easy to understand. In my case, for every 100 kids my age, 90 of them were heavier than me. In my defense, I did search for the other nine, but it was a windy day. The ranking presumes that there is a large sample of kids who were measured – the “universe” – against which any can be compared.

Site reports have long had the option of including a benchmark column, often national, against which a particular site can be compared. In the case of total population, the comparison is pretty much useless as listing comparing the national population to a 10 minute drive time adds no context to the trade area population. Showing the national median income is more useful, but if the trade area has an income value of $2000 over the national median, what does it really tell us? Is that a significant difference? Impossible to tell because the comparison lacks context.

What if we were able to give a percentile ranking for any variable against the “universe” of similar sized sites nationwide? A 50th percentile value would indicate that a site was average (actually, the median). A site at the 90th percentile would mean that only one in ten locations nationwide scored higher on that variable. This ultimately conveys much information in a very efficient package. It is also very amenable to effective graphical presentation in simple ways that convey both the actual value, but also the value relative to the universe of possible values.

The universe of sites consists of the nearly 220,000 block group centroids nationwide, which represent the distribution of the population and of potential sites very effectively. For use in the Snapshot API, percentile tables have been built for 7000 variables for radius sizes from 1 to 20 miles in one mile increments. Any radius, polygon, or standard geographic area such as a zip code can be compared to areas of similar size nationwide, instantly. The comparison is not to an arbitrary benchmark, but to areas which are of roughly equivalent size – a 60 square mile drive time polygon is compared to the universe of 4 mile radius (area is just over 50 square miles).

Site reports can specify individual variables to be output as percentiles using the PERCENTILE(<variable>) function, and as benchmarks by using the bench=percentile parameter.

So often, we hear “we only want top ten percent locations” and using the percentile functions is an efficient way to find them.