Manufactured Assent: The Philosophical Gourmet Report’s Sampling Problem

The Philosophical Gourmet Report (PGR) purports to be a ranking of faculty reputation based on the opinions of “research-active” faculty, which it surely is. But the legitimacy of the PGR as an accurate measure of opinion relies in part on how well the PGR pool of evaluators reflects the population of research-active faculty.   Addressing this point, Brian Leiter remarks (here) on the selection procedure for the PGR evaluator pool:

Evaluators were selected with an eye to balance, in terms of area, age and educational background—though since, in all cases, the opinions of research-active faculty were sought, there was, necessarily, a large number of alumni of the top programs represented. Approximately half those surveyed were philosophers who had filled out the surveys in previous years; the other half were nominated by members of the Advisory Board, who picked research-active faculty in their fields.

However, the PGR evaluator pool is not balanced with respect to educational background, and the claim that the educational imbalance observed in the 2011 PGR is a necessary consequence of soliciting the opinion of research-active faculty is false.

Now, to criticize the composition of the PGR evaluator pool is not to criticize the individual members making up the pool. The focus of the discussion at C&I (here, here, here, here, and even here; see also Andrew Gelman’s post here) has been on methodology.  The point instead is this: if you are interested in an accurate picture of professional opinion but oversubscribe from some parts of the profession and systematically omit other parts altogether, you might end up with an accurate assessment. But knowing this much about your methods, there is no reason to believe that you will. That, in a nutshell, is the nature of the PGR sampling problem.

The PGR sampling problem is not breaking news.  It has been a stubbornly durable feature of the survey since its inception, as Richard Heck pointed out long ago, and Zachery Ernst has elaborated on—remarking, and I paraphrase, that the snowball sampling method used by the PGR, while suitable for surveying pimps and dope dealers, is indefensible for academics practicing philosophy out in the open.

What might be new is a data set for the 2011 PGR rater pool for you to see for yourself, and a visualization of the degree of the educational background bias within the PGR rater pool, presented below.


We start with Kieran Healy’s excellent series on the 2006 PGR data set, which provides several insights into how to think about the data for the PGR. In those posts he is pretty careful to signal that his remarks are descriptive (i.e., based on the properties of the rater pool) and not inferential (i.e., representative of corresponding properties in the profession), but he does consider some objections to using the PGR to draw such inferences and the bulk of the posts are taken up with investigating whether the data set can rule those objections out.  And in some cases, the data does just that. Although it would be better to move to an open data model for the PGR, these posts are the next best thing.

In his Ratings and Specialties post, Professor Healy looks at various ways to measure “cross-field” consensus.  In one exercise he reconstructs the PGR ranking from the point of view of each specialty. What would the PGR ranking of departments look like if just the Ethicists were in charge? What would it look like if just the Philosophers of Language were? The Kantians? And so on. Those rankings were then aggregated to see how much variation there is across the specialties, which gives a sense of  how much (or how little) consensus there is across specialties. The box and whisker plots (top 25: .png.pdf; total population: .png.pdf) give a picture of this.  Except for the top 6 departments, and a handfull rounding out the bottom, the answer is that the rankings vary quite a lot:  people vote according to whom they recognize, and invariably those are the people working in their area(s) of specialization(s).  The flip side of this appears to be a version of the closed world assumption: if a rater hasn’t heard of you, then you probably aren’t worth hearing from.  This was my favorite of Healy’s posts, although I suspect that the observed volatility is attenuated by the unrepresentative composition of the evaluator pool.  So, let’s turn to that.

Healy considers whether voting patterns in the 2006 data are correlated to the “social location” of the evaluators, which is their place of employment or “Home” institution. But, another question is to examine whether the voting patterns are correlated to the “imprinting location” of raters (i.e., where they earned their PhDs), once the cross-specialty volatility is controlled for.   Through some reverse engineering of the data that is available,  it would be surprising if Ph.D. institution and voting patterns did not turn out to be correlated. In short, while it may be true that where you stand does not depend on where you sit, it still may be that where you stand depends on from whom you learned to stand on your own.

This point can be illustrated by comparing two graphs. This first graph, with the blue ink blots, overlays the distribution of the Home Institutions in the 2011 PGR pool with Healy’s PCA of the 2006 data set.[1]   In Healy’s terms, what this graph does is overlay the distribution of “social locations” (but for the 2011 data set) over his heuristic for viewing how departments and subfields differ from one another (based on his 2006 analysis).

(Click to enlarge, or view in full size: pdf and png)

This picture is consistent with a happy view of the PGR, for it paints the picture of a wide and reasonably diverse sample of the profession. Even the centers of concentration of the raters are outside the center of the plot, which further contributes to a sense of balance.  I suspect that something like this is what some have in mind when they counter that the PGR simply tells it like it is, like it or not, and is what bucks up their confidence to label dissenters as idiosyncratic cranks on the fringe of the field longing to be accepted by the mainstream.  Although more telling of the critic than those criticized, one can be forgiven for thinking that there is some truth behind the bluster.

But that case starts to crumble when one takes a look instead at where the raters earned their PhDs.  For here we see an unusually high concentration around a small cluster of universities, and this fact runs counter to the claim that the evaluator pool is “educationally balanced”.

(Full size: pdf and png)

Here we see not only a high concentration around a small number of imprinter institutions, but a cluster around the center of the PCA. This invites several questions, most of which are unanswerable without open PGR datasets. One immediate question in this context is the degree to which the oversubscription of a handful of ‘imprinter institutions’ is driving this PCA analysis and the clustering of green ink blots around the center, as this seems to be evidence for an effect from the educational imbalance in the evaluator pool.[3]

To put a finer point on this, there are 299 [4] evaluators and 126 universities represented in the 2011 PGR rater pool—113 individual Home institutions (blue)  and 58 PhD institutions (green).  But of the 736 individual rankings submitted for the 33 areas of specialization covered by the PGR, half (48.5%) were submitted by alumni from just 8 universities.

Now, you might maintain that a hard-nosed look at the research-active faculty will bear out that concentration, and that it will also bear out the 8 institutions in that set. But there are gaps in that set and holes in that argument—holes that appear even by the PGR’s own measure.  Exploring that point is next.

– Gregory Wheeler



[1]  Healy’s caveats apply here, as well as the additional warning that I am overlaying the 2011 rater pool with the departmental positions determined by the 2006 data. Leiter reports that the PGR is remarkably stable, so this shouldn’t be too far off from a new PCA constructed from the 2011 data.

[2]  I have taken the subfield vectors out of my graphs to make them less cluttered. See Healy’s original (png or pdf), or email me if you would like to see the version with these vectors back in.

[3] Note that evaluators are prohibited from directly ranking both their home institution and their PhD institution, but this does not eliminate the effects of an unbalanced evaluator pool.

[4] There are some discrepancies on the PGR website between the main list of evaluators and the evaluators listed for the individual specialties. The total number of evaluators that I can account for in the specialty rankings is 299, but Leiter’s list of “nearly 270 evaluators who completed the overall faculty quality survey” numbers 302, and there is one missing. Professor Leiter was contacted about discrepancies on the PGR website but did not reply.

5 Responses to Manufactured Assent: The Philosophical Gourmet Report’s Sampling Problem

  1. […] We observed before that the 2011 PGR is unbalanced with respect to educational background.  That point can be captured graphically by merging the two earlier bubble plots—one representing the distribution of Home institutions (blue) and another for the distribution of PhD institutions (green)—into one. […]

  2. […] about sampling problems in our favorite school ranking system, the Philosophical Gourmet Report: The Sampling Problem and Educational […]

  3. […] I thought it was a mistake, but then it dawned on me that he could be angry that I had posted links to something that was critical of The Philosophical Gourmet Report, which he runs.  As I said […]

  4. […] Inference, has a series of posts on the measurement issues in the PGR, particularly concerning sampling and representativeness, leading him to conclude that “while the results of the survey might […]

  5. […] as a guide to hiring. It would be worth investigating the implications of this legally as well. . Manufactured Assent: The Philosophical Gourmet Report’s Sampling Problem  (Gregory Wheeler) . More on the Educational Imbalance within the PGR Evaluator Pool   (Gregory […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: