The PhilPapers Categorization Project

A central aspect of PhilPapers is a categorization system, by which papers can be categorized into hierarchical categories. For this purpose, we have developed an extensive, if preliminary, taxonomy of philosophical areas, and we have also developed a number of tools by which the categorization system can be used. Details on these matters follow below.

The PhilPapers Taxonomy

The PhilPapers taxonomy is an extension of the MindPapers taxonomy of areas in the philosophy of mind, extended to all of philosophy. You can view the whole PhilPapers taxonomy either at the dynamically-generated overview or at the (quicker, but possibly out of date) static overview.

The taxonomy involves a five-level hierarchical system. A category at a given level will typically have 4-10 subcategories at the next level. A given subcategory can have more than one "parent" category, although one will always be designated as the "primary" parent (in the overviews above, nonprimary listings of a subcategory are designated with a "*"). The system culminates in 2000+ "leaf" categories (which are often but not always at level 5). Every paper is ultimately to be categorized under 1-3 leaf categories.

The levels are as follows:

Because papers can be categorized under more than one category, there is a certain amount of crossclassification among these categories. For example, papers in the history of philosophy will often fall under both a historical category and underneath a topical category.

Many disclaimers should immediately be made. First, the current taxonomy is extremely preliminary, and is much better developed in some areas than in others. Unsurprisingly, philosophy of mind is the best developed, as the category system here has been refined by the experience of categorizing many papers over a number of years. Some other areas of M&E are reasonably well-developed, while a number of areas of Value Theory and Science/Logic/Mathematics are very patchy. Under History, we have only attempted some obvious coarse subdivisions for now, along with subcategories for a small number of historical philosophers (largely determined mechanically, by selecting those with more than a certain number of entries in the PhilPapers database). With a couple of exceptions, we haven't tried to taxonomize the various subareas of "Other Philosophical Traditions" more than minimally. Suggestions for further development in all clusters are welcome -- see below.

Second, there are many ways to go about compiling a taxonomy, and ours is just one approach. We make no claim to be producing a principled, definitive taxonomy of philosophy. Our taxonomy has been largely driven by pragmatic concerns: what taxonomy will be most useful and usable for users of PhilPapers? As such, we have tended to go with categories that are reasonably standard within the field when they exist. Our categories mix ontologically distinct kinds such as theories, questions, phenomena, and so on. We have typically been constrained by having a reasonable number of subcategories under each category, and especially at lower levels, categories are driven by where the available papers in that category happen to fall. Many fairly arbitrary decisions have had to be made.

Third, it should also be noted that this is a categorization by analytic philosophers, assuming something like the perspective of analytic philosophy. A taxonomy by philosophers from other traditions would no doubt look very different. Still, the system is intended to be open to work in those areas, whether by categorization under relevant subareas of "Philosophical Traditions" (which is intended to cover regional and non-analytic traditions), or under relevant topical or historical areas. For example, work in Asian philosophy of mind might well be placed in subcategories of both Asian Philosophy and of Philosophy of Mind. We have included Continental Philosophy under "Philosophical Traditions", but we note that much work in continental philosophy will also fall under a relevant historical and/or topical category.

The existing system has largely been developed by (i) exploiting our own sense of categories within philosophy, (ii) consulting reference works (the Stanford Encyclopedia of Philosophy has been especially useful), (iii) consulting experts in given areas, and (iv) posting an initial call for feedback online at Fragments of Consciousness (the discussion thread there also contains some relevant methodological discussion).

Refinement of the category system is an ongoing project, and feedback is more than welcome. We expect that as users attempt to categorize papers, all sorts of gaps and imperfections in the system will become evident. Suggests for taxonomizing as-yet undivided areas are also welcome. All suggestions should be posted in the PhilPapers Categorization Project discussion forum.

In posting suggestions, please keep in mind our constraints: we'd like to stick to 4-10 subcategories per category where possible (occasionally more is OK when unavoidable, or fewer in some cases when the subcategories are leaves). Leaf categories should typically be of a specificity such that they'll eventually have 15-100 entries. For now, we are constrained to five levels in the primary structure (though some entries are more than five levels deep if one follows nonprimary ancestry). We're unlike to change our categorization methodology wholesale at this point, and the cluster/area structure is reasonably well-set, but things below that level are still very much open to improvement.

Categorization Tools

Categorization of papers and books within PhilPapers is an ongoing project. As of the launch of PhilPapers, around half of the entries have been partially categorized using automatic classification tools (see below), around 18000 have categories inherited from MindPapers, and a handful have been manually categorized while testing PhilPapers. A large part of the categorization project is to be driven by users, using three manual categorization tools.

The fine-grained categorization tool. This tool enables fine-grained categorization of any entry. It is available by pressing "categorize" under an entry, if you are signed in. Using this tool you can classify an entry in up to three fine-grained categories. You can find a category either by using the search box or by proceeding through the hierarchy by opening folders in turn. You can repeat this process for up to three categories, clicking on a category to add it to a paper, and clicking the red mark next to a category to remove it. You can also categorize multiple entries simultaneously by choosing multiple-entry mode. This mode is especially useful for populating categories quickly by using search tools.

The iterative categorization tool. This tool enables quick categorization of entries into immediate subcategories, allowing further subcategorization by people with expertise in those areas. It is available in the area pages under the "Browse by area" menu. Here, the "Uncategorized Material" page contains entries that have not yet been categorized at all, while area pages for non-leaf categories contain a list of entries in that category that have not yet been categorized under a leaf category. Each entry is followed by a set of links for classifying the entry under a lower-level category (two levels lower for uncategorized material, one level lower for other nonleaf areas). Clicking on a link will place the entry under the relevant subcategory. You can repeat this for up to three subcategories, then click "remove" (on a category page) or "done with this one" (on the uncategorized material page).

The direct categorization tool. This tool enables users to add papers to a category directly, whether or not an entry for that paper is currently displayed. It is available in a box at the top of every category page. Simply enter the authors' surname and the first few words of the title into the box. If the paper is in the PhilPapers database, it will appear, and you can select it to add it directly to the category.

The three tools are complementary. The fine-grained tool is the most powerful but slower to use. The iterative is less powerful, because it performs only coarse-grained categorization, but is quicker and is easy to use for repeated categorization. The direct categorization tool provides more flexible coverage of papers. We hope that the presence of all three tools will enable faster progress on the categorization project than would be possible with any of them alone.

We encourage users to use these tools. Please use them only if you have relevant expertise: typically a Ph.D. in philosophy or graduate work in a relevant area. If you do have this expertise, categorization of as many papers as possible, especially within your areas of expertise, will be much appreciated! This process will make the category system much more useful and comprehensive.

Of course it will sometimes happen that users have different ideas about categorization. If you see what you think is a mistake in categorization, feel free to undo it (though you should examine the paper in question first) and replace by a more appropriate category. Cases like this will be flagged for the editors' attention and we will eventually adjudicate.

Automatic Categorization

At the moment, PhilPapers uses a limited amount of automatic categorization. First, many journals are associated with a specific area, and every paper in that journal is filed under that area. Second, books are frequently filed under a category corresponding to their Library of Congress call number. Third, we have some automatic filters for classifying entries under areas according to the occurrence of certain words in their titles. All of these processes are imperfect. Entries are most frequently assigned to nonleaf categories, so that they will need to be further assigned to leaf categories. Often an entry will be assigned a single category automatically but will also belong under further categories that need to be assigned manually. In some cases, entries will be miscategorized entirely. Users are encouraged to look out for these imperfections and to correct them by manual categorization. We plan to eventually add more sophisticated automatic categorization tools. These tools will probably require a database of already-classified items to serve as a training set, however, so manual categorization will play a vital role in any case.

The Use of Categories

Categories are used at a number of places on PhilPapers.

First, users have the option to automatically display the categories currently associated with a given entry, by checking the "Display categories" box in the right column of most pages containing entries.

Second, users can browse categories by using the "Browse by area" menu. The menu itself leads to pages for clusters or areas (for now, putting the full category system in the menu is impractical due to memory usage and speed). The page for a nonleaf category displays the subcategories of that category in the left column, with an item count for each (either [n] or [n/m], where n is the number of items under that category, and m is the number of items in that category that await further subcategorization). Deeper subcategories can be opened by pressing "+". Clicking on a subcategory will take on to the page for that subcategory.

For every category, the right-hand column will contain a list of papers under that category. For nonleaf categories, these will be papers awaiting further subcategorization. For leaf categories, these will be all the papers falling under that category. Our hope is that these lists will eventually constitute comprehensive bibliographies for all sorts of areas of philosophy.

In addition, the page for every category contains a link to the discussion forum for the area associated with that category, for user-contributed bibliographies in that area, and to a list of users (with publically available profiles) who have listed that area as an area of interest.

Third, every area (such as philosophy of mind, normative ethics, and so on) has an associated discussion forum, available via the "Forums" page. This discussion forum contains discussions of papers that fall under that category, initiated via the "Discuss" link under a paper (note that when the areas associated with a paper change, the associated discussion forums will change correspondingly). The discussion also contains other discussions relevant to that area, initiated via the "Forums" page. There are also aggregated forums for each cluster (produced by aggregating the area forums), and for all clusters at once.

Fourth, every user can choose up to ten areas as their areas of interest. At the moment, users who choose such areas can (i) optionally filter any list of papers using those areas, (ii) optionally receive e-mail alerts for new items in those areas, (iii) be listed on the page of users associated with that area, and (iv) receive information about forums in those areas on their profile page.

Once again, all feedback regarding the category system is welcome at the PhilPapers Categorization Project discussion forum.