part of Big Dynamic Network Data (BigDND) project by Professors Erik Demaine of MIT and Mohammad T. Hajiaghayi of U. of Maryland
Disclaimer: If you find the ranking in this website offensive, please ignore it. This data is not official at this point, and may still have errors.
News: As a followup to our work, Emery Berger has created an interactive general CS ranking that is open source.
We feel there is a lack of transparency and well-defined measures in the methods used by U.S. News to rank CS departments in theoretical computer science (and other similar rankings). Over the past several months we have developed a ranking based on a hard, measurable method for the top 50 U.S. universities. To make this possible, we gathered information about universities from various resources as described below.
Our method for ranking is very simple (to avoid any complications by different measures) but robust: ordering by the measure Rank 1 + ½ Rank 2 and ordering by the measure of just Rank 1 CS theory conference publications, the rankings vary only slightly. More precisely, for each paper in a Rank 1 CS Theory Conference that has a faculty member from a particular CS department, we give score of 1 for both measures to the department. In addition, for each paper in a Rank 2 CS Theory Conference, we give score of ½ to the measure of Rank 1 + ½ Rank 2. We then rank the departments based on each of these two measures as given in the table below. (Choose your preferred measure from the tabs at the bottom.) Note that a paper counts once for an institution even if it has multiple authors from that institution, which is not the same as counting papers for each author separately and then summing.
It is worth mentioning that, although we tried our best to minimize errors, as with any large dataset we cannot guarantee that the data is 100% accurate, especially because of several name mismatches as well as conferences changing names over years. Also note that, as DBLP data is updated frequently as well as faculty associated with each department, the ranking may change in each run. (The current ranking is based on data from July 2014.) Please send any comments to csdepartmentranking [at] gmail [dot] com. We gather all comments and possibly errors for this Beta version of the ranking and hopefully fix in the next release. Please scroll to the right to see all conferences.
Q: By not normalizing by the number of faculty the ranking favors departments with more faculty in an area. Large TCS groups will produce more papers. An important but missing measure is papers/faculty.
A: Indeed, we considered this measure as well, but for now are not releasing data. One reason is that the faculty data from Brown is not error-free and thus sometimes the number of faculty for some departments is higher than the actual number (e.g., by counting lecturers, adjuncts, etc.). Also the research field for faculty is not very precise. For the data presented, this is not really a problem since these extra people have few if any publications which does not change the ranking. But if we want to divide by the faculty size then it can be a source of error. Also, at the end, if the size of a department is large, then it is often a stronger department as well. Hopefully in the future if we have more precise information for faculty size so we can add this measure as well.
Q: This measures a "lifetime" ranking for the faculty. Productivity in recent years seems like a better measure.
A: Indeed, we plan to release a version of the data where the user can change their notion of "recent", and get the corresponding ranking. We have not released such a ranking yet because it seems difficult (and debatable) to choose any one cutoff for "recent", but making it a user parameter would resolve this issue.
Q: The tier 1/tier 2 classification of the conferences seems arbitrary to me. A possible approach to avoiding this would be to let the user select the conferences which are used to compute the ranking. In this way you would not marry yourself to any particular classification (and, most likely, the results would not change much).
A: Good idea — we plan to release a customizable version like that in the future. For those excited to experiment now, the data in the tables is enough to choose your own weighting or conference set. Observe that, already, adding the rank-2 conferences (at weight ½) barely affects the overall ranking, so it does not seem too significant. It would be good to add data for some more conferences, e.g. for more specialized domains, that are not presently on the list. For now, we wanted to rely on an externally chosen (and reasonable) list of conferences.
Q: This ranking perpetuates the trend in (theoretical) computer science to publish only in conferences. Did you try to look at journals as well?
A: We agree on the importance of journals and that our measure so far is incomplete (like any measure). We are thinking about including journals, which could be weighted by journal impact factors. For now, we stuck with the relatively few important conference in our field, but hope to expand in the future.
Q: While your approach is objective in the sense it is formula driven, I note that your proposed approach is based on the questionable assumption that volume of papers equates with quality (of both the paper and thus program). Even at the set of top conferences, not all papers presented have similar impact.
A: Certainly that is an issue with paper counting. Our expectation is that the variation of weights within a single conference is dwarfed by the large numbers of papers we are summing over, as in a Chernoff bound. We certainly do not claim that this is the best measure, but we feel it is a reasonable one and better than US News approach.
Q: Why is number of publications a good measure? Number of citations would be better, no? There are also people who don't publish, but have excellent students.
A: We certainly do not claim that publication record is the best measure, or a complete measure. However, it is the record that everyone sees and compares in CV, for jobs, grants, reports, etc. The data is also readily available and clearly defined. Citations would be interesting for measuring influence (hence the h-index measure, for example), and definitely something we should consider, but citations also have issues with weighting (more popular fields will have more citations), and it is more difficult to get complete data.
Q: Does a publication in 2000 count toward the institution that the author was a faculty member in 2000, or where they are now?
A: They count toward the institution that the author is currently (according to the Brown study) a faculty member. As a result, if you hire the best people in the world, then your ranking goes up immediately. On the other hand, if you lose all your good people, then your ranking immediately goes to the bottom. We feel that this makes sense.
Q: How often is your data updated?
A: Our ranking can be very dynamic. DBLP data is updated almost every day, and faculty data is updated every few months. Thus we have the plan to update the ranking every day or month. (This is not yet implemented, though.)
Q: Why not sum up the number of publications of faculty from a department for each conference, instead of counting the number of papers in each conference which have a faculty from a department?
A: If we sum up the number of publications of faculty for each conference, then say for one SODA paper written with 8 faculty from the same department, we would count one paper 8 times, which does not seem right to us.
Q: The other real concern I have is places and people whose students publish papers without the advisors name on them, but isn't accounted in this measure.
A: Yes, it will not be considered in this ranking; first since we do not have the data. Second often there are some faculty involved and it is not very common to have papers with no faculty; third students are moving (unlike faculty that stay for several years if not for a lifetime) and when they move to another university then we will count them. Indeed the same measure of only faculty publicaion count has been used by others as well see here.
© Copyright 2014. All rights reserved.