Should the LSAT (and Legal Testing More Generally) Be Revisited and Perhaps Retooled?
|By VIKRAM DAVID AMAR|
|Friday, Mar. 18, 2005|
Last weekend, some 300,000 high school students across the country took the newly- revamped SAT I Test (also known as the SAT Reasoning Test). The changes to the test included, among others, the addition of an essay writing section, the incorporation of more advanced math questions, and the deletion of the dreaded "analogy" queries.
These changes were driven primarily by the University of California undergraduate campuses. A few years ago, UC had threatened to stop using the SAT as an admissions criterion altogether unless the test were reworked to better measure the skills most relevant to college academic success.
Now that standardized test makers and test takers are breathing a sigh of relief - having learned that there is life after SAT revision -- the time is ripe to discuss whether the SAT's legal sibling, the LSAT (Law School Admission Test), ought also to be reconsidered. This may even cause us to ask whether legal testing more generally (including law school final exams and the bar examination) should be looked at more carefully and critically.
In a series of columns beginning with this one, I attempt to contribute a bit to such a discussion.
Recent Commentary on the LSAT: It Seems That Speed Counts
One good starting point is a recent article published by Indiana University Law School Professor Bill Henderson on the way the LSAT works. Professor Henderson begins his analysis with a hypothetical, a variant of which I shall offer here: Suppose two law school applicants, Alan and Beth, take the LSAT at the same testing center on the same day. Each of them plows through the 101 questions contained in the 4 graded sections of the LSAT and tries diligently to fill in the bubbles on the multiple-choice scantron answer sheet that will be credited as "correct."
Let's begin with Alan. In general, he is able to spend as much time on each of the 101 questions as he needs to feel confident that he has picked the best or "right" answer. There are, to be sure, a few questions to which he would like to go back and devote additional attention if he had a bit more time. But overall he feels that he has been able to understand, unravel and "solve" each of the queries in the 4 sections. When his exam is graded, we find out that Alan got 81 correct responses - he missed 20 questions, even though he felt he had picked the best response to each of them.
Beth, by contrast, is unable to even read through -- let alone think at all carefully about - the last 6 or so questions in each of the 4 graded sections. That is, for the last 6 questions in each Part, Beth has completely run out of time and thus has had to fill in the scantron bubbles completely at random. (In the LSAT, there is no penalty for guessing - one's total score is determined simply by the number of correct responses; the number of incorrect responses is not subtracted from the number of correct responses or included into the formula in any other way.) So, in effect, Beth has "answered" 77 of the 101 questions, and has completely "guessed" at the other 24. But as to the 77 questions she answered, Beth feels pretty confident.
When Beth's test is scored, we find out that of the 77 questions she had time to read and analyze enough to make her comfortable, she got 76 correct - a remarkably high rate! But as to the 24 questions on which she had to guess completely, she got only 5 correct - a somewhat predictable number given that there are 5 answer options for each question. That brings Beth's total number of correct answers to 81 (76 plus 5) - the same number achieved by Alan.
Alan and Beth will thus both receive the same LSAT score -- their 81 "raw" score will likely translate to a 164 "scaled" score. ("Scaled" scores are the ones used by law schools and other institutions for purposes of comparing individuals.)
Now, I certainly believe a scaled score of 164 - which places a person in about the 92nd percentile of test takers nationwide (the national median score being 151) - is a very good score, and one that would be considered by most law schools as suggestive of substantial reasoning power. (That score of 164 is just below the median score of admitted applicants at my home law school, UC Hastings, so I am very familiar with the kind of students who score in that range.)
Yet one has to wonder whether the 164 score is a fair measure for Beth - whether such a score gives her enough credit.
If Beth had enjoyed an additional 30 minutes, say, to complete the LSAT (imagine an extra 7.5 minutes for each of the 4 graded sections), her raw score might have increased from 81 to something like 91 or better. A raw score of 91 would translate to a scaled score of about 172 (in the 99th percentile nationally), which is right around the median score of students who attend Yale Law School, the law school to which it is most difficult to gain admission.
Additional time might have helped Alan, too, but not nearly as much. If he had a few more minutes to go over his responses, he might have caught a few of his careless errors, and might also have been able to think through better the couple of questions over which he was agonizing. But he probably wouldn't have picked up more than 2 or 3 additional correct responses; remember, he was pretty happy with the amount of time and confidence he had for virtually all of his 101 responses.
From a law school admissions perspective, Alan and Beth appear to be identical -- so far as LSAT performance is concerned -- even though Beth, given just a little more time, would have substantially outperformed Alan on the test, and set herself apart.
The possibility - indeed the inevitability - that there are real-life Beths and Alans out there led Professor Henderson to explore whether the LSAT places weight - perhaps too much weight - on speed as a test-taking skill. According to Professor Henderson, in the field of
"psychometrics [test design], it is widely acknowledged that test-taking speed and reasoning ability are separate abilities with little or no correlation to each other." That is, a person's abilities, respectively, to reason well and to reason quickly aren't very related.
The LSAT is supposed to measure reasoning ability, or "power." As Professor Henderson says, "test-taking speed is assumed to be an ancillary variable with a negligible effect on candidate scores." But that may not be true.
Traditionally, the LSAT makers have maintained that they are interested in testing only for reasoning power to discern the best answer, not for superior quickness in reaching the best answer. Yet Professor Henderson suggests that some LSAT makers and experts these days may concede that the exam does value speed, whether by design or not.
LSAT Scores Correlate Somewhat Well To Law School Grades - But Why?
LSAT scores are used by law school admissions offices primarily because they correlate to law school grade performance better than any other single criterion (including college grades) does. In other words, LSAT scores do have some meaningful correlation to law school grades. (A formula blending LSAT score and college GPA yields a number that correlates to law school grades better than does either LSAT score or college grades alone, but the LSAT undeniably has some predictive power here.)
Now consider these two points together, and the question they raise: It seems that the LSAT measures and values speed. And it seems that the LSAT correlates to law school grade performance. The question becomes: Is the correlation to law school grades we observe related to the speed aspect of the LSAT?
Professor Henderson suggests that the reason for the correlation may be that law school exams also measure and value speed. Indeed, the majority of law school grades - especially those given in the first, and most formative, year of law school -- are based upon in-class, three- or four-hour, issue-spotting/issue analyzing" exams, in which students are asked to read hypothetical factual situations and to quickly identify and discuss the (perhaps dozens) of legal issues implicated by the facts. (No wonder these are nicknamed "racehorse" style exams.)
If Law School Did Not Use "Racehorse" Exams, Would the Same Correlation Persist?
Professor Henderson's piece goes on to tackle an important issue: Would other ways of evaluating law student performance yield correlations to LSAT performance that are very different from those generated by the racehorse tests? Possible grading alternatives to in-class short exams would include take-home exams that students have 8 hours or more to complete, or papers (in lieu of exams) written over the course of weeks.
To begin to answer this question, Professor Henderson examined multi-year data from two schools that fell into two different kinds. One was a "national" law school where almost all the students had high LSATs and high college GPAs. The other was a "regional" law school where many students had LSAT scores and college GPAs close to the national medians.
Professor Henderson's findings were, as I explain a bit below, significant -- though he did note a number of limitations of, and complexities arising from, his methodology, including the following:
First, the correlation between LSAT scores and law school grades is invariably going to be weaker when most or all of the students are compressed within a very small segment of the LSAT performance spectrum, as is true at many "national" elite schools; where LSAT differences between students are relatively small, factors other than LSAT performance - such as level of engagement, work ethic, and so on - are going to account for law school grade differentials much more.
Second, with respect to term papers, grading cannot be "blind" the way it can be for in-class or take-home exams. That is, the professor grading a paper virtually always knows the student - the person - whose paper is being graded. (Indeed, the student likely discussed, and the professor may have approved, the topic ahead of time.) This personal factor - as well as some students' tendency to write papers for certain notoriously generous paper graders on the faculty - may skew things a bit.
Third, there is a much greater problem with illicit collaboration (that is, cheating) with respect to papers and take-home exams than there is with respect to in-class three- or four-hour tests. If roommates help each other on take-home exams, who will know? In contrast, cheating on a proctored in-class exam is far more dangerous. Accordingly, it is not always easy to compare grade results from the different categories perfectly.
Notwithstanding these wrinkles, though, Professor Henderson's study tends to show (quite convincingly to me) that LSAT performance correlates much more to law school grades that are given on the basis of short in-class exams than it does to law school grades given on papers and take-home exams. In other words, one big reason why time-pressured LSAT scores predict law school grade success is that we give so many time-pressured law school tests.
Are "Racehorse" LSAT, Law School, and Bar Exams Justified?
Suppose a higher percentage of law school grades were based on take-home exams and/or papers. According to Henderson's results, the LSAT's predictive usefulness would still be present, but it would also be significantly reduced.
(Interestingly, the extent to which college grades (as opposed to LSAT scores) predict law school grades does not vary across different types of law school evaluation methods nearly as much.)
That brings us to what I think is the $64,000 question (or questions): Is the extensive use of time-pressured, issue-spotting/issue analyzing exams in law schools justified, by the nature of time-pressured bar exams, the nature of time-pressured legal practice, or by something else entirely?
If we believe the use of these "racehorse" exams is indeed justified, then we should explain why and how. And if we can't provide that explanation, we should ask ourselves why we use tests that seem to value speed so much.
These are questions to which I shall return next time.