Welcome to this Questionmark podcast. Questionmark podcasts bring you news, ideas, and advice about assessments and learning.
Joan Phaup, Questionmark: This is Joan Phaup from Questionmark, and I’m very pleased to be talking today with Sharon Shrock and Bill Coscarelli. Sharon and Bill are professors of instructional design, business consultants, and the authors of Criterion-Referenced Test Development, which is now in its third edition. They are also the keynote speakers at the Questionmark 2009 users’ conference. So Bill and Sharon, welcome to the call.
Bill Coscarelli: Oh, good to be here.
Sharon Shrock: Thank you.
Joan: What I wanted to ask you about your favorite subject, which is criterion-reference tests, or CRTs. Can you tell me how they are different from other tests?
Bill: Well, I’ll go ahead and start. Most people, the most sophisticated tests that most people see are called norm-referenced tests, and these tests are things like, in America, the Scholastic Aptitude Tests, or the ACT, are designed to sort people out, so that they can eventually use them to predict usually future behavior. And so they’re a test where not everybody can get 100%, and in fact if everybody does well, the test is considered bad from the designer’s perspective. The criterion-referenced test is the opposite in the sense that it’s typically used when we have skills or procedures we want to make sure people can do. For example, we all want the airline pilot to get the landing gear down 100% of the way 100% of the time. We’re not going to give him an A if he only does it 95% of the time. So the criterion-referenced tests are usually linked to specific job task skills, and in most organizations as well there’s usually instructional objectives that are linked, so that we can achieve the performance that we want. And if everybody gets everything right, we’re happy.
Joan: Okay. I’d like to follow that by asking you if that kind of test, the criterion-referenced test, is harder to create than a norm-referenced test.
Sharon: Well, I think that it’s not. However, it’s less well known. The methodology for creating criterion-referenced tests is less publicized. People are less familiar with it, because most people taking tests through school have actually experienced norm-referenced tests rather than criterion-referenced tests, and the technology for creating norm-referenced tests is very well established and has been around for a long, long time, more than a hundred years, whereas criterion-referenced testing is relatively new. So while in many ways it’s conceptually more simple to create, it’s not as well-known.
Joan: Are you finding an increase in the interest in criterion-referenced tests? Are more people wanting to use them now?
Bill: Well, the fact that we’re on the third edition of book tells us something, that somebody’s doing something. The big push, I think, began actually in the early ’90’s, with the high-tech companies. If you think of yourself as Microsoft or Hewlett Packard and you’re sitting in southern California, and somebody says they know how to do, fix that server, and there’s a 1,050 applications on your desk from 500 different schools, you’re beginning to wonder, like “Geez, how do I even begin to make sense of this?” And from that, I think the certification movement got its start as we know it in the States, in that they were saying, “Well geez, I wish there was a test out there that, if they said they knew how to fix the server, we knew that they knew how to fix the server.” So that started in the ’90’s with the high tech world, though there’s clearly been some other things out there such as the nursing exams and the CPA exams that have used some of these principles. But as people began to see the power of the tests and the ability for them to make accurate decisions about somebody’s competence, then other organizations have begun to draw them in and use them along the way.
Sharon: I can only add that the desire for these tests has been with us for decades, but people were very hesitant to even attempt to do it, because testing in general is an intimidating subject, and it was thought to be extremely difficult, and of course it’s got litigation implications, and so it was avoided until very recently. I would like to think that the book that we wrote has something to do with that, but Bill is correct. There was also just a need to make sure that employees were more competitive.
Bill: Criterion-referenced testing is about logical analysis. It starts with a job, and it usually leads to a test item that is also anchored to an objective. And so it’s really a logical thinking process more than a statistical tour de force.
Joan: I know that the two of you have been looking at this subject for more than 25 years, so I’m interested in going back with you a little bit to see what changes have happened, and I wonder if you could start by talking a little bit about the technical changes that have occurred in this kind of testing.
Sharon: Well, I think that one of the things that we saw was a growing consensus around ways to establish a passing score for this kind of a test. I think that that made that part of the test, which actually is not essential for norm-referenced tests, but is essential for criterion-referenced tests, that made the work more feasible. The other thing that we discovered was a major problem with testing in most organizations, is difficulty writing test questions that actually match what people have to do on jobs. So it’s a question of being able to write test questions that are above memory level. That turned out to be extremely important in writing this kind of test, and so that’s another technical aspect that certainly changed our focus in what we spend time with, with people that we work with. Those would be, I think, the two major technical points.
Bill: Well, I would add one more, I think, and that has been the computer support that sort of makes a lot of things happen. Questionmark in particular, with their software, came along at a time that enabled users to easily begin to build the item banks, to put the pieces together, and as Sharon was talking about, processes for setting the cut-off scores. As far as we know, Questionmark’s the only company that has built in its software the hand(?7:03)-off process. So it’s bringing these skills right to the desktop. And plus, sophisticated help screens really (?7:14) someone who’s just trying to go through this and looking for a just in time support.
Sharon: That’s absolutely true.
Joan: And then the other aspect of it I’m interested in is the organizational changes that have been coming about.
Bill: Yeah, this was fascinating for us. When we began to do our work, we thought of criterion-referenced testing as the last box in the instructional development model. And so in fact, in our first edition of the book, there’s not even a discussion of a model for developing a test. It was just our census(?7:45) like okay, it’s instructional development. You get done creating your instruction, now you gotta see whether or not somebody knows something. In working with one of our earlier clients, a very large global corporation, they were interested in creating a test for soft skills. Soft skills are different from hard skills in that hard skills are things like fixing the machine, or welding the pipe. Soft skills are things that have to do with interpersonal skills. And so they had a very sophisticated test that we helped them develop, and it was very accurate, and it was a very good measure, and anybody who failed that test, you couldn’t blame it on the test. You’d have to say, “No, they failed it.” And what happened, in the first testing round, a third of all the people failed it, and they all failed it coming from the same manager. Well, you can imagine, it didn’t take long before people thought, “Well, is it these people, or is it this manager?” The manager became horrified, I think, and the phones went off like wild brushfire. And before long, the political weight of that test caused the whole system to collapse into a very very nice coaching exercise. And at that moment is when we began to realize that testing done properly is not just a technical skill, but it’s an organizational development skill. If you put the test in right and there are consequences, and people don’t pass, then you’re going to have to start asking questions upstream about why they didn’t pass. And if you want them to pass, then you’re going to have to enter into an organizational development exercise to fix all the parts to make everything work to achieve the goal.
Sharon: I guess I would just say in summary that sound testing actually precipitates some very useful discussions in an organization, not only about employees but about training and development initiatives and about management and about support for doing work correctly and so forth.
Joan: Your comment make me wonder about the importance of using tests as effectively as possible, and I wanted to ask you what you think keeps organizations from using tests as effectively as they could.
Bill: Well, I guess there’s a couple of things. The first part really is that they think it’s hard to do, and that it’s complicated. And nobody has, typically in most graduate trainings, training programs for instructional designers or people who are hired in training, they never get the testing tested. So they don’t even know what it is, except that, a class in testing. They never even know what it is and how to go about it. So that’s the first hurdle in the process is simply understanding, you know, what are the steps involved? And so that’s number one. And then number two is education managers who are often just happy if they get the course out on training somebody, then that’s it. They’ve met their goal and their objective, and they don’t really care at that stage.
Sharon: It’s like so many innovations. The impediment is a knowledge impediment more than anything else.
Bill: The reality is, if you were starting to build a course from the ground up in any business today, and let’s say you figured it would take you three months to develop that course the way you wanted it, to get the tests done the way they were supposed to be done, starting at the very beginning with that whole course development process interlaced with the test development process, you’re probably only adding about ten days to the total project, maximum.
Joan: Right. So it’s possible it’s just a matter of learning what you need to learn in order to do it properly.
Bill: Right. Yes.
Joan: Well thank you both very much. It’s always a pleasure talking to you. Thanks a lot for your time and all your thoughts.
Sharon: Bye bye now.
Bill: Bye. |