An automatic boredom detector? Inside “educational data mining” research

8092131055_608198f390_z

I’m currently working on a book about the past, present and future of assessment. For the “future” bit I get to talk to researchers like Ryan Baker at Columbia. He’s spent the last ten years working on systems that gather evidence about crucial parts of the learning process that would seem to be beyond the ken of a non-human teacher.

The basis for the observations comes from what’s called “semantic logs” within a computer learning platform, such as Khan Academy’s: Was it a hard or easy question?  Did the student enter a right or wrong answer? How quickly did they answer it? How did it compare with their previous patterns of answers? The detectors gather evidence that students are gaming the system, drifting off-task, or making careless errors. They can extrapolate a range of emotional states, like confusion, flow, frustration, resistance, (which Baker calls memorably “WTF” behavior), engagement, motivation, excitement, delight, and yes, boredom.

Baker’s engagement detectors are embedded within systems currently being used by tens of thousands of students in classrooms from K-12 up to medical school. (Medical residents, he says, show the highest rate of “gaming the system,” aka trying to trick the software into letting them move on without learning anything, at rates up to 38% for a program that was supposed to teach them how to detect cancer.) His research, located at the forefront of the rapidly expanding field known as “educational data mining,” has a wide range of fascinating applications for anyone interested in blended learning.

Understanding how good these detectors currently are requires a bit of probability theory. To describe the accuracy of a diagnostic test, you need to compare the rate of true positives to the rate of false positives. The results for the “behavior detectors,” Baker says proudly, are about as good as first-line medical diagnostics. That is, if the question is whether someone is acting carelessly, off task, or gaming the system, his program will be right about as often as an HIV test was in the early 80s–0.7 or 0.8 (“fair” according to this rubric). For emotional states, which require a more sophisticated analysis, the results are closer to chance, but still have some usefulness. These accuracy scores are derived from systematic comparison with trained human observers in a classroom.

So why would someone want to build a computer program that can tell if you are bored?

To improve computer tutoring programs. Let’s say a learning program provides several levels of hints before the right answer. You want to build something in that prevents a student from simple gaming techniques, such as pressing “hint, hint, hint, hint,” and then just entering the answer.

To give students realtime feedback and personalization.  “I would like to see every kid get an educational experience tailored to their needs on multiple levels: cognitive, emotional, social,” says Baker. Let’s say the program knows you are easily frustrated, and gives you a few more “warmup” questions before moving on to a new task. Your friend is easily bored. She gets “challenge” questions at the start of every session to keep her on her toes.

To improve classroom practice. Eventually as these systems become more common, “I would envision teachers having much more useful information about their kids,” says Baker. “Technology doesn’t get rid of the teacher, it allows them to focus on what people are best at: Dealing with students’ engagement, helping to support them, working on on one with kids who really need help.” In other words, though technology can provide the diagnostics for affective states that affect learning, it is often teachers that provide the best remedies.

To reinvent educational research: This is a fascinating one to me. 

“I’d like to see educational research have the same methodological scope and rigor that have transformed biology and physics,” Baker says. “Hopefully I would like to see research with, say, 75% of the richness of qualitative methods with ten times the scale of five years ago.”

Modeling qualitative factors related to learning opens up new possibilities for getting really rich answers to really interesting questions. “Educational data mining often has some really nice subtle analyses. You can start to ask questions like: What’s the difference in impact between brief confusion and extended confusion?”

In case you’re wondering, I will clear up the confusion. Brief confusion is extremely helpful, even necessary, for optimal learning, but extended confusion is frustrating and kills motivation.

The very phrase “data mining” as applied to education ruffles feathers. It’s helpful to hear from an unabashedly enthusiastic research scientist, not an educational entrepreneur with a product to sell, about this topic. Privacy, he says, should be given due consideration. “The question is what the data is being used for,” he says. “We have a certain level of comfort with Amazon or Google knowing all this about us, so why not curriculum designers and developers? If we don’t allow education to benefit from the same technology as e-commerce, all we are saying is we don’t want our kids to have the best of what 21st c technology has to offer.”

If you’re interested in learning more, Baker has a free online Coursera course on “Big Data in Education” starting this Thursday. Over 30,000 people have signed up.

Social media and video games in classrooms can yield valuable data for teachers

Photo by BarbaraLN

Social media, video games, blogs and wikis are playing increasingly important roles in classrooms across the country. Some worry that incorporating more social media and other technologies into education is leading to too much computer time, as well as to a generation of students deficient in the face-to-face social skills needed to survive in the workplace. Proponents say schools need to find ways to use these technologies to improve teaching and learning, or else risk losing the attention of digital natives.

A paper released earlier this week by the Brookings Institution addresses how social media, blogs and video games are improving education by increasing access to people and information in various forms, including Twitter feeds, blog posts, videos and books. These tools are also increasing people’s ability to share information with networks and contribute their own thoughts.

In a panel convened at Brookings yesterday to discuss how technologies like social media and video games are influencing education, the hot topics of “analytics” and understanding student data were discussed. Constance Steinkuehler Squire, a senior policy analyst in the White House’s Office of Science and Technology Policy and an expert on the educational uses of video games, said that beyond increasing student engagement, video games create valuable “data exhaust” by tracking each student’s progress.

“The data you can get from a student interacting with the game is compelling,” Squire said in the panel yesterday. “And it opens up an entire new area of formative assessment—and the mapping of formative assessments and learning analytics—to a game-type environment, which has been of a lot of interest, both private and public.”

The idea is that the data collected by video games and social media sites can be provided, sometimes in real time, to teachers who can then use it to better understand their students and tailor instruction to meet individual needs.

Janet Kolodner of the National Science Foundation said that data collection will come to be about more than that. She mentioned that NSF just launched a project on “big data”—a term that encompasses the gathering of extremely large amounts of data to which analytics are applied to reach new insights—and said that big data will play a much bigger role in education in the future.

“Big data is also being used so that if we have kids learning in the context of games or kids learning in the context of tutoring systems, that the system will be able to analyze that student’s work and their understanding and be able to give the right kind of feedback at the right time to help them deepen their understanding,” Kolodner said.

Companies like KnewtonJunyo and the Learn Lab in Pittsburgh and are all creating such systems that are being used by many schools across the nation.

Another concern of multiple audience members at the panel was the idea that advances in digital technologies would make school as we know it irrelevant. They suggested a day might come when students wouldn’t go to a school building at all, but would instead learn exclusively from mobile devices and virtual teachers. Kolodner put their fears to rest.

“I don’t think schools are going to go away,” she said. “Parents need to work, and you need a place to put the kids.”

A wave of laughter spread across the room.