Screenshot of the embodied avatar system “Diana.“ Credit:
Imagine a classroom in the future where teachers are working
alongside artificial intelligence partners to ensure no student
gets left behind.
the back who has been quiet and still for the whole class and the
AI partner prompts the teacher to engage the student. When called
on, the student asks a question. The teacher clarifies the material
that has been presented and every student comes away with a better
understanding of the lesson.
This is part of a larger vision of future classrooms where human
instruction and AI technology interact to improve educational
environments and the learning experience.
James Pustejovsky, the TJX Feldberg Professor of Computer
Science, is working towards that vision with a team led by the
University of Colorado Boulder, as part of the new $20 million
National Science Foundation-funded AI Institute for Student-AI
The research will play a critical role in helping ensure the AI
agent is a natural partner in the classroom, with
language and vision capabilities, allowing it to not only hear what
the teacher and each student is saying, but also notice gestures
(pointing, shrugs, shaking a head), eye gaze, and facial
expressions (student attitudes and emotions).
Pustejovsky took some time to answer questions from BrandeisNOW
about his research.
How does your research help build this classroom of the
For the past five years, we have been working to create a
multimodal embodied avatar system, called „Diana,“ that interacts
with a human to perform various tasks. She can talk, listen, see,
and respond to language and gesture from her human partner, and
then perform actions in a 3-D simulation environment called
VoxWorld. This is work we have been conducting with our
collaborators at Colorado State University, led by Ross Beveridge
in their vision lab. We are working together again (CSU and
Brandeis) to help bring this kind of „embodied human computer
interaction“ into the classroom. Nikhil Krishnaswamy, my former
Ph.D. student and co-developer
of Diana, has joined CSU as part of their team.
How does it work in the context of a classroom
At first it’s disembodied, a virtual
presence on an iPad, for example, where it is able to recognize
the voices of different students. So imagine a classroom: Six to 10
children in grade school. The initial goal in the first year is to
have the AI partner passively following the different students, in
the way they’re talking and interacting, and then eventually the
partner will learn to intervene to make sure that everyone is
equitably represented and participating in the classroom.
Are there other settings that Diana would be useful in
besides a classroom?
Let’s say I’ve got a Julia Child app on my iPad and I want her
to help me make bread. If I start the program on the iPad, the
Julia Child avatar would be able to understand my speech. If I have
my camera set up, the program allows me to be completely embedded
and embodied in a virtual space with her so that she can help
How does she help you?
She would look at my table and say, „Okay, do you have
everything you need.“ And then I’d say, „I think so.“ So the camera
will be on, and if you had all your baking materials laid out on
your table, she would scan the table. She’d say, I see flour,
yeast, salt, and water, but I don’t see any utensils: you’re going
to need a cup, you’re going to need a teaspoon. After you had
everything you needed, she would tell you to put the flour in „that
bowl over there.“ And then she’d show you how to mix it.
Is that where Diana comes in?
Yes, Diana is basically becoming an „embodied presence“ in
interaction: she can see what you’re doing, you can see what
she’s doing. In a classroom interaction, Diana could help with
guiding students through lesson plans, through dialog and gesture,
while also monitoring the students‘ progress, mood, and levels of
satisfaction or frustration.
Does Diana have any uses in virtual learning in
Using an AI partner for virtual learning could be a fairly
natural interaction. In fact, with a platform such as Zoom, many of
the computational issues are actually easier since voice and video
tracks of different speakers have already been segmented and
identified. Furthermore, in a Hollywood Squares display of all the
students, a virtual AI partner may not seem as unnatural, and Diana
might more easily integrate with the students online.
What stage is the research at now?
Within the context of the CU Boulder-led AI Institute, the
research has just started. It’s a five-year project, and it’s
getting off the ground. This is exciting new research that is
starting to answer questions about using our avatar and agent
technology with students in the classroom.
Originally published by Tessa Venell, Brandeis University
November 20th 2020