The Planet of the Apes reboot has been a massive success, and its impact on the film industry is hard to overstate. It began with Rupert Wyatt’s 2011’s Rise of the Planet of the Apes and carried on with Matt Reeves Dawn of the Planet of the Apes in 2014 and his riveting coda, War for the Planet of the Apes, in 2017. The central character, Andy Serkis’s chimpanzee Caesar, embodies the trilogy’s wondrous achievement, an evolution in technical wizardry realized by reversing human evolution; man becomes ape.
War for the Planet of the Apes is nominated for an Oscar for visual effects (as were the previous two installments—none have won), and will compete against aliens (The Last Jedi, Guardians of the Galaxy), replicants (Blade Runner 2049), and an intra-species duel against a much larger ape (Kong: Skull Island). All of these films are worthy of their nominations, but one gets the feeling that having revolutionized the motion capture technology that allowed Serkis (and his fellow actors like Steve Zahn, Terry Notary and Karin Konoval) become apes, and is now used widely throughout the industry, it’s War for the Planet of the Apes’ time to bring home the Oscar.
We spoke with VFX supervisor and four-time Oscar winner Joe Letteri (his previous wins were for Avatar, the final two The Lord of the Rings films and King Kong), who is nominated alongside his peers Daniel Barrett, Dan Lemmon and Joel Whist. Our conversation has been edited for length and clarity.
Take us through the evolution of your technique, from Rise to War.
We had a significant change right at the beginning with Rise, because we decided we needed to create these apes as totally realistic chimps, which meant no prosthetics or makeup. Every ape would be digital. Once we made that choice, we knew we had to have a key performance with Caesar. Fox was on board with Andy Serkis, and what I wanted to do was take this idea we’d already been doing with Andy [in King Kong and The Lord of the Rings]—performance capture—only this time we wouldn’t to ask him to go back and redo everything he did. I wanted to find a way to capture his performance live, as he was performing. We did that for Rise, and for the next two films we just pushed it further and further.
What does that mean, ‘not ask him to go back and redo everything’?
So back when we were turning Andy into Gollum in Lord of the Rings and into King Kong, he’d be on set with the actors, performing so they had an eye line with him (he was always in character), which gave them the benefit of being able to play off another actor live in the scene. Yet because of the technical nature of recording motion capture performances at the time, we couldn’t bring our cameras and lights on a live action set because they’d interfere with the other cameras. So after everything was filmed, we’d have to go back to a motion capture stage with all the shots that Andy was in, have Andy watch them, and then have him perform those moves as closely as possible all over again. It worked, but it’s asking a lot of the actor to do it twice. So we wanted to figure out a way to essentially shield our cameras and lights so they didn’t interfere with the regular cameras and lights, and capture Andy on set, in real time.
So that began with Rise, which was mostly set in closed environments like houses, labs, and that prison-like place they take Caesar, but this changed dramatically in Dawn, didn’t it?
On Dawn, it meant getting away from a studio location and out into an actual location, out in a forest, where it was raining. After that was successful, on War, Matt [Reeves] wanted to go even further, to take the apes on this exodus and get them moving across the country in this vast landscape, so we shot in harsh, mountainous conditions with snow and ice and rain, while still having to capture all the nuances the actors were giving us.
How did you pull that off?
When you’re doing performance capture on a stage, you have optical reflective markers on the body suit that reflects light from the cameras and then shoots that light back into the lens. But those lights were then throwing light into the actual scene, so we couldn’t have that on a live action set. Also, if you have bright lights on the set, our cameras could get confused by them, so in order to hide all that, we put infrared transmitters on the body suit and had our cameras be infrared sensitive. This allowed us to send infrared lights that the regular film cameras couldn’t see.
It was a breakthrough. Film hasn’t really changed in the last 100 years, yet we showed up on the set of Rise with a completely new unit that has to be integrated into the live-action. So when everyone got done rigging their lights and sets and props, we had to get in there and hang our lights and cameras and calibrate them and get everything ready for shooting. Everyone had to make room for us as this new department, but by War, everyone was really used to us.
Can you walk me through the process of turning an actor into an ape. Talk to me like I’m utterly technologically challenged—which I am.
So on War, the whole point of performance capture is to record the actor’s movement. With film, you have one camera and a bunch of pixels. If you have the right image, you’re done. That won’t work for us. We have to have dozen of cameras around the set, and each pixel has to align. If you see a marker on Andy’s elbow, several cameras need to triangulate and get that position in space. If you don’t triangulate that marker in space, then all the cameras are getting competing information. So we have to keep them all tightly calibrated, which is difficult enough on a stage; it becomes really tricky on a location, especially one in a remote, mountainous area. Our team was all over it, constantly adjusting things, creating wireless rigs, weatherproof rigs, anything it took to accomplish getting all the cameras on the same page, they did it.
What you’re doing with all these performance capture cameras is recreating the actor’s performance in space. The markers tell you what his skeleton is doing, and we try to get the motion of the actor’s skeleton and translate it to the apes’ skeleton. Every camera sees those dots from different points so you can compute from triangulation where that point is in space, and how it moves through space, and record that relative to every other dot on the body. We have to sift through all that information, compute which markers belong to which character, and use it to recreate that body movement in space. If all 9 cameras see Andy’s elbow dot, all 9 cameras need to agree on where that dot is in space. Otherwise, in the 3D world we’re creating for the skeleton, that point will start to jitter over time and will break the subtle motion.
Your work on set is only the beginning of your process. Tell us a bit about actually creating the apes from the information you capture during filming.
Once we have all this performance capture information, we actually have to create the character. That’s a whole different side of it. It’s not automatic that, once you’ve recorded all this data, you’ll get an interesting performance on the other end. We have to go in and actually make the characters. Caesar is built in 3D space, with a skeleton, muscle, tissue, fur, and a facial system captured using a head mounted video rig. All that info gets pushed forward, frame by frame. The skeleton drives the muscle, the muscle drives the skin, the skin drives the fur, the facial muscles drive the facial skin, and we look at all of this on a frame by frame basis. Then we review it with Matt Reeves, and ask if we’re getting the emotional feel of what Andy and Steven and Terry and Karin are doing. This is something we have to constantly refine. It takes a lot of artists to look at and get that subtle realism of each shot.
How hard is it to turn a human face into an ape’s face?
For a real close up of Caesar, and Matt used a lot of them on this film, you’re getting small, nuanced, detailed motion on the face. That’s something we had to interpret from Andy’s face to Caesar’s, because the faces aren’t the same. You have to translate and interpret that performance because apes are missing some features that humans have, but then, of course, they have other features we don’t, like their elongated muzzle. It’s especially hard for dialogue.
I imagine for something as nuanced as translating grief, which Caesar experiences a lot of in this film, the process is even trickier.
Apes do have a very similar muscle layout to humans, which helps. Caesar’s digital model has a skull, and over the skull is a layer of tissues, and we drive that tissue using the muscles that we think Andy’s using. So if Andy’s furrowing his brow, we try to estimate how that’s happening, and then try to make Caesar’s brow furrows in that way. But it’s not exact. Andy’s face is different, so we’re constantly adding or subtracting shapes to capture the same intensity or feeling. It’s just something you can’t leave to a computer to do on it’s own.
You also didn’t just turn humans into apes, you also digitally “grew” a forest, I hear?
We came up with a new technique where we actually grew the pine forest that gets destroyed in the end, digitally. In the past you’d have made a forest by basically molding and sculpting a bunch of trees, and then moving them all around. It would essentially be done by hand, by art directing, which is great, but I was always curious what the difference between that and what’s real would be. The natural process of forest growth takes a hundred years or more, and means a lot of things happen for a reason. As each tree grows it competes for resources with other trees, and much of that competition depends on where they are in the forest, wind, rain, and so many factors. So we had a helicopter fly over the actual mountain, scan it, rebuild it, seed it, and then let it grow for a hundred years. Then we added snow to the branches so they would have the proper weight on them, so that way when we wiped it all out with an avalanche, every tree was unique and dynamic and reacted accordingly. And it was all done digitally.
There was a fear for a while that this technology you’re creating will one day replace actors all together. Way back in 2002, the film S1m0ne revolved around a producer recreating a pouty star digitally, and then the digital creation becomes a star her, or really itself. How credible are these fears? Just sci-fi stuff?
We design this process around actors. We take what they do well and give them opportunities to create new characters. Now you’re not limited by your gender, or even your species! Karin Konoval is a small woman, yet here she plays a massive orangutan. This opens up a lot of potential for actors, and it puts them at the center of this technology.