Until recently, in my consulting work around 3D reality capture I have made the distinction between two technologies:
- Active 3D capture, or 3D scanning, which uses a dedicated device that projects a pattern onto the subject with visible or invisible light or lasers and uses one or multiple sensors to capture this pattern to calculate depth.
- Passive 3D capture, or photogrammetry, which uses a series of 2D images and computer algorithms to estimate camera positions and calculate depth from that information.
The latter is gaining a lot of traction because it’s a software-solution that doesn’t require special hardware purchases and has become realistic to run on smartphone hardware.
But since this approach still requires the subject to be completely still during scanning it’s hard to use for capturing people. That’s why instead of relying on traditional photogrammetry principles, a new wave of 3D capture technologies uses artificial intelligence (AI) and machine learning to estimate depth and generate lifelike 3D models. And do so in a way that’s a lot more user friendly.
Last year I tested itSeez3D’s Avatar SDK, which has since then has become a complete commercial platform that can work locally and in the cloud. It can create a 3D avatar of a human head from just a single 2D selfie photo. Sure it’s an estimation at results vary per person or attempt but machine learning algorithms get better over time. And the little time and effort it takes compared to traditional methods is so significant that it can easily win from accuracy for many use cases.
Besides the ease of use, these kind of generated 3D models contain very efficient geometric topology (the Avatar SDK even provides different levels of detail, or LODs) that can readily be used and animated in real-time applications, whereas captured 3D models usually contain a completely random and far from optimal polygonal mesh. And while digital hair replacements can look ridiculous if you choose the wrong one, they have big advantages over actual captured hair because they can move dynamically within realtime environments.
A new concept that was recently published on sciencemag.org takes the AI-based reality capture concept even further. It can turn a regular video or a person that’s turning around in place into a efficient full body 3D model that’s readily rigged to be animated. The video below explains it better than I can put in writing:
This is very interesting technology because it can create lifelike 3D avatars easily by anyone with a camera without the need to stand totally still. On top of that, the promised “5 mm average accuracy” can potentially also make this technology viable for body measurement use cases that don’t require extremely accurate measurements.
I’m hoping this technology will soon be released so I can test it for myself and compare it to more traditional body scanning solutions like photogrammetry, depth sensors and professional handheld scanners.
What’s your take on this? And do you know any more examples of AI-based 3D capture? Please let me know in the comments!
Interesting blog!!
Technology is reaching at it’s best. Such an amazing information you have given here. Keep it up, good going.
Great post
to be honest as a 3d artist and computer programmer I’ve been looking into developing my own ai model to integrate a blended technology solution (which I’m surprised doesn’t already exist a combination of photogrammetry, binocular, time of flight and projection mapping that is sorted by ai backend) but excellent blog. If you want to check out an interesting similar project (though only 2d) check out thispersondoesnotexist.com (ai generated cc0 people images).
ps. If a blended ai solution already exists please feel free to message me and tell me about it.