For my Master’s degree thesis, I dove into acquiring static objects with a RGBD sensor. Eventually, I decided to use the Kinect Fusion algorithm, which produced decent results.
I found this topic fascinating, so I continued on my own, but in another direction: acquiring people. So, in the last few months, I have been experimenting with scanning myself and my friends with a Microsoft Kinect One.
Conditioned by my previous results, initially, I tried with point clouds.
Deformation graph approach
One approach I found in several papers consists in:
- acquiring only a few scans (from 6-8 points of view), with the person as still as they can;
- performing a rough global alignment;
- running ICP to improve the rigid alignment locally;
- downsampling the point cloud to build a deformation graph;
- resolving an optimization problem;
- deforming the denser point clouds.
Reaching point 3 is not trivial because people move. ICP can be very unforgiving, and in some cases, you also need some luck to obtain good results at this stage.
For downsampling (Point 4), I used Open3D’s voxelization followed by averaging point coordinates. I do not know how this can influence the final results compared to something like a clustering algorithm. … [Leggi il resto]