3D SCENE RECONSTRUCTION AND DIGITIZATION METHOD FOR MIXED REALITY SYSTEMS

M. I. SOROKIN; Сорокин М. И.; D. D. ZHDANOV; Жданов Д. Д.; A. D. ZHDANOV; Жданов А. Д.

doi:10.31857/S0132347423030056

3D SCENE RECONSTRUCTION AND DIGITIZATION METHOD FOR MIXED REALITY SYSTEMS

Authors: SOROKIN M.I.¹, ZHDANOV D.D.¹, ZHDANOV A.D.¹
Affiliations:
1. St. Petersburg National Research University of Information Technologies, Mechanics, and Optics (ITMO University)
Issue: No 3 (2023)
Pages: 26-36
Section: COMPUTER GRAFICS AND VISUALIZATION
URL: https://journals.rcsi.science/0132-3474/article/view/137624
DOI: https://doi.org/10.31857/S0132347423030056
EDN: https://elibrary.ru/DEMESA
ID: 137624

Cite item

Full Text

Abstract
About the authors
References
Supplementary files
Statistics

Abstract

Mixed reality systems are a promising direction of research that opens up great opportunities for
interaction with virtual objects in the real world. Like any promising direction, mixed reality has a number of unresolved problems. One of these problems is the synthesis of natural lighting conditions for virtual objects, including the correct light interaction of virtual objects with the real world. Since virtual and real objects are located in different spaces, it is difficult to ensure their correct interaction. To create digital copies of realworld objects, machine learning tools and neural network technologies are employed. These methods are successfully used in computer vision for space orientation and environment reconstruction. As a solution, it is proposed to transfer all objects into the same information space: virtual space. This makes it possible to solve most of the problems associated with visual discomfort caused by the unnatural light interaction of real and virtual objects. Thus, the basic idea of the method is to recognize physical objects from point clouds and replace these objects with virtual CAD models. In other words, it implies semantic analysis of a scene and classification of objects with their subsequent transformation into polygonal models. In this study, we use competitive neural network architectures, which can provide state-of-the-art results. The test experiments are carried out on Semantic3D, ScanNet, and S3DIS, which are currently the largest datasets with point clouds that represent indoor scenes. For semantic segmentation and classification of 3D point clouds, we use the PointNeXt architecture based on PointNet, as well as modern methods of data augmentation in the process of learning. For geometry reconstruction, the Soft Rasterizer differentiable rendering method and the Total3Understanding neural network are considered.

References

Dhaval S. Critical review of mixed reality integration with medical devices for patientcare // International Journal for Innovative Research in Multidisciplinary Field. 2022. V. 8. Issue 1. https://doi.org/10.2015/IJIRMF/202201017
Maas M.J., Hughes J.M. Virtual, augmented and mixed reality in K-12 education: a review of the literature // Technology, Pedagogy and Education. 2020. V. 29. Issue 2. https://doi.org/10.1080/1475939X.2020.1737210
Evangelidis K., Sylaiou S., Papadopoulos T. Mergin’mode: Mixed reality and geoinformatics for monument demonstration // Applied Sciences. 2020. V. 10. № 11. P. 3826.
Piumsomboon T., Lee G.A., Hart J.D., Ens B., Lindeman R.W., Thomas B.H., Billinghurst M. Mini-me: An adaptive avatar for mixed reality remote collaboration / In Proceedings of the 2018 CHI conference on human factors in computing systems. 2018. P. 1–13.
Miedema N.A., Vermeer J., Lukosch S., Bidarra R. Superhuman sports in mixed reality: The multi-player game League of Lasers / In 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). IEEE, 2019. P. 1819–1825.
Guna J., Gersak G., Humar I. Virtual Reality Sickness and Challenges Behind Different Technology and Content Settings // Mobile Networks and Applications. 2020. V. 25. P. 1436–1445. https://doi.org/10.1007/s11036-019-01373-w
Saredakis D., Szpak A., Birckhead B., Keage H.A., Rizzo A., Loetscher T. Factors associated with virtual reality sickness in head-mounted displays: a systematic review and meta-analysis // Frontiers in human neuroscience. 2020. V. 14. P. 96.
Moser T., Hohlagschwandtner M., Kormann-Hainzl G., Pölzlbauer S., Wolfartsberger J. Mixed reality applications in industry: challenges and research areas / In International Conference on Software Quality. Cham.: Springer, 2019. P. 95–105.
Pallot M., Fleury S., Poussard B., Richir S. What are the Challenges and Enabling Technologies to Implement the Do-It-Together Approach Enhanced by Social Media, its Benefits and Drawbacks? // Journal of Innovation Economics Management. 2022. I132-XLII.
Guo J., Weng D., Zhang Z., Liu Y., Duh H.B., Wang Y. Subjective and objective evaluation of visual fatigue caused by continuous and discontinuous use of HMDs // Journal of the Society for Information Display. 2019. V. 27, № 2. P. 108–119.
Armeni I., Sener O., Zamir A.R., Jiang H., Brilakis I., Fischer M., Savarese S. 3D semantic parsing of large-scale indoor spaces / In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. P. 1534–1543.
Dai A., Chang A.X., Savva M., Halber M., Funkhouser T., Nießner M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes / In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 2017.
Haoming L., Humphrey S. Deep Learning for 3D Point Cloud Understanding: A Survey // Computer Vision and Pattern Recognition. 2020. https://doi.org/10.48550/arXiv.2009.08920
Qian G., Li Y., Peng H., Mai J., Hammoud H.A., Elhoseiny M., Ghanem B. PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies. arXiv preprint arXiv:2206.04670. 2022.
Qian G., Hammoud H., Li G., Thabet A., Ghanem B. Assanet: An anisotropicseparable set abstraction for efficient point cloud representation learning // Advances in Neural Information Processing Systems (NeurIPS). 2021. P. 34.
Sandler M., Howard A., Zhu M., Zhmoginov A., Chen L.C. Mobilenetv2: Inverted residuals and linear bottlenecks / In Proceedings of the IEEE/CVF Conference on Computer Visionand Pattern Recognition (CVPR). 2018. P. 4510–4520.
He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition / In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2016. P. 770–778.
Li Y., Bu R., Sun M., Wu W., Di X., Chen B. Pointcnn: Convolution on X-transformed points // Advances in Neural Information Processing Systems (NeurIPS), 2018.
Li G., Muller M., Thabet A., Ghanem B. Deepgcns: Can gcns go as deep as cnns? / In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2019. P. 9267–9276.
Loshchilov I., Hutter F. Decoupled weight decay regularization / In International Conference on Learning Representations (ICLR). 2019.
Diederik P. Kingma, Jimmy Ba. Adam: A method for stochastic optimization / In International Conference on Learning Representations (ICLR). 2015.
Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z. Rethinking theinception architecture for computer vision / In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2016.
Nie Y., Han X., Guo S., Zheng Y., Chang J., Zhang J.J. Total3dunderstanding: Joint layout, object pose and mesh reconstruction for indoor scenes from a single image / In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. P. 55–64.
Kulikajevas A., Maskeliūnas R., Damaševičius R., Misra S. Reconstruction of 3D object shape using hybrid modular neural network architecture trained on 3D models from ShapeNetCore dataset // Sensors. 2019. V. 19. № 7. P. 1553.