Café’s Performance Modeling with Spatial Data

封面

如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

The relevance of the article lies in the importance of the placement problem for the economic performance of organizations and the growth of interest in the use of spatial data in decision support systems in recent years. The main purpose of the research work is to model the estimation of impact of important spatial features for café’s turnover prediction. The article analyzes some approaches that combine spatial data with machine learning to solve the placement problem. A correlation analysis of spatial data has been carried out. A multistage feature selection for two sets of features proper for different types of models was made. The hyperparameter optimization for the selected modeling methods (linear regression, decision tree, random forest, gradient boosting) was made and models were created. The main tools are the Python programming language and its libraries pandas, sklearn, XGBoost, hyperopt, shap, boostaroota. The analysis of the obtained results was carried out. The gradient boosting model was identified as optimal in terms of accuracy and interpretation. The result of the work is the created approach to modeling the economic performance of a company using machine learning based on spatial data.

作者简介

Ivan Ivanov

LLC «BST Digital»

Email: ivanzivanov@yandex.ru
ORCID iD: 0009-0007-7496-3212

Head

俄罗斯联邦, Moscow

Nailia Abliazina

The Russian Presidential Academy of National Economy and Public Administration

Email: nellykluchkovskaya@gmail.com
ORCID iD: 0009-0007-2208-3782
SPIN 代码: 1145-0772

the EMIT Institute

俄罗斯联邦, Moscow

Natalia Grineva

Financial University under the Government of the Russian Federation

编辑信件的主要联系方式.
Email: ngrineva@fa.ru
ORCID iD: 0000-0001-7647-5967
SPIN 代码: 1140-9636

Cand. Sci. (Econ.), Associate Professor, Associate Professor of the Department of Data Analysis and Machine Learning

俄罗斯联邦, Moscow

参考

  1. Ananiev A. Yu., Gaevoy S. V., Ostrovsky A. A. The use of geoeconomic simulation for solving problems of small and medium business // Proceedings of the Volgograd State Technical University. —2011. —No. 11. —p. 73–76.
  2. Bulychev D. M. Forecasting the results of expert evaluation of points of sale using a neural network // Bulletin of the Russian New University. Series: Complex systems: models, analysis and control. —2019. —No. 4. —p. 65–74.
  3. Kalinkina G. E., Maratkanov S. V., Gabdullin V. M. Quantitative assessment of demand in order to find the most effective locations for trade enterprises using geomarketing // Bulletin of the Izhevsk State Technical University. —2012. —No. 4. —p. 57–60.
  4. Naumov A., Rubanov I., Ablyazina N. New approaches to the typology of rural territories in Russia //Moscow University Geography Bulletin. —2021. —№. 4. —P. 12–24.
  5. Takhtarov I. A., Sergeev A. V. Development and research of geomarketing technology based on transport factors and a nonlinear regression model // Proceedings of the III International Conference and Youth School «Information Technologies and Nanotechnologies» (ITNT-2017). —Samara: New technology. —2017. —p. 702–706.
  6. CIAN. URL: https://www.cian.ru/ (Date of access: 20.09.2022).
  7. Yandex.Maps. URL: https://yandex.ru/maps/ (Date of access: 25.05.2022).
  8. Burges C. et al. Learning to rank using gradient descent // Proceedings of the 22nd international conference on Machine learning. —2005. —p. 89–96.
  9. Karamshuk D. et al. Geo-spotting: mining online location-based services for optimal retail store placement // Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. —2013. —p. 793–801.
  10. Kursa M. B., Rudnicki W. R. Feature selection with the Boruta package // Journal of statistical software. —2010. —V. 36. —p. 1–13.
  11. Liu Y. et al. DeepStore: An interaction-aware wide&deep model for store site recommendation with attentional spatial embeddings // IEEE Internet of Things Journal. —2019. —V. 6. —No. 4. —p. 7319-7333.
  12. Yin H. et al. LCARS: a location-content-aware recommender system // Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. —2013. —p. 221–229.
  13. Revealing the ‘Where’ of Business Intelligence using Location Analytics / Esri. 2012. URL: https://www.esri.com/content/dam/esrisites/sitecore-archive/Files/Pdfs/library/whitepapers/pdfs/business-intelligence-location-analytics.pdf (Date of access: 21.05.2022).

补充文件

附件文件
动作
1. JATS XML
2. Fig. 1. Correlation coefficients of the target variable with some factors where correlation coefficient is greater than 0.4 in absolute value

下载 (63KB)
3. Fig. 2. Scatter plot of the «Total mobile traffic in the average income group within a 700 m radius» with the target variable

下载 (37KB)
4. Fig. 3. Scatter plot of the «Pedestrian traffic within a 140 m radius» with the target variable

下载 (33KB)
5. Fig. 4. Scatter plot of the «Average price per square meter within a 300 m radius» with the target variable

下载 (32KB)
6. Fig. 5. Scatter plot of the «Rating of customer activity in the 'Cosmetics' category within a 500 m radius» with the target variable

下载 (40KB)
7. Fig. 6. Scatter plot of the «Total number of objects in the 'Universities' category within a 500 m radius» with the target variable

下载 (44KB)
8. Fig. 7. Scatter plot of the «Average number of objects in the 'Pickup points' category within a 5 m radius» with the target variable

下载 (37KB)
9. Fig. 8. Scatter plot of the «Morning automobile traffic of workers within a 300 m radius» with the target variable

下载 (46KB)
10. Fig. 9. Decision tree model prediction algorithm

下载 (30KB)
11. Fig. 10. Random forest model interpretation

下载 (65KB)
12. Fig. 11. Gradient boosting model interpretation

下载 (76KB)


##common.cookie##