Search

CN-121981390-A - Regional industry structure intelligent analysis method based on enterprise big data mining

CN121981390ACN 121981390 ACN121981390 ACN 121981390ACN-121981390-A

Abstract

The invention discloses an intelligent analysis method of an area industrial structure based on enterprise big data mining, which relates to the technical field of enterprise big data mining and treatment and comprises the steps of obtaining enterprise registration addresses and records of places where activities occur, and constructing a weighted registration point set and a weighted activity point set; the method comprises the steps of respectively calculating statistical parameters to construct registered standard deviation ellipses and active standard deviation ellipses, synthesizing and calculating double-ellipse hollow fingerprint intensity based on the area difference of the two ellipses, the distribution difference of active points inside and outside the ellipses and the center offset distance, further utilizing statistical baseline screening to determine a hollow significant industry set, and finally adaptively selecting the active standard deviation ellipses or the registered standard deviation ellipses as an industry space range according to a screening result to generate an area industry image. The invention effectively quantifies the space dislocation of registration and operation and improves the accuracy of industrial characteristic analysis.

Inventors

  • WANG FENGXU
  • ZOU WENFENG
  • ZHAN SHIHONG
  • XU JIANG
  • SHAO CHENGBAO

Assignees

  • 深圳前瞻资讯股份有限公司

Dates

Publication Date
20260505
Application Date
20260123

Claims (10)

  1. 1. An intelligent analysis method for an area industrial structure based on enterprise big data mining is characterized by comprising the following steps: S1, acquiring enterprise registration addresses and activity occurrence records of industries in an area, and encoding the addresses into plane coordinate points to form a registration point set and an activity point set; S2, calculating a weighted center and a covariance matrix of the registration point set and the movable point set, and combining a preset coverage multiple to construct a registration standard deviation ellipse and a movable standard deviation ellipse; S3, calculating the industrial double-ellipse hollow fingerprint intensity based on the area difference of the registered standard deviation ellipse and the movable standard deviation ellipse, the distribution difference of movable points inside and outside the movable standard deviation ellipse and the center offset distance of the two ellipses; S4, determining a significant hollowing industry set based on the double-ellipse hollowing fingerprint intensity of each industry; S5, generating regional industry images according to whether each industry belongs to the hollow significant industry set.
  2. 2. The method for intelligently analyzing an area industrial structure based on enterprise big data mining according to claim 1, wherein forming a weighted registration point set and a weighted activity point set comprises: respectively performing geocoding on the enterprise registration address and the activity place record to obtain plane coordinates, and converting the plane coordinates into the same plane coordinate system; Converting the enterprise registration address into a registration point and giving weight; converting the records of the enterprise activity places into an activity point set, calculating a uniformly-spreading weight according to the number of the enterprise activity points, and giving each activity point; And respectively collecting the registration points and weights thereof, the activity points and equal weights thereof of all enterprises under the industry to form a weighted registration point set and a weighted activity point set.
  3. 3. The regional industry structure intelligent analysis method based on enterprise big data mining according to claim 1, wherein calculating a weighted center and a weighted covariance matrix to construct a registered standard deviation ellipse and an active standard deviation ellipse comprises: Determining a weighted center based on the coordinates of all points in the point set and the weighted average of the corresponding weights thereof; calculating a weighted variance and a weighted covariance according to the deviation of the coordinates of each point in the point set and the weighted center and combining the corresponding weights so as to construct a weighted covariance matrix; the shape parameters of the registered standard deviation ellipse and the active standard deviation ellipse are determined by respective weighted covariance matrixes, and the coverage is controlled by a preset coverage multiple.
  4. 4. The intelligent analysis method for regional industry structures based on enterprise big data mining according to claim 1, wherein calculating the areas of registered standard deviation ellipses and active standard deviation ellipses comprises: calculating square root of determinant of the weighted covariance matrix, and multiplying the square root by the circumference ratio and the square of the coverage multiple to obtain the area of standard deviation ellipse; And respectively executing the area calculation to obtain a registered standard deviation elliptical area and an active standard deviation elliptical area.
  5. 5. The intelligent analysis method for the regional industrial structure based on enterprise big data mining according to claim 1, wherein calculating the distribution difference of the active points inside and outside the ellipse of the active standard deviation comprises: Constructing a quadratic distance formula by taking a weighted center of an ellipse of the movable standard deviation and a weighted covariance matrix as references; Calculating the quadratic distance from each active point in the weighted active point set to the center of the ellipse of the active standard deviation; setting a core threshold and an outer layer threshold, wherein the core threshold is determined by the product of the coverage multiple and a preset scaling factor, and the outer layer threshold is determined by the coverage multiple; Counting the sum of weights of the active points with the quadratic distance smaller than or equal to the core threshold value, and recording the sum as a core weighted number; And counting the sum of the weights of the active points with the quadratic distance larger than the core threshold and smaller than or equal to the outer layer threshold, and recording the sum as the outer layer endless belt weighted quantity.
  6. 6. The intelligent analysis method for the regional industrial structure based on enterprise big data mining according to claim 1, wherein the center offset distance is the euclidean distance between an activity standard deviation ellipse weighted center and a registration standard deviation ellipse weighted center.
  7. 7. The intelligent analysis method for the regional industry structure based on enterprise big data mining according to claim 5, wherein calculating the double-ellipse hollow fingerprint intensity of the industry comprises: Calculating the ratio of the movable standard deviation elliptical area to the registered standard deviation elliptical area as an expansion contrast term; calculating the ratio of the weighted number of the outer annular bands to the weighted number of the cores to serve as a hollow contrast item; And multiplying the external expansion contrast term, the hollow contrast term and the center offset term to obtain the double-ellipse hollow fingerprint intensity.
  8. 8. The method for intelligently analyzing an area industry structure based on enterprise big data mining according to claim 1, wherein determining a significant industry set based on double-ellipse hollowed fingerprint intensity of each industry comprises: calculating the median of the double-ellipse hollow fingerprint intensities of all industries in the area, calculating the absolute value of the difference between all fingerprint intensities and the median, and taking the median of the absolute value as the median of absolute deviation; constructing a discrimination threshold value which is equal to the median of the double-ellipse hollow fingerprint intensity plus the absolute deviation median of a preset multiple; The industry with the double-ellipse hollow fingerprint intensity larger than the discrimination threshold is divided into a remarkable hollow industry set.
  9. 9. The method for intelligently analyzing a regional industry structure based on enterprise big data mining according to claim 1, wherein generating a regional industry representation according to whether each industry belongs to the hollowed-out significant industry set comprises: If the industry belongs to a hollow significant industry set, selecting an activity standard deviation ellipse as an industry space range, counting the sum of weights of activity points falling into the activity standard deviation ellipse as effective points, and selecting the area of the activity standard deviation ellipse as an effective area; if the industry does not belong to the hollow significant industry set, selecting a registration standard deviation ellipse as an industry space range, counting the sum of weights of registration points falling into the registration standard deviation ellipse as effective points, and selecting the registration standard deviation ellipse area as an effective area; And calculating the ratio of the effective points to the effective areas to obtain the industry intensity, and sequencing the industries in the area according to the industry intensity to generate an area industry image list.
  10. 10. An intelligent analysis system for an area industrial structure based on enterprise big data mining is characterized by being applied to the intelligent analysis method for the area industrial structure based on enterprise big data mining according to any one of claims 1 to 9.

Description

Regional industry structure intelligent analysis method based on enterprise big data mining Technical Field The invention relates to the technical field of enterprise big data mining and treatment, in particular to an intelligent analysis method for an area industrial structure based on enterprise big data mining. Background The regional industry feature analysis is an important support for industry planning, business drop point, park positioning and industry chain layout decision, and the core logic is to accurately infer the regional dominant industry type, industry gathering area range and industry intensity pattern by analyzing the number of enterprises, spatial distribution and industry classification conditions based on the enterprise related data. Under the conditions that the general assembly economy is developed vigorously, the enterprise grouping operation becomes a normal state and the different-place branch layout is increasingly common, the same industry often presents a differentiated space distribution situation in the same area, namely, enterprise registration related addresses present obvious cohesive characteristics, the actual operation activity of the enterprise presents a remarkable outward expansion trend, and in part of scenes, the annular distribution form with sparse core areas and dense outer ring areas is formed, so that the phenomenon presents a serious challenge for the accuracy of the industrial characteristic analysis. The existing regional industry characteristic analysis scheme generally adopts enterprise registration addresses or registration information as core judgment basis of the location of an enterprise, and carries out subsequent analysis works such as industry space range fitting, aggregation degree calculation, industry portrait output and the like based on the location information. However, the scheme fails to consider the spatial distribution difference between the enterprise registering place and the actual activity place, directly equates the registering place with the actual occurrence place of the industry, cannot effectively distinguish the spatial characteristics of the registering gathering and the activity gathering of the enterprise, is extremely easy to cause deviation of industrial characteristic analysis conclusion, and the related conclusion often has the problems of insufficient consistency and difficult reasonable interpretation, finally influences the scientificity of industrial planning, the accuracy of the drop points of a shop, the locating rationality of a park and the reliability of the layout judgment of an industrial chain, and cannot meet the application requirement of the accurate analysis tool for the high-quality development of the current regional industry. Disclosure of Invention The invention aims to solve the defects that in the prior art, when regional industry analysis is carried out, space dislocation conditions of enterprise registration places and actual operation places cannot be effectively distinguished and quantified, so that hollow industries with registration cohesion and operation expansion are difficult to accurately identify, and further, the positioning of an industry gathering area is inaccurate and deviation occurs in industrial intensity measurement and calculation. In order to solve the problems existing in the prior art, the invention adopts the following technical scheme: An intelligent analysis method for an area industrial structure based on enterprise big data mining comprises the following steps: S1, acquiring registered addresses and activity occurrence places of all industrial enterprises in an area, and encoding the addresses into plane coordinate points to form a registered point set and an activity point set; S2, calculating a weighted center and a covariance matrix of the registration point set and the movable point set, and combining a preset coverage multiple to construct a registration standard deviation ellipse and a movable standard deviation ellipse; S3, calculating the industrial double-ellipse hollow fingerprint intensity based on the area difference of the registered standard deviation ellipse and the movable standard deviation ellipse, the distribution difference of movable points inside and outside the movable standard deviation ellipse and the center offset distance of the two ellipses; S4, determining a significant hollowing industry set based on the double-ellipse hollowing fingerprint intensity of each industry; S5, generating regional industry images according to whether each industry belongs to the hollow significant industry set. Preferably, forming a weighted registration point set and a weighted activity point set includes: respectively performing geocoding on the enterprise registration address and the activity place record to obtain plane coordinates, and converting the plane coordinates into the same plane coordinate system; Converting the enterprise registration address into a r