Search

CN-121979829-A - Method and system for quickly constructing server cluster

CN121979829ACN 121979829 ACN121979829 ACN 121979829ACN-121979829-A

Abstract

The invention discloses a method and a system for quickly constructing a server cluster, wherein the method comprises the steps of coding an infrastructure, realizing automatic creation and configuration of resources by using a declarative configuration language and an automatic deployment tool, deploying containerized services, dynamically adjusting service instances by a containerized service arrangement platform and an automatic expander, carrying out service templating and arrangement, quickly instantiating an application component based on a structured template, dynamically configuring authority strategies, carrying out task scheduling and resource reservation, realizing fine-grained scheduling by adopting a reinforcement learning scheduler and combining a resource reservation mechanism, intelligently monitoring and self-healing, acquiring multidimensional data, carrying out abnormal early warning and fault self-healing by utilizing a prediction model and root cause analysis, and integrating a data service gateway. The invention realizes the cluster minute-level delivery, fine-grained resource scheduling, dynamic authority control, intelligent operation and maintenance and cross-environment unified management, and remarkably improves the deployment efficiency, the resource utilization rate and the system availability.

Inventors

  • ZOU XIAOQI
  • ZOU BING
  • Xiang Zhenxuan

Assignees

  • 杭州中谦科技有限公司

Dates

Publication Date
20260505
Application Date
20260126

Claims (10)

  1. 1. The method for quickly building the server cluster is characterized by comprising the following steps: Step 1, coding an infrastructure, writing an infrastructure definition script by using a declarative configuration language, and executing by a version control system management and automation deployment tool to realize automatic creation and configuration of resources; step 2, containerized service deployment, namely packaging application services into container mirror images, deploying and managing the application services through a container arranging platform, and configuring an automatic expander to realize dynamic adjustment of service instances; Step 3, service templating and arranging, defining application components, resource configuration, dependency relationship and health check based on the structured template, and realizing quick instantiation of the application through an intelligent service arranging engine; Step 4, dynamic configuration of authority policies, integrating a context-aware policy engine, and dynamically adjusting access authorities based on multidimensional signals such as user identities, time, sources, risk scores and the like; Step 5, task scheduling and resource reservation, namely, adopting a task scheduler based on reinforcement learning, and combining a resource reservation and preemption mechanism to realize task-level fine-grained scheduling; step 6, intelligent monitoring and self-healing, collecting multidimensional operation and maintenance data, carrying out abnormal early warning by using a prediction model, positioning faults by using a root cause analysis model, and triggering a self-healing script to execute repairing actions; and 7, integrating the data service gateway, providing a unified API (application program interface) inlet, and implementing security control, request routing and data format conversion.
  2. 2. The method of claim 1, wherein in the infrastructure-as-code step, the declarative configuration language comprises HCL or YAML and the automated deployment tool comprises Terraform, ansible or Pulumi.
  3. 3. The method for quickly building a server cluster according to claim 1, wherein in the containerized service deployment step, the container arrangement platform is Kubernetes, the automatic expander is a horizontal Pod automatic expander, and the expansion and the contraction are performed according to CPU utilization, memory usage, or custom QPS indexes.
  4. 4. The method of claim 1, wherein in the service templating and orchestrating step, the templates are in YAML or JSON format, and the intelligent service orchestration engine supports dependency resolution, service discovery, and health check initialization.
  5. 5. The method for quickly building a server cluster according to claim 1, wherein in the step of dynamically configuring the authority policy, the policy engine performs a tamper-proof recording of an audit log in combination with a blockchain technique.
  6. 6. The method for quickly building a server cluster according to claim 1, wherein in the task scheduling and resource reservation steps, the reinforcement learning scheduler adopts DQN or PPO algorithm, and the reward function comprehensively considers the task completion rate, average waiting time and resource fragmentation rate.
  7. 7. The method for quickly building a server cluster according to claim 1, wherein in the intelligent monitoring and self-healing step, the prediction model is LSTM or ARIMA, the root cause analysis model is a bayesian network, and the self-healing scenario includes a restart instance, a resource extension, pod migration or fault isolation.
  8. 8. The method of claim 1, wherein in the data service gateway integration step, RESTful or GRAPHQL API is supported, API Key, OAuth 2.0 or JWT authentication is integrated, and a current limiting and fusing policy is implemented.
  9. 9. The method for quickly building the server cluster according to claim 1, wherein the method further supports hybrid cloud, multi-cloud and edge computing scenarios, and seamless scheduling and unified management of heterogeneous resources are achieved through a unified abstraction layer.
  10. 10. A system for implementing the method for quickly building a server cluster according to any one of claims 1-9, comprising: the infrastructure is a code module and is used for executing the writing, version management and automatic deployment of the infrastructure definition script; The containerized service deployment module is used for constructing, arranging and automatically expanding and contracting container images; The service templating and arranging module is used for analyzing the application template, managing the dependency and instantiating the service; the authority policy dynamic configuration module is used for evaluating the context awareness policy and adjusting the dynamic authority; the task scheduling and resource reservation module is used for task scheduling and resource management based on reinforcement learning; the intelligent monitoring and self-healing module is used for data acquisition, anomaly prediction, root cause analysis and self-healing execution; and the data service gateway module is used for unifying API exposure, security management and control and request routing.

Description

Method and system for quickly constructing server cluster Technical Field The invention relates to the technical field of cloud computing and distributed computing, in particular to a method and a system for quickly constructing a server cluster. Background With the acceleration of the global digitization and intellectualization processes, information systems of enterprises and organizations face multiple challenges such as rapid increase of traffic, improvement of service response real-time requirements, rising of architecture complexity and increase of cost control pressure. In this context, an agile, flexible, intelligent and secure IT infrastructure becomes critical to support business innovations. The following problems generally exist in the conventional server cluster building and management manner, such as static allocation based on physical machines, manual management of virtualized resource pools, or static configuration depending on early automation tools (e.g. Puppet, chef): the deployment efficiency is low, from resource application and environment configuration to service online, the period is as long as days or even weeks, and the rapid change of the service is difficult to deal with. The resource scheduling is extensive, namely the resource allocation is mainly performed by taking a virtual machine or a physical machine as a unit, the granularity is coarse, and the accurate scheduling of a container level or a task level cannot be realized, so that the resource utilization rate is low. The operation and maintenance passive response comprises that the monitoring system takes threshold value alarm as a main part and lacks of prediction capability, the fault processing relies on manual investigation and intervention, the average repair time is long, and the availability of the system is not guaranteed enough. Security policy rigidifies that authority management is based on static roles, lacks sensing and response capability to dynamic factors such as user behavior and access context, and has poor adaptability in the face of internal threat or complex attack. The environment compatibility is poor, the solution is usually designed aiming at the data center environment, is difficult to seamlessly extend to public cloud, private cloud or edge nodes, and cannot realize real cross-environment unified management. In the prior art, as patent document CN111752539B discloses a BI service cluster system and a method for building the same, user rights are managed by the tenant system, so that user management pressure of a BI tool is reduced. However, the technical scheme still has the limitations that the authority management is based on a static role, the context awareness and dynamic adjustment capability are lacked, the resource scheduling is coarsely performed by taking tenants as units, the infrastructure, namely the code and modern container arrangement support is lacked, the deployment efficiency is low, the intelligent operation and self-healing mechanism is not integrated, and the operation and maintenance cost is high. The scheme is essentially a centralized management system oriented to specific application scenes, and is difficult to meet the general requirements of rapid elasticity, intelligent operation and maintenance and fine-granularity safety management and control in diversified scenes such as big data analysis, internet of things, micro-service architecture and the like. Therefore, there is a strong need in the art for an integrated cluster building method and system that can implement minute-level delivery, fine-grained resource scheduling, dynamic adaptive security and intelligent predictive operation and maintenance to systematically solve the above-mentioned problems. Disclosure of Invention The primary aim of the invention is to provide a method and a system for quickly constructing a server cluster, which realize minute delivery, fine-grained resource scheduling, dynamic authority control, intelligent operation and maintenance and cross-environment unified management of the cluster, and fundamentally solve the problems of low resource utilization rate, poor elasticity, stiff authority, complex deployment, passive operation and maintenance and the like in the prior art. In order to achieve the above purpose, the invention provides a method for quickly building a server cluster, which comprises the following steps: Step 1, coding an infrastructure, writing an infrastructure definition script by using a declarative configuration language, and executing by a version control system management and automation deployment tool to realize automatic creation and configuration of resources; step 2, containerized service deployment, namely packaging application services into container mirror images, deploying and managing the application services through a container arranging platform, and configuring an automatic expander to realize dynamic adjustment of service instances; Step 3, service tem