CN-116383167-B - Method for solving insufficient disk space based on object storage

CN116383167BCN 116383167 BCN116383167 BCN 116383167BCN-116383167-B

Abstract

The invention provides a method for solving the problem of insufficient disk space based on object storage, which comprises the following steps of dynamically expanding a disk, adding data nodes newly and finally achieving odd data nodes, building an NFS system by using a physical machine, interconnecting and communicating HBase and NFS, and synchronizing data of two service ends of the NFS. The invention has the beneficial effects of saving cost and avoiding expensive physical machines. The system can adopt an old or cheap physical machine to carry the NFS system, has high data security, can not reduce the data security by adopting the method, has dynamic capacity expansion, has simple data cold standby, and can provide data service for the outside as long as the consistency of the data catalogue is ensured.

Inventors

ZHOU ZHENLEI
LI HUAJIAN
ZHANG YANFANG
SU JIANHUI
LI XIANYING

Assignees

爱信诺征信有限公司

Dates

Publication Date: 20260505
Application Date: 20221227

Claims (3)

1. The method for solving the problem of insufficient disk space based on object storage is characterized by comprising the following steps: S1, dynamically expanding a disk, adding data nodes, and finally achieving odd data nodes; S2, utilizing a physical machine to build an NFS system; S3, interconnection and intercommunication between the HBase and the NFS are realized, wherein the process is as follows, firstly, the NFS is required to be mounted on a data node deployed by the HBase, secondly, the data is migrated to a directory address on the NFS, updated to an HBase table, finally, data is checked and verified, finally, a client acquires an NFS path through an HBase API, and real data is acquired according to the acquired path; s4, the two NFS service ends perform data synchronization.
2. The method for solving the problem of insufficient disk space based on object storage according to claim 1, wherein in step S1, the dynamic expansion process of the disk is as follows: S11, newly adding machine environment configuration, and configuring as follows, S111, using Cloudera Manager by an operating system; s112, adopting RAID as a memory of the naminode to protect metadata; s113, mounting the disk under different catalogues; S114, deploying the nodes in the same network segment; S115, modifying the host name and mapping; s116, setting the starting level to be 3; s117, starting a firewall and selinux; S118, configuring the opening quantity of the system files and the maximum process quantity of users; s119, configuring NTP service to synchronize the newly added node clock with the cluster host clock; s1110, configuring SSH, and configuring SSH with a host to avoid password login; s1111, installing jdk and configuring environment variables; s1112.yum mount dependency; s1113, the newly added node copies a MySQL drive package; S1114, creating cloudera-scm users on the newly added nodes; S1115, creating a Parcel package distribution directory on the newly added node; s1116, disabling the large-area compression command and adding the large-area compression command into the/etc/rc.local script; S1117.swappiness parameter is set to 0; S1118, keeping in mind the problem of Python script, and avoiding abnormality in the process of distributing Parcel packages; S12, copying cloudera-manager catalogues of the nodes to the new nodes; S13, configuring and starting Cloudera MANAGER AGENT, wherein the process is as follows, S131, creating an operation catalog of the agent on all nodes; S132, copying a Cloudera MANAGER AGENT startup script to a/etc/init.d/directory; s133, configuring and starting Cloudera MANAGER AGENT; S14, performing new machine addition and service addition on a management page of a host node Cloudera Manager Web; S15, balancing after the new clusters are stabilized.
3. The method of claim 1, wherein in step S4, the service ends of the two NFS are mutually cold standby and perform data synchronization.

Description

Method for solving insufficient disk space based on object storage Technical Field The invention belongs to the technical field of storage, and particularly relates to a method for solving the problem of insufficient disk space based on object storage. Background In the invention patent with the application number 2018107335697, a Hadoop file system is used for managing the object to be stored through a data protocol module interface, so that the support of Hadoop and object storage service on CephFS is realized, and a user can access the data stored by the object storage service through the data protocol module interface. Its object store is mainly focused on the implementation between CephFS and hadoop. The Hadoop cluster can ensure the safety of data, but is inevitably trapped in data redundancy, so that the problem of insufficient disk space is also required to be optimized. Disclosure of Invention In view of the foregoing, the present invention aims to overcome the above-mentioned drawbacks of the prior art, and proposes a solution to the problem of insufficient disk space under object-based storage. In order to achieve the above purpose, the technical scheme of the invention is realized as follows: The method for solving the problem of insufficient disk space based on object storage comprises the following steps: S1, dynamically expanding a disk, adding data nodes, and finally achieving odd data nodes; S2, utilizing a physical machine to build an NFS system; S3, interconnection and intercommunication between the HBase and the NFS; s4, the two NFS service ends perform data synchronization. Further, in step S1, the dynamic capacity expansion process of the disk is as follows: S11, newly adding machine environment configuration, and configuring as follows, S111, using Cloudera Manager by an operating system; s112, adopting RAID as a memory of the naminode to protect metadata; s113, mounting the disk under different catalogues; S114, deploying the nodes in the same network segment; S115, modifying the host name and mapping; s116, setting the starting level to be 3; s117, starting a firewall and selinux; S118, configuring the opening quantity of the system files and the maximum process quantity of users; s119, configuring NTP service to synchronize the newly added node clock with the cluster host clock; s1110, configuring SSH, and configuring SSH with a host to avoid password login; s1111, installing jdk and configuring environment variables; s1112.yum mount dependency; s1113, the newly added node copies a MySQL drive package; S1114, creating cloudera-scm users on the newly added nodes; S1115, creating a Parcel package distribution directory on the newly added node; s1116, disabling the large-area compression command and adding the large-area compression command into the/etc/rc.local script; S1117.swappiness parameter is set to 0; S1118, keeping in mind the problem of Python script, and avoiding abnormality in the process of distributing Parcel packages; S12, copying cloudera-manager catalogues of the nodes to the new nodes; S13, configuring and starting Cloudera MANAGER AGENT, wherein the process is as follows, S131, creating an operation catalog of the agent on all nodes; S132, copying a Cloudera MANAGER AGENT startup script to a/etc/init.d/directory; s133, configuring and starting Cloudera MANAGER AGENT; S14, performing new machine addition and service addition on a management page of a host node Cloudera Manager Web; S15, balancing after the new clusters are stabilized. Further, in step S3, the interconnection and interworking process between the HBase and the NFS is as follows, firstly, the NFS needs to be mounted on a data node deployed by the HBase, secondly, the NFS is migrated to a directory address on the NFS, updated to an HBase table, finally, data is checked and verified, finally, a client acquires an NFS path through an HBase API, and acquires real data according to the acquired path. Further, in step S4, the service ends of the two NFSs are mutually cold standby, and perform data synchronization. Compared with the prior art, the invention has the following advantages: the method for solving the problem of insufficient disk space based on object storage saves cost and avoids expensive physical machines. The system can adopt an old or cheap physical machine to carry the NFS system, has high data security, can not reduce the data security by adopting the method, has dynamic capacity expansion, has simple data cold standby, and can provide data service for the outside as long as the consistency of the data catalogue is ensured. Drawings The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention. In the drawings: FIG. 1 is a diagram of a service architecture for object storage according