US-12619353-B2 - Shifting an encoded data slice subset for smart rebuilding

US12619353B2US 12619353 B2US12619353 B2US 12619353B2US-12619353-B2

Abstract

A method includes generating a second encoded data slice of a second subset of encoded data slices of a set of encoded data slices, where the second subset of encoded data slices is not currently stored in a set of storage units of the storage network, where the set of encoded data slices include a first subset of encoded data slices that is stored in the set of storage units and includes at least a decode threshold number of encoded data slices of the set of encoded data slices, and where a first encoded data slice of the first subset requires rebuilding. The method further includes sending the second encoded data slice to the set of storage units for storage therein, where when the second encoded data slice is stored, the second encoded data slice no longer included in the second subset of encoded data slices.

Inventors

Jason K. Resch
Greg R. Dhuse

Assignees

PURE STORAGE, INC.

Dates

Publication Date: 20260505
Application Date: 20240903

Claims (20)

1 . A method for execution by one or more computing devices within a storage network, the method comprises: generating a second encoded data slice of a second subset of encoded data slices of a set of encoded data slices, wherein the second subset of encoded data slices is not currently stored in a set of storage units of the storage network, wherein the set of encoded data slices include a first subset of encoded data slices that is stored in the set of storage units and includes at least a decode threshold number of encoded data slices of the set of encoded data slices, and wherein a first encoded data slice of the first subset requires rebuilding; and sending the second encoded data slice to the set of storage units for storage therein, wherein when the second encoded data slice is stored, the second encoded data slice is no longer included in the second subset of encoded data slices.
2 . The method of claim 1 further comprises: identifying the second encoded data slice based on identifying a storage unit of the set of storage units that is mapped to store a respective encoded data slice of the second subset of encoded data slices.
3 . The method of claim 2 , wherein the identifying the storage unit is based on determining the storage unit has been restored within a time period.
4 . The method of claim 2 , wherein the identifying the storage unit is based on determining the storage unit has been upgraded within a time period.
5 . The method of claim 2 , wherein the identifying the storage unit is based on determining the storage unit has not been used previously to store an encoded data slice of the set of encoded data slices.
6 . The method of claim 2 , wherein the identifying the storage unit is based on determining the storage unit is within a performance range of other storage units in the set of storage units.
7 . The method of claim 2 , wherein the identifying the storage unit is based on determining the storage unit has not been used previously for storing any of the first subset of encoded data slices of the set of encoded data slices.
8 . The method of claim 1 further comprises: error encoding a data segment of data to produce the first subset of encoded data slices; and sending the first subset of encoded data slices to the set of storage units for storage therein.
9 . The method of claim 8 , wherein the error encoding the data segment is in accordance with error encoding parameters, and wherein the error encoding parameters include the decode threshold number and a pillar width number.
10 . The method of claim 9 , wherein the set of encoded data slices includes the pillar width number.
11 . The method of claim 10 , wherein a number of the second subset of encoded data slices comprises: the pillar width number minus a number of the first subset of encoded data slices.
12 . The method of claim 1 , wherein the generating the second encoded data slice comprises: retrieving the decode threshold number of encoded data slices of the first subset of encoded data slices; error decoding the decode threshold number of encoded data slices to reconstruct a data segment associated with the set of encoded data slices; and error encoding at least a portion of the reconstructed data segment to produce the second encoded data slice.
13 . The method of claim 12 , wherein the error encoding comprises: arranging the reconstructed data segment into a data matrix; obtaining a row of an encoding matrix that corresponds to the second encoded data slice of the set of encoded data slices; and matrix multiplying the selected row of the encoding matrix with the data matrix to produce the second encoded data slice.
14 . The method of claim 13 further comprises: identifying a third encoded data slice of the second subset of encoded data slices; and generating the third encoded data slice from the first subset of encoded data slices.
15 . The method of claim 14 , wherein the generating the third encoded data slice comprises: error encoding the reconstructed data segment to produce the third encoded data slice.
16 . The method of claim 15 , wherein the error encoding the reconstructed data segment to produce the third encoded data slice comprises: obtaining a second row of the encoding matrix that corresponds to the third encoded data slice of the set of encoded data slices; and matrix multiplying the second row of the encoding matrix with the data matrix to produce the third encoded data slice.
17 . The method of claim 1 , wherein determining the first subset of encoded data slices comprises: receiving, from storage units of the set of storage units, favorable listing responses to a listing request for the first subset of encoded data slices, wherein a first favorable listing response of the favorable listing responses indicates a corresponding storage unit is storing a corresponding encoded data slice of the first subset of encoded data slices.
18 . The method of claim 17 further comprises: determining an additional encoded data slice of the set of encoded data slices is stored in the set of storage units, wherein the additional encoded data slice was not included in the favorable listing responses; and updating the first subset of encoded data slices based on the additional encoded data slice.
19 . The method of claim 1 , wherein determining a number of encoded data slices within the first subset of encoded data slices comprises: obtaining the number by performing a lookup in a lookup table.
20 . A computing device of a storage network, the computing device comprises: memory; an interface; and at least one processing module operably coupled to the memory and the interface, wherein the at least one processing module is operable to: generate a second encoded data slice of a second subset of encoded data slices of a set of encoded data slices, wherein the second subset of encoded data slices is not currently stored in a set of storage units of the storage network, wherein the set of encoded data slices include a first subset of encoded data slices that is stored in the set of storage units and includes at least a decode threshold number of encoded data slices of the set of encoded data slices, and wherein a first encoded data slice of the first subset requires rebuilding; and send, via the interface, the second encoded data slice to the set of storage units for storage therein, wherein when the second encoded data slice is stored, the second encoded data slice is no longer included in the second subset of encoded data slices.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present U.S. Utility Patent Application claims priority pursuant to 35 U.S.C. § 120 as a continuation of U.S. Utility application Ser. No. 17/881,667, entitled “SMART REBUILDING OF AN ENCODED DATA SLICE,” filed Aug. 5, 2022, allowed, which is a continuation of U.S. Utility application Ser. No. 17/248,885, entitled “EFFICIENT REBUILDING OF AN ENCODED DATA SLICE,” filed Feb. 11, 2021, issued as U.S. Pat. No. 11,543,964 on Jan. 3, 2023, which is a continuation of U.S. Utility application Ser. No. 16/396,399, entitled “EFFICIENT COMPUTATION OF ONLY THE REQUIRED SLICES,” filed Apr. 26, 2019, abandoned, which is a continuation-in-part of U.S. Utility application Ser. No. 15/405,004, entitled “MAPPING STORAGE OF DATA IN A DISPERSED STORAGE NETWORK,” filed Jan. 12, 2017, issued as U.S. Pat. No. 10,324,623 on Jun. 18, 2019, which is a continuation of U.S. Utility application Ser. No. 14/088,897, entitled “MAPPING STORAGE OF DATA IN A DISPERSED STORAGE NETWORK,” filed Nov. 25, 2013, issued as U.S. Pat. No. 9,558,067 on Jan. 31, 2017, which claims priority pursuant to 35 U.S.C. § 119 (e) to U.S. Provisional Application No. 61/748,916, entitled “UTILIZING A HIERARCHICAL REGION HEADER OBJECT STRUCTURE FOR DATA STORAGE,” filed Jan. 4, 2013, all of which are hereby incorporated herein by reference in their entirety and made part of the present U.S. Utility Patent Application for all purposes. The present U.S. Utility Patent Application also claims priority pursuant to 35 U.S.C. § 120 as a continuation of U.S. Utility application Ser. No. 17/809,796, entitled “Using a Dispersed Index in a Storage Network” filed Jun. 29, 2022, which claims priority pursuant to 35 U.S.C. § 120 as a continuation of U.S. Utility application Ser. No. 16/878,013, entitled “MANAGING CONCURRENCY IN A DISPERSED STORAGE NETWORK”, filed May 19, 2022, which claims priority pursuant to 35 U.S.C. § 120 as a continuation of U.S. Utility application Ser. No. 13/943,456, entitled “STORING INDEXED DATA TO A DISPERSED STORAGE NETWORK”, filed Jul. 16, 2013, issued as U.S. Pat. No. 10,671,585 on Jun. 2, 2020, which is a continuation-in-part of U.S. Utility application Ser. No. 13/718,961, entitled “RETRIEVING DATA UTILIZING A DISTRIBUTED INDEX”, filed Dec. 18, 2012, issued as U.S. Pat. No. 9,507,786 on Nov. 29, 2016, which claims priority pursuant to 35 U.S.C. § 119 (e) to U.S. Provisional Application No. 61/593,116, entitled “INDEXING IN A DISTRIBUTED STORAGE AND TASK NETWORK” filed Jan. 31, 2012, all of which are hereby incorporated herein by reference in their entirety and made part of the present U.S. Utility Patent Application for all purposes. U.S. Utility application Ser. No. 13/943,456 claims priority pursuant to 35 U.S.C. § 119 (e) to U.S. Provisional Application No. 61/695,997, entitled “Utilizing metadata to access a dispersed storage and task network”, filed Aug. 31, 2012, which is hereby incorporated herein by reference in its entirety and made part of the present U.S. Utility Patent Application for all purposes. STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT Not applicable. INCORPORATION-BY-REFERENCE OF MATERIAL SUBMITTED ON A COMPACT DISC Not applicable. BACKGROUND OF THE INVENTION Technical Field of the Invention This invention relates generally to computer networks and more particularly to encoded data slices. Description of Related Art Computing devices are known to communicate data, process data, and/or store data. Such computing devices range from wireless smart phones, laptops, tablets, personal computers (PC), work stations, and video game devices, to data centers that support millions of web searches, stock trades, or on-line purchases every day. In general, a computing device includes a central processing unit (CPU), a memory system, user input/output interfaces, peripheral device interfaces, and an interconnecting bus structure. As is further known, a computer may effectively extend its CPU by using “cloud computing” to perform one or more computing functions (e.g., a service, an application, an algorithm, an arithmetic logic function, etc.) on behalf of the computer. Further, for large services, applications, and/or functions, cloud computing may be performed by multiple cloud computing resources in a distributed manner to improve the response time for completion of the service, application, and/or function. For example, Hadoop is an open source software framework that supports distributed applications enabling application execution by thousands of computers. In addition to cloud computing, a computer may use “cloud storage” as part of its memory system. As is known, cloud storage enables a user, via its computer, to store files, applications, etc. on an Internet storage system. The Internet storage system may include a RAID (redundant array of independent disks) system and/or a dispersed storage system that uses an error correction scheme to encode data for storage. BRIEF D