US Patent No. 10,599,436

DATA PROCESSING METHOD AND APPARATUS, AND SYSTEM


Patent No. 10,599,436
Issue Date March 24, 2020
Title Data Processing Method And Apparatus, And System
Inventorship Haiyan Liu, Shenzhen (CN)
Jun Xu, Hangzhou (CN)
Qun Yu, Beijing (CN)
Assignee HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)

Claim of US Patent No. 10,599,436

1. A data processing method, applied to a system comprising a central processing unit (CPU) pool and a storage pool, whereinthe CPU pool is communicatively connected to the storage pool; and
the CPU pool comprises at least two CPUs, a master node, at least one mapper node, and at least one reducer node running in the CPU pool, wherein the at least one mapper node comprises a first mapper node, the at least one reducer node comprises a first reducer node, and the first mapper node and the first reducer node run on different CPUs in the CPU pool;
the storage pool comprises a remote storage area shared by the first mapper node and the first reducer node; and the method comprises:
executing, by the first mapper node, a map task on a data slice, and obtaining N groups of at least one data segment according to an execution result of the map task, wherein
N is a positive integer, each of the at least one data segment is to be processed by a corresponding reducer node, and
the at least one data segment comprises a first data segment, and an Mth group, the first data segment being a data segment to be processed by the first reducer node, and the Mth group comprising an Mth first data segment, wherein M is a positive integer less than or equal to N;
storing, by the first mapper node, all first data segments in the N groups of at least one data segment into the remote storage area, and generating N storage messages, wherein an Mth storage message comprises a storage address of the Mth first data segment in the remote storage area and a data volume of the Mth first data segment; and
sending, by the first mapper node, the N storage messages to the master node.