Analysis of information sources in references of the Wikipedia article "Sunway TaihuLight" in English language version.
The TOP500 report said that the chip also lacks any traditional L1-L2-L3 cache, and instead has 12KB[sic] of instruction cache and 64KB "local scratchpad" that works sort of like an L1 cache.
Each CPE Cluster is composed of a Management Processing Element (MPE) which is a 64-bit RISC core which is supporting both user and system modes, a 256-bit vector instructions, 32 KB L1 instruction cache and 32 KB L1 data cache, and a 256KB L2 cache. The Computer Processing Element (CPE) is composed of an 8×8 mesh of 64-bit RISC cores, supporting only user mode, with a 256-bit vector instructions, 16 KB L1 instruction cache and 64 KB Scratch Pad Memory (SPM). [..] Each CPE has a 64 KB local (scratchpad) memory, no cache memory. The local memory is SRAM. There is a 16KB instruction cache. Each of the 4 CPE/MPE clusters has 8 GB of DDR3 memory. So a node has 32 GB of primary memory. Each processor connects to four 128-bit DDR3-2133 memory controllers, with a memory bandwidth of 136.51 GB/s.