Intel Core micro architectures dual core buffer

At the IDF forum in early 2006, Intel focused on introducing a new dual-core microarchitecture for 32-bit IA-32 architecture (coexisting with EPIC architecture, IXA) with optimistic assertions about performance improvement. performance, processing performance look on energy correlation point of view. Vi Kien

At the IDF forum in early 2006, Intel focused on introducing a new dual-core microarchitecture for 32-bit IA-32 architecture (coexisting with EPIC architecture, IXA) with optimistic assertions about performance improvement. performance, processing performance look on energy correlation point of view. The new microarchitecture is simply Intel Core.

Reviewing the dual-core microarchitecture in use, we see a common point in how to implement it is to integrate the previous two single-core microprocessor cores into one chip with fairly clear independence of processing and separation buffers. micro command execution process. Core microarchitecture has improved this by designing a common level 2 (L2 cache) buffer for both cores, while also improving the strengths of both NetBurst and Mobile microarchitecture to achieve concurrency. Two important factors are increased processing efficiency and reduced power consumption. This strategic Core microarchitecture will also be deployed by Intel on all three lines of mobile, desktop and server; but in the product introduction roadmap, the processor of Merom code for laptops will be pre-introduced.

Compared to the current dual-core Intel microarchitecture, the five major improvements in Core microarchitecture are Wide Dynamic Execution, Intelligent Power Management (Intelligent Power Capability), Advanced Intelligent Buffering. (Advanced Smart Cache), Smart Memory Access (Smart Memory Access) and Advanced Digital Media Boost (Advanced Digital Media Boost).

Wide Dynamic Execution

Dynamic execution is a combination of many techniques (data flow analysis, speculation execution, non-order execution, .) that Intel has implemented in the P6 microarchitecture including Pentium Pro processor, Pentium II, Pentium III. In the NetBurst microarchitecture later, Intel launched Advanced Dynamic Execution, a deeper design, far more predictive and algorithmic improvements that predict command branching to reduce false predictions. In the Core architecture, the longer executable execution order (14 stages) makes the command branch predictable more accurately and up to 4 rows of executable commands at the same time (Intel Mobile and NetBurst were previously executable at the same time). order goods).

Another feature also contributes to shortening command execution time is macrofusion. During the decoding process, some common micro-ops (such as compare command) with a conditional jump (jump) command will be joined by a macrofusion into a new microinstruction. Arithmetic Logic Unit-ALU in the Core microarchitecture is designed to execute macrofusion-matching commands in one clock, significantly shortening execution time (compared to when combination) and also means reducing energy.

In Intel Core, you also come across the energy-saving technique of micro-op fusion time used in Pentium M. Processors. Typically, a macro-op is often split into multiple micro-ops. micro-op command before switching to the microprocessor execution command line. Micro-op technology helps minimize some microinstructions in the queue. In the Core microarchitecture, the number of microinstructions streamlined is expanded thanks to the 14-step design command (longer than before).

Intel Core micro architectures dual core buffer Picture 1Intel Core micro architectures dual core buffer Picture 1 Each core can handle up to 4 orders simultaneously.

Intelligent Power Capability

One of the factors assessing the efficiency of the current computing system is the performance / power consumption index. This means we only need to reduce the amount of power consumed by increasing system efficiency. Besides improvements to processor performance, Intel Core also designs Intel Intelligent Power Capability to save power.

Current technology has enabled Intel to design a logic-based shutdown mechanism. Thus, the Core microarchitecture is capable of turning off a microprocessor in the processor when it is not needed to save power; but still ensure activation as soon as needed so as not to affect the overall speed of the processor. In addition, many bus routes and data areas have also been designed separately to ensure data transmission at low voltage levels in some states.

Intel Core micro architectures dual core buffer Picture 2Intel Core micro architectures dual core buffer Picture 2

Advanced Smart Cache

Unlike the usual implementation, Intel designed in Core microarchitecture a shared L2 cache for both processor cores to improve performance and increase data retrieval efficiency. For one thing, when two executables need to use the same data, they can be stored in a shared L2 cache, rather than having to be saved in two separate L2 partitions as before. This saves resources, shortening the time to transfer data back and forth between two buffers.

This technology also allows dynamic allocation of buffer capacity according to individual needs. When the first kernel does not need to use the buffer, the entire L2 pool buffer can be divided by the second kernel; and vice versa. This increases the efficiency of buffering, avoiding the lack of buffering, and effectively utilizing the high response speed of L2 cache.

Smart Memory Access

Intel Smart Memory Access technology has two important techniques: memory disambiguation and advanced prefetcher. The preloading technique has a special algorithm for pricing unreliable and load-enforcing load commands (save data). This enforces the spirit of parallel processing and reaches the level of command execution of the micro-op command so it is very supportive for multitasking and parallel processing environment. In some cases where the "over-lamp" loading is not correct, Intel has also integrated a mechanism that allows detection of contention points, quick reload of correct data and re-execution of commands.

Besides, Intel Smart Memory Access also has an advanced prefetcher that not only loads data into memory but also transfers data available in the buffer zone to take advantage of the high speed access of the buffer . Core microarchitecture integrates two L1 and two L2 levels with the task of placing data of non-instant execution commands on L1 buffers and getting ready to execute data immediately on L2 cache.

Intel Core micro architectures dual core buffer Picture 3Intel Core micro architectures dual core buffer Picture 3

Intel Core microarchitecture handles 128-bit SIMD instructions in one clock

Advanced Digital Media Boost

Speeding up the execution of the Streaming SIMD Extension (SSE), Core microarchitecture is equipped with Intel Advanced Digital Media Boost technology to help handle SIMD 128bit operations. In the past, microprocessors only supported 64bit operations, so a SIMD 128bit command had to be split and processed in two pulses. Intel Advanced Digital Media Boost technology in Core microarchitecture only processes in one pulse so it shortens data processing time of video, audio, graphics applications and data types using SSE scripts, SSE2. SSE3. The ability to calculate floating point and 128 bit integers also helps to improve accuracy in specific applications such as image processing, video, voice, coding, finance, engineering and science.

Product roadmap

Intel Core micro architectures dual core buffer Picture 4Intel Core micro architectures dual core buffer Picture 4 Shared L2 cache helps to speed up access, reduce data transfer time. Based on Core microarchitecture, Intel outlined the upcoming product roadmap for all three mobile, desktop and server streams with a close combination of processor and system platform. Continuing the trend of multi-core processors, Intel is expected to launch a quad-core model in early 2007.

Specifically, the mobile microprocessor Merom code for notebooks was launched in the world in the second quarter running on the current Napa platform and also the Santa Rosa platform (to be released in 1/2007). Desktop processors for Conroe's code office will be officially available in the world market in the third quarter with the Averill platform (introduced this year). The Bridge Creek family desktop platform (in the middle of this year) can change the Conroe code processor to Kentsfield in the first quarter of 2007. Two PCs of Conroe and Kentsfield of personal computers will also be shared with Kaylo and Wyloway processor / server workstations (3rd quarter). Meanwhile, Woodcrest code processor (3rd quarter) and Clovertown (1st quarter 2007) will be released on Bensley and Glidewell platforms (same quarter 2) for two-processor servers. By 2007, Core microarchitecture will appear in the Tigerton processor running on the Caneland multi-processor server platform.

Intel Vietnam said that the dual-core processor line on Core microarchitecture will be available in Vietnam around the third quarter of this year with the pioneering version for laptops (Merom code name), next can is a PC version (Conroe code). According to Intel, Intel Core microarchitecture has significantly improved performance and significantly reduced power consumption compared to Intel processors on previous generation microarchitecture. . Specifically, Conroe increased its performance by more than 40%, reducing power consumption by more than 40%; Woodcrest increased by more than 80% performance, 35% less electricity. The new generation of Intel Core Duo notebooks using Merom processors will increase performance by more than 20%.

Duy Khanh

5 ★ | 2 Vote