CentraleSupélecDépartement informatique
Plateau de Moulon
3 rue Joliot-Curie
F-91192 Gif-sur-Yvette cedex
1CC1000 - Information Systems and Programming - Lab: Comparing machines

 

Consider the three Dell machines described in the following table.

 

ModelLatitude 3440Precision Mobile 7780Precision 7960
TypeLaptopLaptop/WorkstationWorkstation
ProcessorIntel Core i5-1335UIntel Core i9-13950HXIntel Xeon w9-3495X
Memory (RAM)8GB - 3 200 MT/s - DDR4 (1 × 8GB)32GB - 5 600 MHz - DDR5 (1 × 32GB)512 GB - 4 800 MHz - GDDR6 (8 × 64 GB)
GPUIntel Iris Xe Graphics G7 80EUs (integrated)NVIDIA RTX 3500NVIDIA RTX A6000
StorageSSD 256 GBSSD 1 TBRAID 5 : 4 × SSD 4To
Dimensions14" - 1920 × 108015.6" - 3840 × 2160--
Weight1.5 kg2.5 kg--
Price900 €4 600 €32 100 €
Specsclick hereclick hereclick here

 

The first computer is a light laptop suitable for personal and/or office use. The second is a laptop workstation that can support applications requiring computing and graphics. The third is a powerful server capable of working on massive calculations, such as deep learning.

 

Computers compared in 2020
ModelLatitude 3410Precision Mobile 7550Precision 7920
TypeLaptopLaptop/WorkstationWorkstation
ProcessorIntel Core i5-10210UIntel Core i9-10885H2 × Intel Xeon Gold-5220
Memory (RAM)8Go 2667MHz DDR4 (1 × 8Go)32Go 2933MHz DDR4 (2 × 16Go)128Go 2933 MHz DDR4 (8 × 16Go)
GPUUHD620 (integrated)NVIDIA Quadro T2000Nvidia Quadro GV100
StorageSSD 128 GoSSD 512 GoRAID 5 : 8 × SSD 1 To - SATA
Dimensions14" 1366 × 76815.6" - 3840 × 2160--
Weight1.6 kg2.5 kg--
Price500 €1 500 €7 500 €
Specsclick hereclick hereclick here

 

Look at the specification of the first two computers (link in the last row of the table). How many screens can be simultaneously used with these laptops, regardless of the characteristics of the GPU?



ANSWER ELEMENTS

  • Latitude 3440 : 3
    • Laptop screen
    • HDMI 1.4
    • USB 3.2 Gen 2 Type-C port with DisplayPort
  • Precision Mobile 7780 : 7
    • Laptop screen
    • HDMI 2.0a or 2.1
    • USB 3.2 Gen 2 Type-C port with DisplayPort
    • 2 x Thunderbolt 4 ports with USB Type-C (each supports two 4K displays)

What are the data transfer rates on the available connections of the two laptops?



ANSWER ELEMENTS

  • Latitude 3440
    • USB 3.2 Gen 1 : 5 Gbit/s
    • USB 3.2 Gen 2 Type-C : 10 Gbit/s
    • RJ45 (Ethernet) : 10/100/1000 Mbit/s
    • Wifi : Up to 2400 Mbit/s
    • Bluetooth 5.3 : Up to 2 Mbit/s
    • WWAN module : Up to 1 Gbit/s DL - 150 Mbit/s UL
  • Precision Mobile 7780 : add/change
    • Thunderbolt 4 ports with USB Type-C : 40 Gbit/s
    • WWAN module : Up to 3 Gbit/s DL - 250 Mbit/s UL

 

We now compare the processors of the three computers. The following table summarizes the key points. An official and exhaustive comparison is available here.

 

 Intel Core i5-1335UIntel Core i9-13950HXIntel Xeon w9-3495X
Number of cores2 + 88 + 1656
Base frequency1.7 GHz2.20 GHz1.90 Ghz
Turbo frequency4.60 GHz5.50 Ghz4.80 Ghz
Cache12 MB36 MB105 MB
Power15W55W350W
Max. memory size64 GB128 GB4 TB
Max. memory channels228
Recommended price$340.00$590.00$5889.00
Geekbench 5 single-core score (cpu-monkey)162821081734
Geekbench 5 multi-core score (cpu-monkey)72401975956911

 

Computers proposed for comparison in 2020
 Intel Core i5 10210UIntel Core i9 10885HIntel Xeon Gold 5220
Number of cores4818
Base frequency1.6 GHz2.40 GHz2.20 Ghz
Turbo frequency4.20 GHz5.30 Ghz3.90 Ghz
Cache6 MB16 MB25 MB
Power15 W45 W125 W
Max memory size64 GB128 GB1 TB
Max memory freq2 667 MHz2 933 MHz2 667 Mhz
Max memory channels226

 

  • Discuss the differences between the three processors. Can you guess what the different features listed in the table mean?
  • Look at the CPU limitations with memory. Are these limitations respected in the hardware configuration of the three computers? Can we still add memory to the three configurations?

ANSWER ELEMENTS

Here is a description of the features:

  • Number of cores. A core is a single processing unit of a CPU. A core can read and execute program instructions. A multi-core CPU can execute multiple instructions at the same time.
  • Base frequency. The frequency of the CPU clock under typical utilization.
  • Turbo frequency. The frequency of the CPU clock under heavy utilization. The Turbo Boost technology dynamically increases the clock speed to handle heavy workload. The frequency of the clock is set based on the system heat and the number of cores in use. The clock speed cannot exceed the one specified by the turbo frequency feature.
  • Cache. Indicates the amount of cache memory integrated in the CPU chip.
  • Power. The average level of heat generated under heavy utilization while the CPU is running at its base frequency.
  • Max. memory size. The maximum amount of memory that the CPU can address.
  • Max. memory channels. The maximum number of memory modules that can be supported by the CPU.
  • Geekbench. A benchmark, that is a set of performance tests, that gives a performance score to a CPU. The scores in the table are taken from the website cpu-monkeys.

In general, a higher clock speed means a faster CPU. However, many other factors might affect the performances. The CPU has several ways to optimize the execution of program instructions. Today's technology is able to distribute the execution of instructions among the computer cores in an intelligent way. It is therefore possible that an older CPU with a higher clock frequency is less performant of a newer CPU with a lower clock speed.

The cache size is also important.

Benchmarks are used to compare the CPU performances. From the table we learn that the Intel Core i9 has a higher score than the Intel Xeon (that costs more) when only one single core is used. This is true for the selected benchmark, another may give a different result. The performance scores should be considered with caution.

 

As for the second question, we see that the three hardware configurations respect the limitations with the memory. Actually, the first two computers only use one memory channel, we can add one more module, as the respective CPUs support two channels.

 

Finally, you might have noticed that the first two computers have two types of cores. You can see it in the detailed specifications. Intel CPUs are generally equipped with P-cores (or, Performance-cores) and E-cores (or, Efficient-cores). The first are used for the heavy workload; they are more performant, but they cost more and they generate more heat (as a result, they consume more power). The second are used for background tasks that do not require a high computing power. They are less performant, but they cost less and generate less heat. The goal of having two types of cores is to strike a balance between the performances and the cost and power consumption of the CPU.


 

We now look at the graphics processing unit (GPU). The following table describes the features of the GPUs in the three computers.

 

 Intel Iris Xe Graphics G7 80EUs (integrated)NVIDIA RTX 3500NVIDIA RTX A6000
Number of cores (CUDA)805 12010 752
Clock speed1 250 MHz1 545 MHz2 505 MHz
Memory8 GB (shared)12 GB48 GB
FP16 performance768.0 GFLOPS15.82 TFLOPS38.7 TFLOPS
FP32 performance384.0 GFLOPS15.82 TFLOPS38.7 TFLOPS
FP64 performance96.0 GFLOPS247.2 GFLOPS1.21 TFLOPS
Tensor cores (Deep learning)--160336
Max power consumption< 15W100W300 W

 

Computers compared in 2020
 UHD620 (integrated)NVIDIA Quadro T2000Nvidia Quadro GV100
Number of cores (CUDA)19210245120
Clock speed1150 MHz1785MHz1627MHz
Memory32GB(shared)4GB32GB
FP16 performance768.0 GFLOPS7.3 TFLOPS29,6 TFLOPS
FP32 performance384.0 GFLOPS3.6 TFLOPS14,8 TFLOPS
FP64 performance96.0 GFLOPS114.2 GFLOPS7,4 TFLOPS
Tensor cores (Deep learning)----640
Tensor performance----118,5 TFLOPS
Max power consumption< 15W60W250 W

 

Compare the features of the three GPUs. Can you understand the meaning of each feature listed in the table?


ANSWER ELEMENTS

  • GPUs execute identical computations simultaneously (data parallelism) on a vector of data and produce a corresponding vector of outputs. Their primary purpose is to render three-dimensional images but they are also used to execute compute-intensive applications where data parallelism is involved.
  • In the first computer, the GPU is integrated into the CPU. As a result, the CPU and the GPU share the same memory. While integrating the GPU into the CPU results in lower performances, for most users this is a convenient and cheap solution. Gamers and developers of graphics applications need a configuration with a dedicated GPU.
  • The second and third computers are equipped with dedicated NVIDIA GPUs. These GPUs have a dedicated memory.
  • NVIDIA GPUs consist of several CUDA cores. CUDA (Compute Unified Device Architecture) is a NVIDIA platform that enables the development of applications for GPUs. A CUDA core executes one floating-point operation per clock cycle. Therefore, the number of cores and the clock frequency are good indicators of a GPU performance.
  • FP16, FP32 and FP64 stand respectively for half-float (16bits), float (32bits) and double-float (64bits) precision.
  • FP16, FP32 and FP64 performance refer to the number of floating-point operations per second. This is measured in GFLOPS (Giga floating-point operations per second) or TFLOPS (Tera floating-point operations per second). The values in the table are important for comparison but they are theoretical. To have a concrete comparison, it is better to look for benchmarks like here.
  • Since CUDA cores are limited to the execution of one operation per clock cycle, NVIDIA developed more advanced cores, called tensor cores. They can calculate entire matrix operations per clock cycle and bring new deep learning applications to GPUs.

 

Consider a typical laptop battery of 50Wh. These laptop screens consume 10W to 20W depending on size and brightness.

Considering that the three main power consuming components in a laptop are the CPU, the screen and the GPU, discuss autonomy and power consumption of the two laptops.


ANSWER ELEMENTS

  • The first laptop consumes 15W for both the processor and its integrated GPU, and 10W for the screen. Assuming that the battery works properly, its autonomy is 2 hours under high load conditions.
  • The second laptop consumes 45W (processor) + 60W (GPU) + 20W (screen) = 125W. The battery autonomy is less than 30 minutes under high load conditions.

In modern computer architectures, the frequency of each component is dynamically adjusted based on the actual computing load, to save power and reduce the amount of heat generated by the circuits.


 

[Advanced only]

The storage of the Precision 7960 is RAID 5 : 4 x SSD 4To. What is RAID?


ANSWER ELEMENTS

RAID is an acronym that stands for Redundant Array of Independent Disks. It is a technology that allows the storage of data across multiple disks to prevent data loss in case of disk failures. One of the benefits of RAID is to increase the fault tolerance of an information system.

There are different versions of RAID, known as levels, each identified by a number.

  • RAID 0, also known as disk striping. In this configuration, data is split into different independent partitions (also called stripes); partitions are then stored evenly across several disks. The configuration is illustrated here. The benefit of this configuration is performance : we can read data in parallel from many disks. The disadvantage is that we may lose a set of data partitions if the disk where they are stored fails.
  • RAID 1, also known as disk mirroring. In this configuration, a dataset is stored redundantly (or, mirrored) on multiple disks. The configuration is illustrated here. This configuration guarantees fault tolerance, but at the price of using many disks to store one single dataset.
  • RAID 4. In this configuration data is split into different independent partitions, and then distributed evenly across several disks, like in RAID 0. Unlike RAID 0, however, an additional disk stores parity data. When a disk fails, the partitions stored in it can be recomputed by using the other partitions and the parity data. The configuration is illustrated here.

Other levels exist. Their discussion is out of the scope of this course. More details are available here.

Importantly, RAID disks are seen by the operating system as one single disk.