Best cost performance? I benchmarked an AMD instance on GCP.
table of contents
hello everyone
My name is Kanbara from the System Solutions Department.
I joined Beyond as a new graduate, but before I knew it, it had been three years since I had written a blog.
This is my first blog post.
It may be a bit sudden, but for the past few years, AMD's CPUs have been a hot topic for me personally.
The first Zen architecture CPU was released in 2017, and its high multi-threading performance became a hot topic.
Since then, CPUs with improved architectures such as Zen+, Zen2, etc. have been introduced one after another, and in particular, the Zen3 architecture CPU announced in October last year has significantly improved not only multi-thread performance but also single-thread performance. It was a very strong result in many aspects.
Kanbara was also enthusiastic about using an AMD CPU for his next home-built PC, but due to various reasons he hasn't been able to build one yet. I want 5800X.
Now, CPUs using AMD's Zen microarchitecture are also being deployed for servers.
Products for servers are being developed under the brand AMD EPYC, and our company's Ohara has summarized the features of EPYC in a blog post , so please take a look.
Servers/instances with EPYC CPUs can be used on various cloud platforms including AWS.
This time, I would like to launch a Compute Engine instance that uses EPYC on GCP and benchmark it with an instance that uses Intel's Xeon.
AMD EPYC on GCP Compute Engine
Google Compute Engine makes EPYC available on multiple machine types.
- E2 VM
- N2D VM
- Tau T2D VM
The benchmark uses N2D VM. N2D is positioned as the AMD EPYC version of N2 VM, which uses Intel Xeon, and was selected because it makes it easy to compare CPU differences.
Instance selection
First, select an instance for benchmarking.
The machine type of the EPYC instance is N2D VM, with the following specifications and configuration. EPYC also has generations, and the official documentation , it seems that the second generation EPYC Rome can be used with N2D VM. By the way, the latest EPYC is the 3rd generation Milan.
CPU Brand | AMD EPYC Rome |
vCPU | 2 |
Memory | 8GB |
Disk | Balanced persistent disk 20GB |
OS Image | centos-7-v20211105 |
The opposing Intel Xeon instance was an N2 VM with the following specs and configuration. However, there is no comparison if there is an imbalance in the specs, so I used exactly the same as the EPYC instance except for the CPU brand. The Xeon generation seems to be either Ice Lake or Cascade Lake, but either way it's a fairly new generation.
CPU Brand | Intel Xeon (Ice Lake or Cascade Lake) |
vCPU | 2 |
Memory | 8GB |
Disk | Balanced persistent disk 20GB |
OS Image | centos-7-v20211105 |
This time, we will take a benchmark with this configuration and compare it.
Check the CPU
After starting the instance, log in using SSH.
GCP has SSH integrated into the console, so it's convenient to be able to easily log in to the server from a browser. If it's a disposable item like this time, it's easier because you don't have to worry about preparing a key.
After logging in with SSH, try running lscpu. First is the EPYC instance.
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 2 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7B12 Stepping: 0 CPU MHz: 2249.998 BogoMIPS: 4499.99 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 16384K NUMA node0 CPU(s): 0,1 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext retpoline_amd ssbd ibrs ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save umip
The model name is AMD EPYC 7B12. You can also see the cache configuration.
The architecture is naturally x86_64, and the instruction set is compatible with Intel Xeon.
SIMD instructions are compatible up to AVX2. However, since SIMD instructions themselves have limited use opportunities in general server applications, this is considered necessary and sufficient.
Next is the result when executed on a Xeon instance.
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 2 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) CPU @ 2.80GHz Stepping: 7 CPU MHz: 2800.252 BogoMIPS: 5600.50 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 33792K NUMA node0 CPU(s): 0,1 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt x savec xgetbv1 arat avx512_vnni md_clear spec_ctrl intel_stibp arch_capabilities
The model name is Intel(R) Xeon(R) CPU @ 2.80GHz, and the specific model number seems to be masked.
Additionally, avx512 flags have been added to Flags, indicating that the CPU supports AVX512 (512-bit wide SIMD instructions).
The EPYC instance supports up to AVX2 (256-bit width SIMD instructions), so it is likely to exhibit high performance in processes that can use SIMD instructions (and compatible programs) such as video encoding and image conversion.
Benchmark preparation
This time, we will use UnixBench as the benchmark software.
IDCF has published about UnixBench explanations and installation procedures
Both the EPYC instance and the Xeon instance are booted from the same image (centos-7-v20211105), and we run yum -y update once before running the benchmark.
The kernel, GCC, and UnixBench versions of the execution environment are as follows.
Linux Kernel | 3.10.0-1160.49.1.el7.x86_64 |
GCC | 4.8.5 20150623 (Red Hat 4.8.5-44) |
UnixBench | Version 5.1.3 |
When UnixBench is run without arguments, it runs the benchmark with 1 parallelism and then continues with the same number of parallelisms as the number of logical CPUs (2 times in total).
This time, we specified 2vCPU as the instance spec, so the number of logical CPUs is 2, so the number of parallelism is also 2.
In the results in the next section, we will show both the results for parallelism 1 and the results for parallelism 2.
Benchmark results
First, here are the results for the number of parallels: 1.
Blue bars are results for Xeon instances, red bars are results for EPYC instances, and larger values mean better results.
The System Benchmarks Index Score (Overall) at the bottom is the total benchmark score, and the other items are the results of individual tests.
Looking at the results, AMD instances are performing well overall. The total score is also 1096.6 for the Xeon instance and 1288.2 for the AMD instance, which is a significant difference. Looking at individual items, it seems to show particularly high performance in items such as System Call Overhead and File Copy.
System call calls and file copies are important indicators for server applications, so it's great that the performance of these items is high.
Next is the result of parallel number 2.
As with parallelism number 1, the blue bar is the Xeon instance and the red bar is the EPYC instance.
EPYC instances are also dominant here. Overall, EPYC instances have better results than Xeon instances.
Also, what I would like to focus on here is the total score, while the Xeon instance remains at 1545.4, the EPYC instance has increased its score to 2253.1.
A Xeon instance with a parallel count of 2 has a score of about 1.4 times that of a parallel count of 1, but an EPYC instance with a parallel count of 2 has a score of about 1.75 times that of a parallel count of 1.
This is the result of running a single benchmark software called UnixBench, but it seems that EPYC instances tend to have better parallelization efficiency than Xeon instances.
Fee
Although not directly related to the benchmark, EPYC instances are cheaper than Xeon instances.
With the configuration and specifications used in this benchmark, Xeon instances cost about $74 per month, while EPYC instances can be used for about $64 per month.
summary
I launched an instance using AMD EPYC and an instance using Intel Xeon on GCP's Compute Engine, and benchmarked them using UnixBench to compare them.
In this benchmark, we found that instances using EPYC had superior performance. It also has excellent cost performance, so please consider it when building a server with GCP.
Also, in this benchmark, the results were not good for instances using Intel There is also gender. Intel Xeon also has a feature called AVX512 that EPYC does not have, so you can choose an instance that suits your workload on a case-by-case basis.
That's all. thank you very much.