Главная страница описания
Главная страница ЛТФ
CUDA
Технология вычислений на графических процессорах Nvidia
Введение
Вверх
Главная страница описания
Главная страница ЛТФ
Аппаратное обеспечение
См.
статья о Nvidia Tesla в Wikipedia
| TECHNICAL SPECIFICATIONS |
|
Tesla P100 на THEOR3
(
Micro-architecture Pascal GP100 )
|
|
RTX A2000 на i9a
(
Micro-architecture Ampere GA106 )
- FORM FACTOR → PCIe x16 form factor
- # OF CUDA CORES → 3328
- # OF Tensor cores → 104
- CUDA compute capability → 7.5
- FREQUENCY OF CUDA CORES u→ p to 1.2 GHz
- DOUBLE PRECISION FLOATING POINT PERFORMANCE (PEAK)
→ 249 Gflops
- SINGLE PRECISION FLOATING POINT PERFORMANCE (PEAK)
→ 7.987 Tflops
- TOTAL DEDICATED MEMORY
→ 12GB GDDR6*
- MEMORY SPEED
→ 1.5 GHz
- MEMORY INTERFACE
→ 192-bit
- MEMORY BANDWIDTH
→ 288 GB/sec
- POWER CONSUMPTION
→ 70W TDP
- SYSTEM INTERFACE
→ PCIe x16
|
|
Tesla C2075 на Theor2
(
Micro-architecture Fermi GF100 )
- FORM FACTOR → 9.75. PCIe x16 form factor
- # OF CUDA CORES → 448
- FREQUENCY OF CUDA CORES → 1.15 GHz
- DOUBLE PRECISION FLOATING POINT PERFORMANCE (PEAK)
→ 515 Gflops
- SINGLE PRECISION FLOATING POINT PERFORMANCE (PEAK)
→ 1.03 Tflops
- TOTAL DEDICATED MEMORY
→ 6GB GDDR5*
- MEMORY SPEED
→ 1.5 GHz
- MEMORY INTERFACE
→ 384-bit
- MEMORY BANDWIDTH
→ 144 GB/sec
- POWER CONSUMPTION
→ 225W TDP
- SYSTEM INTERFACE
→ PCIe x16 Gen2
- THERMAL SOLUTION
→ Active Fansink
- DISPLAY SUPPORT
→ Dual-Link DVI-I: 1
→ Maximum Display Resolution
1600x1200
|
Вверх
Главная страница описания
Главная страница ЛТФ
Программное обеспечение
Вверх
Главная страница описания
Главная страница ЛТФ
Производительность
for (i = 0; i < MatrixSize; i++)
for (j = 0; j < MatrixSize; j++)
for (k = 0; k < MatrixSize; k++)
C[j][i] += A[j][k] * B[k][i];
GFlops = 2 * MatrixSize3 /109/ExecutionTime
Вверх
Главная страница описания
Главная страница ЛТФ
Пример для Maple
theor2:> maple test_cuda.mpl
|\^/| Maple 16 (X86 64 LINUX)
._|\| |/|_. Copyright (c) Maplesoft, a division of Waterloo Maple Inc. 2012
\ MAPLE / All rights reserved. Maple is a trademark of
<____ ____> Waterloo Maple Inc.
| Type ? for help.
> CUDA:-IsEnabled();
false
> CUDA:-Enable(true);
false
> CUDA:-IsEnabled();
true
>
> CUDA:-HasDoubleSupport();
table([0 = true])
>
> with(LinearAlgebra):
> M:=RandomMatrix(4000,outputoptions=[datatype=float[4]]);
[ 4000 x 4000 Matrix ]
M := [ Data Type: float[4] ]
[ Storage: rectangular ]
[ Order: Fortran_order ]
> N:=RandomMatrix(4000,outputoptions=[datatype=float[4]]);
memory used=124.1MB, alloc=126.0MB, time=0.88
[ 4000 x 4000 Matrix ]
N := [ Data Type: float[4] ]
[ Storage: rectangular ]
[ Order: Fortran_order ]
>
> time[real](MatrixMatrixMultiply(M,N));
memory used=185.2MB, alloc=187.1MB, time=0.92
0.617
> CUDA:-Enable(false);
true
> time[real](MatrixMatrixMultiply(M,N));
5.623
>
>
> CUDA:-Enable(true);
false
> M:=RandomMatrix(4000,outputoptions=[datatype=float[8]]);
memory used=368.4MB, alloc=248.1MB, time=7.48
[ 4000 x 4000 Matrix ]
M := [ Data Type: float[8] ]
[ Storage: rectangular ]
[ Order: Fortran_order ]
> N:=RandomMatrix(4000,outputoptions=[datatype=float[8]]);
memory used=490.6MB, alloc=370.2MB, time=7.88
[ 4000 x 4000 Matrix ]
N := [ Data Type: float[8] ]
[ Storage: rectangular ]
[ Order: Fortran_order ]
>
> time[real](MatrixMatrixMultiply(M,N));
1.640
>
> CUDA:-Enable(false);
true
>
> time[real](MatrixMatrixMultiply(M,N));
10.614
>
> CUDA:-Properties();
[table(["Max Threads Dimensions" = [1024, 1024, 64], "Clock Rate" = 1147000,
"Max Grid Size" = [65535, 65535, 65535], "Memory Pitch" = 2147483647,
"Max Threads Per Block" = 1024, "Warp Size" = 32,
"Kernel Exec Timeout Enabled" = false, "Resisters Per Block" = 32768,
"ID" = 0, "Texture Alignment" = 512, "Minor" = 0,
"MultiProcessor Count" = 14, "Shared Memory Per Block" = 49152,
"Total Global Memory" = 4294967295, "Major" = 2, "Name" = "Tesla C2075",
"Total Constant Memory" = 65536,
"Device Overlap" = 1
])]
> quit
memory used=734.8MB, alloc=614.3MB, time=20.10
Вверх
Главная страница описания
Главная страница ЛТФ
Источники информации
Компьютерная группа ЛТФ
20 февраля 2013 г.
e-mail: super@theor.jinr.ru, telepuzik@theor.jinr.ru
e-mail yoda@theor.jinr.ru, godzilla@theor.jinr.ru
Дата обновления: 2024-08-16 17:23:20
Вверх
Главная страница описания
Главная страница ЛТФ