Amdahl's Law

Amdahl's Law states that if parts of a program run at two different speeds, the slower speed will dominate. Here are the details for a Cray YMP C90.

One vector processor of a Cray YMP C90 can perform two floating point adds per clock cycle and each clock cycle is 4 nanoseconds. So its vector speed, in MFLOPS, is

In[1]:=


  v = N[2 / (4 10^-9) * 10^-6]

Out[1]=

  500.

One processor operating in scalar mode can perform two floating point adds in 6 clock cycles, so its scalar speed, in MFLOPS, is

In[2]:=

  s = N[2 / (6 * 4 10^-9) * 10^-6]

Out[2]=

  83.3333

If a fraction, f, of the operations are performed in vector mode, then the total performance rate, in MFLOPS, is

In[3]:=

  r[f_,v_,s_] := 1 / (f/v + (1-f)/s)

We can construct a table for various values of f.

In[4]:=

  fvalues = {0,.1,.2,.3,.4,.5,.6,.7,.8,.9,.95,1};

In[5]:=

  Prepend[
     Map[{#, r[#,v,s], r[#,v,s]/v, r[#,v,s]/s}&, fvalues],
     {"f", "r", "% of peak", "speedup"}]//TableForm
  

Out[5]=

  f      r         % of peak   speedup
  
  0      83.3333   0.166667    1.
  
  0.1    90.9091   0.181818    1.09091
  
  0.2    100.      0.2         1.2
  
  0.3    111.111   0.222222    1.33333
  
  0.4    125.      0.25        1.5
  
  0.5    142.857   0.285714    1.71429
  
  0.6    166.667   0.333333    2.
  
  0.7    200.      0.4         2.4
  
  0.8    250.      0.5         3.
  
  0.9    333.333   0.666667    4.
  
  0.95   400.      0.8         4.8
  
  1      500.      1.          6.

and we can graph r as a function of f.

In[6]:=

  Plot[r[f,v,s], {f,0,1}];

Notice that the graph of performance stays pretty low until f is about 0.8. But even then, it is only at 1/2 of peak performance. If 0.95 percent of the program is vectorizable, then performance grows to 80% of peak, but this is still not very spectacular. Things look pretty bleak!

Go up to Numerical Methods for Supercomputers