Multigrid Solver

Posted on March 11th, 2018
Previous Article :: Next Article

Previously on this blog we saw how to obtain a numerical approximation to the Poisson’s equation \(\nabla^2\phi =b \) using the Finite Difference or Finite Volume method. In 1D, Finite Difference gives us
$$ \frac{\phi_{i-1} -2\phi_i + \phi_{i+1}}{(\Delta x)^2} = b_i$$
We can then solve this linear system using an iterative scheme such as Gauss-Seidel. We simply loop over the nodes over and over, each time computing a new approximation of the node value at iteration \(n\) from
$$ \phi_i^{n+1} = \frac{1}{2}\left( \phi_{i-1}^{n+1} + \phi_{i+1}^{n} – (\Delta x)^2 b_i\right)$$

Motivation

While this method is simple to implement, it is very slow to converge. To see why, let’s consider a system with
$$b = A \sin\left(\frac{k2\pi x}{L}\right)$$
where \(A\) is some magnitude, \(k\) is an integer wave number, and \(L\) is the domain length. In 1D, we have
$$\frac{\partial^2\phi}{\partial x^2} = A \sin\left(\frac{k2\pi x}{L}\right)$$
We can integrate this system analytically subject to boundary conditions,
$$\frac{\partial \phi}{\partial x} = -A \cos\left(\frac{k2\pi x}{L}\right) \left(\frac{L}{k2\pi}\right) +C_1$$
$$\phi = -A \sin\left(\frac{k2\pi x}{L}\right) \left(\frac{L}{k2\pi}\right)^2 +C_1 x + C_2$$
Prescribing the Neumann zero boundary condition \({\partial \phi}/{\partial x}=0\) on \(x=0\) and the Dirichlet \(\phi=0\) on \(x=L\), we obtain
$$C_1 = A\left(\frac{L}{k2\pi}\right)$$
and
$$C_2 = -C_1 L$$

Since we have the analytical solution, we can compute the actual error vector \(\epsilon \equiv (\phi^n – \phi)\) at each iteration. Note that normally you wouldn’t have this option since the solution is not known a priori – otherwise there would be no need to solve the system! In Figure 1 we graph this \(\epsilon\) term over the first 50 iterations (left) or over the entire solution cycle (right) for \(A=10\) and \(k=4\).

Figure 1. Solution error in a Guass-Seidel solver over the first 50 iterations (left) and over the entire solution cycle (17,500 iterations)

What you can notice right away is that initially there is a high frequency error that dissipates quite quickly. After this, what remains is a low frequency term that takes much longer to reduce. You can think of the high frequency term as the node-to-node variation. The Poisson’s equation specifies the second derivative (the “acceleration”) of \(\phi\). This high frequency term corresponds to how well the local trends are represented – whether the value from node to node “accelerates” correctly. However, just because this local behavior is captured doesn’t mean that the entire solution is right. A Dirichlet boundary is needed to “ground” the system, otherwise we would not have a unique solution. So while the node values may be growing or shrinking correctly node to node, the overall solution may be satisfying a different set of boundary conditions. This implies that the entire solution needs to scale up or down. This is the low frequency term. Because our Finite Difference scheme is based on a three point stencil (in 1D), it takes long time for the boundary information to propagate. In the picture below, you can see that four solver iterations are needed for the highlighted yellow node to even become aware of the boundary.

Figure 2. It takes four iterations to propagate the boundary value to the highlighted node

This is the motivation behind multigrid schemes. Since we are after the “global” behavior, it makes sense to advance the solution using a coarser grid. This way, we are effectively stretching the “arms” of the stencil, allowing it to use data farther away. Using a coarse mesh with a doubled mesh spacing ends up halving the number of iterations needed to propagate the boundary value to two.

Figure 3. Using a coarse grid, we reduce the number of iteration to propagate the boundary value to just two

Error Vector

While it may seem natural to continue by solving the system on the coarser grid, this is actually not the best approach. We don’t care for the actual solution of the original linear system on the coarse grid. Instead we want to use the coarse grid to help us improve the solution on the finer grid. Consider some iterative scheme to solve the linear system
$$A \phi = b$$
After \(n\) solver iteration, we obtain
$$A\phi^n = b + R^n$$
where \(R^n\) is a residue vector that should approach zero as \(n \to \infty\). Subtracting the two equations, we get
$$A(\phi^n – \phi) = R^n$$
Here the term in the parentheses, \(\epsilon\equiv (\phi^n-\phi)\) corresponds to the error in our approximation of the solution that is yielding the residue \(R^n\). If the error is smoothly varying (which it is, as we just saw above), we can try using a coarser grid to approximate the error term and then use its interpolation on the fine grid to update the solution.

Basic Multigrid Algorithm

There are different multigrid strategies, so here I am going to describe a simple two-grid scheme based on Ferziger’s Computational Methods for Fluid Dynamics.

1) We start by performing one or more iterations of the solution on the fine grid,
$$A(\phi_f^n)^* = b_f$$

2) We next compute the residue on the fine grid,
$$ R_f^n=A(\phi_f^n)^*-b_f$$

2b) Since we have the residue, we should check for termination. I call this step “2b” since it’s not really part of the solution algorithm but is still needed.

3) Next, we interpolate the residue to the coarse grid,
$$ R_f \to R_c$$
This operation is called restriction. In our Finite Difference example, we have every other fine node overlapping a coarse node, so we could simply assign values according to \([i]_c \equiv [2i]_f\). The downside of this approach is that we completely miss anything that is happening on the odd fine-grid nodes. Therefore, it may be better to use a node average such as
$$ [i]_c = 0.25([2i-1]_c + 2[2i]_f +[2i+1]_f)$$
This is visualized below in Figure 4.

Figure 4. Restriction from fine to the coarse grid

4) Now that we have the residue on the coarse grid, we can perform several iterations on the coarse grid to estimate
$$A \epsilon_c = R_c$$
We use the standard Gauss-Seidel scheme here, remembering that \((\Delta x)_c = 2(\Delta x)_f\). We obtain the boundary conditions from the desired behavior of the solution vector. On our Newmann node, the residue is \((\phi_1^*-\phi_0^*)/{\Delta x} = 0 + R_0 \) which after substitution gives \([(\phi_1-\epsilon_1) – (\phi_0-\epsilon_0)]/{\Delta x} = 0 + R_0 \). After subtracting \((\phi_1 – \phi_0)/{\Delta x} = 0 \), we obtain \(\epsilon_0 = \epsilon_1 + R_0\Delta \). On the Dirichlet node I just use \(\epsilon=0\) since there is no reason why that node would not already have the final value.

5) The next step is to interpolate the correction vector to the fine grid,
$$ \epsilon_c \to \epsilon_f$$
Every other node on the fine mesh overlaps a course grid node, so there we just just copy the value. On the intermediate nodes (nodes with odd indexes), we set the value to the average of the surrounding course grid nodes, \( (\epsilon_f )_{2i+1} = 0.5((\epsilon_c)_{i} + (\epsilon_c)_{i+1}) \) as shown in Figure 5.

Figure 5. Interpolation from the coarse to the fine grid

6) Finally, we correct our solution from
$$ \phi_f^{n} = (\phi_f^n)^* – \epsilon_f $$

7) We then return to step 1 and continue until the norm of the residue vector \(R^n\) reaches some tolerance value.

Implementation

This algorithm is illustrated with the attached Python code. It contains both the GS and MG solvers. The GS solver requires 17,500 iterations to converge, while the MG solver needs only 93 loops of the above algorithm to reach the same tolerance! But comparing the number of iterations is a bit like comparing apples to oranges since each MG iteration is much more complicated. Thus, it’s better to compare the number of operations. This is bit of a fuzzy math, but I tried to stay consistent between the two solvers. I mainly only counted stand-alone multiplications or accesses to the solution vector. Using this scheme, we have 5 operations per each node per each Gauss-Seidel iteration,

phi[0] = phi[1] #neumann BC on x=0
for i in range (1,ni-1):    #Dirichlet BC at last node so skipping
     g = 0.5*(phi[i-1] + phi[i+1] - dz*dz*b[i])            
     phi[i] = phi[i] + w*(g-phi[i])

This includes two operations for the SOR scheme. Periodically we also need to compute the residue to check for convergence. The code for that is

#compute residue to check for convergence
if (it%R_freq==0):    
     r_i = phi[1]-phi[0]
     r_sum = r_i*r_i
     for i in range (1, ni-1):
          r_i = (phi[i-1] - 2*phi[i] + phi[i+1])/(dz*dz) - b[i]
          r_sum += r_i*r_i

This also involved 5 operations but these occur only every R_freq iterations. Hence, the total number of iterations for the GS solver is (num_its*ni*5 + (num_its/R_freq)*5). This same GS scheme is used in the Multigrid solver to iterate \((\phi^n)^*\) and \(\epsilon_c\), with the caveat that the second iteration is on (ni/2) nodes. We additionally have operations to compute the residue and interpolate between meshes. Some of these operations could be optimized out but I kept them here for clarity. Comparing operation counts, we see that the MG needs a factor of 6.9 fewer operations that Gauss Seidel (1.6 million vs. 11.1 million)! Not bad at all. This result was obtained using only a single iteration to update \((\phi^n)^*\) (so only a single pass), and 50 iterations to approximate \(\epsilon_c\). These values were arrived at by trial and error so there may be better combinations out there. Figure 6 below plots the evolution of the solution \(\phi^n\) for the two solvers. The Gauss-Seidel data on left shows the solution every 250 iterations, while the Multigrid plot on right plots data after every 25 iterations on the fine level. We can clearly see the benefit of the Multigrid scheme: the number of iterations on the fine mesh, which contains more data points, is greatly reduced! The effect is even more pronounced in 2D and 3D where the number of unknowns reduces by a factor of 4 or 8 between each grid level. As a final note, most “production” multigrid solvers use additional levels of refinement. A common scheme is the “W”. It uses three grid levels 1,2,3 from the coarsest to the finest. We visit the grids like 3-2-1-2-1-2-3 which, if plotted, would give a W shape.

Figure 6. Comparison of the solution progress for the Gauss-Seidel (left) and Multigrid (right) solvers. The GS plot shows every 250th, while the MG plot shows every 25th iteration. The analytical solution is shown by the dashed red line.

Code

Here is a snippet of the Multigrid solver

#multigrid solver
def MGsolve(phi,b):
    global eps_f, eps_c, R_f, R_c
    print("**** MULTIGRID SOLVE ****")
    ni = len(phi)
    ni_c = ni>>1    #divide by 2
    dx_c = 2*dx
 
    R_f = np.zeros(ni)
    R_c = np.zeros(ni_c)
    eps_c = np.zeros(ni_c)
    eps_f = np.zeros(ni)
 
    pl.figure()
    #plot analytical solution
    pl.plot(x,phi_true,LineWidth=4,color='red',LineStyle="dashed")
    pl.title('Multigrid Solver: phi vs. x')
 
    for it in range(10001):
 
        #number of steps to iterate at the finest level
        inner_its = 1
 
        #number of steps to iterate at the coarse level        
        inner2_its = 50
        w = 1.4
 
        #1) perform one or more iterations on fine mesh
        for n in range(inner_its):
            phi[0] = phi[1]            
            for i in range (1,ni-1):
                g = 0.5*(phi[i-1] + phi[i+1] - dx*dx*b[i])
                phi[i] = phi[i] + w*(g-phi[i])
 
        #2) compute residue on the fine mesh, R = A*phi - b
        for i in range(1,ni-1):
            R_f[i] = (phi[i-1]-2*phi[i]+phi[i+1])/(dx*dx) - b[i]
        R_f[0] = (phi[0] - phi[1])/dx    #neumann boundary
        R_f[-1] = phi[-1] - 0   #dirichlet boundary
 
        #2b) check for termination
        r_sum = 0
        for i in range(ni):
            r_sum += R_f[i]*R_f[i]
        norm = np.sqrt(r_sum)/ni
        if (norm<1e-4): 
            print("Converged after %d iterations"%it)
            print("This is %d solution iterations on the fine mesh"%(it*inner_its))
            op_count_single = (inner_its*ni*5 + ni*5 + (ni>>1) + 
                               inner2_its*(ni>>1)*5 + ni + ni)
            print("Operation count: %d"%(op_count_single*it))
            pl.plot(x,phi)
            return op_count_single*it
            break
 
        #3) restrict residue to the coarse mesh
        for i in range(2,ni-1,2):
            R_c[i>>1] = 0.25*(R_f[i-1] + 2*R_f[i] + R_f[i+1])            
        R_c[0] = R_f[0]
 
        #4) perform few iteration of the correction vector on the coarse mesh
        eps_c[:] = 0
        for n in range(inner2_its):
            eps_c[0] = eps_c[1] + dx_c*R_c[0]
            for i in range(1,ni_c-1):
                g = 0.5*(eps_c[i-1] + eps_c[i+1] - dx_c*dx_c*R_c[i])
                eps_c[i] = eps_c[i] + w*(g-eps_c[i])
 
        #5) interpolate eps to fine mesh
        for i in range(1,ni-1):
            if (i%2==0):    #even nodes, overlapping coarse mesh
                eps_f[i] = eps_c[i>>1]
            else:
                eps_f[i] = 0.5*(eps_c[i>>1]+eps_c[(i>>1)+1])
            eps_f[0] = eps_c[0]
 
        #6) update solution on the fine mesh
        for i in range(0,ni-1):
            phi[i] = phi[i] - eps_f[i]
 
        if (it % int(25/inner_its) == 0):
            pl.plot(x,phi)
 
    return -1

Attachments

Source code: multigrid.py

4 comments to “Multigrid Solver”

  1. John
    August 13, 2021 at 12:38 pm

    You can speed up the multigrid further by doing a deeper v-cycle. So going from a fine mesh to a coarse mesh to an even coarser the operation count ratio went from

    Operation count ratio (GS/MG): 6.91105

    to

    Operation count ratio (GS/MG): 82.2093

    And the duration to solve the potential went from

    5.609375 seconds for Gauss-Seidel

    0.859375 seconds for MG with 1 level coarse grid V cycle

    0.28125 secondss for MG with 2 levels coarse grid V cycle.

    Here is the modified code for a 2 level coarse grid MultiGrid solver.

    # -*- coding: utf-8 -*-
    “””
    Example 1D Multigrid solver
    Written by Lubos Brieda for
    https://www.particleincell.com/2018/multigrid-solver/

    Modified by John Coady for 2 level MultiGrid V cycle for improved result.

    Fri Mar 9 21:47:15 2018
    “””

    import numpy as np
    import pylab as pl
    import time

    #standard GS+SOR solver
    def GSsolve(phi,b):
    print(“**** GS SOLVE ****”)

    #plot for solution vector
    fig_sol = pl.figure()
    #plot analytical solution
    pl.plot(x,phi_true,LineWidth=4,color=’red’,LineStyle=”dashed”)
    pl.title(‘GS Solver: phi vs. x’)

    #plot for error
    fig_err = pl.figure()
    pl.title(‘GS Solver: (phi-phi_true) vs. x’)

    R_freq = 100 #how often should residue be computed
    w = 1.4 #SOR acceleration factor

    for it in range(100001):
    phi[0] = phi[1] #neumann BC on x=0
    for i in range (1,ni-1): #Dirichlet BC at last node so skipping
    g = 0.5*(phi[i-1] + phi[i+1] – dx*dx*b[i])
    phi[i] = phi[i] + w*(g-phi[i])

    #compute residue to check for convergence
    if (it%R_freq==0):
    r_i = phi[1]-phi[0]
    r_sum = r_i*r_i
    for i in range (1, ni-1):
    r_i = (phi[i-1]-2*phi[i]+phi[i+1])/(dx*dx) – b[i]
    r_sum += r_i*r_i

    norm = np.sqrt(r_sum)/ni
    if (norm>1 #divide by 2
    dx_2h = 2*dx
    ni_4h = ni_2h>>1 #divide by 2
    dx_4h = 2*dx_2h

    R_h = np.zeros(ni_h)
    R_2h = np.zeros(ni_2h)
    R_4h = np.zeros(ni_4h)
    eps_4h = np.zeros(ni_4h)
    eps_2h = np.zeros(ni_2h)
    eps_h = np.zeros(ni_h)

    pl.figure()
    #plot analytical solution
    pl.plot(x,phi_true,LineWidth=4,color=’red’,LineStyle=”dashed”)
    pl.title(‘Multigrid Solver: phi vs. x’)

    for it in range(10001):

    #number of steps to iterate at the finest level
    inner_its = 1

    #number of steps to iterate at the coarse level
    inner2h_its = 5
    inner4h_its = 50
    w = 1.4

    #1) perform one or more iterations on fine mesh
    for n in range(inner_its):
    phi[0] = phi[1]
    for i in range (1,ni-1):
    g = 0.5*(phi[i-1] + phi[i+1] – dx*dx*b[i])
    phi[i] = phi[i] + w*(g-phi[i])

    #2) compute residue on the fine mesh, R = A*phi – b
    for i in range(1,ni-1):
    R_h[i] = (phi[i-1]-2*phi[i]+phi[i+1])/(dx*dx) – b[i]
    R_h[0] = (phi[0] – phi[1])/dx #neumann boundary
    R_h[-1] = phi[-1] – 0 #dirichlet boundary

    #2b) check for termination
    r_sum = 0
    for i in range(ni):
    r_sum += R_h[i]*R_h[i]
    norm = np.sqrt(r_sum)/ni_h
    if (norm>1) +
    inner2h_its*(ni_h>>1)*5 + ni_h + ni_h)
    print(“Operation count: %d”%(op_count_single*it))
    pl.plot(x,phi)
    return op_count_single*it
    break

    #3) restrict residue to the 2h mesh
    for i in range(2,ni_h-1,2):
    R_2h[i>>1] = 0.25*(R_h[i-1] + 2*R_h[i] + R_h[i+1])
    R_2h[0] = R_h[0]

    #4) perform few iteration of the correction vector on the coarse mesh
    eps_2h[:] = 0
    for n in range(inner2h_its):
    eps_2h[0] = eps_2h[1] + dx_2h*R_2h[0]
    for i in range(1,ni_2h-1):
    g = 0.5*(eps_2h[i-1] + eps_2h[i+1] – dx_2h*dx_2h*R_2h[i])
    eps_2h[i] = eps_2h[i] + w*(g-eps_2h[i])

    # Second V Restriction
    #3a) restrict residue to the 4h mesh
    for i in range(2,ni_2h-1,2):
    R_4h[i>>1] = 0.25*(R_2h[i-1] + 2*R_2h[i] + R_2h[i+1])
    R_4h[0] = R_2h[0]

    #4a) perform few iteration of the correction vector on the coarse mesh
    eps_4h[:] = 0
    for n in range(inner4h_its):
    eps_4h[0] = eps_4h[1] + dx_4h*R_4h[0]
    for i in range(1,ni_4h-1):
    g = 0.5*(eps_4h[i-1] + eps_4h[i+1] – dx_4h*dx_4h*R_4h[i])
    eps_4h[i] = eps_4h[i] + w*(g-eps_4h[i])

    #5a) interpolate eps to 2h mesh
    for i in range(1,ni_2h-1):
    if (i%2==0): #even nodes, overlapping coarse mesh
    eps_2h[i] = eps_4h[i>>1]
    else:
    eps_2h[i] = 0.5*(eps_4h[i>>1]+eps_4h[(i>>1)+1])
    eps_2h[0] = eps_4h[0]

    #5) interpolate eps to fine mesh
    for i in range(1,ni_h-1):
    if (i%2==0): #even nodes, overlapping coarse mesh
    eps_h[i] = eps_2h[i>>1]
    else:
    eps_h[i] = 0.5*(eps_2h[i>>1]+eps_2h[(i>>1)+1])
    eps_h[0] = eps_2h[0]

    #6) update solution on the fine mesh
    for i in range(0,ni_h-1):
    phi[i] = phi[i] – eps_h[i]

    if (it % int(25/inner_its) == 0):
    pl.plot(x,phi)
    # print(“it: %d, norm: %g”%(it, norm))

    return -1

    ni = 128

    #domain length and mesh spacing
    L = 1
    dx = L/(ni-1)

    x = np.arange(ni)*dx

    phi = np.ones(ni)*0
    phi[0] = 0
    phi[ni-1] = 0
    phi_bk = np.zeros_like(phi)
    phi_bk[:] = phi[:]

    #set RHS
    A = 10
    k = 4
    b = A*np.sin(k*2*np.pi*x/L)

    #analytical solution
    C1 = A/(k*2*np.pi/L)
    C2 = – C1*L
    phi_true = (-A*np.sin(k*2*np.pi*x/L)*(1/(k*2*np.pi/L))**2 + C1*x + C2)

    pl.figure(1)
    pl.title(‘Right Hand Side’)
    pl.plot(x,b)

    t = time.process_time()
    #do some stuff
    op_gs = GSsolve(phi,b)
    elapsed_time = time.process_time() – t
    print(“GS took : %g secs”, (elapsed_time))

    t = time.process_time()
    #reset solution
    phi = phi_bk
    op_mg = MGsolve(phi,b)

    elapsed_time = time.process_time() – t
    print(“MG took : %g secs”, (elapsed_time))

    print(“Operation count ratio (GS/MG): %g”%(op_gs/op_mg))

  2. John
    August 13, 2021 at 1:43 pm

    You can try running these MultiGrid python example files online in the cloud here as Jupyter Notebooks.

    https://mybinder.org/v2/gh/jcoady/MultiGrid/HEAD

    There are 3 Jupyter Notebook files. The first one just executes a single level MultiGrid solver. The second notebook is a 2 level MultiGrid solver and the third one is a 3 level MultiGrid solver. Try running the code in each of the notebooks and compare the results. I also put the notebooks in a github repository here.

    https://github.com/jcoady/MultiGrid

  3. John
    August 14, 2021 at 3:28 am

    I updated the MultiGrid code further and improved the performance of the multigrid solver even more. There are 4 multigrid files with 1 V-cycle level, 2 V-cycle levels, 3 V-cycle levels and 4 V-cycle levels respectively. What I am seeing is that if there are more V-cycle levels then the multigrid code runs even faster to solve the problem. Here are the results that I am getting with

    ni = 128

    for 1 V-cycle level MultiGrid was 7.39054 times faster than Gauss-Seidel

    for 2 V-cycle levels MultiGrid was 25.293 times faster than Gauss-Seidel

    for 3 V-cycle levels MultiGrid was 38.7557 times faster than Gauss-Seidel

    for 4 V-cycle levels MultiGrid was 54.4867 times faster than Gauss-Seidel

    If I increased ni to 256 then I get the following results.

    for 1 V-cycle level MultiGrid was 7.81608 times faster than Gauss-Seidel

    for 2 V-cycle levels MultiGrid was 40.0115 times faster than Gauss-Seidel

    for 3 V-cycle levels MultiGrid was 126.4 times faster than Gauss-Seidel

    for 4 V-cycle levels MultiGrid was 254.213 times faster than Gauss-Seidel

    You can try running these python codes for yourself online as Jupyter Notebooks at this link.

    https://mybinder.org/v2/gh/jcoady/MultiGrid/HEAD

    And the Jupyter Notebook files are located here on github.

    https://github.com/jcoady/MultiGrid

    Here is an MIT lecture on MultiGrid where these V-cycle concepts are explained starting around the 19 minute mark in the video.

    https://ocw.mit.edu/courses/mathematics/18-086-mathematical-methods-for-engineers-ii-spring-2006/video-lectures/lecture-16-general-methods-for-sparse-systems/

  4. John
    August 15, 2021 at 9:25 pm

    I made a 3D version of the MultiGrid code in C++. The code can be found on github here.

    https://github.com/jcoady/MultiGrid/tree/main/cpp

    I used the 3D multigrid version to solve poisson equation for a 3D box with Dirichlet boundary conditions as shown in this video.

    https://www.youtube.com/watch?v=4hSWZU4gjKs

    The 3D multigrid solver was around 5 times faster than Gauss-Seidel solver but was about 10 times slower than the FFT solver for this problem.

Leave a Reply

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre lang="" line="" escaped="" cssfile=""> In addition, you can use \( ...\) to include equations.