Skip to content

Conversation

@andrthu
Copy link
Contributor

@andrthu andrthu commented Feb 27, 2020

Add option to use Restricted Additive Schwarz (RAS) as parallel preconditioner instead of Block-Jacobi (BJ). This is made possible by adding multiple layers of overlap between subdomains after partitioning the grid, and PR depend on PR OPM/opm-grid#449. The overlap layers are separated into two types: The Overlap layer(s) and the outer Ghost layer. The matrix entries in the overlap rows are then valuable and can be used in the preconditioner. There is no need for a RAS preconditioner class, since the index set, the comm.project function and comm.copyOwnerToAll function handles everything for us.

Norne with transmissibility edge-weights

To demonstrate the impact of RAS on the convergence, consider Norne with default partitioning options using 1 and 2 layers of overlap.

mpirun -np N flow Norne.DATA --overlap-layers=L --edge-weights-method=1
N L=1(BJ) iter L=2(RAS) iter
1 25167 25167
2 25103 24753
4 25000 25328
8 25328 25797
16 26137 25591
32 26178 25925
64 26505 25309

We observe a slight reduction in iteration count when using RAS over BJ for N=2,16,32 and 64, but higher for n=4 and 8. The main reason for these unimpressive results are that parallel flow on Norne with transmissibility weights produce almost no increase in iterations compared to sequential flow. For different or larger cases running on more processors the benefit could be greater.

Norne with uniform edge-weights

To see an example where RAS has a more significant impact, consider Norne again, but now use uniform edge-weights to partition the grid.

mpirun -np N flow Norne.DATA --overlap-layers=L --edge-weights-method=0
N L=1(BJ) iter L=2(RAS) iter L=3(RAS) iter
1 25167 25167 25167
2 26533 26819 26789
4 27771 25276 25706
8 29555 26889 26844
16 29695 28466 28762
32 43380 32572 28371
64 50486 34684 28006

Here we see a lower iteration count for all N except 2. We also observe that the benefit of RAS increase with N.

Note that the tables show total number of iterations and not total time. The lower iteration count of RAS comes at the cost of increased parallel overhead.

Remove confusing, unused and unnecessary ParallelRestrictedAdditiveSchwarz.hpp file.

Enable multiple overlap layers and restricted additive schwarz.
Copy link
Member

@akva2 akva2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial stuff. i now see you fixed a couple of them in the second commit but fix it by never doing the mistakes instead.

CMakeLists.txt Outdated
EXE_NAME flow_onephase
DEPENDS "opmsimulators"
LIBRARIES "opmsimulators")
if (FALSE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope.

grid().globalCell().data(), grid().size(0));
this->schedule().filterConnections(activeCells);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope

typedef Dune::CollectiveCommunication< int > communication_type;
#endif

#if DUNE_VERSION_NEWER(DUNE_ISTL, 2, 6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.6 is minimum, no need for the ifdefery

@blattms
Copy link
Member

blattms commented Feb 27, 2020

Please also report timings.

Copy link
Member

@blattms blattms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff of this PR is much shorter than I expected from the description.

Where do I choose the RAS preconditioner? I could not find the option.
What preconditioner is used internally when solving?

I seems like all the logic is implemented in opm-grid which should not care about linear solver stuff. Is that right?

// If overlapLayers_ > 1 we are using Restricted Additive Schwarz and therefore need
// the residual at overlap DoFs.
if (overlapLayers_ > 1)
parallelInformation_arg.copyOwnerToAll(istlb, istlb);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why this is only needed for more than one overlap layers. Shouldn't the right hand side always be consistent (all procs have the same values?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The right hand side is normally unique, which is correct with only one overlap since this will give Dirichlet condition for the copy (?). I think the problem is that of some reason the residual is only calculated for the owner cells since this is the only needed for the original solvers. With this copying and the additional setting of zero for the copy attribute (i.e. "useless overlap") the right hand side is correct. The correct thing is to assemble for all owner and overlap (I would have prefered all) and then set it to zero (at the same time as the matrix elements is fixed to get correct Dirichlet condition). If amg solver is used I think a change in the redistribution function is needed to get correct behavoir if the coarse scale communicator has overlap attributes.


// Add just a single element to ghost rows
if (elem.partitionType() != Dune::InteriorEntity)
if (elem.partitionType() == Dune::GhostEntity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is totally unrelated to this PR but it just struck me when reading the diff:

Should we move the for-loop over the wellConnectionsGraph_ to else branch and explicitly use noGhostMat_->setrowsize(1);. That seems clearer and might prevent surprises if we distribute the wells.

@GitPaean
Copy link
Member

Thanks for the efforts. I will try to test how it works with the model 2 variants.

@GitPaean
Copy link
Member

Please ignoring my previous message about the parallel running failure with model 2.

 --overlap-layers=L 

I did not replace L with a number.

@blattms
Copy link
Member

blattms commented Mar 2, 2020

FYI, my comment of OPM/opm-grid#449 apply here to.

I think if we want to support RAS then we should use an explicit option for choosing it and not --overlap-layers=x with x>1.

@GitPaean
Copy link
Member

GitPaean commented Mar 2, 2020

preliminary testings with with two realizations of model2, with overlap-layers=1 and 4 processes each, it did not provide anything positive in term of the running time and also convergence behaviors, while only slightly worse.

Maybe we need more tests.

I think it is something good to have for later testing and study, or something we can keep improving.

@hnil
Copy link
Member

hnil commented Apr 3, 2020

If the only reasonable solver with none empty "overlap attributes" why do we need an extra option for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants