Add NonbondedInteractionGroup potential, rename existing nonbonded potentials #578

mcwitt · 2022-01-24T22:20:02Z

Add NonbondedInteractionGroup, allowing computation of nonbonded interactions between two sets of atoms (useful for efficient computation of host-ligand interactions; this uses the neighborlist updates from Feature/neighborlist on host ligand #545)
Rename existing nonbonded potentials for clarity:
- NonbondedDense -> NonbondedAllPairs (given a set of atoms, computes the sum of all pairwise interactions)
- NonbondedPairs -> NonbondedPairList (given a list of atom pairs, computes the sum of the specified pair interactions)

The implementation strategy for the interaction groups potential is to compose an extra permutation with the existing hilbert-sorting permutation, resulting in a permutation that

sorts atoms into contiguous blocks (column atoms followed by row atoms), and
(if enabled) hilbert-sorts each block independently

The idea for the initial correctness test is due to @maxentile . This uses the identity

U = U_A + U_B + U_{AB}

where U_A (U_B) is the all-pairs potential for the first (second) group, and U_{AB} is the "interaction group" potential.

Notes

This involves copying the existing nonbonded potential to a new file and making modifications, and as such is probably easiest to review commit-by-commit. I've squashed the original branch down to a smaller set of changes that should follow logically. The main implementation of the interaction group potential is in 945e62a, and the correctness test in 980185e
Currently it's only supported to specify one set of atom indices (e.g. for the ligand), and the remaining indices are assumed to be the other interacting group. This was to simplify the initial implementation, but we might need to generalize this to make it useful

timemachine/cpp/src/nonbonded_interaction_group.cu

timemachine/cpp/src/kernels/k_nonbonded.cuh

Rename `k_nonbonded_all_pairs.cuh` to (shared) `k_nonbonded.cuh` Move concrete implementations in k_nonbonded.cuh header to k_nonbonded.cu to prevent "multiple definitions" at linking stage

General strategy: modify permutation to 1. sort each group of atoms into a contiguous block 2. hilbert sort each group separately

In the interacting groups case (NR != 0), the row atom indices are offset in the sorted arrays (e.g. d_sorted_x) by the number of column atoms

maxentile

Looks good -- minor comments so far

tests/test_nonbonded_interaction_group.py

timemachine/cpp/src/nonbonded_interaction_group.cu

We now rebuild the neighborlist whenever the box changes at all

tests/test_nonbonded_interaction_group.py

timemachine/cpp/src/nonbonded_interaction_group.hpp

Row/col is an implementation detail; ligand/host reflects a common usage

maxentile

These changes are looking pretty good!

(Minor comments in new testing code)

maxentile · 2022-02-17T23:53:42Z

timemachine/potentials/nonbonded.py

@@ -251,6 +253,17 @@ def apply_cutoff(x):
    return vdW, electrostatics


+def nonbonded_v3_interaction_groups(conf, params, box, inds_l, inds_r, beta: float, cutoff: Optional[float] = None):


Nit: naming: (inds_l, inds_r) is used to mean pair indices in nonbonded_v3_on_specific_pairs (with len(inds_l) == len(inds_r) == num_pairs), but is used to mean group indices in nonbonded_v3_interaction_groups (with len(inds_l) != len(inds_r), num_pairs == len(inds_r) * len(inds_r))
--> suggestion: modify name in nonbonded_v3_interaction_groups signature to indicate these are groups which will be product'd rather than zip'd...

The more I think about this, the more I dislike the cryptic "zip(inds_l, inds_r)" idiom I introduced here, esp. when it gets in the way of more intuitive "product(group_a_inds, group_b_inds)" usage you're going for here.

There's no reason I can think of to prefer two (n_pairs,) arrays over a single (n_pairs, 2) array to represent a pair list, and I think I should remove the (inds_l, inds_r) / (inds_i, inds_j) pairs idiom throughout jax_utils in a follow-up PR...

There's no reason I can think of to prefer two (n_pairs,) arrays over a single (n_pairs, 2) array for pair lists

I think this makes sense to me too -- passing in a single (N, 2) array for the specific pairs case also automatically enforces the precondition that the two columns have the same length.

timemachine/potentials/nonbonded.py

maxentile · 2022-02-17T23:59:01Z

tests/test_nonbonded_interaction_group.py

+@pytest.mark.parametrize("precision,rtol,atol", [(np.float64, 1e-8, 1e-8), (np.float32, 1e-4, 5e-4)])
+@pytest.mark.parametrize("num_atoms_ligand", [1, 15])
+@pytest.mark.parametrize("num_atoms", [33, 231])
+def test_nonbonded_interaction_group_correctness(


Tests here look great to me now!

Roughly 5000x speedup for arrays of length 100

Addressing #578 (comment)

proteneer

This is looking fantastic! I added some comments in-line re: expanding coverage for varying lambda.

Given the critical nature of this code, I'm requesting an additional consistency tests as a sanity check that we didn't "break" anything. Can we check bitwise consistency of the original C++ Nonbonded code with NonbondedAllPairs U_A + NonbondedAllPairs U_B + NonbondedInteractionGroup U_AB, and let's vary lambda for this, and test it with and without hilbert curve sorting.

proteneer · 2022-02-18T14:53:53Z

timemachine/cpp/src/kernels/k_nonbonded.cu

+    vals[atom_idx] = atom_idx;
+}
+
+// TODO: DRY with k_coords_to_kv


definitely DRY for this one (do this in a later PR is fine), I think you can simply implement the old version with atom_idxs = np.arange(N)

proteneer · 2022-02-18T14:54:59Z

tests/test_nonbonded_interaction_group.py

+
+    ligand_idxs = rng.choice(num_atoms, size=num_atoms_ligand, replace=False).astype(np.int32)
+    lambda_plane_idxs = np.zeros(num_atoms, dtype=np.int32)
+    lambda_plane_idxs[ligand_idxs] = 1


Given that most of the alchemical changes (and their contributions to the delta Us) will be coming from the interaction group, we should expand the coverage to include typical ways that the interaction group may depend on lambda.

a) Currently the offset_idxs are all zero for the tests (only plane_idxs are modified in _lambda_planes()). The dependence on lambda isn't fully tested here. Because the w coordinates are:

w = lambda_plane_idx*cutoff + lambda*lambda_offset_idx du/dl = du/dw.dw/dl is only properly tested if lambda_offset_idx != 0

b) Need more test coverage for the InterpolatedNonbondedInteractionGroup (esp in conjunction with varying lambda_plane_idxs and lambda_offset_idxs etc.). I don't think I see any right now.

Currently the offset_idxs are all zero for the tests (only plane_idxs are modified in _lambda_planes())

Ah, makes sense. Updated to test with nonzero plane and offset idxs in 6c5598b and dbf4fbe.

Need more test coverage for the InterpolatedNonbondedInteractionGroup

Good call, this was untested. Added a Python wrapper and test in 6d199ca

Following #578 (comment)

mcwitt · 2022-02-18T21:24:10Z

Can we check bitwise consistency of the original C++ Nonbonded code with NonbondedAllPairs U_A + NonbondedAllPairs U_B + NonbondedInteractionGroup U_AB

@proteneer If I understand correctly I think this would require

exposing NonbondedAllPairs in Python, and
adding an option to NonbondedAllPairs to use a subset of atoms.

(1) may not be strictly necessary since we can just pass an empty list of exclusions to the full Nonbonded potential

(2) is needed because SummedPotential passes the same coordinates and parameters to each potential in the sum (and I think we'd probably need to use this to check bitwise equality, versus doing the sum after the conversion to floating-point in Python).

Would it make sense to add this test once NonbondedAllPairs is exposed and we've added an atom_idxs option? Can make an issue to track.

proteneer

LGTM after discussing offline!

maxentile

This should be a versatile new feature!

mcwitt changed the title ~~Add NonbondedInteractionGroups potential, rename existing nonbonded potentials~~ Add NonbondedInteractionGroup potential, rename existing nonbonded potentials Jan 24, 2022

mcwitt force-pushed the nonbonded-interaction-groups branch from adae990 to 1c97a4e Compare February 9, 2022 16:58