Evolution with large EKOs #209

alecandido · 2023-02-22T10:49:48Z

As required in #185 (comment) I'm opening a new issue for the problem.

The problem is not a problem at the moment, since we only have a memory problem with the EKO for the whole NNPDF4.0 data set altogether, but we always split by process. And those who require the biggest EKOs are jets, which require an EKO of ~4GB.

However, at the moment PineAPPL needs to load the whole 4GB or whatever they are. But this is only a problem in PineAPPL, since EKO can break them and load one piece at a time (i.e. one
scale at a time).

Actually, the format is already available. As soon as I will have solved the current "top-priority", I will step in and solve myself, providing a lazy-loader to reduce memory consumption (based on the current EKO loader and the new format).

cschwan · 2023-02-22T11:48:29Z

Do we have evidence that there really is a problem with Grid::evolve?

alecandido · 2023-02-22T14:39:42Z

Do we have evidence that there really is a problem with Grid::evolve?

The problem is in the signature:

pineappl/pineappl/src/evolution.rs

Line 168 in fc50e9c

operator: &ArrayView5<f64>,

If you ask for a rank-5, that's the full one. And we know the size of the full EKO for these processes.
In particular, we need to pass instead a callable for ArrayView4, and only when you call for that $Q^2$ element the object is loaded in memory. It is available in Python, but not available in Rust (however, it is just a matter of looking the correct .np{y,z}.lz4 file inside a tar folder, so not much different from what you have already done.

cschwan · 2023-02-22T14:53:35Z

I understand that the whole thing will be big, but have you ever 1) run Grid::evolve with a large EKO, 2) saw it really is slow and 3) verified the culprit is that you pass the entire operator? From your description it seems that you anticipate it being a performance problem, but so far it seems we don't have evidence for it.

alecandido · 2023-02-22T15:17:00Z

It is not a performance problem, just a memory consumption problem.

Passing a 5-dim operator constructed from the 4-dim ones, or a loader to 4-dim operator, will have exactly the same performance (unless that there are more pieces involved, since you have to join them in Python and might be slower).
However, no one is concerned about this performance differences, but only that the 5-dim object might be too big for some specific data sets on not too powerful machines, without a good reason to consume all that memory.

cschwan · 2023-02-22T15:31:00Z

OK, I see! In that case, a straightforward optimization would need functions to perform the following operations:

which (pid1, pid0) combinations have a non-zero operator? This is currently done here for the indices of the pids:

pineappl/pineappl/src/evolution.rs

Lines 102 to 118 in fc50e9c

    
           // list of all non-zero PID indices 
        
           let pid_indices: Vec<_> = (0..operator.dim().3) 
        
               .cartesian_product(0..operator.dim().1) 
        
               .filter(|&(pid0_idx, pid1_idx)| { 
        
                   // 1) at least one element of the operator must be non-zero, and 2) the pid must be 
        
                   // contained in the lumi somewhere 
        
                   operator 
        
                       .slice(s![.., pid1_idx, .., pid0_idx, ..]) 
        
                       .iter() 
        
                       .any(|&value| value != 0.0) 
        
                       && pid1_nonzero(if gluon_has_pid_zero && info.pids1[pid1_idx] == 21 { 
        
                           0 
        
                       } else { 
        
                           info.pids1[pid1_idx] 
        
                       }) 
        
               }) 
        
               .collect();

However, this going over the entire operator and I think after generation EKO should 'know' this.

for each non-zero PID combination found above we next need a subset of the remaining three-dimensional operator, see here:

pineappl/pineappl/src/evolution.rs

Lines 208 to 220 in fc50e9c

    
           // create the corresponding operators accessible in the form [muf2, x0, x1] 
        
           let operators: Vec<_> = pid_indices 
        
               .iter() 
        
               .map(|&(pid0_idx, pid1_idx)| { 
        
                   operator 
        
                       .slice(s![.., pid1_idx, .., pid0_idx, ..]) 
        
                       .select(Axis(0), &fac1_indices) 
        
                       .select(Axis(1), &x1_indices) 
        
                       .permuted_axes([0, 2, 1]) 
        
                       .as_standard_layout() 
        
                       .into_owned() 
        
               }) 
        
               .collect();

This slices the full operator in its muf2- and x1-axes. These 3D operators I think we can safely pass around without any memory problems.

alecandido · 2023-02-23T00:01:23Z

We are more or less slicing the other way round, as described above ($\mu_F^2 \mapsto O_4$).

The reason is that different scales are really independent, while in flavor basis in general there are no non-zero pairs (pid0, pid1).
Indeed, the operator is block-diagonal in evolution basis (QCD: singlet/gluon + diagonal, QCD+QED: singlet-u/singlet-d/gluon/photon + valence-u/valence-d + diagonal), and once you rotate even a single basis (input or output) to flavor basis, you have all entries filled.

The only special case is intrinsic: if you evolve charm below its matching scale (usually charm mass) then the evolution is the identity in flavor basis. But this is really the only case.

Moreover, in order to construct a slice containing the elements related to all $\mu_F^2$ values you have to load the full EKO (even though not all at once, so no memory problem, just inefficient because you load many times the same files from the disk - if you cache them, you have back the memory issue).

cschwan · 2023-02-23T08:20:13Z

We are more or less slicing the other way round, as described above (μF2↦O4).

This can be done as well, but I'm not sure it's as efficient, but we'd simply have to try it out.

The reason is that different scales are really independent, while in flavor basis in general there are no non-zero pairs (pid0, pid1). Indeed, the operator is block-diagonal in evolution basis (QCD: singlet/gluon + diagonal, QCD+QED: singlet-u/singlet-d/gluon/photon + valence-u/valence-d + diagonal), and once you rotate even a single basis (input or output) to flavor basis, you have all entries filled.

That's not entirely correct: for Nf=5 PDFs there are no top quarks at both process and fitting scale (PIDs -6 and 6), often there's no photon and at fitting scale there are no bottom quarks (PIDs -5 and 5). For some PDFs there aren't even charm quarks! For NNPDF4.0 we can decrease the dimensions from 14x14 to 9x11, and that's a reduction of the operator size of 50%!

alecandido · 2023-02-23T09:43:21Z

Ok, you're right.

But for most of the processes even bottom is there, it is just a fraction of DIS data that do not have the bottom, and almost always not for every value of $\mu^2$, so you can't drop the slice.
The further limitations are true only when fitting a specific PDF set (e.g. Nf3/Nf4/...), but these are considered variations in NNPDF (and mostly also by the other collaborations), so you only have a couple of them in a full release, and you usually develop only on the VFNS.

Moreover, as soon as we will have QED (that is decided to be almost done) the default fits and evolution will include the photon.
So you will have all flavors but bottom and top at the input scale, and all flavors but top at, at least, one output scale. So your typical operator will be 10x12 (for as long as someone won't propose to fit the bottom, but for that we have time, since we're still digesting $c^-$).

This can be done as well, but I'm not sure it's as efficient, but we'd simply have to try it out.

This is extremely convenient to evolve PDFs, and the only way to compute an EKO (since we can solve DGLAP only for one final scale at a time, but all x values and flavors at the same time).

However, that EKO loader will be available anyhow, with the $\mu^2 \mapsto O_4$ layout, because it's the one matching the new output format. If you wish, once it will be ready, you'll be able to test the most convenient way (and decide if using as it is, or reconstruct the $O_3$ slices, joining different final scales).

cschwan · 2023-10-10T08:25:43Z

@alecandido I thought a bit more about the problem of what interface we should have for this, and one idea would be to have a Grid::evolve_slice. This is basically Grid::evolve, but only evolves one (factorization-scale) slice of the Grid, corresponding to a 4-dimensional EKO (slice), and ignores everything else. Then the user can combine Grid::evolve_slice and Grid::merge to sum every evolved slice. In this way we should be able to avoid the questions about callbacks. We can also rewrite Grid::evolve in terms of Grid::evolve_slice. What do you think?

alecandido · 2023-10-10T08:45:51Z

I believe it's a very good solution. The spirit was actually to consume them one by one, and I just never thought that the grid were sliced as well.

For completeness: sliced EKOs will be the only EKOs from now on, so you have to merge them anyhow to feed Grid::evolve. I would propose to even deprecate it, and eventually we could make it a wrapper for Grid::evolve_slice the moment the EKO Rust loader will be available.

The only unpleasant bit is that we should still improve Grid::merge to support small EKOs convolution fully in memory from Python (without intermediate grid write), because at the moment Python has only access to merge_from_file.

pineappl/pineappl_py/src/grid.rs

Lines 614 to 618 in c756b4c

    
               /// Note 
        
               /// ---- 
        
               /// For a current limitation with the implementation of the bound object `Grid` is not possible 
        
               /// to operate with two `Grid`s in memory, since is not possible to pass a `Grid` by argument 
        
               pub fn merge_from_file(&mut self, path: PathBuf) -> PyResult<()> {

cschwan · 2023-10-10T08:52:31Z

Grid is Clone now, so I don't think there's any showstopper of exposing Grid::merge to the Python interface.

alecandido · 2023-10-10T09:18:22Z

Grid is Clone now, so I don't think there's any showstopper of exposing Grid::merge to the Python interface.

Ah, great. I just didn't notice it happened. We can add immediately Grid.merge to the Python interface.

cschwan · 2023-10-10T11:36:48Z

PR #244 will implement the necessary changes.

alecandido added the enhancement New feature or request label Feb 22, 2023

alecandido self-assigned this Feb 22, 2023

cschwan added this to the v0.6.3 milestone Oct 10, 2023

cschwan linked a pull request Oct 10, 2023 that will close this issue

Add method(s) to support larger EKOs #244

Merged

6 tasks

cschwan closed this as completed in #244 Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evolution with large EKOs #209

Evolution with large EKOs #209

alecandido commented Feb 22, 2023 •

edited

Loading

cschwan commented Feb 22, 2023

alecandido commented Feb 22, 2023

cschwan commented Feb 22, 2023

alecandido commented Feb 22, 2023

cschwan commented Feb 22, 2023 •

edited

Loading

alecandido commented Feb 23, 2023 •

edited

Loading

cschwan commented Feb 23, 2023

alecandido commented Feb 23, 2023

cschwan commented Oct 10, 2023

alecandido commented Oct 10, 2023

cschwan commented Oct 10, 2023

alecandido commented Oct 10, 2023

cschwan commented Oct 10, 2023

Evolution with large EKOs #209

Evolution with large EKOs #209

Comments

alecandido commented Feb 22, 2023 • edited Loading

cschwan commented Feb 22, 2023

alecandido commented Feb 22, 2023

cschwan commented Feb 22, 2023

alecandido commented Feb 22, 2023

cschwan commented Feb 22, 2023 • edited Loading

alecandido commented Feb 23, 2023 • edited Loading

cschwan commented Feb 23, 2023

alecandido commented Feb 23, 2023

cschwan commented Oct 10, 2023

alecandido commented Oct 10, 2023

cschwan commented Oct 10, 2023

alecandido commented Oct 10, 2023

cschwan commented Oct 10, 2023

alecandido commented Feb 22, 2023 •

edited

Loading

cschwan commented Feb 22, 2023 •

edited

Loading

alecandido commented Feb 23, 2023 •

edited

Loading