[CPP] Invented a parallel algorithm based on seidel implementation. #163

JS00000 · 2021-03-08T12:42:00Z

Fixes #
PathParametrizationAlgorithm::computeFeasibleSets() now can pass the test.

Changes in this PRs:

add class SeidelParallel
Modify the class PathParametrizationAlgorithm and class Solver for parallel algorithm

Checklists:

Update HISTORY.md with a single line describing this PR

… parallel

hungpham2511 · 2021-03-10T01:46:25Z

Cool. Thanks for contributing @JS00000. I will try to get around to review the code quickly.

hungpham2511 · 2021-03-10T01:47:05Z

Do you have any numbers/graphs demonstrating the new algorithm?

JS00000 · 2021-03-10T06:29:02Z

Do you have any numbers/graphs demonstrating the new algorithm?

Not yet. The idea of this parallel optimization is very simple, that is, separate the parts that have no dependencies in the serial calculation of the controllable set, and then process them in parallel.

jmirabel

Thanks for raising this point. I need a bit more information to understand your changes.

Duplicating the API to achieve parallelization is a bit excessive. The current API does not support parallelization. I would be in favor of updating the API to have only one API that supports parallelization, rather than changing it opportunistically.
I may be overlooking, but Seidel and SeidelParallel seems to have a lot of code in common. Is this code duplication necessary ? This is bad because:

it makes code review hard because differences are hidden,
it makes maintenance harder.
From what I understand, you want to store the result as an attribute of the class. I guess you also removed the reordering of the LP constraints. Am I missing something ?

I also made a few minor comments.

jmirabel · 2021-03-10T22:04:10Z

cpp/src/toppra/algorithm.cpp

  initialize();
+  size_t en = GetCurrentTimeUsec();
+  printf("time of init:                             %fms\n", (en - be)/1000.0);
+  be = en;


Can you replace printf by TOPPRA_LOG_DEBUG, here and everywhere in the code ?

jmirabel · 2021-03-10T22:07:40Z

cpp/src/toppra/algorithm.cpp

+  struct timeval tv;
+  gettimeofday(&tv,NULL);
+  return ((unsigned long long)tv.tv_sec * 1000000 + (unsigned long long)tv.tv_usec);
+}


This is a C-style way of measuring time. Although valid, I would be in favor of a more C++ style (e.g. https://www.techiedelight.com/measure-elapsed-time-program-chrono-library).
I think std::chrono is also more portable.

Ok, It's a good suggestion.

jmirabel · 2021-03-10T22:23:20Z

cpp/src/toppra/solver/seidel-parallel.cpp

+  return res;
+}
+
+constexpr value_type infi = 1e6;


This infinity is definitely too low and I don't think this is actually necessary. What issue did you get that made you set this ?

If I set infi to a very large number, it will cause some numerical calculation problems. It may be caused by the limitations of the algorithm itself. In fact, the original seidel algorithm also set an upper limit of 100, which is much smaller than the upper limit here.

In the C++ implementation, I don't think there is any such limit. Are you talking about the limit discussed in #156 ?
There is no good reason for this limit and this limit is modifiable by the user since #160.
Through this API, I change the bound to [ 0, +inf ] in my code and it works well on my use cases.

You are right. In the origin develop branch, if you set m_initXBound to [ 0, +inf ], it will works well. But it doesn't work in parallel branch here. This is also the limitation of this parallel algorithm.
There are two variables, u and x. In principle, the limit of x (which is m_initXBound in origin develop branch) should be set to [ 0, +inf ], and the limit of u should be set to [ -inf, +inf ].
However, after a series of incomplete tests, I found that in parallel branch here, the limit of x can be set to [ 0, +inf ], but the limit of u should be set to [ -1e13, +1e13 ] or smaller. Otherwise, this parallel algorithm will have a large numerical error.
The reason is that the order of constraints in the LP in this parallel algorithm has been adjusted. In order to process unrelated constraints in parallel, I put the two constraints corresponding to xNext at the end. This leads to a high degree of linear correlation between the constraints during the first and last stages of parallel processing, and the difference between the variables u and x is even up to 15 orders of magnitude. This is enough to make the "double“ data structure produce enough errors to affect the result.
So I set infi to 1e6, it can also be set larger, of course, a more complete test is needed. A smaller infi setting may speed up the calculation speed, and it is also sufficient to meet actual data requirements.

JS00000 · 2021-03-11T14:19:59Z

Thanks for raising this point. I need a bit more information to understand your changes.

Duplicating the API to achieve parallelization is a bit excessive. The current API does not support parallelization. I would be in favor of updating the API to have only one API that supports parallelization, rather than changing it opportunistically.

I may be overlooking, but Seidel and SeidelParallel seems to have a lot of code in common. Is this code duplication necessary ? This is bad because:

it makes code review hard because differences are hidden,

it makes maintenance harder.
From what I understand, you want to store the result as an attribute of the class. I guess you also removed the reordering of the LP constraints. Am I missing something ?

I also made a few minor comments.

I will try to optimize the code structure, which will take a while.

jmirabel · 2021-03-11T14:47:50Z

The API modification can be discussed before you actually do them. I will let @hungpham2511 give his opinion on this.

hungpham2511 · 2021-03-11T15:21:06Z

Do you have any numbers/graphs demonstrating the new algorithm?

Not yet. The idea of this parallel optimization is very simple, that is, separate the parts that have no dependencies in the serial calculation of the controllable set, and then process them in parallel.

To be honest, I believe there is not so much gain we can get from parallelizing the algorithm. At best solving LP(s) is two times faster. Including the time taken for other parts, i.e., calculating the coefficients, the total speedup is probably less than that.

hungpham2511

After going through the change, I am actually confused about where parallelism is implemented and how does it help. Could you @JS00000 explain briefly? In particular, it is the seidel solver that you are improving or the high-level steps in the TOPPRA algorithm.

I think only by understanding how parallelism is used, we can come up with a suitable set of APIs.

hungpham2511 · 2021-03-11T15:22:18Z

cpp/CMakeLists.txt

@@ -17,7 +17,7 @@ option(TOPPRA_DEBUG_ON "Set logging mode to on." OFF)
 option(TOPPRA_WARN_ON "Enable warning messages." ON)
 option(OPT_MSGPACK "Serialization using msgpack " OFF)
 # Bindings
-option(PYTHON_BINDINGS "Build bindings for Python" ON)
+option(PYTHON_BINDINGS "Build bindings for Python" OFF)


Please don't change the default flag needlessly. You can always use -DPython_BINDINGS=OFF during compilation.

hungpham2511 · 2021-03-11T15:23:09Z

cpp/CMakeLists.txt

@@ -47,6 +47,13 @@ if(BUILD_WITH_GLPK)
    message(STATUS "Found glpk ${GLPK_LIBRARY} ${GLPK_INCLUDE_DIR}")
 endif(BUILD_WITH_GLPK)

+find_package(OpenMP REQUIRED)


It's better to have a BUILD_WITH_OPENMP flag and find OpenMP only if it is switched on. Otherwise this breaks compilation.

hungpham2511 · 2021-03-11T15:29:08Z

cpp/src/toppra/algorithm.cpp

+  #pragma omp parallel for schedule(dynamic)
+  for (int i = 0; i < m_N; i++) {
+    // solver_ret = m_solver->solveStagewiseBatch(i, g_upper, solution_upper[i]);
+    solver_ret = m_solver->solveStagewiseBatch(i, g_upper);


This doesn't look right to me. Are you sure that this routine returns the correct controllable sets?

No, it just store the half-finished solution in solver.

And here is actually the implementation of parallelism.

JS00000 · 2021-03-11T16:13:01Z

Do you have any numbers/graphs demonstrating the new algorithm?

Not yet. The idea of this parallel optimization is very simple, that is, separate the parts that have no dependencies in the serial calculation of the controllable set, and then process them in parallel.

To be honest, I believe there is not so much gain we can get from parallelizing the algorithm. At best solving LP(s) is two times faster. Including the time taken for other parts, i.e., calculating the coefficients, the total speedup is probably less than that.

Yes. In my tests, the total speed is atmost 2 times faster than origin Seidel solver, but still slower than qpOASES solver.

…parallel

JS00000 added 4 commits March 8, 2021 14:53

p_withbug

b1000a1

Merge branch 'develop' of https://github.com/hungpham2511/toppra into…

344b166

… parallel

seidel parallel algorithm

4b1d84d

compare with qpOASES

64e5c56

hungpham2511 closed this Mar 10, 2021

hungpham2511 reopened this Mar 10, 2021

check BUILD_WITH_qpOASES in test

b002cdf

jmirabel reviewed Mar 10, 2021

View reviewed changes

hungpham2511 reviewed Mar 11, 2021

View reviewed changes

JS00000 added 6 commits April 15, 2021 22:51

test threads

09ebd7d

Merge branch 'master' of https://github.com/hungpham2511/toppra into …

8fd1f61

…parallel

add seidel fine code

4365e5a

remove debug code

3a402b9

omp do nothing

5c42ee2

change googletest source

f3e98b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CPP] Invented a parallel algorithm based on seidel implementation. #163

[CPP] Invented a parallel algorithm based on seidel implementation. #163

JS00000 commented Mar 8, 2021 •

edited

Loading

hungpham2511 commented Mar 10, 2021

hungpham2511 commented Mar 10, 2021

JS00000 commented Mar 10, 2021

jmirabel left a comment

jmirabel Mar 10, 2021

JS00000 Mar 11, 2021

jmirabel Mar 10, 2021

JS00000 Mar 11, 2021

jmirabel Mar 10, 2021

JS00000 Mar 11, 2021

jmirabel Mar 11, 2021

JS00000 Mar 11, 2021

JS00000 commented Mar 11, 2021

jmirabel commented Mar 11, 2021

hungpham2511 commented Mar 11, 2021

hungpham2511 left a comment •

edited

Loading

hungpham2511 Mar 11, 2021

JS00000 Mar 11, 2021

hungpham2511 Mar 11, 2021

JS00000 Mar 11, 2021

hungpham2511 Mar 11, 2021

JS00000 Mar 11, 2021

JS00000 Mar 11, 2021

JS00000 commented Mar 11, 2021

[CPP] Invented a parallel algorithm based on seidel implementation. #163

Are you sure you want to change the base?

[CPP] Invented a parallel algorithm based on seidel implementation. #163

Conversation

JS00000 commented Mar 8, 2021 • edited Loading

hungpham2511 commented Mar 10, 2021

hungpham2511 commented Mar 10, 2021

JS00000 commented Mar 10, 2021

jmirabel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JS00000 commented Mar 11, 2021

jmirabel commented Mar 11, 2021

hungpham2511 commented Mar 11, 2021

hungpham2511 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JS00000 commented Mar 11, 2021

JS00000 commented Mar 8, 2021 •

edited

Loading

hungpham2511 left a comment •

edited

Loading