Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPP] Invented a parallel algorithm based on seidel implementation. #163

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

JS00000
Copy link

@JS00000 JS00000 commented Mar 8, 2021

Fixes #
PathParametrizationAlgorithm::computeFeasibleSets() now can pass the test.

Changes in this PRs:

  • add class SeidelParallel
  • Modify the class PathParametrizationAlgorithm and class Solver for parallel algorithm

Checklists:

  • Update HISTORY.md with a single line describing this PR

@hungpham2511
Copy link
Owner

Cool. Thanks for contributing @JS00000. I will try to get around to review the code quickly.

@hungpham2511
Copy link
Owner

Do you have any numbers/graphs demonstrating the new algorithm?

@JS00000
Copy link
Author

JS00000 commented Mar 10, 2021

Do you have any numbers/graphs demonstrating the new algorithm?

Not yet. The idea of this parallel optimization is very simple, that is, separate the parts that have no dependencies in the serial calculation of the controllable set, and then process them in parallel.

Copy link
Collaborator

@jmirabel jmirabel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this point. I need a bit more information to understand your changes.

  1. Duplicating the API to achieve parallelization is a bit excessive. The current API does not support parallelization. I would be in favor of updating the API to have only one API that supports parallelization, rather than changing it opportunistically.

  2. I may be overlooking, but Seidel and SeidelParallel seems to have a lot of code in common. Is this code duplication necessary ? This is bad because:

  • it makes code review hard because differences are hidden,
  • it makes maintenance harder.
    From what I understand, you want to store the result as an attribute of the class. I guess you also removed the reordering of the LP constraints. Am I missing something ?

I also made a few minor comments.

initialize();
size_t en = GetCurrentTimeUsec();
printf("time of init: %fms\n", (en - be)/1000.0);
be = en;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace printf by TOPPRA_LOG_DEBUG, here and everywhere in the code ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

struct timeval tv;
gettimeofday(&tv,NULL);
return ((unsigned long long)tv.tv_sec * 1000000 + (unsigned long long)tv.tv_usec);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a C-style way of measuring time. Although valid, I would be in favor of a more C++ style (e.g. https://www.techiedelight.com/measure-elapsed-time-program-chrono-library).
I think std::chrono is also more portable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, It's a good suggestion.

return res;
}

constexpr value_type infi = 1e6;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This infinity is definitely too low and I don't think this is actually necessary. What issue did you get that made you set this ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I set infi to a very large number, it will cause some numerical calculation problems. It may be caused by the limitations of the algorithm itself. In fact, the original seidel algorithm also set an upper limit of 100, which is much smaller than the upper limit here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the C++ implementation, I don't think there is any such limit. Are you talking about the limit discussed in #156 ?
There is no good reason for this limit and this limit is modifiable by the user since #160.
Through this API, I change the bound to [ 0, +inf ] in my code and it works well on my use cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. In the origin develop branch, if you set m_initXBound to [ 0, +inf ], it will works well. But it doesn't work in parallel branch here. This is also the limitation of this parallel algorithm.
There are two variables, u and x. In principle, the limit of x (which is m_initXBound in origin develop branch) should be set to [ 0, +inf ], and the limit of u should be set to [ -inf, +inf ].
However, after a series of incomplete tests, I found that in parallel branch here, the limit of x can be set to [ 0, +inf ], but the limit of u should be set to [ -1e13, +1e13 ] or smaller. Otherwise, this parallel algorithm will have a large numerical error.
The reason is that the order of constraints in the LP in this parallel algorithm has been adjusted. In order to process unrelated constraints in parallel, I put the two constraints corresponding to xNext at the end. This leads to a high degree of linear correlation between the constraints during the first and last stages of parallel processing, and the difference between the variables u and x is even up to 15 orders of magnitude. This is enough to make the "double“ data structure produce enough errors to affect the result.
So I set infi to 1e6, it can also be set larger, of course, a more complete test is needed. A smaller infi setting may speed up the calculation speed, and it is also sufficient to meet actual data requirements.

@JS00000
Copy link
Author

JS00000 commented Mar 11, 2021

Thanks for raising this point. I need a bit more information to understand your changes.

  1. Duplicating the API to achieve parallelization is a bit excessive. The current API does not support parallelization. I would be in favor of updating the API to have only one API that supports parallelization, rather than changing it opportunistically.
  2. I may be overlooking, but Seidel and SeidelParallel seems to have a lot of code in common. Is this code duplication necessary ? This is bad because:
  • it makes code review hard because differences are hidden,
  • it makes maintenance harder.
    From what I understand, you want to store the result as an attribute of the class. I guess you also removed the reordering of the LP constraints. Am I missing something ?

I also made a few minor comments.

I will try to optimize the code structure, which will take a while.

@jmirabel
Copy link
Collaborator

The API modification can be discussed before you actually do them. I will let @hungpham2511 give his opinion on this.

@hungpham2511
Copy link
Owner

Do you have any numbers/graphs demonstrating the new algorithm?

Not yet. The idea of this parallel optimization is very simple, that is, separate the parts that have no dependencies in the serial calculation of the controllable set, and then process them in parallel.

To be honest, I believe there is not so much gain we can get from parallelizing the algorithm. At best solving LP(s) is two times faster. Including the time taken for other parts, i.e., calculating the coefficients, the total speedup is probably less than that.

Copy link
Owner

@hungpham2511 hungpham2511 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After going through the change, I am actually confused about where parallelism is implemented and how does it help. Could you @JS00000 explain briefly? In particular, it is the seidel solver that you are improving or the high-level steps in the TOPPRA algorithm.

I think only by understanding how parallelism is used, we can come up with a suitable set of APIs.

@@ -17,7 +17,7 @@ option(TOPPRA_DEBUG_ON "Set logging mode to on." OFF)
option(TOPPRA_WARN_ON "Enable warning messages." ON)
option(OPT_MSGPACK "Serialization using msgpack " OFF)
# Bindings
option(PYTHON_BINDINGS "Build bindings for Python" ON)
option(PYTHON_BINDINGS "Build bindings for Python" OFF)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't change the default flag needlessly. You can always use -DPython_BINDINGS=OFF during compilation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@@ -47,6 +47,13 @@ if(BUILD_WITH_GLPK)
message(STATUS "Found glpk ${GLPK_LIBRARY} ${GLPK_INCLUDE_DIR}")
endif(BUILD_WITH_GLPK)

find_package(OpenMP REQUIRED)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to have a BUILD_WITH_OPENMP flag and find OpenMP only if it is switched on. Otherwise this breaks compilation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

#pragma omp parallel for schedule(dynamic)
for (int i = 0; i < m_N; i++) {
// solver_ret = m_solver->solveStagewiseBatch(i, g_upper, solution_upper[i]);
solver_ret = m_solver->solveStagewiseBatch(i, g_upper);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me. Are you sure that this routine returns the correct controllable sets?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it just store the half-finished solution in solver.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here is actually the implementation of parallelism.

@JS00000
Copy link
Author

JS00000 commented Mar 11, 2021

Do you have any numbers/graphs demonstrating the new algorithm?

Not yet. The idea of this parallel optimization is very simple, that is, separate the parts that have no dependencies in the serial calculation of the controllable set, and then process them in parallel.

To be honest, I believe there is not so much gain we can get from parallelizing the algorithm. At best solving LP(s) is two times faster. Including the time taken for other parts, i.e., calculating the coefficients, the total speedup is probably less than that.

Yes. In my tests, the total speed is atmost 2 times faster than origin Seidel solver, but still slower than qpOASES solver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants