-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added feature infos to JSON dump #2660
Conversation
if it is only used one-time, I think you can write it directly, without a new function. |
@guolinke Sorry, not sure that fully understood you. Do you mean to write code of UPD: BTW, seems that |
sorry, i misread it. Is |
@guolinke OK. One more thing I can suggest is to rename
and add
This will be consistent with the rest codebase. |
@StrikerRUS yeah, it makes sense. |
Agree! WDYT will be better names instead of |
288a5b1
to
9cede46
Compare
c09071b
to
977ad48
Compare
@guolinke Can you please help with |
maybe use a function like |
Yes, exactly!
Great idea (if I understand correctly 😃 )! To be honest, I have no practice in writing cpp code, so I'll be happy to see any decision you think fits here better. |
@StrikerRUS what is the status of this PR? |
okay, I will implement it when I have time. |
@StrikerRUS when I implement it, I find the template is a better solution. |
src/boosting/gbdt_model_text.cpp
Outdated
str_buf << "\"feature_infos\":" << "{"; | ||
auto feature_infos_json_objs = train_data_->feature_infos<true>(); | ||
bool first_obj = true; | ||
for (size_t i = 0; i < feature_infos_json_objs.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems this loop causes the fails, @StrikerRUS could you check it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented the loop and discovered that the following line causes segfaults:
auto feature_infos_json_objs = train_data_->feature_infos<true>();
UPD: the following (old code for ordinary string format) is also failing with the same error. Maybe a bug was introduced earlier?..
diff --git a/src/boosting/gbdt_model_text.cpp b/src/boosting/gbdt_model_text.cpp
index b1ea0d06..8180bb67 100644
--- a/src/boosting/gbdt_model_text.cpp
+++ b/src/boosting/gbdt_model_text.cpp
@@ -40,18 +40,18 @@ std::string GBDT::DumpModel(int start_iteration, int num_iteration) const {
<< Common::Join(monotone_constraints_, ",") << "]," << '\n';
str_buf << "\"feature_infos\":" << "{";
- auto feature_infos_json_objs = train_data_->feature_infos<true>();
- bool first_obj = true;
- for (size_t i = 0; i < feature_infos_json_objs.size(); ++i) {
- if (feature_infos_json_objs[i] != "{}") {
- if (!first_obj) {
- str_buf << ",";
- }
- str_buf << "\"" << feature_names_[i] << "\":";
- str_buf << feature_infos_json_objs[i];
- first_obj = false;
- }
- }
+ auto feature_infos_json_objs = train_data_->feature_infos<false>();
+ // bool first_obj = true;
+ // for (size_t i = 0; i < feature_infos_json_objs.size(); ++i) {
+ // if (feature_infos_json_objs[i] != "{}") {
+ // if (!first_obj) {
+ // str_buf << ",";
+ // }
+ // str_buf << "\"" << feature_names_[i] << "\":";
+ // str_buf << feature_infos_json_objs[i];
+ // first_obj = false;
+ // }
+ // }
str_buf << "}," << '\n';
str_buf << "\"tree_info\":[";
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we cannot dynamically generate the feature_infos_json_objs
, as the model could be loaded from file/string. In this case, both used_feature_map_
and 'feature_bin_mappers_` are lost and cause the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@StrikerRUS I propose a new change, you can check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see... Now feature_infos
field is not actually a JSON representation. I think we can get back to my initial implementation then, which parses those strings and constructs valid JSONs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in the latest commit.
Part of #2604.
Not sure about the implementation, maybe it's better to have something like
feature_infos_json_
?