Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Use Rprintf for logging in the R package (fixes #1440, fixes #1909) #2901

Merged
merged 41 commits into from
Mar 24, 2020

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Mar 12, 2020

This PR proposes a change to get the R package closer to CRAN's standards (#629). See #1440 and #1909 for background. The basic issue, as documented in Writing R Extensions, is this:

Compiled code is checked for symbols corresponding to functions which might terminate R or write to stdout/stderr instead of the console.

To get LightGBM into compliance with this check, I propose using Rprintf() and Rvprintf() from R's headers. This is the approach CRAN recommends

The most useful function for printing from a C routine compiled into R is Rprintf. This is used in exactly the same way as printf, but is guaranteed to write to R’s output (which might be a GUI console rather than a file, and can be re-directed by sink). It is wise to write complete lines (including the "\n") before returning to R. It is defined in R_ext/Print.h.

Notes for Reviewers

  1. cmake/modules/FindLibR.cmake was included to find R and its headers on any system. It borrows heavily from xgboost's implementation.
  2. I've tested this approach on my personal Travis and it seems to be working. One task (mpi) is failing but I think it is unrelated.
  3. I ran R CMD check lightgbm_2.3.2.tar.gz --as-cran and can confirm that this NOTE is gone!
File ‘/home/travis/build/microsoft/LightGBM/lightgbm.Rcheck/lightgbm/libs/lib_lightgbm.so’:
  Found ‘printf’, possibly from ‘printf’ (C)
  Found ‘putchar’, possibly from ‘putchar’ (C)
  Found ‘stderr’, possibly from ‘stderr’ (C)
  Found ‘stdout’, possibly from ‘stdout’ (C)

Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs. The detected symbols are linked into the code but
might come from libraries and not actually be called.

Comment on lines 83 to 112
# CRAN may run RD CMD CHECK instead of R CMD CHECK,
# which can lead to this infamous error:
# 'R' should not be used without a path -- see par. 1.6 of the manual
find_program(
LIBR_EXECUTABLE
NO_DEFAULT_PATH
HINTS "${CMAKE_CURRENT_BINARY_DIR}" "/usr/bin" "/usr/lib/"
NAMES R R.exe
)

if(LIBR_EXECUTABLE MATCHES ".*lightgbm\\.Rcheck.*")
message(FATAL_ERROR "Hit this block. '${LIBR_EXECUTABLE}'")
endif()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is something I added that you won't find in the xgboost script. It solves an issue that only matters at testing time...basically inside of R CMD check, CRAN actually runs a different command called RD CMD check which uses a special version of R / R.exe. We faced this in pkgnet as well.

From Writing R Extensions

Do not invoke R by plain R, Rscript or (on Windows) Rterm in your examples, tests, vignettes, makefiles or other scripts. As pointed out in several places earlier in this manual, use something like

"$(R_HOME)/bin/Rscript"
"$(R_HOME)/bin$(R_ARCH_BIN)/Rterm"

with appropriate quotes (as, although not recommended, R_HOME can contain spaces).

The hints that I gave here will only work on Unix-alike operating systems. When I do #2335 it will get changed. But since this only matters for testing (in the context of R CMD check), not for normal use, I think it's ok to merge.

Without this block, the builds will fail with this error:

'R' should not be used without a path -- see par. 1.6 of the manual

@@ -39,9 +39,9 @@

// 64bit pointer
#if INTPTR_MAX == INT64_MAX
typedef int64_t R_xlen_t;
typedef int64_t xlen_t;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to rename this type to avoid conflicts with R_xlen_t defined in R's headers.

#include <R_ext/Error.h>
#include <R_ext/Print.h>
#endif

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for those who are new to it:

  • R_NO_REMAP: Do not change symbols like Rf_error() to error()
  • R_USE_C99_IN_CXX: Make sure that Rvprintf() is defined. (reference)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: R_USE_C99_IN_CXX is defined in R_ext/Print.h in R 4.0. Will this break when using R 4.0?

@jameslamb
Copy link
Collaborator Author

Travis is struggling today, seeing a lot of these

image

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb Awesome job!

Below are some initial comments.

R-package/src/cmake/modules/FindLibR.cmake Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
@jameslamb
Copy link
Collaborator Author

I just tried building this branch on my Windows laptop (Windows 10, R3.6.1, Visual Studio 16 2019) with Rscript build_r.R and can confirm this worked!

The first time I ran I did hit https://github.com/jameslamb/LightGBM/blob/feat/r-header/R-package/src/cmake/modules/FindLibR.cmake#L41, but that was easy to fix by following the error message and adding the Rtools minGW bin/ directory to the path.

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb I left some comments. Please take a look.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
@jameslamb
Copy link
Collaborator Author

I just rebased to master since there have been a lot of changes on there since this PR was opened. Made sure to pull the suggestions committed in the browser before doing that, so they didn't get reverted by the force push.


// R code should write back to R's error stream,
// otherwise to stderr
#ifndef LGB_R_BUILD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should agree on preprocessor directive indentations on a later PR. ping @guolinke @StrikerRUS @jameslamb for later.

Copy link
Contributor

@Laurae2 Laurae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See R-devel (R 4.0): https://cran.r-project.org/doc/manuals/r-devel/NEWS.html in C-level facilities:

Header ‘R_ext/Print.h’ defines R_USE_C99_IN_CXX and hence exposes Rvprintf and REvprintf if used with a C++11 (or later) compiler.

Do we need a workaround for it?

#include <R_ext/Error.h>
#include <R_ext/Print.h>
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: R_USE_C99_IN_CXX is defined in R_ext/Print.h in R 4.0. Will this break when using R 4.0?

@jameslamb
Copy link
Collaborator Author

See R-devel (R 4.0): https://cran.r-project.org/doc/manuals/r-devel/NEWS.html in C-level facilities:

Header ‘R_ext/Print.h’ defines R_USE_C99_IN_CXX and hence exposes Rvprintf and REvprintf if used with a C++11 (or later) compiler.

Do we need a workaround for it?

Still investigating, but for the sake of documenting the discussion...divest (as one example), uses this same pattern and recently passed CRAN nightly checks against r-devel (which is 4.0.0 now) https://cran.r-project.org/web/checks/check_results_divest.html

Copy link
Contributor

@Laurae2 Laurae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is no issue for R_USE_C99_IN_CXX according to R_ext/Print.h in R-devel (4.0):

/*
 *  R : A Computer Language for Statistical Data Analysis
 *  Copyright (C) 1998-2019    The R Core Team
 *
 *  This header file is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU Lesser General Public License as published by
 *  the Free Software Foundation; either version 2.1 of the License, or
 *  (at your option) any later version.
 *
 *  This file is part of R. R is distributed under the terms of the
 *  GNU General Public License, either Version 2, June 1991 or Version 3,
 *  June 2007. See doc/COPYRIGHTS for details of the copyright status of R.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU Lesser General Public License for more details.
 *
 *  You should have received a copy of the GNU Lesser General Public License
 *  along with this program; if not, a copy is available at
 *  https://www.R-project.org/Licenses/
 */

/* Included by R.h: API */

#ifndef R_EXT_PRINT_H_
#define R_EXT_PRINT_H_

#ifdef  __cplusplus
/* If the vprintf interface is defined at all in C++ it may only be
   defined in namespace std.  It is part of the C++11 standard. */
# if __cplusplus >= 201103L && !defined(R_USE_C99_IN_CXX)
#  define R_USE_C99_IN_CXX
# endif
# ifdef R_USE_C99_IN_CXX
#  include <cstdarg>
#  define R_VA_LIST std::va_list
# endif
extern "C" {
#else
# include <stdarg.h>
# define R_VA_LIST va_list
#endif

@jameslamb
Copy link
Collaborator Author

Thanks for the review @Laurae2 ! I won't merge until @StrikerRUS and I agree on some of the lingering comments on FindLibR.cmake

@jameslamb
Copy link
Collaborator Author

@StrikerRUS I saw you resolved the comment threads on FindLibR.cmake. Am I ok to merge this?

Thanks for the review! btw I made PR into xgboost with one of the improvements you suggested: dmlc/xgboost#5427

@StrikerRUS
Copy link
Collaborator

@jameslamb Please give me some time to make a second round of review.

@jameslamb
Copy link
Collaborator Author

@jameslamb Please give me some time to make a second round of review.

sure no hurry at all! Glad that I asked

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb Finally got a minute to drop a second round of review. Sorry for the delay!

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
@jameslamb
Copy link
Collaborator Author

@jameslamb Finally got a minute to drop a second round of review. Sorry for the delay!

no problem! I'll look at these right now.

@jameslamb
Copy link
Collaborator Author

jameslamb commented Mar 23, 2020

R + Mac CI on Travis died with this error:

image

Looks like a timeout for Homebrew. I have been seeing that slightly more the last few weeks. I imagine all the people now working remote is putting a lot of pressure on ISPs.

(just restarted it manually in the Travis console)

@StrikerRUS
Copy link
Collaborator

@jameslamb

Looks like a timeout for Homebrew. I have been seeing that slightly more the last few weeks. I imagine all the people now working remote is putting a lot of pressure on ISPs.

Yeah, it seems to be so...

I cannot remember how many times I re-run Travis macOS builds last days. Also, Travis had some problems with macOS recently: https://www.traviscistatus.com/

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb Overall looks good! But please take a look at some my questions below.

CMakeLists.txt Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Outdated Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Show resolved Hide resolved
R-package/src/cmake/modules/FindLibR.cmake Show resolved Hide resolved
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb LGTM! Great step towards CRAN release! 🎉

Thanks a lot for your patience!

@StrikerRUS StrikerRUS merged commit 0341906 into microsoft:master Mar 24, 2020
jameslamb added a commit to jameslamb/LightGBM that referenced this pull request Mar 24, 2020
…#1440, fixes microsoft#1909) (microsoft#2901)

* [R-package] started cutting over from custom R-to-C interface to R.h

* replaced LGBM_SE with SEXP

* fixed error about ocnflicting definitions of length

* got linking working

* more stuff

* eliminated R CMD CHECK note about printing

* switched from hard-coded include dir to the one from FindLibR.cmake

* cleaned up formatting in FindLibR.cmake

* commented-out everything in CI that does not touch R

* more changes

* trying to get better logs

* tried ignoring

* added error message to confirm a suspicion

* still trying to find R during R CMD CHECK

* restore full CI

* fixed comment

* Update R-package/src/cmake/modules/FindLibR.cmake

* changed strategy for finding LIBR_HOME on Windows

* Removed 32-bit Windows stuff in FindLibR.cmake

* Update R-package/src/cmake/modules/FindLibR.cmake

* Update CMakeLists.txt

Co-Authored-By: Nikita Titov <[email protected]>

* Update CMakeLists.txt

Co-Authored-By: Nikita Titov <[email protected]>

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* removed some duplication in cmake scripts

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* added LIBR_CORE_LIBRARY back

* small fixes to CMakeLists

* simplified FindLibR.cmake

* some fixes for windows

* Apply suggestions from code review

Co-Authored-By: Nikita Titov <[email protected]>

* allowed for directly passing LIBR_EXECUTABLE to FindLibR.cmake

* reorganized FindLibR.cmake to catch more cases

* clean up inconsistencies  in R calls in FindLibR.cmake

* Update R-package/src/cmake/modules/FindLibR.cmake

Co-Authored-By: Nikita Titov <[email protected]>

* removed unnecessary log messages

* removed unnecessary unset() call

Co-authored-by: Nikita Titov <[email protected]>
@jameslamb jameslamb deleted the feat/r-header branch March 25, 2020 19:54
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants