Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constant memory leak #1151

Closed
joshmyzie2 opened this issue Apr 1, 2016 · 14 comments
Closed

Constant memory leak #1151

joshmyzie2 opened this issue Apr 1, 2016 · 14 comments
Assignees

Comments

@joshmyzie2
Copy link

joshmyzie2 commented Apr 1, 2016

I created a simple dashboard, a single plot with plotly and a couple small tables. It works great, but seems to eat about 100MB of memory per hour. Running gc() does not reclaim any memory, so I have to kill R and restart the server twice a day or my 2GB server will run out of memory.

I split the dashboard in two: one with just the plot and other with just the tables. The plotly dashboard is the one consuming 90% of the memory. Any ideas how to further debug?

ui.R

library(shiny)
library(plotly)

syms <- c("A","B","C","D","E")

shinyUI(fluidPage(
    sidebarPanel(checkboxGroupInput(inputId="syms", label=NULL, choices=syms, selected=syms)),
    mainPanel(plotlyOutput("plot")),
    tags$style(type="text/css", ".recalculating {opacity: 1.0;}")
    ))

server.R

f <- function(input, output, session) {
    output$plot <- renderPlotly({
        invalidateLater(5000, session)
        syms <- paste("`",paste(isolate(input$syms), collapse="`"),sep="")
        d <- read.csv(paste("http://127.0.0.1:9897/q.csv?f",syms,sep=""), colClasses=c("factor","integer","POSIXct","numeric"))
        ggplotly(ggplot(d, aes(time,px,color=sym,group=bid)) + geom_step())
    })
}

shinyServer(f)
@wch
Copy link
Collaborator

wch commented Apr 1, 2016

Are you able to reproduce this with something like renderPlot instead of renderPlotly? Plotly is an external package, so it would be great if you could make sure that the problem is within Shiny, and not Plotly.

@wch
Copy link
Collaborator

wch commented Apr 1, 2016

A couple other thoughts... You might want to try eliminating variables one by one. For example, possible sources of memory leakage:

  • ggplot2
  • the downloading of the csv
  • ggplotly
  • renderPlotly

It would probably be best to eliminate the http download and replace it with a CSV on disk, and then decrease the invalidateLater interval so that you can test it faster. Then you could try eliminating the other variables and see which one(s) makes a difference.

@joshmyzie2
Copy link
Author

Thanks for the suggestions. I eliminated all 3rd party libraries and am able to reproduce the leak with just renderTable and random data. Unfortunately, the leak is much slower and takes days to get to 2-4x the original memory. It's also not leaking at a consistent rate. Speeding up invalidateLater didn't help too much as my browser can't keep up after a while.

Not sure I'm going to be able get to the bottom of this, so feel free to close this issue until I have something more specific.

@wch
Copy link
Collaborator

wch commented Apr 15, 2016

This is an interesting issue. I won't close it, but if you do find out more, please keep us updated.

@wch
Copy link
Collaborator

wch commented Apr 15, 2016

Also, can you post the code for your app that uses renderTable with random data?

@jcheng5 jcheng5 self-assigned this Apr 15, 2016
@jcheng5
Copy link
Member

jcheng5 commented Apr 18, 2016

I suspect this may be the source of the plotly memory leak. cc @cpsievert

https://github.com/ropensci/plotly/blob/2748789ecda04ccba6c6a5fd9cb7af2872bfe983/R/utils.R#L30-L42

@cpsievert
Copy link
Collaborator

cpsievert commented Apr 19, 2016

If replacing ggplotly() with gg2list() fixes the problem, then it almost certainly has to do with hash_plot().

Anything in particular that you're suspicious of @jcheng5? Perhaps the assign() call?

@jcheng5
Copy link
Member

jcheng5 commented Apr 19, 2016

Yeah, the fact that creating these objects have the side effect of storing (large?) amounts of data in plotlyEnv, which prevents the garbage collector from being able to collect it. Do entries ever get removed from plotlyEnv?

@cpsievert
Copy link
Collaborator

No, and that is really what enables this black magic.

@jcheng5
Copy link
Member

jcheng5 commented Apr 19, 2016

Ugh... sorry Carson but I very strongly believe this is a bad idea. The memory leak aspect alone is extremely troubling, and it also means that plotly plots can't be serialized across R sessions which will break them in RStudio Server if you leave them for too long (and even RStudio Desktop in some cases where it restarts the R session but tries to preserve your environment, such as "Build and Reload" on a package).

There is a way to accomplish a very similar effect to what you want here, but without the hackiness or memory leak. It is based on the concept of monads, which @hadley and I have been looking closely at for R. The basic idea is that when you have a thing like a plotly object, that isn't itself a data frame, but it "has" a data frame inside of it; and you have a function (like augment) that transforms data frames but knows nothing about plotly; then you can pipe them together using a new operator (Haskell calls it fmap and uses the operator <*>). This operator would be an S3 generic that (in this case) extracts the data frame from the plotly object, runs the function on the data frame, and puts the result back into a new plotly object and returns that.

plot_ly(df) %<*>%
  loess(whatever) %<*>%
  augment()

So if %>% means "transform thing on left using function call on right" then %<*>% means "transform the value of thing on left into function call on right".

@jcheng5
Copy link
Member

jcheng5 commented Apr 19, 2016

Btw it wouldn't be your job to make the generic operator, just the implementation of it for plotly, which would be like 3 lines of code. This assumes the existence of a @hadley/monad package.

@cpsievert
Copy link
Collaborator

Thanks @jcheng5, and no need for apologies, I'll have a look at https://github.com/hadley/monads

@jcheng5
Copy link
Member

jcheng5 commented Apr 19, 2016

Oh I totally forgot that he actually started on a package :) The operator is %>>%.

@wch
Copy link
Collaborator

wch commented Jul 27, 2016

Closing for now because the problem was in plotly. If you still are experiencing another memory leak in renderTable, you can open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants