Discussion:
[Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package
Thanh Le Hoang
2017-10-08 22:22:34 UTC
Permalink
_______________________________________________
Rcpp-devel mailing list
Rcpp-***@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Thanh Le Hoang
2017-10-10 10:40:47 UTC
Permalink
_______________________________________________
Rcpp-devel mailing list
Rcpp-***@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Dirk Eddelbuettel
2017-10-10 11:21:30 UTC
Permalink
On 10 October 2017 at 12:40, Thanh Le Hoang wrote:
| [DELETED ATTACHMENT <no suggested filename>, HTML]

Can you please try again in text mode?

Dirk
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Thanh Le Hoang
2017-10-10 20:48:05 UTC
Permalink
Hello, you can find a text copy of the previous emails below.
I have already found a solution for my problem, but thanks for your reply.

Thanh
Gesendet: Dienstag, 10. Oktober 2017 um 13:21 Uhr
Betreff: Re: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package
| [DELETED ATTACHMENT <no suggested filename>, HTML]
Can you please try again in text mode?
Dirk
--
Replying to my own email since I just found the solution. I somehow screwed up the
Makevars/Makevars.win files, so I deleted them and created new files where I
exactly copied the Makevars lines on the RcppParallel webpage. I also had to add
to my package so that there were no errors with the NAMESPACE file when running
roxygen2.
Gesendet: Montag, 09. Oktober 2017 um 00:22 Uhr
Betreff: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package
Hello,
I'm writing my first package for a machine learning algorithm called self-organizing
map where I use compiled code (with Rcpp) and parallelization (RcppParallel).
My computer uses Windows 10 (64 bit, 8 GB RAM) and I currently have a problem
with the memory usage (shown in the Windows task manager) which keeps going up the
longer the algorithm runs. The usage doesn't increase immediately, but after a couple
of seconds and I only noticed it when I tried larger data sets. The memory is
only freed by terminating/restarting the R session.
What is somewhat strange is that the memory usage is not attributed to Rstudio or
the R session (i.e. the memory usage in the task manager does not go up for
the respective processes). According to RAMMap (which gives more information
about memory usage on Windows) the used memory belongs to the "nonpaged pool".
The RStudio profiler and lineprof did not seem to detect the memory leak (if
I read the output correctly). So far I have rewritten parts of the C++ code to
use references and pre-allocated memory, but it did not help.
The main function in the package calls several smaller functions written in
C++ and it seems that all of those functions play a role here, but I have found
a function where this problem occurs consistently. It calculates the (squared)
euclidean norm for each row of a given matrix (in parallel) with a boolean
vector (oldColumns) specifying which columns should be used/ignored during this
https://pastebin.com/qgyzx0M7
When I pasted this code into a new project, I have noticed that the problem
only happens when I build (with devtools::build()) and install a package
containing this function, regardless of whether I build a source package or
a binary package. When I just sourceCpp a file with this function, no
memory problems occur. So could this have anything to do with how I build packages?
Until now I have followed the "R packages" book written by Hadley Wickham for this.
Here is some R code which generates some test data and calls the function.
https://pastebin.com/c0RaeW9K
Everytime I run this code (which takes a couple of minutes), the memory usage
goes up by 4% - 6% which makes my package unusable for larger sets of data.
I have been stuck on this problem for a week now and any help would be
appreciated.
Thank you,
Thanh
Loading...