Discussion:
[Rcpp-devel] Performance issues with simple list class
Clemens Schmid
2017-06-08 12:29:56 UTC
Permalink
Dear Rcpp developers,

first of all thank you for your persistent work on and with Rcpp - this
mailinglist is impressive!

I implemented a class_A and a list class_B following this
(https://stackoverflow.com/a/44303993/3216883) example by Romain
Francois in the Rcpp modules framework.
class_A in my setup is far more complex than the example and my factory
function of class_B takes a DataFrame instead of a List. Nevertheless I
didn't include code here - linked example is already quite comprehensive.

I can instantiate objects of class_A in R and store them in the
std::vector in class_B. That works fine. Unfortunately I run into
performance problems with this approach. class_B is supposed to store up
to 100,000 instances of class_A, but the new() call in R for just 2000
already takes far too long (ca. 20s). I tried to figure out what exactly
takes this long and I realized that it's neither the construction of the
individual objects of class_A, nor pushing them into the vector. The
bottleneck seems to be somehow related to how this object of class_B is
exposed to R. That's especially sad because I don't want to interact a
lot with class_B in R, but mostly use it in my C++-Code. The interaction
with R works with an as.data.frame() function which is again pretty fast.

Do you have any ideas how I could avoid this bottleneck?

Clemens Schmid
Dirk Eddelbuettel
2017-06-12 11:07:00 UTC
Permalink
On 8 June 2017 at 14:29, Clemens Schmid wrote:
| Dear Rcpp developers,
|
| first of all thank you for your persistent work on and with Rcpp - this
| mailinglist is impressive!
|
| I implemented a class_A and a list class_B following this
| (https://stackoverflow.com/a/44303993/3216883) example by Romain
| Francois in the Rcpp modules framework.
| class_A in my setup is far more complex than the example and my factory
| function of class_B takes a DataFrame instead of a List. Nevertheless I
| didn't include code here - linked example is already quite comprehensive.
|
| I can instantiate objects of class_A in R and store them in the
| std::vector in class_B. That works fine. Unfortunately I run into
| performance problems with this approach. class_B is supposed to store up
| to 100,000 instances of class_A, but the new() call in R for just 2000
| already takes far too long (ca. 20s). I tried to figure out what exactly
| takes this long and I realized that it's neither the construction of the
| individual objects of class_A, nor pushing them into the vector. The
| bottleneck seems to be somehow related to how this object of class_B is
| exposed to R. That's especially sad because I don't want to interact a
| lot with class_B in R, but mostly use it in my C++-Code. The interaction
| with R works with an as.data.frame() function which is again pretty fast.
|
| Do you have any ideas how I could avoid this bottleneck?

I fear nobody can tell without code.

Dirk
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Loading...