Discussion:
[Rcpp-devel] Code review for xtensor R bindings
Wolf Vollprecht
2017-06-06 21:15:46 UTC
Permalink
Dear RCpp developer community,

I put together a small proof of concept for the xtensor R bindings over the
weekend.
It lives here: https://github.com/QuantStack/xtensor-r

xtensor is an exciting C++ library which provides a unified C++ tensor
interface for Python/NumPy, Julia and soon R.

My biggest question is in the memory management of R. Maybe you could do a
quick code review?
Is it enough to call Rcpp_release_object with the SEXP in the destructor of
the rxarray (which I haven't implemented yet)?
Also for the shape of the rxarray, I allocate another R-vector. Is that
going to get destroyed upon destruction of the R-array?

Best,

Wolf
Dirk Eddelbuettel
2017-06-06 22:06:11 UTC
Permalink
Howdy,

On 6 June 2017 at 14:15, Wolf Vollprecht wrote:
| I put together a small proof of concept for the xtensor R bindings over the
| weekend. 
| It lives here: https://github.com/QuantStack/xtensor-r

Nice! Not sure I have spare cycles right now for this but that is very welcome.

| xtensor is an exciting C++ library which provides a unified C++ tensor
| interface for Python/NumPy, Julia and soon R.
|
| My biggest question is in the memory management of R. Maybe you could do a
| quick code review?
| Is it enough to call Rcpp_release_object with the SEXP in the destructor of the
| rxarray (which I haven't implemented yet)?
| Also for the shape of the rxarray, I allocate another R-vector. Is that going
| to get destroyed upon destruction of the R-array?

The general idea when working with R objects is that

-- on the way in from R to the C++ functions we construct object such that
the existing memory (from R) is used; one example is how RcppArmadillo
uses the 'advanced' constructors of Armadillo allowing to operate
without copies; hence R managed memory is used while C++ functions are
called; in general this side of the conversion is the templates as<>()
converters;

-- on the way back we use the *alloc functions from the C-level API of R to
construct objects that use memory from R's pool, once we return these
objects to R as SEXP they are indistinguishable to R from its own; these
are the wrap() functions (and I think we may make on occassion make "one
final copy" at his level, but I'd have to double-check).

I haven't had a chance to look at what Wes is doing with Apache Arrow, and
what is happening with xtensor -- so thanks for getting the ball rolling.

Dirk
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Dirk Eddelbuettel
2017-06-06 23:27:21 UTC
Permalink
On 6 June 2017 at 17:06, Dirk Eddelbuettel wrote:
| The general idea when working with R objects is that
|
| -- on the way in from R to the C++ functions we construct object such that
| the existing memory (from R) is used; one example is how RcppArmadillo
| uses the 'advanced' constructors of Armadillo allowing to operate
| without copies; hence R managed memory is used while C++ functions are
| called; in general this side of the conversion is the templates as<>()
| converters;
|
| -- on the way back we use the *alloc functions from the C-level API of R to
| construct objects that use memory from R's pool, once we return these
| objects to R as SEXP they are indistinguishable to R from its own; these
| are the wrap() functions (and I think we may make on occassion make "one
| final copy" at his level, but I'd have to double-check).

Actually, let me rephrase: "in most cases not involving native R types" do we
make one copy at the return.

The general approach is iterator based, see internal/wrap.h. But there is a
lot of template magic...

Dirk

| I haven't had a chance to look at what Wes is doing with Apache Arrow, and
| what is happening with xtensor -- so thanks for getting the ball rolling.
|
| Dirk
|
| --
| http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-***@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Wolf Vollprecht
2017-06-07 01:01:06 UTC
Permalink
Ok, I think I got the idea.

In the R->xtensor bindings, we always work directly on memory allocated
through R (either passed in from R or allocated from C++ code).

The review-relevant lines are really only ~10 lines: https://github.com/
QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L145

const int vtype = traits::r_sexptype_traits<int>::rtype;
SEXP shape_sxp = Rf_allocVector(vtype, shape.size());

int* r_shape = INTEGER(shape_sxp);
// set shape values ...
m_sexp = Rf_allocArray(SXP, shape_sxp);

So afterwards, the idea would be to make a call to

Rcpp_PreserveObject(m_sexp);

right? And if the destructor of the object is called (and the object is
owned from C++) we would just call

Rcpp_ReleaseObject(m_sexp);

And the memory is going to be freed again, at some point. And calling
Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute",
right?

Thanks for your reply!
Wolf
Post by Dirk Eddelbuettel
| The general idea when working with R objects is that
|
| -- on the way in from R to the C++ functions we construct object such that
| the existing memory (from R) is used; one example is how RcppArmadillo
| uses the 'advanced' constructors of Armadillo allowing to operate
| without copies; hence R managed memory is used while C++ functions are
| called; in general this side of the conversion is the templates as<>()
| converters;
|
| -- on the way back we use the *alloc functions from the C-level API of R to
| construct objects that use memory from R's pool, once we return these
| objects to R as SEXP they are indistinguishable to R from its own; these
| are the wrap() functions (and I think we may make on occassion make "one
| final copy" at his level, but I'd have to double-check).
Actually, let me rephrase: "in most cases not involving native R types" do we
make one copy at the return.
The general approach is iterator based, see internal/wrap.h. But there is a
lot of template magic...
Dirk
| I haven't had a chance to look at what Wes is doing with Apache Arrow, and
| what is happening with xtensor -- so thanks for getting the ball rolling.
|
| Dirk
|
| --
| _______________________________________________
| Rcpp-devel mailing list
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
--
Dirk Eddelbuettel
2017-06-07 01:15:41 UTC
Permalink
On 6 June 2017 at 18:01, Wolf Vollprecht wrote:
| Ok, I think I got the idea. 
|
| In the R->xtensor bindings, we always work directly on memory allocated through
| R (either passed in from R or allocated from C++ code).
|
| The review-relevant lines are really only ~10 lines: https://github.com/
| QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L145
|
|     const int vtype = traits::r_sexptype_traits<int>::rtype;
|     SEXP shape_sxp = Rf_allocVector(vtype, shape.size());
|
|     int* r_shape = INTEGER(shape_sxp);
|         // set shape values ... 
|     m_sexp = Rf_allocArray(SXP, shape_sxp);
|
| So afterwards, the idea would be to make a call to 
|
| Rcpp_PreserveObject(m_sexp);
|
| right? And if the destructor of the object is called (and the object is owned
| from C++) we would just call 
|
| Rcpp_ReleaseObject(m_sexp);
|
| And the memory is going to be freed again, at some point. And calling
| Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute", right?

Yes, that is roughly correct and how we used to do things -- but you can
probably even do better by relying on

Rcpp::Shield<yourTypeHere>

which then covers the preserve / release parts in its own ctor / dtor pair.

And I glanced at your repo, that looks pretty good so far. You found how to
declare C++14 as its default C++ standard, this of course requires R 3.4.0.

Dirk
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Romain Francois
2017-06-07 07:29:44 UTC
Permalink
Hi,

Instead of doing that:

const int vtype = traits::r_sexptype_traits<int>::rtype;
SEXP shape_sxp = Rf_allocVector(vtype, shape.size());

Perhaps your rxarray object could hold an IntegerVector. Also not quite sure why you use the first line at all, which does not depend on T, so vtype is always going to be INTSXP.


Also, you should avoid using std::cout in https://github.com/QuantStack/xtensor-r/blob/master/src/rcpp_hello_xtensor.cpp#L14 <https://github.com/QuantStack/xtensor-r/blob/master/src/rcpp_hello_xtensor.cpp#L14>

You can use Rcpp::Rcout instead, or Rprintf.
And the memory is going to be freed again, at some point. And calling Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute", right?
The shape_sxp object you create here (and don’t protect)
https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L152 <https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L152>

Is probably duplicated here:
https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L160 <https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L160>

As part of allocArray’s business:
https://github.com/wch/r-source/blob/d60c742aa8acc764874db87e3c748e27986e1134/src/main/array.c#L259 <https://github.com/wch/r-source/blob/d60c742aa8acc764874db87e3c748e27986e1134/src/main/array.c#L259>

Romain
Ok, I think I got the idea.
In the R->xtensor bindings, we always work directly on memory allocated through R (either passed in from R or allocated from C++ code).
The review-relevant lines are really only ~10 lines: https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L145 <https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L145>
const int vtype = traits::r_sexptype_traits<int>::rtype;
SEXP shape_sxp = Rf_allocVector(vtype, shape.size());
int* r_shape = INTEGER(shape_sxp);
// set shape values ...
m_sexp = Rf_allocArray(SXP, shape_sxp);
So afterwards, the idea would be to make a call to
Rcpp_PreserveObject(m_sexp);
right? And if the destructor of the object is called (and the object is owned from C++) we would just call
Rcpp_ReleaseObject(m_sexp);
And the memory is going to be freed again, at some point. And calling Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute", right?
Thanks for your reply!
Wolf
| The general idea when working with R objects is that
|
| -- on the way in from R to the C++ functions we construct object such that
| the existing memory (from R) is used; one example is how RcppArmadillo
| uses the 'advanced' constructors of Armadillo allowing to operate
| without copies; hence R managed memory is used while C++ functions are
| called; in general this side of the conversion is the templates as<>()
| converters;
|
| -- on the way back we use the *alloc functions from the C-level API of R to
| construct objects that use memory from R's pool, once we return these
| objects to R as SEXP they are indistinguishable to R from its own; these
| are the wrap() functions (and I think we may make on occassion make "one
| final copy" at his level, but I'd have to double-check).
Actually, let me rephrase: "in most cases not involving native R types" do we
make one copy at the return.
The general approach is iterator based, see internal/wrap.h. But there is a
lot of template magic...
Dirk
| I haven't had a chance to look at what Wes is doing with Apache Arrow, and
| what is happening with xtensor -- so thanks for getting the ball rolling.
|
| Dirk
|
| --
| _______________________________________________
| Rcpp-devel mailing list
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel <https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel>
--
_______________________________________________
Rcpp-devel mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Dirk Eddelbuettel
2017-06-07 10:59:56 UTC
Permalink
On 7 June 2017 at 09:29, Romain Francois wrote:
| Hi,
|
| Instead of doing that:
|
| const int vtype = traits::r_sexptype_traits<int>::rtype;
| SEXP shape_sxp = Rf_allocVector(vtype, shape.size());
|
| Perhaps your rxarray object could hold an IntegerVector. Also not quite sure
| why you use the first line at all, which does not depend on T, so vtype is
| always going to be INTSXP.
|
| Also, you should avoid using std::cout in https://github.com/QuantStack/
| xtensor-r/blob/master/src/rcpp_hello_xtensor.cpp#L14
|
| You can use Rcpp::Rcout instead, or Rprintf.

Yep. And 'install_and_test.R' builds via build() and installs the package
which is muddled; the demo code is better in inst/demo/ or inst/scripts.

For what it is worth I looked into taking the xtensor .deb package and
creating an Ubuntu 16.04 package via the launchpad.net service. I have that
in 'my' PPA (https://launchpad.net/~edd/+archive/ubuntu/misc/+packages) and
will try the same for a 14.04 package at which point we could use Travis
easily. I may create a simple 'more standard R package' PR.

Dirk

| And the memory is going to be freed again, at some point. And calling
| Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute",
| right?
|
| The shape_sxp object you create here (and don’t protect)
| https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/
| rxarray.hpp#L152
|
| Is probably duplicated here:
| https://github.com/QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/
| rxarray.hpp#L160
|
| As part of allocArray’s business:
| https://github.com/wch/r-source/blob/d60c742aa8acc764874db87e3c748e27986e1134/
| src/main/array.c#L259
|
| Romain
|
|
|
|
| Le 7 juin 2017 à 03:01, Wolf Vollprecht <***@gmail.com> a écrit :
|
| Ok, I think I got the idea.
|
| In the R->xtensor bindings, we always work directly on memory allocated
| through R (either passed in from R or allocated from C++ code).
|
| The review-relevant lines are really only ~10 lines: https://github.com/
| QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/rxarray.hpp#L145
|
| const int vtype = traits::r_sexptype_traits<int>::rtype;
| SEXP shape_sxp = Rf_allocVector(vtype, shape.size());
|
| int* r_shape = INTEGER(shape_sxp);
| // set shape values ...
| m_sexp = Rf_allocArray(SXP, shape_sxp);
|
| So afterwards, the idea would be to make a call to
|
| Rcpp_PreserveObject(m_sexp);
|
| right? And if the destructor of the object is called (and the object is
| owned from C++) we would just call
|
| Rcpp_ReleaseObject(m_sexp);
|
| And the memory is going to be freed again, at some point. And calling
| Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute",
| right?
|
| Thanks for your reply!
| [cleardot]
| Wolf
|
|
| 2017-06-06 16:27 GMT-07:00 Dirk Eddelbuettel <***@debian.org>:
|
|
| On 6 June 2017 at 17:06, Dirk Eddelbuettel wrote:
| | The general idea when working with R objects is that
| |
| | -- on the way in from R to the C++ functions we construct object
| such that
| | the existing memory (from R) is used; one example is how
| RcppArmadillo
| | uses the 'advanced' constructors of Armadillo allowing to
| operate
| | without copies; hence R managed memory is used while C++
| functions are
| | called; in general this side of the conversion is the templates
| as<>()
| | converters;
| |
| | -- on the way back we use the *alloc functions from the C-level API
| of R to
| | construct objects that use memory from R's pool, once we return
| these
| | objects to R as SEXP they are indistinguishable to R from its
| own; these
| | are the wrap() functions (and I think we may make on occassion
| make "one
| | final copy" at his level, but I'd have to double-check).
|
| Actually, let me rephrase: "in most cases not involving native R types"
| do we
| make one copy at the return.
|
| The general approach is iterator based, see internal/wrap.h. But there
| is a
| lot of template magic...
|
| Dirk
|
| | I haven't had a chance to look at what Wes is doing with Apache
| Arrow, and
| | what is happening with xtensor -- so thanks for getting the ball
| rolling.
| |
| | Dirk
| |
| | --
| | http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
| | _______________________________________________
| | Rcpp-devel mailing list
| | Rcpp-***@lists.r-forge.r-project.org
| | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/
| rcpp-devel
|
| --
| http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
|
|
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-***@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
|
|
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-***@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Wolf Vollprecht
2017-06-07 23:06:08 UTC
Permalink
The idea with IntegerVector for shape is great. I had to fix some minor
stuff in xtensor but it appears to be working.

Ok, good to know about the shape copy!

Yeah, this is literally a weekend project for now (and I haven't used R
before) so this was just a quick-n-dirty test script.

Cheers,

Wolf
Post by Dirk Eddelbuettel
| Hi,
|
|
| const int vtype = traits::r_sexptype_traits<int>::rtype;
| SEXP shape_sxp = Rf_allocVector(vtype, shape.size());
|
| Perhaps your rxarray object could hold an IntegerVector. Also not quite sure
| why you use the first line at all, which does not depend on T, so vtype is
| always going to be INTSXP.
|
| Also, you should avoid using std::cout in https://github.com/QuantStack/
| xtensor-r/blob/master/src/rcpp_hello_xtensor.cpp#L14
|
| You can use Rcpp::Rcout instead, or Rprintf.
Yep. And 'install_and_test.R' builds via build() and installs the package
which is muddled; the demo code is better in inst/demo/ or inst/scripts.
For what it is worth I looked into taking the xtensor .deb package and
creating an Ubuntu 16.04 package via the launchpad.net service. I have that
in 'my' PPA (https://launchpad.net/~edd/+archive/ubuntu/misc/+packages) and
will try the same for a 14.04 package at which point we could use Travis
easily. I may create a simple 'more standard R package' PR.
Dirk
| And the memory is going to be freed again, at some point. And calling
| Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute",
| right?
|
| The shape_sxp object you create here (and don’t protect)
| https://github.com/QuantStack/xtensor-r/blob/master/inst/
include/xtensor-r/
| rxarray.hpp#L152
|
| https://github.com/QuantStack/xtensor-r/blob/master/inst/
include/xtensor-r/
| rxarray.hpp#L160
|
| https://github.com/wch/r-source/blob/d60c742aa8acc764874db87e3c748e
27986e1134/
| src/main/array.c#L259
|
| Romain
|
|
|
|
|
| Ok, I think I got the idea.
|
| In the R->xtensor bindings, we always work directly on memory allocated
| through R (either passed in from R or allocated from C++ code).
|
https://github.com/
| QuantStack/xtensor-r/blob/master/inst/include/xtensor-r/
rxarray.hpp#L145
|
| const int vtype = traits::r_sexptype_traits<int>::rtype;
| SEXP shape_sxp = Rf_allocVector(vtype, shape.size());
|
| int* r_shape = INTEGER(shape_sxp);
| // set shape values ...
| m_sexp = Rf_allocArray(SXP, shape_sxp);
|
| So afterwards, the idea would be to make a call to
|
| Rcpp_PreserveObject(m_sexp);
|
| right? And if the destructor of the object is called (and the object is
| owned from C++) we would just call
|
| Rcpp_ReleaseObject(m_sexp);
|
| And the memory is going to be freed again, at some point. And calling
| Rcpp_ReleaseObject(m_sexp) would also clear the shape_sxp "attribute",
| right?
|
| Thanks for your reply!
| [cleardot]
| Wolf
|
|
|
|
| | The general idea when working with R objects is that
| |
| | -- on the way in from R to the C++ functions we construct object
| such that
| | the existing memory (from R) is used; one example is how
| RcppArmadillo
| | uses the 'advanced' constructors of Armadillo allowing to
| operate
| | without copies; hence R managed memory is used while C++
| functions are
| | called; in general this side of the conversion is the templates
| as<>()
| | converters;
| |
| | -- on the way back we use the *alloc functions from the C-level API
| of R to
| | construct objects that use memory from R's pool, once we return
| these
| | objects to R as SEXP they are indistinguishable to R from its
| own; these
| | are the wrap() functions (and I think we may make on occassion
| make "one
| | final copy" at his level, but I'd have to double-check).
|
| Actually, let me rephrase: "in most cases not involving native R types"
| do we
| make one copy at the return.
|
| The general approach is iterator based, see internal/wrap.h. But there
| is a
| lot of template magic...
|
| Dirk
|
| | I haven't had a chance to look at what Wes is doing with Apache
| Arrow, and
| | what is happening with xtensor -- so thanks for getting the ball
| rolling.
| |
| | Dirk
| |
| | --
| | _______________________________________________
| | Rcpp-devel mailing list
| | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/
| rcpp-devel
|
| --
|
|
| _______________________________________________
| Rcpp-devel mailing list
| https://lists.r-forge.r-project.org/cgi-bin/mailman/
listinfo/rcpp-devel
|
|
| _______________________________________________
| Rcpp-devel mailing list
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
--
Loading...