Discussion:
[Rcpp-devel] operator<< issues
Iñaki Úcar
2018-03-14 17:51:32 UTC
Permalink
Hi all,

I'm not sure whether this is a bug or not, so I think this is the
right place to start with. Consider the following code:

Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void print_fun(Function x) {
Rcout << x << std::endl;
}

// [[Rcpp::export]]
void print_env(Environment x) {
Rcout << x << std::endl;
}
')

print_fun(function() {})
print_env(environment())

It compiles and the output from the functions are two addresses. So
far, so good. However, if we try the same for a data frame, the
compilation fails, so we need to define the operator<< as follows:

Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;

inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
out << "data.frame";
return out;
}

// [[Rcpp::export]]
void print_df(DataFrame x) {
Rcout << x << std::endl;
}
')

print_df(data.frame(x=1))

Now, it compiles and produces the output we defined. Once more, so
far, so good. Now the problem comes when we try to merge the two
examples above, that is:

Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;

inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
out << "data.frame";
return out;
}

// [[Rcpp::export]]
void print_df(DataFrame x) {
Rcout << x << std::endl;
}

// [[Rcpp::export]]
void print_fun(Function x) {
Rcout << x << std::endl;
}

// [[Rcpp::export]]
void print_env(Environment x) {
Rcout << x << std::endl;
}
')

The compilation fails again due to an ambiguous overload for Function
and Environment types, so that we need to define the operator<< for
these classes too in order to disambiguate and fix this. I suppose it
may happen for other classes too. Is this... expected? Desirable? At
the very least, it is confusing from my point of view.

Regards,
Iñaki
Iñaki Úcar
2018-03-19 07:24:28 UTC
Permalink
Hi -- I hope my last email didn't hit the spam folder. :-)

Iñaki
Post by Iñaki Úcar
Hi all,
I'm not sure whether this is a bug or not, so I think this is the
Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void print_fun(Function x) {
Rcout << x << std::endl;
}
// [[Rcpp::export]]
void print_env(Environment x) {
Rcout << x << std::endl;
}
')
print_fun(function() {})
print_env(environment())
It compiles and the output from the functions are two addresses. So
far, so good. However, if we try the same for a data frame, the
Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;
inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
out << "data.frame";
return out;
}
// [[Rcpp::export]]
void print_df(DataFrame x) {
Rcout << x << std::endl;
}
')
print_df(data.frame(x=1))
Now, it compiles and produces the output we defined. Once more, so
far, so good. Now the problem comes when we try to merge the two
Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;
inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
out << "data.frame";
return out;
}
// [[Rcpp::export]]
void print_df(DataFrame x) {
Rcout << x << std::endl;
}
// [[Rcpp::export]]
void print_fun(Function x) {
Rcout << x << std::endl;
}
// [[Rcpp::export]]
void print_env(Environment x) {
Rcout << x << std::endl;
}
')
The compilation fails again due to an ambiguous overload for Function
and Environment types, so that we need to define the operator<< for
these classes too in order to disambiguate and fix this. I suppose it
may happen for other classes too. Is this... expected? Desirable? At
the very least, it is confusing from my point of view.
Regards,
Iñaki
Dirk Eddelbuettel
2018-03-19 11:36:10 UTC
Permalink
On 19 March 2018 at 08:24, Iñaki Úcar wrote:
| Hi -- I hope my last email didn't hit the spam folder. :-)

It didn't but it is a little hard to say anything here. Sometimes the
compiler needs help with disambiguation as you said.

Dirk

| Iñaki
|
| 2018-03-14 18:51 GMT+01:00 Iñaki Úcar <***@gmail.com>:
| > Hi all,
| >
| > I'm not sure whether this is a bug or not, so I think this is the
| > right place to start with. Consider the following code:
| >
| > Rcpp::sourceCpp(code='
| > #include <Rcpp.h>
| > using namespace Rcpp;
| >
| > // [[Rcpp::export]]
| > void print_fun(Function x) {
| > Rcout << x << std::endl;
| > }
| >
| > // [[Rcpp::export]]
| > void print_env(Environment x) {
| > Rcout << x << std::endl;
| > }
| > ')
| >
| > print_fun(function() {})
| > print_env(environment())
| >
| > It compiles and the output from the functions are two addresses. So
| > far, so good. However, if we try the same for a data frame, the
| > compilation fails, so we need to define the operator<< as follows:
| >
| > Rcpp::sourceCpp(code='
| > #include <Rcpp.h>
| > using namespace Rcpp;
| >
| > inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
| > out << "data.frame";
| > return out;
| > }
| >
| > // [[Rcpp::export]]
| > void print_df(DataFrame x) {
| > Rcout << x << std::endl;
| > }
| > ')
| >
| > print_df(data.frame(x=1))
| >
| > Now, it compiles and produces the output we defined. Once more, so
| > far, so good. Now the problem comes when we try to merge the two
| > examples above, that is:
| >
| > Rcpp::sourceCpp(code='
| > #include <Rcpp.h>
| > using namespace Rcpp;
| >
| > inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
| > out << "data.frame";
| > return out;
| > }
| >
| > // [[Rcpp::export]]
| > void print_df(DataFrame x) {
| > Rcout << x << std::endl;
| > }
| >
| > // [[Rcpp::export]]
| > void print_fun(Function x) {
| > Rcout << x << std::endl;
| > }
| >
| > // [[Rcpp::export]]
| > void print_env(Environment x) {
| > Rcout << x << std::endl;
| > }
| > ')
| >
| > The compilation fails again due to an ambiguous overload for Function
| > and Environment types, so that we need to define the operator<< for
| > these classes too in order to disambiguate and fix this. I suppose it
| > may happen for other classes too. Is this... expected? Desirable? At
| > the very least, it is confusing from my point of view.
| >
| > Regards,
| > Iñaki
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-***@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Tim Keitt
2018-03-20 03:33:05 UTC
Permalink
http://www.keittlab.org/
Post by Iñaki Úcar
Hi all,
I'm not sure whether this is a bug or not, so I think this is the
Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void print_fun(Function x) {
Rcout << x << std::endl;
}
// [[Rcpp::export]]
void print_env(Environment x) {
Rcout << x << std::endl;
}
')
print_fun(function() {})
print_env(environment())
It compiles and the output from the functions are two addresses. So
far, so good. However, if we try the same for a data frame, the
Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;
inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
out << "data.frame";
return out;
}
// [[Rcpp::export]]
void print_df(DataFrame x) {
Rcout << x << std::endl;
}
')
print_df(data.frame(x=1))
Now, it compiles and produces the output we defined. Once more, so
far, so good. Now the problem comes when we try to merge the two
Rcpp::sourceCpp(code='
#include <Rcpp.h>
using namespace Rcpp;
inline std::ostream& operator<<(std::ostream& out, const DataFrame& df) {
out << "data.frame";
return out;
}
// [[Rcpp::export]]
void print_df(DataFrame x) {
Rcout << x << std::endl;
}
// [[Rcpp::export]]
void print_fun(Function x) {
Rcout << x << std::endl;
}
// [[Rcpp::export]]
void print_env(Environment x) {
Rcout << x << std::endl;
}
')
The compilation fails again due to an ambiguous overload for Function
and Environment types, so that we need to define the operator<< for
these classes too in order to disambiguate and fix this. I suppose it
may happen for other classes too. Is this... expected? Desirable? At
the very least, it is confusing from my point of view.
Why not something like:

Rcpp::sourceCpp(code='
#include <Rcpp.h>
using Rcpp::Rcout;

// [[Rcpp::export]]
void print_addr(SEXP x){
Rcout << static_cast<void*>(x) << std::endl;
}')

I'm not sure why one would expect Rcpp types to automatically yield a
pointer appropriate for printing.

THK
Post by Iñaki Úcar
Regards,
Iñaki
_______________________________________________
Rcpp-devel mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Iñaki Úcar
2018-03-20 10:42:27 UTC
Permalink
Post by Iñaki Úcar
Rcpp::sourceCpp(code='
#include <Rcpp.h>
using Rcpp::Rcout;
// [[Rcpp::export]]
void print_addr(SEXP x){
Rcout << static_cast<void*>(x) << std::endl;
}')
I'm not sure why one would expect Rcpp types to automatically yield a
pointer appropriate for printing.
I may have my own reasons for that, but that's not the point here. The
point is that I expected a homogeneous behaviour across Rcpp classes
when any object is passed to operator<< (i.e., print *something*).

By grepping the source, I discovered that Matrix and Vector have an
implementation of operator<<, but not the other classes.

Of course, this is a minor issue, because anyone is able to define
this operator if needed. I simply wanted to note it here, just in case
Dirk considered that all classes should have a well-defined one.

Iñaki
Post by Iñaki Úcar
THK
Dirk Eddelbuettel
2018-03-20 12:02:08 UTC
Permalink
On 20 March 2018 at 11:42, Iñaki Úcar wrote:
| I may have my own reasons for that, but that's not the point here. The
| point is that I expected a homogeneous behaviour across Rcpp classes
| when any object is passed to operator<< (i.e., print *something*).
|
| By grepping the source, I discovered that Matrix and Vector have an
| implementation of operator<<, but not the other classes.
|
| Of course, this is a minor issue, because anyone is able to define
| this operator if needed. I simply wanted to note it here, just in case
| Dirk considered that all classes should have a well-defined one.

Yes, there are always some missing elements and holes. Contributions welcome.

The operator<<() addition for matrices and vectors came as those types are
most often used. "General purpose" operators are a lot work, particularly for
data.frame, see eg all the work dplyr, tibble, pillar, ... are going through.

We do have a print() method though that dispatches to R's print function.

Dirk
--
http://dirk.eddelbuettel.com | @eddelbuettel | ***@debian.org
Loading...