6. Rpy2

The Rpy2 package allows one to easily start an R session, and interact with it from with-in Python, it has both more high level functions, that attempts to be more pythonic and explain away R’s language and a more low level interface which allows directly evaluating R expressions.

Let’s start with a simple example using the low level API:

[1]:
import rpy2.robjects as ro
import numpy as np

#Create an y object in R
ro.r('y <- rnorm(10)')

#get y as rpy2 object
ro.r['y']

#get y as an R representation
ro.r['y'].r_repr()

#copy y to a Numpy array
y_array = np.array(ro.r['y'])

Note that, if we change some of the values of y_array, ro.r['y'] will remain unchanged:

[2]:
y_array[0] = 120.10

#Changed:
y_array

#Unchanged:
ro.r['y']
[2]:
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f91c0b8f548 / R:0x1b431d8>
[-0.402889, 0.719581, -0.315704, ..., 1.478772, -0.098176, 0.329627]

This is not the case if we create our array using asarray method:

[3]:
y_array = np.asarray(ro.r['y'])
y_array[0] = 120.10

#Changed:
y_array

#Changed (!!!):
ro.r['y']
[3]:
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f91c0b8fd08 / R:0x1b431d8>
[120.100000, 0.719581, -0.315704, ..., 1.478772, -0.098176, 0.329627]

6.1. Automatic Numpy conversion

Using numpy2ri.activate(), you can automatically have ro.r['y'] converted to a Numpy array:

[4]:
from rpy2.robjects import numpy2ri
numpy2ri.activate()

y_array = ro.r['y']

#Returns numpy.ndarray!!
type(y_array)

#Note: use the following to deactivate automatic conversion:
#numpy2ri.deactivate()
[4]:
numpy.ndarray

Here again, changing y_array will change the internal R object (which is allows you to change your R vector from within Python):

[5]:
y_array[0] = 230.20

#Changed:
y_array

#Changed:
ro.r['y']
[5]:
array([ 2.30200000e+02,  7.19581382e-01, -3.15704395e-01, -6.79324653e-01,
        2.90198908e+00,  1.58830438e+00,  6.13016246e-01,  1.47877191e+00,
       -9.81764286e-02,  3.29626622e-01])

6.2. Automatic Pandas conversion

Using pandas2ri.activate, you can automatically have a R data.frame converted and copied to a Pandas DataFrame type:

[6]:
from rpy2.robjects import pandas2ri
pandas2ri.activate()

ro.r('df <- data.frame(a=rnorm(100), b=rpois(100, 1))')

pd_df = ro.r['df']

type(pd_df)
[6]:
pandas.core.frame.DataFrame

6.3. Further reading

https://rpy2.readthedocs.io/en/version_2.8.x/getting-started.html