6. Rpy2
The Rpy2 package allows one to easily start an R session, and interact with it from with-in Python, it has both more high level functions, that attempts to be more pythonic and explain away R’s language and a more low level interface which allows directly evaluating R expressions.
Let’s start with a simple example using the low level API:
[1]:
import rpy2.robjects as ro
import numpy as np
#Create an y object in R
ro.r('y <- rnorm(10)')
#get y as rpy2 object
ro.r['y']
#get y as an R representation
ro.r['y'].r_repr()
#copy y to a Numpy array
y_array = np.array(ro.r['y'])
Note that, if we change some of the values of y_array, ro.r['y'] will remain unchanged:
[2]:
y_array[0] = 120.10
#Changed:
y_array
#Unchanged:
ro.r['y']
[2]:
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f91c0b8f548 / R:0x1b431d8>
[-0.402889, 0.719581, -0.315704, ..., 1.478772, -0.098176, 0.329627]
This is not the case if we create our array using asarray method:
[3]:
y_array = np.asarray(ro.r['y'])
y_array[0] = 120.10
#Changed:
y_array
#Changed (!!!):
ro.r['y']
[3]:
R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x7f91c0b8fd08 / R:0x1b431d8>
[120.100000, 0.719581, -0.315704, ..., 1.478772, -0.098176, 0.329627]
6.1. Automatic Numpy conversion
Using numpy2ri.activate(), you can automatically have ro.r['y'] converted to a Numpy array:
[4]:
from rpy2.robjects import numpy2ri
numpy2ri.activate()
y_array = ro.r['y']
#Returns numpy.ndarray!!
type(y_array)
#Note: use the following to deactivate automatic conversion:
#numpy2ri.deactivate()
[4]:
numpy.ndarray
Here again, changing y_array will change the internal R object (which is allows you to change your R vector from within Python):
[5]:
y_array[0] = 230.20
#Changed:
y_array
#Changed:
ro.r['y']
[5]:
array([ 2.30200000e+02, 7.19581382e-01, -3.15704395e-01, -6.79324653e-01,
2.90198908e+00, 1.58830438e+00, 6.13016246e-01, 1.47877191e+00,
-9.81764286e-02, 3.29626622e-01])
6.2. Automatic Pandas conversion
Using pandas2ri.activate, you can automatically have a R data.frame converted and copied to a Pandas DataFrame type:
[6]:
from rpy2.robjects import pandas2ri
pandas2ri.activate()
ro.r('df <- data.frame(a=rnorm(100), b=rpois(100, 1))')
pd_df = ro.r['df']
type(pd_df)
[6]:
pandas.core.frame.DataFrame
6.3. Further reading
https://rpy2.readthedocs.io/en/version_2.8.x/getting-started.html