CHPC Software: R (programming language)

In computing, R is a programming language and software environment for statistical computing and graphics. It is an implementation of the S programming language with lexical scoping semantics inspired by Scheme. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is now developed by the R Development Core Team. It is named partly after the first names of the first two R authors (Robert Gentleman and Ross Ihaka), and partly as a play on the name of S.

The R language has become a de facto standard among statisticians for the development of statistical software, and is widely used for statistical software development and data analysis.

R is part of the GNU project. Its source code is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems. R uses a command line interface, though several graphical user interfaces are available.


R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others) and graphical techniques. R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. There are some important differences, but much code written for S runs unaltered. Much of R's system is itself written in the language, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

R is also highly extensible through the use of user-submitted packages for specific functions or specific areas of study. Due to its S heritage, R has stronger object-oriented programming facilities than most statistical computing languages. Extending R is also eased by its permissive lexical scoping rules.

Another of R's strengths is its graphical facilities, which produce publication-quality graphs which can include mathematical symbols. R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hard copy.

Although R is mostly used by statisticians and other practitioners requiring an environment for statistical computation and software development, it can also be used as a general matrix calculation toolbox with comparable benchmark results to GNU Octave and its proprietary counterpart, MATLAB. An RWeka interface has been added to the popular data mining software Weka which allows the capability to read/write into the arff data format thus allowing the usage of data mining capabilities in Weka and statistical in R.


The capabilities of R are extended through user-submitted packages, which allow specialized statistical techniques, graphical devices, as well as programming interfaces and import/export capabilities to many external data formats. These packages are developed in R, LaTeX, Java, and often C and Fortran. A core set of packages are included with the installation of R, with a total of 1862 (as of June 2009) available at the Comprehensive R Archive Network (CRAN). Notable packages by subject area are listed along with comments on the official R Task View pages.

Last Modified: August 07, 2009 @ 12:07:31