aroma - An R object-oriented microarray analysis environment

Henrik Bengtsson

Centre for Mathematical Sciences
Mathematical Statistics
Lund Institute of Technology,
Lund University,

ISSN 1403-9338
We have implemented a self-contained package for DNA microarray analysis in R. The package is named aroma and is formerly known as com.braju.sma. The purpose of the package is to provide end-users with easy access to the latest statistical tools and methods, to provide other statisticians with a structured platform to develop and test new analysis methods, and to make our methods available to an audience as broad as possible. In turn, this give us most valuable feedback for improving the package and its algorithms. The package provides methods for low-level analysis and pre-processing of spotted microarray data with two or more channels, although the recent methods are applicable to single-channel microarrays as well. Many of the commonly known and widely used normalization methods have been implemented as well as our own normalization and calibration methods in both early and mature versions. A few high-level analysis methods do exist. Import and export of microarray data is supported for most free and commercial file formats. Support for new ones can easily be added. Connectivity to other microarray analysis platforms based on relational databases is planned. For exploratory data analysis, the package provides a large set of methods for visualization of data with automatic graphical annotations. Textual and symbolic annotation of graphs is straightforward. Generated graphics can be exported for on-screen presentations and high-resolution printed publications. Report generators for simultaneously creating documents in plain text, HTML, and LaTeX are available, making it easy to create batch analysis scripts. Detailed documen-tation with code examples and references to the literature are included. In order to provide time and memory efficient algorithms, an object-oriented design and implementation utilizing reference variables has been used. In addition, to minimize the risk for programming mistakes, a consistent application protocol interface (API) was developed by enforcing an R coding convention.This will make it possible for a new user to quickly become familiar with the structure of classes and methods enhancing overall productivity. To further minimize the risk for programming errors, method arguments are validated as far as possible and explanatory error messages are used. Priority is also on backward compatibility with previous versions, compatibility with other packages, and potential future packages. The package does not provide a graphical user interface (GUI) per se, but the object-oriented API and the fact that reference variables are used make it easy for third-party developers to implement a GUI for, say, everyday high-throughput normalization. The package is available online as open source written in 100% R for maximum accessibility on all platforms. Installation is straightforward.
Key words:
microarray analysis; single-, two-, and multiple-channel microarrays; low-level analysis; pre-processing of data; visualization; calibration; normalization; data import and export; object-oriented programming; reference variables; coding conventions; data hiding; encapsulation; virtual fields