R Programming/Ordination

From Wikibooks, open books for an open world
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Overview

This page provides basic code for creating a distance matrix and running and plotting a Non-metric Multidimensional Scaling (NMDS) ordination.

Read more about Ordination on Wikipedia.

This code relies on package vegan in R by Jari Oksanen.

Data

First, import data and load required libraries:

require(MASS)
require(vegan)
data(varespec)   # species data
data(varechem)   # environmental data

Distance matrix

bray <- vegdist(varespec, method = "bray")				# calculate a distance matrix

# There are many distance measure options for 'dist', 
# discoverable by running '?dist'. Common distance measures include:
       # 'bray' = Bray-Curtis
       # 'canb' = Canberra
       # 'euclidean' = Euclidean

Unconstrained Ordination

Displaying dissimilarity using NMDS

NMDS analysis and plotting:

nmds <- metaMDS(varespec, k = 2, 
          distance = 'bray', autotransform = FALSE) 	# semi-black box NMDS function

ordiplot(nmds, type = "text")			      # Plot NMDS ordination
fit <- envfit(nmds, varechem[ ,1:4])			   # Calculates environmental vectors
fit						        # Lists vector endpoint coordinates and r-squared values
plot(fit)						   # adds environmental vectors
# a linear representation of environmental variables is not always appropiate
# we could also add a smooth surface of the variable to the plot
ordisurf(nmds, varechem$N, add = TRUE, col = "darkgreen")
nmds$stress                                             # stress value
resulting nmds plot


In the metaMDS function, k is user-defined and relates to how easily the projection fits the dataframe when constrained to k dimensions. Conventional wisdom seems to suggest that stress should not exceed 10-12%. Stress is reduced by increasing the number of dimensions. However, increasing dimensionality might decrease the "realism" of a 2-dimensional plot of the first two NMDS axes.


We can also run a nMDS with 3 dimensions, fit environmental vectors and create a dynamic graph:

nmds3d <- metaMDS(varespec, k = 3, 
  distance = 'bray', autotransform = FALSE)              # run nmds with 3 dimensions
nmds3d$stress                                            # stress drops
fit3d <- envfit(nmds3d, varechem[ ,1:4], choices = 1:3)  # fit environmental vectors to 3d space
ordirgl(nmds3d, envfit = fit3d)                          # dynamic 3D graph

Running a principle component analysis (PCA) on environmental data

chem_pca <- rda(varechem, scale = TRUE)    # Run PCA
biplot(chem_pca, scaling = 2)              # display biplot
PCA biplot

Constrained Ordination