

This ignores the actual strength of the connection so WGCNA constructs a weighted gene (or OTU) co-occurrence adjacency matrix in lieu of ‘hard’ thresholding. Many networks use hard-thresholding (where a connection score between any two nodes is noted as 1 if it is above a certain threshold and noted as 0 if it is below it). Very simply, each OTU is going to be represented by a node in a vast network and the adjacency (a score between 0 and 1) between each set of nodes will be calculated. More recently, the method has been applied to microbial communities (Duran-Pinedo et al., 2011 Aylward et al., 2015 Guidi et al., 2016 Wilson et al., 2018)–the following is a walk though using microbial sequence abundances and environmental data from my 2018 work ().īackground: WGCNA finds how clusters of genes (or in our case abundances of operational taxonomic units–OTUs) correlates with traits (or in our case environmental variables or biochemical rates) using hierarchical clusters, novel applications of weighted adjacency functions and topological overlap measures, and a dynamic tree cutting method. Originally created to assess gene expression data in human patients, the authors of the WGCNA method and R package have a thorough tutorial with in-depth explanations (). Thus genes are sorted into modules and these modules can then be correlated with other traits (that must be continuous variables).

Weighted gene correlation network analysis (WGCNA) is a powerful network analysis tool that can be used to identify groups of highly correlated genes that co-occur across your samples.
