DomStratStats

Domain Stratified Statistics

The goal of DomStratStats is to bring better statistics to protein sequence analysis. The standard approaches are based on p-values and E-values, but here we introduce q-values and lFDRs (local False Discovery Rates) for protein domains. They key to make q-values and lFDRs work better is by stratifying, in this case analyzing each domain family separately, which best balances noise across domain families.

ProtDomain allows you to use DomStratStats as a Domain Finder for your sequences. DomStratStats is developed and maintained as a Perl application by Alejandro Ochoa García, an Assistant Professor at Duke University, and a former student at SinghLab. To directly install and use DomStratStats, please refer to the GitHub repository for DomStratStats which also serves as the software's manual.


2015-11-17. Alejandro Ochoa, John D Storey, Manuel Llinás, and Mona Singh. Beyond the E-value: stratified statistics for protein domain prediction. PLoS Comput Biol. 11 e1004509. Article, arXiv 2014-09-23.