Supplementary Information
“An expanded evaluation of protein function prediction methods shows
an improvement in accuracy” by Jiang Y. et al
Genome Biology, 2015
Content:
• Supplementary Figures.
1. Benchmark annotation depth distribution.
2. Benchmark information content distribution.
3. Benchmark sequence identity distribution.
4. Top predictors, precision-recall curves.
5. Top predictors, easy vs. difficult, precision-recall curves.
6. Top predictors, eukarya vs. prokarya, precision-recall curves.
7. Top predictors, species breakdown, Fmax bars.
8. Top predictors, weighted precision-recall curves.
9. Top predictors, normalized remaining uncertainty-misinformation.
10. Similarity networks between methods.
11. Keyword usage by top methods.
• Supplementary Table 1. Participating teams.
Additional supplementary data (297MB) provides all additional data, analyses and full prediction
results for every method. It is available at:
https://dx.doi.org/10.6084/m9.figshare.2059944.v1
Code used in CAFA2 is available at:
https://github.com/yuxjiang/CAFA2
1
Supplementary Figure 1 Distribution of depths of the leaf annotations, over all benchmarks in
(A) Molecular Function ontology, (B) Biological Process ontology, (C) Cellular Component ontology
and (D) Human Phenotype ontology. A leaf term for a benchmark protein is defined as any term
whose descendant nodes (more specific nodes) are not among the experimentally determined terms
for that protein.
(A) (B)
(C) (D)
2
Supplementary Figure 2 The histogram and boxplot of total information content of benchmark
proteins as well as all experimentally annotated proteins at time t1; i.e., the point of benchmark col-
lection: (A) Molecular Function ontology, (B) Biological Process ontology, (C) Cellular Component
ontology, and (D) Human Phenotype ontology. The information content of each directed acyclic
graph was calculated according to [9]. The red point in each plot indicates the value of information
content for the predicted annotation corresponding to the Naive baseline model.
Supplementary Figure 2A:
3
Supplementary Figure 2B:
Supplementary Figure 2C:
4
Supplementary Figure 2D:
5
Supplementary Figure 3 The histogram of pairwise sequence identities between each benchmark
proteins and the experimentally annotated template most similar to it: (A) Molecular Function
ontology, (B) Biological Process ontology, and (C) Cellular Component ontology. The histograms
roughly determine two groups of benchmarks: easy – with maximum global sequence identity greater
than or equal to 60%, and difficult – with maximum global sequence identity below 60%.
(A) (B)
(C)
6
Supplementary Figure 4 Precision-recall curves for the top-performing methods for (A) Molecu-
lar Function ontology, (B) Biological Process ontology, (C) Cellular Component ontology and (D)
Human Phenotype ontology. All panels show the top ten participating methods in each category, as
well as the Na¨ıve and BLAST baseline methods. Points corresponding to the maximum F-measure
are marked in circles on each curve. The legend provides the maximum F-measure (F ) and coverage
(C) for all methods. In cases where a Principal Investigator (PI) participated with multiple teams,
only the results of the best scoring method are presented.
Supplementary Figure 4A:
7
Supplementary Figure 4B:
Supplementary Figure 4C:
8
Supplementary Figure 4D:
9
Supplementary Figure 5 Precision-recall curves for the top-performing methods for (A) easy
benchmark category and Molecular Function ontology, (B) difficult benchmark category and Molec-
ular Function ontology, (C) easy benchmark category and Biological Process ontology, (D) difficult
benchmark category and Biological Process ontology, (E) easy benchmark category and Cellular
Component ontology and (F) difficult benchmark category and Cellular Component ontology. All
panels show the top ten participating methods in each category, as well as the Na¨ıve and BLAST
baseline methods. Points corresponding to the maximum F-measure are marked in circles on each
curve. The legend provides the maximum F-measure (F ) and coverage (C) for all methods. In cases
where a Principal Investigator (PI) participated with multiple teams, only the results of the best
scoring method are presented.
10
Supplementary Figure 5A (easy):
Supplementary Figure 5B (difficult):
11
Supplementary Figure 5C (easy):
Supplementary Figure 5D (difficult):
12
Supplementary Figure 5E (easy):
Supplementary Figure 5F (difficult):
13
Supplementary Figure 6 Precision-recall curves for the top-performing methods for (A) eukary-
otic benchmark category and Molecular Function ontology, (B) prokaryotic benchmark category and
Molecular Function ontology, (C) eukaryotic benchmark category and Biological Process ontology,
(D) prokaryotic benchmark category and Biological Process ontology, (E) eukaryotic benchmark
category and Cellular Component ontology and (F) prokaryotic benchmark category and Cellular
Component ontology. All panels show the top ten participating methods in each category, as well
as the Na¨ıve and BLAST baseline methods. Points corresponding to the maximum F-measure are
marked in circles on each curve. The legend provides the maximum F-measure (F ) and coverage
(C) for all methods. In cases where a Principal Investigator (PI) participated with multiple teams,
only the results of the best scoring method are presented.
14
Supplementary Figure 6A (eukarya):
Supplementary Figure 6B (prokarya):
15
Supplementary Figure 6C (eukarya):
Supplementary Figure 6D (prokarya):
16
Supplementary Figure 6E (eukarya):
Supplementary Figure 6F (prokarya):
17
Supplementary Figure 7 Performance evaluation based on the maximum F-measure for the top-
performing methods for the Molecular Function ontology (A–F), Biological Process ontology (G–O),
and Cellular Component ontology (P–V). Only the species with 15 benchmark proteins or more are
included. All bars show the top ten participating methods as well as the Na¨ıve and BLAST baseline
methods. A perfect predictor would be characterized with Fmax of 1. Confidence interval (95%)
were determined using bootstrapping with 10,000 iterations on the set of target sequences.
18
Supplementary Figure 7A (Arabidopsis thaliana):
Supplementary Figure 7B (Escherichia coli K12):
19
Supplementary Figure 7C (Homo sapiens):
Supplementary Figure 7D (Mus musculus):
20
Supplementary Figure 7E (Pseudomonas aeruginosa):
Supplementary Figure 7F (Rattus norvegicus):
21
Supplementary Figure 7G (Arabidopsis thaliana):
Supplementary Figure 7H (Danio rerio):
22
Supplementary Figure 7I (Dictyostelium discoideum):
Supplementary Figure 7J (Drosophila melanogaster):
23
Supplementary Figure 7K (Escherichia coli K12):
Supplementary Figure 7L (Homo sapiens):
24
Supplementary Figure 7M (Mus musculus):
Supplementary Figure 7N (Pseudomonas aeruginosa):
25
Supplementary Figure 7O (Rattus norvegicus):
26
Supplementary Figure 7P (Arabidopsis thaliana):
Supplementary Figure 7Q (Drosophila melanogaster):
27
Supplementary Figure 7R (Escherichia coli K12):
Supplementary Figure 7S (Homo sapiens):
28
Supplementary Figure 7T (Mus musculus):
Supplementary Figure 7U (Rattus norvegicus):
29
Supplementary Figure 7V (Saccharomyces cerevisiae):
30
Supplementary Figure 8 Weighted precision-recall curves for the top-performing methods for (A)
Molecular Function ontology, (B) Biological Process ontology, (C) Cellular Component ontology
and (D) Human Phenotype ontology. All panels show the top ten participating methods in each
category, as well as the Na¨ıve and BLAST baseline methods. Points corresponding to the maxi-
mum weighted F-measure are marked in circles on each curve. The legend provides the maximum
weighted F-measure (F ) and coverage (C) for all methods. In cases where a Principal Investigator
(PI) participated with multiple teams, only the results of the best scoring method are presented.
Calculation of the weighted precision-recall curve. Each term f in the ontology was weighted
according to the information content of that term. The information content of the term f was
calculated as
ic(f) = log2
1
Pr (f |P(f)) ,
where Pr (f |P(f)) is the probability that the term f in the ontology is associated to a protein given
that all of its parents are associated. (probabilities were determined based on the union of Swiss-
Prot, UniProt-GOA and GO Consortium databases). Weighted precisions and recalls are calculated
as
wpr(τ) =
1
m(τ)
m(τ)∑
i=1
∑
f ic(f) · 1 (f ∈ Pi(τ) ∧ Ti(τ))∑
f ic(f) · 1 (f ∈ Pi(τ))
, and
wrc(τ) =
1
ne
ne∑
i=1
∑
f ic(f) · 1 (f ∈ Pi(τ) ∧ Ti(τ))∑
f ic(f) · 1 (f ∈ Ti(τ))
,
where Pi(τ) is the set of predicted terms for protein i with score no less than threshold τ and Ti
is the set of true terms for protein i, m(τ) is the number of sequences with at least one predicted
score greater than or equal to τ , and ne is the number of proteins used in a particular mode of
evaluation. In the full evaluation mode ne = n, the number of benchmark proteins, whereas in the
partial evaluation mode ne = m(0).
31
Supplementary Figure 8A:
Supplementary Figure 8B:
32
Supplementary Figure 8C:
Supplementary Figure 8D:
33
Supplementary Figure 9 Normalized remaining uncertainty-misinformation curves for the top-
performing methods for (A) Molecular Function ontology, (B) Biological Process ontology, (C)
Cellular Component ontology and (D) Human Phenotype ontology. All panels show the top ten
participating methods in each category, as well as the Na¨ıve and BLAST baseline methods. Points
corresponding to the minimum normalized semantic distance [40] are marked in circles on each
curve. The legend provides the minimum normalized semantic distance (S) and coverage (C) for all
methods. In cases where a Principal Investigator (PI) participated with multiple teams, only the
results of the best scoring method are presented.
Calculation of the normalized remaining uncertainty-misinformation curve.
nru(τ) =
1
ne
ne∑
i=1
∑
f ic(f) · 1 (f /∈ Pi(τ) ∧ f ∈ Ti)∑
f ic(f) · 1 (f ∈ Pi(τ) ∨ f ∈ Ti)
, and
nmi(τ) =
1
ne
ne∑
i=1
∑
f ic(f) · 1 (f ∈ Pi(τ) ∧ f /∈ Ti)∑
f ic(f) · 1 (f ∈ Pi(τ) ∨ f ∈ Ti)
,
where Pi(τ) is the set of predicted terms for protein i with score no less than threshold τ and Ti is
the set of true terms for protein i, and ne is the number of proteins used in a particular mode of
evaluation. In the full evaluation mode ne = n, the number of benchmark proteins, whereas in the
partial evaluation mode ne is the number of proteins that have at least one positive predicted score.
34
Supplementary Figure 9A:
Supplementary Figure 9B:
35
Supplementary Figure 9C:
Supplementary Figure 9D:
36
Supplementary Figure 10 Similarity network of participated methods for (A) Molecular Function
ontology, (B) Biological Process ontology, (C) Cellular Component ontology and (D) Human Phe-
notype ontology. For all panels, similarities are computed as the Pearson’s correlation coefficient
between methods with a 0.75 cutoff for illustration purposes. A unique color is assigned to all
methods submitted under the same principal investigator. Not evaluated (organizer’s) methods are
shown in triangles, while benchmark methods (Na¨ıve and BLAST) are shown in squares. Top 10
methods are highlighted with enlarged nodes and circled in red. Edge width indicates the strength
of similarity. Nodes are labelled with the name of methods followed by “team-model” if multiple
teams/models are submitted.
37
Supplementary Figure 10A:
Jones−UCL−2(3)
ProFun−1(1)
ProFun−1(2)
ProFun−2(1)
Gough Lab−1(1)
Gough Lab−2(1)
Gough Lab−4(3)
Gough Lab−5(2)
Gough Lab−3(3)
Gough Lab−2(3)
Gough Lab−3(2)
Gough Lab−3(1)
Gough Lab−4(1)
Gough Lab−1(2)
Gough Lab−5(1)
IASL(1)
Jones−UCL−2(1)
IASL(2)
IASL(3)
SIFTER 2.4(1)
SIFTER 2.4(2)
Jones−UCL−3(3)
Tian Lab(2)INGA−Tosatto
Tian Lab(1)
Orengo−FunFams−1(2)
SIFTER−T
CBRG(2)
SIFTER 2.4(3) COPBP
Orengo−FunFams−1(1)
Jones−UCL−1
Paccanaro Lab(2)
FPM(2)
FANN−GO(1)
PFP(1)
PFPDB(1)
Orengo−FunFams−2(3)
Go2Proto(2)
PANNZER(3)
Orengo−FunFams−2(2)
SIAM(1)
BAR++(2)
SIAM(2)
BAR++(1)
Orengo−FunFams−2(1)
PANNZER(1)
Blast2GO
ESGDB(1)
GORBI(1)
GORBI(2)SANS(2)
CBRG(1)
PANNZER(2)
PULP(2)
PULP(1)
SANS(3)
ESG(2) CBRG(3)
Go2Proto(3)
Argot2
CombFunc
SIAM(3)
Jones−UCL−3(1)
FPM(1)
ESG(1)
GORBI(3)
SANS(1)
PFP(2)
ProFun−2(2)
CONS
PFPDB(2)
ESGDB(2)
Jones−UCL−3(2)
ProFun−2(3)
PANdeMIC
FANN−GO(2)
FANN−GO(3)
EVEX(1)
EVEX(2)
PhyloScriptors
Yale(3)
Yale(1)
PANDA(2)Anacleto Lab(2)
PANDA(1)Anacleto Lab(1)
Yale(2)
Rost Lab−2(1)
GOstruct
ISM AP(3)
ISM AP(2)
ISM AP(1)
BLAST
Naive
Jones−UCL−2(2)
Paccanaro Lab(1)
ProFun−1(3)
Go2Proto(1)APRICOT(1)
APRICOT(2)
Anacleto Lab(3)MS−kNN(3)Moirai(3)Gough Lab−4(2) PANDA(3)
Moirai(2)
Gough Lab−1(3) Moirai(1)
Gough Lab−2(2)
MS−kNN(1)
MS−kNN(2)
38
Supplementary Figure 10B:
ProFun−2(3) SIAM(3)
SIAM(1)
Gough Lab−1(3)
Gough Lab−2(2)
BMRF(2)
ProFun−2(1)
ProFun−1(2)
MS−kNN(1) Anacleto Lab(1) PANDA(1)
Anacleto Lab(2)
IASL(1)
PANDA(2)
Orengo−FunFams−1(1)
Orengo−FunFams−1(2)
IASL(2)
PANNZER(1)
PANNZER(3) PANNZER(2)
PFP(1)
Jones−UCL−2(1) MS−kNN(2)
Jones−UCL−1
Orengo−FunFams−2(1)
Orengo−FunFams−2(2)
Orengo−FunFams−2(3)
BAR++(1)
argot2bmrf(2)
BMRF(1)
ProFun−2(2)
BAR++(2)
ProFun−1(3)
ProFun−1(1)
Go2Proto(3)
Go2Proto(2)
Paccanaro Lab(1)
SIAM(2)
Go2Proto(1)
APRICOT(2)
Tian Lab(1)
PULP(2)
PULP(1)
argot2bmrf(1)
Gough Lab−4(2)
Jones−UCL−2(3)
Jones−UCL−3(2)
Jones−UCL−3(3)
CombFuncJones−UCL−2(2)
INGA−Tosatto
Rost Lab−2(1)
GOstruct BLAST
PANdeMIC
FANN−GO(1) Orengo−FunFams−1(3)
Tian Lab(2)
Yale(1)
Yale(2)
APRICOT(1)
FANN−GO(3)
EVEX(1)
EVEX(2)
COPBP ISM AP(2)
ISM AP(1)PhyloScriptors
Yale(3)
ISM AP(3)
IASL(3)PANDA(3)Anacleto Lab(3)Jones−UCL−3(1) MS−kNN(3)
Naive
Moirai(2)
Gough Lab−5(1)
Gough Lab−5(2)
Gough Lab−1(1)
FANN−GO(2)
FPM(1)
Gough Lab−4(3)
PFPDB(2)
PFP(2)
FPM(2)
SANS(3)
SANS(2)
SANS(1)
Gough Lab−2(1)
Gough Lab−2(3)
ESGDB(2)
ESGDB(1)
PFPDB(1)
CONS
ESG(2)
CBRG(3)
GORBI(2)
Blast2GO
GORBI(3)
Argot2
ESG(1)
Paccanaro Lab(2)
CBRG(1)
GORBI(1)
CBRG(2)
Moirai(1)
Gough Lab−4(1)
Gough Lab−1(2)
Gough Lab−3(1)
Gough Lab−3(2)
Gough Lab−3(3)
Moirai(3)
39
Supplementary Figure 10C:
Go2Proto(2)
PANdeMIC
EVEX(1) Yale(3)
EVEX(2) ISM AP(1)
ISM AP(2)
ISM AP(3)
APRICOT(2)
Go2Proto(1)
MS−kNN(2)
Moirai(2)
Moirai(1)
Moirai(3)
COPBP
MS−kNN(1)
FANN−GO(3)
MS−kNN(3)
Orengo−FunFams−1(2)
Naive
PULP(1)
Rost Lab−2(1)
Tian Lab(2)
FANN−GO(2)
Orengo−FunFams−1(1)
Gough Lab−5(2)
Gough Lab−1(3)
Gough Lab−4(1)
Gough Lab−5(1)
Gough Lab−2(2)
Gough Lab−3(3)
Gough Lab−3(2)
Gough Lab−3(1)
Gough Lab−4(2)
Gough Lab−1(2)
Yale(1)
Yale(2)BAR++(2)
APRICOT(1)BAR++(1)
ProFun−1(2)
PANDA(3)
Jones−UCL−3(3)
Paccanaro Lab(1)
BLAST
Jones−UCL−2(1)
PANDA(1) PANDA(2) ProFun−1(3)
ProFun−2(1)
ProFun−1(1)
Gough Lab−4(3)
Gough Lab−1(1) Gough Lab−2(1)
GOstruct
Gough Lab−2(3)
PFPDB(2)
CONS
ESGDB(2)
ESGDB(1)
PFPDB(1)
PFP(2)
PFP(1)
Blast2GO
PANNZER(3)
PhyloScriptors
SIAM(1)
SIAM(3)
GORBI(2) PANNZER(1)
Jones−UCL−3(2)
GORBI(1)
Anacleto Lab(2)
Rost Lab−1(1)
Anacleto Lab(3)
Anacleto Lab(1)
SANS(3)
PANNZER(2)
CBRG(3)
ProFun−2(2)
ProFun−2(3)
ESG(1)
SANS(2)SANS(1)
ESG(2)
CBRG(1)
PULP(2)
IASL(1)
Jones−UCL−2(3)
INGA−Tosatto
Jones−UCL−2(2)
IASL(2)
Jones−UCL−3(1)
Jones−UCL−1
Tian Lab(1)
FPM(2)
IASL(3)
GORBI(3)
FPM(1)
CBRG(2)
Paccanaro Lab(2)
Argot2
Go2Proto(3)
40
Supplementary Figure 10D:
BAR++(1)
Rost Lab−2(1)
Tian Lab(1)
ENDEAVOUR(2)
EVEX(2)
BLAST
ENDEAVOUR(1)
ENDEAVOUR(3)
BAR++(2)
INGA−Tosatto
Gough Lab−1(2)g2p buck
Gough Lab−4(2)Gough Lab−1(3)
Anacleto Lab(1)
Anacleto Lab(2) Anacleto Lab(3)
EVEX(1)
Rost Lab−1(3)
KernelFusion(1)
Gough Lab−3(1)
KernelFusion(2)
Gough Lab−3(2)
KernelFusion(3)
Gough Lab−5(2)
Rost Lab−1(1)
Rost Lab−1(2)
Gough Lab−1(1)
Gough Lab−3(3)
Gough Lab−5(1)
Gough Lab−4(1)
Naive
41
Supplementary Figure 11 The barplot of keyword frequency self-annotated by CAFA2 top 10
methods of (A) Molecular Function ontology, (B) Biological Process ontology, and (C) Cellular
Component ontology. The barplot of keyword enrichment self-annotated by CAFA2 top 10 methods
against all submitted methods of (D) Molecular Function ontology, (E) Biological Process ontology,
and (F) Cellular Component ontology. Keyword enrichment was calculated as log-ratio of:
e(k) = log
1
10
∑10
i=1 1(k ∈ Ki)
1
n
∑n
i=1 1(k ∈ Ki)
,
where we assume methods are in descending order of their Fmax measure and Ki indicates the set
of self-annotated keywords by model i.
42
Supplementary Figure 11A:
Supplementary Figure 11B:
43
Supplementary Figure 11C:
44
Supplementary Figure 11D:
Supplementary Figure 11E:
45
Supplementary Figure 11F:
46
Supplementary Table 1. (Part 1) Participating methods grouped according to Principal Investi-
gators (PIs)
Principal Investigator Method Name Model (keyword) Publications
Asa Ben-Hur GOstruct Model 1 (sa,sp,pp,pi,ge,gi,lt,gc,ml,nlp) [36]
Richard Bonneau PULP
Model 1 (ph,sp,pp,pi,ge,ps,pps,dp,ml,or)
[43, 42, 41]
Model 2 (ph,sp,pp,pi,ge,ps,pps,dp,ml)
Steven Brenner SIFTER 2.4 †
Model 1 (ph,ml,or,pa,ho)
[34, 15]Model 2 (ph,ml,or,pa,ho)
Model 3 (ph,ml,or,pa,ho)
Rita Casadio BAR++
Model 1 (sa,spa,pp,pps,ml,ho,hmm)
[4, 32]
Model 2 (sa,spa,pp,pps,ml,ho,hmm)
Jianlin Cheng
ProFun
Model 1 (spa,sp,gi,gc,dp,gd)
[6]Model 2 (spa,dp)
Model 3 (spa,gi,gc,dp,gd)
ProFun/donet
Model 1 (ppa,spa)
[38]Model 2 (ppa,spa)
Model 3 (ppa,spa)
Wyatt Clark Yale
Model 1 (pi)
Model 2 (pi)
Model 3 (pi)
Christophe Dessimoz
GORBI
Model 1 (ml,or,pa,ho,gc)
[35]Model 2 (ml,or,pa,ho,gc)
Model 3 (or,pa,ho,sa,spa,ppa,ph,hmm)
CBRG
Model 1 (or,pa,ho)
[3]Model 2 (or,pa,ho)
Model 3 (or)
Tunca Dogan PANdeMIC Model 1 (sa,ml,ho)
Filip Ginter EVEX
Model 1 (sa,ml,sp)
[37]
Model 2 (sa,ml,sp)
Julian Gough
Gough Lab/GoughGroup
Model 1 (sa,spa,hmm)
Model 2 (pps,hmm)
Model 3 (pi)
Gough Lab/D2P2
Model 1 (pp,sa,spa,hmm)
[30]Model 2 (pp,pi)
Model 3 (pp)
Gough Lab/dcGO
Model 1 (pps,pp,sa,spa,hmm,pi)
[17]Model 2 (pps,pp,sa,spa,hmm,pi)
Model 3 (pps,pp,sa,spa,hmm,pi)
Gough Lab/SUPERFAMILY
Model 1 (pps,pp,sa,spa,hmm,pi)
[14]Model 2 (pi)
Model 3 (pp,sa,spa,hmm)
Gough Lab/dcGOpredictor
Model 1 (pps,sa,spa,hmm,pi)
Model 2 (pps,sa,spa,hmm,pi)
Liisa Holm
SANS
Model 1 (sa)
[24]Model 2 (sa)
Model 3 (sa)
PANNZER
Model 1 (sa,ph,or,pa,ho,nlp,ofi)
[25]Model 2 (sa,ph,or,pa,ho,nlp,ofi)
Model 3 (sa,ph,or,pa,ho,nlp,ofi)
Wen-Lian Hsu IASL
Model 1 (sa,spa,sp)
Model 2 (sa,spa,sp)
Model 3 (sa,spa,sp)
David Jones
Jones-UCL/jfpred-RF Model 1 (hmm,ppa,sp,pi,or,lt,ml)
[11]
Jones-UCL/jfpred-FP
Model 1 (hmm,ppa,sp,pi,or,lt,ml)
Model 2 (sp,pp,pps,ml)
Model 3 (sp,pp,pps,ml)
Jones-UCL/jfpred-PB
Model 1 (hmm,ppa,sp,pi,or,lt,ml)
Model 2 (sa,spa)
Model 3 (hmm,ppa)
†SIFTER is expected to work well on microbial proteins.
47
Supplementary Table 1. (Part 2)
Principal Investigator Method Name Model (keyword) Publications
Daisuke Kihara
ESG
Model 1 (sa)
[7]
Model 2 (sa)
CONS Model 1 (sa)
[23]
FPM
Model 1 (sa)
Model 2 (sa)
PFPDB
Model 1 (sa)
Model 2 (sa)
ESGDB
Model 1 (sa)
Model 2 (sa)
PFP
Model 1 (sa)
[22, 21]
Model 2 (sa)
Sean Mooney g2p buck (not evaluated) Model 1 (N/A)
Michal Linial Go2Proto
Model 1 (sa,sp,php,pp,cm,ml,or,pa,ho,ofi)
Model 2 (sa,sp,php,pp,cm,ml,or,pa,ho,ofi)
Model 3 (sa,sp,php,pp,cm,ml,or,pa,ho,ofi)
Yves Moreau
ENDEAVOUR
Model 1 (sa,ph,pi,ge,lt,ml,ofi)
[1]Model 2 (sa,ph,pi,ge,lt,ml,ofi)
Model 3 (sa,ph,pi,ge,lt,ml,ofi)
KernelFusion
Model 1 (sa,pi,ge,lt,ml,ofi)
[44, 13]Model 2 (sa,pi,ge,lt,ml,ofi)
Model 3 (sa,pi,ge,lt,ml,ofi)
Christine Orengo
Orengo-FunFams/MDA
Model 1 (ml)
[12]
Model 2 (sp)
Model 3 (pi)
Orengo-FunFams
Model 1 (spa,ppa,ho,hmm)
Model 2 (spa,ppa,ho,hmm)
Model 3 (spa,ppa,ho,hmm)
Alberto Paccanaro Paccanaro Lab
Model 1 (sa,spa,pi,ge,lt,gc,ml,or.ho)
Model 2 (spa,hmm,ml)
Paul Pavlidis Moirai
Model 1 (ofi)
Model 2 (ofi)
Model 3 (ofi)
Predrag Radivojac FANN-GO (not evaluated)
Model 1 (sa,ml)
[8]Model 2 (sa,ml)
Model 3 (sa,ml)
Burkhard Rost
Rost Lab
Model 1 (sa,spa,ppa,sp,dp,ml)
[18]Model 2 (sa,spa,ppa,sp,dp,ml)
Model 3 (sa,spa,ppa,sp,dp,ml)
Rost Lab/metastudent2 Model 1 (sa,ml,or,pa,ho) [20]
Asaf Salamov COPBP Model 1 (N/A)
Fran Supek PhyloScriptors Model 1 (ph,gc,ml,pa,or)
Weidong Tian Tian Lab
Model 1 (sa)
[19]
Model 2 (sa)
Stefano Toppo Argot2 Model 1 (sa,spa) [16]
Toppo/van Dijk * argot2bmrf
Model 1 (sp,pi,ge,gi,ml,sa,spa)
Model 2 (sp,pi,ge,gi,ml,sa,spa)
Silvio Tosatto INGA-Tosatto Model 1 (hmm,ppa,sa,pi) [31]
Michael Tress SIAM
Model 1 (sa,ho,sp,ps,php,spa,ppa,sta,cm)
[29]Model 2 (ps,php,spa,ppa,sta,cm)
Model 3 (sa,ho,sp)
Hafeez Ur Rehman PFPPipeLine Model 1 (sa,pi,ml,ho,ofi) [5]
Giorgio Valentini Anacleto Lab
Model 1 (ml,sa)
[33]Model 2 (ml,sa)
Model 3 (ml,sa)
Aalt-Jan van Dijk BMRF
Model 1 (sp,pi,ge,gi,ml)
[26, 27]
Model 2 (sp,pi,ge,gi,ml)
Nevena Veljkovic ISM AP
Model 1 (ppa,php)
Model 2 (ppa,php,ge)
Model 3 (ppa,php,ge)
* This is a joint group of Stefano Toppo and Aalt-Jan van Dijk.
48
Supplementary Table 1. (Part 3)
Principal Investigator Method Name Model (keyword) Publications
Ricardo Vencio SIFTER-T Model 1 (spa,ml,ho) [2]
Jo¨rg Vogel APRICOT
Model 1 (ho,hmm,ppa,pp)
Model 2 (ho,hmm,ppa,pp)
Slobodan Vucetic MS-kNN
Model 1 (ml,sa,ge)
[28]Model 2 (ml,sa,ge)
Model 3 (ml,sa,ge)
Zheng Wang PANDA
Model 1 (spa,ppa,ph,or,pa,ho)
Model 2 (spa,ppa,ph,or,pa,ho)
Model 3 (spa,ppa,ph,or,pa,ho)
Mark Wass CombFunc Model 1 (spa,sa,ml,ge,pi) [39]
N/A ‡ Blast2GO Model 1 (sa) [10]
‡Blast2GO predictions were downloaded from the website https://www.blast2go.com one week before the prediction
deadline and converted into appropriate submission format by the CAFA organizers.
Supplementary Table 1. (Part 4) Keyword table.
Code Keyword Code Keyword
sa sequence alignment sta structure alignment
spa sequence-profile alignment cm comparative model
ppa profile-profile alignment pps predicted protein structure
ph phylogeny dp de novo prediction
sp sequence properties ml machine learning
php physicochemical properties gne genome environment
pp predicted properties op operon
pi protein interactions or ortholog
ge gene expression pa paralog
ms mass spectrometry ho homolog
gi genetic interactions hmm hidden Markov model
ps protein structure cd clinical data
lt literature gd genetic data
gc genomic context nlp natural language processing
sy synteny ofi other functional information
References
[1] S. Aerts, D. Lambrechts, S. Maity, P. Van Loo, B. Coessens, F. De Smet, L. C. Tranchevent,
B. De Moor, P. Marynen, B. Hassan, P. Carmeliet, and Y. Moreau. Gene prioritization through
genomic data fusion. Nat Biotechnol, 24(5):537–544, 2006.
[2] D. C. Almeida-e Silva and R. Z. Vencio. SIFTER-T: a scalable and optimized framework for
the SIFTER phylogenomic method of probabilistic protein domain annotation. Biotechniques,
58(3):140–142, 2015.
[3] A. M. Altenhoff, N. Skunca, N. Glover, C. M. Train, A. Sueki, I. Pilizota, K. Gori, B. Tomiczek,
S. Muller, H. Redestig, G. H. Gonnet, and C. Dessimoz. The OMA orthology database in 2015:
function predictions, better plant support, synteny view and other improvements. Nucleic Acids
Res, 43(Database issue):D240–249, 2015.
[4] L. Bartoli, L. Montanucci, R. Fronza, P. L. Martelli, P. Fariselli, L. Carota, G. Donvito, G. P.
Maggi, and R. Casadio. The Bologna Annotation Resource: a non hierarchical method for the
functional and structural annotation of protein sequences relying on a comparative large-scale
genome analysis. J Proteome Res, 8(9):4362–4371, 2009.
49
[5] A. Benso, S. Di Carlo, H. Ur Rehman, G. Politano, A. Savino, and P. Suravajhala. A combined
approach for genome wide protein function annotation/prediction. Proteome Sci, 11(Suppl
1):S1, 2013.
[6] R. Cao and J. Cheng. Integrated protein function prediction by mining function associations,
sequences, and protein-protein and gene-gene interaction networks. Methods, 2015.
[7] M. Chitale, T. Hawkins, C. Park, and D. Kihara. ESG: extended similarity group method for
automated protein function prediction. Bioinformatics, 25(14):1739–1745, 2009.
[8] W. T. Clark and P. Radivojac. Analysis of protein function and its prediction from amino acid
sequence. Proteins, 79(7):2086–2096, 2011.
[9] W. T. Clark and P. Radivojac. Information-theoretic evaluation of predicted ontological anno-
tations. Bioinformatics, 29(13):i53–i61, 2013.
[10] A. Conesa, S. Gotz, J. M. Garcia-Gomez, J. Terol, M. Talon, and M. Robles. Blast2GO:
a universal tool for annotation, visualization and analysis in functional genomics research.
Bioinformatics, 21(18):3674–3676, 2005.
[11] D. Cozzetto, D. W. Buchan, K. Bryson, and D. T. Jones. Protein function prediction by massive
integration of evolutionary analyses and multiple data sources. BMC Bioinformatics, 14 Suppl
3:S1, 2013.
[12] S. Das, D. Lee, I. Sillitoe, N. L. Dawson, J. G. Lees, and C. A. Orengo. Functional clas-
sification of CATH superfamilies: a domain-based approach for protein function annotation.
Bioinformatics, 2015.
[13] T. De Bie, L. C. Tranchevent, L. M. van Oeffelen, and Y. Moreau. Kernel-based data fusion
for gene prioritization. Bioinformatics, 23(13):i125–132, 2007.
[14] D. A. de Lima Morais, H. Fang, O. J. Rackham, D. Wilson, R. Pethica, C. Chothia, and
J. Gough. Superfamily 1.75 including a domain-centric gene ontology method. Nucleic Acids
Res, 39(Database issue):D427–434, 2011.
[15] B. E. Engelhardt, M. I. Jordan, J. R. Srouji, and S. E. Brenner. Genome-scale phylogenetic
function annotation of large and diverse protein families. Genome Res, 21(11):1969–1980, 2011.
[16] M. Falda, S. Toppo, A. Pescarolo, E. Lavezzo, B. Di Camillo, A. Facchinetti, E. Cilia, R. Velasco,
and P. Fontana. Argot2: a large scale function prediction tool relying on semantic similarity of
weighted gene ontology terms. BMC Bioinformatics, 13(Suppl 4):S14, 2012.
[17] H. Fang and J. Gough. A domain-centric solution to functional genomics via dcGO predictor.
BMC Bioinformatics, 14 Suppl 3:S9, 2013.
[18] T. Goldberg, M. Hecht, T. Hamp, T. Karl, G. Yachdav, N. Ahmed, U. Altermann, P. Angerer,
S. Ansorge, K. Balasz, M. Bernhofer, A. Betz, L. Cizmadija, K. T. Do, J. Gerke, R. Greil,
V. Joerdens, M. Hastreiter, K. Hembach, M. Herzog, M. Kalemanov, M. Kluge, A. Meier,
H. Nasir, U. Neumaier, V. Prade, J. Reeb, A. Sorokoumov, I. Troshani, S. Vorberg, S. Waldraff,
J. Zierer, H. Nielsen, and B. Rost. LocTree3 prediction of localization. Nucleic Acids Res,
42(Web Server issue):W350–355, 2014.
[19] Q. Gong, W. Ning, and W. Tian. GoFDR: a sequence alignment based method for predicting
protein functions. Methods, 2015.
[20] T. Hamp, R. Kassner, S. Seemayer, E. Vicedo, C. Schaefer, D. Achten, F. Auer, A. Boehm,
T. Braun, M. Hecht, M. Heron, P. Honigschmid, T. A. Hopf, S. Kaufmann, M. Kiening,
D. Krompass, C. Landerer, Y. Mahlich, M. Roos, and B. Rost. Homology-based inference
sets the bar high for protein function prediction. BMC Bioinformatics, 14 Suppl 3:S7, 2013.
50
[21] T. Hawkins, M. Chitale, S. Luban, and D. Kihara. PFP: Automated prediction of gene ontology
functional annotations with confidence scores using protein sequence data. Proteins, 74(3):566–
582, 2009.
[22] T. Hawkins, S. Luban, and D. Kihara. Enhanced automated function prediction using distantly
related sequences and contextual association by PFP. Protein Sci, 15(6):1550–1556, 2006.
[23] I. K. Khan, Q. Wei, S. Chapman, D. B. Kc, and D. Kihara. The PFP and ESG protein function
prediction methods in 2014: effect of database updates and ensemble approaches. Gigascience,
4:43, 2015.
[24] J. P. Koskinen and L. Holm. SANS: high-throughput retrieval of protein sequences allowing
50% mismatches. Bioinformatics, 28(18):i438–i443, 2012.
[25] P. Koskinen, P. Toronen, J. Nokso-Koivisto, and L. Holm. PANNZER: high-throughput func-
tional annotation of uncharacterized proteins in an error-prone environment. Bioinformatics,
31(10):1544–1552, 2015.
[26] Y. A. Kourmpetis, A. D. van Dijk, M. C. Bink, R. C. van Ham, and C. J. ter Braak. Bayesian
Markov Random Field analysis for protein function prediction based on network data. PLoS
One, 5(2):e9293, 2010.
[27] Y. A. Kourmpetis, A. D. van Dijk, R. C. van Ham, and C. J. ter Braak. Genome-wide com-
putational function prediction of arabidopsis proteins by integration of multiple data sources.
Plant Physiol, 155(1):271–281, 2011.
[28] L. Lan, N. Djuric, Y. Guo, and S. Vucetic. MS-kNN: protein function prediction by integrating
multiple data sources. BMC Bioinformatics, 14 Suppl 3:S8, 2013.
[29] P. Maietta, G. Lopez, A. Carro, B. J. Pingilley, L. G. Leon, A. Valencia, and M. L. Tress.
FireDB: a compendium of biological and pharmacologically relevant ligands. Nucleic Acids
Res, 42(Database issue):D267–272, 2014.
[30] M. E. Oates, P. Romero, T. Ishida, M. Ghalwash, M. J. Mizianty, B. Xue, Z. Dosztanyi,
V. N. Uversky, Z. Obradovic, L. Kurgan, A. K. Dunker, and J. Gough. D(2)P(2): database of
disordered protein predictions. Nucleic Acids Res, 41(Database issue):D508–516, 2013.
[31] D. Piovesan, M. Giollo, E. Leonardi, C. Ferrari, and S. C. Tosatto. INGA: protein function
prediction combining interaction networks, domain assignments and sequence similarity. Nucleic
Acids Res, 43(W1):W134–140, 2015.
[32] D. Piovesan, P. L. Martelli, P. Fariselli, A. Zauli, I. Rossi, and R. Casadio. BAR-PLUS:
the Bologna Annotation Resource Plus for functional and structural annotation of protein
sequences. Nucleic Acids Res, 39(Web Server issue):W197–202, 2011.
[33] M. Re, M. Mesiti, and G. Valentini. A fast ranking algorithm for predicting gene functions in
biomolecular networks. IEEE/ACM Trans Comput Biol Bioinform, 9(6):1812–1818, 2012.
[34] S. M. Sahraeian, K. R. Luo, and S. E. Brenner. SIFTER search: a web server for accurate
phylogeny-based protein function prediction. Nucleic Acids Res, 43(W1):W141–147, 2015.
[35] N. Skunca, M. Bosnjak, A. Krisko, P. Panov, S. Dzeroski, T. Smuc, and F. Supek. Phyletic
profiling with cliques of orthologs is enhanced by signatures of paralogy relationships. PLoS
Comput Biol, 9(1):e1002852, 2013.
[36] A. Sokolov and A. Ben-Hur. Hierarchical classification of gene ontology terms using the gostruct
method. J Bioinform Comput Biol, 8(2):357–376, 2010.
51
[37] S. Van Landeghem, K. Hakala, S. Ronnqvist, T. Salakoski, Y. Van de Peer, and F. Ginter.
Exploring biomolecular literature with EVEX: connecting genes through events, homology, and
indirect associations. Adv Bioinformatics, 2012:582765, 2012.
[38] Z. Wang, R. Cao, and J. Cheng. Three-level prediction of protein function by combining
profile-sequence search, profile-profile search, and domain co-occurrence networks. BMC Bioin-
formatics, 14 Suppl 3:S3, 2013.
[39] M. N. Wass, G. Barton, and M. J. Sternberg. CombFunc: predicting protein function using
heterogeneous data sources. Nucleic Acids Res, 40(Web Server issue):W466–470, 2012.
[40] R. Yang, Y. Jiang, M. W. Hahn, E. A. Housworth, and P. Radivojac. New metrics for learning
and inference on sets, ontologies, and functions. arXiv preprint arXiv:1603.06846, 2016.
[41] N. Youngs. Positive-unlabeled learning in the context of protein function prediction. Ph.d. thesis,
New York University, 2014.
[42] N. Youngs, D. Penfold-Brown, R. Bonneau, and D. Shasha. Negative example selection for
protein function prediction: the NoGO database. PLoS Comput Biol, 10(6):e1003644, 2014.
[43] N. Youngs, D. Penfold-Brown, K. Drew, D. Shasha, and R. Bonneau. Parametric Bayesian priors
and better choice of negative examples improve protein function prediction. Bioinformatics,
29(9):1190–1198, 2013.
[44] P. Zakeri, B. Moshiri, and M. Sadeghi. Prediction of protein submitochondria locations based
on data fusion of various features of sequences. J Theor Biol, 269(1):208–216, 2011.
52