The Keio Journal of Medicine

Large-scale correlation of DNA accession numbers to the cDNAs in the FANTOM full-length mouse cDNA clone set
Itsuki Ajioka1,2, Takuya Maeda1,2, and Kazunori Nakajima2,3

1 Equal contributors in this work
2 Department of Anatomy, School of Medicine, Keio University
Tokyo, Japan;
3 Department of Molecular Neurobiology, Institute of DNA Medicine,
Jikei University School of Medicine, Tokyo, Japan

iKeio J Med 55(3): 107-110, September 2006)


  • Supplementary table 1
(GeneChip_ID) GeneChip ID provided by Affymetrix. (GeneChip Acc) GenBank accession number of GeneChip clone. (score(F1), e_value(F1)) Score and e-value, respectively, of the highest scoring FANTOM clone correlated with a GeneChip clone in the BLAST search. (FANTOM Accs) GenBank accession numbers of the highest scoring FANTOM clone correlated with a GeneChip clone in the BLAST search. The accession numbers were arranged using a semicolon as a separator. (score(D1), e_value(D1)) Since some of the non-matching GeneChip clones might have corresponded to a part of the UTR that was not present among the FANTOM clones, we executed a fourth program to obtain longer cDNA Seq corresponding to the GeneChip clones, as described in the main text. These columns show the score and the e-value, respectively, of the highest scoring rodent cDNA clones that were correlated with the GeneChip clones in the BLAST search. (HomologousAccs(D1)) GenBank accession numbers of highest scoring rodent cDNA clones correlated with GeneChip clones in the BLAST search. The accession numbers were arranged using a semicolon as a separator in square brackets. (Amino_sq?(F1)) Presence or absence of an AA Seq in the FANTOM DDBJ file (1 = present, 0 = absent) for clones with accession numbers in the gFANTOM Accsh column. (score(D2), e_value(D2)) We carefully read the DDBJ flat files that did not have an AA Seq and found that the FANTOM clones occasionally contained sequence errors that prevented translation into AA, as reported previously.5 To obtain the AA Seq of such clones, we applied the g2-step correlating programh to the 12,504 clones, as described in the main text. These columns show the score and the e-value, respectively, of the highest scoring rodent cDNA clones correlated with the FANTOM clones in the BLAST search. The scores and e_values were arranged corresponding to the accession numbers in the gFANTOM Accsh column. Note that the scores and e_values are visible at the same positions as the e0f data in the gAmino_sq?(F1)h column because BLAST searches were performed for only FANTOM clones with no AA seq. (BLAST(D2)_FANTOM Accs) GenBank accession numbers of the highest scoring rodent cDNA clones correlated with the FANTOM clones in the BLAST search. (iPSORT score) Data regarding whether or not the clone was a secretory molecule, based on the results of an iPSORT-based Perl program.6 The scores were arranged corresponding to the gAmino_sq(F1)h column or as three columns labelled gwith_amino(G)h, gwith_amino(D1)h, and gwith_amino(F2)h. (TMHMM score)) Number of transmembrane regions according to the TMHMM program.8 (definition (D1)) Gene definitions described in the DDBJ files of the rodent cDNA clones. The definitions were arranged using a semicolon as a separator. (definition (F1)) Gene definitions described in the DDBJ files of the FANTOM clones. The definitions were arranged in the order of the accession numbers in FANTOM accs.