Multiple alignment information
2015. 5
|
SET of all MULTIPLE ALIGNMENT(Flat File)
|
347M |
Flat File |
NOTES: "FMULTI" provides multiple-alignments of all the H-Inv
transcripts and RefSeq sequences mapped in the same H-Inv cluster(HIX),
against human genome sequence.
|
Positional information of mapping
2015. 6
|
Data of all positional information for all the exon(Flat File)
|
124M |
Flat File |
NOTES: "FMULTIP"; FMULTIP.tar.gz (mALNp/.tbl) provides positional
information for all the exon of all the H-Inv transcripts and RefSeq
sequences mapped in the same H-Inv cluster(HIX), against human genome
sequence.
Format: 01:HIT (or acc), 02:HIX (or temporally cluster-id), 03:key
(=acc_chr_sno), 04:seq1 ("genome"), 05:seq2 (cDNA_acc), 06:strand,
07:exon_no, 08:type[N:not aligned part of seq1(genome); U:unmapped part
of seq2(cDNA); A:alignment, D:deletion, I:insertion; G:gap due to other
member)], 09:start_seq1 (genome), 10:end_seq1 (genome), 11:start_seq2
(cDNA), 12:end_seq2 (cDNA) (or gap_length if 08:type='G')
|
The positional information of the exon, CDS and UTR for representative H-Inv transcripts (HITs) in GFF3 format.
|
7.6M |
GFF3 File |
The positional information of the exon, CDS and UTR for all H-Inv transcripts (HITs) in GFF3 format.
|
25M |
GFF3 File |
NOTES: "h-inv_pub_rep.gff3.gz","h-inv_pub.gff3.gz"; The positional information of the exon, CDS and UTR in GFF3 format.
Format:01: "seqid (HIT)", 02: "source", 03: "type", 04: "start",
05: "end", 06: "score", 07: "strand", 08: "phase", 09: "attributes"
Attributes (exon):ID=HIX, Name=HIT, Note=HUGO gene symbol
Attributes (CDS):ID=HIT, Parent=HIX, Name=HIT, Alias=definition, Note=HUGO gene symbol, accession
Attributes (UTR):Parent=HIT, Name=HIT
|
H-ANGEL matrix
|
Gene expression matrix of H-ANGEL (Flat File)
|
25M |
Flat File |
NOTES: "H-ANGEL_matrix.txt.gz" provides gene expression matrix of
H-ANGEL. The followings are the No. of column and each description.
Format: 1: Type of platform, 2: experimental ID added by each
provider, 3: primer/probe ID. e.g. GeneChip Identifier, 4: 10 categories
collapsed by the avarage of 40 categories, 5: 10 categories collapsed
by the maximum value of 40 categories, 6: 40 categories, 7: Acc
corresponding to the primer/probe, 8: start site of the primer/probe on
the genome, 9: end site of the primer/probe on the genome, 10: Absolute
value of the expression data for EST and SAGE only, NaN otherwise, 11:
HIX, 12: UniGene ID, 13: start site of HIX on the genome, 14: end site
of HIX on the genome, 15: strand of the locus, 16: Acc(s) included
within HIX.
|
Molecular evolutionary annotation (Evola)
|
SET of Evola ortholog list
|
|
Flat File |
"Evola.txt.gz": provides ortholog accession number list of Evola
|
Inter-species multi-FASTA (Transcript)
|
SET of Evola ortholog sequences (Transcript)
|
|
Flat File |
NOTES: "NFAS.tar.gz" provides multiple FASTA of transcript nucleotide sequences of human and other species orthologs.
|
Inter-species multi-FASTA (Protein)
|
SET of Evola ortholog sequences (Protein)
|
|
Flat File |
NOTES: "PFAS.tar.gz" provides multiple FASTA of protein amino acid sequences of human and other species orthologs.
|
Phylogenetic trees
|
SET of Evola duplicate gene family trees (Flat File)
|
|
Flat File |
NOTES: "NJ.tar.gz" provides the phylogenetic trees (phb files) constructed by the Neighbor-joining method (amino acid)
|
Human protein complex database with quality index (PCDq), data set
New!
2015. 11
|
Protein complex list, their subunits (members), and related annotation.
|
1.5M |
TSV Files (tar.gz file) |
NOTES: "complexList.tsv" provides complex ID, name, etc.
"subunitMembers.tsv" provides subunits (members) of each complex and related annotations.
"public_ppi.tsv" provides PPI data used for complex prediction.
File format is described in README file included in download package.
|
A subset of H-InvDB annotation data sets with supporting proteome evidence
2015. 7
|
SET of H-Inv clusters with supporting proteome evidence (Flat File)
|
25M |
Flat File |
SET of H-Inv clusters with supporting proteome evidence (XML)
|
33M |
XML |
SET of H-Inv transcripts with supporting proteome evidence (Flat File)
|
146M |
Flat File |
SET of H-Inv transcripts with supporting proteome evidence (XML)
|
178M |
XML |
SET of H-Inv proteins with supporting proteome evidence (Flat File)
|
57M |
Flat File |
SET of H-Inv proteins with supporting proteome evidence (XML)
|
61M |
XML |
NOTES: Subsets of H-InvDB loci,
transcripts, and proteins, with supporting evidences of expression
confirmed in comprehensive proteomic experiments. Those classified as
"protein level" or "transcript level" expression in C-HPP are included.
|
Transcripts and clusters unmapped to human genome
2015. 5
|
Clusters of transcripts unmapped to human genome (Flat File)
|
2.7M |
Flat File |
Clusters of transcripts unmapped to human genome (XML)
|
3.0M |
XML |
Transcripts unmapped to human genome (Flat File)
|
14M |
Flat File |
Transcripts unmapped to human genome (XML)
|
15M |
XML |
NOTES: Data set of transcripts (HIT) and clusters (HIX) that are not mapped to human genome.
|