| Multiple alignment information
					2015. 5 | 
			
				|  SET of all MULTIPLE ALIGNMENT(Flat File) | 347M | Flat File | 
			
				| NOTES: "FMULTI" provides multiple-alignments of all the H-Inv 
transcripts and RefSeq sequences mapped in the same H-Inv cluster(HIX), 
against human genome sequence. | 
			
				| Positional information of mapping
					2015. 6 | 
			
				|  Data of all positional information for all the exon(Flat File) | 124M | Flat File | 
			
				| NOTES: "FMULTIP"; FMULTIP.tar.gz (mALNp/.tbl) provides positional 
information for all the exon of all the H-Inv transcripts and RefSeq 
sequences mapped in the same H-Inv cluster(HIX), against human genome 
sequence. Format: 01:HIT (or acc), 02:HIX (or temporally cluster-id), 03:key 
(=acc_chr_sno), 04:seq1 ("genome"), 05:seq2 (cDNA_acc), 06:strand, 
07:exon_no, 08:type[N:not aligned part of seq1(genome); U:unmapped part 
of seq2(cDNA); A:alignment, D:deletion, I:insertion; G:gap due to other 
member)], 09:start_seq1 (genome), 10:end_seq1 (genome), 11:start_seq2 
(cDNA), 12:end_seq2 (cDNA) (or gap_length if 08:type='G')
 | 
			
			
				|  The positional information of the exon, CDS and UTR for representative H-Inv transcripts (HITs) in GFF3 format. | 7.6M | GFF3 File | 
			
				|  The positional information of the exon, CDS and UTR for all H-Inv transcripts (HITs) in GFF3 format. | 25M | GFF3 File | 
			
				| NOTES: "h-inv_pub_rep.gff3.gz","h-inv_pub.gff3.gz"; The positional information of the exon, CDS and UTR in GFF3 format. Format:01: "seqid (HIT)", 02: "source", 03: "type", 04: "start", 
05: "end", 06: "score", 07: "strand", 08: "phase", 09: "attributes"
 Attributes (exon):ID=HIX, Name=HIT, Note=HUGO gene symbol
 Attributes (CDS):ID=HIT, Parent=HIX, Name=HIT, Alias=definition, Note=HUGO gene symbol, accession
 Attributes (UTR):Parent=HIT, Name=HIT
 | 
			
			
				| H-ANGEL matrix | 
			
				|  Gene expression matrix of H-ANGEL (Flat File) | 25M | Flat File | 
			
				| NOTES: "H-ANGEL_matrix.txt.gz" provides gene expression matrix of 
H-ANGEL. The followings are the No. of column and each description. Format: 1: Type of platform, 2: experimental ID added by each 
provider, 3: primer/probe ID. e.g. GeneChip Identifier, 4: 10 categories
 collapsed by the avarage of 40 categories, 5: 10 categories collapsed 
by the maximum value of 40 categories, 6: 40 categories, 7: Acc 
corresponding to the primer/probe, 8: start site of the primer/probe on 
the genome, 9: end site of the primer/probe on the genome, 10: Absolute 
value of the expression data for EST and SAGE only, NaN otherwise, 11: 
HIX, 12: UniGene ID, 13: start site of HIX on the genome, 14: end site 
of HIX on the genome, 15: strand of the locus, 16: Acc(s) included 
within HIX.
 | 
			
				| Molecular evolutionary annotation (Evola) | 
			
				|  SET of Evola ortholog list |  | Flat File | 
			
				| "Evola.txt.gz": provides ortholog accession number list of Evola 
 | 
			
				| Inter-species multi-FASTA (Transcript) | 
			
				|  SET of Evola ortholog sequences (Transcript) |  | Flat File | 
			
				| NOTES: "NFAS.tar.gz" provides multiple FASTA of transcript nucleotide sequences of human and other species orthologs. | 
			
				| Inter-species multi-FASTA (Protein) | 
			
				|  SET of Evola ortholog sequences (Protein) |  | Flat File | 
			
				| NOTES: "PFAS.tar.gz" provides multiple FASTA of protein amino acid sequences of human and other species orthologs. | 
			
				| Phylogenetic trees | 
			
				|  SET of Evola duplicate gene family trees (Flat File) |  | Flat File | 
			
				| NOTES: "NJ.tar.gz" provides the phylogenetic trees (phb files) constructed by the Neighbor-joining method (amino acid) | 
			
				| Human protein complex database with quality index (PCDq), data set
					New!
                                        2015. 11 | 
			
				|  Protein complex list, their subunits (members), and related annotation. | 1.5M | TSV Files (tar.gz file) | 
 
			
				| NOTES: "complexList.tsv" provides complex ID, name, etc.  
					"subunitMembers.tsv" provides subunits (members) of each complex and related annotations.  
					"public_ppi.tsv" provides PPI data used for complex prediction. File format is described in README file included in download package.
 | 
			
				| A subset of H-InvDB annotation data sets with supporting proteome evidence  
                                        2015. 7 | 
			
				|  SET of H-Inv clusters with supporting proteome evidence (Flat File) | 25M | Flat File | 
			
				|  SET of H-Inv clusters with supporting proteome evidence (XML) | 33M | XML | 
			
				|  SET of H-Inv transcripts with supporting proteome evidence (Flat File) | 146M | Flat File | 
			
				|  SET of H-Inv transcripts with supporting proteome evidence (XML) | 178M | XML | 
			
				|  SET of H-Inv proteins with supporting proteome evidence (Flat File) | 57M | Flat File | 
			
				|  SET of H-Inv proteins with supporting proteome evidence (XML) | 61M | XML | 
			
				| NOTES: Subsets of H-InvDB loci, 
transcripts, and proteins, with supporting evidences of expression 
confirmed in comprehensive proteomic experiments. Those classified as 
"protein level" or "transcript level" expression in C-HPP are included. | 
			
				| Transcripts and clusters unmapped to human genome  
					2015. 5 | 
			
				|  Clusters of transcripts unmapped to human genome (Flat File) | 2.7M | Flat File | 
			
				|  Clusters of transcripts unmapped to human genome (XML) | 3.0M | XML | 
			
				|  Transcripts unmapped to human genome (Flat File) | 14M | Flat File | 
			
				|  Transcripts unmapped to human genome (XML) | 15M | XML | 
			
				| NOTES: Data set of transcripts (HIT) and clusters (HIX) that are not mapped to human genome. |