Description of the CAPIH Web interface The CAPIH interface provides five query schemes: by gene accession number, gene description, gene ontology, protein domain, and expressing tissue (selleck products Figure 2A). Alternatively, the user can also look up the proteins of interest in the protein table, which includes all the proteins analyzed in the interface. All the proteins that match the query key word will be shown with a plus “”+”" sign in front (Figure 2B). Detailed information of each protein can be shown by clicking on the “”+”" sign (Figures. 3 and 4). Note that the information page of each protein is composed of three sections (“”Genome Comparison Statistics”", “”Multiple
Sequence Alignments”", and “”Protein Interactions”"). By default only the first section will be deployed when the page is shown. The user can deploy the other two sections Kinase Inhibitor Library manufacturer by clicking the “”+”" sign before selleck chemicals each section. The user can further refine the search by submitting a second key word, or return to the homepage and start a new search. For each protein of interest, CAPIH shows the statistical pie diagram of species-specific
variations in the “”Genome Comparison Statistics”" section (substitutions in light blue, indels in purple, and PTMs in green color; Figure 3A). For substitutions and indels, the diagram gives species-specific variations in amino acid sequences, InterPro-predicted protein domains, CDSs, 3′UTR, and 5′ UTR (in the top-down direction). Each filled block represents 10 variations. That is, 10 nucleotide substitutions (for CDS and UTRs), amino acid changes (for amino old acid and IPR domains), indels, or PTMs. For example, 12 species-specific changes will be shown as 2 filled blocks in the graph. However, if the number of species-specific changes exceeds 40, only 4 filled blocks will be shown (Figure 3A). Note that nucleotide substitutions in coding regions do not necessarily cause amino acid substitutions, whereas indels do. Also note that one indel event may affect more than one amino acids. Therefore, the total numbers of indels and nucleotide substitutions in CDS do not necessarily
equal the number of amino acid changes. Figure 2 (A) The query schemes of CAPIH. (B) All the proteins that match the query key word will be shown with a plus “”+”" sign in front. Detailed information of each protein can be shown by clicking on the “”+”" sign. Figure 3 (A) Statistics of species-specific changes in different regions. Each filled block represents ~10 species-specific genetic changes. AA: amino acid; IPR: Interpro-predicted protein domain; CDS: coding sequence; 3/5 UTR: 3′/5′ untranslated regions. (B) Multiple amino acid sequence alignment wherein species-specific changes (PTMs, and substitutions) and InterPro domains are shown in colored boxes. Indels are not color-shaded. The colors can be shown or hidden by checking the boxes in the “”Feature Settings”" panel.