Homework 2 - Last three problems from the text

Problem 3, page 63.
Visit the Saccharomyces cerevisiae (budding yeast) genome database (SGD Web site to learn about the yeast transcription factor HSF1. Go to SGD and look up the following information using the global gene hunter. Enter the name of the gene, limit the search by unclicking boxes as needed, and click Submit. Use the links on the following page to answer these next questions. (Note: There is a large group of Ph.D. Fellows who scan the literature frequently and add the information to the database.)

a. On what chromosome does the gene reside?

b. What is the mature length of the protein?

c. What are the SwissProt and PIR accession numbers?

d. Is the gene found in other species – not at all, one or two, or many? If so, give an example of the name of the similar gene in another species.

Problem 6, page 63.
In addition to individual genes, whole genomes of organisms are becoming available including many prokaryotes, organelles, and viruses. One good way to retrieve these genomic sequences is through the NCBI Entrez page for genomes and the taxonomy browser.

a. Go to the NCBI Entrez page and then to the genome page (on bar at top of Entrez page). Enter “Homo sapiens mitochondrion” and click on the entry that appears for the human mitochondrion. Note that the RefSeq accession number starts with the NC (nucleotide sequence of chromosome). Examine the sequence of the Homo sapiens mitochondria. What is the length? Roughly outline the genes that are present. Click on the map to see the genes that are present and then on the gene blocks to see the sequences.

b. Another resource for microbial genomes is at the Institute for Genomic Research. Go to the Comprehensive Microbial Resource page, choose genomes, and click on the genome name of Synechocystis sp. PCC 6803 under the group Cyanobacteria. This is an ancient organism that produces oxygen from light and puts oxygen in the atmosphere. What is the size of the genome and how many proteins are encoded? What does the color code of the genome represent?

Part II, page 118.
Use the accession numbers P69202 and NP_040628 for the two phages.

In this section, protein sequence pairs will be aligned using Internet servers.

1. Using one of the websites listed below and the default conditions provided by the Web site, align the protein sequences for phage λ and p22 phage repressors. Cut and paste FASTA files of the sequences already available into these sites.

2. Record the resulting percent identity and similarity and briefly describe what each represents.

Internet Sites for Sequence Alignment

The following are Web sites that will perform sequence alignment of two sequences by the dynamic programing algorithm.

1.LALIGN http://fasta.bioch.virginia.edu/
2.SIM http://us.expasy.org/tools/sim-prot.html
3.BCM http://searchlauncher.bcm.tmc.edu/