Thursday 13 December 2012

Spending the morning analysing Archbishop Desmond Tutu's SNPs:

1. Import dataset from all personal genomes available in the "Complete Khoisan and Bantu genomes from southern Africa" Nature paper from the data libraries section of Galaxy.

2. Calculate how many SNPs Desmond Tutu has compared to the reference human genome -> turns out this is ~1,400,000 regions

3. Retrieve data of all of the SNPs in the reference genome using SNP130.

4. Calulate new SNPs Desmond Tutu has compared to the reference genome -> turns out this is ~100,000 regions.

5. Import data telling how many SNPs are present within annotated exons in the reference genome -> apparently this is ~630,000 regions
And finally..
6. Join datasets of SNPs within exons and new SNPs Desmond Tutu has to find out how many SNPs he has within exons (I will eventually attempt to find out the effect of these SNPs on the coding potential of the exon…) -> 3,315 regions.


Off to figure out how the aaChanges tool on Galaxy works :)

1 comment: