Thursday 13 December 2012

Windshield Metagenomics

Having a play around with windshield metagenomics task now.

Left the megablast search thingy to run over lunch but it is not playing ball (still queuing..) so using some data that has already been megablasted to carry on.
Spending the morning analysing Archbishop Desmond Tutu's SNPs:

1. Import dataset from all personal genomes available in the "Complete Khoisan and Bantu genomes from southern Africa" Nature paper from the data libraries section of Galaxy.

2. Calculate how many SNPs Desmond Tutu has compared to the reference human genome -> turns out this is ~1,400,000 regions

3. Retrieve data of all of the SNPs in the reference genome using SNP130.

4. Calulate new SNPs Desmond Tutu has compared to the reference genome -> turns out this is ~100,000 regions.

5. Import data telling how many SNPs are present within annotated exons in the reference genome -> apparently this is ~630,000 regions
And finally..
6. Join datasets of SNPs within exons and new SNPs Desmond Tutu has to find out how many SNPs he has within exons (I will eventually attempt to find out the effect of these SNPs on the coding potential of the exon…) -> 3,315 regions.


Off to figure out how the aaChanges tool on Galaxy works :)

Wednesday 12 December 2012

First attempt with Galaxy...

So today is the first time I have every used Galaxy..

Spent the first 20 minutes or so going over some of the help videos to try and figure out what I'm supposed to do and orientate myself a little better. The videos are extremly useful so if you haven't seen them, make sure you do- well worth a watch :)

I tried the "Finding Human Coding Exons with the Highest SNP Density" video- the visualisation too wouldn't ork on the computers at uni (apparently something to do with potentially not having enough RAM, so I'm told) but everything else worked fine.

Now, off to find my group to attempt to tackle another question..