Downloads

  • Data files (csv format)
    • Gen.csv - GEN file (includes genetic signture of reference populations)
    • Geo.csv - GEO file (includes geographical coordinates of reference populations)
    • data.csv - input file with test participants.
  • GPS_results.txt
    • GPS_results.txt - partial output file for the data.csv.
  • The R program
    • GPS.R - the GPS scripts. It accepts the data.csv (with the fixed Gen.csv and Geo.csv) and outputs GPS_results.txt.
  • Genotype Data
    • 132 sampels of the 1000 Genomes populations were gneotyped on the GenoChip microarray and are available for download. The files are coded in PLINK format (please see http://pngu.mgh.harvard.edu/~purcell/plink/).
  • Readme File

Instructions

The input file The input structure of data.csv includes the sample ID, 9 admixture coefficients and group ID. The last column is for your own records and is not used by GPS for predictions.
SAMPLE_IDAdmixture1Admixture2Admixture3Admixture4Admixture5Admixture6Admixture7Admixture8Admixture9GROUP_ID
GRC120762880.0221 0.5524 0 0.3492 0 0 0 0.0762 0 Abkhazians
GRC120763000.0228 0.5162 0.0069 0.3522 0 0 0 0.0921 0.0098 Abkhazians

The output file The output file looks like this:
GROUP_ID # SAMPLE_ID Closest_population LatitudeLongitude
Abkhazians1GRC12076288Abkhazians_039.530707055439146.885015601252
Abkhazians2GRC12076300Abkhazians_041.367504253465547.0145274096427
Abkhazians3GRC12076312Ingush_342.913713952319139.0598623753912
Run GPS Call GPS with the following command:
GPS(directory_name="D:/GPS")
where D:/GPS is the folder that has the data.csv file.

On-site GPS Calculator

For your convenience we also developed a calculator (see on this website) in which you can plug the 9 admixture coefficient and view the geographical bioorigin of your test sample. To test the performances of the GPS algorithm, obtain the admixture components of known populations from the data.csv file. Enter these components into the calculator in the same exact order as provided.
For example, the first record of the data.csv file:
SAMPLE_IDAdmixture1Admixture2Admixture3Admixture4Admixture5Admixture6Admixture7Admixture8Admixture9GROUP_ID
GRC120762880.0221 0.5524 0 0.3492 0 0 0 0.0762 0 Abkhazians
has the admixture components:
0.02210.55240 0.34920000.07620
Entering these numbers into the GPS on-line calculator yields the output:
Based on the data that you have entered our tool computed your closest population from Abkhazia region. 
Calculated Latitude of Origin: 38.2188
Calculated Longitude of Origin: 47.2863
Which are the geographical coordinates of this sample.
Note, the GPS calculator uses only data from the GEO and GEN matrices for its calculations.

Complete Workflow

A complete workflow for using GPS on your data can be found here .

Q&A


If you have additional questions about our method please read our paper first.
If that still does not answer your quesitons, 
please email Eran Elhaik at: e.elhaik@sheffield.ac.uk or Tatiana Tatarinova at: tatarino@usc.edu