Overview

The GeneNetworkAPI package provides access to the GeneNetwork database and analysis functions using the GeneNetwork REST API.

Credits

Pjotr Prins and Zach Sloan are the main contributors to the GeneNetwork REST API. Karl Broman wrote the GNapi R package for providing access to GeneNetwork from R. This package follows the structure and function of that package closely.

Note on terminology

GeneNetwork collects data on genetically segregating populations (called groups) in a number of species including humans. Most of the phenotype data is "omic" data which are organized as datasets.

Check connection

To check if the website is responding properly:

julia> check_gn()GeneNetwork is alive.
200

Get species list

Which species have data on them?

julia> list_species()12×4 DataFrame
 Row │ FullName                           Id     Name         TaxonomyId
     │ String                             Int64  String       Int64
─────┼───────────────────────────────────────────────────────────────────
   1 │ Mus musculus                           1  mouse             10090
   2 │ Rattus norvegicus                      2  rat               10116
   3 │ Arabidopsis thaliana                   3  arabidopsis        3702
   4 │ Homo sapiens                           4  human              9606
   5 │ Hordeum vulgare                        5  barley             4513
   6 │ Fly (Drosophila melanogaster dm6)      6  drosophila         7227
   7 │ Macaca mulatta                         7  monkey             9544
   8 │ Glycine max                            8  soybean            3847
   9 │ Solanum lycopersicum                   9  tomato             4081
  10 │ Populus trichocarpa                   10  poplar             3689
  11 │ Oryzias latipes (Japanese medaka)     11  medaka             8090
  12 │ Bat (Glossophaga soricina)            12  bat               27638

To get information on a single species:

julia> list_species("rat")1×4 DataFrame
 Row │ FullName           Id     Name    TaxonomyId
     │ String             Int64  String  Int64
─────┼──────────────────────────────────────────────
   1 │ Rattus norvegicus      2  rat          10116

You could also subset (safer):

julia> GeneNetworkAPI.subset(list_species(), :Name => x->x.=="rat")1×4 DataFrame
 Row │ FullName           Id     Name    TaxonomyId
     │ String             Int64  String  Int64
─────┼──────────────────────────────────────────────
   1 │ Rattus norvegicus      2  rat          10116

List groups for a species

Since the information is organized by segregating population ("group"), it is useful to get a list for a preticular species you might be interested in.


julia> list_groups("rat")7×8 DataFrame Row │ DisplayName FullName GeneticType Id MappingMethodId Name SpeciesId public ⋯ │ String String String Int64 String String Int64 Int64 ⋯ ─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ Hybrid Rat Diversity Panel (Incl… Hybrid Rat Diversity Panel (Incl… None 10 1 HXBBXH 2 2 ⋯ 2 │ UIOWA SRxSHRSP F2 UIOWA SRxSHRSP F2 intercross 24 1 SRxSHRSPF2 2 2 3 │ NIH Heterogeneous Stock (RGSMC 2… NIH Heterogeneous Stock (RGSMC 2… None 42 1 HSNIH-RGSMC 2 2 4 │ NIH Heterogeneous Stock (Palmer) NIH Heterogeneous Stock (Palmer) None 55 None HSNIH-Palmer 2 2 5 │ NWU WKYxF344 F2 Behavior NWU WKYxF344 F2 Behavior intercross 82 3 NWU_WKYxF344_F2 2 2 ⋯ 6 │ HIV-1Tg and Control HIV-1Tg and Control None 83 1 HIV-1Tg 2 2 7 │ HRDP-HXB/BXH Brain Proteome HRDP-HXB/BXH Brain Proteome None 87 1 HRDP_HXB-BXH-BP 2 2

You can see the type of population it is. Note the short name (Name) as that will be used in queries involving that population (group).

Get genotypes for a group

To get the genotypes of a group:


julia> get_geno("BXD") |> (x->first(x,10))10×239 DataFrame Row │ Chr Locus cM Mb BXD1 BXD2 BXD5 BXD6 BXD8 BXD9 BXD11 BXD12 BXD13 BXD14 BXD15 ⋯ │ String3 String31 Float64 Float64 String1 String1 String1 String1 String1 String1 String1 String1 String1 String1 String1 ⋯ ─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ 1 rsm10000000001 0.0 3.00149 B B D D D B B D B B D ⋯ 2 │ 1 rs31443144 0.11 3.01027 B B D D D B B D B B D 3 │ 1 rs6269442 0.21 3.4922 B B D D D B B D B B D 4 │ 1 rs32285189 0.32 3.5112 B B D D D B B D B B D 5 │ 1 rs258367496 0.43 3.6598 B B D D D B B D B B D ⋯ 6 │ 1 rs32430919 0.53 3.77702 B B D D D B B D B B D 7 │ 1 rs36251697 0.64 3.81227 B B D D D B B D B B D 8 │ 1 rs30658298 0.75 4.43062 B B D D D B B D B B D 9 │ 1 rs31879829 0.85 4.51871 B B D D D B B D B B D ⋯ 10 │ 1 rs36742481 0.96 4.77632 B B D D D B B D B B D 224 columns omitted

Currently, we only support the .geno format which returns a data frame of genotypes with rows as marker and columns as individuals.

List datasets for a group

To list the (omic) datasets available for a group, you have to use the name as listed in the group list for a species:


julia> list_datasets("HSNIH-Palmer")10×11 DataFrame Row │ AvgID CreateTime DataScale FullName Id Long_Abbreviation ProbeFreezeId Shor ⋯ │ Int64 String String String Int64 String Int64 Stri ⋯ ─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ 24 Mon, 27 Aug 2018 00:00:00 GMT log2 HSNIH-Palmer Nucleus Accumbens C… 860 HSNIH-Rat-Acbc-RSeq-Aug18 347 HSNI ⋯ 2 │ 24 Sun, 26 Aug 2018 00:00:00 GMT log2 HSNIH-Palmer Infralimbic Cortex … 861 HSNIH-Rat-IL-RSeq-Aug18 348 HSNI 3 │ 24 Sat, 25 Aug 2018 00:00:00 GMT log2 HSNIH-Palmer Lateral Habenula RN… 862 HSNIH-Rat-LHB-RSeq-Aug18 349 HSNI 4 │ 24 Fri, 24 Aug 2018 00:00:00 GMT log2 HSNIH-Palmer Prelimbic Cortex RN… 863 HSNIH-Rat-PL-RSeq-Aug18 350 HSNI 5 │ 24 Thu, 23 Aug 2018 00:00:00 GMT log2 HSNIH-Palmer Orbitofrontal Corte… 864 HSNIH-Rat-VoLo-RSeq-Aug18 351 HSNI ⋯ 6 │ 24 Fri, 14 Sep 2018 00:00:00 GMT log2 HSNIH-Palmer Nucleus Accumbens C… 868 HSNIH-Rat-Acbc-RSeqlog2-Aug18 347 HSNI 7 │ 24 Fri, 14 Sep 2018 00:00:00 GMT log2 HSNIH-Palmer Infralimbic Cortex … 869 HSNIH-Rat-IL-RSeqlog2-Aug18 348 HSNI 8 │ 24 Fri, 14 Sep 2018 00:00:00 GMT log2 HSNIH-Palmer Lateral Habenula RN… 870 HSNIH-Rat-LHB-RSeqlog2-Aug18 349 HSNI 9 │ 24 Fri, 14 Sep 2018 00:00:00 GMT log2 HSNIH-Palmer Prelimbic Cortex RN… 871 HSNIH-Rat-PL-RSeqlog2-Aug18 350 HSNI ⋯ 10 │ 24 Fri, 14 Sep 2018 00:00:00 GMT log2 HSNIH-Palmer Orbitofrontal Corte… 872 HSNIH-Rat-VoLo-RSeqlog2-Aug18 351 HSNI 4 columns omitted

List meta information of genotype file

To list the meta information of a genotype file for a group, you have to use the name as listed in the group list for a species:


julia> list_geno("BXD")3×4 DataFrame Row │ f1s mat pat location │ String String String String ─────┼────────────────────────────────────── 1 │ B6D2F1 C57BL/6J DBA/2J BXD.8.geno 2 │ D2B6F1 BXD.geno* 3 │ BXD.4.geno

In the case where multiple genotype dataset are associated to a group, the * indicates the default genotype dataset.

Get sample data for a group

This gives you a matrix with rows as individuals/samples/strains and columns as "clinical" (non-omic) phenotypes. The number after the underscore is the phenotype number (to be used later). Some data may be missing.


julia> get_pheno("HSNIH-Palmer") |> (x->x[81:100,:]) |> show20×509 DataFrame Row │ id HSR_10001 HSR_10002 HSR_10003 HSR_10004 HSR_10005 HSR_10006 HSR_10007 HSR_10008 HSR_10009 HSR_10010 HSR_100 ⋯ │ String15 Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64 ⋯ ─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ 000721E489 missing missing missing missing missing missing missing missing missing missing missing ⋯ 2 │ 00072AAC0D missing missing missing missing missing missing missing missing missing missing missing 3 │ 00072AC972 missing missing missing missing missing missing missing missing missing missing missing 4 │ 00077E61DC missing missing missing missing missing missing missing missing missing missing missing 5 │ 00077E61EC missing missing missing missing missing missing missing missing missing missing missing ⋯ 6 │ 00077E61F3 missing missing missing missing missing missing missing missing missing missing missing 7 │ 00077E61F5 missing missing missing missing missing missing missing missing missing missing missing 8 │ 00077E6204 missing missing missing missing missing missing missing missing missing missing missing ⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ 14 │ 00077E634B missing missing missing missing missing missing missing missing missing missing missing ⋯ 15 │ 00077E63D9 missing missing missing missing missing missing missing missing missing missing missing 16 │ 00077E641E missing missing missing missing missing missing missing missing missing missing missing 17 │ 00077E6433 missing missing missing missing missing missing missing missing missing missing missing 18 │ 00077E64B3 8672.99 86.414 4762.08 63.416 24076.1 87.118 84.0 43.57 6614.97 22.526 1955 ⋯ 19 │ 00077E64BA missing missing missing missing missing missing missing missing missing missing missing 20 │ 00077E64C1 missing missing missing missing missing missing missing missing missing missing missing 498 columns and 5 rows omitted

Get information about traits

To get information on a particular (non-omic) trait use the group name and the trait number:


julia> info_dataset("HSNIH-Palmer","10308")1×4 DataFrame Row │ dataset_type description id name │ String String Int64 String ─────┼─────────────────────────────────────────────────────────────────────────────── 1 │ phenotype Central nervous system, behavior… 10308 reaction_time_pint1_5

To get information on a dataset (of omic traits) for a group, use:


julia> info_dataset("HSNIH-Rat-Acbc-RSeq-Aug18")1×10 DataFrame Row │ confidential data_scale dataset_type full_name id name public short_name ⋯ │ Int64 String String String Int64 String Int64 String ⋯ ─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ 0 log2 mRNA expression HSNIH-Palmer Nucleus Accumbens C… 860 HSNIH-Rat-Acbc-RSeq-0818 1 HSNIH-Palmer Nucleus A ⋯ 3 columns omitted

Summary information on traits

Get a list of the maximum LRS for each trait and position.


julia> info_pheno("HXBBXH") |> (x->first(x,10))10×11 DataFrame Row │ Additive Authors Chr Description ⋯ │ Float64? String String String ⋯ ─────┼────────────────────────────────────────────────────────────────────────────────────────────── 1 │ 0.0499968 Pravenec M, Zidek V, Musilova A,… 8 Original post publication descri… ⋯ 2 │ -0.0926364 Pravenec M, Zidek V, Musilova A,… 14 Original post publication descri… 3 │ 0.60189 Pravenec M, Zidek V, Musilova A,… 20 Original post publication descri… 4 │ 0.992576 Pravenec M, Zidek V, Musilova A,… 8 Original post publication descri… 5 │ 0.00854221 Pravenec M, Zidek V, Musilova A,… 8 Original post publication descri… ⋯ 6 │ -0.0355208 Pravenec M, Zidek V, Musilova A,… 8 Original post publication descri… 7 │ 0.413279 Pravenec M, Zidek V, Musilova A,… 2 Original post publication descri… 8 │ -0.936806 Pravenec M, Zidek V, Musilova A,… 3 Original post publication descri… 9 │ 1.23913 Pravenec M, Zidek V, Musilova A,… 7 Original post publication descri… ⋯ 10 │ 1.2982 Pravenec M, Zidek V, Musilova A,… 7 Original post publication descri… 7 columns omitted

You could also specify a group and a trait number or a dataset and a probename.

julia> info_pheno("BXD","10001")1×4 DataFrame
 Row │ additive  id     locus       lrs
     │ Float64   Int64  String      Float64
─────┼──────────────────────────────────────
   1 │  2.39444      4  rs48756159  13.4975

julia> info_pheno("HC_M2_0606_P","1436869_at")1×13 DataFrame Row │ additive alias chr description id locus lrs mb mean name p_v ⋯ │ Float64 String String String Int64 String Float64 Float64 Float64 String Flo ⋯ ─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ -0.214088 HHG1; HLP3; HPE3; SMMCI; Dsh; Hh… 5 sonic hedgehog (hedgehog) 99602 rs8253327 12.7711 28.4572 9.27909 1436869_at 0 ⋯ 3 columns omitted

Analysis commands

GEMMA

julia> run_gemma("BXDPublish","10015",use_loco=true) |> (x->first(x,10))10×6 DataFrame
 Row │ Mb       additive  chr  lod_score  name            p_value
     │ Float64  Float64   Any  Float64    String          Float64
─────┼─────────────────────────────────────────────────────────────
   1 │ 3.00149  0.496892  1     0.548213  rsm10000000001  0.283001
   2 │ 3.01027  0.496892  1     0.548213  rs31443144      0.283001
   3 │ 3.4922   0.496892  1     0.548213  rs6269442       0.283001
   4 │ 3.5112   0.496892  1     0.548213  rs32285189      0.283001
   5 │ 3.6598   0.496892  1     0.548213  rs258367496     0.283001
   6 │ 3.77702  0.496892  1     0.548213  rs32430919      0.283001
   7 │ 3.81227  0.496892  1     0.548213  rs36251697      0.283001
   8 │ 4.43062  0.496892  1     0.548213  rs30658298      0.283001
   9 │ 4.51871  0.496892  1     0.548213  rs31879829      0.283001
  10 │ 4.77632  0.496892  1     0.548213  rs36742481      0.283001

R/qtl

This function performs a one-dimensional genome scan. The arguments are

  • db (required) - DB name for trait above (Short_Abbreviation listed when you query for datasets)
  • trait (required) - ID for trait being mapped
  • method - hk (default) | ehk | em | imp | mr | mr-imp | mr-argmax ; Corresponds to the "method" option for the R/qtl scanone function.
  • model - normal (default) | binary | 2-part | np ; corresponds to the "model" option for the R/qtl scanone function
  • n_perm - number of permutations; 0 by default
  • control_marker - Name of marker to use as control; this relies on the user knowing the name of the marker they want to use as a covariate
  • interval_mapping - Whether to use interval mapping; "false" by default
julia> run_rqtl("BXDPublish", "10015") |> (x->first(x,10))https://genenetwork.org/api/v_pre1/mapping?trait_id=10015&db=BXDPublish&method=rqtl&rqtl_method=hk&rqtl_model=normal&num_perm=0&interval_mapping=false
10×5 DataFrame
 Row │ Mb       cM       chr  lod_score  name
     │ Float64  Float64  Any  Float64    String
─────┼───────────────────────────────────────────────
   1 │ 3.01027  3.01027  1     0.116927  rs31443144
   2 │ 3.4922   3.4922   1     0.117404  rs6269442
   3 │ 3.5112   3.5112   1     0.117424  rs32285189
   4 │ 3.6598   3.6598   1     0.117573  rs258367496
   5 │ 3.77702  3.77702  1     0.117691  rs32430919
   6 │ 3.81227  3.81227  1     0.117727  rs36251697
   7 │ 4.43062  4.43062  1     0.118356  rs30658298
   8 │ 4.44674  4.44674  1     0.118372  rs51852623
   9 │ 4.51871  4.51871  1     0.118447  rs31879829
  10 │ 4.77632  4.77632  1     0.118714  rs36742481

Correlation

This function correlates a trait in a dataset against all traits in a target database.

  • trait_id (required) - ID for trait used for correlation
  • db (required) - DB name for the trait above (this is the Short_Abbreviation listed when you query for datasets)
  • target_db (required) - Target DB name to be correlated against
  • type - sample (default) | tissue
  • method - pearson (default) | spearman
  • return - Number of results to return (default = 500)
julia> run_correlation("1427571_at","HC_M2_0606_P","BXDPublish") |> (x->first(x,10))10×4 DataFrame
 Row │ #_strains  p_value      sample_r   trait
     │ Int64      Float64      Float64    Int64
─────┼──────────────────────────────────────────
   1 │         6  0.00480466   -0.942857  20511
   2 │         6  0.00480466   -0.942857  20724
   3 │        12  1.82889e-5   -0.923362  13536
   4 │         7  0.00680719    0.892857  10157
   5 │         7  0.00680719   -0.892857  20392
   6 │         6  0.0188455     0.885714  20479
   7 │        12  0.000189298  -0.875658  12762
   8 │        12  0.000245942   0.868653  12760
   9 │         7  0.0136973    -0.857143  20559
  10 │        10  0.00222003   -0.842424  10925