Categories
emacs Linux

Setup NCBI E-utilities

The E-utilities toolbox from NCBI is the recommended way to access NCBI and NLM data, including the genome data. The following command can easily install it in the Ubuntu Linux distribution.

$sudo apt update
$sudo apt install ncbi-entrez-direct

After successful installation you can test it out using the following example code, which provides ftp path to the first 10 Bacillus cereus in refseq. Unfortunately, this code needs to run in the BASH shell. Running it in eshell (within Emacs) does not work.

$esearch -db assembly -query "Bacillus cereus " | efetch -format docsum -mode json|grep "ftppath_refseq" |head -10
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/394/245/GCF_013394245.1_ASM1339424v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/303/115/GCF_013303115.1_ASM1330311v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/284/505/GCF_013284505.1_ASM1328450v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/284/455/GCF_013284455.1_ASM1328445v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/284/445/GCF_013284445.1_ASM1328444v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/284/425/GCF_013284425.1_ASM1328442v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/267/775/GCF_013267775.1_ASM1326777v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/267/475/GCF_013267475.1_ASM1326747v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/267/455/GCF_013267455.1_ASM1326745v1",
            "ftppath_refseq": "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/013/267/275/GCF_013267275.1_ASM1326727v1",

Leave a Reply

Your email address will not be published.