Thursday, September 27, 2018

Downloading fasta using efetch - CentOS


$ efetch -db nucleotide -id KJ413946.1 -format fasta


501 Protocol scheme 'https' is not supported (LWP::Protocol::https not installed)
No do_post output returned from 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=KJ413946.1&rettype=fasta&retmode=text&edirect_os=linux&edirect=9.90&tool=edirect&email=datta@ttshbio'
Result of do_post http request is
$VAR1 = bless( {
                 '_content' => 'LWP will support https URLs if the LWP::Protocol::https module
is installed.
',
                 '_rc' => 501,
                 '_headers' => bless( {
                                        'client-warning' => 'Internal response',
                                        'client-date' => 'Thu, 27 Sep 2018 06:42:06 GMT',
                                        'content-type' => 'text/plain',
                                        '::std_case' => {
                                                          'client-warning' => 'Client-Warning',
                                                          'client-date' => 'Client-Date'
                                                        }
                                      }, 'HTTP::Headers' ),
                 '_msg' => 'Protocol scheme \'https\' is not supported (LWP::Protocol::https not installed)',
                 '_request' => bless( {
                                        '_content' => 'db=nucleotide&id=KJ413946.1&rettype=fasta&retmode=text&edirect_os=linux&edirect=9.90&tool=edirect&email=datta@ttshbio',
                                        '_uri' => bless( do{\(my $o = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi')}, 'URI::https' ),
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.05',
                                                               'content-type' => 'application/x-www-form-urlencoded'
                                                             }, 'HTTP::Headers' ),
                                        '_method' => 'POST'
                                      }, 'HTTP::Request' )
               }, 'HTTP::Response' );

Installing "perl-LWP-Protocol-https" has solved the problem!


$ sudo yum install perl-LWP-Protocol-https


$ efetch -db nucleotide -id KJ413946.1 -format fasta
>KJ413946.1 Escherichia coli strain ECS01 plasmid pNDM-ECS01, complete sequence
ATGGCAGAGGAAAGCAAACAGCTAACCAAACGGCAACAAAAAGCCATTGATACAGCGGCGTTAATCCGGC
AGGAGCCGCCGCAGGGTGAAGATATGGCATTCACCCACTCCATTCTGTGCCAGGTCGGTTTGCCCCGTTC
TAAGGTGGCAGGGCGTGAGTTTATGCGCCGTTCTGGTGATGCCTGGCTCGTCGTACAGGCAGGCTGGATT
GATGAAGGCAGTGGCCCGGTAGAGCAGCCTTTACCCTATGGCGCTATGCCGCGACTCACGTTCGCCTGGA
TTTCATCGTATGCACTGCGCAACAAAACGCGGGAAATCGCCATCGGCCACAGCGCTAATGAGTTTCTTCA
CCTTATGGGGATGGACTCACAGGGAACCCGTCATAAAACGCTGCGTACACAAATGCAGGCGCTGGCCGCG
TGTCGTTTGCAGCTGGGCTTTAAGGGCC

For downloading multiple records from NCBI directly with a list of accession using "efetch" and make accession as filename:

$ time for d in $(cat Plasmid.list);do echo $d; efetch -db nucleotide -id $d -format fasta >$d.fasta; done ##takes a bit longer time 
 
real    6m47.812s
user    0m43.515s
sys    0m7.811s  

For extracting multiple records from local NT/NR database with a list of accessions using "blastdbcmd" and make accession as filename

$ time for d in $(cat Plasmid.list);do echo $d; blastdbcmd -db nt -entry $d >$d.fasta; done

real    0m26.736s
user    0m7.004s
sys    0m4.384s  

Notice the advantage of having downloaded local databases. It just took 26 secs!!

No comments:

Post a Comment