Tuesday, June 21, 2022

Downloading viral genome database for viral read classification - Metagenomics

  1.  Kraken2 Metagenomic Virus Database - redirects to globus - does not have an option to download

  2. Default kraken2 command:
    • kraken2-build --download-library viral --db $DBNAME

      Results in the error - adding --use-ftp did not help
      rsync: getaddrinfo: ftp.ncbi.nlm.nih.gov 873: Name or service not known
      rsync error: error in socket IO (code 10) at clientserver.c(127) [Receiver=3.1.3]
      Error downloading assembly summary file for viral, exiting.


    • Tried changing code in specific scripts based on this error thread - still no luck! :(
    • Tried changing code in specific scripts based on this error thread - still no luck! :(

  3. A python script that helps with updating the kraken databases :
    • error - FileNotFoundError: [Errno 2] No such file or directory: 'assembly_summary_refseq.txt'

  4. Downloaded NCBI Viral database from here - have not tested - but fasta sqeuences needed to converted to kraken2 database format.

  5. Downloaded Viral RefSeq database from a PeerJ paper - worked!
    • Pre-compiled databases
    • Looks comprehensive! Size is big (6.6 gb)
    • Could classify the viral reads using kraken2 command

No comments:

Post a Comment