More solutions here:
awk '/>/{sub(">","&"FILENAME"_");sub(/\.fasta/,x)}1' assembly.fasta | sed 's/ /_/g' | grep '>'
More solutions here:
awk '/>/{sub(">","&"FILENAME"_");sub(/\.fasta/,x)}1' assembly.fasta | sed 's/ /_/g' | grep '>'
Step1: Paste the comma separated ERR IDs in the address bar like this:,ERR3418576,ERR3418577,ERR3307235?show=reads
Step 2: This will list the ID in the webpage like this:
$ fgrep 'filename="run' ena_data_20211215-0958.xml |
sed -e 's/.*run/\/vol1\/run/g' -e 's/\".*//g' >downloadlinks.list
$ time for d in $(cat downloadlinks.list); do echo $d; wget $d; done
--2021-12-16 09:50:41-- (try:13) Connecting to (||:80... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 1842088178 (1.7G), 606664946 (579M) remaining [application/octet-stream] Saving to: ‘1_P_PA_1.fastq.gz’ 1_P_PA_1.fastq.gz 67%[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ] 1.15G --.-KB/s in 15m 0s 2021-12-16 10:05:43 (0.00 B/s) - Read error at byte 1235423232/1842088178 (Connection timed out). Retrying. --2021-12-16 10:05:53-- (try:14) Connecting to (||:80... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 1842088178 (1.7G), 606664946 (579M) remaining [application/octet-stream] Saving to: ‘1_P_PA_1.fastq.gz’ 1_P_PA_1.fastq.gz 67%[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ] 1.15G --.-KB/s in 15m 0s 2021-12-16 10:20:54 (0.00 B/s) - Read error at byte 1235423232/1842088178 (Connection timed out). Retrying. --2021-12-16 10:21:04-- (try:15) Connecting to (||:80... connected. HTTP request sent, awaiting response... 206 Partial Content Length: 1842088178 (1.7G), 606664946 (579M) remaining [application/octet-stream] Saving to: ‘1_P_PA_1.fastq.gz’ 1_P_PA_1.fastq.gz 81%[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++================> ] 1.40G --.-KB/s in 26m 54s
$ locate asperaweb_id_dsa.openssh /home/user/.aspera/connect/etc/asperaweb_id_dsa.openssh
$ cat downloadlinks.list | sed 's/\/vol1\/run\///g' >downloadlinks_ascp.list $ time for d in $(cat downloadlinks_ascp.list); do echo $d; ascp -v -l 300m -P33001 -i /home/prakki/.aspera/connect/etc/asperaweb_id_dsa.openssh$d /ena/download/folder/ ; done
time for d in $(cat SRR.list); do time prefetch -v $d; done &
time for d in $(ls */*.sra);do fastq-dump --outdir fastq --split-files $d; done
Supposedly, if we want to run any perl/any language program in bash on mutiple input files (say 1000 inputs files) and if the number of threads in my computer is either 48 or 64 only, sometimes there might be overload which can lead to Resource temporarily unavailable error.
So, to circumvent this problem, we want to run a batch of 48 scripts/commands parallely without overloading the resources. This can be done using GNU parallel. The advantage of GNU parallel is, By default, parallel runs as many jobs in parallel as there are CPU cores.
In order to do this, we will create a file with commands and pass it to GNU parallel.
time for d in $(ls */*fq);
echo "perl /storage/apps/SNP_Validation_Scripts/tools/ $d";
done >
parallel <
And thats it!
$ sudo yum update R
$ yum list installed | grep R
$ sudo yum install
$ sudo yum install yum-utils
$ sudo yum-config-manager --enable "rhel-*-optional-rpms"
$ export R_VERSION=4.1.1
$ curl -O${R_VERSION}-1-1.x86_64.rpm
$ sudo yum install R-${R_VERSION}-1-1.x86_64.rpm
$ /opt/R/${R_VERSION}/bin/R --version
R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
$ sudo ln -s /opt/R/${R_VERSION}/bin/R /usr/local/bin/R
$ sudo ln -s /opt/R/${R_VERSION}/bin/Rscript /usr/local/bin/Rscript
# Let's check the R version is changed
$ R
R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
# Looks like the we have to create a softlink of R 4.1 instead of R 3.5 which is at:
$ which R
# change the R3.5 version softlinkname
$ sudo mv /storage/apps/anaconda3/bin/R /storage/apps/anaconda3/bin/R3.5
# softlink
$ sudo ln -s /opt/R/${R_VERSION}/bin/R /storage/apps/anaconda3/bin/R
# Now we have latest R version
$ R --version
R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
Sometimes, when running BEAST, it is seen that ESS values of the individual runs is not > 200 but the combined ESS of all the independent runs is > 200 and traces of three independent runs converge in the combined trace. Is it acceptable then to use combined estimate of parameters such as mutation rate?
Here is an answer from BEAST author:
Is it acceptable to use an estimate of say, TMRCA, from the combined analysis of three identical runs (from different random seeds) if the ESS is >200 for all parameters in the combined file (but not in each individual file) in Tracer?
Or do I need to do that three times (for a total of 9 runs) and use log combiner to combine the nine files?
3 runs combined to give an ESS of >200 is probably safe but you need to be a little bit careful. 10 runs to get an ESS of >200 is probably not safe because if individual runs are given ESS estimates of about 20 then there is a question of whether those ESS estimates are valid at all. You should use Tracer to visually inspect that your 3 runs are giving essentially the same answers for all the parameters and the likelihood and prior. The traces should substantially overlap and given basically the same mean and variance. If your three runs give traces that don't overlap then you can't combine them no matter what the combined ESS says!
2021-06-09 14:18:11,612 mob_suite.utils ERROR: Downloading databases failed, please check your internet connection and retry [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:18:11,612 mob_suite.utils ERROR: Process failed with error [Errno 2] No such file or directory: 'mash': 'mash'. Removing lock file [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
Some solutions to try at:
Mine specifically looks like mash is not installed so msh file is not generated causing the error.
$ conda install -c bioconda mash
$ mob_init
2021-06-09 14:51:02,838 mob_suite.utils INFO: Database directory folder already exists at /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:51:02,838 mob_suite.utils INFO: Placed lock file at /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/.lock [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:51:02,838 mob_suite.utils INFO: Initializing databases...this will take some time [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:51:02,838 mob_suite.utils INFO: Downloading databases...this will take some time [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:51:02,838 mob_suite.utils INFO: Trying mirror [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 214M 100 214M 0 0 3638k 0 0:01:00 0:01:00 --:--:-- 3942k
2021-06-09 14:52:04,226 mob_suite.utils INFO: Download sha256 checksum is 92a9008caa2bbc273bdb9cb76c5df001f25d71be819fa85a19f7e6d13e56073c [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:04,227 mob_suite.utils INFO: Download size in bytes is 225377917 [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:04,241 mob_suite.utils INFO: Downloading databases successful, now building databases [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:04,241 mob_suite.utils INFO: Decompressing /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/ [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:04,932 mob_suite.utils INFO: Decompressing /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/ncbi_plasmid_full_seqs.fas.gz [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:12,348 mob_suite.utils INFO: Decompressing /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/orit.fas.gz [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:12,350 mob_suite.utils INFO: Decompressing /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/rep.dna.fas.gz [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:12,372 mob_suite.utils INFO: Decompressing /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/mob.proteins.faa.gz [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:12,381 mob_suite.utils INFO: Decompressing /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/repetitive.dna.fas.gz [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:12,460 mob_suite.utils INFO: Decompressing /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/databases/mpf.proteins.faa.gz [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:12,477 mob_suite.utils INFO: Building repetitive mask database [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:12,696 mob_suite.utils INFO: Building complete plasmid database [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:17,531 mob_suite.utils INFO: Sketching complete plasmid database [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
2021-06-09 14:52:23,037 mob_suite.utils INFO: Init ete3 library ... [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
NCBI database not present yet (first time used?)
Downloading taxdump.tar.gz from NCBI FTP site (via HTTP)...
Done. Parsing...
Loading node names...
2334637 names loaded.
247459 synonyms loaded.
Loading nodes...
2334637 nodes loaded.
Linking nodes...
Tree is loaded.
Updating database: /home/prakki/.etetoolkit/taxa.sqlite ...
2334000 generating entries...
Uploading to /home/prakki/.etetoolkit/taxa.sqlite
Inserting synonyms: 50000 2021-06-09 14:53:57,919 mob_suite.utils ERROR: Init of ete3 library failed with error UNIQUE constraint failed: synonym.spname, synonym.taxid. Removing lock file [in /home/prakki/sw/pyenv/versions/3.6.4/lib/python3.6/site-packages/mob_suite/]
$ sudo yum install gcc
Loaded plugins: fastestmirror, langpacks
Existing lock /var/run/ another copy is running as pid 158996.
Another app is currently holding the yum lock; waiting for it to exit...
The other application is: yum
Memory : 1.0 M RSS (450 MB VSZ)
Started: Fri Jun 4 15:31:12 2021 - 2 day(s) 21:14:07 ago
State : Traced/Stopped, pid: 158996
Another app is currently holding the yum lock; waiting for it to exit...
The other application is: yum
Memory : 1.0 M RSS (450 MB VSZ)
Started: Fri Jun 4 15:31:12 2021 - 2 day(s) 21:14:09 ago
State : Traced/Stopped, pid: 158996
Another app is currently holding the yum lock; waiting for it to exit...
The other application is: yum
Memory : 1.0 M RSS (450 MB VSZ)
Started: Fri Jun 4 15:31:12 2021 - 2 day(s) 21:14:11 ago
State : Traced/Stopped, pid: 158996
Another app is currently holding the yum lock; waiting for it to exit...
The other application is: yum
Memory : 1.0 M RSS (450 MB VSZ)
Started: Fri Jun 4 15:31:12 2021 - 2 day(s) 21:14:13 ago
State : Traced/Stopped, pid: 158996
Exiting on user cancel.
$ ps -ef | grep 158996
root 158996 158792 0 Jun04 pts/3 00:00:00 /usr/bin/python /bin/yum install pyenv
datta 203485 67435 0 12:46 pts/2 00:00:00 grep --color=auto 158996
$ sudo kill -9 158996
$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver XXX.XXX.X.X # some numbers
$ sudo vi /etc/resolv.conf
added nameserver
$ sudo yum install gcc # Resolved now works.
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
epel/x86_64/metalink | 18 kB 00:00:00
* base:
* epel:
* extras:
* nux-dextop:
* updates:
$ conda search spades
Loading channels: done
# Name Version Build Channel
spades 3.5.0 1 bioconda
spades 3.5.0 py27_0 bioconda
spades 3.6.2 0 bioconda
spades 3.7.0 0 bioconda
spades 3.8.0 0 bioconda
spades 3.8.1 0 bioconda
spades 3.9.0 0 bioconda
spades 3.9.0 3 bioconda
spades 3.9.0 4 bioconda
spades 3.9.0 py27_1 bioconda
spades 3.9.0 py27_2 bioconda
spades 3.9.0 py34_1 bioconda
spades 3.9.0 py35_1 bioconda
spades 3.9.0 py35_2 bioconda
spades 3.9.1 0 bioconda
spades 3.9.1 h9ee0642_1 bioconda
spades 3.10.0 py27_0 bioconda
spades 3.10.0 py34_0 bioconda
spades 3.10.0 py35_0 bioconda
spades 3.10.1 1 bioconda
spades 3.10.1 py27_0 bioconda
spades 3.10.1 py34_0 bioconda
spades 3.10.1 py35_0 bioconda
spades 3.11.0 py27_0 bioconda
spades 3.11.0 py27_zlib1.2.8_1 bioconda
spades 3.11.0 py35_0 bioconda
spades 3.11.0 py35_zlib1.2.8_1 bioconda
spades 3.11.0 py36_0 bioconda
spades 3.11.0 py36_zlib1.2.8_1 bioconda
spades 3.11.1 h21aa3a5_2 bioconda
spades 3.11.1 h21aa3a5_3 bioconda
spades 3.11.1 hb7ba0dd_4 bioconda
spades 3.11.1 py27_zlib1.2.11_1 bioconda
spades 3.11.1 py27_zlib1.2.8_0 bioconda
spades 3.11.1 py35_zlib1.2.11_1 bioconda
spades 3.11.1 py35_zlib1.2.8_0 bioconda
spades 3.11.1 py36_zlib1.2.11_1 bioconda
spades 3.11.1 py36_zlib1.2.8_0 bioconda
spades 3.12.0 1 bioconda
spades 3.12.0 h9ee0642_2 bioconda
spades 3.12.0 py27_0 bioconda
spades 3.12.0 py35_0 bioconda
spades 3.12.0 py36_0 bioconda
spades 3.13.0 0 bioconda
spades 3.13.1 0 bioconda
spades 3.13.1 h2d02072_2 bioconda
spades 3.13.1 hfb2e325_1 bioconda
spades 3.13.2 h2d02072_0 bioconda
spades 3.14.0 h2d02072_0 bioconda
spades 3.14.1 h2d02072_0 bioconda
spades 3.14.1 h2d02072_1 bioconda
spades 3.14.1 h95f258a_2 bioconda
spades 3.15.0 h633aebb_0 bioconda
spades 3.15.2 h633aebb_0 bioconda
spades 3.15.2 h95f258a_1 bioconda
means that you need python version 3.5
for this specific version. If you only have python3.4
and the package is only for version 3.5
you cannot install it with conda. (see here for more details:
$ conda install spades=3.9.0
conda install spades=3.9.0
Collecting package metadata (repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /storage/apps/anaconda3
added / updated specs:
- spades=3.9.0
The following packages will be downloaded:
package | build
spades-3.9.0 | 0 9.1 MB bioconda
Total: 9.1 MB
The following packages will be DOWNGRADED:
spades 3.13.1-0 --> 3.9.0-0
Proceed ([y]/n)? y
Downloading and Extracting Packages
spades-3.9.0 | 9.1 MB | ##################################################################################################################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
conda install -c bacant -c conda-forge -c bioconda bacant=3.3.4
Open Devices from the File menu
Click Install Guest Additions CD image
Go to main directory of the computer with all the disks displayed
Click the CD Drive VirtualBox Guest Addtions
Double-click the VBoxWindowsAdditions.exe file depending on your operating system (x86 for 32 bit os and amd64 for 64 bit os)
Press Right Ctrl + C to - This creates the VM windows with a proper resolution
$ sudo do-release-upgrade
Checking for a new Ubuntu release
Please install all available updates for your release before upgrading.