GWIPS-viz Forums
Kmer tool - Printable Version

+- GWIPS-viz Forums (https://gwips.ucc.ie/Forum)
+-- Forum: RiboGalaxy (/forumdisplay.php?fid=23)
+--- Forum: RiboGalaxy (/forumdisplay.php?fid=24)
+--- Thread: Kmer tool (/showthread.php?tid=120)



Kmer tool - rosario.avolio - 05-Oct-2017 07:59 AM

Hi everyone,

I have another question for you. I am trying to run the Kmer tool but I am encountering the following error:

sh: 1: rm: Argument list too long
Error during tmp directory cleaning
Programme aborted at Wed Oct 4 18:41:48 2017


I downloaded the following GFF3 file from NCBI "ref_GRCh38.p7_top_level.gz". Is it correct or should I use another reference annotation file?

Thank you

Rosario


RE: Kmer tool - audrey - 05-Oct-2017 08:59 AM

Hi Rosario,

The kmer tool requires that your Ribo-seq data to be mapped to the a genome and not the transcriptome.

Normally I would say that your gff3 file is correct. I presume from that you obtained it from ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/GFF/?

However, one requirement for the kmer tool is that the genome fasta file and the gff3 file need to refer to the chomosome sequences in the same format. For example, in your gff3 file, chromosome 1 is referred as NC_000001.11.

However, I have checked the fasta headers for the chromosomes available on ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens and their format is different (gi|568815364|ref|NT_077402.3| Homo sapiens chromosome 1 genomic scaffold, GRCh38.p7 Primary Assembly HSCHR1_CTG1).

This is quite frustrating and the tool will not work.

In the past we recommended downloading from Ensembl where the formats are compatible (http://www.ensembl.org/info/data/ftp/index.html).

So the best would be to try the Ensembl fasta and gff3.

Let us know how it goes.

Audrey


RE: Kmer tool - rosario.avolio - 05-Oct-2017 01:30 PM

OK, it make sense, I will try thank you very much!

In the meanwhile I am trying to run the RUST tools (metafootprint analysis) but I got the following error:

/mnt/workspace/DATA/galaxy/galaxy-dist/tool_dependencies/numpy/1.7.1/ribogalaxy/package_numpy_1_7/2eadc5476a0a/lib/python/numpy/core/_methods.py:57: RuntimeWarning: invalid value encountered in double_scalars
ret = ret / float(rcount)

NM_030581.3, AUG codon not found
NM_006890.4, AUG codon not found
NM_001258275.2, AUG codon not found
NR_002744.1, AUG codon not found
NM_001349941.1, AUG codon not found
NR_148472.1, AUG codon not found
....

I have choosen the same fasta file that I used for the transcriptome alignment, what could be the problem?

Thank you again

Rosario


RE: Metafootprint Analysis error AUG codon not found - jamesp - 10-Oct-2017 10:35 AM

Output messages containing names of transcripts followed by "AUG codon not found" when running a RUST Metafootprint analysis may indicate a problem with the reference transcriptome.
This symptom may be due to the sequences being in lowercase in the Transcriptome reference file. The recommendation is to provide a Transcriptome reference file with sequences in uppercase i.e. CGAGTAACC instead off cgagtaacc

Regards
jamesp


RE: Kmer tool - rosario.avolio - 10-Oct-2017 03:53 PM

(10-Oct-2017 10:35 AM)jamesp Wrote:  Output messages containing names of transcripts followed by "AUG codon not found" when running a RUST Metafootprint analysis may indicate a problem with the reference transcriptome.
This symptom may be due to the sequences being in lowercase in the Transcriptome reference file. The recommendation is to provide a Transcriptome reference file with sequences in uppercase i.e. CGAGTAACC instead off cgagtaacc

Regards
jamesp

Thank you very much, it worked of course! Big Grin

Regards

Rosario