Examples

Essentially mGene.web can be used in three different ways: 1) If you simply want it as a black-box tool to get your sequence annotated, you should use the monolithic tools mGeneTrain, mGenePredict and mGeneEval. 2) If you would like to get a little more involved, perhaps inspect intermediate results, like transcription start site predictions and so on, you should use the pre-defined workflows. 3) You might not even be interested in the complete gene finding process, but only in a subtask, then you should have a look at the individual mGene.web Modules. With those you can also build your own workflows. These options are explained in more detail along the lines of a few examples.

Load the Supplied Example Data

go to http://galaxy.fml.mpg.de
in the toolbar at the left, open "mGene.web"
klick on "Examples and Instructions" => the organism "C. elegans" is already pre-selected
press "execute" to load the example data => three new objects will appear in the object list on the right
- Information about the datasets
- Genome Annotation in GFF3 format
- Genome Sequence in FASTA format

mGeneTrain: Train the Gene Finder

from "mGene.web" in the toolbar, select "mGeneTrain"
in the input box (beige, labeled "mGeneTrain") select the FASTA file and the GFF3 annotation from the C. elegans example
press "execute"
be patient -- training a full gene finder is a very complex computation and will take a few hours

mGenePredict: Find Genes in Other Genomic DNA

use "Upload file" in the toolbar to upload a FASTA file
activate "mGene.web"->"mGenePredict" from the toolbar
select your FASTA data file and your previously trained mGene predictor and "execute"

Libraries

Via the galaxy-framework we offer pretrained signal, content and gene-structure classifiers for an increasing number of organisms. To obtain such a classifier please follow the following simple steps.

click on Libraries on the top panel of your galaxy environment => a list of all available libraries is shown
click on one of the libraries from the list => a list of datasets and classifiers opens
select a data set of your choice and click on go to import it

Please note that all the classifiers provide additional information once you have uploaded them, e.g.:

`Trained "don" classifier`

based on 3381 labeled examples (272 positive, 3109 negative)
using 5-fold cross-validation for model-selection from 1 models (inner cv loop)
using 5-fold cross-validation for obtaining unbiased predictions (outer cv loop)

`Performance`

Average area under ROC curve on test splits: 0.995
Average area under PRC curve on test splits: 0.940

Using pre-defined Workflows

We provide a number of predefined workflows that combine different parts of our system to complete certain tasks. The workflows are described in detail here.

Note: To import one of these workflows you have to login to the galaxy system

To run a pre-defined workflow do the folliwing steps:

click on User on the top panel of your galaxy environment
if you are already registered, log in, if not, register
use one of the links provided here to import one of the workflows
in the list Workflows shared with you by others you should now find the imported workflow
click on the respective workflow
chose the required input data files (if there is nothing to chose from you first need to upload the data, see above)
click on Run workflow at the bottom of the page
go back to Analyze Data at the top panel of your galaxy environment
you should now be able to observe the progress of the called modules.

Changing pre-defined Workflows

click on the Workflow button in the top panel of the galaxy system
click on the arrow on the right hand side of your imported workflow and select clone
click on the arrow of your cloned workflow and select edit
now the workflow editor opens and you can inspect and modify the workflow
you might also want to have a look at the individual mGene.web Modules (left panel). they can be added to the workflow by clicking on them.

cBio@MSKCC

Personal tools

Examples

Document Actions