network-interactions
version 1
Constructs and compares Gene Networks based on the output of matrix-scan
from a list of genes and transcription factors of interest.
network tool
network-interactions -tfs [TFsListFile] -cre [RegulatorySeqsBEDfile] -genome [GenomeAssembly] [...]
A one-column file with all transcription factors of interest. Warning! This names should match the gene names indicated in the BED file.
A BED file referring all regulatory regions of all genes of interest, including TFs. Gene names must be provided in the fourth column.
A two-column file indicating all two-node interactions, one per row.
complete_direct_interactions_date.tsv
A table indicating all direct putative interactions found for all genes specified in the BED file of regulatory regions.
Format is as follows: A TF-gene interaction per row, the coordinates in which the interaction was found (as indicated in the input BED), the one-based position starting from the first coordinate followed at last by the matrix-scan
’s related score to the TF’s binding in such position. If the interaction is found more than once, then details of positions and scores are separated by “;”, where the first score corresponds to the first position and so on. If the interaction is found in another coordinate, then this will appear on another row.
GRN_direct_interactions_date.tsv
A table indicating all direct putative interactions found only for transcription factors.
Format is as follows: A TF-TF interaction per row (the first one regulates the second one), the coordinates in which the interaction was found, the one-based position starting from the first coordinate followed at last by the matrix-scan
’s related score to the TF’s binding in such position. If the interaction is found more than once, then details of positions and scores are separated by “;”, where the first score corresponds to the first position and so on. If the interaction is found in another coordinate, then this will appear on another row.
GRN_indirect_interactions_date.tsv
Because the program only finds direct putative interactions, this file indicates all direct interactions composed by 3 nodes.
E.g.: TFx ==> TFy ==> TFz
which should be read as: TFx directly regulates TFy and TFy directly regulates TFz, thus, TFx indirectly regulates TFz.
Special cases:
A TF has no targets. If a TF has no direct targets (hence no indirect targets) it is specified as: TFx has no targets
. If this given “TFx” is a direct target of another TF, then “TFx” will simply be a terminal node.
TF regulating itself. Cases of TFs putatively regulating themselves are only specified once, as a two-node interaction. E.g.: TFy ==> TFy
For each TF-TF interaction, the same details are given as in GRN_direct_interactions_date.tsv
, these are: coordinates, positions and scores.
When an input network is given, a comparison is made between the input network and the one generated by the program (the complete network), notice that this comparison is made only for the direct interactions found.
This comparison is made only taking into account all genes referred to in the input BED file and will generate the three following output files:
network_intersection_date.tsv
A tab-delimited file containing, one per row, all the common interactions between the input network and the network of all genes generated by the program. Information about coordinates, positions and scores related to the output network is also given.
network_not_found_interactions_date.tsv
A two-column, tab-delimited file containing, one per row, all the interactions found in the input network but not found again in the output network (the assymetric difference for input network).
An interaction can be not_found for one of several reasons:
One or both of the gene’s regulatory sequence was not indicated.
The chosen motif database does not contain a matrix describing the motif for the TF in the interaction. Default database is JASPAR’s 2020 nonredundant vertebrate motif collection.
matrix-scan
did not found an instance of the TF’s motif in the genomic sequences indicated or it found it but it was not reliable given the score and/or p-val thresholds.
Note: Since these interactions are absent in the complete output network, information about coordinates, positions and scores are also absent.
network_found_interactions_date.tsv
A tab-delimited file containing, one per row, all the interactions found by the program but that were not specified in the input network (the asymmetric difference for the complete network). Information about coordinates, positions and scores related to the output network is also given.
-v #
Level of verbosity.
-h
Display full help message.
-help
Same as -h.
-tfs TFs_infile
Mandaroty. File containing a list of TFs to be analysed One-column input file with list of TFs in network.
-cre cre_infile
Mandaroty. Bed file with of regulatory sequences per gene in network. Each sequence must refer to its regulated-gene on the 4th column.
-genome genome_version
Mandaroty. Working genome version.
-db databases
Database(s) to use separated by commas.
-net network_infile
Mandaroty if -report_net is specified. The network must be a two-column file with each row containing a single interaction.
-report_net
Report network, i.e., differences and overlap between the input network and the new network are outputted
-title title
Title displayed on top of the report page.
-html
Output html SUMMARY file
-date date
Date used as suffix for output files.
-seq input_sequences
Fasta file with of regulatory sequences per gene in network. Each sequence must refer to its regulated-gene on its header by tag. Regulatory sequences of TFs in the network should be incorporated.
-m matrices-file
File of matrices in transfac fomat including the TF name on the ID field.
-scan matrix-scan-results-file
File resulting from matrix-scan on regulatory sequences for the TFs’ matrices of interest.
-score #
Score lower threshold to run matrix-scan with, default is 5.
-pval #
P-value upper threshold to run matrix-scan with.
-o output_prefix
Prefix for the output files.
-outdir output_dir
Output dir
Last update: December 4th, 2020