Zum Inhalt springenZur Suche springen

Own scripts for data analysis

Own scripts for data analysis and Arduino microcontroller

All scripts available below have been written to solve specific tasks, i.e. dealing with the analysis and visualization of protein and DNA sequence data. They are not actively supported. Scripts can be downloaded and used at your own risk. Feel free to modify and improve the code. Feedback and improvements welcome and appreciated!

Perl scripts: (see the respective script for usage)





Perl script that allows the substitution of the B-factor field of any PDB-file with any value, i.e. NMR chemical shift changes, MD derived residue-wise RMSD values, conservation scores etc.


Input: pdb-file and text-file containing the values that should be mapped onto the protein structure


Output: pdb-file with rewritten B-factor field


Direct visualization of residue-wise data i.e. as color gradient mapped onto a crystal structure.


Output tested with Pymol v0.99


Perl script to analyze DNA sequences of one or several open reading frames (e.g. operons) for the presence of rare codons.


Input: text-file with DNA sequence data


Output: text-file containing a list with the occurrence (%) of rare E. coli codons within the input DNA data set

Analysis of DNA sequences for the presence of substantial numbers of rare E. coli codons.


May provide a hint for inefficient expression of your target gene/operon in E.coli


Perl script to analyze multiple amino acid sequences (i.e. alignment data in fasta format) with respect to their residue-wise hydropathy.

Hydropathy index assigned according to the Kyte-Doolittle scale.


Input: text-file containing multiple amino acid sequences in fasta format (e.g. gap-free sequence alignment)


Output: ordered space separated list with residue-wise hydropathy index values assigned to each amino acid of the input sequences

Visualisation of conserved functional sequence motifs in protein sequences (e.g. coiled-coil heptad repeats).



The script can be easily modified for the analysis of other amino acid side-chain properties (e.g. charges).


Perl script that allows the analysis of the guanine-cytosine (GC) content of a given DNA sequence (gene, plasmid, genome etc). The tool derives the mean GC content of the provided sequence as well as sequence resolved GC content values by averaging over a user defined window (e.g. 10 bp).


Input: INPUT.txt text-file containing a DNA sequence (no header, plain txt only).


Output: outfile.txt text-file containing average GC content (%) versus window number.

Analysis of average GC content and generation of sequence resolved GC content plots.

Processing and Arduino scripts





Arduino script which allows the gating of up to 4 LEDs via a custom-built high-power LED driver connected to an Arduino UNO board.




Control of the on/off time of up to 4 high-power LEDs


On/off delay times have to be adjusted in the script.


LED intensity control possible via a Processing based graphical user interface (GUI) (BlinkingLED_GUI.pde)


Start of the time program only possible via the below Processing GUI.


Correct serial port and arduino pin usage has to be adapted in the script.


Processing script providing a graphical user interface for LED intensity control via an Arduino microcontroller



The tool allows switching and dimming of LED sources via pulse-width modulation.


Requirement: BlinkingLED.ino running on an Arduino UNO board which controls a LED driver. Serial connection between Arduino and PC.