Automated analysis of electrophoresis images for inter-delta fingerprint yeast typing
Three files have been uploaded, corresponding with the three parts of this program. The second part is developed in R and the other two (first and third parts) in Python. The first part selects each lane in the digitalized electrophoresis gel and extracts its intensity signal spectrum. The second part identifies the most important peak in the spectrum (each band of the line). The third part compares the bands of each line, identifying the isolates that belongs to the same strain.
Steps to reproduce
In every lane of the inter-delta fingerprinting obtained, select by hand the starting and ending of each lane. For a selected lane, the program read 10 pixels wide (for images with lanes between 15 - 50 pixels wide) and then the averages its intensity at each position along the line is calculated. So, the program generates a signal intensity spectrum for each line (isolate). An algorithm based on continuous wavelet transformation allows an accurate noise reduction. It is applied directly to the raw spectrum and given a ‘Signal-to-Noise’ Ratio threshold (usually it won’t be lower than 3), it identifies automatically the relative position and strength of the strongest peaks (being the most representative bands of the spectrum). This way for every isolate a band profile is obtained, with the position and intensity of every band. Once band patterns are obtained for each isolate, the strain identification (typing) of the yeast isolates is performed by comparing all profiles. Each lane is read individually and compared with all the other lanes, band by band. The program compares the position of every band, allowing a given error (recommended lower than 2,5%). Each comparison is made in two ways, so you can select different similitude percentage in the comparison to consider two profiles as equal (usually the lose of a single band in the profile is allowed, so if the profile has 5 bands it would be 80% or in the case of having 4 bands it would be 75%). The program gives a text file where the different clusters of peer isolates are reproduced, and also creates a file for each strain obtained. In these files, the names of the isolates belonging to that strain also appear.