close

Вход

Забыли?

вход по аккаунту

код для вставкиСкачать
SW-ARRAY: a dynamic
programming solution for the
identification of copy-number
changes in genomic DNA using
array comparative gnome
hybridization data
Motivation
• Chromosomal changes cause genetic
diseases
– aneusomies
• Easy to detect
– Copy number changes of genes
• Not so easy
Array CGH
• Comparative Genome Hybridization CGH to
DNA microarrays
• Method for detecting copy number changes
– Data analyzed using thresholds
– Not reliable to detect single-copy gains or losses
when using large insert clones as probes
– High false positives and false negatives
– Inconsistent for probes of different chromosomal
regions
• Cannot be used for clinical diagnostic
applications!
Data Adjustment
• Normalization and Correction
– Reason: variations between probes
– Control vs. control data ratio
• Find mean and SD
– Divide control vs. test ratios by that mean
Threshold method
• Compare each data from control vs. test
experiment to threshold values
– Below 0.8=deletion
– Above 1.2=polysomy
SW-ARRAY
• Smith-Waterman algorithm adapted for
Array CGH
• New way to analyze Array CGH data
• Reason:
– Log ratio data is contiguous one-dimensional
series, where locations of high values may
indicate polysomic regions, low deletions
SW-ARRAY
• Step 1:
– Remove outlying probes
• Log intensity ratio more than 2.5 MAD away from median of
other probes in array
• MAD=Mean Absolute Deviation
– Robust measure of Standard Deviation
1

n
n
x

x
i
i 1
SW-ARRAY
• Step 2:
– Log ratio data - t0
– Ensures that the mean of adjusted data is
negative
• t0=median + 0.2 x MAD
SW-ARRAY
• Step 3:
– Search for high-scoring islands
• Definition
– locally high-scoring segment-a positive
scoring segment whose score cannot be
increased by shrinking or expanding segment
boundaries
SW-ARRAY
T ( p, q) 

q
i p
X (i )
T(p,q)=score of segment
X(i)=score for the pth probe ordered
along genome
SW-ARRAY
S(p)=score of island ending at p
B(p)=beginning point of the island
S(0)=0
P>0
SW-ARRAY
• Iterate through locations along gene
probes
• Search where scores>0
– Find max-scoring island
– Record data
– Set island=0
– Find next max-scoring island
SW-ARRAY
• Statistical Significance
– In 1000 runs with permuted log ratios for each
probe
• find frequency of highest scoring island in each run
Experiment
• Test Group
– DNA from subjects with well-characterized
monosomies
• Control groups
• Data analyzed using 2 methods
– Threshold
– SW-ARRAY
Experiment Results
• Threshold Method
– 78.1% correct identification of copy-number
changes
• SW-ARRAY
– Identified 13/14 of the monosomic regions
with high significance levels in the 14 blind
tests
Ideal Conditions for SW-ARRAY
• numerious probes border region of copy
number change
• long sequences for which edge effects are
minimized
Output
1/--страниц
Пожаловаться на содержимое документа