<%BANNER%>

New Protein Structure Prediction Method Using Inter-Residue Distances and a Theoretical Investigation of the Isomerizati...

Permanent Link: http://ufdc.ufl.edu/UFE0021713/00001

Material Information

Title: New Protein Structure Prediction Method Using Inter-Residue Distances and a Theoretical Investigation of the Isomerization of Azobenzene and Disubstituted Azobenzenes
Physical Description: 1 online resource (227 p.)
Language: english
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: azobenzene, casp, isomerization, protein, rosetta
Chemistry -- Dissertations, Academic -- UF
Genre: Chemistry thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: It is often claimed that knowing a protein?s structure is important in understanding its function. The experimental structure determination methods presently available can be costly and time-consuming. This dissertation presents an idea for a fast and inexpensive protein structure prediction method that combines modeling with a minimal set of experimental data. Our method involves three steps: (1) building a decoy set (a set of protein-like structures), (2) measuring inter-residue distances, and (3) comparing the measured distances with those calculated in each decoy. We postulate that structures with a small number of similar inter-residue distances will also have similar three-dimensional structure. We further hypothesize that the minimum number of distances needed to determine structure is much less than the total number of inter-residue distances in the protein. To develop our protocol, we searched the decoy set for target proteins whose structures have been solved experimentally but have not been explicitly included in our decoy set. We simulated experimental data by calculating alpha-carbon distances from the experimentally determined structures of our target proteins. We have created a large, generalized decoy set using most of the structures in the Protein Data Bank. This decoy set can be used to study any protein composed of 100 residues or less. Using this decoy set, we attempted to predict structures for several proteins. We also analyzed the RMSD distributions of the decoys using the search proteins as references and found the distributions to be similar for each protein. Of the nearly five thousand alpha-C-alpha-C distances in a 100 residue protein, knowledge of only twenty-five selected distances will usually result in predicting a reliable model. In the second part of our study, results are presented for a series of azobenzenes which were studied using ab initio methods to determine the substituent effects on the isomerization pathways. Energy barriers were determined from three-dimensional potential energy surfaces of the ground and electronically excited states. In the ground state (S0), the inversion pathway was found to be preferred. Results show that electron donating substituents increase the isomerization barrier along the inversion pathway, while electron withdrawing substituents decrease it. The inversion pathway of the first excited state (S1) showed trans to cis barriers with no curve crossing between the S0 and S1. In contrast, a conical intersection was found between the ground and first excited states along the rotation pathway for each of the azobenzenes studied. No barriers were found in this pathway and we therefore postulate that after n to pi* (S0 to S1) excitation, the rotation mechanism dominates. Upon pi to pi* (S0 to S2) excitation, there may be sufficient energy to open an additional pathway (concerted-inversion) as proposed by Diau. This pathway is only accessible for unsubstituted azobenzene and 4,4-dinitroazobenzene. Because of the S0 and S1 curves crossing on the trans side, the concerted inversion channel explains the experimentally observed difference in trans-to-cis quantum yields between S1 and S2 excitations. The concerted inversion channel is not available to the remaining azobenzenes and so they must employ the rotation pathway for both n to pi* and pi to pi* excitations.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis: Thesis (Ph.D.)--University of Florida, 2008.
Local: Adviser: Roitberg, Adrian E.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0021713:00001

Permanent Link: http://ufdc.ufl.edu/UFE0021713/00001

Material Information

Title: New Protein Structure Prediction Method Using Inter-Residue Distances and a Theoretical Investigation of the Isomerization of Azobenzene and Disubstituted Azobenzenes
Physical Description: 1 online resource (227 p.)
Language: english
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: azobenzene, casp, isomerization, protein, rosetta
Chemistry -- Dissertations, Academic -- UF
Genre: Chemistry thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: It is often claimed that knowing a protein?s structure is important in understanding its function. The experimental structure determination methods presently available can be costly and time-consuming. This dissertation presents an idea for a fast and inexpensive protein structure prediction method that combines modeling with a minimal set of experimental data. Our method involves three steps: (1) building a decoy set (a set of protein-like structures), (2) measuring inter-residue distances, and (3) comparing the measured distances with those calculated in each decoy. We postulate that structures with a small number of similar inter-residue distances will also have similar three-dimensional structure. We further hypothesize that the minimum number of distances needed to determine structure is much less than the total number of inter-residue distances in the protein. To develop our protocol, we searched the decoy set for target proteins whose structures have been solved experimentally but have not been explicitly included in our decoy set. We simulated experimental data by calculating alpha-carbon distances from the experimentally determined structures of our target proteins. We have created a large, generalized decoy set using most of the structures in the Protein Data Bank. This decoy set can be used to study any protein composed of 100 residues or less. Using this decoy set, we attempted to predict structures for several proteins. We also analyzed the RMSD distributions of the decoys using the search proteins as references and found the distributions to be similar for each protein. Of the nearly five thousand alpha-C-alpha-C distances in a 100 residue protein, knowledge of only twenty-five selected distances will usually result in predicting a reliable model. In the second part of our study, results are presented for a series of azobenzenes which were studied using ab initio methods to determine the substituent effects on the isomerization pathways. Energy barriers were determined from three-dimensional potential energy surfaces of the ground and electronically excited states. In the ground state (S0), the inversion pathway was found to be preferred. Results show that electron donating substituents increase the isomerization barrier along the inversion pathway, while electron withdrawing substituents decrease it. The inversion pathway of the first excited state (S1) showed trans to cis barriers with no curve crossing between the S0 and S1. In contrast, a conical intersection was found between the ground and first excited states along the rotation pathway for each of the azobenzenes studied. No barriers were found in this pathway and we therefore postulate that after n to pi* (S0 to S1) excitation, the rotation mechanism dominates. Upon pi to pi* (S0 to S2) excitation, there may be sufficient energy to open an additional pathway (concerted-inversion) as proposed by Diau. This pathway is only accessible for unsubstituted azobenzene and 4,4-dinitroazobenzene. Because of the S0 and S1 curves crossing on the trans side, the concerted inversion channel explains the experimentally observed difference in trans-to-cis quantum yields between S1 and S2 excitations. The concerted inversion channel is not available to the remaining azobenzenes and so they must employ the rotation pathway for both n to pi* and pi to pi* excitations.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis: Thesis (Ph.D.)--University of Florida, 2008.
Local: Adviser: Roitberg, Adrian E.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0021713:00001


This item has the following downloads:


Full Text





NEW PROTEIN STRUCTURE PREDICTION METHOD USING INTER-RESIDUE
DISTANCES AND A THEORETICAL INVESTIGATION OF THE ISOMERIZATION OF
AZOBENZENE AND DISUBSTITUTED AZOBENZENES




















By

CHRISTINA R. CRECCA


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2008





































O 2008 Christina R. Crecca




























To my husband Chris










ACKNOWLEDGMENTS

At the completion of this work, I take great pleasure in acknowledging the people who

have supported me over the last few years. I gratefully thank and acknowledge my advisor, Prof

Adrian Roitberg, for his continual guidance, support, understanding, and encouragement. I

would also like to thank my committee members Dr. Chang, Dr. Cao, Dr. Fanucci, and Dr.

Polfer.

During my time at QTP I have made many great friends. Without their support, I do not

think I would have made it. I would like to give special thanks to Andrew, Dan, Georgios,

Gustavo, Hui, Joey, Josh, Julio, Kelly, Ken, Lena, Lex, Mehrnoosh, Ozlem, Seonah, Tom, and

Yilin. I would also like to thank my family, especially my nieces and nephews, Gabe, Savanna,

A. J., and Anna.

I would also like to thank Dr. Eric Deumens for his infinite patience and understanding. I

apologize to all the computers that were harmed during this work, particularly Arwen and

Cobalt.

Our work was supported in part by DOE contract DE-F602-02ER45995 and a University

of Florida Alumni Fellowship. Computer resources were provided by the University of Florida

High Performance Computing Center as well as the Large Resource Allocations Committee

through grant TG-MCA05S010.












TABLE OF CONTENTS


page

ACKNOWLEDGMENT S .............. ...............4.....


LI ST OF T ABLE S ............ ..... .__ ...............9....


LIST OF FIGURES ............_...... ._ ...............11...


LI ST OF AB BREVIAT IONS ............_ ..... ..__ ............... 15..


AB S TRAC T ............._. .......... ..............._ 16...


CHAPTER


1 INTRODUCTION TO PROTEIN STRUCTURE PREDICTION METHODS ....................18


1.1 Background Information on Proteins............... ...............18
1.2 Experimental Methods ................. ...............19................
1.2.1 Structure Determination .............. ...............19....
1.2. 1.1 X-ray crystallography ................... ........... ...............19....
1.2. 1.2 Nuclear magnetic resonance (NMR) ................. ....__ ............... ...22
1.2.2 Distance Measurements .............. .......... .. ...... ..... ...............2
1.2.2.1 Nuclear overhauser effects (NOE) from NMR .............. .....................2
1.2.2.2 Electron paramagnetic resonance (EPR) ....._._._ ........... ..............26
1.2.2.3 Fluorescence resonance energy transfer (FRET) .............. ....................2
1.2.2.4 Chemical cross-linking with mass spectrometry ................. ............... ....30
1.3 Methods of Structure Prediction ................. ...............31...............
1.3.1 Homology Modeling ............... .... ...............32
1.3.2 Fold Recognition Methods (Threading) .............. ...............35....
1.3.3 Ab britio Methods ........._._ ...... .... ...............37.
1.3.3.1 Rosetta ................ .......... .. ...............40...
1.3.3.2 Databases to test scoring functions .............. ...............41....
1.3.4 Distance Geometry ................ ...... ...............42
1.3.5 Chemical Cross-Linking with MS ................. ...............43........... .
1.3.6 O ur M ethod .............. .... ... ......... .. .... ....... .........4
1.4 Critical Assessment of Techniques for Protein Structure Prediction (CASP) ................. .46

2 METHODS FOR PROTEIN STRUCTURE PREDICTION ................. .................4


2. 1 Decoy Generation ................ ...............49................
2.1.1 General Decoy Set............... ...............49..
2. 1.2 Specific Decoy Set .............. ...............50....
2.2 Decoy Discrimination............... .............5
2.3 Choosing Constraints............... ..............5
2.4 Comparing Results............... ...............54











3 TRIALS AND ERRORS: DEVELOPING THE METHOD ................. ................ ...._.56


3.1 Testing the Method on Previously Constructed Databases .............. .....................5
3.1.1 Number of Structures Satisfying Specific Constraints ................. ............... .....57
3.1.2 Effects of Applying Constraints in Different Orders ................. ............ ........58
3.1.2. 1 Randomly ordered constraints ................. ...............58..............
3.1.2.2 Same constraints in different order ................. .... ... ........... .......5
3.2 Developing a Search Protocol Using a Structure Known to Be in Our Database ...........59
3.3 Developing a Search Protocol Using a Structure Not in Our Database ...........................60
3.3.1 Constraint Distance Acceptance Ranges: +/- 2 A+ and +/- 4 A+................ ..............60
3.3.2 Calculation of All RM SDs .............. ....... .. ... ......................6
3.3.3 Constraint Distance Acceptance Range of +/- 12 A+ and +/- 12 A+ & +/- 10 A+......62
3.3.4 Block of Distances. ............... .... ... ......... ...............63.....
3.3.5 Vary the Order of Constraint Application ................. ........... ................. .6
3.3.6 Count the Number of Satisfied Constraints for Each Decoy ................. ...............65
3.4 Determination of an Average RMSD Distribution............... ..............6
3.5 Summary of Methods .............. ...............66....

4 RESULTS: USING OUR DECOY SET TO FIND FOUR PROTEINS............... .... ........._..77


4. 1 Completeness of Decoy Set ............ ...... .._ ...............77
4.2 Evaluation of Decoy Discrimination ............ ......_ ...............78
4.2.1 Target lb4c, Apo-S100P .............. ............ ....... ...............7
4.2.2 Target 1ghh, DNA-Damage-Inducible Protein I (Dinl) ........_._....... ............ ...78
4.2.3 T ar get l ubi, Ubiquitin .............. ........ ........ ... ... ..................7
4.2.4 Target 2ezk, Mu End DNA-Binding ibeta Subdomain of Phage Mu
Transposase ................. .. ......... .... ...... ..... .. ...... .........8
4.2.5 Comparison of Search Process for All Target Proteins ................. ............... ....80
4.3 Conclusions............... ..............8


5 RESULTS: USINTG SPECIFIC DECOY SETS TO FIND FOUR PROTEINS................... ..89


5.1 Parameter Optimizations .............. ...............89....
5.1.1 Decoy Set Size............... .. ...... ............8
5.1.2 Constraint Distance Acceptance Range ................. ...............90..............
5.1.2. 1 Twelve constraints ................. ...............90........... ..
5.1.2.2 Twenty-five constraints ................. ...............92..............
5.2 Search Results............... ...............93
5.2. 1 T ar get lb4c ................. ...............94........... ...
5. 2.2 T target 1 ghh ................ ...............94........... ...
5.2.3 T ar get l ubi ................. ...............95................
5.2.4 Target 2ezk ................. ...............95................
5.3 Conclusions.............. ..............9












6 RESULTS: USING GENERAL AND SPECIFIC DECOYS SETS TO STUDY Twelve
CASP7 TARGET S .............. ...............103....


6. 1 General Decoy Set ............ .......__ ...............103..
6. 1.1 Targets That W worked ........._....._ ....__. ...............104.
6. 1.1.1 Target T288 ........._.___..... .__ ...............104.
6. 1.1.2 Target T340 ........._.___..... .__ ...............105.
6. 1.1.3 Target T3 59 ................ ...............107.............
6. 1.1.4 Target T309 ........._.___..... .__ ...............108.
6. 1.1.5 Target T335 ........._.___..... .__ ...............109
6.1.1.6 CASP comparisons.................... ....................1
6. 1.2 Targets That Could Have Worked But Did Not .......__. ......... ._. .............110
6. 1.2. 1 Target T348 ................. ...............110..............
6. 1.2.2 Target T349 ................ ...............112........... ...
6. 1.2.3 Target T3 58 ................. ...............113..............
6. 1.2.4 CASP comparisons............... ...............114
6. 1.3 Targets That Never Had a Chance ................. ...............115........... .
6. 1.3.1 Target T306 ................ ...............115....._ __ ..
6. 1.3.2 Target T311 ................ ...............116.............
6. 1.3.3 Target T3 53 ................. ...............117.......... ..
6. 1.3.4 Target T363 ................. ...............118..............
6.1.3.5 CASP comparisons............... ........ .........1
6. 1.4 Summary of Results for General Decoy Set ................. .......... ................1 19
6.2 Specif c Decoy Sets ................. ...............120......... .....
6.2. 1 Targets That W worked ................. ...............120........... ...
6.2.2 Targets That Did Not Work ................. ...............123..............
6.2.3 Targets That Never Had a Chance............... ....... ...........12
6.2.4 Summary of Results Using the Specific Decoy Set ................. ............. .......125
6.3 Comparisons of Decoy Sets ................. ...............125........... ...

7 COMPARISONS OF GENERAL AND SPECIFIC DECOY SETS ................. ...............143


7. 1 Comparing the performance of the general and specific decoy sets on four targets ......143
7.2 Results for CASP7 ................. ...............144...............


8 AZOBENZENE ISOMERIZATION ................. ...............146................


8. 1 Isomerization M echanism ............... ....__ ...............146.
8.2 Applications of Azobenzenes in Biomolecules ................. ...............................149

9 COMPUTATIONAL DETAILS .............. ...............153....


9.1 Ground-State Calculations............... .............15
9.2 Excited-State Calculations ................. ...............154................











10 RESULTS: UNSUB STITUTED AZOBENZENE ......____ ..... ... .__ .. ......._......15


10. 1 Optimized Ground-State Geometry ....__ ......_____ .......___ ..........15
10.2 Electronic Excitation Energies .............. ...............155....
10.3 Potential Energy Surfaces ............ ..... .._ ...............156.
10.3.1 Ground State ............ _....... .._ ...............156.
10.3.2 Excited State 1 (n x*n ) ................. ...............157............
10.3.2.1 Rotation pathway............... ...............158
10.3.2.2 Inversion pathway .............. ...............159....
10.3.3 Excited State 2 (: + n*) ................. ...............160.............
10.3.3.1 Rotation pathway............... ...............160
10.3.3.2 Inversion pathway .............. ...............160....
10.3.3.3 Concerted inversion pathway ................. ...............161........... ..
10.4 Summary of Unsubstituted Azobenzene ........._.._.. ......._ ........__. ........6

11 RESULTS: SUBSTITUTED AZOENZENES .....__.....___ .........._ ...........17


11.1 Optimized Ground-State Geometry .....__.....___ ..........._ ...........7
11.1.1 N N D instance .......................... .. ............... ..................17
11.1.2 NNC Angle, CNNC Dihedral Angle, and NNCC Dihedral Angle .................... 170
11.1.3 Relative Energy Differences............... ..............17
1 1.2 Comparison of Charges .............. .....................171
11.3 Electronic Excitation Energies .............. ...............172....
11.4 Potential Energy Surfaces ................. ...............174..............
11.4. 1 Ground State ................. ...............174..............
11.4.2 Excited State 1 ........................... ........178
11.4.2.1 Rotation pathway............... ...............178
11.4.2.2 Inversion pathway .............. .....................178
11.4.3 Excited State 2 ................. ...............179..............
11.4.3.1 Rotation pathway............... ...............179
11.4.3.2 Inversion pathway ................. ..................... ....................179
11.4.3.3 Concerted-inversion pathway ....__ ......_____ ...... ...___..........8
11.5 Summary of Substituted Azobenzenes ................. ...............181.............

12 AZ OBENZENE CONCLUSION S ............ .....__ ...............201.


APPENDIX


A LIST OF CONSTRAINTS .............. ...............202....


LI ST OF REFERENCE S ............ ..... ._ ...............209...


BIOGRAPHICAL SKETCH .............. ...............227....










LIST OF TABLES


Table page

3-1 Comparison of input for the four target proteins ................ ............ ........ .........67

3-2 RMSDs for decoys satisfying the most constraints .............. ...............67....

3-3 Lowest RMSD decoys in database using lb4c as a reference..........._ ... ........_..__.....67

3-4 Decoys remaining after 32 constraints using the block method .............. ...................68

3-5 Lowest RMSD decoys found in varying the order of constraint application ....................68

3-6 Lowest RMSD decoys found using the count method for both trials............... ................68

4-1 Number of decoys with RMSDs under each threshold............... ...............8

4-2 Summary of results ................ ................ ........ ......... ........ ........ .83

5-1 The RMSD ranges............... ...............97.

5-2 Comparison of scores for each protein with different acceptance ranges .........................97

6-1 Results for 12 targets .............. ...............127....

6-2 JPred predictions compared to target structures .............. ...............128....

6-3 Results for each of the 12 targets ................. ...............129.............

6-4 Comparison of results for each target using both types of decoy sets ................... ..........129

10-1 Optimized geometries of cis and trans isomers of azobenzene .............. .....................163

10-2 Vertical excitation energies (eV) of trans and cis azobenzene. ................ .................. 163

11-1 Optimized geometries of cis and trans isomers of azobenzenes ................. ................. 182

11-2 Vertical excitation energies in eV of trans and cis azobenzenes. ................ .................182

11-3 Cis + trans energy barriers calculated along the inversion and rotation pathways. .......183

11-4 Dipole moments of the inversion transition state and cis isomer ................. .................183

11-5 Distances of transition states along the rotation and inversion pathways. ................... ....183

11-6 Rotational energy barriers in the first excited state .............. ...............183....

11-7 Placement and energy of first excited state minimum of the conical intersection. ..........184











11-8 Trans + cis inversion energy barriers in the first excited state ................. ................. .184

11-9 Trans + cis energy barriers calculated along the inversion and rotation pathways on
the second excited state surface. ............. ...............184....

11-10 Energy differences between S1 and S2 ................ ...............185............

1 1-1 1 Energies of the S1 and S2 minima, conical intersections, barrier heights, and available
energy ................. ...............185................

A-1 List of distances for targets T288 and T306 .............. ...............202....

A-2 List of distances for targets T309 and T335 .............. ...............203....

A-3 List of distances for target T340 .............. ...............204....

A-4 List of distances for target T349 .............. ...............205....

A-5 List of distances for targets T348 and T353 .............. ...............206....

A-6 List of distances for targets T358 and T363 .............. ...............207....

A-7 List of distances for targets T359 and T311 .............. ...............208....











LIST OF FIGURES


Figure page

1-1 Diagram of an amino acid (alanine)............... ...............4

1-2 Organization of protein structure. .............. ...............48....

2-1 How decoys are generated from a single protein ................. ...............55........... .

3-1 Results of counting the number of decoys that satisfy each constraint. ............................69

3-2 Application of randomly ordered constraints for lbba ................. .......... ...............70

3-3 Results using the same set of constraints in different orders ................ ............. .......71

3-4 Superimposed images of the results of the 2ezm search.. ......___ .......__ ..............72

3-5 Results from Trial 1 .............. ...............72....

3-6 Target protein and the final four remaining decoys after 13 constraints with a +/- 4 A~
distance range............... ...............72.

3-7 Histogram of RMSDs for all decoys in the database using lb4c as a reference. ..............73

3-8 Decoys with the lowest RMSDs in database using lb4c as a reference............._.._..........73

3-9 How an insertion in a loop region can affect the search process............... .................7

3-10 Decoy 1 mka-49 ................. ...............74................

3-11 Number of decoys vs. the number of constraints each decoy satisfies for both trials .......75

3-12 Five decoys used to determine a random average RMSD for our decoy database............75

3-13 Histograms of RMSDs for five randomly chosen decoys, lb7u, Ifxh, Irt6, lujn,
2w rp. ............. ...............76.....

4-1 Histograms of RMSDs for all studied proteins, 1ghh, lubi, 2ezk, and lb4c.....................84

4-2 Target lb4c and top scoring decoys............... ...............84.

4-3 Target 1ghh and top scoring decoys. ............. ...............85.....

4-4 Target lubi and top scoring decoys .............. ...............85....

4-5 Target 2ezk and top scoring decoys ................. ...............86........... ..

4-6 Analysis of the scoring procedure................... ................ ......... ........ ...._87











4-7 Relationship between RMSD and score .............. ...............88....

5-1 Distribution of RMSDs for all four target proteins for the 10,000 decoy sets ................... 98

5-2 Lowest RMSD structures in the 10,000 decoy set. .............. ...............98....

5-3 The number of structures remaining vs. score for each protein............... ................9

5-4 Correlation between score and RMSD .............. ...............100....

5-5 Average RMSD for each protein at different scores ................. ......... ................10 1

5-6 Top scoring decoys for lb4c ........... ...... ._._ ...............101...

5-7 Representation of the P-sheet orientation for the native structure of target protein
1ghh and the top scoring decoys ................. ...............102..............

5-8 Top scoring decoy, # 3631, for lubi with a high RMSD, 12.6 A+ ................. ................102

5-9 Top scoring decoys for 2ezk ................. ...............102..............

6-1 Distribution of RMSDs for each target protein ................. ...............130.............

6-2 Target T288 and the top scoring decoys for T288 ......___ .... ... .__ ..........___....13

6-3 Target T340 and some of the top scoring decoys ..........._ ..... ..__ ................ .13

6-4 Target T359 and its top scoring decoys. ............. ...............131....

6-5 Target T309 and its top scoring decoys. ............. ...............132....

6-6 Target T335 and its top scoring decoys. ............. ...............132....

6-7 Use of Global Distance Test (GDT) analysis for Targets T288, T340, T359, T309,
and T335 .............. ...............133....

6-8 Target T348, lowest RMSD decoys in the database, and the top scoring decoys. ..........134

6-9 Target T349, lowest RMSD decoys in the database, and the top scoring decoys ...........134

6-10 Target T358, lowest RMSD decoys in the database, and the top scoring decoys. ..........134

6-11 Use of Global Distance Test (GDT) analysis for Targets T348, T349, and T358 ..........135

6-12 Target T306, best decoy in database, and top scoring decoys ................. ................ ..136

6-13 Target T311, best decoy in database, and top scoring decoy ................. ............... .....136

6-14 Target T3 53, best decoy in database, and top scoring decoys. ................. ................. 136











6-15 Target T3 63, best decoy in database, and a top scoring decoy ................. ................ ..13 7

6-16 Use of Global Distance Test (GDT) analysis for Targets T306, T3 11, T3 53, and
T363.............. ...............137..


6-17 Histogram of C" RMSDs for all twelve CASP targets. .................. .................138

6-18 Top scoring decoys for target that worked. ............. ...............139....

6-19 Results for T288............... ...............140.

6-20 Results for T348............... ...............140.

6-21 Results for T359............... ...............140.

6-22 Results for T363............... ...............141.

6-23 Results for T340............... ...............141.

6-24 Results for T353............... ...............141.

6-25 Results for T306................ ...............142.

6-26 Results for T309................ ...............142.


8-1 Diagram of the rotation and inversion pathways of the trans + cis isomerization of
azobenzenes .............. ...............151....


8-2 Structures of compounds investigated in this work ...._ ................. .........._..._.. 152

10-1 Molecular orbitals of Azo involved in the S,t So and S26 So transitions. ..........._...........164


10-2 Ground state potential energy surface of Azo. ........._.._.. .....__ ........__.........6

10-3 First excited state potential energy surface of Azo ........._._.._......_.. ........._......165

10-4 Diagram of pathways in the first excited state of Azo. ........._.__......_.. ..............166

10-5 Conical Intersection of So and S1 states of Azo.................. ........._.__ ............. ...166


10-6 Second excited state potential energy surface of Azo ........._.._. ......_. ........._......167

10-7 Rotation, inversion, and concerted-inversion pathways of Azo ................ ................. 168


10-8 Scheme of the trans~cis isomerization process after n~n* excitation and n~*
excitation ................. ...............169......... .....


11-1 Comparison of charge differences in trans isomers of the substituted azobenzenes .......186










11-2 Molecular orbitals involved in the Sit So and S26 So transitions for AzoNO2 H2
and AzoNO2N O2 ................. ...............187................

11-3 Contour map s of the ground state of Azo and sub stituted azob enzene s................... .......18 8

11-4 Schematic diagram of the molecular orbitals of the inversion transition state. ...............189

11-5 Contour maps of the first excited state of Azo and substituted azobenzenes. ................. 190

11-6 Contour maps of the second excited state of Azo and substituted azobenzenes .............192

11-7 Rotation pathway along the angle of the ground state minimum of Azo and
substituted azobenzenes ................. ...............194................

11-8 Inversion pathway along the dihedral of the ground state minimum of Azo and
substituted azobenzenes ................. ...............196................

11-9 Concerted-inversion pathway along the dihedral of the ground state minimum of Azo
and substituted azobenzenes. ............. ...............198....

11-10 Scheme of the trans~cis isomerization process for Azon, Azonco, and AzoNO22z. .200









LIST OF ABBREVIATIONS

Unsubstituted azobenzene

4,4'-diaminoazobenzene

N-[4-(4-(Acetylamino)phenylazo)phenyl]-actmd

4,4' -nitro-aminoazob enzene

4,4' -dinitroazobenzene

Critical assessment for techniques in protein structure prediction

Complete active space self-consistent field

Charges from electronic potential

Density functional theory

Electron paramagnetic resonance

Fluorescence resonance energy transfer

Global distance test

Longest continuous segment

LCS under 5A~

Local-global alignment

Mass spectrometry

Nuclear overhauser effects

Nuclear magnetic resonance

Protein databank

Residual dipolar couplings

Root mean square deviations

Site-directed spin labeling

Time dependent density functional theory


Azo

Azon

Azonco

AzoNO2H2

AzoNO2NO2

CASP

CASSCF

CHelpG

DFT

EPR

FRET

GDT

LCS

LCS-5

LGA

MS

NOEs

NMR

PDB

RDC

RMSD

SDSL

TDDFT









Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

NEW PROTEIN STRUCTURE PREDICTION METHOD USING INTER-RESIDUE
DISTANCES AND A THEORETICAL INVESTIGATION OF THE ISOMERIZATION OF
AZOBENZENE AND DISUB STITUTED AZOBENZENES

By

Christina Crecca

May 2008

Chair: Adrian Roitberg
Major: Chemistry

It is often claimed that knowing a protein's structure is important in understanding its

function. The experimental structure determination methods presently available can be costly

and time-consuming. This dissertation presents an idea for a fast and inexpensive protein

structure prediction method that combines modeling with a minimal set of experimental data.

Our method involves three steps: (1) building a decoy set (a set of protein-like structures), (2)

measuring inter-residue distances, and (3) comparing the measured distances with those

calculated in each decoy. We postulate that structures with a small number of similar inter-

residue distances will also have similar three-dimensional structure. We further hypothesize that

the minimum number of distances needed to determine structure is much less than the total

number of inter-residue distances in the protein. To develop our protocol, we searched the decoy

set for target proteins whose structures have been solved experimentally but have not been

explicitly included in our decoy set. We simulated experimental data by calculating co-carbon

distances from the experimentally determined structures of our target proteins.

We have created a large, generalized decoy set using most of the structures in the Protein

Data Bank. This decoy set can be used to study any protein composed of 100 residues or less.










Using this decoy set, we attempted to predict structures for several proteins. We also analyzed

the RMSD distributions of the decoys using the search proteins as references and found the

distributions to be similar for each protein. Of the nearly five thousand C"-C" distances in a 100

residue protein, knowledge of only twenty-five selected distances will usually result in predicting

a reliable model.

In the second part of our study, results are presented for a series of azobenzenes which

were studied using ab initio methods to determine the substituent effects on the isomerization

pathways. Energy barriers were determined from three-dimensional potential energy surfaces of

the ground and electronically excited states. In the ground state (So), the inversion pathway was

found to be preferred. Results show that electron donating substituents increase the

isomerization barrier along the inversion pathway, while electron withdrawing substituents

decrease it. The inversion pathway of the first excited state (S1) showed trans~cis barriers with

no curve crossing between the So and S1. In contrast, a conical intersection was found between

the ground and first excited states along the rotation pathway for each of the azobenzenes

studied. No barriers were found in this pathway and we therefore postulate that after n x*"

(Sif So) excitation, the rotation mechanism dominates. Upon n~* (S26So) excitation, there

may be sufficient energy to open an additional pathway (concerted-inversion) as proposed by

Diau. Thi s pathway i s only accessible for unsub stituted azobenzene and 4,4-dinitroazobenzene .

Because of the So and S1 curves crossing on the trans side, the concerted inversion channel

explains the experimentally observed difference in trans-to-cis quantum yields between S1 and S2

excitations. The concerted inversion channel is not available to the remaining azobenzenes and

so they must employ the rotation pathway for both n~n* and n~* excitations.









CHAPTER 1
INTRODUCTION TO PROTEIN STRUCTURE PREDICTION IVETHODS

We begin with a brief introduction to protein studies, including general information on

protein structure, including methods of structure determination, is provided. We also discuss

experimental methods like Electron Paramagnetic Resonance (EPR), Fluorescence Resonance

Energy Transfer (FRET), Nuclear Overhauser Effects (NOE) from Nuclear Magnetic Resonance

(NMR), and chemical cross-linking with mass spectrometry, all of which can be used to measure

distances in proteins. Various methods of protein structure prediction are presented followed by

a summary of our proposed method.

1.1 Background Information on Proteins

The building blocks of proteins are the twenty naturally occurring a-amino acids. Each

amino acid residue has the same fundamental structure (Figure 1-1) containing a carboxyl group,

an amino group, and an a-carbon with an R group attached. Amino acids differ in their R-

substituent. Linking the carboxyl and amino groups of adj acent residues forms a peptide bond,

thereby joining amino acids in a linear fashion. However, some proteins containing cysteine

resides can form disulfide bonds which result in the cross-linking covalentt bonding) of

nonadj acent residues.

The amino acid sequence of a protein is encoded by the DNA sequence of a gene and is

often referred to as the protein's primary structure (Figure 1-2A). The secondary structure is

composed of the regularly repeating local conformations generally stabilized by hydrogen bonds.

The most typically seen secondary structural elements are the a-helix and the P-sheet (Figure 1-

2B). A single protein may have many regions of differing secondary structure and how these

regions relate to one another is described by their tertiary structure (Figure 1-2C). Thus, tertiary

structure is the overall shape of a single-chain protein. Stabilizing this structure are many non-









local interactions. For example, to minimize their exposure to water, hydrophobic residues

retreat to the protein's core. Salt bridges, hydrogen bonds, and disulfide bonds are also formed

to help stabilize the structure. Proteins composed of two or more polypeptide chains may exhibit

quaternary structure, which refers to the spatial arrangement of these chains (Figure 1-2D).

1.2 Experimental Methods

1.2.1 Structure Determination

To understand the function of a protein on a molecular level, it helps to have some

knowledge of its structure.1,2 In the late 1950s and early 1960s, the first protein structures were

determined via X-ray crystallography. The structure of myoglobin was solved by Sir John

Cowdery Kendrew, 3 while Max Perutz gave us the structure of hemoglobin.4 So important were

their discoveries that Perutz and Kendrew were awarded the Nobel Prize in 1962.

1.2.1.1 X-ray crystallography

There are three steps in determining the structure of a protein from X-ray crystallography.

First, a suitable crystal must be grown, which is often the most difficult and time consuming

step.' The quality of the crystal is hard to assess until the diffraction pattern is obtained, but

ideally it will be pure and free from imperfections, regular in structure, and suitable in size.

Next, a beam of X-rays is fired at the crystal thereby producing a regular array of

reflections (a pattern of spots) that can be seen on a screen behind the crystal. One of the

advantages of using crystals is that they are periodic and therefore composed of many repeating

unit cells. Constructive interference due to this periodicity serves to amplify the weak scattering

of the individual unit cells into a more powerful, coherent reflection. The relative intensities of

the reflections are important in determining the arrangement of the molecules in the crystal.

They can be recorded using an area detector, charge-coupled device image sensor, or

photographic film. After the intensity of each spot is recorded, the crystal is rotated slightly to










produce another set of reflections, whose intensities are also recorded. One image of spots does

not provide enough information to determine the molecular structure of the crystal because it

only represents a small piece of the full Fourier transform. Therefore, this process is continued

for more than a 1800 rotation of the crystal. One should also change the axis of rotation at least

once to avoid a blind spot in reciprocal space near the rotation axis. Often, several sets of

diffraction patterns may be collected.

Finally, the data can be analyzed. The reflections from all the recordings must be indexed

by identifying the dimensions of the unit cell. An autoindexing algorithm6 is usually employed

to determine which image peak corresponds to which position in reciprocal space. All the

images of all the reflections are converted into a single file containing the Miller index of each

reflection along with its intensity. Hundreds of separate images of the crystal are taken at

different orientations leading to many symmetry-related reflections being recorded multiple

times. One must find which peaks appear in two or more images.

Because the diffraction data is a reciprocal space representation of the crystal lattice, the

location of each spot is determined by the size and shape of the unit cell as well as the symmetry

in the crystal. The intensity of each spot is proportional to the square root of the structure factor

amplitude, which is a complex number that contains information relating to the amplitude and

phase of a wave. Both the amplitude and phase must be known in order to generate an electron

density map, which is then used to build a starting model of the protein structure. A potential

problem in X-ray crystallography is that the phase cannot be directly recorded during the

experiment.

The phase problem can be overcome in several ways. (1) Ab britio phasing (direct method)

7, is often applied to small proteins by exploiting known phase relationships between specific









groups of reflections to determine the needed phase information. (2) In molecular replacement,9

the structure of a related protein can be used as a model to determine the orientation and position

of the molecules in the unit cell. Phase information is then obtained and used to build an

electron density map. (3) Anomalous scattering (multiwavelength anomalous dispersion MAD)

10 involves incorporating anomalously scattering atoms like Selenium into the protein; the

scattering is changed in a known way. The position of the anomalously diffracting atoms can be

found easily thereby providing their initial phases. (4) Similar to MAD phasing, heavy atoms

can be incorporated into the protein. The changes in the scattering amplitudes can be used to

determine the phases. MAD phasing with selenomethionine is now more commonly used than

heavy atom replacement.'0

Initial models can be built after the initial phases have been obtained. The models are then

used to refine the phases, which are used to further improve the models. The B-factor is an

estimate of the thermal motion of the atom and must be included in the phase refinement process.

The R-factor is a measure of the agreement between the crystallographic model and the

diffraction data, and depends highly on the resolution of the data. It should also be noted that it

is not always possible to see every atom in the molecule because the electron density is an

average over all molecules in the crystal. Sometimes atoms exist in several conformations

causing their electron density to appear smeared. On the other hand, they may appear multiple

times in an electron density map.

In summary, the magnitude of the Fourier transform of electron density is found from the

multiple recorded intensities of the reflections. The full Fourier transform of the electron density

comes from combining the phases and magnitudes. The electron density is converted into the









arrangement of atoms in the crystal and then the determined crystal structure is stored in a

publicly accessible database, like the Protein Data Bank (PDB).

1.2.1.2 Nuclear magnetic resonance (NMR)

Structure determination using NMR can be performed in five steps. (1) Prepare the protein

solution; unlike X-ray crystallography, Nuclear Magnetic Resonance (NMR) can be performed

on samples either in solution or in solid state. (2) Take the NMR measurements. (3) Assign the

NMR signals to individual atoms in the molecule. (4) Identify conformational constraints, like

distances between hydrogen atoms. (5) Calculate the 3D structure based on experimental

constraints."

When a nucleus is placed in a magnetic field, it can exist in one of a small number of

allowed orientations (states) with different energy. The nucleus of a hydrogen atom has only two

allowed orientations; the magnetic moment of the nuclei can either align parallel to the external

magnetic field or point in the opposite direction. Some nuclei will be oriented parallel and others

anti-parallel giving rise to a small polarization of nuclear spins and thus a net macroscopic

magnetization, which can be manipulated with the appropriate electromagnetic waves. Quantum

mechanically speaking, the external magnetic field causes the ground state energy level of the

nucleus to split into two spin states (for nuclei with S= V/2) that have an energy difference of hy.

This energy gap can be measured by applying electromagnetic radiation of frequency v (usually

in the radio-frequency range) causing the nuclei to be excited from the lower energy level to the

upper one. This process is classically described as flipping the spin of the nucleus between their

two spin states, spin up and spin down. The frequency is typically applied in several pulses, each

of which is a few microseconds long, causing the spins of the nuclei to flip. An NMR signal is

obtained after perturbing the equilibrium spin states. The signal decays as the system returns to









its equilibrium state. The signal is also called the free induction decay (FID), which is

essentially the sum of decaying cosine waves whose frequencies correspond to the resonance

frequencies of the nuclei. A Fourier transform of this data yields the NMR frequency

spectrum.12

Different types of nuclei have vastly different resonance frequencies. Protons ( H), for

example, resonate at a frequency four times higher than carbon nuclei (13C) and ten times that of

the nitrogen nuclei (15N). Much smaller resonance frequency differences occur between nuclei

of the same type. Such variations or chemical shifts are due to the interactions between the

nuclei and the surrounding electrons affecting the local magnetic field experienced by a

particular nucleus which in turn alters its resonance frequency. It is the chemical shift that

allows us to assign protons to different classes. For example, we can distinguish between amide

protons and those on methyl groups. The chemical shift is very sensitive to many structural,

electronic, magnetic, and dynamic variables and contains a lot of information on the state of the

system of interest.

The most important feature of NMR spectroscopy is that individual nuclei interact with the

small magnetic fields generated by the spins of the nuclei nearby. The different nuclei can be

correlated with one another in the molecule by using these spin-spin interactions. The nuclei

interactions are either through-space or through-bond. The through-space interactions are the

basis for the nuclear Overhauser effect (NOE), which allows for distance measurements between

hydrogen nuclei. The through-bond interactions are called spin-spin coupling or J coupling.

Both of these correlations form the basis for the analysis of protein spectra.

To determine the structure of proteins, multidimensional NMR must be used because it

provides spectra with improved resolution as well as more easily analyzed correlations. Two-









dimensional NMR was primarily developed by Kurt Wurthrich, who shared the Nobel Prize in

chemistry in 2002.

There are four consecutive time periods in multidimensional NMR: excitation, evolution,

mixing, and detection. In the excitation period, the nuclear spins are prepared in the desired

state. The chemical shifts are then observed during the evolution period zl. The spins are

correlated with each other during the mixing period and the chemical shift information of one

nucleus is transferred to another nucleus whose frequency is measured during z2, the detection

period. Several experiments are run with successively incremented lengths of II. From this

information, a two-dimensional data set is obtained, from which a data matrix S(ri, z2) is

generated. The frequency spectrum, S(myl, co2), iS obtained from a Fourier transform of S(ri, T2).

If two nuclei interact during the mixing time, the interaction will appear as a cross peak in the

resonance spectrum at a position corresponding to the resonance frequencies of the two nuclei.

Larger proteins are generally labeled with 15N and 13C, but the preferred nucleus for

detection is hydrogen because it is the most sensitive. During the evolution period, the other

nuclei, 1N and 13C, can be measured and their information is transferred to the protons for

detection. The chemical shift is sensitive to the environment of a nucleus. Thus, multiple copies

of the same amino acid in a protein can be distinguished due to the conformation dependent

chemical shift. The 1H, 1N, and 13C chemical shifts are known for many 3D NMR structures of

proteins and can be used in empirical and semi-empirical correlations with structural parameters.

To assign the spectra, knowledge of the protein sequence is necessary. For large [ 5N,

13C]-labeled proteins, through-bond correlations across the peptide-bond between sequential

amino acids can be used to assign the spectrum. Distance information and/or dihedral angles

must be derived from NMR data to calculate the protein's structure. Basic information about










protein structure such as, amino acid sequence, bond lengths, bond angles, chiralities, planar

groups, and steric repulsion between non-bonded atom pairs, is used in conjunction with NMR

data to do so. The crucial information comes from NOE distance measurements but

supplementary dihedral angle constraints can come from through-bond correlations. Chemical

shifts can also indicate the type of secondary structure that is present and through-bond

interactions can detect hydrogen-bonds. When NOEs are not prevalent, residual dipolar

couplings (RDCs) can be used. RDCs are related to the orientation of N-H and C-H internuclear

vectors relative to the molecular frame. Using more input constraints in the structure calculation

gives rise to higher quality structures.

The final step in this process is to calculate the structure. First a low resolution structure is

derived from an unambiguous subset of NOE data. Many computer programs are available for

this process and they are divided into two main groups:13 Ones that use inter-atomic distanceS14

like DISGEO and DISMAN15,16 and ones that use torsional bond angles, like DYANA17 and

DIANA s. The final result for each type of method is the Cartesian coordinates of the family of

structures which satisfy the set of NMR constraints. The experimental constraints do not specify

just one unique structure, instead they describe a range of possible values. Also, some distances

cannot be determined. Due to these restrictions, an ensemble of structures satisfying the

constraints is typically generated by repeating the structure calculation several times. The best

ensemble samples all the conformational space the constraints allow.

Restrained molecular dynamicsl4,15 and distance geometryl18,19 are two of the most common

approaches used in structure generation. Distance geometry is often used to generate an initial

structure for molecular dynamics. Using the distance constraints, an error function is minimized,

which depends on the sum of all differences between the distance constraint and the actual









distance. In restrained molecular dynamics, energy terms based on the NMR-derived constraints

are added to the classical molecular dynamics force fields. Usually, a combination of distance

geometry and molecular dynamics is used to calculate the structure of a protein.

1.2.2 Distance Measurements

1.2.2.1 Nuclear overhauser effects NOE) from NMR

Nuclear Overhauser effects are through-space correlations between nearby hydrogen atoms

in the protein. Unlike J couplings, the nuclei involved in the NOE can be separated greatly in the

protein sequence as long as they are close to each other in space. The NOE results from the

transfer of magnetization between spins interacting through their dipoles. The intensity of the

NOE is approximately proportional to r-6, where r is the distance between the two interacting

nuclei. Because of the dependence on the inverse power of six of the distance, the intensity falls

off very quickly with increasing distance. As such, NOEs are only observed for small distances,

between 2 5 All2. The lower bound is essentially the sum of two hydrogen atomic radii. The

NOE distances are also classified as strong, medium, or weak depending on their intensity. The

distance ranges corresponding to each type of NOE are usually defined as less than 2.5 A+ for

strong, 2.5 3.5 A+ for medium, and greater than 3.5 A+ for weak.20

1.2.2.2 Electron paramagnetic resonance (EPR)

Electron paramagnetic resonance (EPR) is a magnetic resonance technique used to

investigate systems which possess unpaired electrons. This technique has been used in studies of

metal centers in metalloproteins as well as reaction-intermediates via spin trapping. The

introduction of site-directed spin labeling (SDSL)21 has allowed EPR to be applied to studies of

proteins that do not contain metal atoms. In SDSL, a cysteine residue is introduced into a

specific location in the protein sequence, which can then form a disulfide bond with a thiol-

specific nitroxide spin label, like methanethiosulfonate (MTSL).









The basic principles of EPR are similar to NMR, except EPR deals with the magnetic

moment of an unpaired, free electron instead of nuclear spins. An oscillating magnetic field

induces transitions between the two spin states. Energy is absorbed during the transition and the

first order derivative of the absorption spectrum is generally recorded.

To measure distances in proteins using EPR, two spin labels must be inserted. If the two

spin probes are close enough to each other they will experience dipole-dipole coupling

proportional to r-3, the inverse cube of the distance between them.22,23 EPR can measure a wide

range of distances, from 8 20 A+ for continuous wave techniques and 18 80 A+ for pulsed

techmiques.2

In continuous wave methods, distance information can be extracted by analyzing line

broadening caused by dipolar interactions. To obtain the dipolar interaction, three types of

labeled samples are need: a protein labeled in site A, a protein labeled in site B, and a protein

labeled in both sites A and B. A spectrum with two spin labels is assumed to be the convolution

of the dipolar broadening function and the monoradical spectrum.25 To separate the dipolar

spectrum, a Fourier deconvolution method is used to subtract monoradical contaminants. This

yields a pake distribution from which distance information can be obtained.

In pulsed EPR techniques, like double-electron electron resonance (DEER), distance

information is obtained by modulating a spin echo at the frequency of the dipolar interaction.

Analysis of spin-echo amplitude oscillations reveals such distance information.

There are many benefits to using EPR. The sample preparation is fairly easy compared to

other methods. Low temperatures are required but only one type of spin label is needed. EPR

often has higher precision than techniques like Fluorescence Resonance Energy Transfer

(FRET), which will be discussed in section 1.2.2.3. EPR can be performed with lower sample









volumes and concentrations than NMR and there are no molecular size limitations. EPR can be

used to measure medium (5 25 A+) to long range distances (25 80 A+).

One disadvantage of EPR is that the dynamics of the spin labels are unknown and highly

system dependent. To overcome this problem, molecular dynamics simulations can be used to

predict the orientation of the labels and their effects on the distance distributions. The

experimentally determined distances are those between the spin labels, not the ot-carbons, but by

predicting the spin probe locations, one may be able to derive inter-residue distance information.

1.2.2.3 Fluorescence resonance energy transfer (FRET)

In fluorescence resonance energy transfer, energy is transferred from an excited state donor

(D) to a ground state acceptor (A). Long-range dipole-dipole interactions between the donor and

the acceptor cause the energy transfer, which occurs without the appearance of a photon. The

rate of energy transfer depends on four things: (1) the spectral overlap between the emission

spectrum of the donor and the absorption spectrum of the acceptor, (2) the quantum yield of the

donor, (3) the relative orientation of the donor and acceptor transition dipoles, and (4) the

distance between the donor and acceptor. In measuring inter-residue distances in proteins, it is

this distance dependence of FRET which is exploited.

Proteins can be covalently labeled with a donor, typically a tryptophan, and an acceptor

molecule. The distance between the labels is inferred from the efficiency of energy transfer,

which can be determined from steady-state measurements of the extent of donor quenching due

to the presence of the acceptor. Because such distances can be measured, FRET is often

described as a "spectroscopic ruler".26

The distance over which energy transfer occurs is similar to the dimensions of proteins.

The Foirster distance (Ro) is the distance at which FRET is fifty percent efficient and is usually









between 20 90 A+, depending on the specific donor and acceptor pair. Transfer efficiency can be

expressed in terms of distances (Equation 1-1), decay times (Equation 1-3) and intensities

(Equation 1-4). In equation 1-1, Ro is the Fiirster distance while r is the distance between the

donor and acceptor. The rate of FRET strongly depends on the distance and is inversely

proportional to r6. Any phenomenon that changes the donor-acceptor distance will also cause a

change in the transfer rate, which allows, for example, the study of conformational changes in

proteins. Energy transfer is assumed to occur if the distance between the donor and acceptor is

near the Fiirster distance and there is enough spectral overlap.



Ro + r6

The Fiirster distance can be calculated from spectral properties of the donor and acceptor

molecules as in Equation 1-2. The quantum yield of the donor in the absence of the acceptor is

(DD, .{(h) is the spectral overlap, n is the refraction index of the medium (generally taken to be 1.4

for biomolecules in aqueous solution), A is a constant, and K2 is the orientation factor, which

describes the relative orientation of the donor and acceptor transition dipoles. A dynamic

random average for K2 is assumed to be 2/3. The transfer efficiency is measured experimentally

via fluorescence intensity and calculated using Equation 1-3 or Equation 1-4. The subscript DA

indicates the lifetime (ZDA) Or intensity (IDA) Of the donor in presence of acceptor while the D

subscript represents is the lifetime (zD) Or intensity (ID) Of the donor in absence of the acceptor.










Ro6 = AK2 D J(h) (1-2)
n_4

TDA
E = 1 (1-3)


IDA
E = 1 (1-4)



In summary, FRET can be used to measure long-range distances (20 90 A+) in proteins.

The actual distances measured are between the donor and acceptor molecules. Distances

between two co-carbons, however can be inferred from this data by estimating the position of the

labels using molecular dynamics. Some degree of uncertainty is introduced into the

measurement when deriving such distances from the experimental data.

1.2.2.4 Chemical cross-linking with mass spectrometry

Chemical cross links are used to connect two polymer chains through covalent bonds. In

biochemistry, they are often employed in the study of protein structure, function, and interactions

with other proteins. Cross-linkers bind to surface amino acid residues that are near one another

in space. This helps to stabilize otherwise weak or transient inter-residue interactions so they can

be analyzed. Imidoester cross-linker dimethyl suberimidate27 and the N-hydroxyl succinamide-

ester cross-linker BS3 (bis(sulfosuccinimidyl) suberate)28 can both be used as cross-linkers in

protein studies. In these cases, lysine's amino group undergoes a nucleophilic attack resulting in

a covalent bond between the lysine and the cross-linker. The carbodiimide cross-linker EDC, 1-

ethyl-3 -(3 -dimethylaminopropyl)-carbodiimide, converts carboxyl groups into amine-reactive

isourea intermediates that can bind to available primary amines, including lysine residues. The

cross-linkers have a known end-to-end distance, which can be taken as the maximum distance









between the two linked residues, as the linkers are usually flexible and can fold over on

themselves.29

Mass spectrometry (MS) is an analytical tool that has many uses including protein

characterization. In general, there are four steps in MS: (1) ionize the sample, (2) separate ions

of differing masses, (3) detect the number of ions having each mass produced, and (4) collect and

analyze the data. To characterize proteins, two methods can be used. In the "top-down"

approach, the intact protein is ionized by electrospray ionization or matrix-assisted laser

desorption/ionization and then it is run through a mass analyzer. In the "bottom-up" method,

protease enzymes, like trypsin, are used to digest the proteins into smaller peptides, which are

then introduced into the mass spectrometer. The identity of the peptide is found through peptide

mass fingerprinting or tandem mass spectrometry. It is easy to identify cross-linked residues

because the enzymes will not cleave residues containing such cross-linkers.

1.3 Methods of Structure Prediction

Many methods of protein structure prediction are available. The ultimate goal in protein

modeling is to predict the structure of a protein using its amino acid sequence. Ideally, the

predicted structure will be comparable in accuracy to an experimental structure but the method of

deriving the structure will be much faster compared to experiment. Protein modeling is

especially useful for proteins that cannot be crystallized for X-ray diffraction or those that are too

large to be studied via NMR. The recent interest in genome proj ects has given rise to enormous

amounts of amino acid sequence information, which continues to grow much faster than protein

structure data.30-32 In an attempt to keep up with the demand for protein structures, many

scientists are turning to structure prediction methods. We will discuss homology modeling, fold

recognition methods, and ab initio methods.









1.3.1 Homology Modeling

The simplest protein structure prediction method is sequence homology, 33,34 which

determines the degree of similarity between proteins--one with a known structure and one

without. There are two basic premises of homology modeling: (1) the protein' s structure is

determined by its amino acid sequence35 and (2) over millions of years, a protein' s structure is

less likely to change than its sequence.36-38 If two proteins have sequence homology greater than

30%, they are believed to have essentially the same structure. In general, homology modeling

can be broken down into seven steps. They are: template recognition and initial alignment,

alignment correction, backbone generation, loop modeling, side-chain modeling, model

optimization, and model validation.

The first step in homology modeling is to find a template and align the sequences.

Sequence alignment programs like BLAST39 Or FASTA40 COmpare the target sequence to all

sequences that have a known structure and are in the PDB by using two matrices. The first is a

residue exchange matrix, which characterizes the probability that two of the twenty amino acids

should be aligned. The axes of this matrix are simply the 20 amino acids. The highest values are

found along the diagonal representing conserved residues. Exchanges between residues with

similar properties, like phenylalanine + tyrosine, have higher scores than those exchanges

between very dissimilar residues. In the alignment matrix, the axes correspond to the two

sequences being aligned and the elements are the values of the exchange matrix for a particular

pair of residues. To find the optimal alignment, the best path through the matrix is found taking

care not to use any residue twice. Gaps can be inserted to improve the alignment, but the

alignment algorithm will subtract a gap opening penalty. The end result of a BLAST search is a

list of hits, which are the modeling template and their corresponding alignments.









In the second step, the alignment can be corrected using programs like CLUSTALW41 to

perform multiple sequence alignments, which use the sequences of other homologous proteins.

Such methods are useful for regions of low sequence identity in the original alignment. Multiple

sequence alignments can also generate position-specific scoring matrices called profiles, which

indicate the residues most likely to be buried in the hydrophobic core and which are on the

surface based on the most frequently seen residue exchanges.

The backbone is generated in step 3. The coordinates of the template residues appearing in

the alignment are copied. If the template residues are the same as the target, all coordinates of

the residue are copied. If not, only the backbone coordinates (N, C", C, and O) are copied.

Multiple template modeling, as performed by programs like Swiss-Model42, iS useful when one

template is found to contain errors.

After alignment, there are often gaps from insertions and deletions which change the

conformation of the backbone. These changes usually do not occur in regular secondary

structural elements, but rather in loop regions and turns. Even without the insertions and

deletions, loop conformations often differ between the template and the target. There are two

common methods used for loop modeling, knowledge based and energy based. In the first

method, the loop conformation is copied from a known loop in the PDB with endpoints that are

the same as the residues between which the loop must be inserted. Most maj or modeling

programs can do this.42-46 The energy based method determines the best loop conformation by

minimizing an energy function using Monte Carlo47 Or Molecular dynamics.48

The next step is to model the side chains. Usually, this is done using libraries of common

rotamers (different conformations) that have been extracted from high resolution X-ray

structures. Many such libraries exist.49-51 Each rotamer is positioned and energy functions are









used to score them. If the residue is conserved, it is easier to copy the coordinates of the entire

residues instead of copying the backbone and modeling the side chain. Also, certain

conformations of the backbone may prefer certain rotamers, which helps to minimize the search

space as the position of one affects the position of its neighbors. Residues in the hydrophobic

core generally adopt only one conformation, whereas the more flexible side chains on the surface

of proteins may adopt several conformations.

The model optimization step is actually an iteration of rotamer prediction and energy

minimization steps. The rotamers are predicted, which changes the backbone, then the rotamers

for the new backbone are predicted and so on until convergence is achieved. Molecular

dynamics is the preferred method of energy minimization. Some methods restrain the atom

positions and/or only employ a few hundred steps of energy minimization. Better, more accurate

force Hields will help improve model optimization.

The Einal step in homology modeling is to validate the model. The amount of error can

depend on the sequence identity between the target and template. Poor sequence identity (< 25

%) can often lead to very large errors. Such errors can be estimated by calculating the model's

energy based on a force Hield, which checks to see if bond lengths and angles are within normal

limits. It is not possible to discern if the protein is folded correctly using force Hields alone as

misfolded, yet well-minimized models can usually give the same energy as the target structure.52

Determining normality indices is an additional way to estimate errors. These indices describe

how similar certain characteristics are between the model and real structures. They will check

properties like the distribution of polar and nonpolar residues in the interior and on the surface of

the protein as well as radial distribution functions that can distinguish between good and bad

contacts.









In general, sequence homology is quick and computationally inexpensive. A drawback of

this method, however, is its inability to detect structural similarities existing between two

proteins with very different sequences. Unfortunately, in the protein world such occurrences are

still quite common.53-5 For example, mammalian glycogen phosphorylase and DNA

glucosyltransferase have similar shapes but differ greatly in sequence.54 For these proteins, other

methods, like threading and ab initio folding, must be used.

1.3.2 Fold Recognition Methods (Threading)

The basic idea behind fold recognition methods (also called threading or the inverse

folding problem), is to determine which of the known protein folds will be most similar to an

unknown fold of a new protein knowing only its amino acid sequence. In nature, often two

seemingly unrelated proteins may adopt similar folds. It is therefore important for a program to

detect structural similarities between proteins with vastly different sequences. Some of these

occurrences are the result of divergent evolution; the two proteins are related, but our current

sequence analysis methods are not sensitive enough to detect such distant homologies.

Convergent evolution, on the other hand, may explain how similar structures can result from

proteins having common functional requirements, like binding the same class of substrates.

Because only a small number of folds have been found in nature thus far, proteins may have very

limited folding space giving rise to similarities between unrelated proteins. This explanation is

generally reserved only for single domain proteins."

The first two explanations show that proteins with similar structures sometimes also have

similar functions; fold recognition, therefore, might be used as a function prediction tool as well.

Usually, the active site, identity of cofactors, and general features of the reaction being catalyzed

are highly conserved for enzymes with similar folds.l Essentially, for such proteins, function is

conserved evolutionarily.









In sequence-based fold recognition, one must first recognize similarities based on sequence

and then construct a detailed alignment, which is a residue-by-residue equivalence table between

the two proteins. The same methods of sequence alignment were discussed in our section on

homology modeling. Fold recognition methods also use position-specific mutation rules derived

from the multiple sequence alignment of a homologous family to Eind distant homologies, even

between proteins with less than 25% sequence similarity.

Energy-based fold recognition methods are similar to grid search minimizations. The

calculated energy at each grid point is based on known protein structures. This method is also

called threading5859 because the target sequence is being threaded through or forced to adopt the

structure of another protein. Several threading algorithms have been developed, but all follow

the same paradigm of sequence alignment, template identification, and alignment building. The

same limitations apply to threading as sequence-based fold recognition. If no correct structure

exists in the structural database being used, no good models will be built.

Many threading algorithms have been developed over the years. An intuitive approach

would be to use a technique that incorporates nonlocal scoring functions. Many approximations

are needed to minimize the space of possible alignments. One of the first most successful

methods was the Threader algorithm60. It used two-level dynamic programming to optimize

interaction partners for each pair of aligned residues. Only the strongest interacting residues

were considered in this method, which helped reduce its computational cost.

Threading algorithms generally differ in three areas. The first is in their protein model and

interaction descriptions. Simplifying the three-dimensional protein structure is one way to speed

up energy calculations. Side chains can be simplified by describing them as interaction points,

which can be located at C" or CP positions or can encompass the whole side chain. Also, the









energy calculation may only include certain parts of the protein and the interaction energy may

or may not be distance dependent. Different algorithms also have various empirical energy

parameterizations. Finally, threading methods differ in their alignment algorithms. The

threading energy is a nonlocal function based on the alignment between the template structure

and the prediction target sequence.

1.3.3 Ab Inzitio Methods

Predicting a protein's native conformation solely from its sequence of amino acids is the

basis ofAb initio structure prediction. In the early 1970's, Anfinsen suggested that a protein's

native conformation corresponds to a global free energy minimum for their sequences, which is

commonly referred to as the thermodynamicc hypothesis:". He also showed that information

needed for a protein to fold is contained in its amino acid sequence61,62. It seems logical to

assume that given a perfect energy function and the proper computational tools, the native

structure can be found. In reality, two problems hinder this method. The conformational space

to be searched is huge, while the molecular potentials have limited accuracy.

To reduce the effects of these problems, many methods use reduced representations,

simplified potentials, and coarse search strategies63-66. In ab initio folding, representations of the

polypeptide chain are usually simplified in some way. Implicit solvation models are preferred

over explicit water molecules. United atom representations are used, where the non-polar

hydrogens are drawn into the base of the heavy atoms to which they are bonded. Using the

limited set of conformations for each side chain that is most prevalent in the PDB (found in

rotamer libraries") can reduce computational cost without loss of predictability. Side chains are

also sometimes replaced by locating the side-chain properties at the C" or the centroid of the side









chain. This essentially averages out the side chain degrees of freedom, which speeds up the

calculation but also decreases the specificity.

To reduce the size of conformational space to be searched, one can limit the available

backbone conformations. Certain local structures prefer certain torsion angle pairs67-69. Torsion

angles can also be restricted to discrete, commonly seen values by using a small set of phi-psi

pairs'0, by choosing pairs from an ideal set based on predicted secondary structure, or by using

fragments from known proteins63,71-73

There are two types of potentials used to evaluate the free energy of proteins, molecular

mechanics potentials and scoring functions. Both classes must be able to properly represent the

forces that determine protein conformation. Such forces include solvation, electrostatic

interactions between hydrogen bonds as well as ion pairs, covalent bonds, bond angles, dihedral

angles, and van der Waals interactions.

For molecular mechanics, the forces needed to determine protein conformation are

modeled by using physical based functional forms that have been parameterized from small

molecules or quantum mechanical (QM) calculations. Coulomb's law is used to model

electrostatic interactions using QM calculations to derive partial charges, while a standard 6-12

potential is usually used to describe van der Waals interactions.

Scoring functions (protein structure-derived potentials), on the other hand, are empirically

derived from experimental structures.74,75 A functional form of the potential is usually not

specified but rather the logarithm of probability distribution functions are used to find

pseudoenergies. These functions are especially useful when dealing with reduced complexity

models as they can represent interactions between side chain centroids after averaging over all

possible positions of the non-present atoms.









Molecular dynamics is usually too expensive for the de novo generation of protein models

using full atom representation. This method, however, has had some success with very small

proteins, like the Trp-cage protein.76,77 COnformational searching is quicker when a coarse

sampling of the energy landscape is performed. Methods that take this approach include

Metropolis Monte Carlo simulated annealing,63 Simulated tempering,'" evolutionary

algorithms,72 and genetic algorithms.79 These methods generally allow for large perturbations in

structure in a single move. Because the Einal structure of a single search may end up being

trapped in a local minimum instead of the global minimum, several conformational searches are

performed to generate an ensemble of possible structures. Choosing the most native-like

structures from the ensemble is difficult and many techniques have been developed to do so.80-82

As potential functions are improved, identifying the most accurate models will become easier as

they will have the lowest free energy. Perhaps the best energy functions for discriminating

amongst the possible structures will be ones that combine molecular mechanics potentials with

those derived from protein databases.

Two Hields in which ab initio protein folding might be of great use include genome

functional annotation and structural genomics. Many open reading frames have no sequence

homology with proteins of known structure and/or function, but links between such proteins may

be detected through ab initio folding. Structural similarities can be detected by comparing the

predicted structures to those in the PDB using a structure-structure comparison tool.83 One could

also look for conserved geometric motifs in these structures.84 Finally, the predicted structures

can be used to make matches to sequence-based motif libraries more sensitive and reliable.

Ab initio structure prediction can be used in structural genomics initiatives as a guide for

experimentalists by finding proteins most likely to contain novel folds. A hybrid ofab initio










prediction and homology modeling can also be used, if a homology models contains a gap, then

it can be filled in by ab initio prediction. Combined with a small amount of experimental data,

ab initio methods can be used in rapid structure determination for proteins whose structures

cannot be determined via X-ray or NMR data, like membrane-bound proteins."

1.3.3.1 Rosetta

A specific example of an ab initio folding algorithm is Rosetta (http://www.bakerlab .org/),

which is one of the best prediction methods available today. Prediction methods are tested and

compared via the Critical Assessment of protein Structure Prediction (CASP) experiments

(discussed further in section 1.4). In past CASP experiments,47,86,87 Rosetta has generated some

the top scoring predictions. There are several variants of Rosetta, all of which use sequence

information and an energy function to generate protein-like structures. Rosetta has been

employed in structure determination using limited experimental constraints,88,89 de novo protein

design,9091 prOtein-protein docking,92 and loop modeling.93 All methods involve generating a

fragment library, piecing the fragments together, clustering the structures by pairwise C" root

mean square deviation (RMSD) values, and ranking the representative cluster centers.

Incorporating experimental data into the Rosetta method has been successful. RosettaNMR, for

example, uses NMR constraints like residual dipolar coupling (RDCs), Nuclear overhauser

effects (NOEs), and unassigned chemical shifts (CSs) to restrain certain bond distances and

angles to improve the quality of predicted structures.

After the decoy set is generated, some decoys are eliminated using two filters. The contact

order filter removes the decoys with low contact order (overly local structures) compared to a

test set of proteins. The strand arrangement filter eliminates structures with non paired P-strands









and other nonprotein-like structures. Finally, the decoys are clustered. A representative model

from each cluster is chosen and ranked by the size of the cluster it represents.

Discriminating among the decoys is still problematic. Clustering is not always the best

option, as the best structures may not be in the most populated cluster. We will present a method

involving the generation of decoy sets using Rosetta and discriminating among the decoys using

inter-residue distance measurements.

1.3.3.2 Databases to test scoring functions

In developing proper energy functions for use in protein structure prediction, it is

important to test the function on a set of computer-generated conformations called decoys to see

if the functions can distinguish between the native and non-native-like conformations.94

Samudrala and Levitt developed many sets of such conformations (the database called Decoys

'R' us located at http://dd. stanford.edu) and made them available to the public to aid in the

improvement of scoring functions. The decoys were generated with the intention of fooling the

scoring functions; they have similar characteristics of native proteins, but they are not necessarily

correct. Decoys have been developed from the following types of methods: molecular dynamics

traj ectories,95 CryStal structures,96 COnformations with different loop regions,82 threading of the

amino acid sequence onto very different folds,97 and discrete-state models.so

Similar websites have been created to test energy functions for general protein structure

prediction (http://prostar.carb .nist.gov) as well as scoring functions specific for fold recognition

(http://fold. doe-mbi.ucla.edu). Testing scoring functions on several different decoy sets allows

for the exploration of a vast conformational space of proteins, which a single energy function

alone might not be able to provide.

A function's performance can be measured in many ways. The simplest method is to look

at the RMSD of the best scoring conformation and the native structure. It is also possible to









estimate the probability of choosing the best conformation by chance, RMSD rank of the

conformation divided by the total number of conformations. The correlation coefficient of the

RMSDs and the scores is also a good method because it uses information about all the

conformations in the decoy set.

1.3.4 Distance Geometry

The general goal of distance geometry calculations, is to build model structures that satisfy

a set of constraints. This branch of mathematics was developed by Blumenthal while Crippen

and co-workers98-100 were the first to apply these principles to chemical structure problems.

Presently, the term distance geometry is used to describe any of the computer programs that

convert geometric constraints into three-dimensional molecular coordinates. Distance geometry

algorithms are usually fast and can explore a vast conformational space. The structures they

generate can be used as input for further refinement methods.

Constraints are usually expressed in terms of an obj ective function. One way of doing this

is to specify a target value for a parameter of interest (e.g. a distance or an angle) and then have

the obj ective function measure deviations from the optimum value. Another way is assign upper

and lower bounds on a certain parameter. When boundary conditions are violated, a penalty

term is added to the obj ective function.

Most distance geometry programs have four parts: input preparations, bounds generation

and bounds smoothing, embedding, and optimization. Many types of distances can be used as

input including holonomic, experimental, and those from secondary structure. Holonomic

distances are determined directly from the protein sequence. Templates of each amino acid are

made and generally include bond lengths, Eixed dihedral distances like those in the peptide bond,

and distances involved in rigid structures like aromatic rings. Upper and lower bounds are

usually set to +/- 2% of the distance of interest.100 Experimental distances can be derived from









NOE data. Usually, a 5-A+ upper bound is applied while the lower bound is the sum of the

appropriate van der Waals radii. Information from secondary structure, like hydrogen bonding

constraints, can also be used.

All the distances are stored in a (N*N 1)/2 symmetric matrix while the bounds matrix is

not symmetric. Because only a small number of all the interatomic distances will be found

through experiment, other constraints, like the triangle inequality, must be used as well. For the

upper bounds, the triangle inequality shows that for three atoms (i, j, k), the distance between i

and j (D,) must be less than the sum of distances from i to k (Dik) and from k to j (Dkj). If D?1 is

greater than Dik + Dkj, the distance is replaced with the sum. If the sum of the two distances is

less than the lower bound, then a triangle violation has occurred. After applying the triangle

inequality to the upper bounds, it can be applied to the lower bounds. The overall inequality can

be summarized as follows: the upper bound on Dkj must be greater than or equal to the sum of

the upper bound of D, and the lower bound of Dik for all atoms i, j, k.

1.3.5 Chemical Cross-Linking with MS

Recently, a technique involving the use of intermolecular cross-linking, mass

spectrometry, and sequence threading has been employed in a structure prediction method.29

Using a lysine-specific cross-linking agent, BS3 (bis(sulfonsuccinimidyl) suberate, the tertiary

structure of (FGF)-2 (bovine basic fibroblast growth factor) was probed. Tripeptide mapping

using time-of-flight mass spectrometry was employed to identify the eighteen (Lys-Lys) cross-

linking sites and distance constraints were derived from this information. Threading was then

used to assign the protein to a family of folds. This method, which requires only a small amount

of sample, is fast and easily automated.

The B S3 CTOss-linker reacts with the amine groups of Lys and the N terminus. Only one

Lys-Lys cross-link per molecule was seen, ensuring the tertiary structure remained unperturbed.









The masses of tryptic peptides were assigned from the mass spectra using the Automated

Sequence Assignment Program (ASAP).29,101-103 The cross-linked residues are identified

because Trypsin cleaves at lysines and arginines, but not at BS3-mOdified lysines.

To identify the protein fold, a sequence threading program, program 123D,104 was used to

Eind the twenty best structural models from a database of proteins with at least 30% sequence

identity. The models were then ranked by how similar their distances were to the cross-link-

derived distance constraints. The threading models were scored according to Equation 1-5 where

N is the total number of modes, di is the C"-C" distance between the residues in constraint i, and

do is the maximum C"-C" through space distance between the BS3-CTOss-linked lysines.



(1-5)
f:O fdi i do, if di > do
i= 1


In their work, approximately N/10 constraints (where N is the number of residues) were

found to provide enough distance information to correctly assign the fold of the studied protein.

This method can be used to study most proteins if the fold has been previously observed. There

are many cross-linkers available that react with other polar groups besides lysine. These cross-

linkers may also have different spacer arm lengths and flexibility. This method can also be

combined with the other methods discussed in section 1-3.

1.3.6 Our Method

As mentioned previously, the more common experimental structure determination methods

are expensive and time consuming. We have employed a creative use of less expensive

experimental methods in an attempt to overcome some of the obstacles associated with the more

common structure determination methods. We Eind only a modest decrease in the resolution of

the predicted structure. Even low resolution structures, have been demonstrated to provide









insights into protein function.'os We suggest a method using simple computer algorithms and

relatively inexpensive inter-residue distance measurements to generate low resolution models

which can be further refined with additional procedures.

We propose a method to predict the unknown structure of protein using a database of

protein-like structures. After generating over 8 million decoys, we eliminate the bad ones using

inexpensive distance measurements. We will test the following two hypotheses for our method:

(1) our decoy set is complete, therefore, a target protein will have similar structure to a member

of the decoy set; (2) proteins with a small set of similar inter-residue distances (much smaller

than the total number of distances) will have similar overall structure.

Our decoys are derived from structures in the Protein Data Bank (PDB), ensuring that all

common protein folds are represented106 (See Chapter 2). After choosing a target sequence of

unknown structure, several C"-C" distances are measured. Experimental techniques like NMR

(NOE), EPR, and FRET can be used to measure small (3 7 A+), medium (10 25 A+), and large

(25 100 A+) distances respectively. Determining radius of gyration through scattering

experiments can also generate useful information. All of these methods generally cost less than

X-ray crystallography.

To test the feasibility of our method, we search the decoy set for target proteins whose

structures have been solved experimentally but have not been explicitly included in our set.

Experimental data is simulated by calculating C"-C" distances from the experimentally

determined structures of our target proteins. Those distances are then used as search constraints.

The same set of distances are calculated for each of the decoy structures and compared to those

measured in the target protein. Structures with several similar co-carbon distances also have

similar three-dimensional structure. Our first hypothesis suggests there should be at least one









"surviving" structure in the decoy set while our second hypothesis, if true, guarantees the number

of surviving structures to be low. Therefore, our final protein structure predictions are the

decoys satisfying the most distance constraints.

Recently, the rate of new protein folds deposited into the PDB has reached a plateau,

suggesting that most novel protein folds have already been discovered using the techniques

presently available.107,10 This finding bolsters our assumption that most small (~100 residues),

single domain, folded proteins have a structurally similar decoy in our database. We limit the

use of our method to proteins containing 100 residues or fewer by generating decoys exactly 100

residues long. Our method is not intended to predict the structure of membrane proteins, as such

proteins are not as well represented in the PDB.

1.4 Critical Assessment of Techniques for Protein Structure Prediction (CASP)

The Critical Assessment of Techniques for Protein Structure Prediction (CASP,

http ://predictioncenter.org/casp7) is a community-wide experiment that allows protein structure

prediction groups to test and compare their methods. The goals of CASP are threefold:109 to

determine the abilities and limitations of the current methods; to determine where progress is

advancing; to determine where the field is not making progress due to specific bottlenecks. The

categories of predictions are always changing slightly from one round of CASP to the next. For

example, evaluation of high resolution models was suggested at the CASP6 meeting and

implemented in CASP7.

In Chapter 6, we discuss the results of testing our method with CASP targets. The CASP

organizers solicit from experimentalists target protein sequences whose structures are close to

being determined or have not yet been published. The participants are given only the target

sequences and a limited amount of time to use their prediction methods to determine the target

protein structures. After analyzing the results, the organizers hold a conference at which the









most successful groups are asked to present their methods. Attendants may also make

suggestions for future CASP experiments. There have been seven rounds of CASP since its

inception in 1994.

CASP7 included three primary categories of prediction, (1) Tertiary structure predictions,

(2) High resolution models, and (3) other predictions. Each category is further divided into

automated and human-aided predictions. The "human" predictions can be made using any

combination of computational and human methods, but the automated structure prediction

servers must be fully automated.

The tertiary structure predictions are further divided into two types, template based

modeling and template free modeling. The first group includes the previous categories of

comparative modeling, homologous fold based models, and some analogous fold based models.

The second group contains models of proteins with new folds (previously unseen) as well as hard

analogous fold based models.

The second primary category, high resolution models, is new. It contains a subset of

models from the tertiary structure prediction category whose backbones are highly accurate such

that the details of active sites, loops, and side chains can be evaluated.

The other prediction category looks at how well predictors were able to define boundaries

of structural domains, detect residue-residue contacts, and identify the regions of disorder in the

targets. This category also includes predictions of function from structure. Another new facet

of evaluations included judging the predictor' s ability to discern the best model from their

respective decoy set without knowledge of the native structure. Evaluating model refinement is

also important as there is much interest in generating models with high accuracy.












Carboxyl group
COOH
Carbon alpha atom (C")


H2N a~H
Amino group


(side chain)Rgru


Figure 1-1. Structure of an amino acid (alanine) showing the a-carbon, the R-group (CH3 foT
alanine), the amino group, and the carboxyl group. The amino acid is shown in its
neutral, non-zwitterion form.


F


S K


a-helix


Figure 1-2. Organization of protein structure. A) Primary structure corresponding to the amino
acid sequence of: alanine, tyrosine, phenylalanine, serine, lysine. B) Secondary
structure: a-helix and P-sheet patterns. C) Tertiary structure of the Trp-cage protein
(PDB code 112y). D) Quaternary structure of hemoglobin (PDB code 1GZX).


P-sheet









CHAPTER 2
IVETHODS FOR PROTEIN STRUCTURE PREDICTION

2.1 Decoy Generation

2.1.1 General Decoy Set

All protein structures in the protein databank (PDB) with 100 residues or more (24,561

proteins in all including x-ray and average NMR structures) were used to populate our decoy

database. The protein backbones were split into overlapping and running fragments of 100

residues (Figure 2-1) and only the Cartesian coordinates of the a-carbons were stored. A parent

protein of more than 100 residues can be segmented into exactly N 99 overlapping fragments

containing 100 residues each, where N is the total number of residues. Thus the first decoy

contains the first 100 a-carbons from residues 1 tol00, while the Einal decoy is composed of a-

carbons from residues (N 99) to N. Decoys are named by first listing the PDB code of the

parent protein and then the decoy number. If the parent protein is composed of several chains,

the chain name is listed after the PDB code. For example, Im31-a-2 is the second decoy

composed of residues 2 101 from chain A of PDB code Im3 1. Exactly 8,060,245 decoys were

generated in this manner, creating a database to find the structure of proteins composed of 100

residues or fewer. Because each decoy is exactly 100 residues long, we disregard the excess

terminal residues when searching for shorter proteins.

Several problems were encountered in constructing the decoy set. Some entries in the

PDB are missing important atoms or residues causing gaps in the parent protein. Decoys were

not generated from the gapped regions of such proteins. The two numbers appearing after some

of the PDB codes indicate the parent protein contained a gap and the sequence was renumbered

after the gapped region. For example, parent protein 2a60 contained a gap and was divided into

two sections, each containing over 100 residues. Decoy 2a60-2-25 was created from residues 25









- 124 of the second fragment. Multiple positions for a single residue or an entire chain are also

frequently seen in PDB entries. We consistently selected the first coordinate set if multiple

positions were provided. For multi-chained proteins, the chains were treated as separate entities.

Several proteins have multiple PDB entries; no attempt was made to rid the database of

redundant structures.

2.1.2 Specific Decoy Set

The Rosetta procedure has been described in depth elsewhere.63,83,110,111 Briefly,

generating decoys with Rosetta requires the initial formation of a fragment library using Robetta

112-114. Robetta divides the target protein's sequence into fragments of three and nine residues

and searches the protein databank (PDB) for the possible structures of these fragments, which

represent the range of accessible local structure conformations. These fragments are then pieced

together randomly using a Monte Carlo simulated annealing procedure with an energy function

that favors hydrophobic burial, paired P-strands, and specific side-chain interactions. Each

decoy is evaluated by how well it compares to a protein-like structure based on statistics of

known protein structures.

2.2 Decoy Discrimination

As described in section 2.1, decoys are protein-like structures that may or may not look

like the target. The goal of the search process is to find a decoy (or small set of decoys) similar

in structure to the target protein. Using inter-residue distances from the target protein as

constraints, we distinguish between the good decoys (structures with low RMSDs using the

target protein as a reference) and the bad decoys (high RMSD structures). These distances can

be measured from experiments like nuclear overhauser effects (NOE) in nuclear magnetic

resonance spectroscopy (NMR), electron paramagnetic resonance (EPR), and fluorescence

resonance energy transfer (FRET), which can measure short (3 7A), medium (10 25A+), and









long (25 100 A+) distances respectively. Such measurements are not exact; the probes in FRET

and EPR are constantly rotating and have a finite size making their exact orientation difficult to

predict. Also, the measured distances are between the spin labels, while we simulate the

experimental data using the distance between two a-carbons. The distance uncertainties in EPR

measurements without considering spin orientation, are estimated to be around 5 A+.25 All of

these uncertainties must be taken into consideration in our search process. While comparing the

C"-C" distances of the target protein to those of the decoys, upper and lower bounds are placed

on the target protein' s distance. A decoy satisfies the constraint only if its distance is within the

range, the constraint distance acceptance range.

The acceptance range indicates how much the decoy distance can differ from the target

distance and still satisfy the constraint. Smaller ranges, +/- 1 and 2, are too tight; some low

RMSD structures do not satisfy many constraints using this range. Also, this range is too small

to account for experimental uncertainty. Larger ranges, +/- 10 and +/- 15, are too broad,

allowing high RMSD structures to satisfy several constraints. After many trials, a more

moderate range of +/- 5 A+ was found to yield the best results. This range also compensates for

insertions and deletions, as the distance between two consecutive a-carbons is ~ 3.8 A+.

In addition to finding an optimal constraint distance acceptance range, the choice of which

distance constraints to use is a key factor in the success of this method. When choosing

constraints, it is helpful to initially run the amino acid sequence through a secondary structure

prediction method. For our studies we used JPred,ll a consensus method that gets result from

six secondary structure prediction algorithms116-11 that use evolutionary information from

multiple sequences. Based on sequence information, JPred highlights which fragments of the

chain are more likely to exist as a-helices and which will be P-sheets. We identify approximate










regions of defined secondary structure to avoid choosing distances between atoms in loop

regions, which are highly mobile and less structurally defined, even in fairly similar structures.

Therefore, the most effective constraints are distances between atoms in defined areas of

secondary structure, like ot-helices and P-sheets. These regions are usually more conserved as

they often play a significant role in the protein's function.

After using the acceptance range to compare the calculated distances, we scored the decoys

by counting the number of constraints each one satisfies. Over a series of trials, we found that

twenty-five distances were sufficient to rank the decoys; therefore, our scores range from 0 to 25.

Other researchers have found a similar amount of experimental distance information to be

necessary in structure prediction.120,121 This scoring method provides some protection against

poor constraint choices (constraints satisfied by high but not low RMSD structures). In our

previous trials (Chapter 3), constraints were applied sequentially and decoys not satisfying the

constraint were eliminated from the database at each step. The few decoys remaining after

several constraint applications were the structure predictions. When a poorly chosen constraint

was applied, low RMSD structures were immediately eliminated from the database and were,

therefore, unable to become the predicted structures.

The counting method makes the sum of the constraints more important than each one

individually, minimizing the effect of a few bad choices. Applying a poor constraint can result

in a low RMSD structure having a slightly lower score and a high RMSD structure having a

slightly higher score. The effects on the scores are so small that the low RMSD structures are

still predictable.

In summary, our search process is divided into 5 steps: (1) use a secondary structure

prediction method to identify important distance constraints. (2) Measure distances










experimentally. (3) Calculate the same set of distances for each decoy. (4) Compare each of the

target protein's distances with those of the decoys. A decoy satisfies a particular constraint if the

two distances are similar within the constraint distance acceptance range (+/- 58A). (5) Score

each decoy by counting the number of constraints it satisfies. We hypothesize that structures

with similar co-carbon distances will show similarities in overall structure. The decoys satisfying

the most constraints become our structure predictions. While testing our method, we search for a

protein of known structure. We can therefore simulate the experimental constraints in step 2 by

choosing a set of distances from the native structure of the target protein.

2.3 Choosing Constraints

There are many ways to choose constraints. In our early work (Chapter 3), we attempted

to randomly select constraints. Usually, a few of these random constraints involved atoms in

loop regions, which is problematic as these regions are often not structurally well defined. To

avoid choosing constraints from loop regions, we use a secondary structure prediction method to

identify these areas. With this knowledge, we selected constraints between all predicted

secondary structure elements (Chapters 4 and 5).

One can also choose constraints in a daisy-chain manner. In this method, each atom in a

constraint is also involved with another constraint. For example, let constraint 1 be the distance

between residues A and B. Then constraint 2 is composed of residues B and C while constraint 3

is derived from residues C and D. Constraints chosen in this manner are stronger than those

randomly chosen because they must obey the triangle inequality.

It is also possible to select a few atoms and use several of the distances between them as

the set of constraints. Many implicit constraints are imposed in this manner. Another method is

to choose a piece of secondary structure to serve as a reference and have all constraints involve









an atom from this region. This method of choosing constraints has been shown previously to be

significantly better than daisy-chaining.122 In Our later work (Chapter 6) we used a combination

of these three methods.

2.4 Comparing Results

In the recent Critical Assessment of protein Structure Prediction (CASP) experiments, the

Local-Global Alignment (LGA)123 (http ://PredictionCenter.11nl .gov/local/1ga) measure has been

used to evaluate the prediction results. Although RMSD calculations provide insight into global

similarities between protein structures, the LGA method was designed to measure similarities in

both global and local structure. This program creates several alignments between the structures

of the predicted model and the target to find those regions of the proteins that are most similar to

each other. The LGA method has two components, the longest continuous segment (LCS) and

the global distance test (GDT). Several iterations of both methods are usually required to find

the optimal alignments.

When comparing two protein structures, the LCS procedure searches for the alignment that

superimposes the longest section of continuous residues with an RMSD under a specified cutoff.

For example, suppose the RMSD cutoff was 5.0 A+ and the first three residues of each structure

are aligned and had an RMSD 4.0 A+. The program would then align residues 1-4 and compute

the RMSD again. If the RMSD was still under 5.0 A+, residue 5 would be included in the

calculation otherwise the RMSD between residues 2 5 would be computed. This process

continues as several alignments are sampled and then the LCS is identified. Any set of residues

in the model can be aligned to the target; they do not need to have the same location in each

sequence (eg. residues 4, 5, and 6 of the model can be aligned to reside 23, 24, and 25 of the

target). Unless otherwise indicated, in this paper we discuss LCS using a cutoff of 5.0 A+ and use

the abbreviation LCS-5.









In the GDT method, the structures are aligned to find the largest set of residues that differ

by less than a selected distance cutoff. The distance cutoffs range from 0.5 10.0 A+ and are

scanned at a 0.5 A+ interval. For GDT, the largest set is not necessarily composed of continuous

residues. Pairs of residues are selected from both structures and a superposition and RMSD are

calculated. The superpositions are used as starting points to generate a list of equivalent residues

(carbon-a pairs from both proteins). After aligning the target and model structures, the distances

between the equivalent residues are calculated and the number of residue pairs with distances

under the cutoff is counted. The residues above the threshold are removed, others are added, and

the distances are calculated again. The initial list of equivalent residues is thus iteratively

extended to find the largest set of residues that satisfy a specific distance threshold.



104
103
101~ 102
100




1 234
Parent protein
N =104 Decoys


Figure 2-1. An example of how decoys are generated from a single protein. A parent protein with
104 residues (N = 104) can be cut into 5 decoys (104 99 = 5), each with exactly 100
residues. The first decoy contains a-carbons 1 100 while the final decoy is
composed of a-carbons from residues 5 104 [(N 99) N].










CHAPTER 3
TRIALS AND ERRORS: DEVELOPING THE METHOD

In this chapter method development is discussed. To test the idea for the method, we

searched through previously constructed decoy databases that were designed to test scoring

functions. We then developed our own decoy set according to the methods set out in Chapter 2.

A target protein with a structure known to be in the database was selected to test our search

protocol. Finally we predicted the structure of a protein whose native structure was not included

in the database.

3.1 Testing the Method on Previously Constructed Databases

We tested our search procedure using Samudrala and Levitt' s pre-constructed databases

(http://dd. compbio.washington.edu) for four known protein structures (1bba,124 lb~n-b,125 1Cf126

andldtkl27), which ranged in length from 31 to 78 residues. For each individual target protein, a

unique set of decoys was created. The number of decoys for each target ranged from 216 to 501

(Table 3-1). Standard bond lengths and angles were used to generate the initial structures. The

trans configuration was used for all peptide bonds and predefined co-helices and P-sheets were

assigned ideal torsion (O,Y) angles of (-600, -400) and (-1200, 1500) respectively. For the

remaining residues in the loop regions, a range of random torsion angles were used, -120(160)o

for O~ and 150(f90)o for y.12

After obtaining these databases from Samudrala and Levitt' s, distance constraints for each

protein were chosen from the native structure (Table 3-1). To develop our search protocol, we

investigated the effect of using different types of constraints to distinguish the more native-like

decoys from the less native-like. We calculated the number of decoys in the database that satisfy

particular constraints to identify which distances were common to most decoys and which were









very different. Determining which constraints eliminated the most structures (long or short

distances) was investigated by applying the same set of constraints in different orders (randomly,

long to short distances, and short to long distances).

3.1.1 The Number of Structures Satisfying Specific Constraints

For each target protein, constraints were chosen by finding the most prevalent residue type

and calculating the distances between those residues. For lbba, Ala was the most common

residue but the number of Ala-Ala distances was too small; Tyr-Tyr distances were also included

in the constraint list. The number of structures in the appropriate database satisfying each

constraint was then determined. We began these studies using a constraint distance acceptance

range of +/- 2 A+ (we later found this range to be too small when applying the method to our

general decoy set). For example, if the distance between the C" of residues A and B in the native

structure is 10 A+, a decoy will satisfy that constraint if it has a distance of 8 12 A+ between the

C" of residues A and B. For this test, we were not concerned with the quality of the decoys that

satisfy each constraint, only the number. We wanted to find the minimum number of constraints

that eliminate all structures in the database except that of the target.

As can be seen in Figure 3-1, care must be taken when choosing constraints. Distances

that are present in all of the decoys are not very effective constraints as they eliminate very few

structures. Some of the proteins have more complete data sets than others by having decoys that

sample a wider range of structural possibilities. As can be seen in Figure 3-1A, constraints 12,

15, and 16 for lbba eliminate almost all decoys in database. These constraints involve an atom

in a highly variable loop region. The same is true for constraints 7, 8, and 16 for lb0n-b in

Figure 3-1B. For Idtk and letf, (Figure 3-1D, E) most of the constraints are satisfied by fewer

than half of the decoys in the database.









3.1.2 The Effects of Applying Constraints in Different Orders

3.1.2.1 Randomly ordered constraints

For lbba, all Ala-Ala and Tyr-Tyr distances were selected as constraints. These distances

ranged from 3.8 30.2 A+. A constraint distance acceptance range of +/- 2 A+ of the constraint

distance was used as the acceptance criterion. The same set of constraints was applied in three

different, randomly determined orders. In each of the three trials, many structures are eliminated

after applying a single constraint (Figure 3-2) and 5 13 constraints were needed to eliminate all

but one decoy. The Tyr-Tyr constraints were also found to eliminate more structures in the first

step than the Ala-Ala constraints, indicating that the decoys span a wide range of distances at

those points. As mentioned previously, some of these constraints involved atoms in a highly

mobile or ill-defined loop region. Despite their ability to eliminate many decoys, constraints

involving atoms in loop regions may not be the best choices due to their low resolution and high

variability between structures which are otherwise quite similar.

3.1.2.2 Same constraints in different order

For each target protein, a list of constraints was chosen and applied in the following orders:

(1) long to short distances and (2) short to long distances. For 1b0n-b, all Glu-Glu distances

were selected as constraints. While the total set of constraints had distances that ranged from 3.8

- 23.8 A+, the six constraints used in Trial 1 (long to short) ranged from 19.3 23.8 A+ while the

nineteen constraints in Trial 2 (short to long) ranged from 3.8 23.8 A+ (Figure 3-3A). For letf,

all Val-Val distances were chosen as constraints and ranged from 16.4 24.7 A+ in the Trial 1 and

4.7 1 1.4 A+ in the Trial 2 (Figure 3-3B). As seen in Figure 3-3, the number of needed

constraints depends on the order of application. In each trial, fewer constraints were needed

when the longer distance constraints were used initially. Similar results were seen for 1bba and









Idtk. In each case, the Einal decoy remaining in the database was the native structure of the

target and therefore satisfied all constraints.

Applying longer constraints first eliminates decoys with vastly different overall structures

compared to the target protein. Shorter constraints eliminate decoys with differing local

structure. We have found that the search time is shortened by first removing structures with

great overall differences by applying longer constraints and then applying shorter constraints to

fine tune the structure. For lb0n-b, applying long constraints first (Trial 1, Figure 3-3A) requires

only six constraints, much less than the nineteen needed when applying short constraints first

(Trial 2, 3-3A). Some constraints do not eliminate any additional decoys resulting in the

plateaus seen in Figure 3-3. For letf, application of large constraints first (Trial 1, Figure 3-3B)

requires seven constraints, whereas applying small constraints first (Trial 2, Figure 3-3B)

requires thirteen. It was also found that changing the constraint distance acceptance range from

+/- 2 A to +/- 1 A did not change the number of distances required to find the correct structure.

3.2 Developing a Search Protocol Using a Structure Known to be in Our Database

Our general decoy set was generated as described in the Chapter 2. To test our search

procedure using our database, we chose 2ezm,129 Cyanovirin-N (an HIV inactivating protein).

The database contains two decoys generated from the 101 residue 2ezm target protein. Our

decoy set is spiked--the correct structure is definitely present because the parent protein met all

the requirements to be included in the decoy generation process.

In Trial 1, constraints were chosen with distances ranging from 10.1 24.7 A+. They were

also selected so that the atoms were within 8 59 residues of each other in the sequence. For

Trial 2, constraints were chosen in the distance range of 5.2 17.9 A+. The constraint atoms were

also required to be within 3 10 residues of each other, much closer than those constraints used

in Trial 1. The constraint distance acceptance range was set to +/- 2 A+ as used in previous trials.









Due to the changes in constraints, Trial 2 needed more than twice the number of constraints

needed in Trial 1. From these results we can conclude that it is more efficient to consider longer

distances between atoms several residues apart when choosing constraints

In each trial we were able to narrow down our search to the three structures shown in

Figure 3-4, 2ezm-1,129 2ezn-1,129 and liiy-1.130 Their structures are virtually indistinguishable

because they are all from the same HIV inactivating protein. The PDB entries differ as follows:

2ezn represents an ensemble of NMR structures, 2ezm is only the mean NMR structure, and liiy

is the mean NMR structure with a ligand. When shorter distance constraints were used and

restrictions were placed on the number of residues apart the two atoms were allowed to be, the

number of constraints needed increased from eight in Trial 1 to twenty in Trial 2.

3.3 Developing a Search Protocol Using a Structure Not in Our Database

Next, a search of the database was performed to find a protein whose structure was not

included in developing the database. Target protein lb4cl31 iS a homodimer of S100 beta

subunits, each 92 residues in length. It has been classified as a metal-binding protein. Due to the

nature of the database, the structure of only one chain was chosen as the search target.

3.3.1 Constraint Distance Acceptance Ranges: +/- 2 A~ and +/- 4 A

Our first task was to find upper and lower bounds to use as the constraint distance

acceptance range. We started with +/- 2 A+ as in previous trials (section 3.1, 3.2). Constraint

distances were chosen to be between 1 1.0 A+ and 25.6 A+ with an average distance of 20.6 A+.

Seven constraints were required to eliminate all but twenty-five structures. Three decoys

satisfied eight constraints and can be found in Figure 3-5. The RMSDs for the decoys satisfying

seven and eight constraints are listed in Table 3-2. The RMSD for the top three structures was

14.4 A+. The parent proteins for these three decoys are all related; 1ky7132 and In68133 are the

multi-copper oxidase (CueO) and 1pD3133 iS the M441L mutant of the same protein. The 1pf3









decoy found in the search is a fragment of the protein which does not contain the mutation.

These three decoys are nearly identical because of the redundant nature of the database.

The average RMSD for the top twenty-five decoys was found to be 13.3 A+. Applying the

final constraint apparently removed some of the better (lower RMSD) structures. This search

was unable to predict the correct secondary structure. The native structure of lb4c has five a-

helices, this search found decoys with four P-sheets and one short a-helix.

The RMSDs found in Trial 1 indicate that the three decoys remaining in the database are

not reliable predictions. Reva et. al.134 Showed that a structure with an RMSD of less than 6.0 A~

is a successful prediction for small proteins. If the acceptance range is too tight, low RMSD

decoys may not satisfy all constraints. In order to improve our results, the constraint distance

acceptance range was increased from +/- 2 A+ to +/- 4 A+. Thirteen constraints were required to

eliminate all but four decoys. The remaining structures can be found in Figure 3-6. The parent

proteins of these decoys are all dehydrogenases; 1h~hl35 iS a1 format dehydrogenase while

Inekl36 and Inenl36 are succinate dehydrogenase. Increasing the distance range improved the

quality of the final structures. A better prediction of secondary structure is made as the decoys

are found to have four a-helices and only two small P-sheets. The RMSDs of 12.2 A+ and 13.3 A~

are slightly better than +/-2 A+ distance range used in Trial 1, but they are still out of range for

this method to be considered a success.

3.3.2 Calculation of All RMSDs

Due to the high RMSD values of the final structures found in previous trials, we calculated

the RMSDs for all the decoys using the native structure of lb4c as the reference to determine if

any "good" (low RMSD) decoys existed in our database. The distribution of the RMSDs in

Figure 3-7 shows that most of the structures are within 12 -20 A+. We found 353 structures that









have RMSDs less than 7 A+ and 85 with RMSDs less than 6 A+. The structures with the best

RMSDs can be found in Table 3-3 and are depicted in Figure 3-8.

It was found that the good structures were eliminated during our search procedure because

some of the chosen constraint atoms are in loop regions which differ greatly among the

structures. Other distance constraints were between atoms that were shifted slightly in the

sequence due to insertions and deletions. Figure 3-9 shows an example of how insertions and

deletions can hinder successful predictions. The two proteins in the Eigure differ only in their

loop regions, the ot-helical sections are highly conserved giving rise to a very small RMSD.

Using the distance between residues 10 and 13 as a constraint, the present searching method

would eliminate the black decoy as an improbable structure because the extra residue in the loop

region adds to the length of the distance of interest. To overcome this problem, we increased the

acceptable distance range from +/- 4 A+ to +/- 12 A+.

3.3.3 Constraint Distance Acceptance Range of +/- 12 A~ and +/- 12 A~ + +/- 10 A~

Using similar constraints as in the previous trials (sections 3.3.1, 3.3.2) but with a

constraint distance acceptance range of +/- 12 A+, seventy-Hyve constraints were required to Eind

the top 1,163 decoys. The structures with the seven lowest RMSDs calculated previously were

in the Einal decoy set. The finding of the seven lowest RMSD structures showed our method to

have some promise, however, without knowing the structure a priori, it would be extremely

difficult to distinguish between the seven good and over one-thousand bad decoys. It can

therefore be concluded that this constraint distance acceptance range of +/- 12 A+ is too large to

adequately eliminate the least likely decoys.

In order to eliminate more structures, a constraint distance acceptance range of +/-12 A+

was employed for the first twenty-five constraints and +/-10 A+ was used for the next twenty-five









constraints. After these fifty constraints were applied, only 620 structures remained. The seven

lowest RMSD structures were once again among the remaining decoys. Unfortunately, 620

structures is still a rather large number and 50 constraints are far too many for this method to be

cost effective.

3.3.4 Block of Distances

Instead of comparing one distance to the native structure at a time, in this trial a block of

distances +/- 2 residues from the distance of interest was compared. For example, if the

experimental data indicated that residues 10 and 20 were 15 A+ apart, we would calculate all the

distances between residues 8, 9, 10, 11, 12 and 18, 19, 20, 21, 22. These twenty-five distances

make up the block for each constraint. For each block, the distance range, maximum, and

minimum were calculated. A decoy satisfied the constraint if the native structure distance was

found to be in the distance range (max + 2A+, min 2A+). Two small restrictions were placed on

the constraints: (1) the distance constraints ranged from 15.1 35.2 A+; (2) the atom pairs were

between 9 and 77 residues apart in the sequence.

Twenty-five constraints were required to find the top 943 structures, of which only 79 had

RMSDs less than 7.0 A+. Application of 32 constraints resulted in five remaining structures

found in Table 3-4 with their RMSDs. This method showed some improvement over the method

used in section 3.3.3. Of the final structures remaining in the database, only one was found to

have an RMSD greater than the cut off for it to be considered a successful prediction. The four

good structures can be found in Figure 3-8 and the higher RMSD decoy, from parent protein

1mkal37 can be found in Figure 3-10. The decoy from 1mka shares very few structural

similarities with lb4c. It has two P-sheets as well as two co-helices that do not align well with

lb4c. A large distance range for the constraint acceptance requirements and poorly chosen










constraints may explain why this decoy was not eliminated during the search process. The large

distance range also requires too many constraints making this method computationally

expensive.

3.3.5 Vary the Order of Constraint Application

As seen during the initial testing of our method (see section 3.1), the results of each trial

depend greatly on the order of application of the constraints. Several trials using the same

constraints in different order were performed. The constraint distance acceptance range was set

to +/- 5 A+.

The first set of constraints used the same order as that used in the previous trial (section

3.3.4). Eighteen constraints were satisfied by seventeen decoys with RMSDs ranging from 10.8

- 15.2 A+ as found in Table 3-5. None of the low RMSD structures were found. It was

discovered that the lowest RMSD decoy satisfied 24 of the 25 constraints. One of the atoms in

the unsatisfied constraint is in the middle of a loop region. It is known that these regions have

much flexibility giving rise to very different conformations, even in otherwise similar proteins.

We placed this constraint at the end of the list and performed the trial again. Seven decoys were

found to remain after 21 constraints. The RMSDs of the top seven decoys can be found in Table

3-5. This method was able to find two of the best decoys, but it found five high RMSD decoys

as well. A final trial was performed using another order of the constraints. Twenty-one

constraints were required to eliminate all but six decoys. The same two low RMSD decoys were

found as in the previous order. The final trial found four high RMSD decoys which were

different than those found earlier.

Initially it was assumed that upon varying the order of the application of the constraints,

the low RMSD decoys would remain in the database more often than the less probable ones. As

long as the bad constraint was placed at the end of the list, the two lowest RMSD decoys were










always be found. Because a priori one would not know if a bad constraint was being used, this

method is not as effective as we would like it to be.

3.3.6 Count the Number of Satisfied Constraints for Each Decoy

In order to remove the dependence on the order of constraint application, we counted the

number of constraints each decoy satisfied. We assumed the decoys that satisfied the most

constraints would have the lowest RMSDs. We performed a trial using the same constraints as

those used in the previous trial with a distance range of +/- 5 A+. It was found that four decoys

satisfy twenty-five constraints. The RMSDs of these structures are found in Table 3-6. Two of

the lowest RMSD decoys were found along with two rather high RMSD decoys.

A slightly different set of constraints were selected that include only distances between

atoms that are involved in secondary structure, not the loop regions. The distances were chosen

to be between 11.9 26.9 A+. Four decoys were found to satisfy these twenty-Hyve constraints.

They have the lowest RMSDs in the database (Table 3-6, Trial 2). As seen with the previous set

of constraints, half of the decoys in the database satisfy ~11 constraints. This data is shown in

Figure 3-11 and is remarkably similar to that obtained using the other, different set of distance

constraints.

3.4 Determination of an Average RMSD Distribution

Because the four target proteins had similar RMSD distributions, we wanted to determine a

random average RMSD-the RMSD of two randomly chosen structures in the decoy set. We

calculated the RMSD of each decoy in the database using other decoys as references. The Hyve

reference decoys can be found in Figure 3-12. As explained in the Chapter 2, the decoys are 100

residue long fragments of larger proteins. The decoys in Figures 3-12A, 3-12C, and 3-12E from

parent proteins lb7u,138 lU 11,139 and Irt6,140 have both co-helices and P sheets, while Figure 3-

12B (from parent protein 1fhx141) represents an all P-sheet protein and Figure 3-12D (from









parent protein 2wrpl42) COntains only ot-helices. Most of these decoys are folded rather tightly

and resemble small proteins. The decoy from Irt6 (Figure 3-12E), however, is a fragment of a

very large multi-domain protein, HIV-1 reverse transcriptase. This particular decoy contains

residues in two domains even though they are connected through the same chain. Decoys like

this one can account for some of the poor RMSD values calculated with other references.

For Irt6-109, the RMSDs are shifted to the right with average values between 15 A+ and 25

A+ (Figure 3-13), indicating that most decoys are less similar to it than the more compactly folded

structures. Our database contains some less compact, semi-folded decoys and a search for such a

protein may result in finding a reliable decoy where searches of other databases may not.

3.5 Summary of Methods

A constraint range of +/- 2 A+ and +/- 4 A+ was found to be too small to obtain good results

while a constraint range of +/- 12 A+ is far too large. The block method was able to find the

lowest RMSD structures, but it required too much computer time and too many constraints to do

so. The method of counting the number of constraints that each decoy satisfies has yielded the

best results thus far. Target lb4c was studied previously using a de novo protein structure

prediction algorithm which employed Rosetta.143 The 3.6 A+ RMSD of our best decoy was

slightly better than their best-scoring cluster which had an RMSD of 4.6 A+. We will further

discuss the application of this method to other proteins.










Table 3-1. Comparison of input for the four target proteins
Target Number of decoys Range of distance Number of residues
in database constraints, A+ in seuence
lbba 501 3.8 30.2 36
lb0n-b 498 3.8 23.8 31
Ictf 498 4.7 24.7 78
Idtk 216 5.3 27.4 57



Table 3-2. RMSDs for decoys satisfying the most constraints

Decoy RMSD (A+) Decoy RMSD (A+)
1z7q-n 10.7 11kt7-473 14.1
livr 11.9 11kt5-473 14.1
1gtm-c 11.9 11kt7-702 14.1
lymy 11.9 11k5-15 14.1
1qol-a 12.0 11lk5-244 14.1
1qol-b 12.0 11kt5-702 14.1
1qol-e 12.0 11k7-15 14.1
1qol-f 12.0 1 ky7-1 14.4
1qol-h 12.0 In68-1 14.4
It3q-c 13.8 1pf3-1 14.4
It3q-f 13.9 1khy-1 14.4
1sl8 14.0 1 khw-1 14.5
11kt7-244 14.1

A constraint distance acceptance range of +/- 2 A+ was used.


Table 3-3. Lowest RMSD decoys in database using lb4c as a reference

DecoyRMSD (A)
Im3 1-b-2 3.6
1m31-a-2 3.6
1m31-a-1 4.8
1m31l-b-1 4.8
Insh-b-2 4.9
Insh-a-2 4.9
1wlm-7 5.9
1wlm-8 5.4
1wlm-9 5.1
1wlm-10 5.6
1ps-a-1 5.3
1prb1 5.3












Im31-a-2 3.6
1m3 1-b-2 3.6
1mka-49 10.2
1psr-1 5.3
1psr-1 5.3


Table 3-5. Lowest RMSD decoys found in varying the order of constraint application
Trial 1 Trial 2 Trial 3
DecoyRMSD (A) Decoy RMSD (A) Decoy RMSD(A)
lag-3-8 11.3 1m31-a-2 3.6 1m31-a-2 3.6
1 r4-1-96 15.1 I m31-b-2 3.6 1m3 1-b-2 3.6
Irif-4 12.9 Inzc-4-95 12.7 I vgw-a-2-1 15.4
Ivid-1-95 15.2 If8x-1-11 13.2 Ivgw-d-2-1 15.4
2a72-1-2 11.8 I f8x-2-11 13.2 I vgw-e-2-1 15.5
2a72-2-2 11.8 If8y-1-11 12.9 Ivgz-4-1 15.4
2af0-1-22 12.0 If8y-2-11 13.1
2bt2-1-16 11.0
2bt2-2-16 11.0
2bt2-3-17 11.0
2bt2-4-15 11.1
2bt2-5-17 10.9
2bv 1-1-11 11.2
2bv1-2-10 11.3
lezt-1-8 10.8
Ifgk-2-12 14.5
1hld-1-106 15.1



Table 3-6. Lowest RMSD decoys found using. the count method for both trials
Trial 1 Trial 2
Decoy RMSD (A+) Decoy RMSD (A+)
Im31-a-2 3.6 1m31-a-1 4.8
1 m31-b-2 3.6 1m31-a-2 3.6
1 hz4-141 11.0 1 m31-b-1 4.8
1hz4-142 11.3 1 m31-b-2 3.6
Trial 1 uses the original set of constraints. In Trial 2, the constraint involving atoms in the loop
region is replaced by one between atoms in defined areas of secondary structure.


Table 3-4. Decoys remaining after 32 constraints using the block method


Decoy


RMSD (A+)










'9 00 -11 1 l-


" 300-

S200 -

S100-


2 4 6 8 10 12 14 16
Small constraints -+, large constraints



400


c 300






4 8 12 16 20 24 28
Small constraints -+ large constraints


500

,x400

r,300

S200

S100

0


4 8 12 16 20
Small constraints -+ large constraints


BisB

2 00 Ls-s

150





4 8 12 16 20 24 28
Small constraints -+ large constraints


Figure 3-1. The results of counting the number of decoys that satisfy each constraint. Constraints
are numbered from shortest to longest distance. A) lbba, bars in pink correspond to
Tyr-Tyr constraints. B) lb0n-b. C) Ictf. D) Idtk. Constraints were selected
between the most prevalent residue type for each target.


SGlu-Glu































O 2 4 6 8 10 12 14

Number of constraints


500 He


4 300
-


S200
-


100
0-


Figure 3-2. Application of randomly ordered constraints for lbba. The three trials used the same
constraints in different, random orders.


Trial 1
*Trial 2
Trial 31








*f ii~r










500 1


400 1


3c 00 1
o

E 200-


100 1
0-


**


See


***

O 4 8 12
Number of constraints


16 20


500 -(e


~0-



a


100-
0-


0 2 4 6 8 ~~'I 10 12 1 l4
Number of constraints


Figure 3-3. Results for using the same set of constraints in different orders. In trial 1, larger
constraints were applied first and then smaller constraints until only the target
structure remained in the database. In trial 2, constraints were applied from small
distances to larger distances. A) lb0n-b. B) Ictf. In each case, the final structure
remaining in the database was the native structure of the target protein and therefore
satisfied all constraints.


Trial 1
Trial 2








**


* Trial 1I
* Trial 21


****


+~+~~*



















Figure 3-4. The superimposed images of the results of the 2ezm search. The top scoring decoys
are 2ezm-1, 2ezn-1, and liiy-1. All three PDB codes represent the same protein. liiy
contains a ligand which was not included in the decoys.













A B
Figure 3-5. Results from Trial 1. A) lb4c. B) The final three remaining decoys after eight
constraints with a +/- 2 A+ distance range, 1ky7-1, In68-1 and 1pf3-1.


A B


Figure 3-6. The target protein and the final four remaining decoys after 13 constraints with a +/-
4 A+ distance range. A) The native structure of lb4c. B) 1h~h-a-334 and 1h~h-k-334.
Each had an RMSD of 12.2 A+. C) Inek-a-246 and Inen-a-246. Each had an RMSD
of 13.3 A+. lb4c is represented using a slightly different orientation than that in
Figure 3-5.














6101000


450000 -1 I





RMSD A



150000-





O 5 10 15 20 25 30 35

RIMS, A


Figure 3-7. Histogram of RMSDs for all decoys in the database using lb4c as a reference. The
histogram of RMSDs for all decoys with RMSDs less than 7.0 A+ is also included.












A B C D E


Figure 3-8. Decoys with the lowest RMSDs in database using lb4c as a reference. A) lb4c. B)
Im31-a-2. C) Insh-a-2. D) 1wlm-9. E) 1psr-a-1
























Figure 3-9. Schematic diagram of how an insertion in a loop region can affect the search process.
The red structure represents the native structure of our example target protein and the
black structure represents a decoy.


Figure 3-10. Decoy 1mka-49 (shown in yellow) satisfied many constraints for the lb4c target
(shown in blue).

















II


Segeee


100


0 5 10


15 20


Number of constraints

Figure 3-11. Graph of the number of decoys vs. the number of constraints each decoy satisfies
for both trials. Fifty percent of the decoys satisfy 11 constraints.








A B C D E

Figure 3-12. Five decoys used to determine a random average RMSD for our decoy database. A)
lb7u-109. B) Ifhx-13. C) lujn-75. D) 2wrp-15. E) Irt6-109.


* Triail1
* Trial 2


**se**g















700000_ 1ft
1---ujn-75

600000 -2r


500000


a4003000


300000-


2003000


100000-



5 10 15 20 25 30

RMSD, A8



Figure 3-13. Histograms of RMSDs for five randomly chosen decoys, lb7u, Ifxh, Irt6, lujn,
2wrp .









CHAPTER 4
RESULTS: USING OUR DECOY SET TO FIND FOUR PROTEINS

We attempted to find the structures of four proteins using our database. The target proteins

were PDB codes: lb4c,131 1ghh,144 lubi,145 and 2ezk.146 We chose these specific proteins

because they were previously used to evaluate other methods.143,147 All target proteins contain

fewer than 100 residues and are therefore not explicitly included in the decoy set. Twenty-five

distance constraints were chosen for each protein using the secondary structure prediction

method JPred"'s (to avoid the loop regions). First we will evaluate the decoy generation method

followed by an analysis of the decoy discrimination process.

4.1 Completeness of Decoy Set

Because we know the structures of the target proteins a priori, we can evaluate our decoy

set by calculating RMSDs for all of the decoys using each target protein as a reference. The

RMSD distributions (Figure 4-1) are similar for each of the targets and show that most of the

decoys have RMSDs within 12 20 A+. We also targeted another five proteins and found a

similar distribution. Because it is commonly assumed that a good structure prediction for a

small protein is one with an RMSD lower than 6.0 ,134 it will be difficult to find the few good

decoys in the set.

Because the distributions are skewed gaussian, only a few decoys are expected to have

RMSDs under 6.0 A+. Assuming a perfect gaussian distribution and using the standard deviation

and mean RMSD for each target, we calculated the number of decoys expected to have RMSDs

under the following cutoffs: 6 A+, 7 A+, 8 A+, and 9 A+ (Table 4-1). Comparing the number

extrapolated from a perfect gaussian to the number of decoys found within each RMSD cutoff in

our decoy set, we find that the decoy set number is consistently much lower. It is harder to find

low RMSD decoys than if the distribution was perfectly Gaussian.









4.2 Evaluation of Decoy Discrimination

4.2.1 Target 1b4c13, Apo-S10013

Our first target protein is lb4c (Figure 4-2), a homodimer of S100 beta subunits, each 92

residues in length. It has been classified as a metal-binding protein. Due to the nature of the

database, the structure of only one chain was chosen as the target. Using lb4c as a reference, we

found 85 decoys with RMSDs less than 6 A+. Like lb4c, Im31 (apo-Mtsl)148 and Insh (apo-

S100Al )149 are both are metal binding proteins. PDB code 1psrlso is the psoriasin protein while

is 1wlm15 is CGI-38 and currently has no known classification.

Four decoys were found to satisfy twenty-Hyve constraints, which ranged in distance from

11.9 A+ to 26.9 A+. They have the lowest RMSDs in the database using lb4c as a reference (3.6 A~

and 4.8 A+) and are all from the same parent protein, Im31 (Figure 4-2). Hypothesis 1 and 2 were

satisfied; a low RMSD decoy was in the database and this decoy shared a small set of similar

distances with the target. Target lb4c was studied by Meiler and Baker using a de novo protein

structure prediction algorithm which employed Rosettal43. The 3.6 A+ RMSD of our best decoy

was slightly better than their best-scoring cluster which had an RMSD of 4.6 A+.

4.2.2 Target 1ghh,144 DNA-Damlage-Inducible protein I (Dinl)

Our next target protein, 1ghh, is composed of 81 residues and can be found in Figure 4-3.

Of the 8 million decoys in the database, 85 had RMSDs less than 6 A+ with 1ghh as a reference.

The RMSD distribution was very similar to that seen for 1b4c (Figure 4-1). The distance

constraints ranged from 11.4 A+ to 21.1 A+. Our method successfully identified the lowest RMSD

decoys in the database. Eight decoys satisfied all twenty-Hyve constraints; their structures and

RMSDs are shown in Figure 4-3.

Because no attempt has been made to remove redundant structures from the database, some

of the top scoring RMSD decoys come from the same parent proteins with different PDB codes.









For example, liwg,152 10y6,153 and It9ul54 represent acrifiavine resistance protein B. ISHp608

transposase is represented by 2a6m and 2a60.15 PDB code Ivh2156 iS the autoinducer-2

synthesis protein.

Sequence homology using BLAST33 was unable to find any structures similar to 1ghh in

the PDB. Our method has an advantage in that we are able to generate low RMSD structures

with little sequence homology. Often structural relationships are more conserved than

sequence.1,2,157,158 We found decoys with RMSDs as low as 4.9 A+ which was very similar to the

4.8 A+ RMSD value found by Meiler and Bakerl43 for this protein.

The eight top scoring decoys each have three P-sheets and at least two a-helical regions.

Despite the low RMSD values, the target protein has a pair of parallel P-sheets and a pair of

anti-parallel P-sheets (see Figure 4-3) while the decoys have only anti-parallel P-sheet

orientations. In all of these structures, the P-sheets have, as usual, distances of ~5 A+ between a-

carbons on adjacent strands. The small distance between the P-strands allows for a low RMSD

between the overall structures despite an incorrect topology.

The study of this protein indicates that proteins with P-sheets may have low RMSD (~5 6

A+) decoys with incorrect topology. Our preliminary results on other proteins also show low

RMSD decoys with various P-sheet orientations. For these types of proteins, RMSD alone may

not be a useful indicator of a successful prediction.

4.2.3 Target lubi,145 Ubiquitin

PDB code lubi represents the well studied ubiquitin protein (Figure 4-4). It is composed

of 76 residues. The RMSDs of all the decoys were calculated using lubi as a reference and

seven decoys were found to have RMSDs less than 6.0 A+.









The chosen distance constraints ranged from 7.3 A 19.5 A+. Two decoys from 1z2ml59

satisfied twenty-Hyve constraints and can be found in Figure 4-4. Parent protein 1z2m is an

interferon-induced ubiquitin-like protein and therefore not surprisingly similar to our target. The

RMSDs for both decoys were 3.9 A+, the lowest RMSD decoys in the database using lubi as a

reference. This RMSD value was similar to the top-scoring cluster found using Rosetta,143 3.4



4.2.4 Target 2ezk,146 Mu End DNA-Binding ibeta Subdomain of Phage Mu Transposase

Target 2ezk has 93 residues. It was selected as our Einal target protein because Kihara et.

all47 USed it to test their method and had some difficulty finding a low RMSD model. A BLAST

search of this target found one other protein with sequence homology, 2ezl.146 With only 93

residues, 2ezl was too small to be included in our database. RMSDs for all the decoys in the

database were calculated and 41 decoys had RMSDs between 7.7 A+ and 8.0 A+. No decoy in our

database had an RMSD less than 6.0 A+; our database does not contain a good decoy. This 93

residue segment is not similar to any piece of a larger protein.

The distance constraints ranged from 12.0 to 18.5 A+. Nine decoys satisfied twenty-five

constraints. Six decoys came from parent protein Ingkl60 while Iv2al61 was the parent protein

for three decoys (Figure 4-5). The parent proteins seem to have functions unrelated to that of the

target. Mycobacterium tuberculosis Hemoglobin O (Ingk) has possible functions in oxygen

storage and transport, while Iv2a is a glutathione transferase isoenzyme. All of the decoys from

Iv2a had RMSDs of 7.7A+ using the target as a reference while the decoys from Ingk had

RMSDs of 11.6 11.7 A+.

4.2.5 Comparison of Search Process for All Target Proteins

For each target protein, half of the decoys satisfied at least 10 to 12 constraints (Figure 4-

6A). All the search proteins show a similar Gaussian distribution of decoy scores (Figure 4-6B).









Most decoys satisfy at least one constraint but very few satisfy all twenty-five. The RMSD

distribution for each protein (Figure 4-1) is similar in shape to Figure 4-6B, suggesting a

relationship between the score (the number of satisfied constraints) and the RMSD. The low

RMSD structures satisfy more constraints than those with high RMSDs. The strong correlation

between RMSD and score is seen more clearly in Figure 4-7; low RMSD decoys have high

scores, high RMSD decoys have low scores. Also, the average RMSD decreases with an

increase in score. As seen in Figure 4-7, three of the target proteins have low RMSD decoys (< 6

A+) that satisfy all constraints. In general, decoys with scores between 10 and 15 have RMSDs

between 15 and 20 A+, while decoys with a score of less than 10 have RMSDs greater than 25 A+.

For each target protein, there are a few decoy structures that have high RMSDs and high

scores. These decoys generally span more than one domain giving them an unfolded and non-

protein-like appearance. Often, one section of the decoy is similar in structure to a target protein

thereby satisfying several constraints, while the large RMSD comes from the second section of

the protein being so far from the first. In the PDB, occasionally multi-domain proteins are

poorly labeled. For example, in 1xi5,162 TOSidue 838 and 839 are nearly 152 A+ apart. Some of

the high RMSD, high scoring decoys in this study came from parent proteins lxi5 and 1xi4.163

4.3 Conclusions

We found that it is possible to search our decoy database using distance constraints to

find reasonably accurate protein models with RMSDs less than 6 A+. A distance range of +/- 5 A~

as the constraint acceptance criterion yields the best results. To avoid dependence on the order

of application of constraints, we counted the total number of constraints that each decoy

satisfied. Decoys that satisfied the most constraints systematically had the lowest RMSDs.

Our final results showed that 3 of the 4 target proteins had RMSDs less than 5 A+ as

summarized in Table 4-2. Even low resolution structures have been found to give insight into









the function of proteins."o Structures of this resolution can also be used as starting points in

density generation for X-ray structures." In each of these trials, twenty-five constraints were

needed to eliminate all but a few representative structures. More studies must be performed

before we can state with confidence that this number accurately represents the amount of

distance information needed to determine structure. We also analyzed the RMSDs for several

proteins and found that in general the average RMSD range for decoys in our database is ~15 A+.

Like the PDB, our database contains many semi-redundant structures. Removal of such decoys

may further decrease the search time of an already fast screening process.














Table 4-1. The number of decoys with RMSDs under each threshold
Under 6A+ Under 7A+ Under 8A+ Under 9A+
real hypothetical real hypothetical real hypothetical real hypothetical
lb4c 85 896 354 3,958 1,774 14,766 5,983 46,528
1ghh 43 2,885 208 10,082 1,088 30,486 7,308 79,759
lubi 7 4,540 12 15,001 264 42,939 3,797 106,478
2ezk 0 876 0 3,690 41 1 3,276 2, 182 40,787
There are 8,060,245 decoys in the set.


lb4c Im31 3.6
1ghh 2a6m 4.9
lubi 1 z2m 3.9
2ezk 1v2a 7.7


Table 4-2. Summary table of results


Target


Parent protein of found decoy


RMSD










1b4c
-1gghh
-1ubi
- 2ezk


it 400000-


S 3000900

Z
200000


100000



O 5 10 15 20 25 30

RMSD, A$

Figure 4-1. RMSD histograms for all studied proteins, 1ghh, lubi, 2ezk, and lb4c


Figure 4-2. Decoys with the lowest RMSDs in database using lb4c as a reference. Parent protein
1m31 has two chains, a and b. The first two decoys from each chain are the top
scoring decoys with low RMSDs. A) lb4c. B) Im31-a-2 and Im31-b-2. Each decoy
had an RMSD of 3.6 A+. C) Im31l-a-1 and Im31l-b-1. Each decoy had an RMSD of
4.8 A+.


A


B


c

















C D


A B


E F


Figure 4-3. Target 1ghh and top scoring decoys. Decoys from the same parent proteins are
shown together. A) Target 1ghh. B) 2a6m-1-25 and 2a6m-2-26, each with an RMSD
of 4.9 A+. C) 2a60-1-25 and 2a60-2-25, each with an RMSD of 4.9 A+. D) loy6-1-45
has an RMSD of 5.3 A+. E) It9u-1-45 has an RMSD of 5.2 A+. F) Ivh2-1-42 has an
RMSD of 5.1 A+. G) liwg-1-45 has an RMSD of 5.3 A+.














A B

Figure 4-4. Target lubi and top scoring decoys. A) lubi. B) 1z2m-1 and 1z2m-2 both have an
RMSD of 3.9 A+.


















B


Figure 4-5. Target 2ezk and top scoring decoys. A) Three decoys from 1v2a (1v2a-a-75, Iv2a-b-
75, Iv2a-c-75) and B) six decoys from Ingk (Ingk-e-15, Ingk-h-15, Ingk-i-15, Ingk-
j-15, Ingk-k-15, Ingk-1-15) satisfied all constraints. The RMSD for decoys from
Iv2a was 7.7 A+ while those for Ingk ranged from 11.6 11.7 A+.


A


C












-1b4c




so-










20-



O 5 10 15 20 25
Minimum Number of Satisfied Constraints
A




-2ezk
12-
















U 5 10 15 20 25
Score (total number of satisfied constraints)




Figure 4-6. Analysis of the scoring procedure: A) the y-axis represents the percent of decoys
satisfying at least a certain number of constraints. For example, 100 % of decoys satisfy 0
or more constraints, while fifty percent of the decoys satisfy at least 10 12 constraints.
B) The y-axis represents the percent of decoys satisfying the exact number of constraints.
Very few decoys satisfy exactly 0 or 25 constraints.















20 .H 20 T

15 I )15








0 5 TO 16 20 25 30 35 40 0 5 10 15 20 25 3D 35 40
RMSD, A RMsD, A
A B

25 -1 T 25

21
20~ ~ -" 20-


( 41

U, v M






U 5 10 15 20 25 3D 35 40 0 5 10) 15 20 25 30 35 40
RMSD. A RMSD, A
C D


Figure 4-7. Figure showing the relationship between RMSD and score. A) lb4c. B) 1ghh. C)
lubi. D) 2ezk. In general, the low RMSD structures have high scores and the high
RMSD structures have low scores.









CHAPTER 5
RESULTS: USING SPECIFIC DECOY SETS TO FIND FOUR PROTEINS

To use our decoy discrimination procedure with Rosetta-generated decoy structures, two

parameters were optimized: (1) the number of decoys in the data set and (2) the constraint

distance acceptance range. Once optimized, these parameters determined the number of distance

constraints needed in the search process. We will first explain how the parameters were

optimized and then discuss the search results using the optimized parameters for four proteins:

lb4c,131 1ghh,144 lubi,145 and 2ezk.146

5.1 Parameter Optimizations

5.1.1 Decoy Set Size

For each target protein, three sets were generated containing 1,000, 10,000, and 50,000

decoys. RMSDs were then calculated using the target protein as a reference. For a given

protein, the RMSD distribution is relatively constant despite the number of decoys generated as

shown for 1b4c in Figure 5-1. The distribution, however, varies from protein to protein.

Analysis of the RMSD ranges in Table 5-1 reveals that slightly better decoys (lower RMSD

structures) are generated in the 10,000 decoy set than in the 1,000 decoy set. Increasing the set

to include 50,000 decoys, however, does not show a significant improvement in the quality of

decoys generated to justify the extra computational cost associated with their generation. In all

cases, increasing the set size generates slightly worse structures (ones with higher RMSDs) as

well.

For lb4c more than 40% of the decoys have low RMSDs (less than 6.0 A+). Increasing the

set size from 1,000 to 10,000 decoys generates a slightly lower RMSD structure than the best

decoy in the 1,000 set. In the largest decoy set, both the lowest and highest RMSD decoys can

be found. More than a third of the decoys for lubi have low RMSDs in each of the three









different size sets. As seen for 1b4c, an increase in set size for lubi generates both lower and

higher RMSD structures. The RMSD distributions for 1ghh and 2ezk are broad; only about 10%

of the decoys have RMSDs less than 6.0 A+. For 1ghh, there is a slightly better RMSD structure

found in the 10,000 decoy set than in the 1,000 while the 10,000 and 50,000 decoy sets have the

same RMSD range. For 2ezk, there is no improvement in low RMSD structures by increasing

the set size from 10,000 to 50,000. This increase does, however, generate decoys with slightly

higher RMSDs.

For all of the target proteins, Rosetta generates structures with RMSDs of 3.6 A+ or lower.

We choose a decoy set size of 10,000 as a balance in terms of cost/performance ratio because it

generates low RMSD structures relatively quickly. The lowest RMSD structure for each protein

generated in the 10,000 decoy set is shown in Figure 5-2 superimposed on the native structure of

the target protein.

5.1.2 Constraint Distance Acceptance Range

The upper and lower bounds placed on a distance constraint make up the constraint

distance acceptance range. Such a range is needed in order to properly simulate experimental

conditions in which the measured distances are not exact. A decoy with a calculated distance

within the acceptance range, satisfies the constraint. We tested constraint distance acceptance

ranges of +/- 18+, 3A+, and 5A+ for sets of twelve and twenty-five constraints.

5.1.2.1 Twelve constraints

Using the present constraint selection procedure, which incorporates information from a

secondary structure prediction method, we chose a set of twelve distances from the native

structures of each target protein. Each constraint met the following criteria: it involved atoms in

defined regions of secondary structure and its length was between 5 and 30 A+.









A constraint distance acceptance range of +/- 1 A+ is too tight for most of the target

proteins. Fifty percent of decoys for lb4c satisfy only four constraints (Figure 5-3A). The

highest scoring decoy satisfied eleven constraints and had an RMSD of 5.1 A+ (Table 5-2). For

1ghh, fifty percent of decoys satisfied six constraints and the highest scoring decoy (score of

eleven) had an RMSD of 5.4 A+. The third protein, lubi, had two decoys with a score of twelve;

their RMSDs were 3.0 A+ and 4.1 A+. Although not the lowest in the database, the top-scoring

decoys for lb4c, 1ghh, and lubi had RMSDs in the range for good predictions (< 6.0 A+). Those

for 2ezk, however, did not. The three top scoring decoys had a score of eleven and RMSDs

ranging from 8.1 9.8 A+. An acceptance range of +/- 1 A+ is far too restrictive for 2ezk,

eliminating low RMSD structures. Because the constraint distance acceptance range failed to

assign the highest scores to the lowest RMSD decoys, an increase in the acceptance range was

necessary.

Increasing the constraint distance acceptance range from +/- 1 A+ to +/- 3 A+ shows some

slight improvement in structure prediction; all four target proteins had a top scoring decoy with a

RMSD of 4.6 A+ or lower (see Table 5-2). Fifty percent of decoys satisfied nine to eleven

constraints. Use of this range, however, results in too many high RMSD decoys satisfying all of

the constraints. The top scoring decoys had high RMSDs ranging from 7.9 A+ for lb4c to 13.3 A~

for lubi (Table 5-2). The total number of top scoring decoys is also higher than when using the

lower acceptance range. More constraints must be used in order to employ a constraint distance

acceptance range of +/- 3 A+.

Using a constraint distance acceptance range of +/- 5 A+, at least one of the top scoring

decoys for 1ghh, lubi, and 2ezk also had the lowest RMSD in the set. This constraint range,

however, has the same drawbacks as the +/- 3 A+ range; 1ghh has nearly one thousand decoys that









satisfy twelve constraints while lubi has almost four thousand. Although lubi has over three

thousand decoys with RMSDs under 6 A+, many of the top scoring decoys have larger RMSDs--

as high as 14 A+. Using only twelve constraints in the search procedure does not adequately

distinguish the good decoys from the bad. More constraints must be used.

5.1.2.2 Twenty-five constraints

A set of twenty-five constraints was chosen; twelve of which were taken from the previous

constraint set. As was seen for twelve constraints, an acceptance range of +/- 1 A+ is very tight--

fifty percent of decoys satisfy ~6 constraints for lb4c and ~10 12 constraints for the other three

target proteins (Figure 5-3B). None of the target proteins had a decoy that satisfied all twenty-

five constraints using an acceptance range of +/- 1 A+. Three of the four target proteins had top

scoring decoys with RMSDs under 6 A+ (Table 5-2) but the lowest RMSD decoy in each set was

not assigned the highest score. Only 1ghh had top scoring decoys out of the range for reliable

predictions.

Increasing the acceptance range from +/- 1 A+ to +/- 3 A+ improves predictions for 1ghh and

lubi, but the RMSDs of the top scoring decoys are higher for lb4c. Both low (4.3 A+) and high

(8.2 A+) RMSD decoys for 2ezk satisfy all twenty-five constraints. For each of the target

proteins, fifty percent of decoys satisfy 16 18 constraints using an acceptance range of +/- 3 A~

and 21 23 constraints for an acceptance range of +/- 5 A+. For the latter acceptance range, all

target proteins had at least one top scoring decoy with a low RMSD (< 6.0 A+). For 1ghh, lubi,

and 2ezk, the lowest RMSD decoy in the set had a score oftwenty-five. For lb4c, an acceptance

range of +/-5 A+ had a top scoring decoy with the lowest RMSD when compared to the other

acceptance ranges. In summary, using twenty-five constraints and a constraint distance

acceptance range of +/- 5 A+ works best for this type of decoy set.









5.2 Search Results

We have found the optimal parameters to be a set size of 10,000 decoys and a constraint

distance acceptance range of +/- 5 A+ with a set of twenty-five distance constraints. We will

present results for the four target proteins using the optimized parameters.

Our scoring procedure is tested by the correlation between the decoy's score and its RMSD

(Figure 5-4). A good scoring procedure assigns lower RMSD decoys higher scores. For lb4c,

the expected trend holds true, low RMSD structures have high scores. For 1ghh, lubi, and 2ezk,

the trend is not detectable. This may be due to the large number of low RMSD structures

generated for lb4c compared to the other target proteins. The lack of a trend is also caused by

Rosetta' s ability to accurately reproduce local structure in most of its decoys as well as our use

of several short distance constraints for the target proteins. Short distances give information

about a protein's local structure, while large distances give clues about its overall structure.

Rosetta does a great j ob in predicting secondary structure; most constraints between residues

close one another in the chain are, therefore, satisfied by almost every decoy. In summary, the

lack of a clear correlation between score and RMSD may be the result of poor constraint choices.

Another way to view our results is to look at the average RMSD for each score (Figure 5-

5). For each target protein, the average RMSD decreased with an increase in score indicating the

use of constraints to distinguish between good and bad decoys is effective. The decoy set for

1ghh has the highest average RMSD, 9.7 A+. The average RMSD for the 1ghh decoys satisfying

twenty-five constraints is only 4.9 A+; many high RMSD decoys were eliminated by applying

several distance constraints. The average RMSD for decoys of lubi was 7.6 A+ and the average

of those satisfying twenty-five constraints was lowered to 4.2 A+. The other two target proteins

also show a decrease in the average RMSD upon constraint application, albeit less drastic; 2ezk









is lowered from 8.6 A+ to 7.7 A+ and lb4c is lowered from 7.1 A+ to 6.0 A+. For 2ezk and lb4c,

poor constraint choices may have hindered the decoy discrimination process.

5.2.1 Target 1b4c

All twenty-Hyve constraints were satisfied by eight decoys of lb4c with RMSDs ranging

from 4.6 7.7 A+ and an average RMSD of 6.0 A+. Some of these structures can be found in

Figure 5-6. The top scoring decoys are very similar to each other; the greatest area of variation

from the native structure can be found in the loop regions. Also, the first a-helix appears to be

somewhat displaced in most of the decoys.

The lowest RMSD structure in the database (Figure 5-2A) has a score of 20. The Hyve

constraints this decoy did not satisfy involved residues in the first a-helix (residues 7 16). As

can be seen in Figure 5-2A, helix 1 of the decoy is slightly displaced from helix 1 of the target

protein. All of the constraint distances in the decoy were less than 7A+ from the target distances

of the native structure. Using a slightly larger constraint distance acceptance range would result

in the lowest RMSD decoy satisfying all twenty-Hyve constraints.

5.2.2 Target 1ghh

For 1ghh, four decoys satisfied twenty-Hyve constraints with RMSDs ranging from 3.2 -

6.2 A+. These structures can be found in Figure 5-7. The lowest RMSD decoy in the decoy set

satisfied all constraints. Our decoy discrimination procedure successfully identified the low

RMSD decoys. The average RMSD in the decoy set was 9.7 A+ and dropped to 4.9 A+ for decoys

satisfying all twenty-five constraints.

Of the top scoring decoys, only the lowest RMSD decoy had the same topology as the

native structure of the target protein. There are three P-sheets in the native structure (Figure 5-

6); P-1 is located between P-2 and P-3. P-1 is parallel to P-2 and anti-parallel to P-3. In the top









scoring decoys 1059 and 1073, P-2 is the middle P strand and is still parallel to P-1 as in the

native structure. In decoy 9935, P-3 is the middle P strand and runs anti-parallel to both of the

other strands. Because P-strands are within hydrogen bonding distance to each other, decoys

with this secondary structure can have low RMSDs and incorrect topologies, as seen in our

previous work (Chapter 4).

5.2.3 Target lubi

Eighty-six decoys of lubi satisfy twenty-five constraints with RMSDs ranging from 2.4 -

12.6 A+ and an average of 4.2 A+. The best structure in the decoy set has an RMSD of 2.4 A+ and

was found to satisfy all constraints (Figure 5-2C). Only nine of the top scoring decoys had

RMSDs greater than 6 A+. Decoy number 3631 (Figure 5-8) had an RMSD of 12.6 A+. This

decoy shared similar topology with the target structure for the first fifty residues; the RMSD for

this section was only 3.4 A+. Deviation from the target structure appears in a loop region after the

decoy's fourth P-sheet.

In this set, 86 decoys were found to satisfy all constraints. As seen in the previous

example (1ghh), structures with incorrect topology, inverted P-sheets for example, can

sometimes satisfy several constraints. A slightly tighter constraint range of +/- 3 A+ had only six

decoys that satisfied 24 constraints with RMSDs ranging from 2.4 5.8 A+. A tighter constraint

range may prevent such incorrectly aligned P-sheets from satisfying so many constraints.

5.2.4 Target 2ezk

For 2ezk, 732 decoys satisfied twenty-five constraints with RMSDs ranging from 2.9 -

14.4 A+ with an average of 7.7 A+. Although the lowest RMSD decoy in the set (Figure 5-2D)

satisfies all constraints, the RMSD range for the top scoring decoys is similar to that of the whole

decoy set. The search method did not adequately distinguish between the good and bad decoys.









This may be because the decoy set contains only a small number of structures with RMSDs

lower than 6 A+. Choosing better distance constraints may also improve the discrimination

process.

A tighter constraint range of +/- 3 A+ led to only eight decoys satisfying all twenty-five

constraints (Figure 5-9A). The RMSDs, however, range from 4.3 8.2 A+. Excluding residues 1

- 10 lowers the RMSD range of the top scoring decoys to 2.4 4.0 A+, indicating this is the

greatest region of deviation from native structure. Constraints were not chosen from this region

because JPred did not predict any defined secondary structure.

5.3 Conclusions

Using our present method of choosing constraints, twenty-five distances must be measured

to distinguish between reliable and unreliable decoys using a constraint distance acceptance

range of +/- 5 A+. Decoys with slightly lower RMSDs are generated in the 10,000 decoy set when

compared to the 1,000 set. In general, there is no significant difference between the decoys

generated in the 10,000 versus the 50,000 decoy set. Rosetta generates low RMSD structures for

each of our target proteins and our scoring procedure is effective in assigning these low RMSD

decoys high scores. The RMSDs of the best top scoring decoys were: 4.6 A+ for lb4c, 3.2 for

1ghh, 2.4 A+ for lubi, and 2.9 A+ for 2ezk. For lubi and 2ezk, several decoys satisfied all twenty-

five constraints with a large range of RMSD values. A different set of constraints may be more

effective in distinguishing between good and bad decoys. In our next study, we will use a more

reliable method for choosing constraints.












Table 5-1. RMSD ranges
1,000 (% < 6.0 A+) 10,000 (% < 6.0 A+) 50,000(% < 6.0 A+)
lb4c 4.2 16.3 (42.7) 3.6 17.2 (45.2) 3.4 17.7 (44.6)
1ghh 4.2 14.6 (9.8) 3.2 16.8 (9.5) 3.2 16.8 (10.3)
lubi 2.9 14.9 (32.7) 2.4 15.0 (34.6) 1 .8 18.5 (34.9)
2ezk 3.7 15.0 (4.9) 2.9 15.2 (7.8) 2.9 17.9 (7.8)
In parentheses is the percentage of reliable structures generated for each decoy set for each target
protein.


Table 5-2. Comparison of scores for each protein with different acceptance ranges

lb4c 1ghh
score Rmsd range score Rmsd range
12 constraints +/- 1 A+ 11 5.1 (1) 11 5.4 (1)
+/- 3 A+ 12 4.6 7.9 (82) 12 4.1 12.2 (41)
+/- 5 A+ 12 3.8 14.3 (1343) 1 2 3.2 13.1 (998)
25 constraints +/- 1 A~ 13 4.8 5.8 (4) 1 7 6.1 12.1 (4)
+/- 3 A 22 5.7 (1) 23 4.6 5.4 (3)
+/- 5 A+ 25 4.6 7.7 (8) 25 3.2 6.2 (4)
Lowest RMSD in decoy set 3.6 3.2

lubi 2ezk
score Rmsd range score Rmsd range
12 constraints +/- 1 A+ 12 3.0 4. 1 (2) 11 8.1 9.8 (3)
+/- 3 A+ 12 2.4 13.3 (518) 1 2 3.6 12.6 (668)
+/- 5 A 12 2.4 14.0 (3929) 1 2 2.9 14.3 (3207)
25 constraints +/- 1 A~ 19 4.1 (1) 1 8 5.9 (1)
+/- 3 A+ 24 2.4 4. 1 (6) 25 4.3 8.2 (8)
+/- 5 A+ 25 2.4 12.6 (86) 25 2.9-14.3 (732)
Lowest RMSD in decoy set 2.4 2.9

*the number in parenthesis is the number of decoys with that particular score. All data is for the
10,000 decoy set.














~- lb40 1,000
- -lb4010,000
----- 1640 50,000
1ghh 10,000

-----2ezkr 10,000


V1
O

tr 15
O
a,
o
In


0 5 10 15


RhrlbD:, A


Figure 5-1. RMSD distributions for all four target proteins using the 10,000 decoy sets. For
lb4c, the RMSD distribution for sets of 1,000 and 50,000 decoys are also shown.
The bin size was 0.5 A+. The frequency was calculated as a percentage of the total
number of decoys in the set.














A B C, D


Figure 5-2. Lowest RMSD structures in the 10,000 decoy set. The target protein is shown in blue
and the decoy structure is overlapping in red. A) lb4c with decoy #6426. The
RMSD is 3.6 A+. B) 1ghh with decoy # 6104. The RMSD is 3.2 A+. C) lubi with
decoy # 5423. The RMSD is 2.4 A+. D) 2ezk with decoy # 5532. The RMSD is 2.9
















~-1b4c, +/- 1A
- -1b~c, +/- 3A
-1b~rc, +/- 5A
S1 gh h, +/- 1A
- 1ghh, +/- 3A6
- 1--ghh, +/- 5A
~-1ubi, +/- 1A
- -1ubi, +/-3A
- lubi, +/- SA~
----2ezk, +/- 1A
-- -2ezk, +/- 1A
- 2ezk, +/- 1A


Minimum number of satisfied constraints


u 5 10 15 20 25
Minimum number of satisfied constraints

B

Figure 5-3. The number of structures remaining vs. score for each protein, for +/-1, 3, and 5 A
using A) twelve constraints and B) twenty-five constraints


























1 1 .1 1 1 >,,, 1.I 1 1
0 2 4 6 8 10 12 14 16 18
RMSD, A


0'I s l il
0 2 4 6 8 10 12 14 16 18
RMSD, A


01 I l l i l l il-
0 2 4 6 8 10 12 14 16 18
RMSD,A
C


2.000
5.000
25.00
50.00
75.00
100.0
150.0
200.0
250.0
30D.0
35i00


1 2~ 4 6 8b 10 12 14 16
RMSD,A
D


Figure
5-4.
Correla
tion
between
n score
and
RMSD
.A)
lb4c.
B)
1ghh.
C)
lubi.
D)
2ezk.


-1b4C

-1ghh
-1ubi
- 2ezk


10 12 14 16 18 20 22 24


Score




































Figure 5-5. Average RMSD for each protein at different scores.


A B


Figure 5-6. Top scoring decoys for lb4c's 10,000 decoy set: A) Decoy #8500 has an RMSD of
4.6 A+. B) Decoy #8827 has an RMSD of 7.7 A+.


















A B


Figure 5-7. Representation of the P-sheet orientation for the native structure of target protein
1ghh and the top scoring decoys. A) Orientation for the native structure of 1ghh and
decoy # 6104 with an RMSD of 3.2 A+. B) Orientation for decoy # 1059 and # 1073
with RMSDs of 5.8 A+ and 6.2 A+ respectively. C) Orientation for decoy # 993 5,
which has an RMSD of 4.5 A+.


Figure 5-8. Top scoring decoy, for lubi, # 3631 with a high RMSD, 12.6 A+. When residues 1 -
50 aligned to the native structure of lubi the RMSD is 3.4 A+.









A B

Figure 5-9. Top scoring decoys for 2ezk. A) When all residues are aligned the RMSDs range
from 4.3 8.2 A+. B) When residues 10 93 are aligned, the RMSDs range from 2.4
4.0 A+.









CHAPTER 6
RESULTS: USING GENERAL AND SPECIFIC DECOYS SETS TO STUDY TWELVE
CASP7 TARGETS

We used our general and specific decoy sets to predict the structures of twelve CASP7

targets. For a given target, the same set of twenty-Hyve constraints was used for both types of

decoy sets. Unless otherwise indicated, a constraint distance acceptance range of +/- 5 A+ was

employed.

6.1 General Decoy Set

For each target, twenty-five constraints were chosen. As seen in Table 6-1, the number of

top scoring decoys for each target ranged from two to over ten thousand. The C" RMSDs of the

top scoring decoys were calculated and Hyve targets were found to have successful predictions (a

decoy with an RMSD under 6.0 A+). To determine whether the lack of reliable predictions for the

remaining seven targets was due to a breakdown in decoy generation or decoy discrimination, we

calculated the C" RMSDs between each decoy and each target (Figure 6-1).

The RMSD distribution is similar for most of the target proteins; most decoys have

RMSDs within 10 20 A+ of their target with an average RMSD of ~16 A+. Target T335 is the

exception. Its RMSD distribution is shifted to the left giving rise to an average RMSD of only

9.9 A+. Over 160 thousand decoys have RMSDs less than 6.0 A+, low enough to be considered a

reliable prediction. It is not surprising, therefore, to Eind over ten thousand decoys satisfying all

twenty-Hyve constraints for this small target (42 residues), which also has a very common

structural motif.

For target T335 and four others (T288, T309, T340, T359), the best decoy in the set had an

RMSD under 6.0 A+ and satisfied all constraints. For three targets, T348, T349, and T358, low









RMSD decoys were generated in the set but the discrimination procedure failed to assign them

top scores. The remaining four targets (T306, T3 11, T3 53, T363), had no low RMSD decoys in

the set; the decoy generation method failed to provide accurate structures, indicating no larger

proteins contain pieces that look like these four targets.

Comparisons between the JPred predictions and the real structure for each target can be

found in Table 6-2. Because constraints are chosen from the JPred prediction, it is important to

determine the prediction' s accuracy. Poor structure prediction can lead to poor constraint

choices.

6.1.1 Targets That Worked

For Hyve CASP7 targets (T288, T309, T335, T340, T359), the lowest RMSD decoy in the

set was found to satisfy all twenty-Hyve distance constraints. In each case, the lowest RMSD

decoy was also under 6.0 A+.

6.1.1.1 Target T288

Target T288 corresponds to 2gzv, the PDZ domain of human PICK1 (a fragment of a

PRKCA-binding protein). The PDB structure is missing two residues (27 and 28) located in a

loop region, making the target 91 residues long. Overall JPred does a good job predicting the

secondary structure as seen in Table 6-2. Although it does not predict the a-helix between

residues 41 45 or the small P-strand between residues 60 61, its predictions for the rest of the

structure are never off by more than two residues.

The a-helix from residue 67 to 76 was selected as a reference and all distance constraints

involved an atom from this helix. Eleven decoys satisfied twenty-Hyve constraints with RMSDs

ranging from 3.4 5.6 A+ (Figure 6-2); all of the top scoring decoys had RMSDs within the range

of reliable structure predictions.









The six decoys with the lowest RMSDs in the set satisfied all constraints. All of the top

scoring decoys are from the PDZ domain of various proteins. PDB codes Itp3, Itp5, Itq3, lbe9,

and lbfe are crystal structures of the same protein, the PDZ3 domain of synaptic PSD-95 protein,

completed with different ligands. Parent protein lum7 is the PDZ domain of synaptic-

assoicated protein 102 while lb8q is the extended neuronal nitric oxide synthase PDZ domain.

The greatest difference between the structures of the decoys and the target occurs at the C-

terminus from residues 87 91; no constraints involved atoms in this region.

In addition to RMSD, determining the longest continuous segment (LCS) of a decoy

structure that has an RMSD under a specific threshold is sometimes used to evaluate the

similarity between two structures (see Chapter 2). The longest continuous segment under 5 A+

(LCS-5) is 91 residues for the Itq3 decoy (the entire structure) and 89 residues for the lb8q

decoy. When a lower threshold is used, greater differences appear between the structures. For

the decoy from Itq3, the longest continuous segment under 2 A+ (LCS-2) is composed of 15 86

(72 residues total), while the LCS-2 for the lb8q decoy has only 29, from residues 33 61.

6.1.1.2 Target T340

Target T340 represents 2HE4, the second PDZ domain of human NHERF-2 (SLC9A3R2)

interacting with a mode 1 PDZ binding motif. It is composed of 90 residues. The JPred

prediction (Table 6-2) is fairly accurate for this target. It does not, however, predict the ot-helix

between residues 40 44 or the P-strand between residues 58 59.

The helix between residues 64 73 was selected as the reference structure from which all

constraints were chosen. Sixty decoys satisfied twenty-five constraints with RMSDs ranging

from 3.0 14.0 A+. The top scoring decoys with the highest and lowest RMSDs are shown in

Figure 6-3. The best decoy in the database had an RMSD less than 6.0 A+, as did over 80% of the









top scoring decoys. Twenty of the top scoring decoys, with RMSDs ranging from 3.0 4.4 A+,

came from parent protein 1wf7 (Figure 6-3B) while 1wif (Figure 6-3C) was the parent protein of

sixteen top scoring decoys with RMSDs ranging from 4.4 4.7 A+. Both parent proteins are PDZ

domains of a larger protein. For these top scoring decoys, the longest continuous segments

under 5 A+ included all residues; the LCS-2 was 63 residues for 1wf7 and 39 for 1wif.

The top scoring decoy with the highest RMSD is from parent protein 1qln (Figure 6-3D),

the structure of transcribing T7 RNS polymerase initiation complex. It has no noticeable

similarities to the target structure as it is completely a-helical while the target is mostly a P-

barrel. The longest continuous segment of this decoy with an RMSD under 5 A+ is 25 residues,

which included the reference helix. Most of the other top scoring decoys with high RMSDs have

a similar a-helical arrangement and the same LCS-5. Tightening the constraint distance

acceptance range to +/- 3 A+ only lowers the score of 1qln from 25 to 18. It is, therefore, poor

constraint choices that must be contributing to the high RMSD structure predictions for this

target.

An alternate set of twenty-five constraints was chosen by selecting eight atoms from

various regions of secondary structure and calculating the distances between them. Twenty-three

decoys satisfied all twenty-five constraints with RMSDs ranging from 3.0 4.4 A+. This set of

constraints was much better than the previous set, which led to much better results. Among the

top scoring structures were twenty decoys from parent protein 1wf7 with RMSDs ranging from

3.0 4.4 A+, as well as decoys from parent proteins lufl, luit, and 1v51 (Figure 6-3E, F, G). The

parent proteins of all the top scoring decoys are structures of the PDZ domains of various

proteins similar to target T340.









6.1.1.3 Target T359

Target T359 represents 2iwn, the 3rd PDZ domain of multiple PDZ domain protein MPDZ.

The gap in the crystal structure between residues 28 31 resulted in the protein having a total of

93 residues. JPred does a good job predicting secondary structure for this protein. It does,

however, miss an co-helix between residues 45 49 and a P-strand between residues 64 65

(Table 6-2). Also, it predicts the first, second, and last P-sheets to be slightly longer than they

are in the real structure.

The helix between residues 71 79 was selected as a reference and all constraints involved

an atom from this helix. Fifteen decoys were found to satisfy all twenty-five constraints with

RMSDs ranging from 3.6 13.1 A+ (Figure 6-4). The lowest RMSD decoys in the database

satisfied all constraints. The parent proteins of six of the top scoring decoys (PDB codes: 1pld,

luml, lil6, lueq, 1wfy) were PDZ domains of various proteins and had RMSDs of 6.8 A+ or less.

These decoys differed from each other only slightly, mostly in the loop regions. Despite having

a lower RMSD, 1pld (3.6 A+ RMSD) had 55 residues in its longest continuous chain with a 2 A~

threshold while 1wfy (6.8 A+ RMSD) had 60 residues in its LCS-2.

The remaining nine top scoring decoys were redundant structures of parent protein 1w5e

and had an RMSD of 13.1 A+. A tighter constraint distance acceptance range of +/- 3 A+ was

applied and these high RMSD decoys were found to satisfy only 14 constraints. Tightening the

constraint distance acceptance range, however, also lowered the score of the lowest RMSD

decoy in the decoy set from 25 to 20. So in this case, little is gained by changing the acceptance

range; better constraints must be chosen.

Decoys of 1w5e have certain secondary structural elements in just the right places allowing

them to satisfy many constraints. For example, twenty-eight residues comprised the LCS-5 for










this decoy, which included the a-helix that was used as a reference in choosing the constraints.

Many constraints involved distances between an a-helix and a P-sheet so it is not surprising that

all the top scoring decoys had these structural features.

6.1.1.4 Target T309

Target T309 corresponds to 2h40, a 62 residue hypothetical protein from bacillus subtilis

(yonk). JPred predicts an a-helix between residues 27 40. In the real structure, however, the

a-helix is only between residues 34 40. The other regions of secondary structure are P-sheets

located between residues 13 15, 20 24, and 29 33, none of which were predicted by JPred.

Because of the discrepancies in the secondary structure prediction and the inherent lack of

structure in the target, it was difficult to choose good constraints. Surprisingly, we were able to

obtain successful predictions for this target.

Because most of the protein is unstructured, constraints were chosen between all regions of

predicted secondary structure. Nine decoys satisfied twenty-four constraints. The RMSDs of the

top scoring decoys ranged from 5.7 14.4 A+ and came from four parent proteins: lesc, lesd,

lese, and 1wk1 (Figure 6-5). The first three codes represent esterase and their top scoring

decoys had RMSDs ranging from 5.7 6.6 A+. The longest continuous segment under 5 A+

RMSD is from residues 1 58 (almost the whole structure) for lesc, lesd, lese and between

residues 3 42 for 1wkl.

For 1wkl, the Lectin C-type domain derived from a hypothetical protein from C. elegans,

the top scoring decoys had RMSDs of 14. 1 14.4 A+. The high RMSD structures have some

similarities to the target protein; the decoy's three P-sheets within residues 10 34 align closely

to those of the target protein. As can be seen in Figure 6-5, the target protein does not have

defined secondary structure between residues 1 10 or 42 62. The RMSD between the two









structures from residues 10 34 is ~3.8 A+, and, as mentioned above, its LCS-5 is composed of

40 residues, including this region.

Removing the distance constraint between residues 5 and 35 from the constraint set results

in only three decoys (from parent protein lesc) satisfying all twenty-four constraints. The three

decoys each have an RMSD of 5.7 A+. The results are, therefore, significantly improved when

bad constraints are not used. This particular constraint is not effective because residue 5 is in a

region of undefined secondary structure. Poor secondary structure predictions can lead to poor

constraints choices which can significantly hinder the performance of our method.

6.1.1.5 Target T335

Target T335 is 2hep, the UPFO291 protein ynzC from Bacillus subtilis. The target

sequence was made of 85 residues but only 42 appear in the NMR structure. JPred correctly

predicts the two main a-helices of this protein.

Twenty-five constraints were selected from eight residues, four on each a-helix. Over ten

thousand decoys satisfied all twenty-five constraints (Figure 6-6); their RMSDs ranged from 2.1

- 9.5 A+. Over eight thousand of the top scoring decoys had RMSDs less than 6.0 A+. Target

T335 is the smallest protein in this study and contains a fairly common structural motif. The

lowest RMSD decoy in the set, from parent protein 1qsp, had all but two residues in its longest

continuous segment with an RMSD under 2 A+.

Because of their compact structures, some higher RMSD decoys satisfied all twenty-five

constraints. For example, a decoy from parent protein lb6c (Figure 6-6C) had an RMSD of 9.5

A+ and a LCS-5 of 30 residues. This decoy just barely satisfied all the constraints; using a

constraint distance acceptance range of +/- 3.0 A+, the lb6c decoy satisfied only eleven of the

twenty-five constraints. Using a tighter acceptance range, 142 decoys satisfied all constraints;









their RMSDs ranged from 2. 1 7.6 A+ and 82% of them had RMSDs under 6.0 A+. For this

target, tightening the constraint distance acceptance range significantly improved the results.

6.1.1.6 CASP comparisons

We employed the Global Distance Test (GDT) analysis (see Chapter 2 for details) to

evaluate the performance of our method compared to other methods used in CASP7. Most

methods were successful in predicting low RMSD structures for targets T288, T340, and T359

(Figure 6-7). Our predictions were not among the best for these targets even though their

RMSDs were less than 6.0 A+.

Target T309 was a difficult target; it is largely unstructured which is a common pitfall for

many methods. Our predictions for this target are much better those of other methods. Even our

high RMSD top scoring decoy cyann line in Figure 6-7) showed better GDT results than the

average prediction; over fifty percent of the residues in this decoy satisfied a distance cutoff of

6.0 A+.

CASP results are mixed for target T335; the GDT results are highly scattered. Our top

scoring decoy, with an RMSD of 2. 1 A+, is one of the best predictions while our other top scoring

decoy, with an RMSD of 9.5 A+, is average.

6.1.2 Targets That Could Have Worked But Did Not

For three targets, low RMSD decoys were generated but the discrimination procedure did

not assign them top scores. The breakdown in decoy discrimination can be explained by various

reasons as discussed in the next section.

6.1.2.1 Target T348

Target T3 48 represents 2hfl, the putative Tetraacyldi saccharide-1-P 4-kinase from

Chromobacterium violaceum. It has 61 residues. The JPred prediction is not very good; it

predicts an co-helix between residues 4 10 which does not appear in the target structure and it









fails to predict two other a-helices (29 31, 46 48) and three P-strands (18 20, 41 42, 50 -

51), as seen in Table 6-2.

The helix between residues 58 61 was selected as a reference and half of the constraints

involved one of these residues. Eighteen decoys were found to satisfy all twenty-five constraints

with RMSDs ranging from 6.9 11.0 A+ (Figure 6-8 C, D, E, F). The top scoring decoys with the

lowest RMSDs of 6.9 A+ (2poo, 1poo, Icvm, 1qlg) represent the enzyme phytase. They all have a

P-sheet pattern similar to the target structure, but differ somewhat in the loop regions. The

longest continuous segment with an RMSD under 5 A+ for these four decoys is between residues

22 61, which includes the reference helix.

Different forms of trans-hydrogenase are represented by 1hzz, 117d, 117e, 1ptj, lu2d, and

lxlt (Figure 6-8E); decoys from these parent proteins also satisfied all twenty-five constraints.

The structures of these decoys have two P-sheets and a terminal a-helix and their RMSDs ranged

from 10.6 10.7 A+. Although the overall RMSD was quite different, when the structures were

aligned in smaller pieces (residues 4 22, 18 41, and 41 61), the RMSD of each section was

only 5.7 5.8 A+. The LCS-5 for a decoy from 1hzz was composed of residues 23 45.

Three other top scoring decoys are depicted in Figure 6-9 to show the wide range of

structures that satisfy all constraints. A decoy from parent protein Is61 had a RMSD of 8.7 A+.

As seen previously, the P-sheets of this decoy are inverted compared to the target. A decoy from

parent protein le88 had an RMSD of 11.0 A+. Similar to the other top scoring decoys with high

RMSDs, when the le88 and the target were aligned in fragments (residues 20 40, 41 60), the

RMSDs of each section were within the range for good structure predictions (5.7 A+ and 5.9 A

respectively). The longest continuous segment with an RMSD less than 5.0 A+ is 19 residues.

All of these decoys were able to satisfy twenty-five constraints despite their great difference in









structure. To prevent these high RMSD decoys from satisfying all constraints, better constraints

must be chosen. Due to the highly inaccurate secondary structure prediction for this target, good

constraint choices were quite difficult.

The best decoy in the database, from Itl2, (Figure 6-8B) was found to have an RMSD of

5.3 A+ and a score of 22. Two of the unsatisfied constraints involved residue 10, which was

predicted by JPred to be in an co-helix but was not in a defined region of secondary structure in

the target. This decoy has a LCS-5 of 56 residues, almost the entire structure.

6.1.2.2 Target T349

Target T349 represents 2hfy, Pseudomonas aeruginosa hypothetical protein RPA1041. It

is composed of 75 residues. JPred does not predict the first P-sheet between residues 2 7 and

predicts residues 58 and 65 to be in P-strands but they are in structurally undefined regions of the

target structure. It gives a reliable prediction for the remaining secondary structural elements.

The helix between residues 9 23 was selected as a reference and most constraints

involved an atom from this region. Sixty-four decoys satisfied all constraints with RMSDs

ranging from 10.5 13.6 A+ (Figure 6-9). A decoy from parent protein Ita3 (Figure 6-9D) had an

RMSD of 13.6 A+, but it was quite similar to the target in two regions: from residues 49 71 the

RMSD was 3.0 A+ and from residues 1 18 the RMSD was 5.9 A+. The longest continuous

segment with an RMSD less than 5.0 A+ included 30 residues. A decoy from parent protein It9u

had an RMSD of 10.5 A+ (Figure 6-9E) and satisfied all twenty-five constraints. Its LCS-5 had

32 residues, which were also located near the C-terminus of the decoy.

In the database, twenty-eight decoys had RMSDs less than 5.6 A+ and scores ranging from

14 21. A decoy from luj 5 (Figure 6-9B) had an RMSD of 5.6 A+ and a score of 21, while a

decoy of 1wel (Figure 6-9C) satisfied only fourteen constraints and had the same RMSD.










Several of the unsatisfied constraints for both of these decoys involved atoms in the Einal co-helix

composed of residues 55 64. The two decoys were similar to each other having an RMSD

between them of 5.4 A+. Increasing the constraint distance range from 5 A+ to 7 A+ raises the

scores of the lowest RMSD decoys to 23, but also allows other high RMSD decoys to satisfy

more constraints.

The decoy from luj5 had only 39 residues in its LCS-5, while 1wel had 63. Therefore,

1wel should have satisfied more constraints than luj5, even though they had very similar RMSD

values, but this did not happen. For the decoy from luj5, the placement of secondary structure is

closer to the JPred prediction than that of the target. The target and decoy differ greatly in the

loop regions; the decoy has two short P-sheets where the target has a long loop. The decoy also

ends with a P-sheet unlike the target which has a largely unstructured terminal region.

Another set of constraints was selected in an attempt to better the results. Nine atoms were

chosen and distances were calculated between them. Satisfying all the new constraints were 42

decoys with RMSDs ranging from 5.5 A+ 11.1 A+. This was a slight improvement as the lowest

RMSD decoy was found to satisfy all constraints.

6.1.2.3 Target T358

Target T358 represents 2hjj, protein ykfF from Escherichia coli. It is 66 residues long.

The JPred prediction is pretty good for this protein. The first nine residues are missing in the

crystal structure.

The helix between residues 5 14 was selected as a reference structure and all constraints

involved an atom in this region. Five decoys satisfied all twenty-Hyve constraints with RMSDs

ranging from 8.3 11.8 A+ (Figure 6-10). A decoy from 1p99 had an RMSD of 8.3 A+. Forty-two

residues comprised the LCS-5 for this decoy, which included the reference helix. Three top









scoring decoys, lefd, 1k2v, and 1k7s, had RMSDs of 9.3 A+ and LCS-5s of 27 residues, including

the reference helix. The longest continuous segment for 1x9d (RMSD = 11.8 A+), also include

the reference helix but was composed of only 24 residues.

The best decoys in the database, from parent proteins loe9, 1w7i, and 1w7j (Figure 6-

10B), had RMSDs of 5.5 A+ and scores of 22. Their longest continuous segments with RMSDs

under 5.0 A+ were 37 residues long. The segments did not include residues in the reference helix

which may explain why these decoys did not satisfy all constraints. It is difficult to know a

priori which regions of the target will be most similar to any particular decoy.

6.1.2.4 CASP comparisons

As done for the targets that worked, we employed the GDT analysis to evaluate our

method's performance on the targets that could have worked but did not. Target T348 was

difficult for most CASP participants. There were no predicted structures with 100 % of the

residues within a 10.0 A+ distance cutoff. However, our lowest RMSD decoy in the set (1tl2,

pink line in Figure 6-1 1A) did satisfy this requirement but it was not one of the top scoring

decoys. A decoy from 1cvm satisfied all constraints and performed well compared to other

CASP predictions (blue line in Figure 6-11A).

The results for target T349 were quite mixed; some groups did well, while others

struggled. Our top scoring decoys were not among the best predictions (red and blue blue lines

in Figure 6-11B) and the lowest RMSD decoys in the set were average predictions (green and

cyan lines in Figure 6-11B).

T358 was a difficult target. Our lowest RMSD decoy in the set would have been one of

the best structure predictions had it satisfied all constraints (red line in Figure 6-11C). Two of

our top scoring decoys (green and blue lines) were slightly better than average predictions while

the other decoy cyann line Figure 6-11C) was not very good.









6.1.3 Targets That Never Had a Chance

Low RMSD decoys were not generated for the remaining four targets. Lack of structures

similar to these targets shows our decoy set to be incomplete. These targets are not represented

in the database; either they are not fragments of larger proteins or their parent protein was not

included in our database. If they exist, the similar proteins may be less than 100 residues long or

they may contain gaps in their PDB structures, excluding them from our decoy set.

6.1.3.1 Target T306

Target T306 (Figure 6-12A) corresponds to 2hd3, a small fragment of Ethanolamine

Utilization Protein (EutN) from Escherichia coli. It has 95 residues. The JPred prediction is not

very accurate. It does not predict either of the a-helices. It also predicts a long P-sheet

composed of residues 75 87 which is split into two smaller P-sheets in the real structure (Table

6-1).

Twenty of the twenty-five constraints were chosen from the same reference structure, the

P-sheet composed of residues 40 45. Two decoys with parent proteins, 1jhw and 1j72, were

found to satisfy twenty-four constraints (Figure 6-12C). The RMSDs of the top scoring decoys

were 13.4 A+ and 13.5 A+ respectively. Both parent proteins represent a macrophage capping

protein, Cap G, which is composed of four P-sheets and a long a-helix (Figure 6-12C). The

longest continuous segment with an RMSD under 5.0 A+ is small for these decoys, composed of

only 24 residues. The reference P-strand is also in this region, which helps explain why such

high RMSD decoys satisfy most of the constraints.

Unlike the top scoring decoys, the target structure has a P-barrel center, as does the lowest

RMSD decoy in the database (from parent protein 1fgu, Figure 6-12B). This decoy, however,









was found to satisfy only eighteen constraints. Four of the six unsatisfied constraints involved

residue 43, which is part of a P-sheet in the target protein and a small a-helix in the decoy.

Several other slight structural differences exist between the lowest RMSD decoy and the

target protein. The decoy has a loop from residues 10 22, a small a-helix from 40 49, and a

P-sheet from residues 60 67. The target, however, has an a-helix composed of residues 16 -

20, a loop region from 46 52, and an a-helix from residues 61 67. In the C-terminus, the

target ends with two short P-sheets while the decoy Einishes with one short P-sheet followed by

an a-helix. All of these differences give rise to an RMSD between the decoy and target of 8. 1 A+.

The longest continuous segment with an RMSD less than 5.0 A+ is composed of 49 residues for

this decoy, which included the reference P-strand.

No decoys satisfied all twenty-Hyve constraints. Due to JPred's poor secondary structure

predictions for this target, choosing good constraints was quite challenging. The lowest RMSD

decoy in the database satisfies a unique set of eighteen constraints. When only those constraints

are used, the lowest RMSD decoy is assigned a perfect score. Regardless of constraint choices,

however, for this target, no reliable predictions can be made because no decoys with RMSDs less

than 6.0 A+ exist in the database.

6.1.3.2 Target T311

Target T311 is associated with two parent proteins, 2icp and 2ict, which represent bacterial

antitoxin HigA, each crystallized at a different pH. We used 2ict as our reference structure and it

was composed of 87 residues. The JPred prediction for this protein was fairly accurate having

only a slight discrepancy in the position of the last a-helix (Table 6-2).

The a-helix composed of residues 57 74 was selected as the reference helix; all

constraints involved an atom in this region. Twenty-three decoys satisfied all twenty-Hyve









constraints and had RMSDs ranging from 10.1 10.2 A+ (Figure 6-13). The target and the

decoys have different local structures for the first and last 15 residues. The remainder of the

structure is mostly helical, with both proteins having similar sized helices. The difference lies in

the orientation of these helices thereby increasing the overall RMSD between the two structures.

Because the decoy's helices are rotated only slightly, the distances between them are similar to

those of the target which explains why these high RMSD decoys satisfied so many constraints.

For the top scoring decoys, the longest continuous segment with an RMSD under 5.0 A+ is 49

residues long, from residue 31 to 79. These decoys came from eleven parent proteins, nine of

which were from some form of carbamoyl phosphate synthetase (PDB codes: la9x, lbxr, c30,

Ic30, Ics0, 1jdb, 1kee, Imy6, It36). Parent protein Iceb represents recombinant kringle 1

domain of human plasminogen while Ics8 is procathepsin L.

In the database, 64 slightly different structures of parent protein 1f6g had the lowest

RMSD, 6.6 A+. This decoy has a much longer terminal a-helix than the target or the top scoring

decoys. It also has an initial a-helix that is quite similar in size to the target. All of the low

RMSD decoys satisfied twenty constraints. Three of the unsatisfied constraints involved residue

64 and an atom in a loop region. The other two constraints involved residue 50 which is on a

small a-helix in the target structure and part of the large terminal a-helix of the decoy. As seen

for T306, no reliable decoys are generated for this target.

6.1.3.3 Target T353

Target T353 represents 2hfq, protein NE1680 from Nitrosomonas europaea. It has 85

residues. JPred accurately predicts the secondary structure for this target. The a-helix

composed of residues 29 42 was selected as a reference and all constraints involved an atom in

this helix. Twenty-six decoys were found to satisfy twenty-four constraints with RMSDs









ranging from 10.8 15.2 A+ (Figure 6-14). The longest continuous segments under 5.0 A+ RMSD

for the top scoring decoys were 26 residues for lekf (from residue 28 to 53) and 27 residues

(from residue 27 to 53) for 1j49. Both LCS-5s included the reference helix.

The best decoy in the database, from parent protein 1jrp, had an RMSD of 6.4 A+ (Figure 6-

14B) and a score of 17. The decoy matches the target structure exceptionally well from residues

12 43, with an RMSD of ~3.1 A+ in this region. The longest continuous segment with an

RMSD under 5.0 A+ is composed of 54 residues (from residue 9 to 62). The lowest RMSD decoy

did not satisfy all constraints indicating several poor constraints were chosen. The lowest RMSD

decoy, however, was not good enough to be considered a reliable model even if it did satisfy all

constraints.

6.1.3.4 Target T363

Target T363 represents 2hj l, the 3D domain-swapped dimer of hypothetical protein from

Haemophilus influenzae. The PDB structure contains 77 residues. The JPred prediction for this

sequence is fairly accurate. The a-helix between residues 25 34 was chosen as a reference and

all twenty-five constraints involved an atom in this region. Three decoys of parent protein I sxg

satisfied all twenty-five constraints and had RMSDs of 9.3 9.4 A+ (Figure 6-15). The longest

continuous segment with an RMSD under 5.0 A+ was composed of 36 residues and included the

reference a-helix.

The lowest RMSD decoy came from parent protein 1hux and is shown in Figure 6-15. It

was found to satisfy twenty-four constraints. Comprising the LCS-5 for this decoy were residues

1 59. Twenty-seven decoys satisfied the same set of twenty-four constraints with RMSDs

ranging from 6.4 11.7 A+.









6.1.3.5 CASP comparisons

We performed the GDT analysis on the four proteins for which our database had no low

RMSD decoys (Figure 6-16). Target T306 was difficult to predict for most CASP participants;

no models have more than 60 % of the residues within a 5.0 A+ distance cutoff. Our lowest

RMSD decoy (not top scoring), was one of only two models to have 85 % of the residues under a

10.0 A+ distance cutoff (blue line in Figure 6-16A). Our top scoring decoys were about average

compared to the other CASP models.

Predicting the structure of target T3 53 was also difficult for most groups. Our lowest

RMSD decoys (1jrp, 1jro) were the only models to have 95% of the residues under a distance

cutoff of 10.0 A+ (pink lines in Figure 6-16C). Our top scoring decoys, however, were again

average models.

Our results for T311 and T363 were not very good. Most groups found target T311 pretty

easy to predict, while the results for T363 were mixed. In both cases, our top scoring decoys

were not among the best predictions.

6.1.4 Summary of Results for General Decoy Set

We studied twelve CASP7 targets. The lowest RMSD decoy in the database for five of the

targets was assigned the highest score. Three targets had low RMSD decoys but they were not

assigned the highest score while four targets did not have any low RMSD decoys in the database.

Because constraints were chosen with the aid of a secondary structure prediction method,

predicting incorrect secondary structure can lead to poor constraint choices which results in bad

structure predictions. We also found that many high scoring high RMSD decoys have regions of

great similarity to the target which usually contain the reference structure.









When comparing our method to the other methods used in CASP7, we find that our

method performed quite well for some of the hardest targets and not well for some of the easiest

targets.

6.2 Specific Decoy Sets

We studied the same 12 targets as in section 6.1 and used the same sets of constraints. To

generate specific decoy sets for each target, we used the Rosetta algorithm. Each set contained

exactly 10,000 decoys. The C" RMSDs between each target and each decoy were calculated and

the RMSD distribution can be found in Figure 6-17. The RMSD distribution varies greatly from

target to target. As seen in the general decoy set, target T335 has a very high number of low

RMSD decoys; 95.6 % of decoys are under 6.0 A+. Targets T309 and T306 have RMSD

distributions centered around 15.0 A+ and 14.0 A+ respectively, and have no decoys with RMSDs

under 6.0 A+. The RMSD of the best decoy for each target is listed in Table 6-3.

6.2.1 Targets That Worked

Four target proteins, T3 11, T33 5, T3 58, and T349, had the lowest RMSD decoy in their set

satisfy all twenty-five constraints. In each case, the lowest RMSD was under 6.0 A+. The

number of top scoring decoys, range of RMSD values, as well as the RMSD of the best decoy in

the set can be found in Table 6-3.

For T311, 181 decoys had perfect scores and their RMSDs ranged from 4.8 14.6 A+. The

highest (#61) and lowest (#3 545) RMSD decoys satisfying all constraints are shown in Figure 6-

18(A, B). The first fifty residues of the high scoring high RMSD decoy (#61) are similar to the

target; the RMSD in this region is only 1.7 A+. This decoy has a high RMSD despite its very

similar local structure because it also has a few consecutive incorrect dihedral angles. Rotation

around this bond makes the whole decoy structure quite different from the target despite their

great similarities in local structure. The worst decoy in the set had an RMSD of 19.0 A+ and









satisfied only 6 constraints. Our method successfully assigned the worst decoy in the database, a

low score. Rosetta generated 158 decoys with RMSDs under 6.0 A+ and our discrimination

procedure assigned these decoys scores ranging from 13 25 with an average of 23.

The 84 top scoring decoys for T349 had RMSDs ranging from 4.0 11.0 A+. The highest

(#3665) and lowest (#4480) RMSD decoys can be found in Figure 6-18(C, D). Despite the high

overall RMSD, the high scoring high RMSD decoy was similar to the target from residues 6 to

24 and 49 to 66 having RMSDs of only 1.8 and 1.5 A+ respectively. Both sections correspond to

a-helices, with the first being the reference helix. The highest RMSD decoy in the set (15.3 A+)

satisfied 14 constraints. Fifty-four decoys were generated with RMSDs less than 6.0 A+ and they

satisfied between 14 25 constraints with an average of 23.

Target T358 had 20 decoys satisfying all constraints with RMSDs ranging from 4. 1 11.4

A+ (Figure 6-18(E, F)). The 11.4 A+ RMSD decoy (#3334) was similar to the target from residues

1 to 20, which included the reference helix, and from residues 48 to 60. The RMSDs of these

sections were 2. 1 A+ and 2.0 A+, respectively. The highest RMSD decoy for T358 satisfied 13

constraints and had an RMSD of 15.7 A+. The 122 decoys with RMSDs under 6.0 A+ had scores

ranging from 10 25 with an average of 20, slightly lower than the other targets.

Many decoys satisfied all constraints for T335. The range of RMSDs, however, was small,

from 1.4 8.2 A+ (Figure 6-18(G, H)). The lowest RMSD decoy for this target (#9623) was the

lowest RMSD decoy generated for any target. Like the target T335, the high scoring high

RMSD decoy, #2312, had an a-helix from residues 4 to 20. The target and decoy had an RMSD

of 1.8 A+ between residues 1 21. Almost all the decoys in this set had RMSDs under 6.0 A+ and

their scores ranged from 12 25 with an average of 23.5. One of the highest RMSD decoys,









with an RMSD of 11.4 A+, also satisfied 24 constraints. In this case, very few decoys had high

RMSDs and it was hard to separate the good from the bad.

Four decoys, T288, T348, T359, and T363, had top scoring decoys with low RMSDs but

the lowest RMSD decoy in the set did not satisfy all constraints.

For target T288, 86 decoys satisfied all twenty-five constraints and their RMSDs ranged

from 3.6 13.2 A+ (Figure 6-19). The 13.2 A+ RMSD decoy (#7663) was similar to the target

from residue 46 to 83, which included the reference helix. The RMSD of this region was 1.9 A+.

The best decoy in database (#369), with an RMSD of 3.5 A+, satisfied 23 constraints. The two

constraints it did not satisfy were between atoms in the reference helix and the second P-sheet.

The highest RMSD decoy in the database satisfied 21 constraints and had an RMSD of 17.0 A+.

Poor constraint choices may be the result of this high RMSD decoy satisfying so many

constraints. Rosetta generated 25 decoys with RMSDs under 6.0 A+ and their scores ranged from

19 25 with an average score of 24.

No decoys for T348 satisfied all 25 constraints. Two decoys satisfied 23 constraints and

had RMSDs of 5.6 A+ and 12.8 A+ (Figure 6-20). They did not satisfy the same set of 23

constraints. The 12.8 A+ RMSD decoy (#7088) was similar to the target from residues 15 to 31

and 49 to 61, which included the reference helix (from residues 58 61). The RMSDs in these

regions were 2.0 A+ and 1.6 A+ respectively. The lowest RMSD decoy (#1017) satisfied only 12

constraints and had an RMSD of 4.2 A+. The secondary structure prediction for this target was

not very good resulting in poor constraint choices. The highest RMSD decoy in the set satisfied

8 constraints and had an RMSD of 15.3 A+. The 151 low RMSD decoys (RMSD less than 6.0 A+)

had scores ranging from 7 25 with an average of 16. To improve our results, better constraints

must be chosen.









Forty decoys of T3 59, with RMSDs ranging from 6.0 15.2 A+, satisfied all constraints

(Figure 6-21). The high scoring, high RMSD decoy (#3012) and the target had an RMSD of 2.1

A+ from residues 51 89, which included the reference helix from residues 71 to 79. The best

decoy in the set, #1 12, had an RMSD of 4.8 A+ and satisfied 23 constraints, while the highest

RMSD decoy (17.7 A+) satisfied 17 constraints. Only four decoys were generated that had

RMSDs less than 6.0 A+. Their scores ranged from 22 25 with an average of 23.

For target T363, 11 decoys satisfied all twenty-five constraints and their RMSDs ranged

from 5.7 10.5 A+ (Figure 6-22). The 10.5 A+ decoy (#5181) was similar to the target from

residue 1 to 39 having an RMSD of 2.8 A+ in this region. In addition to this region containing the

reference helix, it is also the most structurally defined area; the target is fairly unstructured from

residue 5 1 to the C-terminus. The two best decoys in the set (#455 1, #63 88) with RMSDs of 5. 1

A+ satisfied 19 and 24 constraints, while the highest RMSD decoy (16.8 A+) satisfied 11

constraints. Twenty-five decoys had RMSDs under 6.0 A+ and their scores ranged from 16 25

with an average of 20.

6.2.2 Targets That Did Not Work

The Rosetta method generated low RMSD decoys for targets T340 and T353, but our

method was unable to discriminate the good decoys from the bad. Target T340 had only five

low RMSD decoys in the database with scores ranging from 12 24 and an average of 21, while

the 57 low RMSD decoys for T353 had scored ranging from 15 24 with an average score of 20.

Six decoys of T340 satisfied all constraints and had RMSDs ranging from 7.4 14.5 A+

(Figure 6-23). The 7.4 A+ RMSD decoy (#9880) was similar to the target from residues 1 to 11

and 50 to 74, which included the reference helix. Both sections had RMSDs of 1.9 A+. The 14.5

A+ RMSD decoy (#9412) was most similar to the target from residue 2 to 17 and 63 to 83. They

had RMSDs of 1.9 and 2.9 respectively. The best decoy in the set, #94, had an RMSD of 3.7 A+









but satisfied only 12 constraints, while the next best decoy, with an RMSD of 3.8 A+, satisfied 24

constraints. The decoy with the highest RMSD in the set (17.6 A+) satisfied 20 constraints, while

the next highest RMSD decoy (17.2 A+) satisfied 10 constraints.

Twenty decoys of T3 53 satisfied all twenty-five constraints with RMSDs ranging from 7.0

- 13.5 A+ (Figure 6-24). The top scoring decoys were similar to the target from residue 1 to 47

for the 7.0 A+ decoy and from residue 1 to 43 for the 13.5 A+ decoy. The similar regions included

the reference helix from residue 29 to 42. The RMSDs between target and decoy for these

sections were 2.8 A+ and 3.0 A+ respectively. The higher RMSD decoy (#3009) also had a

segment of high similarity to the target between residues 59 74 (1.7 A+). The best decoy in the

set, #5124, had an RMSD of 4.3 A+ and satisfied 22 constraints. The three unsatisfied constraints

involved residue 29, which is located in an co-helix that is slightly displaced compared to the

target. The highest RMSD decoy (16.2 A+) satisfied only 13 constraints.

6.2.3 Targets That Never Had a Chance

For targets T306 and T309, no low RMSD decoys were generated. No decoys for T306

satisfied all constraints. However, two decoys satisfied 24 constraints and had RMSDs of 12.5 A~

and 13.5 A+ (Figure 6-25). The 12.5 A+ RMSD decoy (#1604) was most similar to the target

between residues 15 32 and 28 59 having RMSDs of 3.0 A+ and 2.8 A+ in these segments. The

13.5 A+ decoy (#5643) and the target have similarities between residues 36 45, 72 82, and 56

- 89 with RMSDs of 2.5 A+, 2.8 A+, and 3.1 A+. The reference P-strand was included in a low

RMSD region of each of the top scoring decoys. The lowest RMSD decoy in database (#935)

satisfied onlyl16 constraints and had an RMSD of 8.0 A+; five of the unsatisfied constraints

involved residue 40, which is located within the P-barrel structure of both the target and the

decoy. The highest RMSD decoy (18.0 A+) satisfied only 9 constraints.









Only one decoy of T309 (#210) satisfied all constraints and it had an RMSD of 1 1.0 A~

(Figure 6-26). The target has defined secondary structure only between residues 13 and 40, and

the RMSD between the target and decoy in this region (from residues 13 and 39) was 3.7 A+.

Therefore, if only the regions of defined secondary structure are considered, the method worked

well for this target. The best decoy in database, #5810, satisfied 22 constraints and had an

RMSD of 8. 1 A+. Two of the unsatisfied constraints involved residue 30, which is located within

one of the P-sheets. The highest RMSD decoy (19.8 A+) satisfied 16 constraints.

6.2.4 Summary of Results Using the Specific Decoy Set

Rosetta generated low RMSD decoys for ten of the twelve targets. Even the high RMSD

decoys usually had some local structural similarities to the target. Low RMSD decoys were

assigned top scores for eight targets. Two targets had low RMSD decoys in the database but

they did not satisfy all constraints while low RMSD decoys were not generated for two other

targets, T306 and T309. Both T306 and T309 were difficult to predict for most CASP

participants.

6.3 Comparisons of Decoy Sets

A comparison of the results using each type of decoy set can be found in Table 6-4.

Rosetta generated low RMSD decoys for all but two targets, T309 and T306. The general decoy

set generated low RMSD decoys for target T309 (5.7 A+) but the best decoy in the set for target

T306 had an RMSD of 8.1 A+. Also, the lowest RMSD decoy for ten of the twelve targets is

lower for Rosetta decoys than the general set. The general decoy set generated better decoys for

Target T309 and T359.

The discrimination process was equally effective for both types of decoy sets. Three

targets (T288, T359, T335) had successful predictions using both decoy sets. Target T306 did

not have low RMSD decoys in either set. This is not a very common type of protein. The three









targets for which low RMSDs were generated but were not found in the general decoy set (T348,

T349, T358) and two targets for which the general decoy set did not generate low RMSD decoys

(T311, T363), had successful predictions using the specific decoys sets. Using the general decoy

set, T309 and T340 had low RMSD decoys with top scores, whereas using the specific decoy set,

successful predictions were not obtained. Finally, for T353, the specific decoy set generates a

low RMSD decoy but it is not assigned a top score, while the general decoy set does not generate

a low RMSD decoy. When both methods are considered, ten of the twelve targets had

successful predictions.















Target Lowest RMSD Range of RMSDs for Number of top
in decoy set top scoring decoys scoring decoys

288 3.4 3.4 5.6 11
340 3.0 3.0 -14.0 60
359 3.6 3.6 -13.1 15
309 5.7 5.7 14.4 9*
335 2.1 2.1 9.5 10582
348 5.3 6.9 11.0 18
349 5.5 10.5 13.6 64
358 5.5 8.3 11.8 5
306 8.1 13.4 13.5 2*
311 6.6 10.1 -10.2 23
353 6.4 10.8 15.2 26*
363 6.4 9.3 9.4 3
*no decoys satisfied all constraints; the number represents the number of decoys satisfying 24
constraints


Table 6-1. Results for 12 targets











T288
Jpred Real
5- 11 5- 10 p
17 -23 19 -23 p
30 -36 31 -36 p
41 -45 a
53 -57 52 -57 p
60 -61 p
66 -75 a 67 -76 a
80 -86 P 80 -86 P


T306
JPred Real
4-7 2- 10
10- 15 p
17- 19 a
24 -30 23 -30 p
40 -46 36 -45 p
54 -59 53 -59 p
61 -67 a
75 -87 P 76 -81 P
84 -85


T309
JPred Real
5- 10
13 15
18-21 a
20 -24
27 -40 a 29 -33
34 -40 a
46 -52 P


T311 T335 T340
JPred Real JPred Real JPred Real
3- 12 a 1-13 a 3- 18 a 5- 19 a 6- 11 6- 11
16 -23 a 17 -24 a 24 -46 a 24 -40 a 18 -24 19 -23
28 -35 a 28 -36 a 51 -55 31 -36 30 -35
43 -52 a 43 -53 a 64 -73 a 40 -44 a
57 -75 a 57 -74 a 52 -56 50 -55
78 -80 a 58 -59
83 -85 a 66 -74 a 65 -73 a
80 -86 P 78 -84 P

T348 T349 T353
JPred Real JPred Real JPred Real
4- 10 a 2-7 3- 11 3-11
18 -20 9-21 a 9 -23 a 17 -24 17 -24
26 -28 25 -28 27 -30 26 -28 29 -43 a 29 -43 a
29 -31 a 36 & 43 59 -61 59 -61
35 -36 33 -38 46 -50 45 -51 65 -74 a 65 -74 a
41 -42 54 -64 a 55 -64 a 76 -79 76 -79
46 -48 a
50 -51
58 -62 a 54 -60 a

T358 T359 T363
JPred Real JPred Real JPred Real
5- 14 a 5- 14 a 5- 11 2- 11 3-9 i 10
15 -21 18-21 17 -22 19 -22 14 -20 13 -22
27 -33 29 -33 35 -40 35 -40 25 -34 a 27 -34 a
39 -42 39 -43 45 -49 a 36 -39 a
52 -60 a 50 -58 a 57 -61 56 -61 51 -56
64 -65 69 -72
71 -79 a 71 -80 a
84 -91 84 -94


Table 6-2. JPred predictions compared to target structures


92 -93


p | 93 -94 | P










Table 6-3. Results for each of the 12 targets
CASP Number of RMSD range of Lowest RMSD Number of decoys
Target top scoring top scoring in decoy set, with RMSDs under
decoys decoys, (A) (A) 6.0 A~
335 3619 1.4 8.22 1.4 9,562
311 181 4.8 14.6 4.8 158
358 20 4.1 -11.4 4.1 122
349 84 4.0 11.0 4.0 54
363 11 5.7 10.5 5.1 25
288 86 3.6 13.2 3.5 25
359 40 6.0 15.2 4.8 4
348 2* 5.6, 12.8 4.2 151
353 20 7.0 13.5 4.3 57
340 6 7.4 14.5 3.7 5
309 1 11.0 8.1 0
306 2* 12.5, 13.5 8.0 0
*top score was less than 25


Table 6-4. Comparison of results for each target using both types of decoy sets
Target Specific Set General Set
288 3.6 13.2 3.4 5.6
335 1.4 8.22 2.1 9.5
359 6.0 15.2 3.6 13.1
349 4.0 11.0 10.5 13.6
358 4. 1- 11.4 8.3 11.8
348 5.6, 12.8 6.9 11.0
363 5.7 10.5 9.3 9.4
311 4.8 14.6 10.1 10.2
353* 7.0 13.5 10.8 15.2
340 7.4 14.5 3.0 14.0
309 11.0 5.7 14.4
306* 12.5, 13.5 13.4 13.5
Targets T353 and T306 were not predicted successfully by either method. The entries in red had
successful predictions. Those in green had low RMSD decoys in the set but they did not satisfy
all constraints, while those in blue did not have low RMSD decoys.











- T288
- T 306i
- T309
---T3X11
'T335
- T340
T348
- T"349
- T353
- T358
- T359


350000


3000003(





250000


100000-


50000



O 5 10 15 20 25 30
RMSD, AI

Figure 6-1. RMSD distributions for each target protein


Figure 6-2. Target T288 and the top scoring decoys for T288. A) Target T288. B) Itq3 had an
RMSD of 3.4 A+. C) lum7 had an RMSD of 3.5 A+. D) lb8q had an RMSD of 5.6 A+.


















A B C D


E F G


Figure 6-3. Target T340 and some of the top scoring decoys. A) Target T340. B) 1wf7 with an
RMSD of 3.0 A+. C) 1wif with an RMSD of 4.4 A+. D) 1qln with an RMSD of 14.0
A+. E) lufl with an RMSD of 4.4 A+. F) luit with an RMSD of 3.4 A+. G) Iv51 with
an RMSD of 3.3 A+.













A B C D

Figure 6-4. Target T359 and its top scoring decoys. A) Target T359. B) 1pld and luml, each
had an RMSD of 3.6 A+. C) lueq and 1wfy, each with an RMSD of 6.8 A+. D) 1w5e
with an RMSD of 13.1 A+.



















A B


Figure 6-5. Target T309 and its top scoring decoys. A) Target T309. B) lese, lesd, and lesc,
each with an RMSD of 5.7 A+. C) 1wk1 with an RMSD of 14.4 A+.




















A B C

Figure 6-6. Target T335 and its top scoring decoys. A) Target T335. B) 1qsp, 1wa5, and 1z3h,
each with an RMSD of 2. 1 A+. C) lb6c has an RMSD of 9.5 A+.





0 204 08 0
Percent .# rsde


'


-1ltq3
--168
lum7


--1qln
-1lwfl
lwif
luit
lull
-- v51


02~0 40 60 80 100
Percent of residues:


-Ilese
- lesd
lese
lwkl


- 1pld
-lueq
luml


0 ,0 ,0 8 0
8ecn rzd r
6 A


4 j

2. 6b~~


20 40 60 80 100
Percent oftesidues


- 1qsp
1wa5


It



20 40 60 80 100
Percent of residues


Figure 6-7. Use of Global Distance Test (GDT) analysis to compare our top scoring decoys with
the results from other methods used in CASP7. A) Target T288. B) Target T340. C)
Target T359. D) Target T309, decoys from lesc, lesd, and lese have RMSDs of 5.7
A+ of, while 1wk1 has an RMSD of 14.4 A+. E) Target T335, the lb6c decoy has an
RMSD of 9.5 A+, while the other three have RMSDs of 2. 1 A+


1'

a M


1


















A B C D E F

Figure 6-8. Target T348, the best decoys in the database, and the top scoring decoys. A) Target
T348. B) The best decoy in database, Itl2, had an RMSD of 5.3 A+. Top scoring
decoys: C) Icvm, 1poo, 2poo, and 1qlg, each with an RMSD of 6.9 A+. D) 1s61 had
an RMSD of 8.7 A+. E) 1hzz, 117d, 117e, 1ptj, lu3d, and 1xlt have RMSDs ranging
from 10.6 10.7 A+. F) le88 had an RMSD of 11.0 A+.


A B


Figure 6-9. Target T349, best decoy in database, and top scoring decoys. A) Target T349. B)
luj5 had an RMSD of 5.6 A+ and a score of 21. C) 1wel had an RMSD of 5.5 A+ and a
score of 14. D It9u had an RMSD of 10.5 A+ and a score of 25. E) Ita3 had an
RMSD of 13.6 A+ and a score of 25.








A B C D E

Figure 6-10. Target T358, lowest RMSD decoys in database, and top scoring decoys. A) Target
T358. B) The best decoys in database, loe9, 1w7i, and 1w7j, each have an RMSD of
5.5 A+. C) The top scoring decoy, 1p99, had an RMSD of 8.3 A+. D) The top scoring
decoys, lefd, 1k2v, and 1k7s, each have an RMSD of 9.3 A+. E) The top scoring
decoy, lx9d, had an RMSD of 11.8 A+.













-1Icvm
-- le88
lhz
Il2 1


-1 t9u
-1 ta3
1 uj 5
1wel


.'ti~ '



20 40 60 80
Percent of residues


20 40 60 80 100
Percent of residues


--lefd
r --loe9
1p99
1x9d


4 10


I
,* +'


0 20 40 60 80 100
Percent of residues





Figure 6-11i. Use of Global Distance Test (GDT) analysis to compare our results with those from
other methods used in CASP7. A) Target T348. The lowest RMSD decoy in the set
(1tl2) has a score of 22. The remaining decoys in the Eigure satisfied all constraints.
B) Target T349. The lowest RMSD decoys are 1wel and luj5 which satisfied 14 and
21 constraints respectively, which It9u and Ita3 satisfied all constraints. C) Target
T358. The lowest RMSD decoy in the set, loe9, satisfied 22 constraints. The
remaining decoys satisfied all constraints.

















A B C


Figure 6-12. Target T306, lowest RMSD decoys in the database, and the top scoring decoys. A)
Target T306. B) Best decoy in database, Ifgu, had an RMSD of 8.1 A+. C) Top
scoring decoys, 1jhw and 1j72, had RMSDs ranging from 13.4 13.5 A+.


A B C


Figure 6-13. Target T311, best decoy in database, and top scoring decoy. A) Target T311. B)
If6g has the lowest RMSD in the database, 6.6 A+. The PDB entry for the parent
protein contains only the ot-carbons. B) The top scoring decoys, la9x, lbxr, c30,
Ic30, Ics0, 1jdb, 1kee, Imy6, and It36, had RMSDs ranging from 10.1 10.2 A+.


A B C D


Figure 6-14. Target T353, best RMSD decoy in database, and top scoring decoys. A) Target
T353. B) The lowest RMSD decoys in database, 1jrp and 1jro, each have an RMSD
of 6.4 A+. C) The top scoring decoys, lekf, had an RMSD of 10.8 A+. D) Both 1j49
and 1j4a, are top scoring decoys and have an RMSD of 11.6 A+.



































'1C L~b 5'
I~C, n
R~sla~
I
' .f


) 20 40 60 80 100
Percent of rsidues
A


4 a e


0 20 40 60 80 101
Peroent of residues
B


A B


Figure 6-15. Target T363, best decoy in database, and a top scoring decoy. A) Target T363. B)
The lowest RMSD decoy is 1hux with an RMSD of 6.4 A+. C) The top scoring decoy,
1sxg, had an RMSD of 9.3 A+.


-- Ifgn


-- lamx


-l j49


-- 1hux


20 40 60 80 100
Percent of residu~es
D


20 40 60 80 100
Percent of reidues


Figure 6-16. Use of Global Distance Test (GDT) analysis to compare our results with those from
other methods used in CASP7. A) Target T306. The lowest RMSD decoy in the set,
Ifgu, satisfied 18 constraints. The other two decoys satisfied all constraints. B)
Target T311. The lowest RMSD decoy in the set, If6g, is not shown, while la9x is a
top scoring decoy. C) Target T353. The lowest RMSD decoy in the set, 1jrp,
satisfied 17 constraints. D) Target T363. The lowest RMSD decoy in the set, 1hux,
satisfied 24 constraints, while 1sxg satisfied all constraints.


-4 10-
6-


10










3500-

3000- T0



T335
r,2000-
o T348

~11500-

1000- T5

500-



0 5 10 15 20 25
RMSD, A~

Figure 6-17. Histogram of C" RMSDs for all twelve CASP targets.


















C D


A B


G H


E F


Figure 6-18. Top scoring decoys for target that worked. A) T3 11, #3 545, with an RMSD of 4.8
A+. B) T3 11, #61, with an RMSD of 14.6 A+. C) T349, #4480, with an RMSD of 4.0
A+. D) T349, #3665, with an RMSD of 11.0 A+. E) T358, #1572, with an RMSD of
4.1 A+. F) T358, #3334, with an RMSD of 11.4 A+. G) T335, #9623, with an RMSD
of 1.4 A+. H) T335, #2312, with an RMSD of 8.2 A+. For each target, the top scoring
decoys with the lowest and highest RMSDs are shown.

















B C


Figure 6-19. Results for T288. A) The best decoy in data set, #369, had an RMSD of 3.5 A+. B)
The top scoring decoy, #5124, had an RMSD of 3.6 A+. C) The top scoring decoy,
#7663, had an RMSD of 13.2 A+.











A B C

Figure 6-20. Results for T348. A) The best decoy in data set, #1017, had an RMSD of 4.2 A+. B)
The top scoring decoy, #4218, had an RMSD of 5.6 A+. C) The top scoring decoy,
#7088, had an RMSD of 12.8 A+.











A B C D

Figure 6-21. Results for T3 59. A) One of the best decoys in data set, #1 12, had an RMSD of 4.8
A+. B) The other best decoy in data set, #6536, also had an RMSD of 4.8 A+. C) The
top scoring decoy, #5817, had an RMSD of 6.0 A+. D) The top scoring decoy, #3012,
had an RMSD of 15.2 A+.


















A B


Figure 6-22. Results for target T363. A) The best decoy in data set, #4551, had an RMSD of 5.1
A+. B) The best decoy in data set, #6388, had an RMSD of 5.1 A+. C) The top scoring
decoy, #6376, had an RMSD of 5.7 A+. D) The top scoring decoy, #5181, had an
RMSD of 10.5 A+.


B C


Figure 6-23. Results for target T340. A) The best decoy in data set, #94, had an RMSD of 3.7 A+.
B) The top scoring decoy, # 9880, had an RMSD of 7.4 A+. C) The top scoring decoy,
# 9412, had an RMSD of 14.5 A+.


Figure 6-24. Results for target T353. A) The best decoy in data set, #5124, had an RMSD of 4.3
A+. B) The top scoring decoy, # 5488, had an RMSD of 7.0 A+. C) The top scoring
decoy, # 3009, had an RMSD of 13.5 A+.


















B C


Figure 6-25. Results for target T306. A) The best decoy in data set, #93 5, had an RMSD of 8.0
A+. B) The top scoring decoy, #1604, had an RMSD of 12.5 A+. C) The top scoring
decoy, #5643, had an RMSD of 13.5 A+.














A B

Figure 6-26. Results for Target T309. A) The best decoy in database, #5810, had an RMSD of
8. 1 A. B) The top scoring decoy, #210, had an RMSD of 11.0 A+.









CHAPTER 7
COMPARISONS OF GENERAL AND SPECIFIC DECOY SETS


7.1 Comparing the performance of the general and specific decoy sets on four target
proteins

For most targets, many decoys from both the general and specific sets satisfied at least

twelve constraints. We found it necessary to use twenty-five distance constraints to adequately

distinguish between reliable and unreliable decoys while employing a constraint distance

acceptance range of +/- 5 A+. To avoid dependence on the order of application of constraints, we

counted the total number of constraints that each decoy satisfied. Decoys that satisfied the most

constraints systematically had the lowest RMSDs.

The general decoy set contains a fixed number of decoys, 8,060,245, while the optimum

size for the specific decoy set, in terms of cost and performance, was found to be 10,000 decoys.

Decoys with slightly lower RMSDs are generated in the 10,000 set compared to the 1,000 set,

but significantly lower RMSD decoys are not usually generated in the 50,000 set.

Rosetta generates low RMSD structures for each of our four target proteins and our scoring

procedure effectively assigned these low RMSD decoys high scores. The RMSDs of the best top

scoring decoys ranged from 2.4 4.6 A+. Several decoys satisfied all twenty-five constraints for

lubi and 2ezk; the top scoring decoys had a large range of RMSD values. An alternate

constraint set may have been more effective in distinguishing between good and bad decoys.

We had similar findings using our general decoy set. The RMSDs of the top scoring

decoys ranged from 3.6 7.7 A+. Our final results showed that three of the four target proteins

had successful structure predictions. Such low resolution structures may be used as starting

points in density generation for X-ray structures and may also be useful in determining protein

functions."o We also analyzed the RMSDs for several proteins and found that in general the









average RMSD range for decoys in our database is ~15 A+. Like the PDB, our database contains

many semi-redundant structures. Removal of such decoys may further decrease the search time

of an already fast screening process.

7.2 CASP7 results

We studied twelve CASP targets using the general and specific decoy sets. Using the

general set, the lowest RMSD decoy for five of the CASP7 targets satisfied all distance

constraints. Three targets had low RMSD decoys but were not assigned the highest score while

four targets did not have any low RMSD decoys in the database. Because constraints are chosen

with the aid of a secondary structure prediction method, predicting incorrect secondary structure

can often lead to poor constraint choices which results in bad structure predictions. When

comparing our method to the other methods used in CASP7, we Eind that our method's

performance was decent. It does quite well for some of the hardest targets, like T309, and not

well for some of the easiest targets, like T288, T340, and T359.

Some targets had no low RMSD decoys in the general decoy set. Two reasons for this

occurrence are: (1) the target contained an unusual protein fold not seen in the PDB or (2) the

protein fold was excluded from the database. These protein folds may be more common for

smaller structures but parent proteins with fewer than 100 residues were not used in decoy

generation. Another explanation is that the most similar proteins in the PDB were missing small

fragments of structure, thereby excluding them from the decoy set.

For the specific decoy sets, low RMSD decoys were generated for all but two targets, T309

and T306. A low RMSD decoy was generated for target T309 (5.7 A+) in the general decoy set,

but for target T306, the best decoy in the general set had an RMSD of 8.1 A+. Also, for ten of the

twelve targets, the specific set had lower RMSD decoys compared to those generated in the

general set. The general decoy set generated better decoys for Target T309 and T359.
144










The discrimination process was equally effective for both types of decoy sets. Three

targets (T288, T359, T335) had successful predictions using both decoy sets. Most methods

used in CASP7 performed well for these targets as well. Target T306 did not have low RMSD

decoys in either set and it was very difficult for most other groups to predict. The three targets

for which low RMSDs were generated but were not found in the general decoy set (T348, T349,

T358) and two targets for which the general decoy set did not have low RMSD decoys (T311,

T363), had low RMSD predictions using the specific decoys sets. Using the general decoy set,

T309 and T340 had low RMSD decoys with top scores, whereas use of the specific decoy set did

not result in successful predictions. Finally, for T353, the specific decoy set generates a low

RMSD decoy but it is not assigned a top score, while the general decoy set does not generate a

low RMSD decoy. When both methods are considered, ten of the twelve targets had successful

predictions.

In both types of decoy sets, whenever high RMSD decoys satisfied all constraints it was

because the decoy and the target had regions of great similarity. In those cases where we

selected a reference structure and chose constraints involving atoms from that region, such high

RMSD high scoring decoys were common, especially when the reference structures were

included in the region of similarity. Better results were seen when a set of residues were selected

and constraints were chosen between them.









CHAPTER 8
INTRODUCTION: AZOBENZENE ISO1VERIZATIONa

8.1 Isomerization Mechanism

Azobenzene can adopt cis and trans conformations in the electronic ground state with the

trans isomer lower in energy by approximately 0.6 eV.164 The trans to cis energy barrier was

found experimentally to be about 1.6 eV.165 Azobenzene is known to undergo a reversible

photoisomerization between these conformations. A trans to cis isomerization occurs upon

excitation at 365 nm (3.40 eV) and a cis to trans isomerization takes place at 420 nm (2.95

eV).166 A thermally induced cis to trans isomerization is also possible in the ground state. Due

to their facile inter-conversion at appropriate wavelengths, azobenzenes have the potential to be

used in optical switching and image storage deviceS167-17 as well as molecular scissorsl7 and as

targets for coherent control in molecular electronics.172

There are two pathways by which isomerization is thought to take place. The rotation

pathway occurs by an out of plane torsion of the CNNC dihedral angle labeled $ in Figure 8-1.

The inversion pathway involves an in-plane inversion of the NNC angle between the azo group

and the adjacent carbon of the benzene ring. The inversion angle is labeled cp in Figure 8-1. An

interesting and somewhat puzzling aspect of the photochemistry of azobenzenes is the difference

in trans to cis quantum yield upon excitation to the dark Sl(nx*") state (O= 0.20 166,173 0.36 174)

and bright S2(nX ) State (@=0.09 166,173 0.20 174). Even within this large experimental range,

@(S1) is clearly larger than O(S2). When the rotation pathway is blocked by restricting the NN

bond rotation with a crown ether,17 cyclophane structure,176 Or within a cyclodextrin cavity,l7


"Adapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the
Isomerization Mechanism of Azobenzene and Disubstituted Azobenzene Derivatives, J. Phys.
Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society









the difference in quantum yield disappears. This observation led to the belief that isomerization

occurs by different mechanisms after the n~n* and n~* excitations.

Most researchers agree that the inversion mechanism dominates in the ground state,l7~s

but until recently there was much debate over which mechanism dominates after excitation to

each excited state. Monti'sl7 minimal basis set CI calculations provided the first theoretical

explanation: excitation to S1 resulted in isomerization via the inversion pathway while the

rotation pathway dominated after S2 excitation. His potential energy curves were adopted by

most experimentalists and used to explain their results.

Time-resolved UV-visible absorption spectroscopy of azobenzene by Lednev shows that

upon excitation of trans-azobenzene at hexc = 280 to 347 nm, two transients are formed.182-184

One was determined to be fast decaying, 1 ps, corresponding to the S2 State and the other was

longer-lived, 10 16 ps, corresponding to the S1 state. Lednev used Monti's potential energy

curves to explain his results. Therefore, these transients have been assigned assuming the

rotational pathway dominates after S2 excitation.

Fujinols performed time-resolved Raman spectroscopy to show that the S1 state that

formed after S2 excitation had a similar NN stretching frequency as that of the So state. This

indicates the NN double bond remains intact after the excitation and therefore provides evidence

for the inversion mechanism in the S1 state. In later work, Fujinol86 preSented results from a

time-resolved fluorescence experiment that denied the existence of a rotational pathway that

starts from the S2 State, in contrast with Monti's work. They showed that isomerization always

occurs in the S1 state regardless of excitation wavelength. In order to explain the differing

quantum yields, he proposed an additional relaxation channel that must be opened upon S2

excitation and produces mostly trans isomers.









Much theoretical work has been done to investigate the photochemistry of azobenzene.

Cattaneo and Persicol79 perfOrmed complete active space self-consistent field (CASSCF) and

CIPSI calculations to generate potential energy curves of the ground and excited states.

Ishikawa et al. "' obtained three-dimensional potential energy surfaces of So, S1, S2, and S3 States

using CASSCF and multireference configuration interaction method with singles and doubles

(MRCISD). Quennvillels used CASSCF to generate potential energy curves for the lowest five

electronic states. Tiago et al.189 perfOrmed two-dimensional surface scans for So, S1, and S2

using TDDFT. Ciminelli et al.190 USed a combination of Tully' s surface hopping approach with a

direct semiempirical calculation to study the dynamics in the excited states. Cembran et al.191

calculated the lowest singlet and triplet excited state PES along the torsion pathway using

complete active space with second-order perturbation theory (CASPT2). Gagliardi et al.192 also

focused on the torsion pathway but used MS-CASPT2 and TDDFT. Diaul93 USed CASSCF to

look at the inversion, rotation, and concerted-inversion pathways on the S1 surface.

The most recent theoretical conclusions agree that the n~n* state has a slight inversion

barrier and a nearly barrierless rotation pathway.179180187,189,190,193 Several researchers have

found an S1-So conical intersection along the rotation pathway with a CNNC dihedral angle of

~90.00187-191,193 It is generally agreed that when excited to the S1 state, relaxation to the So state

occurs through the conical intersection along the midpoint of the rotation pathway.187191193

Recent experimental work has shown support for this mechanisml94. The comprehensive studies

of Fujino and Taharals showed that isomerization does not occur directly on the S2 State, but

that it relaxes to a lower lying excited state, where it then isomerizes. Some calculations point

to an S2-S1 conical intersection near the trans-azobenzene Franck-Condon region which leads to

a direct S2 to S1 relaxation.188,190









Many models have been unable to explain the difference in quantum yield that is seen

upon excitation to the S2 State. Diau proposed a new isomerization pathway that is open after S2

excitation.193 This channel produces more trans isomers than cis thereby lowering the trans to cis

quantum yield. This mechanism is explored in our studies.

In addition to investigating the preferred isomerization mechanism, we also look at how

substituting the phenyl rings of azobenzenes affects the isomerization process. In order to study

these effects, we examined the pathways by generating potential energy surfaces of the ground

and excited states of azobenzene [Azo] and four of its derivatives, 4,4'-diaminoazobenzene

[Azon], 4,4' -nitro-aminoazobenzene [AzoNO2 H2], N-[4-(4-(Acetylamino)phenylazo)phenyl]-

acetamide [Azonco], and 4,4' -dinitroazobenzene [AzoNO2NO2] (Figure 8-2). The azobenzenes

will be from now on be referred to by the name in brackets.

Absorption spectroscopy by Blevins and Blanchard on the Azo, Azon, and Azonco

systems suggest that the ground state isomerization barrier is reduced when electron-donating

substituents are placed on the benzene rings.lso Our results, however, indicate that electron-

donating groups, like NH2 and HNCOCH3, inCreaSe the ground state inversion barrier while

electron withdrawing groups, like NO2, decrease it. Lack of solvent effects in our calculations

may be the reason for these discrepancies as will be discussed further in this paper.

8.2 Applications of Azobenzenes in Biomolecules

Recently, a photoswitchable molecular glue for DNA has been developed which can

reversibly control the hybridization of mismatch-containing DNAs with the aid of an external

light stimulus.19 These small synthetic molecules bind specifically to mismatch DNA and serve

to stabilize the mismatched DNA duplex, thereby acting as a glue holding together two single

stranded DNAs. Azobenzene was incorporated into naphthyridine carbamate dimmers, which

bind specifically to GG-mismatches in DNA. When the azobenzene undergoes isomerization,

149










the positions and orientations of the naphthyridines will also change and therefore enable to

adherence of two single-stranded DNAs that contain the GG-mismatch. The stabilization of the

DNA duplex by the glue was evaluated by melting temperature comparisons. The cis-

azobenzene-containing glue stabilized the GG mismatch DNA more strongly than the trans

complex. It was also found that the cis complex disassembled upon cis to trans isomerization by

430 nm photoilumination. Thus, this reversible, photoswitchaable molecular glue for DNA has

the potential to be used in controlling biological functions triggered by DNA hybridization. It

may also be useful in the reversible construction of DNA-based nanoarchitectures.

Azobenzene has also been incorporated into an ionotropic glutamate receptor which acts as

a photoswitch and controls an ion channel in cells.196 The switch covalently modifies target

proteins and can reversibly present and withdraw a ligand from its binding site by the

photoisomrization of azobenzene. Upon photoswitching to the active state, a tethered glutamate

is placed near the binding site. The photostationary state can be altered using different

wavelengths of light thereby setting the fraction of active channels in an analog fashion. The

switch can be turned on with short pulses at one wavelength, kept on in the dark for a few

minutes, and turned off with long pulses at another wavelength. In this way, sustained activation

with minimal radiation is achieved. The process provides quick and reversible control of protein

function.







-Dihedral 4

N

- (out of plane)
rotation


d"o


(in plarie)
-Inversion


Figure 8-1. Schematic diagram of the rotation and inversion pathways of the trans + cis
isomerization of azobenzenes. The rotation pathway is obtained by a torsion of the
azo group around the CNNC dihedral angle 5. The inversion pathway is obtained by
an in-plane inversion of the NNC angle (angle cp) formed between the azo group and
the attached carbon of one of the benzene rings.











4f;
61~2
It ~ T



i,.
14 ~ 10
It )I


~~N~D


Figure 8-2. Structures of compounds investigated in this work: (a) Azo (b) Azon (c) Azonco (d)
AzoNO2NH2 (e) AzoNO2NO2. This numbering scheme will be referred to
throughout the text.


'i''

It ~[



i,,
14 ~ 10
It )I
13~11

cl~o


I \
14 ~ 10
It )


CB









CHAPTER 9
COMPUTATIONAL DETAILSb

9.1 Ground-State Calculations

All calculations were performed using Gaussian 03.197 All ground-state geometries were

computed using ab initio density-functional theory with the B3LYPl98 functional and the 6-31G*

basis setl99 as this method was previously found to accurately reproduce experimental results.200

To investigate the rotation and inversion pathways, the potential energy surface was

generated by scanning the NNC angle (angle cp in Figure 8-1, 7-8-9 in Figure 8-2) from 80.00 to

180.00 and the CNNC dihedral angle dihedrall angle $ in Figure 8-1, 4-7-8-9 in Figure 8-2) from

-40.00 to 220.00 at a 10.00 interval. For each calculation, the NNC angle and the CNNC dihedral

angle were fixed at the appropriate values while the rest of the degrees of freedom were

optimized. The remaining points in the potential energy surface were found through symmetry.

The potential energy surface for the concerted inversion pathway was generated in the

same manner except the NNC and CNN angles were scanned synchronously. Two potential

energy surfaces were generated for the 4,4' -nitro-aminoazobenzene due to its asymmetrically

substituted benzene rings. Azo(NO2) H2 TeferS to the surface with NO2 On the same side as the

NNC angle being inverted (9-8-7 in Figure 8-2) while AzoNO2 N2) TepreSents the surface with

NH2 On the same side as the inverted NNC angle (4-7-8 in Figure 8-2).

Charges were calculated using the CHelpG method to determine the electron donating or

withdrawing nature of each substituent. Electron donating groups were identified as those that

showed a decrease in charge on the ortho and para positions of the substituted azobenzene




b Adapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the
Isomerization Mechanism of Azobenzene and Disubstituted Azobenzene Derivatives, J. Phys.
Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society










compared to the charge on the unsubstituted azobenzene. While electron withdrawing groups

had a negative charge difference on the atoms in the meta positions.

9.2 Excited-State Calculations

All calculations were performed using Gaussian 03. Time dependent density-functional

theory (TDDFT) with the B3LYP functional and the 6-31G* basis set were used for the excited-

state calculations as they were found to give reliable results.166 The excited-state potential

energy surfaces were generated by calculating single point vertical excitation energies for each

of the points in the ground-state potential energy surfaces. Vertical excitations were also

calculated from the fully optimized ground state cis and trans minima.









CHAPTER 10
RESULTS: UNSUBSTITUTED AZOBENZENEc

10.1 Optimized Ground-State Geometry

The optimized geometries of cis and trans azobenzene were found and the results are

shown in Table 10-1. The trans isomer is about 15.2 kcal mof~ or 0.66 eV lower in energy than

the cis isomer. This is just slightly higher than the experimental value of 0.6 eV.164 Different

experimental methods suggest different structures for the trans isomer. Electron diffraction201

results indicate the phenyl rings of the trans isomer are 300 out of plane while the X-ray202 data

show a planar structure. Our results agree with the X-ray data as well as with the results of

several theoretical calculations.179,187,191,203 The structure of the cis isomer is less controversial.

Our DFT results are very similar to both X-ray data204 and other theoretical

predi cti ons. 179,187,189, 191,200,203

10.2 Electronic Excitation Energies

For the singlet vertical excitations of the trans isomer of azobenzene, the first transition,

n x*", is symmetry forbidden and therefore has a very weak oscillator strength, while the second

transition, n x*, is much more intense. The excitation energies for these transitions are shown

in Table 10-2. The assignment of symmetry is done by visual inspection. Evaluation of our

molecular orbitals (Figure 10-1) reveals that the first transition originates from the lone pair on

the central nitrogens and is of 88% n~x* character as calculated from the CI coefficients. The

second transition is 78% xn~x* and is delocalized throughout the entire molecule. It has been

suggested that the second excited state relaxes to the first via a conical intersection above the

ground state trans minimum.

" Adapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the
Isomerization Mechanism of Azobenzene and Disubstituted Azobenzene Derivatives, J. Phys.
Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society









The TDDFT calculated energy for the Sit So transition of trans azobenzene, 2.55 eV, is

fairly close to that of the known experimental value, 2.79 eV.166 Although CASSCF "7 and

configuration interaction by perturbative iterative selection (CIPSI)179 calculations have given

values that agree slightly better with experiment for this transition, 2.85 eV and 2.81 eV

respectively, the S26 So transition is much better described by TDDFT with an energy of 3.77 eV

compared to an experimental value of 3.95 eV. CASSCF predicts an energy of 7.62 eV and the

CIPSI energy is 4.55 eV. TDDFT consistently predicts slightly lower energies than the

experimental values while the CASSCF values are generally much higher. These values are

summarized in Table 10-2.

The Sif So transition occurs at about the same energy for both trans and cis. Unlike the

trans excitations, however, the S t So transition from the cis isomer shows slight intensity due

to the loss of symmetry making the transition allowed. The S2 tSo transition from the cis

isomer is much less intense and slightly higher in energy than that of the trans isomer.

10.3 Potential Energy Surfaces

10.3.1 Ground State

A ground state three dimensional potential energy surface and a contour map were

calculated for azobenzene (Figure 10-2). The surface is very symmetric with two cis and two

trans minima. Cis to trans barrier heights were determined from these plots by finding the

energy of the highest point on the potential energy surface along the pathway and subtracting

from it the energy of the cis minimum. Proper identification of these points as true transition

states was done checking for the existence of only one imaginary frequency in normal modes

analysis. The peak along the inversion pathway (angle reaction coordinate) was taken to be at an

angle of 180.00 and a dihedral angle of 180.00 and is represented in Figure 10-2b by point 1.

The peak of the rotational pathway dihedrall angle reaction coordinate) was taken to be at a









dihedral angle of 90.00 while the angle was the same as that of the trans minimum, 110.00. In

the rotation pathway, the peak was a saddle point and is labeled point 2 in Figure 10-2b.

Azobenzene is known to undergo a thermal cis to trans isomerization in the ground state so

only the cis barriers will be discussed. The barrier along the inversion pathway, 24.9 kcal mol l,

was lower than that of the rotation pathway, 36.2 kcal mol- indicating that in the ground state,

the inversion mechanism is favored. This is in agreement with previous reports.l7~s A cis to

trans barrier height for azobenzene was measured experimentally'so to be 25.8 kcal mol-1, in

good agreement with our results.

We can explain the difference in energy barriers between mechanisms by looking at how

the NN distance changes along each pathway. Along the inversion pathway, the NN distance

decreases (increases in bond order) from the trans isomer to the transition state (point 1 in Figure

10-2B) and then increases in length (decreases in bond order) as it approaches the cis isomer.

The inversion transition state shows the strongest NN bond along the pathway. The opposite

trend is seen along the rotation pathway. The NN distance increases from the trans isomer to the

rotation transition state (point 2 in Figure 10-2B) and then decreases in length as it approaches

the cis isomer. The NN distance found in the rotation transition state is approximately that of a

single bond. There is a high energy cost involved in a decrease of the NN bond order in the

rotation pathway which is seen as an increase in the energy barrier.

10.3.2 Excited State 1 (n & Tr*)

Potential energy surfaces and contour maps were calculated for the first two excited states

(Figure 10-3). Our surfaces are similar to those of previous calculations.187,189 Vertical

excitations from the trans minima reach the points labeled 1 and 4 while excitations from the cis

minima arrive at the points labeled 3 and 6. Points 2 and 5 depict the placement of the S1

mimima.









10.3.2.1 Rotation pathway

There is essentially no energy barrier along the rotation pathway of the first excited state as

has also been reported in previous calculations. 179,180,187,189,190,193 The potential energy surface

along this pathway has only a shallow slope above the area corresponding to the trans minimum

(from points 1 to 2 and 4 to 5 in Figure 10-3B), 0.21 kcal (mol~degree)^l, and a very steep slope

on the cis side (from points 3 to 2 and 6 to 5 in Figure 10-3B), 0.33 kcal (mol~degree)^l. These

slopes are also shown schematically in Figure 10-4. The figures suggest that when excited from

the cis conformation there is a much faster relaxation to the excited state minimum than if

excited from the trans conformation. This phenomenon has been shown experimentally by

femtosecond transient absorption measurements.205

A conical intersection was found between the ground and first excited states. It can be

seen when the minimum of the excited state is very close in energy to the maximum barrier

height along the rotation pathway in the ground state as can be seen in Figure 10-5. We have

located our conical intersection at an NNC angle of 140.0 and a CNNC dihedral angle of 90.0

(point 5 in Figures 10-2b and 10-4a). The location of this conical intersection is in agreement

with several other groups.187-191,193 The splitting between the surfaces is estimated to be 0.65

kcal moll

Stilbene, which can only isomerizes via the rotation mechanism, has been found to have an

S1-So conical intersection along the midpoint of the rotation pathway and is also known to have

an isomerization yield of 0.5. It is interesting to find that azobenzene has a conical intersection

near the same location yet shows a very different quantum yield. This can be explained by

looking at the difference in slope on the S1 surface on either side of the conical intersection in

azobenzene. As mentioned previously, the S1 slope above the cis minimum (point 6 in Figure 10-









4) is greater than the corresponding slope on the trans side (point 4 in Figure 10-4). The crossing

probability close to the conical intersection can be related to the non-adiabatic coupling between

So and S1 206 written as dS1-SO (CPSO I PS1). A larger slope corresponds to a larger change

in wavefunction (right side of formula). There is a greater probability, therefore, of jumping

from S1 to So when starting from the cis side rather than the trans side resulting in more trans

isomers in the So state, because the transition carries the momentum from the excited state. In

other words, while oscillating on the S1 surface near the conical intersection, more relaxation

occurs when the wave packet moves from point 6 to point 5 than from point 4 to point 5,

depositing more population on the trans side than on the cis side of the ground state surface,

hence producing a quantum yield lower than 0.5. The slopes (cis and trans sides) on the S1

surface of stilbene are essentially equal giving rise to more similar So and S1 wavefunctions than

those of Azobenzene. The probability of jumping from S1 to So is equal when coming from either

side of the conical intersection in the case of stilbene. This results in the experimentally seen

quantum yield of 0.5.

10.3.2.2 Inversion pathway

There is a slight trans~cis energy barrier along the inversion pathway as can be seen in

Figure 10-4. The S1 trans~cis energy barrier is 9.6 kcal mor l. There is no conical intersection

between the ground and first excited state along this pathway making the inversion mechanism

highly improbable. This is in agreement with previous calculations.188,193

Our results indicate that the isomerization can easily occur through an excitation to the first

excited state, relaxation to the excited state minimum along the rotation pathway, followed by

descent to either the cis or trans conformation via the conical intersection, providing for the

known cis yield (0.20-0.36) after excitation to the first excited state.









10.3.3 Excited State 2 (7e 9 n)

The potential energy surfaces of the second excited state are shown in Figure 10-6. As in

the ground state surface, cis and trans minima appear on the surface of the S2 State along the

inversion and rotation pathways. The cis minima are extremely shallow. The trans~cis energy

barriers were computed in the same manner as the ground state cis~trans barriers. The

inversion barrier was found to be 30. 1 kcal mol-l while that of the rotation pathway was 29.6

kcal mol l. Due to these substantial energy barriers, it is unlikely that isomerization occurs on

the S2 Surface. Rapid relaxation from the S2 State to the S1 state is energetically more favorable.

This is in agreement with Kasha' s rule.207 We examined energy gaps between the two states

along the inversion, rotation, and concerted inversion pathways in order to investigate this

process.

10.3.3.1 Rotation pathway

The possibility of a conical intersection between the S2 and S1 states along the rotation

pathway with an angle of 1170 and a dihedral angle of 1800 has been previously suggested. "s

For Azo, the states differ by 23.48 kcal mol-l at the trans minimum as can be seen in Figure 10-

7A. We do not find a conical intersection between S1 and S2 alOng the rotation pathway and can

therefore rule out this pathway as an isomerization mechanism.

10.3.3.2 Inversion pathway

A conical intersection between the S2 and S1 states has been previously located near the

ground state trans minimal90. While we do not find a curve crossing in this exact area, we do

see the energy difference between the S1 and S2 States become smaller along the inversion

pathway as can been seen in Figure 10-7b. This point is a few degrees away from the S2 minima.

At a CNNC dihedral angle of 180.00 and an NNC angle of 100.00, the energy gap between the S1

and S2 Surfaces appears to be the smallest, 15.70 kcal mol- This energy gap may be small









enough to allow for rapid relaxation to the first excited state. This explains why experimentalists

see two transients, a shorter one corresponding to the S2 State before is relaxes to a longer lived

species corresponding to the S1 state.182-184

10.3.3.3 Concerted inversion pathway

The above mechanism does not explain the difference in quantum yield that is seen upon

excitation at different wavelengths for unsubstituted azobenzene. To explain this process, we

invoke Diau'sl93 prOposal of an additional isomerization channel (concerted-inversion) that is

opened by exciting to the S2 State. The concert-inversion pathway involves a synchronous

inversion of the NNC and CNN angles. In our calculations, the CNNC dihedral angle is fixed at

180.00. The concerted inversion pathway is plotted in Figure 10-7C.

As in the inversion pathway, the S1 and S2 Surfaces are close in energy at an NNC angle of

100.00. This energy gap is significantly smaller than that of the rotation or inversion pathway,

5.17 kcal mol- It seems likely that rapid relaxation from the S2 to S1 state can occur due to this

small energy gap which will again give rise to two transients as seen experimentally. A

potential problem of the concerted-inversion mechanism is the existence of an energy barrier on

the St surface. The energy barrier (labeled b in Figure 10-7C) is measured by subtracting the

energy of the S1 minimum from the S1 energy at the S1-So conical intersection, 31.21 kcal moll

The available energy is calculated by subtracting the S1 minimum energy from the S1 energy at

the S2-S1 conical intersection (labeled a in Figure 10-7C), 50.43 kcal mol- There is enough

energy available to overcome the energy barrier so the channel is open.

10.4 Summary of Unsubstituted Azobenzene

Excitation to the S1 state leads to isomerization via the rotation mechanism. Our

conclusion is based on the finding of a conical intersection between the S1 and So states near the

midpoint of this pathway (NNC=110, CNNC=90.0). The rotation pathway has also been found









to be without a significant barrier, unlike the inversion pathway. Excitation to the S2 State results

in rapid relaxation to the S1 surface via the conical intersection found at NNC=100 and

CNNC=180 along the concerted inversion pathway. The energy gap between these surfaces is

significantly smaller than those seen in other pathways. Once on the concerted-inversion S1

surface there is an energy barrier of ~3 1.2 kcal molr Only when excitation to the S2 State occurs

is there enough energy to overcome this barrier. The conical intersection between the S1 and So

states is located at NNC=170 and CNNC=180. More trans isomers would be produced because

the crossing of these states is on the trans side of the potential energy curve. This is in agreement

with the experimental observation of differing quantum yields upon excitations at different

wavelengths. The concerted-inversion pathway has a nearly planar transition state in which the

NN double bond stays intact. This explains Fujino's observation that the S1 state formed after S2

excitation had a similar NN stretching frequency as that of the So state.'" It should also be noted

that because the S2 State relaxes to the S1 state at a geometry similar to that of both the electronic

ground state as well as the direct S1 excited state in the Franck-Condon region, the spectra of

both S1 states should be quite similar as seen in Fujino's work.208 A schematic diagram of these

mechanisms is shown in Figure 10-8.












Table 10-1. Optimized Geometries of cis and trans Isomers of Azobenzene
Angles/deg. Distances/A Energy a7
ZCNNC ZNNCC ZNNC dNN dcN kcal moll
trans 180.0 0.0 114.8 1.261 1.419 0.0
trans X-rb 180.0 0.0 114.1 1.247 1.428
trans EDc 180.0 30.0 114.5 1.268 1.427
Cis 9.8 50.3 124.1 1.250 1.436 15.2
Cis X-ry 0.0 53.3 121.9 1.253 1.449
aEnergies are relative to the trans isomer. bfrne22cReference 21deeec 0


Table 10-2. Vertical Excitation Energies (eV) of trans and cis Azobenzene.
TDDFTa Exp.b CASSCFe CIPSd
trans S1 2.55 (0.0) 2.79 2.85 2.81
S2 3.77 (0.77) 3.95 7.62 4.55
cis S1 2.57 (0.04) 2.82 3.65 2.94
S2 4.12 (0.07) 4.77 8.62 4.82
a Intensity is in parenthesis. b Reference 166, c Reference ", d Reference17























LUMO

HOMO nt"I


Figure 10-1. Molecular orbitals of Azo involved in the SAtSo and S26So transitions. This
figure also represents the molecular orbitals of Azon and Azonco as they are very
similar to those of Azo.























-UrL 200 280 p7r 0 50 100 150 200
-Ro 4 "# Dihedlral Anlgle
Rotation Pathway
A B
Figure 10-2. Ground state potential energy surface of Azo. A) Potential energy surface. B)
Contour map. Angles in degrees, energy in kcal moll relative to the energy of the
ground state trans isomer. In B, point 1 marks the position of the inversion transition
state while point 2 indicates the position of the rotation transition state. The cis and
trans minima are also labeled.


120 ,
a 160


200

240-1 4 5 6

280

0 50 R0 150 200


86.4
75.6
64.8
54.0
43.2
32.4
21.6
10.8
0


60


117.2
107.6
97.95
88.33
78.70
69.08
59.45
49.83
40.20


80 j

610


~Ci 0


10


2 0


Figure 10-3. First excited state potential energy surface of Azo. A) Potential energy surface. B)
Contour map. Points 1 and 4 represent where the molecule is on the S1 surface after
excitation from the ground state trans minima whereas excitation from the ground
state cis minima will place the molecule at points 3 and 6. Points 2 and 5 represent
the S1 minima as well as mark the location of the S1/So conical intersection. Angles
in degrees, energy in kcal mol l, relative to the energy of the ground state trans
isomer.













1 5


So
trans


ODihedral Angle ~
Rotation Pathway at LMC, = 240


1 E trans **





trans

120 An 24_0
Inversion Pathway at LCNNC = 180 "
B


6


Figure 10-4. Schematic representation of pathways in the first excited state of Azo. A) The
rotation pathway. B) The inversion pathway. The curves in A are along the angle of
2400 while those in B are along the dihedral of 1800. The labeled points are the same
as those in Figure 5b. The arrow in b depicts the inversion barrier in the S1 state.






Energy


~j


60 80 100 120 140
Dihedral Angle


Figure 10-5. Conical Intersection of So and S1 states of Azo. Angles in degrees, energy in kcal
mof~






















160 -8 145.4

1 4 0 2 0 -1 2 8 9
120.7
120 .. ...112.5

100
200 -104.3


240 ~196.05
100 79.60

200 80 9gle0 50 100 150 200
40 Dihedral Angle
Rotation

A B

Figure 10-6. Second excited state potential energy surface of Azo. A) Potential energy surface.
B) Contour map. Angles in degrees, energy in kcal mol ', relative to the energy of





























167

















160

120



40 -q


160

120



40


, c//


-40 0 40 80 120 160 200
Dihedm1l Angle
A
160

120

80

40 1.


80 120


160 200 240
Angle
B


Angle
c


Figure 10-7. A) Rotation Pathway along the angle of the ground state minimum of Azo,
LNNC=1100. B) Inversion and C) concerted-inversion pathways of Azo along
LCNNC=180.0 0. SO in blue, S1 in red, S2 in green. Angles in degrees, energy in
kcal mol-1. In C, arrow a represents the available energy while arrow b represents the
energy barrier.













So S



trans cis trans cis
Rotation Concerted-Inversion
After n~x* excitation After xn~x* excitation
A B

Figure 10-8. Scheme of the trans~cis isomerization process after A) n~x* excitation and B)
n~x* excitation. The ovals indicate locations of curve crossings.









CHAPTER 11
RESULTS: SUBSTITUTED AZOENZENESd

11.1 Optimized Ground-State Geometry

The optimized geometries of the cis and trans isomers of the azobenzenes were found

using the same technique as for the unsubstituted azobenzene. Important bond distances, angles,

and dihedrals are summarized in Table 11-1. The values listed for Azo(NO2) H2 are those of

the NO2 Substituted ring, while the values for NH2 Substituted ring are represented by

AzoNO2 N2).

11.1.1 NN Distance

For each azobenzene studied, the NN bond is shorter for the cis isomer than the trans

isomer. The NN distances were quite similar between the azobenzenes ranging from 1.260 A+ to

1.267 A+ for the trans isomer and 1.247 A+ to 1.256 A+ for the cis isomer. AzoNO2NO2 has the

shortest NN distance for both conformations followed by Azo. The substituents appear to

contribute only slightly to the NN bond as evidenced by the very small increase in bond length

upon substitution of the rings with electron donating groups and a small decrease in length when

substituted with electron withdrawing groups.

11.1.2 NNC Angle, CNNC Dihedral Angle, and NNCC Dihedral Angle

Like the NN distances, the NNC angles are very similar. For the trans conformation, the

angles range from 114. 10 to 115.60, while the range for the cis isomer was from 124.00 to

125.50. The CNNC dihedral angle of the trans isomers are all about the same, 180.00 while the

NNCC dihedral angle were about 0.00. The CNNC dihedral angle for the cis isomers is slightly




d Adapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the
Isomerization Mechanism of Azobenzene and Disubstituted Azobenzene Derivatives, J. Phys.
Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society









larger in substituted azobenzenes ranging from 9.80 for Azo to 11.80 for Azon. The NNCC

dihedral angle was smallest for AzoNO2 N2) and largest for Azo(NO2) H2.

11.1.3 Relative Energy Differences

The difference between the cis and trans ground state energies was calculated and found to

be very similar ranging from 14.8 kcal moll for AzoNO2NO2 to 16.8 kcal mol-l for Azon.

Electron donating substituents appeared to increase the relative energy difference while electron

withdrawing groups lowered the energy difference. The push-pull system showed a slight

increase in relative energy difference when compared to Azo.

11.2 Comparison of Charges

We define electron donating groups as those that activate the ortho and para positions

while electron withdrawing groups are those that activate the meta positions. Activation is

determined by change in charges relative to the unsubstituted azobenzene. Blevins and

Blanchardlso suggested the CH3CONH groups of Azonco would act as electron withdrawing

substituents. Using the CHarges from ELectrostatic Potentials (CHELPG) method to calculate

charges, however, we found that Azonco demonstrates electron donating behavior similar to that

of Azon. The charge differences were calculated by subtracting the charge on the unsubstituted

azobenzene from that of the substituted azobenzene. As can be seen in Figure 11-1, the carbons

that are ortho to the substituent, C2, C6, C11, and C13, (refer to numbering scheme in Figure 10-

2) have similar charge differences with an average of -0.226 for Azon and -0.234 for Azonco.

The para carbons, C4 and C9, are only slightly activated with average charge differences of -

0. 126 for Azon and -0. 117 for Azonco. The activation of the ortho carbons is enhanced by the

electron withdrawing effect of the azo group. The azo group activates the positions meta to

itself, which are the same as those ortho to the substituent. In effect, the azo group will act

synergistically with the electron donating substituents.









Interesting behavior results when an electron donating substituent, NH2, iS placed on one

benzene ring para to the azo group and an electron withdrawing group, NO2, iS placed in the para

position of the other benzene ring. This creates a push pull system as in AzoNO2 H2. As seen

in Azon, the NH2 grOup activates the positions ortho to itself, C2 and C6, which are the same as

those positions meta to the azo group as depicted in Figure 11-1. When an electron withdrawing

group like NO2 is placed on the ring para to the azo group, there is a mixing of charges. Both

groups try to activate the positions that are meta with respect to themselves. This will obviously

result in a conflict because the meta positions of the azo group are ortho to the NO2 grOup.

What we see is a difference in charge of -0. 125 at C14, which is meta to the NO2 grOup and -

0. 129 at C11, which is meta to the azo group. Therefore C11 and C14 are the activated carbons

in AzoNO2 H2. AzoNO2NO2 alSo shows a mixing of charges. Similar results are seen in

AzoNO2NO2, C3, C6, C11, and C14 are activated.

It can now be stated with confidence that Azonco and Azon have electron donating groups,

AzoNO2NO2 had electron withdrawing groups and Azo(NO2) H2 is a push pull system with

both an electron donating and an electron withdrawing group.

11.3 Electronic Excitation Energies

The singlet vertical excitations of the trans isomers of the substituted azobenzenes are very

similar to unsubstituted azobenzene. The first transition, n x*", is symmetry forbidden and

therefore has a very weak oscillator strength, while the second transition, n~x*, shows some

intensity. Visual inspection is used to assign symmetry.

The excitation energies for the Sof-S1 transition for all the azobenzenes were similar, as

shown in Table 11-2. The molecular orbitals (Figures 10-1 and 11-2) show again that the first

transition originates from the lone pair on the central nitrogens. Figure 10-1 can be used to










represent the molecular orbitals for Azo, Azon, and Azonco. The excitation was of nearly pure

n~x* character for all but AzoNO2 H2 and AzoNO2NO2. These systems show some additional

charge transfer to their NO2 Substituents.

The second transition, xn~x*, is delocalized throughout the entire molecule for all but the

push pull system. AzoNO2 H2 Shows an excitation primarily from the n: orbitals of the benzene

ring with the NH2 Substituent as well as from the n: orbitals of the central nitrogens. As in the

n~x* transition, AzoNO2 H2 and AzoNO2NO2 both show a charge transfer to the NO2

substituents. The molecular orbitals involved in the second transition are pictured in Figures 10-

1 and 11-2.

AzoNO2 H2 exhibits an intense trans excitation with the smallest energy, 2.99 eV, while

the n~x* excitations of Azonco and Azon are particularly close in energy, 3.25 eV and 3.26 eV

respectively. AzoNO2NO2 has an excitation of 3.48 eV. The S26So transition of Azo is highest

in energy, 3.77 eV, and the least intense of all the azobenzenes. It appears that adding both

electron donating and electron withdrawing substituents to Azo decreases the excitation energy

and increases the intensity of the S26 So transition.

We have found again that the first and second excited states at the optimized ground state

trans geometry are very close in energy. Azo shows the largest energy gap, 1.22 eV, followed by

AzoNO2NO2, 1.17 eV, and Azonco, 0.66 eV. Azon and AzoNO2 H2 have very similar energy

gaps, 0.550 and 0.546 eV respectively. The energy differences between the first two excited

states are summarized in Table 11-2.

The energy of the steady-state absorption spectroscopy maximum of Azo was 3.96 eV,lso

slightly higher than the TDDFT maximum of 3.77 eV. Azon showed an experimental excitation

of 3.15 eV, while the calculated energy was 3.26 eV. Azonco showed an excitation of 3.41 eV,










slightly higher than the calculated energy of 3.25 eV. Both experimental and theoretical results

show Azo to have the highest energy transition. TDDFT predicts the excitation energies of Azon

and Azonco to be about the same, while experiment shows these energies to differ by 0.26 eV.

It is also interesting to compare differences between the cis and trans excitations. Unlike

the trans excitations, the S t So transition from the cis isomer shows slight intensity. For Azo

and AzoNO2 H2, the Sif So transition occurs at about the same energy for both trans and cis.

Azon and Azonco have trans Sif So excitations slightly higher in energy than cis excitations

while AzoNO2NO2 Shows a higher energy cis excitation.

The S2 tSo transition from the cis isomer is much less intense and slightly higher in

energy than that of the trans isomer. There is a greater difference between the cis and trans S2

So transitions than the S t So transitions. The greatest difference is seen in Azonco with almost

0.5 eV separating the cis and trans excitations.

11.4 Potential Energy Surfaces

11.4.1 Ground State

Ground state three dimensional potential energy surfaces and contour maps were

calculated for each azobenzene (Figure 11-3). As mentioned previously, for the push pull

system, Azo(NO2) H2 TepreSents the surface with NO2 On the same side as the NNC angle being

inverted while AzoNO2 N2) TepreSents the surface with NH2 On the same side as the inverted

NNC angle. As can be seen in these figures, the ground state surfaces of the azobenzenes are

very similar. Cis to trans barrier heights were determined as described in Chapter 9. The energy

barriers can be found in Table 11-3. For the push pull system, the barriers for both

Azo(NO2) H2 and AzoNO2 N2) WeTO COnsidered together.

In all five systems, the barrier along the inversion pathway was lower than that of the

rotation pathway, indicating that in the ground state, the inversion mechanism is still favored.









The unsubstituted azobenzene, Azo, was found to have an inversion barrier of 24.9 kcal moll

Azo(NO2) H2 and AzoNO2NO2 have barriers lower than Azo, 17.2 kcal mol-l and 20.8 kcal

moll respectively. In both of these systems, the inversion angle is adjacent to a phenyl ring with

an electron withdrawing substituent. Azonco, Azon, and AzoNO2 N2), each have an electron

donating substituent on the phenyl ring adj acent to the inversion angle and showed higher

barriers than Azo, 25.5 kcal mol l, 26.8 kcal mol- ,and 28.5 kcal moll respectively. It is clear

from our results that substituting the benzene ring attached to the angle being inverted with an

electron donating group, raises the inversion barrier height compared to the unsubstituted

azobenzene. Substituting the same ring with an electron withdrawing group lowers the barrier

height. These observations can be explained upon examination of the molecular orbitals of the

inversion transition state (Figure 11-4).

Each of the azobenzenes has an inversion transition state with an angle of 1800 and a

dihedral angle of 1800. Due to the electron donating substituents on Azon and Azonco, there is

more electron density on the phenyl rings than is seen on Azo. There is therefore greater steric

hindrance between the lone pairs on the central nitrogens and p orbitals of the phenyl ring

adj acent to the 1800 NNC angle. The steric effects cause the inversion transition state of Azon

and Azonco to be higher in energy than that of Azo. AzoNO2NO2, On the other hand, has

electron withdrawing substituents which accept electron density from the n: orbitals of the phenyl

rings. AzoNO2NO2 is slightly stabilized by the ability of the less filled n: orbitals of the phenyl

ring adj acent to the 1800 NNC angle to accept electron density from the lone pair orbitals of the

central nitrogens. The lower barrier height of AzoNO2NO2 COmpared to Azo is due to this

stabilization.









For the push pull system, the smallest barrier appears along the inversion pathway of

Azo(NO2) H2. This suggests that the preferred mechanism of isomerization in the ground state

of the push pull system is the inversion of the NNC angle that is on the same side as the NO2

substituent. This is in agreement with the results of Kikuchi's209 Studies of a similar push-pull

system, 4-dimethylamino-4' -nitroazobenzene. This system has the lowest inversion energy

barrier of all the azobenzenes studied. The transition state is stabilized by the vacant orbitals of

the nitro substituted phenyl rings accepting electron density from the lone pairs on the central

nitrogens. The lone pairs are parallel to the vacant n: orbitals on this phenyl ring. The lone pairs

are also perpendicular to the occupied orbitals of the amine substituted phenyl ring which has a

stabilizing effect as it minimizes the electron-electron repulsion. The combination of these

effects results in the Azo(NO2) H2 having the lowest inversion energy barrier.

Blevins and Blanchard looked at the ground state cis + trans back-conversion for Azo,

Azon, and Azonco using theory and experiment. They calculated barrier heights from their

experimentally measured isomerization recovery time constants. The experiments did not

indicate which pathway the barriers referred to, so we will compare them to both the inversion

and rotation cis to trans barriers. A barrier height of 21.2 kcal moll was measure for Azon, 23.7

kcal mol-l for Azonco, and 25.8 kcal mol-l for Azo. The experimental data indicates that adding

electron donating substituents decreases the energy barrier which conflicts with our results. This

may be due to the lack of consideration of solvent effects in our calculations. The dipole

moment of the cis isomer and the transition state will be stabilized by the polar solvent. The

dipole moments were calculated and can be found in Table 11-4. For Azo, the cis isomer and the

inversion transition state have approximately the same dipole moment indicating they will be

equally stabilized by a polar solvent. This may explain why our calculated barrier height is









closest to the experimental value for Azo. The inversion transition states of Azon and Azonco

are more stabilized by a polar solvent than their corresponding cis isomers due to their greater

dipole moment. Stabilization of the transition state will lower the energy barrier as is seen when

comparing our calculated results with experiment. Polar solvents will have the greatest effect on

the push-pull system due to the large dipole moments that can be found in both the transition

state as well as the cis isomer.

The NN distance changes along each pathway in the substituted azobenzenes. follow the

same trend seen for unsubstituted azobenzene (see Chapter 9). Along the inversion pathway, the

NN distance is smallest at the transition state. The values of the NN distances in the transition

states can be found in Table 11-5. The inversion transition state of AzoNO2NO2 has the shortest

NN distance, 1.222 A+, followed by Azo, 1.226 A+, and Azo(NO2) H2, 1.228 A+. The inversion

transition state of Azonco was found to have an NN distance of 1.233 A+ while that of Azon was

found to be 1.241 A+. AzoNO2 N2) had the longest NN distance, 1.248 A+. The electron

donating groups can contribute electron density to the x*" orbitals thereby decreasing the bond

order and increasing the length of the NN bond compared to that of the unsubstituted

azobenzene. These distances indicate that the central nitrogens of the inversion transition state

have a double bond between them.

The opposite trend is seen along the rotation pathway, the NN distance is greatest at the

transition state and is approximately that of a single bond. These distances can also be found in

Table 11-5. Azo(NO2) H2 has the longest NN bond distance, 1.335A+, while AzoNO2 N2) has

the shortest NN distance, 1.290A+.

Potential energy surfaces and contour maps were calculated for the first two excited states.

Figure 11-5 shows these calculations for the first excited state of all the azobenzenes. The









surface graphs of all substituted azobenzenes appear to be similar to Azo (Figure 10-2a) and are

therefore not shown. Slight differences are more visible in the contour plots.

11.4.2 Excited State 1

11.4.2.1 Rotation pathway

As seen in unsubstituted azobenzene, there is essentially no energy barrier along the

rotation pathway of the first excited state. There is a shallow slope above the area corresponding

to the trans minimum and a very steep slope on the cis side. We can compare the excited state cis

and trans energy barriers and slopes (Table 11-6) to approximate relative relaxation times. A

steeper slope indicates a quicker relaxation time. We can conclude from this analysis that the

lifetime of the first excited state cis isomer is shorter than that of the trans for each of the

azobenzenes studied here. Azonco appears to have the steepest trans slope and we predict it will

exhibit the shortest S1 lifetime while AzoNO2NO2 has the least steep slope and is expected to

have the longest S1 lifetime.

A conical intersection was discovered in each azobenzene between the ground and first

excited states. The location of the conical intersection is only slightly different between the

azobenzenes. The location as well as the relative energy can be found in Table 11-7. Azonco's

conical intersection is located on the trans side of the barrier. This may indicate that Azonco will

have a lower cis~trans quantum yield.

11.4.2.2 Inversion pathway

We again see a trans + cis energy barrier along the inversion pathway (Table 11-8).

AzoNO2 N2), Azon, AzoNO2NO2, and Azo have higher barrier heights, 11.5 kcal mol l, 11.1

kcal moll 10.4 kcal mol- and 9.6 kcal moll respectively. Azo(NO2) H2 and Azonco show

very small inversion barriers, 1.3 kcal mol-l and 2.3 kcal mol l, making it difficult to rule out this

pathway as a possible isomerization mechanism for these azobenzenes based on barrier height









alone. Lack of a conical intersection between the ground and first excited state along this

pathway makes the inversion mechanism highly improbable. We can conclude that substituting

the phenyl rings of azobenzene does not change the isomerization mechanism after S1 excitation.

11.4.3 Excited State 2

The potential energy surfaces of the second excited state were also generated and can be

found in Figure 11-6. Both cis and trans minima appear on this surface along the inversion and

rotation pathways of each of the azobenzenes. The trans~cis energy barriers were computed as

described previously and can be found in Table 11-9. These barrier heights are too substantial

for isomerization to occur on this surface. Rapid relaxation from the S2 to the S1 surface is again

expected. We will compare the energy gaps between the first and second excited states along the

rotation, inversion, and concerted inversion pathways.

11.4.3.1 Rotation pathway

As depicted in Figure 11-7, in general, there is a significant decrease in the energy gap

upon substitution of the benzene rings by both electron donating and electron withdrawing

groups in agreement with experimental work.210 These values can be found in Table 11-10. For

Azo, the states differ by 23.48 kcal mol-l above the trans minimum. Azo(NO2) H2 Shows the

smallest energy gap of 8.89 kcal mol- These energy gaps are still slightly too high for

relaxation to occur along this pathway.

11.4.3.2 Inversion pathway

The energy difference between the states becomes smaller along the inversion pathway

near the trans minima as can been seen in Figure 11-8 and Table 11-10. These points are a few

degrees away from the minima of the second excited state. In general, at a dihedral angle of

180.00 and angles of 100.00, the energy gap between the first and second excited state surfaces

appears to be the smallest. Azo and AzoNO2NO2 have the largest energy gaps, 15.70 kcal moll









and 16.01 kcal moll respectively. The other azobenzenes show significantly smaller energy

gaps, under 4.67 kcal mol l, making this a very probable pathway.

11.4.3.3 Concerted-inversion pathway

This pathway is depicted in Figure 11-9. Energies of the S1 and S2 minima, conical

intersections, barrier heights, and available energy can be found in Table 11-11.

Azon, Azonco, and AzoNO2N 2: For these three azobenzenes, excitation to the S2 Surface

in the franck-condon region results in excitation to the S2 minimum at NNC=110.0 and

CNNC=180. This is also the location of the smallest S2-S1 energy gap along this pathway, 2.79

kcal mol-l for Azon, 6.24 kcal mol-l for Azonco, and 3.49 kcal mol-l for AzoNO2 H2. These

energy gaps are extremely small and would allow for rapid relaxation from the S2 Surface to the

S1 surface.

As seen in unsubstituted azobenzene, a large energy barrier is seen on the S1 surface of

each of these systems. The energy barriers were measured by subtracting the energy of the S1

minimum from the S1 energy at the S1-So conical intersection (arrow b in Figure 11-9). The

available energy is calculated by subtracting the S1 minimum energy from the S1 energy at the

S2-S1 conical intersection (arrow a in Figure 11-9). In each case, the available energy is less than

the energy barrier. It is highly improbable that this channel is open for Azon, Azonco, and

AzoNO2 H2. However, highly polar solvents may lower the S1 energy at the S1-So conical

intersection, which may lower the energy barrier enough to allow for the opening of this channel.

AzoNO2NO2: AzoNO2NO2 is quite similar to Azo. The smallest S1-S2 Cnergy gap Of 6.35

kcal mol-l is found at NNC=100.00 and CNNC=180.00. This energy gap is smaller than that of

the rotation and inversion pathways. The available energy was calculated to be 53.90 kcal moll

while the barrier was found to be 33.70 kcal mol- There appears to be sufficient energy to










overcome the barrier. More trans isomers would be formed as the S1-So conical intersection

appears at NNC=170.00 and CNNC=180.00

11.5 Summary of Substituted Azobenzenes

As seen for Azo, the rotation pathway dominates the isomerization process after excitation

to the S1 surface as evidenced by the conical intersection between the S1 and So states near the

midpoint of this pathway (NNC=1 10, CNNC=90.0) and the lack of a significant barrier. Azon,

Azonco, and AzoNH2NO2 USe the rotation pathway after excitation to the S2 State as represented

schematically in Figure 11-10. There is not enough available energy for these azobenzenes to

overcome the concerted-inversion barrier. It may be possible for this channel to open in very

polar solvents if the transition state is stabilized. The concerted-inversion channel is open for

AzoNO2NO2, after excitation to the S2 Surface.










e 11 pimizedL Gemtriel~Llts of c;is andC trans Isomrnls of Azobetnze~nes
Angles/deg Distances/A+ Energya7
Structure ZCN\NC ZNNCC ZN\NC RuN RNc kcal mof~
trans Azo 180.0 0.0 114.8 1.261 1.419 0
Azon 180.0 0.1 115.0 1.267 1.409 0
Azonco 180.0 0.0 114.9 1.265 1.411 0
Az(N2) H2 179.9 0.2 114.1 1.267 1.415 0
AzoNO2 N2) 179.9 0.0 115.6 1.267 1.399 0
AzoNO2NO2 179.9 0.2 114.6 1.260 1.427 0
cis Azo 9.8 50.3 124.1 1.250 1.436 15.2
Azon 11.8 44.1 124.6 1.256 1.430 16.8
Azonco 11.1 46.0 124.5 1.253 1.431 16.1
Azo(NO2 N2 11.5 60.4 125.5 1.254 1.423 15.6
AzoNO2 N2) 11.5 30.4 125.0 1.254 1.419 15.6
AzoNO2NO2 10.2 52.2 124.0 1.247 1.432 14.8


Table 1 1-2. Vertical Excitation Energies in eV of trans and cis Azobenzenes.
Structure Sif So (n~n*) S26So (n~*) SztSo
-Saf So
Energy Intensity % news" Energy Intensity % n~na Energy
Diff.
trans Azo 2.55 0.0 88 3.77 0.77 79 1.22
(3.96b
Azon 2.71 0.0 89 3.26 1.03 78 0.55
(3.15b)
Azonco 2.59 0.0 88 3.25 1.29 80 0.66
(3.41b
AzoNO2 H2 2.44 0.0 85 2.99 0.86 80 0.55
AzoNO2NO2 2.31 0.0 86 3.48 1.07 80 1.17

cis Azo 2.57 0.04 78 4.12 0.07 87 1.55
Azon 2.46 0.09 71 3.70 0.22 77 1.24
Azonco 2.46 0.10 74 3.72 0.29 73 1.26
AzoNO2 H2 2.46 0.11 60 3.17 0.09 40 0.71
AzoNO2NO2 2.44 0.07 75 3.62 0.03 81 1.18

aThe % n~n* and % n~* values are calculated from the CI coefficients. bReferencelso
experimental value.


'P^I-l^ll 1 ~~-L:~:-^lr^^~^LI^^ ^~^:^ ^~1 L~^~^ T^^~^~^ ^~


Tablt


a Energies are relative to their respective trans minima.










Table 11-3. The cis + trans Energy Barriers Calculated Along the Inversion and Rotation
Pathways.
Azo Azon Azonco Azo(N02)NH2 AzoN02(NH2) AzoN02NO,
Inv. Rot. Inv. Rot. Inv. Rot. Inv. Rot. Inv. Rot. Inv. Rot.
ZNNC 180 110 180 120 180 110 180 110 180 120 180 120
ZCNNC 180 90 180 90 180 90 180 90 180 90 180 90
AE as, 24.9 36.2 26.8 30.5 25.5 34.2 17.2 31.6 28.5 20.8 20.8 29.2
AAEf" 11.3 3.7 8.7 3.6 8.4
a AAE~ is the energy difference in kcal mol-l between the rotation and inversion isomerization
barriers. Angles are in degrees.

Table 1 1-4. Dipole Moments of the inversion transition State and cis Isomer
Dipole Moment cis Dipole moment Inversion TS
Azo 3.22 3.22
Azon 2.61 4.44
Azonco 5.35 7.39
Azo(NO2) H2 7.53 13.37
AzoNO2 N2) 7.53 8.79
AzoNO2NO2 3.66 5.89


Table 1 1-5. NN Distances (A+) of Transition States Along the Rotation and Inversion Pathways.
Inversion Rotation
Azo 1.226 1.303
Azon 1.241 1.308
Azonco 1.233 1.322
Azo(NO2) H2 1.228 1.335
AzoNO2 2) 1.248 1.290
AzoNO2NO2 1.222 1.297


Table 11-6. Rotational Energy Barriers in the First Excited State a
Trans Barrier Cis Barrier
Azo 18.5 (0.206) 29.8 (0.331)
Azon 11.6 (0. 129) 28.6 (0.318)
Azonco 19.2 (0.213) 29.2 (0.324)
Az(N2) H2 17.5 (0.194) 31.1 (0.346)
AzoNO2 N2) 13.9 (0.154) 32.0 (0.356)
AzoNO2NO2 11.5 (0.128) 27.3 (0.303)


a This barrier is measured as the difference in energy between the excited state minimum and the
excited state point corresponding to the ground state trans and cis minima. Energies are in kcal
mol- slope, in parenthesis, is in units of kcal mol-l degree-l









Table 1 1-7. Placement and Energy of First Excited State Minimum of the Conical Intersection
Angles/deg Energy a/ kcal mol~
ZNNC ZCNNC
Azo 140 90 46.0
Azon 130 90 47.0
Azonco 130 100 43.9
Azo(NO2) H2 140 90 42.4
AzoNO2 2) 120 90 38.9
AzoNO2NO2 150 90 45.4
a Energies are relative to their respective trans ground state minimum.


Table 11-8. The trans + cis Inversion Energy Barriers in the First Excited State a
AEtran
Azo 9.6
Azon 11.1
Azonco 1.3
Azo(NO2) H2 2.3
AzoNO2 N2) 11.5
AzoNO2NO2 10.4
a These barriers were found by subtracting the energy of the excited state point above the ground
state trans minimum from the energy of the excited state at an angle of 180.00 and a dihedral of
180.00. Energies are in kcal moll


Azo Azon Azonco IAzo(NO2) H2 AzoNO2 N2)I AzoNO2NO2
Inv. Rot. Inv. Rot. Inv. Rot. Inv. Rot. Inv. Rot. Inv. Rot.
LNNC 180 110 180 110 180 120 180 120 180 110 180 110
LCNNC 180 90 180 90 180 90 180 90 180 90 180 90
AE tran 30. 1 29.6 40.5 46.2 40.2 34.9 43.1 28.4 27. 1 31.1 14.5 27.7
AAEta 0.5 5.7 5.3 1.3 13.2
a AAE~ is the energy difference between the rotation and inversion isomerization barriers. Angles
are in degrees. Energies are in kcal moll


Table


11-9. The trans + cis Energy Barriers Calculated Along the Inversion and Rotation
Pathways on the Second Excited State Su e









Table 11-10. Energy Differences between S1 and S2
Energy Gap at Energy Gap at Energy Gap at
A=1100 D=1800 D=1800 A=1000 D=1800 A=1000
(rotation) (Inversion) (Conc erte d-Inversi on)a
Azo 26.43 15.70 5.17
Azon 22.06 0.69 2.79
Azonco 17.12 3.56 6.24
Az(N2) H2 8.89 4.67 3.49
AzoNO2 N2) 17.30 2.30 3.49
AzoNO2NO2 22.83 16.01 6.36
aFor the concerted-inversion pathway, Azo(NO2) H2 and AzoNO2 N2) are the same. Energies
are in kcal mof~l


S2 min S2 at S1 min S1 at S1 barriers Available
S1-S2 CI So-Si CI Enery
Azo 84.95 100.99 45.39 76.60 31.21 50.43
(5.17) (1.64)
Azon 73.19 73.19 49.68 77.63 27.95 20.72
(2.79) (6.78)
Azonco 73.27 73.27 46.11 81.58 35.47 20.92
(6.24) (4.56)
Azo(NO2) H2 67.72 67.72 41.79 71.12 29.33 22.44
(3.49) (2.57)
AzoNO2NO2 79.28 98.78 38.53 72.23 33.70 53.90
(6.35) (7.12)

aEnergies are in kcal mol-1 and are relative to their respective trans minimum. The numbers in
parenthesis refer to the energy gaps between the two states. bThe S1 barrier is measured as the
difference between the S1 minimum energy and the S1 energy at the So-Si conical intersection.
"The available energy is the difference between the energy of S1 at the S2-S1 conical intersection
and the energy of the S1 minimum. If the available energy is greater than the S1 barrier, the
concerted-inversion channel can be used.


Table


11-11. Energies of the St and S2 Minima, Conical Intersections, Barrier Heights, and
Available Energy a











CH



,.2270.5 -0.242 -0.

.019 0.051 0.
-0.116

NN

-0.1117
0.039 0.041

-0.235-022
0.30. 54 -.3




=H3


O O


H\NH
0.742
-0.233 -0..224 -

0.013 t 0.026 0
-0.126

N~N
-0.126
0 013 0.026


-0.2340.4 -0.224


H H


0.036


-0.035
-0.125 0.156

0.029 -0.129
0.280


-0.124

0.036


0.163

-0.124


Figure 11-1. Comparison of charge differences in trans isomers. A) Azon. B) Azonco. C)
AzoNO2 H2. D) AzoNO2NO2. Charge differences were calculated by subtracting
the charge on the unsubstituted azobenzene from that of the substituted azobenzene.
A negative charge differences (highlighted in bold) indicates that the position has
been activated.









1LUMO *


Figure 11-2. Molecular orbitals involved in the Sif So and S26So transitions. A) AzoNO2 H2
B) AzoNO2NO2


~SC,





















240



280
0 50 100 150 100
Dihedm1l Angle


80


120












0 50 100 150 200
Dihedral Angle


0 50 100 150 200
Dihedral Angle
B


80


120


160


200


240


280


0 50 100 150
Dihedral Angle
D


Figure 11-3. Contour maps of the ground state. A) Azo. B) Azon. C) Azonco. D)
Azo(NO2) H2. E)AzoNO2 N2). F) AzoNO2NO2. Angles in degrees, energy in
kcal mol- The energy range for each color id depicted in the legend.


























0 50 100 150 200 0 50 100 150 200
Dihedral Angle Dihedral Angle
E F


0 10.8 21.6 32.4 43.2 54.0


64.8 75.6 86.4


Figure 11-3. Continued


Figure 11-4. Schematic diagram of the molecular orbitals of the inversion transition state.





















240 14 65


280
0 50 100 150 200
Dihedm1l Angle


,,, ,,


''


--


0 50 100 150 200


Dihedm1l Angle


12-
160


200


0 50 100 150 200
Dihedmal Angle


0 50 100 150 200
Dihedm1l Angle


Figure 11-5. Contour maps of the first excited state. A) Azo. B) Azon. C) Azonco. D)
Azo(NO21) N2. E) AzoNO2 N 2). F) AzoNO2(NO2). Angles in degrees, energy in
kcal mol .The energy range for each color is depicted in the legend.





0 50 100 150 200

Dihedmal Angle


1820



160




200


0 50 100 150 200

Dihedm1l Angle


I
,,, -~
,,
--


i
v










-------v,,


Figure 11-5. Continued


40.30 40.83 50.45 60.(E 78.70 IM.33 07.05 1107.6 i7.2









80 80 -


120 i 1201


1601 160


200 200*


240 i i1 240


0o 50l 100 50 200 0 50 100 150 200
Dihedral Angle Dihedm1l Angle

80 80


120 13;120

1601 160 -- -


200 200




280 28 *- I I 8
0 50 100 150 200 0 50 100 150 200
Dihedral Angle Dihedral Angle
C D


Figure 11-6. Contour maps of the second excited state. A) Azo. B) Azon. C) Azonco. D)
Azo(NO2 2z. E) AzoNO2 N2). F) AzoNO2NO2. Angles in degrees, energy in
kcal mol .The energy range is depicted in the legend.












- -


80


160

200


240


280 .
O 50 100 150 200
Dihedm1l Angle


240


280
0 50 100 150 200
Dihedsal Angle


Figure 11-6. Continued


70.IK) 87.83 151.5 104.3 112.5 120.7 12~8.0 137T.2 1145.




















30


0





12



16


200 150 100 50 0
Dihedral Angle

A


200 150 100 50 0
Dihedral Angle


200 150 100 50 0
Dihedral Angle
C


200 150 100 50 0


Dihedral Angle
D


Figure 1 1-7. Rotation pathway along the angle of the ground state minimum. A) Azo. B) Azon.
C) Azonco. D) Azo(NO2) H2. E) AzoNO2 N2). F) AzoNO2NO2. Angles in
degrees, energy in kcal moll










I12


120 -

90




30

0-


60

30


200 150 100 50
Dihedmal Angle


200 150 100 50
Dihedral Angle


Figure 11-7. Continued












160j 160
120 120



80 8 0



100 140 180 220 260 100 140 180 220 260
Angle Angle
A B
160 16
120 120







100 140 180 220 260 100 140 180 220 260
Angle Angle
C D


Figure 11-8. Inversion pathway along the dihedral of the ground state minimum. A) Azo. B)
Azon. C) Azonco. D) Azo(NO2) H2. E) AzoNO2 N2). F) AzoNO2NO2. Angles
in degrees, energy in kcal moll









160 160

120 12





40 / 40 /

0 0 .
100 140 180 220 260 100 140 180 220 260
Angle Angle
E F

Figure 11-8. Continued











1212 -i







100 140 180 220 260 100 140 180 220 260
Angle Angle
A B
160 160 1


120 120


S\ / 0


100 140 180 220 260 100 140 180 220 260
Angle Angle
C D


Figure 11-9. Concerted-inversion pathway along the dihedral of the ground state minimum. A)
Azo. B) Azon. C) Azonco. D) AzoNO2 H2. E) AzoNO2NO2. Angles in degrees,
energy in kcal mol- Only one graph is necessary for AzoNH2NO2 because the NNC
and CNN angles are being scanned synchronously. Arrow a represents the amount of
available energy while arrow b represents the energy barrier. The concerted-
inversion pathway is only open when the amount of available energy (arrow a) is
greater than the energy barrier (arrow b).











1601


120




80





100 140 180 220 260
Angle



Figure 11-9. Continued


























Inversion Rotation
trans




Figure 11-10. Scheme of the trans~cis isomerization process for Azon, Azonco, and
AzoNO2 H2. After both n~x* and xn~x* excitation, the rotation pathway
dominates the isomerization process.









CHAPTER 12
AZOBENZENE CONCLUSIONS

We have found that adding electron donating substituents to the benzene rings of

azobenzene raises the ground state inversion barrier height, making it harder to isomerize.

Electron withdrawing groups were found to lower the same barrier. On the potential energy

surface of the first excited state, there exists a slight trans + cis barrier along the inversion

pathway, while all other pathways are without barriers. A conical intersection between the So

and S1 states was found for each azobenzene along the rotation pathway making this pathway the

most likely method of isomerization. The surface of the S2 State was shown to be extremely close

in energy to the S1 state at specific points, indicating that excitation to the S2 State leads to rapid

relaxation to the S1 state. Our results indicate this relaxation occurs using the concerted-

inversion pathway for Azo and AzoNO2NO2. The concerted-inversion energy barriers were too

high for the other azobenzenes to overcome. They most likely use the conical intersection found

along the rotation pathway as their primary isomerization mechanism, regardless of excitation

wavelength.













Table A-1. List of distances for targets T288 and T306
T288 T306
Atom 1 Atom 2 Distance, A Atom 1 Atom 2 Distance, A

15 67 20.1 2 40 12
25 71 19.2 2 45 14.7
35 76 22 7 40 9.1
410 67 21.5 7 45 13.5
510 71 16 23 40 16.9
610 76 13.1 23 45 5.5
719 67 13.3 30 40 9.3
819 76 14.5 30 45 17.9
923 71 11.6 40 53 14.9
1023 76 20.2 40 59 13.2
1131 71 10.1 45 53 12.6
1231 76 18 45 59 12.7
1336 67 16.9 40 76 11.5
1436 71 14.5 40 81 17.7
1536 76 19 45 76 9.1
1653 67 12 45 78 5.6
1753 71 13.7 5 26 8.1
1857 67 12.5 5 57 9.9
1957 76 11.6 5 78 10.8
2067 76 14.4 26 57 11.7
2167 80 19.9 26 78 11.1
2267 86 16.5 26 43 6.5
2371 80 14.4 5 43 10.8
2471 86 17.4 43 57 7.8
2576 86 22.7 43 78 6.6


APPENDIX
LIST OF CONSTRAINTS










Table A-2. List of distances for targets T309 and T335
T309 T335
Atom 1 Atom 2 Distance, A+ Atom 1 Atom 2 Distance, A+

15 35 20.7 5 14 13.7
25 40 13.7 5 18 19
310 35 12.1 5 24 23.1
410 40 13.8 5 29 15.9
518 40 17.7 5 35 16.5
6 21 35 9.6 5 40 16.6
721 40 15.8 9 18 13.5
827 35 24.6 9 24 18
927 40 27.9 9 29 11.9
1034 40 10.1 9 35 14.6
1135 46 15.6 9 40 17.7
1235 52 19.1 14 24 12.1
1340 46 11 14 29 9.7
1440 52 23 14 35 13.6
155 30 25.4 14 40 20
1610 30 10.9 18 24 11.5
1718 30 14.9 18 29 13.2
1830 35 15.3 18 35 17.7
1930 40 19.6 18 40 24.9
2030 46 25.2 24 29 8.6
2130 52 18.7 24 35 15.7
225 21 22.9 24 40 24.4
2321 27 16.8 29 35 9.9
2421 46 19.5 29 40 17.3
2521 52 12.9 35 40 8.9










Table A-3. List of distances for target T340
T340 Set1 T340 Set 2
Atom 1 Atom 2 Distance, A+ Atom 1 Atom 2 Distance, A+

16 66 22.9 6 22 18.7
26 73 19.6 6 31 14
311 66 21 6 35 15.9
411 69 16 6 52 9.9
519 66 15.3 6 55 11.9
6 19 69 11.6 6 68 17.9
719 73 12.6 6 81 8.1
823 69 11.1 22 31 5.1
923 73 17 22 35 11.8
1030 66 10.4 22 52 10.1
1130 73 14.8 22 55 14.9
1235 66 19.2 22 68 10.3
1335 73 16.7 22 81 14.4
1452 66 13.8 35 52 13.8
1552 69 11.2 35 55 17.1
1652 73 14.4 35 68 17.2
1755 66 13.3 35 81 13.6
1855 73 9.8 52 55 11.2
1966 73 10.6 52 68 9.3
2066 80 17.4 52 81 6.8
2166 84 20.5 55 68 7.9
2269 80 12.9 55 81 6.5
2369 84 17.8 68 81 10.9
2473 80 10.7 31 81 11.3
2573 84 20.2 31 68 10.7











Table A-4. List of distances for target T349
T349 Set1 T349 Set 2
Atom 1 Atom 2 Distance, A+ Atom 1 Atom 2 Distance, A+

132 38 10.1 32 38 10.1
232 43 15.7 32 43 15.7
332 68 9.4 32 58 11.7
432 72 18.5 32 78 24.3
532 77 22.9 32 81 19.6
632 81 19.6 38 43 8.5
732 86 21.6 38 52 12.6
838 43 8.5 38 58 19.5
938 68 11.7 38 68 11.7
1038 72 11.4 38 78 15.1
1138 77 14.3 43 52 13.5
1238 81 10.5 43 58 21
1338 86 12.6 43 72 10.9
1443 68 16.3 43 78 15.6
1543 72 10.9 43 81 13.3
1643 77 14 52 68 6.4
1743 81 13.3 52 72 11
1843 86 16 52 78 20
1968 77 18.2 52 81 17.6
2068 81 17 58 68 14.5
2168 86 22.4 58 72 24.2
2272 81 8.9 58 81 28.7
2372 86 16.5 68 81 17
2477 86 14.1 72 78 9.4
2581 86 8.4 68 72 13.2









Table A-5. List of distances for targets T348 and T353
T348 T353
Atom 1 Atom 2 Distance, A+ Atom 1 Atom 2 Distance, A+

126 58 14 3 29 13.7
226 61 13.5 3 35 14.7
328 58 10.3 3 42 22.3
428 61 9.6 11 29 18.2
535 58 10.9 11 35 16.1
6 35 61 12.1 11 42 16.4
736 58 14.6 17 29 20
836 61 15.3 17 35 17.8
94 58 34.8 17 42 17.5
104 61 33.1 29 35 9.8
118 58 24.3 29 42 19.8
128 61 22.2 35 42 10.3
1310 58 19 29 65 24
1410 61 17.3 29 73 20.1
154 26 22.1 35 65 21.3
164 28 24.5 35 72 14.4
174 35 24.9 42 65 25
188 26 15.2 42 72 19.3
198 28 14.2 29 77 14.9
208 35 16 42 77 13.1
2110 26 12.7 29 79 17.1
2210 28 9.6 35 79 12.6
2310 35 11.8 42 79 14.5
2426 35 6 8 29 10.5
254 10 18.8 8 42 16










Table A-6. List of distances for targets T358 and T363
T358 T363
Atom 1 Atom 2 Distance, A+ Atom 1 Atom 2 Distance, A+

114 23 14.3 4 36 11.3
214 27 13.2 14 45 12.7
314 30 11.4 20 45 15.5
414 38 9.9 25 38 19.6
514 42 18.4 25 45 15.8
6 14 48 20.3 31 36 10.6
714 51 15.2 31 45 10.1
814 61 16.9 36 64 15.6
914 67 18.4 36 67 18.8
1018 27 9.3 36 80 23.9
1118 30 11.7 36 83 27.9
1218 38 11.7 45 62 16.1
1318 42 15 45 66 21.1
1418 48 18.2 45 80 27.1
1518 51 15 36 45 12.1
1618 61 17.3 27 38 15.6
1718 67 15.1 27 45 11.1
1823 27 8.7 39 45 9.8
1923 30 16.7 39 65 16.3
23 38 16.6 39 80 23.5
2123 42 13.2 39 83 28.6
2223 48 17.7 14 41 11.1
2323 51 16.8 31 41 9.6
2423 61 17.5 14 67 13.5
2523 67 10.9 20 40 19.7










Table A-7. List of distances for targets T359 and T31 1
T359 T311
Atom 1 Atom 2 Distance, A+ Atom 1 Atom 2 Distance, A+

15 71 19.1 10 64 14.1
25 79 21.9 10 70 9.3
311 71 23.6 10 82 22.8
411 79 14.4 19 64 19.8
519 71 16.1 19 70 18.9
6 19 79 11.2 19 77 27.3
722 79 11.8 23 64 21.7
835 79 12.6 23 70 21.8
940 71 19.4 23 77 29.7
1040 79 15.8 30 64 16.3
1157 71 11.2 30 70 21.2
1257 79 15.8 35 64 20.3
1361 71 13.5 35 70 22.4
1471 79 12.4 35 77 28.6
1571 84 22.0 42 64 16.4
1671 87 15.9 42 70 12.9
1771 91 17.2 42 82 24.9
1879 84 12.7 50 64 7.4
1979 87 12.9 50 70 12.2
2079 91 21.5 50 82 24.5
215 75 19.2 59 64 10.9
2211 75 18.3 59 70 16.9
2340 75 16.0 59 77 27.3
2457 75 11.8 64 82 27.4
2575 84 16.7 64 70 10.3









LIST OF REFERENCES


1. Todd AE, Orengo CA, Thornton JM. Evolution of Function in Protein Superfamilies,
from a Structural Perspective. Journal of Molecular Biology 2001;3 07(4): 1 113 -1143.

2. Orengo CA, Todd AE, Thornton JM. From protein structure to function. Current Opinion
in Structural Biology 1999;9(3):374-382.

3. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. A three-
dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature
(London, United Kingdom) 1958;181:662-666.

4. Perutz MF, Muirhead H, Cox JM, Goaman LCG, Mathews FS, McGandy EL, Webb LE.
Three-dimensional Fourier synthesis of horse oxyhemoglobin at 2.8 A resolution. I. X-
ray analysis. Nature (London, United Kingdom) 1968;219(5 149):29-32.

5. Geerlof A, Brown J, Coutard B, Egloff MP, Enguita FJ, Fogg MJ, Gilbert RJC, Groves
MR, Haouz A, Nettleship JE, Nordlund P, Owens RJ, Ruff M, Sainsbury S, Svergun DI,
Wilmanns M. The impact of protein characterization in structural proteomics. Acta
Crystallographica, Section D: Biological Crystallography 2006;D62(10): 1125-1136.

6. Powell HR. The Rossmann Fourier autoindexing algorithm in MOSFLM. Acta
Crystallographica, Section D: Biological Crystallography 1999;D55(10):1690-1695.

7. Hauptman H. Phasing methods for protein crystallography. Current Opinion in Structural
Biology 1997;7(5):672-680.

8. Uson I, Sheldrick GM. Advances in direct methods for protein crystallography. Current
Opinion in Structural Biology 1999;9(5):643-648.

9. Taylor G. The phase problem. Acta Crystallographica, Section D: Biological
Crystallography 2003;D5 9(1 1): 18 81 1890.

10. Ealick SE. Advances in multiple wavelength anomalous diffraction crystallography.
Current Opinion in Chemical Biology 2000;4(5):495-499.

11. Wider G. Structure determination of biological macromolecules in solution using nuclear
magnetic resonance spectroscopy. BioTechniques 2000;29(6):1278-1280, 1282, 1284-
1290, 1292, 1294.

12. Hore PJ. Nuclear Magnetic Resonance. Compton RG, editor. New York: Oxford
University Press; 2001. 90 p.

13. Guntert P. Structure calculation of biological macromolecules from NMR data. Quarterly
Reviews of Biophysics 1998;31(2):145-237.









14. Nilges M, Clore GM, Gronenborn AM. Determination of three-dimensional structures of
proteins from interproton distance data by hybrid distance geometry-dynamical
stimulated annealing calculations. FEBS Letters 1988;229(2):317-324.

15. Havel TF. An evaluation of computational strategies for use in the determination of
protein structure from distance constraints obtained by nuclear magnetic resonance.
Progress in Biophysics & Molecular Biology 1991;56(1):43-78.

16. Wagner G, Braun W, Havel TF, Schaumann T, Go N, Wuethrich K. Protein structures in
solution by nuclear magnetic resonance and distance geometry. The polypeptide fold of
the basic pancreatic trypsin inhibitor determined using two different algorithms, DISGEO
and DISMAN. Journal of Molecular Biology 1987; 196(3):61 1-63 9.

17. Guntert P, Mumenthaler C, Wuthrich K. Torsion angle dynamics for NMR structure
calculation with the new program DYANA. Journal of Molecular Biology
1997;273(1):283-298.

18. Guntert P, Qian YQ, Otting G, Muller M, Gehring W, Wuthrich K. Structure
determination of the Antp (C39----S) homeodomain from nuclear magnetic resonance
data in solution using a novel strategy for the structure calculation with the programs
DIANA, CALIBA, HABAS and GLOMSA. J Mol Biol FIELD Full Journal Title:Journal
of molecular biology 1991;217(3):53 1-540.

19. Braun W. Distance geometry and related methods for protein structure determination
from NMR data. Quarterly Reviews of Biophysics 1987; 19(3 -4): 115-157.

20. Demeter A, Fodor T, Fischer J. Stereochemical investigations on the diketopiperazine
derivatives of enalapril and lisinopril by NMR spectroscopy. Journal of Molecular
Structure 1998;471(1-3):161-174.

21. Hubbell WL, Altenbach C. Investigation of structure and dynamics in membrane proteins
using site-directed spin labeling. Current Opinion in Structural Biology 1994;4(4):566-
573.

22. Jeschke G. Determination of the nanostructure of polymer materials by electron
paramagnetic resonance spectroscopy. Macromolecular Rapid Communications
2002;23(4):227-246.

23. Schweiger A, Jeschke G. Principles of Pulse Electron Paramagnetic Resonance
Spectroscopy; 2001. 572 pp p.

24. Berliner LJ, Eaton GR, Eaton SS, Editors. Distance Measurements in Biological Systems
by EPR. [In: Biol. Magn. Reson., 2000; 19]; 2000. 614 pp p.









25. Rabenstein MD, Shin Y-K. Determination of the distance between two spin labels
attached to a macromolecule. Proceedings of the National Academy of Sciences of the
United States of America 1995;92(1 8):823 9-8243.

26. Stryer L. Fluorescence energy transfer as a spectroscopic ruler. Annual Review of
Biochemistry 1978;47:819-846.

27. Dodson MS. Dimethyl suberimidate cross-linking of oligo(dT) to DNA-binding proteins.
Bioconjug Chem FIELD Full Journal Title:Bioconjugate chemistry 2000; 11(6):876-879.

28. MacPhee CE, Howlett GJ, Sawyer WH. Mass Spectrometry to Characterize the Binding
of a Peptide to a Lipid Surface. Analytical Biochemistry 1999;275(1):22-29.

29. Young MM, Tang N, Hempel JC, Oshiro CM, Taylor EW, Kuntz ID, Gibson BW,
Dollinger G. High throughput protein fold identification by using experimental
constraints derived from intramolecular cross-links and mass spectrometry. Proceedings
of the National Academy of Sciences of the United States of America 2000;97(1 1):5802-
5806.

30. Benson DA, Boguski MS, Lipman DJ, Ostell J, Ouellette BFF, Rapp BA, Wheeler DL.
GenBank. Nucleic Acids Research 1999;27(1):12-17.

31. Stoesser G, Tuli MA, Lopez R, Sterk P. The EMBL Nucleotide sequence database.
Nucleic Acids Research 1999;27(1):18-24.

32. Zhu H, Bilgin M, Snyder M. Proteomics. Annual Review of Biochemistry 2003;72:783 -
812.

33. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped
BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic
Acids Research 1997;25(17):33 89-3402.

34. Sanchez R, Sali A. Evaluation of comparative protein structure modeling by
MODELLER-3. Proteins 1997;Suppl 1:50-58.

35. Epstein CJ, Goldberger RF, Anfinsen CB. The genetic control of tertiary protein
structure. Model systems. Cold Spring Harbor Symposia on Quantitative Biology
1963;28:439-449.

36. Chothia C, Lesk AM. The relation between the divergence of sequence and structure in
proteins. EMBO Journal 1986;5(4):823-826.

37. Sander C, Schneider R. Database of homology-derived protein structures and the
structural meaning of sequence alignment. Proteins: Structure, Function, and Genetics
1991;9(1):56-68.









38. Rost B. Twilight zone of protein sequence alignments. Protein Engineering
1999; 12(2):85-94.

39. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search
tool. Journal of Molecular Biology 1990;215(3):403 -410.

40. Pearson WR. Rapid and sensitive sequence comparison with FASTP and FASTA.
Methods in Enzymology 1990; 183(Mol. Evol.: Comput. Analy. Protein Nucleic Acid
Sequences):63-98.

41. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of
progressive multiple sequence alignment through sequence weighting, position-specific
gap penalties and weight matrix choice. Nucleic Acids Research 1994;22(22):4673 -4680.

42. Peitsch MC, Schwede T, Guex N. Automated protein modeling the proteome in 3D.
Pharmacogenomics 2000; 1(3):257-266.

43. Bates PA, Sternberg MJE. Model building by comparison at CASP3: using expert
knowledge and computer automation. Proteins: Structure, Function, and Genetics
1999(Suppl. 3):47-54.

44. Dayringer HE, Tramontano A, Sprang SR, Fletterick RJ. Interactive program for
visualization and modeling of proteins, nucleic acids and small molecules. Journal of
Molecular Graphics 1986;4(2):82-87.

45. Sali A, Blundell TL. Comparative protein modeling by satisfaction of spatial restraints.
Journal of Molecular Bi ology 1 993;23 4(3):779- 8 15.

46. Vriend G. WHAT IF: a molecular modeling and drug design program. Journal of
Molecular Graphics 1990;8(1):52-56, 29.

47. Simons KT, Bonneau R, Ruczinski I, Baker D. Ab initio protein structure prediction of
CASP III targets using ROSETTA. Proteins: Structure, Function, and Genetics
1999(Suppl. 3):171-176.

48. Fiser A, Do RKG, Sali A. Modeling of loops in protein structures. Protein Science
2000;9(9):1753-1773.

49. De Filippis V, Sander C, Vriend G. Predicting local structural changes that result from
point mutations. Protein Engineering 1994;7(10): 1203-1208.

50. Stites WE, Meeker AK, Shortle D. Evidence for strained interactions between side-chains
and the polypeptide backbone. Journal of Molecular Biology 1994;23 5(1):27-32.

51. Dunbrack RL, Jr., Karplus M. Conformational analysis of the backbone-dependent
rotamer preferences of protein sidechains. Nature Structural Biology 1994; 1(5):3 34-340.









52. Novotny J, Rashin AA, Bruccoleri RE. Criteria that discriminate between native proteins
and incorrectly folded models. Proteins: Structure, Function, and Genetics 1988;4(1):19-
30.

53. Brenner SE, Chothia C, Hubbard TJP, Murzin AG. Understanding protein structure:
using scope for fold interpretation. Methods in Enzymology 1996;266(Computer
Methods for Macromolecular Sequence Analysis):635-643.

54. Holm L, Sander C. Mapping the protein universe. Science (Washington, D C)
1996;273(5275):595-602.

55. Hubbard TJP, Murzin AG, Brenner SE, Chothia C. SCOP: a structural classification of
proteins database. Nucleic Acids Research 1997;25(1):236-239.

56. Valencia A, Kj eldgaard M, Pai EF, Sander C. GTPase domains of ras p21 oncogene
protein and elongation factor Tu: analysis of three-dimensional structures, sequence
families, and functional sites. Proceedings of the National Academy of Sciences of the
United States of America 1991;,88(12):5443-5447.

57. Bourne PE, Weissig H, Editors. Structural Bioinformatics; 2003. 649 pp

58. Godzik A, Skolnick J. Sequence-structure matching in globular proteins: Application to
supersecondary and tertiary structure determination. Proceedings of the National
Academy of Sciences of the United States of America 1992;89(24): 12098-12102.

59. Bryant SH, Lawrence CE. An empirical energy function for threading protein sequence
through the folding motif. Proteins: Structure, Function, and Genetics 1993;16(1):92-112.

60. Jones DT, Taylor WR, Thornton JM. A new approach to protein fold recognition. Nature
(London, United Kingdom) 1992;358(6381): 86-89.

61. Anfinsen CB, Haber E, Sela M, White FH, Jr. Kinetics of formation of native
ribonuclease during oxidation of the reduced polypeptide chain. Proceedings of the
National Academy of Sciences of the United States of America 1961;47: 1309-13 14.

62. Anfinsen CB. Principles that govern the folding of protein chains. Science (Washington,
DC, United States) 1973;181(4096):223-230.

63. Simons KT, Kooperberg C, Huang E, Baker D. Assembly of protein tertiary structures
from fragments with similar local sequences using simulated annealing and Bayesian
scoring functions. Journal of Molecular Biology 1997;268(1):209-225 .

64. Samudrala R, Xia Y, Huang E, Levitt M. Ab initio protein structure prediction using a
combined hierarchical approach. Proteins: Structure, Function, and Genetics 1999(Suppl.
3):194-198.









65. Ortiz AR, Kolinski A, Rotkiewicz P, Ilkowski B, Skolnick J. Ab initio folding of proteins
using restraints derived from evolutionary information. Proteins: Structure, Function, and
Genetics 1999(Suppl. 3):177-185.

66. Pillardy J, Czaplewski C, Liwo A, Lee J, Ripoll DR, Kazmierkiewicz R, Oldziej S,
Wedemeyer WJ, Gibson KD, Arnautova YA, Saunders J, Ye Y-J, Scheraga HA. Recent
improvements in prediction of protein structure by global optimization of a potential
energy function. Proceedings of the National Academy of Sciences of the United States
of America 2001;,98(5):2329-2333.

67. Marqusee S, Robbins VH, Baldwin RL. Unusually stable helix formation in short
alanine-based peptides. Proceedings of the National Academy of Sciences of the United
States of America 1989;86(14):5286-5290.

68. Blanco FJ, Rivas G, Serrano L. A short linear peptide that folds into a native stable b-
hairpin in aqueous solution. Nature Structural Biology 1994; 1(9):5 84-590.

69. Callihan DE, Logan TM. Conformations of Peptide Fragments from the FK506 Binding
Protein: Comparison with the Native and Urea-unfolded States. Journal of Molecular
Biology 1999;285(5):2161-2175.

70. Park BH, Levitt M. The complexity and accuracy of discrete state models of protein
structure. Journal of Molecular Biology 1 995;249(2):493 -5 07.

71. Sippl MJ, Hendlich M, Lackner P. Assembly of polypeptide and protein backbone
conformations from low energy ensembles of short fragments: Development of strategies
and construction of models for myoglobin, lysozyme, and thymosin b4. Protein Science
1992;1(5):625-640.

72. Bowie JU, Eisenberg D. An evolutionary approach to folding small a-helical proteins that
uses sequence information and an empirical guiding fitness function. Proceedings of the
National Academy of Sciences of the United States of America 1994;91(10):443 6-4440.

73. Jones DT. Successful ab initio prediction of the tertiary structure of NK-lysin using
multiple sequences and recognized supersecondary structural motifs. Proteins:Structure,
Function, and Genetics 1997; Suppl 1:185-191.

74. Sippl MJ. Knowledge-based potentials for proteins. Current Opinion in Structural
Biology 1995;5(2):229-235.

75. Koppensteiner WA, Sippl MJ. Knowledge-based potentials-back to the roots.
Biochemistry (Moscow)(Translation of Biokhimiya (Moscow)) 1998;63(3):247-252.

76. Simmerling C, Strockbine B, Roitberg AE. All-Atom Structure Prediction and Folding
Simulations of a Stable Protein. Journal of the American Chemical Society
2002;124(38):11258-11259.










77. Qiu L, Pabit SA, Roitberg AE, Hagen SJ. Smaller and faster: the 20-residue Trp-cage
protein folds in 4 ms. Journal of the American Chemical Society 2002; 124(44): 12952-
12953.

78. Hansmann UHE, Okamoto Y. Numerical comparisons of three recently proposed
algorithms in the protein folding problem. Journal of Computational Chemistry
1997; 18(7):920-933.

79. Pedersen JT, Moult J. Protein folding simulations with genetic algorithms and a detailed
molecular description. Journal of Molecular Biology 1 997;269(2):240-259.

80. Park B, Levitt M. Energy functions that discriminate x-ray and near-native folds from
well-constructed decoys. Journal of Molecular Biology 1 996;258(2):367-392.

81. Huang ES, Subbiah S, Tsai J, Levitt M. Using a hydrophobic contact potential to evaluate
native and near-native folds generated by molecular dynamics simulations. Journal of
Molecular Biology 1996;257(3):716-725 .

82. Samudrala R, Moult J. An all-atom distance-dependent conditional probability
discriminatory function for protein structure prediction. Journal of Molecular Biology
1998;275(5):895-916.

83. Bonneau R, Strauss CEM, Rohl CA, Chivian D, Bradley P, Malmstrom L, Robertson T,
Baker D. De novo prediction of three-dimensional structures for maj or protein families.
Journal of Molecular Biology 2002;322(1):65-78.

84. Fetrow JS, Skolnick J. Method for prediction of protein function from sequence using the
sequence-to-structure-to-function paradigm with application to
glutaredoxins/thioredoxins and T1 ribonucleases. Journal of Molecular Biology
1998;281(5):949-968.

85. Zhang Y, DeVries ME, Skolnick J. Structure modeling of all identified G protein-coupled
receptors in the human genome. [Erratum to document cited in CAl44:305279]. PLoS
Computational Biology 2006;2(3):200.

86. Bradley P, Chivian D, Meiler J, Misura KMS, Rohl CA, Schief WR, Wedemeyer WJ,
Schueler-furman O, Murphy P, Schonbrun J, Strauss CEM, Baker D. Rosetta predictions
in CASP5: Successes, failures, and prospects for complete automation. Proteins:
Structure, Function, and Genetics 2003;53(Suppl. 6):457-468.

87. Bonneau R, Tsai J, Ruczinski I, Chivian D, Rohl C, Strauss CE, Baker D. Rosetta in
CASP4: progress in ab initio protein structure prediction. Proteins: Structure, FUnction,
and Genetics 2001;(Suppl 5):119-126.

88. Bowers PM, Strauss CEM, Baker D. De novo protein structure determination using
sparse NMR data. Journal of Biomolecular NMR 2000; 18(4):3 11-3 18.









89. Rohl CA, Baker D. De novo determination of protein backbone structure from residual
dipolar couplings using Rosetta. Journal of the American Chemical Society
2002;124(11):2723-2729.

90. Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. Design of a Novel
Globular Protein Fold with Atomic-Level Accuracy. Science (Washington, DC, United
States) 2003;302(5649):1364-1368.

91. Kuhlman B, O'Neill JW, Kim DE, Zhang KYJ, Baker D. Accurate Computer-based
Design of a New Backbone Conformation in the Second Tumn of Protein L. Journal of
Molecular Biology 2002;315(3):471-477 .

92. Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D.
Protein-protein docking with simultaneous optimization of rigid-body displacement and
side-chain conformations. Journal of Molecular Biology 2003;33 1(1):281-299.

93. Rohl CA, Strauss CEM, Chivian D, Baker D. Modeling structurally variable regions in
homologous proteins with Rosetta. Proteins: Structure, Function, and Bioinformatics
2004;55(3):656-677.

94. Samudrala R, Levitt M. Decoys \"R\" Us: a database of incorrect conformations to
improve protein structure prediction. Protein Science 2000;9(7): 1399-1401.

95. Wang Y, Zhang H, Li W, Scott RA. Discriminating compact nonnative structures from
the native structure of globular proteins. Proceedings of the National Academy of
Sciences of the United States of America 1995;92(3):709-713.

96. Subramaniam S, Tcheng DK, Fenton JM. A knowledge-based method for protein
structure refinement and prediction. Proceedings / Intemnational Conference on
Intelligent Systems for Molecular Biology ; ISMB International Conference on Intelligent
Systems for Molecular Biology 1996;4:218-229.

97. Holm L, Sander C. Evaluation of protein models by atomic solvation preference. Journal
of Molecular Biology 1992;225(1):93-105.

98. Crippen GM. A novel approach to calculation of conformation: distance geometry.
Journal of Computational Physics 1977;24(1):96-107.

99. Havel TF, Kuntz ID, Crippen GM. The combinatorial distance geometry method for the
calculation of molecular conformation. I. A new approach to an old problem. J Theor
Biol FIELD Full Journal Title:Joumal of theoretical biology 1983;104(3):3 59-3 81.

100. Kuntz ID, Crippen GM, Kollman PA. Application of distance geometry to protein tertiary
structure calculations. Biopolymers 1979; 18(4):93 9-95 7.









101. Collins CJ, Schilling B, Young M, Dollinger G, Guy RK. Isotopically labeled
crosslinking reagents: resolution of mass degeneracy in the identification of crosslinked
peptides. Bioorganic & Medicinal Chemistry Letters 2003;13(22):4023 -4026.

102. Schilling B, Row RH, Gibson BW, Guo X, Young MM. MS2Assign, automated
assignment and nomenclature of tandem mass spectra of chemically crosslinked peptides.
Journal of the American Society for Mass Spectrometry 2003;14(8):834-850.

103. Kruppa GH, Schoeniger J, Young MM. A top down approach to protein structural studies
using chemical cross-linking and fourier transform mass spectrometry. Rapid
Communications in Mass Spectrometry 2003;17(2): 155-162.

104. Alexandrov NN, Nussinov R, Zimmer RM. Fast protein fold recognition via sequence to
structure alignment and contact capacity potentials. Pacific Symposium on Biocomputing
Pacific Symposium on Biocomputing 1996:53-72.

105. Bonneau R, Tsai J, Ruczinski I, Baker D. Functional Inferences from Blind ab Initio
Protein Structure Predictions. Journal of Structural Biology 2001;134(2 & 3): 186-190.

106. Zhang Y, Hubner IA, Arakaki AK, Shakhnovich E, Skolnick J. On the origin and highly
likely completeness of single-domain protein structures. Proceedings of the National
Academy of Sciences of the United States of America 2006; 103(8):2605-2610.

107. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN,
Bourne PE. The Protein Data Bank. Nucleic Acids Research 2000;28(1):235-242.

108. Levitt M. Growth of novel protein structural data. Proceedings of the National Academy
of Sciences of the United States of America 2007; 104(9):3 183-3 188.

109. Moult J, Fidelis K, Rost B, Hubbard T, Tramontano A. Critical Assessment of methods
of protein Structure Prediction (CASP)-round 6. Proteins: Structure, Function, and
Bioinformatics 2005;61(Suppl. 7):3-7.

110. Rohl CA, Strauss CEM, Misura KMS, Baker D. Protein structure prediction using
Rosetta. Methods in Enzymology 2004;383(Numerical Computer Methods, Part D):66-
93.

111. Rohl CA. Protein structure estimation from minimal restraints using Rosetta. Methods in
Enzymology 2005;394(Nuclear Magnetic Resonance of Biological Macromolecules, Part
C):244-260.

112. Chivian D, Kim DE, Malmstrom L, Schonbrun J, Rohl CA, Baker D. Prediction of
CASP6 structures using automated Robetta protocols. Proteins: Structure, Function, and
Bioinformatics 2005;61(Suppl. 7):157-166.









113. Kim DE, Chivian D, Baker D. Protein structure prediction and analysis using the Robetta
server. Nucleic Acids Research 2004;32(Web Server):W526-W531.

114. Chivian D, Kim DE, Malmstroem L, Bradley P, Robertson T, Murphy P, Strauss CEM,
Bonneau R, Rohl CA, Baker D. Automated prediction of CASP-5 structures using the
Robetta server. Proteins: Structure, Function, and Genetics 2003;53(Suppl. 6):524-533.

115. Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ. JPred: a consensus secondary
structure prediction server. Bioinformatics 1998;14(10):892-893.

116. King RD, Sternberg MJE. Identification and application of the concepts important for
accurate and reliable protein secondary structure prediction. Protein Science
1996;5(11):2298-2310.

117. Rost B, Sander C. Prediction of protein secondary structure at better than 70% accuracy.
Journal of Molecular Biology 1993;,232(2):584-599.

118. Salamov AA, Solovyev VV. Prediction of protein secondary structure by combining
nearest-neighbor algorithms and multiple sequence alignments. Journal of Molecular
Biology 1995;247(1):11-15.

119. Frishman D, Argos P. Seventy-five percent accuracy in protein secondary structure
prediction. Proteins: Structure, Function, and Genetics 1997;27(3):329-335.

120. Li W, Zhang Y, Skolnick J. Application of sparse NMR restraints to large-scale protein
structure prediction. Biophysical Journal 2004;87(2):1241-1248.

121. Chen Y, Ding, F., Dokholyan, N. V. Fidelity of the protein structure reconstruction from
inter-residue proximity constraints. J Phys Chem B 2007; 111:7432-7438.

122. Faulon J-L, Sale K, Young M. Exploring the conformational space of membrane protein
folds matching distance constraints. Protein Science 2003;12(8): 1750-1761.

123. Zemla A. LGA: a method for finding 3D similarities in protein structures. Nucleic Acids
Research 2003;31(13):3370 3374.

124. Li X, Sutcliffe MJ, Schwartz TW, Dobson CM. Sequence-specific proton NMR
assignments and solution structure of bovine pancreatic polypeptide. Biochemistry
1992;31(4):1245-1253.

125. Lewis RJ, Brannigan JA, Offen WA, Smith I, Wilkinson AJ. An evolutionary link
between sporulation and prophage induction in the structure of a repressor: anti-repressor
complex. Journal of Molecular Biology 1998;283(5):907-91 2.









126. Leij onmarck M, Lilj as A. Structure of the C-terminal domain of the ribosomal protein
L7/L12 from Escherichia coli at 1.7 .ANG. Journal of Molecular Biology
1987;195(3):555-580.

127. Berndt K, Guentert P, Wuethrich K. Nuclear magnetic resonance solution structure of
dendrotoxin K from the venom of Dendroaspis polylepis polylepis. Journal of Molecular
Biology 1993;234(3):73 5-750.

128. keassar Chen IM. a novel approach to decoy set generation: designing a physical energy
function having local minima with native structure characteristics. j journal of Molecular
Biology 2003;329:159 151 174.

129. Bewley CA, Gustafson KR, Boyd MR, Covell DG, Bax A, Clore GM, Gronenborn AM.
Solution structure of cyanovirin-N, a potent HIV-inactivating protein. Nature Structural
Biology 1998;5(7):571-578.

130. Bewley CA. Solution structure of a cyanovirin-N:Manal -2Mana complex structural basis
for high-affinity carbohydrate-mediated binding to gpl20. Structure (Cambridge, MA,
United States) 2001;9(10):931-940.

131. Drohat AC, Tj andra N, Baldisseri DM, Weber DJ. The use of dipolar couplings for
determining the solution structure of rat apo-S 100B(bb). Protein Science 1999;8(4):800-
809.

132. Roberts SA, Weichsel A, Grass G, Thakali K, Hazzard JT, Tollin G, Rensing C, Montfort
WR. Crystal structure and electron transfer kinetics of CueO, a multicopper oxidase
required for copper homeostasis in Escherichia coli. Proceedings of the National
Academy of Sciences of the United States of America 2002;99(5):2766-277 1.

133. Roberts SA, Wildner GF, Grass G, Weichsel A, Ambrus A, Rensing C, Montfort WR. A
Labile Regulatory Copper lon Lies Near the T1 Copper Site in the Multicopper Oxidase
CueO. Journal of Biological Chemistry 2003;278(3 4):3 1958-3 1963.

134. Reva BA, Finkelstein AV, Skolnick J. What is the probability of a chance prediction of a
protein structure with an rmsd of 6 .ANG.? Folding & Design 1998;3(2):141-147.

135. Raaijmakers H, Macieira S, Dias JM, Teixeira S, Bursakov S, Huber R, Moura JJG,
Moura I, Romao MJ. Gene Sequence and the 1.8 .ANG. Crystal Structure of the
Tungsten-Containing Formate Dehydrogenase from Desulfovibrio gigas. Structure
(Cambridge, MA, United States) 2002; 10(9): 1261-1272.

136. Yankovskaya V, Horsefield R, Toernroth S, Luna-Chavez C, Miyoshi H, Leger C, Byrne
B, Cecchini G, Iwata S. Architecture of succinate dehydrogenase and reactive oxygen
species generation. Science (Washington, DC, United States) 2003;,299(5607):700-704.









137. Leesong M, Henderson BS, Gillig JR, Schwab JM, Smith JL. Structure of a dehydratase-
isomerase from the bacterial pathway for biosynthesis of unsaturated fatty acids: two
catalytic activities in one active site. Structure (London) 1996;4(3):253-264.

138. Sharma AK, Raj ashankar KR, Yaday MP, Singh TP. Structure of mare apolactoferrin: the
N and C lobes are in the closed form. Acta Crystallographica, Section D: Biological
Crystallography 1999;D55(6): 1152-1157.

139. Sugahara M, Nodake Y, Sugahara M, Kunishima N. Crystal structure of dehydroquinate
synthase from Thermus thermophilus HB8 showing functional importance of the dimeric
state. Proteins FIELD Full Journal Title:Proteins 2005;58(1):249-252.

140. Ren J, Esnouf RM, Hopkins AL, Warren J, Balzarini J, Stuart DI, Stammers DK. Crystal
Structures of HIV-1 Reverse Transcriptase in Complex with Carboxanilide Derivatives.
Biochemistry 1998;37(41): 14394-14403.

141. Ferguson KM, Kavran JM, Sankaran VG, Fournier E, Isakoff SJ, Skolnik EY, Lemmon
MA. Structural basis for discrimination of 3-phosphoinositides by pleckstrin homology
domains. Molecular Cell 2000;6(2):3 73 -384.

142. Lawson CL, Zhang R, Schevitz RW, Otwinowski Z, Joachimiak A, Sigler PB. Flexibility
of the DNA-binding domains of trp repressor. Proteins: Structure, Function, and Genetics
1988;3(1):18-31.

143. Meiler J, Baker D. Rapid protein fold determination using unassigned NMR data.
Proceedings of the National Academy of Sciences of the United States of America
2003;100(26):15404-15409.

144. Ramirez BE, Voloshin ON, Camerini-Otero RD, Bax A. Solution structure of DinI
provides insight into its mode of RecA inactivation. Protein Science 2000;9(1 1):2161-
2169.

145. Alexeev D, Bury SM, Turner MA, Ogunj obi OM, Muir TW, Ramage R, Sawyer L.
Synthetic, structural and biological studies of the ubiquitin system: chemically
synthesized and native ubiquitin fold into identical three-dimensional structures.
Biochemical Journal 1994;299(1): 15 9-163.

146. Schumacher S, Clubb RT, Cai M, Mizuuchi K, Clore GM, Gronenborn AM. Solution
structure of the Mu end DNA-binding Ib subdomain of phage Mu transposase: modular
DNA recognition by two tethered domains. EMBO Journal 1997; 16(24):7532-7541.

147. Kihara D, Lu H, Kolinski A, Skolnick J. TOUCHSTONE: an ab initio protein structure
prediction method that uses threading-based tertiary restraints. Proceedings of the
National Academy of Sciences of the United States of America 2001;98(18): 10125-
10130.









148. Vallely KM, Rustandi RR, Ellis KC, Varlamova O, Bresnick AR, Weber DJ. Solution
Structure of Human Mts 1 (S 100A4) As Determined by NMR Spectroscopy.
Biochemistry 2002;41(42):12670-12680.

149. Dempsey AC, Walsh MP, Shaw GS. Unmasking the Annexin I Interaction from the
Structure of Apo-S 100All1. Structure (Cambridge, MA, United States) 2003;1 1(7):887-
897.

150. Brodersen DE, Etzerodt M, Madsen P, Celis JE, Thogersen HC, Nyborg J, Kjeldgaard M.
EF-hands at atomic resolution: the structure of human psoriasin (S100A7) solved by
MAD phasing. Structure (London) 1998;6(4):477-489.

151. Kobayashi N, Koshiba, S., Inoue, M., Kigawa, T., Yokoyama, S. RIKEN Structural
Genomics/Proteomics Initiative (RSGI), Solution structure of mouse CGI-38 protein. To
be Published

152. Murakami S, Nakashima R, Yamashita E, Yamaguchi A. Crystal structure of bacterial
multidrug efflux transporter AcrB. Nature (London, United Kingdom)
2002;419(6907):587-593.

153. Yu EW, McDermott G, Zgurskaya HI, Nikaido H, Koshland DE, Jr. Structural Basis of
Multiple Drug-Binding Capacity of the AcrB Multidrug Efflux Pump. Science
(Washington, DC, United States) 2003;300(562 1):976-980.

154. Yu EW, Aires JR, McDermott G, Nikaido H. A periplasmic drug-binding site of the
AcrB multidrug efflux pump: A crystallographic and site-directed mutagenesis study.
Journal of Bacteriology 2005;187(19):6804-6815.

155. Ronning DR, Guynet C, Ton-Hoang B, Perez ZN, Ghirlando R, Chandler M, Dyda F.
Active site sharing and subterminal hairpin recognition in a new class of DNA
transposases. Molecular Cell 2005;20(1):143-154.

156. Badger J, Sauder JM, Adams JM, Antonysamy S, Bain K, Bergseid MG, Buchanan SG,
Buchanan MD, Bativenko Y, Christopher JA, Emtage S, Eroshkina A, Feil I, Furlong EB,
Gajiwala KS, Gao X, He D, Hendle J, Huber A, Hoda K, Kearins P, Kissinger C, Laubert
B, Lewis HA, Lin J, Loomis K, Lorimer D, Louie G, Maletic M, Marsh CD, Miller I,
Molinari J, Muller-Dieckmann HJ, Newman JM, Noland BW, Pagarigan B, Park F, Peat
TS, Post KW, Radojicic S, Ramos A, Romero R, Rutter ME, Sanderson WE, Schwinn
KD, Tresser J, Winhoven J, Wright TA, Wu L, Xu J, Harris TJR. Structural analysis of a
set of proteins resulting from a bacterial genomics proj ect. Proteins: Structure, Function,
and Bioinformatics 2005;60(4):787-796.

157. Martin ACR, Orengo CA, Hutchinson EG, Jones S, Karmirantzou M, Laskowski RA,
Mitchell JBO, Taroni C, Thornton JM. Protein folds and functions. Structure (London)
1998;6(7):875-884.










158. Russell RB, Ponting CP. Protein fold irregularities that hinder sequence analysis. Current
Opinion in Structural Biology 1998;8(3):364-371.

159. Narasimhan J, Wang M, Fu Z, Klein JM, Haas AL, Kim J-JP. Crystal Structure of the
Interferon-induced Ubiquitin-like Protein ISGl5. Journal of Biological Chemistry
2005;280(29):27356-27365.

160. Milani M, Savard, P.-Y., Oullet, H., Ascenzi, P., Guertin, M., Bolognesi, M. A
TyrCD 1/TrpG8 hydrogen bond network and a TyrB 10-TyrCD 1 covalent link shape the
heme distal site of Mycobacterium tuberculosis hemoglobin O. PNAS 2003 v1 00:5766-
5771.

161. Udomsinprasert R, Pongjaroenkit, S., Wongsantichon, J., Oakley, A.J., Prapanthadara,
L.A., Wilce, M.C., Ketterman, A.J. Identification, characterization and structure of a
new Delta class glutathione transferase isoenzyme. BiochemJ 2005 388 763-771.
162. Fotin A, Cheng, Y., Grigorieff, N., Walz, T., Harrison, S.C., Kirchhausen, T.
Structure of an auxilin-bound clathrin coat and its implications for the mechanism of
uncoating Nature 2004 432 649-653.

163. Fotin A, Cheng, Y., Sliz, P., Grigorieff, N., Harrison, S.C., Kirchhausen, T., Walz, T.
.Molecular model for a complete clathrin lattice from electron cryomicroscopy. Nature
2004 432 573-579.

164. Schulze FW, Petrick HJ, Cammenga HK, Klinge H. Thermodynamic properties of the
structural analogs benzo[c]cinnoline, trans-azobenzene, and cis-azobenzene. Zeitschrift
fuer Physikalische Chemie (Muenchen, Germany) 1977; 107(1): 1-19.

165. Talaty ER, Fargo JC. Thermal cis-trans isomerization of substituted azobenzenes.
Correction of the literature. Chemical Communications (London) 1967(2):65-66.

166. Rau H. Azo compounds [Photochromium based on E-Z isomerization of double bonds].
Studies in Organic Chemistry (Amsterdam) 1 990;40(Photochromism: Mol. Syst.): 165-
192.

167. Liu ZF, Hashimoto K, Fujishima A. Photoelectrochemical information storage using an
azobenzene derivative. Nature (London, United Kingdom) 1 990;347(6294):658-660.

168. Hugel T, Holland Nolan B, Cattani A, Moroder L, Seitz M, Gaub Hermann E. Single-
molecule optomechanical cycle. Science (New York, NY) 2002;296(5570): 1103-1106.

169. Ikeda T, Tsutsumi O. Optical switching and image storage by means of azobenzene
liquid-crystal films. Science (Washington, D C) 1995;268(5219):1873-1875.

170. Sekkat Z, Dumont M. Photoassisted poling of azo dye doped polymeric films at room
temperature. Applied Physics B: Photophysics and Laser Chemistry 1992;B54(5):486-
489.









171. Muraoka T, Kinbara K, Kobayashi Y, Aida T. Light-Driven Open-Close Motion of
Chiral Molecular Scissors. Journal of the American Chemical Society
2003;125(19):5612-5613.

172. Zhang C, Du MH, Cheng HP, Zhang XG, Roitberg AE, Krause JL. Coherent Electron
Transport through an Azobenzene Molecule: A Light-Driven Molecular Switch. Physical
Review Letters 2004;92(15):158301/158301-158301/158304.

173. Bortolus P, Monti S. Cis-trans photoisomerization of azobenzene. Solvent and triplet
donors effects. Journal of Physical Chemistry 1979;83(6):648-652.

174. Zimmerman G, Chow L-Y, Paik U-J. The photochemical isomerization of azobenzene.
Journal of the American Chemical Society 1958;80:3528-3531.

175. Rau H. Further evidence for rotation in the p,p* and inversion in the n,p*
photoisomerization of azobenzenes. Journal of Photochemistry 1984;26(2-3):221 -225.

176. Rau H, Lueddecke E. On the rotation-inversion controversy on photoisomerization of
azobenzenes. Experimental proof of inversion. Journal of the American Chemical Society
1982;104(6):1616-1620.

177. Bortolus P, Monti S. cis .dblharw. trans Photoisomerization of azobenzene-cyclodextrin
inclusion complexes. Journal of Physical Chemistry 1987;91(19):5046-5050.

178. Monti S, Orlandi G, Palmieri P. Features of the photochemically active state surfaces of
azobenzene. Chemical Physics 1982;71(1):87-99.

179. Cattaneo P, Persico M. An ab initio study of the photochemistry of azobenzene. Physical
Chemi stry Chemical Physics 1999; 1(20):473 9-4743.

180. Blevins AA, Blanchard GJ. Effect of Positional Substitution on the Optical Response of
Symmetrically Disubstituted Azobenzene Derivatives. Journal of Physical Chemistry B
2004;108(16):4962-4968.

181. Andersson JA, Petterson R, Tegner L. Flash photolysis experiments in the vapor phase at
elevated temperatures. I: Spectra of azobenzene and the kinetics of its thermal cis-trans
isomerization. Journal of Photochemistry 1982;20(1):17-32.

182. Lednev IK, Ye TQ, Matousek P, Towrie M, Foggi P, Neuwahl FVR, Umapathy S, Hester
RE, Moore JN. Femtosecond time-resolved UV-visible absorption spectroscopy of trans-
azobenzene: dependence on excitation wavelength. Chemical Physics Letters
1998;290(1,2,3):68-74.

183. Lednev IK, Ye T-Q, Abbott LC, Hester RE, Moore JN. Photoisomerization of a Capped
Azobenzene in Solution Probed by Ultrafast Time-Resolved Electronic Absorption
Spectroscopy. Journal of Physical Chemistry A 1998; 102(46):9161-9166.









184. Lednev IK, Ye T-Q, Hester RE, Moore JN. Femtosecond Time-Resolved UV-Visible
Absorption Spectroscopy of trans-Azobenzene in Solution. Journal of Physical Chemistry
1996;100(32):13338-13341.

185. Fujino T, Tahara T. Picosecond Time-Resolved Raman Study of trans-Azobenzene.
Journal of Physical Chemistry A 2000; 104(18):4203 -4210.

186. Fujino T, Arzhantsey SY, Tahara T. Femtosecond Time-Resolved Fluorescence Study of
Photoisomerization of trans-Azobenzene. Journal of Physical Chemistry A
2001;105(35):8123-8129.

187. Ishikawa T, Noro T, Shoda T. Theoretical study on the photoisomerization of
azobenzene. Journal of Chemical Physics 2001;1 15(16):7503-7512.

188. Quennville J. First principles strudies of cis-trans photoisomeriation dynamics and
excited states in ethylene, stilbene, azobenzene, and TATB. Urbana: University of Illinois
at Urbana-Champaign; 2003.

189. Tiago ML, Ismail-Beigi S, Louie SG. Photoisomerization of azobenzene from first-
principles constrained density-functional calculations. Journal of Chemical Physics
2005;122(9):094311/094311-094311/094317.

190. Ciminelli C, Granucci G, Persico M. The photoisomerization mechanism of azobenzene:
A semiclassical simulation of nonadiabatic dynamics. Chemistry--A European Joumnal
2004;10(9):2327-2341.

191. Cembran A, Bernardi F, Garavelli M, Gagliardi L, Orlandi G. On the Mechanism of the
cis-trans Isomerization in the Lowest Electronic States of Azobenzene: SO, S1, and Tl.
Journal of the American Chemical Society 2004; 126(1 0):3 234-3 243.

192. Gagliardi L, Orlandi G, Bemardi F, Cembran A, Garavelli M. A theoretical study of the
lowest electronic states of azobenzene: The role of torsion coordinate in the cis-trans
photoisomerization. Theoretical Chemistry Accounts 2004; 11 1(2-6):3 63 -372.

193. Diau EW-G. A New Trans-to-Cis Photoisomerization Mechanism of Azobenzene on the
S1l(n,p) Surface. Journal of Physical Chemi stry A 2004; 108(6):95 0-95 6.

194. Chang C-W, Lu Y-C, Wang T-T, Diau EW-G. Photoisomerization Dynamics of
Azobenzene in Solution with S1 Excitation: A Femtosecond Fluorescence Anisotropy
Study. Journal of the American Chemical Society 2004; 126(32):10109-101 18.

195. Dohno C, Uno S-n, Nakatani K. Photoswitchable Molecular Glue for DNA. Journal of
the American Chemical Society 2007; 129(39): 11898-11899.











196. Gorostiza P, Volgraf M, Numano R, Szobota S, Trauner D, Isacoff EY. Mechanisms of
photoswitch conjugation and light activation of an ionotropic glutamate receptor.
Proceedings of the National Academy of Sciences of the United States of America
2007;104(26):10865-10870.

197. Gaussian I, Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria GER, M. A.;
Cheeseman, J. R.; Montgomery, Jr., J. A.; Vreven, T.; Kudin KNB, J. C.; Millam, J. M.;
lyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci BC, M.; Scalmani, G.; Rega, N.;
Petersson, G. A.; Nakatsuji, H.; Hada ME, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.;
Ishida, M.; Nakajima TH, Y.; Kitao, O.; Nakai, H.; Klene, M.; Li, X.; Knox, J. E.; ,
Hratchian HPC, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann
REY, O.; Austin, A. J.; Cammi, R.; Pomelli, C.;, Ochterski JWA, P. Y.; Morokuma, K.;
Voth, G. A.; Salvador, P.; Dannenberg JJZ, V. G.; Dapprich, S.; Daniels, A. D.; Strain,
M. C.; ,Farkas OM, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz
JVC, Q.; Baboul, A. G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu GL, A.;
Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; ,Keith TA-L, M. A.; Peng, C. Y.;
Nanayakkara, A.; Challacombe, M.; Gill PMWJ, B.; Chen, W.; Wong, M. W.;
Gonzalez, C.; and Pople, J. A. Gaussian 03, Revision C.02. Wallingford CT; 2004.

198. Becke AD. Density-functional thermochemistry. III. The role of exact exchange. Journal
of Chemical Physics 1993;98(7): 5648-5652.

199. Hariharan PC, Pople JA. Influence of polarization functions on MO hydrogenation
energies. Theoretica Chimica Acta 1973;28(3):213-222.

200. Biswas N, Umapathy S. Density Functional Calculations of Structures, Vibrational
Frequencies, and Normal Modes of trans- and cis-Azobenzene. Journal of Physical
Chemi stry A 1997; 101(3 0):55 5 5-5 566.

201. Traetteberg M, Hilmo I, Hagen K. A gas electron diffraction study of the molecular
structure of trans-azobenzene. Journal of Molecular Structure 1977;3 9(2):23 1-23 9.

202. Bouwstra JA, Schouten A, Kroon J. Structural studies of the system trans-
azob enzene/trans-stilbene. I. A reinvestigation of the di sorder in the crystal structure of
trans-azobenzene, C12H10N2. Acta Crystallographica, Section C: Crystal Structure
Communications 1983;C39(8):1121-1123.

203. Fliegl H, Koehn A, Haettig C, Ahlrichs R. Ab Initio Calculation of the Vibrational and
Electronic Spectra of trans- and cis-Azobenzene. Journal of the American Chemical
Society 2003;125(32):9821-9827.

204. Mostad A, Roemming C. Refinement of the crystal structure of cis-azobenzene. Acta
Chemica Scandinavica (1947-1973) 1971;25(10):3561-3568.









205. Naegele T, Hoche R, Zinth W, Wachtveitl J. Femtosecond photoisomerization of cis-
azobenzene. Chemical Physics Letters 1 997;272(5,6):489-495 .

206. Tully JC. Molecular dynamics with electronic transitions. Journal of Chemical Physics
1990;93(2):1061-1071.

207. Kasha M. Characterization of electronic transitions in complex molecules. Discussions of
the Faraday Society 1950;No. 9:14-19.

208. Fujino T, Arzhantsey SY, Tahara T. Femtosecond/picosecond time-resolved
spectroscopy of trans-azobenzene: isomerization mechanism following S2(pp*) <- SO
photoexcitation. Bulletin of the Chemical Society of Japan 2002;75(5): 103 1-1040.

209. Kikuchi O, Azuki M, Inadomi Y, Morihashi K. Ab initio GB study of solvent effect on
the ci s-trans i somerizati on of 4-dimethyl amino-4'-nitroazob enzene. Theochem
1999;468(1-2):95-104.

210. Hirose Y, Yui H, Sawada T. Effect of Potential Energy Gap between the n-p and the p-p
State on Ultrafast Photoisomerization Dynamics of an Azobenzene Derivative. Journal of
Physical Chemi stry A 2002; 106(13):3 067-3 071.









BIOGRAPHICAL SKETCH

Christina was born in a small town in Northeastern Pennsylvania. In 1998, she entered

Bloomsburg University of Pennsylvania with aspirations of becoming a nurse. Because of the

excellent tutelage she received from Dr. Wayne P. Anderson in an introductory

organic/biochemistry course, she decided to change her major to chemistry. A year later, she

joined Dr. Anderson in his research efforts. Together, they studied the geometric effects on the

spectra of Vanadyl complexes as well as potential aluminum catalysts to be used in olefin

polymerization. She also had the privilege of working with Dr. Anna Krylov in the summer of

2002 as a participant in the Research Experience for Undergraduates program at the University

of Southern California.

In 2003, Christina entered the graduate program at the University of Florida and

immediately began working for Dr. Adrian Roitberg. Her initial studies were focused on

determining the isomerization pathways of Azobenzene but she is now more interested in bio-

systems. During the summer of 2005, she had the opportunity to participate in the National

Science Foundation's East Asia and Pacific Summer research Institutes program working for Dr.

Jill Gready, at the Australian National University.





PAGE 1

1 NEW PROTEIN STRUCTURE PREDICTI ON METHOD USING INTER-RESIDUE DISTANCES AND A THEORETICAL INVEST IGATION OF THE ISOMERIZATION OF AZOBENZENE AND DISUBSTITUTED AZOBENZENES By CHRISTINA R. CRECCA A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2008

PAGE 2

2 2008 Christina R. Crecca

PAGE 3

3 To my husband Chris

PAGE 4

4 ACKNOWLEDGMENTS At the com pletion of this work, I take great pleasure in acknowledging the people who have supported me over the last few years. I gr atefully thank and acknowledge my advisor, Prof. Adrian Roitberg, for his continual guidance, support, understanding, and encouragement. I would also like to thank my committee members Dr. Chang, Dr. Cao, Dr. Fanucci, and Dr. Polfer. During my time at QTP I have made many gr eat friends. Without their support, I do not think I would have made it. I would like to give special thanks to Andrew, Dan, Georgios, Gustavo, Hui, Joey, Josh, Julio, Kelly, Ken, Lena, Lex, Mehrnoosh, Ozlem, Seonah, Tom, and Yilin. I would also like to thank my family, es pecially my nieces and nephews, Gabe, Savanna, A. J., and Anna. I would also like to thank Dr. Eric Deumens for his infinite patien ce and understanding. I apologize to all the computers that were harm ed during this work, particularly Arwen and Cobalt. Our work was supported in part by DOE contract DE-F602-02ER45995 and a University of Florida Alumni Fellowship. Computer resources were provided by the University of Florida High Performance Computing Center as well as the Large Resource Allocations Committee through grant TG-MCA05S010.

PAGE 5

5 TABLE OF CONTENTS page ACKNOWLEDGMENTS...............................................................................................................4 LIST OF TABLES................................................................................................................. ..........9 LIST OF FIGURES.......................................................................................................................11 LIST OF ABBREVIATIONS........................................................................................................ 15 ABSTRACT...................................................................................................................................16 CHAP TER 1 INTRODUCTION TO PROTEIN ST RUCTURE PREDICTION METHODS .................... 18 1.1 Background Information on Proteins................................................................................ 18 1.2 Experimental Methods...................................................................................................... 19 1.2.1 Structure Determination......................................................................................... 19 1.2.1.1 X-ray crystallography...................................................................................19 1.2.1.2 Nuclear magnetic resonance (NMR)............................................................ 22 1.2.2 Distance Measurements..........................................................................................26 1.2.2.1 Nuclear overhauser effects (NOE) from NMR............................................ 26 1.2.2.2 Electron paramagnetic resonance (EPR)...................................................... 26 1.2.2.3 Fluorescence resonance en ergy transfer (FRET) ......................................... 28 1.2.2.4 Chemical cross-linking with m ass spectrometry.......................................... 30 1.3 Methods of Structure Prediction.......................................................................................31 1.3.1 Homology Modeling..............................................................................................32 1.3.2 Fold Recognition Methods (Threading)................................................................. 35 1.3.3 Ab Initio Methods ...................................................................................................37 1.3.3.1 Rosetta.......................................................................................................... 40 1.3.3.2 Databases to test scoring functions .............................................................. 41 1.3.4 Distance Geometry................................................................................................. 42 1.3.5 Chemical Cross-Linking with MS.......................................................................... 43 1.3.6 Our Method............................................................................................................44 1.4 Critical Assessment of Techniques for Protein Structure Prediction (CASP) ..................46 2 METHODS FOR PROTEIN ST RUCTURE PREDICTION ................................................. 49 2.1 Decoy Generation.............................................................................................................49 2.1.1 General Decoy Set.................................................................................................. 49 2.1.2 Specific Decoy Set.................................................................................................50 2.2 Decoy Discrimination.......................................................................................................50 2.3 Choosing Constraints........................................................................................................53 2.4 Comparing Results............................................................................................................54

PAGE 6

6 3 TRIALS AND ERRORS: DEVELOPING THE METHOD.................................................. 56 3.1 Testing the Method on Previ ously Constructed Databases .............................................. 56 3.1.1 Number of Structures Satisfying Specific Constraints ........................................... 57 3.1.2 Effects of Applying Constr aints in Different Orders ............................................. 58 3.1.2.1 Randomly ordered constraints...................................................................... 58 3.1.2.2 Same constraints in different order .............................................................. 58 3.2 Developing a Search Protocol Using a Structure Known to Be in Our Database ............ 59 3.3 Developing a Search Protocol Usi ng a Structure Not in Our Database ...........................60 3.3.1 Constraint Distance Acceptance Ranges: +/2 and +/4 ...............................60 3.3.2 Calculation of All RMSDs..................................................................................... 61 3.3.3 Constraint Distance Acceptance Range of +/12 and +/12 +/10 ......62 3.3.4 Block of Distances.................................................................................................. 63 3.3.5 Vary the Order of Constraint Application.............................................................. 64 3.3.6 Count the Number of Satisfied Constraints for Each Decoy .................................. 65 3.4 Determination of an Average RMSD Distribution........................................................... 65 3.5 Summary of Methods.......................................................................................................66 4 RESULTS: USING OUR DECOY S ET TO FI ND FOUR PROTEINS................................ 77 4.1 Completeness of Decoy Set.............................................................................................. 77 4.2 Evaluation of Decoy Discrimination................................................................................ 78 4.2.1 Target 1b4c, Apo-S100 ........................................................................................78 4.2.2 Target 1ghh, DNA-Damage-Inducible Protein I (DinI)......................................... 78 4.2.3 Target 1ubi, Ubiquitin............................................................................................ 79 4.2.4 Target 2ezk, Mu End DNA-Binding ibeta Subdomain of Phage Mu Transposase .....................................................................................................................80 4.2.5 Comparison of Search Proces s for All Target Proteins ..........................................80 4.3 Conclusions.......................................................................................................................81 5 RESULTS: USING SPECIFIC DECOY SETS TO FIND FOUR PROTEINS..................... 89 5.1 Parameter Optimizations.................................................................................................. 89 5.1.1 Decoy Set Size........................................................................................................ 89 5.1.2 Constraint Distance Acceptance Range..................................................................90 5.1.2.1 Twelve constraints........................................................................................90 5.1.2.2 Twenty-five constraints................................................................................ 92 5.2 Search Results............................................................................................................. ......93 5.2.1 Target 1b4c............................................................................................................. 94 5.2.2 Target 1ghh............................................................................................................. 94 5.2.3 Target 1ubi.............................................................................................................. 95 5.2.4 Target 2ezk............................................................................................................. 95 5.3 Conclusions.......................................................................................................................96

PAGE 7

7 6 RESULTS: USING GENERA L AND SPECIFI C DECOYS SETS TO STUDY Twelve CASP7 TARGETS............................................................................................................... 103 6.1 General Decoy Set.......................................................................................................... 103 6.1.1 Targets That Worked............................................................................................ 104 6.1.1.1 Target T288................................................................................................104 6.1.1.2 Target T340................................................................................................105 6.1.1.3 Target T359................................................................................................107 6.1.1.4 Target T309................................................................................................108 6.1.1.5 Target T335................................................................................................109 6.1.1.6 CASP comparisons..................................................................................... 110 6.1.2 Targets That Could Have Worked But Did Not................................................... 110 6.1.2.1 Target T348................................................................................................110 6.1.2.2 Target T349................................................................................................112 6.1.2.3 Target T358................................................................................................113 6.1.2.4 CASP comparisons..................................................................................... 114 6.1.3 Targets That Never Had a Chance........................................................................ 115 6.1.3.1 Target T306................................................................................................115 6.1.3.2 Target T311................................................................................................116 6.1.3.3 Target T353................................................................................................117 6.1.3.4 Target T363................................................................................................118 6.1.3.5 CASP comparisons..................................................................................... 119 6.1.4 Summary of Results for General Decoy Set......................................................... 119 6.2 Specific Decoy Sets........................................................................................................ 120 6.2.1 Targets That Worked............................................................................................ 120 6.2.2 Targets That Did Not Work.................................................................................. 123 6.2.3 Targets That Never Had a Chance........................................................................ 124 6.2.4 Summary of Results Usi ng the Specific Decoy Set ............................................. 125 6.3 Comparisons of Decoy Sets............................................................................................ 125 7 COMPARISONS OF GENERAL AND SPECIFI C DECOY SETS................................... 143 7.1 Comparing the performance of the general and specific decoy sets on f our targets...... 143 7.2 Results for CASP7..........................................................................................................144 8 AZOBENZENE ISOMERIZATION.................................................................................... 146 8.1 Isomerization Mechanism............................................................................................... 146 8.2 Applications of Azobenzenes in Biomolecules.............................................................. 149 9 COMPUTATIONAL DETAILS.......................................................................................... 153 9.1 Ground-State Calculations.............................................................................................. 153 9.2 Excited-State Calculations..............................................................................................154

PAGE 8

8 10 RESULTS: UNSUBSTITUTED AZOBENZENE............................................................... 155 10.1 Optimized Ground-State Geometry.............................................................................. 155 10.2 Electronic Excitation Energies.....................................................................................155 10.3 Potential Energy Surfaces............................................................................................. 156 10.3.1 Ground State.......................................................................................................156 10.3.2 Excited State 1 (n *) ....................................................................................157 10.3.2.1 Rotation pathway......................................................................................158 10.3.2.2 Inversion pathway....................................................................................159 10.3.3 Excited State 2 ( *) ....................................................................................160 10.3.3.1 Rotation pathway......................................................................................160 10.3.3.2 Inversion pathway....................................................................................160 10.3.3.3 Concerted inversion pathway...................................................................161 10.4 Summary of Unsubstituted Azobenzene............................................................... 161 11 RESULTS: SUBSTITUTED AZOENZENES.....................................................................170 11.1 Optimized Ground-State Geometry.............................................................................. 170 11.1.1 NN Distance....................................................................................................... 170 11.1.2 NNC Angle, CNNC Dihedral A ngle, and NNCC Dihedral Angle .................... 170 11.1.3 Relative Energy Differences............................................................................... 171 11.2 Comparison of Charges................................................................................................171 11.3 Electronic Excitation Energies.....................................................................................172 11.4 Potential Energy Surfaces............................................................................................. 174 11.4.1 Ground State.......................................................................................................174 11.4.2 Excited State 1.................................................................................................... 178 11.4.2.1 Rotation pathway......................................................................................178 11.4.2.2 Inversion pathway....................................................................................178 11.4.3 Excited State 2.................................................................................................... 179 11.4.3.1 Rotation pathway......................................................................................179 11.4.3.2 Inversion pathway....................................................................................179 11.4.3.3 Concerted-inversion pathway...................................................................180 11.5 Summary of Substituted Azobenzenes.........................................................................181 12 AZOBENZENE CONCLUSIONS....................................................................................... 201 APPENDIX A LIST OF CONSTRAINTS...................................................................................................202 LIST OF REFERENCES.............................................................................................................209 BIOGRAPHICAL SKETCH.......................................................................................................227

PAGE 9

9 LIST OF TABLES Table page 3-1 Comparison of input for the four target proteins ............................................................... 67 3-2 RMSDs for decoys satisf ying the m ost constraints........................................................... 67 3-3 Lowest RMSD decoys in database using 1b4c as a reference........................................... 67 3-4 Decoys remaining after 32 c onstraints using the block m ethod........................................ 68 3-5 Lowest RMSD decoys found in varying the order of constraint application ....................68 3-6 Lowest RMSD decoys found using the count method for both trials................................ 68 4-1 Number of decoys with RMSDs under each threshold ...................................................... 83 4-2 Summary of results........................................................................................................ ....83 5-1 The RMSD ranges........................................................................................................... ...97 5-2 Comparison of scores for each prot ein with different acceptan ce ranges......................... 97 6-1 Results for 12 targets.................................................................................................... ...127 6-2 JPred predictions compar ed to target structures .............................................................. 128 6-3 Results for each of the 12 targets..................................................................................... 129 6-4 Comparison of results for each ta rget using both types of decoy sets ............................. 129 10-1 Optimized geometries of cis and trans isom ers of azobenzene....................................... 163 10-2 Vertical excitation energies (eV) of trans and cis azobenzene. ....................................... 163 11-1 Optimized geometries of cis and trans isom ers of azobenzenes...................................... 182 11-2 Vertical excitation energies in eV of trans and cis azobenzenes. .................................... 182 11-3 Cis trans energy barrier s calculated along the invers ion and rotation pathways. .......183 11-4 Dipole moments of the inversi on transition state and cis isomer .................................... 183 11-5 Distances of transition states al ong the rotation and i nversion pathways. .......................183 11-6 Rotational energy barriers in the first excited state.........................................................183 11-7 Placement and energy of first excited state m inimum of the conical intersection........... 184

PAGE 10

10 11-8 Trans cis inversion energy barriers in the first excited state ....................................... 184 11-9 Trans cis energy barriers calcul ated along the inversion and rotation pathways on the second excited state surface. ...................................................................................... 184 11-10 Energy differences between S1 and S2.............................................................................185 11-11 Energies of the S1 and S2 minima, conical intersections, barrier heights, and available energy...............................................................................................................................185 A-1 List of distances for targets T288 and T306 ....................................................................202 A-2 List of distances for targets T309 and T335 ....................................................................203 A-3 List of distances for target T340......................................................................................204 A-4 List of distances for target T349......................................................................................205 A-5 List of distances for targets T348 and T353 ....................................................................206 A-6 List of distances for targets T358 and T363 ....................................................................207 A-7 List of distances for targets T359 and T311 ....................................................................208

PAGE 11

11 LIST OF FIGURES Figure page 1-1 Diagram of an amino acid (alanine)................................................................................... 48 1-2 Organization of protein structure. ...................................................................................... 48 2-1 How decoys are generated from a single protein............................................................... 55 3-1 Results of counting the number of decoys that satisfy each co nstraint............................. 69 3-2 Application of randomly ordered constraints for 1bba ...................................................... 70 3-3 Results using the same set of constraints in different orders ............................................. 71 3-4 Superimposed images of the results of the 2ezm search.................................................... 72 3-5 Results from Trial 1...................................................................................................... .....72 3-6 Target protein and the final four rem aining decoys after 13 constraints with a +/4 distance range.....................................................................................................................72 3-7 Histogram of RMSDs for all decoys in the database using 1b4c as a reference. .............. 73 3-8 Decoys with the lowest RMSDs in database using 1b4c as a reference ............................ 73 3-9 How an insertion in a l oop region can affect the search process. ...................................... 74 3-10 Decoy 1m ka-49.................................................................................................................. 74 3-11 Number of decoys vs. the number of cons traints each decoy satisfies for both trials .......75 3-12 Five decoys used to determine a ra ndom average RMSD for our decoy database............ 75 3-13 Histograms of RMSDs for five rand om ly chosen decoys, 1b7u, 1fxh, 1rt6, 1ujn, 2wrp...................................................................................................................................76 4-1 Histograms of RMSDs for all stud ied proteins, 1ghh, 1ubi, 2ezk, and 1b4c .....................84 4-2 Target 1b4c and top scoring decoys................................................................................... 84 4-3 Target 1ghh and top scoring decoys.................................................................................. 85 4-4 Target 1ubi and top scoring decoys................................................................................... 85 4-5 Target 2ezk and top scoring decoys................................................................................... 86 4-6 Analysis of the scoring procedure...................................................................................... 87

PAGE 12

12 4-7 Relationship between RMSD and score............................................................................88 5-1 Distribution of RMSDs for all four target proteins for the 10,000 decoy sets................... 98 5-2 Lowest RMSD structures in the 10,000 decoy set.............................................................98 5-3 The number of structures rema ining vs. score for each protein ......................................... 99 5-4 Correlation between score and RMSD............................................................................ 100 5-5 Average RMSD for each protein at different scores........................................................ 101 5-6 Top scoring decoys for 1b4c............................................................................................ 101 5-7 Representation of the -sheet orientation for the nativ e stru cture of target protein 1ghh and the top scoring decoys......................................................................................102 5-8 Top scoring decoy, # 3631, for 1ubi with a high RMSD, 12.6 ................................... 102 5-9 Top scoring decoys for 2ezk............................................................................................ 102 6-1 Distribution of RMSDs for each targ et protein................................................................ 130 6-2 Target T288 and the t op scoring decoys for T288 ........................................................... 130 6-3 Target T340 and some of the top scoring decoys............................................................ 131 6-4 Target T359 and its top scoring decoys........................................................................... 131 6-5 Target T309 and its top scoring decoys........................................................................... 132 6-6 Target T335 and its top scoring decoys........................................................................... 132 6-7 Use of Global Distan ce Test (GDT) analysis for Targets T288, T340, T359, T309, and T335 ..........................................................................................................................133 6-8 Target T348, lowest RMSD decoys in the database, and the top scoring decoys. .......... 134 6-9 Target T349, lowest RMSD decoys in the database, and the top scoring decoys ........... 134 6-10 Target T358, lowest RMSD decoys in the database, and the top scoring decoys. .......... 134 6-11 Use of Global Distance Test (GDT) analysis for Targets T348, T349, and T358 .......... 135 6-12 Target T306, best decoy in database, and top scoring decoys. ........................................ 136 6-13 Target T311, best decoy in database, and top scoring decoy...........................................136 6-14 Target T353, best decoy in database, and top scoring decoys. ........................................ 136

PAGE 13

13 6-15 Target T363, best decoy in database, and a top scoring decoy. ....................................... 137 6-16 Use of Global Distance Test (GDT) analysis for Targets T306, T311, T353, and T363. ................................................................................................................................137 6-17 Histogram of C RMSDs for all twelve CASP targets.................................................... 138 6-18 Top scoring decoys fo r target that w orked...................................................................... 139 6-19 Results for T288......................................................................................................... ......140 6-20 Results for T348......................................................................................................... ......140 6-21 Results for T359......................................................................................................... ......140 6-22 Results for T363......................................................................................................... ......141 6-23 Results for T340......................................................................................................... ......141 6-24 Results for T353......................................................................................................... ......141 6-25 Results for T306......................................................................................................... ......142 6-26 Results for T309......................................................................................................... ......142 8-1 Diagram of the rotation and inversion pathways of the trans cis isom erization of azobenzenes.....................................................................................................................151 8-2 Structures of compounds investigated in this work .........................................................152 10-1 Molecular orbitals of Azo involved in the S1 S0 and S2 S0 transitions....................... 164 10-2 Ground state potential energy surface of Azo .................................................................. 165 10-3 First excited state potential energy surface of Azo.......................................................... 165 10-4 Diagram of pathways in th e first excited state of Azo. ....................................................166 10-5 Conical Intersection of S0 and S1 states of Azo............................................................... 166 10-6 Second excited state potential energy surface of Azo..................................................... 167 10-7 Rotation, inversion, and concer ted-inversion pathways of Az o...................................... 168 10-8 Scheme of the trans cis isom erization process after n excitation and excitation..................................................................................................................... .....169 11-1 Comparison of charge differences in trans isom ers of the substituted azobenzenes.......186

PAGE 14

14 11-2 Molecular orbitals involved in the S1 S0 and S2 S0 transitions for AzoNO2NH2 and AzoNO2NO2..............................................................................................................187 11-3 Contour maps of the ground state of Azo and substituted azobenzenes.......................... 188 11-4 Schematic diagram of the molecular or bitals of the inversion transition state. ............... 189 11-5 Contour maps of the first excited state of Azo and substituted azobenzenes. .................190 11-6 Contour maps of the second excited state of Azo and substituted azobenzenes ............. 192 11-7 Rotation pathway along the angle of the ground state m inimum of Azo and substituted azobenzenes................................................................................................... 194 11-8 Inversion pathway along the dihedral of the ground state m inimum of Azo and substituted azobenzenes................................................................................................... 196 11-9 Concerted-inversion pathway along the di hedral of the ground state m inimum of Azo and substituted azobenzenes............................................................................................198 11-10 Scheme of the trans cis isom erization process for Azon, Azonco, and AzoNO2NH2.. 200

PAGE 15

15 LIST OF ABBREVIATIONS Azo Unsubstituted azobenzene Azon 4,4-diaminoazobenzene Azonco N-[4-(4-(Acetylamino)phenylazo)phenyl]-acetamide AzoNO2NH2 4,4-nitro-aminoazobenzene AzoNO2NO2 4,4-dinitroazobenzene CASP Critical assessment for techni ques in protein st ructure prediction CASSCF Complete active space self-consistent field CHelpG Charges from electronic potential DFT Density functional theory EPR Electron paramagnetic resonance FRET Fluorescence resonance energy transfer GDT Global distance test LCS Longest continuous segment LCS-5 LCS under 5 LGA Local-global alignment MS Mass spectrometry NOEs Nuclear overhauser effects NMR Nuclear magnetic resonance PDB Protein databank RDC Residual dipolar couplings RMSD Root mean square deviations SDSL Site-directed spin labeling TDDFT Time dependent de nsity functional theory

PAGE 16

16 Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy NEW PROTEIN STRUCTURE PREDICTI ON METHOD USING INTER-RESIDUE DISTANCES AND A THEORETICAL INVEST IGATION OF THE ISOMERIZATION OF AZOBENZENE AND DISUBSTITUTED AZOBENZENES By Christina Crecca May 2008 Chair: Adrian Roitberg Major: Chemistry It is often claimed that knowing a protein s structure is important in understanding its function. The experimental structure determination methods presently available can be costly and time-consuming. This dissertation presents an idea for a fast and inexpensive protein structure prediction method that combines modeli ng with a minimal set of experimental data. Our method involves three steps: (1 ) building a decoy set (a set of protein-like structures), (2) measuring inter-residue distances, and (3) co mparing the measured distances with those calculated in each decoy. We postulate that st ructures with a small number of similar interresidue distances will also have similar three-dimensional structur e. We further hypothesize that the minimum number of distances needed to de termine structure is much less than the total number of inter-residue distances in the protein. To develop our protocol, we searched the decoy set for target proteins whose structures have been solved experimentally but have not been explicitly included in our d ecoy set. We simulated expe rimental data by calculating -carbon distances from the experimentally determined structures of our target proteins. We have created a large, generalized decoy set using most of the structures in the Protein Data Bank. This decoy set can be used to study any protein composed of 100 residues or less.

PAGE 17

17 Using this decoy set, we attempted to predict stru ctures for several proteins. We also analyzed the RMSD distributions of the decoys using th e search proteins as references and found the distributions to be similar for each pr otein. Of the nearly five thousand C-C distances in a 100 residue protein, knowledge of only twenty-five select ed distances will usually result in predicting a reliable model. In the second part of our study, results are presented for a series of azobenzenes which were studied using ab initio me thods to determine the substituen t effects on the isomerization pathways. Energy barriers were determined from three-dimensional potential energy surfaces of the ground and electronically excited states. In the ground state (S0), the inversion pathway was found to be preferred. Results show that electron donating substituents increase the isomerization barrier along the inversion path way, while electron withdrawing substituents decrease it. The inversion pathway of the first excited state (S1) showed trans cis barriers with no curve crossing between the S0 and S1. In contrast, a conical intersection was found between the ground and first excited states along the rotation pathway for each of the azobenzenes studied. No barriers were found in this path way and we therefore postulate that after n (S1 S0) excitation, the rotation me chanism dominates. Upon (S2 S0) excitation, there may be sufficient energy to open an additiona l pathway (concerted-inversion) as proposed by Diau. This pathway is only accessible for unsub stituted azobenzene and 4,4-dinitroazobenzene. Because of the S0 and S1 curves crossing on the trans side the concerted i nversion channel explains the experimentally obs erved difference in trans-to-c is quantum yields between S1 and S2 excitations. The concerted inversion channel is not available to the remaining azobenzenes and so they must employ the rotation pathway for both n and excitations.

PAGE 18

18 CHAPTER 1 INTRODUCTION TO PROTEIN STRUCTURE PREDICTION METHODS We begin with a brief introduc tion to protein studies, incl uding general inform ation on protein structure, including met hods of structure determination, is provided. We also discuss experimental methods like Electron Paramagnetic Resonance (EPR), Fluorescence Resonance Energy Transfer (FRET), Nuclear Overhauser Ef fects (NOE) from Nuclear Magnetic Resonance (NMR), and chemical cross-linking with mass spectro metry, all of which can be used to measure distances in proteins. Various methods of protein structure prediction are presented followed by a summary of our proposed method. 1.1 Background Information on Proteins The building blocks of proteins ar e the twenty naturally occurring -amino acids. Each amino acid residue has the same fundamental st ructure (Figure 1-1) c ontaining a carboxyl group, an amino group, and an -carbon with an R group attached. Amino acids differ in their Rsubstituent. Linking the carboxyl and amino gr oups of adjacent residues forms a peptide bond, thereby joining amino acids in a linear fashion. However, some protei ns containing cysteine resides can form disulfide bonds which result in the cross-linking (covalent bonding) of nonadjacent residues. The amino acid sequence of a protein is encoded by the DNA sequence of a gene and is often referred to as the proteins primary structure (Figure 1-2A). The secondary structure is composed of the regularly repeating local conformations generally stabilized by hydrogen bonds. The most typically seen secondary structural elements are the -helix and the -sheet (Figure 12B). A single protein may have many regions of differing secondary structure and how these regions relate to one another is described by their tertiary structure (Figure 1-2C). Thus, tertiary structure is the overall shape of a single-chain protein. Stab ilizing this structure are many non-

PAGE 19

19 local interactions. For example, to minimi ze their exposure to water, hydrophobic residues retreat to the proteins core. Salt bridges, hydrogen bonds, and di sulfide bonds are also formed to help stabilize the structure. Proteins compos ed of two or more polypeptide chains may exhibit quaternary structure, which refers to the spatia l arrangement of these chains (Figure 1-2D). 1.2 Experimental Methods 1.2.1 Structure Determination To understand the function of a protein on a m o lecular level, it helps to have some knowledge of its structure.1,2 In the late 1950s and early 1960s, the first pr otein structures were determined via X-ray crystallography. The st ructure of myoglobin was solved by Sir John Cowdery Kendrew, 3 while Max Perutz gave us the structure of hemoglobin.4 So important were their discoveries that Peru tz and Kendrew were awarde d the Nobel Prize in 1962. 1.2.1.1 X-ray crystallography There are three steps in determ ining the structure of a protei n from X-ray crystallography. First, a suitable crystal must be grown, whic h is often the most difficult and time consuming step.5 The quality of the crystal is hard to asse ss until the diffraction pattern is obtained, but ideally it will be pure and free from imperfections regular in structure, and suitable in size. Next, a beam of X-rays is fired at the crystal thereby produci ng a regular array of reflections (a pattern of spots) that can be seen on a screen behind the crystal. One of the advantages of using crystals is that they are periodic and theref ore composed of many repeating unit cells. Constructive interference due to this pe riodicity serves to amp lify the weak scattering of the individual unit cells into a more powerful, coherent reflec tion. The relative intensities of the reflections are important in determining the arrangement of the molecules in the crystal. They can be recorded using an area detector, charge-coupled devi ce image sensor, or photographic film. After the intensit y of each spot is recorded, the crystal is rotate d slightly to

PAGE 20

20 produce another set of reflections, whose intensities are also recorded. One image of spots does not provide enough information to determine the mo lecular structure of the crystal because it only represents a small piece of the full Fourier tran sform. Therefore, this process is continued for more than a 180 rotation of the crystal. One should also change the ax is of rotation at least once to avoid a blind spot in reciprocal space near the rotation axis. Often, several sets of diffraction patterns may be collected. Finally, the data can be analyzed. The reflectio ns from all the record ings must be indexed by identifying the dimensions of the unit cell. An autoindexing algorithm6 is usually employed to determine which image peak corresponds to wh ich position in reciprocal space. All the images of all the reflections are converted into a single file containing the Miller index of each reflection along with its intensity. Hundreds of separate im ages of the crystal are taken at different orientations leading to many symmetr y-related reflections being recorded multiple times. One must find which peaks appear in two or more images. Because the diffraction data is a reciprocal sp ace representation of th e crystal lattice, the location of each spot is determined by the size an d shape of the unit cell as well as the symmetry in the crystal. The intensity of each spot is prop ortional to the square root of the structure factor amplitude, which is a complex number that contai ns information relating to the amplitude and phase of a wave. Both the amplitude and phase must be known in order to generate an electron density map, which is then used to build a startin g model of the protein structure. A potential problem in X-ray crystallography is that the phase cannot be directly recorded during the experiment. The phase problem can be overcome in several ways. (1) Ab Initio phasing (direct method) 7,8 is often applied to small proteins by expl oiting known phase relations hips between specific

PAGE 21

21 groups of reflections to determine the needed ph ase information. (2) In molecular replacement,9 the structure of a related protein can be used as a model to determine the orientation and position of the molecules in the unit cell. Phase inform ation is then obtained and used to build an electron density map. (3) Anomalous scattering (multiwavelength anomalous dispersion MAD) 10 involves incorporating anomalously scatteri ng atoms like Selenium into the protein; the scattering is changed in a known way. The posi tion of the anomalously diffracting atoms can be found easily thereby providing th eir initial phases. (4) Similar to MAD phasing, heavy atoms can be incorporated into the protein. The cha nges in the scattering amplitudes can be used to determine the phases. MAD phasing with seleno methionine is now more commonly used than heavy atom replacement.10 Initial models can be built after the initial pha ses have been obtained. The models are then used to refine the phases, which are used to further improve the models. The B-factor is an estimate of the thermal motion of the atom and must be included in the phase refinement process. The R-factor is a measure of the agreemen t between the crystallographic model and the diffraction data, and depends highly on the resolution of the data. It should also be noted that it is not always possible to see every atom in th e molecule because the electron density is an average over all molecules in the crystal. Sometimes atoms exist in several conformations causing their electron density to appear smeared. On the other hand, they may appear multiple times in an electron density map. In summary, the magnitude of the Fourier tr ansform of electron dens ity is found from the multiple recorded intensities of the reflections. The full Fourier transform of the electron density comes from combining the phases and magnitudes. The electron de nsity is converted into the

PAGE 22

22 arrangement of atoms in the crystal and then th e determined crystal structure is stored in a publicly accessible database, like the Protein Data Bank (PDB). 1.2.1.2 Nuclear magnetic resonance (NMR) Structure determ ination using NMR can be perfor med in five steps. (1) Prepare the protein solution; unlike X-ray crystallography, Nuclear Magnetic Resonance (NMR) can be performed on samples either in solution or in solid state. (2) Take the NMR measurements. (3) Assign the NMR signals to individual atoms in the molecule (4) Identify conformational constraints, like distances between hydrogen atoms. (5) Calculat e the 3D structure based on experimental constraints.11 When a nucleus is placed in a magnetic field, it can exist in one of a small number of allowed orientations (states) wi th different energy. The nucleus of a hydrogen atom has only two allowed orientations; the magnetic moment of the nuclei can either align parallel to the external magnetic field or point in the opposite direction. So me nuclei will be oriented parallel and others anti-parallel giving rise to a small polarization of nuclear sp ins and thus a net macroscopic magnetization, which can be manipulated with the appropriate electromagnetic waves. Quantum mechanically speaking, the external magnetic fi eld causes the ground state energy level of the nucleus to split into two spin states (for nuclei wi th S= ) that have an energy difference of h This energy gap can be measured by applyi ng electromagnetic radiation of frequency (usually in the radio-frequency range) causing the nuclei to be excited from the lower energy level to the upper one. This process is classi cally described as flipping the sp in of the nucleus between their two spin states, spin up and spin down. The freque ncy is typically applied in several pulses, each of which is a few microseconds long, causing the sp ins of the nuclei to flip. An NMR signal is obtained after perturbing the equilibrium spin stat es. The signal decays as the system returns to

PAGE 23

23 its equilibrium state. The signal is also called the free induction decay (FID), which is essentially the sum of decaying cosine waves whose frequencies correspond to the resonance frequencies of the nuclei. A Fourier transf orm of this data yields the NMR frequency spectrum.12 Different types of nuclei have vastly different resonan ce frequencies. Protons (1H), for example, resonate at a frequency four times higher than carbon nuclei (13C) and ten times that of the nitrogen nuclei (15N). Much smaller resonance freque ncy differences occur between nuclei of the same type. Such variations or chemical shifts are due to the interactions between the nuclei and the surrounding elec trons affecting the local ma gnetic field experienced by a particular nucleus which in turn alters its resonance frequency. It is the chemical shift that allows us to assign protons to di fferent classes. For example, we can distinguish between amide protons and those on methyl groups The chemical shift is very sensitive to many structural, electronic, magnetic, and dynamic va riables and contains a lot of information on the state of the system of interest. The most important feature of NMR spectroscopy is that individual nuclei interact with the small magnetic fields generated by the spins of the nuclei nearby. The different nuclei can be correlated with one another in the molecule by using these spin-spin interactions. The nuclei interactions are either thr ough-space or through-bond. The through-space interactions are the basis for the nuclear Overhauser effect (NOE), which allows for distance measurements between hydrogen nuclei. The through-bond interactions ar e called spin-spin coupling or J coupling. Both of these correlations form the ba sis for the analysis of protein spectra. To determine the structure of proteins, multidimensional NMR must be used because it provides spectra with improved resolution as well as more easily analyzed correlations. Two-

PAGE 24

24 dimensional NMR was primarily developed by Kurt Wurthrich, who shared the Nobel Prize in chemistry in 2002. There are four consecutive time periods in multidimensional NMR: excitation, evolution, mixing, and detection. In the excitation period, the nuclear spins are prepared in the desired state. The chemical shifts are then observed during the evolution period 1. The spins are correlated with each other duri ng the mixing period and the chemical shift information of one nucleus is transferred to another nucl eus whose frequency is measured during 2, the detection period. Several experiments are run with successively incremented lengths of 1. From this information, a two-dimensional data set is obtained, from whic h a data matrix S(1, 2) is generated. The frequency spectrum, S( 1, 2), is obtained from a Fourier transform of S(1, 2). If two nuclei interact during the mixing time, the interaction will appear as a cross peak in the resonance spectrum at a position corresponding to the resonance frequencies of the two nuclei. Larger proteins are ge nerally labeled with 15N and 13C, but the preferred nucleus for detection is hydrogen because it is the most se nsitive. During the evolution period, the other nuclei, 15N and 13C, can be measured and their informa tion is transferred to the protons for detection. The chemical shift is sensitive to th e environment of a nucleus. Thus, multiple copies of the same amino acid in a protein can be di stinguished due to the conformation dependent chemical shift. The 1H, 15N, and 13C chemical shifts are known for many 3D NMR structures of proteins and can be used in empirical and semi-emp irical correlations with structural parameters. To assign the spectra, knowledge of the pr otein sequence is necessary. For large [15N, 13C]-labeled proteins, through-bond correlations across the peptide-bond between sequential amino acids can be used to assign the spectrum. Distance information a nd/or dihedral angles must be derived from NMR data to calculate th e proteins structure. Basic information about

PAGE 25

25 protein structure such as, ami no acid sequence, bond lengths, bond angles, chiralities, planar groups, and steric repulsion betw een non-bonded atom pairs, is us ed in conjunction with NMR data to do so. The crucial information comes from NOE distance measurements but supplementary dihedral angle constraints can come from through-bond correlations. Chemical shifts can also indicate the type of secondary structure that is present and through-bond interactions can detect hydr ogen-bonds. When NOEs are not prevalent, residual dipolar couplings (RDCs) can be used. RDCs are related to the orientation of NH and C-H internuclear vectors relative to the molecular frame. Using more input constrai nts in the struct ure calculation gives rise to higher qua lity structures. The final step in this process is to calculate the structure. First a low resolution structure is derived from an unambiguous subset of NOE data Many computer programs are available for this process and they are di vided into two main groups:13 ones that use inter-atomic distances14 like DISGEO and DISMAN15,16 and ones that use torsional bond angles, like DYANA17 and DIANA18. The final result for each type of method is the Cartesian coordinates of the family of structures which satisfy the set of NMR constraints. The experimental constraints do not specify just one unique structure, instead they describe a range of possible values. Also, some distances cannot be determined. Due to these restricti ons, an ensemble of structures satisfying the constraints is typically generate d by repeating the structure calcul ation several times. The best ensemble samples all the conformational space the constraints allow. Restrained molecular dynamics14,15 and distance geometry18,19 are two of the most common approaches used in structure gene ration. Distance geometry is often used to generate an initial structure for molecular dynamics. Using the distance constraints, an erro r function is minimized, which depends on the sum of all differences be tween the distance constraint and the actual

PAGE 26

26 distance. In restrained molecular dynamics, energy terms based on the NMR-derived constraints are added to the classical molecular dynamics fo rce fields. Usually, a combination of distance geometry and molecular dynamics is used to calculate the structure of a protein. 1.2.2 Distance Measurements 1.2.2.1 Nuclear overhauser ef fects NOE) from NMR Nuclear Overhauser effects are th rough-space correlations between nearby hydrogen atoms in the protein. Unlike J couplings the nuclei involved in the NOE can be separated greatly in the protein sequence as long as they are close to each other in space. The NOE results from the transfer of magnetization between spins interacting thr ough their dipoles. The intensity of the NOE is approximately proportional to r-6, where r is the distance between the two interacting nuclei. Because of the dependence on the inverse power of six of the distance, the intensity falls off very quickly with increasing distance. As such, NOEs are only observed for small distances, between 2 5 12. The lower bound is essentially the sum of two hydrogen atomic radii. The NOE distances are also classifi ed as strong, medium, or weak depending on their intensity. The distance ranges corresponding to each type of NOE are usually defined as less than 2.5 for strong, 2.5 3.5 for medium, and greater than 3.5 for weak.20 1.2.2.2 Electron paramagnetic resonance (EPR) Electron param agnetic resonance (EPR) is a magnetic resonance technique used to investigate systems which possess un paired electrons. This technique has been used in studies of metal centers in metalloproteins as well as reaction-intermediates vi a spin trapping. The introduction of site-directed spin labeling (SDSL)21 has allowed EPR to be applied to studies of proteins that do not contain metal atoms. In SDSL, a cysteine residu e is introduced into a specific location in the protein sequence, whic h can then form a disulfide bond with a thiolspecific nitroxide spin label, like methanethiosulfonate (MTSL).

PAGE 27

27 The basic principles of EPR are similar to NMR, except EPR deals with the magnetic moment of an unpaired, free electr on instead of nuclear spins. An oscillating magnetic field induces transitions between the two spin states. Energy is absorbed during the transition and the first order derivative of the absorption spectrum is generally recorded. To measure distances in proteins using EPR, two spin labels mu st be inserted. If the two spin probes are close enough to each other they will experience dipole-dipole coupling proportional to r-3, the inverse cube of the distance between them.22,23 EPR can measure a wide range of distances, from 8 20 for continuous wave techniques and 18 80 for pulsed techniques.24 In continuous wave methods, distance info rmation can be extracted by analyzing line broadening caused by dipolar intera ctions. To obtain the dipolar interaction, three types of labeled samples are need: a protein labeled in site A, a protein la beled in site B, and a protein labeled in both sites A and B. A spectrum with two spin labels is assumed to be the convolution of the dipolar broadening functi on and the monoradical spectrum.25 To separate the dipolar spectrum, a Fourier deconvolution method is used to subtract monoradical contaminants. This yields a pake distribution from whic h distance information can be obtained. In pulsed EPR techniques, like double-elect ron electron resonance (DEER), distance information is obtained by modulating a spin echo at the frequency of the dipolar interaction. Analysis of spin-echo amplitude oscillat ions reveals such distance information. There are many benefits to using EPR. The sa mple preparation is fairly easy compared to other methods. Low temperatures are required but only one type of spin label is needed. EPR often has higher precision than techniques like Fluorescence Resonance Energy Transfer (FRET), which will be discussed in section 1.2.2. 3. EPR can be performed with lower sample

PAGE 28

28 volumes and concentrations than NMR and there are no molecular size limitations. EPR can be used to measure medium (5 25 ) to long range distances (25 80 ). One disadvantage of EPR is that the dynamics of the spin labels are unknown and highly system dependent. To overcome this problem, molecular dynamics simulations can be used to predict the orientation of the labels and th eir effects on the distance distributions. The experimentally determined distances are those between the spin labels, not the -carbons, but by predicting the spin probe locations, one may be able to derive inter-residue distance information. 1.2.2.3 Fluorescence resonance energy transfer (FRET) In fluorescence resonance energy transfer, energy is transferred from an excited state donor (D) to a ground state ac ceptor (A). Long-range dipole-dipole interactions between the donor and the acceptor cause the en ergy transfer, which occurs without the appearance of a photon. The rate of energy transfer depends on four things : (1) the spectral overlap between the emission spectrum of the donor and the absorption spectrum of the acceptor, (2) the quantum yield of the donor, (3) the relative orientation of the donor and acceptor transition dipoles, and (4) the distance between the donor and acceptor. In meas uring inter-residue distan ces in proteins, it is this distance dependence of FRET which is exploited. Proteins can be covalently labeled with a donor, typically a tryptophan, and an acceptor molecule. The distance between the labels is in ferred from the efficiency of energy transfer, which can be determined from steady-state meas urements of the extent of donor quenching due to the presence of the acceptor. Because su ch distances can be measured, FRET is often described as a spectroscopic ruler.26 The distance over which energy transfer occurs is similar to the dimensions of proteins. The Frster distance (R0) is the distance at which FRET is fifty percent efficient and is usually

PAGE 29

29 between 20 90 depending on the specific donor a nd acceptor pair. Transfer fficiency can be expressed in terms of distances (Equation 11), decay times (Equation 1-3) and intensities (Equation 1-4). In equation 1-1, R0 is the Frster distance while r is the distance between the donor and acceptor. The rate of FRET strongly depends on the distance and is inversely proportional to r6. Any phenomenon that changes the donor-acceptor distance will also cause a change in the transfer rate, which allows, for example, the study of c onformational changes in proteins. Energy transfer is assumed to occur if the distance between the donor and acceptor is near the Frster distance and th ere is enough spectral overlap. The Frster distance can be cal culated from spectral proper ties of the donor and acceptor molecules as in Equation 1-2. The quantum yield of the donor in the abse nce of the acceptor is D, J( ) is the spectral overlap, n is the refraction i ndex of the medium (generally taken to be 1.4 for biomolecules in aqueous so lution), A is a constant, and 2 is the orientation factor, which describes the relative orientation of the donor and acceptor transition dipoles. A dynamic random average for 2 is assumed to be 2/3. The transfer efficiency is measured experimentally via fluorescence intensity and calculated using Equation 1-3 or Equation 1-4. The subscript DA indicates the lifetime ( DA) or intensity (IDA) of the donor in presence of acceptor while the D subscript represents is the lifetime ( D) or intensity (ID) of the donor in absence of the acceptor. (1-1) R0 6 R0 6 + r6 E =

PAGE 30

30 In summary, FRET can be used to measure long -range distances (20 90 ) in proteins. The actual distances measured are between the donor and acceptor molecules. Distances between two -carbons, however can be inferred from this data by estimating the position of the labels using molecular dynamics. Some degree of uncertainty is introduced into the measurement when deriving such dist ances from the experimental data. 1.2.2.4 Chemical cross-linking with mass spectrometry Che mical cross links are used to connect tw o polymer chains through covalent bonds. In biochemistry, they are often employed in the stu dy of protein structure, function, and interactions with other proteins. Cross-linkers bind to surf ace amino acid residues that are near one another in space. This helps to stabilize otherwise weak or transient inter-residue interactions so they can be analyzed. Imidoester cro ss-linker dimethyl suberimidate27 and the N -hydroxylsuccinamideester cross-linker BS3 (bis(sulfosuccinimidyl) suberate)28 can both be used as cross-linkers in protein studies. In these cases, lysines am ino group undergoes a nucleoph ilic attack resulting in a covalent bond between the lysine and the cros s-linker. The carbodiimid e cross-linker EDC, 1ethyl-3-(3-dimethylaminopropyl)-carbodiimide, c onverts carboxyl groups into amine-reactive isourea intermediates that can bind to available pr imary amines, including lysine residues. The cross-linkers have a known endto-end distance, which can be ta ken as the maximum distance Ro 6 = A 2 D J( ) n4 (1-2) E = 1 DA D (1-3) (1-4) E = 1 DA D

PAGE 31

31 between the two linked residues, as the linkers are usually flexible and can fold over on themselves.29 Mass spectrometry (MS) is an analytical tool that has many uses including protein characterization. In general, ther e are four steps in MS: (1) ioni ze the sample, (2) separate ions of differing masses, (3) detect the number of ions having each mass produced, and (4) collect and analyze the data. To characte rize proteins, two methods can be used. In the top-down approach, the intact protein is ionized by electrospray ioniza tion or matrix-assisted laser desorption/ionization and then it is run thr ough a mass analyzer. In the bottom-up method, protease enzymes, like trypsin, are used to digest the proteins into smaller peptides, which are then introduced into the mass spect rometer. The identity of th e peptide is found through peptide mass fingerprinting or tandem mass spectrometry. It is easy to identif y cross-linked residues because the enzymes will not cleave re sidues containing such cross-linkers. 1.3 Methods of Structure Prediction Many m ethods of protein structure prediction ar e available. The ultimate goal in protein modeling is to predict the stru cture of a protein using its am ino acid sequence. Ideally, the predicted structure will be comparable in accuracy to an experimental structure but the method of deriving the structure will be much faster compared to experiment. Protein modeling is especially useful for proteins th at cannot be crystalli zed for X-ray diffraction or those that are too large to be studied via NMR. The recent interest in genome projects has given rise to enormous amounts of amino acid sequence information, which c ontinues to grow much faster than protein structure data.30-32 In an attempt to keep up with the demand for protein structures, many scientists are turning to structure prediction methods. We will discuss homology modeling, fold recognition methods, and ab initio methods.

PAGE 32

32 1.3.1 Homology Modeling The sim plest protein structure pr ediction method is sequence homology, 33,34 which determines the degree of similarity between proteinsone with a known structure and one without There are two basic premises of homology modeling: (1) the proteins structure is determined by its amino acid sequence35 and (2) over millions of years, a proteins structure is less likely to change than its sequence.36-38 If two proteins have se quence homology greater than 30%, they are believed to have essentially the same structure38. In general, homology modeling can be broken down into seven steps. They ar e: template recognition and initial alignment, alignment correction, backbone generation, loop modeling, side-chain modeling, model optimization, and model validation. The first step in homology modeling is to find a template and align the sequences. Sequence alignment programs like BLAST39 or FASTA40 compare the target sequence to all sequences that have a known stru cture and are in the PDB by using two matrices. The first is a residue exchange matrix, which characterizes the probability that two of the twenty amino acids should be aligned. The axes of this matrix are simply the 20 amino acids. The highest values are found along the diagonal representing conserved residues. Exchanges between residues with similar properties, like phenylalanine tyrosine, have higher scor es than those exchanges between very dissimilar residues. In the al ignment matrix, the axes correspond to the two sequences being aligned and the elements are the values of the exchange matrix for a particular pair of residues. To find the optimal alignment, the best path through th e matrix is found taking care not to use any residue twice. Gaps can be inserted to improve the alignment, but the alignment algorithm will subtract a gap opening pe nalty. The end result of a BLAST search is a list of hits, which are the modeling template and their corresponding alignments.

PAGE 33

33 In the second step, the alignment can be corrected using programs like CLUSTALW41 to perform multiple sequence alignments, which use the sequences of other homologous proteins. Such methods are useful for regions of low sequen ce identity in the original alignment. Multiple sequence alignments can also generate positionspecific scoring matrices called profiles, which indicate the residues mo st likely to be buried in the hydr ophobic core and which are on the surface based on the most frequently seen residue exchanges. The backbone is generated in step 3. The coor dinates of the template residues appearing in the alignment are copied. If the template residues are the same as the ta rget, all coordinates of the residue are copied. If not, only the backbone coordinates (N, C, C, and O) are copied. Multiple template modeling, as performed by programs like Swiss-Model42, is useful when one template is found to contain errors. After alignment, there are often gaps from insertions and deletions which change the conformation of the backbone. These changes usually do not occur in regular secondary structural elements, but rather in loop regions and turns. Even without the insertions and deletions, loop conformations often differ between the template and the target. There are two common methods used for loop modeling, knowledge based and energy based. In the first method, the loop conformation is copied from a known loop in the PDB with endpoints that are the same as the residues between which the l oop must be inserted. Most major modeling programs can do this.42-46 The energy based method determines the best loop conformation by minimizing an energy function using Monte Carlo47 or Molecular dynamics.48 The next step is to model the side chains. Usually, this is done using libraries of common rotamers (different conformations) that have been extracted from high resolution X-ray structures. Many such libraries exist.49-51 Each rotamer is positioned and energy functions are

PAGE 34

34 used to score them. If the residu e is conserved, it is easier to copy the coordinates of the entire residues instead of copying the backbone a nd modeling the side chain. Also, certain conformations of the backbone ma y prefer certain rotamers, which helps to minimize the search space as the position of one affects the position of its neighbors. Residues in the hydrophobic core generally adopt only one conformation, whereas the more flexible side chains on the surface of proteins may adopt several conformations. The model optimization step is actually an iteration of rotamer prediction and energy minimization steps. The rotamers are predicted, which changes the backbone, then the rotamers for the new backbone are predicted and so on until convergence is achieved. Molecular dynamics is the preferred method of energy mini mization. Some methods restrain the atom positions and/or only employ a few hundred steps of energy minimization. Better, more accurate force fields will help improve model optimization. The final step in homology modeling is to va lidate the model. The amount of error can depend on the sequence identity between the target and template. Poor sequence identity (< 25 %) can often lead to very large errors. Such errors can be estimated by calculating the models energy based on a force field, which checks to se e if bond lengths and angles are within normal limits. It is not possible to disc ern if the protein is folded corr ectly using force fields alone as misfolded, yet well-minimized models can usually give the same energy as the target structure.52 Determining normality indices is an additional way to estimate errors. These indices describe how similar certain characteristics are between the model and real structures. They will check properties like the distribution of polar and nonpolar residues in th e interior and on the surface of the protein as well as radial distribution functions that can distinguish between good and bad contacts.

PAGE 35

35 In general, sequence homology is quick and co mputationally inexpensiv e. A drawback of this method, however, is its inability to detect structural similarities existing between two proteins with very different se quences. Unfortunately, in the pr otein world such occurrences are still quite common.53-56 For example, mammalian glycogen phosphorylase and DNA glucosyltransferase have similar sh apes but differ greatly in sequence.54 For these proteins, other methods, like threading and ab ini tio folding, must be used. 1.3.2 Fold Recognition Methods (Threading) The basic idea behind fold r ecognition m ethods (a lso called threading or the inverse folding problem), is to determine which of the known protein folds will be most similar to an unknown fold of a new protein knowing only its am ino acid sequence. In nature, often two seemingly unrelated proteins may adopt similar fold s. It is therefore important for a program to detect structural similarities between proteins with vastly different sequences. Some of these occurrences are the result of divergent evoluti on; the two proteins are related, but our current sequence analysis methods are not sensitive e nough to detect such distant homologies. Convergent evolution, on the other hand, may explain how similar structures can result from proteins having common functiona l requirements, like binding the same class of substrates. Because only a small number of fold s have been found in nature thus far, proteins may have very limited folding space giving rise to similarities betw een unrelated proteins. This explanation is generally reserved only for single domain proteins.57 The first two explanations show that proteins with similar structures sometimes also have similar functions; fold recognition, therefore, might be used as a function pr ediction tool as well. Usually, the active site, identity of cofactors, and general features of the reaction being catalyzed are highly conserved for enzymes with similar folds.1 Essentially, for such proteins, function is conserved evolutionarily.

PAGE 36

36 In sequence-based fold recognition, one must fi rst recognize similarities based on sequence and then construct a detailed alignment, which is a residue-by-residue eq uivalence table between the two proteins. The same methods of sequen ce alignment were discussed in our section on homology modeling. Fold recogni tion methods also use positionspecific mutation rules derived from the multiple sequence alignment of a homologous family to find distant homologies, even between proteins with less than 25% sequence similarity. Energy-based fold recognition methods are similar to grid search minimizations. The calculated energy at each grid poi nt is based on known protein stru ctures. This method is also called threading58,59 because the target sequence is being thre aded through or forced to adopt the structure of another protein. Several threading algorithms have been developed, but all follow the same paradigm of sequence alignment, template identification, and alignment building. The same limitations apply to threading as sequencebased fold recognition. If no correct structure exists in the structural database be ing used, no good models will be built. Many threading algorithms have been developed over the years. An intuitive approach would be to use a technique that incorporates nonlocal scoring functi ons. Many approximations are needed to minimize the space of possible al ignments. One of the first most successful methods was the Threader algorithm60. It used two-level dynamic programming to optimize interaction partners for each pa ir of aligned residues. Only the strongest inte racting residues were considered in this method, which he lped reduce its computational cost. Threading algorithms generally differ in three area s. The first is in their protein model and interaction descriptions. Simplifying the three-di mensional protein structure is one way to speed up energy calculations. Side chai ns can be simplified by describi ng them as interaction points, which can be located at C or C positions or can encompass the whole side chain. Also, the

PAGE 37

37 energy calculation may only include certain parts of the protein and th e interaction energy may or may not be distance dependent. Different algorithms also have various empirical energy parameterizations. Finally, th reading methods differ in thei r alignment algorithms. The threading energy is a nonlocal function based on the alignment be tween the template structure and the prediction target sequence. 1.3.3 Ab Initio Methods Predicting a proteins native c onformation solely from its sequence of amino acids is the basis of Ab initio structure prediction. In the early 1970 s, Anfinsen suggested that a proteins native conformation corresponds to a global free energy minimum for their sequences, which is commonly referred to as the the rmodynamic hypothesis. He al so showed that information needed for a protein to fold is contained in its amino acid sequence61,62. It seems logical to assume that given a perfect energy function a nd the proper computati onal tools, the native structure can be found. In reality, two problem s hinder this method. The conformational space to be searched is huge, while the mol ecular potentials have limited accuracy. To reduce the effects of these problems, many methods use reduced representations, simplified potentials, and coarse search strategies63-66. In ab initio folding, representations of the polypeptide chain are usually simplified in some way. Implicit solvation models are preferred over explicit water molecules. United atom representations are used, where the non-polar hydrogens are drawn into the base of the heavy atoms to which they are bonded. Using the limited set of conformations for each side chain that is most prevalent in the PDB (found in rotamer libraries51) can reduce computational cost without loss of predictability. Side chains are also sometimes replaced by locating th e side-chain properties at the C or the centroid of the side

PAGE 38

38 chain. This essentially averages out the side chain degrees of freedom, which speeds up the calculation but also decreases the specificity. To reduce the size of conformational space to be searched, one can limit the available backbone conformations. Certain local struct ures prefer certain torsion angle pairs67-69. Torsion angles can also be restricted to discrete, commonly seen values by using a small set of phi-psi pairs70, by choosing pairs from an ideal set based on predicted secondary st ructure, or by using fragments from known proteins63,71-73. There are two types of potentials used to ev aluate the free energy of proteins, molecular mechanics potentials and scoring functions. Both classes must be able to properly represent the forces that determine protein conformation. Such forces include so lvation, electrostatic interactions between hydrogen bonds as well as ion pairs, covale nt bonds, bond angles, dihedral angles, and van der Waals interactions. For molecular mechanics, the forces need ed to determine protein conformation are modeled by using physical based functional forms that have been parameterized from small molecules or quantum mechanical (QM) calcu lations. Coulombs law is used to model electrostatic interactions using QM calculations to derive partial charges, while a standard 6-12 potential is usually used to descri be van der Waals interactions. Scoring functions (protein stru cture-derived potentia ls), on the other hand, are empirically derived from experimental structures.74,75 A functional form of th e potential is usually not specified but rather the logarithm of probabi lity distribution functions are used to find pseudoenergies. These functions are especially useful when dealing with reduced complexity models as they can represent in teractions between side chain cen troids after averaging over all possible positions of the non-present atoms.

PAGE 39

39 Molecular dynamics is usua lly too expensive for the de novo generation of protein models using full atom representation. This method, howe ver, has had some success with very small proteins, like the Trp-cage protein.76,77 Conformational searching is quicker when a coarse sampling of the energy landscape is performed. Methods that take th is approach include Metropolis Monte Carlo simulated annealing,63 simulated tempering,78 evolutionary algorithms,72 and genetic algorithms.79 These methods generally allow for large perturbations in structure in a single move. Because the final structure of a single search may end up being trapped in a local minimum instead of the globa l minimum, several conformational searches are performed to generate an ensemble of possible structures. Choosing the most native-like structures from the ensemble is difficult and ma ny techniques have been developed to do so.80-82 As potential functions are improved, identifying th e most accurate models will become easier as they will have the lowest free energy. Perhap s the best energy functions for discriminating amongst the possible structures will be ones that combine molecular mechanics potentials with those derived from protein databases. Two fields in which ab initio protein folding might be of great use include genome functional annotation and struct ural genomics. Many open read ing frames have no sequence homology with proteins of known structure and/or function, bu t links between such proteins may be detected through ab initio folding. Structural similarities can be detected by comparing the predicted structures to those in the PDB using a structure-stru cture comparison tool.83 One could also look for conserved geometri c motifs in these structures.84 Finally, the predicted structures can be used to make matches to sequence-based motif libraries more sensitive and reliable. Ab initio structure prediction can be used in stru ctural genomics initiatives as a guide for experimentalists by finding prot eins most likely to contain novel folds. A hybrid of ab initio

PAGE 40

40 prediction and homology modeling can also be used, if a homology models contains a gap, then it can be filled in by ab initio prediction. Combined with a sm all amount of experimental data, ab initio methods can be used in rapid structure de termination for proteins whose structures cannot be determined via X-ray or NM R data, like membrane-bound proteins.85 1.3.3.1 Rosetta A specific exam ple of an ab initio folding algorithm is Rosetta (http ://www.bakerlab.org/), which is one of the best prediction methods avai lable today. Prediction methods are tested and compared via the Critical Assessment of prot ein Structure Prediction (CASP) experiments (discussed further in section 1.4). In past CASP experiments,47,86,87 Rosetta has generated some the top scoring predictions. There are several variants of Rosetta, all of which use sequence information and an energy function to generate protein-like structures. Rosetta has been employed in structure determination us ing limited experimental constraints,88,89 de novo protein design,90,91 protein-protein docking,92 and loop modeling.93 All methods involve generating a fragment library, piecing the fragments together, clustering the structures by pairwise C root mean square deviation (RMSD) values, and ra nking the representative cluster centers. Incorporating experimental data into the Rosetta method has been successful. RosettaNMR, for example, uses NMR constraints like residual dipolar coupling (RDCs) Nuclear overhauser effects (NOEs), and unassigned chemical shifts (CSs) to restrain ce rtain bond distances and angles to improve the quality of predicted structures. After the decoy set is generated, some decoys ar e eliminated using two filters. The contact order filter removes the decoys with low contact order (overly local structures) compared to a test set of proteins. The strand arrangement filter eliminates structures with non paired -strands

PAGE 41

41 and other nonprotein-like structur es. Finally, the decoys are cl ustered. A representative model from each cluster is chosen and ranked by the size of the cluster it represents. Discriminating among the decoys is still prob lematic. Clustering is not always the best option, as the best structures may not be in the most populated cluster. We will present a method involving the generation of decoy sets using Ro setta and discriminating among the decoys using inter-residue distance measurements. 1.3.3.2 Databases to test scoring functions In developing proper energy functions for use in protein structure prediction, it is im portant to test the function on a set of computer-generated conf ormations called decoys to see if the functions can distinguish between the native and non-native-like conformations.94 Samudrala and Levitt developed many sets of such conformations (the database called Decoys R us located at http://dd.stanford.edu) and made them available to the public to aid in the improvement of scoring functions. The decoys were generated w ith the intention of fooling the scoring functions; they have simila r characteristics of native protei ns, but they are not necessarily correct. Decoys have been developed from th e following types of methods: molecular dynamics trajectories,95 crystal structures,96 conformations with different loop regions,82 threading of the amino acid sequence onto very different folds,97 and discrete-state models.80 Similar websites have been created to test energy functions for gene ral protein structure prediction (http://prostar. carb.nist.gov) as well as scoring func tions specific for fold recognition (http://fold.doe-mbi.ucla.edu). Test ing scoring functions on several different decoy sets allows for the exploration of a vast conformational sp ace of proteins, which a single energy function alone might not be able to provide. A functions performance can be measured in many ways. The simplest method is to look at the RMSD of the best scoring conformation and the native structure. It is also possible to

PAGE 42

42 estimate the probability of choosing the best conformation by chance, RMSD rank of the conformation divided by the tota l number of conformations. The correlation coefficient of the RMSDs and the scores is also a good method because it uses information about all the conformations in the decoy set. 1.3.4 Distance Geometry The general goal of distance geom etry calculations, is to build model structures that satisfy a set of constraints. This branch of mathem atics was developed by Bl umenthal while Crippen and co-workers98-100 were the first to apply these principl es to chemical structure problems. Presently, the term distance geometry is used to describe any of the computer programs that convert geometric constraints into three-dimensional molecular co ordinates. Distance geometry algorithms are usually fast and can explore a vast conformational space. The structures they generate can be used as input for further refinement methods. Constraints are usually expresse d in terms of an objective f unction. One way of doing this is to specify a target va lue for a parameter of interest (e.g. a di stance or an angle) and then have the objective function measure deviations from th e optimum value. Another way is assign upper and lower bounds on a certain parameter. Wh en boundary conditions are violated, a penalty term is added to th e objective function. Most distance geometry programs have four parts: input preparat ions, bounds generation and bounds smoothing, embedding, and optimization. Ma ny types of distances can be used as input including holonomic, experime ntal, and those from seconda ry structure. Holonomic distances are determined directly from the prot ein sequence. Templates of each amino acid are made and generally include bond lengths, fixed di hedral distances like those in the peptide bond, and distances involved in rigid structures like aromatic rings. Upper and lower bounds are usually set to +/2% of the distance of interest.100 Experimental distances can be derived from

PAGE 43

43 NOE data. Usually, a 5upper bound is appl ied while the lower bound is the sum of the appropriate van der Waals radii. Informati on from secondary struct ure, like hydrogen bonding constraints, can also be used. All the distances are stored in a (N*N 1)/2 symmetric matrix while the bounds matrix is not symmetric. Because only a small number of all the interatomic distances will be found through experiment, other constraints, like the triangle inequality, must be used as well. For the upper bounds, the triangle inequality shows that fo r three atoms (i, j, k), the distance between i and j (Dij) must be less than the sum of distances from i to k (Dik) and from k to j (Dkj). If Dij is greater than Dik + Dkj, the distance is replaced with the sum. If the sum of the two distances is less than the lower bound, then a triangle violat ion has occurred. Afte r applying the triangle inequality to the upper bounds, it can be applied to the lower bounds. The overall inequality can be summarized as follows: the upper bound on Dkj must be greater than or equal to the sum of the upper bound of Dij and the lower bound of Dik for all atoms i, j, k. 1.3.5 Chemical Cross-Linking with MS Recently, a technique in volving the use of intermolecular cross-linking, mass spectrometry, and sequence threading has been employed in a structure prediction method.29 Using a lysine-specific cross-linking agent, BS3 (bis(sulfonsuccinimidyl) suberate, the tertiary structure of (FGF)-2 (bovine basic fibroblast growth factor) was probe d. Tripeptide mapping using time-of-flight mass spectrometry was employ ed to identify the eighteen (Lys-Lys) crosslinking sites and distance constrai nts were derived from this information. Threading was then used to assign the protein to a family of folds. This method, which requires only a small amount of sample, is fast and easily automated. The BS3 cross-linker reacts with the amine groups of Lys and the N terminus. Only one Lys-Lys cross-link per molecule was seen, ensuri ng the tertiary structur e remained unperturbed.

PAGE 44

44 The masses of tryptic peptides were assigne d from the mass spectra using the Automated Sequence Assignment Program (ASAP).29,101-103 The cross-linked residues are identified because Trypsin cleaves at lysine s and arginines, but not at BS3-modified lysines. To identify the protein fold, a sequence threading program, program 123D,104 was used to find the twenty best structural models from a da tabase of proteins with at least 30% sequence identity. The models were then ranked by how similar their distances were to the cross-linkderived distance constraints. The threading models were scored according to Equation 1-5 where N is the total num ber of modes, di is the C-C distance between the residues in constraint i, and d0 is the maximum C-C through space distance between the BS3-cross-linked lysines. In their work, approximately N/10 constraint s (where N is the number of residues) were found to provide enough distance information to corre ctly assign the fold of the studied protein. This method can be used to study most proteins if the fold has been previously observed. There are many cross-linkers available th at react with other polar groups besides lysine. These crosslinkers may also have different spacer arm lengths and flexibility. This method can also be combined with the other methods discussed in section 1-3. 1.3.6 Our Method As m entioned previously, the more common ex perimental structure determination methods are expensive and time consuming. We have employed a creative use of less expensive experimental methods in an attempt to overcome some of the obstacles associated with the more common structure determination methods. We find only a modest decrease in the resolution of the predicted structure. Even low resolution structures, have been demonstrated to provide 0, if di d0 di d0, if di > d0 i=1 n (1-5)

PAGE 45

45 insights into protein function.105 We suggest a method using simple computer algorithms and relatively inexpensive inter-residue distance meas urements to generate low resolution models which can be further refined with additional procedures. We propose a method to predict the unknown st ructure of protein us ing a database of protein-like structures. After generating over 8 million decoys, we eliminate the bad ones using inexpensive distance measurements. We will test the following two hypotheses for our method: (1) our decoy set is complete, therefore, a target protein will have similar structure to a member of the decoy set; (2) proteins with a small set of similar inter-residue distances (much smaller than the total number of distances) will have similar overall structure. Our decoys are derived from structures in th e Protein Data Bank (PDB), ensuring that all common protein folds are represented106 (See Chapter 2). After choosing a target sequence of unknown structure, several C-C distances are measured. Expe rimental techniques like NMR (NOE), EPR, and FRET can be used to measure small (3 7 ), medium (10 25 ), and large (25 100 ) distances respectively. Determ ining radius of gyration through scattering experiments can also generate useful information. All of these methods generally cost less than X-ray crystallography. To test the feasibility of our method, we search the decoy se t for target proteins whose structures have been solved experimentally but have not been explicitly included in our set. Experimental data is simulated by calculating C-C distances from the experimentally determined structures of our target proteins. Thos e distances are then used as search constraints. The same set of distances are calculated for each of the decoy structures and compared to those measured in the target protein. Structures with several similar -carbon distances also have similar three-dimensional structure. Our first hypothesis suggests there should be at least one

PAGE 46

46 surviving structure in the d ecoy set while our second hypothesis, if true, guarantees the number of surviving structures to be low. Therefore, our final prot ein structure predictions are the decoys satisfying the most distance constraints. Recently, the rate of new protein folds depos ited into the PDB has reached a plateau, suggesting that most novel protein folds have already been discovered using the techniques presently available.107,108 This finding bolsters our assumption that most small (~100 residues), single domain, folded proteins have a structurally similar decoy in our database. We limit the use of our method to proteins containing 100 residues or fewer by generating decoys exactly 100 residues long. Our method is not intended to predic t the structure of membra ne proteins, as such proteins are not as well represented in the PDB. 1.4 Critical Assessment of Techniques for Protein Structu re Prediction (CASP) The Critical Assessment of Techniques fo r Protein Structure Prediction (CASP, http://predictioncenter.org/casp7) is a community-w ide experiment that allows protein structure prediction groups to test and compare their methods. The goals of CASP are threefold:109 to determine the abilities and limitations of the current methods ; to determine where progress is advancing; to determine where the field is not making progress due to specific bottlenecks. The categories of predictions are always changing slightly from one round of CASP to the next. For example, evaluation of high resolution mode ls was suggested at the CASP6 meeting and implemented in CASP7. In Chapter 6, we discuss the results of testing our method w ith CASP targets. The CASP organizers solicit from experimentalists target protein sequences whose st ructures are close to being determined or have not yet been publishe d. The participants are given only the target sequences and a limited amount of time to use th eir prediction methods to determine the target protein structures. After analyz ing the results, the organizers hold a conference at which the

PAGE 47

47 most successful groups are asked to present their methods. Attendants may also make suggestions for future CASP experiments. Ther e have been seven rounds of CASP since its inception in 1994. CASP7 included three primary categories of pred iction, (1) Tertiary structure predictions, (2) High resolution models, and (3 ) other predictions. Each cate gory is further divided into automated and human-aided predictions. The human predictions can be made using any combination of computational and human met hods, but the automated structure prediction servers must be fully automated. The tertiary structure predictions are furthe r divided into two types, template based modeling and template free modeling. The firs t group includes the prev ious categories of comparative modeling, homologous fold based mode ls, and some analogous fold based models. The second group contains models of proteins with new folds (previ ously unseen) as well as hard analogous fold based models. The second primary category, high resolution mode ls, is new. It contains a subset of models from the tertiary structure prediction cat egory whose backbones are highly accurate such that the details of active sites, loops, and side chains can be evaluated. The other prediction category looks at how well predictors were able to define boundaries of structural domains, detect re sidue-residue contacts, and identify the regions of disorder in the targets. This category also includes predictions of function from structur e. Another new facet of evaluations included judging the predictors ability to disc ern the best model from their respective decoy set without knowle dge of the native structure. Evaluating model refinement is also important as there is much interest in generating models with high accuracy.

PAGE 48

48 Figure 1-1. Structure of an am ino acid (alanine) showing the -carbon, the R-group (CH3 for alanine), the amino group, and the carboxyl group. The amino acid is shown in its neutral, non-zwitterion form. Figure 1-2. Organization of protein structure. A) Primary stru cture corresponding to the amino acid sequence of: alanin e, tyrosine, phenylalanine, se rine, lysine. B) Secondary structure: -helix and -sheet patterns. C) Tertiary st ructure of the Trp-cage protein (PDB code 1l2y). D) Quaternary stru cture of hemoglobin (PDB code 1GZX). D B -helix -sheet C A Ala Tyr Phe Ser Lys A Y F S K R group (side chain) Carbon alpha atom (C) CH CH3 Carboxyl group COOH H2N Amino group

PAGE 49

49 CHAPTER 2 METHODS FOR PROTEIN ST RUCTURE PREDICTION 2.1 Decoy Generation 2.1.1 General Decoy Set All protein structures in the protein databank (PDB) w ith 100 residues or more (24,561 proteins in all including x-ray and average NMR structures) we re used to populate our decoy database. The protein backbones were split in to overlapping and running fragm ents of 100 residues (Figure 2-1) and only th e Cartesian coordinates of the -carbons were stored. A parent protein of more than 100 resi dues can be segmented into exac tly N 99 overlapping fragments containing 100 residues each, where N is the total number of resi dues. Thus the first decoy contains the first 100 -carbons from residues 1 to100, while the final decoy is composed of carbons from residues (N 99) to N. Decoys are named by first listing the PDB code of the parent protein and then the decoy number. If the parent protein is composed of several chains, the chain name is listed after the PDB code. For example, 1m31-a-2 is the second decoy composed of residues 2 101 from chain A of PDB code 1m31. Exactly 8,060,245 decoys were generated in this manner, creating a database to find the structure of proteins composed of 100 residues or fewer. Because each decoy is ex actly 100 residues long, we disregard the excess terminal residues when searching for shorter proteins. Several problems were encountered in constr ucting the decoy set. Some entries in the PDB are missing important atoms or residues causi ng gaps in the parent protein. Decoys were not generated from the gapped regions of such pr oteins. The two numbers appearing after some of the PDB codes indicate the parent protein c ontained a gap and the sequence was renumbered after the gapped region. For example, parent pr otein 2a6o contained a gap and was divided into two sections, each containing over 100 residues. Decoy 2a6o-2-25 was created from residues 25

PAGE 50

50 124 of the second fragment. Multiple positions for a single residue or an entire chain are also frequently seen in PDB entries. We consistent ly selected the first coordinate set if multiple positions were provided. For multi-chained proteins, th e chains were treated as separate entities. Several proteins have multiple PDB entries; no attempt was made to rid the database of redundant structures. 2.1.2 Specific Decoy Set The Rosetta procedure h as been described in depth elsewhere.63,83,110,111 Briefly, generating decoys with Rosetta requires the initia l formation of a fragment library using Robetta 112-114. Robetta divides the target proteins sequence into fragment s of three and nine residues and searches the protein databa nk (PDB) for the possible structur es of these fragments, which represent the range of accessible local structure conformations. These fragments are then pieced together randomly using a Monte Carlo simulate d annealing procedure w ith an energy function that favors hydrophobic burial, paired -strands, and specific side-chain interactions. Each decoy is evaluated by how well it compares to a protein-like structure based on statistics of known protein structures. 2.2 Decoy Discrimination As described in section 2.1, decoys are protein-like structures that m ay or may not look like the target. The goal of the search process is to find a decoy (or small set of decoys) similar in structure to the target protein. Using inte r-residue distances from the target protein as constraints, we distinguish between the good decoys (structures with low RMSDs using the target protein as a reference) and the bad decoys (high RMSD structures). These distances can be measured from experiments like nuclear ove rhauser effects (NOE) in nuclear magnetic resonance spectroscopy (NMR), electron parama gnetic resonance (EPR), and fluorescence resonance energy transfer (FRET), which can meas ure short (3 7), medium (10 25), and

PAGE 51

51 long (25 100 ) distances respectively. Such measurements are not ex act; the probes in FRET and EPR are constantly rotating a nd have a finite size making thei r exact orientation difficult to predict. Also, the measured distances are be tween the spin labels, while we simulate the experimental data using the distance between two -carbons. The distance uncertainties in EPR measurements without considering spin or ientation, are estimated to be around 5 .25 All of these uncertainties must be taken into considerati on in our search process. While comparing the C-C distances of the target protein to those of the decoys, upper and lower bounds are placed on the target proteins distance. A decoy satisfies the constraint onl y if its distance is within the range, the constraint distance acceptance range. The acceptance range indicates how much the de coy distance can differ from the target distance and still satisfy the constraint. Smaller ranges, +/1 and 2, are too tight; some low RMSD structures do not satisfy many constraints us ing this range. Also, this range is too small to account for experimental uncertainty. Larg er ranges, +/10 and +/15, are too broad, allowing high RMSD structures to satisfy seve ral constraints. After many trials, a more moderate range of +/5 was found to yield the best results. This range also compensates for insertions and deletions, as the distance between two consecutive -carbons is ~ 3.8 In addition to finding an optimal constraint distance acceptance range, the choice of which distance constraints to use is a key factor in the success of this method. When choosing constraints, it is helpful to initially run th e amino acid sequence through a secondary structure prediction method. For our studies we used JPred,115 a consensus method that gets result from six secondary structure prediction algorithms116-119 that use evolutionary information from multiple sequences. Based on sequence informa tion, JPred highlights which fragments of the chain are more likely to exist as -helices and which will be -sheets. We identify approximate

PAGE 52

52 regions of defined secondary structure to a void choosing distances between atoms in loop regions, which are highly mobile and less structurally defined, even in fairly similar structures. Therefore, the most effective constraints are distances between atoms in defined areas of secondary structure, like -helices and -sheets. These regions are usually more conserved as they often play a significant role in the proteins function. After using the acceptance range to compare the calculated distances, we scored the decoys by counting the number of constraints each one satisfi es. Over a series of trials, we found that twenty-five distances were sufficien t to rank the decoys; therefore, our scores range from 0 to 25. Other researchers have found a similar amount of experimental distance information to be necessary in structure prediction.120,121 This scoring method provides some protection against poor constraint choices (constraints satisfied by high but not low RMSD structures). In our previous trials (Chapter 3), c onstraints were applied sequentially and decoys not satisfying the constraint were eliminated from the database at each step. The few decoys remaining after several constraint applications were the structur e predictions. When a poor ly chosen constraint was applied, low RMSD structures were immedi ately eliminated from the database and were, therefore, unable to become the predicted structures. The counting method makes the sum of the c onstraints more important than each one individually, minimizing the effect of a few bad choices. Applying a poor constraint can result in a low RMSD structure having a slightly lo wer score and a high RM SD structure having a slightly higher score. The eff ects on the scores are so small th at the low RMSD structures are still predictable. In summary, our search proce ss is divided into 5 steps: (1) use a secondary structure prediction method to identify important dist ance constraints. (2) Measure distances

PAGE 53

53 experimentally. (3) Calculate th e same set of distances for each decoy. (4) Compare each of the target proteins distances with those of the decoys. A decoy satisfies a particular constraint if the two distances are similar within the constraint distance acceptance range (+/5). (5) Score each decoy by counting the number of constraints it satisfies. We hypothesize that structures with similar -carbon distances will show similarities in overall structure. The decoys satisfying the most constraints become our structure predic tions. While testing our method, we search for a protein of known structure. We can therefore simu late the experimental c onstraints in step 2 by choosing a set of distances from the nativ e structure of the ta rget protein. 2.3 Choosing Constraints There are many ways to choose c onstraints. In our early work (Chapter 3), we attem pted to randomly select constraints. Usually, a few of these random constraints involved atoms in loop regions, which is problematic as these region s are often not structurally well defined. To avoid choosing constraints from loop regions, we use a secondar y structure prediction method to identify these areas. With this knowledge, we selected constraints between all predicted secondary structure elements (Chapters 4 and 5). One can also choose constraints in a daisy-ch ain manner. In this method, each atom in a constraint is also involved with another constraint. For example, let constraint 1 be the distance between residues A and B. Then constraint 2 is composed of residues B and C while constraint 3 is derived from residues C and D. Constraints chosen in this manner are stronger than those randomly chosen because they must obey the triangle inequality. It is also possible to select a few atoms and use several of the distances between them as the set of constraints. Many implicit constraints are imposed in this manner. Another method is to choose a piece of secondary structure to serve as a reference and have all constraints involve

PAGE 54

54 an atom from this region. This method of choosing constraints has been shown previously to be significantly better th an daisy-chaining.122 In our later work (Chapter 6) we used a combination of these three methods. 2.4 Comparing Results In the recen t Critical Assessment of protein St ructure Prediction (CASP) experiments, the Local-Global Alignment (LGA)123 (http://PredictionCenter.llnl .gov/local/lga) measure has been used to evaluate the prediction results. Although RMSD calculations provide insight into global similarities between protein stru ctures, the LGA method was designe d to measure similarities in both global and local structure. This program creates several alignments between the structures of the predicted model and the targ et to find those regions of the pr oteins that are most similar to each other. The LGA method has two component s, the longest continuous segment (LCS) and the global distance test (GDT). Several iterations of both met hods are usually required to find the optimal alignments. When comparing two protein structures, the LCS procedure searches for the alignment that superimposes the longest section of continuous re sidues with an RMSD under a specified cutoff. For example, suppose the RMSD cutoff was 5.0 a nd the first three residues of each structure are aligned and had an RMSD 4.0 The program would then align residues 1-4 and compute the RMSD again. If the RMSD was still under 5.0 residue 5 would be included in the calculation otherwise the RMSD between residues 2 5 would be computed. This process continues as several alig nments are sampled and then the LCS is identified. Any set of residues in the model can be aligned to the target; they do not need to have the same location in each sequence (eg. residues 4, 5, and 6 of the model can be aligned to reside 23, 24, and 25 of the target). Unless otherwise indicated, in this pape r we discuss LCS using a cutoff of 5.0 and use the abbreviation LCS-5.

PAGE 55

55 In the GDT method, the structures are aligned to find the largest set of residues that differ by less than a selected distance cutoff. The distance cutoffs range from 0.5 10.0 and are scanned at a 0.5 interval. For GDT, the largest set is not necessarily composed of continuous residues. Pairs of residues ar e selected from both structures and a superposition and RMSD are calculated. The superpositions are used as starting points to generate a list of equivalent residues (carbonpairs from both proteins). After aligning the target and model structures, the distances between the equivalent residues are calculated and the number of residue pairs with distances under the cutoff is counted. The residues above th e threshold are removed, others are added, and the distances are calculated again. The initial li st of equivalent residue s is thus iteratively extended to find the largest set of residues that satisfy a sp ecific distance threshold. Figure 2-1. An example of how decoys are generated fr om a single protein. A parent protein with 104 residues (N = 104) can be cut into 5 d ecoys (104 99 = 5), each with exactly 100 residues. The first decoy contains -carbons 1 100 while the final decoy is composed of -carbons from residues 5 104 [(N 99) N]. Parent protein Decoys 1 100 2 101 3 102 103 4 104 5 N = 104

PAGE 56

56 CHAPTER 3 TRIALS AND ERRORS: DEVELOPING THE METHOD In this chapter m ethod development is discu ssed. To test the idea for the method, we searched through previously constructed decoy da tabases that were designed to test scoring functions. We then developed our own decoy set according to the methods set out in Chapter 2. A target protein with a structure known to be in the database wa s selected to test our search protocol. Finally we predicted the structure of a protein whose native structure was not included in the database. 3.1 Testing the Method on Previously Constructed Databases We tested our search procedure using Sa mudr ala and Levitts preconstructed databases (http://dd.compbio.washington.edu) for four known protein structures (1bba,124 1b0n-b,125 1ctf,126 and1dtk127), which ranged in length from 31 to 78 re sidues. For each individual target protein, a unique set of decoys was created. The number of decoys for each target ranged from 216 to 501 (Table 3-1). Standard bond lengths and angles were used to genera te the initial structures. The trans configuration was used for all peptide bonds and predefined -helices and -sheets were assigned ideal torsion ( ) angles of (-60, -40) and (120, 150) respectively. For the remaining residues in the loop regions, a range of random torsion angles were used, -120() for and 150() for After obtaining these databases from Samudrala and Levitts, distance constraints for each protein were chosen from the na tive structure (Table 3-1). To develop our search protocol, we investigated the effect of using different types of constraints to distinguish the more native-like decoys from the less native-like. We calculated th e number of decoys in the database that satisfy particular constraints to identif y which distances were common to most decoys and which were

PAGE 57

57 very different. Determining which constraints eliminated the most structures (long or short distances) was investigated by applying the same se t of constraints in diffe rent orders (randomly, long to short distances, and s hort to long distances). 3.1.1 The Number of Structures Satisfying Specific Constraints For each target pro tein, constraints were chos en by finding the most prevalent residue type and calculating the distances between those re sidues. For 1bba, Ala was the most common residue but the number of Ala-Ala distances was too small; Tyr-Tyr distances were also included in the constraint list. The numb er of structures in the appropr iate database satisfying each constraint was then determined. We began thes e studies using a constr aint distance acceptance range of +/2 (we later found this range to be too small wh en applying the method to our general decoy set). For example, if the distance between the C of residues A and B in the native structure is 10 a decoy will satisfy that constr aint if it has a distance of 8 12 between the C of residues A and B. For this test, we were not concerned with the quality of the decoys that satisfy each constraint, only the number. We wa nted to find the minimum number of constraints that eliminate all structures in the da tabase except that of the target. As can be seen in Figure 3-1, care must be taken when choosing cons traints. Distances that are present in all of the decoys are not very effective constraints as they eliminate very few structures. Some of the proteins have more complete data se ts than others by having decoys that sample a wider range of structur al possibilities. As can be se en in Figure 3-1A, constraints 12, 15, and 16 for 1bba eliminate almost all decoys in database. Thes e constraints involve an atom in a highly variable loop region. The same is true for constraints 7, 8, and 16 for 1b0n-b in Figure 3-1B. For 1dtk and 1ctf, (Figure 3-1D, E) most of the constraints are satisfied by fewer than half of the decoys in the database.

PAGE 58

58 3.1.2 The Effects of Applying Co nstraints in Different Orders 3.1.2.1 Randomly ordered constraints For 1bba, all Ala-Ala and Tyr-Tyr distances were selected as constraint s. These distances ranged from 3.8 30.2 A constraint d istance acceptance range of +/2 of the constraint distance was used as the acceptance criterion. The same set of constraints was applied in three different, randomly determined orders. In each of the three trials, many structures are eliminated after applying a single constraint (Figure 3-2) and 5 13 constraint s were needed to eliminate all but one decoy. The Tyr-Tyr constraints were also found to eliminate more structures in the first step than the Ala-Ala constraints, indicating that the decoys span a wide range of distances at those points. As mentioned previously, some of these constraints involved atoms in a highly mobile or ill-defined loop region. Despite thei r ability to eliminate many decoys, constraints involving atoms in loop regions may not be the best choices due to their low resolution and high variability between structures whic h are otherwise quite similar. 3.1.2.2 Same constraints in different order For each target pro tein, a list of constraints wa s chosen and applied in the following orders: (1) long to short distances and (2) short to long distances. For 1b0n-b, all Glu-Glu distances were selected as constraints. While the total se t of constraints had distances that ranged from 3.8 23.8 the six constraints used in Trial 1 (long to short) ranged from 19.3 23.8 while the nineteen constraints in Trial 2 (short to long) ranged from 3.8 23.8 (Figure 3-3A). For 1ctf, all Val-Val distances were chosen as constraints and ranged from 16.4 24.7 in the Trial 1 and 4.7 11.4 in the Trial 2 (Figure 3-3B). As seen in Figure 3-3, the number of needed constraints depends on the order of application. In each trial, fewer constraints were needed when the longer distance constraints were used in itially. Similar results were seen for 1bba and

PAGE 59

59 1dtk. In each case, the final decoy remaining in the database was the native structure of the target and therefore satisfied all constraints. Applying longer constraints first eliminates dec oys with vastly different overall structures compared to the target protein. Shorter cons traints eliminate decoys with differing local structure. We have found that the search time is shortened by first removing structures with great overall differences by appl ying longer constraints and then applying shorter constraints to fine tune the structure. For 1b0n-b, applying long constraints first (Trial 1, Figure 3-3A) requires only six constraints, much less than the nineteen needed when applying short constraints first (Trial 2, 3-3A). Some constraints do not el iminate any additional de coys resulting in the plateaus seen in Figure 3-3. For 1ctf, application of large constraints first (Trial 1, Figure 3-3B) requires seven constraints, wh ereas applying small constraint s first (Trial 2, Figure 3-3B) requires thirteen. It was also found that changing the constraint distance acceptance range from +/2 to +/1 did not change the number of distances re quired to find the correct structure. 3.2 Developing a Search Protocol Using a Structure Known to be in Our Database Our general decoy set w as generated as descri bed in the Chapter 2. To test our search procedure using our database, we chose 2ezm,129 Cyanovirin-N (an HIV in activating protein). The database contains two decoys generated from the 101 residue 2ezm target protein. Our decoy set is spikedthe correct structure is defi nitely present because the parent protein met all the requirements to be included in the decoy generation process. In Trial 1, constraints were chosen with di stances ranging from 10.1 24.7 They were also selected so that the atoms were within 8 59 residues of each other in the sequence. For Trial 2, constraints were chosen in the distance range of 5.2 17.9 The constraint atoms were also required to be within 3 10 residues of each other, much cl oser than those constraints used in Trial 1. The constraint distan ce acceptance range was set to +/2 as used in previous trials.

PAGE 60

60 Due to the changes in constraints, Trial 2 need ed more than twice the number of constraints needed in Trial 1. From these results we can concl ude that it is more effi cient to consider longer distances between atoms several residues apart wh en choosing constraints In each trial we were able to narrow down our search to the three structures shown in Figure 3-4, 2ezm-1,129 2ezn-1,129 and 1iiy-1.130 Their structures are vi rtually indistinguishable because they are all from the same HIV inactiva ting protein. The PDB entries differ as follows: 2ezn represents an ensemble of NMR structures, 2ezm is only the mean NMR structure, and 1iiy is the mean NMR structure with a ligand. When shorter distance constraints were used and restrictions were placed on the number of residues apart the two atoms were allowed to be, the number of constraints needed increased from eight in Trial 1 to twenty in Trial 2. 3.3 Developing a Search Protocol Usin g a Structure Not in Our Database Next, a search of the database was perform e d to find a protein whose structure was not included in developing the da tabase. Target protein 1b4c131 is a homodimer of S100 beta subunits, each 92 residues in length. It has been classified as a me tal-binding protein. Due to the nature of the database, the structure of only one chain was chosen as the search target. 3.3.1 Constraint Distance Acceptance Ranges: +/2 and +/4 Our first task was to find upper and lower bounds to use as the c onstraint distance acceptance range. W e started with +/2 as in previous trials (section 3.1, 3.2). Constraint distances were chosen to be between 11.0 and 25.6 with an average distance of 20.6 Seven constraints were required to eliminate all but twenty-five structures. Three decoys satisfied eight constraints and can be found in Figure 3-5. The RMSDs for the decoys satisfying seven and eight constraints are li sted in Table 3-2. The RMSD for the top three structures was 14.4 The parent proteins for th ese three decoys ar e all related; 1kv7132 and 1n68133 are the multi-copper oxidase (CueO) and 1pf3133 is the M441L mutant of the same protein. The 1pf3

PAGE 61

61 decoy found in the search is a fragment of th e protein which does not contain the mutation. These three decoys are nearly identical because of the redundant nature of the database. The average RMSD for the top twenty-five decoys was found to be 13.3 Applying the final constraint apparently removed some of th e better (lower RMSD) structures. This search was unable to predict the correct secondary structure. The native structure of 1b4c has five helices, this search f ound decoys with four -sheets and one short -helix. The RMSDs found in Trial 1 indicate that the th ree decoys remaining in the database are not reliable predictions. Reva et. al.134 showed that a structure with an RMSD of less than 6.0 is a successful prediction for small proteins. If the acceptance range is too tight, low RMSD decoys may not satisfy all constraints. In order to improve our results, the constraint distance acceptance range was increased from +/2 to +/4 Thirteen constraints were required to eliminate all but four decoys. The remaining st ructures can be found in Figure 3-6. The parent proteins of these decoys are all dehydrogenases; 1h0h135 is a formate dehydrogenase while 1nek136 and 1nen136 are succinate dehydrogenase. Incr easing the distance range improved the quality of the final structures. A better prediction of secondary structure is made as the decoys are found to have four -helices and only two small -sheets. The RMSDs of 12.2 and 13.3 are slightly better than +/-2 di stance range used in Trial 1, but they are still out of range for this method to be considered a success. 3.3.2 Calculation of All RMSDs Due to the high RMSD values of the final stru ctures found in previous trials, we calculated the RMSDs for all the decoys usi ng the native structure of 1b4c as the ref erence to determine if any good (low RMSD) decoys existed in our database. The distribution of the RMSDs in Figure 3-7 shows that most of the structures ar e within 12 We found 353 structures that

PAGE 62

62 have RMSDs less than 7 and 85 with RMSDs less than 6 The structures with the best RMSDs can be found in Table 3-3 and are depicted in Figure 3-8. It was found that the good struct ures were eliminated during our search procedure because some of the chosen constraint atoms are in loop regions which differ greatly among the structures. Other distance constr aints were between atoms that were shifted slightly in the sequence due to insertions and deletions. Figure 3-9 shows an example of how insertions and deletions can hinder successful predictions. The two proteins in the figure differ only in their loop regions, the -helical sections are high ly conserved giving rise to a very small RMSD. Using the distance between residues 10 and 13 as a constraint, the present searching method would eliminate the black decoy as an improbable structure because the extra residue in the loop region adds to the length of the distance of interest. To overcome this problem, we increased the acceptable distance range from +/4 to +/12 3.3.3 Constraint Distance Acceptance Range of +/12 and +/12 +/10 Using sim ilar constraints as in the previ ous trials (sections 3.3.1, 3.3.2) but with a constraint distance acceptance range of +/12 se venty-five constraints were required to find the top 1,163 decoys. The structures with the seven lowest RMSDs calculated previously were in the final decoy set. The finding of the seve n lowest RMSD structures showed our method to have some promise, however, without knowing the structure a priori it would be extremely difficult to distinguish between the seven good and over one-thousand bad decoys. It can therefore be concluded that this constraint di stance acceptance range of +/12 is too large to adequately eliminate the least likely decoys. In order to eliminate more structures, a c onstraint distance acceptance range of +/-12 was employed for the first twenty-five constraint s and +/-10 was used for the next twenty-five

PAGE 63

63 constraints. After these fifty constraints were applied, only 62 0 structures remained. The seven lowest RMSD structures were once again am ong the remaining decoys. Unfortunately, 620 structures is still a rather la rge number and 50 constraints are far too many for this method to be cost effective. 3.3.4 Block of Distances Instead of com paring one distance to the native structure at a time, in this trial a block of distances +/2 residues from the distance of interest was compared. For example, if the experimental data indicat ed that residues 10 and 20 were 15 apart, we would calculate all the distances between residues 8, 9, 10, 11, 12 and 18, 19, 20, 21, 22. These twenty-five distances make up the block for each constraint. Fo r each block, the distance range, maximum, and minimum were calculated. A decoy satisfied the constraint if the native structure distance was found to be in the distance range (max + 2, mi n 2). Two small restrictions were placed on the constraints: (1) the distan ce constraints ranged from 15.1 35.2 ; (2) the atom pairs were between 9 and 77 residues apart in the sequence. Twenty-five constraints were required to fi nd the top 943 structures, of which only 79 had RMSDs less than 7.0 Application of 32 constr aints resulted in five remaining structures found in Table 3-4 with their RMSDs. This method showed some improvement over the method used in section 3.3.3. Of the final structures re maining in the database, only one was found to have an RMSD greater than the cut off for it to be considered a successful prediction. The four good structures can be found in Figure 3-8 and the higher RMSD decoy, from parent protein 1mka137 can be found in Figure 3-10. The decoy from 1mka shares very few structural similarities with 1b4c. It has two -sheets as well as two -helices that do not align well with 1b4c. A large distance range for the constraint acceptance requiremen ts and poorly chosen

PAGE 64

64 constraints may explain why this decoy was not eliminated during th e search process. The large distance range also requires too many constr aints making this me thod computationally expensive. 3.3.5 Vary the Order of Constraint Application As seen during the in itial testing of our method (see section 3.1), the results of each trial depend greatly on the order of app lication of the constraints. Several trials using the same constraints in different order were performed. The constraint distance acceptance range was set to +/5 The first set of constraints used the same orde r as that used in the previous trial (section 3.3.4). Eighteen constraints were satisfied by seventeen decoys with RMSDs ranging from 10.8 15.2 as found in Table 3-5. None of the low RMSD structures were found. It was discovered that the lowest RMSD decoy satisfied 24 of the 25 constraints. One of the atoms in the unsatisfied constraint is in the middle of a loop region. It is known that these regions have much flexibility giving rise to very different c onformations, even in otherwise similar proteins. We placed this constraint at the end of the list and performed the trial again. Seven decoys were found to remain after 21 constraints. The RMSDs of the top seven decoys can be found in Table 3-5. This method was able to find two of the best decoys, but it found five high RMSD decoys as well. A final trial was performed using a nother order of the constraints. Twenty-one constraints were required to eliminate all but si x decoys. The same two low RMSD decoys were found as in the previous order. The final trial found four high RMSD decoys which were different than those found earlier. Initially it was assumed that upon varying the order of the application of the constraints, the low RMSD decoys would remain in the database more often than the less probable ones. As long as the bad constraint was pl aced at the end of th e list, the two lowest RMSD decoys were

PAGE 65

65 always be found. Because a priori one would not know if a bad c onstraint was being used, this method is not as effective as we would like it to be. 3.3.6 Count the Number of Satisfied Constraints for Each Decoy In order to rem ove the dependence on the orde r of constraint applic ation, we counted the number of constraints each dec oy satisfied. We assumed the decoys that satisfied the most constraints would have the lowest RMSDs. We performed a trial using the same constraints as those used in the previous trial with a distance range of +/5 It was found that four decoys satisfy twenty-five constraints. The RMSDs of these structures are found in Table 3-6. Two of the lowest RMSD decoys were found along with two rather high RMSD decoys. A slightly different set of constraints were selected that include only distances between atoms that are involved in secondary structure, not the loop regions. The distances were chosen to be between 11.9 26.9 Four decoys were f ound to satisfy these twenty-five constraints. They have the lowest RMSDs in the database (Table 3-6, Trial 2). As seen with the previous set of constraints, half of the decoys in the database satisfy ~11 constr aints. This data is shown in Figure 3-11 and is remarkably similar to that obtai ned using the other, diffe rent set of distance constraints. 3.4 Determination of an Average RMSD Distribution Because the four target p roteins had similar RMSD distributions, we wanted to determine a random average RMSDthe RMSD of two randomly chosen structures in the decoy set. We calculated the RMSD of each decoy in the database using other dec oys as references. The five reference decoys can be found in Figure 3-12. As explained in the Chapter 2, the decoys are 100 residue long fragments of larger proteins. The decoys in Figures 3-12A, 3-12C, and 3-12E from parent proteins 1b7u,138 1ujn,139 and 1rt6,140 have both -helices and sheets, while Figure 312B (from parent protein 1fhx141) represents an all -sheet protein and Figure 3-12D (from

PAGE 66

66 parent protein 2wrp142) contains only -helices. Most of these dec oys are folded rather tightly and resemble small proteins. The decoy from 1r t6 (Figure 3-12E), however, is a fragment of a very large multi-domain protein, HI V-1 reverse transcriptase. Th is particular decoy contains residues in two domains even t hough they are connect ed through the same chain. Decoys like this one can account for some of the poor RMSD values calculated with other references. For 1rt6-109, the RMSDs are shifted to the ri ght with average values between 15 and 25 (Figure 3-13), indicating that most decoys are le ss similar to it than the more compactly folded structures. Our database contains some less comp act, semi-folded decoys and a search for such a protein may result in finding a reliable decoy where searches of other databases may not. 3.5 Summary of Methods A constraint range of +/2 and +/4 wa s found to be too sm all to obtain good results while a constraint range of +/12 is far too large. The block met hod was able to find the lowest RMSD structures, but it required too much computer time and too many constraints to do so. The method of counting the number of constrai nts that each decoy sa tisfies has yielded the best results thus far. Target 1b4c was studied previously using a de novo protein structure prediction algorithm which employed Rosetta.143 The 3.6 RMSD of our best decoy was slightly better than th eir best-scoring cluster which had an RMSD of 4.6 We will further discuss the application of this method to other proteins.

PAGE 67

67 Table 3-1. Comparison of input fo r the four target proteins Target Number of decoys in database Range of distance constraints, Number of residues in sequence 1bba 501 3.8 30.2 36 1b0n-b 498 3.8 23.8 31 1ctf 498 4.7 24.7 78 1dtk 216 5.3 27.4 57 Table 3-2. RMSDs for decoys satisfying the most constraints A constraint distance acceptance range of +/2 was used. Table 3-3. Lowest RMSD decoys in da tabase using 1b4c as a reference Decoy RMSD () Decoy RMSD () 1z7q-n 10.7 1lk7-473 14.1 1ivr 11.9 1lk5-473 14.1 1gtm-c 11.9 1lk7-702 14.1 1qmy 11.9 1lk5-15 14.1 1qol-a 12.0 1lk5-244 14.1 1qol-b 12.0 1lk5-702 14.1 1qol-e 12.0 1lk7-15 14.1 1qol-f 12.0 1kv7-1 14.4 1qol-h 12.0 1n68-1 14.4 1t3q-c 13.8 1pf3-1 14.4 1t3q-f 13.9 1khv-1 14.4 1s18 14.0 1khw-1 14.5 1lk7-244 14.1 Decoy RMSD () 1m31-b-2 3.6 1m31-a-2 3.6 1m31-a-1 4.8 1m31-b-1 4.8 1nsh-b-2 4.9 1nsh-a-2 4.9 1wlm-7 5.9 1wlm-8 5.4 1wlm-9 5.1 1wlm-10 5.6 1psr-a-1 5.3 1psr-b-1 5.3

PAGE 68

68 Table 3-4. Decoys remaining after 32 constraints using the block method Decoy RMSD () 1m31-a-2 3.6 1m31-b-2 3.6 1mka-49 10.2 1psr-1 5.3 1psr-1 5.3 Table 3-5. Lowest RMSD decoys found in vary ing the order of cons traint application Trial 1 Trial 2 Trial 3 Decoy RMSD () Decoy RMSD () Decoy RMSD() 1agr-3-8 11.3 1m31-a-2 3.6 1m31-a-2 3.6 1jr4-1-96 15.1 1m31-b-2 3.6 1m31-b-2 3.6 1rif-4 12.9 1nzc-4-95 12.7 1vgw-a-2-1 15.4 1vid-1-95 15.2 1f8x-1-11 13.2 1vgw-d-2-1 15.4 2a72-1-2 11.8 1f8x-2-11 13.2 1vgw-e-2-1 15.5 2a72-2-2 11.8 1f8y-1-11 12.9 1vgz-4-1 15.4 2af0-1-22 12.0 1f8y-2-11 13.1 2bt2-1-16 11.0 2bt2-2-16 11.0 2bt2-3-17 11.0 2bt2-4-15 11.1 2bt2-5-17 10.9 2bv1-1-11 11.2 2bv1-2-10 11.3 1ezt-1-8 10.8 1fqk-2-12 14.5 1h1d-1-106 15.1 Table 3-6. Lowest RMSD decoys found us ing the count method for both trials Trial 1 Trial 2 Decoy RMSD () Decoy RMSD () 1m31-a-2 3.6 1m31-a-1 4.8 1m31-b-2 3.6 1m31-a-2 3.6 1hz4-141 11.0 1m31-b-1 4.8 1hz4-142 11.3 1m31-b-2 3.6 Trial 1 uses the original set of constraints. In Trial 2, the cons traint involving atoms in the loop region is replaced by one between atoms in defined areas of secondary structure.

PAGE 69

69 Figure 3-1. The results of counting the number of decoys that satisfy each constraint. Constraints are numbered from shortest to longest dist ance. A) 1bba, bars in pink correspond to Tyr-Tyr constraints. B) 1b0n-b. C) 1ctf D) 1dtk. Constraints were selected between the most prevalent re sidue type for each target. A B C D

PAGE 70

70 Figure 3-2. Application of randomly ordere d constraints for 1bba. The three trials used the same constraints in different, random orders.

PAGE 71

71 Figure 3-3. Results for using the same set of constr aints in different orders In trial 1, larger constraints were applied first and then smaller constraints until only the target structure remained in the database. In trial 2, constraints we re applied from small distances to larger distances. A) 1b0n-b. B) 1ctf. In each case, the final structure remaining in the database was the native stru cture of the target protein and therefore satisfied all constraints. A A

PAGE 72

72 Figure 3-4. The superimposed images of the results of the 2ezm search. The top scoring decoys are 2ezm-1, 2ezn-1, and 1iiy-1. All three PD B codes represent the same protein. 1iiy contains a ligand which was not included in the decoys. A B Figure 3-5. Results from Trial 1. A) 1b4c. B) The final three remaining decoys after eight constraints with a +/2 distance range, 1kv7-1, 1n68-1 and 1pf3-1. A B C Figure 3-6. The target protein and the final four remaining decoys after 13 constraints with a +/4 distance range. A) The native struct ure of 1b4c. B) 1h0h-a-334 and 1h0h-k-334. Each had an RMSD of 12.2 C) 1nek-a-246 and 1nen-a-246. Each had an RMSD of 13.3 1b4c is represented using a slig htly different orient ation than that in Figure 3-5.

PAGE 73

73 Figure 3-7. Histogram of RMSDs for all decoys in the database using 1b4c as a reference. The histogram of RMSDs for all decoys with RMSDs less than 7.0 is also included. Figure 3-8. Decoys with the lowest RMSDs in database using 1b4c as a reference. A) 1b4c. B) 1m31-a-2. C) 1nsh-a-2. D) 1wlm-9. E) 1psr-a-1 A B C D E

PAGE 74

74 Figure 3-9. Schematic diagram of how an insertion in a loop region can affect th e search process. The red structure represents the native struct ure of our example ta rget protein and the black structure represents a decoy. Figure 3-10. Decoy 1mka-49 (shown in yellow) satisfied many constraints for the 1b4c target (shown in blue). 10 11 12 11 13 13 10 14 12

PAGE 75

75 Figure 3-11. Graph of the number of decoys vs. the num ber of constraints each decoy satisfies for both trials. Fifty percent of the decoys satisfy 11 constraints. A B C D E Figure 3-12. Five decoys used to determine a random average RMSD for our decoy database. A) 1b7u-109. B) 1fhx-13. C) 1ujn-75. D) 2wrp-15. E) 1rt6-109.

PAGE 76

76 Figure 3-13. Histograms of RMSDs for five ra ndomly chosen decoys, 1b7u, 1fxh, 1rt6, 1ujn, 2wrp.

PAGE 77

77 CHAPTER 4 RESULTS: USING OUR DECOY S ET TO FI ND FOUR PROTEINS We attempted to find the structures of four prot eins using our database. The target proteins were PDB codes: 1b4c,131 1ghh,144 1ubi,145 and 2ezk.146 We chose these specific proteins because they were previously used to evaluate other methods.143,147 All target proteins contain fewer than 100 residues and are therefore not exp licitly included in the decoy set. Twenty-five distance constraints were chosen for each prot ein using the secondary structure prediction method JPred115 (to avoid the loop regions). First we will evaluate the decoy generation method followed by an analysis of the decoy discrimination process. 4.1 Completeness of Decoy Set Because we know the structures of th e target proteins a priori we can evaluate our decoy set by calculating RMSDs for all of the decoys us ing each target protein as a reference. The RMSD distributions (Figure 4-1) are similar for each of the targets and show that most of the decoys have RMSDs within 12 20 We also targeted another five proteins and found a similar distribution. Because it is commonly assumed that a good struct ure prediction for a small protein is one with an RMSD lower than 6.0 ,134 it will be difficu lt to find the few good decoys in the set. Because the distributions are skewed gaussia n, only a few decoys are expected to have RMSDs under 6.0 Assuming a perfect gaussian distribution and using th e standard deviation and mean RMSD for each target, we calculated th e number of decoys expected to have RMSDs under the following cutoffs: 6 7 8 and 9 (Table 4-1). Comparing the number extrapolated from a perfect gaussian to the num ber of decoys found within each RMSD cutoff in our decoy set, we find that the d ecoy set number is consistently much lower. It is harder to find low RMSD decoys than if the dist ribution was perfectly Gaussian.

PAGE 78

78 4.2 Evaluation of Decoy Discrimination 4.2.1 Target 1b4c131, Apo-S100 Our first target protein is 1b4c (Figure 4-2) a homodimer of S100 beta subunits, each 92 residues in length. It has been classified as a metal-binding protein. Due to the nature of the database, the structure of only one chain was chosen as the target. Using 1b4c as a reference, we found 85 decoys with RMSDs less than 6 Like 1b4c, 1m31 (apo-Mts1)148 and 1nsh (apoS100A11)149 are both are metal binding proteins. PDB code 1psr150 is the psoriasin protein while is 1wlm151 is CGI-38 and currently ha s no known classification. Four decoys were found to satisfy twenty-five constraints, which ranged in distance from 11.9 to 26.9 They have the lowest RMSDs in the database using 1b4c as a reference (3.6 and 4.8 ) and are all from the same parent prot ein, 1m31 (Figure 4-2). Hypothesis 1 and 2 were satisfied; a low RMSD decoy was in the database and this decoy shared a small set of similar distances with the target. Target 1b4c was studied by Meiler and Baker using a de novo protein structure prediction algorithm which employed Rosetta143. The 3.6 RMSD of our best decoy was slightly better than thei r best-scoring cluster which had an RMSD of 4.6 4.2.2 Target 1ghh,144 DNA-Damage-Inducible protein I (DinI) Our next target protein, 1ghh, is composed of 81 residues and can be found in Figure 4-3. Of the 8 million decoys in the database, 85 had RMSDs less than 6 with 1ghh as a reference. The RMSD distribution was very similar to that seen for 1b4c (Figure 4-1). The distance constraints ranged from 11.4 to 21.1 Our met hod successfully identifi ed the lowest RMSD decoys in the database. Eight decoys satisfied all twenty-five constraints; their structures and RMSDs are shown in Figure 4-3. Because no attempt has been made to remove redundant structures from the database, some of the top scoring RMSD decoys come from the sa me parent proteins with different PDB codes.

PAGE 79

79 For example, 1iwg,152 1oy6,153 and 1t9u154 represent acriflavine resistance protein B. ISHp608 transposase is represented by 2a6m and 2a6o.155 PDB code 1vh2156 is the autoinducer-2 synthesis protein. Sequence homology using BLAST33 was unable to find any structures similar to 1ghh in the PDB. Our method has an advantage in that we are able to generate low RMSD structures with little sequence homology. Often structur al relationships are more conserved than sequence.1,2,157,158 We found decoys with RMSDs as low as 4.9 which was very similar to the 4.8 RMSD value found by Meiler and Baker143 for this protein. The eight top scoring decoys each have three -sheets and at least two -helical regions. Despite the low RMSD values, the targ et protein has a pair of parallel sheets and a pair of anti-parallel -sheets (see Figure 4-3) while the decoys have only anti-parallel -sheet orientations. In all of these structures, the -sheets have, as usual, distances of ~5 between carbons on adjacent strands. The small distance between the -strands allows for a low RMSD between the overall structures despite an incorrect topology. The study of this protein indi cates that proteins with -sheets may have low RMSD (~5 6 ) decoys with incorrect topol ogy. Our preliminary results on other proteins also show low RMSD decoys with various -sheet orientations. For these types of proteins, RMSD alone may not be a useful indicator of a successful prediction. 4.2.3 Target 1ubi,145 Ubiquitin PDB code 1ubi represents the well studied ubiqu itin protein (Figure 4-4). It is composed of 76 residues. The RMSDs of all the decoys were calculated using 1ubi as a reference and seven decoys were found to have RMSDs less than 6.0

PAGE 80

80 The chosen distance constrai nts ranged from 7.3 19.5 Two decoys from 1z2m159 satisfied twenty-five constraints and can be f ound in Figure 4-4. Parent protein 1z2m is an interferon-induced ubiquitin-like protein and therefore not surprisingly similar to our target. The RMSDs for both decoys were 3.9 the lowest RM SD decoys in the database using 1ubi as a reference. This RMSD value was similar to the top-scoring cluster found using Rosetta,143 3.4 4.2.4 Target 2ezk,146 Mu End DNA-Binding ibeta Subdomain of Phage Mu Transposase Target 2ezk has 93 residues. It was selected as our final targ et protein because Kihara et. al147 used it to test their method and had some difficulty finding a low RMSD model. A BLAST search of this target found one othe r protein with sequence homology, 2ezl.146 With only 93 residues, 2ezl was too small to be included in our database. RMSDs for all the decoys in the database were calculated and 41 decoys had RMSDs between 7.7 and 8.0 No decoy in our database had an RMSD less than 6.0 ; our da tabase does not contain a good decoy. This 93 residue segment is not similar to any piece of a larger protein. The distance constraints ranged from 12.0 to 18.5 Nine decoys satisfied twenty-five constraints. Six decoys ca me from parent protein 1ngk160 while 1v2a161 was the parent protein for three decoys (Figure 4-5). The parent proteins seem to have f unctions unrelated to that of the target. Mycobacterium tuberculosis Hemogl obin O (1ngk) has possible functions in oxygen storage and transport, while 1v2a is a glutathione transferase isoenz yme. All of the decoys from 1v2a had RMSDs of 7.7 using the target as a reference while the decoys from 1ngk had RMSDs of 11.6 11.7 4.2.5 Comparison of Search Process for All Target Proteins For each target pro tein, half of the decoys sati sfied at least 10 to 12 constraints (Figure 46A). All the search proteins show a similar Gaus sian distribution of decoy scores (Figure 4-6B).

PAGE 81

81 Most decoys satisfy at least one constraint but very few satisfy all twenty-five. The RMSD distribution for each protein (Figure 4-1) is similar in shape to Figure 4-6B, suggesting a relationship between the score (t he number of satisfied constraints) and the RMSD. The low RMSD structures satisfy more constraints than those with high RMSDs. The strong correlation between RMSD and score is seen more clearl y in Figure 4-7; low RMSD decoys have high scores, high RMSD decoys have low scores. Also, the average RMSD decreases with an increase in score. As seen in Figure 4-7, three of the target prot eins have low RMSD decoys (< 6 ) that satisfy all constraints. In general, decoys with scor es between 10 and 15 have RMSDs between 15 and 20 while decoys with a score of less than 10 have RMSDs greater than 25 For each target protein, there are a few dec oy structures that have high RMSDs and high scores. These decoys generally span more than one domain giving them an unfolded and nonprotein-like appearance. Often, one section of the decoy is sim ilar in structure to a target protein thereby satisfying several constraints, while th e large RMSD comes from the second section of the protein being so far from the first. In the PDB, occasionally multi-domain proteins are poorly labeled. For example, in 1xi5,162 residue 838 and 839 are nearly 152 apart. Some of the high RMSD, high scoring decoys in this study came from parent proteins 1xi5 and 1xi4.163 4.3 Conclusions We found that it is possible to search our decoy database using di stance constraints to find reasonably accu rate protein models with RMSDs less than 6 A distance range of +/5 as the constraint acceptance criteri on yields the best results. To avoid dependence on the order of application of constraints, we counted th e total number of constraints that each decoy satisfied. Decoys that satisfied the most constr aints systematically had the lowest RMSDs. Our final results showed that 3 of the 4 target proteins had RMSDs less than 5 as summarized in Table 4-2. Even low resolution stru ctures have been found to give insight into

PAGE 82

82 the function of proteins.105 Structures of this re solution can also be used as starting points in density generation for X-ray structures.111 In each of these trials, twenty-five constraints were needed to eliminate all but a few representative structures. More studies must be performed before we can state with confidence that th is number accurately represents the amount of distance information needed to determine structur e. We also analyzed the RMSDs for several proteins and found that in general the average RMSD range for decoys in our database is ~15 Like the PDB, our database contains many semi -redundant structures. Removal of such decoys may further decrease the search time of an already fast screening process.

PAGE 83

83 Table 4-1. The number of decoys with RMSDs under each threshold Under 6 Under 7 Under 8 Under 9 real hypothetical real hypothetical real hypotheticalreal hypothetical 1b4c 85 896 354 3,958 1,77414,766 5,983 46,528 1ghh 43 2,885 208 10,082 1,08830,486 7,308 79,759 1ubi 7 4,540 12 15,001 264 42,939 3,797 106,478 2ezk 0 876 0 3,690 41 13,276 2,182 40,787 There are 8,060,245 decoys in the set. Table 4-2. Summary table of results Target Parent protein of found decoy RMSD 1b4c 1m31 3.6 1ghh 2a6m 4.9 1ubi 1z2m 3.9 2ezk 1v2a 7.7

PAGE 84

84 Figure 4-1. RMSD histograms for all studi ed proteins, 1ghh, 1ubi, 2ezk, and 1b4c Figure 4-2. Decoys with the lowest RMSDs in database using 1b4c as a reference. Parent protein 1m31 has two chains, a and b. The first two decoys from each chain are the top scoring decoys with low RMSDs. A) 1b4c B) 1m31-a-2 and 1m31-b-2. Each decoy had an RMSD of 3.6 C) 1m31-a-1 and 1m31-b-1. Each decoy had an RMSD of 4.8 B C A

PAGE 85

85 Figure 4-3. Target 1ghh and top scoring decoys. Decoys from the same parent proteins are shown together. A) Target 1ghh. B) 2a6m -1-25 and 2a6m-2-26, each with an RMSD of 4.9 C) 2a6o-1-25 and 2a6o-2-25, each with an RMSD of 4.9 D) 1oy6-1-45 has an RMSD of 5.3 E) 1t9u-1-45 has an RMSD of 5.2 F) 1vh2-1-42 has an RMSD of 5.1 G) 1iwg -1-45 has an RMSD of 5.3 Figure 4-4. Target 1ubi and top scoring decoys. A) 1ubi B) 1z2m-1 and 1z2m-2 both have an RMSD of 3.9 B A C E B D F G A

PAGE 86

86 Figure 4-5. Target 2ezk and top scoring decoys. A) Three decoys from 1v2a (1v2a-a-75, 1v2a-b75, 1v2a-c-75) and B) six decoys from 1ngk (1ngk-e-15, 1ngk-h-15, 1ngk-i-15, 1ngkj-15, 1ngk-k-15, 1ngk-l-15) satisfied all constr aints. The RMSD for decoys from 1v2a was 7.7 while those for 1ngk ranged from 11.6 11.7 A B C

PAGE 87

87 Figure 4-6. Analysis of the scori ng procedure: A) the y-axis repr esents the percent of decoys satisfying at least a certain nu mber of constraints. For example, 100 % of decoys satisfy 0 or more constraints, while fifty percent of th e decoys satisfy at least 10 12 constraints. B) The y-axis represents the percent of decoys satisfying the exact number of constraints. Very few decoys satisfy exact ly 0 or 25 constraints. A B

PAGE 88

88 Figure 4-7. Figure showing the relationship between RMSD and score. A) 1b4c. B) 1ghh. C) 1ubi. D) 2ezk. In general, the low RMSD structures have high scores and the high RMSD structures have low scores. A B C D

PAGE 89

89 CHAPTER 5 RESULTS: USING SPECIFIC DECOY SETS TO FIND FOUR PROTEINS To use our decoy discrim ination procedure w ith Rosetta-generated decoy structures, two parameters were optimized: (1) the number of d ecoys in the data set and (2) the constraint distance acceptance range. Once optimized, these parameters determined the number of distance constraints needed in the search process. We will first explain how the parameters were optimized and then discuss the search results using the optimized paramete rs for four proteins: 1b4c,131 1ghh,144 1ubi,145 and 2ezk.146 5.1 Parameter Optimizations 5.1.1 Decoy Set Size For each target pro tein, three sets were generated containing 1,000, 10,000, and 50,000 decoys. RMSDs were then calculated using the target protein as a reference. For a given protein, the RMSD distribution is relatively constant despite the number of decoys generated as shown for 1b4c in Figure 5-1. The distribution, however, varies from protein to protein. Analysis of the RMSD ranges in Table 5-1 rev eals that slightly bette r decoys (lower RMSD structures) are generated in the 10,000 decoy set th an in the 1,000 decoy set. Increasing the set to include 50,000 decoys, however, does not show a significant improvement in the quality of decoys generated to justify the extra computationa l cost associated with their generation. In all cases, increasing the set size generates slightly worse structures (ones with higher RMSDs) as well. For 1b4c more than 40% of the decoys have low RMSDs (less than 6.0 ). Increasing the set size from 1,000 to 10,000 decoys generates a sli ghtly lower RMSD structure than the best decoy in the 1,000 set. In the largest decoy set, both the lowest and highest RMSD decoys can be found. More than a third of the decoys for 1ubi have low RMSDs in each of the three

PAGE 90

90 different size sets. As seen for 1b4c, an incr ease in set size for 1ubi generates both lower and higher RMSD structures. The RMSD distributions for 1ghh and 2ezk are broad; only about 10% of the decoys have RMSDs less than 6.0 Fo r 1ghh, there is a slightly better RMSD structure found in the 10,000 decoy set than in the 1,000 wh ile the 10,000 and 50,000 decoy sets have the same RMSD range. For 2ezk, there is no impr ovement in low RMSD structures by increasing the set size from 10,000 to 50,000. This increase doe s, however, generate decoys with slightly higher RMSDs. For all of the target proteins, Rosetta generates structures with RMSDs of 3.6 or lower. We choose a decoy set size of 10,000 as a balance in terms of cost/perform ance ratio because it generates low RMSD structures re latively quickly. The lowest RMSD structure for each protein generated in the 10,000 decoy set is shown in Figure 5-2 superimpos ed on the native structure of the target protein. 5.1.2 Constraint Distance Acceptance Range The upper and lower bo unds placed on a distan ce constraint make up the constraint distance acceptance range. Such a range is need ed in order to properly simulate experimental conditions in which the measured distances are not exact. A decoy with a calculated distance within the acceptance range, satisf ies the constraint. We tested constraint distance acceptance ranges of +/1, 3, and 5 for sets of twelve and twenty-five constraints. 5.1.2.1 Twelve constraints Using the present constraint selection procedure, which inco rporat es information from a secondary structure prediction method, we chose a set of twelve distances from the native structures of each target protei n. Each constraint met the follo wing criteria: it involved atoms in defined regions of secondary structure and its length was between 5 and 30

PAGE 91

91 A constraint distance acceptance range of +/1 is too tight for most of the target proteins. Fifty percent of decoys for 1b4c sati sfy only four constraint s (Figure 5-3A). The highest scoring decoy satisfied eleven constraint s and had an RMSD of 5. 1 (Table 5-2). For 1ghh, fifty percent of decoys satisfied six constr aints and the highest sc oring decoy (score of eleven) had an RMSD of 5.4 The third protei n, 1ubi, had two decoys with a score of twelve; their RMSDs were 3.0 and 4.1 Although not the lowest in the databa se, the top-scoring decoys for 1b4c, 1ghh, and 1ubi had RMSDs in the range for good predictions (< 6.0 ). Those for 2ezk, however, did not. The three top scoring decoys had a score of eleven and RMSDs ranging from 8.1 9.8 An acceptance range of +/1 is far too restrictive for 2ezk, eliminating low RMSD structures. Because the constraint distance acceptance range failed to assign the highest scores to the lowest RMSD decoys, an increa se in the acceptance range was necessary. Increasing the constraint distance acceptance range from +/1 to +/3 shows some slight improvement in structure pr ediction; all four target proteins had a top scoring decoy with a RMSD of 4.6 or lower (see Tabl e 5-2). Fifty percent of dec oys satisfied nine to eleven constraints. Use of this range, however, results in too many high RMSD d ecoys satisfying all of the constraints. The top scoring decoys had high RMSDs ranging from 7.9 for 1b4c to 13.3 for 1ubi (Table 5-2). The total number of top scoring decoys is also higher than when using the lower acceptance range. More constraints must be used in order to employ a constraint distance acceptance range of +/3 Using a constraint distance acceptance range of +/5 at least one of the top scoring decoys for 1ghh, 1ubi, and 2ezk also had the lowest RMSD in the set. This constraint range, however, has the same drawbacks as the +/3 range; 1ghh has nearly one thousand decoys that

PAGE 92

92 satisfy twelve constraints while 1ubi has almost four thousan d. Although 1ubi has over three thousand decoys with RMSDs under 6 many of the top scoring decoys have larger RMSDs as high as 14 Using only tw elve constraints in the search procedure does not adequately distinguish the good decoys from the ba d. More constraints must be used. 5.1.2.2 Twenty-five constraints A set of twenty-five constraints was chosen; tw elve of which were taken from the previous constraint set. As was seen for twelve constrai nts, an acceptance range of +/1 is very tight fifty percent of decoys satisfy ~6 constraints fo r 1b4c and ~10 12 constraints for the other three target proteins (Figure 5-3B). None of the targ et proteins had a decoy th at satisfied all twentyfive constraints using an acceptance range of +/1 Three of the four target proteins had top scoring decoys with RMSDs under 6 (Table 5-2) but the lowest RMSD decoy in each set was not assigned the highest score. Only 1ghh had top scoring decoys out of the range for reliable predictions. Increasing the acceptance range from +/1 to +/3 improves predictions for 1ghh and 1ubi, but the RMSDs of the top scoring decoys are higher for 1b4c. Both low (4.3 ) and high (8.2 ) RMSD decoys for 2ezk satisfy all twen ty-five constraints. For each of the target proteins, fifty percent of decoys satisfy 16 18 constraints using an acceptance range of +/3 and 21 23 constraints for an acceptance range of +/5 For the latter acceptance range, all target proteins had at least one top scoring decoy with a lo w RMSD (< 6.0 ). For 1ghh, 1ubi, and 2ezk, the lowest RMSD decoy in the set had a score of twenty-five. For 1b4c, an acceptance range of +/-5 had a top scoring decoy with the lowest RMSD when compared to the other acceptance ranges. In summary, using twenty-five constraints and a constraint distance acceptance range of +/5 works best for this type of decoy set.

PAGE 93

93 5.2 Search Results We have found the optimal param eters to be a set size of 10,000 dec oys and a constraint distance acceptance range of +/5 with a set of twenty-five distance constraints. We will present results for the four target pr oteins using the optimized parameters. Our scoring procedure is tested by the correlation between the decoys score and its RMSD (Figure 5-4). A good scoring proc edure assigns lower RMSD dec oys higher scores. For 1b4c, the expected trend holds true, low RMSD structures have high scores. For 1ghh, 1ubi, and 2ezk, the trend is not detectable. This may be due to the large number of low RMSD structures generated for 1b4c compared to the other target pr oteins. The lack of a trend is also caused by Rosettas ability to accurately reproduce local stru cture in most of its de coys as well as our use of several short distance constraints for the target proteins. Short distances give information about a proteins local structure, while large distances give clues about its overall structure. Rosetta does a great job in predicting secondary structure; most constraints between residues close one another in the chain ar e, therefore, satisfied by almost every decoy. In summary, the lack of a clear correlation between score and RMSD may be the result of poor constraint choices. Another way to view our results is to look at the average RMSD for each score (Figure 55). For each target protein, the average RMSD d ecreased with an increase in score indicating the use of constraints to distinguish between good and bad decoys is effective. The decoy set for 1ghh has the highest average RMSD, 9.7 The average RMSD for the 1ghh decoys satisfying twenty-five constraints is only 4.9 ; many hi gh RMSD decoys were eliminated by applying several distance constraints. The average RMSD for decoys of 1ubi was 7.6 and the average of those satisfying twenty-five c onstraints was lowered to 4.2 The other two target proteins also show a decrease in the av erage RMSD upon constraint applic ation, albeit less drastic; 2ezk

PAGE 94

94 is lowered from 8.6 to 7.7 and 1b4c is lowe red from 7.1 to 6.0 For 2ezk and 1b4c, poor constraint choices ma y have hindered the decoy discrimination process. 5.2.1 Target 1b4c All twenty-five constraints we re satisfied by eight decoys of 1b4c with RMSDs ranging from 4.6 7.7 and an average RMSD of 6.0 Some of these structures can be found in Figure 5-6. The top scoring decoys are very simila r to each other; the gr eatest area of variation from the native structure can be found in the loop regions. Also, the first -helix appears to be somewhat displaced in most of the decoys. The lowest RMSD structure in the database (Figure 5-2A) has a sc ore of 20. The five constraints this decoy did not sati sfy involved residues in the first -helix (residues 7 16). As can be seen in Figure 5-2A, helix 1 of the decoy is slightly displaced from helix 1 of the target protein. All of the constraint di stances in the decoy were less than 7 from the target distances of the native structure. Using a slightly larger constraint distance acceptance range would result in the lowest RMSD decoy satisfy ing all twenty-five constraints. 5.2.2 Target 1ghh For 1ghh, four decoys satisfied twenty-five constraints with RMSDs ranging from 3.2 6.2 These structures can be found in Figure 5-7. The lowest RMSD decoy in the decoy set satisfied all constraints. Our decoy discrimina tion procedure successfully identified the low RMSD decoys. The average RMSD in the decoy set was 9.7 and dropped to 4.9 for decoys satisfying all twenty-five constraints. Of the top scoring decoys, only the lowest RMSD decoy had the same topology as the native structure of the target protein. There are three -sheets in the native structure (Figure 56); -1 is located between -2 and -3. -1 is parallel to -2 and anti-parallel to -3. In the top

PAGE 95

95 scoring decoys 1059 and 1073, -2 is the middle strand and is still parallel to -1 as in the native structure. In decoy 9935, -3 is the middle strand and runs anti-parallel to both of the other strands. Because -strands are within hydrogen bonding distance to each other, decoys with this secondary structure can have low RM SDs and incorrect topolog ies, as seen in our previous work (Chapter 4). 5.2.3 Target 1ubi Eighty-six decoys of 1ubi satisfy twenty-fiv e constraints with RMSDs ranging from 2.4 12.6 and an average of 4.2 The best structur e in the decoy set has an RMSD of 2.4 and was found to satisfy all constraints (Figure 5-2C). Only nine of the top scoring decoys had RMSDs greater than 6 D ecoy number 3631 (Figure 5-8) ha d an RMSD of 12.6 This decoy shared similar topology with the target stru cture for the first fifty residues; the RMSD for this section was only 3.4 Deviation from the ta rget structure appears in a loop region after the decoys fourth -sheet. In this set, 86 decoys were found to satisfy all constraints. As seen in the previous example (1ghh), structures with incorrect topology, inverted -sheets for example, can sometimes satisfy several constraints. A slightly tighter constraint range of +/3 had only six decoys that satisfied 24 constraints with RMSD s ranging from 2.4 5.8 A tighter constraint range may prevent such incorrectly aligned -sheets from satisfying so many constraints. 5.2.4 Target 2ezk For 2ezk, 732 decoys satisfied twenty-fiv e constraints with RMSDs ranging from 2.9 14.4 with an average of 7.7 Although the lo west RMSD decoy in the set (Figure 5-2D) satisfies all constraints, the RMSD range for the t op scoring decoys is simila r to that of the whole decoy set. The search method did not adequate ly distinguish between the good and bad decoys.

PAGE 96

96 This may be because the decoy set contains on ly a small number of structures with RMSDs lower than 6 Choosing better distance cons traints may also improve the discrimination process. A tighter constraint range of +/3 led to only eight decoys satisfying all twenty-five constraints (Figure 5-9A). The RMSDs, howev er, range from 4.3 8.2 Excluding residues 1 10 lowers the RMSD range of the top scoring decoys to 2.4 4.0 indicating this is the greatest region of deviation from native structure. Constraints we re not chosen from this region because JPred did not predict any defined secondary structure. 5.3 Conclusions Using our present m ethod of choosing constraint s, twenty-five distances must be measured to distinguish between reliable and unreliable decoys using a constrai nt distance acceptance range of +/5 Decoys with slightly lower RMSDs are generated in the 10,000 decoy set when compared to the 1,000 set. In general, there is no significant difference between the decoys generated in the 10,000 versus the 50,000 decoy set. Rosetta generates low RMSD structures for each of our target proteins and our scoring proc edure is effective in assigning these low RMSD decoys high scores. The RMSDs of the best top scoring decoys were: 4.6 for 1b4c, 3.2 for 1ghh, 2.4 for 1ubi, and 2.9 for 2ezk. For 1ubi a nd 2ezk, several decoys satisfied all twentyfive constraints with a large rang e of RMSD values. A different set of constraints may be more effective in distinguishing between good and bad d ecoys. In our next study, we will use a more reliable method for choosing constraints.

PAGE 97

97 Table 5-1. RMSD ranges In parentheses is the percentage of reliable stru ctures generated for each decoy set for each target protein. Table 5-2. Comparison of scores for each protein with different acceptance ranges *the number in parenthesis is the number of decoys w ith that particular score. All data is for the 10,000 decoy set. 1,000 (% < 6.0 ) 10,000 (% < 6.0 ) 50,000(% < 6.0 ) 1b4c 4.2 16.3 (42.7) 3.6 17.2 (45.2) 3.4 17.7 (44.6) 1ghh 4.2 14.6 (9.8) 3.2 16.8 (9.5) 3.2 16.8 (10.3) 1ubi 2.9 14.9 (32.7) 2.4 15.0 (34.6) 1.8 18.5 (34.9) 2ezk 3.7 15.0 (4.9) 2.9 15.2 (7.8) 2.9 17.9 (7.8) 1b4c 1ghh score Rmsd range* score Rmsd range +/1 11 5.1 (1) 11 5.4 (1) +/3 12 4.6 7.9 (82) 12 4.1 12.2 (41) 12 constraints +/5 12 3.8 14.3 (1343) 12 3.2 13.1 (998) +/1 13 4.8 5.8 (4) 17 6.1 12.1 (4) +/3 22 5.7 (1) 23 4.6 5.4 (3) 25 constraints +/5 25 4.6 7.7 (8) 25 3.2 6.2 (4) Lowest RMSD in decoy set 3.6 3.2 1ubi 2ezk score Rmsd range score Rmsd range +/1 12 3.0 4.1 (2) 11 8.1 9.8 (3) +/3 12 2.4 13.3 (518) 12 3.6 12.6 (668) 12 constraints +/5 12 2.4 14.0 (3929) 12 2.9 14.3 (3207) +/1 19 4.1 (1) 18 5.9 (1) +/3 24 2.4 4.1 (6) 25 4.3 8.2 (8) 25 constraints +/5 25 2.4 12.6 (86) 25 2.9-14.3 (732) Lowest RMSD in decoy set 2.4 2.9

PAGE 98

98 Figure 5-1. RMSD distributions for all four target proteins using the 10,000 decoy sets. For 1b4c, the RMSD distribution for sets of 1,000 and 50,000 decoys are also shown. The bin size was 0.5 The frequency was calculated as a percen tage of the total number of decoys in the set. Figure 5-2. Lowest RMSD structures in the 10,000 decoy se t. The target protein is shown in blue and the decoy structure is overlapping in red. A) 1b4c with decoy #6426. The RMSD is 3.6 B) 1ghh with decoy # 6104. The RMSD is 3.2 C) 1ubi with decoy # 5423. The RMSD is 2.4 D) 2ezk with decoy # 5532. The RMSD is 2.9 A B C D

PAGE 99

99 A B Figure 5-3. The number of structures remaining vs. sc ore for each protein, for +/-1, 3, and 5 using A) twelve constraints and B) twenty-five constraints

PAGE 100

100 Figure 5-4. Correla tion betwee n score and RMSD A) 1b4c. B) 1ghh. C) 1ubi. D) 2ezk. A B D C

PAGE 101

101 Figure 5-5. Average RMSD for each protein at different scores. A B Figure 5-6. Top scoring decoys for 1b4cs 10,000 decoy set: A) Decoy #8500 has an RMSD of 4.6 B) Decoy #8827 has an RMSD of 7.7

PAGE 102

102 Figure 5-7. Representation of the -sheet orientation for the nativ e structure of target protein 1ghh and the top scoring decoys. A) Orient ation for the native structure of 1ghh and decoy # 6104 with an RMSD of 3.2 B) Orientation for decoy # 1059 and # 1073 with RMSDs of 5.8 and 6.2 respectiv ely. C) Orientation for decoy # 9935, which has an RMSD of 4.5 Figure 5-8. Top scoring decoy, for 1ubi, # 3631 with a high RMSD, 12.6 When residues 1 50 aligned to the native struct ure of 1ubi the RMSD is 3.4 A B Figure 5-9. Top scoring decoys fo r 2ezk. A) When all residues are aligned the RMSDs range from 4.3 8.2 B) When residues 10 93 are aligned, the RMSDs range from 2.4 4.0 1 2 3 1 2 3 1 2 3 A BC

PAGE 103

103 CHAPTER 6 RESULTS: USING GENERAL AND SPECI FI C DECOYS SETS TO STUDY TWELVE CASP7 TARGETS We used our general and specific decoy sets to predict the structures of twelve CASP7 targets. For a given target, th e same set of twenty-five constr aints was used for both types of decoy sets. Unless otherwise indicated, a constraint distance acceptance range of +/5 was employed. 6.1 General Decoy Set For each target, twenty -five constraints were chosen. As seen in Table 6-1, the number of top scoring decoys for each target ranged from two to over ten thousand. The C RMSDs of the top scoring decoys were calculated and five targ ets were found to have successful predictions (a decoy with an RMSD under 6.0 ). To determine wh ether the lack of reliable predictions for the remaining seven targets was due to a breakdown in decoy generation or decoy discrimination, we calculated the C RMSDs between each decoy and each target (Figure 6-1). The RMSD distribution is sim ilar for most of the target proteins; most decoys have RMSDs within 10 20 of their target with an average RMSD of ~16 Target T335 is the exception. Its RMSD distribution is shifted to the left giving rise to an average RMSD of only 9.9 Over 160 thousand decoys have RMSDs le ss than 6.0 low enough to be considered a reliable prediction. It is not surprising, therefore, to find over ten thousand decoys satisfying all twenty-five constraints for this small target (42 residues), which also has a very common structural motif. For target T335 and four others (T288, T309, T 340, T359), the best decoy in the set had an RMSD under 6.0 and satisfied all constraints. For three targets, T348, T349, and T358, low

PAGE 104

104 RMSD decoys were generated in the set but th e discrimination procedur e failed to assign them top scores. The remaining four targets (T306, T311, T353, T363), had no low RMSD decoys in the set; the decoy generation method failed to prov ide accurate structures, indicating no larger proteins contain pieces that look like these four targets. Comparisons between the JPred predictions and the real structure for each target can be found in Table 6-2. Because constraints are chosen from the JPred prediction, it is important to determine the predictions accuracy. Poor structure prediction can lead to poor constraint choices. 6.1.1 Targets That Worked For five CASP7 targets (T288, T309, T335, T 340, T359), the lowest RMSD decoy in the set was found to satisfy all twen ty-five distance constraints. In each case, the lowest RMSD decoy was also under 6.0 6.1.1.1 Target T288 Target T288 corresponds to 2gzv, the PDZ do m ain of human PICK1 (a fragment of a PRKCA-binding protein). The PD B structure is missing two resi dues (27 and 28) located in a loop region, making the target 91 residues long. Overall JPred does a good job predicting the secondary structure as seen in Tabl e 6-2. Although it doe s not predict the -helix between residues 41 45 or the small -strand between residues 60 61, its predictions for th e rest of the structure are never off by mo re than two residues. The -helix from residue 67 to 76 was selected as a reference and all distance constraints involved an atom from this helix. Eleven dec oys satisfied twenty-five constraints with RMSDs ranging from 3.4 5.6 (Figure 6-2); all of the t op scoring decoys had RM SDs within the range of reliable structure predictions.

PAGE 105

105 The six decoys with the lowest RMSDs in the se t satisfied all constraints. All of the top scoring decoys are from the PDZ domain of various proteins. PDB codes 1tp3, 1tp5, 1tq3, 1be9, and 1bfe are crystal structures of the same prot ein, the PDZ3 domain of synaptic PSD-95 protein, complexed with different ligands. Parent pr otein 1um7 is the PDZ domain of synapticassoicated protein 102 while 1b8q is the extended neuronal nitric oxide synthase PDZ domain. The greatest difference between the structures of the decoys and the target occurs at the Cterminus from residues 87 91; no constraints involved atoms in this region. In addition to RMSD, determining the longe st continuous segment (LCS) of a decoy structure that has an RMSD under a specific th reshold is sometimes used to evaluate the similarity between two structur es (see Chapter 2). The longe st continuous segment under 5 (LCS-5) is 91 residues for the 1tq3 decoy (the entire structure) and 89 residues for the 1b8q decoy. When a lower threshold is used, greater differences appear betwee n the structures. For the decoy from 1tq3, the longest continuous segment under 2 (LCS-2) is composed of 15 86 (72 residues total), while the LCS-2 for the 1b8q decoy has only 29, from residues 33 61. 6.1.1.2 Target T340 Target T340 represents 2HE4, the second PDZ dom ain of human NHERF-2 (SLC9A3R2) interacting with a mode 1 PDZ binding motif. It is composed of 90 residues. The JPred prediction (Table 6-2) is fairly accurate for this target. It does not, however, predict the -helix between residues 40 44 or the -strand between residues 58 59. The helix between residues 64 73 was selected as the reference structure from which all constraints were chosen. Sixty decoys satisfied twenty-five constraints with RMSDs ranging from 3.0 14.0 The top scoring decoys with the highest and lowest RMSDs are shown in Figure 6-3. The best decoy in the database had an RMSD less than 6.0 as did over 80% of the

PAGE 106

106 top scoring decoys. Twenty of the top scori ng decoys, with RMSDs ranging from 3.0 4.4 came from parent protein 1wf7 (Figure 6-3B) while 1wif (Figure 6-3C) was the parent protein of sixteen top scoring decoys with RMSDs ranging from 4.4 4.7 Both parent proteins are PDZ domains of a larger protein. For these top sc oring decoys, the longest continuous segments under 5 included all residues; the LCS-2 wa s 63 residues for 1wf7 and 39 for 1wif. The top scoring decoy with the highest RMSD is from parent protei n 1qln (Figure 6-3D), the structure of transcribing T7 RNS polymer ase initiation complex. It has no noticeable similarities to the target structure as it is completely -helical while the target is mostly a barrel. The longest continuous segment of this decoy with an RMSD und er 5 is 25 residues, which included the reference helix. Most of the other top scoring decoys with high RMSDs have a similar -helical arrangement and the same LCS5. Tightening the constraint distance acceptance range to +/3 only lowers the score of 1qln from 25 to 18. It is, therefore, poor constraint choices that must be contributing to the high RMSD structure predictions for this target. An alternate set of twenty-five constraints was chosen by selecting eight atoms from various regions of secondary structure and calcu lating the distances between them. Twenty-three decoys satisfied all twenty-five constraints with RMSDs ranging fr om 3.0 4.4 This set of constraints was much better than the previous set, which led to much better results. Among the top scoring structures were twenty decoys from parent protein 1wf7 with RMSDs ranging from 3.0 4.4 as well as decoys from parent proteins 1uf1, 1uit, and 1v5l (Figure 6-3E, F, G). The parent proteins of all the top scoring decoys are structures of the PDZ domains of various proteins similar to target T340.

PAGE 107

107 6.1.1.3 Target T359 Target T359 represents 2iwn, the 3rd PDZ domain of multiple PDZ domain protein MPDZ. The gap in the crystal structure between residues 28 31 resulted in the protein having a total of 93 residues. JPred does a good job predicting secondary structure for this protein. It does, however, miss an -helix between residues 45 49 and a -strand between residues 64 65 (Table 6-2). Also, it predicts the first, second, and last -sheets to be slightly longer than they are in the real structure. The helix between residues 71 79 was selected as a reference and a ll constraints involved an atom from this helix. Fifteen decoys were found to satisfy all twen ty-five constraints with RMSDs ranging from 3.6 13.1 (Figure 6-4). The lowest RMSD decoys in the database satisfied all constraints. The pa rent proteins of six of the t op scoring decoys (PDB codes: 1p1d, 1um1, 1il6, 1ueq, 1wfv) were PDZ domains of various proteins and had RMSD s of 6.8 or less. These decoys differed from each other only slightl y, mostly in the loop regions. Despite having a lower RMSD, 1p1d (3.6 RMSD) had 55 residues in its longest continuous chain with a 2 threshold while 1wfv (6.8 RMSD) had 60 residues in its LCS-2. The remaining nine top scoring decoys were re dundant structures of parent protein 1w5e and had an RMSD of 13.1 A tighter constr aint distance acceptance range of +/3 was applied and these high RMSD decoys were found to satisfy only 14 constrai nts. Tightening the constraint distance acceptance range, however, al so lowered the score of the lowest RMSD decoy in the decoy set from 25 to 20. So in this case, little is gained by changing the acceptance range; better constraints must be chosen. Decoys of 1w5e have certain secondary structur al elements in just the right places allowing them to satisfy many constraints. For example, twenty-eight residues comprised the LCS-5 for

PAGE 108

108 this decoy, which included the -helix that was used as a refere nce in choosing the constraints. Many constraints involved distances between an -helix and a -sheet so it is no t surprising that all the top scoring decoys had these structural features. 6.1.1.4 Target T309 Target T309 corresponds to 2h4o, a 62 residue hy pothetical protein from bacillus subtilis (yonk). JPred predicts an -helix between residues 27 40. In the real struct ure, however, the -helix is only between residues 34 40. Th e other regions of s econdary structure are -sh eets located between residues 13 15, 20 24, and 29 33, none of which were predicted by JPred. Because of the discrepancies in the secondary structure prediction and the inherent lack of structure in the target, it was di fficult to choose good constraints. Surprisingly, we were able to obtain successful predictions for this target. Because most of the protein is unstructured, c onstraints were chosen between all regions of predicted secondary structure. Ni ne decoys satisfied twenty-four constraints. The RMSDs of the top scoring decoys ranged from 5.7 14.4 and came from four parent proteins: 1esc, 1esd, 1ese, and 1wk1 (Figure 6-5). The first three co des represent esterase and their top scoring decoys had RMSDs ranging from 5.7 6.6 The longest continuous segment under 5 RMSD is from residues 1 58 (almost the whole structure) for 1esc, 1esd, 1ese and between residues 3 42 for 1wk1. For 1wk1, the Lectin C-type domain derived fr om a hypothetical protein from C. elegans, the top scoring decoys had RMSDs of 14.1 14. 4 The high RMSD structures have some similarities to the target protein; the decoys three -sheets within residues 10 34 align closely to those of the target protein. As can be seen in Figure 6-5, the target protein does not have defined secondary structure be tween residues 1 10 or 42 62. The RMSD between the two

PAGE 109

109 structures from residues 10 34 is ~3.8 and, as mentioned above, its LCS-5 is composed of 40 residues, including this region. Removing the distance constraint between residues 5 and 35 from the constraint set results in only three decoys (from parent protein 1esc) satisfying all twenty-four constraints. The three decoys each have an RMSD of 5.7 The result s are, therefore, significantly improved when bad constraints are not used. This particular cons traint is not effective be cause residue 5 is in a region of undefined secondary structure. Poor se condary structure predictions can lead to poor constraints choices which can significantly hinder the performance of our method. 6.1.1.5 Target T335 Target T335 is 2hep, the UPF0291 protein ynzC from Bacillus subtilis. The target sequence was made of 85 residues but only 42 app ear in the NMR structure. JPred correctly predicts the two main -helices of this protein. Twenty-five constraints were selected from eight residues, four on each -helix. Over ten thousand decoys satisfied all twenty-five constr aints (Figure 6-6); their RMSDs ranged from 2.1 9.5 Over eight thousand of the top scori ng decoys had RMSDs less than 6.0 Target T335 is the smallest protein in this study and co ntains a fairly common structural motif. The lowest RMSD decoy in the set, from parent pr otein 1qsp, had all but two residues in its longest continuous segment with an RMSD under 2 Because of their compact structures, some hi gher RMSD decoys satisfied all twenty-five constraints. For example, a decoy from parent protein 1b6c (Figure 6-6C) had an RMSD of 9.5 and a LCS-5 of 30 residues. This decoy just barely satisfied all the constraints; using a constraint distance acceptance range of +/3.0 the 1b6c decoy satisfied only eleven of the twenty-five constraints. Using a tighter acceptance range, 142 decoys satisfied all constraints;

PAGE 110

110 their RMSDs ranged from 2.1 7.6 and 82% of them had RMSDs under 6.0 For this target, tightening the constraint distance accepta nce range significantly improved the results. 6.1.1.6 CASP comparisons We employed the Global Distance Test (GDT) analysis (see Chapter 2 for details) to evaluate the performance of our method compar ed to other methods used in CASP7. Most methods were successful in predicting low RM SD structures for targets T288, T340, and T359 (Figure 6-7). Our predictions were not am ong the best for these targets even though their RMSDs were less than 6.0 Target T309 was a difficult target; it is larg ely unstructured which is a common pitfall for many methods. Our predictions for this target ar e much better those of other methods. Even our high RMSD top scoring decoy (cyan line in Figur e 6-7) showed better GDT results than the average prediction; over fifty percent of the residues in this d ecoy satisfied a distance cutoff of 6.0 CASP results are mixed for target T335; the GDT results are highly scattered. Our top scoring decoy, with an RMSD of 2.1 is one of the best predic tions while our other top scoring decoy, with an RMSD of 9.5 is average. 6.1.2 Targets That Could Have Worked But Did Not For three targets, low RMSD decoys were generated but the discri mination procedure did not assign them top scores. The breakdown in decoy discrimination can be explained by various reasons as discussed in the next section. 6.1.2.1 Target T348 Target T348 represents 2hf1, the putative Tetraacyldisaccharide-1-P 4-k inase from Chromobacterium violaceum. It has 61 residues. The JPred prediction is not very good; it predicts an -helix between residues 4 10 which does no t appear in the target structure and it

PAGE 111

111 fails to predict two other -helices (29 31, 46 48) and three -strands (18 20, 41 42, 50 51), as seen in Table 6-2. The helix between residues 58 61 was selected as a reference and half of the constraints involved one of these residues. Eighteen decoys were found to satisfy all twenty-five constraints with RMSDs ranging from 6.9 11.0 (Figure 6-8 C, D, E, F). The top scoring decoys with the lowest RMSDs of 6.9 (2poo, 1poo, 1cvm, 1qlg) repres ent the enzyme phytase. They all have a -sheet pattern similar to the target structure, but differ somewhat in the loop regions. The longest continuous segment with an RMSD under 5 for these four decoys is between residues 22 61, which includes the reference helix. Different forms of trans-hydrogenase are represented by 1hzz, 1l7d, 1l7e, 1ptj, 1u2d, and 1xlt (Figure 6-8E); decoys from th ese parent proteins also satisfie d all twenty-five constraints. The structures of these decoys have two -sheets and a terminal -helix and their RMSDs ranged from 10.6 10.7 Although the overall RMSD was quite different, when the structures were aligned in smaller pieces (residues 4 22, 18 41, and 41 61), the RMSD of each section was only 5.7 5.8 The LCS-5 for a decoy from 1hzz was composed of residues 23 45. Three other top scoring decoys are depicted in Figure 6-9 to show the wide range of structures that satisfy all constr aints. A decoy from parent protein 1s6l ha d a RMSD of 8.7 As seen previously, the -sheets of this decoy are inverted co mpared to the target. A decoy from parent protein 1e88 had an RMSD of 11.0 Si milar to the other top scoring decoys with high RMSDs, when the 1e88 and the target were alig ned in fragments (residues 20 40, 41 60), the RMSDs of each section were within the rang e for good structure predictions (5.7 and 5.9 respectively). The longest continuous segment w ith an RMSD less than 5.0 is 19 residues. All of these decoys were able to satisfy twenty -five constraints despite their great difference in

PAGE 112

112 structure. To prevent these high RMSD decoys fro m satisfying all constraints, better constraints must be chosen. Due to the highly inaccurate secondary structure prediction for this target, good constraint choices were quite difficult. The best decoy in the database, from 1tl2, (Figure 6-8B) was found to have an RMSD of 5.3 and a score of 22. Two of the unsatisfi ed constraints involved residue 10, which was predicted by JPred to be in an -helix but was not in a defined region of secondary structure in the target. This decoy has a LCS-5 of 56 residues, almost the entire structure. 6.1.2.2 Target T349 Target T349 represents 2hfv, Pseudomonas aeruginosa hypothetical protein RPA1041. It is com posed of 75 residues. JPred does not predict the first -sheet between residues 2 7 and predicts residues 58 and 65 to be in -strands but they are in struct urally undefined regions of the target structure. It gives a reliable predicti on for the remaining secondary structural elements. The helix between residues 9 23 was selected as a reference and most constraints involved an atom from this re gion. Sixty-four dec oys satisfied all constraints with RMSDs ranging from 10.5 13.6 (Figure 69). A decoy from parent prot ein 1ta3 (Figure 6-9D) had an RMSD of 13.6 but it was quite similar to the target in two regions: from residues 49 71 the RMSD was 3.0 and from residues 1 18 the RMSD was 5.9 The longest continuous segment with an RMSD less than 5.0 included 30 residues. A decoy from parent protein 1t9u had an RMSD of 10.5 (Figure 6-9E) and satisfie d all twenty-five constraints. Its LCS-5 had 32 residues, which were also located n ear the C-terminus of the decoy. In the database, twenty-eight decoys had RMSDs less than 5.6 and scores ranging from 14 21. A decoy from 1uj5 (Figure 6-9B) had an RMSD of 5.6 and a score of 21, while a decoy of 1wel (Figure 6-9C) satisfied only f ourteen constraints and had the same RMSD.

PAGE 113

113 Several of the unsatisfied cons traints for both of these decoys involved atoms in the final -helix composed of residues 55 64. The two decoys were similar to each other having an RMSD between them of 5.4 Increa sing the constraint distance range from 5 to 7 raises the scores of the lowest RMSD decoys to 23, but al so allows other high RMSD decoys to satisfy more constraints. The decoy from 1uj5 had only 39 residues in its LCS-5, while 1wel had 63. Therefore, 1wel should have satisfied more constraints than 1uj5, even t hough they had very similar RMSD values, but this did not happen. For the decoy fr om 1uj5, the placement of secondary structure is closer to the JPred prediction than that of the target. The target and decoy differ greatly in the loop regions; the decoy has two short -sheets where the target has a long loop. The decoy also ends with a -sheet unlike the target which has a largely unstructured terminal region. Another set of constraints was selected in an a ttempt to better the results. Nine atoms were chosen and distances were calcul ated between them. Satisfying all the new constraints were 42 decoys with RMSDs ranging from 5.5 11.1 This was a slight improvement as the lowest RMSD decoy was found to satisfy all constraints. 6.1.2.3 Target T358 Target T358 represents 2hjj, protein ykfF from Escherichia coli. It is 66 residues long. The JPred prediction is pretty good for this protei n. The first nine residues are missing in the crystal structure. The helix between residues 5 14 was selected as a reference structure and all constraints involved an atom in this region. Five decoys satisfied all twen ty-five constraints with RMSDs ranging from 8.3 11.8 (Figure 610). A decoy from 1p99 had an RMSD of 8.3 Forty-two residues comprised the LCS-5 for this decoy, wh ich included the reference helix. Three top

PAGE 114

114 scoring decoys, 1efd, 1k2v, and 1k7s, had RMSDs of 9.3 and LCS-5s of 27 residues, including the reference helix. The longest continuous segment for 1x9d (RMSD = 11.8 ), also include the reference helix but was co mposed of only 24 residues. The best decoys in the database, from pare nt proteins 1oe9, 1w7i, and 1w7j (Figure 610B), had RMSDs of 5.5 and scores of 22. Their longest continuous segments with RMSDs under 5.0 were 37 residues long. The segments did not include residues in the reference helix which may explain why these decoys did not satisfy all constraints. It is difficult to know a priori which regions of the target will be most similar to any particular decoy. 6.1.2.4 CASP comparisons As done for the targets that worked, we e m ployed the GDT analysis to evaluate our methods performance on the targets that could have worked but did not Target T348 was difficult for most CASP participants. There we re no predicted structures with 100 % of the residues within a 10.0 distance cutoff. Howeve r, our lowest RMSD decoy in the set (1tl2, pink line in Figure 6-11A) did sa tisfy this requirement but it wa s not one of the top scoring decoys. A decoy from 1cvm satisfied all constraints and performed well compared to other CASP predictions (blue line in Figure 6-11A). The results for target T349 were quite mi xed; some groups did well, while others struggled. Our top scoring decoys were not among the best predicti ons (red and blue blue lines in Figure 6-11B) and the lowest RMSD decoys in the set were average predictions (green and cyan lines in Figure 6-11B). T358 was a difficult target. Our lowest RMSD decoy in the set would have been one of the best structure predictions had it satisfied all constraints (red line in Figure 6-11C). Two of our top scoring decoys (green and blue lines) were slightly better than average predictions while the other decoy (cyan line Figure 6-11C) was not very good.

PAGE 115

115 6.1.3 Targets That Never Had a Chance Low RMSD decoys were not generated for the re m aining four targets. Lack of structures similar to these targets shows our decoy set to be incomplete. These targets are not represented in the database; either they are not fragments of larger proteins or thei r parent protein was not included in our database. If they exist, the sim ilar proteins may be less than 100 residues long or they may contain gaps in their PDB structur es, excluding them from our decoy set. 6.1.3.1 Target T306 Target T306 (Figure 6-12A) corresponds to 2hd3, a sm all fragment of Ethanolamine Utilization Protein (EutN) from Escherichia coli. It has 95 residues. The JPred prediction is not very accurate. It does not predict either of the -helices. It also predicts a long -sheet composed of residues 75 87 wh ich is split into two smaller -sheets in the real structure (Table 6-1). Twenty of the twenty-five constraints were c hosen from the same reference structure, the -sheet composed of residues 40 45. Two decoys with parent proteins, 1jhw and 1j72, were found to satisfy twenty-four constraints (Figure 6-12C). The RMSDs of the top scoring decoys were 13.4 and 13.5 respectively. Both pare nt proteins represen t a macrophage capping protein, Cap G, which is composed of four -sheets and a long -helix (Figure 6-12C). The longest continuous segment with an RMSD under 5.0 is small for these decoys, composed of only 24 residues. The reference -strand is also in this regio n, which helps explain why such high RMSD decoys satisfy mo st of the constraints. Unlike the top scoring decoys, the target structure has a -barrel center, as does the lowest RMSD decoy in the database (from parent protein 1fgu, Figure 6-12B). This decoy, however,

PAGE 116

116 was found to satisfy only eighteen constraints. Four of the six unsatisfied constraints involved residue 43, which is part of a -sheet in the target protein and a small -helix in the decoy. Several other slight structural differences exist between the lowest RMSD decoy and the target protein. The decoy has a loop from residues 10 22, a small -helix from 40 49, and a -sheet from residues 60 67. The target, however, has an -helix composed of residues 16 20, a loop region from 46 52, and an -helix from residues 61 67. In the C-terminus, the target ends with two short -sheets while the decoy fi nishes with one short -sheet followed by an -helix. All of these differences give rise to an RMSD between the decoy and target of 8.1 The longest continuous segment with an RMSD less than 5.0 is composed of 49 residues for this decoy, which included the reference -strand. No decoys satisfied all twenty-five constraint s. Due to JPreds poor secondary structure predictions for this target, c hoosing good constraints was quite challenging. The lowest RMSD decoy in the database satisfies a unique set of eighteen constrai nts. When only those constraints are used, the lowest RMSD decoy is assigned a perf ect score. Regardless of constraint choices, however, for this target, no reliable predictions can be made because no decoys with RMSDs less than 6.0 exist in the database. 6.1.3.2 Target T311 Target T311 is associated with two parent proteins, 2icp and 2ict, which re present bacterial antitoxin HigA, each crystallized at a different pH. We used 2ict as our reference structure and it was composed of 87 residues. The JPred prediction for this protein was fairly accurate having only a slight discrepancy in the position of the last -helix (Table 6-2). The -helix composed of residues 57 74 wa s selected as the reference helix; all constraints involved an atom in this region. Twenty-three decoys sati sfied all twenty-five

PAGE 117

117 constraints and had RMSDs ranging from 10.1 10.2 (Figure 6-13). The target and the decoys have different local structures for the fi rst and last 15 residues. The remainder of the structure is mostly helical, with both proteins having similar sized helices. The difference lies in the orientation of these helices thereby increasing the overall RMSD between the two structures. Because the decoys helices are ro tated only slightly, the distances between them are similar to those of the target which explains why these hi gh RMSD decoys satisfied so many constraints. For the top scoring decoys, the longest cont inuous segment with an RMSD under 5.0 is 49 residues long, from residue 31 to 79. These decoys came from eleven parent proteins, nine of which were from some form of carbamoyl phos phate synthetase (PDB codes: 1a9x, 1bxr, c30, 1c3o, 1cs0, 1jdb, 1kee, 1mv6, 1t36). Parent protein 1ceb represents recombinant kringle 1 domain of human plasminogen whil e 1cs8 is procathepsin L. In the database, 64 slightly different struct ures of parent protein 1f6g had the lowest RMSD, 6.6 This decoy has a much longer terminal -helix than the target or the top scoring decoys. It also has an initial -helix that is quite similar in si ze to the target. All of the low RMSD decoys satisfied twenty constraints. Thre e of the unsatisfied constraints involved residue 64 and an atom in a loop region. The other tw o constraints involved residue 50 which is on a small -helix in the target structure and part of the large terminal -helix of the decoy. As seen for T306, no reliable decoys are generated for this target. 6.1.3.3 Target T353 Target T353 represents 2hfq, protein NE1680 from Nitrosomonas europaea. It has 85 residues. JP red accurately predicts the s econdary structure for this target. The -helix composed of residues 29 42 was selected as a re ference and all constraints involved an atom in this helix. Twenty-six decoys were found to satisfy twenty-four constraints with RMSDs

PAGE 118

118 ranging from 10.8 15.2 (Figure 6-14). The longest continuous segments under 5.0 RMSD for the top scoring decoys were 26 residues fo r 1ekf (from residue 28 to 53) and 27 residues (from residue 27 to 53) for 1j49. Both LCS-5s included the reference helix. The best decoy in the database, from parent protein 1jrp, had an RMSD of 6.4 (Figure 614B) and a score of 17. The decoy matches the ta rget structure exceptionally well from residues 12 43, with an RMSD of ~3.1 in this regi on. The longest continuous segment with an RMSD under 5.0 is composed of 54 residues (from residue 9 to 62). The lowest RMSD decoy did not satisfy all constraints i ndicating several poor constraints were chosen. The lowest RMSD decoy, however, was not good enough to be considered a reliable model even if it did satisfy all constraints. 6.1.3.4 Target T363 Target T363 represents 2hj1, the 3D dom a in-swapped dimer of hypothetical protein from Haemophilus influenzae. The PDB structure contains 77 residues. The JPred prediction for this sequence is fairly accurate. The -helix between residue s 25 34 was chosen as a reference and all twenty-five constraints involved an atom in th is region. Three decoys of parent protein 1sxg satisfied all twenty-five constr aints and had RMSDs of 9.3 9.4 (Figure 6-15). The longest continuous segment with an RMSD under 5.0 was composed of 36 residues and included the reference -helix. The lowest RMSD decoy came from parent pr otein 1hux and is shown in Figure 6-15. It was found to satisfy twenty-four constraints. Co mprising the LCS-5 for this decoy were residues 1 59. Twenty-seven decoys satisfied the same set of twenty-four constraints with RMSDs ranging from 6.4 11.7

PAGE 119

119 6.1.3.5 CASP comparisons We perfor med the GDT analysis on the four proteins for which our database had no low RMSD decoys (Figure 6-16). Target T306 was difficult to predict for most CASP participants; no models have more than 60 % of the residues within a 5.0 distance cutoff. Our lowest RMSD decoy (not top scoring), wa s one of only two models to have 85 % of the residues under a 10.0 distance cutoff (blue line in Figure 6-16A). Our top scoring decoys were about average compared to the other CASP models. Predicting the structure of ta rget T353 was also difficult fo r most groups. Our lowest RMSD decoys (1jrp, 1jro) were the only models to have 95% of the residues under a distance cutoff of 10.0 (pink lines in Figure 6-16C). Our top scoring decoys, however, were again average models. Our results for T311 and T363 were not very good. Most groups found target T311 pretty easy to predict, while the results for T363 were mixed. In both cases, our top scoring decoys were not among the best predictions. 6.1.4 Summary of Results for General Decoy Set We studied twelve CASP7 targets. The lowest RMSD decoy in the database for five of the targets was assigned the highest sc ore. Three targets had low RMSD decoys but they were not assigned the highest score while f our targets did not have any low RMSD decoys in the database. Because con straints were chosen with the ai d of a secondary structure prediction method, predicting incorrect secondary structure can lead to poor constraint choi ces which results in bad structure predictions. We also found that many high scoring high RMSD decoys have regions of great similarity to the target which usually contain the reference structure.

PAGE 120

120 When comparing our method to the other me thods used in CASP7, we find that our method performed quite well for some of the hardest targets and not well for some of the easiest targets. 6.2 Specific Decoy Sets We studied the sam e 12 targets as in section 6.1 and used the same sets of constraints. To generate specific decoy sets for each target, we used the Rosetta algorithm. Each set contained exactly 10,000 decoys. The C RMSDs between each target and each decoy were calculated and the RMSD distribution can be found in Figure 6-17. The RMSD distribution varies greatly from target to target. As seen in the general decoy set, target T335 has a very high number of low RMSD decoys; 95.6 % of decoys are under 6.0 Targets T309 and T306 have RMSD distributions centered around 15.0 and 14.0 respec tively, and have no decoys with RMSDs under 6.0 The RMSD of the best decoy for each target is listed in Table 6-3. 6.2.1 Targets That Worked Four target proteins, T311, T335, T358, and T349, had the lowest RMSD decoy in their set satisfy all tw enty-five constraints. In each case, the lowest RMSD was under 6.0 The number of top scoring decoys, range of RMSD valu es, as well as the RMSD of the best decoy in the set can be found in Table 6-3. For T311, 181 decoys had perfect scores and their RMSDs ranged from 4.8 14.6 The highest (#61) and lowest (#3545) RMSD decoys sa tisfying all constraints are shown in Figure 618(A, B). The first fifty residues of the high sc oring high RMSD decoy (#61) are similar to the target; the RMSD in this region is only 1.7 This decoy has a high RMSD despite its very similar local structure because it also has a few consecutive incorrect dihe dral angles. Rotation around this bond makes the whole decoy structure quite different fr om the target despite their great similarities in local structure. The wo rst decoy in the set had an RMSD of 19.0 and

PAGE 121

121 satisfied only 6 constraints. Our method successfu lly assigned the worst decoy in the database, a low score. Rosetta generated 158 decoys w ith RMSDs under 6.0 a nd our discrimination procedure assigned these decoys scores ra nging from 13 25 with an average of 23. The 84 top scoring decoys for T349 had RMSD s ranging from 4.0 11.0 The highest (#3665) and lowest (#4480) RMSD decoys can be found in Figure 6-18(C, D). Despite the high overall RMSD, the high scoring high RMSD decoy was similar to the target from residues 6 to 24 and 49 to 66 having RMSDs of only 1.8 and 1.5 respectively. Both sections correspond to -helices, with the first being the reference helix The highest RMSD decoy in the set (15.3 ) satisfied 14 constraints. Fiftyfour decoys were generated with RMSDs less than 6.0 and they satisfied between 14 25 constraints with an average of 23. Target T358 had 20 decoys satisfying all co nstraints with RMSDs ranging from 4.1 11.4 (Figure 6-18(E, F)). The 11.4 RMSD decoy (#3334) was similar to the target from residues 1 to 20, which included the reference helix, a nd from residues 48 to 60. The RMSDs of these sections were 2.1 and 2.0 respectively. The highest RMSD decoy for T358 satisfied 13 constraints and had an RMSD of 15.7 Th e 122 decoys with RMSDs under 6.0 had scores ranging from 10 25 with an average of 20, slightly lower than the other targets. Many decoys satisfied all constraints for T335. The range of RMSDs, however, was small, from 1.4 8.2 (Figure 6-18(G, H)). The lowe st RMSD decoy for this target (#9623) was the lowest RMSD decoy generated for any target. Like the target T335, the high scoring high RMSD decoy, #2312, had an -helix from residues 4 to 20. Th e target and decoy had an RMSD of 1.8 between residues 1 21. Almost all th e decoys in this set had RMSDs under 6.0 and their scores ranged from 12 25 with an aver age of 23.5. One of the highest RMSD decoys,

PAGE 122

122 with an RMSD of 11.4 also satisfied 24 constrai nts. In this case, very few decoys had high RMSDs and it was hard to separate the good from the bad. Four decoys, T288, T348, T359, and T363, had t op scoring decoys with low RMSDs but the lowest RMSD decoy in the set did not satisfy all constraints. For target T288, 86 decoys satisfied all twen ty-five constraints and their RMSDs ranged from 3.6 13.2 (Figure 6-19). The 13.2 RMSD decoy (#7663) was similar to the target from residue 46 to 83, which included the reference helix. The RMSD of this region was 1.9 The best decoy in database (#369), with an RMSD of 3.5 satisfied 23 constraints. The two constraints it did not satisfy were between atoms in the reference helix and the second -sheet. The highest RMSD decoy in the database satisf ied 21 constraints and had an RMSD of 17.0 Poor constraint choices may be the result of this high RMSD decoy satisfying so many constraints. Rosetta generated 25 decoys with RMSDs under 6.0 and their scores ranged from 19 25 with an average score of 24. No decoys for T348 satisfied all 25 constraint s. Two decoys satisfied 23 constraints and had RMSDs of 5.6 and 12.8 (Figure 6-20). Th ey did not satisfy the same set of 23 constraints. The 12.8 RMSD decoy (#7088) was similar to the target from residues 15 to 31 and 49 to 61, which included the reference helix (from residues 58 61). The RMSDs in these regions were 2.0 and 1.6 respectively. The lowest RMSD decoy (#1017) satisfied only 12 constraints and had an RMSD of 4.2 The sec ondary structure prediction for this target was not very good resulting in poor c onstraint choices. The highest RMSD decoy in the set satisfied 8 constraints and had an RMSD of 15.3 The 151 low RMSD decoys (RMSD less than 6.0 ) had scores ranging from 7 25 with an average of 16. To improve our results, better constraints must be chosen.

PAGE 123

123 Forty decoys of T359, with RMSDs ranging fr om 6.0 15.2 satisfied all constraints (Figure 6-21). The high scoring, high RMSD decoy (#3012) and the target had an RMSD of 2.1 from residues 51 89, which included the refe rence helix from residues 71 to 79. The best decoy in the set, #112, had an RMSD of 4.8 a nd satisfied 23 constraint s, while the highest RMSD decoy (17.7 ) satisfied 17 constraints. Only four dec oys were generated that had RMSDs less than 6.0 Their scores ranged from 22 25 with an average of 23. For target T363, 11 decoys satisfied all twen ty-five constraints and their RMSDs ranged from 5.7 10.5 (Figure 6-22). The 10.5 d ecoy (#5181) was similar to the target from residue 1 to 39 having an RMSD of 2.8 in this region. In addition to this region containing the reference helix, it is also the most structurally defined area; the ta rget is fairly unstructured from residue 51 to the C-terminus. The two best de coys in the set (#4551, #6388) with RMSDs of 5.1 satisfied 19 and 24 constraints, while th e highest RMSD decoy (16.8 ) satisfied 11 constraints. Twenty-five decoys had RMSDs under 6.0 and their scores ranged from 16 25 with an average of 20. 6.2.2 Targets That Did Not Work The Rosetta method generated low RMSD decoys for targets T340 and T353, but our method was unable to discriminate the good decoys from the bad. Target T340 had only five low RMSD decoys in the database with scores ranging from 12 24 and an average of 21, while the 57 low RMSD decoys for T353 had scored ra nging from 15 24 with an average score of 20. Six decoys of T340 satisfied all constr aints and had RMSDs ranging from 7.4 14.5 (Figure 6-23). The 7.4 RMSD decoy (#9880) was similar to the target from residues 1 to 11 and 50 to 74, which included the reference helix. Both sections had RMSDs of 1.9 The 14.5 RMSD decoy (#9412) was most similar to the targ et from residue 2 to 17 and 63 to 83. They had RMSDs of 1.9 and 2.9 respectively. The best decoy in the set, #94, had an RMSD of 3.7

PAGE 124

124 but satisfied only 12 constraints, while the next best decoy, with an RMSD of 3.8 satisfied 24 constraints. The decoy with the highest RMSD in the set (17.6 ) satisfied 20 constraints, while the next highest RMSD decoy (17.2 ) satisfied 10 constraints. Twenty decoys of T353 satisfi ed all twenty-five constraints with RMSDs ranging from 7.0 13.5 (Figure 6-24). The top scor ing decoys were similar to the target from residue 1 to 47 for the 7.0 decoy and from residue 1 to 43 for the 13.5 decoy. The similar regions included the reference helix from residue 29 to 42. Th e RMSDs between target and decoy for these sections were 2.8 and 3.0 respectively. The higher RMSD decoy (#3009) also had a segment of high similarity to th e target between residues 59 74 ( 1.7 ). The best decoy in the set, #5124, had an RMSD of 4.3 and satisfied 22 c onstraints. The three unsatisfied constraints involved residue 29, which is located in an -helix that is slightly displaced compared to the target. The highest RMSD decoy ( 16.2 ) satisfied only 13 constraints. 6.2.3 Targets That Never Had a Chance For targets T306 and T309, no low RMSD dec oys were generated. No decoys for T306 satisfied all constraints. However, two decoys satisfied 24 constraint s and had RMSDs of 12.5 and 13.5 (Figure 6-25). The 12.5 RMSD dec oy (#1604) was most similar to the target between residues 15 32 and 28 59 having RMSDs of 3.0 and 2.8 in these segments. The 13.5 decoy (#5643) and the target have simila rities between residues 36 45, 72 82, and 56 89 with RMSDs of 2.5 2.8 and 3.1 The reference -strand was included in a low RMSD region of each of the top scoring decoys. The lowest RMSD decoy in database (#935) satisfied only16 constraints and had an RMSD of 8.0 ; five of the unsatisfied constraints involved residue 40, which is located within the -barrel structure of bot h the target and the decoy. The highest RMSD decoy (18.0 ) satisfied only 9 constraints.

PAGE 125

125 Only one decoy of T309 (#210) satisfied a ll constraints and it had an RMSD of 11.0 (Figure 6-26). The target has defined secondary structure only between residues 13 and 40, and the RMSD between the target and decoy in th is region (from residues 13 and 39) was 3.7 Therefore, if only the regions of defined secondary structure are considered, the method worked well for this target. The best decoy in da tabase, #5810, satisfied 22 constraints and had an RMSD of 8.1 Two of the unsatisfied constr aints involved residue 30, wh ich is located within one of the -sheets. The highest RMSD decoy (19.8 ) satisfied 16 constraints. 6.2.4 Summary of Results Using the Specific Decoy Set Rosetta generated low RMSD decoys for ten of the twelve targets. Even the high RMSD decoys usually had som e local structural simila rities to the target. Low RMSD decoys were assigned top scores for eight targ ets. Two targets had low RMSD decoys in the database but they did not satisfy all constraints while low RMSD decoys were not generated for two other targets, T306 and T309. Both T306 and T309 were difficult to predict for most CASP participants. 6.3 Comparisons of Decoy Sets A com parison of the results using each type of decoy set can be found in Table 6-4. Rosetta generated low RMSD decoys for all but two targets, T309 and T306. The general decoy set generated low RMSD decoys for target T309 ( 5.7 ) but the best decoy in the set for target T306 had an RMSD of 8.1 Also, the lowest RMSD decoy for ten of the twelve targets is lower for Rosetta decoys than the general set. The general decoy set gene rated better decoys for Target T309 and T359. The discrimination process was equally eff ective for both types of decoy sets. Three targets (T288, T359, T335) had successful predicti ons using both decoy sets. Target T306 did not have low RMSD decoys in either set. This is not a very common type of protein. The three

PAGE 126

126 targets for which low RMSDs were generated bu t were not found in the general decoy set (T348, T349, T358) and two targets for which the general decoy set did not generate low RMSD decoys (T311, T363), had successful predic tions using the specific decoys sets. Using the general decoy set, T309 and T340 had low RMSD decoys with top scores, whereas using the specific decoy set, successful predictions were not obtained. Finally, for T353, the specific decoy set generates a low RMSD decoy but it is not as signed a top score, while the ge neral decoy set does not generate a low RMSD decoy. When both methods are c onsidered, ten of the twelve targets had successful predictions.

PAGE 127

127 Table 6-1. Results for 12 targets Target Lowest RMSD in decoy set Range of RMSDs for top scoring decoys Number of top scoring decoys 288 3.4 3.4 5.6 11 340 3.0 3.0 14.0 60 359 3.6 3.6 13.1 15 309 5.7 5.7 14.4 9* 335 2.1 2.1 9.5 10582 348 5.3 6.9 11.0 18 349 5.5 10.5 13.6 64 358 5.5 8.3 11.8 5 306 8.1 13.4 13.5 2* 311 6.6 10.1 10.2 23 353 6.4 10.8 15.2 26* 363 6.4 9.3 9.4 3 *no decoys satisfied all constraints; the number represents the number of decoys satisfying 24 constraints

PAGE 128

128 Table 6-2. JPred predictions comp ared to target structures T288 T306 T309 Jpred Real JPred Real JPred Real 5 11 5 10 4 7 2 10 5 10 17 23 19 23 10 15 13 15 30 36 31 36 17 19 18 21 41 45 24 30 23 30 20 24 53 57 52 57 40 46 36 45 29 33 60 61 54 59 53 59 27 40 34 40 66 75 67 76 61 67 46 52 80 86 80 86 75 87 76 81 84 85 92 93 93 94 T311 T335 T340 JPred Real JPred Real JPred Real 3 12 1 3 3 18 5 19 6 11 6 11 16 23 17 24 24 46 24 40 18 24 19 23 28 35 28 36 51 55 31 36 30 35 43 52 43 53 64 73 40 44 57 75 57 74 52 56 50 55 78 80 58 59 83 85 66 74 65 73 80 86 78 84 T348 T349 T353 JPred Real JPred Real JPred Real 4 10 2 7 3 11 3 1 18 20 9 21 9 23 17 24 17 24 26 28 25 28 27 30 26 28 29 43 29 43 29 31 36 & 43 59 61 59 61 35 36 33 38 46 50 45 51 65 74 65 74 41 42 54 64 55 64 76 79 76 79 46 48 50 51 58 62 54 60 T358 T359 T363 JPred Real JPred Real JPred Real 5 14 5 14 5 11 2 11 3 9 1 10 15 21 18 21 17 22 19 22 14 20 13 22 27 33 29 33 35 40 35 40 25 34 27 34 39 42 39 43 45 49 36 39 52 60 50 58 57 61 56 61 51 56 64 65 69 72 71 79 71 80 84 91 84 94

PAGE 129

129 Table 6-3. Results for each of the 12 targets CASP Target Number of top scoring decoys RMSD range of top scoring decoys, () Lowest RMSD in decoy set, () Number of decoys with RMSDs under 6.0 335 3619 1.4 8.22 1.4 9,562 311 181 4.8 14.6 4.8 158 358 20 4.1 11.4 4.1 122 349 84 4.0 11.0 4.0 54 363 11 5.7 10.5 5.1 25 288 86 3.6 13.2 3.5 25 359 40 6.0 15.2 4.8 4 348 2* 5.6, 12.8 4.2 151 353 20 7.0 13.5 4.3 57 340 6 7.4 14.5 3.7 5 309 1 11.0 8.1 0 306 2* 12.5, 13.5 8.0 0 *top score was less than 25 Table 6-4. Comparison of results for each target using both types of decoy sets Target Specific Set General Set 288 3.6 13.2 3.4 5.6 335 1.4 8.22 2.1 9.5 359 6.0 15.2 3.6 13.1 349 4.0 11.0 10.5 13.6 358 4.1 11.4 8.3 11.8 348 5.6, 12.8 6.9 11.0 363 5.7 10.5 9.3 9.4 311 4.8 14.6 10.1 10.2 353* 7.0 13.5 10.8 15.2 340 7.4 14.5 3.0 14.0 309 11.0 5.7 14.4 306* 12.5, 13.5 13.4 13.5 Targets T353 and T306 were not predicted successf ully by either method. The entries in red had successful predictions. Those in green had low RM SD decoys in the set but they did not satisfy all constraints, while those in blue did not have low RMSD decoys.

PAGE 130

130 Figure 6-1. RMSD distributions for each target protein Figure 6-2. Target T288 and the top scoring decoys for T288. A) Target T288. B) 1tq3 had an RMSD of 3.4 C) 1um7 had an RMSD of 3.5 D) 1b8q had an RMSD of 5.6 A B C D

PAGE 131

131 Figure 6-3. Target T340 and some of the top scori ng decoys. A) Target T340. B) 1wf7 with an RMSD of 3.0 C) 1wif with an RMSD of 4.4 D) 1qln with an RMSD of 14.0 E) 1uf1 with an RMSD of 4.4 F) 1uit with an RMSD of 3.4 G) 1v5l with an RMSD of 3.3 Figure 6-4. Target T359 and its top scoring decoys. A) Ta rget T359. B) 1p1d and 1um1, each had an RMSD of 3.6 C) 1ueq and 1wfv, each with an RMSD of 6.8 D) 1w5e with an RMSD of 13.1 A B C D A B C D E F G

PAGE 132

132 Figure 6-5. Target T309 and its t op scoring decoys. A) Target T309. B) 1ese, 1esd, and 1esc, each with an RMSD of 5.7 C) 1wk1 with an RMSD of 14.4 Figure 6-6. Target T335 and its top scoring decoys. A) Target T335. B) 1qsp, 1wa5, and 1z3h, each with an RMSD of 2.1 C) 1b6c has an RMSD of 9.5 A B C A B C

PAGE 133

133 Figure 6-7. Use of Global Distance Test (GDT) analysis to compare our top scoring decoys with the results from other methods used in CASP 7. A) Target T288. B) Target T340. C) Target T359. D) Target T309, decoys from 1esc, 1esd, and 1ese have RMSDs of 5.7 of, while 1wk1 has an RMSD of 14.4 E) Target T335, the 1b6c decoy has an RMSD of 9.5 while the ot her three have RMSDs of 2.1

PAGE 134

134 Figure 6-8. Target T348, the best decoys in the database, and the top scori ng decoys. A) Target T348. B) The best decoy in database, 1 tl2, had an RMSD of 5.3 Top scoring decoys: C) 1cvm, 1poo, 2poo, and 1qlg, each with an RMSD of 6.9 D) 1s6l had an RMSD of 8.7 E) 1hzz, 1l7d, 1l7e, 1ptj, 1u3d, and 1xlt have RMSDs ranging from 10.6 10.7 F) 1e88 had an RMSD of 11.0 Figure 6-9. Target T349, best d ecoy in database, and top scoring decoys. A) Target T349. B) 1uj5 had an RMSD of 5.6 and a score of 21. C) 1wel had an RMSD of 5.5 and a score of 14. D 1t9u had an RMSD of 10.5 and a score of 25. E) 1ta3 had an RMSD of 13.6 and a score of 25. Figure 6-10. Target T358, lowest RMSD decoys in database, and top scoring decoys. A) Target T358. B) The best decoys in database, 1oe 9, 1w7i, and 1w7j, each have an RMSD of 5.5 C) The top scoring decoy, 1p99, had an RMSD of 8.3 D) The top scoring decoys, 1efd, 1k2v, and 1k7s, each have an RMSD of 9.3 E) The top scoring decoy, 1x9d, had an RMSD of 11.8 A B C D E A B C D E A B C D E F

PAGE 135

135 Figure 6-11. Use of Global Distance Test (GDT) an alysis to compare our results with those from other methods used in CASP7. A) Target T348. The lowest RMSD decoy in the set (1tl2) has a score of 22. The remaining decoys in the figure satisfied all constraints. B) Target T349. The lowest RMSD decoys are 1wel and 1uj5 which satisfied 14 and 21 constraints respectively, which 1t9u and 1ta3 satisfied all constr aints. C) Target T358. The lowest RMSD decoy in the set, 1oe9, satisfied 22 constraints. The remaining decoys satisfied all constraints. 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1cvm 1e88 1hzz 1s6l 1tl2 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1t9u 1ta3 1uj5 1wel A B 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1efd 1oe9 1p99 1x9d C 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1cvm 1e88 1hzz 1s6l 1tl2 1cvm 1e88 1hzz 1s6l 1tl2 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1t9u 1ta3 1uj5 1wel 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1t9u 1ta3 1uj5 1wel 1t9u 1ta3 1uj5 1wel A B 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1efd 1oe9 1p99 1x9d 0 2 4 6 8 10 020406080100 Percent of residuesDistance cutoff, 1efd 1oe9 1p99 1x9d 1efd 1oe9 1p99 1x9d C

PAGE 136

136 Figure 6-12. Target T306, lowest RMSD decoys in the database, and the top scoring decoys. A) Target T306. B) Best decoy in databa se, 1fgu, had an RMSD of 8.1 C) Top scoring decoys, 1jhw and 1j72, had RMSDs ranging from 13.4 13.5 Figure 6-13. Target T311, best decoy in database, and top scoring decoy. A) Target T311. B) 1f6g has the lowest RMSD in the database 6.6 The PDB entry for the parent protein contains only the -carbons. B) The top scor ing decoys, 1a9x, 1bxr, c30, 1c3o, 1cs0, 1jdb, 1kee, 1mv6, and 1t36, ha d RMSDs ranging from 10.1 10.2 Figure 6-14. Target T353, best RMSD decoy in database, and top scoring decoys. A) Target T353. B) The lowest RMSD decoys in data base, 1jrp and 1jro, each have an RMSD of 6.4 C) The top scoring decoys, 1ekf had an RMSD of 10.8 D) Both 1j49 and 1j4a, are top scoring decoys and have an RMSD of 11.6 A B C D A B C A B C

PAGE 137

137 Figure 6-15. Target T363, best decoy in database, and a top scoring decoy. A) Target T363. B) The lowest RMSD decoy is 1hux with an RMSD of 6.4 C) The top scoring decoy, 1sxg, had an RMSD of 9.3 Figure 6-16. Use of Global Distance Test (GDT) an alysis to compare our results with those from other methods used in CASP7. A) Target T306. The lowest RMSD decoy in the set, 1fgu, satisfied 18 constraints. The other tw o decoys satisfied all constraints. B) Target T311. The lowest RMSD decoy in the set, 1f6g, is not shown, while 1a9x is a top scoring decoy. C) Target T353. The lowest RMSD decoy in the set, 1jrp, satisfied 17 constraints. D) Target T363. The lowest RMSD decoy in the set, 1hux, satisfied 24 constraints, while 1sxg satisfied all constraints. C A B

PAGE 138

138 Figure 6-17. Histogram of C RMSDs for all twelve CASP targets.

PAGE 139

139 Figure 6-18. Top scoring decoys for target that worked. A) T311, #3545, with an RMSD of 4.8 B) T311, #61, with an RMSD of 14.6 C) T349, #4480, with an RMSD of 4.0 D) T349, #3665, with an RMSD of 11.0 E) T358, #1572, with an RMSD of 4.1 F) T358, #3334, with an RMSD of 11.4 G) T335, #9623, with an RMSD of 1.4 H) T335, #2312, with an RMSD of 8.2 For each target, the top scoring decoys with the lowest and highest RMSDs are shown. A B C D E F G H

PAGE 140

140 A B C Figure 6-19. Results for T288. A) The best deco y in data set, #369, had an RMSD of 3.5 B) The top scoring decoy, #5124, had an RMSD of 3.6 C) The top scoring decoy, #7663, had an RMSD of 13.2 A B C Figure 6-20. Results for T348. A) The best decoy in data set, #1017, had an RMSD of 4.2 B) The top scoring decoy, #4218, had an RMSD of 5.6 C) The top scoring decoy, #7088, had an RMSD of 12.8 A B C D Figure 6-21. Results for T359. A) One of the best decoys in data set, #112, had an RMSD of 4.8 B) The other best decoy in data set, #6536, also had an RMSD of 4.8 C) The top scoring decoy, #5817, had an RMSD of 6.0 D) The top scoring decoy, #3012, had an RMSD of 15.2

PAGE 141

141 A B C D Figure 6-22. Results for target T363. A) The be st decoy in data set, #4551, had an RMSD of 5.1 B) The best decoy in data set, #6388, had an RMSD of 5.1 C) The top scoring decoy, #6376, had an RMSD of 5.7 D) The top scoring decoy, #5181, had an RMSD of 10.5 A B C Figure 6-23. Results for target T340. A) The best decoy in data set, #94, had an RMSD of 3.7 B) The top scoring decoy, # 9880, had an RMSD of 7.4 C) The top scoring decoy, # 9412, had an RMSD of 14.5 A B C Figure 6-24. Results for target T353. A) The be st decoy in data set, #5124, had an RMSD of 4.3 B) The top scoring decoy, # 5488, had an RMSD of 7.0 C) The top scoring decoy, # 3009, had an RMSD of 13.5

PAGE 142

142 A B C Figure 6-25. Results for target T306. A) The be st decoy in data set, #935, had an RMSD of 8.0 B) The top scoring decoy, #1604, had an RMSD of 12.5 C) The top scoring decoy, #5643, had an RMSD of 13.5 A B Figure 6-26. Results for Target T309. A) The best decoy in database, #5810, had an RMSD of 8.1 B) The top scoring decoy, #210, had an RMSD of 11.0

PAGE 143

143 CHAPTER 7 COMPARISONS OF GENERAL AND SPECIFIC DECOY SETS 7.1 Comparing the performance of the general and specific decoy sets on four target proteins For m ost targets, many decoys from both the ge neral and specific sets satisfied at least twelve constraints. We found it necessary to us e twenty-five distance cons traints to adequately distinguish between reliable and unreliable decoys while employing a constraint distance acceptance range of +/5 To avoid dependence on the order of application of constraints, we counted the total number of constraints that each d ecoy satisfied. Decoys that satisfied the most constraints systematically had the lowest RMSDs. The general decoy set contai ns a fixed number of dec oys, 8,060,245, while the optimum size for the specific decoy set, in terms of cost and performance, was found to be 10,000 decoys. Decoys with slightly lower RMSDs are generated in the 10,000 set compared to the 1,000 set, but significantly lower RMSD decoys are not usually generated in the 50,000 set. Rosetta generates low RMSD structures for each of our four target proteins and our scoring procedure effectively assigned these low RMSD d ecoys high scores. The RMSDs of the best top scoring decoys ranged from 2.4 4.6 Several decoys satisfied all twenty-five constraints for 1ubi and 2ezk; the top scoring decoys had a la rge range of RMSD values. An alternate constraint set may have been more effective in distinguishing between good and bad decoys. We had similar findings using our general d ecoy set. The RMSDs of the top scoring decoys ranged from 3.6 7.7 Our final results showed that three of the four target proteins had successful structure predictions. Such low resolution structures may be used as starting points in density generation for X-ray structures111 and may also be useful in determining protein functions.105 We also analyzed the RMSDs for seve ral proteins and found that in general the

PAGE 144

144 average RMSD range for decoys in our database is ~15 Like the PDB, our database contains many semi-redundant structures. Removal of such decoys may further decrease the search time of an already fast screening process. 7.2 CASP7 results We studied twelve CASP targets using the general and specific decoy sets. Using the general set, the lowest R MSD decoy for five of the CASP7 targets satisfied all distance constraints. Three targets had low RMSD decoys but were not assigned the highest score while four targets did not have any low RMSD decoys in the database. Because constraints are chosen with the aid of a secondary st ructure prediction met hod, predicting incorrect secondary structure can often lead to poor constraint choices whic h results in bad struct ure predictions. When comparing our method to the other methods used in CASP7, we fi nd that our methods performance was decent. It does quite well for some of the hardest targets, like T309, and not well for some of the easiest targets, like T288, T340, and T359. Some targets had no low RMSD decoys in the general decoy set. Two reasons for this occurrence are: (1) the target contained an unusual protein fold not seen in the PDB or (2) the protein fold was excluded from the database. These protein folds may be more common for smaller structures but parent proteins with fewer than 100 residues were not used in decoy generation. Another explanation is that the most similar proteins in the PDB were missing small fragments of structure, thereby ex cluding them from the decoy set. For the specific decoy sets, low RMSD decoys were generated for all but two targets, T309 and T306. A low RMSD decoy was generated for ta rget T309 (5.7 ) in th e general decoy set, but for target T306, the best decoy in the general set had an RMSD of 8.1 Also, for ten of the twelve targets, the specific set had lower RMSD decoys compared to those generated in the general set. The general d ecoy set generated better decoys for Target T309 and T359.

PAGE 145

145 The discrimination process was equally eff ective for both types of decoy sets. Three targets (T288, T359, T335) had successful predic tions using both decoy sets. Most methods used in CASP7 performed well for these targets as well. Target T306 did not have low RMSD decoys in either set and it was very difficult for most other groups to predict. The three targets for which low RMSDs were generated but were not found in the general decoy set (T348, T349, T358) and two targets for which the general de coy set did not have low RMSD decoys (T311, T363), had low RMSD predictions using the specifi c decoys sets. Using the general decoy set, T309 and T340 had low RMSD decoys with top scor es, whereas use of the specific decoy set did not result in successful predicti ons. Finally, for T353, the speci fic decoy set generates a low RMSD decoy but it is not assigned a top score, while the genera l decoy set does not generate a low RMSD decoy. When both methods are consider ed, ten of the twelve targets had successful predictions. In both types of decoy sets, whenever high RM SD decoys satisfied all constraints it was because the decoy and the target had regions of great similarity. In those cases where we selected a reference structure a nd chose constraints involving atom s from that region, such high RMSD high scoring decoys were common, especi ally when the reference structures were included in the region of similarit y. Better results were seen when a set of residues were selected and constraints were chosen between them.

PAGE 146

146 CHAPTER 8 INTRODUCTION: AZOBENZENE ISOMERIZATIONa 8.1 Isomerization Mechanism Azobenzene can adopt cis and trans confor mati ons in the electronic ground state with the trans isomer lower in energy by approximately 0.6 eV.164 The trans to cis energy barrier was found experimentally to be about 1.6 eV.165 Azobenzene is known to undergo a reversible photoisomerization between these conformations. A trans to cis isomerization occurs upon excitation at 365 nm (3.40 eV) and a cis to tr ans isomerization takes place at 420 nm (2.95 eV).166 A thermally induced cis to trans isomeriza tion is also possible in the ground state. Due to their facile inter-conversion at appropriate wavelengths, azobenzenes have the potential to be used in optical switching and image storage devices167-170 as well as molecular scissors171 and as targets for coherent control in molecular electronics.172 There are two pathways by which isomerizati on is thought to take place. The rotation pathway occurs by an out of plane torsi on of the CNNC dihedral angle labeled in Figure 8-1. The inversion pathway involves an in-plane i nversion of the NNC angle between the azo group and the adjacent carbon of the benzene ring. The inversion angle is labeled in Figure 8-1. An interesting and somewhat puzzling aspect of the photochemistry of azobenzenes is the difference in trans to cis quantum yiel d upon excitation to the dark S1(n *) state ( = 0.20 166,173 0.36 174) and bright S2( *) state ( =0.09 166,173 0.20 174). Even within this large experimental range, (S1) is clearly larger than (S2). When the rotation pathway is blocked by restricting the NN bond rotation with a crown ether,175 cyclophane structure,176 or within a cyclodextrin cavity,177 aAdapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the Isomerization Mechanism of Azobenzene and Dis ubstituted Azobenzene Derivatives, J. Phys. Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society

PAGE 147

147 the difference in quantum yield disappears. This observation led to the belief that isomerization occurs by different mechanisms after the n and excitations. Most researchers agree that the inversion mechanism do minates in the ground state,178-181 but until recently there was mu ch debate over which mechanism dominates after excitation to each excited state. Montis178 minimal basis set CI calculations provided the first theoretical explanation: excitation to S1 resulted in is omerization via the inve rsion pathway while the rotation pathway dominated after S2 excitation. His potential energy curves were adopted by most experimentalists and used to explain their results. Time-resolved UV-visible absorption spectro scopy of azobenzene by Lednev shows that upon excitation of trans-azobenzene at exc = 280 to 347 nm, two transients are formed.182-184 One was determined to be fast decaying, 1 ps, corresponding to the S2 state and the other was longer-lived, 10 16 ps, corresponding to the S1 state. Lednev used Montis potential energy curves to explain his results. Therefore, thes e transients have been assigned assuming the rotational pathway dominates after S2 excitation. Fujino185 performed time-resolved Raman spectroscopy to show that the S1 state that formed after S2 excitation had a similar NN stretching frequency as that of the S0 state. This indicates the NN double bond remains intact after th e excitation and theref ore provides evidence for the inversion mechanism in the S1 state. In later work, Fujino186 presented results from a time-resolved fluorescence experiment that denied the existence of a rotational pathway that starts from the S2 state, in contrast with Montis work. They showed that isomerization always occurs in the S1 state regardless of excitation wavelengt h. In order to explain the differing quantum yields, he proposed an additional re laxation channel that must be opened upon S2 excitation and produces mostly trans isomers.

PAGE 148

148 Much theoretical work has been done to investigate the photochemistry of azobenzene. Cattaneo and Persico179 performed complete active space se lf-consistent field (CASSCF) and CIPSI calculations to generate potential energy curves of the ground and excited states. Ishikawa et al.187 obtained three-dimensional potential energy surfaces of S0, S1, S2, and S3 states using CASSCF and multireference configuration interaction method with singles and doubles (MRCISD). Quennville188 used CASSCF to generate potential energy curves for the lowest five electronic states. Tiago et al.189 performed two-dimensional surface scans for S0, S1, and S2 using TDDFT. Ciminelli et al.190 used a combination of Tullys surface hopping approach with a direct semiempirical calculation to study the dyn amics in the excited states. Cembran et al.191 calculated the lowest singlet and triplet excited state PES along the torsion pathway using complete active space with second-order pertur bation theory (CASPT2). Gagliardi et al.192 also focused on the torsion pathway but used MS-CASPT2 and TDDFT. Diau193 used CASSCF to look at the inversion, rotation, and c oncerted-inversion pathways on the S1 surface. The most recent theoretical c onclusions agree that the n state has a slight inversion barrier and a nearly barrierless rotation pathway.179,180,187,189,190,193 Several researchers have found an S1-S0 conical intersection along the rotation pa thway with a CNNC dihedral angle of ~90.0.187-191,193 It is generally agreed that when excited to the S1 state, relaxation to the S0 state occurs through the conical intersection along the midpoint of the rotation pathway.187,191,193 Recent experimental work has shown support for this mechanism194. The comprehensive studies of Fujino and Tahara185 showed that isomerization does not occur directly on the S2 state, but that it relaxes to a lower lying excited state, wh ere it then isomerizes. Some calculations point to an S2-S1 conical intersection near the trans-azobenzene Franck-Condon region which leads to a direct S2 to S1 relaxation.188,190

PAGE 149

149 Many models have been unable to explain the difference in quantum yield that is seen upon excitation to the S2 state. Diau proposed a new isomeri zation pathway that is open after S2 excitation.193 This channel produces more trans isomers than cis ther eby lowering the trans to cis quantum yield. This mechanism is explored in our studies. In addition to investigating the preferred isomerization mechanism, we also look at how substituting the phenyl rings of azobenzenes affect s the isomerization process. In order to study these effects, we examined the pathways by ge nerating potential energy surfaces of the ground and excited states of azobenzene [Azo] and f our of its derivatives, 4,4-diaminoazobenzene [Azon], 4,4-nitro-aminoazobenzene [AzoNO2NH2], N-[4-(4-(Acetylam ino)phenylazo)phenyl]acetamide [Azonco], and 4,4-dinitroazobenzene [AzoNO2NO2] (Figure 8-2). The azobenzenes will be from now on be referred to by the name in brackets. Absorption spectroscopy by Blevins and Blanchard on the Azo, Azon, and Azonco systems suggest that the ground state isomerizat ion barrier is reduced when electron-donating substituents are placed on the benzene rings.180 Our results, however, indicate that electrondonating groups, like NH2 and HNCOCH3, increase the ground state inversion barrier while electron withdrawing groups, like NO2, decrease it. Lack of solven t effects in ou r calculations may be the reason for these discrepancies as wi ll be discussed further in this paper. 8.2 Applications of Azobenzenes in Biomolecules Recently, a photoswitchable m olecular glue for DNA has been developed which can reversibly control the hybridi zation of mismatch-containing DNAs with the aid of an external light stimulus.195 These small synthetic molecules bind sp ecifically to mismatch DNA and serve to stabilize the mismatched DNA duplex, thereby acting as a glue holdi ng together two single stranded DNAs. Azobenzene was incorporated in to naphthyridine carbamate dimmers, which bind specifically to GG-mismatches in DNA. When the azobenzene undergoes isomerization,

PAGE 150

150 the positions and orientations of the naphthyridines will also change and therefore enable to adherence of two single-stranded DNAs that cont ain the GG-mismatch. Th e stabilization of the DNA duplex by the glue was evaluated by melting temperature comparisons. The cisazobenzene-containing glue stabilized the GG mismatch DNA more strongly than the trans complex. It was also found that the cis comple x disassembled upon cis to trans isomerization by 430 nm photoilumination. Thus, this reversible, photoswitchaable molecular glue for DNA has the potential to be used in c ontrolling biological functions tr iggered by DNA hybridization. It may also be useful in the reversible cons truction of DNA-based nanoarchitectures. Azobenzene has also been incoporated into an ionotropic glutamate receptor which acts as a photoswitch and controls an ion channel in cells.196 The switch covalently modifies target proteins and can reversibly present and withdraw a ligand from its binding site by the photoisomrization of azobenzene. Upon photoswitchi ng to the active state, a tethered glutamate is placed near the binding site. The photosta tionary state can be altered using different wavelengths of light thereby setting the fraction of active channels in an analog fashion. The switch can be turned on with short pulses at one wavelength, kept on in the dark for a few minutes, and turned off with long pulses at anothe r wavelength. In this way, sustained activation with minimal radiation is achieved. The process provides quick and reversib le control of protein function.

PAGE 151

151 N N N N N N N N rotation inversion (out of plane) (in plane) Angel Dihedral N N N N N N N N rotation inversion (out of plane) (in plane) Angel Dihedral Figure 8-1. Schematic diagram of the rotation an d inversion pathways of the trans cis isomerization of azobenzenes. The rotati on pathway is obtained by a torsion of the azo group around the CNNC dihedral angle The inversion pathway is obtained by an in-plane inversion of the NNC angle (angle ) formed between the azo group and the attached carbon of one of the benzene rings.

PAGE 152

152 A B C D E Figure 8-2. Structures of compound s investigated in this work: (a) Azo (b) Azon (c) Azonco (d) AzoNO 2 NH 2 (e) AzoNO 2 NO 2 This numbering scheme will be referred to throughout the text.

PAGE 153

153 CHAPTER 9 COMPUTATIONAL DETAILSb 9.1 Ground-State Calculations All calculations were perfor med using Gaussian 03.197 All ground-state geometries were computed using ab initio density-functional theory with the B3LYP198 functional and the 6-31G* basis set199 as this method was previously found to accurately reproduce experimental results.200 To investigate the rotation and inversion pathways, the potential energy surface was generated by scanning the NNC angle (angle in Figure 8-1, 7-8-9 in Figure 8-2) from 80.0 to 180.0 and the CNNC dihedral angle (dihedral angle in Figure 8-1, 4-7-89 in Figure 8-2) from -40.0 to 220.0 at a 10.0 interval. For each ca lculation, the NNC angle and the CNNC dihedral angle were fixed at the appropriate values wh ile the rest of the de grees of freedom were optimized. The remaining points in the potentia l energy surface were found through symmetry. The potential energy surface for the concerted inversion pathway was generated in the same manner except the NNC and CNN angles we re scanned synchronously. Two potential energy surfaces were generated for the 4,4-nit ro-aminoazobenzene due to its asymmetrically substituted benzene rings. Azo(NO2)NH2 refers to the surface with NO2 on the same side as the NNC angle being inverted (9-8-7 in Figure 8-2) while AzoNO2(NH2) represents the surface with NH2 on the same side as the inverted NNC angle (4-7-8 in Figure 8-2). Charges were calculated using the CHelpG me thod to determine the electron donating or withdrawing nature of each substituent. Elec tron donating groups were identified as those that showed a decrease in charge on the ortho and para positions of the substituted azobenzene b Adapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the Isomerization Mechanism of Azobenzene and Dis ubstituted Azobenzene Derivatives, J. Phys. Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society

PAGE 154

154 compared to the charge on the unsubstituted azobenzene. While electron withdrawing groups had a negative charge difference on the atoms in the meta positions. 9.2 Excited-State Calculations All calculations were perfor med using Gau ssian 03. Time dependent density-functional theory (TDDFT) with the B3LYP functional and th e 6-31G* basis set were used for the excitedstate calculations as they were found to give reliable results.166 The excited-state potential energy surfaces were generated by calculating si ngle point vertical excitation energies for each of the points in the ground-state potential ener gy surfaces. Vertical ex citations were also calculated from the fully optimized ground state cis and trans minima.

PAGE 155

155 CHAPTER 10 RESULTS: UNSUBSTITUTED AZOBENZENEc 10.1 Optimized Ground-State Geometry The optim ized geometries of cis and trans azobenzene were found and the results are shown in Table 10-1. The trans isomer is about 15.2 kcal mol-1 or 0.66 eV lower in energy than the cis isomer. This is just slightly hi gher than the experimental value of 0.6 eV.164 Different experimental methods suggest di fferent structures for the tran s isomer. Electron diffraction201 results indicate the phenyl rings of the trans isomer are 30 out of plane while the X-ray202 data show a planar structure. Our re sults agree with the X-ray data as well as with the results of several theoretical calculations.179,187,191,203 The structure of the cis is omer is less controversial. Our DFT results are very similar to both X-ray data204 and other theoretical predictions.179,187,189,191,200,203 10.2 Electronic Excitation Energies For the sing let vertical excitations of the trans isomer of azobenzene, the first transition, n *, is symmetry forbidden and therefore has a ve ry weak oscillator strength, while the second transition, *, is much more intense. The excitation energies for these transitions are shown in Table 10-2. The assignment of symmetry is done by visual inspec tion. Evaluation of our molecular orbitals (Figure 10-1) reveals that the first transition originates from the lone pair on the central nitrogens and is of 88% n character as calculated from the CI coefficients. The second transition is 78% and is delocalized throughout the entire molecule. It has been suggested that the second excite d state relaxes to the first via a conical intersection above the ground state trans minimum. c Adapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the Isomerization Mechanism of Azobenzene and Dis ubstituted Azobenzene Derivatives, J. Phys. Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society

PAGE 156

156 The TDDFT calculated energy for the S1 S0 transition of trans azobenzene, 2.55 eV, is fairly close to that of the known experimental value, 2.79 eV.166 Although CASSCF187 and configuration interaction by perturba tive iterative selection (CIPSI)179 calculations have given values that agree slightly better with experiment for th is transition, 2.85 eV and 2.81 eV respectively, the S2 S0 transition is much better described by TDDFT with an energy of 3.77 eV compared to an experimental value of 3.95 eV. CASSCF predicts an energy of 7.62 eV and the CIPSI energy is 4.55 eV. TDDFT consistently pr edicts slightly lower energies than the experimental values while the CASSCF values ar e generally much higher. These values are summarized in Table 10-2. The S1 S0 transition occurs at about the same en ergy for both trans and cis. Unlike the trans excitations, however, the S1 S0 transition from the cis isom er shows slight intensity due to the loss of symmetry making the transition allowed. The S2 S0 transition from the cis isomer is much less intense and slightly higher in energy than that of the trans isomer. 10.3 Potential Energy Surfaces 10.3.1 Ground State A ground state three dim ensional potential energy surface and a contour map were calculated for azobenzene (Figure 10-2). The su rface is very symmetric with two cis and two trans minima. Cis to trans barrier heights we re determined from these plots by finding the energy of the highest point on the potential energy surface along the pathway and subtracting from it the energy of the cis minimum. Proper id entification of these poi nts as true transition states was done checking for the existence of only one imaginary frequency in normal modes analysis. The peak along the inve rsion pathway (angle reaction coordi nate) was taken to be at an angle of 180.0 and a dihedral a ngle of 180.0 and is represente d in Figure 10-2b by point 1. The peak of the rotational path way (dihedral angle reaction coor dinate) was taken to be at a

PAGE 157

157 dihedral angle of 90.0 while the angle was the sa me as that of the trans minimum, 110.0. In the rotation pathway, the peak was a saddle poin t and is labeled point 2 in Figure 10-2b. Azobenzene is known to undergo a thermal cis to trans isomerization in the ground state so only the cis barriers will be discussed. The barrier along the inversion pathway, 24.9 kcal mol-1, was lower than that of the rotation pathway, 36.2 kcal mol-1, indicating that in the ground state, the inversion mechanism is favored. This is in agreement with previous reports.178-181 A cis to trans barrier height for azoben zene was measured experimentally180 to be 25.8 kcal mol-1, in good agreement with our results. We can explain the difference in energy barr iers between mechanisms by looking at how the NN distance changes along each pathway. Along the inversion pathway, the NN distance decreases (increases in bond order) fr om the trans isomer to the tran sition state (point 1 in Figure 10-2B) and then increases in length (decreases in bond order) as it approaches the cis isomer. The inversion transition state shows the st rongest NN bond along the pathway. The opposite trend is seen along the rotation pathway. The NN distance increases from the trans isomer to the rotation transition state (point 2 in Figure 10-2B) and then decreases in length as it approaches the cis isomer. The NN distance found in the rotati on transition state is approximately that of a single bond. There is a high ener gy cost involved in a decrea se of the NN bond order in the rotation pathway which is seen as an increase in the energy barrier. 10.3.2 Excited State 1 (n *) Potential energy surfaces and contour m aps were calculated for the first two excited states (Figure 10-3). Our surfaces are simila r to those of previous calculations.187,189 Vertical excitations from the trans minima reach the points labeled 1 and 4 while excitations from the cis minima arrive at the points labeled 3 and 6. Points 2 and 5 depict the placement of the S1 minima.

PAGE 158

158 10.3.2.1 Rotation pathway There is essentially no energy barrier along the ro tation pathway of the first excited state as has also been reported in previous calculations.179,180,187,189,190,193 The potential energy surface along this pathway has only a shallow slope ab ove the area corresponding to the trans minimum (from points 1 to 2 and 4 to 5 in Fi gure 10-3B), 0.21 kcal (mol*degree)-1, and a very steep slope on the cis side (from points 3 to 2 and 6 to 5 in Figure 10-3B), 0.33 kcal (mol*degree)-1. These slopes are also shown schematically in Figure 104. The figures suggest that when excited from the cis conformation there is a much faster re laxation to the excited state minimum than if excited from the trans conformation. This phenomenon has been shown experimentally by femtosecond transient absorption measurements.205 A conical intersection was found between the gr ound and first excited states. It can be seen when the minimum of the excited state is very close in energy to the maximum barrier height along the rotation pathway in the ground stat e as can be seen in Figure 10-5. We have located our conical intersecti on at an NNC angle of 140.0 and a CNNC dihedral angle of 90.0 (point 5 in Figures 10-2b and 10-4a). The location of this conical intersection is in agreement with several other groups.187-191,193 The splitting between the su rfaces is estimated to be 0.65 kcal mol-1. Stilbene, which can only isomerizes via the ro tation mechanism, has been found to have an S1-S0 conical intersection along the mi dpoint of the rotation pathway and is also known to have an isomerization yield of 0.5. It is interesting to find that azobenzene ha s a conical intersection near the same location yet shows a very diffe rent quantum yield. This can be explained by looking at the difference in slope on the S1 surface on either side of the conical intersection in azobenzene. As mentioned previously, the S1 slope above the cis minimum (point 6 in Figure 10-

PAGE 159

159 4) is greater than the correspondi ng slope on the trans side (point 4 in Figure 10-4). The crossing probability close to the conical intersection can be related to the non-adiabatic coupling between S0 and S1 206 written as 1001 SSSS d A larger slope corresponds to a larger change in wavefunction (right side of formula). There is a greater probability, therefore, of jumping from S1 to S0 when starting from the cis side rather th an the trans side resulting in more trans isomers in the S0 state, because the transition carries th e momentum from the excited state. In other words, while oscillating on the S1 surface near the conical intersection, more relaxation occurs when the wave packet moves from point 6 to point 5 than from point 4 to point 5, depositing more population on the trans side than on the cis side of the ground state surface, hence producing a quantum yield lower than 0.5 The slopes (cis and trans sides) on the S1 surface of stilbene are essentially eq ual giving rise to more similar S0 and S1 wavefunctions than those of Azobenzene. The proba bility of jumping from S1 to S0 is equal when coming from either side of the conical inters ection in the case of stilbene. This results in the experimentally seen quantum yield of 0.5. 10.3.2.2 Inversion pathway There is a slight trans cis energy barrier along the inversio n pathway as can be seen in Figure 10-4. The S1 trans cis energy barrier is 9.6 kcal mol-1. There is no conical intersection between the ground and first excited state along this pathway making the inversion mechanism highly improbable. This is in agreement with previous calculations.188,193 Our results indicate that the isomerization can easily occur through an excitation to the first excited state, relaxation to the excited stat e minimum along the rotation pathway, followed by descent to either the cis or trans conformation via the conical intersection, providing for the known cis yield (0.20-0.36) after excitation to the first excited state.

PAGE 160

160 10.3.3 Excited State 2 ( *) The potential energy su rfaces of the second exci ted state are shown in Figure 10-6. As in the ground state surface, cis and trans mi nima appear on the surface of the S2 state along the inversion and rotation pathways. The cis minima are extremely shallow. The trans cis energy barriers were computed in the same manner as the ground state cis trans barriers. The inversion barrier was found to be 30.1 kcal mol-1 while that of the rotation pathway was 29.6 kcal mol-1. Due to these substantial energy barriers, it is unlikely that isomerization occurs on the S2 surface. Rapid relaxation from the S2 state to the S1 state is energetically more favorable. This is in agreement with Kashas rule.207 We examined energy gaps between the two states along the inversion, rotation, and co ncerted inversion pathways in order to investigate this process. 10.3.3.1 Rotation pathway The possibility of a conical intersection between the S2 and S1 states along the rotation pathway with an angle of 117 and a dihedral angle of 180 has been previously suggested.188 For Azo, the states differ by 23.48 kcal mol-1 at the trans minimum as can be seen in Figure 107A. We do not find a coni cal intersection between S1 and S2 along the rotation pathway and can therefore rule out this pathway as an isomerization mechanism. 10.3.3.2 Inversion pathway A conical intersection between the S2 and S1 states has been previously located near the ground state trans minima190. While we do not find a curve cro ssing in this exac t area, we do see the energy difference between the S1 and S2 states become smaller along the inversion pathway as can been seen in Figure 10-7b. Th is point is a few degr ees away from the S2 minima. At a CNNC dihedral angle of 180.0 and an NNC angle of 100.0, the energy gap between the S1 and S2 surfaces appears to be the smallest, 15.70 kcal mol-1. This energy gap may be small

PAGE 161

161 enough to allow for rapid relaxation to the first exc ited state. This explai ns why experimentalists see two transients, a shorte r one corresponding to the S2 state before is relaxes to a longer lived species corresponding to the S1 state.182-184 10.3.3.3 Concerted inversion pathway The above mechanism does not explain the diffe rence in quantum yiel d that is seen upon excitation at different wavelengths for unsubstituted azobenzene. To explain this process, we invoke Diaus193 proposal of an additional isomerization channel (concerted-i nversion) that is opened by exciting to the S2 state. The concert-inversion pathway involves a synchronous inversion of the NNC and CNN angles. In our calc ulations, the CNNC dihedr al angle is fixed at 180.0. The concerted inversion pathway is plotted in Figure 10-7C. As in the inversion pathway, the S1 and S2 surfaces are close in energy at an NNC angle of 100.0. This energy gap is significantly smaller th an that of the rotation or inversion pathway, 5.17 kcal mol-1. It seems likely that ra pid relaxation from the S2 to S1 state can occur due to this small energy gap which will again give rise to two transients as seen experimentally. A potential problem of the concerted-inversion mech anism is the existence of an energy barrier on the S1 surface. The energy barrier (labeled b in Figure 10-7C) is measured by subtracting the energy of the S1 minimum from the S1 energy at the S1-S0 conical intersection, 31.21 kcal mol-1. The available energy is calc ulated by subtracting the S1 minimum energy from the S1 energy at the S2-S1 conical intersection (labeled a in Figure 10-7C), 50.43 kcal mol-1. There is enough energy available to overcome the energy barrier so the channel is open. 10.4 Summary of Unsubstituted Azobenzene Excita tion to the S1 state leads to isomerization vi a the rotation mechanism. Our conclusion is based on the finding of a conical intersection between the S1 and S0 states near the midpoint of this pathway (NNC=110, CNNC=90.0). The rotation pathway has also been found

PAGE 162

162 to be without a significant barrier, unlike the inversion pathway. Excitation to the S2 state results in rapid relaxation to the S1 surface via the conical intersection found at NNC=100 and CNNC=180 along the concerted inversion pathway. The energy gap between these surfaces is significantly smaller than those seen in othe r pathways. Once on the concerted-inversion S1 surface there is an energy barrier of ~31.2 kcal mol-1. Only when excitation to the S2 state occurs is there enough energy to overcome this barrie r. The conical intersection between the S1 and S0 states is located at NNC=170 and CNNC=180. Mo re trans isomers would be produced because the crossing of these states is on the trans side of the potential energy curve. This is in agreement with the experimental observation of differi ng quantum yields upon excitations at different wavelengths. The concerted-inversion pathway has a nearly planar transition state in which the NN double bond stays intact. This explai ns Fujinos observa tion that the S1 state formed after S2 excitation had a similar NN stretchi ng frequency as that of the S0 state.185 It should also be noted that because the S2 state relaxes to the S1 state at a geometry similar to that of both the electronic ground state as well as the direct S1 excited state in the Fran ck-Condon region, the spectra of both S1 states should be quite simila r as seen in Fujinos work.208 A schematic diagram of these mechanisms is shown in Figure 10-8.

PAGE 163

163 Table 10-1. Optimized Geometries of cis and trans Isomers of Azobenzene Angles/deg Distances/ CNNC NNCC NNC dNN dCN Energy a/ kcal mol-1 trans 180.0 0.0 114.8 1.261 1.419 0.0 trans X-rayb 180.0 0.0 114.1 1.247 1.428 trans EDc 180.0 30.0 114.5 1.268 1.427 Cis 9.8 50.3 124.1 1.250 1.436 15.2 Cis X-rayd 0.0 53.3 121.9 1.253 1.449 aEnergies are relative to the trans isomer. bReference 202, cReference 201. dReference 204. Table 10-2. Vertical Excitation Energies (eV) of trans and cis Azobenzene. TDDFTa Exp.b CASSCFc CIPSId trans S1 2.55 (0.0) 2.79 2.85 2.81 S2 3.77 (0.77) 3.95 7.62 4.55 cis S1 2.57 (0.04) 2.82 3.65 2.94 S2 4.12 (0.07) 4.77 8.62 4.82 a Intensity is in parenthesis. b Reference 166, c Reference187, d Reference 179.

PAGE 164

164 Figure 10-1. Molecular orbitals of Azo involved in the S1 S0 and S2 S0 transitions. This figure also represents the molecular orbita ls of Azon and Azonco as they are very similar to those of Azo. n HOMO LUMO

PAGE 165

165 Figure 10-2. Ground state potential energy surface of Azo. A) Po tential energy surface. B) Contour map. Angles in degrees, energy in kcal mol-1 relative to the energy of the ground state trans isomer. In B, point 1 ma rks the position of the inversion transition state while point 2 indicates the position of the rotation transition state. The cis and trans minima are also labeled. Figure 10-3. First excited state potential energy surface of Azo. A) Potential energy surface. B) Contour map. Points 1 and 4 repres ent where the molecule is on the S1 surface after excitation from the ground state trans mi nima whereas excitation from the ground state cis minima will place the molecule at points 3 and 6. Points 2 and 5 represent the S1 minima as well as mark the location of the S1/S0 conical intersection. Angles in degrees, energy in kcal mol-1, relative to the energy of the ground state trans isomer.

PAGE 166

166 Figure 10-4. Schematic representation of pathways in the first excited state of Azo. A) The rotation pathway. B) The inve rsion pathway. The curves in A are along the angle of 240 while those in B are along the dihedral of 180. The labeled points are the same as those in Figure 5b. The arrow in b depicts the inversion barrier in the S1 state. Dihedral Angle 60 80 100 120 140 150 110 130 70 60 50 40 30 20 Energy Angle Figure 10-5. Conical Intersection of S0 and S1 states of Azo. Angles in degrees, energy in kcal mol-1. trans cis Dihedral An g le 0 180 S1 S0 A RotationPathwayat NNC = 240 4 5 6 trans cis E trans An g le 120 240 S0 S1 B Inversion Pathway at CNNC = 180 1 2

PAGE 167

167 Figure 10-6. Second excited state potential energy surface of Azo. A) Potential energy surface. B) Contour map. Angles in degrees, energy in kcal mol-1, relative to the energy of the trans isomer.

PAGE 168

168 Figure 10-7. A) Rotation Pathway along the angle of the ground st ate minimum of Azo, NNC=110. B) Inversion and C) concerted-inversion pathways of Azo along CNNC=180.0 S0 in blue, S1 in red, S2 in green. Angles in degrees, energy in kcal mol-1. In C, arrow a represents the available energy while arrow b represents the energy barrier.

PAGE 169

169 Figure 10-8. Scheme of the trans cis isomerization process after A) n excitation and B) excitation. The ovals indicate locations of curve crossings. S0S1S2 S0 S S A fter n excitation Rotation trans cis Concerted-Inversion transcis A fter excitation A B

PAGE 170

170 CHAPTER 11 RESULTS: SUBSTITUTED AZOENZENESd 11.1 Optimized Ground-State Geometry The optim ized geometries of the cis and tr ans isomers of the azobenzenes were found using the same technique as for the unsubstituted azobenzene. Important bond distances, angles, and dihedrals are summarized in Tabl e 11-1. The values listed for Azo(NO2)NH2 are those of the NO2 substituted ring, while the values for NH2 substituted ring are represented by AzoNO2(NH2). 11.1.1 NN Distance For each azo benzene studied, the NN bond is shorter for the cis isomer than the trans isomer. The NN distances were quite similar be tween the azobenzenes ranging from 1.260 to 1.267 for the trans isomer and 1.247 to 1.256 for the cis isomer. AzoNO2NO2 has the shortest NN distance for both conformations followed by Azo. The substituents appear to contribute only slightly to the NN bond as evid enced by the very small increase in bond length upon substitution of the rings with electron donating groups and a sm all decrease in length when substituted with electron withdrawing groups. 11.1.2 NNC Angle, CNNC Dihedral Angle, and NNCC Di hedral Angle Like the NN distances, the NNC angles are very similar. For the trans conformation, the angles range from 114.1 to 115.6, while the ra nge for the cis isomer was from 124.0 to 125.5. The CNNC dihedral angle of the trans isomers are all about the same, 180.0 while the NNCC dihedral angle were about 0.0. The CNNC dihedral angle for the ci s isomers is slightly d Adapted with permission from Crecca, C. R., Roitberg, A. E. Theoretical Study of the Isomerization Mechanism of Azobenzene and Dis ubstituted Azobenzene Derivatives, J. Phys. Chem. A, 2006;110:8188-8203 Copyright 2006 American Chemical Society

PAGE 171

171 larger in substituted azobenzenes ranging from 9.8 for Azo to 11.8 for Azon. The NNCC dihedral angle was smallest for AzoNO2(NH2) and largest for Azo(NO2)NH2. 11.1.3 Relative Energy Differences The difference between the cis and trans ground state energies was cal culated and found to be very sim ilar ranging from 14.8 kcal mol-1 for AzoNO2NO2 to 16.8 kcal mol-1 for Azon. Electron donating substituents a ppeared to increase the relative energy difference while electron withdrawing groups lowered the energy difference. The push-pull system showed a slight increase in relative energy di fference when compared to Azo. 11.2 Comparison of Charges We define electron donating groups as those that activate the ort ho and para positions while electron withdraw ing groups are those that activate the meta positions. Activation is determined by change in char ges relative to the unsubstitu ted azobenzene. Blevins and Blanchard180 suggested the CH3CONH groups of Azonco would act as electron withdrawing substituents. Using the CHarges from ELectrost atic Potentials (CHELP G) method to calculate charges, however, we found that Azonco demonstrates electron donating behavior similar to that of Azon. The charge differences were calcula ted by subtracting the charge on the unsubstituted azobenzene from that of the substituted azobenzene. As can be seen in Figure 11-1, the carbons that are ortho to the substituent, C2, C6, C11, and C13, (refer to numbering scheme in Figure 102) have similar charge differences with an average of -0.226 for Azon and -0.234 for Azonco. The para carbons, C4 and C9, are only slightly ac tivated with average charge differences of 0.126 for Azon and -0.117 for Azonco. The activation of the ortho carbons is enhanced by the electron withdrawing effect of the azo group. Th e azo group activates th e positions meta to itself, which are the same as those ortho to th e substituent. In effect, the azo group will act synergistically with the el ectron donating substituents.

PAGE 172

172 Interesting behavior results when an electron donating substituent, NH2, is placed on one benzene ring para to the azo group and an electron withdrawing group, NO2, is placed in the para position of the other benzene ring. This creates a push pull system as in AzoNO2NH2. As seen in Azon, the NH2 group activates the positions ortho to itse lf, C2 and C6, which are the same as those positions meta to the azo group as depict ed in Figure 11-1. When an electron withdrawing group like NO2 is placed on the ring para to the azo gr oup, there is a mixing of charges. Both groups try to activate the positions that are meta w ith respect to themselves. This will obviously result in a conflict because the meta positions of the azo group are ortho to the NO2 group. What we see is a difference in charge of -0.125 at C14, which is meta to the NO2 group and 0.129 at C11, which is meta to the azo group. Th erefore C11 and C14 are the activated carbons in AzoNO2NH2. AzoNO2NO2 also shows a mixing of charges. Similar results are seen in AzoNO2NO2, C3, C6, C11, and C14 are activated. It can now be stated with confidence that Azonco and Azon have electron donating groups, AzoNO2NO2 had electron withdrawing groups and Azo(NO2)NH2 is a push pull system with both an electron donating and an electron withdrawing group. 11.3 Electronic Excitation Energies The single t vertical excitations of the trans isomers of the substituted azobenzenes are very similar to unsubstituted azobenzene. The first transition, n *, is symmetry forbidden and therefore has a very weak oscillator strength, while the second transition, *, shows some intensity. Visual inspection is used to assign symmetry. The excitation energies for the S0 S1 transition for all the azobenzenes were similar, as shown in Table 11-2. The molecular orbitals (Fi gures 10-1 and 11-2) show again that the first transition originates from the l one pair on the central nitrogens. Figure 10-1 can be used to

PAGE 173

173 represent the molecular orbitals for Azo, Azon, and Azonco. The excitation was of nearly pure n character for all but AzoNO2NH2 and AzoNO2NO2. These systems show some additional charge transfer to their NO2 substituents. The second transition, *, is delocalized throughout the en tire molecule for all but the push pull system. AzoNO2NH2 shows an excitation primarily from the orbitals of the benzene ring with the NH2 substituent as well as from the orbitals of the central nitrogens. As in the n transition, AzoNO2NH2 and AzoNO2NO2 both show a charge transfer to the NO2 substituents. The molecular orbi tals involved in the second transi tion are pictured in Figures 101 and 11-2. AzoNO2NH2 exhibits an intense tran s excitation with the sma llest energy, 2.99 eV, while the excitations of Azonco and Azon are partic ularly close in energy, 3.25 eV and 3.26 eV respectively. AzoNO2NO2 has an excitation of 3.48 eV. The S2 S0 transition of Azo is highest in energy, 3.77 eV, and the least intense of all the az obenzenes. It appear s that adding both electron donating and electron wit hdrawing substituents to Azo d ecreases the excitation energy and increases the intensity of the S2 S0 transition. We have found again that the first and sec ond excited states at the optimized ground state trans geometry are very close in energy. Az o shows the largest energy gap, 1.22 eV, followed by AzoNO2NO2, 1.17 eV, and Azonco, 0.66 eV. Azon and AzoNO2NH2 have very similar energy gaps, 0.550 and 0.546 eV respectively. The energy differences between the first two excited states are summarized in Table 11-2. The energy of the steady-state absorpti on spectroscopy maximum of Azo was 3.96 eV,180 slightly higher than the TDDFT maximum of 3.7 7 eV. Azon showed an experimental excitation of 3.15 eV, while the calculated energy was 3.26 eV. Azonco showed an excitation of 3.41 eV,

PAGE 174

174 slightly higher than the calcul ated energy of 3.25 eV. Both expe rimental and theo retical results show Azo to have the highest energy transition. TDDFT predicts the exc itation energies of Azon and Azonco to be about the same, while experi ment shows these energies to differ by 0.26 eV. It is also interesting to co mpare differences between the cis and trans excitations. Unlike the trans excitations, the S1 S0 transition from the cis isomer shows slight intensity. For Azo and AzoNO2NH2, the S1 S0 transition occurs at about the sa me energy for both trans and cis. Azon and Azonco have trans S1 S0 excitations slightly higher in energy than cis excitations while AzoNO2NO2 shows a higher energy cis excitation. The S2 S0 transition from the cis isomer is mu ch less intense and slightly higher in energy than that of the trans is omer. There is a greater difference between the cis and trans S2 S0 transitions than the S1 S0 transitions. The greatest differen ce is seen in Azonco with almost 0.5 eV separating the cis and trans excitations. 11.4 Potential Energy Surfaces 11.4.1 Ground State Ground state three dim ensional potential energy surfaces and contour maps were calculated for each azobenzene (Figure 11-3). As mentioned previously, for the push pull system, Azo(NO2)NH2 represents the surface with NO2 on the same side as the NNC angle being inverted while AzoNO2(NH2) represents the surface with NH2 on the same side as the inverted NNC angle. As can be seen in these figures, the ground state surfaces of the azobenzenes are very similar. Cis to trans barri er heights were determined as de scribed in Chapter 9. The energy barriers can be found in Table 11-3. For th e push pull system, the barriers for both Azo(NO2)NH2 and AzoNO2(NH2) were considered together. In all five systems, the barrier along the i nversion pathway was lower than that of the rotation pathway, indicating that in the ground state, the inversion mechanism is still favored.

PAGE 175

175 The unsubstituted azobenzene, Azo, was found to have an inversion barrier of 24.9 kcal mol-1. Azo(NO2)NH2 and AzoNO2NO2 have barriers lower than Azo, 17.2 kcal mol-1 and 20.8 kcal mol-1 respectively. In both of these systems, the inversion angle is adjacent to a phenyl ring with an electron withdrawing substi tuent. Azonco, Azon, and AzoNO2(NH2), each have an electron donating substituent on the phenyl ring adjacent to the inversion angle and showed higher barriers than Azo, 25.5 kcal mol-1, 26.8 kcal mol-1,and 28.5 kcal mol-1 respectively. It is clear from our results that substituting the benzene ring attached to the angle being inverted with an electron donating group, raises th e inversion barrier height co mpared to the unsubstituted azobenzene. Substituting the same ring with an electron withdrawing gr oup lowers the barrier height. These observations can be explained upon examination of the molecular orbitals of the inversion transition st ate (Figure 11-4). Each of the azobenzenes has an inversion tr ansition state with an angle of 180 and a dihedral angle of 180. Due to the electron dona ting substituents on Azon and Azonco, there is more electron density on the phenyl rings than is seen on Azo. There is therefore greater steric hindrance between the lone pair s on the central nitrogens and p orbitals of the phenyl ring adjacent to the 180 NNC angle. The steric effe cts cause the inversion transition state of Azon and Azonco to be higher in energy than that of Azo. AzoNO2NO2, on the other hand, has electron withdrawing substituents which accept electron density from the orbitals of the phenyl rings. AzoNO2NO2 is slightly stabilized by th e ability of the less filled orbitals of the phenyl ring adjacent to the 180 NNC angle to accept electron density from the lone pair orbitals of the central nitrogens. The lower barrier height of AzoNO2NO2 compared to Azo is due to this stabilization.

PAGE 176

176 For the push pull system, the smallest barri er appears along the inversion pathway of Azo(NO2)NH2. This suggests that the preferred mechanism of isom erization in the ground state of the push pull system is the inversion of the NNC angle that is on the same side as the NO2 substituent. This is in agreem ent with the results of Kikuchis209 studies of a similar push-pull system, 4-dimethylamino-4-nitroazobenzene. This system has the lowest inversion energy barrier of all the azobenzenes studied. The transi tion state is stabilized by the vacant orbitals of the nitro substituted phenyl rings accepting electr on density from the lone pairs on the central nitrogens. The lone pairs are parallel to the vacant orbitals on this phenyl ring. The lone pairs are also perpendicular to the occupied orbitals of the amine substituted phenyl ring which has a stabilizing effect as it minimi zes the electron-electron repulsion. The combination of these effects results in the Azo(NO2)NH2 having the lowest inversion energy barrier. Blevins and Blanchard looke d at the ground state cis trans back-conversion for Azo, Azon, and Azonco using theory and experiment. They calculated barrier heights from their experimentally measured isomerization recovery time constants. The experiments did not indicate which pathway the barriers referred to, so we will compare them to both the inversion and rotation cis to trans barriers. A barrier height of 21.2 kcal mol-1 was measure for Azon, 23.7 kcal mol-1 for Azonco, and 25.8 kcal mol-1 for Azo. The experimental data indicates that adding electron donating substituents decr eases the energy barrier which conf licts with our results. This may be due to the lack of consideration of so lvent effects in our calculations. The dipole moment of the cis isomer and the transition stat e will be stabilized by the polar solvent. The dipole moments were calculated a nd can be found in Table 11-4. For Azo, the cis isomer and the inversion transition state have approximately th e same dipole moment indicating they will be equally stabilized by a polar solvent. This ma y explain why our calcula ted barrier height is

PAGE 177

177 closest to the experimental va lue for Azo. The inversion tran sition states of Azon and Azonco are more stabilized by a polar so lvent than their corre sponding cis isomers due to their greater dipole moment. Stabilization of th e transition state will lower the energy barrier as is seen when comparing our calculated results with experiment. Polar solvents will have the greatest effect on the push-pull system due to the large dipole mo ments that can be found in both the transition state as well as the cis isomer. The NN distance changes along each pathway in the substituted azobenzenes follow the same trend seen for unsubstituted azobenzene (see Chapter 9). Along the inversion pathway, the NN distance is smallest at the tr ansition state. The values of the NN distances in the transition states can be found in Table 11-5. The inversion transition state of AzoNO2NO2 has the shortest NN distance, 1.222 followed by Azo, 1.226 and Azo(NO2)NH2, 1.228 The inversion transition state of Azonco was found to have an NN distance of 1.233 while that of Azon was found to be 1.241 AzoNO2(NH2) had the longest NN distance, 1.248 The electron donating groups can contribute electron density to the orbitals thereby decreasing the bond order and increasing the lengt h of the NN bond compared to that of the unsubstituted azobenzene. These distances indicate that the cent ral nitrogens of the in version transition state have a double bond between them. The opposite trend is seen al ong the rotation pathway, the NN distance is greatest at the transition state and is approximate ly that of a single bond. These distances can also be found in Table 11-5. Azo(NO2)NH2 has the longest NN bond distance, 1.335, while AzoNO2(NH2) has the shortest NN distance, 1.290. Potential energy surfaces and contour maps were calculated for the first two excited states. Figure 11-5 shows these calculations for the firs t excited state of all the azobenzenes. The

PAGE 178

178 surface graphs of all substituted azobenzenes appear to be similar to Azo (Figure 10-2a) and are therefore not shown. Slight differences ar e more visible in the contour plots. 11.4.2 Excited State 1 11.4.2.1 Rotation pathway As seen in unsubstituted azobenzene, there is essentially no en ergy barrier along the rotation pathway of the first excited state. Th ere is a shallow slope above the area corresponding to the trans m inimum and a very steep slope on th e cis side. We can compare the excited state cis and trans energy barriers and sl opes (Table 11-6) to approximat e relative relaxation times. A steeper slope indicates a quicker relaxation time. We can conc lude from this analysis that the lifetime of the first excited state cis isomer is shorter than that of th e trans for each of the azobenzenes studied here. Azonco appears to have the steepest trans slope and we predict it will exhibit the shortest S1 lifetime while AzoNO2NO2 has the least steep slope and is expected to have the longest S1 lifetime. A conical intersection was discovered in each azobenzene between the ground and first excited states. The location of the conical inte rsection is only slightly different between the azobenzenes. The location as well as the rela tive energy can be found in Table 11-7. Azoncos conical intersection is located on the trans side of the barrier. This may indicate that Azonco will have a lower cis trans quantum yield. 11.4.2.2 Inversion pathway We again see a trans cis energy barrier along the inve rsion pathway (Table 11-8). AzoNO2(NH2), Azon, AzoNO2NO2, and Azo have higher barrier heights, 11.5 kcal mol-1, 11.1 kcal mol-1 10.4 kcal mol-1, and 9.6 kcal mol-1 respectively. Azo(NO2)NH2 and Azonco show very small inversion barriers, 1.3 kcal mol-1 and 2.3 kcal mol-1, making it difficult to rule out this pathway as a possible isomerization mechanism for these azobenzenes based on barrier height

PAGE 179

179 alone. Lack of a conical in tersection between the ground and first excited state along this pathway makes the inversion mechanism highly im probable. We can conclude that substituting the phenyl rings of azobenzene does not change the isomerization mechanism after S1 excitation. 11.4.3 Excited State 2 The potential energy su rfaces of the second exci ted state were also generated and can be found in Figure 11-6. Both cis and trans minima appear on this surface along the inversion and rotation pathways of each of the azobenzenes. The trans cis energy barriers were computed as described previously and can be found in Table 11-9. These barri er heights are too substantial for isomerization to occur on this su rface. Rapid relaxation from the S2 to the S1 surface is again expected. We will compare the energy gaps between the first and second excited states along the rotation, inversion, and concer ted inversion pathways. 11.4.3.1 Rotation pathway As depicted in Figure 11-7, in general, th ere is a significant decrease in the energy gap upon substitution of the benzene rings by bot h electron donating and electron withdrawing groups in agreement with experimental work.210 These values can be found in Table 11-10. For Azo, the states differ by 23.48 kcal mol-1 above the trans minimum. Azo(NO2)NH2 shows the smallest energy gap of 8.89 kcal mol-1. These energy gaps are still slightly too high for relaxation to occur along this pathway. 11.4.3.2 Inversion pathway The energy difference between the states b ecomes smaller along the inversion pathway near the trans minima as can been seen in Figure 11-8 and Table 11-10. These points are a few degrees away from the minima of the second excite d state. In general, at a dihedral angle of 180.0 and angles of 100.0, the energy gap between the first and second excited state surfaces appears to be the smallest. Azo and AzoNO2NO2 have the largest ener gy gaps, 15.70 kcal mol-1

PAGE 180

180 and 16.01 kcal mol-1 respectively. The other azobenzenes show significantly smaller energy gaps, under 4.67 kcal mol-1, making this a very probable pathway. 11.4.3.3 Concerted-inversion pathway This pathway is depicted in Figure 11-9. Energies of the S1 and S2 minima, conical intersections, barrier heights, and available energy can be found in Table 11-11. Azon, Azonco, and AzoNO2NH2: For these three azobenzenes, excitation to the S2 surface in the franck-condon region results in excitation to the S2 minimum at NNC=110.0 and CNNC=180. This is also the location of the smallest S2-S1 energy gap along this pathway, 2.79 kcal mol-1 for Azon, 6.24 kcal mol-1 for Azonco, and 3.49 kcal mol-1 for AzoNO2NH2. These energy gaps are extremely small and woul d allow for rapid relaxation from the S2 surface to the S1 surface. As seen in unsubstituted azobenzene, a large energy barrier is seen on the S1 surface of each of these systems. The energy barriers were measured by subtracting the energy of the S1 minimum from the S1 energy at the S1-S0 conical intersection (arro w b in Figure 11-9). The available energy is calculated by subtracting the S1 minimum energy from the S1 energy at the S2-S1 conical intersection (arrow a in Figure 11-9). In each case, the available energy is less than the energy barrier. It is highl y improbable that this channel is open for Azon, Azonco, and AzoNO2NH2. However, highly polar solvents may lower the S1 energy at the S1-S0 conical intersection, which may lower the energy barrier enough to allow for th e opening of this channel. AzoNO2NO2: AzoNO2NO2 is quite similar to Azo. The smallest S1-S2 energy gap of 6.35 kcal mol-1 is found at NNC=100.0 and CNNC=180.0. Th is energy gap is smaller than that of the rotation and inversion pathways. The available energy was calculated to be 53.90 kcal mol-1 while the barrier was found to be 33.70 kcal mol-1. There appears to be sufficient energy to

PAGE 181

181 overcome the barrier. More trans isomers would be formed as the S1-S0 conical intersection appears at NNC=170.0 and CNNC=180.0. 11.5 Summary of Substituted Azobenzenes As seen for Azo, the rotation pathway dom inates the isomerization process after excitation to the S1 surface as evidenced by the coni cal intersection between the S1 and S0 states near the midpoint of this pathway (NNC= 110, CNNC=90.0) and the lack of a significant barrier. Azon, Azonco, and AzoNH2NO2 use the rotation pathway after excitation to the S2 state as represented schematically in Figure 11-10. There is not en ough available energy for these azobenzenes to overcome the concerted-inversion barrier. It may be possible for this channel to open in very polar solvents if the transition state is stabili zed. The concerted-inversion channel is open for AzoNO2NO2, after excitation to the S2 surface.

PAGE 182

182 Table 11-1. Optimized Geometries of ci s and trans Isomers of Azobenzenes Angles/deg Distances/ Structure CNNC NNCC NNC RNN RNC Energya/ kcal mol-1 Azo 180.0 0.0 114.8 1.2611.419 0 Azon 180.0 0.1 115.0 1.2671.409 0 Azonco 180.0 0.0 114.9 1.2651.411 0 Azo(NO2)NH2 179.9 0.2 114.1 1.2671.415 0 AzoNO2(NH2) 179.9 0.0 115.6 1.2671.399 0 trans AzoNO2NO2 179.9 0.2 114.6 1.2601.427 0 Azo 9.8 50.3 124.1 1.2501.436 15.2 Azon 11.8 44.1 124.6 1.2561.430 16.8 Azonco 11.1 46.0 124.5 1.2531.431 16.1 Azo(NO2)NH2 11.5 60.4 125.5 1.2541.423 15.6 AzoNO2(NH2) 11.5 30.4 125.0 1.2541.419 15.6 cis AzoNO2NO2 10.2 52.2 124.0 1.2471.432 14.8 a Energies are relative to their respective trans minima. Table 11-2. Vertical Excitation Energies in eV of trans and cis Azobenzenes. aThe % n and % values are calculated from the CI coefficients. bReference180, experimental value. S1 S0 (n *) S2 S0 ( *) S2 S0 -S1 S0 Structure Energy Intensity% n *a EnergyIntensity % *a Energy Diff. Azo 2.55 0.0 88 3.77 (3.96b) 0.77 79 1.22 Azon 2.71 0.0 89 3.26 (3.15b) 1.03 78 0.55 Azonco 2.59 0.0 88 3.25 (3.41b) 1.29 80 0.66 AzoNO2NH2 2.44 0.0 85 2.99 0.86 80 0.55 trans AzoNO2NO2 2.31 0.0 86 3.48 1.07 80 1.17 Azo 2.57 0.04 78 4.12 0.07 87 1.55 Azon 2.46 0.09 71 3.70 0.22 77 1.24 Azonco 2.46 0.10 74 3.72 0.29 73 1.26 AzoNO2NH2 2.46 0.11 60 3.17 0.09 40 0.71 cis AzoNO2NO2 2.44 0.07 75 3.62 0.03 81 1.18

PAGE 183

183 Table 11-3. The cis trans Energy Barriers Calculated Along the Inversion and Rotation Pathways. Azo Azon Azonco Azo(NO2)NH2 AzoNO2(NH2) AzoNO2NO2 Inv. Rot. Inv. Rot. Inv. Rot.Inv. Rot. Inv. Rot. Inv. Rot. NNC 180 110 180 120 180 110 180 110 180 120 180 120 CNNC 180 90 180 90 180 90 180 90 180 90 180 90 E cis 24.9 36.226.8 30.5 25.5 34.217.2 31.6 28.5 20.8 20.8 29.2 Ea 11.3 3.7 8.7 3.6 8.4 a E is the energy difference in kcal mol-1 between the rotation and inversion isomerization barriers. Angles are in degrees. Table 11-4. Dipole Moments of the invers ion transition State and cis Isomer Dipole Moment cis Dipole moment Inversion TS Azo 3.22 3.22 Azon 2.61 4.44 Azonco 5.35 7.39 Azo(NO2)NH2 7.53 13.37 AzoNO2(NH2) 7.53 8.79 AzoNO2NO2 3.66 5.89 Table 11-5. NN Distances () of Transition States Along the Rota tion and Inversion Pathways. InversionRotation Azo 1.226 1.303 Azon 1.241 1.308 Azonco 1.233 1.322 Azo(NO2)NH21.228 1.335 AzoNO2(NH2)1.248 1.290 AzoNO2NO2 1.222 1.297 Table 11-6. Rotational Energy Barrie rs in the First Excited State a Trans Barrier Cis Barrier Azo 18.5 (0.206) 29.8 (0.331) Azon 11.6 (0.129) 28.6 (0.318) Azonco 19.2 (0.213) 29.2 (0.324) Azo(NO2)NH2 17.5 (0.194) 31.1 (0.346) AzoNO2(NH2) 13.9 (0.154) 32.0 (0.356) AzoNO2NO2 11.5 (0.128) 27.3 (0.303) a This barrier is measured as the difference in energy between the excited state minimum and the excited state point corresponding to the ground state trans and cis minima. Energies are in kcal mol-1, slope, in parenthesis, is in units of kcal mol-1 degree-1.

PAGE 184

184 Table 11-7. Placement and Energy of First Excited State Minimum of the Conical Intersection Angles/deg NNC CNNC Energy a/ kcal mol-1 Azo 140 90 46.0 Azon 130 90 47.0 Azonco 130 100 43.9 Azo(NO2)NH2 140 90 42.4 AzoNO2(NH2) 120 90 38.9 AzoNO2NO2 150 90 45.4 a Energies are relative to their resp ective trans ground state minimum. Table 11-8. The trans cis Inversion Energy Barriers in the First Excited State a Etrans Azo 9.6 Azon 11.1 Azonco 1.3 Azo(NO2)NH2 2.3 AzoNO2(NH2) 11.5 AzoNO2NO2 10.4 a These barriers were found by subtracting the en ergy of the excited state point above the ground state trans minimum from the ener gy of the excited state at an angle of 180.0 and a dihedral of 180.0. Energies are in kcal mol-1. Table 11-9. The trans cis Energy Barriers Calculated Along the Inversion and Rotation Pathways on the Second Excited State Surface. Azo Azon Azonco Azo(NO2)NH2AzoNO2(NH2) AzoNO2NO2 Inv. Rot. Inv. Rot. Inv. Rot.Inv. Rot. Inv. Rot. Inv. Rot. NNC 180 110 180 110 180 120 180 120 180 110 180 110 CNNC 180 90 180 90 180 90 180 90 180 90 180 90 E trans 30.1 29.6 40.5 46.2 40.234.943.1 28.4 27.1 31.1 14.5 27.7 Ea 0.5 5.7 5.3 1.3 13.2 a E is the energy difference between the rotation a nd inversion isomerization barriers. Angles are in degrees. Energies are in kcal mol-1.

PAGE 185

185 Table 11-10. Energy Differences between S1 and S2 Energy Gap at A=110 D=180 (rotation) Energy Gap at D=180 A=100 (Inversion) Energy Gap at D=180 A=100 (Concerted-Inversion)a Azo 26.43 15.70 5.17 Azon 22.06 0.69 2.79 Azonco 17.12 3.56 6.24 Azo(NO2)NH2 8.89 4.67 3.49 AzoNO2(NH2) 17.30 2.30 3.49 AzoNO2NO2 22.83 16.01 6.36 aFor the concerted-inversion pathway, Azo(NO2)NH2 and AzoNO2(NH2) are the same. Energies are in kcal mol-1. Table 11-11. Energies of the S1 and S2 Minima, Conical Intersectio ns, Barrier Heights, and Available Energy a. S2 min S2 at S1-S2 CI S1 min S1 at S0-S1 CI S1 barrierb Available Energyc Azo 84.95 100.99 (5.17) 45.39 76.60 (1.64) 31.21 50.43 Azon 73.19 73.19 (2.79) 49.68 77.63 (6.78) 27.95 20.72 Azonco 73.27 73.27 (6.24) 46.11 81.58 (4.56) 35.47 20.92 Azo(NO2)NH2 67.72 67.72 (3.49) 41.79 71.12 (2.57) 29.33 22.44 AzoNO2NO2 79.28 98.78 (6.35) 38.53 72.23 (7.12) 33.70 53.90 aEnergies are in kcal mol-1 and are relative to their respective trans minimum. The numbers in parenthesis refer to the energy gaps between the two states. bThe S1 barrier is measured as the difference between the S1 minimum energy and the S1 energy at the S0-S1 conical intersection. cThe available energy is the diffe rence between the energy of S1 at the S2-S1 conical intersection and the energy of the S1 minimum. If the available energy is greater than the S1 barrier, the concerted-inversion ch annel can be used.

PAGE 186

186 A B C D Figure 11-1. Comparison of charge differences in trans isomers. A) Azon. B) Azonco. C) AzoNO2NH2. D) AzoNO2NO2. Charge differences were calculated by subtracting the charge on the unsubstituted azobenzene from that of the substituted azobenzene. A negative charge differences (highlighted in bold) indicates that the position has been activated. CH3 N N O N O N H H 0.244 -0.253 0.056 0.033 0.780 -0.142 -0.125 0.156 0.029 -0.129 -0.035 0.280 N N O N O N O O 0.036 -0.124 -0.124 0.163 0.287 -0.027 0.163 -0.124 -0.124 0.036 -0.030 0.291 N N H N N H CH3 O 0.041 0.039 0.232 -0.235 -0.117 0.754 -0.242 -0.227 0.051 0.019 0.756 -0.116 O H N H 0.026 0013 -0.224 -0.234 -0.126 0.742 N N H H -0.224 -0.233 0.026 0.013 0.742 -0.126 N

PAGE 187

187 Figure 11-2. Molecular orbitals involved in the S1 S0 and S2 S0 transitions. A) AzoNO2NH2 B) AzoNO2NO2 NO2 NH2 HOMO LUMO n NO2 NH2 NO2 NH2 A B

PAGE 188

188 Figure 11-3. Contour maps of the ground stat e. A) Azo. B) Azon. C) Azonco. D) Azo(NO2)NH2. E)AzoNO2(NH2). F) AzoNO2NO2. Angles in degrees, energy in kcal mol-1. The energy range for each color id depicted in the legend.

PAGE 189

189 Figure 11-3. Continued Figure 11-4. Schematic diagram of the molecular orbita ls of the inversion transition state. N N 0 10.8 21.6 32.443.254.064.875.686.4

PAGE 190

190 Figure 11-5. Contour maps of the first excited state. A) Azo. B) Azon. C) Azonco. D) Azo(NO2)NH2. E) AzoNO2(NH2). F) AzoNO2(NO2). Angles in degrees, energy in kcal mol-1. The energy range for each color is depicted in the legend.

PAGE 191

191 Figure 11-5. Continued

PAGE 192

192 Figure 11-6. Contour maps of the second excited state. A) Azo. B) Azon. C) Azonco. D) Azo(NO2)NH2. E) AzoNO2(NH2). F) AzoNO2NO2. Angles in degrees, energy in kcal mol-1. The energy range is depicted in the legend.

PAGE 193

193 Figure 11-6. Continued

PAGE 194

194 Figure 11-7. Rotation pathway along the angle of the ground state minimum. A) Azo. B) Azon. C) Azonco. D) Azo(NO2)NH2. E) AzoNO2(NH2). F) AzoNO2NO2. Angles in degrees, energy in kcal mol-1.

PAGE 195

195 Figure 11-7. Continued

PAGE 196

196 Figure 11-8. Inversion pathway along the dihedral of the ground state minimum. A) Azo. B) Azon. C) Azonco. D) Azo(NO2)NH2. E) AzoNO2(NH2). F) AzoNO2NO2. Angles in degrees, energy in kcal mol-1.

PAGE 197

197 Figure 11-8. Continued

PAGE 198

198 Figure 11-9. Concerted-inversion pathway along th e dihedral of the ground state minimum. A) Azo. B) Azon. C) Azonco. D) AzoNO2NH2. E) AzoNO2NO2. Angles in degrees, energy in kcal mol-1. Only one graph is necessary for AzoNH2NO2 because the NNC and CNN angles are being scanned synchr onously. Arrow a represents the amount of available energy while arrow b represents the energy barrier. The concertedinversion pathway is only open when the amount of available energy (arrow a) is greater than the energy barrier (arrow b).

PAGE 199

199 Figure 11-9. Continued

PAGE 200

200 Figure 11-10. Scheme of the trans cis isomerization process for Azon, Azonco, and AzoNO2NH2. After both n and excitation, the rotation pathway dominates the isomerization process. Inversion Rotation trans cis cis S0 S1 S2

PAGE 201

201 CHAPTER 12 AZOBENZENE CONCLUSIONS We have found that add ing electron donati ng substituents to the benzene rings of azobenzene raises the ground state inversion barrier height, making it harder to isomerize. Electron withdrawing groups were found to lowe r the same barrier. On the potential energy surface of the first excited state, there exists a slight trans cis barrier along the inversion pathway, while all other pathways are without ba rriers. A conical inte rsection between the S0 and S1 states was found for each azobenzene along the rotation pathway making this pathway the most likely method of isomeri zation. The surface of the S2 state was shown to be extremely close in energy to the S1 state at specific points, indicating that excitation to the S2 state leads to rapid relaxation to the S1 state. Our results i ndicate this relaxation o ccurs using the concertedinversion pathway for Azo and AzoNO2NO2. The concerted-inversion energy barriers were too high for the other azobenzenes to overcome. Th ey most likely use the conical intersection found along the rotation pathway as their primary isomer ization mechanism, regardless of excitation wavelength.

PAGE 202

202 APPENDIX LIST OF CONSTRAINTS Table A-1. List of distances for targets T288 and T306 T288 T306 Atom 1 Atom 2 Distance, Atom 1Atom 2Distance, 1 5 67 20.1 2 40 12 2 5 71 19.2 2 45 14.7 3 5 76 22 7 40 9.1 4 10 67 21.5 7 45 13.5 5 10 71 16 23 40 16.9 6 10 76 13.1 23 45 5.5 7 19 67 13.3 30 40 9.3 8 19 76 14.5 30 45 17.9 9 23 71 11.6 40 53 14.9 10 23 76 20.2 40 59 13.2 11 31 71 10.1 45 53 12.6 12 31 76 18 45 59 12.7 13 36 67 16.9 40 76 11.5 14 36 71 14.5 40 81 17.7 15 36 76 19 45 76 9.1 16 53 67 12 45 78 5.6 17 53 71 13.7 5 26 8.1 18 57 67 12.5 5 57 9.9 19 57 76 11.6 5 78 10.8 20 67 76 14.4 26 57 11.7 21 67 80 19.9 26 78 11.1 22 67 86 16.5 26 43 6.5 23 71 80 14.4 5 43 10.8 24 71 86 17.4 43 57 7.8 25 76 86 22.7 43 78 6.6

PAGE 203

203 Table A-2. List of distances for targets T309 and T335 T309 T335 Atom 1 Atom 2 Distance, Atom 1Atom 2Distance, 1 5 35 20.7 5 14 13.7 2 5 40 13.7 5 18 19 3 10 35 12.1 5 24 23.1 4 10 40 13.8 5 29 15.9 5 18 40 17.7 5 35 16.5 6 21 35 9.6 5 40 16.6 7 21 40 15.8 9 18 13.5 8 27 35 24.6 9 24 18 9 27 40 27.9 9 29 11.9 10 34 40 10.1 9 35 14.6 11 35 46 15.6 9 40 17.7 12 35 52 19.1 14 24 12.1 13 40 46 11 14 29 9.7 14 40 52 23 14 35 13.6 15 5 30 25.4 14 40 20 16 10 30 10.9 18 24 11.5 17 18 30 14.9 18 29 13.2 18 30 35 15.3 18 35 17.7 19 30 40 19.6 18 40 24.9 20 30 46 25.2 24 29 8.6 21 30 52 18.7 24 35 15.7 22 5 21 22.9 24 40 24.4 23 21 27 16.8 29 35 9.9 24 21 46 19.5 29 40 17.3 25 21 52 12.9 35 40 8.9

PAGE 204

204 Table A-3. List of dist ances for target T340 T340 Set1 T340 Set 2 Atom 1 Atom 2 Distance, Atom 1Atom 2Distance, 1 6 66 22.9 6 22 18.7 2 6 73 19.6 6 31 14 3 11 66 21 6 35 15.9 4 11 69 16 6 52 9.9 5 19 66 15.3 6 55 11.9 6 19 69 11.6 6 68 17.9 7 19 73 12.6 6 81 8.1 8 23 69 11.1 22 31 5.1 9 23 73 17 22 35 11.8 10 30 66 10.4 22 52 10.1 11 30 73 14.8 22 55 14.9 12 35 66 19.2 22 68 10.3 13 35 73 16.7 22 81 14.4 14 52 66 13.8 35 52 13.8 15 52 69 11.2 35 55 17.1 16 52 73 14.4 35 68 17.2 17 55 66 13.3 35 81 13.6 18 55 73 9.8 52 55 11.2 19 66 73 10.6 52 68 9.3 20 66 80 17.4 52 81 6.8 21 66 84 20.5 55 68 7.9 22 69 80 12.9 55 81 6.5 23 69 84 17.8 68 81 10.9 24 73 80 10.7 31 81 11.3 25 73 84 20.2 31 68 10.7

PAGE 205

205 Table A-4. List of dist ances for target T349 T349 Set1 T349 Set 2 Atom 1 Atom 2 Distance, Atom 1Atom 2Distance, 1 32 38 10.1 32 38 10.1 2 32 43 15.7 32 43 15.7 3 32 68 9.4 32 58 11.7 4 32 72 18.5 32 78 24.3 5 32 77 22.9 32 81 19.6 6 32 81 19.6 38 43 8.5 7 32 86 21.6 38 52 12.6 8 38 43 8.5 38 58 19.5 9 38 68 11.7 38 68 11.7 10 38 72 11.4 38 78 15.1 11 38 77 14.3 43 52 13.5 12 38 81 10.5 43 58 21 13 38 86 12.6 43 72 10.9 14 43 68 16.3 43 78 15.6 15 43 72 10.9 43 81 13.3 16 43 77 14 52 68 6.4 17 43 81 13.3 52 72 11 18 43 86 16 52 78 20 19 68 77 18.2 52 81 17.6 20 68 81 17 58 68 14.5 21 68 86 22.4 58 72 24.2 22 72 81 8.9 58 81 28.7 23 72 86 16.5 68 81 17 24 77 86 14.1 72 78 9.4 25 81 86 8.4 68 72 13.2

PAGE 206

206 Table A-5. List of distances for targets T348 and T353 T348 T353 Atom 1 Atom 2 Distance, Atom 1Atom 2Distance, 1 26 58 14 3 29 13.7 2 26 61 13.5 3 35 14.7 3 28 58 10.3 3 42 22.3 4 28 61 9.6 11 29 18.2 5 35 58 10.9 11 35 16.1 6 35 61 12.1 11 42 16.4 7 36 58 14.6 17 29 20 8 36 61 15.3 17 35 17.8 9 4 58 34.8 17 42 17.5 10 4 61 33.1 29 35 9.8 11 8 58 24.3 29 42 19.8 12 8 61 22.2 35 42 10.3 13 10 58 19 29 65 24 14 10 61 17.3 29 73 20.1 15 4 26 22.1 35 65 21.3 16 4 28 24.5 35 72 14.4 17 4 35 24.9 42 65 25 18 8 26 15.2 42 72 19.3 19 8 28 14.2 29 77 14.9 20 8 35 16 42 77 13.1 21 10 26 12.7 29 79 17.1 22 10 28 9.6 35 79 12.6 23 10 35 11.8 42 79 14.5 24 26 35 6 8 29 10.5 25 4 10 18.8 8 42 16

PAGE 207

207 Table A-6. List of distances for targets T358 and T363 T358 T363 Atom 1 Atom 2 Distance, Atom 1Atom 2Distance, 1 14 23 14.3 4 36 11.3 2 14 27 13.2 14 45 12.7 3 14 30 11.4 20 45 15.5 4 14 38 9.9 25 38 19.6 5 14 42 18.4 25 45 15.8 6 14 48 20.3 31 36 10.6 7 14 51 15.2 31 45 10.1 8 14 61 16.9 36 64 15.6 9 14 67 18.4 36 67 18.8 10 18 27 9.3 36 80 23.9 11 18 30 11.7 36 83 27.9 12 18 38 11.7 45 62 16.1 13 18 42 15 45 66 21.1 14 18 48 18.2 45 80 27.1 15 18 51 15 36 45 12.1 16 18 61 17.3 27 38 15.6 17 18 67 15.1 27 45 11.1 18 23 27 8.7 39 45 9.8 19 23 30 16.7 39 65 16.3 23 38 16.6 39 80 23.5 21 23 42 13.2 39 83 28.6 22 23 48 17.7 14 41 11.1 23 23 51 16.8 31 41 9.6 24 23 61 17.5 14 67 13.5 25 23 67 10.9 20 40 19.7

PAGE 208

208 Table A-7. List of distances for targets T359 and T311 T359 T311 Atom 1 Atom 2 Distance, Atom 1Atom 2Distance, 1 5 71 19.1 10 64 14.1 2 5 79 21.9 10 70 9.3 3 11 71 23.6 10 82 22.8 4 11 79 14.4 19 64 19.8 5 19 71 16.1 19 70 18.9 6 19 79 11.2 19 77 27.3 7 22 79 11.8 23 64 21.7 8 35 79 12.6 23 70 21.8 9 40 71 19.4 23 77 29.7 10 40 79 15.8 30 64 16.3 11 57 71 11.2 30 70 21.2 12 57 79 15.8 35 64 20.3 13 61 71 13.5 35 70 22.4 14 71 79 12.4 35 77 28.6 15 71 84 22.0 42 64 16.4 16 71 87 15.9 42 70 12.9 17 71 91 17.2 42 82 24.9 18 79 84 12.7 50 64 7.4 19 79 87 12.9 50 70 12.2 20 79 91 21.5 50 82 24.5 21 5 75 19.2 59 64 10.9 22 11 75 18.3 59 70 16.9 23 40 75 16.0 59 77 27.3 24 57 75 11.8 64 82 27.4 25 75 84 16.7 64 70 10.3

PAGE 209

209 LIST OF REFERENCES 1. Todd AE, Orengo CA, Thornton JM. Evoluti on of Function in Protein Superfamilies, from a Structural Perspective. Jour nal of Molecular Biology 2001;307(4):1113-1143. 2. Orengo CA, Todd AE, Thornton JM. From prot ein structure to func tion. Current Opinion in Structural Biology 1999;9(3):374-382. 3. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC. A threedimensional model of the myoglobin molecu le obtained by x-ray analysis. Nature (London, United Kingdom) 1958;181:662-666. 4. Perutz MF, Muirhead H, Cox JM, Goam an LCG, Mathews FS, McGandy EL, Webb LE. Three-dimensional Fourier synthesis of horse oxyhemoglobin at 2.8 A resolution. I. Xray analysis. Nature (London, United Kingdom) 1968;219(5149):29-32. 5. Geerlof A, Brown J, Coutard B, Egloff MP, Enguita FJ, Fogg MJ, Gilbert RJC, Groves MR, Haouz A, Nettleship JE, Nordlund P, Owe ns RJ, Ruff M, Sainsbury S, Svergun DI, Wilmanns M. The impact of protein charac terization in structur al proteomics. Acta Crystallographica, Secti on D: Biological Crysta llography 2006;D62(10):1125-1136. 6. Powell HR. The Rossmann Fourier autoindexing algorithm in MOSFLM. Acta Crystallographica, Secti on D: Biological Crysta llography 1999;D55(10):1690-1695. 7. Hauptman H. Phasing methods for protein cr ystallography. Current Opinion in Structural Biology 1997;7(5):672-680. 8. Uson I, Sheldrick GM. Advances in direct methods for protein crystallography. Current Opinion in Structural Biology 1999;9(5):643-648. 9. Taylor G. The phase problem. Acta Cr ystallographica, Sect ion D: Biological Crystallography 2003;D59(11):1881-1890. 10. Ealick SE. Advances in multiple wavele ngth anomalous diffraction crystallography. Current Opinion in Chem ical Biology 2000;4(5):495-499. 11. Wider G. Structure determination of biologi cal macromolecules in solution using nuclear magnetic resonance spectroscopy. BioTechniques 2000;29(6):1278-1280, 1282, 12841290, 1292, 1294. 12. Hore PJ. Nuclear Magnetic Resonance. Compton RG, editor. New York: Oxford University Press; 2001. 90 p. 13. Guntert P. Structure calculation of biologi cal macromolecules from NMR data. Quarterly Reviews of Biophysics 1998;31(2):145-237.

PAGE 210

210 14. Nilges M, Clore GM, Gronenborn AM. Determin ation of three-dimensional structures of proteins from interproton distance da ta by hybrid distance geometry-dynamical stimulated annealing calculati ons. FEBS Letters 1988;229(2):317-324. 15. Havel TF. An evaluation of computational strategies for use in the determination of protein structure from distance constraints obtained by nuc lear magnetic resonance. Progress in Biophysics & Mole cular Biology 1991;56(1):43-78. 16. Wagner G, Braun W, Havel TF, Schaumann T, Go N, Wuethrich K. Pr otein structures in solution by nuclear magnetic resonance and distance geometry. The polypeptide fold of the basic pancreatic trypsin inhibitor determined using tw o different algorithms, DISGEO and DISMAN. Journal of Mol ecular Biology 1987;196(3):611-639. 17. Guntert P, Mumenthaler C, Wuthrich K. Torsion angle dynamics for NMR structure calculation with the new program DYANA. Journal of Molecular Biology 1997;273(1):283-298. 18. Guntert P, Qian YQ, Otting G, Muller M, Gehring W, Wuthrich K. Structure determination of the Antp (C39----S) home odomain from nuclear magnetic resonance data in solution using a novel strategy for th e structure calculati on with the programs DIANA, CALIBA, HABAS and GLOMSA. J Mol Biol FIELD Full Journal Title:Journal of molecular biology 1991;217(3):531-540. 19. Braun W. Distance geometry and related me thods for protein structure determination from NMR data. Quarterly Review s of Biophysics 1987;19(3-4):115-157. 20. Demeter A, Fodor T, Fischer J. Stereochem ical investigations on the diketopiperazine derivatives of enalapril a nd lisinopril by NMR spectrosc opy. Journal of Molecular Structure 1998;471(1-3):161-174. 21. Hubbell WL, Altenbach C. Inve stigation of structure and dyn amics in membrane proteins using site-directed spin labeling. Current Opinion in Structural Biology 1994;4(4):566573. 22. Jeschke G. Determination of the nanostr ucture of polymer ma terials by electron paramagnetic resonance sp ectroscopy. Macromolecular Rapid Communications 2002;23(4):227-246. 23. Schweiger A, Jeschke G. Principles of Pulse Electron Paramagnetic Resonance Spectroscopy; 2001. 572 pp p. 24. Berliner LJ, Eaton GR, Eaton SS, Editors. Distance Measurem ents in Biological Systems by EPR. [In: Biol. Magn. Reson., 2000; 19]; 2000. 614 pp p.

PAGE 211

211 25. Rabenstein MD, Shin Y-K. Determination of the distance between two spin labels attached to a macromolecule. Proceedings of the National Academy of Sciences of the United States of America 1995;92(18):8239-8243. 26. Stryer L. Fluorescence energy transfer as a spectroscopic ruler. Annual Review of Biochemistry 1978;47:819-846. 27. Dodson MS. Dimethyl suberimidate cross-lin king of oligo(dT) to DNA-binding proteins. Bioconjug Chem FIELD Full J ournal Title:Bioconjugate ch emistry 2000;11(6):876-879. 28. MacPhee CE, Howlett GJ, Sawyer WH. Ma ss Spectrometry to Characterize the Binding of a Peptide to a Lipid Surface. An alytical Biochemistry 1999;275(1):22-29. 29. Young MM, Tang N, Hempel JC, Oshiro CM, Taylor EW, Kuntz ID, Gibson BW, Dollinger G. High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proceedings of the National Academy of Sciences of th e United States of America 2000;97(11):58025806. 30. Benson DA, Boguski MS, Lipman DJ, Ostell J, Ouellette BFF, Rapp BA, Wheeler DL. GenBank. Nucleic Acids Research 1999;27(1):12-17. 31. Stoesser G, Tuli MA, Lopez R, Sterk P. The EMBL Nucleotide sequence database. Nucleic Acids Research 1999;27(1):18-24. 32. Zhu H, Bilgin M, Snyder M. Proteomics Annual Review of Biochemistry 2003;72:783812. 33. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of pr otein database search programs. Nucleic Acids Research 1997;25(17):3389-3402. 34. Sanchez R, Sali A. Evaluation of co mparative protein structure modeling by MODELLER-3. Proteins 1997;Suppl 1:50-58. 35. Epstein CJ, Goldberger RF, Anfinsen CB. The genetic control of tertiary protein structure. Model systems. Cold Spring Harbor Symposia on Quantitative Biology 1963;28:439-449. 36. Chothia C, Lesk AM. The relation between the divergence of sequence and structure in proteins. EMBO Journal 1986;5(4):823-826. 37. Sander C, Schneider R. Database of ho mology-derived protein structures and the structural meaning of sequence alignment. Proteins: Structure, Function, and Genetics 1991;9(1):56-68.

PAGE 212

212 38. Rost B. Twilight zone of protein sequence alignments. Protein Engineering 1999;12(2):85-94. 39. Altschul SF, Gish W, Miller W, Myers EW Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology 1990;215(3):403-410. 40. Pearson WR. Rapid and sensitive sequ ence comparison with FASTP and FASTA. Methods in Enzymology 1990;183(Mol. Evol.: Comput. Analy. Prot ein Nucleic Acid Sequences):63-98. 41. Thompson JD, Higgins DG, Gibson TJ. CL USTAL W: improving the sensitivity of progressive multiple sequence alignment th rough sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Resear ch 1994;22(22):4673-4680. 42. Peitsch MC, Schwede T, Guex N. Automate d protein modeling the proteome in 3D. Pharmacogenomics 2000;1(3):257-266. 43. Bates PA, Sternberg MJE. Model buildi ng by comparison at CASP3: using expert knowledge and computer automation. Protei ns: Structure, Function, and Genetics 1999(Suppl. 3):47-54. 44. Dayringer HE, Tramontano A, Sprang SR, Fletterick RJ. Interactive program for visualization and modeling of proteins, nucle ic acids and small molecules. Journal of Molecular Graphics 1986;4(2):82-87. 45. Sali A, Blundell TL. Comparative protein m odeling by satisfaction of spatial restraints. Journal of Molecular Biology 1993;234(3):779-815. 46. Vriend G. WHAT IF: a molecular modeli ng and drug design program. Journal of Molecular Graphics 1990;8(1):52-56, 29. 47. Simons KT, Bonneau R, Ruczins ki I, Baker D. Ab initio pr otein structure prediction of CASP III targets using ROSETTA. Protei ns: Structure, Function, and Genetics 1999(Suppl. 3):171-176. 48. Fiser A, Do RKG, Sali A. Modeling of loops in protein structures. Protein Science 2000;9(9):1753-1773. 49. De Filippis V, Sander C, Vriend G. Predicti ng local structural changes that result from point mutations. Protein Engineering 1994;7(10):1203-1208. 50. Stites WE, Meeker AK, Shortle D. Evidence fo r strained interactions between side-chains and the polypeptide backbone. Journal of Molecular Biology 1994;235(1):27-32. 51. Dunbrack RL, Jr., Karplus M. Conformati onal analysis of the backbone-dependent rotamer preferences of protein sidechains. Nature Structural Bi ology 1994;1(5):334-340.

PAGE 213

213 52. Novotny J, Rashin AA, Bruccoleri RE. Criteri a that discriminate between native proteins and incorrectly folded models. Proteins: Structure, Function, and Genetics 1988;4(1):1930. 53. Brenner SE, Chothia C, Hubbard TJP, Murzin AG. Understanding protein structure: using scope for fold interpretation. Me thods in Enzymology 1996;266(Computer Methods for Macromolecular Sequence Analysis):635-643. 54. Holm L, Sander C. Mapping the prot ein universe. Science (Washington, D C) 1996;273(5275):595-602. 55. Hubbard TJP, Murzin AG, Brenner SE, Chothi a C. SCOP: a structural classification of proteins database. Nucleic Ac ids Research 1997 ;25(1):236-239. 56. Valencia A, Kjeldgaard M, Pai EF, Sander C. GTPase domains of ras p21 oncogene protein and elongation factor Tu: analysis of three-dime nsional structures, sequence families, and functional sites. Proceedings of the National Academy of Sciences of the United States of America 1991;88(12):5443-5447. 57. Bourne PE, Weissig H, Editors. Structural Bioinformatics; 2003. 649 pp 58. Godzik A, Skolnick J. Sequence-structure matching in globular prot eins: Application to supersecondary and tertiary structure determination. Proceedings of the National Academy of Sciences of the United States of America 1992;89(24):12098-12102. 59. Bryant SH, Lawrence CE. An empirical en ergy function for threading protein sequence through the folding motif. Proteins: Struct ure, Function, and Genetics 1993;16(1):92-112. 60. Jones DT, Taylor WR, Thornton JM. A new a pproach to protein fold recognition. Nature (London, United Kingdom) 1992;358(6381):86-89. 61. Anfinsen CB, Haber E, Sela M, White FH, Jr. Kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proceedings of the National Academy of Sciences of th e United States of America 1961;47:1309-1314. 62. Anfinsen CB. Principles that govern the fo lding of protein chains. Science (Washington, DC, United States) 1973;181(4096):223-230. 63. Simons KT, Kooperberg C, Hua ng E, Baker D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. Journal of Molecular Biology 1997;268(1):209-225. 64. Samudrala R, Xia Y, Huang E, Levitt M. Ab initio protein structure prediction using a combined hierarchical approach. Proteins: Structure, Function, and Genetics 1999(Suppl. 3):194-198.

PAGE 214

214 65. Ortiz AR, Kolinski A, Rotkiewicz P, Ilkowski B, Skolnick J. Ab initio folding of proteins using restraints derived from evolutionary in formation. Proteins: Structure, Function, and Genetics 1999(Suppl. 3):177-185. 66. Pillardy J, Czaplewski C, Liwo A, Lee J, Ripoll DR, Kazmierkiewicz R, Oldziej S, Wedemeyer WJ, Gibson KD, Arnautova YA, Saunders J, Ye Y-J, Scheraga HA. Recent improvements in prediction of protein struct ure by global optimization of a potential energy function. Proceedings of the National Ac ademy of Sciences of the United States of America 2001;98(5):2329-2333. 67. Marqusee S, Robbins VH, Baldwin RL. Unus ually stable helix formation in short alanine-based peptides. Proceedings of the Na tional Academy of Sciences of the United States of America 1989;86(14):5286-5290. 68. Blanco FJ, Rivas G, Serrano L. A short linear peptide that folds into a native stable bhairpin in aqueous solution. Nature Structural Biology 1994;1(9):584-590. 69. Callihan DE, Logan TM. Conformations of Peptide Fragments from the FK506 Binding Protein: Comparison with the Native and Urea-unfolded States. Journal of Molecular Biology 1999;285(5):2161-2175. 70. Park BH, Levitt M. The complexity and accuracy of discrete state models of protein structure. Journal of Molecu lar Biology 1995;249(2):493-507. 71. Sippl MJ, Hendlich M, Lackner P. Asse mbly of polypeptide and protein backbone conformations from low energy ensembles of s hort fragments: Development of strategies and construction of models for myoglobin, ly sozyme, and thymosin b4. Protein Science 1992;1(5):625-640. 72. Bowie JU, Eisenberg D. An evolutionary approa ch to folding small a-helical proteins that uses sequence information and an empirical guiding fitness function. Proceedings of the National Academy of Sciences of the Un ited States of America 1994;91(10):4436-4440. 73. Jones DT. Successful ab initio prediction of the tertiary structure of NK-lysin using multiple sequences and recognized supersecondar y structural motifs. Proteins:Structure, Function, and Genetics 1997;Suppl 1:185-191. 74. Sippl MJ. Knowledge-based potentials for proteins. Current Opinion in Structural Biology 1995;5(2):229-235. 75. Koppensteiner WA, Sippl MJ. Knowledge-b ased potentials-back to the roots. Biochemistry (Moscow)(Translation of Biokhimiya (Moscow)) 1998;63(3):247-252. 76. Simmerling C, Strockbine B, Roitberg AE. All-Atom Structure Prediction and Folding Simulations of a Stable Protein. Journal of the American Chemical Society 2002;124(38):11258-11259.

PAGE 215

215 77. Qiu L, Pabit SA, Roitberg AE, Hagen SJ. Smaller and faster: the 20-residue Trp-cage protein folds in 4 ms. Journal of the American Chemical Society 2002;124(44):1295212953. 78. Hansmann UHE, Okamoto Y. Numerical co mparisons of three recently proposed algorithms in the protein folding problem. Journal of Computational Chemistry 1997;18(7):920-933. 79. Pedersen JT, Moult J. Protein folding simulations with genetic algor ithms and a detailed molecular description. Journal of Molecular Biology 1997;269(2):240-259. 80. Park B, Levitt M. Energy functions that di scriminate x-ray and n ear-native folds from well-constructed decoys. Journal of Molecular Biology 1996;258(2):367-392. 81. Huang ES, Subbiah S, Tsai J, Levitt M. Using a hydrophobic contact potential to evaluate native and near-native folds generated by molecular dynamics simulations. Journal of Molecular Biology 1996;257(3):716-725. 82. Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein struct ure prediction. Journal of Molecular Biology 1998;275(5):895-916. 83. Bonneau R, Strauss CEM, Rohl CA, Chivian D, Bradley P, Malmstrom L, Robertson T, Baker D. De novo prediction of three-dimensi onal structures for major protein families. Journal of Molecular Biology 2002;322(1):65-78. 84. Fetrow JS, Skolnick J. Method for predicti on of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonuc leases. Journal of Molecular Biology 1998;281(5):949-968. 85. Zhang Y, DeVries ME, Skolnick J. Structur e modeling of all identif ied G protein-coupled receptors in the human genome. [Erratu m to document cited in CA144:305279]. PLoS Computational Biology 2006;2(3):200. 86. Bradley P, Chivian D, Meiler J, Misura KMS, Rohl CA, Schief WR, Wedemeyer WJ, Schueler-furman O, Murphy P, Schonbrun J, St rauss CEM, Baker D. Rosetta predictions in CASP5: Successes, failures, and prosp ects for complete automation. Proteins: Structure, Function, and Gene tics 2003;53(Suppl. 6):457-468. 87. Bonneau R, Tsai J, Ruczinski I, Chivian D, Rohl C, Strauss CE, Baker D. Rosetta in CASP4: progress in ab initio protein structure prediction. Proteins: Structure, FUnction, and Genetics 2001;(Suppl 5):119-126. 88. Bowers PM, Strauss CEM, Baker D. De novo protein structure determination using sparse NMR data. Journal of Biomolecular NMR 2000;18(4):311-318.

PAGE 216

216 89. Rohl CA, Baker D. De novo determination of protein backbone structure from residual dipolar couplings using Rosetta. Journa l of the American Chemical Society 2002;124(11):2723-2729. 90. Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. Design of a Novel Globular Protein Fold with Atomic-Level Accuracy. Science (Washington, DC, United States) 2003;302(5649):1364-1368. 91. Kuhlman B, O'Neill JW, Kim DE, Zhang KYJ, Baker D. Accurate Computer-based Design of a New Backbone Conformation in th e Second Turn of Protein L. Journal of Molecular Biology 2002;315(3):471-477. 92. Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. Journal of Molecular Biology 2003;331(1):281-299. 93. Rohl CA, Strauss CEM, Chivian D, Baker D. Modeling structurally variable regions in homologous proteins with Rosetta. Proteins: Structure, Functi on, and Bioinformatics 2004;55(3):656-677. 94. Samudrala R, Levitt M. Decoys \"R\" Us: a database of incorrect conformations to improve protein structure predicti on. Protein Scien ce 2000;9(7):1399-1401. 95. Wang Y, Zhang H, Li W, Scott RA. Disc riminating compact nonnative structures from the native structure of globular proteins Proceedings of the National Academy of Sciences of the United States of America 1995;92(3):709-713. 96. Subramaniam S, Tcheng DK, Fenton JM. A knowledge-based method for protein structure refinement and prediction. Pro ceedings / International Conference on Intelligent Systems for Molecular Biology ; IS MB International Conference on Intelligent Systems for Molecular Biology 1996;4:218-229. 97. Holm L, Sander C. Evaluation of protein m odels by atomic solvation preference. Journal of Molecular Biol ogy 1992;225(1):93-105. 98. Crippen GM. A novel approach to calcula tion of conformation: distance geometry. Journal of Computationa l Physics 1977;24(1):96-107. 99. Havel TF, Kuntz ID, Crippen GM. The combinatorial distance geometry method for the calculation of molecular conformation. I. A ne w approach to an old problem. J Theor Biol FIELD Full Journal Title:Journal of theoretical biology 1983;104(3):359-381. 100. Kuntz ID, Crippen GM, Kollman PA. Application of distance ge ometry to protein tertiary structure calculations. Biopolymers 1979;18(4):939-957.

PAGE 217

217 101. Collins CJ, Schilling B, Young M, Dollin ger G, Guy RK. Isotopically labeled crosslinking reagents: resoluti on of mass degeneracy in the identification of crosslinked peptides. Bioorganic & Medicinal Ch emistry Letters 2003;13(22):4023-4026. 102. Schilling B, Row RH, Gibson BW, Guo X, Young MM. MS2Assign, automated assignment and nomenclature of tandem mass sp ectra of chemically crosslinked peptides. Journal of the American Society fo r Mass Spectrometry 2003;14(8):834-850. 103. Kruppa GH, Schoeniger J, Young MM. A top down approach to protein structural studies using chemical cross-linking and fourier transform mass spectrometry. Rapid Communications in Mass Sp ectrometry 2003;17(2):155-162. 104. Alexandrov NN, Nussinov R, Zimmer RM. Fast protein fold recognition via sequence to structure alignment and contact capacity pote ntials. Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing 1996:53-72. 105. Bonneau R, Tsai J, Ruczinski I, Baker D. Functional Inferences fr om Blind ab Initio Protein Structure Predictions. Journal of Structural Biology 2001;134(2 & 3):186-190. 106. Zhang Y, Hubner IA, Arakaki AK, Shakhnovich E, Skolnick J. On the origin and highly likely completeness of single-domain protein structures. Proceedings of the National Academy of Sciences of the United States of America 2006;103(8):2605-2610. 107. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nu cleic Acids Research 2000;28(1):235-242. 108. Levitt M. Growth of novel protein structur al data. Proceedings of the National Academy of Sciences of the United Stat es of America 2007;104(9):3183-3188. 109. Moult J, Fidelis K, Rost B, Hubbard T, Tramontano A. Critical Assessment of methods of protein Structure Pred iction (CASP)-round 6. Proteins : Structure, Function, and Bioinformatics 2005;61(Suppl. 7):3-7. 110. Rohl CA, Strauss CEM, Misura KMS, Ba ker D. Protein structure prediction using Rosetta. Methods in Enzymology 2004;383(Numerical Computer Methods, Part D):6693. 111. Rohl CA. Protein structure estimation from minimal restraints using Rosetta. Methods in Enzymology 2005;394(Nuclear Magnetic Resona nce of Biological Macromolecules, Part C):244-260. 112. Chivian D, Kim DE, Malmstrom L, Schonbr un J, Rohl CA, Baker D. Prediction of CASP6 structures using automated Robetta pr otocols. Proteins: St ructure, Function, and Bioinformatics 2005;61(Suppl. 7):157-166.

PAGE 218

218 113. Kim DE, Chivian D, Baker D. Protein structure prediction an d analysis using the Robetta server. Nucleic Acids Research 2004;32(Web Server):W526-W531. 114. Chivian D, Kim DE, Malmstroem L, Bradley P, Robertson T, Murphy P, Strauss CEM, Bonneau R, Rohl CA, Baker D. Automated prediction of CASP-5 st ructures using the Robetta server. Proteins: Structure, F unction, and Genetics 2003;53(Suppl. 6):524-533. 115. Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ. JPred: a consensus secondary structure prediction server. Bioinformatics 1998;14(10):892-893. 116. King RD, Sternberg MJE. Identification and ap plication of the concepts important for accurate and reliable protein secondar y structure prediction. Protein Science 1996;5(11):2298-2310. 117. Rost B, Sander C. Prediction of protein sec ondary structure at bett er than 70% accuracy. Journal of Molecular Biology 1993;232(2):584-599. 118. Salamov AA, Solovyev VV. Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple se quence alignments. Journal of Molecular Biology 1995;247(1):11-15. 119. Frishman D, Argos P. Seventy-five per cent accuracy in protein secondary structure prediction. Proteins: Structure, F unction, and Genetics 1997;27(3):329-335. 120. Li W, Zhang Y, Skolnick J. Application of sparse NMR restraints to large-scale protein structure prediction. Biophysical Journal 2004;87(2):1241-1248. 121. Chen Y, Ding, F., Dokholyan, N. V. Fidelity of the protein struct ure reconstruction from inter-residue proximity constrai nts. J Phys Chem B 2007;111:7432-7438. 122. Faulon J-L, Sale K, Young M. Exploring th e conformational space of membrane protein folds matching distance constraints. Protein Science 2003;12(8):1750-1761. 123. Zemla A. LGA: a method for finding 3D simila rities in protein structures. Nucleic Acids Research 2003;31(13):3370 3374. 124. Li X, Sutcliffe MJ, Schwartz TW, Dobson CM. Sequence-specific proton NMR assignments and solution structure of bovine pancreatic polypeptide. Biochemistry 1992;31(4):1245-1253. 125. Lewis RJ, Brannigan JA, Of fen WA, Smith I, Wilkinson AJ. An evolutionary link between sporulation and prophage induction in the structure of a repressor: anti-repressor complex. Journal of Molecula r Biology 1998;283(5):907-912.

PAGE 219

219 126. Leijonmarck M, Liljas A. Structure of th e C-terminal domain of the ribosomal protein L7/L12 from Escherichia coli at 1.7 .ANG. Journal of Molecular Biology 1987;195(3):555-580. 127. Berndt K, Guentert P, Wuethrich K. Nucl ear magnetic resonance solution structure of dendrotoxin K from the venom of Dendroaspis polylepis polylepis. Journal of Molecular Biology 1993;234(3):735-750. 128. keassar Chen lM. a novel approach to dec oy set generation: desi gning a physical energy function having local minima with native stru cture characteristics. journal of Molecular Biology 2003;329:159 151 174. 129. Bewley CA, Gustafson KR, Boyd MR, C ovell DG, Bax A, Clor e GM, Gronenborn AM. Solution structure of cyanovirin-N, a potent HIV-inactivating protei n. Nature Structural Biology 1998;5(7):571-578. 130. Bewley CA. Solution structure of a cyanovi rin-N:Mana1-2Mana comp lex structural basis for high-affinity carbohydrate-mediated binding to gp120. Structur e (Cambridge, MA, United States) 2001;9(10):931-940. 131. Drohat AC, Tjandra N, Baldisseri DM, We ber DJ. The use of dipolar couplings for determining the solution structure of rat apo-S100B(bb). Protein Science 1999;8(4):800809. 132. Roberts SA, Weichsel A, Grass G, Thakali K, Hazzard JT, Tollin G, Rensing C, Montfort WR. Crystal structure and electron transfer kinetics of CueO, a multicopper oxidase required for copper homeostasis in Escher ichia coli. Proceedings of the National Academy of Sciences of the United St ates of America 2002;99(5):2766-2771. 133. Roberts SA, Wildner GF, Grass G, Weichsel A, Ambrus A, Rensing C, Montfort WR. A Labile Regulatory Copper Ion Lies Near the T1 Copper Site in the Multicopper Oxidase CueO. Journal of Biological Chemistry 2003;278(34):31958-31963. 134. Reva BA, Finkelstein AV, Skolnick J. What is the probability of a chance prediction of a protein structure with an rmsd of 6 .ANG.? Folding & Design 1998;3(2):141-147. 135. Raaijmakers H, Macieira S, Dias JM, Te ixeira S, Bursakov S, Huber R, Moura JJG, Moura I, Romao MJ. Gene Sequence and the 1.8 .ANG. Crystal Structure of the Tungsten-Containing Formate Dehydrogenase from Desulfovibrio gigas. Structure (Cambridge, MA, United St ates) 2002;10(9):1261-1272. 136. Yankovskaya V, Horsefield R, Toernroth S, Luna-Chavez C, Miyoshi H, Leger C, Byrne B, Cecchini G, Iwata S. Architecture of succinate dehydrogenase and reactive oxygen species generation. Science (Washington, DC, United States) 2003;299(5607):700-704.

PAGE 220

220 137. Leesong M, Henderson BS, Gillig JR, Schwab JM, Smith JL. Structure of a dehydrataseisomerase from the bacterial pathway for bi osynthesis of unsaturated fatty acids: two catalytic activities in one active site. Structure (London) 1996;4(3):253-264. 138. Sharma AK, Rajashankar KR, Yadav MP, Singh TP. Structure of mare apolactoferrin: the N and C lobes are in the closed form. Acta Crystallographica, S ection D: Biological Crystallography 1999;D55(6):1152-1157. 139. Sugahara M, Nodake Y, Sugahara M, Kuni shima N. Crystal structure of dehydroquinate synthase from Thermus thermophilus HB8 s howing functional importance of the dimeric state. Proteins FIELD Full Jour nal Title:Proteins 2005;58(1):249-252. 140. Ren J, Esnouf RM, Hopkins AL, Warren J, Balz arini J, Stuart DI, Stammers DK. Crystal Structures of HIV-1 Reverse Transcriptase in Complex with Carboxanilide Derivatives. Biochemistry 1998;37(41):14394-14403. 141. Ferguson KM, Kavran JM, Sankaran VG, Fournier E, Isakoff SJ Skolnik EY, Lemmon MA. Structural basis for discrimination of 3-phosphoinositides by pleckstrin homology domains. Molecular Cell 2000;6(2):373-384. 142. Lawson CL, Zhang R, Schevitz RW, Otwinowski Z, Joachimiak A, Sigler PB. Flexibility of the DNA-binding domains of trp repressor. Proteins: Structure, Function, and Genetics 1988;3(1):18-31. 143. Meiler J, Baker D. Rapid protein fold determination using unassigned NMR data. Proceedings of the National Academy of Scie nces of the United States of America 2003;100(26):15404-15409. 144. Ramirez BE, Voloshin ON, Camerini-Otero RD, Bax A. Solution structure of DinI provides insight into its mode of RecA inactivation. Protein Science 2000;9(11):21612169. 145. Alexeev D, Bury SM, Turner MA, Ogunjobi OM, Muir TW, Ramage R, Sawyer L. Synthetic, structural and biological studies of the ubi quitin system: chemically synthesized and native ubiquitin fold into identical three-dimensional structures. Biochemical Journal 1994;299(1):159-163. 146. Schumacher S, Clubb RT, Cai M, Mizuuc hi K, Clore GM, Gronenborn AM. Solution structure of the Mu end DNA-binding Ib subdom ain of phage Mu transposase: modular DNA recognition by two tethered domains. EMBO Journal 1997;16(24):7532-7541. 147. Kihara D, Lu H, Kolinski A, Skolnick J. TOUCHSTONE: an ab initio protein structure prediction method that uses threading-based tertiary restraints. Proceedings of the National Academy of Sciences of the United States of America 2001;98(18):1012510130.

PAGE 221

221 148. Vallely KM, Rustandi RR, Ellis KC, Varlamova O, Bresnick AR, Weber DJ. Solution Structure of Human Mts1 (S100A4) As Determined by NMR Spectroscopy. Biochemistry 2002;41(42):12670-12680. 149. Dempsey AC, Walsh MP, Shaw GS. Unmask ing the Annexin I Interaction from the Structure of Apo-S100A11. Structure (Cam bridge, MA, United States) 2003;11(7):887897. 150. Brodersen DE, Etzerodt M, Madsen P, Celis JE, Thogersen HC, Nyborg J, Kjeldgaard M. EF-hands at atomic resolution: the struct ure of human psoriasin (S100A7) solved by MAD phasing. Structure (London) 1998;6(4):477-489. 151. Kobayashi N, Koshiba, S., Inoue, M., Ki gawa, T., Yokoyama, S. RIKEN Structural Genomics/Proteomics Initiative (RSGI), Solutio n structure of mouse CGI-38 protein. To be Published 152. Murakami S, Nakashima R, Yamashita E, Ya maguchi A. Crystal structure of bacterial multidrug efflux transporter AcrB Nature (London, United Kingdom) 2002;419(6907):587-593. 153. Yu EW, McDermott G, Zgurskaya HI, Nikai do H, Koshland DE, Jr. Structural Basis of Multiple Drug-Binding Capacity of the AcrB Multidrug Efflux Pump. Science (Washington, DC, United States) 2003;300(5621):976-980. 154. Yu EW, Aires JR, McDermott G, Nikaido H. A periplasmic drug-binding site of the AcrB multidrug efflux pump: A crystallograp hic and site-directed mutagenesis study. Journal of Bacteriology 2005;187(19):6804-6815. 155. Ronning DR, Guynet C, Ton-Hoang B, Perez ZN, Ghirlando R, Chandler M, Dyda F. Active site sharing and subterminal hairpin recognition in a new class of DNA transposases. Molecula r Cell 2005;20(1):143-154. 156. Badger J, Sauder JM, Adams JM, Antonysamy S, Bain K, Bergseid MG, Buchanan SG, Buchanan MD, Batiyenko Y, Christopher JA, Emta ge S, Eroshkina A, Feil I, Furlong EB, Gajiwala KS, Gao X, He D, Hendle J, Huber A, Hoda K, Kearins P, Kissinger C, Laubert B, Lewis HA, Lin J, Loomis K, Lorimer D, Louie G, Maletic M, Marsh CD, Miller I, Molinari J, Muller-Dieckmann HJ, Newman JM, Noland BW, Pagarigan B, Park F, Peat TS, Post KW, Radojicic S, Ramos A, Rome ro R, Rutter ME, Sanderson WE, Schwinn KD, Tresser J, Winhoven J, Wright TA, Wu L, Xu J, Harris TJR. Struct ural analysis of a set of proteins resulting from a bacterial ge nomics project. Proteins : Structure, Function, and Bioinformatics 2005;60(4):787-796. 157. Martin ACR, Orengo CA, Hutchinson EG, Jones S, Karmirantzou M, Laskowski RA, Mitchell JBO, Taroni C, Thornton JM. Prot ein folds and functions. Structure (London) 1998;6(7):875-884.

PAGE 222

222 158. Russell RB, Ponting CP. Protein fold irregula rities that hinder sequence analysis. Current Opinion in Structural Biology 1998;8(3):364-371. 159. Narasimhan J, Wang M, Fu Z, Klein JM, H aas AL, Kim J-JP. Crystal Structure of the Interferon-induced Ubiquitin-like Protein ISG15. Journal of Biological Chemistry 2005;280(29):27356-27365. 160. Milani M, Savard, P.-Y., Oullet, H., Ascenzi, P., Guertin, M., Bolognesi, M. A TyrCD1/TrpG8 hydrogen bond network and a Ty rB10-TyrCD1 covalent link shape the heme distal site of Mycobacterium tube rculosis hemoglobin O. PNAS 2003 v100:57665771. 161. Udomsinprasert R, Pongjaroenkit, S., Wongs antichon, J., Oakley, A.J., Prapanthadara, L.A., Wilce, M.C., Ketterman, A.J. Id entification, characteriza tion and structure of a new Delta class glutathione transferase isoenzyme. BiochemJ 2005 388 763-771. 162. Fotin A, Cheng, Y., Grigorieff, N., Walz T., Harrison, S.C., Kirchhausen, T. Structure of an auxilin-bound clathrin coat and its impli cations for the mechanism of uncoating Nature 2004 432 649-653. 163. Fotin A, Cheng, Y., Sliz, P., Grigorieff, N., Harrison, S.C., Kirchhausen, T., Walz, T. Molecular model for a complete clathrin la ttice from electron cryomicroscopy. Nature 2004 432 573-579. 164. Schulze FW, Petrick HJ, Cammenga HK, Kli nge H. Thermodynamic properties of the structural analogs benzo[c]ci nnoline, trans-azobenzene, and cis-azobenzene. Zeitschrift fuer Physikalische Chemie (Muenchen, Germany) 1977;107(1):1-19. 165. Talaty ER, Fargo JC. Thermal cis-trans isomerization of substituted azobenzenes. Correction of the literature. Chemi cal Communications (London) 1967(2):65-66. 166. Rau H. Azo compounds [Photochromium based on E-Z isomeriza tion of double bonds]. Studies in Organic Chemistry (Amsterdam ) 1990;40(Photochromism: Mol. Syst.):165192. 167. Liu ZF, Hashimoto K, Fujishima A. Photoelectrochemical information storage using an azobenzene derivative. Nature (Lon don, United Kingdom) 1990;347(6294):658-660. 168. Hugel T, Holland Nolan B, Cattani A, Mo roder L, Seitz M, Gaub Hermann E. Singlemolecule optomechanical cycle. Sc ience (New York, NY) 2002;296(5570):1103-1106. 169. Ikeda T, Tsutsumi O. Optical switchi ng and image storage by means of azobenzene liquid-crystal films. Science (W ashington, D C) 1995;268(5219):1873-1875. 170. Sekkat Z, Dumont M. Photoassisted poling of azo dye doped polymeric films at room temperature. Applied Physics B: Phot ophysics and Laser Chemistry 1992;B54(5):486489.

PAGE 223

223 171. Muraoka T, Kinbara K, Kobayashi Y, Ai da T. Light-Driven Open-Close Motion of Chiral Molecular Scissors. Journal of the American Chemical Society 2003;125(19):5612-5613. 172. Zhang C, Du MH, Cheng HP, Zhang XG, Roitberg AE, Krause JL. Coherent Electron Transport through an Azobenzene Molecule: A Light-Driven Molecular Switch. Physical Review Letters 2004;92(15):158301/158301-158301/158304. 173. Bortolus P, Monti S. Cis-trans photoisome rization of azobenzene. Solvent and triplet donors effects. Journal of Physical Chemistry 1979;83(6):648-652. 174. Zimmerman G, Chow L-Y, Paik U-J. The photochemical isomerization of azobenzene. Journal of the American Chemical Society 1958;80:3528-3531. 175. Rau H. Further evidence for rotati on in the p,p* and inversion in the n,p* photoisomerization of azobenzenes. Journal of Photochemistry 1984;26(2-3):221-225. 176. Rau H, Lueddecke E. On the rotation-i nversion controversy on photoisomerization of azobenzenes. Experimental proof of inversion. Journal of the American Chemical Society 1982;104(6):1616-1620. 177. Bortolus P, Monti S. cis .dblharw. trans Photoisomerization of azobenzene-cyclodextrin inclusion complexes. Journal of P hysical Chemistry 1987;91(19):5046-5050. 178. Monti S, Orlandi G, Palmieri P. Features of the photochemically act ive state surfaces of azobenzene. Chemical Physics 1982;71(1):87-99. 179. Cattaneo P, Persico M. An ab initio study of the photochemistry of azobenzene. Physical Chemistry Chemical Physics 1999;1(20):4739-4743. 180. Blevins AA, Blanchard GJ. Effect of Pos itional Substitution on the Optical Response of Symmetrically Disubstituted Azobenzene Deriva tives. Journal of P hysical Chemistry B 2004;108(16):4962-4968. 181. Andersson JA, Petterson R, Tegner L. Flash photolysis experiments in the vapor phase at elevated temperatures. I: Spectra of azobenzen e and the kinetics of its thermal cis-trans isomerization. Journal of Photochemistry 1982;20(1):17-32. 182. Lednev IK, Ye TQ, Matousek P, Towrie M, Foggi P, Neuwahl FVR, Umapathy S, Hester RE, Moore JN. Femtosecond time-resolved UVvisible absorption sp ectroscopy of transazobenzene: dependence on excitation wa velength. Chemical Physics Letters 1998;290(1,2,3):68-74. 183. Lednev IK, Ye T-Q, Abbott LC, Hester RE, Moore JN. Photoisomerization of a Capped Azobenzene in Solution Probed by Ultrafas t Time-Resolved Electronic Absorption Spectroscopy. Journal of Physical Chemistry A 1998;102(46):9161-9166.

PAGE 224

224 184. Lednev IK, Ye T-Q, Hester RE, Moore JN. Femtosecond Time-Resolved UV-Visible Absorption Spectroscopy of trans-Azobenzene in Solution. Journal of Physical Chemistry 1996;100(32):13338-13341. 185. Fujino T, Tahara T. Picosecond Time-R esolved Raman Study of trans-Azobenzene. Journal of Physical Chemistry A 2000;104(18):4203-4210. 186. Fujino T, Arzhantsev SY, Tahara T. Femt osecond Time-Resolved Fluorescence Study of Photoisomerization of trans-Azobenzene. Journal of Physical Chemistry A 2001;105(35):8123-8129. 187. Ishikawa T, Noro T, Shoda T. Theo retical study on the photoisomerization of azobenzene. Journal of Chem ical Physics 2001;115(16):7503-7512. 188. Quennville J. First principles strudies of cis-trans photoisomeriation dynamics and excited states in ethylene, stilbene, azobenzen e, and TATB. Urbana: University of Illinois at Urbana-Champaign; 2003. 189. Tiago ML, Ismail-Beigi S, Louie SG. Photoisomerization of azobenzene from firstprinciples constrained density-functional ca lculations. Journal of Chemical Physics 2005;122(9):094311/094311-094311/094317. 190. Ciminelli C, Granucci G, Persico M. The photoisomerization mechanism of azobenzene: A semiclassical simulation of nonadiabatic dynamics. Chemistry--A European Journal 2004;10(9):2327-2341. 191. Cembran A, Bernardi F, Garavelli M, Gaglia rdi L, Orlandi G. On the Mechanism of the cis-trans Isomerization in the Lowest Electr onic States of Azobenzene: S0, S1, and T1. Journal of the American Chem ical Society 2004;126(10):3234-3243. 192. Gagliardi L, Orlandi G, Bernardi F, Cembra n A, Garavelli M. A theoretical study of the lowest electronic states of azobenzene: The ro le of torsion coordinate in the cis-trans photoisomerization. Theoretical Chem istry Accounts 2004;111(2-6):363-372. 193. Diau EW-G. A New Trans-to-Cis Photoisomerization Mechanism of Azobenzene on the S1(n,p) Surface. Journal of Physical Chemistry A 2004;108(6):950-956. 194. Chang C-W, Lu Y-C, Wang T-T, Diau EW-G. Photoisomerization Dynamics of Azobenzene in Solution with S1 Excitati on: A Femtosecond Fluorescence Anisotropy Study. Journal of the American Ch emical Society 2004;126(32):10109-10118. 195. Dohno C, Uno S-n, Nakatani K. Photoswitcha ble Molecular Glue for DNA. Journal of the American Chemical Society 2007;129(39):11898-11899.

PAGE 225

225 196. Gorostiza P, Volgraf M, Numano R, Szobota S, Trauner D, Isacoff EY. Mechanisms of photoswitch conjugation and light activati on of an ionotropic glutamate receptor. Proceedings of the National Academy of Scie nces of the United States of America 2007;104(26):10865-10870. 197. Gaussian I, Frisch, M. J.; Trucks, G. W. ; Schlegel, H. B.; Scuseria GER, M. A.; Cheeseman, J. R.; Montgomery, Jr., J. A.; Vrev en, T.; Kudin KNB, J. C.; Millam, J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V. ; Mennucci BC, M.; Scalmani, G.; Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada ME M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima TH, Y.; Kitao, O.; Naka i, H.; Klene, M.; Li, X.; Knox, J. E.; Hratchian HPC, J. B.; Bakken, V.; Adamo, C. ; Jaramillo, J.; Gomperts, R.; Stratmann REY, O.; Austin, A. J.; Cammi, R.; Pomelli, C.;, Ochterski JWA, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg JJZ, V. G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas OM, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz JVC, Q.; Baboul, A. G.; Clifford, S.; Cioslo wski, J.; Stefanov, B. B.; Liu GL, A.; Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; Keith TA-L, M. A.; Peng, C. Y.; Nanayakkara, A.; Challacombe, M.; Gill PMWJ, B.; Chen, W.; Wong, M. W.; Gonzalez, C.; and Pople, J. A. Gaussian 03, Revision C.02. Wallingford CT; 2004. 198. Becke AD. Density-functional thermochemistr y. III. The role of exact exchange. Journal of Chemical Physics 1993;98(7):5648-5652. 199. Hariharan PC, Pople JA. Influence of polarization functions on MO hydrogenation energies. Theoretica Chimica Acta 1973;28(3):213-222. 200. Biswas N, Umapathy S. Density Functional Calculations of Structures, Vibrational Frequencies, and Normal Modes of transand cis-Azobenzene. Journal of Physical Chemistry A 1997;101(30):5555-5566. 201. Traetteberg M, Hilmo I, Hagen K. A ga s electron diffraction st udy of the molecular structure of trans-azobenzene. Journal of Molecular Structure 1977;39(2):231-239. 202. Bouwstra JA, Schouten A, Kroon J. Stru ctural studies of the system transazobenzene/trans-stilbene. I. A reinvestigation of the disorder in th e crystal structure of trans-azobenzene, C12H10N2. Acta Crystall ographica, Section C: Crystal Structure Communications 1983;C39(8):1121-1123. 203. Fliegl H, Koehn A, Haettig C, Ahlrichs R. Ab Initio Calculation of the Vibrational and Electronic Spectra of transand cis-Azoben zene. Journal of the American Chemical Society 2003;125(32):9821-9827. 204. Mostad A, Roemming C. Refinement of the crystal structure of cis-azobenzene. Acta Chemica Scandinavica ( 1947-1973) 1971;25(10):3561-3568.

PAGE 226

226 205. Naegele T, Hoche R, Zinth W, Wachtvei tl J. Femtosecond photoisomerization of cisazobenzene. Chemical Phys ics Letters 1997;272(5,6):489-495. 206. Tully JC. Molecular dynamics with electronic transitions. Journal of Chemical Physics 1990;93(2):1061-1071. 207. Kasha M. Characterization of electronic transitions in comple x molecules. Discussions of the Faraday Society 1950;No. 9:14-19. 208. Fujino T, Arzhantsev SY, Tahara T. Femtosecond/picosecond time-resolved spectroscopy of trans-azobenzene: isomerization mechanis m following S2(pp*)
PAGE 227

227 BIOGRAPHICAL SKETCH Christina was born in a sm all town in North eastern Pennsylvania. In 1998, she entered Bloomsburg University of Pennsyl vania with aspirations of beco ming a nurse. Because of the excellent tutelage she received from Dr. Wayne P. Anderson in an introductory organic/biochemistry course, she decided to change her major to chemistry. A year later, she joined Dr. Anderson in his research efforts. T ogether, they studied the geometric effects on the spectra of Vanadyl complexes as well as potentia l aluminum catalysts to be used in olefin polymerization. She also had th e privilege of working with Dr Anna Krylov in the summer of 2002 as a participant in the Research Experience for Undergraduates program at the University of Southern California. In 2003, Christina entered the graduate progr am at the University of Florida and immediately began working for Dr. Adrian Roit berg. Her initial studies were focused on determining the isomerization pathways of Azobe nzene but she is now more interested in biosystems. During the summer of 2005, she had th e opportunity to partic ipate in the National Science Foundations East Asia and Pacific Summer research Institutes program working for Dr. Jill Gready, at the Australian National University.


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20101108_AAAABV INGEST_TIME 2010-11-08T20:11:57Z PACKAGE UFE0021713_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 3386 DFID F20101108_AABPHB ORIGIN DEPOSITOR PATH crecca_c_Page_010thm.jpg GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
cf30240b24860d4c80de9c0c3d4f95cd
SHA-1
677bd58c1dab6c5a72d20c74939d568af7e2a6aa
54987 F20101108_AABOXT crecca_c_Page_041.pro
524a2982a53776805e8d02ba6c37c332
ac61b1694d7378748d075af194078561340a7f88
98301 F20101108_AABNNO crecca_c_Page_008.jpg
c58e7f1c140ca2f380f6920522cca53f
c3bc06276e826240c76aa66de1e53fde4aecc23b
81350 F20101108_AABOKI crecca_c_Page_137.jpg
d2165639dd49f14345a5c8da76dbeb7e
11518e6b3b1e52555495602102b724915172a606
24375 F20101108_AABPHC crecca_c_Page_011.QC.jpg
83bc21aebb3384497d8b1675f79573fa
72f97d1dd89c626f66f8473bdf27d88d48994422
53607 F20101108_AABOXU crecca_c_Page_042.pro
62ac63a7606ed16db9600db624d1f106
042eec7a05627dbd683ac3f14a9a648071f5cdc8
14585 F20101108_AABNNP crecca_c_Page_073.QC.jpg
4a407c66f9b6718d5049c599b78ae40f
07bc9130f592eab199fb176ad1ec480543bb7edc
6124 F20101108_AABPHD crecca_c_Page_011thm.jpg
2b10cbbc19efbd28e8c123d1383f7c11
93add08d07da9d43f63c5296bc395e740db058b0
54904 F20101108_AABOXV crecca_c_Page_046.pro
ab3b3b8c56a86d9f9d6c9e598d94970d
af6f0838500d47daabb1e10f4815781034f04f77
25271604 F20101108_AABNNQ crecca_c_Page_164.tif
c90cd7bd3dfe5e5bd3ca2256d0f1efd7
647ee2ed1af0198bd3bbdc1a7774bf1b7653cacd
31613 F20101108_AABOKJ crecca_c_Page_138.jpg
eefe50eeae48ee3c19bf41d481b7e8de
4751abb01e9b6ba04a07f5a9bf8e2e903fb5b3d5
27783 F20101108_AABPHE crecca_c_Page_012.QC.jpg
74bcad6d7942026ca85d8d4631fbef60
f3b0992af4b6f3e04e9021226320a3bb6739a24d
1051976 F20101108_AABNNR crecca_c_Page_153.jp2
8ccbba996cb336415d2a70bd1c04983c
f75bc18b34108be196d7b3e279f07303ba0dedd2
77209 F20101108_AABOKK crecca_c_Page_140.jpg
4b944a037fb5f12cf336ebf72c8f0fd8
5a837765935db4362ffc3eb1c5324cff18a451fc
18268 F20101108_AABOXW crecca_c_Page_048.pro
c97013ca7cffd0e6c38466a0b23be8a6
26b9c55be17e72c4d96c938ec42cbc99b66d6baf
20266 F20101108_AABNNS crecca_c_Page_071.pro
8e0667bea59a6761e02ce06707456cfc
6e4cd6c4e19009624ba88207490d6ff7d8f5398f
74213 F20101108_AABOKL crecca_c_Page_141.jpg
704bd149faa3b9db917cbcb3d622660d
49a1f91b270230425eb020c96311e07d98866784
24451 F20101108_AABPHF crecca_c_Page_013.QC.jpg
2d8ab0ddb87a0b1caf903ae4c2ee5de4
f151d9ac0826288f88fef3c410c8714249d8f612
53658 F20101108_AABOXX crecca_c_Page_053.pro
d56c54e475bb37cb5e54040b80f46133
c6768072bb9a16ec9308dc2239b94e4914f8d411
1919 F20101108_AABNNT crecca_c_Page_199.pro
7949e69551cdb73a18b831e2759c9f6e
6d221ff1dfb415bdc096e4e9c2282f4c4a5f52c4
41981 F20101108_AABOKM crecca_c_Page_142.jpg
e03d2953298d4d3915af2ffa0862acd5
9a374056eb8ddf53789c8ef4bddaf2b5f0afb6d4
3715 F20101108_AABPHG crecca_c_Page_014thm.jpg
3be7d8e90917b83992cd75a72cbaff80
90686bbdf1405516e07500ab863b8ab51f8e7b78
52547 F20101108_AABOXY crecca_c_Page_054.pro
1a91d59659391473fc38dda227f7bb72
8868f2bffa1be6bb3d0b406bc7911eb86f345ece
F20101108_AABNNU crecca_c_Page_001.tif
132ccee17db9d7dbc36193d14b1a06c8
8b36fe4adac3e6d44ce33a99d3835df305977bd8
89302 F20101108_AABOKN crecca_c_Page_144.jpg
adb446f263dbc0aaf7980c42882bc16a
52e9eeedff3fab410e99d358ef5502aa91d13b5d
24378 F20101108_AABPHH crecca_c_Page_016.QC.jpg
85dfa71680ef7d51ef1b04b6f58a8205
954e0c9d251769408e1c731b028be8885c7b2c33
49468 F20101108_AABOXZ crecca_c_Page_056.pro
d6f003a7c35227eac6383e3f3a818288
f6676141a045fc1b284f8e95c77aa343c41fb21f
54949 F20101108_AABNNV crecca_c_Page_039.pro
e78506c5a69b9e484e5e79b27ace844a
6865a54562e87ac541e6e2609c752c742f3532a3
86162 F20101108_AABOKO crecca_c_Page_147.jpg
5898f683f25ad0566cbdbfc11db9f9d7
739afc818eca77c0bcd06babe010e9370ce8a624
26978 F20101108_AABPHI crecca_c_Page_018.QC.jpg
d28d5d6bbd00379ce3e091c7f95ac231
2eaab6e2180d886fb28e84f2425ddd7e86b8ef95
17584 F20101108_AABNNW crecca_c_Page_192.pro
299e90ee0a5b8597aa180b3006c8b336
f327178316d1d6df97efed081b8ce59d5ae861d6
90691 F20101108_AABOKP crecca_c_Page_148.jpg
93f52f83df5ba69b8d3c36fd01b40d05
2b2da61c4146479474507310c647e1a021aa1dde
6304 F20101108_AABPHJ crecca_c_Page_018thm.jpg
f0fc0f51b2b9557c9fd704cf9c9aaf73
7e5e8428691e6ef2687b1c6e2f220030aa66a1a9
73152 F20101108_AABNNX crecca_c_Page_096.jpg
d29d9ba177c343ad5a6580c7500053a1
87bae62ca3c20fd5b7af1d5d06f2fa2385e8340f
92035 F20101108_AABOKQ crecca_c_Page_149.jpg
21dd306b08cb468a52694030079041e9
dda5897d39f2060cb633ffbf9276b4910a00b51c
28129 F20101108_AABPHK crecca_c_Page_019.QC.jpg
ce4f5c147b5df1a4ba0973b1696f33bf
4c0e579c6440c4616e7c821151df13c32db9c05c
91716 F20101108_AABNNY crecca_c_Page_046.jpg
860d07cbe43dac48a6e7d2d15ede3ab9
72049c929d91303140025fad09ae30fabb5a42bc
32331 F20101108_AABOKR crecca_c_Page_152.jpg
71238543046b2a70a9b5f7fa334d5f94
10ffbb836ba94ad1e5f2dcedbebe1abb7ac045af
6565 F20101108_AABPHL crecca_c_Page_019thm.jpg
43a85ce7471f1b462691381f4606351b
af364c727e0c8701039734626436a243f4d7dd36
890 F20101108_AABNNZ crecca_c_Page_140.txt
6b5709440493a380ad5b03bb6144bc7b
b2a60bdef50c75566b9115dcf88fc570292e96d2
87969 F20101108_AABOKS crecca_c_Page_153.jpg
70c56d3373e388ff7e840e22ea3317eb
6f706843ad3aa053b703a578e74449dec80be3a2
25819 F20101108_AABPHM crecca_c_Page_021.QC.jpg
ea774c1034d168b43e171f48550f4aa4
75c339e3cc279eca861f6d2a32e1e97df3bc66b7
88923 F20101108_AABOKT crecca_c_Page_156.jpg
3e3f809605a933f127e8ea731156590b
7a07af7df1e7aabfb657d79d794baa370cd3fc23
6367 F20101108_AABPHN crecca_c_Page_021thm.jpg
1e315fda8d84a9d42e23ac46d2b40e01
0742710fd5857c17f9d1d7292b8a1bdaa326f7d0
85749 F20101108_AABOKU crecca_c_Page_158.jpg
af279273c7a8eb7cf3b6c3dd1c2490d2
5e308d3fc2829b2cef8e85f32c9cbfa907478586
6866 F20101108_AABPHO crecca_c_Page_022thm.jpg
f691daed64aaf467a07cf43ba28b8689
2d05c5a626839eab94ab591ad9933ed9cbb6efcb
85743 F20101108_AABOKV crecca_c_Page_159.jpg
3e2adc020b7a0df4a33e8c6094c42dca
5060a05fabe2b09be2c848369027119d559942c8
6437 F20101108_AABPHP crecca_c_Page_023thm.jpg
aa9e211dfed26f216c162acfd8e57eeb
d7c3f618755066749ba0f4800356fb0d52671a32
64945 F20101108_AABOKW crecca_c_Page_162.jpg
fa53f4843177e368954df6c4cda961d7
b498de47cde1a9e6ddf85d28ed2042c50f4dc89c
6637 F20101108_AABPHQ crecca_c_Page_024thm.jpg
448832a207f1c6e357e90fb62d29c0ca
c083f24b4cfcbc83cac9ab9b8ac7d46c90fd8d4d
41739 F20101108_AABOKX crecca_c_Page_163.jpg
02e29d0f317da9def47fc670d6f47ec4
97dc205b930a30f15877c0c6a7412bcfef055a29
6829 F20101108_AABPHR crecca_c_Page_025thm.jpg
77a49f69872eeb825ae7c7aa8a785cd0
8db6d39965c9f7c490d2c11629a82d1d7b327381
34639 F20101108_AABOKY crecca_c_Page_164.jpg
a02cc73c6897338c09f8e49fac72a32a
61a8d858d25b77ebf82899b015497aef8d63fc59
6695 F20101108_AABPHS crecca_c_Page_026thm.jpg
a2af5221edcd2ee5bb0146b377c08d20
3f03d546fc5d6f6533cc9b1b9704c21e0ce05746
109355 F20101108_AABOKZ crecca_c_Page_165.jpg
47afd02037f7ccd73b8cc498f26f7be7
b91ca8464d59775dc6bc65429d76fb1b4bd99706
27330 F20101108_AABPHT crecca_c_Page_027.QC.jpg
c60ba4dd8df2175102fb8b274c29a6e4
5cf57f9241b6513cafcdb9676b529dbb898db301
6636 F20101108_AABPHU crecca_c_Page_028thm.jpg
aa2a998f9dd0fb6611e941c26e12ec76
5c38b42359803debe1cb076463c6ac71bbe6d26d
71699 F20101108_AABNTA crecca_c_Page_029.jpg
5be8853094df138517e9145cb3970c02
e35cee6866da9314ab93838295f15e3532435641
22600 F20101108_AABPHV crecca_c_Page_029.QC.jpg
e5186a5564a50fbb8a13cd092385abbe
5382471d95fd4c496acddce4fb63a0a01721640a
7236 F20101108_AABNTB crecca_c_Page_137thm.jpg
3e52f34c2da57a8d972a4989602c0869
35db821da6d349d4300617aec6eaeaf3b007faef
5474 F20101108_AABPHW crecca_c_Page_029thm.jpg
0668e6834f9ef3ca2335c1c1ec441704
eae8b9940d42dcaecf5c1d3d64d0689129ffa13e
1051963 F20101108_AABNTC crecca_c_Page_177.jp2
cab4ff61cd77d13111ebd5a20fc13ca3
954e0f09458664763bf78ff310bc13ef43e8b1f5
21880 F20101108_AABPHX crecca_c_Page_030.QC.jpg
c120b93bcc317b241607adc4fb58f914
e37cd1dfe7c6ad59c116f454de52f99c710277bc
2859 F20101108_AABNTD crecca_c_Page_197thm.jpg
7e73d9d0c97bb3a428f9607e847df1d2
d560eeeb02de21376595695b5a5226767efcc0f4
5546 F20101108_AABPHY crecca_c_Page_030thm.jpg
ec14cfeaab379fd6f7a1fad4877ac105
8edbd6a00f3c80d7beabbb3e375d41ff9ccc05eb
F20101108_AABNTE crecca_c_Page_057.tif
f3e097f33fbac9f7785aa9b0eec38512
e35baae1ab8f722de471b08dd696020fea6f6747
6263 F20101108_AABPHZ crecca_c_Page_031thm.jpg
8d0a22543b6305850156bd3da929d163
adba945af6271e19bddbd471016fed8e23db1707
2114 F20101108_AABNTF crecca_c_Page_078.txt
ed6af02923268f596982e479a204f00c
2490e644eb9b396e946f2dd77cb75a8240c28dc8
49261 F20101108_AABNTG crecca_c_Page_044.pro
f81efba6dbe1633fef41ac2a92bd8410
3449d20809ae136215d7251877b513c46f960add
1051924 F20101108_AABNTH crecca_c_Page_013.jp2
abace1c736692b8d6a93bc3f37877c1c
6cfc79cdb0241a56652dfd4ceaa2f83b556289cb
1051982 F20101108_AABOQA crecca_c_Page_160.jp2
83338845e1374b3094f8e637eb594a59
58ac9b68445c7087485a6d3eb41e271a350aee25
F20101108_AABNTI crecca_c_Page_042.tif
d5fb16cc9496cf88bf4f78e26a02fcbf
5d4531587982c53a77ad67e9e29c595d629be02e
862756 F20101108_AABOQB crecca_c_Page_162.jp2
f01a5b8970457391bd2bc24132a6e3ce
7692a287d582852619822903ded2454dd938af38
F20101108_AABNTJ crecca_c_Page_002.tif
841a50f423b15b94bcb020a80d30ca75
ad2dc9adde7b4c6ea714dde1baa306bd723f603b
1051986 F20101108_AABOQC crecca_c_Page_165.jp2
698a50a307404c2f45a317d1ca0174a4
a6c772f49cc4a254ce266c8956a0c00bd01a2e91
1051983 F20101108_AABNTK crecca_c_Page_063.jp2
6876f083ab1b864638cd07d91dd3435d
8a00110da4a465ebfd764f82eb925dc00de78d92
857175 F20101108_AABOQD crecca_c_Page_167.jp2
58d24d044282d96e08975b45300654c3
d452d00e0830975ceaa03362319f4c69999fd338
F20101108_AABNTL crecca_c_Page_216.tif
50944db97b51970c76ed6ca51c6199bd
2c8b8554e30569df9357ce753c0dc2c259793897
1051926 F20101108_AABOQE crecca_c_Page_171.jp2
c5c5d89a03a508044cc072fb2f48283d
543bb9b62871595256057aa2eda086c7c715fe3b
F20101108_AABNTM crecca_c_Page_158.tif
c20de5f686d204ae388cb2e7203725ae
151ab3796d3463662c7db4974e592e0808e9a523
1051958 F20101108_AABOQF crecca_c_Page_172.jp2
480b86e8035bc9575c0e2efeb66e0578
faf935e4cf8958652ba719da536e315862d8e88a
519 F20101108_AABNTN crecca_c_Page_194.txt
7dad66bdff717364ba6fa94a5af0f0f0
de8072df1f69717977004162c960046aa0098536
F20101108_AABOQG crecca_c_Page_175.jp2
e86036716489c59c8070ee5acc347b5b
406854ef907387e78fd6862aa9e0b4564c1cc4d1
19041 F20101108_AABNTO crecca_c_Page_070.jpg
d8d9d646cee00c4d36f2be88408d9994
f641dc4a4914e67589f0cd580de03ee08354f6a3
1051967 F20101108_AABOQH crecca_c_Page_178.jp2
209b57a8ff604ed1285f16a8f47371e8
c68800cf7eeff3aa199dd04fa2080256e5575711
6478 F20101108_AABPNA crecca_c_Page_125thm.jpg
aa8eca77a781f6e0f5c9bd788c867b29
0e517bd43c0506af5dc01e8cb6a91c0f185e19cf
F20101108_AABNTP crecca_c_Page_173.jp2
30f50247e3b8c53bddd4aa7e9732b76e
92e98484bebd1b222e65c7ba0a5cf279cd371f83
1051945 F20101108_AABOQI crecca_c_Page_180.jp2
26e3100e5225eec1930edd52aab525be
492de751c07d9edd511e1a7302ff206acc2c7928
11119 F20101108_AABPNB crecca_c_Page_126.QC.jpg
718d573b5bc193aa8a0cb0ea1909f63f
7411feeafc7785e994cf8ea795548924a07d22a8
2929 F20101108_AABNTQ crecca_c_Page_225.txt
d5015f0493787d9e368cb80a057dbd1c
9b25ba77d7614304a9452c510ebf6dee96622ba5
529077 F20101108_AABOQJ crecca_c_Page_181.jp2
930055b5a459a0fea03996574a4b701f
e0c814f7401d0fd5700b6ff1e41f3ec2eaf0a779
2642 F20101108_AABPNC crecca_c_Page_126thm.jpg
aa48fe41c083fb88012523a9f3d803e6
8485c747e2d122b6ed8014a56c2da9dea3c47f7a
F20101108_AABNTR crecca_c_Page_090.tif
3edb5b64cfe58ac4f7cfb1cf411aab07
280c99955108a6754f641726a9633a49a4dec204
1000644 F20101108_AABOQK crecca_c_Page_182.jp2
1ea18f4a4d0218c58a54d6cfc0396d8c
1a642f0eec7938a31bcca93f2c811b137f93dc2c
2334 F20101108_AABPND crecca_c_Page_127thm.jpg
7d8325852ab30be25d1329337b25839a
13cf1246b9bca19dd1a5af503151e63e6e8433c3
1051975 F20101108_AABODA crecca_c_Page_050.jp2
ed01a6ba0afa0e2206964a65d7b88db5
01fb0bc2674316c193f476b96424b711084479ca
44770 F20101108_AABNTS crecca_c_Page_166.jpg
77cd706c5f584772db7658c510ad2057
1ae3e32834698021634eb0741f5cd47037df69a3
1030523 F20101108_AABOQL crecca_c_Page_183.jp2
091601d3ecf4835d524a659191de14bb
16bb9e4b6ae9428efd545fdf0f3e157a74efcfe4
29200 F20101108_AABPNE crecca_c_Page_128.QC.jpg
674a1bf6b1209dcc92b1459296dc010a
96c69ab672387961e2df0b6992f6ec21f2d86d43
621 F20101108_AABODB crecca_c_Page_131.txt
f098fbf308aaa1ca3068ba8f00705ce5
98f825687796a8de033c4f6217b5c46a91b94e4a
935911 F20101108_AABOQM crecca_c_Page_185.jp2
f8430745979cbaba0be0647406a2a733
2b1efdd61f4695218d2bf503ac566c2b000b6e14
6344 F20101108_AABPNF crecca_c_Page_128thm.jpg
32dcd3f45f8043581655ec8b3c17ce14
aea59a3d88d86bc13869636fc7fe13fa581f042f
8984 F20101108_AABNTT crecca_c_Page_074.pro
fd0a2cf2db21d09eb678aa51485a1f86
eccf4ed532fa172ec804c607e4afba027cfe79c9
970330 F20101108_AABOQN crecca_c_Page_189.jp2
f7c8f9e770ba8d2ada4a804155af2c9d
aae93c358e35af2a3b785c35c15191816395a0c2
18819 F20101108_AABPNG crecca_c_Page_129.QC.jpg
cd8ddae27c089622a916f8a09aca3402
012331be1335516000e6a55cc65e26663f601c7e
51742 F20101108_AABODC crecca_c_Page_031.pro
625c082fba6ecee53c33a4795335b88d
0d67e8ac5d3eaad7b0e0d8f24a03e24b40d7f4aa
19995 F20101108_AABNTU crecca_c_Page_163.pro
c5179d0ce1d7bc20b4c37a8411fde578
2e26127c741195738231e88dda08b207977cf4ed
F20101108_AABOQO crecca_c_Page_190.jp2
173a76171a3fd9fafe6c6ed0dd78a2a2
88a8e2cba895f5a8437da9c3090a8f54ec1bcd2e
4487 F20101108_AABPNH crecca_c_Page_129thm.jpg
8fe471e62ab8679920ae1355aa135e21
b3e2aca9ee0c85a5df75aa5b9a89bdabbb3ac25f
486398 F20101108_AABODD crecca_c_Page_138.jp2
3178054f8b1287b09e5fc2c1fcb5d0b1
5355ae5aaf2f6d969969fcda5f46ce235ce4742b
705307 F20101108_AABNTV crecca_c_Page_194.jp2
26068bc596638f052c52f50ac167e021
aa3ea890ad925cf798492387e6de1beb6c916409
19600 F20101108_AABPNI crecca_c_Page_130.QC.jpg
1c821aac2d5e6c36af89d7edd9576366
cd11236804939f9726f14d37e81f396fe397d0a0
1051973 F20101108_AABODE crecca_c_Page_146.jp2
fe70416dbad1917f5f0fdad5d3940fee
3ec512cd4726198153d3362ef3a0fedbc780c45b
1755 F20101108_AABNTW crecca_c_Page_067.txt
53fe0172b2bac0ff11cedf76134511a5
d9f339f26836245f666a50663376be43f2863046
698180 F20101108_AABOQP crecca_c_Page_191.jp2
b59e115d16839f6f76e0ad0e1ce0b686
8b1601fedb44f03f91e67aaa8db3774b2e464be2
23937 F20101108_AABPNJ crecca_c_Page_131.QC.jpg
489ba40c6ba3010ad71ae57ee01b1bde
c9b0f07f2c800aaa0c10b07978e731821231b881
86378 F20101108_AABODF crecca_c_Page_052.jpg
b97cd8cacfd421483467ef28cbf282d5
141c181965b162f6573781663eca2f263b2f4db5
6444 F20101108_AABNTX crecca_c_Page_141thm.jpg
567a4fc30e42075acab67473d624bc53
999a0b3572d9e5fa40a7e8cb3a7a8608a44eb9a4
1051984 F20101108_AABOQQ crecca_c_Page_192.jp2
05dd3c0ea2f1520f4d2f8946377e7f4e
84e25484e765c4227cbc8730a8bb383b6f0f1a97
7321 F20101108_AABPNK crecca_c_Page_131thm.jpg
c63926e7b28905f4814504a4cb86499b
a406950a3d00abb45aa261492dd6160f23aed996
2776 F20101108_AABODG crecca_c_Page_163thm.jpg
e5e0244021d556e9b37a1f75c184f969
faaeb1971a119289ce856955b68137e5df82ec0c
8815 F20101108_AABNTY crecca_c_Page_195.QC.jpg
b2d2a019a3c43224c4a7f8345e3d0e5b
b1ed5d28a2b33781f139237d706694c928e4d626
630917 F20101108_AABOQR crecca_c_Page_193.jp2
b70d28a56c2a6a62ec93820984cda42e
96875d5fd93a41680a896fa5e5743a418906330c
28897 F20101108_AABODH crecca_c_Page_219.QC.jpg
5de941a93ce51d330a479b5491f451a4
0c5f4856b0443bfe8e8987ab322ede8f2ff02b4d
26730 F20101108_AABNTZ crecca_c_Page_177.QC.jpg
1b05e33b81709a228429179d577f0857
0102bdfb916749ed14e82b1f332ae5f726bb31c1
12393 F20101108_AABPAA crecca_c_Page_151.pro
47228787b084409244c35e4b0dd87989
748fd7f7db69469afa98e161857b9f6171edc2f1
F20101108_AABOQS crecca_c_Page_198.jp2
d5a9b4933fc8e70044ac85648337a07d
8f4c7e9a2de7833230554866a5615f8484075c91
4776 F20101108_AABPNL crecca_c_Page_132thm.jpg
035e51f7e16b693262b0e9e6cf071490
96511839229263075f4837e7f7b44fcc6ae25023
831 F20101108_AABODI crecca_c_Page_141.txt
8f8b9f3340b65d5eda0664a14d90184a
30e411b1341307695c1fbf390028178be811003e
7977 F20101108_AABPAB crecca_c_Page_152.pro
3d1c821b6a0416db5af341f32e21f635
b168c80716dd2515203fbcb60ea98fb63114751d
260600 F20101108_AABOQT crecca_c_Page_199.jp2
243ea85966aceaae761467cc3bb6cc16
d8e92ff2c89c6e27dfcdd39e2912045b8ce29dc0
19881 F20101108_AABPNM crecca_c_Page_133.QC.jpg
7030c43093f4c155f6361975ea1222e7
2a689ebc6a6758ca3b1e9fe66e5dfab6bddcc3cd
81035 F20101108_AABODJ crecca_c_Page_115.jpg
22173c5356d853ed1d84a3f01c7d628e
22cb6baad716f0743346610639cdb41dc42d27a3
51839 F20101108_AABPAC crecca_c_Page_155.pro
b7b9daaaa4f545245a2008b2b7e49e60
d4407e535487c30f3dc08ca364b81d0f97f2c9fd
555006 F20101108_AABOQU crecca_c_Page_203.jp2
fc8a237111133c2bc8532da54df0af95
0458e15336b2a0fac61f594812d97e45edfab404
5755 F20101108_AABPNN crecca_c_Page_133thm.jpg
06152279292de0e0cef559d35389ec3c
ae4fdc9b3a9cf098da63b9f16a75ded416fae58f
1051981 F20101108_AABODK crecca_c_Page_007.jp2
781dc67eb9228392c0a58474c0cab6d2
20d5414d8d3866ce8ecf43c5b6235931a94d0515
53825 F20101108_AABPAD crecca_c_Page_157.pro
682732fdd920ccf5c646e97bf9368185
2bcdd017fcc4636c26693fbc7c3f16ca4b1af9fc
563447 F20101108_AABOQV crecca_c_Page_207.jp2
86180007f6db61ab60862a9a4825cead
2dd47684b89c65cb63bb71acb0c5a9edb33cd463
9038 F20101108_AABPNO crecca_c_Page_134thm.jpg
26f65dcb1b2eccc5a0714e1d311f6bbd
9416d287e7bbfa11531bc824641c41008107922c
565238 F20101108_AABODL crecca_c_Page_208.jp2
bd0ddc458cfd2c4aa80ab9b41f275062
8c55e333417e016bb0a8c1caedc25fee74e7ba6e
54410 F20101108_AABPAE crecca_c_Page_161.pro
60227049ffdc54f4d616f06250043979
455895a8a8ab7bcae478670e063a9fed49fef3a3
F20101108_AABOQW crecca_c_Page_210.jp2
c5b8c44900bf47941a274cb666460a9f
36772fde8118789e74e9b6460f817118188ed54b
27257 F20101108_AABPNP crecca_c_Page_136.QC.jpg
9cad57f006580d3b2c3adedb572fcf14
793aebc680aff812309d59a4f9110385ef25af0b
51179 F20101108_AABODM crecca_c_Page_107.pro
16646291264f2a8cfd692e81a8a3f5e6
0c7280bcb0616dae1b608602271f8a06db0640be
5925 F20101108_AABPAF crecca_c_Page_164.pro
d80e079c8a1a80e7a4c91fc6d8bba935
155502bf7dc86fc90184a4d6797faf063d20db35
F20101108_AABOQX crecca_c_Page_212.jp2
1171f0c8816ccbfe1e748dafa4871482
8303d19aaafe8941d94537206d651ea8e3783954
25161 F20101108_AABPNQ crecca_c_Page_137.QC.jpg
3c7549be37fba7d3b3cf0d84db140cd0
a32d47b9028477142b778902add96481a2fd90fe
88363 F20101108_AABODN crecca_c_Page_143.jpg
a857015fc5c589929d5304750f01b28a
b8cc46ae13445335b118bab64552e40e08ba19c4
17911 F20101108_AABPAG crecca_c_Page_166.pro
ff7b1f7fb24334c6d2627c7768a86d0a
13d384502814b673573afbb429b643ec4da61ff4
1051949 F20101108_AABOQY crecca_c_Page_213.jp2
eb9069ea98051adc21903fd9ca24e59e
daeff75b29364b8dd6e441c791229ebee34a314a
3073 F20101108_AABPNR crecca_c_Page_138thm.jpg
c9ae400d14388ace978951d4723ad5dd
7c98ffcacc81820a5d2a442c67549d6086af0e38
50772 F20101108_AABODO crecca_c_Page_049.pro
564b26c34f0d9906ce548d8389aff110
caa6f9ca2790d956de15382af046e7db70b897e5
13122 F20101108_AABPAH crecca_c_Page_168.pro
87b5fbd36ab5884d19ebed1b43fff250
14cbb037da43809d36dd6b15273f58f39b0022ec
F20101108_AABOQZ crecca_c_Page_214.jp2
4637f845986f7ba6703747f7e778349f
e85d829a19ae253373d89d12efaa31351f1ee7e4
17192 F20101108_AABPNS crecca_c_Page_139.QC.jpg
f3d0359d3bad9615c8105d05bc1a8f47
c5b8233be7d52ac77525d59e50157554d3d7144e
63705 F20101108_AABODP crecca_c_Page_069.jpg
7d3d0645747bcf07d72b68366eb2a4a8
3a083b52753386e5302390081b4af55e9784ce59
7183 F20101108_AABPAI crecca_c_Page_169.pro
50919312b6b53a28b021195da0cfdd0b
d6c58d78edd231440f7e1d2092a2914bdea9a81c
4836 F20101108_AABPNT crecca_c_Page_139thm.jpg
7a925bd4f69efea7b451f32759579dbf
3d79a08a49eeccac253a9102bebe29958b7a599a
1051950 F20101108_AABNZA crecca_c_Page_094.jp2
a60da5697ef6134a2154c90ad61e1c11
b776297e427c4c504d66615851783ca833b417c6
5739 F20101108_AABODQ crecca_c_Page_016thm.jpg
562d4b5216121b910746265b45599a08
5a6228d0d56f8a2d19de64b5046dee104873542b
46554 F20101108_AABPAJ crecca_c_Page_170.pro
f14e0fbe2898aad08a8e3f2cddc14129
e75dc1330f500a0b47495737b5da2badcd3e5137
F20101108_AABPNU crecca_c_Page_140thm.jpg
055a6f0b7fb2fc429479cfec301fb63c
9d370cbb73f9271e0584e418e436140015d15bf0
1129 F20101108_AABNZB crecca_c_Page_205.txt
b4ebfa477c5f1ad14048417253ce8939
559a4320bc05c93862ea3fa49fbf86f0a57bd8a1
35690 F20101108_AABODR crecca_c_Page_055.pro
a0da0b3e83f0242d78499c2cd47247fb
ddfbb3908052193110c974c4ad5d10b8b7f90d7c
52348 F20101108_AABPAK crecca_c_Page_172.pro
e5b83e21ab0767281887964c1ac5eda1
a28cea8087c6f76f11d173622609a87450b0a264
23182 F20101108_AABPNV crecca_c_Page_141.QC.jpg
b0ad25a01ec5db7ce9658eb38ec8ed0e
3ef6c60586f13b5a7f71223b4cb415dd3d64a89a
51636 F20101108_AABNZC crecca_c_Page_179.pro
98f45f8f3488709df11bffdd369df59a
5178f8b7d72e0e1fb18bce47a5ebf926c5d40fc0
F20101108_AABODS crecca_c_Page_045.tif
8a7178c43512ebb42f57e8f28832560f
84594ebc19afffc42c9dd268b1eebbcdc264dc83
51007 F20101108_AABPAL crecca_c_Page_175.pro
d1d67009947dacdb393b84d521858862
6fcdb4e36dffd037dc4bb634214acda3ca63576c
13845 F20101108_AABPNW crecca_c_Page_142.QC.jpg
ef456668a4fb0ffb5e5cb983cd178d6f
382bf7a8b9fb4ac03947dd6c1f673b8b4d18a7e7
2153 F20101108_AABNZD crecca_c_Page_065.txt
a6b902874119d6796b5e9b32a6b2ef35
3212f635731f42ccade0881ae50cda9f07d06294
F20101108_AABODT crecca_c_Page_122.tif
5668b98e52a00b483ac767262bcafea3
8e0aca2c3b968082678096694939665a07c017cf
57097 F20101108_AABPAM crecca_c_Page_176.pro
a65f1b6545bb6e61ead3c2dbc7f32a80
69910c5638144dbe39ab38549b451601a78ce295
3858 F20101108_AABPNX crecca_c_Page_142thm.jpg
2f957b2676206ca2637886b6b9f7aadf
091a693ab19c79c4ba76a23fed33d13c1d0ce470
F20101108_AABNZE crecca_c_Page_151.tif
3889a8e6f335832899af796cde3cac91
19c6df091c0d449a3d0da4a7af1ccc78a168dd74
1051940 F20101108_AABODU crecca_c_Page_040.jp2
4eee01aa5e140948ba4206cc11341ba2
def41404f9759c295e4959af929c16ccdd6e926f
52387 F20101108_AABPAN crecca_c_Page_177.pro
bc2431112b64eb8ea8dcbb0439b7f32b
e3c761980140ad2cab472e39d0d3f7e4930624f8
6597 F20101108_AABPNY crecca_c_Page_143thm.jpg
11f79994a95e83ec216c139b6e034fb8
cf7d3dbd915b5be776565236e72ddaa372097f72
F20101108_AABNZF crecca_c_Page_050.tif
489751d95369084858e3c9e4d762432c
36083a6c573e6ec474e26b6e5004545391b0d1cc
F20101108_AABODV crecca_c_Page_082.tif
89d49ed6f8f08306c086df2a5a08d8e2
2063e67db422ac5796b949419966fa5f4e32fbfd
23166 F20101108_AABPAO crecca_c_Page_181.pro
a5695c75bb83d2411cdf2f4f97be939c
b296d3649a0dc3557c89ebbecc2eea9eb5c6c922
6697 F20101108_AABPNZ crecca_c_Page_144thm.jpg
a6c4422737fde9ae3d435d31710cba0c
2e5156dde519242d49e8defccc0837d32494c243
1074 F20101108_AABNZG crecca_c_Page_002.QC.jpg
1717a84925a45c7b1e653aca146a22a7
7ce5da8c791b74338099c6cf5e48f0b5a3e901c3
27039 F20101108_AABODW crecca_c_Page_023.QC.jpg
5e22708e267a456402827708b8efeaad
42a62506a3d9fb0bce7a120540e6da6951f617ee
41886 F20101108_AABPAP crecca_c_Page_183.pro
a0c9ba4f9ee5d602415b7701ba1cf0e2
dc26bdfddab3567118ec09935b43c17191715258
F20101108_AABNZH crecca_c_Page_189.tif
19006046ce3e36934fe371ef4fd9f5fb
512fb08454f94f1f08793a0c5037b18c2ce96de2
6950 F20101108_AABODX crecca_c_Page_223thm.jpg
b35017cb090a86960140e9d3fcb183bc
e070e1ba224ebda5d1807cff88e0f5deb8d9b971
F20101108_AABOWA crecca_c_Page_198.tif
a2ca7b9ea2dbda0860a3ee8e3502db22
b77aa3a4d069ce2359d87f836670da55ca862955
34979 F20101108_AABPAQ crecca_c_Page_184.pro
6eff0ecabc5e86d6fe361fb25b457fc1
3022ee0d1f32d822a94925e6e5b0a4672f4f51c3
56048 F20101108_AABNZI crecca_c_Page_051.pro
acf3dfe096fe5e30c8d47119c7953bcf
fe8ca70e46a085c2f2a989fcd91a3f9c0880376f
2152 F20101108_AABODY crecca_c_Page_035.txt
9656f887f20c9dede9e4392eb2f47fb7
1cc72cb6a4b4a80088f5a9121d44b5383f126d71
F20101108_AABOWB crecca_c_Page_199.tif
3b58ac5b91af9acae50fba450e2efb65
9c2e838393774931ab5763faa08a75e0df492f1f
38524 F20101108_AABPAR crecca_c_Page_185.pro
6c31d1bfd5ed6ee1910b90034788d7d9
77f8ba246bd6c1f568b45fbfebd396f69fa8862f
40047 F20101108_AABNZJ crecca_c_Page_132.jpg
7c1420dd6e13c90fc931ebd071df0cf1
9ca8a95ef7663addbff6e597a7884efb6ce13524
6520 F20101108_AABODZ crecca_c_Page_053thm.jpg
4b3edd8e9c26df68ce67fe8027179807
496c37519b1318ab1b8223ec8ab0ac55715573c7
F20101108_AABOWC crecca_c_Page_200.tif
07bcea79f06c374c4c5cf5da2e717b89
84463d748b7f82b8bdaea3c1e029d8d9ecb9a029
3545 F20101108_AABPAS crecca_c_Page_187.pro
357e5c6df4e743b79a4aae777f547e8f
e0e6a3839d1fe1faf216970b603a2f8bd5175b7f
235195 F20101108_AABNZK crecca_c_Page_070.jp2
185ad9935318cd1f87e6916ee8c1c70a
147fe5c7f8c1422177013b306f8aaef3e860d892
F20101108_AABOWD crecca_c_Page_201.tif
ffbe9dc534eeaf56674eb2ed7c1775e9
12cc907f34e25db534cf3f4b1a60f837bf6fd2c0
6628 F20101108_AABPAT crecca_c_Page_189.pro
88ed2c413ebf682d41ab8c8afdf11a20
b40e2173226e88c728e4c249e08f89e99575b8e6
22126 F20101108_AABNZL crecca_c_Page_150.QC.jpg
de7083f314d0f68b3a38b2977b612e68
cae9dd0fc4214b7d65b20ee8fc894171662e0cec
F20101108_AABOWE crecca_c_Page_203.tif
712e0bfff34be10c86f8135abd217ede
41ba8d5f82b283565bfbb7b9812bed1a3505a148
11794 F20101108_AABPAU crecca_c_Page_190.pro
907ad097347572468557d4ccde2cc83e
cd497d6e6b79d1bb6d0c2aaf355f629cf34b6e37
953 F20101108_AABNMA crecca_c_Page_181.txt
aca4082a5c5d7fc9521edbfaecef8e99
158c8860db85dad65795c4039fb8f092139862bc
1051969 F20101108_AABNZM crecca_c_Page_158.jp2
6038d1bfd926ebbfc2a230c40fd45466
9127d5b0c33d3cda6bb1ce8bf618248673b4873d
F20101108_AABOWF crecca_c_Page_204.tif
8dd26862d3c8a10a90753969216f4733
e53709d98539511c6583c7a408ed07ca078280d6
5644 F20101108_AABPAV crecca_c_Page_191.pro
6767895eaab415300a3f7b525ab7c31e
e1f997737cfd1dc9ba6021922521ad61822eb1b7
84941 F20101108_AABNMB crecca_c_Page_021.jpg
ac2df436609d88e6ef635ba7e7b0d196
4da579a2a620c4eae1b70fdfd80fde55f700170d
51474 F20101108_AABNZN crecca_c_Page_160.pro
9b409dea99af5fb2d3b8b860c4e44aee
bcf7ff92135ec2be33fe7b07fd08c502f9fb10de
F20101108_AABOWG crecca_c_Page_205.tif
cae40a8f1369655fd2e54ea50ca79277
977591b4a3632fdd95617169e224f6108247ce95
5918 F20101108_AABPAW crecca_c_Page_193.pro
1ea4cdd75f8fac363390b409ccd4ed96
72f3c9e140c44f002ffa6d969ce8ca41dc4a6ef8
2380 F20101108_AABNMC crecca_c_Page_211.txt
e74d2d088ffd677598285494ad98c95a
6459023296776972671a98ef5265f4f55e798057
6467 F20101108_AABNZO crecca_c_Page_116thm.jpg
896b1a1b44b39cc6551a90f9fb6dcefe
a8e96b509e80cc577adee7a0e392452109a7ea20
F20101108_AABOWH crecca_c_Page_206.tif
f278e1422bce19df4a7ddb427b05a7fe
f3626f77f0b5e2039dd56030e52b98dc73775c42
10032 F20101108_AABPAX crecca_c_Page_196.pro
69929c27188a88e05f28c174d84b66d2
4bcd5fde46438522d31376a9ef219c821336ce0e
17343 F20101108_AABNMD crecca_c_Page_186.pro
3deb59996dde82db3202f1b9627b5938
952e00a0fefbdc8cd348e90cb59d8e8ea627eee3
598294 F20101108_AABNZP crecca_c_Page_166.jp2
9692982bae152850684a23a1bc9156e7
ed30083d8e3016fe7deb0db3391c644f983a1945
F20101108_AABOWI crecca_c_Page_207.tif
96eff103971bd4579500d10b1e433ff3
739342d6acf96307b0cf57974431e8360cb1d644
5559 F20101108_AABPAY crecca_c_Page_197.pro
c0d145a11e25e4a7ed8041c9319882c9
87d8907f5b14f20e3008077d0377100328d5fe50
52189 F20101108_AABNME crecca_c_Page_023.pro
898fa76e28db278bf85f974d9d13a040
271d112859dadf5b98bd17f2f77e0516fdfb4e09
5041 F20101108_AABNZQ crecca_c_Page_193thm.jpg
e4014b7fd0bbbcb56707eda38d1ff63c
4ded63481f4b2722d8b80d36881cd3c6b540bda2
F20101108_AABOWJ crecca_c_Page_208.tif
d67d9cdc8a6829b9da9fb9200c2c1bf7
d11b932b36408812ddbb9a44609ab058f869c348
5904 F20101108_AABPAZ crecca_c_Page_200.pro
45086b219e3909080471f5a2aa8085bf
7c57d3d576844ab422784fb3833addee44f10f70
2075 F20101108_AABNMF crecca_c_Page_017.txt
cad712b6091bd496423d5529cdf229c4
7e39cd18815906ba197e358cd6f2e47f61abe651
17965 F20101108_AABNZR crecca_c_Page_126.pro
45b6b77cb04b448198d60a26d55a6809
15b12ed16300f97e01e249eab685bcf79ac78fe9
F20101108_AABOWK crecca_c_Page_211.tif
695f2afe4bc479bde85a8bb47566381b
8364a16e8115f4aa42de770db4fba3833fc0d0be
66806 F20101108_AABNMG crecca_c_Page_129.jpg
8a62496671482b8df2afdb03f8dae953
aae51576917b4fa77a0f5470c61606d9eb348eaa
4706 F20101108_AABNZS crecca_c_Page_097thm.jpg
a2a2aa85eb2a40b99abe0c1b02114b5a
b82c1d3b3e1d905fe511485529b64af954e75750
F20101108_AABOWL crecca_c_Page_212.tif
69fb89eee64ff5b4ec7ef88df3d4a109
2edb008902dffd55b6ed157d8f0a312a850b0fb0
12866 F20101108_AABNMH crecca_c_Page_167.QC.jpg
fff4dfa89b6063eaeb4c3ba3043be548
0b55d99a17676e4b5814279bfb6c44a2abaac975
34205 F20101108_AABOJA crecca_c_Page_082.jpg
c331d13061b81f180c98c59ad3b0c3dc
08ef5990d0f6032e5f7417442c758ac0152eaf6e
24355 F20101108_AABNZT crecca_c_Page_079.QC.jpg
49b867e2f327a71a2ccccb6335a116f3
f322b07d0d6e3ec18c6a73238b5299cfd1bb0683
F20101108_AABOWM crecca_c_Page_213.tif
46bdd25f95ee23732b4eb1f22caa40df
598f69b884a2ccc80089dc497a47f588ed047340
972698 F20101108_AABNMI crecca_c_Page_055.jp2
76fcb6f5b4cfef04a4843c2d490b7e63
c4626aa681ba5e2c13966dc8661ffff4bb0c0dac
33959 F20101108_AABOJB crecca_c_Page_083.jpg
eb9337f5bf373577609117b35dfe2655
ab5617de5c25c554df78b5b877332bcc731aa947
1677 F20101108_AABNZU crecca_c_Page_066.txt
b9c45471d73e3855b5dab0e35d226411
5de5111c86df59d868c5c4c7596a7231a0789635
F20101108_AABOWN crecca_c_Page_218.tif
8e9bc415f70e1c87767b27b19b810720
7d47aa7dae1efbbd5687b1f1c1ffba5445eccee9
6644 F20101108_AABNMJ crecca_c_Page_117thm.jpg
664ebfdc2ce58a8aac7e0e9a1962ad5a
dc9c407ffcede5d3ac8e768d781680c582ded121
46195 F20101108_AABOJC crecca_c_Page_084.jpg
611011928dffcb15f0bc89e4c97ad874
3b920ffc7d77ab28b1d7be20da7b755d2c93a79f
F20101108_AABNZV crecca_c_Page_221.tif
b6abb4829b1685fee002afb59b899de4
2d4755c29cc8d7c8a8a6372abb307843599c92d9
F20101108_AABOWO crecca_c_Page_219.tif
dddadf131581c4b5eb99f025bbfa2ca6
35e7c7f6d962bef67374c45fcac32f25671f85ee
F20101108_AABNMK crecca_c_Page_123.jp2
48b7fd5aa91a05a7ac1d6a96b39a9330
630fa48455c93cd88b5e32de368965e1ef4be339
54577 F20101108_AABOJD crecca_c_Page_085.jpg
8653baa22a1a963bdeed8492384b8dc5
e40618eefb64c7bc7575317c60e33e48ada7111d
16085 F20101108_AABNZW crecca_c_Page_048.QC.jpg
d8eb366bbd969ba39e2f1030ebae65ed
e03a944090003ed56c9a494f62ad5ea04eeed41e
F20101108_AABOWP crecca_c_Page_220.tif
0518cf8abb35c34e9c85c067e2ce2dad
b816f54e7197b1d62aab34c9093875c1f7ba8309
6495 F20101108_AABNML crecca_c_Page_017thm.jpg
398923f8bb8b17012040a3e9bc8c2866
1d8199b8793983b191ece3dff45090dc2a807b22
42055 F20101108_AABOJE crecca_c_Page_087.jpg
71e8914ffd1b9569e8f24ad09277e8a6
368cb57512efc1f53f3ceeb8932066954f73a048
29287 F20101108_AABNZX crecca_c_Page_221.QC.jpg
faa6e82dccc1f46855e093506d80427e
9ee247b75f010a50fd5d704c614e42a244103541
F20101108_AABOWQ crecca_c_Page_222.tif
fd70fed349fc11b6ab0baa19c6abb8c9
e98ac3edded2e2f38c33d7584c2732abe33e454b
51393 F20101108_AABOJF crecca_c_Page_088.jpg
5fe12a5e8f2f1b5dcbb60ef541369a35
759a5d9f647bb7723cc3a6fc8c295afcbd11a67f
F20101108_AABNZY crecca_c_Page_042.jp2
354051734f24d93fabda990bd4226582
3e824d237fad73736a73c8237d73f951e93f2e8d
F20101108_AABOWR crecca_c_Page_224.tif
475199c22cb984dab385c44398a219a8
5e7abacac27ce0a7f7d58c76c8573416839ed03c
25111 F20101108_AABNMM crecca_c_Page_182.QC.jpg
4ba1f5969cba89608cf7cf7bb544453f
f9257a1b799878666465a4b48a98f8982cf00820
81146 F20101108_AABOJG crecca_c_Page_089.jpg
27d28dcc25216dad7c8c2fb7d8d3a363
fabee13370f00b37cead8ff32061c6f5ba69da74
254 F20101108_AABPGA crecca_c_Page_199.txt
6b98511e458b02c0830fd6e3b52736dd
9d9f5d61a3759bffdd913cd1778add69a2077f4a
F20101108_AABOWS crecca_c_Page_225.tif
530f1cb79c44d0854ae8fdff6aeb2668
2cff3cb3c9ede46452688968bbffb8ba1a3736b0
F20101108_AABNMN crecca_c_Page_143.tif
3f665c5231b447669b9900e770646543
03f59588328210703f1630aa12b708328753730e
83498 F20101108_AABOJH crecca_c_Page_090.jpg
d304f31edba74fc7626614d3e80ad1ab
c00961d66f5de7bcd3f50a30b6a64d61b6687b4e
54952 F20101108_AABNZZ crecca_c_Page_093.pro
4917e00d6f04f08b0407eef8f4f65be8
3853faa6689a333a63df05b6513b189db5130ab4
348 F20101108_AABPGB crecca_c_Page_200.txt
a5a7b4d93d1a88281dbae2a7307ac658
3d3a1f2b893a331eb254465fd202cce0fc61c8f7
F20101108_AABOWT crecca_c_Page_227.tif
59162ee1ef7b57e5efde50b410dd723c
33e9d68ca69e3bc0bcbb0fbf6c24c04941eba7e5
47676 F20101108_AABNMO crecca_c_Page_016.pro
a8cc38c06c27600f52812f77be8e4cd5
ff3dd9ef01928fbca599747b666f42616f36ccec
F20101108_AABPGC crecca_c_Page_202.txt
d74a94c1e4e666643b7bfe635122ab63
d2b5825fe390ba212f706aca3dd00403f7fff288
10381 F20101108_AABOWU crecca_c_Page_001.pro
a3bd9fae412e423b2bd05f7a51eed08b
6c9808a083179ecea1af9d743eff97296447cc42
360906 F20101108_AABNMP crecca_c_Page_152.jp2
38ee1756a8bf9afa88c6011b990a4df3
48011efbb01f8882b3313e4c299c44679aac80cd
88129 F20101108_AABOJI crecca_c_Page_091.jpg
64d3051fe468960fb09e0c1df8b32e3c
9169e976c77b1e33703771ae89fa5ea47200b960
1098 F20101108_AABPGD crecca_c_Page_203.txt
1925563b741771a7c704fb38e2596fb1
99792b7e36045675b6f5704e2aea213501679ecc
2420 F20101108_AABNMQ crecca_c_Page_011.txt
2258fdf6fccd53bdbd960f73e374a067
78c4044461038e9e78dd0c81037c29c634cd7b5f
90958 F20101108_AABOJJ crecca_c_Page_093.jpg
55381358751955272644fb936f3cb836
b51a3e3bb333f47dbaadea619b4cbdcf2aa7b764
778 F20101108_AABOWV crecca_c_Page_003.pro
7cc391082d33703cbc8d9c7e86ee49a9
10394d52e9444f2ec8d5839d1f27b57a78e5405c
2032 F20101108_AABNMR crecca_c_Page_180.txt
54a8dba2b1a50521982bff4f3560b45c
846011b28028cb45163300e2cd70cc2541727d15
83675 F20101108_AABOJK crecca_c_Page_095.jpg
a467315f246d868afb72809b645f7c72
7559626cf99942317b5801874448974fe2bd3f2b
1103 F20101108_AABPGE crecca_c_Page_204.txt
f82c1b80092048ef586c40ffd882aecf
e7538c349a97e6eb54a75dbcc0d46e5396f954b6
32649 F20101108_AABOWW crecca_c_Page_004.pro
2927e07ee70640f3bad3566b24e37e3d
49cc3b5ab03ae9cb701d1de9f159e7e90158e7ed
1051938 F20101108_AABNMS crecca_c_Page_018.jp2
a561779c4cba97487661bba60a861c7d
3047098bbfe851dd3d19858ed5abb83149c4c01f
71983 F20101108_AABOJL crecca_c_Page_097.jpg
2ed44619d52cfcdd34e30d68b7c00170
b42524885988d97f291b97a56f865be1367c7928
1090 F20101108_AABPGF crecca_c_Page_206.txt
b330a52f01b9470841dc36e77dee4e9d
6f07d96aee5aef6f68cd98fbfd26a9d33509ed06
73353 F20101108_AABOWX crecca_c_Page_005.pro
a43ed87b7ff32d3fb1a84a79be7c6afc
b21918c9b3babfbe8b6cf8997637bf0ac60fce35
1034599 F20101108_AABNMT crecca_c_Page_119.jp2
03e3c948a0f7e972acd772ac3b18bd6b
41203df1a2474e1dfa2ce04e48bc3fc1cd7bc774
65263 F20101108_AABOJM crecca_c_Page_098.jpg
3de82deba124b6eb7c97e22243227d30
9603e9fc8b1a5af7bd0833b2ebaf526b2f29dc79
1134 F20101108_AABPGG crecca_c_Page_207.txt
76a03fa016d2c03db3df0ba222a6eb12
814bf5905f89b3f022e6c203e9413d7dceb03e8c
79833 F20101108_AABOWY crecca_c_Page_007.pro
f66af0bd11c88907ff30b0ede9470c58
0182b839d1b3d7d9e3324ed58a655b0b1944fc96
11272 F20101108_AABNMU crecca_c_Page_075.pro
8f32011c458cb76b7ad46cabd62c0d32
869e30a2ef12adef2552493f2dd5dd6dc4c1f124
64697 F20101108_AABOJN crecca_c_Page_102.jpg
dc548a5e4d48c7ee126d9fcc0e2ebec1
3b98eefe19a9ea0ce3ff8875f3f3557e5768a273
1124 F20101108_AABPGH crecca_c_Page_208.txt
083a25be13804fbfc49428b6de2664f2
df7338215112ea93491427aac79f5814341c5346
71916 F20101108_AABOWZ crecca_c_Page_008.pro
8860e464a97b615e0f69605ff391d380
e3607833a5530ecec20ddddce16bfac37b9eb2d5
4543 F20101108_AABNMV crecca_c_Page_227thm.jpg
9e758680d08a86fc7fb8fcc03654d4e1
4e201dcdc5c359ffd6e51d8beb47dc19c38602e0
79111 F20101108_AABOJO crecca_c_Page_103.jpg
ccd4bbf857658d6329607fb8ecc091fb
4643746e0618c6097d828d0d6e5310da0a167d07
2538 F20101108_AABPGI crecca_c_Page_210.txt
edc0c8b03e86e6c86974f88e282fb889
34588730fe0155d2304419da298768d5e2c3c77e
85213 F20101108_AABNMW crecca_c_Page_179.jpg
5f9f147a1fd6f4a8b78d8f97b242bc47
f4115675b3f1a73fcb791107b38d64b0d89247b7
74885 F20101108_AABOJP crecca_c_Page_104.jpg
9acde5ca660f3b8ed6186119ae5f8282
1388bb9eb89329882cb41abd7c3f17b273f33fbc
2620 F20101108_AABPGJ crecca_c_Page_213.txt
4c28345142f7c3be7be7c107cc5faecf
1fecb54edec90243a58608ba9989bf6f86bc1ac8
585 F20101108_AABNMX crecca_c_Page_139.txt
8f25b0b8fb122890972888e53b36a638
bc4499cc7ade85cd295d32ba536cbf382a0aa253
83248 F20101108_AABOJQ crecca_c_Page_106.jpg
8b0592aa429c218649d96232075a6102
d1f4e7f325f83ccd00dbe11380387193580df638
2596 F20101108_AABPGK crecca_c_Page_216.txt
142ad939f3b48106d2e4bf403aade15d
7a77e0424cbd9273e335d6213758922dd9e02c9b
88011 F20101108_AABNMY crecca_c_Page_058.jpg
43bac06263375b8ccf07a6db50a18c86
66b1eaa1d99d9f23696bc036f3fa4df8d6e49126
83391 F20101108_AABOJR crecca_c_Page_109.jpg
09ca61ebbe975422c260e4c73254ac3c
ed979ef785b33a1d905934585c9ba8a57b028528
2479 F20101108_AABPGL crecca_c_Page_218.txt
0dbf83fc9c7462434edca6f13ed139fe
f44dc2ea68319e279d6fda061c53e414e1755bd2
28801 F20101108_AABNMZ crecca_c_Page_120.QC.jpg
04e2b1fd787028723e9d239583ce7e61
b1cb9ed2ae509f651177d435fc3fd87dd9e37260
90084 F20101108_AABOJS crecca_c_Page_111.jpg
379e87864dddc1f7fb9b151f2bdc4668
1e96f3ab5a310e8f464e6603181f0cdb8d3166c0
2710 F20101108_AABPGM crecca_c_Page_219.txt
513d4fae54d8363f26e92a02ab105a6d
f867b7db3eb59ac04091a5451ff558dd222bdb28
87569 F20101108_AABOJT crecca_c_Page_112.jpg
426f2045ae547669387190f54b979a6b
6ef735370df4f06cf9a1d812d6ff8866df6d8129
2718 F20101108_AABPGN crecca_c_Page_221.txt
cfb8bce25d6632e7ad8a444783f49bfd
219c4b4dfb75ff34b77d75fb9c310a286922074b
82629 F20101108_AABOJU crecca_c_Page_113.jpg
3e30d22228749a9e6931ba7e54426ea5
ee5502338b2b05fed128bc08633256cbaab02f49
2722 F20101108_AABPGO crecca_c_Page_222.txt
efeb1e687bd51b55808e321e6c217470
35840b1125767452afb89f09ea063715cb7e5441
87356 F20101108_AABOJV crecca_c_Page_114.jpg
99613372f3d31f002103e89844910e5a
aa45404f8a4a510368a3b09fa1336c684819f77d
2669 F20101108_AABPGP crecca_c_Page_223.txt
888546d6b16b7a5d249d7b8a3a29bb8a
1c32b5826b12e6c28170c2602cb02ba2061493eb
85726 F20101108_AABOJW crecca_c_Page_116.jpg
afb2cf7c3c5a56f73f50f189f130ca04
3dd7e5d5ec8480d2d651ae1e2bd2d16112b4cb1f
2519 F20101108_AABPGQ crecca_c_Page_224.txt
ff362647ab42a3a026ddd24ff662a70a
078fb34ba017de9f05a4c3af82a8aa26f9babe54
87996 F20101108_AABOJX crecca_c_Page_117.jpg
561297d0b867504e6958af645a239a70
6e057b3d4e31e6ef64e9cdd6df21304d80895699
1366 F20101108_AABPGR crecca_c_Page_227.txt
187febbc718f7e15c172f7bafe05c6cd
e0c708ff5bd202c284fb314bd9a3564b1ea54b27
77829 F20101108_AABOJY crecca_c_Page_118.jpg
8121223611f68b83c26e6dfbdff9b24d
e4e7122898dc502f08335898dd6bf9105b4fbc63
472 F20101108_AABPGS crecca_c_Page_002thm.jpg
c2de27f089524aac51dbeb0c25c73a6e
fa8e28a41df1d1913c581477ecd03ee5ba4a255a
93088 F20101108_AABOJZ crecca_c_Page_123.jpg
8ca63d63a83470882be2c68ba151cc74
e9f97d002496c423f4e9c3159e54b537191d3bc3
1051 F20101108_AABPGT crecca_c_Page_003.QC.jpg
1a87d51d20352246d5eb36ee9229e772
445b0ea6e53e7f69868336eb93fb7dcf939aa1b7
21581 F20101108_AABPGU crecca_c_Page_005.QC.jpg
24cca57f413f287f904cf34d85ded30e
c01eb38c085c406c5e4b770856a8f6afb428dcd6
29165 F20101108_AABNSA crecca_c_Page_134.pro
5b94154dc3e0b7da957e088a81c30b0c
c8a052e748510db0a6ad3a101890ce8c47f7e9c6
5288 F20101108_AABPGV crecca_c_Page_005thm.jpg
2b35db5b4e9a6263389e973c50a94168
b954281a1ef62d79cec345e45676ad7f4cb97e07
92328 F20101108_AABNSB crecca_c_Page_030.jp2
0d24b6736f9f8023b183bd941aee1591
866c9538306824c4a5cdb62f7810851ac0741921
26494 F20101108_AABPGW crecca_c_Page_006.QC.jpg
e02ca18801d52b0498c7ec148c3cb796
32126942a377818d440cced8c952718ef707bb96
84223 F20101108_AABNSC crecca_c_Page_040.jpg
108bba3eac44a021df84490e32b46024
68f399c7e28408b41a3e7b25bba1986a2752cafe
6030 F20101108_AABPGX crecca_c_Page_006thm.jpg
8377fb8635fda50b1bf0d02e92534741
ddb9f64a1a97fb71c4be7cb3af72002496beb324
1051929 F20101108_AABNSD crecca_c_Page_016.jp2
d070f9dc2602654b91aa2c39971a82c4
a28b9847f2fc5f5ec4efe78c8b9b61205a4ae5bb
23713 F20101108_AABPGY crecca_c_Page_007.QC.jpg
609b08aa30a1916efaaafdaa91c392c9
61725753e793fe0416ed747897b7204394df4825
6008 F20101108_AABNSE crecca_c_Page_100thm.jpg
3bc1a4f8fe48b82f8a71c6c4c6e0f2e7
884232ab0ef0657b8efb66b66495f1ec376d9b96
5403 F20101108_AABPGZ crecca_c_Page_007thm.jpg
d586fa4bee7114ae533d44a781092918
abcd2a070cb4b9b9d49b80fc5c7e37248d2fe9ab
8230 F20101108_AABNSF crecca_c_Page_169.QC.jpg
c5f995dac02730a553a3ccb64a0339b3
826a872539e69c9df479deff56ea054ca7ebd5d0
11087 F20101108_AABNSG crecca_c_Page_188.pro
9474c64bdad733dc3f9a4059a1bcc013
2beb45409624d2ebb287c541cd78fc347d962e87
51014 F20101108_AABNSH crecca_c_Page_048.jpg
7e0f2687ecf8c66d8046e5f9034f175c
32b97519d8e6e7f45d79021206d78dc520ea66e4
1032572 F20101108_AABOPA crecca_c_Page_118.jp2
c3a6b482d2d81688ae611ae24db01543
e87b554569aac44e64f1c4c8f290381548e5858d
50874 F20101108_AABNSI crecca_c_Page_109.pro
8acba188e84e2f5093c5e4ef6f979402
80f6d05d1019c9551e548c51fc23a3f79a62d0a6
1051974 F20101108_AABOPB crecca_c_Page_121.jp2
70e2cb3b0ae2835aa52e82ba75b3af7b
8472f670db254712fc9dd53814a8cb5a288d6252
6919 F20101108_AABNSJ crecca_c_Page_199.QC.jpg
40b865278d2b40f21a1f8b252f2709f9
030810e88ac4c1206fa1f17936a31d7933d6e3d9
1051939 F20101108_AABOPC crecca_c_Page_122.jp2
d7772fb5602c34b08ab341b86ebe2430
963eacb65dbc2d8b8d5a238a1e7feeb0da13cc8a
F20101108_AABNSK crecca_c_Page_161.jp2
b5d0680325cbbb7fe3731c634e4a667d
df89394d4f70bd03b1cf25dd57dbb8cb5c96b460
1051962 F20101108_AABOPD crecca_c_Page_125.jp2
424762fff01c31306b043422cc486013
57cd47e225111a15e873e9de4cbcbf1774690d41
432632 F20101108_AABNSL crecca_c_Page_101.jp2
8b61eada8459e04d37fb5c5906f2118f
4a7bd1ee89c407651bd192bbe6a25dba86f23960
424345 F20101108_AABOPE crecca_c_Page_126.jp2
d240a9a531ed3f480eec8968abd36cbf
036018a151fe9b02c5a07f164b2d7596a1eefad8
F20101108_AABNSM crecca_c_Page_106.jp2
aebbba243586f523b5d9261b980d083d
5c0aae25175407f56d91faa2d2f0a4cd4ab779ed
347755 F20101108_AABOPF crecca_c_Page_127.jp2
5d5cfce11c380e188d7f5b0907421f6b
834eaf17f40da93bc4c156f5f2a71f99de17d5dc
89749 F20101108_AABNSN crecca_c_Page_019.jpg
fe13ed2efd45a7e8315532d89cbe5b81
bdc6b06c08f30a68bef1a5a8833469bbbe0f2401
1048361 F20101108_AABOPG crecca_c_Page_128.jp2
a62fae142c464c3160cec76f87b8cd41
5efc87c9750fc8f7b2f668a64a606fbd0f2840f4
2016 F20101108_AABNSO crecca_c_Page_178.txt
3d3c63e14314831381c66c2017a4a6b6
30d22a414cb006efd2801867d75f11b6cc8df698
761818 F20101108_AABOPH crecca_c_Page_129.jp2
1aa540907e9d31a25cb858d4bc8044a0
68b7694c5bc07b239cba4c405fcb484c1e302bc0
27595 F20101108_AABPMA crecca_c_Page_108.QC.jpg
cadf529beea21bfdece0345ca40407a8
e488cdabc9fb302a47dff0110d98754d50c8c60e
12842 F20101108_AABNSP crecca_c_Page_075.QC.jpg
538e3f57b9a926d2efceb75f45451d5a
dac8d14eb7677505b360cc9ba384f42dcbd6fdbe
F20101108_AABOPI crecca_c_Page_130.jp2
0266e3ce6b5eba29924181852bb118e4
f7d2abd67c995a034c67cad567745f0d8aa36528
26168 F20101108_AABPMB crecca_c_Page_109.QC.jpg
f42ecc181bcf43352933ecbfa3434de6
1b531165164996b538b814106a18d16b9032724a
F20101108_AABNSQ crecca_c_Page_055.tif
fc40d6f961158c5b8665d6853203eaee
7604f1bf0d3700abe8a883dd540ddf202e7274f0
1051885 F20101108_AABOPJ crecca_c_Page_131.jp2
2dc0081d83e1da023484dfffda121084
b71ce8275acf817a0e89d2fb6f59385c58f0dc4d
6533 F20101108_AABPMC crecca_c_Page_109thm.jpg
365e54a3837acc373d73042397624af7
a1981de5c62f3e35b6507a33bd0e798c9573f11f
76189 F20101108_AABNSR crecca_c_Page_190.jpg
139e8dad5c815c6a2ea69af3c62349d5
9f36677a998dd5aefb7369cad4171f8cda30c42a
613005 F20101108_AABOPK crecca_c_Page_132.jp2
f8a2785ced0748aa7caf606ee17262e9
38f15bb0cadd2f4a80c412e6cd243302b68b84ee
6253 F20101108_AABPMD crecca_c_Page_110thm.jpg
6e0261795e621ef747e1d630fd77f837
70b5f7eb4933c00003ee787bf6c1ab38920c4d57
1051977 F20101108_AABOCA crecca_c_Page_037.jp2
c89a3f4b1b46381f7e923d0c651007b7
f6900b5b869d9a47281add43f02cda421bc6ffc8
F20101108_AABOPL crecca_c_Page_133.jp2
6ff991599469627ee298ca67548f82ff
5b3b7b7d0919f2d7d7567569f3d7c71a520cd1d6
28088 F20101108_AABPME crecca_c_Page_112.QC.jpg
61256fa5e386b292263283edd103fec1
60c32a50054cf88bfa69dc4089b4083a07428ff8
78965 F20101108_AABNSS crecca_c_Page_119.jpg
05e2ba27d35880e10f93e473ef332aca
d6c4deae0c953c8d2f2cfd1025c94218e91646dc
1051960 F20101108_AABOPM crecca_c_Page_134.jp2
29b94b99c4f34f295b484f77f7d0e100
0cf8db83662f6c756e1ca0e277df473f74c29714
6710 F20101108_AABPMF crecca_c_Page_112thm.jpg
504a026295831e93701ce73c25aa7869
c12c3e140f1e804a096731e2fbfbfaea3a2bc163
6724 F20101108_AABOCB crecca_c_Page_225thm.jpg
1e9bf676162a7a4b3b6430345a906c68
feb8c829de6a3b538acd0640b02a354365c69c1c
2694 F20101108_AABNST crecca_c_Page_082thm.jpg
e3d1b885f81ed42401cae53a152e1ba5
d39d82d084711b7b28646d9d2ef41aaf34137577
1051957 F20101108_AABOPN crecca_c_Page_135.jp2
10c6ce08edb3a44ec2e01ed6d18d3c76
5cbede6b9037dd901cbe6221e074724240c0892b
26538 F20101108_AABPMG crecca_c_Page_113.QC.jpg
9c0b5c3b0d288a7979aee316d09b24e2
33fb6cd9899a06fbde2c55319e52a58c9d5049ae
4105 F20101108_AABOCC crecca_c_Page_075thm.jpg
31ae5145c224d772110bb124d5ceae34
f57abc1e02ec794b87fc0aa2370cece342d585e6
25816 F20101108_AABNSU crecca_c_Page_067.pro
e62b32de992064c07d0bac764cdeca36
5d3de92b040428b87652e0cd9770f1a2fa117d28
27978 F20101108_AABPMH crecca_c_Page_114.QC.jpg
2c650be6f1dcd4d576a5ea24e32fcccc
c1350469ff1d7fa806c3884053df7a3987bebbe7
1982 F20101108_AABOCD crecca_c_Page_106.txt
567825139123cbb0b53cc8e8427aff6a
cddd471da128aae4c7c2126b596c33bc605d10f6
5390 F20101108_AABNSV crecca_c_Page_098thm.jpg
55fcf05ed3eca2d601e4aef8c7363256
351d3bb8ef7e0e80dae46376cf5c506cc8c82c47
F20101108_AABOPO crecca_c_Page_136.jp2
4aa638973d4ab428ce291a23217ab8fb
2c889d04a65ca4e52ca4e3eb7bf70a8ef9ba20be
6645 F20101108_AABPMI crecca_c_Page_114thm.jpg
6ee7223b3f4c1acb0455a30497ae7d7a
3c38aa0bedc82f6cb7542e85dd68584b01a096ba
28569 F20101108_AABOCE crecca_c_Page_148.QC.jpg
213a66277b4774da2300209bac213cc6
658dcbc72a3c295d42a79822181cb82c3020f1ab
16862 F20101108_AABNSW crecca_c_Page_099.QC.jpg
a6423281699bd666bd8c951088eeb302
5a15a547fe7cb5ddf9ce9def42927a1d508c7cbd
1051913 F20101108_AABOPP crecca_c_Page_139.jp2
6561c84471b494de61d4d1e6dda3d983
10aa40c5d49910b4d87ae00668d89eaf7ba62823
26010 F20101108_AABPMJ crecca_c_Page_115.QC.jpg
473f980fcab4c671db0bbd00658cc392
ad79eff5137424e8473bcd26c6a2eac5296dc326
6853 F20101108_AABOCF crecca_c_Page_221thm.jpg
8dcdf283c90c0a9cc04873ddcd789328
797fcc23df26cd3a8caddd3ba21c30a640b723d2
2144 F20101108_AABNSX crecca_c_Page_111.txt
ab05e2f638a78ae99557c612517905d8
034a041ee1290f2656ba97c122f0feb9939ffa24
F20101108_AABOPQ crecca_c_Page_140.jp2
8f0168e6c43eb125819a9deccede34ca
b3529c96875916405c9721b4975932a7dde66b5b
F20101108_AABOCG crecca_c_Page_036.tif
4efae07fb5321d370566550bcb6cf188
645aad648aeba7bd5fe4812135360ec4807aa0de
F20101108_AABNSY crecca_c_Page_186.tif
0d2b79bc978a5d0e5e4a7b8f9bba9697
82a167f62d9ffe0fc1ec1668e1cdbde5b63e31a5
F20101108_AABOPR crecca_c_Page_141.jp2
20f5abdf421da5d32c07600d784ea173
3fac18fd135297bf246a091a23bc035aa9a1b238
5930 F20101108_AABPMK crecca_c_Page_115thm.jpg
d4765744390bd38edae45836198989a4
d275788421c8256c8f1b3bd356f333f602c72bd8
F20101108_AABOCH crecca_c_Page_183.tif
7d8207e080c8a12de68a87f3a0459ae2
7710ee46b15961a07fed1328176d6bd463f5b387
366829 F20101108_AABNSZ crecca_c_Page_197.jp2
f90f2c81ff5a77e6835acd4d02561caf
c24a324c0dcfe859fe1a673e4550eda9801cd53c
856775 F20101108_AABOPS crecca_c_Page_142.jp2
2730415af858ad1586582235389b14ad
a3ab51a2d5ec0bf836c7f36f1cd7dd91db520101
27123 F20101108_AABPML crecca_c_Page_116.QC.jpg
4f826c3af1d0ae5e7d4a6b5fb150f56e
fa56322b01e741420d04c299a0db64bdf313d333
23497 F20101108_AABOCI crecca_c_Page_183.QC.jpg
e06dccfc1520ed1fed8a6ccad0a4bab1
b8fd85fcd50d7d303250635e6fccca90e0da57d3
1051978 F20101108_AABOPT crecca_c_Page_143.jp2
abeb2795e3477c51b131e37dc333ec52
dc3e3fa84a1d46803636e5c96cb46963b195e970
28059 F20101108_AABPMM crecca_c_Page_117.QC.jpg
e71efb188abc48222b0ce89994fa2483
595c3253a60de8df3c126090f1cd82b36d3a4115
F20101108_AABOCJ crecca_c_Page_043.jp2
51381b1b82a3c001943601ea74928ae8
b32c53beb427b984df3e7600948070969bc1f2a2
925070 F20101108_AABOPU crecca_c_Page_145.jp2
57f6dbf94b5f32bb302c4a7e77123183
ee54f93127da30ab01997106b3705b1a275741dd
24684 F20101108_AABPMN crecca_c_Page_118.QC.jpg
e8de1857297e2a5ad487edf85ca6e7b1
ea1c322eeefd0606bed0ceaa9f37101896464bfb
F20101108_AABOCK crecca_c_Page_044.tif
888408fdc284e8d3ac6a0d4b66b9917d
32ca4c38e69b1b7e8b030c89679f561ab9517f20
F20101108_AABOPV crecca_c_Page_148.jp2
d01dea9f7e527cecc1e1b6bf7cef1cb4
2a6f8f1636ed598328dce66a057b3893f63ce46d
6287 F20101108_AABPMO crecca_c_Page_118thm.jpg
a68df6bb977207975c0b04c5dcf98b87
aa12e417cbb0bd0ce8dac932460feca6b32eb983
86914 F20101108_AABOCL crecca_c_Page_034.jpg
fb08549f26cf93169ff83f300d88f015
670a7cb3a65ede30d2fc80f422387fb7ab5d0f43
931557 F20101108_AABOPW crecca_c_Page_150.jp2
e03fe1112a5d6cd3bbb5ea7a1358eff3
a9e75a63d9adc71843d2c42600d8e94a79c07f6f
23906 F20101108_AABPMP crecca_c_Page_119.QC.jpg
d84e8c3f7739ac1c640a008261ca9299
2876d0955c6233b2b5753fea0e7213be83554710
564 F20101108_AABOCM crecca_c_Page_075.txt
89309b4cd75783d9b5278be4cc76bc03
dd94c39a89434ee395a216133ef84fe236bf6ad2
439640 F20101108_AABOPX crecca_c_Page_154.jp2
b796588b195e84ff9be24fdfebaddb4a
f578cb1b7067da2d8c252eeec0e8837525689b33
6003 F20101108_AABPMQ crecca_c_Page_119thm.jpg
89929c3a965ab769a793b8390c200bce
e953254e7ff4ed3e027cd07edbb137473ac76d40
92404 F20101108_AABOCN crecca_c_Page_176.jpg
a6312fdb6d1507f321bf000c7141b1bb
3c74856a97195d5b18b586bcc53caceaf17eaa50
1051985 F20101108_AABOPY crecca_c_Page_156.jp2
f53079d7b7bf1715d3a85129c843f1f0
c83ce90a2608d64b66111538f14c445e27fca057
28856 F20101108_AABPMR crecca_c_Page_121.QC.jpg
481188361b42686259afe406e2c41457
3568a7ce35c8fed17ecc723e82f539a90879aaa1
13500 F20101108_AABOCO crecca_c_Page_186.QC.jpg
228db97cc782ee8bb128bca940e18c78
ff3575c178e165f297ac96fd6a9dabb86cfbdb32
F20101108_AABOPZ crecca_c_Page_157.jp2
714955aa57c0ba365eb50e75a13e2f5f
b5d84ab5f377ece1395be51fbc26496272b4148a
6863 F20101108_AABPMS crecca_c_Page_121thm.jpg
0a4cdefa361a0e4d76c2662442d78f4d
233928fd641054430465daeb11040d55bf123592
1051758 F20101108_AABOCP crecca_c_Page_102.jp2
94db245a1598398707cd651c79e57821
c9d8ba6a3bc35ba59d7a6e518346d12fcebe08dc
27478 F20101108_AABPMT crecca_c_Page_122.QC.jpg
4c9c97cd58dde9b12aee11372d336c99
431bb3538bc72915138e7fe50b222e3ca38fcd7a
19150 F20101108_AABNYA crecca_c_Page_202.pro
e897f7405d9a3b310501329f918821fc
834e7af896d876ffc283bbbfe4591b5bc99000f8
21752 F20101108_AABOCQ crecca_c_Page_008.QC.jpg
3a0de00b45281099c8d1c27c83b64dee
ccd20f2408c4376359302882f128e83929511d3f
6571 F20101108_AABPMU crecca_c_Page_122thm.jpg
cca547048295df072a934b0eec926bb1
9beb07100e9ac5cc4505817b0b71fdbadbd590bf
575014 F20101108_AABNYB crecca_c_Page_202.jp2
63dfaacffb1e85e83477d2e255332ed0
841330d3f5f6fa13c43831179db51b02d1629f84
F20101108_AABOCR crecca_c_Page_028.tif
f2a473fcc13ac7783e4ff5dae377932c
4df02f8b4eb9e3b129cfc0ec13599d07d1dade91
29535 F20101108_AABPMV crecca_c_Page_123.QC.jpg
e275f27adb07199b8737267f198c499d
6f93b31a6a656359c8e295525f3827eacb6c952b
41276 F20101108_AABNYC crecca_c_Page_150.pro
470e52de19d1afd3f38572a7e69b85c5
75dfc252b784d276261479afdac51a728327e9ab
28938 F20101108_AABOCS crecca_c_Page_144.QC.jpg
926ba712b01db4c5e802dbe0fc30d325
8280aade39563e600cfbda8e24507910cde62039
6798 F20101108_AABPMW crecca_c_Page_123thm.jpg
6b45af8482ed04d5a8d275a5f5d4740c
5b99606a73de0c0339341534a6383226e44a72f5
57249 F20101108_AABNYD crecca_c_Page_060.pro
880e12395182e289adb12b95cad5275b
3436919d373716833213b0ca040653fa8f3665ed
11276 F20101108_AABOCT crecca_c_Page_154.QC.jpg
47b791a41ff9f89b764d02fc61a2307c
c2945926384628c49097d8db0b4743f6ffd8b642
29036 F20101108_AABPMX crecca_c_Page_124.QC.jpg
97cc36d6f6a0daa380079a4256017f17
f7a0cc70fe272ecc1860042dec6b2ecba53950cd
F20101108_AABNYE crecca_c_Page_091.tif
16141003a7abbf31fac28e2fad0498cc
8a7dca2314fb29e1dc1fb7d4cf2ae281a6e3e605
53103 F20101108_AABOCU crecca_c_Page_091.pro
090692d3477ddcecc7628f8a1fcc8965
3fe2e3e83ff32f355b1162ff67d661d6aa2b766e
6941 F20101108_AABPMY crecca_c_Page_124thm.jpg
c651f8fb0ec400bb9da393e7a9d9d481
e4ee5b7effb6700393a96c87337843c25660d185
285 F20101108_AABNYF crecca_c_Page_164.txt
91b62369f7d2ddb82ce1b6bc6787e652
b7b32f84dac891cca423731f57a8800bb170a255
5856 F20101108_AABOCV crecca_c_Page_130thm.jpg
28d466908e38bfae8fc0b99609add947
4cf4f92cddf07191e668e3ab2b7cafa8fc9141b5
28623 F20101108_AABPMZ crecca_c_Page_125.QC.jpg
de90838728eabc8d2efe2b371619dba1
04539e4f30e9e69857e5b357548e15f6cefec902
5037 F20101108_AABNYG crecca_c_Page_185thm.jpg
b70f1f1e007bf39bb0798d1d39a8c2e6
74d0aaeff4dcb51c96d4e1f55756d9f04ab74efa
52527 F20101108_AABOCW crecca_c_Page_052.pro
86b9a8edbe4503036fa1e04a62c6f8af
8fbed5e4a6795aa1158d5feef697dc6e65eea16f
12449 F20101108_AABNYH crecca_c_Page_087.QC.jpg
c079f513db709f7e8e4e0a07b84e7951
ea5a98bca8bfa381b3066e20bf42b9e1a816866d
495434 F20101108_AABOCX crecca_c_Page_163.jp2
e36da8adce8693e6c81b6877b9ac1e8d
495ec965ac983e351ad005dd3aebf79026609e1e
F20101108_AABOVA crecca_c_Page_150.tif
bd312dc3109b3954f456d107e7fd763d
68c15ca00c6f877c002b3e7df288f7bd861b2f55
27117 F20101108_AABNYI crecca_c_Page_146.QC.jpg
5f810c4f10a4618ea39e7c3f0e40169b
e3f6d60f45e4484ba62fb94ab641d660e130f2cc
27393 F20101108_AABOCY crecca_c_Page_210.QC.jpg
d1a77b592b671bc062a035a79be24f66
1ed81d918816112c7d7b6103a9de18d194799358
F20101108_AABOVB crecca_c_Page_152.tif
80ea93a64bf29b076e6e2d95645c2bf4
3678eaa6dc8f602e1d3d55a828181e9e55b981ba
2066 F20101108_AABNYJ crecca_c_Page_125.txt
8770ad44ff0a17eeafb2588a867747d2
171fb37530bcd054b633bbeec39453efc8b483b3
18263 F20101108_AABOCZ crecca_c_Page_205.QC.jpg
a24b71d14ab2f2a7fb4947d6f88e55a2
fc6d4cff388ec23cd3a498cdba1fcae1925df8eb
F20101108_AABOVC crecca_c_Page_153.tif
005c0415004ce5dd4577c7fe14b05a89
f903244919b11458c36288e6900c8e2e7626c453
438875 F20101108_AABNYK crecca_c_Page_082.jp2
d463c109dfb8b1ccb809043ac9db78d0
488a3acec046c0c0add9550c2b10532076b719f1
F20101108_AABOVD crecca_c_Page_154.tif
59488d498adf93c5e64337d67bf4d24f
5361455fc30444adec6eaf0dda2b35cf4d5f21cf
2160 F20101108_AABNYL crecca_c_Page_146.txt
b31f49c133780058fb68585016104f82
394732d275c8711f2ff218ea802908347f25d35b
F20101108_AABOVE crecca_c_Page_157.tif
bbd9b14a9b11940339f3d43464fd0685
9dc0d2d20556bcd80b785d0f5b07f2bf71be4720
6386 F20101108_AABNLA crecca_c_Page_020thm.jpg
ff5f812269dbf7edb28099bdbdaea1f8
61e63e99127e058a8866095b6a6158f52e43cb46
66116 F20101108_AABNYM crecca_c_Page_219.pro
cb7c7a9c0b577a3da0331d15968f8b80
cda4c5573f8a9fcd4389be718d42c3692ac0d04d
F20101108_AABOVF crecca_c_Page_159.tif
5c2f62ba8792ab9b86dcbc3cec593ff4
0b9d99819bf0405018efb00f0d7318176b1e39f7
24965 F20101108_AABNLB crecca_c_Page_101.jpg
d3853726dbb1bfece467426c27bae244
a4c1d6c6caaa5a4f4807e0cffad4be2133d31a90
3167 F20101108_AABNYN crecca_c_Page_008.txt
03259c4844c853284425465237658dc6
a195f15279aab961bf2b57413b2f0cabcb8e4f29
F20101108_AABOVG crecca_c_Page_160.tif
6a1cd436b2173388325ecccae9ed08f0
e7cfb90bd2ca676cec4c33dd72909f83a9af85d3
2154 F20101108_AABNLC crecca_c_Page_155.txt
26cad1fa2900b0cdef30e2cef1e055c7
ef8c3aab76bf6b8f33c7856cf7132d6cc1fd972d
76432 F20101108_AABNYO crecca_c_Page_072.jpg
5141d41669d78a7127da234afdcaa6cc
97da9d657a023fc5b2765a264c0da3b48feda16b
F20101108_AABOVH crecca_c_Page_163.tif
e459bc661646496713152c6690e50299
165890ab94cdb25a11393312de35193a9c79391d
F20101108_AABNLD crecca_c_Page_033.tif
5719b5c919e7819d194a8b229140ca0c
8551dcf9b7374c26542e76ceaa26a790d61b121b
59256 F20101108_AABNYP crecca_c_Page_004.jpg
0c732aad8b14d8c1acdc92da0c16f4d4
e6ce677027577279378e07eb209230832465d262
F20101108_AABOVI crecca_c_Page_165.tif
0248eb11deeaa34c198692aa77e19de7
38aa81eca8790d1b4411225051ade15713b6eb5e
2112 F20101108_AABNLE crecca_c_Page_124.txt
9f9510e056ed313378adc988d525d12c
574a819092a63428b59a1786533f7254fa62a9e0
F20101108_AABNYQ crecca_c_Page_017.jp2
6de736453227ed1448339897ea9314b9
6f6965e2ce5967d6355296ec8f087a997a55ab53
F20101108_AABOVJ crecca_c_Page_169.tif
4e2ec11deef43ba3788baebf450155b2
03653b1e59bd7464e06e97dbdb227c52f1b2106a
2820 F20101108_AABNLF crecca_c_Page_195thm.jpg
c583b6d8c44c678648870772430f200e
1b44ad8f4fd64207b17565484c2c758188ef148e
1133 F20101108_AABNYR crecca_c_Page_087.txt
9972971445a0c4edf5bed15b5e50cd71
4a9190b5df9cb16a1bf5a497f1d3f8def67f49dd
F20101108_AABOVK crecca_c_Page_170.tif
868e0f001644c03c80e866043f657e7d
10eff9dff37f14576285c606b949108dd346663c
1051959 F20101108_AABNLG crecca_c_Page_115.jp2
d6cce2047fe8bbd018920aaae1f44b95
ba345601b28e5b593c8553fc37b736e11e05dcf9
6313 F20101108_AABNYS crecca_c_Page_210thm.jpg
6cede515cee5388240c5c94a2f9cae4f
7e1fabe5e40e17266b04a147550a8d50e924cd30
F20101108_AABOVL crecca_c_Page_173.tif
4e35f0f4bbd86b31e3371c58981ac677
91264dc25626a422ffc6b68369769f63a32ce3af
86156 F20101108_AABNLH crecca_c_Page_173.jpg
577573821b829703d8484fca30c58516
d65d3d205a0638569212d9a8ba6a2bf6f0809c72
87243 F20101108_AABOIA crecca_c_Page_032.jpg
4b6d58186f8472cba5d8b648402dfd02
198fa29a3db0c0d871d0362c21c7a43962d562ca
661618 F20101108_AABNYT crecca_c_Page_186.jp2
d551088e817be01307081f3a36fa73c7
18e6ebd451aee7ef87066545325c661ff447e10b
F20101108_AABOVM crecca_c_Page_175.tif
23d96790b91275fbba2b0e2fa821e6e3
b342104494b8b5274764bce2015b2c534aefda4d
F20101108_AABNLI crecca_c_Page_174.jp2
7ba64ff04297817c428b32eb13ade39f
e33b90042bc7c55901931acad86f4f5e93f67b6d
87821 F20101108_AABOIB crecca_c_Page_033.jpg
081afd292aa8ef3de45fda600bd10772
ab9c072db5cdf94e3e9c79622c31f195f7a1f5ad
17756 F20101108_AABNYU crecca_c_Page_004.QC.jpg
5e94d9ab30ac9e3dba3f9edaa11a700e
00ef00913dfda4413284c4742018b0d27ea3447a
F20101108_AABOVN crecca_c_Page_176.tif
b69dde7b87eb01c570a52d9aa0c5a14a
48059a94fa4785c709fb07529ef723c4c1b08c57
2155 F20101108_AABNLJ crecca_c_Page_043.txt
4ea2b7deb8512df4834f17e97ef41517
680761918b0c617e0dcb70a549d9bf955cda0d70
90018 F20101108_AABOIC crecca_c_Page_036.jpg
2e9818aa804e3c94b3b3f7a9e1b4f3d6
ce25449592c7ef3dd9baeeb4695e6a04391a906d
2443 F20101108_AABNYV crecca_c_Page_212.txt
355f3f916faee2aeb1fbf2857b88970e
f5ffe7b444785380a75c86807ba793adb4034017
F20101108_AABOVO crecca_c_Page_177.tif
f89ee4789fa75e45a19c05474fb3c436
40b250df964bd8a3784712a6915668f551353889
103864 F20101108_AABNLK crecca_c_Page_224.jpg
116e1620424d5f44dcbcb506a5234ab8
e6ab7449def4e5410b461ef9f56da43c33cc1e37
82737 F20101108_AABOID crecca_c_Page_038.jpg
3fcd23eb8146c2c566fdcb30a2ef4f62
baa0931e17fb8072a92906c9806a952fea5698a4
F20101108_AABNYW crecca_c_Page_202.tif
f6c90c97e4b43e459848bc4d3d609835
d1a8d119096c463ed8200c4138115905f607d4c3
F20101108_AABOVP crecca_c_Page_178.tif
53e1fa7da197ed0e1146789c97cddec3
ce2fa05560dfeee1024668074cf2c2c860cacc6b
90966 F20101108_AABOIE crecca_c_Page_039.jpg
7873a6dc9526d809651650162de541af
b5c69d3b4bb578167d042c3dd0a85bff3c3500b9
2002 F20101108_AABNYX crecca_c_Page_173.txt
3668c59ac10c62083270536a45713e1d
9b44ab7a2c8a89386dd0ca965f6ac4c8dfddc420
F20101108_AABOVQ crecca_c_Page_182.tif
46aac11d8f99e476a8dc8b8a675b4890
2391b7352496f609768409a92449270b55858c92
2199 F20101108_AABNLL crecca_c_Page_051.txt
f7bc3b6c1e2bfc9eb784d5b32158a155
c10a458c42162a05f874d5c3fa00efb71f7965c7
89527 F20101108_AABOIF crecca_c_Page_043.jpg
ac5e13a513c1d3dbfdfe18199e4af493
eaa19d4758f104f9839519d422278312b7d38b65
F20101108_AABOVR crecca_c_Page_184.tif
cac1d1db1d1a4d027e74e51636f22aee
6d2e23ad156c649fbec30955908bb1ae8d17fd4a
83125 F20101108_AABNLM crecca_c_Page_037.jpg
ffd9186b991140beaa3891632b9d5cdd
b351ca72b93091702f0f8101a52f2feb3700a94d
87893 F20101108_AABOIG crecca_c_Page_045.jpg
ba5e465f1306ed23ea41b67de8787b05
1d9d9c3b1627f48f59204f8821d82532d629b0ed
29281 F20101108_AABNYY crecca_c_Page_220.QC.jpg
a46e25e441203deb91f52e7cf09174d7
2ae66c0a95e10e16e145c4de46b1bc9a366b2bab
2072 F20101108_AABPFA crecca_c_Page_158.txt
dd5ba896aadf0f36fa00440f4b8ea732
6232a336d512810698073bd4aad0fe7bd67c0b0e
F20101108_AABOVS crecca_c_Page_185.tif
85f1d369afdbd9e147c59a2182835d54
53326b7c276c0d532f44b518aea4cb08e891e4de
25874 F20101108_AABNLN crecca_c_Page_089.QC.jpg
ae5b8ec5c37976fd21924b458bfca690
ff9df76921cbe6cdb9671442ff8841df11fdd10d
F20101108_AABNYZ crecca_c_Page_101.tif
34ee15bf644d508f4e435b8cac8e584c
8bb1eb1b555191f0cb8b98b5db24f9419c9913a1
2103 F20101108_AABPFB crecca_c_Page_159.txt
60a9fa2f2492d5c5f06d03200b7c3e0f
5c8da073d414cd7efd8b349f966627e573247c5d
F20101108_AABOVT crecca_c_Page_187.tif
3cd7663d206bd01bfb86390ae9f7fac6
41119a69eb993efa8fd46505fb0719fc2f17f81b
1051965 F20101108_AABNLO crecca_c_Page_081.jp2
423c67aa025e1a10d2cf05439c3f7b0f
92a97f5c831d2f42cb6aa290b90d5ac8ffaa36d6
79692 F20101108_AABOIH crecca_c_Page_047.jpg
d5ad35d1085097733c016ae94a12f841
85997417d378e68220d22d9978f4dcafe89a0533
2030 F20101108_AABPFC crecca_c_Page_160.txt
58cc7c01e83ce1ec562635215be0ec02
9f2f0d23476c7b85c78d3ef2960e87c68a15286d
920105 F20101108_AABNLP crecca_c_Page_066.jp2
dce7038b3d21d22e1be932f753458343
19f287672532d12c21f17b21622aa6dfe2caf95e
84350 F20101108_AABOII crecca_c_Page_050.jpg
04f0344e7c194c3e367f2448e7fc1454
484426019f778fec996a9dcb32ce80532b2bc7f9
F20101108_AABOVU crecca_c_Page_188.tif
22e0de2e69f96d0fe12af469e2777160
0040b1592de455bb1aa89085b5d13f994904e66f
6579 F20101108_AABNLQ crecca_c_Page_146thm.jpg
f80a231f71ad47bfcd6de3a970be8c80
a86997d1fbb6a53caf62641ff74ec53a1fb54f45
86695 F20101108_AABOIJ crecca_c_Page_053.jpg
f2a60dcacf160cca5024d3bf4a35bb7b
a5fbc65b31bf02190285333d1031dc2419948e0f
2169 F20101108_AABPFD crecca_c_Page_161.txt
e855a44dc5d1e40ea0e94ebf3c05000a
2cdea6ab9768a1a8765bc57d775f655cfd95cade
F20101108_AABOVV crecca_c_Page_191.tif
7b8d5ffd666837bbc699ab0dec2fdafe
1a1dd031f484407bdbfdd3cc0b1085652661da86
27968 F20101108_AABNLR crecca_c_Page_035.QC.jpg
8d8622045501da582d0cadc27614dbd6
71b6161efd11c6e1c0c59c416b2e0fbdead1418e
82883 F20101108_AABOIK crecca_c_Page_056.jpg
ac40732e5470ea53e08ac5646146cb6b
fe6dc19ef3cbe363beb65a2d07a08b13d042bfbe
1536 F20101108_AABPFE crecca_c_Page_162.txt
11427c9a479644b949f7dfa157ceb70f
ccd40e6bb19129460b068cb93d056d6f3b14a20e
F20101108_AABOVW crecca_c_Page_192.tif
50430cd8769b22c298b3f46c814bc209
fe3bcfc31b8280ae3684ad1043ddecd180574932
24170 F20101108_AABNLS crecca_c_Page_140.QC.jpg
8af13628d9e0914e01aa0c2df0d1a26a
e21882adf6f25235ac9f7fb60580766cb556a045
88294 F20101108_AABOIL crecca_c_Page_057.jpg
1fa0bed26128a114c3b4b7e4e478b174
63d96ea8c98a745a4d3c2558b3fbf9ea764d1540
999 F20101108_AABPFF crecca_c_Page_163.txt
c2a63cdf81671f9c3fc4a037c20cddfd
4a46cdb1929510b5e408e8c06ff452de583895f7
F20101108_AABOVX crecca_c_Page_193.tif
d7cd84a54380dfe12f7ce70a83c41377
ec6f35e9493e6e1a1393ad4af18b26a8e7bf0856
F20101108_AABNLT crecca_c_Page_149.tif
ec46d40601b400f6d7408bc702e3b0cf
de85184c64111ec45d4d38979469ce9393c11f6b
92845 F20101108_AABOIM crecca_c_Page_059.jpg
3408103c496dbfaad0f3cd562a393a7c
43ead24e5958e4a6f17494f4a89a401051f575ff
884 F20101108_AABPFG crecca_c_Page_166.txt
a0174c93b5b0574299693705c185e967
f787bd3c34f5113e405f1c033698e8284d25a61d
F20101108_AABOVY crecca_c_Page_194.tif
2a0cb7dc55fa4ec270f67ed9950fc7a8
40e708c1e9a920635969555d0a3fc5bb5fadecfc
662 F20101108_AABNLU crecca_c_Page_083.txt
95b3cc7269a4bd3c569dc62cd700b322
834d4e067192d529146c078c05e0ad1ff15312f8
88965 F20101108_AABOIN crecca_c_Page_061.jpg
c12592128a7310ef2ec6ffcd107468ac
3d9b5cb8e86202125a1fb0726303bb90cd7aa436
1032 F20101108_AABPFH crecca_c_Page_167.txt
9669b2b35b15ab23e77e4b48689c0fde
50e76b0dbd6e229ae6b17c73f8665fd4496e5c50
F20101108_AABOVZ crecca_c_Page_196.tif
dcf6d65300436d80d1cc11df1aaacfa3
790ca87021ab7f1c2ab068967083bd9304a6d173
F20101108_AABNLV crecca_c_Page_022.tif
3bdd837b1ff00d1346750d9f72dadb3e
896627d43ce728db4bcdd43da0bf0cb3adc0c188
84158 F20101108_AABOIO crecca_c_Page_063.jpg
8fdb902309364afb4469ad1785685256
943781b85dca0c6c273f8dcbfa62b8e3d9cdf89c
712 F20101108_AABPFI crecca_c_Page_168.txt
a6baafc41d47953950e4a92e329605c3
26ca8e4b16ab583eaf0dddba381ef13c815af725
1693 F20101108_AABNLW crecca_c_Page_055.txt
8cf26887f5344ad71883e55ffec7ae30
411c2cd8aae4c8419c464d1155ac0baf35202800
85987 F20101108_AABOIP crecca_c_Page_064.jpg
70b005441e6887ec81d683919eeb68a2
cc17403c33aaa6a5ffdd48116ff015be63907080
1924 F20101108_AABPFJ crecca_c_Page_170.txt
9572c8600a5d0a4b7b80d32d6a190988
9ff01da88ffe51976e5899cd1596dbfacfb22682
51416 F20101108_AABNLX crecca_c_Page_120.pro
b0380ea4b95c89d535411b529f0baaaf
8a1b448a39eed777a07df10f9410e83c495391b8
91225 F20101108_AABOIQ crecca_c_Page_065.jpg
f8d88034c0274e2c87f36299d7fc1a5b
9a2864a74e17cb4164eb1d1fe6d3a4912cfd68ce
2126 F20101108_AABPFK crecca_c_Page_171.txt
e6d98a2f45e269a90757937c37ce72ee
34814924ffdda43fb1eea7a0008346dad6759e32
F20101108_AABNLY crecca_c_Page_034.tif
59bd3302403c70dc7346e002d7a22e7e
e990931f53640b72d0d84e14215adf8131785860
69752 F20101108_AABOIR crecca_c_Page_066.jpg
9b3d6a279fab84343f1af699a049f8f3
cba6aa4f56aa312cc310b90be5ad8105c6352a31
2102 F20101108_AABPFL crecca_c_Page_172.txt
90717f27ff24c3e9b41bc1a8455686e6
9586eda0fb9ea1adc35fcaf502a3c8045f96ab3c
1051966 F20101108_AABNLZ crecca_c_Page_144.jp2
a7e9ed635241189e28d0d17610e67f49
5170f8b651987afb94c663c95ce8848aee9b4638
58337 F20101108_AABOIS crecca_c_Page_067.jpg
750a8a4db104704e3097c3bbc9be72b5
e79dfda3fb4db24d6109414cdb35378cb7371d5d
F20101108_AABPFM crecca_c_Page_174.txt
9e0f4eb25bc11aae3293650b5c0a8bcd
6e7e3d785ed1b63c9a32903315585d426e82ef28
68966 F20101108_AABOIT crecca_c_Page_068.jpg
eccf382be63f2cf36f3334de103b7607
9ae335f91ce1016abfbe73c9a602d8e714e8e900
2010 F20101108_AABPFN crecca_c_Page_175.txt
d22abfa02320545be66d1af994e4c8cf
fbb572febe8ba5b186d6e3bef6bc6ed14ca91344
42228 F20101108_AABOIU crecca_c_Page_071.jpg
ce920a2bdc7c35e80dcb67871569bae6
04b40af83c4c781e0b1a3f7fb1228ff1f8b31ad6
2239 F20101108_AABPFO crecca_c_Page_176.txt
5f5af4b66a6f2ddd709f392d53d06c18
98bfe57006d44a47335931356ebb0345b4c271bd
35681 F20101108_AABOIV crecca_c_Page_074.jpg
1db5a2bb7a1e5b126ad0f1244e250c3d
659af39f4a900edc2401a8d13e95121619e521d8
2070 F20101108_AABPFP crecca_c_Page_177.txt
92d55c8cc895c14f244561a8a463bc9d
53cbb558f7325394cf2b0e14316c70b13769aa9f
39906 F20101108_AABOIW crecca_c_Page_075.jpg
ec199e0d247a72abffe79d82123bf9c7
3fee7d202d98298543d237292cb1436d6ad67772
2034 F20101108_AABPFQ crecca_c_Page_179.txt
3b4c5276bd12425a508e2ce2f475b599
216851336277fae90b8c4155d79d8663c83a1423
89044 F20101108_AABOIX crecca_c_Page_078.jpg
26ba306d57884b2a52fe55240f704180
ea4f71b7f59788bc8c7e4a3f144ca659ea32241b
2119 F20101108_AABPFR crecca_c_Page_183.txt
2b0b88ba9ff6b63d79cec6f980c64887
966729e9cbc378399ff899965e276c1c58bd6fc8
78962 F20101108_AABOIY crecca_c_Page_079.jpg
8305a4fa3a3996ef23fa22995827759d
507bafd59a8e051034ef31edb8311e5e9bc0366d
1782 F20101108_AABPFS crecca_c_Page_184.txt
91a428a5f73c0ff7cdeb78a3d62d663c
46cb0f3f7e3c7898cb8ffa96916debc4823f5195
93263 F20101108_AABOIZ crecca_c_Page_081.jpg
d84d3224c4cbc250b627581829e24754
254344004bb6c99c798d606f0b978c9590926c65
902 F20101108_AABPFT crecca_c_Page_186.txt
b2d34abc3ef24aa21520d6abbd5d24e7
b63b6051e242e0c775f1863483fefb87961385cd
321 F20101108_AABPFU crecca_c_Page_189.txt
7370286f3ba030f4c684e82af8f4daca
e52c92c06ad6825d04c1cbce6454e7bd13df33c7
54622 F20101108_AABNRA crecca_c_Page_035.pro
ea3b5ba0d1d8ec0d01caa77f18f9f735
f85ad4689d8853321fed0d1284211cd587a34d17
573 F20101108_AABPFV crecca_c_Page_190.txt
f94232ff40a5ccf8f5b76d56b71696ac
ed3c1db098e926b80fee319739c0880d9faea0e4
19912 F20101108_AABNRB crecca_c_Page_068.QC.jpg
82a178e72e6efc80e102706e6d339855
27e98167e828aaefa971b307297d6dc908950514
333 F20101108_AABPFW crecca_c_Page_191.txt
09feda936eb7cd7fb48954ba68aadc92
0cf8cbc9f4bd00c56acba198acd56aa288e0099f
34681 F20101108_AABNRC crecca_c_Page_154.jpg
301a303c83327210152ccd711634aa24
617c2f7c3b6e224bbf1e1a10192c1aaf85aa85fc
921 F20101108_AABPFX crecca_c_Page_192.txt
5fbee0515be00b5c1741f1d13c02fa85
2c70eb590ccf1434a4b31ef10106b0228a8d4310
13927 F20101108_AABNRD crecca_c_Page_226.QC.jpg
bdcc4d27e165f52961233c8f5fceb50c
29d269ff263438d27021c810a7499c05e1e68b15
195 F20101108_AABPFY crecca_c_Page_195.txt
fc85cc3057b332edc3216692bf155e9c
5da02c2b789ae2c9206652a54afb8c12b18afd7e
85996 F20101108_AABNRE crecca_c_Page_182.jpg
14e7c4297b16089d2e8013e900473096
a7689e6d2317430cc65641b3766e80250d4804a0
373 F20101108_AABPFZ crecca_c_Page_197.txt
1d9f8b4315bcd78f90a8a54426f392e8
c1a778fcbc9baca88f3c8229444713f6393e1c74
6831 F20101108_AABNRF crecca_c_Page_051thm.jpg
73be8bf524280f943db1d311f1e473c9
c70cac1c1c5431340132f3996d68fe58d76bd48b
2013 F20101108_AABNRG crecca_c_Page_021.txt
b1248f90a6d6dcbdd146928afeb70c13
ea3f4793d95bc08ba96abbdefee17341820ee2de
46001 F20101108_AABNRH crecca_c_Page_118.pro
cc95bdd00171c4c88e5c458f3fc59dba
2e90394d72cf95fb61204e62720ea25534d23557
F20101108_AABOOA crecca_c_Page_072.jp2
c34aae06bd54c68b82cef1fc2d499ecd
14503df191682029e60bbb5b46924af59d4a2f9c
51349 F20101108_AABNRI crecca_c_Page_018.pro
497d941917e438e7f0bb34af0af2bb50
d58ffeace96f133c02d6b01c0492f9db3d3164e1
720794 F20101108_AABOOB crecca_c_Page_073.jp2
6458a8c28377c9518d93b2f46d79762d
4ad8324e90be31b2b6f6836cd675faf97ca0d61f
51881 F20101108_AABNRJ crecca_c_Page_143.pro
dd2aa7885ac705921d9bd177bf5384bf
399709fdcc9d0f753a9196561a63bbd14c7ecb29
590500 F20101108_AABOOC crecca_c_Page_074.jp2
49b99ccd044d68d96013a771796bd98a
74cd2f90ad9bb44492db09ceb300dd20763e328b
F20101108_AABNRK crecca_c_Page_033.jp2
968aa74459327a0e2e45583e09729ce3
ee7587c4de6db6caad0e5529bdd82dec22f803a7
677047 F20101108_AABOOD crecca_c_Page_075.jp2
7a9e21c8bc04db695696d02a159fa982
ead3ef0461bca4dd8fbe58c16bad0f0e83d9c1db
52554 F20101108_AABNRL crecca_c_Page_158.pro
21ffd0edf6b57a99450533b59089ed56
e425a7a65172f47ed2f7474e8a84114eb16c7c73
493106 F20101108_AABOOE crecca_c_Page_076.jp2
6c9498dd39ba59442d7ecbce1a168910
d190c31e0cb0a3fce7921d25f2407e566d92fd7a
53954 F20101108_AABNRM crecca_c_Page_026.pro
b946f5e53ac3f584c0bb08658dfdc0f9
4023fd181a618d44fa39a645f91d813f7217485c
F20101108_AABOOF crecca_c_Page_077.jp2
d9e2f90c08689ed3ffed84de84d32ae1
43083280179b749265b5eeaa6e67ee7377bad2aa
10678 F20101108_AABNRN crecca_c_Page_130.pro
eff098ef9d6e80be615428d281b44758
525851a70746cfa1dcca17e659a9b3ecbd92418a
1040504 F20101108_AABOOG crecca_c_Page_079.jp2
5b027b95ca268888fa6eeadffa5253be
85899e73ff9ebe940bf5701bdddff0f8a58fd8f1
27476 F20101108_AABNRO crecca_c_Page_057.QC.jpg
13427960b31f74af5b833ff7f85e1633
48dbc41f46e5fd53b22382f802fe7051ccf4beda
1051980 F20101108_AABOOH crecca_c_Page_080.jp2
425bc1b6bd5b479273ad3710edc2b05c
14b8786884d1023301ae2b8e5eb439000f1e38cb
6018 F20101108_AABPLA crecca_c_Page_090thm.jpg
6a0eae19074b55f887f3a80124342e56
4688d3f8fbd7931863a0ed1755d3b2f96468c613
27505 F20101108_AABNRP crecca_c_Page_211.QC.jpg
7fd567866ffa3d027045e6ec9ca87dc0
0b2e0862cf2aacfea399a46cc00c878192bdadb2
393894 F20101108_AABOOI crecca_c_Page_083.jp2
662ee6e18dd63031323350d1a22dbd2c
144ed5fedaee56469a05c6e733ee93aca4bd38ae
27581 F20101108_AABPLB crecca_c_Page_091.QC.jpg
94e4d9f6f48d2f4442f8989eda126230
a7495c31fd798703353876380b67c5d0183f1a23
F20101108_AABNRQ crecca_c_Page_225.jp2
3798d1b2a70c69c1eab777970a90573a
3729fec28ae731e5cd5661fe4b2382fbd3943d6d
1046627 F20101108_AABOOJ crecca_c_Page_085.jp2
addf3e0c929f66d9e6c10addbd4814db
9df29a99a5c044c843f2ec2ee617bed6dc8e24fd
6669 F20101108_AABPLC crecca_c_Page_091thm.jpg
7c3ec46ac9de8cc583a1893ebde7a1a1
a9c03eb615d8222fb678cec15f5f6956bc80707a
1051827 F20101108_AABOOK crecca_c_Page_088.jp2
3bf3339d4a5e92869f93f30c9b72ea93
6a548de011a7224541d7eec15e1ae2806c357543
6752 F20101108_AABPLD crecca_c_Page_093thm.jpg
9df4c1a5b71fd44421440bd4a41c0a06
68f76051aacb222ae091558b5069035fc271838f
F20101108_AABNRR crecca_c_Page_085.tif
4958e7269cbc4b0548296613af5ce7bb
304d5b1fdaeb886a77ba5613d8428757776582bf
F20101108_AABOOL crecca_c_Page_089.jp2
66398477de28b2e571d1775a3bda127e
4fa055b36c55b4ca9a3a4ae6a5b37a1c8f3eb709
25453 F20101108_AABPLE crecca_c_Page_094.QC.jpg
2f1532be9f31a0410d1885c2f07193fc
43b8abe34984d3fd4a90e22fdf48eee6652a5622
6627 F20101108_AABOBA crecca_c_Page_213thm.jpg
b8889282e8f4417f5986f9cdb572ae30
3d0887d4c3c86c50bf651388e1642e5ba19cd8b9
F20101108_AABNRS crecca_c_Page_127.tif
40a0e987558489d52eeac17d1cf4b11d
b4c9e53bfd3ff7558304ec930bd21c5925edde52
F20101108_AABOOM crecca_c_Page_090.jp2
beedc38b6fd21816d03330a9498a9acd
1378aba9b681b0e52f97e50c576f6f35ce08d4b2
6220 F20101108_AABPLF crecca_c_Page_094thm.jpg
c7920ef4beac5e0691c052bf27b518a3
ebd6bf60f29026a14c544dbf6c7cf379260c5f5e
F20101108_AABOBB crecca_c_Page_037.tif
807bc7526fc4c3825c09b9ae161ba299
00573fdcf7535bed4cbce5f4f5ddce42036bab7a
88913 F20101108_AABNRT crecca_c_Page_035.jpg
d8182614300d805edc5f11ca58e8c2e0
8ed8dea73680cc6a15c22fc5b1b3a04f87021e1a
27387 F20101108_AABPLG crecca_c_Page_095.QC.jpg
7356fbce2e1e18a17b0fccbd0e396596
3d9c3e8ac0eb3c6c76712e32901c079c68b8be23
F20101108_AABOBC crecca_c_Page_181.tif
29f9625c81bf067a6ede36ceba6467e9
40234f399c2890670a5b7e2a23d8db108b978751
F20101108_AABNRU crecca_c_Page_062.jp2
26907e5fd733f12b031fcda36192edc9
c18cbe506172436c7838a283ed24d3203d4a6b97
1051955 F20101108_AABOON crecca_c_Page_091.jp2
7e45b9d524497cbd9934dc8a60b747e1
1bf3488c0c8dd26dcad6f936bf0016302ec705b3
6455 F20101108_AABPLH crecca_c_Page_095thm.jpg
3bc863a1a1c07e3823378ad53d4d9814
3c71a211f2292f642598d20d1e030cf0b85529de
735205 F20101108_AABOBD crecca_c_Page_087.jp2
228773a41fadbf19f1a8b8f4454f7900
0edbd825dc8efe22e9662ad355479d70f26a71cb
836284 F20101108_AABNRV crecca_c_Page_048.jp2
037d315039a7f4f791734cff3e55c811
470b03cd0e6330e246afac1707d9db17225aa491
F20101108_AABOOO crecca_c_Page_092.jp2
fca2bcbdf6da4c9b72be69cd5783733a
1d10ca87e1e6993e0031017099e589774c0415a6
23606 F20101108_AABPLI crecca_c_Page_096.QC.jpg
acfc7f122674fdad4299f44e7a616698
4d9980a09a28d6a2118f8694b26abcefcc17a0ab
7157 F20101108_AABOBE crecca_c_Page_076.pro
5b16b42400ee604fc4c738854a7a77e7
41e3c5a866c7f4ee7c3bda06a00e2d51cbdc0f2d
6599 F20101108_AABNRW crecca_c_Page_189thm.jpg
6b2d03bd37c61bffc281c5d26e2404e7
7e92966e844231488a631955a21c297b157a5ea2
F20101108_AABOOP crecca_c_Page_093.jp2
81110fd1e57caa4ad2e20d07bae824dd
77c1e8d2dd0c28527d84ad1d93164239799a180e
90973 F20101108_AABOBF crecca_c_Page_051.jpg
ed7bb56287fe2a040c78566766e05a25
45eaf2a119862cd072fa03a0700ff97cb31aae73
6844 F20101108_AABNRX crecca_c_Page_220thm.jpg
b860f1e3ba77f474f985a819b162fcf3
cd805a436b98edc1bc8af779036c408e77721d8e
967176 F20101108_AABOOQ crecca_c_Page_096.jp2
01560765c1dd9074785f94c73fa836ee
d259d27731b5998338ee62c32c106b2eed7739f6
5678 F20101108_AABPLJ crecca_c_Page_096thm.jpg
44bc1906f1378fa2202d57382312e03f
1d08fbb47b9b7e9c2a1d0381a85259250f187880
2071 F20101108_AABOBG crecca_c_Page_016.txt
03cf4738c13722b97aef9b979956510b
f78e1f33c03f404426aa4f0827d4c74cf1c9efb0
1051707 F20101108_AABNRY crecca_c_Page_137.jp2
3856e87b2b1848305fdfffa650964d79
7bf66885d1467ffc05aba39187f2c7fb17010b97
F20101108_AABOOR crecca_c_Page_098.jp2
29d9189d0cc5196c339bfb78b4375b72
c5012765a9a315b51e69b3af0172b5bee1b86afb
20292 F20101108_AABPLK crecca_c_Page_097.QC.jpg
c86f6d243b2dfa4b13f1bf0d35fa2bbb
c7742e57e60e6df407e40ab2f297faf32a40698b
1988 F20101108_AABOBH crecca_c_Page_089.txt
b6b9e14c20d62a7651626d49b267e57c
5df47b02ca7bed893a70a3b2ebf444447756452d
106845 F20101108_AABNRZ crecca_c_Page_219.jpg
a21b7be90be9a74f3b949903cd361a84
9de4db9a57c1649756c1f502e06cea51abefce95
F20101108_AABOOS crecca_c_Page_099.jp2
9d79be797626098fa94eceb2a1cfa357
32826919d1368a9492d7de8198104921bbe915c9
20020 F20101108_AABPLL crecca_c_Page_098.QC.jpg
c697fab74756b78adde433711a0d4ba1
2154d3e2e1f08ade5f5a604114c79d0fd97ec486
1800 F20101108_AABOBI crecca_c_Page_119.txt
4f88d537d99ffad7a6e373d8bc2c2e33
5650b0d2149bda4bff6b8449392004729def6d1d
1034449 F20101108_AABOOT crecca_c_Page_103.jp2
a17620105e38afa2bb2abc18c0bdb548
70c79d376fff534351a23890ee1013c3d06b5732
4784 F20101108_AABPLM crecca_c_Page_099thm.jpg
11447d15b664157c0ed03d136a884299
a2b1b9daadd18919960b1f46e0e87b614fd6c8ac
6058 F20101108_AABOBJ crecca_c_Page_170thm.jpg
bdf37ca35e92fd8855c5f525b9fabed4
6b0c838ea55cc56c501f65027abd3a457d28cc6d
1017951 F20101108_AABOOU crecca_c_Page_104.jp2
c61477d4316895a5adb3ee9b65df0f38
e96222023f3510057a9f247ccd71c841c21f5b0b
18067 F20101108_AABPLN crecca_c_Page_100.QC.jpg
7b52eaa00b53c6ac247451927ff7df5c
b2e6e788136e404b8b0f00b2170e1d5dc7c6cfdf
6459 F20101108_AABOBK crecca_c_Page_153thm.jpg
5959548ed9b6b96e1f079d2162f2a5e1
bd254ae54441bc65acf91b1aa71cbcebc51124ad
F20101108_AABOOV crecca_c_Page_105.jp2
3e6ee4ef6b726847fa603f29380c55aa
ec6c36f3c3ec28686af3d8794d9d3369d798344e
2303 F20101108_AABPLO crecca_c_Page_101thm.jpg
79b3c52c504ff34e09d3ba9761b782a6
1ab17f2c00f06cf4b096396187a734a2af2d1e8c
6818 F20101108_AABOBL crecca_c_Page_149thm.jpg
c86139a1541af08422b7b53de99574e4
51c1c3bfc0a816e097a8e3a5425c4083dc63708e
F20101108_AABOOW crecca_c_Page_107.jp2
e0488e2578800d4b7a9c8677abf14397
c9a2cd65d19576ab7ac387c1deb3fb0184a76e09
19697 F20101108_AABPLP crecca_c_Page_102.QC.jpg
80e0c915ecc9e1cd556fd6b0b760f9d0
c35a9da1ff8c12d6fd65af3bfdd83d9da4846c8b
F20101108_AABOBM crecca_c_Page_174.tif
beb079fe62404f2f69076e560eea51ad
1d05b3876251a798a7b3901ae638ae8850b083c3
F20101108_AABOOX crecca_c_Page_108.jp2
734f04b13327e5f98b1efc5611a688a9
ecc2c6d3cbb217e2de254b1756e6d4cc03c7c577
5823 F20101108_AABPLQ crecca_c_Page_102thm.jpg
2f72f0999232b54c95b7e6ac3cc40bfa
a9e3e7bd4be00b1e6cd8632b5778c1e87db2fb40
F20101108_AABOBN crecca_c_Page_008.tif
4beb5788c41becff9a7dd180f31f627b
7a3f610a6b97b687d889e5db39969577c6b85563
1051915 F20101108_AABOOY crecca_c_Page_109.jp2
a3aab3c0b5702a67d4ef2875c6e19d9b
d4b4a81e8d46afcc196b9c05382931eea6a6ac52
23948 F20101108_AABPLR crecca_c_Page_103.QC.jpg
d3c3d47b47d57dd7a7f7d224f58c9349
22eb07742916cef1665bd21bdbd739785a17aa89
1844 F20101108_AABOBO crecca_c_Page_079.txt
fa9a9ddd60abf1592b8d8836a880d428
adc77f3e9eca61f3b198f269bf260d51ad6b2e50
1051961 F20101108_AABOOZ crecca_c_Page_117.jp2
993866fd41ce092ee2e9e739b900be41
e1a66bc8a4630a22a69e76416e8c5674f8c0f8a5
5861 F20101108_AABPLS crecca_c_Page_103thm.jpg
cd9c4ddd9acd8b84360eb1bd35fdbd44
e82e0db2e035b9c1c5bb646aefbfbf963254633a
F20101108_AABOBP crecca_c_Page_168.tif
ebb06c089b7a643757ef52898f97b01c
f6cc23674b534b3fb1952c9f5f6a9a38d4f0b156
24672 F20101108_AABPLT crecca_c_Page_104.QC.jpg
f6655293a7f0f579216b5367554783e7
d8861517e4d310adbf6c53a604c1ab447f785fd8
55679 F20101108_AABNXA crecca_c_Page_139.jpg
9e1af870035be6fbaf31803a6962096e
3cd5a6458b68137500d55245805bc5611cdcf107
55608 F20101108_AABOBQ crecca_c_Page_189.jpg
ef3566aa68b6b799601356b465d0e40b
06fea46db46d240910210b4141db6bdce766a150
5929 F20101108_AABPLU crecca_c_Page_104thm.jpg
5ec3a436eee303a793c904304ba37656
b4155a58d46a38c05b0e2cd6f251cb091afa3cd7
6569 F20101108_AABNXB crecca_c_Page_042thm.jpg
d2a1a9fb43482c6bfa6da234c708633b
e43b387e7a227c14514886c13c1ea6ba91d3ff93
2692 F20101108_AABOBR crecca_c_Page_215.txt
146995eb20ccf9a4bfda8698e92e342c
6b9ec3fc6eb99c7edf19fdd6a625a3eb73325d55
27541 F20101108_AABPLV crecca_c_Page_105.QC.jpg
9d5d257ceb0ebbba436d26685da350c6
801adc637971399871330484cfb51d1c995408be
19377 F20101108_AABNXC crecca_c_Page_198.pro
0bda6229cbde7d22f70511ec0dc84e83
121615bf9a6d4267eed4748241829f4dfa4a2fb9
39671 F20101108_AABOBS crecca_c_Page_182.pro
0721412c8eb1e171efa2d1560023dc98
35e86bdd1a383a866ced9e0d11466e8db846cdbd
6742 F20101108_AABPLW crecca_c_Page_105thm.jpg
a54140a427931a3ed81da4d3a401b109
b4bdace3864427f846fe2071dfc7b978069dd918
6604 F20101108_AABNXD crecca_c_Page_041thm.jpg
59c963a33ef41e9f5c65a16e9b45145d
0327aecffe67dcb16525861b99b414bf7aed4cf9
43967 F20101108_AABOBT crecca_c_Page_073.jpg
026b1130d55c1c3460aa023db436640b
8b897494f825ec5ed323de2ff484fb6e6bde3d17
27306 F20101108_AABPLX crecca_c_Page_106.QC.jpg
eba7ea9ff17687db7caa2fee6353ac2e
ba70cdcce6033255ccc8d1dbfe3fef4a640c90cd
6682 F20101108_AABNXE crecca_c_Page_027thm.jpg
17458664a1514d7535810dc06f7ecaae
8533814dbb8a566ba2bb39b4da0292e982f19514
38811 F20101108_AABOBU crecca_c_Page_162.pro
2824918880babb789851a0d5c14eb6e6
14383274840b7f89d8d4634fa0136b0bec33eb98
6151 F20101108_AABPLY crecca_c_Page_106thm.jpg
9a9ca8aa938b767569cfa631383b338c
3a3cbf79c02f0e4b47051372428b940da4bda65f
88606 F20101108_AABNXF crecca_c_Page_024.jpg
646b2c7edaf19641f838ba12a3a8cb75
5a847cb4aed9dab9f325e82f37b4ca4aa8d07cc5
657 F20101108_AABOBV crecca_c_Page_085.txt
de23c97e3b9bc003ec16e341d4589983
daddd8443aa0fd36d01046fc7f64dd2b6c6ccf17
26032 F20101108_AABPLZ crecca_c_Page_107.QC.jpg
0973ef772fa98282f127c5dac15aefa5
da3b5b8677a04543833618825a3cfcf890c5bd84
54150 F20101108_AABNXG crecca_c_Page_057.pro
d879c60efee77083e8899b13970aae15
0b7504ef3e50a19206db7c0b6da49458e039b8e3
6504 F20101108_AABOBW crecca_c_Page_035thm.jpg
a06db1eae89d6a35b9bc7d136e3f5c17
6a34f16d36d8e93df4ca326a8aabfa3e2fcd3335
F20101108_AABNXH crecca_c_Page_112.jp2
d92c712bf8437ad6e267f5db8ce06b30
8fdbea43d3330c5dd99eec0007efa6798c62b3eb
2022 F20101108_AABOBX crecca_c_Page_056.txt
695eea1b1825a6e281944a9e54325a54
34b53166602b97c8dcf1e13d3aef58cc3bb41e1c
F20101108_AABOUA crecca_c_Page_111.tif
00e4aee9e7119ea50e50ae15165d2811
9927d81b0075718ebacef22595bcbacafa231509
1051922 F20101108_AABNXI crecca_c_Page_095.jp2
5d5d3699ebb7688b82129fe89a2531e6
fd50d2f006f8608466063fdbc6872191e2e50790
84422 F20101108_AABOBY crecca_c_Page_077.jpg
200b88078bdc6b8857909ddff93c7926
4c889a081ffedda66e983eb47d0ef8b991b2c8fb
F20101108_AABOUB crecca_c_Page_112.tif
03ed196abcb2b97f2d323e754bd09399
52396c1dc4d8d746c1747081658aa2d19c200b44
910 F20101108_AABNXJ crecca_c_Page_069.txt
f464ee720115a31b142d7ad124fa21ea
cb01b5e84b9aa8d4083b0ad18ff1349c1e9c918f
28328 F20101108_AABOBZ crecca_c_Page_093.QC.jpg
dc2102f008d2f07e93e7dfb74b45496e
eb1e6b04f6dddece85079860dafcf9e2556cc6d5
F20101108_AABOUC crecca_c_Page_115.tif
a70e294ac38944e0308b71f90c28bde4
55bad66b6a297fafa73b568a5f2f47766945616e
F20101108_AABNXK crecca_c_Page_155.tif
e987ba16e350916320931619e73f0734
53d450ab1e76c19ea35ecf9202e84f5f2de07364
F20101108_AABOUD crecca_c_Page_116.tif
3c45127ab318857831cc2589c7c082ec
33f0d6d8517ae3eb7304bdcf2dccf61a4eb6a91c
1051897 F20101108_AABNXL crecca_c_Page_222.jp2
8e451730b152a4a1b2fb0b386f167dc9
7def31a814971695436730fe0a7e42915a3be729
F20101108_AABOUE crecca_c_Page_117.tif
2bbbb1d22b7305c054caae21bf5b2d97
69cb39211db1acd83136b1829b889c000d17307a
F20101108_AABNKA crecca_c_Page_048.tif
63721bd78b5880036712aa80aa7ad117
b9bf8ed534f360100058f772933bbe2710cb9c3f
991 F20101108_AABNXM crecca_c_Page_198.txt
5cfad56a8b640ba25b1e5a4fd69f7b22
3b6eea89cf62cc8a775696d97ecc6c992d048ca2
F20101108_AABOUF crecca_c_Page_118.tif
46d1f2187ea186e175a1e2b0e3107d52
3db313e82b54441e1e67a086e511ebb564f4b58a
25141 F20101108_AABNKB crecca_c_Page_190.QC.jpg
ba36a6ea69f5f80ac81c72fbc1297a37
3cca4d31752bfe07468591df2da0fcb57481fb36
837392 F20101108_AABNXN crecca_c_Page_097.jp2
e9ea209e78d3b9a71401d1c85987339f
0e334e098efbe27e8473fafeb7c4b4239adb19ce
F20101108_AABOUG crecca_c_Page_120.tif
b22a8d0fc7354a85083aa3562007d181
6c647e176e464db04581c3310be3209409400c02
2129 F20101108_AABNKC crecca_c_Page_058.txt
3881551dcd8964e9e7aa3e66de4ac38d
fb895c3c7bc7b7eeb1d450e2d288f01992203bfb
26109 F20101108_AABNXO crecca_c_Page_056.QC.jpg
87c56df510fc0029b0f7a6c3ad15a138
e181347604092c358babe0a470f8528cef7f550f
F20101108_AABOUH crecca_c_Page_123.tif
280b52d1ba69f2f6911371c93eb01bc3
3dc753880274041b5d5a337810da26a3988913fb
27871 F20101108_AABPRA crecca_c_Page_212.QC.jpg
20ba6e82627a25254c84e3f105241a41
ce6515e65ef1440aa657feec67a17c8f4383dd13
40746 F20101108_AABNKD crecca_c_Page_187.jpg
077976f00c4df9d5397ab9212c78fd6d
bf6ec6db6b748fda745e1679641f0a9248d04638
616 F20101108_AABNXP crecca_c_Page_100.txt
f3646b2334409ccd8b1c1538b861d62b
9d3ae00f100eb07ec194eddf0e5ee13e47cebcce
F20101108_AABOUI crecca_c_Page_124.tif
606ccfb9511dc0c709b0c996a7162eb2
dfb0efd798ef10867cad7bed7a6d393ebcb7bd0b
29233 F20101108_AABPRB crecca_c_Page_213.QC.jpg
e93b7ab54f8de174769df29ae4c93480
0348d45618e563e7fe48cca0916e99f22dfe4e97
2396 F20101108_AABNKE crecca_c_Page_209.txt
4cb3950be6f2f2393c1bf052a76cefd8
ca921a57233df4c04c7e0fe151255172382240e7
58357 F20101108_AABNXQ crecca_c_Page_209.pro
38a01549adde701f4f1a97c1a34ae6d3
8d0696afb4029de5b54b10c6a83ca9d6cbe7ab97
F20101108_AABOUJ crecca_c_Page_125.tif
c3cf5afdcaaa68fc4e144e0ace64b8e8
f204e8dd590ed36185853baed11461ccdc48b089
29815 F20101108_AABPRC crecca_c_Page_214.QC.jpg
822b10f0d018bef3da9cb4d321b7f9d7
2dcd11b4fd9778ab62683359b667250b1f8c2c4c
632 F20101108_AABNKF crecca_c_Page_188.txt
a12b27a7c6942439bcd7f27cf285694b
9407b7325d76b16df562ed2a5f3729c66e07a2c8
F20101108_AABNXR crecca_c_Page_215.tif
1de46ae3ead66935ff64d07008e2af29
77ef6a0fe3d26e1e27667ad806a3382b59d03c3b
F20101108_AABOUK crecca_c_Page_126.tif
f721ec6bd4d59323dc3297bbfa8c6a22
aa4139e9428b6528666cd42e35299e754e161c33
7026 F20101108_AABPRD crecca_c_Page_214thm.jpg
0de5301da7aeb776772ee7d1f555f406
8a6d267e9012810821015337c7674c267230929d
F20101108_AABNKG crecca_c_Page_113.jp2
a28c49e58d7e9de0464086eb5f1832b1
a1ca197cca2469183a11bea8a20560d4b046b019
F20101108_AABOHA crecca_c_Page_171.tif
a927ccf698182ed54b5e44c6e5c809ec
acde25bb9ae52164449552f6b1b6a43bfae9413b
7165 F20101108_AABNXS crecca_c_Page_072thm.jpg
202f9facff489ea29a36ac1c6534ab4a
d719a9758e8a9f39ed202bc5dff2a91dc55268a2
F20101108_AABOUL crecca_c_Page_128.tif
f3babde7ae7421bec6ed30b996cb5a82
28b229ff3fb6170c2c1bab2239dbbf55da6f41a5
29780 F20101108_AABPRE crecca_c_Page_215.QC.jpg
ed7f65390db064b14cf8f74de173e2f9
e0dbdbb93e2fb7150038945d441fca81fcd9d8d3
F20101108_AABNKH crecca_c_Page_210.tif
242b9bff29bd4d2ad49c56dd71f1ef78
1beda8e8dd6344f4cbcd8d36d543269b0d076e8e
3218 F20101108_AABNXT crecca_c_Page_181thm.jpg
88b9175f752bcf4896dc1df30d11571f
c569dde68137f756b459f57dc5777c61c53e028e
F20101108_AABOUM crecca_c_Page_130.tif
8732e0a7611ddd50847a73307f8772bc
f07446ea32f90da8f931169a94437f24269d208f
6879 F20101108_AABPRF crecca_c_Page_215thm.jpg
59ea168b0c2f5ae577b76474fdeeec66
0b1976a410dd315f46eb781b101fbddb6f3da757
18252 F20101108_AABNKI crecca_c_Page_227.QC.jpg
2882db5a3ba9118bf6fe414402db8ba1
eb1f4afb526176fd67f0e33a16e6f8d08f5bec22
52930 F20101108_AABOHB crecca_c_Page_028.pro
2d4f3642788bc4a3de66dca4b6316ead
95d75e9c43aeeaa3fd6ed194b2e81d376a5e9915
F20101108_AABNXU crecca_c_Page_065.jp2
085f42ba250ee783a43c87923c49964f
4e7ba9640ce7a4b6ee5496cef4149c5893298415
F20101108_AABOUN crecca_c_Page_132.tif
5cc66f6832077037d4acea927fcf5ca1
dcc421e6d76eb00d9c7b87cee78faf851064294b
29306 F20101108_AABPRG crecca_c_Page_217.QC.jpg
e33dd3768497aff413d9a791313bd877
eb3362eca492141bf5ae5b86c9b23662d2ad1ee1
1051948 F20101108_AABNKJ crecca_c_Page_049.jp2
c954d6759f7ca366c72fec148774f82e
fa39cf882c786eaf67d944e33a5a0b7f92bc5c1b
6334 F20101108_AABOHC crecca_c_Page_107thm.jpg
c09746a8ee16c9594a0be5d175a4ad95
bbce35f86314d81f8cd261d5afcc8c6042ca6fff
F20101108_AABNXV crecca_c_Page_034.jp2
e06d8a7a3d676ca3318f95de7a320cf9
42f946714c73f7804605e78060014326873a06ae
F20101108_AABOUO crecca_c_Page_133.tif
89f9e4f1ff5ed53b25aeb299908c7a15
c18329204ca73f11c9d4d827a293ceb9d41588e5
6837 F20101108_AABPRH crecca_c_Page_217thm.jpg
151a48f078b88fd4f0a0157c64d76b37
033f678ae6d094531ccaf8fd835c02e68313dc87
337959 F20101108_AABOHD UFE0021713_00001.xml FULL
c108048dfb86f1dd0f49951bba733c38
cff17218196e50aaaa0c37de9f07bbe6ef7d830f
64181 F20101108_AABNXW crecca_c_Page_213.pro
ac0a07beba492f43ce971e1d0005de13
8e3d06056a638e4fd2ef90b314550e426e5c0d61
F20101108_AABOUP crecca_c_Page_134.tif
0fbbbe8bc05a94e0c601c53ccaee4610
75f6bfee37bbed0ab7e2cd3f6424c1c3c79b8a38
6643 F20101108_AABPRI crecca_c_Page_218thm.jpg
21d4c7b983671ec4c57d209d097efdb9
34a3919bc82b5f1f55e3b1f4985db1db4b46c896
2729 F20101108_AABNKK crecca_c_Page_220.txt
3958f27eeaa1510bb6417a9053289dd6
5c746803aa84ee631014616022b589e9e89bc1fd
F20101108_AABOUQ crecca_c_Page_135.tif
3597055ad4757a586a9452701e188109
b63173291e86d7398e53fbe62e7218571b80098d
6809 F20101108_AABPRJ crecca_c_Page_219thm.jpg
47d6e6ac0883c27052f26c893c828868
e5dd77ff2778db0072bbba44d545c75568d157f8
60560 F20101108_AABNKL crecca_c_Page_218.pro
55915adc811459ba991a34edeecf5a14
764217d67cfebc23a75c0608e0ac0061a4dd9b01
F20101108_AABNXX crecca_c_Page_214.tif
d939eb4c57647194d0990c027523e19d
6480deaed5e5a1ff8e6cc6002468eb25dedbe7ed
F20101108_AABOUR crecca_c_Page_137.tif
d3ef9d7cce2b973b07fb0a7fafc24c90
d144d5f5f825e5acef8688659bb084c21f8b0652
29759 F20101108_AABPRK crecca_c_Page_222.QC.jpg
0d4d21a78f6f2831ba195f0d76c6c106
50998800c9b3d6d5bfa56a86bf160622bad7b40c
3641 F20101108_AABNKM crecca_c_Page_006.txt
42ca3d2d8d5f14163b14eba1660038f7
234d66dd48534c11cc1a625627584ba107a3baf1
87576 F20101108_AABNXY crecca_c_Page_120.jpg
1512a51ff25f9453b686f449f621d313
861fcf2f30e49e6dd6d244094d0db9c2d4894285
1876 F20101108_AABPEA crecca_c_Page_110.txt
b02d35cbf960f8dd48898e5d1a0767b7
b331fa9942958020f5a53c5ceff9891d75daf59c
F20101108_AABOUS crecca_c_Page_138.tif
462c3bbe2dea8901bd8f2844a52019ab
70e095799ed223df2b2aef21fb3ff6f9cdb52252
7050 F20101108_AABPRL crecca_c_Page_222thm.jpg
f3c0a6b99004870eb09f808a562820f1
2d5a560a26392aebb1815b207022593e2cae855b
87800 F20101108_AABNKN crecca_c_Page_042.jpg
a2c7b87eabd564691347f594387fda84
7790c4a2230afc9f89de3469d5e9ae702d6ad892
28688 F20101108_AABOHG crecca_c_Page_001.jpg
d69c557e0079dd13f14540ee7efd7f76
d2120704e1f86b8e1a67995ac38b4171b4e9c54b
1051979 F20101108_AABNXZ crecca_c_Page_209.jp2
2e7b211a545d6671f785c3abf49ef906
209bf02866caf524fa6531ead01f6f81a0cb4878
2074 F20101108_AABPEB crecca_c_Page_112.txt
c69fb9511d429c574af8a4f123c335e6
ae7825eb74103a9bd1fd29da3e9a1236ce4775b8
29581 F20101108_AABPRM crecca_c_Page_223.QC.jpg
292d91824a018ebd9311d70562e40637
b384d75dfe7bd03e02f3005a39f41006a2edd17c
26000 F20101108_AABNKO crecca_c_Page_090.QC.jpg
17ed3ef9c728546077e786cc189e0c8a
5d392fe7d9f4aa6d5efb5ba3868c69e7fcf782c5
3338 F20101108_AABOHH crecca_c_Page_003.jpg
abf6127c05278dd96fb788e66318df16
167b90585bda2598191b6dc12ff6f13bb3a550ce
F20101108_AABOUT crecca_c_Page_139.tif
c8d733adcd246e6bc6443e5c4d7f49fb
25ab141660cd4a9533557ecdaa5e2f6fbab9d790
6405 F20101108_AABPRN crecca_c_Page_224thm.jpg
0cd3f8ddfb23d7a11ff01ba3c53abb32
9ac632f303f2be5c187d991f8150dea6b4415180
12591 F20101108_AABNKP crecca_c_Page_168.QC.jpg
c8f94acf990946c46b5c5be678ac9097
da87d66b50ae638affd753eafae5201d376ab657
97803 F20101108_AABOHI crecca_c_Page_005.jpg
fed5da2482bc58ad985f17c5516d9723
8654f863f519946d35638b2890c208027c04db6e
F20101108_AABPEC crecca_c_Page_113.txt
0ab6b2dcb6ab06e6d46aa0bc5ca0cf27
088af2429ed40e7834640f54227d39b91c3d123e
F20101108_AABOUU crecca_c_Page_140.tif
59f6c605ce4e2ffe77e8ee56d4a2bc5b
b1b20aca9ba4fbf6d3ee8314bd342edf7adab2ef
29463 F20101108_AABPRO crecca_c_Page_225.QC.jpg
b4901ac1763be5036f9c2691a9dc98ba
879237ae712bb968ed339611697037edde0c15ca
26394 F20101108_AABNKQ crecca_c_Page_063.QC.jpg
9d46634a03d458b2f56f476d6bcfa7de
b086095a2d1df47907a9dc3d724a061f874a581e
116141 F20101108_AABOHJ crecca_c_Page_006.jpg
ace0ad8e75abbad17f7881a8878422c4
b3cdd7df49f4d12ff6e3839adee1ea743cbd4331
1874 F20101108_AABPED crecca_c_Page_115.txt
eebe81d327f6a1e70cd4265ae7be7be0
d452283b888f39e9c45348108521dfbade2175bc
F20101108_AABOUV crecca_c_Page_144.tif
9b5f5290344dd754467a560ca0a71958
b51eafc469f4d836e5e09cadc12a9492356e164d
99433 F20101108_AABNKR crecca_c_Page_211.jpg
ff9bd4abd4a50eb298a1d39637099bbb
04b1d3fea9556bb165518eeb5c56b5482603d9d4
105028 F20101108_AABOHK crecca_c_Page_007.jpg
0b9e8d953f0b1d67e0910974d045ac6b
73e56425c911156f493aa9aa593051372426a0a1
1828 F20101108_AABPEE crecca_c_Page_118.txt
92137d3eb66c173e7797c3b747220213
c5b7d8bc26f3ac5f5e8ea81742ecbf087a4005dc
F20101108_AABOUW crecca_c_Page_145.tif
6121c50c4fb401ff6e0f0a1e313025a1
1518d2c752dacf075c4ac62487fee6238f6bed11
3395 F20101108_AABPRP crecca_c_Page_226thm.jpg
c30efe96fbecf0bcecea8022c625101c
e1b05235aae38869cd5276e8317b7398fd2077ab
69978 F20101108_AABNKS crecca_c_Page_150.jpg
8ff8102fa525f9f4c49eadbc7466a11a
1d3635040b76e8fdfb4f6c938096e316deff193b
93608 F20101108_AABOHL crecca_c_Page_009.jpg
6610df4e0a351666cb4de28e589eebf5
d4cf6da0587f4fcc8b1f16c42bd29b258870b46e
2059 F20101108_AABPEF crecca_c_Page_120.txt
4cc4dbd3ed016c70145da1985bf553e2
f44ee6663177069d54e56fd775fcad5ac003f72e
F20101108_AABOUX crecca_c_Page_146.tif
87c17f8170901de2eb653165daa8b2ec
bebedb767cfb8eb4df3765c0849c1f863246c68d
260775 F20101108_AABPRQ UFE0021713_00001.mets
a1e69ce931baeeafbe573a6a6a9101c0
48e10e12d69c50fb9ebf3016e85f8e48d8d9b8f9
1945 F20101108_AABNKT crecca_c_Page_090.txt
3b5957a9d5c6d3aa62327a62e4d84c35
5ce6886ac291cf17b59a247cdb0b7fe5de0c050a
49741 F20101108_AABOHM crecca_c_Page_010.jpg
b36bacd631e8497499209e02392478ea
a72f28b4c36b624f4acdfb65ef28f5714a4578dc
2018 F20101108_AABPEG crecca_c_Page_122.txt
f78e47ccb4a120674fc889e445281e76
c05db0c9db7558e20e7d5312289026a963a0ed17
F20101108_AABOUY crecca_c_Page_147.tif
0bd9b36ac169148809c45debff4708bd
632f4d66ca2dbeb102794fa4ef11f1a8ee25cb0c
51259 F20101108_AABNKU crecca_c_Page_100.jpg
138f7c2fb8d7039a584b18e0aa22e5c9
7f351f2218b0c33c0b2b1b0bba34d140a5933cdf
89651 F20101108_AABOHN crecca_c_Page_011.jpg
d9183640d9f542a2cd4c5ecff1fcf09e
1b9a9d505415b5b3e3a21d8b0b8c4ee44b004dab
2174 F20101108_AABPEH crecca_c_Page_123.txt
9fdaead2887456330383bcabbb6b53b6
705ffc9b5ad1db99fa6608293e83d01f7c0f4a08
F20101108_AABOUZ crecca_c_Page_148.tif
bca36dd5e2f2c76dfd48a845017074ac
8c7062b88ef870d3540389a7a4448a528b7edb7e
F20101108_AABNKV crecca_c_Page_161.tif
1bd0b31be5f453fafd32ee1eb13e1ef2
9a3b58bf7949c52df827ec9384ffa486289094f8
91620 F20101108_AABOHO crecca_c_Page_013.jpg
478fbfa2ab60d1abfeb705e4b356789b
16c699429f3fe85e43515d30dafcec5d808d8b85
714 F20101108_AABPEI crecca_c_Page_126.txt
80a053e7b932c4c96276e4dcee6b9faa
ec194f879d95f693c60be8c9a78614a3c900be6c
2143 F20101108_AABNKW crecca_c_Page_143.txt
28c05a10245f8bf49f48c8b68150c45c
b2458096c5a9694a7c2903c2660ff9ac7b97b12a
54675 F20101108_AABOHP crecca_c_Page_014.jpg
5ed805c6160e756d90a179afa18d34da
1bcb35769d593ffe3042a8b832f67736449b1cf9
2440 F20101108_AABPEJ crecca_c_Page_128.txt
c4fe5d3c992da7f9700294c5aa1cf3c8
912667bfb1376f86b3a1a74ec28da0281dc56c45
1046 F20101108_AABNKX crecca_c_Page_136.txt
782c199406d94e7ea714463e4e99e7a9
5a1ec8b89e0ec071028fd4f40d82d1170900ae39
43542 F20101108_AABOHQ crecca_c_Page_015.jpg
999ecef67e2d723d9c22722b8cc9698e
3650c6ce8fda683d773376d8bc7153a3ce50c68c
978 F20101108_AABPEK crecca_c_Page_133.txt
1f3456e57500ddc6f1c2cf4fde9fb18f
0de09037a67b087384400167c0ff963b353aecc5
22281 F20101108_AABNKY crecca_c_Page_003.jp2
1f7b8bbdded0ca2a97321fc60ae32145
c3688b8f6f2cb45669772cae0efc59bd80be1412
82607 F20101108_AABOHR crecca_c_Page_016.jpg
dd5713050a56a8952f4ce50925c7d472
0ac9df9558eccb5322adab22d3b71179b513b7b0
1270 F20101108_AABPEL crecca_c_Page_134.txt
2d0e2506c3cc3f76c0d6ccd66ab80a75
9dc60f80562b35979ea392dc1d29bc5614474f49
28584 F20101108_AABNKZ crecca_c_Page_026.QC.jpg
8c2c88c915e12756466ae9c76aaf5673
2a507bded5d34a63e2d358d4bacaba2c56be0d0b
85955 F20101108_AABOHS crecca_c_Page_017.jpg
14e63c0d7aa27cae653b2e98bd104771
83bb37f7af097b5f02e97793739e0cd8e5b2a371
1021 F20101108_AABPEM crecca_c_Page_135.txt
54ba23874b006349113450aaec49cc92
2de62a9b99c2a63d587f4dcd3549436181136986
86487 F20101108_AABOHT crecca_c_Page_020.jpg
3fe819306a43e29666ac104514179e8e
b67d24f4c819d1c467bba21ddc5c57053433d3c3
413 F20101108_AABPEN crecca_c_Page_138.txt
68fd385a0469fb7533cc13c2f200fad9
653b9de7552e1e0fd3774a19642a9b7c39e8b9b9
85754 F20101108_AABOHU crecca_c_Page_023.jpg
697e924b895d13b1a6e9cff96896fa29
60d2aac5b3e82e89b31ec1123c9be43196ef7955
509 F20101108_AABPEO crecca_c_Page_142.txt
df2b2447b9c4914a6bfe238c9cc340bc
f7ff33635dc896e4432a9f4daf71c1cc154e3641
89727 F20101108_AABOHV crecca_c_Page_025.jpg
f81092e4d7ce502311cad3dc399b68cb
0b6c11f424ac87e7abb00e575b6aee0d084a652d
2193 F20101108_AABPEP crecca_c_Page_144.txt
10d7d449ced4bdb955b44a58167b5d02
1c8936f427b7754dbf83f665bb8ac844a8b308da
88911 F20101108_AABOHW crecca_c_Page_026.jpg
e1607e97eede39844b9e24ada92344c2
f317506ebc6abe01f9cdfac9f76419d96832142e
1610 F20101108_AABPEQ crecca_c_Page_145.txt
7287ea7e5718a40c58238d0206a1ff3f
53441feeae838ba67b108589ea6d3cd27eed4ec7
88035 F20101108_AABOHX crecca_c_Page_028.jpg
a9c75c7b87d9a3766ba2781615f487ed
345c554ee0c02c2aeb695cfdf25010a4c32d5aaf
2084 F20101108_AABPER crecca_c_Page_147.txt
996cdfa731a2c1a6acffa2249803cb88
be57fc68e0f343b94ac297a788ac1e0b5943de2c
71866 F20101108_AABOHY crecca_c_Page_030.jpg
9baf12289538aaefa53ee05f9b056d6b
3d855186f33afdfd27a5438a03ee3fc2a0b83303
2190 F20101108_AABPES crecca_c_Page_148.txt
aaec3248d29912ae8bf9debdb45bba5c
57df262ea2b2bf7fdac7c641aee194443925e414
84451 F20101108_AABOHZ crecca_c_Page_031.jpg
bd597673c1463ac1b39570f70c01a2f8
5557b888c479438183cbe23f81e07653882715af
2250 F20101108_AABPET crecca_c_Page_149.txt
c6aa76c53a57092827e2315d62d97934
aba74edc200ae0cb2b15a4f602360928404ef547
1643 F20101108_AABPEU crecca_c_Page_150.txt
d5b02bc1b02a354677332541311ec4a5
b9a58443384ed59c69909cdf62c4d4cc41c64a50
F20101108_AABNQA crecca_c_Page_142.tif
59648610ef4161910741d468e0629fef
34cc1fa5718d41503d7ec599d0abfaecec88e81c
549 F20101108_AABPEV crecca_c_Page_151.txt
52ad423017818b76ac926c585c2d5e93
9522bc862415ded827cb008741528bf2a0257d84
6614 F20101108_AABNQB crecca_c_Page_113thm.jpg
b0379224700d5ada63765b3cf18665a2
aea29d721b3c3f6ae94b7cac535308a839f4a9d9
2134 F20101108_AABPEW crecca_c_Page_153.txt
0c8199394dcb5d2eb326301e9c6723fe
5bf891e3e5af2d9de7b7d31bc14b4ac7df30750c
F20101108_AABNQC crecca_c_Page_051.jp2
2cbcafd2f3148ab9bef017c45b5ac388
573074feb24d52bdf0914ad5fba364febf6ea5d0
802 F20101108_AABPEX crecca_c_Page_154.txt
21a18cf816f1da22a43498ad29edbb17
2589445d9cfb039171acab63880bcef0ab928a93
27085 F20101108_AABNQD crecca_c_Page_077.QC.jpg
5784e2fa2be7fadb838d9f0611d06202
17f83c49e847d6e8addeb68a51b29688d5a7f082
F20101108_AABPEY crecca_c_Page_156.txt
0a24351ba5eaa004aac1484bdadd70b8
e32046f7126f24e72cf7c6f1c75d9567b3e0b185
86716 F20101108_AABNQE crecca_c_Page_049.jpg
41308d91f6ddaf9b02701fc03e1a3d35
a88183b1c2788aea1b1162585dcb4688f1db7518
2125 F20101108_AABPEZ crecca_c_Page_157.txt
c86a3dcdcdef322f01b27a07fd3bdb43
d88884196b3468799907f307e198bd28636ae501
F20101108_AABNQF crecca_c_Page_071.tif
f3d4fc3b66caee14b9fb7c15af9ad712
6d36b849d9c40aec9cbed9165ea28f265933677d
41992 F20101108_AABNQG crecca_c_Page_167.jpg
58719af405b8dd829d32cd94a813f764
6b8bdefb2cc8a47f4387903ac672b2606e52f3a4
F20101108_AABNQH crecca_c_Page_051.tif
e1c8349bf883da52900da780bfb7d96d
561b194b7cd86050a5b2816baf87f8a7fb197d84
1051971 F20101108_AABONA crecca_c_Page_024.jp2
605eb9286fb1456312a2d1a2c2b42bac
45ffd94a56b141f3b831adbed9437ab30edbddfe
28597 F20101108_AABNQI crecca_c_Page_111.QC.jpg
8772001e871b644368661ed9ff993956
97d8e54cf6b642cdc7a1b61c80a0b68c31373842
1051923 F20101108_AABONB crecca_c_Page_025.jp2
7390fa6e2b7e55785df0e45e31a8fac2
1002f128fe3a5db9c1d67e431c42d88c6031bdb9
F20101108_AABNQJ crecca_c_Page_217.tif
a2c795e4b966a7ab78d9a2232a35f033
eef19359ee29c1c9262e735c1fdcf66af2149ed2
F20101108_AABONC crecca_c_Page_026.jp2
2419cbaea430427272aa6d3fbcf99401
24f0aa9493dff8f3c987c15bbc00be4b626442ca
69228 F20101108_AABNQK crecca_c_Page_131.jpg
c1b1e47b828dff926f3a4de3aeb645f8
12f75c713cb207674374ca0a59bf56bcbebc825c
1051910 F20101108_AABOND crecca_c_Page_027.jp2
c029d2376b9fe23f00e96c191eca909e
11325d7fd92648b6426aadc741fc395c39d1fd93
14251 F20101108_AABNQL crecca_c_Page_015.QC.jpg
355bf73b5a4b69beccf2679825e979a1
9fd4aa74912aefd7adbfd2a317c1b8ec5ce0c2b6
F20101108_AABONE crecca_c_Page_028.jp2
401aee1695be24fcd33af28bad34b0d2
2fbb7a5fae4114f670168be666457194a66a534b
86750 F20101108_AABNQM crecca_c_Page_027.jpg
05c2aa13356f49f7604c159a6edc62c9
3104908f8d9f2ebec1711512d3af1d5bf3148e8b
91246 F20101108_AABONF crecca_c_Page_029.jp2
b779fd862ff023106dead6c6ea764ed4
d8a665272f7745ebb9e45e2d64277e4040326eee
F20101108_AABNQN crecca_c_Page_190.tif
e5ed57ba18e1d027106fad1b152d90ba
4d472fc2a65b7f07ee7a2fada988bcc438879b81
1051898 F20101108_AABONG crecca_c_Page_031.jp2
d98ee96d263a87692212a44672361925
171c45873d132f713155b31026d14b5fb40c74db
12909 F20101108_AABPKA crecca_c_Page_071.QC.jpg
fcb9717bf2daa6b9ba75c2f9f8230931
49e452c3ed91c10598a3ed7920ceb346d90d3097
863622 F20101108_AABNQO crecca_c_Page_184.jp2
baa6beb71967a89a5d3aef67891a024c
d4171e6c5df3b014ff8f0ee089c1235cee4aac95
F20101108_AABONH crecca_c_Page_032.jp2
15a222251ac70d6823990fb7f021016f
df80f11500aaa2c4eb635f489b5ecbbcfe164b05
4004 F20101108_AABPKB crecca_c_Page_071thm.jpg
4c79d65b94ce8a7ed745c541bc56cbf6
9524982f87f8d46473d838cf9b1d296883c23c2a
F20101108_AABNQP crecca_c_Page_092.tif
696a5dabe8949e32f60aa689af4d0aae
b011203671c61028e4c6ec207a26870530f32213
F20101108_AABONI crecca_c_Page_036.jp2
51324ade917f6e3ed667bd9961e4b899
16eb9a565e48c9f059621a0ae5c9517fc30f563a
4613 F20101108_AABPKC crecca_c_Page_073thm.jpg
3ffa4763467da7686f2fef038029bdf1
e220112d816cdd67ea2c0c32958afee23deab162
F20101108_AABONJ crecca_c_Page_038.jp2
d37d96d8fdad16912fdceb8378af6f97
372a5cc04fd9e89fd77002bd77f62c9c32648a68
6521 F20101108_AABNQQ crecca_c_Page_069thm.jpg
71128bc3d95107db4783642206795ac9
40e8920c0dec47123c33ae58b06959ced66bcf3b
1051928 F20101108_AABONK crecca_c_Page_039.jp2
df59c80a8094e13e1171ca60081a3c0c
440a359eae54e4e6d3cf5ae59b3213a717bc78a1
11699 F20101108_AABPKD crecca_c_Page_074.QC.jpg
44ad77d44eeeb41d2f06a0ea34bdc442
f641d4535aaeaa0bf9900252c42de6074ee6f2ab
20739 F20101108_AABNQR crecca_c_Page_098.pro
09a169a2526a942d963f5246a5f97873
e584c5ef2f6771982772d1845448c04954c2a959
F20101108_AABONL crecca_c_Page_041.jp2
a379a1cce4762458d40fd50b7aea34fe
b396aa3ff1e1c857698e11af6fcb859e74785043
3409 F20101108_AABPKE crecca_c_Page_074thm.jpg
784845188612255fe960d882f14e0a7b
108dde8b55ef8ef744046b92314bdad175c80352
53489 F20101108_AABOAA crecca_c_Page_159.pro
ebe0ca4b7671c41bc3b2afd8c54fce20
a6ce9e6627fc53f3011fce0ccd967fd78a9cc27f
F20101108_AABNQS crecca_c_Page_046.tif
0c4a3c0d1e728c81be6497be5664b62f
18332b2cab9e7a2020bb2a30a7ed76ca1122702d
9976 F20101108_AABPKF crecca_c_Page_076.QC.jpg
a180b3255728d9bd703aa7b44131515e
4c5f083a674d9cf248e27ddb06d1bc094f7fc636
19168 F20101108_AABOAB crecca_c_Page_154.pro
88c9b14287bc9b9b612dca71ecc5d4a4
f14e557d9da27a7477d302561dbcfa86e1ad464f
2083 F20101108_AABNQT crecca_c_Page_027.txt
1cb7ee6cb6ae5728f249b207966aed6c
11cba9e490e50f64b78fdb46d3d6c92388e2f051
F20101108_AABONM crecca_c_Page_045.jp2
e63ed074cabecce5c28737b3b533450d
cd37dfc79e96b66d7b62189be03a5c3bde132bdd
2889 F20101108_AABPKG crecca_c_Page_076thm.jpg
4772e2a608ea0545456e73b5545e1751
73657eeeb6acaed5abd105ebab1c0aa9af25a66b
4157 F20101108_AABOAC crecca_c_Page_135thm.jpg
7d930d8bae460bf7ffed5861af0056a9
5c2fcd8fd7d2977adab413f3e4de9f764d85c152
29447 F20101108_AABNQU crecca_c_Page_059.QC.jpg
b5baf241cb3b5f6f13e5d545d9a53b24
45efb6429ecc81fa3ae6322fc548d90dca13074c
F20101108_AABONN crecca_c_Page_046.jp2
dfcebd7ce0767c2e0c843241b93669b7
5c2fe83a30cfce5b06084e8ed8d1c412f0bcd0eb
6250 F20101108_AABPKH crecca_c_Page_077thm.jpg
3026b21e2d4681f4dfed6d529fe252cc
33d72a6b23345badeb71859a7f2b5505ce8874d0
2098 F20101108_AABOAD crecca_c_Page_080.txt
b93680b13a4e0f460c93927e223fcbfe
1ab959935fba05b738440c28a96eaebc3c1ce660
89644 F20101108_AABNQV crecca_c_Page_121.jpg
d033349e4f1cdea493314cce297142b8
180c0cc4d621f3619b0e11b83d6c2b4a59da2776
F20101108_AABONO crecca_c_Page_047.jp2
3847c1f0d769d1d55aafe34e48d7482c
50baef479695c79ccd972d764a5e2f3e3a545a98
6661 F20101108_AABOAE crecca_c_Page_054thm.jpg
7cac2d9497e5f17cd2c2f813597923b3
4d30f05a76501c675a73347688c865eac8718ff0
764 F20101108_AABNQW crecca_c_Page_082.txt
6c62cb010959231f42fce1aad0c7bd3b
baffb59bbaf0c40b0f065f0d349dd0ab7e26da28
1051964 F20101108_AABONP crecca_c_Page_052.jp2
88d24796cf1eca00249f2f26a2050d44
8c47d9b9bd6caa09df893e3276019afc181b0f53
28592 F20101108_AABPKI crecca_c_Page_078.QC.jpg
bf367defa138586773c49bb72580f9d9
e64bb073ce3df494645d3cc54532070768cccddf
46271 F20101108_AABOAF crecca_c_Page_079.pro
afa22e171d0514d0f229ed56f18c34ea
8759dadad25263a539248ed4afc844b09a800641
5923 F20101108_AABNQX crecca_c_Page_138.pro
bd192439c4f78f9c51c19f4f4bb45775
3e0680258b5eda7d42da9ba7782d5b6fc066a0ad
F20101108_AABONQ crecca_c_Page_053.jp2
c724c5820782195245c6cfd19b90be23
32fa06e8d620f3772dc5d7432398789b15a34ef2
6886 F20101108_AABPKJ crecca_c_Page_078thm.jpg
688b1c06dc0b889f000a4dc4b259fc39
2b522915638ede87f9a786921b00fed250c0aad0
5213 F20101108_AABOAG crecca_c_Page_183thm.jpg
d36bba258d85807dfa35fcd7cd8292d3
ed258001217cd654cc44b06e03e5da0d4416a04f
F20101108_AABNQY crecca_c_Page_156.tif
66c79e5ecdd3d0d259da3771080154cd
91c12796067067eb96773c5daa7da13f5b122ce9
F20101108_AABONR crecca_c_Page_054.jp2
018a285d958cb443bf0e7596a7b77521
ee2afe9de1e8dbf6b2d94afc1ec6d62d9811f463
6104 F20101108_AABPKK crecca_c_Page_079thm.jpg
e351d683e3d026d40f18efe3741d9779
0b46339023f3a7d2d450d0be25edcc12f9917e01
86535 F20101108_AABOAH crecca_c_Page_062.jpg
07ad0f7a78a19686746bb32eab8807ff
f4cdb781bcd9076e43b1230b14ea743660bd7695
F20101108_AABNQZ crecca_c_Page_061thm.jpg
bb667067bb3c4f10139b62c4bcd54e5f
8c73e964a1c9801aa22f44c5d1e09b722652ee7b
F20101108_AABONS crecca_c_Page_057.jp2
023cf7c6e9abec6f5bbf7b0fa83186de
feeb09b0b25222abf2c06dd5acada716459e7a35
28499 F20101108_AABPKL crecca_c_Page_080.QC.jpg
a11080e789b64ff3d176b7e55bc1b53e
726193ef562e9e6398d6ebac88fc22e5f57dea07
51498 F20101108_AABOAI crecca_c_Page_153.pro
6350dc3cb002e18a4a71afad16b7084d
46c1a0eeffa4747f115728dc3aab9192fd6dbfcb
1051947 F20101108_AABONT crecca_c_Page_058.jp2
a71ca489630c40d7f755313e1e9dfbff
8f743f5d50efb6cf9b75af7e9a09af36d804fe97
6664 F20101108_AABPKM crecca_c_Page_080thm.jpg
4740021140bdeec538f58d2905618f90
8aac16f8533200693159a00123c0750b320f0f5b
95017 F20101108_AABOAJ crecca_c_Page_060.jpg
129b332ea2600a3359d568c2d24c6c66
471d978c84d62335154a67ec08e9c2567190a01b
F20101108_AABONU crecca_c_Page_059.jp2
b3487886b6f29fff24b0fc1f46badad2
18afc2e56c4809d34a989a2b45b3aff7e07f63f6
30494 F20101108_AABPKN crecca_c_Page_081.QC.jpg
b15a712b5f57532fbaac46f1042885f1
d02d765111cd4df8c4ab70c36552a4e38df12631
323 F20101108_AABOAK crecca_c_Page_193.txt
4469c900e0d791a0326583d6eeb75876
a3af03dd61ea695351d8346a714cc032c45bcc0a
F20101108_AABONV crecca_c_Page_061.jp2
7e830036fdecd2ddc76138b62364df49
37200439eafb4bbda803aca26cd520886898458d
7052 F20101108_AABPKO crecca_c_Page_081thm.jpg
42902bc40bb200e944d7544104f8e825
1a9c5753ac8450e0e1f54afe36fef8c499d86a5b
26480 F20101108_AABOAL crecca_c_Page_050.QC.jpg
219ef1d4ef918f4c8002263c7883916b
4d6ff49abea74151e279dcc9622c2d1572b9efaf
1051909 F20101108_AABONW crecca_c_Page_064.jp2
8772bed7af0de6b4f39ee46423ec5974
98e100958788129e31eef6da2b8f198d5009936e
11180 F20101108_AABPKP crecca_c_Page_082.QC.jpg
163d03667d7cd67594f26d1a6800fd4f
e0be6432e2d66fcf8f9e7b63f4c3a155136e4c89
21874 F20101108_AABOAM crecca_c_Page_066.QC.jpg
7aafb995e2a0d5cd3cb148b7db44b8a8
56aabadb4cac48ec47797a5bcdcc9d1decd6185f
639595 F20101108_AABONX crecca_c_Page_067.jp2
b5bef2b5545763267a933f2228c1b5c2
0311f0c257f185ad05016b882e09a1674a677cf6
9585 F20101108_AABPKQ crecca_c_Page_083.QC.jpg
52ac0a5b06a70155a4a581fb92035c83
0b2bd0fe6acbee6e091ef5d587691764a570c9bf
27099 F20101108_AABOAN crecca_c_Page_110.QC.jpg
5f64bd9c90f192539630be6a5d7cdf77
bcbd6121268c39cab15976715ec18b19e2f965f6
956915 F20101108_AABONY crecca_c_Page_069.jp2
b1cddbca2c8bcc58efb6b230fa2acb70
54ce9aebb89852d598152d4d69bd7a64ad9bf44b
2369 F20101108_AABPKR crecca_c_Page_083thm.jpg
4da8c9781511a937f138d2f6ce04db43
1ebd3622a56175b080239297f0b0cd409c20af71
45301 F20101108_AABOAO crecca_c_Page_128.pro
4d16606607d775ac64b1563732b6e603
2e54eab1afc67eb87211936d1712e5d0a6ed6620
588287 F20101108_AABONZ crecca_c_Page_071.jp2
13dcdfe23c7b51d5954f981feb159d3d
6d0752bb4b1e11fbe1d21a3f028664a8361dfff3
14875 F20101108_AABPKS crecca_c_Page_084.QC.jpg
1370e7f41e180974615050ef6afb7b2e
c0ce8794f51d748a647cb98b22ab85619b9a6c09
13682 F20101108_AABOAP crecca_c_Page_010.QC.jpg
cf060930830983b8767a7c6ddd6fcc6b
b230838338a8a690c3f6841286fbd09639d32521
4522 F20101108_AABPKT crecca_c_Page_084thm.jpg
0417ecff7198b6ec91193aa7d44ab867
84621e49283428aeda4116ec93b61910fb6a6df8
800532 F20101108_AABNWA crecca_c_Page_084.jp2
a9f453af3cea069cb729eadabdcc8990
697e37eb34b9020547b9d928e3eecd4e34959775
6326 F20101108_AABOAQ crecca_c_Page_147thm.jpg
038e2c9a0d4af06f8fd9511c8c042b71
e4f8815d8b478fd5b163c38ef921256700cd599a
17680 F20101108_AABPKU crecca_c_Page_085.QC.jpg
9cfc2ccbd6706a5d1da6a8f757f2c0fb
70bb2fecdba04403fafcac97ce2ea05299c21492
F20101108_AABNWB crecca_c_Page_095.tif
1913048041276693b41cacbeec885b50
df3aa519895d6521518300bc59d0ec656a57272d
2092 F20101108_AABOAR crecca_c_Page_117.txt
2d3ba9b362cb2f5101154dd3adef292b
95ccd13be47a2439f9dbb7ce2ec5e86a469cca6f
5255 F20101108_AABPKV crecca_c_Page_085thm.jpg
01db17cd3186211e5403188136cb64ce
c5fac1e80a46cbd36dec78651cd173620e6af516
582504 F20101108_AABNWC crecca_c_Page_168.jp2
0b3733bda6cf68007d820cb9d576f406
a7888d8a7730c5e5076819bdd4960ecf450a8cf1
F20101108_AABOAS crecca_c_Page_226.tif
9b9df5065c5ec36404441071bd3a5d13
696b4d203ad11ff4e396fef713b7e6301a520ead
8535 F20101108_AABPKW crecca_c_Page_086.QC.jpg
04ad535f5e5b41f0ac8b2f7b2ec69ddd
332623c7224bc323b55835474d1208c747ba9c61
1043670 F20101108_AABNWD crecca_c_Page_100.jp2
ee551ae0c1d5d28039137737041c5862
cc4d93d3cc53f74e5de1ee79f1e0862d66114334
100126 F20101108_AABOAT crecca_c_Page_128.jpg
eb56de5465ad75ae91cc1fcdc02e2e20
ab8e09ef528a8ea4ef7c33641e9465eeaf8e7b10
3474 F20101108_AABPKX crecca_c_Page_087thm.jpg
66f3542706468e60da3dfb99a5b8dee2
2d61d7cb6a9edaf1e124797bb6f661f2f80fbc80
529 F20101108_AABNWE crecca_c_Page_130.txt
c12d37076e9ff09a44677deef9e5dcd8
1b3ef153980c3fad48289f8c13f17033fa7e366c
6995 F20101108_AABOAU crecca_c_Page_212thm.jpg
aac2165fd373e79432b0cb521ec209d1
89571a1d57ea5577aba086c35dc1f1fd6da7d124
5249 F20101108_AABPKY crecca_c_Page_088thm.jpg
f3ecf5d0a8ae4a0c08296c75298d8512
9988fda6aa38a7621c5142a68f34ea4f3040128f
18601 F20101108_AABNWF crecca_c_Page_208.QC.jpg
76d2475bf5440ceaaf29212f5601fe8f
a64b9433337a4bf7ed0c6a40459c0b0a2b9771e2
28221 F20101108_AABOAV crecca_c_Page_041.QC.jpg
2785ab02f6130ee833d3038a939428ac
212e790f96d443eb4cb8f098328121ff53d700fd
6417 F20101108_AABPKZ crecca_c_Page_089thm.jpg
dfa4458ddd4ee87c09ba2a8fac080fd2
d0aa5783e296fd29706108dc016d7a592957de05
F20101108_AABNWG crecca_c_Page_105.tif
91dadfbeb24696b7f777da58e13424ee
4058435a8f3548061906811386ef71829e2c27ee
14197 F20101108_AABOAW crecca_c_Page_187.QC.jpg
6f83ab88e2d07b2b7fdf33b2bde933b0
d7d156be02dc1547954bb505448f360e6d356015
26611 F20101108_AABNWH crecca_c_Page_175.QC.jpg
fe6ecbb633d9d001bbe195cf0a65d9ef
11b031e8143f808d732f8a868602c4187af46a47
4207 F20101108_AABOAX crecca_c_Page_203thm.jpg
6556c2b2428a478bc951db1e70141c0d
0cca0bd6cf1d1ebc4ce6d1d92f90ea373d059aa2
F20101108_AABOTA crecca_c_Page_074.tif
dbeff83945a38c6e206d4853e2c338c2
ce9ed9dcca6fda11a07707d137a833f9776b959b
2089 F20101108_AABNWI crecca_c_Page_105.txt
dafa3b75514c175af64c2d511848f374
8fb0af6c533d22596d9be3dd3f1795e68e056956
937 F20101108_AABOAY crecca_c_Page_072.txt
1c99a94f862b36becdcef0aab9720392
1fb8703358e14536a4ce242da51dd0ebd3fb9dd2
F20101108_AABOTB crecca_c_Page_075.tif
1b192d547aa99c9c5e0cae2812477548
3144a9a8abbe590a325dcff6ba59fa0ca88ac7d3
85999 F20101108_AABNWJ crecca_c_Page_092.jpg
1c6bfe7212dd3fb3fc600e3aa7bc1d39
f4eedeada8ce9f7725fc96ace52939912d248598
61451 F20101108_AABOAZ crecca_c_Page_210.pro
cc9a23c85415b52a559ee87df7dbb311
b813bfb413ccc7c421eab39e1a219abd8ce13e54
F20101108_AABOTC crecca_c_Page_076.tif
cdae1d20cc89b1ba88a41a1011f8d4aa
5abcc96164a1c69c6c902cd6f7fa527a6c94a1db
F20101108_AABNWK crecca_c_Page_129.tif
056cac29a7c254374222e133a83c0ea4
c64d4056fec673e166f4e9680b6e826f59baf596
F20101108_AABOTD crecca_c_Page_077.tif
9260d11b6b609c184fd72233add7b41a
2bf0f83d93ca78c89f75b5c04c4af9f387d9c08d
9058 F20101108_AABNWL crecca_c_Page_086.pro
a6eb5b05b782169c89d060095bc3b950
45777a97fefa6349b8efaf722e595b28e01d909f
F20101108_AABOTE crecca_c_Page_078.tif
9291dcdedaf762849d34f607bca9aab8
06a3405a0f878a489245efa66f5b97209aa0211e
22875 F20101108_AABNJA crecca_c_Page_015.pro
cd260f31c8dcbfbb9f1202b7059d901d
be66554715eb937dc49f93f8f879e3727834a9e0
F20101108_AABNWM crecca_c_Page_032.tif
80e24fadbb555ac770c62ae3a42eccba
a38a2491fe1061bcc1ae1dcd2146a95fdc56190d
F20101108_AABOTF crecca_c_Page_080.tif
351befda31e8a79b6b5d66e33a15e48b
1280ddc45b46cb28791a126b90711748938281ed
48451 F20101108_AABNJB crecca_c_Page_089.pro
a9d7c3641fe82eac94aeb52c3c6beda4
e847d2f06d4822665ab360d948046529e2950bdb
27700 F20101108_AABNWN crecca_c_Page_179.QC.jpg
c5a962dec5e7f0dc7a4edbf96f382673
7f511a72679c12d3470c2085bfce6633f63f74da
F20101108_AABOTG crecca_c_Page_081.tif
963d77e44022aa6b580ed2b91895993d
880be59c9fb9f573e52b2e693eb6dc9c93d924d6
F20101108_AABNJC crecca_c_Page_167.tif
c151eb698e045520267ebc837cc1d86e
94146eb627b65e8af25a4fe2781c96d1fedcf32c
F20101108_AABNWO crecca_c_Page_147.jp2
c42f60edddecff7a4ef20d36b8bb63df
1676f88ce30eb9366f8130f7a96a5dae86ae2496
F20101108_AABOTH crecca_c_Page_084.tif
e808dede5b0998b987928cbea4865e35
0c99fdfc89ca0104060fbbc81842cf23a53155f7
19436 F20101108_AABPQA crecca_c_Page_189.QC.jpg
78da3443eedce62c905c053762931cb6
b6b3dff59e84952dca55e46e3cb23d4d65b7c264
52887 F20101108_AABNJD crecca_c_Page_078.pro
ff4825246558c75c2014b16865c61bdb
9a40bd75c54912a52ced0bd217cfb530322d693b
1051956 F20101108_AABNWP crecca_c_Page_159.jp2
9bcc95e920d9e3d5184d92f65ea1221b
f896557e8b2af4520ea5d0b16612f6dcdc6cb6fb
F20101108_AABOTI crecca_c_Page_086.tif
628624185f54e8c43fa18568c9612b9a
28074b9e12187336a9521981d318fb262a3f3764
15019 F20101108_AABPQB crecca_c_Page_191.QC.jpg
d1a2374193d3b967952fa3cc32a04bc8
9e912104517e3b26c7b3298f5c434826de4d70e2
6699 F20101108_AABNJE crecca_c_Page_108thm.jpg
a4da74d88bfbd8ef27dc388f154c9d4c
bc8d95115159b3e9b89ffeabd9fa20a03f53045b
F20101108_AABNWQ crecca_c_Page_162.tif
44081ad6e43e63962476057928b5c469
793cd6c6eba3fadb3d074568d6c02f32e56f7b17
F20101108_AABOTJ crecca_c_Page_087.tif
d16d2df387b69c6e0bbd95916881347a
89de0843d62bd1a6c0ec64414b80d0bf015bede7
5092 F20101108_AABPQC crecca_c_Page_191thm.jpg
8001bf23eb7ac7de47ddbce5431828bd
1febf2060cad6465b88f1f06f0cd91cdc76aac5e
87175 F20101108_AABNJF crecca_c_Page_018.jpg
4453c36aecc514e9c28d20e66c9423f1
3d1687c676f14ac252d9caa0111f6f1ea43725b0
F20101108_AABNWR crecca_c_Page_141.tif
08202e775e3a55f778a836385aed10a6
23faa454dd863025cbc57ccc7f0b6f3927d2477e
F20101108_AABOTK crecca_c_Page_088.tif
9ab55f061dad1b43513e23687e6612ef
8fc351a15a0a49fdd6941b260e4e707c6cedbc1b
23588 F20101108_AABPQD crecca_c_Page_192.QC.jpg
fadcdafd14a995739e26667b43f0890f
699d237362f10f9b93a7cb81a27b296bfdf2038a
636649 F20101108_AABNJG crecca_c_Page_226.jp2
7952022721f95439a6515d782d8a46a5
e2098cea56d688f966b4fd3cf5118a823fbae907
F20101108_AABOGA crecca_c_Page_119.tif
1177baeb2c6598ba4fd4631d5777242c
cf1ef3e3d6f64030d695f05e44489d3f16a0be80
410 F20101108_AABNWS crecca_c_Page_086.txt
8f6c8eeeee99082a92ff81a6e6d24196
77d094df1b15e897dcc56dd510889ce370c44e1e
F20101108_AABOTL crecca_c_Page_089.tif
d63a47eaa4d07781f4cd513ff76e8910
abdcf39e7d659938617484282853cfa7765dbcf0
7233 F20101108_AABPQE crecca_c_Page_192thm.jpg
3b703cb11d05ec215a6a69ce6ae77aee
85c690e805f96289cef3ba4e62a9b18d85b6a983
2068 F20101108_AABNJH crecca_c_Page_116.txt
c8b391c874028d542ea3326f10bcde35
4c000fd090e07dd945641d6bd4fa7146ebc3c17c
19158 F20101108_AABOGB crecca_c_Page_208.pro
9ab90165b1bc165e64815b814c7e49fb
2c25604de472c4317e42106ae1a39025d5dc6cd1
F20101108_AABNWT crecca_c_Page_131.tif
bd2cff96c5aec691c471b75123c3f735
76f54a410bbca75366b2afa355d52f4a74ee1934
F20101108_AABOTM crecca_c_Page_093.tif
5274b7a496619c8404d65f093069f104
040238ccc4d42d1e84508c6c001a3dcb3412ea5e
14519 F20101108_AABPQF crecca_c_Page_193.QC.jpg
c6635f93b59080c9e6e9d308972dd528
10b9b42b1bb4ba8ce69004ff46a946bc24733003
104707 F20101108_AABNJI crecca_c_Page_216.jpg
650bed557e9d71e58903f4179d78fc58
d42fc2e5a5c407962522297ddd4334e0d17d73b3
3303 F20101108_AABOGC crecca_c_Page_151thm.jpg
596aa80a79752ee3556973cdef32b116
1570f36f6f0aebad0974c8c4e3ef936f93ff108d
21796 F20101108_AABNWU crecca_c_Page_185.QC.jpg
462a04ad45a8738fd3d98a7e54b7a4f7
7055f0926c8036ce62e8cfd83328f65b5e8cbb2a
F20101108_AABOTN crecca_c_Page_094.tif
748b76093e4d11f92cbf3d0d3f29ca43
88cca202126248b875f64e8b9005e783bbfa67de
17954 F20101108_AABPQG crecca_c_Page_194.QC.jpg
57bd690639fa61aed123f661f5e5afa8
e423de39804e33551c338e214a17bce11b2d3b55
6462 F20101108_AABOGD crecca_c_Page_052thm.jpg
3e002bfb1e6e646c99d3de9d6c76846f
f3fea0601af280a4840b8a0c648b4e54b5f49f6a
51214 F20101108_AABNWV crecca_c_Page_194.jpg
fb25e0179f9eca864b4ffd5f8b8d52d3
5b11a30bf38332b46b78bc02b38edead9a54d505
F20101108_AABOTO crecca_c_Page_096.tif
67c6bcab3389b2cbab74dbfcc537b477
1c474380af7c8ca994fd1c64a34b467f00cac0fa
5236 F20101108_AABPQH crecca_c_Page_194thm.jpg
df02309a33eef285746f801f72bafd59
4ae4621dfb2579be92d67e6e757e885327493bff
27686 F20101108_AABNJJ crecca_c_Page_062.QC.jpg
5823ccf4ff8673a0c2a19549a851edaf
030c1e220c67a98a41d53521641870c827466f92
1051917 F20101108_AABOGE crecca_c_Page_170.jp2
64c08ed4af69bfce271d7696f778d174
9e174877bd61d76071586d37ad7de648a6a80f4b
F20101108_AABOTP crecca_c_Page_098.tif
30268e25fe128b732c79bdff90c10e74
daacac65bb52d585fd3fa68bd54c31f10db1d033
18475 F20101108_AABPQI crecca_c_Page_196.QC.jpg
008f664195ce6bdf16be8163580577a5
13a709c72695ff8772a69acc944df550943d2b37
760114 F20101108_AABNJK crecca_c_Page_004.jp2
be15745b67a06befc0f5e467fd279dc4
467d9b333a62e102b96a1b5fe13ad1cc37fab77e
3665 F20101108_AABNWW crecca_c_Page_015thm.jpg
dcda877507fae4c9cdc884b0b4d02752
15415a1df3647938a7beb3c245955b69e421910c
F20101108_AABOTQ crecca_c_Page_099.tif
7c2eb04fa640c96e5fab74a34f388314
78f0eec8d680ddde1a3c2a4c490cd90abf821639
5461 F20101108_AABPQJ crecca_c_Page_196thm.jpg
4f97664f9cd9b2bd0840aa893ab13ba1
7aacd0eb9f78895b350d24c060b12608b113a28a
F20101108_AABNJL crecca_c_Page_011.tif
a1fdf85fc291134e45485f19e44b4f93
dcd7221723b1daea1e6b9e430dfddc37f9a1e934
27081 F20101108_AABOGF crecca_c_Page_178.QC.jpg
1c0501e7a5a9b66cd44c078030d6e11b
d4f3e4d43b6cfaf5315e5d1d48dabef6b1d93b5e
4502 F20101108_AABNWX crecca_c_Page_004thm.jpg
a4a57356f8d5e75d36f4afddb8e34056
3259f2edddb18a065c36c7de3b21c420eac97296
F20101108_AABOTR crecca_c_Page_100.tif
b61ae114e935a4b568f6ddbe9329b44b
ad9a0b546892340a2541286bfb5b718b05d300b7
9199 F20101108_AABPQK crecca_c_Page_197.QC.jpg
aef32abc64ad317d9e9e001b4fa4650f
6c17f5f57fe34f65e47175ab3934b35b8a7acb63
F20101108_AABNJM crecca_c_Page_014.tif
32a44e7fe372f83ae4e883d3a75a2579
a216d554a76344ee7ba740ea28c06921c2c2557c
28319 F20101108_AABOGG crecca_c_Page_216.QC.jpg
292ecac3c27458f786a3952ee594e943
faf4052152badf17d78ab06b02c8090e108c29de
40674 F20101108_AABNWY crecca_c_Page_181.jpg
43346362449c46eaeb3300f261640ff6
548c5e8878e458dfddd8743302e5c5e94b89f1f6
2291 F20101108_AABPDA crecca_c_Page_059.txt
1741ea78d3e4f18fd1008496976b1fcb
1b9ef3edb531025bbb9c833dd94ba6b52ad76a9a
23135 F20101108_AABPQL crecca_c_Page_198.QC.jpg
5c9ff9c5e52389431936133367b7eff2
a39f03decb9ef4ee4154ca4cb21cf1d9b7d7cecb
F20101108_AABNJN crecca_c_Page_044.jp2
9d75193cb3cbb3acdb914a325d60127b
fe9f1ab077eae8bcc974ca90fcdd1c6806859bcf
4216 F20101108_AABOGH crecca_c_Page_206thm.jpg
82aab868c9d3a136dc70a5469d554d21
30b53b44e214a107cfad2e25458b58a7c997ccfa
23545 F20101108_AABNWZ crecca_c_Page_055.QC.jpg
f4a2da531e39c8796e13d1f8cb83e99a
cf5fdf113fbe979239c91543ca4c9c6139b51d19
F20101108_AABOTS crecca_c_Page_102.tif
d3e65fafb4f8a070bac7e132dc3aaede
0ad3235601be172aee686cb19e366df699139bb6
6408 F20101108_AABPQM crecca_c_Page_198thm.jpg
f26f97cd3c5375fd5aa534eb8bd00f15
efa5cc6ba6f159a4136e2f26a58a803e933b58ca
320194 F20101108_AABNJO crecca_c_Page_200.jp2
b23215137f99ac61546e009028a965d9
e0b874dfa46bbc997ef7e10c8851e4b859df1c3f
59829 F20101108_AABOGI crecca_c_Page_212.pro
fe63153cc5cbee7222e1e3b4e19ac758
9f80f9fd64d6c9b262b2cc5092f68a17f217ed0e
F20101108_AABPDB crecca_c_Page_061.txt
967fe2c98530da29afb92b162d02e61f
d8cb72271f6ae5fad54d85ee7939e49ac922ea91
F20101108_AABOTT crecca_c_Page_103.tif
9b3a354f574b2249c0604ba5229ca8a1
f2935087162e1f829a27094e5deb31850686efd3
2246 F20101108_AABPQN crecca_c_Page_199thm.jpg
093626f2065b7f0be47b552d3c3ad61d
ac5c33ee633581c0f6407150138117cf05934111
1299 F20101108_AABNJP crecca_c_Page_201.txt
97488f7b2e26623eb0b69f0c6b4a367d
918d07eac749797f5c5acbbaeafa24805220a9c2
52370 F20101108_AABOGJ crecca_c_Page_121.pro
2201980cf66e8615d94cb12579c92504
4322614b19f0b918705d6a843de7963969e230b4
2109 F20101108_AABPDC crecca_c_Page_062.txt
54338a5043e9561b037c692501a96cd5
3c528eeffb0944211e73c6ec6b3a2f3166a76f05
F20101108_AABOTU crecca_c_Page_104.tif
1c99eb1456d2c4d17fbf6fd548fadc42
0b212f63f1850c0770a600861b097f2f3c289d20
33233 F20101108_AABNJQ crecca_c_Page_165.pro
077fd7402d62e4d385edd38235f11da6
8ee8562c64a15376da5e05495b8f233551a8ad36
6689 F20101108_AABOGK crecca_c_Page_111thm.jpg
4f1edb895b8b9522c8befc83b632e6f8
e2148ba16e9cdf32054fb2bee4d2472d8421c906
2046 F20101108_AABPDD crecca_c_Page_063.txt
4568f67f8e4cf2bacc0e736b50c3c954
1f892f56407b6eeace73ce3975d513745da9c241
F20101108_AABOTV crecca_c_Page_106.tif
c931ec4a8435c4dda0c16218dac9f0e8
80a86f9eaa2b2e4567d74bf59d8264b7962a6f36
8895 F20101108_AABPQO crecca_c_Page_200.QC.jpg
c4ba948db96548867016757c2511a081
d32413d7f6bc812c19abd932c4440ebdfa8a7551
F20101108_AABNJR crecca_c_Page_038.tif
d94faae73d152338b189a202fa2d405a
94d011cf4a7cc8a8b1e8d13a76a9ddfc71a33f73
52249 F20101108_AABOGL crecca_c_Page_174.pro
53d55d25447c6fdb264c2cb4cc89670a
22349578b284e201b79af2626570c39b5885e777
1470 F20101108_AABPDE crecca_c_Page_068.txt
28dffa7045cb25dbd32e27701a3ca0b5
680aaf4b638b0de81daa6cc3c2a149b311147681
F20101108_AABOTW crecca_c_Page_107.tif
a4bc2e8f3cb56cf06c40a967b93a642a
d367513043346fb048936fda48030baf8c419e93
2918 F20101108_AABPQP crecca_c_Page_200thm.jpg
1052785af6b558bc8fc0cd8176d397f1
042f3cad6d71bfdb83b9ffaee7214eb97ef37f66
6592 F20101108_AABNJS crecca_c_Page_092thm.jpg
c2642cdd2c3645f1d7d5658f80484c62
dbdbdb66405193f7b1558273a88696a4300bba5a
487 F20101108_AABPDF crecca_c_Page_070.txt
0183e6b9cea6a6da39774e438837da0c
1734147276dd6b87b67dca20d08f5fde1804c7ea
F20101108_AABOTX crecca_c_Page_108.tif
091c9371ba9ca1cb7bcaaba75a8de138
28cbe71e2d9bccfa4ee20b2dc90e8a1af620992e
4061 F20101108_AABPQQ crecca_c_Page_201thm.jpg
76459b7c023d330a906ae61585937213
629a017eb796b390c55cf0bced8d50d50b8d1104
2027 F20101108_AABNJT crecca_c_Page_108.txt
2964fce478abcb17237bdb7ebefbb91a
5829ab008cc628c9141d95bdb35ef2c14ecc4f4d
4181 F20101108_AABOGM crecca_c_Page_208thm.jpg
99664b46c56209f28eba3e2f5076d5fa
237576636dc59d656535ed41bb69d7d073542750
1026 F20101108_AABPDG crecca_c_Page_071.txt
4ef52c5cc98c86eb36f68767394972a7
17e7a47503996ae5a8b5166c36546f21fcece75a
F20101108_AABOTY crecca_c_Page_109.tif
a7df2c6a2f7bf970e0f873c4127e6a62
e53bcafb75d357b22470c150c7c487736536ba03
4378 F20101108_AABPQR crecca_c_Page_202thm.jpg
2f7d63e38e316fdbf3f8ffd5fce2f730
e477461f5fe2a1c3fa5e1d415fb74295d36405a2
F20101108_AABNJU crecca_c_Page_023.txt
abba30b71d84042094d8811f4102c180
937b09d82eb4c194769abb979995156b94464c95
2165 F20101108_AABOGN crecca_c_Page_046.txt
0996bb645108128f7893d6a41a2901a3
d5c4f47246828eaa6df76899e028b2c0c7fe2e9a
603 F20101108_AABPDH crecca_c_Page_073.txt
bfabe108d6291c18941313d89162b9e6
6e616ecb4b46099e14a03462c3d114c010f2d366
F20101108_AABOTZ crecca_c_Page_110.tif
93ef111e132bd3b77fdd56fa83dccf91
67ad67576ef36a969a860ed68acf967e71c2275f
18551 F20101108_AABPQS crecca_c_Page_203.QC.jpg
64008f09a3d697c723fcb32b79c63a1e
f5d6e7e6b7c4cc3a9c9489944e41f265d7822340
F20101108_AABNJV crecca_c_Page_097.tif
7504bcb94c2d4e2cf24033f60fc2657e
d53aea7029f1cc0f77c6eedd2afbc9ab64861505
81854 F20101108_AABOGO crecca_c_Page_094.jpg
d90149f7070c1e8994582338bdef6169
1295dabaff5403d2d883e74765fbaa249cca6593
418 F20101108_AABPDI crecca_c_Page_074.txt
b0ffab882df817f67f1d7c50a41269f1
962e259a1adc93f4aa4eb44c64e17ffbaf4f797d
18519 F20101108_AABPQT crecca_c_Page_204.QC.jpg
63bb44556b448659aed4a081d10d622f
a3738a9f7f6429fee813bdb72c348c5b5431aaf5
111849 F20101108_AABNJW crecca_c_Page_225.jpg
1716a3b5e20bbe8c90da120d9b890251
cab935160a6e76bc1ac1f836cc498f7ff4e91bea
27697 F20101108_AABOGP crecca_c_Page_017.QC.jpg
d7f84c046065b193fcc49496eb009cc0
d121a4ad7103669e5d1fc0f8109f8206e08e5eb1
563 F20101108_AABPDJ crecca_c_Page_076.txt
509ccfe2fc191ecfc0fc4841ea1f8403
f029e014d0ca715f6c41ab7c0d70180dcd291fa7
4189 F20101108_AABPQU crecca_c_Page_205thm.jpg
f1f4e457361bc39b20c321b4d2093dfd
f532fe4bd381bf0f8f602faa9e56b2017eefaf4b
560806 F20101108_AABNJX crecca_c_Page_015.jp2
a44ec464e6f3b6e51514a38e15bc3b89
0202085368d0d859f0c3d955c439dab7aafd8188
7135 F20101108_AABOGQ crecca_c_Page_101.QC.jpg
a4bf3155037d58c71b08d28371bed8f4
a358fe5d4dfc11a846b7909d48a18fae9ed7a223
2078 F20101108_AABPDK crecca_c_Page_077.txt
9074300cad6786acf4d79c37b88ac850
6a31adcad1565808db181628b39b6080f3690846
18516 F20101108_AABPQV crecca_c_Page_206.QC.jpg
a8375f9e90f4e9a7dfdb053b30379bc5
cdb29fb7ecfb190c98a47d6914d25bc9392fc226
84554 F20101108_AABNJY crecca_c_Page_107.jpg
962d6233b4c50f5d10878dc958ecb247
998f0ff5308a8bef312d7ebe96bb9d78dad50bdb
52210 F20101108_AABOGR crecca_c_Page_050.pro
637bd394673c5d3ca00e583bb298ad6c
0733ee244a03793b36fba405cade6b6f13fa0058
2270 F20101108_AABPDL crecca_c_Page_081.txt
c52af20f56352e6d54c4d8adc98755b2
751309e3580b493e012ad87d395329f2dac33b81
18690 F20101108_AABPQW crecca_c_Page_207.QC.jpg
8e14849f075453e203b9c3bf320bee66
b6977f55272e0076a52bcf2c43bb996a0336b1d9
F20101108_AABNJZ crecca_c_Page_136thm.jpg
a3eec5d939cb0414aeb72b72e85c5584
4b318d8fb3e935a744d13a3044404c5c77b38c5d
F20101108_AABOGS crecca_c_Page_223.tif
cbbdfa881fe23b633805d6a7983dc939
2daed727e74242bd011d09d18745eba5f282ed0c
724 F20101108_AABPDM crecca_c_Page_084.txt
70eaa93f56da11aa8c97f94863c5b7a5
f9f74a18efdbb27e3523e0456ce5006d48c65578
27324 F20101108_AABPQX crecca_c_Page_209.QC.jpg
d6ac138eb4a79349582820f9e67f6c5c
422fc2cff979630ea7d57e1013c0f3d0f0025c7c
8172 F20101108_AABOGT crecca_c_Page_001.QC.jpg
2e5ae940bce5d34ea8c97b13d7a0e6bb
b276bce711f9535ef4558f717544d47cf2e84865
2097 F20101108_AABPDN crecca_c_Page_091.txt
6af4f45f2e1d6ac5d73671941894de88
56e4187b7cb839e37d710bb56fc4f012d15f79a4
6679 F20101108_AABPQY crecca_c_Page_209thm.jpg
bb72459d95775f169de12ba94fe365ec
0befa282a7f479af8c3fa551dc20b4b5e17fc926
74 F20101108_AABOGU crecca_c_Page_003.txt
fda4b641b86263af5bdf526831dd4f60
83ce509345d44b927b455269b9a855ece3efed0b
2067 F20101108_AABPDO crecca_c_Page_092.txt
b1361f4ddb42d731f847742c8610eb60
7c9eb4eb2223ff280d695404d1b6c99e0b78fc02
6619 F20101108_AABPQZ crecca_c_Page_211thm.jpg
bb584d5ffb32b79d4fa98523978aff00
9119ee77e20caa0f6e491774d4b228c43df881b3
555853 F20101108_AABOGV crecca_c_Page_206.jp2
4ae0213d70306c448d42c1ce8dec51d2
03aa32b2cef771dd3b761a94824ea29d8ee4f57f
1958 F20101108_AABPDP crecca_c_Page_094.txt
d7d5d7c42db58c9d25ad57facab79d10
6abc2f23bcb1027fdb727a5a28de8637c0710365
F20101108_AABOGW crecca_c_Page_179.tif
54afb29aae3e9062d005df4384e72302
7e987d331606fab16ecf63af362ce75fee266cfc
52979 F20101108_AABOZA crecca_c_Page_105.pro
7a95b837fd63880618e26c483d6faf81
63ddc668648e70eb8bc525c74321ea0b668581b5
1952 F20101108_AABPDQ crecca_c_Page_095.txt
867254cdc09b81dbe5a5da646cd431d3
ec919647387a17759e5cf4ce9c44fd9af8be988f
101437 F20101108_AABOGX crecca_c_Page_012.jpg
dfad461d0db943c2e229ebb6c1552b33
ad54e9717b7c2088bff9458bb02795a47190b0d4
50158 F20101108_AABOZB crecca_c_Page_106.pro
3e53f04f1591be6ef995e3ff544112e2
5329f35a09927a5c5c16b52b640b11c42b8830f7
1840 F20101108_AABPDR crecca_c_Page_097.txt
0f22c7147b1263c6457ae47358ff35aa
55ca256cb2de8942715151fb5e00b1b70185d9fe
4227 F20101108_AABOGY crecca_c_Page_207thm.jpg
36a963ceb67f287957fe9417093740a3
a8abf628be6a3890b39a972969aa81a4d0b95cdb
51381 F20101108_AABOZC crecca_c_Page_108.pro
e4ac327df9a539a4aabd8a339771cd41
d45713a6cf01c4522d50c753b7901e3e1bf38c16
957 F20101108_AABPDS crecca_c_Page_098.txt
54e911b74606b3c5272fb0a3a7f5b1ff
8b3bc79cc176ac2eac442e84284863d98d6bd50c
401193 F20101108_AABOGZ crecca_c_Page_151.jp2
f854e3c1ca26c0672c619f96d820f09b
a2331737fc0fa78b77316eed634f132325f9a597
54488 F20101108_AABOZD crecca_c_Page_111.pro
880f6980d20f77df2c454e919a74db73
4a93d3ba3d5e0a3d6dcb11a178d4b1524ab869da
618 F20101108_AABPDT crecca_c_Page_099.txt
01a4a3d1094bb03519bc9fbb5d12700e
af9b3ebbc33ca7d8e7fded9f500c6ce59bc75241
52428 F20101108_AABOZE crecca_c_Page_112.pro
a0d4170064749e0ae253bb3e08c27c7c
dae3b39375c2d493f543503cb4cf1f476760cdcb
293 F20101108_AABPDU crecca_c_Page_101.txt
df1b42d07b8d4f52694c5fa024954782
17bb1dae43b481f34ac79107de5bf6dd6bd43bee
28323 F20101108_AABNPA crecca_c_Page_024.QC.jpg
199dcf57703888e3a7ef204f59f537ec
f1ebedbe5f8dd3afd520e87f6137e156e4135851
50050 F20101108_AABOZF crecca_c_Page_113.pro
43dfc538e3170729af944219b856847e
bd8807f2c45359d475150d690db99e3ff2d1e48b
F20101108_AABPDV crecca_c_Page_102.txt
451ec46c22e15fcbdf659bac62d8c777
2a990d8abb0d49bda39ac2cb2f5ce4c2747b5ea2
18669 F20101108_AABNPB crecca_c_Page_202.QC.jpg
3f95ba89789408eceeb7c8b2978b023d
52aa9cc43ad757b44fc31661dae6aab4401ad73d
51705 F20101108_AABOZG crecca_c_Page_114.pro
6baf8f4c53ac5a861dba62dc16f9f2c2
02fc748caf02a37dc5fffdba39f218833a9dac55
1902 F20101108_AABPDW crecca_c_Page_103.txt
892f72a47c79e80a6e7218973d26c303
13ef5e70af226cd7711344e09caded404b43c4c3
18295 F20101108_AABNPC crecca_c_Page_069.pro
b96ff89e6824c66685bea5f12206307b
8611e4de86fdf25f59a8a65cc533d01e5118f29d
47085 F20101108_AABOZH crecca_c_Page_115.pro
d7acb4f7a70bef345ddc24f3aa52e7e6
a7814924952e5a88d17c31084d93819013bada53
1811 F20101108_AABPDX crecca_c_Page_104.txt
36f9792b477491847e25062e643e7e76
8fadeff90c6d1c5f18cea94255bbbb461dd0e215
54689 F20101108_AABNPD crecca_c_Page_043.pro
221c9b215ea791daec63473da3a14b30
37c15085d422d4ed6596343fdf488c3f4eda29e4
52272 F20101108_AABOZI crecca_c_Page_116.pro
fd508ad3e5ab2c7e5063698441c8f8c2
a2399e796a1215d7aec7fb06f8f16c610e2f95de
F20101108_AABPDY crecca_c_Page_107.txt
f45173c5a07966e6cf9135b79057d190
8a5a1c137f8ecbe99d9c2f0c6d39ffd54bf7e649
33436 F20101108_AABNPE crecca_c_Page_126.jpg
010ae13cb044818fad1b36e9d94d6692
65e7cca6a6b79a3c7ee30e523833f12cc302e5bc
53236 F20101108_AABOZJ crecca_c_Page_117.pro
d76e6a3986b4e2d4260d479e32b1431f
df764b7aeb603efacbe5881d15ecf998f2f38b45
2011 F20101108_AABPDZ crecca_c_Page_109.txt
11316576f29428011b0d3dab528c9b34
440ae19e796dc02f0f68d57867bccbe338b385b4
56669 F20101108_AABNPF crecca_c_Page_099.jpg
859f68527c0a4192284523f865bf6ec6
df552252c26be23c9c28752228ebe5644d2eab66
45241 F20101108_AABOZK crecca_c_Page_119.pro
3f55bf9912cb185f53bac033db11a4e5
b1e9405a85535e36a9747d8b09a26661a4d9407c
6750 F20101108_AABNPG crecca_c_Page_171thm.jpg
65edf1456aa9f96135fafd63537a9d25
d625ec1176823decc747122714fd171e5646d258
53720 F20101108_AABOZL crecca_c_Page_124.pro
ff8d5133ca32b9eb30355e345a3545f8
d67d563aaea4e032dee50f5be7c1b6d33f5335b7
51131 F20101108_AABNPH crecca_c_Page_122.pro
ca8decd9f72683d16c888cbd0d974728
72b22b63b321083fbb5fa83229729503d7a6b220
102197 F20101108_AABOMA crecca_c_Page_209.jpg
43d802f4bfeb6e75336abb19ea7cd09e
4ccbd2cef11c81621e8f3a161a9fd86ccc255dbe
51422 F20101108_AABOZM crecca_c_Page_125.pro
fdf070d7abf2b645ead84d53e24ec49a
d6c3b2c316851e218d96bb16960db170c428d9e6
5988 F20101108_AABNPI crecca_c_Page_013thm.jpg
3459b2c4d8df37603253ab087835093d
22bea51294c704a398c5f41839675382bba1f05a
103944 F20101108_AABOMB crecca_c_Page_210.jpg
eb6e05b05546bf2c764f2c385ac877ec
633e7fc99032c5dd2d0562174963f949aeebe03a
13664 F20101108_AABOZN crecca_c_Page_127.pro
f5ab77b34dde6b41417b4b7982d3db43
b5aae700040e71969da41f49fcf3f8aac2d5fdc2
F20101108_AABNPJ crecca_c_Page_017.tif
0c70e349de5c00008d4010d9592dd6a9
8708d5c836de6c4603c0a514f0ec4a39a5f18d64
99249 F20101108_AABOMC crecca_c_Page_212.jpg
01243ebb02e7d1b552349c86f840dc59
84acce37fae8d8f96450338f234f60f1dad84620
13265 F20101108_AABOZO crecca_c_Page_131.pro
ee286563a2d165d538ff8b42e9998e69
5252aa6ccdd1f2917772a0f45845fc359c6a8344
F20101108_AABNPK crecca_c_Page_079.tif
fe81de828c73969ff13d3e080296e1da
b4e6d8d8150e0eadc80032c76c63348c3655162a
105207 F20101108_AABOMD crecca_c_Page_213.jpg
bffc7667615890d3b19cef17824b9582
d377d3fa7c350dbae861e04e846afc6c3b6d2510
8688 F20101108_AABOZP crecca_c_Page_132.pro
246f2f67279393c4c66ad401b0bc3c08
56b3ab971e9a2311051e0b7f555b5da8e59f0faa
F20101108_AABNPL crecca_c_Page_039.tif
c4594ceef1ba5b796b87079a51b95515
4133b12e693ee584b3b39a6d3027df052080fffa
110738 F20101108_AABOME crecca_c_Page_214.jpg
7e24b7cbc5b655cc75fcd192bc65cb5f
039c6cd353347bd26c8b0e94706220bca8ec2f8d
20135 F20101108_AABOZQ crecca_c_Page_133.pro
f89fe09b378e80ca5f09ded35fa0fe17
da8d17519d5a27ab85b0300fd3949fc0b1f204a3
1746 F20101108_AABNPM crecca_c_Page_096.txt
02de95b0131079e70d0bbcf1c67f8d8e
418f860d1bed416ac092bb8c8c77f31211915570
110843 F20101108_AABOMF crecca_c_Page_215.jpg
bdcf24b6f18742d9ac85ca8a1b9938ab
21e619f6979da8af87a3aed47aff371cd337f86b
20651 F20101108_AABOZR crecca_c_Page_135.pro
6d2f94a119949601ef48fc5c32c03a34
f26985dc50d2db7a5611fade0f577e9acd36dd2f
87781 F20101108_AABNPN crecca_c_Page_080.jpg
53dff8731f89d28ff070e54d6d4c1b39
62265957c0489681c619c2af550aff72c7fa06d6
108631 F20101108_AABOMG crecca_c_Page_217.jpg
f3442df359e529e46e7b93b921dd1e94
35890511b706c87ec6dd0ddd7adcc5ca124255a8
6327 F20101108_AABPJA crecca_c_Page_049thm.jpg
3fcff534f71c29e2a2dd974fcfc7726b
153c94ea56ae5f5a320dde447f5f08022266b1ce
24820 F20101108_AABOZS crecca_c_Page_136.pro
0a6991fbb7f84adfbc3e47ebe770e6d1
2fbcab6d5a60050432c6501a55352322a4bacda9
422 F20101108_AABNPO crecca_c_Page_169.txt
b31e31dd09acdbbf8d573740d4f1342d
74ad1fc3235c1c71290cd730f4cc2e6af7ac2a1c
101552 F20101108_AABOMH crecca_c_Page_218.jpg
7b304bde4d8c832655244d67a32cf6bb
1ec8f9c46a287ec527f146dcd8d9b90035923d5c
6447 F20101108_AABPJB crecca_c_Page_050thm.jpg
4f22bbd9f3ba4db1facf85caebce684a
3a999426cbafaeb4422cc9eaf4d6192bda82731b
28760 F20101108_AABOZT crecca_c_Page_137.pro
2becfce012395f835cb7d675bb67b54e
67db18a692e35fe329d9f516f040c52c9b9a1109
109225 F20101108_AABOMI crecca_c_Page_220.jpg
1d5bba6080c720b38b1d19cc2d60667e
3d0779497766c59014f4acb8857c62e191292deb
28733 F20101108_AABPJC crecca_c_Page_051.QC.jpg
d9b0dfbbbedf0289b2135d2210e7984f
bcebbf4e1350694a7114cdfbfae8f68018d8d9dc
13143 F20101108_AABOZU crecca_c_Page_139.pro
a9dde213e892eac2d2577bc2d813765d
edfe5570196b95a5d5ac03b399524c0eff03b2c7
3535 F20101108_AABNPP crecca_c_Page_195.pro
16e917710c7673dfc1c5c36ff30cb81e
30a13c7e7acec750262db0bbba75d822b0ead3f8
110110 F20101108_AABOMJ crecca_c_Page_221.jpg
bc7885151e111356a65c9ef7e5743b26
6c8b01a62e9c58bdad5355b750a16574e93800ed
26286 F20101108_AABPJD crecca_c_Page_052.QC.jpg
c5d7d91415b73fe31077407361031223
8ed1b8b9cd9d987cdd47b4f723eb672340581392
19627 F20101108_AABOZV crecca_c_Page_140.pro
b673d7df5b0097bb0704c1ae000ab9b3
19da6e0364b124301a4d9ea2bb1b5247a85a52cc
F20101108_AABNPQ crecca_c_Page_004.tif
8358108021430151a0265b96acc49a94
3ca3b3e3cf39093d646253429f6bdaaa601932c4
50885 F20101108_AABOMK crecca_c_Page_226.jpg
fa93c97efe7765bc9b24c4ea9697aea3
3a1fb068703af6a3ee217c32cd419fdad7c210d5
26896 F20101108_AABPJE crecca_c_Page_053.QC.jpg
f9911eb40b83265c1f1f3a4846d435be
4fb6ea1182e005e81ffd617201fa40d6e2884e1f
10185 F20101108_AABOZW crecca_c_Page_142.pro
e041c7412bdaa873acd0ed1195ffe7ad
9a83d61e29d70da452089542e13bfadfcdd088c8
27975 F20101108_AABNPR crecca_c_Page_036.QC.jpg
c5ae3cbb6fa337f4fe6501be64bd99d9
b47cbd0411172ff87ebc62c4025043209b1f74a7
28345 F20101108_AABPJF crecca_c_Page_054.QC.jpg
7d937841e6948e39ae5a23b380b53559
4663700203c83943f8c15a3c93cbd4c1eeb19bd4
52706 F20101108_AABOZX crecca_c_Page_146.pro
ded8243a085b1d2f6eac54cdf2407ca1
85f5cc12b3787c9fb2ba3a67347235a396836734
2766 F20101108_AABNPS crecca_c_Page_214.txt
144f1f67bd67677bc4f835865b4c9e05
70c41fcf2ea7d155c3d9e42c977ceca8906e945b
58503 F20101108_AABOML crecca_c_Page_227.jpg
4d4af017412b50c679677153fa72c01c
aa9272b017140cb375dc717742ea5303b156ffc5
6331 F20101108_AABPJG crecca_c_Page_055thm.jpg
1389291b1e9ac6e46250eba2940a3a10
53896a59e15c07da157783c7cd6ad80f2423279b
85325 F20101108_AABNPT crecca_c_Page_155.jpg
569c632fc99e29f827c2c2a232dc792c
7357811226aa321353a390fe0e0f6dbd8eade099
321061 F20101108_AABOMM crecca_c_Page_001.jp2
26369adb09f54c9dc1ddf28081af2e84
2b09951ba0c6841ab218601a27e5fee760567d9d
52634 F20101108_AABOZY crecca_c_Page_147.pro
cd3b467b197a61954c6b11e1732734b9
44244d80044108cb51f4a8ee305c2b9bab9dfb74
312906 F20101108_AABNPU crecca_c_Page_195.jp2
aa8c4a20c598e673b4c635d325d60a7e
ab958a395eccc7a7e45b87f4f2f951a32459d78b
27449 F20101108_AABOMN crecca_c_Page_002.jp2
3a23a80b73dfc4c6befac7258f1d68cb
01962493eca9cbf0e525f54349dab9871d7d3665
6361 F20101108_AABPJH crecca_c_Page_056thm.jpg
ae7c009c09991aaf1273755208d8aa47
c039377b09f8d2500c5ab4a222db4074b12d23dd
55358 F20101108_AABOZZ crecca_c_Page_148.pro
2f30c5263628c0e33549fa9ce7889d92
2c569669b0c17ebe98271121199f7473d98ddc3a
27550 F20101108_AABNPV crecca_c_Page_218.QC.jpg
bb6d95b99eeb13f186c129603107ae72
005bab26c986cd58faa486b9e8828a0b1b75c1d5
1051954 F20101108_AABOMO crecca_c_Page_005.jp2
45f0095b0a7fbfa893f8c5b4175b60ae
ffa0bd69df9cd1334672ca32660f28042c8814f9
6659 F20101108_AABPJI crecca_c_Page_057thm.jpg
717be1b71c54a69aca51767b7ceee123
83a95037596bf4e1bcde23b1e38dcccad1c2d3c2
28174 F20101108_AABNPW crecca_c_Page_025.QC.jpg
448ffac53071434b5edfabd6ed0a8672
c6c2638223812c1bc65ea33ec1099956c29b74c6
F20101108_AABOMP crecca_c_Page_006.jp2
5496e56d138a34378bbfd1723cef4d0d
585479a44a164604c0a4485ac689a726116263c6
27813 F20101108_AABPJJ crecca_c_Page_058.QC.jpg
44e21641a236a5f69568def0673ce92b
ab653c539b2216ad2d74f38d11977ad00c81c006
53333 F20101108_AABNPX crecca_c_Page_156.pro
0b8e3d1c594519e1d1e38aef876d5881
e18c8aa4037784273f73ebd62d3a901eaf26420b
F20101108_AABOMQ crecca_c_Page_008.jp2
c996dedb34d45c20d318162aed41060e
5b6cbd8b274e6a0c17a01b7cb5a178ad5d77aec5
6486 F20101108_AABPJK crecca_c_Page_058thm.jpg
adbba10d3f1e5752f160f041bc922dc3
270edbed24fe1bc310c30ceb86675d81656ce558
F20101108_AABNPY crecca_c_Page_197.tif
a25825ff94bc4041e6775cc8fb3ec55f
46286b60f2dbeecf513328de625999fc110914be
F20101108_AABOMR crecca_c_Page_009.jp2
269cb3118981957ce043b61bbc12fb38
01611497a8b2b344f5cacf8a9922855f641a50a5
6763 F20101108_AABPJL crecca_c_Page_059thm.jpg
8c165077dee320dc6b99cdf78a0c2955
fdc9a22dd93f71be6058878120c89cf2e8923551
17260 F20101108_AABNPZ crecca_c_Page_088.QC.jpg
e6e6b3f04ce541672aaf2fbadf8ec4f4
532be829e40a672d048d0e4c5ac99712c52d4733
F20101108_AABOMS crecca_c_Page_010.jp2
3ab1d65df33834e35ff0735815d6d53d
de9e73dfc5d96fdc464574558e5af50cdf0dcb97
30413 F20101108_AABPJM crecca_c_Page_060.QC.jpg
436bb2ad92ee087ccebe4e904c0b0276
f7d5d7c385845f42a6bf9ba4dd14af71904d148a
F20101108_AABOMT crecca_c_Page_011.jp2
40ba97a27adaf36cf50a2b8623addf9b
ea9c1dba918d7b07cb8c1bbdd30c6528588f25b1
F20101108_AABPJN crecca_c_Page_060thm.jpg
c40337457505c220519cc0b28009ade8
999138f5ae0fdb584a5d373a1efa7fe101889869
F20101108_AABOMU crecca_c_Page_014.jp2
1b648f52cb9cfb31d7aeb03f0a95c5fe
f697a2ddad74da979c15093462785b250a5d2edd
F20101108_AABOMV crecca_c_Page_019.jp2
b24a303acf3ceee586ecafca6fc0deae
a1a693708db717c5a59d860ee3397504fcdf2196
28102 F20101108_AABPJO crecca_c_Page_061.QC.jpg
de691536376d38830ce21616f8e6d260
e3ba6208bb87e160abbf29ee0afe8c03eb4a7cad
F20101108_AABOMW crecca_c_Page_020.jp2
4582da40c00a0f2b82d8526f8aaf7be5
f60f39b59cdde97df58bacd0c8d5dcc668ca6593
6635 F20101108_AABPJP crecca_c_Page_062thm.jpg
1535f4c240d4ec395115e911602fe25a
b47dc9b6b1ad8b91584cb52533362924725c699b
F20101108_AABOMX crecca_c_Page_021.jp2
4fc5e0c84d29124e8d02f034d5a3349d
d36cfadfe18a1a86706a71f226e4e6427255077a
F20101108_AABPJQ crecca_c_Page_063thm.jpg
12372e33894e81a5686889fe3219d164
6457aece42d308f81a7f6d7b6a18d455183321de
F20101108_AABOMY crecca_c_Page_022.jp2
c44f2eca67dc6c3b5d49a0c138326419
716b81718ea0d3ce7abe78b74bf2af49b3c2ec67
28141 F20101108_AABPJR crecca_c_Page_064.QC.jpg
aa55bcad4acb77f82beb88c1c02b4347
df751a3234e3f65cad09d1f1821695cde69d028f
1051944 F20101108_AABOMZ crecca_c_Page_023.jp2
ea264b7287a78ab97d8189ed4a2b140b
d7a9140724ad140e136f41d47dc8639335373a19
6489 F20101108_AABPJS crecca_c_Page_064thm.jpg
de7a4c305009dc907697e314e09e9b1a
62dc690b4ac6fdc96ef92468bf988e8207c77c30
29336 F20101108_AABPJT crecca_c_Page_065.QC.jpg
0c4a6b852b1ef8c10550699319ad9184
7c284990b90f501742c9ed3ff2fb3012acfdd015
70108 F20101108_AABNVA crecca_c_Page_145.jpg
1500b349180f596bb87ae1c511b8c471
f9eab24af3e8d2e9485f21595a1516a6f7571468
F20101108_AABPJU crecca_c_Page_065thm.jpg
ac5f14b1d19926e56d4d465ef5655343
0028971b18774ee566e1f564fefd9ec88e3ab51b
568 F20101108_AABNVB crecca_c_Page_001.txt
465c6f23757419a82c1222ab85b84fe4
3acf6c5b0270441be51a42f16e57b971c1ad3514
5361 F20101108_AABPJV crecca_c_Page_066thm.jpg
641211a96fbd7cd1a4e796c52ab9fba6
d943e4a7a815ee88c91cea0eb20066b9ff2ecaab
2026 F20101108_AABNVC crecca_c_Page_182.txt
d6522ca55912fea8357d35338ae1079f
2a4b9d47dd3e6f896014c723fa523dd7759dce28
16649 F20101108_AABPJW crecca_c_Page_067.QC.jpg
31f3bf1d557ef6065a685d82fb29fdf1
73f1b461207548f765694060167bb1b5302957d5
F20101108_AABNVD crecca_c_Page_120.jp2
ee5fdd358ca26605cced67973a18009d
dbb67e41fa5c8a914dd6c71dbab02e53309844ac
4519 F20101108_AABPJX crecca_c_Page_068thm.jpg
941111c1f03f4da13ab7073dda0ca4a1
ff627178f8a15aa187bb4026ed9e0aad51b56266
83930 F20101108_AABNVE crecca_c_Page_178.jpg
ffab6e1b529e679636e86bcad85d4573
0c0743b4a4f167d7b5f72c4f7d4b65e0ff5631bd
20701 F20101108_AABPJY crecca_c_Page_069.QC.jpg
b6da50c5940359d8e3dc9701c6be7bcd
459e5e9d5d305b71b45510bea0fe9cfacdcc0af6
86470 F20101108_AABNVF crecca_c_Page_006.pro
2deeed812ee10b157372fddb7ee17259
5d61a45f89b23127b371c0077c4b1d433bff701e
6297 F20101108_AABPJZ crecca_c_Page_070.QC.jpg
08e90039d6f99f0ef7bdf8cb1a8fbe05
f7cb64baedd579c74efc3a2203db75669447e76c
12403 F20101108_AABNVG crecca_c_Page_167.pro
262c1b1d426c81f5c7734eba29c4c6e8
c9c9d1f96d975e6eacb2e91d067075fad8e7ad4f
21026 F20101108_AABNVH crecca_c_Page_072.pro
7f01d428510e9bfa438890f52b1b962f
80c2134661d644ef7e532f901036807fafe4c660
F20101108_AABOSA crecca_c_Page_027.tif
bc972724f4358ef7df6f8a6870455d83
486d00c30952078bd95d82a46f8181b0c47704e1
1051918 F20101108_AABNVI crecca_c_Page_110.jp2
e693b2b012e90c6289dc8968b86769bb
45be90720fd57c99719d3207504ac4a73cc9e464
1053954 F20101108_AABOSB crecca_c_Page_029.tif
e0cc77e2cb9279c188c587baa821f9d4
9544d5809c0661db24cc9f827467e7894f0bdc21
823588 F20101108_AABNVJ crecca_c_Page_196.jp2
e91b8cddef0c2d5e619ca5a376052365
dc370ba88cf3d37f71f6e92c58bc62c19df56e76
F20101108_AABOSC crecca_c_Page_030.tif
7a844b36b38977009920fc260166d129
d0fcee1013b93d2155c3ec40614a7140e6f8cbe7
4169 F20101108_AABNVK crecca_c_Page_204thm.jpg
db247897e1156651a02ff4a5a0722fb6
5cd8becec1eddbdca7aac9bedf4a34a5aa075840
F20101108_AABOSD crecca_c_Page_031.tif
d83d46382cb29e3aee6d1902d6e65c5d
1ee7cc9e252e5685b9a37add936e8b2bacefb566
6663 F20101108_AABNVL crecca_c_Page_120thm.jpg
34732d5a8692b6b79e40991634b2b291
886ffeede7a0441f6e9f4d6789ff89e75c12a915
F20101108_AABOSE crecca_c_Page_041.tif
28b325c07af3d92dcfb305b143dcb046
82301b867243bc8a6c2def0d5fcc086beb21d620
F20101108_AABNVM crecca_c_Page_188.jp2
3dd30c91f04185c38db4de72799aaf26
04086f8c53c099c95f80d0e1656caad99603b115
F20101108_AABOSF crecca_c_Page_043.tif
98f1f7338599b3d2f369c8afe5e470a4
5d6d6d07cec7c3b638ee7c7fcfb73a18d2e0e100
596 F20101108_AABNVN crecca_c_Page_196.txt
b96c647bba2f5d7e536bb80c7e22aebd
91dea3c87a94400beea33b6681221cb33b97ddfd
F20101108_AABOSG crecca_c_Page_047.tif
b38b7ebd3bc998ab7b2d312602bb5b90
742ec37f9e8c913bfa87d5455c6570b86c4b936e
F20101108_AABNVO crecca_c_Page_136.tif
b17dcaaa3e89473577c064aae49acaf8
2005325e423bd8f7663e40b88b537e504aa9a009
F20101108_AABOSH crecca_c_Page_049.tif
41a26ef87d152140d23b5a25db45dda5
2ca31f6c131a4701580f517c56054c2f34ca1fee
4119 F20101108_AABPPA crecca_c_Page_166thm.jpg
67bd22e34fba38e4c4ff704bbc18cedd
9de5396ade388af2969ddb4408bfef9e4b217356
27666 F20101108_AABNVP crecca_c_Page_224.QC.jpg
9851369dd40b2ab6c14735b517bbf0af
4610744ff1a1393754434df865c6d866e69eb93e
F20101108_AABOSI crecca_c_Page_052.tif
fb0f392ce9c4938cd12d01ec58c1f04a
7889b6466323ff13718bc36c8ca763c3a6c86bfb
3675 F20101108_AABPPB crecca_c_Page_167thm.jpg
60e3028c12038dde8b7a7adb954598b5
85225b77dac6a84814e3b6a76e73493be5f6be92
156 F20101108_AABNVQ crecca_c_Page_187.txt
e0dab7f2d34bd43ba346fbbde39d6bee
4ecde2a270a0df96b16b82e194b2c05327de4cdd
F20101108_AABOSJ crecca_c_Page_053.tif
58a4616556875c0ca6764e53a0c15ab2
9d2113348d69b8bef4e09b01dac5d17d451c316e
3324 F20101108_AABPPC crecca_c_Page_168thm.jpg
d6f2bdb312456b2de9568dde38a58c99
bbb4a13e634d69fc63ec463b755934eb5ea36bae
F20101108_AABNVR crecca_c_Page_031.txt
297a1361121837ce9f8fdac398587425
ea552514720417bb96b4082cec4e487b97063717
F20101108_AABOSK crecca_c_Page_054.tif
2ba0e91a57df4429a25e3004827911b9
882b95cddc20102ae2e1e546724d6833bf4a8888
2809 F20101108_AABPPD crecca_c_Page_169thm.jpg
c716bd8538c754099905c4442fd00cae
9bcd5d09b9670bd2dedb037ab1da2e6eeacae788
51431 F20101108_AABOFA crecca_c_Page_180.pro
0b5e24889979205b5fba290534ce4a5d
0b5f49267956e570518433327f9b2ba07c21e5a6
88047 F20101108_AABNVS crecca_c_Page_054.jpg
0f069db9bfe7d8b3d3f7a1df90f534ad
3947c048caf1cabeef0ab5812e54c524327ad335
F20101108_AABOSL crecca_c_Page_056.tif
97a603ec78a278151cebc28b5f69be95
61f1a00e1e6b47a85da4c7825c6acf6c213b9dff
24698 F20101108_AABPPE crecca_c_Page_170.QC.jpg
84afcd0cc8a1014bb30dcbca594eb6d9
53580b11ca7cb5e258cb73be5846fb867e507114
40013 F20101108_AABOFB crecca_c_Page_193.jpg
142f60d7321d7e09e2971cddd5e9f300
ca661abefa95d3b5d90b2ba44e9cf287cf97e01d
F20101108_AABNVT crecca_c_Page_121.txt
14ee96dc3cd006596ef4039eef98dc35
3961fb47950691d9bdb14e4481f9301473aae95d
F20101108_AABOSM crecca_c_Page_058.tif
a59d34b8ae987b64aece2f650ac327a5
f1d1c7ad1fdae89ac8d1dea1f90f215053b7ec48
28750 F20101108_AABPPF crecca_c_Page_171.QC.jpg
6289914dc962a9289d846b29aa669f1a
4012081c661ff642b76f612542da3fdd6b594225
14881 F20101108_AABOFC crecca_c_Page_084.pro
483816abc47d2aaab9ddfe27046b8512
b825359f67592c8c139f492230aa3884169f66d1
53812 F20101108_AABNVU crecca_c_Page_061.pro
8e33488aa896f419324e7f47ea061e2b
4f4fe68f3ac930ca0fb0ae73cba7469ffbd9fc3e
F20101108_AABOSN crecca_c_Page_059.tif
5619e5ea7df1e07a5b19a387d0554a6e
916aeb567eeb46946cfc07498864e1ec085b89af
27155 F20101108_AABPPG crecca_c_Page_172.QC.jpg
da45df7db2928f57ad38cd0fd80fa38c
f3292e296ebbd13989aaaf3c2d565e88526a97b6
707347 F20101108_AABOFD crecca_c_Page_201.jp2
528d1c2380d162c4176ba7b1572d475c
c447d6c6b38cc8ffda85b3352354cb6cd77ae2e9
F20101108_AABOSO crecca_c_Page_060.tif
54765e91715eb3f93c2f36150d7e60b2
2718b6a439203fc1bb424881ce34c30c5a8b203d
6674 F20101108_AABPPH crecca_c_Page_172thm.jpg
37784024513a9f92454af47a8ead3c13
08d60f8b7a417b28543e61183186271759258d76
F20101108_AABNIJ crecca_c_Page_121.tif
e24932a9650c88bd7c30d8cdd06008f1
3ecaac98cd684459acb91ff8a1ae9bc8fed002f8
7853 F20101108_AABNVV crecca_c_Page_190thm.jpg
c76af23587332dfd56c8568ddf27bf48
68b8e26f42f9eeeb6bbaf44282d00d9ce99fb00c
F20101108_AABOSP crecca_c_Page_061.tif
5ae60889e4c03a4db43cc26ff85b6ac8
bfb089639ea03e1a378216e2651b0504300317e2
6535 F20101108_AABPPI crecca_c_Page_173thm.jpg
401eece716bda292cc96be85c4b03b79
ac2f914d234ea39d8f88856c28ccd7cc012db7ca
F20101108_AABNIK crecca_c_Page_149.jp2
24f04046c1e740bdb5d5a6e26df470d4
afc8673d1a5fcbe346c872397a8366a6a11ab0dc
31420 F20101108_AABOFE crecca_c_Page_134.QC.jpg
8578c7ea2a847db63ca1a67a3932e4f5
c329a7d44eb2dba958a3190de75f8dc082d658b8
4037 F20101108_AABNVW crecca_c_Page_067thm.jpg
7c6d86779b8a064b1b516fd794f581c3
b890e6249cfe106bf5d821b6974705991f0c2e09
F20101108_AABOSQ crecca_c_Page_062.tif
3ee1e12ff4b74f11e2439782b4cc432b
810daf2a7f9980c71e185e786d06169c3d5efe3e
27989 F20101108_AABPPJ crecca_c_Page_174.QC.jpg
ba297755cd5a9a6a567988d0bed5fb12
14042bed9f2243c4b5bf17eaeb0ca9a818112342
6665 F20101108_AABNIL crecca_c_Page_216thm.jpg
cbd8997185e526b390e49c5cba98cf24
d4216e998d6655ca5b6a8953079ba3d35fe5bdbf
F20101108_AABOFF crecca_c_Page_124.jp2
124ac11813c5b27cc5b8c6af59f10271
1aeccceda60f4c8bfe21d2bd667284efb93353d7
50303 F20101108_AABNVX crecca_c_Page_173.pro
b57183562a2d6ea7477c81bff8b1f7dd
9555d69989ef49a14246a53e68f735e3991368ee
6436 F20101108_AABPPK crecca_c_Page_174thm.jpg
cef8ac0e20890a1bdd7e63aa41447db2
2d5f4f0177d7c1284d3e2951480a2762219251d1
23960 F20101108_AABNIM crecca_c_Page_072.QC.jpg
e37c4b5c9602e3cbb0878a52ed97fa35
f5d7879877f8ee2b4e945a5d1a6eb39d8503338e
F20101108_AABOFG crecca_c_Page_195.tif
2a4359a3b05f2789f042af26df6bd349
43f752a7b942ceef15ea87f6b29859a2888183e7
15185 F20101108_AABNVY crecca_c_Page_014.QC.jpg
a72235f31cda7e5afc36508788042141
19c69e4fa2000720d5e2bc0e1e280962bbd1f664
F20101108_AABOSR crecca_c_Page_063.tif
09ebe2a202a7d79df4dd02d7645f1b09
62597212e5603eac954c7fa128f89d593963fd75
6399 F20101108_AABPPL crecca_c_Page_175thm.jpg
5579f46f0c9c18c3a989952799d3b1e5
db197af16c24f3cddf303032323b49498bb131a3
809 F20101108_AABNIN crecca_c_Page_088.txt
64286bd018614d1d333dcb234316e1ff
7515c252d6b369202a32c0072d8299778c45b1c8
2259 F20101108_AABOFH crecca_c_Page_060.txt
6f08a0b76e92bb550d10c7aad62ad79a
940f9d3b56f54e15e0d6f091c7cb5baf8e439a00
27268 F20101108_AABNVZ crecca_c_Page_153.QC.jpg
f69616746d222fd4c959f835fc8fba33
cf670c60419ecf935e1ba85ecd34c987043436ab
1235 F20101108_AABPCA crecca_c_Page_014.txt
dafc67930b299a0213d28faadd6ada2d
8dff81352160a6505b49f2af90e1035926eb4396
F20101108_AABOSS crecca_c_Page_065.tif
8c25a8ef67865a2a7fc994da1456d874
bb20bb86cab794091c835b2a49d0ebe9fd07855e
28952 F20101108_AABPPM crecca_c_Page_176.QC.jpg
f6b39595a13704522074edd920a82c95
f0f752984d88ef850fda35adb4c9ea229a1f031d
717 F20101108_AABNIO crecca_c_Page_127.txt
e000fa40cbf1a0903622214ee2600182
051b40853194b908698c180629b985c1831eb4b2
27229 F20101108_AABOFI crecca_c_Page_020.QC.jpg
fccb72b17ddb149cdcc1537dc9be0922
e4a01ab75a54834a71fd4fdc608de044cd653b60
1001 F20101108_AABPCB crecca_c_Page_015.txt
6851b1aaca7ab8d2930bc6a84f67987a
c203e06104c4dff919cb72ccc28b229b32682877
F20101108_AABOST crecca_c_Page_066.tif
78a4047597b16536139ed780a9709096
d61463e69e057261a0eebe7b7f52cda453c0acce
82350 F20101108_AABNIP crecca_c_Page_183.jpg
2529124d03fa4df394fe25747a91cd0c
f456e08cf1f3823448edcdc43025b333b308a4a9
29398 F20101108_AABOFJ crecca_c_Page_149.QC.jpg
9e7ac4d731b1fa6fb3db8360d71f0636
5ebede5248813cf4b993074423940ffc68c513e1
2091 F20101108_AABPCC crecca_c_Page_018.txt
bb5bac35fdfe3c16446c202a6bffe80d
e240b0e3fcd3a9f14c599d6abdb9a631be111698
F20101108_AABOSU crecca_c_Page_067.tif
c1a5c5e64d0373c47c46ea12cceddae8
ab8a49b14a27e547c419324c9d2fcca9307fdb28
6874 F20101108_AABPPN crecca_c_Page_176thm.jpg
4a1e9bc9c195dd125ab65927388b6230
9de1fe998c54e7456dbe5da98d13f026dd483f7d
81699 F20101108_AABNIQ crecca_c_Page_044.jpg
81f1b9153ef83010d34a71af16520f22
a05c941d9b825b5635bf109aba5edb9c8c46758d
F20101108_AABOFK crecca_c_Page_152.txt
47036f2483d71a1dec83e38e00a43633
7fcafff6c78303d7c8c4f6a7078e24cc21c5037a
2176 F20101108_AABPCD crecca_c_Page_019.txt
21818afc9553cd85f0d53a82db5b89ba
1d5761880302b389855cfc9d1360411cd174060b
F20101108_AABOSV crecca_c_Page_068.tif
c95c9fff6dcd61307baafb1c3fa8a487
bfb5025cf659973cf468470efa172c21b6735360
6515 F20101108_AABPPO crecca_c_Page_177thm.jpg
74de231a69f0a5cb369a9e8280709c33
405118537deb948562376e803b33050a508f08d8
2047 F20101108_AABNIR crecca_c_Page_114.txt
143eea77942f7ec80c6b529a4a89519f
b2346658d2f2b3ede84578a4b61d4b2eed45ebd1
53798 F20101108_AABOFL crecca_c_Page_144.pro
500d1c689bca2a0a6fa26b648a438e0e
3a710b2ed2a0fd21d686444976c2371e2e88f63c
2108 F20101108_AABPCE crecca_c_Page_020.txt
3b65244fe6b29af49f36b19710093ca2
888c1b278f35d6b825041bbacce1a527c08b6470
F20101108_AABOSW crecca_c_Page_069.tif
7e10811c9c5a8bdad039ad8ce3c50529
7873f6df03cdfb2a7ec995c75b56071b6ff7869d
6236 F20101108_AABPPP crecca_c_Page_178thm.jpg
45c2bd2401f4043e6b1d89f66a652c82
2a49c3fe52f186585c83ed72c328e6e3e49fb6e0
65783 F20101108_AABNIS crecca_c_Page_012.pro
bc942de7b7b3f986247dd92cfa7f8c40
5b869af9ec6c698e9bb27e217e5bf9aad947de66
86055 F20101108_AABOFM crecca_c_Page_177.jpg
859fc146aefe8371233f3c85a5e3560e
5870ca3cfecfa106137890af860c261f87312b13
F20101108_AABPCF crecca_c_Page_024.txt
14a1c99d02a20ce48afcf807dce32717
1a7541cd069c7dd394a087f49e6eee804c2911de
F20101108_AABOSX crecca_c_Page_070.tif
0d6275fab1d0e82606e56561797928bd
25314e99ced85f3d8cfad877c6758111fbce450e
6607 F20101108_AABPPQ crecca_c_Page_179thm.jpg
423aa36ff1c6655f3b6d6fb394f15fa0
b6fd6b4beede1dbf86db9212d080c13d8d4ed0e8
58277 F20101108_AABNIT crecca_c_Page_059.pro
6eaaff0e119533843add459d507eaa0f
633685d4019f63af7004654072cba7977df5c9c6
6652 F20101108_AABOFN crecca_c_Page_158thm.jpg
b79ff950796199ced82396fd8f9ffa96
b32e124dea4ac90a5f2208aa013f144b8cb9bb00
2157 F20101108_AABPCG crecca_c_Page_025.txt
5a4bcde6588257c505b1acdcb3a32d6e
b519a74a53556ba6d6bcae14a559ffa23c8eea26
F20101108_AABOSY crecca_c_Page_072.tif
be1ed7d00a6187a2ef56ca64febea06d
8cde7598c32cf3b68430a248ce95038ef59039e8
27010 F20101108_AABPPR crecca_c_Page_180.QC.jpg
6eef6be96117d82ed335cfeef61f38e5
7e16972b0d637ca9403cc2521b382e633c5b2564
7394 F20101108_AABNIU crecca_c_Page_070.pro
33bafc4720332a60e5ead98529bccc7d
4bd6f36f5aa70616227084b806ec40c189efe7da
3796 F20101108_AABOFO crecca_c_Page_002.jpg
580c2156ba372dcd39dc9cb49350e210
42e984b6e9d7ca2c8d2486393274ffb4e17ab438
2096 F20101108_AABPCH crecca_c_Page_028.txt
c383a3020df831b9984ca699a266fb1f
98b7d19fbee2f491d659043c0d43716aa782d0c2
F20101108_AABOSZ crecca_c_Page_073.tif
94091c8aae441a5fef3970bb3c74e6e3
55320a1e92edc1ca7a0e9b8eed1e869d684b6dfe
6870 F20101108_AABPPS crecca_c_Page_180thm.jpg
803e98589b1c0fa71c486f02ed0e0423
22c7bf8ee35df6a0d95f90aaa7b4f381a6c68e8f
F20101108_AABNIV crecca_c_Page_180.tif
c2cad5dcb099d71b488fe137637b7211
bb15b242c660507559c3fa3765114a5986768bed
9531 F20101108_AABOFP crecca_c_Page_194.pro
d95b75fb2222025f9202053c24358edd
6be8c6e59da2afac1966a3bf45f1592444baeeab
1979 F20101108_AABPCI crecca_c_Page_030.txt
9476bb099d46964804728280ce920ebb
b72d72fc7b295eaa2a3daf5546aa3d7290eb7fdb
13540 F20101108_AABPPT crecca_c_Page_181.QC.jpg
124e2e0d66174ddd3c68e4beb1e56dbc
b1d9cc304a66d8e72ec7a74ff269082ded7889b9
960 F20101108_AABNIW crecca_c_Page_002.pro
71a8855c409f0793952ef8c55eb7a0af
4230690bae71a66285aeabc478ab2d879fc50a99
86184 F20101108_AABOFQ crecca_c_Page_122.jpg
c08a616ddf1c3e47f91f6b8964931df9
86e3864f11270fc941cc0d7052c08a5e2797f287
2073 F20101108_AABPCJ crecca_c_Page_032.txt
53eb06311b37b533a4d9d455f312689f
cad99bd0035aed26d8f24baef6f24b392bb4aa00
5791 F20101108_AABPPU crecca_c_Page_182thm.jpg
62f1cc5eaff8cc00dde8a4c3db99eb78
a46ec6f8c5255d75c1fa812af08714fd451abd0b
F20101108_AABNIX crecca_c_Page_209.tif
1b34aaa037b8f657c5b5bbd43ecec41c
773a8eb5ecbdbe2bd199b6546b53aeccece0822f
4711 F20101108_AABOFR crecca_c_Page_184thm.jpg
4d55b8725fb5c4e1700da29d2e7ee2a8
4a92479e7541534c0af8be3c45c34a9171c51099
F20101108_AABPCK crecca_c_Page_033.txt
cd0ad5921e89856a21e9fcd7f19db3f7
672798d7af9dbdbdb509b3e7dd469a16823f783b
19586 F20101108_AABPPV crecca_c_Page_184.QC.jpg
dfb6486fba1dcb3e6db50b5e4ade78f5
2ab698f1d68a2a03b1b8b83e3beef8e1eced4cd3
43056 F20101108_AABNIY crecca_c_Page_030.pro
7b26b388c71433e62d10c75b22f076b9
5e475810924f3dc51e0dab853039d0c2911f876b
6199 F20101108_AABOFS crecca_c_Page_038thm.jpg
4630c628f7f73b1b702ee86e640c0f02
03245b1025d4b3a4f30f0944be022619cc66a38c
2076 F20101108_AABPCL crecca_c_Page_034.txt
ba5ce2a9d0058eaf23db5af1d5a0c4d2
5e9bd8ef5f79d4b99b2c88d718778d92435e69b2
3807 F20101108_AABPPW crecca_c_Page_186thm.jpg
0ff93e0930f8227112f2114144b67ba2
eaa0e95854da51c3d29b4d5974393fe0095b27ed
109139 F20101108_AABNIZ crecca_c_Page_222.jpg
02a2bc497f43ce7b566f16c15f992989
27ca8669f1f0d8d91996673ef11734461e68a821
572647 F20101108_AABOFT crecca_c_Page_164.jp2
e37220581ae6455f1adda66a09bbff58
0be73e5ac63f527f8debc74597f6d63e4551640c
2178 F20101108_AABPCM crecca_c_Page_036.txt
02935366af25a2a6599605fe7f2334d8
a5a36ecca73fd58c8599adb37a9c70e4154f12c1
4481 F20101108_AABPPX crecca_c_Page_187thm.jpg
89021e71694edd27af11ba50b105b36a
16aff1098fe675374abc431606181c31ec4b4763
47409 F20101108_AABOFU crecca_c_Page_047.pro
4b3fc7392d9f063be18e48f8d343bc42
4c3274134638d58b1815a17c834ac02d33ba3733
1997 F20101108_AABPCN crecca_c_Page_037.txt
847c4f5acb22953d8726f11bae14bfc3
f0684aa4ad4a8f05436b08371d086bdbe63f881d
24242 F20101108_AABPPY crecca_c_Page_188.QC.jpg
a302537a1c8fd2221d9aa07816a921cd
a83f920ea8237b713c171b4314ed38a6ff6c4223
413259 F20101108_AABOFV crecca_c_Page_086.jp2
d0f0d166e15e61579476f87e031b1e49
24abb57eb6917c4d9c82089e3c43628632e68bb9
F20101108_AABPCO crecca_c_Page_038.txt
ca5cd6caf9c455212816d17fc3c55361
e1a7015d15778bc7a2764c2b880bda2082a1c8d5
7645 F20101108_AABPPZ crecca_c_Page_188thm.jpg
fc3fa6e9e2dead1e465d669b6be81678
97de0b1a14373472a41103377aff9ed555c6d52e
27657 F20101108_AABOFW crecca_c_Page_086.jpg
7248136f9807d0d08925fcbe51c9a803
50ea447e2ab2bdbf64c6f809974f5b743fe69c74
F20101108_AABPCP crecca_c_Page_039.txt
353c9408ac5f38f2db76aa44a1e656d3
b8dd3ad85ff57cece8edb7ef7492cbeab9b2717c
2201 F20101108_AABOFX crecca_c_Page_093.txt
ab59f6538817397736f4014989c28813
591984bdc3490159a00bcca7ba3a26a766e5c88b
54158 F20101108_AABOYA crecca_c_Page_058.pro
ebaa5f05130e3ca744cd4c2997908316
42a822f0d59e5bd19e2d79193ec7307d6c3b7821
F20101108_AABPCQ crecca_c_Page_040.txt
c52a63cc5ce185e6a34ea07bf413e7d1
f845548f6189348f16fb7bf0987d97113043d38a
53400 F20101108_AABOYB crecca_c_Page_062.pro
7b6d3cca01ef337a7663c5f38221957b
4dab130758260d3d57716b924617ec9b7a7bb2c3
F20101108_AABPCR crecca_c_Page_041.txt
c7da74e7e41ec6aa8630caed050a0ce2
2a4d5e688b16616eb1c5915cf76da353ef7088a0
40242 F20101108_AABOFY crecca_c_Page_145.pro
57c8a809e50afb55dd26c797b8779d4d
96f907ca2adeebc7d303fb6dcfea7529b5c65c91
51979 F20101108_AABOYC crecca_c_Page_063.pro
ea2e2f571d936109f509605cfc173bfc
5337d062a29cbf7cf35025edcb21ed8b1b98eea3
F20101108_AABPCS crecca_c_Page_042.txt
6e54d6f165feb0171843ac7205969463
e3e17c39644e2e03d33efc442288355e8c5f52a3
1357 F20101108_AABOFZ crecca_c_Page_137.txt
c8d6a1d5eb0e42a535f9f25642b9f5e3
960330e5a2da31d85b73fba073dae711f896c50c
52311 F20101108_AABOYD crecca_c_Page_064.pro
5dab4573575880a32e05030160ebb7dd
c53a7346eac772fafb341c79f025697600adf61c
2110 F20101108_AABPCT crecca_c_Page_045.txt
c566677b777f697cce02a1d430e51f50
bf52084664f5b32df7931e24af7720f386efc3f7
54221 F20101108_AABOYE crecca_c_Page_065.pro
71eb7013894ee9191b3d78560a0456f3
3c2ad83f2ebe28a132e8f9426f5b0504b3e72b1f
1885 F20101108_AABPCU crecca_c_Page_047.txt
5fade20cca96ebe3a819fb3a6dd6b4f9
34f706f7a624e64594129ea6ac39f3fa678e813c
F20101108_AABNOA crecca_c_Page_035.jp2
06a6c9ec140fa441ddd409515da46356
d838679a82b4aba8ccac2f4c02109e8543eb15dd
41222 F20101108_AABOYF crecca_c_Page_066.pro
43cc1adfaa16d6abaaeac8f3cce7d37b
546214bab7957ca4b0988a96a5cfe465b814b45d
873 F20101108_AABPCV crecca_c_Page_048.txt
c0904e0a0457347de3639177a3fabf26
1c52995bbe4d315932abd810bad14c5ae3975ba8
F20101108_AABNOB crecca_c_Page_060.jp2
1682ac91082a9198255e9dfe02f28ab7
c6cee00c88a9da266939497966328cae446b622e
31654 F20101108_AABOYG crecca_c_Page_068.pro
659a0c2501d6d9a634b6aa8961af7791
8fe96dbcf6b3e957ac880c44cceb8154ed42748a
F20101108_AABPCW crecca_c_Page_049.txt
bddddaa762cceac51495fdea0bc21ddc
cc6e8d148575467512797bc60b2e4782cb860add
1568 F20101108_AABNOC crecca_c_Page_165.txt
83fddad2e426af0298c11a04fbb29d86
91e75bad04c52836eae8eb8acd830e1f8c075831
10797 F20101108_AABOYH crecca_c_Page_073.pro
19ec701e6c924e1c80d74b8fee187a7f
43c4252e921e816fd3b843b1681a07dc2de47981
F20101108_AABPCX crecca_c_Page_050.txt
0f193564bc573de7f891edb9162494ca
09065528b829542c9745afa7107002f99ba42cdd
27430 F20101108_AABNOD crecca_c_Page_092.QC.jpg
e3991e4d65fcde41042b9f53c0e18b8b
2285bfef26b21b045aff301afe22ebaea0fab35f
50700 F20101108_AABOYI crecca_c_Page_077.pro
0a6da214eb45afc8ce5f947753cb2ae5
244ce8236cb24e08c45e492738b968c004164bd1
2150 F20101108_AABPCY crecca_c_Page_053.txt
6afc51746f4af3a1512d44ca0f3f60ea
0f64c89aa70122acba598e7477ea0136732c4faf
26311 F20101108_AABNOE crecca_c_Page_009.QC.jpg
f4f2eb9ec2fe07c54c1fb1b24378aa9b
699d85e767dfe0c355d2bd12eeeef4223318ca22
F20101108_AABOYJ crecca_c_Page_080.pro
67f1c983a6c0000419e615816f5e5f9d
58837ef406683dccfc0ecfba6ad3869497157a58
F20101108_AABPCZ crecca_c_Page_057.txt
42a8cf89eef3b1dbcfefddd955dea274
8d75663fd035d9d11392f124cc2fc2281e7b52e8
1851 F20101108_AABNOF crecca_c_Page_185.txt
d2f6b6b90641eecba5075883898a2160
cb043d5f76e00c7fd21c299ee6bc66ff015ec1b9
56746 F20101108_AABOYK crecca_c_Page_081.pro
46b1668fa6c1da02c4cafd1acb003397
ef30bbf8b23bdb471b27180c707a5ca29c98d386
87935 F20101108_AABNOG crecca_c_Page_146.jpg
ab7cd0756803cf76352edf8b62be456b
b9e316e8ad439baaf017263f146c0911ec3e7c0f
19265 F20101108_AABOYL crecca_c_Page_082.pro
6768aa0aeb47df3e1ab912b1dd8cdf41
f21183ac4bef13ed65c15d9a3953d2e732d13fc5
F20101108_AABNOH crecca_c_Page_064.tif
fe9dc8944acc3eb1eafb14b471f83467
2ebdce3092e8e6042bdb8eaa3af258681e00359c
40312 F20101108_AABOLA crecca_c_Page_168.jpg
7f27c84597f496520704f8badc0ce744
5dd28f74b9ab405341f7c1b54d7e3ee2c196e6fb
15204 F20101108_AABOYM crecca_c_Page_083.pro
58013412d6ada4b0d150a9ecbe87d1f5
b6f36b0ab984b2c3f0cd990a0bc4e370c9a13809
6082 F20101108_AABNOI crecca_c_Page_009thm.jpg
d40ec2822ce59a08bec6975151897370
6b423dd03ba0518fb01ce5dc87edfeb614573cd0
26559 F20101108_AABOLB crecca_c_Page_169.jpg
930355774ad40e3c2a399c860a170e5b
09de29c41b85ed2515c581c1105c8e2479ef5aaf
13769 F20101108_AABOYN crecca_c_Page_085.pro
522791df65c646d21f59ef00c2233e01
fc624442840ba2200122417aabc39774ae2937e0
F20101108_AABNOJ crecca_c_Page_217.jp2
b0e1db0e6c667e63c22f37a6ce540fcb
f96d88f1b1714aef31ffb285c49129088d93c44a
79904 F20101108_AABOLC crecca_c_Page_170.jpg
d17703d506c9f3b814548496d41abc96
ed54e26b532541b94351783858b748a4545f53ae
15953 F20101108_AABOYO crecca_c_Page_087.pro
6cee51c3862569de810f849833ac91d5
244bd7318537ad56e95c56f3c79f807af05face6
80503 F20101108_AABNOK crecca_c_Page_110.jpg
36208b910498dbbf3a0dbf52ff336e59
d1745781951fef271c65124e27ac5e1cfe608238
89146 F20101108_AABOLD crecca_c_Page_171.jpg
fbd7465b1147cb6d0439c11b10e2c0c3
4a262d0df9082da86ad7b459f530cc9e3d73b91e
11402 F20101108_AABOYP crecca_c_Page_088.pro
77b7635f9354c633538c6c9416607583
c87aac765b5f1701a83251dc518cac21d894ab41
F20101108_AABNOL crecca_c_Page_022.txt
7043847fecfa51fd583cf6eaced7941b
e2f6f11c3f830f8e6b81c3bb36d1dc6c4b844ee3
85810 F20101108_AABOLE crecca_c_Page_172.jpg
c3ec0d77a152f9fe7cc52a8d49f95512
b5db88a76991ce6185d918528ef6a95fa9b67a97
49186 F20101108_AABOYQ crecca_c_Page_090.pro
4cde5660bb5b7a6ee060a41b3f634f44
6cd915b473b85f6ccfc32de09e93a886cc6effad
13516 F20101108_AABNOM crecca_c_Page_132.QC.jpg
0eaae63b2589eb4f5136abd4389a70d4
da526e01ce34a20b68c30ec6e7a2d06b3a51c22e
86973 F20101108_AABOLF crecca_c_Page_174.jpg
121c7f8f23a83383bb021b06a9f33d0d
4ec41609c5d5d1d827ae56e6c5fb0509b98002ec
49366 F20101108_AABOYR crecca_c_Page_094.pro
d5279b73ee4e6b62fe7c184c5ab3a35d
bb174974572dd5a450f3f596405125fe7fd5bcbd
572763 F20101108_AABNON crecca_c_Page_205.jp2
2e2abd33d7fb88e003ceeb012ca2e5f1
02082568aff3ef12d18567e0d6fbcfe8ff0e9ec6
83777 F20101108_AABOLG crecca_c_Page_175.jpg
d0c58e6cfe8f28cfab128eb42588d4ff
c3c7cd76a8247606008785264296349270bb99b9
26829 F20101108_AABPIA crecca_c_Page_032.QC.jpg
dc04316d88e1f26885a8a99988c9da04
8cfe899b98996914616cb17ec99ce774bb09e199
42813 F20101108_AABOYS crecca_c_Page_096.pro
42b4e98ee4ed6df208cb578e68556b82
f51e37a4ebf074734a03497220fdc26b81fa8199
85658 F20101108_AABOLH crecca_c_Page_180.jpg
d68c934641e6c208532e45b76b9fb077
f565ef43e20c8bec08fa15bcd605d61028437abe
6584 F20101108_AABPIB crecca_c_Page_032thm.jpg
638943211f7e382622a6cf3c3dd8e17a
b557c186f5a18fd6e9906020af7363d5e0ee079c
36437 F20101108_AABOYT crecca_c_Page_097.pro
40f8b50981e54f5368bebb9224e35d9d
57ea403e006d58e022e9c82a1ff637a84ddf17fd
72042 F20101108_AABNOO crecca_c_Page_192.jpg
085b54be90cb19851e577ebf0d412a48
0d05d7ece6d1a084d1bf356e91863c34aded0fbc
70987 F20101108_AABOLI crecca_c_Page_184.jpg
f5a0b81265948497967d997b8d6ee330
29f2751d67886f96d71827cb450987200f81db59
27264 F20101108_AABPIC crecca_c_Page_033.QC.jpg
3fe2c5e797cd0cf89827a10de853c4c2
7a446b5cb47324a2db8352f82092a7ffdf0c6d12
12533 F20101108_AABOYU crecca_c_Page_099.pro
2876ad2dee389d29d993263d5ce3c75f
33713f684b3ec8bbd30a69fdab9b032617aa1399
F20101108_AABNOP crecca_c_Page_052.txt
7d934612ec9cf6e1f4cbcf2ac007eae5
5196736558d0986112a143a29fe1cb99784127ad
75952 F20101108_AABOLJ crecca_c_Page_185.jpg
00ac28dfa5e5bf4e37507326b432b938
6664def5d22a86aac52256cf4e931fa712183567
6675 F20101108_AABPID crecca_c_Page_033thm.jpg
9023d597ec0b277ec70180adfafcd2f0
c588b5b99016e12ff84ca7b492c532e6b9b504f8
11857 F20101108_AABOYV crecca_c_Page_100.pro
17aa98969f35ddfe1299590af4a98a58
14991df9ade678d0b6aa3c70be934262a9c3e7ca
27504 F20101108_AABNOQ crecca_c_Page_157.QC.jpg
5f911178dfe4845f7ef3c317558f5ff0
74613aad5cc5a6a8a50ec18951f6bfaceae49bcd
26682 F20101108_AABPIE crecca_c_Page_034.QC.jpg
246f50a090af3da07f9f2f84c49c21da
fed98dcc2f3ac459d76fb6ee6c7e08ff538a648e
5983 F20101108_AABOYW crecca_c_Page_101.pro
1193fb23467dad349ab7c01b72a73a39
6eb6701e3b428d4a390d75a8cfc83dad92ee3019
F20101108_AABNOR crecca_c_Page_179.jp2
8b7a134223020fc4890510be4c41b7ea
dc1d610ffa4fa41fd388cad856e12ad95ec275d1
45722 F20101108_AABOLK crecca_c_Page_186.jpg
1144fa838b971b5d73028b72eec76b1a
221724706d4131d2621ee0e0e58dbab2036f1b3b
6358 F20101108_AABPIF crecca_c_Page_034thm.jpg
2597b27f7caf3fb006575affca466dc7
bcda7367c44001ca2b547110235d90264c5ed94e
31850 F20101108_AABNOS crecca_c_Page_129.pro
eca84878cf1804908266116565067f8b
229d69ee60e8720c47c01a492eff96ed4fedbd21
74073 F20101108_AABOLL crecca_c_Page_188.jpg
9a403e0812e62aa9c8120e4d6254bbea
d3b5fc25e92892cc6626310ebbbe0a7cc4f3973d
19557 F20101108_AABOYX crecca_c_Page_102.pro
37508b9da374f8dacab4173d4ffa516b
67f518a836bea0131548dcb9a9cad0f6787636f6
53321 F20101108_AABNOT crecca_c_Page_171.pro
26f44a52d1ab9333c3705472d1a1b5fa
6d409afaebdf18c54e63078df42f8e468d4ef481
42706 F20101108_AABOLM crecca_c_Page_191.jpg
eb7b61ddf1e4edf2e0e0e8351d666d6a
d5cb7d8727e5e7cb072e554a0f1f84276dddf89f
6629 F20101108_AABPIG crecca_c_Page_036thm.jpg
53d95a2dc5ce554f666d839fd3bf554b
28d04c982953bf03dd1e16151a9eae3928f87013
45107 F20101108_AABOYY crecca_c_Page_103.pro
311a4221e3f18662a35f586ec73aabd8
8581ab3f7840e8dad9c13bc26e6558b3af85d0e6
307284 F20101108_AABNOU crecca_c_Page_169.jp2
08e5b1ce9d8dbec3d94c1c9dec536aad
0b958a3bc0b0b5edad7922fefad4c9edc02920b1
24346 F20101108_AABOLN crecca_c_Page_195.jpg
dab56e355fe1e9bcd247e6d8e3a7556f
f52cc5b3fb5f581a4e6dc71218e968b46850b018
25173 F20101108_AABPIH crecca_c_Page_037.QC.jpg
ba03638da3cddf61a728767610aedcfc
007c261afaaaf2b97d50b8426914e2139825d740
45504 F20101108_AABOYZ crecca_c_Page_104.pro
3944c672517dbd20c984740e5c5f4f5e
28fbbfe4d5ec22301eb353b8d227347ff523c1f3
49478 F20101108_AABNOV crecca_c_Page_095.pro
a1e74c7724b0b7dd33838160f912f3da
99b18054d5a42c7d7ca2b882a49a811cf5c03138
52825 F20101108_AABOLO crecca_c_Page_196.jpg
300c1465f4ed974433a26eb0d0902ab7
e3d1a9bc9de652c418f300e10fd76f5b7ce323f0
6259 F20101108_AABPII crecca_c_Page_037thm.jpg
bea0fe3ab92538295df9e01775036b6d
f1eea1a4614eacb132549f9118f0c6a3a6b69c9a
F20101108_AABNOW crecca_c_Page_116.jp2
47acb917e3c011f13393877dd876d1f6
eb85be3e90b5b9d9c44e469a4c287f192a147282
25929 F20101108_AABOLP crecca_c_Page_197.jpg
6db0e9a458fc817c5389895dc2fe3365
0010058e9541a2eb97ed5c85c2ce65d07b792de0
25597 F20101108_AABPIJ crecca_c_Page_038.QC.jpg
766e111d6e886fcb095d4f1c7406e499
933108bcaaef629bc9b19b1eabe3fe1c87e08cc2
12179 F20101108_AABNOX crecca_c_Page_164.QC.jpg
0c5d7e9418c3ad1045966a9bde3d347e
e0c514ccfe90a7146cd1d92799f41715b9cbcce8
72081 F20101108_AABOLQ crecca_c_Page_198.jpg
a4c2d2a35204ed24d0424902dd723c79
32d96fe13b5b623b77ff4e5c9353bfa29e95645d
28128 F20101108_AABPIK crecca_c_Page_039.QC.jpg
3908c92f6464cea6fd3acb2db4f4d26f
373156e97318cf368ac69af8c196e411c6d609d0
27381 F20101108_AABNOY crecca_c_Page_022.QC.jpg
e91ae070ae258ef74378812e72ef13b6
3034dd5c5af83b969e793259c4369faf50c465bb
19191 F20101108_AABOLR crecca_c_Page_199.jpg
73af68e66dd58d0d64c4beab47e95a46
9478d6f7d3ed8adf0a82425c285cef773758adf2
6634 F20101108_AABPIL crecca_c_Page_039thm.jpg
226734d62aaec7a07bb829588f5ef35e
d47b2381ba1c0337860457ddc52e696e4650bc7f
6375 F20101108_AABNOZ crecca_c_Page_040thm.jpg
c148a6b4fb193936c588658089aa7ec3
7cb3fd9cbaee9c53612516cc8ae7d9a681883ce9
25993 F20101108_AABOLS crecca_c_Page_200.jpg
b441c7d43086147ffb5ac66a248770f7
3afda1ae4a5c7c1c01f1edf9c9cb9e459bb55deb
25969 F20101108_AABPIM crecca_c_Page_040.QC.jpg
065d4c467d3fe8d8c8c967f4246d7df3
34cd29a7c0d9334068727b1a28d0fa78f9fa14d8
53018 F20101108_AABOLT crecca_c_Page_201.jpg
1baa7f12ea3f56f282d803f9f1fcd8a4
96eaa3a5477ed03469fc8c6d11ade77245e4a0b8
28064 F20101108_AABPIN crecca_c_Page_042.QC.jpg
3fea3a0056830decde031e080e832b9e
701ca94c379d45a667f9a4878613318cd629e20e
56449 F20101108_AABOLU crecca_c_Page_202.jpg
fa468f30fd459c4dec64ce5229b47625
49ce8b2c3dcba73616c13c63a1ee667b99f937c9
28795 F20101108_AABPIO crecca_c_Page_043.QC.jpg
efd9b1addd9fc89e3c63a40b4a5df04b
1e1f144dca2c9fb94a40f5f125f7f20fc0f166dd
54940 F20101108_AABOLV crecca_c_Page_203.jpg
bf24981a930c7ba1a6e3d9750a23cb2b
8049690f01d933a289e30d26c48e0fba9f271f73
6795 F20101108_AABPIP crecca_c_Page_043thm.jpg
fba479d285c98068344f368c26d4173c
54d8576ee4bfde3bc37330b8f49d60755b0d3639
55420 F20101108_AABOLW crecca_c_Page_204.jpg
b6e38c1b1abe24ca1418d3476139bb05
89ea676232ada827fa3086b6a44203ff5626f21c
25622 F20101108_AABPIQ crecca_c_Page_044.QC.jpg
faf31728e6d8a4ed590b4f45581a7dea
719d77d7ebbf1c1870b9f6c9b4058f64969c82fe
54931 F20101108_AABOLX crecca_c_Page_206.jpg
0296d652a844eb52eee10c5ae0417df9
5c1f63e36af983d92c4b130d050a716c617472f9
6255 F20101108_AABPIR crecca_c_Page_044thm.jpg
e3eee1568a561748bb49f89401da53ca
9ee798cc380f428a93a7e7c4ebc7b395f17158d1
55319 F20101108_AABOLY crecca_c_Page_207.jpg
62b567236a069ae8cd010d502d248106
a3e0507c5ce3b5f4cb4db595d709c27b438cc186
26943 F20101108_AABPIS crecca_c_Page_045.QC.jpg
e5b93218d69b1881887a9a6ff580f522
233526ce920b178ec928f6073bad94fc73220cdc
55687 F20101108_AABOLZ crecca_c_Page_208.jpg
31dfbf5e9733f9174e950759d8f119fc
b63bf756cc84a6efaa2804309fffa50284917deb
6700 F20101108_AABPIT crecca_c_Page_045thm.jpg
2a19f1a637dffb3ab1af0198577a503b
a79de5d04285ed0270cad9e8dd7dd5624ae498f2
28649 F20101108_AABPIU crecca_c_Page_046.QC.jpg
e91df5b13a2f4a0b4c9d748c38e2b425
7d661adeb81de8001a28129c821e2dd37acfb9eb
87021 F20101108_AABNUA crecca_c_Page_157.jpg
174e24fe490ae32910e36e8ecfd13ab4
4bb92f8be83921aac5325a7bf3dde0a1083b4edb
6677 F20101108_AABPIV crecca_c_Page_046thm.jpg
59f315497e39f27c8e294df388cc648b
350baa46e037dbddfa584216365f85e3d17172bd
2099 F20101108_AABNUB crecca_c_Page_054.txt
134c57b9d3d537b2856bf02e23e12b91
1c8f94442159713d16e9e3e3b8280e9439cb6674
25279 F20101108_AABPIW crecca_c_Page_047.QC.jpg
83dcfd7dff3c77cfdc4609df5b99471d
8ada49e4a2ff2291305ab5af6040cdeec3b52f01
87742 F20101108_AABNUC crecca_c_Page_105.jpg
0c66924ad22f40f4dc21c412092d78de
1eb5f94094e52df34b8f74921ab510b67c36e6b3
6062 F20101108_AABPIX crecca_c_Page_047thm.jpg
72ba28583948f49ee9db2011be35b1ad
09e2ef0b4b9a1d881845dbe8f2f0480d81b63a48
F20101108_AABNUD crecca_c_Page_012.jp2
efffd8266c990228190927e302598a5f
6e89b5a428ccfdf480fe9784d1239d98db236e9a
5123 F20101108_AABPIY crecca_c_Page_048thm.jpg
22cbf1b2e692353925e26ec9915e5a5f
8ad711d1ea99870dfcf5f7d3c03a61e75259cfcc
19540 F20101108_AABNUE crecca_c_Page_141.pro
7f8b888fade950618d98e274c6c67527
4a0657b8b5acd662e3eacc8789cba887e02420e6
27809 F20101108_AABPIZ crecca_c_Page_049.QC.jpg
7ab40f308e270e488fa2bbb3136e0b4a
6b34e27e804ce5e2c725e312f311dd87b837b98e
563571 F20101108_AABNUF crecca_c_Page_204.jp2
a10146026833def9df49ff54887c728a
33046a5922534b2e13af7c33ee2a351acc42b10c
89560 F20101108_AABNUG crecca_c_Page_022.jpg
a77d7bf9e512d3d040c681ff74824d11
aca8983d624da30c3c97b0e4c622aee846c3242b
27892 F20101108_AABNUH crecca_c_Page_143.QC.jpg
0a6bf18eb425c76c3fc7ef6b763fa506
7477964c5b4d5ffea521cca0732cc4df014f92a0
1051968 F20101108_AABORA crecca_c_Page_215.jp2
942f75fc66f22e9777c9aa775176f0f8
3da1b33e72ff28568eac6d447b97e4740b1382dd
54825 F20101108_AABNUI crecca_c_Page_025.pro
70971fadeec334df730c55e5a1636ba1
8cb8c8c41e1434700a6d8c6d409236e0a3455ba5
F20101108_AABORB crecca_c_Page_216.jp2
75786aee17f348efcca1d0ef5c053e1e
d331b6326b127906df61c1fa5345296650d10893
6711 F20101108_AABNUJ crecca_c_Page_148thm.jpg
13297a452d1f4b880b0bba7ddea7a3fe
d1feb39c789e95147a36dbddcf768622d611cc0a
F20101108_AABORC crecca_c_Page_218.jp2
cf7347fc03fdf41a0b44db400ce23556
3978794de3825cca47a9785a578169a9ff7c7d11
55353 F20101108_AABNUK crecca_c_Page_149.pro
ab5799837bd05a70eb922c12321d8a72
28513ebbb0f9b6a26f1e38a82f2cf078c2cbe7a8
F20101108_AABORD crecca_c_Page_219.jp2
ea0133f0820f84561e5ab990e3564324
2af1c8e4d0655dabca04b395e3566a590663ec0c
F20101108_AABNUL crecca_c_Page_026.txt
66e76a0b9783436f7d9fe83109f6b1a4
99d0b952c17a3d57ff9cb71434c922b6faad856b
F20101108_AABORE crecca_c_Page_220.jp2
5b1bffaa0a8d97071b59cacf91d33669
e72a124653efd95c80b7e50327da89a6d13718c0
6608 F20101108_AABNUM crecca_c_Page_012thm.jpg
577cbd6aa5b5c985146ec3ef111c03bb
f8ba70adb4ad12e6553324b6dbd8e342d08ce8a6
F20101108_AABORF crecca_c_Page_221.jp2
a967b0fb3c835665049ec9cf4fe45a47
2740de89986e6732434a6c1e48eaf17f15b77a3f
84152 F20101108_AABNUN crecca_c_Page_108.jpg
35736c6da19211ec91b0ea3b614a1657
53ba555e3cd535db1f8778f6df536ee684db9c43
1051930 F20101108_AABORG crecca_c_Page_223.jp2
c33e429d04e6c5a4f9c84bab1e783dd9
d09c1c932eae3bb81365e48ad415d332db8d687e
90325 F20101108_AABNUO crecca_c_Page_161.jpg
380f433cbb3f32552d5b47292cc5aba6
a7d61bea8037c03ba46593f250b3778259e06ba9
F20101108_AABORH crecca_c_Page_224.jp2
1df9e62d45e203382f0637bcaad9cfe5
09afbcf908fbcfcc4b8993c185e12c7787acc9b3
22002 F20101108_AABPOA crecca_c_Page_145.QC.jpg
d8c8b89cdf5e27bcda72ff4ea5d24bdb
5ecf4950f940a5aad14d3b4e45610e5734b83aed
F20101108_AABNUP crecca_c_Page_035.tif
218d21b1db7cca384fa7404f9604d20f
cd69a79f7852bcc20a76d49350e6e2a9449b7c46
767344 F20101108_AABORI crecca_c_Page_227.jp2
88d106a556c5ee8f32bec0c563886c02
564dee128ad514c44e083249d5f928f39c10c9ef
5228 F20101108_AABPOB crecca_c_Page_145thm.jpg
91231605c3d425acc0a8e1064b76f4d9
71c766680b6fd47535bcfb3e4ed9e825f0523db0
2659 F20101108_AABNUQ crecca_c_Page_217.txt
661c8a72526f0afeac290c7a7044dc39
9fc0d1dd1b483998b558f8b547382bc85b9c6777
F20101108_AABORJ crecca_c_Page_003.tif
a10d161ac98e613f4685be009bda4724
b4d5f3fe9c79352d79b533c27d13b1f948faf770
27263 F20101108_AABPOC crecca_c_Page_147.QC.jpg
ee19736ae08aa244a646f97ec6cc6d47
454c3956bb3efda91c7793995cdc0fa2375dc5a3
804281 F20101108_AABNUR crecca_c_Page_187.jp2
53f0c9351d849233b4cd4d56415b30a9
b01d0a635eb07aca03215294bf8954b9988f610a
F20101108_AABORK crecca_c_Page_005.tif
441ba24e69b33067489413116616862d
215ec0199cdd4c300a54256d3277520b9910ee77
5199 F20101108_AABPOD crecca_c_Page_150thm.jpg
43960b8496f29b59a8cfe1c1df90509c
cd2a7e59c3d8eee500e502053da35b86265a9a1e
2715 F20101108_AABOEA crecca_c_Page_086thm.jpg
9a8eb047886eced9b7268a58a801f24d
6d55d8ae63abe2db934bfd8b528af25dc7c44de4
53609 F20101108_AABNUS crecca_c_Page_045.pro
3b8b9edd69f8c30215394df53d096c97
0c658cc0d6f7363463affd6a524cffaf22664c22
F20101108_AABORL crecca_c_Page_006.tif
06cce37ca8864796779a0c2f07e2f4ab
1ba145981b62c15c4aba6db140ab4faf86ddad59
11077 F20101108_AABPOE crecca_c_Page_151.QC.jpg
26f2036ba6a45d3e1081b89896aa4fab
3f0c82c4515f8b271f7f6b9b58c079ab77b3788b
F20101108_AABOEB crecca_c_Page_064.txt
52a0626000368c07037a9aa1f5a5fc79
8c189a895d0655e46de1f2d59bb7676d4ee085c2
3920819 F20101108_AABNUT crecca_c.pdf
e125a1e8c2b08d0958fa8fb5e4043838
e21ed644b51916ef68f5655341385820392fef05
F20101108_AABORM crecca_c_Page_007.tif
acdafdece9f754928e2758bea29f780e
650e321912757f8ae452835bc6d78fde17180eac
10405 F20101108_AABPOF crecca_c_Page_152.QC.jpg
4a52ce8bb82f481247e21be034e464af
c9869ba9bf91db5ba7429567d0ad1278d86845e2
9919 F20101108_AABOEC crecca_c_Page_127.QC.jpg
d57e1950bc2884b091a76c3b7d1f742a
a42de43cae7ebcc17ada3981a1415cbc26295b56
F20101108_AABORN crecca_c_Page_009.tif
6bc8d9431cc97a2e150398aeb32fd77c
5924e41bae447dcd728aeee53550a866d895a3de
3454 F20101108_AABPOG crecca_c_Page_152thm.jpg
e43865fadcc11d9461c6eaf58abaff1a
8297d234cd827501e0f30e1ee9976e3168d94a38
71731 F20101108_AABNUU crecca_c_Page_055.jpg
8d5bb3bf7cde7843f08da183b01e0b4c
e2264bec0d67e5fc3cf97041b10114b5e741d434
F20101108_AABORO crecca_c_Page_012.tif
8bd22760bf908fbb456bbb6e068cc8b3
fbdc22b6ada24f5915f906e8ca801a54c4110c79
2811 F20101108_AABPOH crecca_c_Page_154thm.jpg
76f65efd3bee1328175559e2a4c26523
bc16e590d1361dd7e77fca9a204be02926f1a0c8
782220 F20101108_AABOED crecca_c_Page_068.jp2
07917c0a78bff4ffe7f2dbcde1bf1bef
2f914a83c81488eccd5af4ba0dfc993f52dd4fb7
55132 F20101108_AABNUV crecca_c_Page_123.pro
caaba4e5ca7dd7d51bbd56bff01428c7
78ce91ce321c0d44a0f2e1624b119f0b04f22b5e
F20101108_AABORP crecca_c_Page_013.tif
f0063470785790c138abf84715123d68
ad74ab23fc5828514a96d6a602f5fc082c8ba11b
26683 F20101108_AABPOI crecca_c_Page_155.QC.jpg
fd6e3ce98dfe773ecf79a30f9dc6c232
9faa2fdd8e36d1ba94dbff68997c213ad58d9e87
F20101108_AABOEE crecca_c_Page_056.jp2
05b3f940984676b65a3ccb0307c33aad
4d47ff39b9b4bbc5bbad0be1bb02d28c3d17397d
33293 F20101108_AABNUW crecca_c_Page_227.pro
f1d45318e28708709c29b63ab4e0360d
f1dde67ab60232df806f96957f97a63107840afa
6471 F20101108_AABPOJ crecca_c_Page_155thm.jpg
63e8be93293cc89c8888ea8e678d714d
5cddf8779a7e028b886ce9db5ba6397af9f40dac
F20101108_AABOEF crecca_c_Page_078.jp2
82cadba4a44ed67e485d79c4e8819c5f
8017c61eff05761c3d1e7c4a23d51a50128d5046
106671 F20101108_AABNUX crecca_c_Page_223.jpg
a1db81372703fb38df5759c37cb56711
4f912404338015f9bbad6605f3fed09518afc50b
F20101108_AABORQ crecca_c_Page_015.tif
147c612760530390d271c6053805a9f3
73b3b5f6a5c3d37294466bfefc17a0d2d3f93c19
28255 F20101108_AABPOK crecca_c_Page_156.QC.jpg
57bec3553f1ef2a7867f645161506da1
30f3d0d57963b35221fac1270d7ef1566292575e
89072 F20101108_AABOEG crecca_c_Page_041.jpg
e85ba72377beec9465fdd9f73435d080
8e3e0ce34a87c64ffbe4413651299d1dab71ca8a
F20101108_AABNUY crecca_c_Page_010.tif
d5709ce611a31ca0d581bdb4079de31f
ba73fac6a3e85b142f9d178a93053b12e4763303
F20101108_AABORR crecca_c_Page_016.tif
ad0a356d6d8d74072da03e0b0d365992
8e8c3c5885cb616d75781c133f8867ca54263663
6642 F20101108_AABPOL crecca_c_Page_156thm.jpg
1f701044c1b0df7922b55faab74a458b
a0bb8fc67af3d5267040dca60d77528832c225c9
27691 F20101108_AABOEH crecca_c_Page_028.QC.jpg
0b3d73f9f90e9b2432b3ce080cf41a98
9fa96878afecadef4e31e4e89d79f2d9d9996926
1671 F20101108_AABNUZ crecca_c_Page_129.txt
ef7215ea757d77709990d13bc62fbf62
cb659d644829b33332ac293d4b0b8d271114aae0
30889 F20101108_AABPBA crecca_c_Page_201.pro
f4fe91cddf8ea89ef8fc1e08342af4ec
9a084f20efdaab3253ca2ce5d5f4130c9f74803d
F20101108_AABORS crecca_c_Page_018.tif
8921deaed38a9bef9d621e057fa54153
f10e9d33d03eae35b97c17551879caf791190416
47038 F20101108_AABOEI crecca_c_Page_110.pro
b016fbbefe70340d568e39f4661c3718
5369730a2e4cef2e5ce04b26ce5733eba0a439f0
18482 F20101108_AABPBB crecca_c_Page_203.pro
13bf1dcf65cbe0c5579eb6c6e7d1f30e
70efe4755ba0d946b0cda22e0f8624da13240651
F20101108_AABORT crecca_c_Page_019.tif
8dd872d08e5fb28b1de04747c82bc4cc
fe415bf6e1489e384806796b28190fc3c04aaacb
6555 F20101108_AABPOM crecca_c_Page_157thm.jpg
f0900e44899b82ccfbc7081459ff272b
20ea076f0989eea7ad1c2a1477c45d456dec51fc
425 F20101108_AABOEJ crecca_c_Page_003thm.jpg
8478c9612e4da35b5173b40a3fa595d0
ee46c6b96932e3d63b3e267a03a43c1601b770e6
18638 F20101108_AABPBC crecca_c_Page_204.pro
c15a36675b701dce008463f323b72fce
0c68e99d631968fad164ddb3d333809ff1f9ee86
F20101108_AABORU crecca_c_Page_020.tif
f6fa6cc6051c20260fb15b17eaf2f60f
f4cc4817508d48c1f04595a8202921665b3a3356
27129 F20101108_AABPON crecca_c_Page_158.QC.jpg
803bd874e518be6e4ef1a07af8cd4421
f5a5b60c87561ed6503db11334836e41ac760762
1940 F20101108_AABOEK crecca_c_Page_001thm.jpg
0c40535d2f7fdbc6c325203448755830
ad1bcd0e40780e6a288ecdfbf585dcc147744156
18739 F20101108_AABPBD crecca_c_Page_205.pro
83dd264d75a70a1c2b9f3e883c3c2f5d
c63b9e2c6437a8d7a80aa098bebc51dd0437f87a
F20101108_AABORV crecca_c_Page_021.tif
4bdc82fb3f71314dda4c70504ed8f130
63c1371bc801bef7cf899db044f4efee36a8d53c
28239 F20101108_AABPOO crecca_c_Page_159.QC.jpg
ecf1763fbcdc70fa5f2e363acf6b773c
6ddac2d04c2bc6bdf0491d865845e2124109fa53
1051942 F20101108_AABOEL crecca_c_Page_114.jp2
4c0fc7db895e60884d5648533003f45c
f0f1e81a8f62341f2472ad4ffb190786842db672
18340 F20101108_AABPBE crecca_c_Page_206.pro
77ed47e09282b12c8b728451a6d9a7e5
ba75674d0e6c05979d6ad877e796df8a7f048bd5
F20101108_AABORW crecca_c_Page_023.tif
0231e130cb3dfca97dca441394299777
a143ac4e632c47ecace5fb09cc75f969eec6f523
6418 F20101108_AABPOP crecca_c_Page_159thm.jpg
e8bdd7226b0e894d505f29fbfe1ef6c0
3c789dd4cba015c1528d8036862836fdade0d4bf
F20101108_AABOEM crecca_c_Page_040.tif
d37771cd409844ce7375bbb0f853d9cc
fa094b8d8e8adba6ba1c744f2e75c3da440c153c
18964 F20101108_AABPBF crecca_c_Page_207.pro
766a5e4a112e1062e248d9d8f86acd10
fe40d0a41f8744aa2d5eb8a562120f37117d4973
F20101108_AABORX crecca_c_Page_024.tif
8eb84f960b7745d1b7604b55a922a375
3a7631a194d343c2e08f1f76c533c3e4e1ad7d69
26867 F20101108_AABPOQ crecca_c_Page_160.QC.jpg
ebf4c22000d8e6cab97470cde16b85e6
32ad9e7d84bacc7c9834078f6bd0777dacd12bc9
16414 F20101108_AABOEN crecca_c_Page_201.QC.jpg
d5e2b8c7331a46f386256db9edc188c2
1b9a73190c69950fbdfc06b8e9190586f69278a0
57949 F20101108_AABPBG crecca_c_Page_211.pro
39bd0a1ad8c3f7658b4d8921cf6b4f75
326a748cdbdb0bf8115fcd4413b3713cda5d3d03
F20101108_AABORY crecca_c_Page_025.tif
38f122e9b4991069d43a1208bbc46e18
fe830cd95520e2cd679c33bd947274e10c6362eb
F20101108_AABPOR crecca_c_Page_160thm.jpg
48353052954550a43417c9b9b3834e33
6466b906de2ae97f084d60c68ea9450917e54ec4
F20101108_AABOEO crecca_c_Page_083.tif
38e4b3394e2b9fb5ecb9642b1d521085
136a1f63b1f4d949ecf23a51c2db3eb07a4e830e
67349 F20101108_AABPBH crecca_c_Page_214.pro
14f386f61395081e25635ce18b663a30
0ca29f2f1c33a2d637988d400fbc2a80cd4f436d
F20101108_AABORZ crecca_c_Page_026.tif
efd54bc24fee0804b1a21253d7f891c2
820430e0986e3ed60b772c24507bdc9118e41036
29174 F20101108_AABPOS crecca_c_Page_161.QC.jpg
52ffdc5dc6f534ed7726fcb02723b9fc
a4912bb1ce76a227c40a074c538ddaba0895d584
1051972 F20101108_AABOEP crecca_c_Page_211.jp2
8ed00f027dc645156dbe5dcc0a11ccef
3b94e802aec9fcbf0d412eb0fbe8e5d9ef44f428
65473 F20101108_AABPBI crecca_c_Page_215.pro
431630257b4d7bf3c57a608da314041f
a1c9e0bae231d92fd0219c06cd5c87d04b8d101a
6651 F20101108_AABPOT crecca_c_Page_161thm.jpg
593cfd99a8afd53cd313f484ed79760c
97e137e3e675d33a8233972aaa20beb80973c7c2
F20101108_AABOEQ crecca_c_Page_044.txt
e12e941a8fc680d574366aa3ad18e7bd
99edd1af14aa502985706157a24f6cb7c29f6983
63370 F20101108_AABPBJ crecca_c_Page_216.pro
1346c657231aafacb47cebe697d9c079
e35957234126c3f5ada2299327c5332177fb76aa
20780 F20101108_AABPOU crecca_c_Page_162.QC.jpg
86c17006a1b1d10c32de934a510bbee9
2dc3cbcad5e977ca985a8088d13d22891f679669
F20101108_AABOER crecca_c_Page_176.jp2
782e7c76ae0ab00d3634d23a52d97a6a
52d01b245e34e1ecc37bbd8f4acdc999259b3945
65072 F20101108_AABPBK crecca_c_Page_217.pro
592c2e226770d24242298b9e80daeafe
f9a1f10a930a06cc372b293e416dc54bcff8e5d6
4935 F20101108_AABPOV crecca_c_Page_162thm.jpg
2208feadf9852519086b0484ba8d242a
a639a8ab7bef485fd1336ed49476ecce54cb52ee
444 F20101108_AABOES crecca_c_Page_132.txt
21a12c27e72b85762cee0680888ba90d
a1a704d778e8f6f00a659b00b2399d6b8b255081
66341 F20101108_AABPBL crecca_c_Page_220.pro
934dd6949f2c177aafee3d6adb251823
d05663e204b7d661e58770d8c5d3b2aa6e04fe45
11752 F20101108_AABPOW crecca_c_Page_163.QC.jpg
f3f2680f180e04166617be202782910c
e400d123254f16342cee1c5efcbf7af198c4fba6
15738 F20101108_AABOET crecca_c_Page_135.QC.jpg
3a7466c4cdf8e401e3a3acd2ba598a84
a0ed389e4931cd1e418ef0d2d8cfb523109067e5
66377 F20101108_AABPBM crecca_c_Page_221.pro
6071f1c94dc539e5228ab9ee4a92ee63
51dd602457f1971527e0c360e897f5c885ef9a86
3979 F20101108_AABPOX crecca_c_Page_164thm.jpg
870059a0a0f5e5dc2a2e4f2b51119ad4
543082d5de78678bb840bfc60a3db4712d177d1d
34172 F20101108_AABOEU crecca_c_Page_151.jpg
4fe3a88869ebf4b0b414ef244aaa6f1e
79556dc7fd0941739e1dd8011165951d6122cd27
66583 F20101108_AABPBN crecca_c_Page_222.pro
de4a7212a213ee6d8d11b737b6122241
b9dfffe74cd6786587039442a1e7a806f38a5ade
9230 F20101108_AABPOY crecca_c_Page_165thm.jpg
209b780478c02640837841e2d9f32077
519d77e44261dccdf249fa315f4494ffecc2d72a
F20101108_AABOEV crecca_c_Page_111.jp2
1aae5d6f00438273580271252f3eae76
8e481f5bdf21c8fc21176446556a438cb26c0b4d
65659 F20101108_AABPBO crecca_c_Page_223.pro
13ae4410257029188b0c9e1eeeb76ae1
992084d5f7655f6dd0a12feb6986b5867dcd349a
13698 F20101108_AABPOZ crecca_c_Page_166.QC.jpg
145601dce32e2adc936771bfdba30b32
3f29ddc5ac74da580497ce6af9248fdf040127b1
F20101108_AABOEW crecca_c_Page_113.tif
48618ef6fff40273058eec7d5db1a795
e8a78cfcf6af055ec9d6101f688c8737ef2a0d74
61536 F20101108_AABPBP crecca_c_Page_224.pro
9b5c0ae20f7f0e1da8d6f79518e55bdf
6abe7cfee7aa84f90db378abb669922d3cf82673
28614 F20101108_AABOEX crecca_c_Page_076.jpg
f25c1149c52d07c35a544a07536de807
bf822ce79fe8157a8a445051b1233960f75d6123
61489 F20101108_AABOXA crecca_c_Page_009.pro
1130dfbc566c3dd91b5317c8caf38135
4c4e6d7ecb4570de17f03ff71d8688708e9e7a3e
71019 F20101108_AABPBQ crecca_c_Page_225.pro
000b51d8279531cd9faa0bf728c6057c
09e082178bfb066cad8061c4eb7cef369ea5a28e
F20101108_AABOEY crecca_c_Page_166.tif
576ce9fbf29fb450f6a35335e4113ea6
e64074a8b849595ad7681c7a97ee64a0ec63ea79
32701 F20101108_AABOXB crecca_c_Page_010.pro
46929eceacee0a721fa1458f16d9448a
70cba1456e4f68e18c32a441962b995c713dc24e
27896 F20101108_AABPBR crecca_c_Page_226.pro
88acec8910bceb091e87116e9d9839e2
0c94ab79a1383ea6c5206e987324bfbf8ecd3398
51168 F20101108_AABOEZ crecca_c_Page_178.pro
2ebdf01e6fa95832b464f55cd9162e7d
60f3880b54abb16068a0e1ea9c07cecabb7b2532
59219 F20101108_AABOXC crecca_c_Page_011.pro
13af46d3362579a9ea164e29e2e28c72
8d12dbbf8b0192be2201273249a2692b1a76ede9
99 F20101108_AABPBS crecca_c_Page_002.txt
10e8ecefabc80df6827f4a5c545a95b3
a8878759cd7fa27319be06ea5ff84c2558f2527c
56725 F20101108_AABOXD crecca_c_Page_013.pro
79dab3290c733da124c89770b4a3029d
4e35585f89f28f3a7f1a922f1de84437bc8c0d93
1356 F20101108_AABPBT crecca_c_Page_004.txt
e4deb36735acdab1454518b11381c89e
9abdb812f597ea5c4e73c1339d6c5737a1b60854
30082 F20101108_AABOXE crecca_c_Page_014.pro
4313cd81a3930bf048bc07acf86f0251
c2d1ee3829d2630e789ca531c248b2a2badd4330
3224 F20101108_AABPBU crecca_c_Page_005.txt
bd9c8d1e77126ce8fe7dc4e99de1c4f5
7930dd27e4d05a08b116b7257ffbd7156dcfaf5c
2021 F20101108_AABNNA crecca_c_Page_070thm.jpg
81fb1c05df5d87faf83e7ab618856b9d
6d8ba236dead19a23ec51eaace287bd33f28d910
52518 F20101108_AABOXF crecca_c_Page_017.pro
db91874354147591a09690c58e7a1f72
b52a524c320918edf4fed69927623f03e9641fcf
3465 F20101108_AABPBV crecca_c_Page_007.txt
6d19b8d114b671a4ef18f81e60fe2c6c
e3014a42c1a8bd0efd98b2c788b7993b1ac3d72e
55310 F20101108_AABNNB crecca_c_Page_205.jpg
1f7ed6f163bf514ae26c5078813d987f
e83c297e6ef24e5af52c41147a0590e002138648
54493 F20101108_AABOXG crecca_c_Page_019.pro
e7be57c92f4ece39aafc4208dffdc9e9
547c0ccfa81a6ab9b7f3204ff6415d546ab78f5c
2438 F20101108_AABPBW crecca_c_Page_009.txt
83d0e2ccc8d26fd6be92f0cdc9d73122
9de2378354cb95eace7f03c3ada2312d0b1607fd
54854 F20101108_AABNNC crecca_c_Page_024.pro
3c5ee7b6ac1f4ee467e3e5313f32d1ab
29e1ffda9ace2495818cbb45847704a11fd57d8d
53280 F20101108_AABOXH crecca_c_Page_020.pro
b5a31e786ec3c7bbe0b9f5000ccb5fff
c8e37d5332097c05df79b688d29b924de6cb0eb9
1348 F20101108_AABPBX crecca_c_Page_010.txt
39a3138962de6902cb77e318b54042a2
eafd6e372dc9095ce2d5dc9eada8528271151600
F20101108_AABNND crecca_c_Page_114.tif
cfda2681c7c184355d2bf3f2795f4352
c73766723dfa62fd721c016a474bfcc343a84b0b
51076 F20101108_AABOXI crecca_c_Page_021.pro
5d23de205d92d7f542574d76faa7b568
c79689722b6254b60276fae0c7f2809ff6821115
2648 F20101108_AABPBY crecca_c_Page_012.txt
cbcd2f38c773d7ac93ffdaf3f08f437a
ff8429a5061f1e2327aa726cc2f4cda8eb4f6c22
1156 F20101108_AABNNE crecca_c_Page_226.txt
fc3e3eeb1376b6dc622060bb32adbbe0
59c4a73f9d47850bab3d0505d20a23560a08c329
53780 F20101108_AABOXJ crecca_c_Page_022.pro
993895565d98c58319ad82b149ab6ed2
9ed19d71d744a9db45591fa7721fb6a8d74f2571
2293 F20101108_AABPBZ crecca_c_Page_013.txt
31775d3d9adbcc7a682abe34f2b8b8c2
721895ca1e9c04fd0e95f46e6643724267e6ec83
1708 F20101108_AABNNF crecca_c_Page_029.txt
f95576dac84a730ad752b66168da5aae
c6af49900d768eadd010ff42b6a7745db5bc99ed
52493 F20101108_AABOXK crecca_c_Page_027.pro
300fea5282fbef3d6344a7a94b2024b1
3f6665ad035a140b0bef9881649390f787566859
F20101108_AABNNG crecca_c_Page_155.jp2
44275877f2b4659fbaadaa4efe21b62a
807331a920e5edf5f2e595cd95f03972c7e2e144
41961 F20101108_AABOXL crecca_c_Page_029.pro
db460c4826e142379f3d70f0b97ec77e
ace59eb0e9c1a43e2a4fa921726561405b3c4c7d
F20101108_AABNNH crecca_c_Page_172.tif
e8d37efd44e8f737280069ec0d34c068
7a16b24255a6cd1afe0c84e00e06c6300da43eab
90178 F20101108_AABOKA crecca_c_Page_124.jpg
054a9cacec6e9193d128494e8b950087
df3f575f1b1b848c870def40b91faa30460b7f83
52660 F20101108_AABOXM crecca_c_Page_032.pro
07cc3f88b2b91a7d8167547f5d78b3ed
9af86063c6b3b6996535a6ef5df873ea754bcf1d
52484 F20101108_AABNNI crecca_c_Page_092.pro
a7b328383dce47a4dcabcb3a7fa879e1
09323d3a5c295aadbace6438e3bb69b4af0ac7eb
87314 F20101108_AABOKB crecca_c_Page_125.jpg
a62bb825aec83ad50f7d3748b9ddd9be
cc8a69d9899962c0048bdbf80e3ca7060df1bf45
53142 F20101108_AABOXN crecca_c_Page_033.pro
645f6f96a99f97fe255d8e49ef85070a
d90e0b9eee51444f64d19eedc258e8a28a84b240
26399 F20101108_AABNNJ crecca_c_Page_173.QC.jpg
7a64341e70522d7d51daff1c31252dc5
ad3fe57ee7c50fa286297766e1bd8ce77c2a0458
32669 F20101108_AABOKC crecca_c_Page_127.jpg
ac1e5f0bddc55be5965d70cfd4dcab45
5b4292e0157fe508cba6d181a3b760d83f71f2e4
52694 F20101108_AABOXO crecca_c_Page_034.pro
4f45d9c4f663f42f286a1803b2b4a029
bf351a7bbef9903e54bfd6eab69f0710a8efdd09
85310 F20101108_AABNNK crecca_c_Page_160.jpg
7869001c984cd787ddb78d6888fccc8e
9ce78ba3bc7f49be7b5be93593fdbd10b057baee
58551 F20101108_AABOKD crecca_c_Page_130.jpg
fab2d2f6385217e5fe3baeb282d86f16
cebc4096f3d7fd0fbc6e0a9c11b1f8b2f7c8b984
55131 F20101108_AABOXP crecca_c_Page_036.pro
94b5f8470755b4e9ffde3fda5910b717
c4bb515cec8a73cd09ef3be3df8c23d7f00e148f
26467 F20101108_AABNNL crecca_c_Page_031.QC.jpg
5be5358ca32a883e1c5ff7de284f5d5f
8d8dcc2f3e3ebc0b8c55ba5ce96ad0f950aba2a1
67861 F20101108_AABOKE crecca_c_Page_133.jpg
8ccf15e65be5f2285925a87b8bb1eb92
96b95c1e0045011141ec327fa25716e365c77a18
50619 F20101108_AABOXQ crecca_c_Page_037.pro
bf54d183c278aba30548bd1c72ade84d
3eda0c31a66191fb69fb102448be920495c9aa5a
10800 F20101108_AABNNM crecca_c_Page_138.QC.jpg
35dcc7c3f7768ba31be895e31f243dd5
624f69716d67b2fc90d277f9ded1df759d4d8623
98044 F20101108_AABOKF crecca_c_Page_134.jpg
0d736a2ed8de044a2debad8e4a2462b0
bd5977b9f3950c80fb79f066fe0debe46797ce8d
50350 F20101108_AABOXR crecca_c_Page_038.pro
f9aad846223f5b76219778b6a867fcc3
3ec6a488e6228ab29d4643a4e42783d937121c3b
58456 F20101108_AABOKG crecca_c_Page_135.jpg
f9accb005a5707a463b3c03f9890e1b8
00c4923f2df65b53c00636bc13bd8a4d0946300c
4681 F20101108_AABPHA crecca_c_Page_008thm.jpg
5d650f2b793da30994ffb77da66bb36d
cb5f8e7f711b6c374bc549afeb17324ee1bca04b
50914 F20101108_AABOXS crecca_c_Page_040.pro
bf5b860d2438bcc6ca9556f939ede921
dc0ac30e8db0da65bc18575683b28520192bc038
31874 F20101108_AABNNN crecca_c_Page_165.QC.jpg
7182cea9e2f2014a57fbb2d726b566ec
1dd09257479c8e47eb3bed3df3c780d53ea4a7fb
82749 F20101108_AABOKH crecca_c_Page_136.jpg
b9dbe21eddd54f45b95f9326fb3b010f
f33ab3d7525c77c276e7dd5dd2be2bfdd391143c