RDKit GetSSSR Non-determinism Fix#2920
RDKit GetSSSR Non-determinism Fix#2920JacksonBurns wants to merge 5 commits intoReactionMechanismGenerator:mainfrom
GetSSSR Non-determinism Fix#2920Conversation
Goal of this PR is to resolve: ReactionMechanismGenerator#2912 The RDKit `GetSSSR` algorithm _is_ deterministic for the same input (unlike the previous RMG version) __but__ the process of converting the RMG molecule into RDKit does _not_ guarantee the same ordering of atoms, which results in non-determinism. This PR forces the RDKit molecule to have a consistent ordering, and should (hopefully) give the same results each time. To test this, we just need to run the regression tests multiple times on the same reference results (so, on the same day, without any changes to `main`). The results should be different from the `main` results, but in the same way.
Regression Testing Results
Detailed regression test results.Regression test aromatics:Reference: Execution time (DD:HH:MM:SS): 00:00:00:54 aromatics Passed Core Comparison ✅Original model has 15 species. aromatics Failed Edge Comparison ❌Original model has 106 species. Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-CsCsHH) + group(Cds-CdsCsCs) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + Estimated bicyclic component: polycyclic(s2_3_5_ane) - ring(Cyclopropane) - ring(Cyclopentane) + ring(Cyclopentene) + ring(Cyclopropane) + polycyclic(s2_3_6_ene_1) + polycyclic(s3_5_6_diene_1_5) - ring(Cyclopropane) - ring(Cyclopentene) - ring(Cyclohexene) + radical(cyclopentene-4) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)CsHH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds- CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + group(Cds-CdsHH) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) + radical(Cds_P) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds- CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + group(Cds-CdsHH) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: DetailsObservables Test Case: Aromatics Comparison✅ All Observables varied by less than 0.500 on average between old model and new model in all conditions! aromatics Passed Observable Testing ✅Regression test liquid_oxidation:Reference: Execution time (DD:HH:MM:SS): 00:00:01:59 liquid_oxidation Passed Core Comparison ✅Original model has 37 species. liquid_oxidation Passed Edge Comparison ✅Original model has 214 species. DetailsObservables Test Case: liquid_oxidation Comparison✅ All Observables varied by less than 0.100 on average between old model and new model in all conditions! liquid_oxidation Passed Observable Testing ✅Regression test nitrogen:Reference: Execution time (DD:HH:MM:SS): 00:00:01:03 nitrogen Passed Core Comparison ✅Original model has 41 species. nitrogen Failed Edge Comparison ❌Original model has 133 species. Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(O2s-CdN3d) + group(N3d-OCd) + group(Cd-HN3dO) + ring(oxirene) + radical(CdJ-NdO) Non-identical kinetics! ❌
kinetics: DetailsObservables Test Case: NC Comparison✅ All Observables varied by less than 0.200 on average between old model and new model in all conditions! nitrogen Passed Observable Testing ✅Regression test oxidation:Reference: Execution time (DD:HH:MM:SS): 00:00:01:45 oxidation Passed Core Comparison ✅Original model has 59 species. oxidation Passed Edge Comparison ✅Original model has 230 species. DetailsObservables Test Case: Oxidation Comparison✅ All Observables varied by less than 0.500 on average between old model and new model in all conditions! oxidation Passed Observable Testing ✅Errors occurred during observable testing
WARNING:root:Initial mole fractions do not sum to one; normalizing.
|
|
Ok, the numbers generated by the code in this PR here: #2920 (comment) Should hopefully match those from the next comment from the bot about the regression tests... hopefully |
Regression Testing Results
Detailed regression test results.Regression test aromatics:Reference: Execution time (DD:HH:MM:SS): 00:00:00:54 aromatics Passed Core Comparison ✅Original model has 15 species. aromatics Failed Edge Comparison ❌Original model has 106 species. Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)CsHH) + group(Cds-CdsCsCs) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + polycyclic(s3_4_5_ene_3) + polycyclic(s2_4_5_diene_1_5) + polycyclic(s2_5_5_diene_1_5) - ring(Cyclobutene) - ring(Cyclopentene) - ring(Cyclopentene) + radical(cyclopentene-allyl) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsCs) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-CsCsHH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + polycyclic(s2_4_4_ene_1) + polycyclic(s2_4_5_diene_1_5) + polycyclic(s3_4_5_ene_1) - ring(Cyclobutene) - ring(Cyclobutane) - ring(Cyclopentene) + radical(bicyclo[2.1.1]hex-2-ene-C5) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-CsCsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsHH) + group(Cds- Cds(Cds-Cds)Cs) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + polycyclic(s2_3_5_ene_1) + polycyclic(s2_3_6_ene_1) + Estimated bicyclic component: polycyclic(s3_5_6_ane) - ring(Cyclopentane) - ring(Cyclohexane) + ring(Cyclopentene) + ring(Cyclohexene) - ring(Cyclopropane) - ring(Cyclopentene) - ring(Cyclohexene) + radical(cyclopentene-allyl) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-CsCsHH) + group(Cds- Cds(Cds-Cds)Cs) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + polycyclic(s2_4_4_ene_1) + polycyclic(s3_4_6_ene_1) + polycyclic(s3_4_6_diene_1_5) - ring(Cyclobutane) - ring(Cyclobutene) - ring(Cyclohexene) + radical(cyclobutane) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)CsHH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds- CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + group(Cds-CdsHH) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) + radical(Cds_P) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds- CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + group(Cds-CdsHH) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: DetailsObservables Test Case: Aromatics Comparison✅ All Observables varied by less than 0.500 on average between old model and new model in all conditions! aromatics Passed Observable Testing ✅Regression test liquid_oxidation:Reference: Execution time (DD:HH:MM:SS): 00:00:01:59 liquid_oxidation Passed Core Comparison ✅Original model has 37 species. liquid_oxidation Passed Edge Comparison ✅Original model has 214 species. DetailsObservables Test Case: liquid_oxidation Comparison✅ All Observables varied by less than 0.100 on average between old model and new model in all conditions! liquid_oxidation Passed Observable Testing ✅Regression test nitrogen:Reference: Execution time (DD:HH:MM:SS): 00:00:01:03 nitrogen Failed Core Comparison ❌Original model has 41 species. nitrogen Failed Edge Comparison ❌Original model has 133 species. Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(O2s-CdN3d) + group(N3d-OCd) + group(Cd-HN3dO) + ring(oxirene) + radical(CdJ-NdO) Non-identical kinetics! ❌
kinetics: DetailsObservables Test Case: NC Comparison✅ All Observables varied by less than 0.200 on average between old model and new model in all conditions! nitrogen Passed Observable Testing ✅Regression test oxidation:Reference: Execution time (DD:HH:MM:SS): 00:00:01:45 oxidation Passed Core Comparison ✅Original model has 59 species. oxidation Passed Edge Comparison ✅Original model has 230 species. DetailsObservables Test Case: Oxidation Comparison✅ All Observables varied by less than 0.500 on average between old model and new model in all conditions! oxidation Passed Observable Testing ✅Errors occurred during observable testing
WARNING:root:Initial mole fractions do not sum to one; normalizing.
|
Goal of this PR is to resolve: #2912
The RDKit
GetSSSRalgorithm is deterministic for the same input (unlike the previous RMG version) but the process of converting the RMG molecule into RDKit does not guarantee the same ordering of atoms, which results in non-determinism. This PR forces the RDKit molecule to have a consistent ordering, and should (hopefully) give the same results each time.Easiest way to test this: just need to run the regression tests multiple times on the same reference results (so, on the same day, without any changes to
main). The results should be different from themainresults, but in the same way.