|
| 1 | +# Isotope Pattern Analyzer |
| 2 | + |
| 3 | +Generate theoretical isotope distributions for molecular formulas, score observed |
| 4 | +isotope patterns using cosine similarity, and detect halogenation (Cl/Br). |
| 5 | + |
| 6 | +This tool consolidates `isotope_pattern_matcher`, `isotope_pattern_scorer`, and |
| 7 | +`isotope_pattern_fit_scorer` into a single, improved utility. |
| 8 | + |
| 9 | +## Features |
| 10 | + |
| 11 | +- Theoretical isotope pattern generation via pyopenms `CoarseIsotopePatternGenerator` |
| 12 | +- Cosine similarity scoring between observed and theoretical patterns |
| 13 | +- **Da or ppm m/z tolerance** — choose your preferred unit |
| 14 | +- Halogen (Cl/Br) detection from M+2 peak enhancement |
| 15 | +- JSON output with per-peak detail |
| 16 | +- Terminal bar-chart preview of the theoretical distribution |
| 17 | +- Optional numpy acceleration for cosine computation |
| 18 | + |
| 19 | +## Installation |
| 20 | + |
| 21 | +```bash |
| 22 | +pip install pyopenms |
| 23 | +``` |
| 24 | + |
| 25 | +## CLI Usage |
| 26 | + |
| 27 | +```bash |
| 28 | +# Generate and display the isotope pattern for glucose |
| 29 | +python isotope_pattern_analyzer.py --formula C6H12O6 |
| 30 | + |
| 31 | +# Score observed peaks against the formula (colon-separated format) |
| 32 | +python isotope_pattern_analyzer.py --formula C6H12O6 \ |
| 33 | + --observed "180.063:100,181.067:6.5,182.070:0.5" \ |
| 34 | + --output result.json |
| 35 | + |
| 36 | +# Use legacy comma-separated format (one --peaks flag per peak) |
| 37 | +python isotope_pattern_analyzer.py --formula C6H12O6 \ |
| 38 | + --peaks 180.063,100.0 --peaks 181.067,6.5 \ |
| 39 | + --output result.json |
| 40 | + |
| 41 | +# Use ppm tolerance |
| 42 | +python isotope_pattern_analyzer.py --formula C6H12O6 \ |
| 43 | + --observed "180.063:100,181.067:6.5" \ |
| 44 | + --tolerance 10 --tolerance-unit ppm |
| 45 | + |
| 46 | +# Detect halogenation (chlorinated compound example) |
| 47 | +python isotope_pattern_analyzer.py --formula C6H5Cl \ |
| 48 | + --observed "112.007:100,113.011:5.5,114.004:33.0" \ |
| 49 | + --output halogen_result.json |
| 50 | +``` |
| 51 | + |
| 52 | +## Output JSON Structure |
| 53 | + |
| 54 | +```json |
| 55 | +{ |
| 56 | + "formula": "C6H12O6", |
| 57 | + "cosine_similarity": 0.9987, |
| 58 | + "n_peaks_compared": 3, |
| 59 | + "tolerance": 0.05, |
| 60 | + "tolerance_unit": "da", |
| 61 | + "peaks": [ |
| 62 | + {"peak_index": 0, "obs_mz": 180.063, "theo_mz": 180.0634, "obs_intensity": 100.0, "theo_intensity": 100.0}, |
| 63 | + ... |
| 64 | + ], |
| 65 | + "theoretical_pattern": [...], |
| 66 | + "halogen_detection": { |
| 67 | + "m2_ratio_observed": 0.5, |
| 68 | + "m2_ratio_theoretical": 0.42, |
| 69 | + "m2_excess": 0.08, |
| 70 | + "halogen_flag": false, |
| 71 | + "possible_halogen": "none" |
| 72 | + } |
| 73 | +} |
| 74 | +``` |
| 75 | + |
| 76 | +## Halogen Detection Thresholds |
| 77 | + |
| 78 | +| M+2 excess above theoretical | Interpretation | |
| 79 | +|------------------------------|---------------------------------| |
| 80 | +| < 10 % | No halogenation detected | |
| 81 | +| 10–20 % | Cl (weak signal) | |
| 82 | +| 20–70 % | Cl | |
| 83 | +| > 70 % | Br | |
0 commit comments