High-throughput evaluation of genetic variants with prime editing sensor libraries

High-throughput design of prime editing sensor libraries

A principal limitation of using prime editing to systematically investigate genetic variants is the inherent variability in editing efficiency among different pegRNAs^10,21,22,23. A number of computational tools for pegRNA design have been developed^{24,25,26,27,28,29,30,31,32,33}, including machine-learning algorithms that can nominate sets of pegRNAs predicted to produce high efficiency edits. However, even pegRNAs generated by these predictive algorithms require extensive experimental validation, and their editing activity is not guaranteed to correlate strongly across different cell types. We hypothesized that coupling pegRNAs with ‘sensors’—artificial copies of their endogenous target sites—would allow us to systematically identify high efficiency pegRNAs while controlling for the confounding effects of variable editing efficiency in a screening context (Fig. 1b).

Synthetic sensor-like target sites have been used previously by our group and others to control for base editing gRNA editing efficiencies while defining the relative fitness of variants in genetic screens^14,34. Several studies have applied a similar strategy to both base and prime editing technologies to identify features of efficient gRNAs or pegRNAs and train predictive algorithms^{21,32,33,35,36,37}. However, this approach has yet to be applied for high-throughput phenotypic screening of endogenous genetic variants with prime editing, probably due in part to the lower editing efficiency of prime editing relative to base editing. We reasoned that a sensor-based prime editing screening approach could be powerful to discriminate bona fide endogenous variants from undesired editing outcomes that enrich or deplete in a screen. Moreover, the sensor approach would theoretically overcome the limitations of assessing variants at different genetic sites in parallel by eliminating the need to sequence several endogenous loci.

To test this approach, we first needed to build a computational tool capable of designing and ranking pegRNAs for thousands of genetic variants, while automatically generating a paired sensor site. To address this unmet need, we built and publicly released PEGG (Extended Data Fig. 1a)—a Python package that enables high-throughput design of prime editing sensor libraries³⁸ (available at https://pegg.readthedocs.io/en/latest/). PEGG is compatible with a range of mutation input formats, including all of the datasets on the cBioPortal, ClinVar identifiers and custom mutation inputs^39,40.

We chose the TP53 tumor suppressor gene as a prototype to establish and credential a scalable prime editing sensor-based screening approach for a number of reasons. First, TP53 is the most frequently mutated gene in human cancer, with ~50% of patients suffering from tumors harboring a mutation within the TP53 gene while the rest often inactivate the p53 pathway through other mechanisms. Second, thousands of unique TP53 mutations have been identified in cancer patients, including eight or so ‘hotspot’ alleles in specific residues that exhibit the highest mutational frequencies¹⁹. Although p53 has been studied for decades, there have been few systematic studies, and those have been hampered by reliance on artificial overexpression of mutant p53 proteins, unrepresentative cell lines and/or a limited spectrum of mutations evaluated^6,7,15,16. These and other studies have sparked controversy in the field over whether any mutant p53 proteins are endowed with activities that go beyond LOF or dominant negative activity to achieve GOF or neomorphic status. These are important questions that extend beyond TP53 because mutant GOF proteins generated by cancer-associated variants, and the phenotypes they produce, could represent attractive therapeutic targets. Finally, prime editing sensor-based screening could be scaled up and broadly deployed to identify causal genetic variants implicated in cancer and other diseases with a strong genetic association.

With the above goals in mind, we first sought to generate a library of pegRNAs targeting TP53 variants. To generate this library, we selected variants from the MSK-IMPACT database, which uses deep exon sequencing of patient tumor samples to identify cancer-associated variants²⁰. From this database of over 40,000 patients, we chose all observed SNVs in p53, as well as frequently observed insertions and deletions, along with a collection of random indels (Extended Data Fig. 1b). We reasoned that including several pegRNAs with different protospacers and combinations of pegRNA properties for each variant would allow us to scan the pegRNA design space more thoroughly to identify highly efficient guides for robust statistical analysis of variant phenotypes. To accomplish this, we used PEGG to produce 30 pegRNA designs per variant (for pegRNAs with a sufficient number of accessible PAM sequences) with varying reverse transcription template (RTT) (10–30 nucleotides) and primer binding site (PBS) lengths (10–15 nucleotides) coupled to canonical ‘NGG’ protospacers. The generated pegRNA designs were ranked based on a composite ‘PEGG score’ that integrates literature best practices for pegRNA design (Extended Data Fig. 1a and Supplementary Table 1).

PEGG also generated silent substitution variants as neutral internal controls for the screen, and we filtered pegRNAs to exclude protospacers with an MIT specificity score below 50 to reduce the probability of off-target editing⁴¹ (Extended Data Fig. 1e). In addition, these pegRNA designs included an epegRNA motif—tevopreQ₁—an RNA pseudoknot located at the 3′ end of the pegRNA that improves editing by preventing degradation of the guide²². Even after these relatively stringent filtration steps, we were able to generate pegRNA designs for more than 95% of the input variants, resulting in a library of >28,000 pegRNAs (Fig. 1c,d and Extended Data Fig. 1c,d). Each pegRNA in the library is also paired with a 60-nucleotide long variant-specific synthetic ‘sensor’ that is generated by PEGG and included in the final oligonucleotide design. Every sensor is designed to recapitulate the native endogenous target locus, thereby linking pegRNA identity to editing outcomes (Fig. 1b).

To test the efficacy of using the sensor as a readout of editing at the endogenous locus, we randomly selected eight TP53 variant-specific pegRNA sensors generated during the process of library preparation. We generated lentivirus for each of these prime editing sensor constructs and performed separate transductions into cells expressing PEmax. At 3- (3D) and 7-days posttransduction (D7), we harvested genomic DNA and amplified both the pegRNA–sensor cassette and the endogenous locus targeted by each pegRNA. Analysis of editing at the sensor and endogenous locus revealed a very high correlation between the sensors and endogenous sites (Spearman correlation ≥0.9; Fig. 1e). In general, the prime editing sensor seems to slightly overestimate the editing activity at the endogenous locus, probably in part due to differences in locus chromatin accessibility⁴², but the ranking of pegRNA editing efficiencies is largely preserved, validating our sensor-based approach.

High-throughput interrogation of TP53 variants

Next, we screened our library of variants in TP53 wild type (WT) A549 lung adenocarcinoma cells stably expressing PEmax²¹. To measure the prime editing activity of this cell line, we generated and transduced these cells with a modified all-in-one lentiviral version of the fluorescence-based PEAR reporter⁴³, validating that the cells displayed strong editing activity (Extended Data Fig. 2a). We then introduced the lentiviral TP53 sensor library into these cells at a low multiplicity of infection and in triplicate while ensuring a library representation of >1,000× at every step of the sfcreen. At 4 days posttransduction (D4), we split the populations into untreated or Nutlin-3-treatment arms (Fig. 1f). Nutlin-3 is a small molecule that inhibits MDM2 to activate the p53 pathway, which can be used to select for TP53 mutations that promote bypass of p53-dependent cell cycle arrest and apoptosis⁴⁴. We hypothesized that this treatment group may increase the signal-to-noise ratio between TP53 variants with putative loss-of-function (LOF) or gain-of-function (GOF) activities and benign variants. We allowed the screen to progress for 34 days (D34), harvesting cell pellets from each replicate and treatment arm at several timepoints (Extended Data Fig. 2b). Genomic DNA extracted from each sample was used to amplify the pegRNA–sensor cassettes, which were subjected to next-generation sequencing (NGS) to simultaneously assess enrichment/depletion of pegRNAs and their editing activity and outcomes at the sensor target site (Extended Data Fig. 2c,d).

The average editing efficiency among all pegRNAs in the library increased in a time-dependent manner, peaking at ~8% in the final timepoint. In general, we observed low indel rates and strong correlation in sensor editing among replicates (Fig. 1g and Extended Data Fig. 3a–d). Strikingly, selecting only the most efficient pegRNA design for each variant led to a twofold increase in the average editing efficiency, highlighting the utility of the sensor for systematic empirical identification of high efficiency pegRNAs (Fig. 1g and Extended Data Fig. 3e–g). Cells with higher editing efficiency also exhibited stronger Nutlin-3 bypass in the Nutlin-3-treatment arm (Fig. 1g). Based on the assessment of editing at the sensor locus, we were able to identify active pegRNAs (≥2% editing efficiency) for more than half of the TP53 variants included in the library. This includes highly efficient pegRNAs that install the desired edit with over 20% efficiency for more than 20% of the variants (Fig. 1h). These validated pegRNAs could be further engineered with silent mutations that evade mismatch repair to boost overall editing efficiency²¹.

The size and diversity of this library also allowed us to examine features of highly efficient pegRNAs that broadly recapitulated previous observations^32,33,37,45. Correlation analysis between various pegRNA features and editing efficiency across all timepoints identified the estimated on-target activity of the protospacer (as predicted by Rule Set 2)⁴⁶ as the single largest determinant of prime editing efficiency (Fig. 2a). In addition, the distance between the edit and the nick introduced by nCas9 was correlated negatively with editing efficiency, while the length of the postedit homology was correlated positively with editing efficiency (Fig. 2a). Thus, edits closer to the nick and with larger postedit homology were more efficient, consistent with previous findings^32,33,37,45.

Fig. 2: Identification of features of highly efficient pegRNAs.

a, Spearman correlations between various features of pegRNA design and correct editing percentage, assessed for all pegRNA-sensor pairs with sufficient reads. Each dot represents a separate replicate/timepoint. For Doench 2016 score, see ref. ⁴⁶. b, Relationship between PEGG score and average correct editing percentage at each timepoint and condition is increasing monotonically. c, Representative example of the correlation between PEGG score and editing efficiency for day 25 replicate 1 (D25-REP1) (Untreated). d, Visualization of the protospacer bias in editing efficiency. The number of pegRNAs generated per protospacer at each TP53 exon on the plus (+) or minus (−) strand (top) and the average editing efficiency at each of these protospacers at day 34 (D34) of the untreated condition (bottom). e, Average editing efficiency for SNV-generating pegRNAs in the library as a function of distance to the nick generated by PEmax and PBS length. The location of the ‘NGG’ PAM sequence is highlighted in blue. Protospacer disrupting (locations +1 to +3) and PAM-disrupting variants (locations +5 and +6) tend to be more efficient. f, Feature importance of 20 random forest models trained separately to predict pegRNA efficiency. Each dot represents a different model. Data are presented as mean values with a 95% confidence interval. Source data and code to reproduce this figure can be found at https://github.com/samgould2/p53-prime-editing-sensor/blob/main/figure2.ipynb. NUT, Nutlin-3-treated.

Notably, the PEGG Score, which is a weighted linear combination of pegRNA features based on literature best practices, correlated more strongly with prime editing efficiency than any other single feature, achieving a Spearman correlation of up to 0.4 (Fig. 2a–c). Although this correlation is modest relative to published predictive models^32,33,37, the PEGG score is a simple, unbiased and cell type/organism-agnostic prediction of pegRNA activity that could complement machine-learning-based predictions of prime editing activity, which may vary due to training on particular cell types.

To further analyze the differences in prime editing activity among the 173 protospacers spanning the TP53 locus, we visualized the number of pegRNAs that utilized each protospacer and the average editing efficiency at each protospacer (Fig. 2d). This analysis suggests that only a subset of protospacers can be used to generate high efficiency pegRNAs, while other protospacers retain little-to-no editing activity. We also found that pegRNAs that introduce edits that disrupt the protospacer or PAM sequence tend to be more efficient (Fig. 2e). Relative to the nick created by nCas9, SNVs introduced at the +1–3 position, which mutate the protospacer, and at the +5–6 position, which mutate the guanine bases in the NGG PAM, display increased editing activity. In contrast, edits introduced at the +4 position, corresponding to the ‘N’ in the ‘NGG’ PAM sequence, display reduced editing, probably due to their failure to disrupt the PAM sequence (Fig. 2e).

Finally, we trained a random forest regressor to predict pegRNA efficiency (Extended Data Fig. 4a). Even with a restricted set of features, this algorithm was able to predict pegRNA activity with a Spearman correlation of ~0.6, comparable with other, more complex algorithms used to predict PE activity^32,33,37 (Extended Data Fig. 4b). Analysis of the relative feature importance of this random forest model was again consistent with previous findings, and highlighted the GC content of the PBS as another important determinant of editing not identified with simple correlation analysis (Fig. 2f). These results demonstrate that large-scale, gene-specific prime editing sensor screening datasets can also provide insight into the determinants of high efficiency prime editing, even though these libraries were not designed with that objective in mind.

Sensor-based calibration identifies pathogenic TP53 variants

To assess the relative fitness conferred by engineered TP53 variants, we used the MAGeCK pipeline to normalize read counts among replicates and quantify the log₂ fold change (LFC) and false discovery rate (FDR) of pegRNAs in the library⁴⁷. While the LFC in pegRNA counts was highly correlated in replicates from the untreated and Nutlin-3-treated arms of the screen, respectively, the correlation among replicates between the two conditions was modest, suggesting that treatment-dependent biological effects were occurring (Extended Data Figs. 3a and 5). We then used the sensor target site as a quantitative proxy for editing efficiency at the endogenous locus to systematically filter pegRNAs based on their empirical editing efficiency and precision (Fig. 3a and Extended Data Fig. 6a–d). As expected, the number of significantly enriched or depleted pegRNAs in ‘sensor-calibrated’ datasets decreased as we increased the editing activity threshold (Fig. 3b). These results demonstrate that our sensor-based approach allows empirical removal of pegRNAs that exhibit potentially spurious enrichment or depletion, and low and/or undesired editing activity, retaining pegRNAs that are more likely to introduce the variants of interest with high efficiency and precision. Based on these results, we decided to focus our statistical analyses on a dataset composed of pegRNAs with ≥10% editing efficiency to minimize the confounding effects of imprecise editing (Fig. 3c–g).

**Fig. 3: High-throughput prime editing sensor screens identify pathogenic *TP53* variants.**

As hypothesized, the dynamic range in the Nutlin-3-treated arm of the screen was considerably higher than in the untreated arm, with pegRNAs more strongly enriching and depleting in the presence of Nutlin-3 (Fig. 3c,d). Treatment with Nutlin-3 also selected for cells with higher-efficiency editing, improving the resolution of the screen (Fig. 3c,d). Editing of sensor loci continued throughout the screen due to the constitutive expression of PEmax, with sensor editing increasing fourfold on average from D4 to D16, and twofold from D16 to D34. However, sensor editing rates among negatively selected (LFC < −1), unselected (−1 ≤ LFC ≤ 1) and positively selected (LFC > 1) pegRNAs remained constant all throughout the screen (Extended Data Fig. 7). These results indicate that any differences in editing rate among the pegRNAs or cells in the population were unlikely to contribute to the results of the screen and were instead controlled internally.

Several putative pathogenic TP53 variants, including R196P and R267P, were strongly enriched in both treatment arms, with several pegRNAs per variant appearing as top hits (Fig. 3e,f). Given the higher dynamic range of the Nutlin-3 treatment arm, as well as the possibility that this treatment biases towards the discovery of dominant negative TP53 mutations^2,6,16, we focused our analyses on this treatment group. Several TP53 variants showed significant enrichment in the Nutlin-3-treatment group, including SNVs and indels in the C-terminal half of the DNA-binding domain (DBD) and the OD (Fig. 3f). This includes several variants at residues 248 and 249, which are known mutational hotspots in p53 and commonly observed in cancer patients and individuals with the Li–Fraumeni cancer predisposition syndrome¹⁷. We also identified strongly depleting variants in the DBD that may retain WT p53 transcriptional activity or fail to exert a dominant negative effect on the p53 tetramer (Fig. 3f). Collectively, these results validate the utility of our approach and dataset to accurately identify functionally diverse pathogenic TP53 variants.

Interestingly, the most commonly observed TP53 mutation in human cancer, R175H, did not show strong enrichment despite the existence of several R175H pegRNAs exhibiting ≥10% editing efficiency. In fact, most of the top enriching variants we identified were not in known mutational hotspots¹⁹, suggesting that other types of variants can produce stronger phenotypes. These include the top hit—an insertion of a histidine between residues 254 and 255—as well as R196P and several variants in the OD (F328L, N345S, A347P) (Fig. 3f). These observations are consistent with the possibility that a subset of TP53 hotspot mutations are observed in part due to disproportionately high mutagenesis rates at the genomic level due to extrinsic and intrinsic factors, such as tobacco smoke and APOBEC (apolipoprotein B mRNA editing catalytic polypeptide-like) activity, rather than only to the fitness advantage conferred by these variants relative to other TP53 mutations^6,19. An alternative explanation to these observations is that the hotspot variants were simply not efficiently engineered by the pegRNAs used in our screen. Although this was true for several hotspot variants, even hotspots with several efficient pegRNA designs (for example, R248Q, Y220C, G245S) were outcompeted by other, less frequently observed variants (Extended Data Fig. 8a–c). This includes rare-variant-encoding pegRNAs that outcompete hotspot variant-encoding pegRNAs with similar or identical empirical editing frequencies (Extended Data Fig. 8a–c). Even at the frequently mutated codon 248, the two most commonly observed substitutions, R248Q and R248W, were outcompeted in the screen by the rarer substitutions R248P and R248G (Extended Data Fig. 8d,e). These results suggest that the spectrum of cancer-associated TP53 mutations is mechanistically diverse and probably arises through the contextual combination of disproportionate mutagenesis rates and phenotypic selection of functionally important codons and their cognate residues.

Bulk quantification of pegRNAs grouped by variant class also revealed that nonsense variants were significantly more enriched compared with missense and silent variants. This is evident at several thresholds of pegRNA activity (Fig. 3g,h). As expected, silent variants tend to deplete, particularly when considering pegRNAs at higher threshold for editing efficiency (≥20%), bolstering our confidence in the fidelity of the screen (Fig. 3h). Using available annotations of p53 residue function⁴⁸, we also found that, as expected, variants in DNA-binding/contacting residues displayed strong enrichment (including residues 248 and 273) (Fig. 3i). Intriguingly, certain variants involved in tetramerization and transactivation (for example, L22V) were also strongly enriched, despite the low frequency of mutation in these residues (Fig. 3i). Other observations are more difficult to interpret, such as the large variance in the enrichment of mutations that affect residues involved in zinc binding, or the fact that variants in partially exposed residues tended to deplete while those in buried and exposed residues tended to enrich (Fig. 3i). Altogether, these observations suggest that there is a large, underappreciated phenotypic variance in the relative fitness conferred by distinct TP53 variants—not all p53 variants are one and the same, a concept that is likely relevant across many other genes and more broadly in biology.

We also sought to quantify the degree of concordance between our screening results and widely used metrics of variant deleteriousness. To do so, we used the combined annotation-dependent depletion (CADD) score, which integrates evolutionary conservation of residues with other metrics of pathogenicity to generate a CADD score, with higher scoring variants predicted to be more deleterious⁴⁹. We observed a low correlation between CADD score and enrichment of all SNV-specific pegRNAs. However, the correlation between CADD score and variant-specific fitness increased dramatically when we used sensor target sites to restrict our analysis to variants generated by high efficiency pegRNAs (Fig. 3j). We achieved a Spearman correlation of ~0.3 when considering Nutlin-3-treated pegRNAs with ≥15% editing activity, and >0.4 when considering pegRNAs ≥50% editing (Fig. 3j). Across all minimum pegRNA editing activity thresholds, the CADD score correlated more strongly with fitness of Nutlin-3-treated pegRNAs to the untreated condition. These results emphasize the significant advantage of including sensor target sites in prime editing screening libraries to quantitatively assess pegRNA efficiency and reinforce the ability of Nutlin-3 to effectively pull out genuine LOF and putative neomorphic and separation-of-function TP53 variants (Fig. 3j,k).

Sequencing of endogenous TP53 validates sensor approach

The above results demonstrate that sensor-calibrated quantification of pegRNA enrichment and depletion can be used to identify bona fide pathogenic TP53 variants. However, these analyses do not rule out the possibility that these changes are independent of true editing at the endogenous TP53 locus. To formally test whether prime editing sensor screens can faithfully quantify the effects of variants engineered at endogenous loci, we performed targeted next-generation sequencing of specific regions in exons 6, 7 and 10 of TP53 using genomic DNA extracted from untreated and Nutlin-treated cells from D4 and D34 timepoints (Fig. 4a). We reasoned that sequencing the native TP53 locus would allow us to directly compare the fold change in pegRNA counts with the fold change of variants installed at defined targeted sites within endogenous TP53.

**Fig. 4: Sequencing of *TP53* validates the prime editing sensor approach.**

Unlike nonsensor-based prime editing screens, which are likely to suffer from extensive noise during enrichment due to the difficulty of designing high efficiency pegRNAs, our sensor-based prime editing screen could be denoised empirically by filtering for pegRNAs that edited their cognate sensor above a given editing threshold. Indeed, we observed no correlation between the LFC in pegRNA counts with the corresponding LFC of endogenous variants engineered in the native TP53 locus when considering all pegRNAs (Fig. 4b–d). However, taking advantage of the sensor site to filter out pegRNAs below a given correct editing threshold dramatically reduced the noise in these data and revealed a strong correlation between the fold change in pegRNA counts and endogenous variant counts (Fig. 4b–c). This correlation increased monotonically as the minimum correct editing threshold increased, reaching a Spearman correlation >0.4 in the untreated arm and >0.5 in the Nutlin-treated arm of the screen when using a minimum correct editing threshold of 50% (Fig. 4d). We note that this is probably an underestimate of correlation on a per variant basis, as several pegRNAs with variable editing efficiencies are being compared directly with a single endogenous genomic site. Importantly, edited exonic sites targeted by top enriching pegRNAs (for example, R196P, A347P, I254_I255insH) were also enriched significantly relative to their WT counterparts and correlated strongly with their respective pegRNA counts. We were also able to detect nearly all of the variants engineered by active pegRNAs (≥500 counts per variant and ≥1% correct sensor editing), and the detectable fraction reached saturation when considering pegRNAs producing ≥14% editing at the sensor site (Fig. 4e). Altogether, these results emphasize the need to integrate quantitative, sensor-like approaches to accurately extract true signal from the high levels of noise that are inherent in large-scale prime editing screens. Indeed, our analysis demonstrates that screening pegRNAs without any empirical quantification of their editing activity invariably leads to spurious conclusions concerning the fitness of the variants that those pegRNAs are intended to engineer.

Functional validation of pathogenic TP53 variants

The above data demonstrate that pegRNA-specific sensor modules can be used to rigorously calibrate screening results to limit the analysis of variant fitness effects only to highly efficient pegRNAs. Though suggestive, these results do not formally prove that top scoring pegRNAs are enriched due to the introduction of defined genetic variants at the endogenous target locus and that these drive the observed biological differences. To test this, we selected a cohort of 29 pegRNAs that significantly enriched or depleted in the screen, or that generated commonly observed ‘hotspot’ mutations (Fig. 5a and Extended Data Fig. 9a). This set of pegRNAs targeted residues that spanned the TP53 locus and also included two control pegRNAs that install silent edits (Fig. 5a). In all, this validation set included low, medium and high efficiency pegRNAs spanning a range of 0–86% correct editing percentages, as measured by their respective sensor sites (Extended Data Fig. 9a). We transduced A549-PEmax cells with lentiviruses encoding individual pegRNAs and allowed editing to occur for 7–10 days, based on the kinetics of editing we observed previously (Fig. 1f). We then mixed each individual population of sequence-verified isogenic A549-PEmax-pegRNA cells with parental TP53 WT A549-PEmax cells and performed longitudinal fluorescence competition assays in the presence or absence of Nutlin-3 (Fig. 5b). We used the red fluorescent protein (RFP) fluorophore linked to each pegRNA vector to track the relative fitness of pegRNA cells (RFP+) compared with parental cells (RFP−) (Fig. 5c). These competition assays proceeded for 2 weeks, with flow cytometry readings taken every 7 days for each replicate (Extended Data Fig. 9b). We then calculated the difference in the RFP+ cell fraction (∆RFP%) for each pegRNA between the profiled timepoints and the initial timepoint in both conditions (Fig. 5d). Consistent with our screening results, a significant fraction of pegRNAs showed enrichment in the presence of Nutlin-3, often reaching complete saturation (Fig. 5d). Overall, there was strong concordance between the enrichment in cells observed in both treatment conditions in the screen and in these competition assays, supporting the reproducibility of the screening results (Fig. 5e,f). Importantly, we observed a significantly strong enrichment of cells expressing pegRNAs designed to engineer variants in the OD, including A347P and N345S (Fig. 5e,f).

The above results indicate that cells harboring a number of pegRNAs designed to engineer diverse types of TP53 variants have an increased fitness. However, these results do not rule out the possibility that these pegRNAs confer increased fitness through indel generation, rather than through their encoded edits. To assess this, we performed targeted NGS of each endogenous target loci in pure A549-PEmax-pegRNA cell lines (that is, before mixing with the parental population). Comparison of the endogenous editing observed in these cell lines with the corresponding sensor editing observed in the untreated arm of the screen at a similar timepoint (D16) revealed a strong correlation, with on-target editing observed for almost all pegRNAs (Fig. 5g).

Another potential application of our approach is to combine high-throughput prime editing with drug treatments to identify variant–drug interactions that could be exploited to develop allele-specific therapies. This is particularly relevant today because recent advances in rational drug design have shown that small molecules targeting specific mutant proteins (including those produced by oncogenic point mutant KRAS alleles) can have therapeutic potential⁵⁰. To test whether our approach could be used to identify variant-specific therapeutic sensitivities, we tested two small molecules that have been shown to exhibit mutant-p53-specific effects, COTI-2 and PK7088, in cells transduced with lentiviruses encoding R175H- and Y220C-targeting pegRNAs, respectively^51,52,53. Both of these treated populations showed depletion in the RFP+ cell fraction (Extended Data Fig. 9c). In particular, the COTI-2 treatment arm showed significant depletion of R175H-pegRNA cells (Extended Data Fig. 9c). These results demonstrate that prime editing sensor screens could be used to systematically identify variant-specific vulnerabilities to diverse therapies, augmenting cDNA-based approaches for performing similar screens⁵⁴.

Prime editing reveals new pathogenic variants

High-throughput functional genomics approaches have been used previously to investigate TP53 mutations. For instance, Giacomelli et al. performed deep mutational scanning of TP53 variants using exogenous overexpression of mutant TP53 cDNAs in A549 cells in the presence or absence of Nutlin-3, concluding that most TP53 mutations probably arise as a consequence of endogenous mutational processes that select for dominant negative and LOF activity⁶. A follow-up study integrated this method with HDR-based modeling of six TP53 hotspot mutations in human leukemia cells and concluded that missense mutations in the TP53 DBD act mainly through dominant negative activity². More recently, Ursu et al. employed a modified version of Perturb-seq to interrogate the transcriptional effects of 200 mutant TP53 cDNAs in A549 cells by single-cell RNA-sequencing, also concluding that most of these disrupt p53 activity through LOF and dominant negative effects¹⁶. In contrast, another study used a similar approach in H1299 and HCT116 cell lines to interrogate variants in the TP53 DBD through parallel in vitro and in vivo experiments, concluding that certain hotspot mutations confer a higher proliferative advantage in vivo, probably through GOF mechanisms⁷. A number of studies in both mice and humans have also demonstrated that certain TP53 variants, including hotspot mutations at residues R175, R248 and R273, can produce phenotypes consistent with neomorphic/GOF activities⁵⁵. These include promoting aberrant self-renewal of hematopoietic stem cells⁵⁶, sustaining tumor growth^57,58 and promoting metastatic dissemination^59,60,61,62, among others⁵⁵. As such, there is much controversy in the field regarding the precise cellular and molecular activities of cancer-associated TP53 mutations—the most common genetic lesions observed across all types of cancer.

We hypothesized that cDNA screening approaches are biased in favor of detecting dominant negative activities due to their reliance on supraphysiological overexpression of mutant proteins. This is particularly relevant for studying the active p53 transcription factor, which is a tetrameric protein composed of a dimer of dimers¹⁸. As such, we postulated that mutant overexpression studies in WT TP53 cells, including A549, may fail to detect mutant allele-specific activities and phenotypes that may be sensitive to gene dosage and protein stoichiometry. To test this hypothesis, we reanalyzed our data to perform comparative bioinformatic analyses with the dataset generated by Giacomelli et al.⁶, as their experiments were also carried out in WT TP53 A549 cells treated with Nutlin-3. First, we plotted the Z-scores of SNV-generating pegRNAs (≥10% editing) against the Z-scores of the corresponding variants expressed from cDNAs (Fig. 6a). Supporting our hypothesis, variants in the OD of p53 tended to deplete (that is, Z-score <0) when expressed from cDNAs, but often enriched significantly when expressed from the endogenous locus (Fig. 6a,b). To investigate this difference further, we calculated the difference in Z-scores (∆Z-score) between each pegRNA–cDNA pair by subtracting the cDNA Z-score from the prime editing Z-score. This analysis revealed a significantly higher ∆Z-score for endogenous variants in the OD relative to other domains of p53 (Fig. 6c)—a pattern that consistently held at several thresholds of pegRNA activity (Extended Data Fig. 10a) and even when we restricted our analysis solely to the most efficient pegRNA for each variant (Fig. 6d and Extended Data Fig. 10b).

**Fig. 6: Comparative analysis of prime editing and cDNA screening datasets of *TP53* variants reveals pathogenic variants in the OD.**

Some of the most impactful variants in the p53 OD are observed frequently in individuals with Li–Fraumeni syndrome, who carry germline TP53 variants that predispose them to cancer. Two independent studies by the Prives and Lozano laboratories recently showed that p53 proteins harboring A347D mutations (in the OD) form stable dimers instead of tetramers, and that these dimeric p53 proteins exhibit neomorphic activities^63,64. Visualizing the residue-averaged ∆Z-scores on the structure of the p53 OD⁶⁵ further highlights the extensive differences in the behavior of endogenous variants as compared with exogenous (cDNA) variants in this domain (Fig. 6e).

To further investigate the phenotypic differences between endogenous and exogenous TP53 variants, we performed fluorescence competition assays with Nutlin-treated A549-PEmax cells transduced with pegRNAs or matched cDNAs representing specific variants spanning the TP53 DBD and OD regions. We also included a number of important controls for each approach, including silent-edit- or no-edit-inducing pegRNAs that targeted the same loci as their matched variant-inducing pegRNAs, as well as empty vector and WT cDNA constructs. We found a strong agreement between the behavior of mutant TP53 cDNAs tested in competition assays and their corresponding Z-scores in the Giacomelli et al. screen⁶, validating our assay (Fig. 6f). However, comparing the enrichment of cells harboring endogenous variants engineered with pegRNAs relative to those expressing exogenous variant cDNAs revealed large differences in the behavior conferred by endogenous and exogenous TP53 mutations (Fig. 6g and Extended Data Fig. 10c). Variants in the DBD behaved similarly across both systems, with the exception of R110P and L145P, which enriched only in the cDNA group. However, all OD variants failed to enrich when tested with cDNAs, but three-quarters of OD variants displayed strong enrichment when engineered endogenously using prime editing (Fig. 6g). Importantly, all controls behaved as expected, failing to enrich in the competition assays (Fig. 6g). These results support the hypothesis that certain variant-induced phenotypes can be observed accurately only when engineered and expressed in the endogenous genomic context.

Collectively, our observations highlight gene dosage, protein stoichiometry and protein–protein interaction domains as important variables that must be taken into account when studying the enormous diversity of mutant alleles observed in human cancer. Failing to take these considerations into account might lead to the misclassification of bona fide pathogenic variants, including those identified in patients with hereditary cancer predisposition syndromes like Li–Fraumeni.