
Pharmaceutical companies have spent decades engineering IP moats around their most valuable assets. Now those moats face a threat and a tool from the same source: artificial intelligence. AI accelerates discovery, but it floods the public domain with prior art at machine speed. It predicts FDA approval odds with 83% accuracy, but it cannot legally be named as an inventor. It can be used to mine competitor patents for invalidity arguments in hours, not months.
The companies that get this right will capture durable exclusivity on their AI-generated pipelines. Those that get it wrong will find their patent applications rejected for implausible utility, their compound libraries used as prior art against them, or their issued patents picked apart by AI-powered ANDA filers running Paragraph IV challenges.
This analysis covers the mechanics, the case law, the IP valuation implications, the litigation economics, and the specific strategies that pharma IP teams, portfolio managers, and R&D leads need to execute now.
Part I: The Dual Role of AI in Pharmaceutical Patent Strategy
AI as a Discovery Engine: What It Actually Does to the R&D Timeline
The honest picture of AI in drug discovery is more specific than ‘it speeds things up.’ Machine learning models compress specific, discrete bottlenecks in the R&D pipeline, and understanding which bottlenecks matters for IP strategy.
Target identification and validation has historically been slow because researchers were limited to literature review and known biology. AI models trained on multi-omics datasets, including genomics, proteomics, and transcriptomics, can identify novel disease targets by finding non-obvious correlations across large biological datasets. Recursion Pharmaceuticals runs phenotypic screens at scale using convolutional neural networks that process cellular imaging data, identifying morphological signatures associated with disease states. Their platform has generated a pipeline entirely predicated on computational hits validated by automated wet-lab experiments.
Structure prediction changed fundamentally when DeepMind published AlphaFold2 in 2021. The model predicted accurate three-dimensional structures for over 200 million proteins, including most of the human proteome. Before AlphaFold2, crystallography or cryo-electron microscopy was required to determine protein structure, each taking months and significant capital. After AlphaFold2, researchers can access high-confidence structural predictions for free. That changes the starting point for structure-based drug design entirely.
Generative chemistry, where AI models propose novel molecular structures with specified properties, is the capability with the most direct patent implications. Models including Generative Adversarial Networks (GANs), variational autoencoders, and transformer-based architectures trained on chemical datasets can generate tens of thousands of candidate structures in hours. Insilico Medicine used this approach to identify INS018_055, a novel TNIK inhibitor for idiopathic pulmonary fibrosis, which reached Phase II clinical trials with a preclinical-to-candidate timeline of approximately 18 months. The company filed patents on both the compound and the discovery platform.
Lead optimization, the iterative chemical modification of a hit compound to improve potency, selectivity, and pharmacokinetic properties, is where AI produces measurable cycle time reductions. Quantitative structure-activity relationship (QSAR) models can predict how structural modifications will change binding affinity, metabolic stability, and aqueous solubility without requiring synthesis and assay of each analog. That compresses medicinal chemistry cycles from weeks to days.
The combined effect is real but uneven. AI does not replace clinical trials, regulatory review, or manufacturing scale-up. The phases where AI is meaningfully accelerating discovery, from target identification through lead optimization, represent roughly the preclinical portion of the timeline. Total development time for most drugs remains over ten years. What AI changes is the cost and speed of reaching an IND filing, and the number of chemically novel candidates that can be generated and evaluated in that window.
IP Valuation Implications: What AI-Accelerated Discovery Does to Patent Portfolio Economics
When AI compresses preclinical timelines, it changes the economics of pharmaceutical IP at the portfolio level. Faster discovery means a company can generate more candidates per R&D dollar, but it also means competitors face the same acceleration. The net effect on patent portfolio value depends on how a company positions its IP relative to this faster environment.
In a traditional small-molecule program, the compound patent filing typically occurs after several years of preclinical work. That filing marks the start of the 20-year patent term. With AI-accelerated discovery, companies can reach a patentable compound candidate faster, which means they can file earlier. Earlier filing, holding everything else constant, means more patent term remains at the point of FDA approval. That directly increases the effective exclusivity window and the net present value of the asset.
A biologic program with a 12-year reference product exclusivity clock under the Biologics Price Competition and Innovation Act (BPCIA) is less sensitive to patent term erosion than a small molecule where the Hatch-Waxman framework and Orange Book listing are the primary exclusivity mechanisms. AI-accelerated timelines therefore produce larger NPV gains for small-molecule programs where every additional month of exclusivity translates directly into revenue.
For analysts valuing pharma companies with AI-heavy pipelines, the relevant adjustment is not simply to use a faster timeline assumption in DCF models. The relevant question is whether the AI platform produces genuinely novel composition-of-matter patents, or whether it generates compounds that are obvious modifications of prior art. A platform producing the former is a durable competitive asset. A platform generating the latter is burning R&D spend on filings that will be rejected or challenged.
Part II: How AI Generates Prior Art and What That Does to Your Patents
The Defensive Publication Problem
Pharmaceutical patents live or die on novelty. Under 35 U.S.C. Section 102, a claimed invention must not have been patented, described in a printed publication, or otherwise publicly disclosed before the effective filing date. AI changes the threat profile here by making mass prior art generation trivially cheap.
Generative chemistry models can produce, in hours, structural analogs of any lead compound in the public domain. If those analogs are published, either as academic preprints, patent applications, or deliberate defensive publications, they enter the prior art and block competitors from patenting the same structures. This is not a hypothetical. Dan Rudoy of Wolf, Greenfield & Sacks has described the possibility explicitly: a company could deploy AI to generate and publish every structural variation of claims in a competitor’s patent application, flooding the public domain and foreclosing those positions.
The Defensive Patent License (DPL) and similar open-source patent frameworks have existed for years in software. In pharmaceuticals, the strategic use of AI-generated publications to prevent competitor patenting is a newer and less mapped territory. The key question is whether a generative model output constitutes enabling disclosure. If the AI generates a structural formula without synthesis route or biological data, a patent examiner or court may find that the disclosure is not enabling, meaning it does not teach someone skilled in the field how to make and use the compound. In that case, it may not qualify as anticipatory prior art under Section 102, though it could still be considered for obviousness analysis under Section 103.
AstraZeneca filed a lawsuit in 2023 against generic filer Alkem Laboratories related to Farxiga (dapagliflozin), an SGLT2 inhibitor generating approximately $2.8 billion in annual global revenues. That litigation, ongoing in the District of Delaware, involved Alkem’s Paragraph IV certification asserting that listed Orange Book patents were invalid or not infringed. The prior art landscape for SGLT2 inhibitors, already complex before AI entered the picture, now includes computational chemistry publications from academic groups running generative models on gliflozin scaffolds. Those publications are citable in ANDA litigation even if they were generated by unsupervised AI tools.
The Prior Art Flood: AI-Generated Compound Libraries
Several research consortia and pharmaceutical companies have published open chemical libraries containing millions of AI-generated compounds. Enamine’s REAL Space contains over 48 billion make-on-demand compounds, not all AI-generated but increasingly AI-enumerated. Mcule and Chemspace maintain similar libraries. These databases are publicly accessible, and their existence creates a prior art problem for any company attempting to patent broad classes of chemical structures without contemporaneous synthesis and assay data.
The key legal issue is constructive reduction to practice. A compound described in a publication but never actually synthesized does not, under current U.S. patent law, constitute prior art for all purposes. The Federal Circuit has wrestled with the question of whether a compound must be synthesized to be ‘described’ in a prior art reference under Section 102. In Rasmusson v. SmithKline Beecham Corp. (2004), the court addressed what level of disclosure constitutes anticipation. The general principle that a compound listed in a Markush group without specific disclosure of that compound is not necessarily anticipatory prior art has held, but the sheer volume of AI-generated chemical databases is pushing courts and patent offices toward clearer rules.
For IP teams, the practical implication is this: if your compound appears in any publicly accessible AI-generated database with even a speculative biological annotation, expect that reference to surface in an ANDA filing or inter partes review (IPR) proceeding. Conduct freedom-to-operate analyses that include computational chemistry databases, not just issued patents and academic literature.
Paragraph IV Filings and AI-Powered Invalidity Analysis
The Hatch-Waxman paragraph IV pathway requires an ANDA filer to certify that the listed patent is invalid, unenforceable, or will not be infringed. AI makes building that invalidity argument cheaper and faster. Patent analytics platforms including Docket Navigator, Innography (now owned by CPA Global/Clarivate), and Patent Sight use machine learning to cluster patents by technology and identify potentially invalidating prior art references. Generic manufacturers and their litigation counsel now routinely use these tools to build their Section 103 obviousness arguments before filing the Paragraph IV certification.
The time and cost dynamics of Hatch-Waxman litigation are shifting as a result. When an originator sues within 45 days of receiving the Paragraph IV certification, the 30-month stay on ANDA approval begins. That stay historically created negotiating leverage for the originator. AI-powered invalidity analysis compresses the time it takes for generics to build a credible trial case, which changes settlement dynamics and increases the frequency with which cases go to trial rather than settling with a reverse-payment agreement.
The FTC has maintained active scrutiny of reverse-payment settlements (so-called pay-for-delay agreements) since FTC v. Actavis (2013), in which the Supreme Court held that such settlements can violate antitrust law under a rule of reason analysis. With AI making invalidity arguments cheaper to construct, originators face more credible threats at lower cost to the generic challenger, potentially increasing the size of settlements or pushing more disputes through full trial. Both outcomes affect IP valuation.
Key Takeaways, Part II
AI-generated prior art is real, growing, and already appearing in pharmaceutical patent disputes. IP teams must expand freedom-to-operate searches to cover computational chemistry databases. Broad composition-of-matter claims covering AI-generated compound libraries face heightened rejection risk based on plausibility. Paragraph IV litigation economics are shifting as AI reduces the cost of building invalidity arguments.
Investment Strategy Note
Analysts pricing branded pharmaceutical assets should haircut patent life assumptions by 12-18 months in therapeutic areas where large AI-generated compound libraries exist, specifically CNS, oncology, and anti-infectives, where academic generative chemistry output is densest. Assets protected by method-of-treatment patents or formulation patents rather than solely composition-of-matter patents carry less exposure to AI prior art challenges.
Part III: Inventorship, the USPTO, and the Human Contribution Requirement
The Legal Standard: What ‘Significant Contribution’ Actually Means
The USPTO position on AI inventorship is unambiguous: AI cannot be named as an inventor. The Thaler v. Vidal case (Federal Circuit, 2022) resolved the question at the appellate level. Stephen Thaler filed patent applications listing his AI system DABUS as the sole inventor. The Federal Circuit affirmed the district court’s ruling that the Patent Act’s reference to ‘individuals’ in the inventorship provisions requires human inventors. The Supreme Court declined to hear the case, leaving the Federal Circuit’s holding intact.
The more nuanced operational question is what level of human contribution satisfies the ‘significant contribution’ standard articulated in Pannu v. Iolab Corp. (Federal Circuit, 1998). Under Pannu, an inventor must contribute to the conception of at least one claim in the patent. Conception is the mental formulation of a complete and operative invention. An employee who merely follows instructions, or whose contribution was fully directed by another person, is not a proper inventor.
Apply that framework to an AI-assisted discovery workflow. A researcher who defines the target protein, selects the training dataset, specifies the pharmacophore constraints, and chooses from among the AI’s generated candidates for synthesis has likely made a significant contribution. A researcher who presses ‘run’ on an off-the-shelf generative model and patents the top-ranked output has a far weaker inventorship claim. The difference is documentation.
Michael Kahn of Akin’s IP practice has framed it precisely: the human’s role must include designing experiments based on AI outputs, or training the AI tool to solve specific problems. Generic model use, without substantial human direction of the discovery logic, will not support inventorship.
USPTO Examination Practice: Plausibility Rejections on AI-Generated Compound Claims
Recursion Pharmaceuticals’ PCT application WO2024039689 for Heterocycle RBM39 Modulators is the clearest recent example of what happens when AI-generated compound claims are filed without sufficient experimental validation. Claim 1 of that application encompassed thousands of chemical structures. The International Searching Authority declined to establish an opinion on patentability, citing the implausibility that even a fraction of the claimed compounds would have the stated biological activity. That is a plausibility rejection, and it reflects a pattern that European Patent Office examiners have consistently applied to broad claim sets lacking enabling experimental data.
The EPO’s Problem-Solution Approach requires that an applicant demonstrate not just that a solution is plausible, but that the technical effect is made plausible across the full claim scope. For a Markush group claiming 10,000 AI-generated compounds based on three synthesized examples, that standard is not met. Under Article 83 EPC (sufficiency of disclosure) and Article 84 EPC (clarity and support), examiners have tools to attack broad AI-generated claims even where the underlying chemistry is genuinely novel.
The USPTO equivalent is the written description and enablement requirements under 35 U.S.C. Section 112. In Amgen Inc. v. Sanofi (Supreme Court, 2023), the Court unanimously held that Amgen’s broad antibody claims, which covered all antibodies that bind a specific region of PCSK9 and block receptor binding, were not enabled because Amgen had only exemplified a small fraction of the claimed functional antibody space. The Court’s language, that a patent cannot claim more than the inventor has described how to make and use, applies directly to AI-generated compound libraries. A claim set covering thousands of computationally generated structures, supported by laboratory data on only a handful, faces a strong Section 112 challenge under Amgen.
IP Valuation: How Inventorship Risk Discounts Patent Assets
When a pharmaceutical company’s pipeline is built substantially on AI-generated compounds, the inventorship documentation becomes a material due diligence item. In a licensing or M&A transaction, poorly documented human contributions create a patent validity risk that discounts asset value. A composition-of-matter patent on a clinical-stage compound generates its valuation premium from expected exclusivity. If that patent is vulnerable to an inventorship challenge, or to an inter partes review citing AI-generated prior art, the probability-weighted NPV of the asset is lower.
Buyers conducting IP due diligence on AI-heavy biotech companies should request invention disclosure records, lab notebooks (electronic or physical), AI tool usage logs, and evidence of human decision-making at each stage of the discovery process. The absence of that documentation is a red flag, not a neutral data point.
Key Takeaways, Part III
The ‘significant contribution’ inventorship standard is fact-specific and depends on documented human decision-making throughout the discovery process. AI-generated Markush claims without broad experimental validation fail both USPTO enablement requirements and EPO plausibility standards. The Amgen v. Sanofi ruling is directly applicable to AI-generated antibody and small-molecule compound claims. Inventorship documentation quality is a material M&A diligence item for AI-heavy biotech portfolios.
Part IV: How AI Strengthens Pharmaceutical Patent Positions
Platform Patents: Protecting the Discovery Engine Itself
The most durable AI-related IP strategy in pharmaceuticals is not patenting AI-generated compounds in bulk. It is patenting the AI platform that generates valuable compounds, combined with aggressive trade secret protection of the training data and model weights.
Intelligencia AI provides a direct example. The company received U.S. Patent No. 11,948,667, titled ‘System and Interfaces for Processing and Interacting With Clinical Data,’ covering novel techniques for training and deploying machine learning models to predict FDA approval likelihood. Their Portfolio Optimizer platform uses proprietary training data and algorithmic approaches to estimate probability of technical and regulatory success (PTS) for drug candidates. The patent does not cover a drug compound. It covers the predictive system, which they license to pharmaceutical companies as a SaaS product.
That structure is strategically distinct from patenting a compound the AI discovers. A platform patent, combined with continuous model updates and proprietary data accumulation, creates a competitive moat that does not erode with each patent expiration cycle. The defensibility of that moat depends on whether the core algorithmic innovations are genuinely novel and non-obvious, which requires careful claim drafting.
Schrodinger, a computational chemistry company with a public market capitalization, has built its business around a different model: licensing computational physics-based drug design software while also co-developing compounds with pharmaceutical partners. Their FEP+ (Free Energy Perturbation) platform, which uses physics-based simulations to predict protein-ligand binding affinity, is protected by a combination of patents on the underlying methodology, trade secrets on proprietary force field parameterizations, and extensive validation datasets that competitors cannot easily replicate. BMS, Pfizer, Roche, and Lilly have all engaged Schrodinger in various collaboration structures, suggesting the platform has demonstrated sufficient predictive value to justify partnerships rather than internal replication.
Evergreening with AI: Using Computational Tools to Build Patent Thickets
Pharmaceutical companies have long used strategic IP filing practices to extend effective exclusivity beyond the expiration of a primary composition-of-matter patent. Techniques include filing secondary patents on formulations, polymorphic crystal forms, enantiomers, metabolites, prodrugs, and new therapeutic indications. AI accelerates every one of these strategies.
Consider the lifecycle management of AbbVie’s adalimumab (Humira), the best-selling drug in pharmaceutical history until biosimilar entry. AbbVie built a patent thicket of over 100 patents covering manufacturing processes, formulations, dosing regimens, and devices. Entry of biosimilar adalimumab was delayed until 2023 in the U.S. despite the core composition-of-matter patent expiring much earlier. AI-assisted polymorph screening can generate and analyze dozens of crystal form candidates in a fraction of the time required by traditional wet chemistry methods. AI-powered formulation optimization can identify novel excipient combinations that support additional patent filings. That means the same evergreening tactics that AbbVie employed manually can now be deployed at lower cost and faster iteration speed.
The strategic playbook for a small-molecule drug approaching primary patent expiration includes deploying AI to identify all patentable polymorph, solvate, and co-crystal forms of the active pharmaceutical ingredient; using generative chemistry to identify structurally related active metabolites that can anchor new composition-of-matter claims; and applying machine learning to clinical and real-world evidence data to identify novel dosing regimens or patient populations supporting new method-of-treatment patents. Each of these generates additional Orange Book-listable patents that reset the Paragraph IV litigation clock.
Branded pharmaceutical companies should also note that AI can be used to monitor competitor generics pipeline activity. ANDA filings are public; the question is which generic companies are building the most credible Paragraph IV challenges and what prior art they are likely to cite. AI tools that scan patent databases, academic literature, and computational chemistry repositories can predict the likely invalidity arguments generic challengers will deploy, giving originators lead time to file supplemental patents or design continuation applications that foreclose those arguments.
Biosimilar Defense and the BPCIA Patent Dance
For biologic drugs, the relevant exclusivity framework is the BPCIA, which provides 12 years of reference product exclusivity for approved biologics from the date of first licensure. The ‘patent dance’ procedural framework under Section 262(l) governs which patents are litigated and in what sequence before or after biosimilar approval. AI affects biosimilar defense in several specific ways.
Epitope mapping, which identifies the specific binding sites on a biologic that determine function and IP protection, is faster and more comprehensive with AI tools. For a monoclonal antibody like pembrolizumab (Keytruda, Merck), the relevant patents cover not just the antibody sequence but the CDR (complementarity-determining region) sequences that define antigen binding, the Fc region modifications that affect half-life, and the manufacturing process parameters that affect glycosylation patterns. AI can systematically identify all patentable epitope combinations and Fc engineering strategies, supporting continuation filings that build the patent thicket on the biologic after the primary sequence patents expire.
Biosimilar applicants, conversely, can use AI-powered structural analysis to identify biologic sequences that are sufficiently differentiated from a reference product’s patent claims while retaining comparable clinical function. That is the computational analog of the small-molecule ANDA strategy of designing around existing patents. The FDA’s current regulatory framework for biosimilar interchangeability, established under the BPCIA and implemented through guidance documents including the March 2019 interchangeability guidance, requires demonstration of pharmacokinetic and pharmacodynamic equivalence and switching study data. That clinical bar is separate from the patent question, but both need to be cleared for a biosimilar to achieve full market access.
Key Takeaways, Part IV
AI platform patents covering discovery systems are more durable than compound patents on AI-generated molecules. Evergreening strategies, including polymorph screening, formulation optimization, and method-of-treatment filings, execute faster and cheaper with AI, making lifecycle management more accessible. Biosimilar defense using AI-powered epitope mapping and continuation filing strategies can extend effective exclusivity on biologics. The Intelligencia AI U.S. Patent No. 11,948,667 is the clearest current example of successfully patenting an AI-driven pharmaceutical decision support system.
Investment Strategy Note
Companies with proprietary AI platforms generating platform-level IP, not just compound-level IP, carry higher sustainable competitive advantages in portfolio valuations. Schrodinger’s public comps and partnership activity suggest the market assigns meaningful value to computational chemistry platforms with demonstrated predictive accuracy. For branded pharma, AI-accelerated evergreening extends cash flow duration on LOE assets, which shows up in longer effective exclusivity windows in DCF models. Analysts should model that explicitly rather than assuming a hard cut-off at primary patent expiration.
Part V: AI in Patent Litigation and Enforcement
How Generic Manufacturers Are Using AI to Build Paragraph IV Cases
The economics of ANDA litigation have historically favored large originator companies with deep IP litigation resources. Building a credible Paragraph IV invalidity case required months of attorney and expert time conducting prior art searches, analyzing claim construction, and identifying weaknesses in issued patents. AI compresses that timeline. Patent analytics tools using machine learning can cluster prior art references by structural and functional similarity to claimed compounds, flag potential Section 103 obviousness combinations, and generate claim charts in a fraction of the time required by manual analysis.
Mylan (now Viatris), Teva, and Amneal have all invested in internal IP analytics capabilities that use natural language processing to analyze patent prosecution histories (file wrappers) for prosecution history estoppel arguments. If an originator narrowed a claim during prosecution to overcome a prior art rejection, that narrowing can be used against them in litigation to foreclose the doctrine of equivalents. AI-powered file wrapper analysis surfaces those prosecution history arguments faster.
The Hatch-Waxman litigation landscape for drugs with complex Orange Book listings, including branded-generic authorized generic deals, is particularly sensitive to AI-powered invalidity analysis. Drugs like duloxetine (Cymbalta, Eli Lilly), pregabalin (Lyrica, Pfizer), and esomeprazole (Nexium, AstraZeneca) generated extended Hatch-Waxman litigation periods partly because the claim scope and prior art arguments were complex enough to sustain multi-year disputes. AI tools that simplify that complexity reduce the barrier to entry for Paragraph IV filing.
IPR Proceedings and AI-Driven Invalidity Challenges
Inter partes review, established by the America Invents Act in 2012 and administered by the Patent Trial and Appeal Board (PTAB), has become the primary venue for patent validity challenges outside district court litigation. The PTAB’s relatively low institution threshold (the petition must demonstrate a reasonable likelihood that at least one claim is unpatentable) combined with the absence of a standing requirement has made IPR the preferred first strike for pharmaceutical patent challenges.
AI tools for prior art search and claim mapping are directly applicable to IPR petition preparation. A petitioner needs to identify prior art references that, alone or in combination, render claims obvious under Section 103. AI models trained on chemical databases and patent corpora can do that search faster and with broader coverage than manual review. For petitioners challenging broad composition-of-matter claims on AI-generated compound classes, the relevant prior art may include computational chemistry publications that a manual search would not surface.
The PTAB has also seen an increase in petitions from non-generic challengers, including hedge funds and specialized patent litigation funding entities. Entities like Hayfin Capital Management and Longford Capital have invested in pharmaceutical IPR petitions as a litigation finance strategy, taking positions in biosimilar or generic companies and then funding IPR challenges to weaken the originator’s patent thicket. AI-powered invalidity analysis reduces the upfront investment required to assess whether an IPR petition has merit, potentially increasing the frequency of these challenges.
Defensive AI Monitoring: Watching for Infringement and Filing Patterns
AI-powered patent monitoring is standard practice for large pharmaceutical IP groups. Tools including Derwent Innovation (owned by Clarivate), PatSnap, and CPA Global’s annuity management platform provide machine learning-based alerts for new patent filings, ANDA submissions, and competitor claim activity. The operational value for a branded pharma IP team is early detection of competitor filings that could encroach on licensed territory or signal pipeline programs that compete with internal assets.
The more sophisticated use is predictive: using AI to forecast where a competitor’s patent filing strategy is heading before they file. If Eli Lilly’s generative chemistry publications cluster around specific kinase inhibitor scaffolds, that is a signal worth tracking months before any patent applications appear. Patent applications are public 18 months after filing under 35 U.S.C. Section 122(b). An AI-powered monitoring system that tracks academic preprints, conference presentations, and scientific job postings can provide earlier signal on competitor R&D direction than waiting for patent publications.
Key Takeaways, Part V
AI compresses the timeline and cost of building Paragraph IV invalidity cases, changing Hatch-Waxman settlement dynamics. IPR petition preparation is a natural application for AI prior art search and claim mapping tools. Patent monitoring using AI can detect competitor filing patterns earlier than the standard 18-month publication lag. Litigation finance entities are deploying AI-powered patent analysis to assess IPR petition merit, potentially increasing challenge frequency against branded pharmaceutical IP.
Part VI: Trade Secrets, Data Exclusivity, and Hybrid Protection Architecture
When Patents Are the Wrong Tool: The Trade Secret Case
For pharmaceutical companies whose competitive advantage lies in the AI platform rather than any individual compound, trade secret protection may be more durable than patents for core algorithmic components. The Defend Trade Secrets Act (DTSA) of 2016 provides federal civil causes of action for trade secret misappropriation, and its definition of trade secret is broad enough to cover training datasets, model architectures, and hyperparameter configurations.
The strategic logic for choosing trade secrets over patents for AI drug discovery platforms is straightforward. A patent requires full public disclosure and expires in 20 years. A trade secret, if properly maintained, has no expiration date and requires no disclosure. The risk is independent development and reverse engineering. For a pharmaceutical company that has assembled a proprietary dataset over decades of internal R&D, the dataset itself may be the most defensible component of the platform, particularly if it includes clinical outcomes data, biomarker correlations, and patient-level pharmacokinetic measurements that competitors cannot easily replicate.
Exscientia, the Oxford-based AI drug discovery company, has taken a hybrid approach. The company has filed patents on specific compounds generated by its platform, including DSP-1181 (developed in partnership with Sumitomo Dainippon Pharma), which entered Phase I clinical trials in 2020 as the first AI-designed drug to reach human testing. The platform architecture itself is protected through trade secrets and contractual confidentiality in its collaboration agreements. That separation of what gets patented from what stays secret is a deliberate structural choice, not a default.
Regulatory Exclusivity as a Complement to Patent Protection
Patent protection and regulatory exclusivity operate independently, and understanding both is necessary for accurate IP valuation. The key regulatory exclusivity periods relevant to AI-discovered drugs are as follows.
New Chemical Entity (NCE) exclusivity under the Drug Price Competition and Patent Term Restoration Act (Hatch-Waxman) provides five years of data exclusivity from the date of first FDA approval of a new chemical entity. During that period, the FDA cannot accept an ANDA that references the innovator’s clinical data. NCE exclusivity runs concurrently with or independent of patent term. A drug that is AI-discovered but built on a genuinely novel chemical scaffold qualifies for NCE exclusivity regardless of how the compound was identified.
New Biologic Exclusivity under the BPCIA provides 12 years of reference product exclusivity for approved biologics, plus four years of data exclusivity from the approval date during which the FDA will not accept a biosimilar application. AI-discovered biologics, including novel antibody sequences or protein therapeutics identified by computational screening, qualify for BPCIA exclusivity on the same basis as conventionally discovered biologics.
Orphan Drug Exclusivity provides seven years of market exclusivity for drugs approved for rare diseases affecting fewer than 200,000 patients in the U.S. AI tools are particularly useful for identifying new indications for existing drugs that qualify for orphan designation, a drug repurposing application that can generate entirely new exclusivity periods. Agios Pharmaceuticals’ enasidenib (Idhifa) and ivosidenib (Tibsovo) were developed for IDH-mutant leukemia subpopulations identified through genomic and computational analysis. Both received orphan drug designation, providing seven-year exclusivity on top of their composition-of-matter patents.
Pediatric exclusivity adds six months to existing patent terms and exclusivity periods in exchange for conducting FDA-requested pediatric studies. For a drug nearing the end of its primary patent term, that six months of additional exclusivity on a blockbuster can be worth hundreds of millions of dollars. AI can identify whether pediatric indication data is realistically achievable based on disease epidemiology and historical trial design parameters, making it a screening tool for whether the pediatric exclusivity strategy is worth pursuing.
Data Ownership and Training Dataset IP
The pharmaceutical industry has not fully resolved who owns the IP generated by AI systems trained on external datasets. If a drug company trains a generative model on publicly licensed ChEMBL data and the model generates a novel compound, the compound itself may be patentable, but the training data is not proprietary. The competitive advantage comes from the model architecture and any proprietary fine-tuning data added on top of the public base.
Companies that have assembled proprietary clinical datasets, either through internal trials or data licensing agreements, are in a stronger position. Real-world evidence data from electronic health records, compiled by companies including Flatiron Health (Roche subsidiary) and IQVIA’s Real-World Solutions business, is extremely valuable for training AI models that predict clinical outcomes. Roche’s acquisition of Flatiron for $1.9 billion in 2018 was partly motivated by the data asset, which provides training material for AI oncology applications unavailable to competitors who have not assembled comparable longitudinal patient datasets.
The legal treatment of training data as IP is unsettled. Copyright in a dataset may protect the compilation against direct copying but may not prevent a competitor from training their own model on a publicly available dataset that contains equivalent information. Trade secret protection for training datasets requires that the company take reasonable measures to maintain secrecy, which typically means restrictive access controls, confidentiality agreements, and data licensing terms that prohibit disclosure to third parties.
Key Takeaways, Part VI
Trade secret protection is more appropriate than patent protection for core AI platform architecture and proprietary training datasets, particularly where the dataset is the primary competitive moat. Regulatory exclusivity, including NCE exclusivity, BPCIA reference product exclusivity, orphan drug designation, and pediatric exclusivity, provides time-limited market protection that is independent of and complementary to patent protection. AI-accelerated drug repurposing for orphan indications is a specific application that generates new exclusivity periods without requiring novel chemistry. Proprietary training datasets with longitudinal clinical outcome data are the most defensible component of AI drug discovery platforms and should be valued separately in M&A diligence.
Investment Strategy Note
For analysts evaluating AI-heavy biotech acquisitions, the trade secret and regulatory exclusivity profile deserves as much diligence as the patent portfolio. A company with thin patent protection but strong BPCIA or orphan drug exclusivity on a clinical-stage compound may have more durable market protection than one with a large but vulnerable composition-of-matter patent subject to IPR challenge. Adjust exclusivity duration assumptions in DCF models to reflect the strongest applicable exclusivity mechanism, not just the patent term.
Part VII: Specific Company IP Strategies and Case Studies
Recursion Pharmaceuticals: Platform Scale vs. Patent Depth
Recursion’s approach to IP illustrates the tension between platform breadth and patent defensibility. The company runs an automated biology platform that generates roughly 2.4 petabytes of biological imaging data per year, which it uses to train phenotypic screening models across dozens of disease areas. That scale produces a large pipeline of computational hits, which Recursion then advances through automated chemistry and in vivo validation.
The IP challenge for Recursion is that platform-scale generation of compound hits produces exactly the kind of broad Markush claims that face plausibility objections. WO2024039689, the RBM39 modulator application cited earlier, is a visible example of the tension between AI-scale generation and patent office requirements for experimental support. Recursion’s long-term IP strategy requires either investing in broader experimental validation before filing or accepting narrower claim scope that focuses on specific, heavily validated lead compounds.
Recursion’s market capitalization as of early 2026 is driven partly by platform narrative and partly by the clinical pipeline, which includes programs in CHD2 haploinsufficiency, Flynn syndrome, and cerebral cavernous malformation. The IP value of the clinical pipeline depends heavily on whether the underlying composition-of-matter patents survive challenge. Analysts should treat Recursion’s platform IP as generating option value rather than certain exclusivity, and apply a probability-weighted scenario analysis to the patent estate.
Insilico Medicine: Generative Chemistry with Clinical Validation
Insilico Medicine’s INS018_055 program for idiopathic pulmonary fibrosis (IPF) is the closest thing to a proof case for AI-generated drug patenting done correctly. The compound was identified using Insilico’s Chemistry42 generative model, synthesized, validated in in vitro and in vivo assays, and advanced to Phase II clinical trials with a preclinical timeline of approximately 18 months. That experimental validation history supports a patent estate that is substantially more defensible than a broad Markush claim unsupported by synthesis data.
The company filed composition-of-matter patents on INS018_055 with experimental data from synthesis and in vitro assays, and method-of-treatment patents on its use in pulmonary fibrosis. That two-layer protection structure, with compound patents and method patents, provides multiple independent bases for exclusivity that generics or biosimilar developers must clear independently. If the compound patent is challenged and invalidated, the method patent may survive and still block an ANDA filer from marketing a competing compound for the same indication.
Insilico’s IP strategy illustrates that AI-generated drugs can support robust patent protection when the company invests in experimental validation before filing rather than attempting to claim the entire output of a generative model.
Intelligencia AI: Patenting the Prediction Engine
Intelligencia AI’s U.S. Patent No. 11,948,667 covers a system and method for training machine learning models on clinical trial data to predict FDA approval probability. The patent protects not a drug but a decision support system for pharmaceutical portfolio management. Their Portfolio Optimizer platform generates PTS scores for drug candidates based on modality, indication, mechanism of action, clinical stage, and trial design parameters, with 83% prospectively validated accuracy on Phase II oncology programs.
The commercial model is SaaS licensing to pharma and biotech companies for pipeline prioritization. The patent protection on the system creates a barrier to competitors building equivalent predictive tools, though the scope of that protection depends on how broadly the claims read on competitor implementations. The more durable competitive moat is the proprietary training dataset, which contains historical FDA approval and failure data that Intelligencia has assembled and curated, and which competitors cannot easily replicate.
From an investor perspective, Intelligencia’s patent represents a category that will become more common: IP on AI systems that assist pharmaceutical decision-making rather than on drugs themselves. That category has different valuation dynamics from compound IP. Platform patents in pharmaceutical software compete with trade secret protection and network effects rather than with regulatory exclusivity periods, and their durability depends on continued innovation in the underlying models.
Part VIII: Global Patent Landscape for AI Drug Discovery
The EPO’s Position on AI-Generated Claims
The European Patent Office has taken a more structured approach to AI inventions than the USPTO. Under the EPO’s Guidelines for Examination (updated periodically, most recently incorporating AI-specific guidance in 2024), inventions involving AI techniques are patentable if they have a technical character and produce a technical effect going beyond the normal physical interactions involved in running the software. That requirement to demonstrate technical effect is broader than the USPTO’s Section 101 patent eligibility analysis but has its own complexity for AI drug discovery applications.
For AI-generated pharmaceutical compounds, the EPO’s plausibility requirement under Article 83 EPC is the primary examination hurdle. The EPO requires that a patent application make it plausible that the claimed compounds work as stated, across the full scope of the claims, based on data in the application as filed. Post-filing data can be submitted to support patentability in some circumstances, but the basic plausibility standard must be met from the application as filed. That creates direct tension with broad AI-generated Markush claims that lack extensive synthesis and assay data.
The EPO Enlarged Board of Appeal’s decision in G 2/21 (2023), the Plausibility case, clarified that post-filing evidence can be considered but only to confirm a technical effect that is already plausible from the application as filed. That decision is directly relevant to AI-generated compound patents: if the application as filed does not contain enough experimental data to make the claimed activity plausible, post-filing data will not rescue it.
UK and Commonwealth Jurisdictions Post-Brexit
Post-Brexit, the UK Intellectual Property Office operates independently from the EPO for UK patent prosecution, though the two systems share examination approaches on most technical questions. The UK Court of Appeal has upheld the position that AI cannot be named as an inventor (Thaler v. Comptroller-General of Patents, 2021). The UK Supreme Court subsequently heard the case and ruled in 2023 that DABUS could not be named as inventor under UK patent law, consistent with the EPO and US positions.
For pharmaceutical companies filing globally, the UK’s post-Brexit examination practice on AI-generated inventions is still developing. The UKIPO has published guidance on AI and IP but has not yet issued sector-specific pharmaceutical guidance. The practical advice for global filings is to use consistent inventorship documentation and experimental validation strategies across jurisdictions rather than attempting jurisdiction-specific filing strategies that are not yet clearly supported by examination practice.
China: A Different IP Environment for AI Drug Discovery
China’s National Intellectual Property Administration (CNIPA) has taken a distinct approach to AI-related patents, one driven partly by national policy to build AI capabilities and partly by the technical examination practice developed for a Chinese pharmaceutical market with its own regulatory structure. China’s Patent Law, revised in 2021, does not explicitly address AI inventorship but CNIPA examination guidelines indicate that computer-implemented inventions, including AI drug discovery tools, are patentable if they have technical character and produce a technical effect.
For pharmaceutical compounds discovered by AI, CNIPA practice generally requires the same novelty, inventive step (non-obviousness), and industrial applicability showings as other jurisdictions. The plausibility requirement is present but operationalized through industrial applicability rather than through the EPO’s plausibility doctrine. Chinese pharmaceutical companies including WuXi AppTec, whose computational chemistry division WuXi Biology operates one of the world’s largest CRO platforms, and BeiGene, which uses AI in its oncology pipeline development, are active filers in this space.
For global pharmaceutical IP strategies, China matters both as a jurisdiction where patent protection is sought and as a source of AI-generated prior art. Chinese academic institutions including Peking University and the Shanghai Institute of Materia Medica have published extensively on generative chemistry, and those publications enter the global prior art base.
Key Takeaways, Part VII and VIII
Recursion’s platform-scale approach generates large pipelines but faces plausibility challenges on broad Markush claims. Insilico Medicine’s INS018_055 demonstrates that AI-generated drugs can support defensible patents when experimental validation precedes filing. Intelligencia AI’s platform patent represents a growing category of IP on pharmaceutical decision support systems rather than drug compounds. EPO G 2/21 directly limits the use of post-filing data to rescue AI-generated compound patents that lack plausibility from the application as filed. China is both a relevant filing jurisdiction and a significant source of AI-generated pharmaceutical prior art.
Part IX: Building a Defensible AI Drug Patent Portfolio
The Documentation Architecture
The single most important operational change pharmaceutical R&D organizations need to make is implementing systematic documentation of human decision-making throughout AI-assisted discovery workflows. That documentation is the foundation of inventorship claims and the primary defense against invalidity arguments that the ‘inventor’ was the AI system.
Documentation should capture who defined the target and why, what training data was selected and by whom, what constraints were placed on the generative model’s output, which compounds were selected from AI-generated candidates and the scientific rationale for that selection, what experimental validation was performed and how results were interpreted, and how human researchers modified AI-suggested structures in subsequent iterations. Each of those decision points is a potential inventorship contribution. Without documentation, even a human who made all of those decisions cannot prove their contribution in a later inventorship dispute.
Electronic lab notebooks (ELNs) with AI integration are the obvious technical solution. Platforms including LabArchives, LabVantage, and Dotmatics support timestamped record-keeping that provides contemporaneous evidence of research decisions. The metadata trail from AI tool usage, showing which model version was used, what inputs were provided, and what outputs were generated, should be preserved alongside the human researcher’s documented interpretation of those outputs.
Claim Scope Calibration: The Goldilocks Problem
Pharmaceutical patent claims need to be broad enough to prevent easy design-around by competitors, but narrow enough to satisfy enablement and written description requirements across the full claimed scope. AI-generated compound patents make this balance harder because the generative model may produce structurally diverse candidates that the applicant cannot experimentally validate before filing.
The practical approach used by experienced pharmaceutical patent counsel is tiered claiming. The broadest genus claims are drafted with the understanding that they may face enablement objections and may not survive IPR. Narrower species claims covering specific synthesized and tested compounds are included as dependent claims or in continuation applications, providing fallback positions that are more likely to survive challenge. Method-of-treatment claims on the therapeutic application of validated lead compounds provide a third layer of protection that is independent of the composition claims.
For biologics, the Amgen v. Sanofi ruling creates a direct constraint: functional antibody claims covering an entire class defined by binding properties, without amino acid sequence disclosure for representative members across the claimed functional space, will not survive an enablement challenge. Biologic patent applicants post-Amgen need to file sequence-specific claims on multiple representative antibody variants, not just functional genus claims.
Freedom-to-Operate in the Age of AI Prior Art
Freedom-to-operate (FTO) analysis for an AI-generated lead compound requires a broader search than traditional FTO. The search must cover issued patents and published patent applications in all relevant jurisdictions, academic literature including preprints on bioRxiv and ChemRxiv, AI-generated compound libraries including Enamine REAL Space, Mcule, and similar make-on-demand collections, and defensive publications filed by competitors.
Computational FTO tools that use AI to compare the chemical structure of a candidate compound against the full patent and literature databases are now commercially available. CAS (Chemical Abstracts Service) SciFinder and Reaxys both incorporate AI-assisted similarity searching. Those searches can identify prior art references that a keyword-based search would miss, particularly for structurally similar compounds described under different nomenclature or in different language patent documents.
FTO opinions for AI-generated compounds should explicitly address the plausibility of prior art references, noting whether cited references include synthesis data and biological data sufficient to establish enablement. An AI-generated compound list in a preprint without synthesis data may not constitute enabling prior art under U.S. law, even if the structure overlaps with a lead compound. That distinction matters for FTO risk assessment.
Key Takeaways, Part IX
Documentation of human decision-making throughout AI-assisted discovery is the operational foundation of valid inventorship claims. Tiered claiming strategies, with genus claims, species claims, and method-of-treatment claims, provide multiple layers of protection with different vulnerability profiles. Post-Amgen, functional genus claims for biologics without representative sequence disclosure will not survive enablement challenges. FTO searches for AI-generated compounds must cover computational chemistry databases and AI-generated defensive publications, not just traditional patent and literature databases.
Part X: Investment Strategy for AI Drug Patent Portfolios
Valuing AI-Generated Pipeline Assets
Standard pharmaceutical asset valuation uses risk-adjusted NPV, discounting expected cash flows from peak sales by the probability of clinical success, patent expiration assumptions, and generic entry timing. For AI-generated pipeline assets, three additional risk factors require explicit modeling.
First, patent validity risk on AI-generated composition-of-matter claims is higher than for traditionally discovered compounds, given the plausibility rejection patterns at the USPTO and EPO and the broader prior art landscape created by AI-generated compound libraries. A probability adjustment of 15-25% discount to expected exclusivity duration is reasonable for early-stage AI-generated assets without extensive experimental validation backing their core composition claims.
Second, inventorship risk affects asset transferability. A patent with unclear inventorship is vulnerable to challenge in litigation and creates complications in licensing and M&A transactions. That risk does not typically affect early-stage valuation but becomes material at Phase II or later when the asset is likely to attract licensing interest or acquisition discussions.
Third, platform IP creates option value that is distinct from the compound pipeline. A pharmaceutical company with a validated AI platform that demonstrably accelerates discovery has an asset that generates future pipeline candidates at lower cost than competitors using traditional methods. That platform option value is not captured in standard pipeline NPV models and requires separate treatment, typically as a real options calculation based on estimated platform productivity.
M&A and Partnership Due Diligence Checklist
For analysts conducting due diligence on a biotech acquisition or licensing transaction involving AI-generated assets, the IP diligence process should include the following elements.
Review the invention disclosure records for all patents in the pipeline, specifically looking for evidence of documented human contribution to each claimed invention. Request AI tool usage logs for all preclinical discovery programs. Assess the breadth of composition-of-matter claims relative to the experimental validation data supporting those claims, specifically looking for plausibility vulnerabilities. Check whether key patents have been cited in any IPR petitions or PTAB proceedings. Review the Orange Book listings for any approved products and assess the strength of each listed patent for Paragraph IV challenge exposure. Evaluate the training data provenance for the company’s AI platform, including whether any training data was licensed from third parties under terms that could affect freedom to use. Assess the trade secret protection protocols for the AI platform, including employee confidentiality agreements, access controls, and documentation that reasonable measures to maintain secrecy are in place.
For biologics programs, assess whether the biosimilar patent dance provisions under Section 262(l) have been properly prepared for any approved products, and whether a comprehensive continuation filing strategy is in place to extend the patent thicket beyond the primary sequence patents.
Conclusion
The framing of AI as either friend or enemy to pharmaceutical patents is too simple for practical use. AI is a tool that amplifies strategy. Companies with rigorous inventorship documentation, calibrated claim scope, and tiered protection architectures across patents, trade secrets, and regulatory exclusivity will use AI to build more durable IP portfolios at lower cost. Companies that treat AI as a machine for bulk-generating patent applications will face plausibility rejections, IPR challenges, and the kind of invalidation that erodes asset value precisely when it is most visible to the market.
The most important legal developments to watch are the continued application of Amgen v. Sanofi to AI-generated compound claims at both the USPTO and EPO, the PTAB’s handling of IPR petitions that cite AI-generated prior art as anticipating references, and the emergence of any DTSA litigation involving pharmaceutical AI training datasets. Each of those developments will clarify the rules of the game for a technology that is already reshaping how drugs are discovered, patented, and challenged.
The companies positioning themselves correctly, including those building proprietary training datasets, documenting human contribution at every discovery stage, and filing tiered protection strategies across compound and platform IP, are the ones whose AI investments will translate into durable market exclusivity rather than expensive patent estates that collapse under challenge.
This analysis is for informational purposes for pharmaceutical industry professionals. Nothing in this document constitutes legal advice. Patent eligibility determinations are fact-specific and require counsel familiar with the specific invention and applicable law.


























