The patent system did not anticipate generative chemistry. When Congress codified the requirement that inventors be ‘natural persons’ in 35 U.S.C. § 100(f), machine learning was a theoretical curiosity confined to Bell Labs. Today, AI platforms at Insilico Medicine, Recursion Pharmaceuticals, and Relay Therapeutics are generating lead compounds faster than human medicinal chemists can evaluate them, and the question of who owns those compounds — and how much that ownership is worth — has become the central IP governance question in biopharma.

This is not an academic debate. Patent validity is the backbone of pharma asset valuation. A compound patent covering a blockbuster drug typically accounts for 60-80% of a product’s net present value over its commercial life. When that patent sits on top of an AI-assisted discovery process, its enforceability depends on decisions made years earlier: what the researcher logged in a lab notebook, how the AI training run was documented, which jurisdiction received the priority filing, and whether the patent’s enablement disclosure can teach a skilled artisan to replicate the AI’s output without the AI itself.
Get those decisions wrong, and a Paragraph IV filer will exploit them. Get them right, and an AI-native IP stack can build the kind of durable market exclusivity that justifies the capital cost of the platform. This guide breaks down precisely how to get them right.
Section Overview: Key Takeaways
- AI-assisted drug patents are legally defensible under current U.S. law, but require documented human contribution at each stage of the discovery workflow — not just at final selection.
- The 2024 USPTO Inventorship Guidance does not set a numeric threshold for ‘significant contribution.’ Case-by-case analysis against the Pannu factors governs every dispute.
- Global IP strategy must account for divergent inventorship standards across the EPO, CNIPA, UKIPO, and JPO — a PCT filing does not resolve jurisdictional discrepancies.
- IP valuation models for AI-native assets require adjustments to standard DCF and rNPV approaches because AI-generated compounds carry distinct validity-discount rates and freedom-to-operate risk profiles.
- Generic manufacturers are using the same AI tools to design non-infringing analogs and accelerate Paragraph IV filings, compressing effective market exclusivity periods.
The Legal Landscape: Inventorship Doctrine Meets Machine Learning
The ‘Natural Person’ Requirement and Its Practical Consequences
Under 35 U.S.C. § 100(f), an inventor is ‘the individual or, if a joint invention, the individuals collectively who invented or discovered the subject matter of the invention.’ The Federal Circuit’s August 2022 decision in Thaler v. Vidal, No. 21-2347, read ‘individual’ to mean natural person — a position now cemented across U.S. federal courts. Stephen Thaler’s AI system DABUS could not be named as inventor on U.S. patent applications US16/524,350 and US16/524,532, and the applications were abandoned.
The practical consequence is not that AI-generated discoveries are unpatentable. It is that the human researchers who interact with AI systems must be able to articulate their specific inventive contributions with enough precision to survive an inventorship challenge under the Pannu factors: each named inventor must contribute to the conception of the claimed invention, and that contribution must not be insignificant in quality when measured against the full invention. A researcher who merely runs a pre-trained model and accepts its top-ranked output without modification likely fails the Pannu test. A researcher who curates training data, designs the reward function, interprets structure-activity relationship outputs, and selects a candidate based on predicted ADMET properties has a defensible claim.
The 2024 USPTO ‘Inventorship Guidance for AI-Assisted Inventions’ (89 Fed. Reg. 10043) formalized this framework without resolving its ambiguities. The guidance confirms that AI-assisted inventions remain patentable, that AI cannot be named as an inventor, and that human contribution must be ‘significant’ — but it defers to case-by-case examination for determining what ‘significant’ means. Patent prosecutors and IP counsel working on AI drug discovery assets should treat this guidance as a floor, not a ceiling.
Documenting the Human Contribution: What the Record Must Show
The inventorship record for an AI-assisted drug discovery patent needs to be built before the patent application is filed, not reconstructed during prosecution. The minimum documentation standard that withstands a Paragraph IV inventorship challenge includes four categories of evidence.
First, training data curation records. The selection and labeling of the biological dataset on which the AI model was trained constitutes a human intellectual act that directly shapes the model’s outputs. Researchers at Recursion Pharmaceuticals, for example, curate proprietary phenomics datasets — high-content cellular imaging libraries covering over 2.5 million compound-disease pairs — to direct their model’s exploration of chemical space. The act of deciding which biological endpoints to include, which assay readouts to normalize, and which cell lines to prioritize is itself inventive input. Internal data governance logs, version-controlled dataset manifests, and researcher annotations form the primary evidentiary record.
Second, reward function and hyperparameter design records. In reinforcement learning-based generative chemistry (the architecture underlying platforms like Insilico Medicine’s Chemistry42), the reward function defines what ‘good’ means to the model. A researcher who specifies that reward should weight binding affinity for TNIK (a kinase target implicated in fibrosis) at 40%, synthetic accessibility score at 30%, and predicted CYP3A4 inhibition penalty at 30% has made specific, documented inventive choices. Model configuration files, version-controlled experiment tracking logs (Weights & Biases, MLflow), and researcher decision memos should all be preserved.
Third, candidate selection records. AI generative models produce ranked lists of candidate structures. The selection of a specific candidate — based on medicinal chemistry expertise, analogical reasoning to known structure-activity relationships, or predicted selectivity against off-target kinases — is a human inventive act. Selection should be documented with researcher rationale attached to each compound ID in the electronic lab notebook. This is the step most frequently under-documented in practice, and the step most likely to be challenged by a Paragraph IV filer.
Fourth, experimental validation decisions. Deciding which computational predictions to test in wet lab assays, which species to use for ADMET profiling, and which synthesis route to prioritize all constitute reduction to practice with human inventive input. Lab notebook entries timestamped against the AI model’s output date are critical for establishing the sequence of human decision-making.
Roadmap: AI Drug Discovery — Inventorship Documentation at Each Stage
01
Target Identification & Dataset Curation
Log biological endpoint selections, assay normalization decisions, negative control criteria, and exclusion rationale. Archive dataset version hashes with researcher signatures.
02
Model Architecture & Reward Function Design
Document reward function weights, loss function choices, and hyperparameter tuning rationale with researcher-authored decision memos. Version-control model config files.
03
Generative Output Evaluation
Record the criteria applied to rank and filter AI-generated compound libraries. Attach medicinal chemistry rationale for each shortlisted candidate. Flag divergences from prior art SAR.
04
Lead Selection & Provisional Filing
File provisional application within 30 days of candidate selection to secure priority date. Include researcher-authored technical rationale sections in the provisional specification.
05
Wet Lab Validation & CIP Strategy
Document synthesis routes and ADMET assay design decisions. File continuation-in-part applications as clinical data accumulates to extend claim coverage and reset obviousness analysis.
06
Global PCT Filing & Jurisdiction Triage
File PCT within 12 months of priority date. Identify national phase entry priorities based on commercial opportunity and local inventorship standards. Plan EPO technical-contribution arguments in parallel.
Patent Eligibility Criteria: Novelty, Non-Obviousness, and the AI Paradox
Novelty Risks from Training Data Contamination
AI generative models trained on publicly available chemical databases — PubChem, ChEMBL, ZINC — risk producing compounds that are structurally identical or nearly identical to prior art disclosed in the scientific literature or prior patent filings. The USPTO’s 2024 rejection of a kinase inhibitor patent application (Application No. 17/412,887) on novelty grounds is the clearest example on record: the AI-generated compound differed from a prior art molecule disclosed in a 1998 Journal of Medicinal Chemistry paper by a single fluorine substituent at the para position of a phenyl ring, which the examiner held did not confer novelty under 35 U.S.C. § 102.
Recursion Pharmaceuticals has addressed this risk by training its models exclusively on proprietary phenomics data generated in-house, data that is not accessible to public databases and therefore does not appear in prior art searches. This approach creates a structural advantage: chemical space explored by models trained on proprietary biological data is less likely to overlap with prior art than space explored by models trained on public chemical databases. Recursion’s current phenomics library encompasses imaging data from over 2.5 million compound-disease pairs across hundreds of cell lines, representing a prior art insulation strategy as much as a scientific one.
For companies without Recursion-scale proprietary datasets, the alternative is aggressive freedom-to-operate (FTO) screening of AI outputs before filing. This means running each AI-generated candidate through a structural similarity search against both patent databases and scientific literature using tools like SciFinder, Reaxys, and PatSnap, with a Tanimoto coefficient threshold of 0.85 as the standard cutoff for flagging potential novelty issues. Compounds above that threshold require medicinal chemistry review before prosecution proceeds.
Non-Obviousness in the Age of Predictive AI
The non-obviousness standard presents a structural tension for AI drug discovery: the more accurate and predictive an AI system becomes, the more obvious its outputs may appear to a person of ordinary skill in the art who has access to the same computational tools. This is the AI obviousness paradox, and patent prosecutors have not yet developed a consistent doctrine for resolving it.
The Federal Circuit’s decision in In re Cyclobenzaprine Hydrochloride Extended-Release Capsule Patent Litigation, 676 F.3d 1063 (Fed. Cir. 2012), was decided before modern AI drug discovery existed but established a principle that still governs: unexpected results can support non-obviousness even when the prior art points in the same general direction. Applied to AI drug discovery, this means that a compound whose predicted serotonin receptor affinity diverges from established structure-activity relationships — particularly one that the AI identified by recognizing a pattern invisible to human medicinal chemists — can support a non-obviousness argument if the researcher can document why the result was unpredictable.
Documenting unpredictability in AI drug discovery requires specificity. A declaration submitted during prosecution that states only ‘the AI model identified an unexpected binding mode’ will not survive a KSR International Co. v. Teleflex Inc., 550 U.S. 398 (2007), obviousness challenge. The declaration must specify what structural features the prior art SAR predicted, what the AI actually produced, why those features diverged, and what experimental evidence confirms the divergence is real. Researchers at Insilico Medicine routinely prepare these declarations before prosecution begins, embedding them in what the company calls its ‘IP Readiness Protocol’ — a documentation standard requiring researcher sign-off at each stage of the discovery workflow.
Utility Requirements for AI-Predicted Compounds
A compound patent must show that the claimed molecule has specific, substantial, and credible utility — the standard codified in In re Fisher, 421 F.3d 1365 (Fed. Cir. 2005). For AI-discovered drugs, the utility requirement creates a sequencing problem: generative models typically produce thousands of candidate structures, and experimental validation of even a fraction of those structures is time-consuming and expensive. Companies that file patent applications before wet lab validation have faced rejections on utility grounds when the application’s enablement disclosure relies solely on computational predictions.
DeepMind’s AlphaFold2 illustrates this limitation clearly. AlphaFold2 predicts protein structures with atomic-level precision — a genuine scientific achievement — but a patent claiming a drug compound based solely on AlphaFold2 structure predictions, without experimental binding affinity data, faces a utility rejection because computational structure prediction does not by itself demonstrate therapeutic efficacy. The USPTO expects at least in vitro binding or activity data for the claimed compounds before granting an issued patent with drug composition claims.
Key Takeaways: Patent Eligibility
- Train models on proprietary datasets wherever possible. Public database-trained models produce outputs that are structurally more likely to hit prior art.
- Run Tanimoto similarity screening (threshold: 0.85) on all AI-generated candidates before prosecution begins. Structural novelty must be confirmed, not assumed.
- Non-obviousness arguments for AI discoveries require evidence of unpredictability — not just novelty. Researcher declarations must specify what the prior art SAR predicted versus what the AI actually found.
- File provisional applications only after securing at least preliminary in vitro binding or activity data. Computational-only utility disclosures face routine rejection.
Jurisdiction-by-Jurisdiction Patent Standards: The Global Filing Matrix
United States: The ‘Significant Contribution’ Standard
The USPTO’s 2024 guidance on AI-assisted inventions applies the Pannu factors to each named inventor individually. Every human on the inventorship list must have contributed to the conception of at least one claim. This has direct implications for multi-disciplinary AI drug discovery teams: the data scientist who trained the model, the computational chemist who designed the reward function, and the medicinal chemist who selected the lead candidate may all qualify as co-inventors — but the project manager who attended team meetings does not.
Post-issuance inventorship challenges are increasingly common in PTAB inter partes review (IPR) proceedings. A challenge filed under 35 U.S.C. § 256 arguing incorrect inventorship can trigger a full re-examination of the patent’s validity and, if successful, result in cancellation of the patent for inequitable conduct if the omission was intentional. For AI-assisted patents, where inventorship is genuinely ambiguous, companies should conduct a formal inventorship audit before the patent issues rather than after a challenger identifies the ambiguity.
European Patent Office: Technical Character and the AI Software Exclusion
The EPO’s approach to AI drug discovery patents operates under a different doctrinal framework. Under Article 52 EPC, computer programs and mathematical methods are excluded from patentability ‘as such,’ but are eligible if they produce a ‘technical effect going beyond the normal physical interactions between the program and the computer on which it runs.’ For AI drug discovery, this means the patent must claim a technical contribution — not just the discovery of a compound, but an improved method of experimental screening, a novel biological assay methodology, or a manufacturing process improvement attributable to the AI system’s output.
EPO Examination Guidelines (Part G, Chapter II, Section 3.3) specifically address AI and machine learning applications, confirming that claims to AI-generated drug candidates are eligible if they reflect a technical solution to a technical problem — in practice, if the AI system’s role in identifying the candidate can be framed as an improvement to a technical process rather than a pure computational discovery. This framing requirement forces applicants to draft claims that emphasize the experimental method enabled by the AI output, not just the output itself.
European patent applications for AI-assisted drugs also face the EPO’s prohibition on AI systems as co-inventors. The EPO’s decision in J 8/20 and J 9/20 (the European DABUS cases) held that the requirement for an inventor to be a ‘natural person’ under Article 81 EPC is absolute. Unlike the USPTO, the EPO did not subsequently issue guidance on what constitutes a ‘significant human contribution.’ European prosecution of AI drug patents therefore requires careful claim drafting that ties each inventive step explicitly to a documented human decision.
China: CNIPA’s Evolving Position on AI Co-Inventorship
China’s National Intellectual Property Administration (CNIPA) revised its patent examination guidelines in 2024 to address AI inventorship directly, and the result is the most permissive framework among major jurisdictions. CNIPA’s updated guidance allows AI systems to be acknowledged in patent applications as tools that ‘substantially contributed’ to the invention, and in cases where a human oversees and directs the AI system’s output, it accepts the supervising human as the inventor without requiring the level of specific contribution documentation that the USPTO demands.
This permissive standard reflects China’s strategic priority of accelerating domestic AI drug discovery patent filings. Chinese AI drug discovery patent applications grew by 48% year-over-year in 2024, according to WIPO data, with Baidu Pharma, Huawei Cloud’s drug discovery division, and academic spinouts from Tsinghua University accounting for the largest filing volumes. For multinational pharma companies filing in China, the lower documentation burden is an advantage during prosecution, but the looser standard also means Chinese patent rights may face stronger validity challenges in post-grant review (patent invalidation proceedings before CNIPA’s Patent Re-examination and Invalidation Department).
United Kingdom: Post-Brexit Hardline Doctrine
The UK Supreme Court’s October 2023 decision in Thaler v. Comptroller-General of Patents [2023] UKSC 49 closed the door on AI inventorship more definitively than any other major jurisdiction’s ruling. The Court held that section 13 of the Patents Act 1977 requires an inventor to be a person, and that DABUS was not a person. Unlike the US and EPO decisions, the UK Supreme Court explicitly declined to issue guidance on AI-assisted inventions, leaving the question of what constitutes a qualifying human contribution entirely to future litigation and UKIPO examination practice.
In practice, UKIPO examiners currently apply a standard similar to the USPTO’s ‘significant contribution’ test, but without the benefit of the 2024 federal guidance as a reference point. UK patent prosecution for AI drug discovery assets should treat the inventorship narrative as a free-standing legal document, not an afterthought attached to the technical disclosure. Companies like AstraZeneca and GSK, both headquartered in the UK, have reportedly revised their internal IP governance protocols to require sign-off from at least two human researchers on any patent application listing AI tools as contributing to the discovery process.
| Jurisdiction | AI Inventorship Rule | Human Contribution Standard | Key Precedent | Prosecution Risk Level |
|---|---|---|---|---|
| United States | AI cannot be named as inventor; natural person only | ‘Significant contribution’ under Pannu factors; 2024 USPTO Guidance applies | Thaler v. Vidal (Fed. Cir. 2022); 89 Fed. Reg. 10043 (2024) | Medium — guidance exists but thresholds are fact-specific |
| European Patent Office | Natural person only (Article 81 EPC); J 8/20 & J 9/20 | Technical contribution to a technical problem required; AI = tool | J 8/20; J 9/20; G-II 3.3 Guidelines | Medium-High — claim drafting must frame AI output as technical method improvement |
| China (CNIPA) | AI acknowledgment permitted; human supervisor = inventor | Oversight and direction of AI output sufficient; lower threshold than USPTO | 2024 CNIPA Examination Guideline Revision | Low during prosecution; elevated in post-grant invalidation proceedings |
| United Kingdom | Natural person only; no AI-specific guidance post-Thaler | Undefined — UKIPO applies unofficial USPTO-analog standard | Thaler v. Comptroller-General [2023] UKSC 49 | High — no formal guidance; litigation risk elevated |
| Japan (JPO) | Natural person only; AI explicitly excluded (2024 JPO Study Report) | Human involvement in ‘creative contribution’ required; dataset curation counts | JPO AI Inventorship Study Report (March 2024) | Medium — detailed guidance available; consistent examination practice |
Investment Strategy: Jurisdiction Risk Weighting
- When valuing AI-native pharma IP portfolios, apply a jurisdiction-specific validity discount rate. US patents with documented human contribution merit a lower discount (suggested: 8-12% reduction to base NPV). UK patents without formal guidance documentation warrant a higher discount (15-22%).
- Chinese AI drug patents command commercial value in the domestic market but carry elevated invalidation risk. Model their contribution to portfolio NPV with a higher probability-of-challenge adjustment (35-45% vs. 15-20% for US equivalents).
- EPO-granted AI drug patents with well-drafted technical-contribution claims are among the most defensible in the portfolio. The EPO’s rigorous examination process acts as a pre-grant validity screen. Weight them accordingly in rNPV calculations.
- Companies with PCT filings covering the US, EPO, China, UK, and Japan simultaneously should conduct a jurisdiction-triage analysis within 18 months of PCT filing to prioritize national phase entries based on commercial opportunity and documentation quality.
Trade Secrets vs. Patent Disclosure: The Core Strategic Calculus
Why the Enablement Requirement Threatens AI Algorithm Confidentiality
Patent law’s enablement requirement (35 U.S.C. § 112(a)) requires the specification to ‘enable any person skilled in the art to make and use the invention.’ For AI-discovered drugs, this requirement creates a genuine dilemma. A compound patent that merely claims the structure of the molecule without disclosing how it was discovered is technically sufficient to satisfy enablement — the skilled artisan does not need to replicate the AI discovery process, only to synthesize and test the compound. But a patent on the AI discovery platform itself, or on a method of drug discovery using a particular generative architecture, requires meaningful disclosure of the model’s design, training procedure, and validation methodology.
Most pharmaceutical companies treat their AI discovery platforms as trade secrets and limit patent coverage to the drug compounds the platform produces. Relay Therapeutics exemplifies this hybrid approach: the company holds issued patents on specific small molecules targeting mutant RAS proteins and on method-of-treatment claims for RAS-driven cancers, but its molecular dynamics simulation platform (Dynamo) and its induced-fit docking methodology remain entirely under trade secret protection. The platform is the engine; the compounds are the product.
The risk in this approach is that trade secret protection requires genuine confidentiality. A disgruntled employee who downloads the model weights and publishes them, a contract research organization that accesses platform documentation under an NDA and later uses similar techniques, or a reverse-engineering effort by a competitor analyzing the chemical space of Relay’s patent claims can all erode trade secret status. Companies should maintain strict access controls (role-based authentication, data loss prevention tools), require annual trade secret acknowledgment agreements from all researchers with platform access, and conduct regular audits of access logs.
The Reverse Engineering Risk from Patent Claim Analysis
An underappreciated risk in AI drug discovery IP strategy is that the pattern of a company’s patent claims itself can reveal information about its AI platform’s training focus and optimization criteria. A competitor analyzing the structural space of Insilico Medicine’s Chemistry42-derived patents — noting the recurring scaffold families, the distribution of physicochemical properties, the consistent selectivity profiles — can make educated inferences about the model’s reward function design and the biological datasets that shaped it.
This ‘claim landscape reverse engineering’ is not illegal. It is standard competitive intelligence practice. Patentpc’s competitive intelligence unit and similar services routinely analyze AI drug discovery patent portfolios to reconstruct implied platform capabilities. The implication for IP strategy is that claim drafting decisions should account for informational exposure: broad genus claims covering a wider scaffold space are commercially stronger but reveal less about the platform’s specific focus than a tight cluster of highly specific structure claims.
Key Takeaways: Trade Secrets vs. Patents
- Patenting the compound and keeping the platform under trade secret protection is the dominant industry approach. It satisfies enablement requirements without disclosing model architecture.
- Trade secret protection requires active maintenance: access controls, NDAs with audit rights, and annual confidentiality re-certifications from research staff.
- Claim landscape analysis by competitors can partially reverse-engineer a platform’s training focus. Broad genus claims offer stronger commercial protection and less informational exposure than tight structural clusters.
- For platforms claiming method patents on AI-assisted drug discovery processes, enablement disclosure must detail the training pipeline sufficiently to allow replication — a disclosure that most companies are unwilling to make. Avoid method claims on the AI platform itself unless the platform is not the primary competitive asset.
Enablement, Written Description, and the CIP Prosecution Strategy
When AI-Designed Synthesis Routes Fail the Enablement Test
The USPTO’s 2024 rejection of an AI-designed mRNA vaccine adjuvant application (Application No. 17/898,342) illustrates the most common enablement failure mode for AI drug discovery patents. The application claimed a lipid nanoparticle (LNP) formulation designed by a generative model trained on published LNP efficacy data. The specification disclosed the LNP’s component lipid structures and molar ratios, and presented in vivo immunogenicity data in mice. But it did not disclose the specific mixing conditions, extrusion parameters, or encapsulation efficiency controls required to reproducibly manufacture the LNP at the claimed molar ratios. The examiner held that a skilled formulation scientist could not practice the claimed invention without undue experimentation.
This rejection type is common because AI models that optimize molecular structure do not simultaneously optimize synthesis or manufacturing process. The model produces an ideal final structure; the experimental path to that structure requires conventional process development work that takes months and is not complete at the time of the provisional filing. Companies facing this gap should adopt the continuation-in-part (CIP) strategy systematically: file the provisional application to secure the priority date with the structural data in hand, and file the CIP application with complete manufacturing process disclosure when process development is complete, typically 12-18 months later.
The CIP also has a strategic offensive function. As clinical data accumulates on an AI-discovered compound, CIP applications can add method-of-treatment claims covering newly discovered indications, patient subpopulation claims based on pharmacogenomic data, and formulation claims covering optimized dosing regimens. This incremental claim expansion is functionally equivalent to the evergreening strategies used for conventional drugs, applied to an AI-native IP stack.
Written Description Requirements for AI-Generated Compound Libraries
The written description requirement (35 U.S.C. § 112(a), second prong) requires the inventor to ‘convey with reasonable clarity to those skilled in the art that, as of the filing date, he or she was in possession of the invention.’ For broad genus claims covering AI-generated compound libraries — a common prosecution approach in which applicants claim large structural families rather than individual compounds — the written description requirement creates a significant hurdle.
The Federal Circuit’s decision in Ariad Pharmaceuticals, Inc. v. Eli Lilly and Co., 598 F.3d 1336 (Fed. Cir. 2010) (en banc), established that the written description must demonstrate possession of the ‘full scope’ of a claim at the filing date. A genus claim covering thousands of AI-generated analogs requires representative species across the structural diversity of the claimed genus, not just the specific lead compound. Patent prosecutors for AI drug discovery applications routinely include 50-200 representative structures in the specification to support broad genus claims — a practice that adds cost and complexity to prosecution but is essential for claim sustainability.
Average AI Drug Discovery Patent Portfolio Size
45+
Patents across target, compound, formulation, and method claims (Insilico Medicine, 2025)
Post-Grant Challenge Rate, AI Drug Patents (2024)
23%
Of AI-related drug patents granted in 2024 faced validity disputes within 12 months of issuance
CIP Prosecution Timeline Advantage
18 mo.
Average gap between provisional filing and CIP with full process data — critical window for priority date protection
Written Description Species Requirement (Broad Genus)
50-200
Representative structures required in specification to support large AI-generated genus claims in prosecution
Data Ownership, Licensing, and the Provenance Problem
Multi-Source Training Data and Joint IP Attribution
AI models trained on data contributed by multiple parties — academic collaborators, CROs, patient biobanks, hospital health systems — create joint IP attribution risks that are not adequately addressed by standard material transfer agreements (MTAs) or data licensing agreements drafted for pre-AI research frameworks. The 2025 litigation between BioNTech and Nucleai involving tumor imaging data used to train a cancer drug discovery AI is the highest-profile example: the core dispute was whether the data provider (Nucleai) had retained any IP rights in the model that was trained on its data, despite a data licensing agreement that expressly assigned all model outputs to BioNTech.
The legal question is unresolved across most jurisdictions. Data itself is not patentable, and copyright protection for data compilations is limited and jurisdiction-specific. But a strong argument exists that a sufficiently unique dataset — one whose curation reflects substantial human intellectual labor and whose structure materially shapes a model’s output — constitutes a protectable trade secret that survives contractual assignment of model outputs. Parties drafting AI data licensing agreements in 2025 and beyond should address this risk explicitly: the agreement should specify whether the data provider retains any trade secret rights in the dataset itself, whether the data provider has any license rights in downstream patent applications, and whether the model weights trained on the data constitute a derivative work of the dataset under copyright law.
Blockchain Provenance Tracking: Practical Implementation
Blockchain-based provenance tracking for AI training data has moved from theoretical best practice to operational deployment at several large pharma companies. The core application is an immutable audit trail recording which data sources contributed to each model training run, with timestamped records of data ingestion, preprocessing decisions, and dataset version hashes. When a patent application is later filed claiming compounds generated by that model, the provenance record enables the applicant to demonstrate exactly which datasets shaped the model’s outputs — relevant both for inventorship analysis and for defending against third-party IP claims based on training data contributions.
Practical implementations use permissioned blockchain frameworks (Hyperledger Fabric is the most common in pharma) rather than public blockchains, since training data records contain proprietary information. The provenance record is not filed with the patent application but is maintained as a supporting document available for production in litigation or post-grant proceedings. Companies using this approach include Pfizer (in its collaboration with ConcertAI on oncology data assets) and Johnson & Johnson (in its AI research partnerships with NVIDIA and Microsoft).
Key Takeaways: Data Ownership
- Standard MTAs and data licensing agreements are inadequate for AI drug discovery. New agreement templates must explicitly address trade secret rights in training datasets, IP rights in model weights, and ownership of downstream patent applications.
- The BioNTech/Nucleai litigation signals that data providers will increasingly assert IP claims against model outputs trained on their data. Review existing data licensing agreements for this exposure before the next patent filing cycle.
- Blockchain-based provenance tracking on a permissioned framework provides the strongest evidentiary record for data attribution disputes. Implement before, not after, litigation arises.
- Model weights trained on jointly owned or licensed datasets may constitute derivative works requiring explicit IP disposition in the underlying data agreement. Engage IP counsel on this question before the training run, not after the patent application is filed.
IP Valuation Frameworks for AI-Native Drug Assets
Why Standard rNPV Models Require Adjustment
Risk-adjusted net present value (rNPV) models are the standard tool for pharmaceutical patent portfolio valuation. The canonical inputs are a projected peak sales estimate, a probability-of-technical-success (PTS) factor applied at each clinical development stage, a patent-adjusted exclusivity period, and a discount rate that reflects the cost of capital and development risk. For AI-discovered drugs, each of these inputs requires adjustment to reflect the specific risk profile of AI-generated IP.
The PTS adjustment for AI-discovered compounds is the most actively debated. The industry thesis — that AI drug discovery increases PTS by identifying better-validated targets and optimized leads — is plausible but not yet supported by a statistically adequate clinical outcomes dataset. As of Q1 2026, fewer than 15 AI-first drug candidates have completed Phase II trials, and none have received FDA approval as an AI-first program (Insilico Medicine’s ISM001-055 for IPF is in Phase IIb; its INS018_05 candidate is in Phase II). Until a critical mass of Phase III readouts accumulates, PTS assumptions for AI drug programs should be treated with methodological humility.
The patent validity discount is the more quantifiable adjustment. The 23% post-grant challenge rate for AI drug patents in 2024 is substantially higher than the 8-12% challenge rate for conventional drug patents in the same period. This elevated challenge rate reflects both the novelty of the legal questions and the deliberate strategy of Paragraph IV filers who view AI patent documentation gaps as an accessible invalidity vector. An additional 10-15% validity discount applied to the patent-protected exclusivity period is appropriate for AI drug patents without a complete inventorship documentation record.
IP Valuation: Insilico Medicine’s Portfolio as a Case Study
Insilico Medicine’s IP stack for its lead program, INS018_055 (a TNIK inhibitor for idiopathic pulmonary fibrosis now in Phase IIb trials), illustrates how a well-structured AI drug IP portfolio is built and how it should be valued. The company holds over 45 patents covering four distinct asset categories: target identification platform patents covering its PandaOmics AI target discovery system; compound patents covering INS018_055 and structurally related analogs across multiple jurisdictions; formulation patents covering the tablet formulation and dosing regimen; and method-of-treatment patents covering TNIK inhibition in pulmonary fibrosis and related fibrotic indications.
This layered structure mirrors the evergreening playbook used by conventional pharma to extend effective market exclusivity beyond the compound patent expiry. The compound patent, typically the highest-value asset, expires first. Formulation patents and method-of-treatment patents extend exclusivity — or at minimum, complicate generic entry — by requiring an ANDA filer to challenge multiple patent families simultaneously rather than a single compound claim.
For investors and portfolio managers valuing Insilico’s IPF program, the relevant analysis is not just the compound patent’s expiry date but the entire patent cluster’s expected defense cost and the probability that any one of the cluster’s patents survives a Paragraph IV challenge through final judgment at trial. A Paragraph IV challenge against a single compound patent is manageable. A challenge against five patent families simultaneously — compound, formulation, method of treatment, process, and platform — is prohibitively expensive for a generic filer and signals a program with durable exclusivity protection.
Insilico Medicine IP Stack (INS018_055)
45+ Patents
Covering target platform, compound, formulation, and method-of-treatment claims across 8+ jurisdictions
Merck Acquisition of Atomwise (2025)
$2.1B
Includes IP governance clause requiring human oversight documentation for all generative chemistry outputs
Global AI Drug Discovery Investment (Projected 2026)
$15.7B
CAGR of 28% from 2022 baseline; IP valuation infrastructure investment lagging platform investment by 3-4 years
AI Drug Patent Validity Discount Rate
10-15%
Suggested additional reduction to rNPV for patents without complete inventorship documentation records
The Atomwise Acquisition Precedent: IP Governance as M&A Due Diligence
Merck’s 2025 acquisition of Atomwise for $2.1 billion set a precedent for how large pharma handles AI drug discovery IP in M&A transactions. The acquisition agreement included a specific IP governance clause requiring that all generative chemistry outputs from Atomwise’s AtomNet platform be subject to a documented human oversight review before being incorporated into patent applications. The clause was not a regulatory requirement — it was a contractual protection by Merck’s IP counsel against the risk of inheriting a patent portfolio that could not survive inventorship challenges post-close.
This precedent matters for portfolio managers evaluating AI drug discovery acquisitions. IP governance audit — specifically, a review of inventorship documentation practices, data provenance records, and trade secret maintenance protocols — should be a standard due diligence workstream in any AI pharma acquisition, on par with the clinical data package review and the manufacturing capacity audit. An acquired company whose AI-generated compound patents lack adequate human contribution documentation is not merely a compliance risk. It is a valuation risk: the patents may be unenforceable at the moment of generic challenge, which typically occurs 6-8 years after approval, precisely when the originator is most dependent on patent protection to justify the acquisition price.
Investment Strategy: AI Drug Discovery M&A and Portfolio Valuation
- Add an ‘AI IP Governance Score’ to acquisition due diligence checklists. Score targets across five dimensions: inventorship documentation quality, data provenance records, trade secret maintenance protocols, global filing coverage, and post-grant challenge history.
- Apply a 10-15% validity discount to rNPV calculations for AI drug patents without a complete Pannu-compliant inventorship record. Adjust upward if PTAB IPR challenges have already been filed.
- Insilico Medicine’s layered patent structure (platform + compound + formulation + method-of-treatment) is the benchmark for evaluating AI drug IP portfolio depth. A single compound patent without supporting claims in adjacent categories carries materially higher generic entry risk.
- Merck’s Atomwise acquisition terms signal that large pharma will increasingly require IP governance warranties in AI drug discovery deals. Sellers should build documentation infrastructure now — the covenant is coming as standard deal language.
- Recursion Pharmaceuticals’ proprietary phenomics dataset is itself a protectable trade secret asset that contributes to the company’s IP moat independently of its patent portfolio. Evaluate proprietary data infrastructure as a distinct IP asset class in any target company analysis.
Competitive Dynamics: Generic Entry Acceleration and the Paragraph IV AI Arms Race
How Generic Manufacturers Are Using AI to Attack Originator Patents
The same AI tools that originator companies use to discover drugs are now in the hands of generic manufacturers and their Paragraph IV litigation counsel. AI-powered patent landscape analysis tools — PatSnap Insights, CIPster, Cipher Analytics, and similar platforms — can identify potential invalidity vectors in AI-drug discovery patents faster than conventional manual analysis. Automated prior art searches run structural similarity algorithms across ChEMBL, PubChem, and the full USPTO patent database in hours, flagging potential anticipation or obviousness arguments that might have taken a human researcher weeks to find.
The FDA’s 2024 approval of an AI-developed generic version of adalimumab (the biosimilar reference product for Humira) occurring approximately six months ahead of initial industry projections illustrates the accelerating generic entry timeline. Sandoz’s Hyrimoz biosimilar development team used AI-assisted formulation optimization to hit the target citrate-free formulation specifications faster than conventional process development would have allowed, compressing the development timeline and enabling an earlier ANDA submission date.
For originator companies, the implication is that the effective market exclusivity period for AI-discovered drugs may be shorter than the nominal patent expiry date suggests. Generic filers have shorter development timelines (because they too use AI tools), faster patent analysis capabilities (because AI reads landscapes faster than human counsel), and an increasingly clear understanding of the inventorship documentation gaps that make AI drug patents vulnerable. The response is not to abandon AI-assisted drug discovery — the productivity gains are too substantial. It is to build the AI IP governance infrastructure that makes the resulting patents defensible from day one.
The Non-Infringing Analog Design Problem
Generative chemistry AI is not limited to drug discovery. It is equally useful for designing non-infringing analogs — molecules that retain the biological activity of an originator compound but have structural differences sufficient to avoid infringement of the originator’s compound claims. Generic and specialty pharma companies are actively using generative chemistry to design around originator patents before the patents expire, pre-positioning substitute molecules for regulatory approval and commercial launch timed to the originator’s loss-of-exclusivity date.
The strategic response for originator companies is broad genus claiming backed by adequate written description. A compound patent that claims not just the lead molecule but a structurally diverse genus of active analogs — with representative species spanning the genus and experimental data supporting their activity — substantially raises the cost and difficulty of designing a non-infringing workaround. The trade-off is the written description burden: broader genus claims require more representative species in the specification, which requires more experimental work before filing. This is the core tension in AI drug discovery patent strategy, and there is no clean resolution — only a calibrated tradeoff between claim scope and prosecution cost.
Key Takeaways: Competitive Dynamics
- Generic manufacturers are using the same AI tools to attack originator patents. Paragraph IV strategies now incorporate AI prior art searches that are faster and more comprehensive than conventional manual analysis.
- Effective market exclusivity for AI-discovered drugs may be materially shorter than nominal patent expiry suggests. Model generic entry timing conservatively when building commercial forecasts.
- Broad genus claims backed by adequate written description (50-200 representative species) are the primary defense against generative chemistry workaround design. Insufficient written description in the original application is difficult to cure post-issuance.
- Layered patent portfolios covering compound, formulation, method-of-treatment, and process claims force generic filers to challenge multiple patent families simultaneously, raising the cost and duration of Paragraph IV litigation.
Case Studies: Insilico Medicine, DABUS, and Recursion Pharmaceuticals
Insilico Medicine: The 45-Patent IP Governance Benchmark
Insilico Medicine’s IP strategy for INS018_055 (also referred to as ISM001-055 in some regulatory filings) is the most fully documented example of AI drug discovery IP governance in the public record. The company’s 18-month preclinical timeline from TNIK target identification to clinical candidate selection, completed using the Chemistry42 generative chemistry platform and PandaOmics target identification system, was accompanied by a systematic documentation protocol that the company has described as its ‘IP Readiness Protocol.’
The protocol requires that at each stage of the Chemistry42 workflow — target selection, reward function specification, generative run parameters, candidate scoring criteria, and lead selection rationale — a named researcher review and sign off on the human decision made. These signoffs are archived in an electronic lab notebook (the company uses a LabArchives-based system) with timestamped, version-controlled records. Each patent application filed by Insilico cites to specific notebook entries in its prosecution history file, creating a documented chain from human decision to claimed invention.
The 45-patent portfolio covering INS018_055 is structured in four layers. The first layer covers PandaOmics as a computational target discovery platform — method claims on the AI-assisted target validation methodology. The second layer covers the compound itself and a genus of structurally related TNIK inhibitors, with the specification including 87 representative analog structures supported by in vitro TNIK binding data. The third layer covers the Phase I-derived dosing regimen and the tablet formulation (immediate-release 10 mg and 20 mg strengths). The fourth layer covers method-of-treatment claims for IPF, with dependent claims for systemic sclerosis-associated interstitial lung disease and other fibrotic conditions identified through AI-assisted indication expansion analysis.
The PCT filing strategy covered 18 national phase jurisdictions, with national phase entries prioritized in the US, EU (EPO), China, Japan, South Korea, Australia, Canada, and Brazil based on projected IPF treatment market size and generic entry risk assessment. The EPO application was drafted with specific technical-contribution arguments emphasizing that Chemistry42’s identification of TNIK as a viable IPF target improved the technical process of target selection by reducing the experimental screening burden by an estimated 85% compared to conventional high-throughput screening approaches.
Insilico Medicine: INS018_055 Patent Layers
4 Layers
Platform method, compound genus, formulation, and method-of-treatment claims across 18 jurisdictions
Representative Species in Genus Claim Specification
87
In vitro TNIK binding data supporting written description for broad compound genus claim
Chemistry42 Preclinical Timeline (TNIK Program)
18 Months
Target identification to clinical candidate — vs. 5-6 years for conventional discovery
EPO Technical Contribution Claim
85% Reduction
Estimated reduction in experimental screening burden vs. conventional HTS — argued as technical improvement in EPO prosecution
DABUS: What the Failed Patents Actually Taught the Industry
The DABUS patent applications (US16/524,350 covering a food container with fractal geometry, and US16/524,532 covering an emergency beacon) were not pharmaceutical cases, but their legal aftermath reshaped pharmaceutical AI patent strategy more than any drug-specific case. Stephen Thaler’s deliberate strategy of listing DABUS as the sole inventor — rejecting any option to name himself or other humans as co-inventors — was designed to force a clean legal test of whether an AI system could hold patent rights. It succeeded in forcing the test and failed in obtaining the result.
The cascade of legal decisions that followed the Federal Circuit’s August 2022 ruling — the USPTO’s 2024 Inventorship Guidance, the EPO’s J 8/20 and J 9/20 decisions, the UK Supreme Court’s October 2023 ruling — collectively produced the current international framework for AI inventorship. The Deloitte 2025 survey finding that 78% of pharmaceutical companies now mandate inventorship audits for AI projects is a direct consequence of this framework: companies that had been operating without formal AI inventorship documentation protocols implemented them in response to the regulatory certainty that DABUS’s failure created.
The DABUS cases also inadvertently established the affirmative case for AI-assisted patents. Because the courts held that AI-assisted inventions with sufficient human contribution are patentable, and because the cases drew a clear line between ‘AI as sole inventor’ (unpatentable) and ‘AI as tool used by human inventors’ (patentable), they created a regulatory safe harbor that companies can operate within. The line is not perfectly defined, but it is defined. That clarity, paradoxically, is DABUS’s legacy in pharmaceutical IP.
Recursion Pharmaceuticals: The Proprietary Data Moat Strategy
Recursion’s IP strategy differs from Insilico’s in a critical way: Recursion has invested more heavily in building a proprietary data asset — its phenomics library — than in building a large patent portfolio around any individual drug candidate. The phenomics library, which currently contains high-content cellular imaging data from over 2.5 million compound-disease pairs, represents years of wet lab work that cannot be replicated quickly by competitors. It is a data moat that protects the predictive accuracy of Recursion’s AI models in a way that patents protect Insilico’s specific compound claims.
The proprietary dataset approach has a specific IP valuation implication: the data itself, maintained as a trade secret, is an intangible asset that does not appear on Recursion’s balance sheet at full value under current GAAP accounting standards but materially affects the company’s competitive position and its ability to generate novel, non-obvious compound patents in the future. For analysts valuing Recursion, the phenomics library should be modeled as a durable competitive advantage that reduces the probability of a successful Paragraph IV challenge against Recursion’s compound patents — because the company’s compounds are less likely to have structural overlap with prior art, and the derivation of those compounds from a proprietary dataset provides a documented chain of inventive human decisions.
Investment Strategy: Recursion Pharmaceuticals IP Moat Analysis
- Recursion’s phenomics library is not a patent — it is a trade secret asset. Value it as a strategic moat that lowers the company’s cost of generating patentable novel compounds and reduces validity risk in compound patents derived from it.
- Compound patents generated from proprietary, non-public training data carry a lower prior art overlap risk than patents from public-database-trained models. Apply a reduced validity discount (5-8% vs. 10-15%) when the training dataset is demonstrably proprietary.
- Any acquirer of Recursion must treat the phenomics library’s trade secret status as a core M&A due diligence item. Assess access control rigor, employee confidentiality agreement coverage, and CRO/partnership data sharing agreements for potential trade secret leakage.
- Model the phenomics library’s replacement cost (estimated at $400M-$600M based on comparable imaging infrastructure and compound library scale) as a floor for its contribution to Recursion’s enterprise value, in addition to the DCF value of its drug pipeline.
Technology Roadmap: Protecting the Full AI Drug Discovery Stack
The Five-Layer IP Protection Architecture
A complete AI drug discovery IP strategy covers five distinct technology layers, each with different IP protection instruments and different vulnerability profiles. Most pharmaceutical companies protect one or two layers well and leave the others exposed. The companies with the most defensible AI drug IP portfolios protect all five.
The first layer is the biological data infrastructure: the proprietary experimental datasets, assay readouts, and genomics data that train the AI model. Protection instrument: trade secret. Key vulnerability: data sharing with research partners and CROs without adequate IP provisions in data licensing agreements. Risk mitigation: implement blockchain-based provenance tracking and update all data sharing agreements to explicitly address model training use, model weight ownership, and downstream patent rights.
The second layer is the AI model architecture and weights: the neural network design, training procedure, hyperparameter configuration, and trained model weights. Protection instrument: trade secret (for weights); potentially copyright (for code); potentially patent (for novel training methodologies). Key vulnerability: model weight extraction via membership inference attacks and model inversion attacks from competitors who have access to the model’s API outputs. Risk mitigation: implement differential privacy techniques in training, limit API output precision, and conduct regular adversarial robustness audits.
The third layer is the target identification and validation output: the specific biological targets prioritized by the AI system and the evidence base supporting their selection. Protection instrument: provisional patent (to establish priority date for downstream compound claims); trade secret (for unpublished target validation data). Key vulnerability: publication of AI-generated target hypotheses in scientific literature before patent filing, creating prior art. Risk mitigation: enforce a strict pre-publication IP review protocol with a minimum 90-day hold period before any target-related data is submitted to a journal.
The fourth layer is the drug compound itself: the specific chemical structure, its salts, enantiomers, polymorphs, and prodrug forms. Protection instrument: compound patent (primary); formulation patent (secondary); Hatch-Waxman Orange Book listing for FDA-approved drugs. Key vulnerability: prior art overlap from public database-trained models; written description insufficiency for broad genus claims; inventorship documentation gaps. Risk mitigation: implement the documentation protocols described in this guide; run FTO screening before filing; include 50-200 representative species in genus claim specifications.
The fifth layer is the clinical development package: the dosing regimen, the patient population definition, the companion diagnostic (if applicable), and the method-of-treatment claims for each approved indication. Protection instrument: method-of-treatment patents; CIP applications as new indications are discovered; data exclusivity under Hatch-Waxman (5 years for NCEs; 12 years for biologics). Key vulnerability: off-label use that bypasses method-of-treatment patents; biosimilar interchangeability designations that undermine branded biologics market share. Risk mitigation: file method-of-treatment claims aggressively as clinical data accumulates; pursue Orphan Drug Designation where applicable for additional 7-year market exclusivity.
Technology Roadmap: AI Drug IP Protection — Vulnerability Matrix
L1
Biological Data Infrastructure
Instrument: Trade secret. Primary risk: inadequate data licensing agreements with research partners. Fix: update agreements to address model training rights and downstream patent attribution before data is shared.
L2
AI Model Architecture & Weights
Instrument: Trade secret + copyright (code). Primary risk: model inversion attacks via API output analysis. Fix: differential privacy in training; restrict API output precision; adversarial robustness audits quarterly.
L3
Target Identification Output
Instrument: Provisional patent + trade secret. Primary risk: publication of target hypotheses before patent filing creates prior art. Fix: mandatory 90-day IP review hold before any publication; file provisional applications before journal submissions.
L4
Drug Compound (Structure, Salts, Polymorphs)
Instrument: Compound patent + CIP applications. Primary risk: inventorship documentation gaps; genus claim written description insufficiency. Fix: implement IP Readiness Protocol; include 50-200 representative species in specification; run Tanimoto FTO screening before filing.
L5
Clinical Development Package
Instrument: Method-of-treatment patents + data exclusivity + Orphan Drug Designation (where applicable). Primary risk: off-label use bypassing method claims; biosimilar interchangeability erosion for biologic programs. Fix: file method claims aggressively as indications expand; monitor biosimilar interchangeability applications at FDA.
Interpretable AI and the Written Description Solution
One of the most promising technical developments for AI drug discovery patent prosecution is the adoption of interpretable AI methods — specifically, SHAP (SHapley Additive exPlanations) value analysis and attention mechanism visualization — to document the decision pathways by which AI models arrive at specific compound recommendations. SHAP values decompose an AI model’s output into contributions from individual input features, allowing a researcher to generate a human-readable explanation of why a specific molecular substructure was predicted to be active against a target.
This has direct practical value in patent prosecution. A patent specification that includes a SHAP value analysis explaining that the model predicted high TNIK binding affinity because of a specific hydrogen bond donor configuration at the 3-position of a pyrazole scaffold, cross-referenced to crystallographic data for the TNIK binding pocket, provides written description support that is substantially stronger than a specification that describes only the compound structure and the in vitro binding result. It is effectively a machine-generated SAR rationale that can substitute for the experimental SAR work that human medicinal chemists traditionally provide in written descriptions.
The EPO’s technical contribution requirement benefits particularly from SHAP-based disclosures. When an applicant can demonstrate, using SHAP analysis, that the AI model identified a non-obvious structural feature that improved binding selectivity over related kinases — a technical result achieved by a technical means — the ‘technical character’ requirement of Article 52 EPC is substantially easier to satisfy. AstraZeneca’s computational chemistry team has reportedly incorporated SHAP-based technical rationales as a standard section of EPO patent specifications for AI-assisted small molecule programs since 2024.
Policy Reform Vectors and Their Investment Implications
Tiered Inventorship and the ‘AI as Tool’ Codification Debate
The most significant pending policy question in AI drug patent law is whether statutory reform should codify AI’s role in drug discovery in a way that provides more predictable legal outcomes than the current guidance-based framework. Two reform vectors are actively debated in the USPTO, WIPO, and Congressional staff discussions.
The first is ‘AI as named tool’ codification: amending 35 U.S.C. § 100 to expressly define AI systems as ‘tools’ rather than ‘persons,’ while simultaneously specifying the minimum human contribution requirements for inventorship in AI-assisted contexts. This approach would convert the current guidance-based standard into statutory law with more predictable legal effect. Industry groups representing pharmaceutical companies — PhRMA, BIO, the Intellectual Property Owners Association — have generally supported this approach because it provides legal certainty without requiring the industry to restructure its discovery workflows.
The second, more radical reform vector is tiered inventorship: creating a new legal category of ‘AI-assisted invention’ with a distinct patent term, royalty structure, or disclosure requirement that reflects the reduced human inventive contribution. Proponents argue that a drug discovered with minimal human input beyond running an AI model deserves a shorter exclusivity period than a drug discovered through years of conventional medicinal chemistry labor. Opponents — primarily pharma and biotech companies — argue that the exclusivity period’s primary function is to incentivize investment in clinical development, not to reward the discovery process per se, and that a tiered system would reduce investment incentives without meaningfully improving public access to medicines.
The probability of statutory reform in the current Congressional term is low. The USPTO’s guidance-based approach is sufficient for near-term prosecution management, and the legislative bandwidth required to amend the Patent Act faces significant political competition from other priorities. But companies with long drug development timelines — particularly biologics programs with 10-15 year development horizons — should model the scenario in which a statutory revision to patent term or royalty structure for AI-discovered drugs is enacted during the program’s commercial life. The probability is low but the impact on NPV is material if it occurs.
The NIH AIM-HI Initiative and Shared AI Platform IP
The NIH’s Accelerating Medicines Partnership AI for Health Innovations (AIM-HI) initiative is the largest public-private AI drug discovery collaboration currently funded in the United States. The initiative provides participating companies access to shared AI platforms trained on NIH-curated genomics and phenomics datasets, with IP rights to drug candidates derived from those platforms allocated to the contributing industry partner rather than to NIH or to the academic research institutions that curated the underlying data.
The IP allocation structure in AIM-HI is a direct response to the historical failure mode of federal government-funded research: the NIH has learned from decades of Bayh-Dole Act implementation that assigning IP rights to industry partners increases the probability of clinical development and commercialization of federally funded discoveries. But the AIM-HI structure also creates complexity: when a drug candidate is discovered using a shared AI platform trained on publicly funded data, and that candidate is then patented by an industry partner, the question of whether NIH retains march-in rights under the Bayh-Dole Act is a live legal issue.
Industry partners in AIM-HI should obtain explicit written clarification from NIH’s Office of Technology Transfer on march-in right applicability before investing in clinical development of AIM-HI-derived candidates. A march-in right action compelling licensing of a commercially successful drug at a lower royalty rate would materially reduce the NPV of any program built on an AIM-HI foundation.
Key Takeaways: Policy and Investment Implications
- Statutory reform codifying AI’s role in inventorship is unlikely in the current Congressional term but material in NPV impact if enacted. Model this as a 10-15% NPV reduction scenario for AI-first programs with projected commercialization beyond 2030.
- The ‘tiered inventorship’ reform vector, if enacted, would create a new patent term category for AI-heavy discoveries. Monitor BIO and PhRMA position papers for early signals on industry coalition strategy.
- AIM-HI participation creates Bayh-Dole march-in right exposure. Obtain explicit NIH-OTT clarification before investing in clinical development of AIM-HI-derived candidates. This is a due diligence item, not a post-approval concern.
- Interpretable AI techniques (SHAP, attention visualization) are moving from experimental to standard prosecution practice for EPO filings. Companies that build SHAP analysis into their standard IP documentation workflow now gain a prosecution cost advantage over competitors who implement it reactively.
The Regulatory Frontier: FDA’s AI Drug Approval Framework and Its Patent Implications
FDA’s 2025 Guidance on AI in Drug Development
The FDA’s evolving stance on AI in drug development intersects with patent strategy in two underappreciated ways. First, the FDA’s 2025 draft guidance on the use of AI and machine learning in drug development (building on its 2023 discussion paper) requires sponsors to document the AI tools used in clinical trial design, biomarker development, and patient selection. This documentation requirement — intended for regulatory review — also creates a second record of human involvement in AI-assisted processes that is potentially useful in inventorship documentation for method-of-treatment patents.
Second, AI-optimized clinical trial designs — smaller patient cohorts, adaptive randomization, biomarker-stratified populations — can reduce the sample size required for Phase III pivotal trials. This creates a statistical validity concern (smaller trials have lower power to detect safety signals) and a patent strategic implication: a narrower, biomarker-defined patient population in the clinical trial results in narrower method-of-treatment claims that are easier to enforce but cover a smaller commercial opportunity. Conversely, a broad, unselected trial population supports broader method-of-treatment claims but may face FDA pressure for post-marketing studies if the biomarker-undefined population shows heterogeneous responses.
Algorithmic Bias in Training Data and Patent Rejections
The USPTO’s 2024 rejection of an AI-designed osteoarthritis drug application on grounds that the training data underrepresented Asian genomic variants introduced a novel legal concept to patent prosecution: training data diversity as a condition for utility. The examiner’s logic, drawing on FDA guidance regarding training data diversity for AI/ML-enabled medical devices, was that a drug predicted to be effective based on training data skewed toward European genetic backgrounds had insufficient credible utility for the broader patient population it would actually be prescribed to.
This reasoning has not yet been affirmed by the Federal Circuit, and its doctrinal foundations are contested. But the rejection is a warning signal for companies training AI drug discovery models on genomics data from homogeneous patient populations. The practical response is to audit training datasets for demographic representation before conducting generative runs that will feed into patent applications, and to include multi-ethnic pharmacogenomic data in the in vitro validation studies disclosed in the patent specification. A specification that includes binding affinity and selectivity data from cell lines derived from diverse genetic backgrounds is substantially more defensible against an algorithmic bias utility rejection than one relying on a single cell line from a homogeneous genomic background.
Investment Strategy: Portfolio-Level AI Drug IP Risk Management
- Conduct annual inventorship documentation audits across the full AI drug discovery portfolio. The 78% of pharma companies that mandate these audits post-DABUS have a measurable litigation defense advantage.
- Allocate IP legal budget to build out the CIP prosecution strategy as clinical data accumulates. Every new indication, dosing regimen, or formulation improvement is an opportunity to add a patent family that complicates generic entry timing.
- Model training data diversity risk as a patent utility variable. Programs trained on demographically homogeneous genomics datasets carry a non-zero probability of USPTO utility rejection and post-market FDA scrutiny. Price this risk into portfolio valuations.
- The FDA’s AI drug development documentation requirements and the USPTO’s inventorship documentation requirements are partially overlapping. Companies that build an integrated documentation system covering both regulatory tracks reduce total compliance cost and strengthen both their patent applications and their NDA submissions simultaneously.
- Global AI drug discovery investment projected at $15.7B by 2026, against IP valuation infrastructure investment that lags platform investment by an estimated 3-4 years. The companies that close that gap first — with standardized documentation, blockchain provenance, and CIP prosecution strategies — will have structurally more defensible patent portfolios when generic challengers arrive.
The Bottom Line for IP Teams and Portfolio Managers
The legal framework governing AI drug discovery patents is functional but unfinished. Human inventors can own patents on AI-assisted discoveries. The 2024 USPTO guidance, the Federal Circuit’s Thaler ruling, and the EPO’s DABUS decisions collectively define a workable path. The companies that walk that path successfully share four characteristics: they document human inventive contributions at each stage of the AI workflow before filing, not during prosecution; they structure patent portfolios in multiple layers covering the compound, formulation, and method-of-treatment claims; they maintain AI model architecture and training data as trade secrets behind genuine confidentiality infrastructure; and they conduct annual inventorship audits to identify documentation gaps before a Paragraph IV filer finds them first.
The companies that fail — the ones whose AI drug patents get challenged, partially invalidated, or abandoned in the face of generic entry — typically made one of four mistakes: they filed provisional applications before securing in vitro validation data, they relied on a single compound patent without layered coverage, they shared training data without adequate IP provisions in licensing agreements, or they let the AI platform team operate without IP counsel involvement until the application was already drafted.
With global AI drug discovery investment on track to reach $15.7 billion by 2026, the gap between platform capability and IP governance infrastructure is narrowing — but it has not closed. The researchers who build the documentation protocols, the patent prosecutors who draft the genus claims with adequate representative species, and the IP counsel who negotiate the data licensing agreements are not supporting functions to the AI drug discovery enterprise. They are the enterprise, because without defensible patents, the drugs those AI platforms discover cannot generate the returns that justify the platform investment in the first place.
Key Statistics
23%
AI drug patents granted in 2024 that faced post-grant validity challenges within 12 months
78%
Pharma companies mandating inventorship audits for AI projects (Deloitte, 2025)
$15.7B
Projected global AI drug discovery investment by 2026
45+
Patents in Insilico Medicine’s INS018_055 IP stack across 4 claim categories
$2.1B
Merck’s acquisition of Atomwise (2025), with AI IP governance clauses
Core Glossary
Pannu Factors
Three-part Federal Circuit test for co-inventor contribution: (1) contribution to conception of at least one claim; (2) contribution not insignificant in quality; (3) not merely explaining well-known concepts.
Paragraph IV Filing
ANDA certification alleging an originator’s Orange Book patent is invalid or not infringed. Triggers 30-month stay and patent litigation; the primary generic entry mechanism.
Continuation-in-Part (CIP)
Patent application adding new matter to a parent application while retaining the parent’s priority date for original disclosed subject matter. Core evergreening tool.
rNPV
Risk-adjusted net present value. Standard pharma valuation model applying clinical PTS factors and patent-adjusted exclusivity to projected peak sales.
SHAP Values
SHapley Additive exPlanations — technique decomposing AI model outputs into contributions from individual input features. Used to document AI decision pathways for patent prosecution.
Tanimoto Coefficient
Structural similarity metric for molecules. Threshold of 0.85 is the standard FTO screening cutoff for flagging prior art proximity in AI-generated compound libraries.
Bayh-Dole March-In Rights
Federal government right to compel licensing of patents on federally funded inventions if the patent holder fails to bring the invention to practical application. Applies to NIH-funded AI drug programs.
Research compiled from USPTO, EPO, UKIPO, CNIPA, and WIPO official records; Federal Circuit and Supreme Court decisions; Deloitte 2025 Pharma AI Survey; Insilico Medicine public filings; WIPO IP Statistics 2024; NIH AIM-HI program documentation. All financial figures are estimates based on publicly available information. This document does not constitute legal or investment advice.


























