Biosimilar Clinical Trial Design: The Playbook That Wins Regulatory Approval

The global biologics market crossed $400 billion in annual revenue in 2023 and is tracking toward $600 billion by 2026 [1]. Every year that a blockbuster biologic ages without facing biosimilar competition, healthcare systems collectively absorb billions in avoidable drug costs. Adalimumab alone cost U.S. payers approximately $21 billion annually before the first biosimilar wave arrived [2]. The commercial case for developing biosimilars is not ambiguous.

What is ambiguous, and what kills programs at a rate that would alarm a casual observer, is the clinical development strategy. Unlike small-molecule generics, a biosimilar program cannot coast to approval on the strength of a bioequivalence study run in 24 healthy volunteers over a weekend. These are large, complex proteins manufactured in living cells. Their development demands a layered, integrated clinical strategy where every study from the first analytical characterization to the final confirmatory trial forms a single, internally consistent argument.

This article breaks down that strategy in full. We cover the science, the regulatory math, the statistical architecture of equivalence trials, the immunogenicity problem, the operational variables that torpedo otherwise competent programs, and the commercial intelligence work that has to happen before the first vial is filled. The goal is a complete, usable framework for senior professionals who already understand the basic vocabulary and need the decision-relevant detail.

Part One: The Strategic Frame

What Biosimilar Development Actually Requires

Start with what the FDA, EMA, and most other major agencies actually ask you to prove: that your biological product is ‘highly similar’ to the reference product and that there are no clinically meaningful differences in safety, purity, or potency [3]. That sentence looks manageable until you try to operationalize it.

The problem is the molecule itself. A monoclonal antibody like adalimumab has a molecular weight of roughly 148 kilodaltons, a primary amino acid sequence coded by a specific gene, a three-dimensional folded structure stabilized by disulfide bonds, and an intricate pattern of glycosylation on its Fc region that varies depending on the cell line and culture conditions used to make it. You cannot synthesize this molecule from scratch and verify identity by comparing two structural diagrams the way you would with a small molecule. You produce it in living cells, which introduces inherent batch-to-batch variability. The reference product itself has that variability. Every lot of Humira ever manufactured has been slightly different from every other lot at the molecular level, and the FDA has approved those lots as equivalent to each other. Your task is to land your product within that existing natural range of variation and prove, comprehensively, that you have done so.

This is why the regulatory concept of ‘totality of the evidence’ exists [3]. No single test, study, or dataset is sufficient. The approval rests on the accumulated weight of analytical data, functional assay data, human pharmacokinetic data, clinical safety and efficacy data, and immunogenicity data, all pointing in the same direction. A weakness in one layer requires compensatory strength in another.

Understanding this structure is the first strategic act in any biosimilar program. It tells you where to concentrate your early investment, how to sequence your development activities, and where the most dangerous failure points are.

The Stepwise Logic and Why Sequence Matters

The stepwise approach that the FDA describes formally in its guidance [3] mirrors basic scientific reasoning. You start with the most fundamental level of evidence, characterize what you know and don’t know, and design subsequent studies to close the remaining gaps.

At the base sits analytical characterization. This is the highest-density, highest-ROI investment you make. A comprehensive analytical similarity package that demonstrates structural and functional equivalence at the molecular level does several things simultaneously: it validates your manufacturing process, it identifies any critical quality attributes that might need monitoring, and it reduces the residual uncertainty that regulators carry into the clinical review. Less residual uncertainty means smaller clinical trials, which means lower costs and faster timelines.

The next layer is human pharmacokinetics, almost always conducted as a comparative study in healthy volunteers or, for molecules with significant safety concerns, in patients. A clean PK study that demonstrates bioequivalence using the standard 80-125% equivalence criterion for AUC and Cmax [4] closes the gap between ‘structurally similar’ and ‘behaves the same way in a human body.’ This layer is where many programs experience their first setbacks, often because the study was underpowered or the reference product lots used in the comparison were not adequately characterized.

The apex of the pyramid is the confirmatory efficacy and safety trial. This study exists to address whatever residual clinical uncertainty the analytical and PK data did not resolve. A biosimilar program that front-loads its investment in the analytical layer can, in some cases, make a compelling case that the confirmatory trial is not necessary or can be substantially reduced in size. The FDA’s own published examples, including several monoclonal antibody approvals, demonstrate that this path is real, not theoretical [5].

The sequencing is not optional. Skipping or compressing the early layers to get to the clinic faster is a common and costly mistake. It produces expensive confirmatory trials that are under-designed because the developers did not understand the molecule well enough to set a defensible equivalence margin, or it generates PK failures that could have been predicted and corrected at the bench.

Part Two: The Analytical Foundation

Building the Molecular Fingerprint

Before any human is enrolled in any study, you need to know your molecule in forensic detail. This is not a regulatory checkbox. It is the intellectual core of the biosimilar development strategy.

The analytical program has two objectives. The first is comparative: demonstrate that the biosimilar candidate and the reference product are highly similar across every relevant quality attribute. The second is developmental: understand which of those attributes are critical to safety and efficacy and therefore must be controlled tightly throughout the manufacturing process.

A complete analytical similarity assessment covers four domains.

Primary structure analysis uses peptide mapping by mass spectrometry to confirm that the amino acid sequence is correct and that post-translational modifications, including disulfide bond connectivity, are identical. Any discrepancy in primary structure is a program termination event. Secondary and tertiary structure analysis uses circular dichroism spectroscopy, Fourier-transform infrared spectroscopy, and, for the most thorough programs, hydrogen-deuterium exchange mass spectrometry to map the three-dimensional folding of the protein. The glycosylation analysis is particularly intensive for IgG antibodies, where the Fc glycan profile directly affects FcgammaR binding and, by extension, ADCC activity and serum half-life [6].

The fourth domain is functional biological activity, assessed through a panel of cell-based and binding assays designed to measure every known mechanism of action of the reference product. For a monoclonal antibody like rituximab, this means measuring antigen binding affinity, ADCC potency, CDC potency, and apoptosis induction in appropriate cell lines. The goal is not merely to confirm that each assay falls within a statistical range but to demonstrate that the quantitative relationship between structural attributes and biological activity is the same for both products.

Critical Quality Attributes: Deciding What Matters

Not all molecular attributes carry equal clinical weight. Identifying the Critical Quality Attributes (CQAs) for the reference product is a structured risk assessment, not a literature review [7]. It requires integrating structural data, mechanistic understanding of the drug’s pharmacology, and clinical knowledge about which aspects of the molecule’s behavior correlate with efficacy and safety.

For a monoclonal antibody operating primarily through its Fab domain (antigen binding), the glycosylation profile of the Fc region may be less critical than for an antibody that depends heavily on ADCC. For an antibody whose therapeutic mechanism is direct target neutralization rather than immune effector function, Fc receptor binding assays may be informative but not primary. Understanding these mechanistic distinctions shapes which attributes you designate as CQAs, how tightly you control them in manufacturing, and how much analytical effort you invest in their comparative assessment.

The FDA uses a tiered analytical similarity assessment framework in which attributes are ranked by their criticality and analytical uncertainty. Tier 1 attributes are those with the highest clinical relevance, assessed using equivalence approaches with formal statistical testing. Tier 2 and Tier 3 attributes are assessed with progressively less formal statistical methodology [3]. Designing your analytical program around this tiered structure gives you an audit-ready package that directly maps onto the regulatory review process.

Reference Product Characterization: The Often-Neglected Step

One of the most consistently underestimated elements of the analytical program is the characterization of the reference product itself. Regulators expect you to demonstrate that the reference product lots used in all your comparative studies (analytical, PK, and clinical) are representative of the reference product’s known range of variation and are manufactured under the commercial process, not some early or transitional process.

This means procuring multiple commercial lots from different manufacturing campaigns, characterizing each one using your analytical panel, and establishing that the variability you observe within the reference product’s own lot-to-lot range defines the target space your biosimilar must fit within. A biosimilar candidate that sits squarely in the center of the reference product’s natural variability range is analytically bulletproof. One that matches only some lots is exposed.

This work is often squeezed in the race to the clinic, and it can have fatal downstream consequences. If you cannot demonstrate that your reference product comparator lots are representative, regulators will question whether the entire comparative study program is valid.

Part Three: The Global Regulatory Architecture

FDA vs. EMA: The Core Differences That Drive Program Design

The U.S. BPCIA pathway, established in 2009, and the EMA’s framework, which predates it by four years, share a common intellectual foundation [3, 8]. Both require a totality of evidence approach. Both support extrapolation of indications. Both demand a comparative immunogenicity assessment. The differences lie in specific operational requirements, and for a company targeting both markets, those differences drive significant program design decisions.

The EMA has historically maintained a stronger default preference for comparative clinical trials, particularly for complex molecules [8]. While the FDA has been progressively open to reducing the clinical component where analytical and PK data are strong, the EMA’s public assessment reports on approved biosimilars suggest a somewhat more conservative posture on extrapolation, particularly for indications where the mechanism of action differs meaningfully from the one studied in the confirmatory trial.

Reference product sourcing is the single most consequential operational difference. The EMA requires that the reference product used in comparative studies be sourced from within the European Economic Area [8]. The FDA requires a U.S.-licensed reference product [3]. For a global program, this means you need both. The practical solution is to design a three-arm comparative PK study that includes your biosimilar, the U.S.-licensed reference product, and the EEA-licensed reference product in the same study, explicitly bridging the two reference products to each other. If the U.S. and EU versions of the reference product are PK-equivalent (which they usually are, given that the same molecule is manufactured under the same validated process for global distribution), that bridging exercise is straightforward. If they are not, your global program has a serious complication to resolve before moving forward.

The interchangeability designation is the other major U.S.-specific consideration. The BPCIA created a higher evidentiary standard for products deemed interchangeable with the reference product, allowing automatic pharmacy-level substitution where state law permits [9]. The EMA has no equivalent designation; substitutability is a member-state-level policy decision. For a company focused on the U.S. market, pursuing interchangeability requires additional switching study data, which adds cost and time but can provide substantial formulary leverage once achieved.

Health Canada, the PMDA, and the Emerging Markets Consideration

Health Canada and Japan’s Pharmaceuticals and Medical Devices Agency have both established biosimilar frameworks that broadly align with the EMA’s approach, though with their own specific data requirements around reference product sourcing and the acceptance of foreign clinical data [10, 11]. For companies targeting global markets, the practical path is to design the core clinical program to satisfy the most stringent combined requirements of the FDA and EMA, and then conduct targeted supplemental analyses or bridging studies to satisfy Health Canada and the PMDA without running duplicative full programs.

Markets in South Korea, Australia, India, and Brazil have each established their own biosimilar pathways with varying degrees of sophistication and alignment to the ICH guidelines [12]. The practical reality is that the clinical data package generated to satisfy the FDA and EMA is generally sufficient as the foundation for submissions in these markets, with market-specific regulatory submissions and potential bridging studies for local manufacturing or reference product sourcing differences.

The Extrapolation Calculus

Extrapolation is where development cost equations change dramatically. If you can support approval in Indication B on the basis of a confirmatory trial conducted in Indication A, you avoid a trial that might cost $50-100 million and take three years. Regulators will support extrapolation when three conditions are met: the mechanism of action is the same across both indications (or at least the relevant components of the mechanism are the same), the PK/PD behavior is expected to be similar in both patient populations, and the safety and immunogenicity profiles do not present indication-specific risks that require direct clinical study [3, 8].

For anti-TNF agents used in rheumatoid arthritis, psoriasis, and inflammatory bowel disease, the core mechanism is the same: neutralization of TNF-alpha. A confirmatory trial in rheumatoid arthritis, combined with a strong analytical and PK package, has supported extrapolation to psoriasis and Crohn’s disease for multiple approved biosimilars [13]. The justification must be written explicitly and reviewed by regulators; extrapolation is not automatic, and agencies have rejected it in cases where they felt the mechanistic or immunological arguments were insufficient.

The choice of the confirmatory trial’s indication is therefore a strategic decision, not just a scientific one. You want the indication that is most sensitive for detecting potential differences (making the study maximally informative), most efficiently enrolled (minimizing timeline), and carries the most defensible extrapolation argument to the other high-value indications.

Part Four: The Pharmacokinetic and Pharmacodynamic Studies

The PK Study as the First Line of Human Evidence

The comparative PK study is the first major clinical milestone in every biosimilar program, and its design deserves the same analytical rigor as the confirmatory efficacy trial. A PK failure at this stage does not just delay the program; it generates a body of evidence that regulators will scrutinize in the confirmatory trial and often requires manufacturing process investigation before the program can continue.

The study must demonstrate bioequivalence, defined as a 90% confidence interval for the AUC and Cmax ratios that falls entirely within 80-125% [4]. The two core design choices, which are linked, are the study population and the study design structure.

Crossover vs. Parallel Group: The Statistical Efficiency Tradeoff

A two-sequence, two-period crossover design (each subject receives both products in random order, separated by a washout period) is statistically more efficient than a parallel-group design because it eliminates inter-subject variability from the treatment comparison. Every subject serves as their own control. For a fixed level of statistical power, a crossover design typically requires 30-50% fewer subjects than a parallel-group design [14].

The catch is the washout period. For a drug to be effectively ‘washed out’ from the body, five half-lives must elapse between the last dose of the first treatment and the first dose of the second. For a monoclonal antibody with a 3-week half-life (such as trastuzumab or adalimumab), that washout period extends to 15 weeks. When you add the time for the first treatment period and the second treatment period, the total study duration approaches nine months for a single-dose crossover design. In a competitive development race, that timeline is unacceptable.

The result is a practical segmentation. For small proteins with short half-lives (filgrastim, epoetin, insulin), crossover designs are standard and deliver statistically clean results with modest sample sizes. For monoclonal antibodies, a parallel-group design is the default, which means larger study populations and greater sensitivity to between-subject variability in clearance rates, immunogenicity, and baseline characteristics.

Healthy Volunteers vs. Patients: The Population Decision

Most biosimilar PK studies are conducted in healthy volunteers rather than patients. The reasons are well-established. Healthy volunteers have no underlying disease to alter drug PK, are not on concomitant medications that cause drug interactions, have standardized body weights and organ function, and provide the cleanest possible experimental system for detecting differences between the biosimilar and the reference product [3].

The exception arises when the drug has a safety profile that makes administration to healthy individuals unacceptable. Most oncology biologics fall into this category. Cetuximab, bevacizumab, and trastuzumab are not practical to study in healthy volunteers because their toxicity profiles (skin reactions, hypertension, cardiac effects) make informed consent from a healthy subject difficult to justify ethically. For these products, the PK study is typically embedded within the confirmatory clinical trial in patients, with pharmacokinetic sampling planned as a prespecified component of the study design.

When PK is assessed in patients, interpretation becomes more complex. Disease activity itself affects the clearance of some biologics, particularly those targeted at soluble antigens (where the target protein contributes to drug clearance via target-mediated drug disposition). If you are studying a biosimilar to tocilizumab in rheumatoid arthritis patients, the level of IL-6 (the drug’s target) in each patient’s serum will affect the PK curve. This requires careful modeling and a study population that is homogeneous enough in disease severity to make the comparison interpretable.

Pharmacodynamics: Where to Use It and Where to Avoid Overinterpreting It

A PD biomarker is most valuable when it provides a quantitative, mechanistic link between drug exposure and biological effect. The neutrophil response to filgrastim (measured as absolute neutrophil count over time) is the canonical example. It is sensitive, precise, mechanistically direct, and produces a dose-response curve that can be compared between products with high statistical power [15].

For most monoclonal antibodies, no single biomarker has the same properties. The inflammatory pathways affected by anti-TNF agents, IL-17 inhibitors, and IL-23 inhibitors are complex, redundant, and highly variable between individual patients. Measuring a single cytokine like CRP or IL-6 as a PD endpoint for an anti-TNF biosimilar provides at best supportive evidence; it does not constitute a valid equivalence assessment. Using such exploratory biomarker data as if it were primary PD evidence, or over-interpreting it in regulatory submissions, is a mistake that invites regulatory pushback.

The appropriate use of PD biomarkers in biosimilar programs is therefore tiered. For molecules with clear, validated, quantitative biomarkers, design a dedicated comparative PD study as part of the early clinical program and use the results as primary supportive evidence for biosimilarity. For molecules without validated PD biomarkers, measure an exploratory panel as part of the confirmatory trial, report the findings transparently, and frame them explicitly as supportive of the totality of evidence rather than as primary equivalence data.

Part Five: The Confirmatory Efficacy and Safety Trial

When You Need It and When You Might Not

The question of whether a comparative efficacy trial is actually necessary is one of the most consequential, and most frequently mishandled, decision points in biosimilar development. The answer is not binary.

The FDA’s position, expressed across multiple guidance documents and reflected in actual approval decisions, is that a confirmatory efficacy trial is required when the analytical and PK data leave residual clinical uncertainty that cannot be resolved by other means [3]. Conversely, where the analytical similarity is exceptionally strong and the PK bridge is clean, the FDA has approved biosimilars without a dedicated comparative efficacy trial. Zarxio (filgrastim-sndz), the first U.S. biosimilar, was approved with a clinical program that relied heavily on PK/PD data without a traditional comparative efficacy trial of the kind typical in the EU [13].

The practical decision framework is: map the specific residual uncertainties that remain after your analytical and PK data, articulate what clinical data would be needed to close each uncertainty, and propose a study design that closes those uncertainties as efficiently as possible. A regulatory meeting before you design the confirmatory trial is not optional strategy; it is mandatory prudence. The FDA’s Type 2 meeting and the EMA’s Scientific Advice process exist precisely to resolve these questions before you spend $50-100 million running a trial that may be over-designed, under-designed, or designed around the wrong endpoint.

Equivalence Trial Architecture: The Mathematics of Proving Similarity

An equivalence trial is structurally different from a superiority trial in ways that affect every design decision from sample size to endpoint selection to analysis methodology.

In a superiority trial, you have a directional hypothesis: your drug is better than control. You need to demonstrate an effect in one direction only, which means you set a one-sided alpha of 0.025. In an equivalence trial, you must demonstrate that your drug is neither meaningfully worse than nor meaningfully better than the reference product. This requires bounding the difference in both directions, which means the confidence interval must fit inside a symmetric (or sometimes asymmetric) window. The statistical consequence is that equivalence trials are systematically more conservative and require larger sample sizes than superiority trials testing the same endpoint with the same power.

The equivalence margin is the number that defines that window, and its derivation is both a scientific and regulatory negotiation. Consider a rheumatoid arthritis trial using the ACR20 responder rate as the primary endpoint. The FDA has generally accepted margins in the range of ±12-15% for this endpoint in biosimilar confirmatory trials [3]. The derivation of that margin follows a formal process: analyze the historical data from the reference product’s original approval trials to establish the size and reliability of the treatment effect, and then determine what fraction of that effect you are willing to risk ‘losing’ in the worst-case scenario. This assay sensitivity argument, borrowed from non-inferiority trial methodology, grounds the margin in the clinical pharmacology of the disease and the drug.

A margin that is too wide fails the physician’s clinical plausibility test: a biosimilar that is ‘equivalent’ at ±20% might actually be detectably less effective in some patients. A margin that is too narrow inflates the required sample size to an operationally untenable level. For most established biologics in immunology, the expected response rate in the reference arm is around 60-70% for ACR20, which gives you enough effect size to work with a margin that is clinically defensible and statistically practical.

Endpoint Selection: Sensitivity Versus Speed

The primary endpoint must satisfy three criteria simultaneously. It must be clinically meaningful enough to convince regulators and physicians that equivalence on this measure matters. It must be sensitive enough to detect a true difference between products if one exists. It must be assessable early enough in the study to make the trial duration practical.

For immunology biosimilars, the field has largely standardized on binary responder endpoints measured at weeks 12-24 as primary efficacy measures: ACR20 for rheumatoid arthritis, PASI75 for psoriasis, and clinical response indices for IBD. These endpoints are well-validated, provide well-characterized historical response distributions in the reference product’s trials, and are assessable before the full immunogenicity profile develops.

For oncology biosimilars, objective response rate (ORR), measured by RECIST criteria, has been the dominant primary endpoint in confirmatory trials. ORR is directly attributable to the drug’s anti-tumor activity, is measurable within 6-12 weeks, and has a well-defined variance structure from the reference product’s historical data. Overall survival (OS), while clinically more meaningful in the long term, is too noisy for an equivalence assessment: it is influenced by subsequent lines of therapy, physician choices, patient compliance, and disease biology in ways that make it genuinely difficult to determine whether any observed OS difference is due to the drugs being compared or to these confounders.

The choice of oncology indication for the confirmatory trial also carries a biological rationale for extrapolation. A trastuzumab biosimilar confirmatory trial in HER2-positive early breast cancer, where trastuzumab’s mechanism (HER2 blockade plus ADCC) operates in a relatively controlled setting with a well-characterized patient population, provides a cleaner scientific basis for extrapolating to HER2-positive metastatic gastric cancer than the reverse.

Part Six: The Immunogenicity Problem

Why Immunogenicity Is the Dominant Safety Variable

All therapeutic proteins are potentially immunogenic. The human immune system recognizes large, complex, non-self proteins and can mount antibody responses against them. For most approved biologics, this happens in a fraction of patients and has manageable clinical consequences. But in specific circumstances, anti-drug antibodies can neutralize the drug’s therapeutic effect, accelerate its clearance, or, in the worst cases, cross-react with the patient’s own endogenous proteins.

The case of recombinant erythropoietin and pure red cell aplasia established the clinical stakes permanently. In the early 2000s, a formulation change in one erythropoietin product sold outside the U.S. led to an increase in immunogenicity that caused severe anemia by neutralizing the patient’s own endogenous EPO [16]. The mechanism was traced to a formulation excipient change that destabilized the protein and increased aggregate formation. This event is cited in every discussion of biosimilar immunogenicity because it demonstrates that minor formulation differences can have serious clinical consequences, and that immunogenicity signals may not appear in short-duration clinical trials.

For a biosimilar developer, the immunogenicity assessment must accomplish two things. It must generate a head-to-head comparison of the biosimilar’s and reference product’s immunogenicity profiles in patients, demonstrating that the incidence, titer, and clinical consequences of anti-drug antibody formation are not meaningfully different. It must also be comprehensive enough and long-duration enough to capture the full scope of the immunogenicity profile, not just the early-onset response.

The Multi-Tiered Assay Strategy

The laboratory infrastructure for immunogenicity assessment is more complex than the clinical trial itself in some programs. The FDA and EMA both expect a tiered testing approach with analytically validated assays at each tier [17].

The screening assay is designed for maximum sensitivity. Its purpose is to ensure that no ADA-positive sample escapes detection. Because of this design philosophy, it will have a meaningful false-positive rate. Every sample that screens positive is then subjected to a confirmatory assay, which is more specific, using drug competition to verify that the positive signal is genuinely due to anti-drug binding rather than non-specific matrix effects. Confirmed positive samples then go through a neutralizing antibody (NAb) assay, typically a cell-based functional assay that measures the capacity of the antibodies to block the drug’s biological activity.

Assay development and validation must be completed before patient dosing begins and must satisfy acceptance criteria defined in the International Conference on Harmonisation (ICH) guidance on immunogenicity [17]. The key assay parameters are sensitivity (minimum detectable antibody concentration), drug tolerance (the maximum concentration of drug in the sample that does not interfere with the assay), specificity, and cut-point (the threshold above which a sample is called positive, set at the 95th percentile of the negative control distribution).

A frequent operational failure is the use of under-validated assays or assays with insufficient drug tolerance. Most biologics are dosed at concentrations that persist in the blood for weeks. If you collect an ADA sample at a time point when drug concentrations are still high, the drug in the sample will compete with the assay’s capture and detection reagents and suppress the ADA signal, producing false negatives. Designing the sampling schedule to capture trough samples (just before the next dose, when drug concentrations are lowest) and using assays with validated drug tolerance specifications that match actual patient trough concentrations is fundamental to generating reliable immunogenicity data.

Clinical Interpretation: The Comparison That Matters

The ADA incidence data from the biosimilar and reference product arms of the confirmatory trial must be compared in a way that is statistically and clinically interpretable. Point estimates for ADA incidence in each arm are not sufficient; you need confidence intervals around the difference that are narrow enough to rule out a clinically meaningful difference.

But the ADA incidence rate alone is not the complete picture. The clinical relevance of the immunogenicity data comes from the correlation analysis: do patients who develop ADAs have worse efficacy outcomes? Do they have more infusion reactions or hypersensitivity events? Is the pharmacokinetic profile different in ADA-positive versus ADA-negative patients? This correlation structure must be planned as part of the statistical analysis plan and reported transparently in the regulatory submission.

An ADA incidence of 15% in the biosimilar arm and 12% in the reference arm sounds alarming until you examine whether those ADAs affected clinical outcomes. If 95% of the ADAs in both arms were non-neutralizing, transient, and had no detectable impact on drug concentrations or clinical response, the 3-percentage-point difference in overall incidence is not clinically meaningful. Conversely, a biosimilar with 8% ADA incidence but a higher rate of persistent, neutralizing antibodies than the reference product at 6% total incidence has a worse immunogenicity profile despite the lower headline rate. Understanding and communicating the nuance of this data is what separates a professionally prepared regulatory package from a checkbox compliance exercise.

Part Seven: Interchangeability and Switching Studies

The Regulatory and Commercial Logic of Interchangeability

An FDA interchangeability designation allows a pharmacist to substitute the biosimilar for the reference product without consulting the prescribing physician, subject to state pharmacy laws. As of 2026, most states have enacted substitution laws that give the interchangeability designation real commercial power. A biosimilar on the interchangeable list can be automatically dispensed when the reference product is prescribed, dramatically reducing the physician and pharmacy activation energy required for conversion.

The evidentiary requirement for interchangeability, established in the BPCIA and further specified in FDA guidance [9], is that the biosimilar can be expected to produce the same clinical result as the reference product in any given patient, and that, for products administered more than once, the risk of alternating between the reference product and the biosimilar is not greater than the risk of using the reference product without alternating.

That second requirement is operationalized through switching studies designed to simulate real-world prescription patterns where patients might receive the biosimilar on some refills and the reference product on others.

Switching Study Design: The Multi-Period Approach

A standard switching study for interchangeability uses a randomized design with two parallel arms and multiple treatment periods. One arm remains on the reference product throughout the study (the ‘non-switchers’). The other arm alternates between the biosimilar and the reference product at each treatment period (the ‘switchers’).

A design that the FDA has accepted uses three treatment periods with two switches in the alternating arm, which produces a sequence like: Reference/Biosimilar/Reference (or Biosimilar/Reference/Biosimilar). The primary endpoints compare the switcher arm to the non-switcher arm on PK parameters (AUC and Cmax during the final treatment period), immunogenicity incidence, and efficacy or safety events. The goal is to demonstrate that the multiple-switch experience is no different from staying on the reference product continuously.

The Cyltezo (adalimumab-adbm) switching study, which supported the first FDA interchangeability designation for an adalimumab biosimilar, illustrates the design well. The study enrolled patients with stable rheumatoid arthritis, randomized them to a non-switching reference arm or a switching arm with three alternating periods, and demonstrated equivalent PK and immunogenicity in the switching arm compared to the non-switching arm over a 22-week study [18].

For a company building a global program, the interchangeability designation adds 18-24 months and $30-60 million to the U.S.-specific development program. The commercial return depends on how effectively the company can translate that designation into formulary preference. In markets where payer contracts drive formulary placement more than automatic substitution laws (which is most of the U.S. commercial market), the interchangeability designation may be less decisive than it appears. The market access strategy must reflect this nuance.

Part Eight: Operational Execution

Patient Recruitment: The Relentless Bottleneck

Biosimilar confirmatory trials operate in therapeutic areas where recruitment competition is fierce. Rheumatoid arthritis, psoriasis, oncology (breast cancer, colorectal cancer), and inflammatory bowel disease are among the most contested spaces in clinical research. At any given time, dozens of trials are open across these indications, competing for the same pool of patients at the same academic medical centers and large rheumatology or oncology practices.

The naive assumption is that biosimilar trials are easier to recruit than novel drug trials because they involve an approved drug rather than an experimental compound. This is wrong. The very fact that the reference product is already commercially available means that physicians can offer it to their patients outside a trial, and many prefer to do so rather than enroll patients in a study where they receive an investigational biosimilar for no additional benefit.

Effective recruitment requires three things above all else: protocol design that minimizes patient and site burden, proactive engagement with high-volume sites that have dedicated research coordinators and patient registries, and a rigorous pre-screening strategy using EHR mining and patient registry analysis to identify eligible candidates before sites are asked to screen them. Every protocol visit that can be eliminated without compromising data quality should be eliminated. Every lab test that is driven by regulatory desire rather than clinical necessity should be challenged. <blockquote> ‘Patient recruitment accounts for up to 40% of total clinical trial cycle time, and delays in recruitment are the leading cause of clinical trial cost overruns. For biosimilar trials specifically, a one-month delay in the primary completion date translates to a one-month delay in regulatory submission, which in the context of a race-to-market biosimilar program, can be worth $50 million or more in lost first-mover commercial revenue.’ — Tufts Center for the Study of Drug Development [19] </blockquote>

Reference Product Procurement: The Supply Chain No One Plans For

The logistical and financial complexity of procuring the reference product for use in clinical trials is underestimated in almost every biosimilar program plan. The innovator company has no obligation to facilitate your development of a competing product, and no commercial incentive to do so.

For large programs, the total quantity of reference product required across all studies (analytical characterization, PK study, switching study, confirmatory trial) can reach tens of thousands of units. At list price, this represents a procurement cost in the range of $10-50 million depending on the molecule. Procurement is typically through authorized wholesale distributors or specialty pharmacies, and the supply must be planned years in advance to ensure consistent lot availability and cold-chain integrity.

The lot consistency requirement compounds the procurement challenge. Your regulatory submission must demonstrate that the reference product lots used in your clinical trials were representative of the commercial product at the time of the trial. This requires analytical testing of every lot procured, retention of reference samples for stability testing, and documentation of the chain of custody from the point of commercial sale to the clinical site dispensing unit. Managing this documentation across a multi-year, multi-country trial program requires dedicated supply chain infrastructure that many smaller biosimilar developers have not built.

For global programs requiring both U.S.- and EEA-sourced reference product, the supply chain doubles in complexity. The two sources must be tracked separately, analyzed independently, and bridged analytically in the regulatory package. Blinding for a double-blind trial must be maintained across both sources, which adds pharmacy operations complexity at every clinical site.

Blinding and Packaging: The Details That Compromise Studies

Maintaining double-blind conditions in a biosimilar trial is operationally demanding in ways that are easy to underestimate. The biosimilar and reference product may differ in subtle sensory properties (viscosity, injection site pain, appearance of the solution) that can unblind patients and investigators even when packaging is matched. Pre-filled syringe devices may have different injection forces. IV formulations may differ in color or clarity.

These issues are not hypothetical. Several published biosimilar trials have noted rates of inadvertent unblinding that were higher than expected, attributable to differences in injection experience between the biosimilar and reference product. For a study whose conclusions rest on an equivalence comparison, a meaningful rate of unblinding introduces a source of assessment bias that regulators will scrutinize during review.

Addressing this requires careful over-encapsulation or over-labeling of the devices, unblinding rate monitoring through patient and investigator questionnaires, and sensitivity analyses in the statistical plan that test whether the results hold after excluding patients where unblinding is suspected.

Statistical Analysis Plan: The Document That Defines the Trial

The Statistical Analysis Plan (SAP) must be finalized and locked before the trial database is locked and unblinded. This requirement is not merely good practice; it is a regulatory expectation, and deviations from a pre-specified SAP require formal justification in the regulatory submission.

The SAP specifies the primary endpoint definition including the responder criteria and the time point of assessment, the primary analysis method (ANCOVA, logistic regression, or otherwise), the equivalence margin and the confidence interval level (typically 95% for clinical equivalence trials, different from the 90% used for PK bioequivalence), the handling of missing data (which for an equivalence trial requires careful thought because imputing response for a missing patient in a direction that favors equivalence is as problematic as imputing it in a direction that works against equivalence), subgroup analyses that are prespecified versus exploratory, and the sensitivity analyses designed to test the robustness of the primary conclusion to methodological assumptions.

For immunogenicity, the SAP must specify the statistical comparison method for ADA incidence rates (a simple Z-test is insufficient for the small absolute numbers typical in most confirmatory trials; Fisher’s exact test or the Mantel-Haenszel method is more appropriate), the definition of the time window for each ADA assessment, and the approach for handling samples with inconclusive results at the confirmatory assay tier.

Part Nine: Digital Tools and Real-World Evidence

Decentralized Trial Elements: What Works and What Does Not

The decentralized clinical trial (DCT) model, accelerated by the COVID-19 pandemic, offers genuine value for biosimilar programs in specific contexts. Home nursing visits for drug administration and PK sample collection, remote monitoring of patient-reported outcomes through validated ePRO platforms, and telehealth assessments for routine safety follow-up can each reduce patient burden and accelerate recruitment by removing geographic constraints.

The limits of decentralization for biosimilar trials, however, are real. The primary efficacy endpoint in most confirmatory trials requires physical examination by a trained assessor (joint counts in rheumatoid arthritis, lesion assessment in psoriasis, tumor imaging in oncology) that cannot be reliably conducted remotely. Blinded image reading requires site-level image acquisition under standardized conditions that home settings cannot replicate. These constraints mean that a fully decentralized biosimilar confirmatory trial is not currently feasible for the primary efficacy assessment, but a hybrid model that decentralizes safety monitoring, patient-reported outcomes, and post-primary-endpoint follow-up can meaningfully reduce the per-patient cost and site burden.

The regulatory acceptance of DCT-generated data is still evolving. The FDA’s guidance on decentralized trials [20] confirms that DCT-generated data can be used in submissions, but sites and investigators retain regulatory responsibility for the data generated at or through their sites. A hybrid trial that uses home nursing visits for PK sampling must have documented SOPs for sample handling, chain-of-custody, cold-chain management, and centrifugation that are validated to produce equivalent sample quality to on-site collection.

Real-World Evidence: The Post-Approval Differentiator

For biosimilar developers, real-world evidence (RWE) has a more clearly defined and commercially important role in the post-approval phase than in the pre-approval phase. It cannot replace the randomized clinical trial for initial regulatory approval; the FDA and EMA have been explicit on this point. What it can do is build the commercial case after launch, support label expansions through extrapolation, and sustain formulary negotiations with payers.

The most direct application is post-marketing safety surveillance. Regulatory approval requires a pharmacovigilance plan, and for complex molecules with significant immunogenicity risk, this often includes a post-marketing study or registry. Using real-world data from insurance claims databases or electronic health records is more efficient than running a prospective registry if you can access population-level data with sufficient clinical detail to adjudicate safety events.

The second application is formulary negotiation. Payers increasingly require real-world performance data as a condition for continued preferred formulary placement or for expanding preferred status to additional patient populations. A well-designed retrospective cohort study in a commercial claims database comparing outcomes in patients treated with your biosimilar versus the reference product can provide exactly this evidence, and it can be completed in 12-18 months for a cost far below a prospective trial.

DrugPatentWatch tracks not only the patent and exclusivity landscape for reference biologics but also provides the competitive patent intelligence that informs where and when real-world data from a biosimilar launch will be most commercially valuable. Understanding which competitors are approaching patent expiration, which have switching study data, and which lack interchangeability designations directly shapes the post-launch RWE investment strategy.

AI and Machine Learning: Useful Now, Transformative Later

AI and machine learning applications in biosimilar clinical development are not speculative. They are deployed today in specific, narrow contexts where their performance advantage over conventional methods is demonstrated.

Machine learning models applied to high-dimensional analytical data (mass spectrometry, HDX-MS, multi-attribute monitoring) can identify subtle structural patterns that distinguish lots of the reference product with different clinical performance characteristics. This is valuable for CQA identification and for understanding the structure-function space within which your biosimilar must land.

Natural language processing applied to EHR data has been validated for patient pre-screening in clinical trials, reducing the false-positive rate of potential candidates sent to sites for formal screening by 50-60% in some programs [21]. For biosimilar confirmatory trials in competitive recruiting environments, this efficiency gain directly translates to faster enrollment.

The transformative applications, meaning AI-driven equivalence margin derivation, in silico prediction of clinical immunogenicity, and adaptive trial designs guided by real-time AI analysis of accumulating data, are further from routine deployment. The regulatory framework for accepting AI-derived evidence in biosimilar submissions is still being developed, and the validation requirements for AI models used in regulatory decision-making are substantially more demanding than for models used in internal development decisions.

Part Ten: Patent Intelligence and the Commercial Foundation

The Patent Landscape: What You Need to Know Before You Start

The decision of which biologic to develop as a biosimilar is, above all else, a patent intelligence decision. The clinical development program, with all its complexity and cost, only makes sense if you can actually launch the product at the end of it. That depends on whether the key patents protecting the reference product will be expired or invalidated by the time your regulatory approval is obtained.

The patent landscape for a blockbuster biologic is rarely simple. The innovator company builds a patent estate that typically includes the original compound patent (covering the protein sequence and basic structure), manufacturing process patents (covering specific cell lines, fermentation conditions, and purification methods), formulation patents (covering specific excipient combinations, concentrations, and delivery systems), and method-of-use patents (covering specific dosing regimens, patient populations, or combination therapies) [22].

The compound patent, which is typically the earliest to expire, is the most visible and widely tracked. But the surrounding layer of secondary patents, particularly formulation and dosing regimen patents, can be valid years after the compound patent expires and can create legitimate legal barriers to launch even for an approved biosimilar.

DrugPatentWatch as a Strategic Intelligence Tool

This is where platforms like DrugPatentWatch provide genuinely competitive intelligence. By aggregating patent filing data, FDA Orange Book and Purple Book listings, BPCIA patent exchange information, and litigation records, DrugPatentWatch gives development teams a comprehensive map of the patent landscape for any given reference product at any given time.

The practical uses are multiple. Before committing to a biosimilar development program, business development teams use DrugPatentWatch to calculate the effective period of market exclusivity for the reference product, accounting for all overlapping patents and the projected timeline for patent expiration or litigation outcome. During development, the platform provides ongoing monitoring of new patent filings by the innovator, which can signal last-minute lifecycle management strategies designed to extend exclusivity. The ‘patent dance’ litigation process under the BPCIA generates a stream of legal filings and decisions that directly affect the competitive landscape; monitoring these through DrugPatentWatch gives companies advance notice of market entry opportunities or barriers that affect their own commercial timeline.

For regulatory submissions, understanding the patent history of the reference product can also be informative for the analytical program. Manufacturing process patents describe the innovator’s production methods, giving biosimilar developers insight into the cell lines, process conditions, and critical steps that generated the clinical profile regulators have accepted. This does not mean copying the innovator’s process, which would likely infringe those patents, but it provides structural context for understanding which process parameters are most likely to be critical to the molecule’s CQA profile.

First-Mover Advantages and the Launch Window

The commercial value of a biosimilar launch decreases nonlinearly with the number of competitors in the market. The first approved biosimilar for a given reference product typically commands a price discount of 15-25% and captures a meaningful share of new patient starts relatively quickly [23]. The second and third entrants must price more aggressively and face an established formulary position for the first biosimilar.

Beyond the third or fourth entrant, biosimilar economics for most products become marginal unless the company has a manufacturing cost advantage that enables deeper discounting or a customer relationship advantage (hospital formulary contracts, specialty pharmacy partnerships) that provides access independent of price.

This commercial reality makes the clinical timeline a direct revenue variable. A program that enters the confirmatory trial six months late due to a delayed PK study or an analytical characterization gap exits the regulatory review process six months late and launches into a market that is six months further along in competitive consolidation. The expected revenue loss from that delay is product-specific but commonly modeled in the range of $100-300 million for a large-volume biologic with three or more biosimilar competitors already in the market.

The portfolio implication is that clinical development efficiency, measured in time from IND submission to regulatory approval, is as important a differentiator between biosimilar developers as manufacturing cost. Companies that have invested in standardized platform technologies for analytical characterization, validated immunogenicity assay platforms that can be rapidly adapted to new molecules, and established site networks with proven biosimilar trial execution experience will systematically reach the market faster than companies that rebuild these capabilities from scratch for each program.

Part Eleven: The Evolving Regulatory Frontier

Reduction in Clinical Requirements: The Data Speaks

The trajectory of biosimilar regulatory science over the past decade is toward progressively greater reliance on analytical and PK data, with a corresponding reduction in the default expectation for large confirmatory efficacy trials. This trend is visible in the published guidance documents from both the FDA and EMA, in the specific approval decisions for individual biosimilars, and in the regulatory science literature.

The scientific basis for this trajectory is straightforward. For every year that biosimilars have been approved and used, we accumulate evidence that the ‘totality of the evidence’ approach works: biosimilars approved on the basis of strong analytical and PK data have performed as predicted in the clinical setting, without unexpected safety or efficacy surprises. This accumulating real-world track record gives regulators increasing confidence that the early-stage data is genuinely predictive of clinical performance.

The FDA’s Biosimilar Action Plan, initially released in 2018 and updated subsequently, explicitly identifies the reduction of unnecessary clinical data requirements as a policy objective [24]. In practice, this has manifested as increasing FDA willingness to engage in dialogue about clinical program waivers or reductions for products where the analytical case is exceptionally strong, particularly for molecules that are well-understood mechanistically and have a long track record of clinical use for the reference product.

The EMA has moved somewhat more slowly on this dimension, reflecting in part a broader European regulatory culture that places high value on direct clinical confirmation. But the European landscape has also evolved, particularly for highly similar monoclonal antibodies in well-understood indications with validated biomarkers.

Adaptive Trial Designs: Opportunity and Constraint

Adaptive clinical trial designs, which allow pre-specified modifications to the trial based on interim data, are well-established in novel drug development. Their application to biosimilar confirmatory trials is more constrained, for a specific reason: in an equivalence trial, any adaptation that is triggered by interim efficacy data or interim ADA data creates a risk of inflation of the Type I error rate (the probability of incorrectly concluding equivalence) that is difficult to control analytically.

The adaptations that are most defensible in biosimilar confirmatory trials are sample size re-estimation based on observed event rates (for time-to-event endpoints) or observed outcome variance (for continuous endpoints), where the adaptation is triggered only by nuisance parameters (the variance or event rate) that are not themselves part of the primary hypothesis. These designs have been reviewed and accepted in principle by both the FDA and EMA and can provide meaningful protection against the scenario where the study turns out to be underpowered due to incorrect variance assumptions.

Fully flexible adaptive designs with multiple interim analyses and arm dropping should be approached with considerable caution in the biosimilar context. The regulatory review of such designs requires pre-submission discussion, detailed simulation-based operating characteristics, and typically a pre-specified adaptive analysis plan that is filed with the agency before the trial begins.

The ICH Harmonization Effort and Its Practical Impact

The International Council for Harmonisation (ICH) has been working on biosimilar-specific guidance under the Q5 series and related documents, and the most recent updates reflect a meaningful effort to align FDA, EMA, and PMDA thinking on analytical characterization, comparability, and clinical evidence requirements [7]. For companies running global programs, this harmonization, however incomplete, reduces the risk of designing a program that satisfies one agency’s requirements while creating gaps for another.

The practical impact is most visible in the shared vocabulary and shared methodological expectations around analytical similarity assessment, particularly the tiered approach to CQA testing and the statistical framework for establishing analytical equivalence ranges. Where FDA and EMA guidance now converge on the analytical layer, developers can run a single analytical similarity assessment that is submission-ready for both agencies. The divergence, which remains real, sits primarily in the clinical layer: reference product sourcing requirements, the depth of clinical evidence required for extrapolation, and the specific study designs accepted for interchangeability or substitutability claims.

Part Twelve: The Commercial-Clinical Integration

Building Commercial Insight Into Development Decisions

The most sophisticated biosimilar developers do not treat clinical development and commercial strategy as sequential functions. They run them in parallel, with active information sharing between the two groups throughout the development program.

The clinical team designs the confirmatory trial. The commercial team is simultaneously building relationships with key opinion leaders in the target therapeutic area, mapping the payer landscape for the reference product, and tracking competitor biosimilar programs through public regulatory databases and patent filings. The data from the clinical trial, when disclosed, will be the primary evidence used in physician education and formulary applications. Designing the trial to generate data that is specifically useful for those commercial purposes, without compromising the regulatory integrity of the primary endpoints, is a form of value optimization that most companies leave on the table.

For example, a confirmatory trial in rheumatoid arthritis that captures patient-reported outcome measures as secondary endpoints, including the Health Assessment Questionnaire and the EQ-5D, generates economic value data that is directly usable in payer negotiations. The incremental cost of collecting these measures in a running trial is minimal. The value of having real comparative PRO data in a payer submission is significant.

Formulary Strategy: The Last Mile

Regulatory approval gives you the right to sell your biosimilar. Formulary placement gives you the ability to sell it at scale. These are two different milestones with two different evidentiary requirements, and conflating them leads to commercial plans that are strong on science and weak on market access.

Payers evaluate biosimilars on a combination of clinical evidence, pricing, and operational factors. The clinical evidence they care about is primarily the safety and immunogenicity comparison, because payers are acutely sensitive to the political and legal risk of a biosimilar-related adverse event in one of their covered lives. The pricing conversation starts with the listed discount to the reference product’s WAC but quickly moves to net price after rebates, which is where the real negotiation happens.

The operational factors include the availability of patient support programs, specialty pharmacy distribution networks, training resources for switching stable patients, and pharmacovigilance reporting infrastructure. A biosimilar that offers strong clinical evidence and a compelling net price but lacks the operational infrastructure to support a smooth transition for stable patients will encounter significant resistance from hospital pharmacy directors and specialty pharmacies whose conversion experience determines how quickly the biosimilar actually moves into prescribed patients’ hands.

The transition from regulatory success to commercial success is where many biosimilar programs have underperformed their scientific promise. Building the commercial infrastructure in parallel with the clinical program, not after it, is the only way to capture the full value of a well-executed development program.

Key Takeaways

The following eight points summarize the most decision-relevant insights from the preceding analysis. Each represents a strategic choice, not a compliance obligation.

1. The analytical layer is the highest-ROI investment in the program. A comprehensive, tiered analytical similarity assessment that creates an unambiguous molecular fingerprint reduces the size and cost of every subsequent clinical study. Underspending on analytics to get to the clinic faster is a false economy that consistently produces larger, more expensive trials.

2. Regulatory strategy precedes clinical design. The confirmatory trial’s design cannot be finalized without understanding the specific residual uncertainties that the FDA and EMA carry after reviewing the analytical and PK data. Formal regulatory interactions before the confirmatory trial protocol is locked are not optional.

3. Reference product lot management is a clinical program risk factor. Inadequate procurement, characterization, and documentation of reference product lots used in comparative studies has derailed multiple programs at the regulatory review stage. Budget and plan for it as a dedicated workstream, not an afterthought.

4. The equivalence margin is a negotiation, not a calculation. The derivation of the clinical equivalence margin requires historical data analysis, clinical judgment, and regulatory agreement before the trial begins. A margin that has not been explicitly agreed with the FDA and EMA before first-patient dosing is a source of regulatory risk that no amount of good data can fully mitigate.

5. Immunogenicity is assessed across three dimensions: incidence, character, and clinical consequence. Reporting ADA incidence rates alone is insufficient. The regulatory and clinical community needs to understand whether ADAs were neutralizing, whether they affected drug exposure, and whether they correlated with efficacy loss or safety events.

6. The interchangeability designation has conditional commercial value. It is most valuable in states with active substitution laws and in therapeutic areas where the pharmacy channel plays a meaningful role in product selection. In hospital settings and high-touch specialty pharmacy channels, the clinical and economic arguments matter more than the interchangeability label.

7. DrugPatentWatch and equivalent patent intelligence tools are pre-program necessities. The patent landscape analysis must be completed before committing development resources to any biosimilar target. Secondary patents on formulation, dosing regimen, and device design can create launch barriers that are as commercially damaging as a delayed regulatory approval.

8. RWE is a post-approval commercial asset, not a pre-approval regulatory substitute. Real-world evidence from approved biosimilars informs formulary negotiations, supports post-marketing pharmacovigilance requirements, and can strengthen extrapolation arguments for label expansions. It does not replace comparative randomized trial data for initial approval purposes under current regulatory frameworks.

FAQ

Q1: If a biosimilar and its reference product share the same amino acid sequence but differ in glycosylation pattern, does that automatically disqualify the biosimilar from approval?

No, but it does raise the analytical and clinical bar substantially. The FDA and EMA both recognize that glycosylation differences are common between biologics produced in different cell lines or under different process conditions, and that not all glycan differences are clinically meaningful. The regulatory question is whether the observed glycosylation difference affects any CQA that bears on clinical performance: Fc receptor binding, serum half-life, or immunogenicity potential. If the developer can demonstrate through functional assays that Fc-mediated effector functions are equivalent, through PK studies that serum half-life is equivalent, and through clinical immunogenicity data that ADA rates are comparable, the glycosylation difference may be acceptable as a minor, non-clinically-meaningful structural variation within the reference product’s known natural variability range. The documentation burden for this argument is heavy but not insurmountable.

Q2: Why do some biosimilar confirmatory trials use non-inferiority designs rather than equivalence designs?

A non-inferiority trial tests only one direction of difference: that the biosimilar is not meaningfully worse than the reference product. An equivalence trial tests both directions: not meaningfully worse and not meaningfully better. The choice between them reflects a scientific argument about the clinical concern. For most biosimilars, regulators want to rule out both inferiority (obvious) and superiority (because superiority would suggest the products are not truly biosimilar, potentially indicating a different mechanism or a different formulation). However, there are specific scenarios where a one-sided non-inferiority design is acceptable, particularly in settings where demonstrating that the biosimilar is not worse is the primary safety and efficacy question and where a small superiority in the biosimilar arm would not affect the biosimilarity conclusion. In oncology, where objective response rates are the endpoint and the clinical concern is primarily about potential inferiority, non-inferiority designs have been accepted by regulators when supported by a strong scientific justification.

Q3: How should a biosimilar developer handle a situation where the reference product undergoes a manufacturing change during the biosimilar’s clinical development program?

This scenario occurs with some frequency given the multi-year timelines of biosimilar development programs. The FDA and EMA have established processes for evaluating comparability across manufacturing changes for the reference product itself, and these processes generate publicly available information through manufacturing supplements and regulatory assessment reports. For the biosimilar developer, the key question is whether the manufacturing change in the reference product affects the CQAs that are central to the biosimilarity comparison. If the change is minor and documented in public comparability data, the biosimilar developer needs to ensure that the reference product lots used in the clinical program are representative of the post-change commercial product, adding a specific analytical characterization of post-change lots to the reference product lot characterization package. If the change is major and significantly alters the reference product’s analytical profile, the biosimilar developer may need to reassess whether the earlier comparative data remains valid and potentially conduct additional bridging studies.

Q4: What is the current thinking on the use of AI-derived immunogenicity prediction models to reduce or eliminate clinical immunogenicity studies?

The short answer is that regulators do not currently accept in silico immunogenicity predictions as a substitute for clinical immunogenicity data. The computational tools for predicting T-cell epitopes and B-cell responses from protein sequence and structure have improved substantially, and companies use them internally to screen biosimilar candidates and flag potential immunogenicity hotspots early in development. But the correlation between computational immunogenicity scores and actual clinical ADA incidence rates is imperfect enough that no regulatory agency has accepted these predictions as primary evidence in a biosimilar submission. Their appropriate use is as a development screening tool and as supportive information in the context of a broader immunogenicity risk assessment, not as a replacement for a clinical ADA measurement program. The regulatory framework for qualifying these computational models as regulatory-grade evidence is an active area of development in both the FDA and EMA scientific communities.

Q5: How does a biosimilar developer navigate the situation where the reference product’s innovator actively markets directly to physicians against the biosimilar after launch?

This is a commercial execution challenge rather than a clinical design question, but it directly affects the commercial return on the clinical investment. Innovators commonly deploy several post-launch strategies: broadening financial assistance programs that make the branded product accessible at effective zero or near-zero cost to certain patient segments, entering into rebate arrangements with PBMs that give the reference product preferred formulary status even after biosimilar entry, and running educational campaigns that raise physician concerns about switching stable patients. Each of these tactics has been effective in specific markets and has slowed biosimilar uptake. The clinical data is the counter-argument, and its persuasiveness depends on how well the biosimilar developer has designed the confirmatory trial to generate data that directly addresses the specific concerns innovators typically raise: switching safety, immunogenicity, and long-term efficacy maintenance. A switching study, even if not strictly required for regulatory approval, generates the clinical evidence that most effectively neutralizes the innovator’s post-launch anti-switching messaging. This is a case where the decision to invest in additional clinical data beyond the regulatory minimum has a direct commercial ROI.

References

[1] IQVIA Institute for Human Data Science. (2023). Global trends in R&D 2023: Activity, productivity, and enablers. IQVIA.

[2] Sagonowsky, E. (2022, June 30). The AbbVie-Humira biosimilar situation: Your questions answered. Fierce Pharma.

[3] U.S. Food and Drug Administration. (2015). Scientific considerations in demonstrating biosimilarity to a reference product: Guidance for industry. U.S. Department of Health and Human Services.

[4] U.S. Food and Drug Administration. (2003). Bioavailability and bioequivalence studies submitting data for investigational new drug applications and new drug applications for chemical entities: Guidance for industry. U.S. Department of Health and Human Services.

[5] U.S. Food and Drug Administration. (2015). Zarxio (filgrastim-sndz) approval history. U.S. Department of Health and Human Services.

[6] Jefferis, R. (2009). Glycosylation as a strategy to improve antibody-based therapeutics. Nature Reviews Drug Discovery, 8(3), 226-234. https://doi.org/10.1038/nrd2804

[7] International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. (2009). ICH harmonised tripartite guideline Q8(R2): Pharmaceutical development. ICH.

[8] European Medicines Agency. (2014). Guideline on similar biological medicinal products containing biotechnology-derived proteins as active substance: Non-clinical and clinical issues. EMA/CHMP/BMWP/42832/2005 Rev1.

[9] U.S. Food and Drug Administration. (2019). Considerations in demonstrating interchangeability with a reference product: Guidance for industry. U.S. Department of Health and Human Services.

[10] Health Canada. (2016). Guidance document: Information and submission requirements for biosimilar biologic drugs. Government of Canada.

[11] Pharmaceuticals and Medical Devices Agency. (2014). Guidelines for the quality, safety, and efficacy assurance of biosimilar products. Ministry of Health, Labour and Welfare, Japan.

[12] World Health Organization. (2013). WHO guidelines on evaluation of similar biotherapeutic products (SBPs). WHO Press.

[13] Cohen, H., Beydoun, D., Chien, D., Lessor, T., McCabe, D., Muenzberg, M., & Tursi, J. (2016). Awareness, knowledge, and perceptions of biosimilars among specialty physicians. Advances in Therapy, 33(12), 2160-2172. https://doi.org/10.1007/s12325-016-0431-5

[14] Senn, S. (2002). Cross-over trials in clinical research (2nd ed.). Wiley.

[15] U.S. Food and Drug Administration. (2018). Clinical pharmacology data to support a demonstration of biosimilarity to a reference product: Guidance for industry. U.S. Department of Health and Human Services.

[16] Casadevall, N., Nataf, J., Viron, B., Kolta, A., Kiladjian, J. J., Martin-Dupont, P., & Varet, B. (2002). Pure red-cell aplasia and antierythropoietin antibodies in patients treated with recombinant erythropoietin. New England Journal of Medicine, 346(7), 469-475. https://doi.org/10.1056/NEJMoa011931

[17] European Medicines Agency. (2017). Guideline on immunogenicity assessment of therapeutic proteins. EMA/CHMP/BMWP/14327/2006 Rev. 1.

[18] Kaur, P., Chow, V., Zhang, N., Moxness, M., & Markus, R. (2017). A randomised, single-blind, single-dose, three-arm, parallel-group study in healthy subjects to demonstrate pharmacokinetic equivalence of ABP 501 and adalimumab. Annals of the Rheumatic Diseases, 76(3), 526-533. https://doi.org/10.1136/annrheumdis-2016-209509

[19] Tufts Center for the Study of Drug Development. (2020). Impact report: State of clinical trials, 2020. Tufts University School of Medicine.

[20] U.S. Food and Drug Administration. (2023). Decentralized clinical trials for drugs, biological products, and devices: Draft guidance for industry, investigators, and other stakeholders. U.S. Department of Health and Human Services.

[21] Ni, Y., Beckwith, H., Santel, D., Gottesman, B., Kowatch, R. A., & Sheridan, R. (2019). Using NLP and machine learning to predict eligibility for a clinical trial using unstructured data. Journal of Biomedical Informatics, 95, 103248. https://doi.org/10.1016/j.jbi.2019.103248

[22] Grabowski, H., Long, G., Mortimer, R., & Boyo, A. (2014). The market for follow-on biologics: How will it evolve? Health Affairs, 33(6), 1033-1041. https://doi.org/10.1377/hlthaff.2014.0126

[23] Sarpatwari, A., Barenie, R., Curfman, G., Darrow, J. J., & Kesselheim, A. S. (2021). The US biosimilar market: Stunted growth and possible reforms. Clinical Pharmacology and Therapeutics, 109(3), 692-700. https://doi.org/10.1002/cpt.1735

[24] U.S. Food and Drug Administration. (2018). Biosimilars action plan: Balancing innovation and competition. U.S. Department of Health and Human Services.