Precision Advocacy in AI Patents: Teaching-Away and Secondary Considerations—Language Examples and Unexpected Results
Struggling to distinguish mere preference from true teaching-away in AI prior art—or to prove unexpected results with legal weight? By the end of this lesson, you’ll diagnose dissuasion language with precision, build secondary-consideration records that tie claim elements to metrics, and draft examiner-ready phrasing that survives §§102/103. You’ll get crisp explanations, corpus-driven templates, high-signal examples, and targeted exercises to validate your judgment and tighten your claim charts. The approach is discreet and surgical—focused on measurable ROI: fewer OAs, stronger nonobviousness, faster allowances.
Step 1: Frame and diagnose—what counts as teaching-away in AI contexts
In AI patent practice, “teaching-away” means a prior art reference actively discourages the claimed approach by signaling that it is inferior, impractical, or contrary to the reference’s core design choices. This is different from a mere preference or a statement that an alternative is “less optimal.” Teaching-away requires clear dissuasion—language that would cause a skilled person to avoid the claimed path—not just a ranking of options. In §§102/103 contexts, this distinction is crucial: if a reference only shows a preference, an examiner may still assert that a skilled artisan would try the disfavored option; but if the reference teaches away, combining toward the claimed invention becomes less reasonable and motivation-to-combine weakens.
AI technologies, especially learning architectures and training/inference pipelines, amplify this distinction because authors often express strong methodological commitments (e.g., end-to-end learning over modular systems) and warnings about pitfalls (e.g., overfitting, instability, or catastrophic forgetting). These warnings can function as high-signal teaching-away when they present the avoided method as unworkable or antithetical to the reference’s goals.
To diagnose teaching-away in AI prior art, look for language that does more than merely rank methods. Your analysis should ask: Does the reference characterize the claimed path as undesirable for the stated objectives? Does it describe the approach as likely to fail, degrade core metrics, or introduce intolerable trade-offs? Does it embed a design principle that excludes the claimed option?
Checklist of dissuasion cues and criticality language
High-signal indicators of teaching-away include:
- Explicit dissuasion: “We avoid,” “We reject,” “This approach is unsuitable,” “Not recommended,” “Should not be used for,” “Incompatible with.”
- Negative performance characterizations: “Unstable,” “Unreliable,” “Fails to converge,” “Produces degenerate solutions,” “Incurs prohibitive latency/memory,” “Suffers catastrophic loss of generalization.”
- Incompatibility with objectives: “Contrary to our goal of real-time inference,” “Inconsistent with privacy constraints,” “Violates determinism required by production,” “Breaks calibration needed for safety.”
- Criticality markers: “Essential,” “Mandatory,” “Crucial,” “Key to obtaining results,” “Required to avoid collapse,” “Necessary to maintain constraints.” When paired with negative statements about alternatives, these show the reference has a rigid design choice.
- Boundary-setting language: “Outside the scope,” “Not contemplated,” “We constrain our method to avoid,” “We restrict/forbid,” “We only consider approaches that exclude.”
- Warnings about foreseeable failure modes: “Leads to mode collapse,” “Prone to irrecoverable drift,” “Causes gradient explosion/vanishing,” “Introduces unacceptable hallucination rates.”
By contrast, weak signals of mere preference include:
- “We found X to be better than Y,” without condemning Y.
- “X is more efficient,” without stating Y is impractical or incompatible.
- “We primarily use X,” without prohibiting Y or pointing to unacceptable trade-offs.
- “Y underperformed on our dataset,” absent statements that Y generally fails or is unsuitable for the objective.
In essence, teaching-away requires directional force and criticality: the reference not only prefers an alternative but flags the claimed approach as something a skilled person ought to avoid to meet the reference’s aims.
Step 2: Build secondary considerations—unexpected results and evidence scaffolding for AI
Secondary considerations can anchor nonobviousness when the prior art landscape is dense or when examiners argue that combining known AI components is routine. In AI, three categories often resonate: unexpected results, commercial success, and long-felt but unmet need. Each should be tied to specific claim limitations and, critically, to whether the limitation resides in the training pipeline, the inference path, or the synergy between them.
Unexpected results in AI
Unexpected results should be framed as outcomes a skilled AI practitioner would not reasonably predict from prior teachings. They should be quantitative where possible, but they can also be qualitative if the field identifies certain gains as notoriously difficult. Consider dimensions such as:
- Training efficiency: materially fewer epochs, less wall-clock time, or significantly lower GPU hours than comparable baselines under matched accuracy or loss targets. Evidence could include ablation studies, resource logs, or replicated benchmarks.
- Generalization and robustness: improved out-of-distribution performance, calibration, fairness metrics, or adversarial resilience not explained by mere scaling or dataset size. Document the protocols used (cross-domain tests, OOD splits, corruptions) and show consistency across seeds and datasets.
- Inference latency and memory: surprising gains at inference that defy conventional trade-offs (e.g., simultaneous latency and accuracy improvements at the same model size). Provide profiling traces, batch-size sweeps, and real-time deployment metrics.
- Convergence stability: reduction in variance across runs (seed stability), fewer collapses, or elimination of known failure modes under standard hyperparameter ranges.
- Data efficiency: retaining accuracy with markedly fewer labeled examples or with noisy/weak supervision while controlling for confounders.
Tie these outcomes tightly to claim limitations. If the claim recites, for example, a particular scheduling of loss components during training that reduces inference memory without pruning, articulate how that training-time mechanism propagates into inference-time savings. Examiners often miss causal bridges between training tweaks and inference metrics; make the link explicit, with figures or logs, so the unexpected nature is evident.
Commercial success and market signals
Commercial success in AI is credible when you can connect sales, adoption, or usage growth to the claimed technical features, not to branding or bundling. Evidence can include:
- Adoption in resource-constrained environments (edge devices) attributable to the claimed inference optimizations.
- Enterprise uptake due to compliance, safety, or auditability features directly tied to the claim’s training regime (e.g., traceable gradients enabling post-hoc audits).
- Cost reductions (cloud bills, GPU-hours) documented in customer case studies linked to the claimed architecture.
Causation is critical. Provide declarations, user testimonials, A/B tests, or pricing deltas tied specifically to the claimed features rather than general product momentum.
Long-felt but unmet need
Identify pain points long acknowledged in the literature—such as stable on-device fine-tuning without catastrophic forgetting, real-time multilingual inference with strict latency bounds, or reproducible training in non-deterministic hardware stacks. Show how prior art attempts fell short (partial fixes, unacceptable regressions), and then map how the claim’s specific elements overcome those deficits without reintroducing excluded trade-offs. The narrative should track the field’s acknowledged impasse and your claim’s precise technical step that breaks it.
Evidence scaffolding
Structure your evidence so each item anchors to a claim element and a metric:
- Provide side-by-side baselines with controlled variables.
- Use accepted benchmarks and disclose settings (seeds, hardware, versions) for reproducibility.
- Include ablations that turn the claimed limitation on/off to isolate its effect.
- For training-vs-inference distinctions, show separate panels for training metrics (time, stability) and inference metrics (latency, memory), then a bridging explanation.
This scaffolding converts general praise into objective indicia with legal weight, directly supporting nonobviousness.
Step 3: Apply reusable drafting templates—AI-specific phrasing that withstands 102/103
Precision in phrasing is paramount. Your language must express either clear dissuasion (for teaching-away) or a well-supported causal link (for secondary considerations) without overclaiming. Below are reusable templates you can adapt. Use them verbatim only when they fit your facts; otherwise, adjust to reflect your record.
Teaching-away language (AI-specific)
- “The reference expressly discourages [claimed approach], stating that it ‘[verbatim dissuasion quote],’ thereby indicating that a skilled person would avoid [claimed approach] to achieve [reference’s objective].”
- “By characterizing [alternative] as ‘[unstable/unreliable/unsuitable],’ the reference sets a critical boundary that excludes the claimed [module/schedule/architecture], not merely a preference for a different configuration.”
- “The reference ties its performance goals to the mandatory use of [X], describing [X] as ‘essential/crucial’ and warning that substituting [claimed Y] would ‘[negative outcome].’ This constitutes clear teaching-away, not a ranking among equivalents.”
- “The authors restrict the method to avoid [claimed feature], stating it is ‘incompatible with [latency/privacy/safety] objectives.’ Such incompatibility is inconsistent with a motivation to combine toward the claimed configuration.”
- “Contrary to examiner’s assertion of routine substitution, the reference embeds a design principle excluding [claimed path], identifying predictable failure modes—‘[quote]’—that would deter adoption.”
Unexpected-results argument templates
- “When the claimed [training schedule/regularizer] is enabled, inference latency decreases by [X%] at equal accuracy, contradicting the field’s expectation that [trade-off]. Ablation studies isolating [claim element] confirm the effect is attributable to the claimed feature.”
- “Under matched data and hyperparameters, the claimed architecture converges in [N] epochs versus [M] for the closest baseline, reducing GPU hours by [Y%]. Prior art does not teach or suggest that introducing [claim element] would produce this dual improvement in both convergence stability and out-of-distribution accuracy.”
- “The observed robustness to [distribution shift/adversarial perturbation] exceeds reported gains from known techniques and persists across seeds and datasets. This magnitude and consistency were unexpected given the prior art’s documented sensitivity.”
- “Commercial adoption in [sector] is linked to the claimed [feature], as evidenced by [case study/declaration], which attributes cost reductions and SLA compliance to the specific limitation recited in claim [#].”
Micro-phrases for claim charts
- “Teaches away: [quote]; states [claimed feature] is ‘[unsuitable/incompatible].’”
- “Criticality: labels [alternative] ‘essential,’ implying exclusion of [claimed feature].”
- “Objective indicia: unexpected [metric] improvement directly tied to [claim element] (see Exh. [#], Fig. [#]).”
- “No motivation to combine: prior art’s boundary condition (‘[quote]’) deters adoption of [claimed approach].”
- “Training→Inference bridge: [training change] yields [inference metric] via [mechanism], not taught or suggested in cited art.”
Do/don’t contrasts
- Do anchor teaching-away in explicit quotes that express dissuasion; don’t rely on vague comparisons.
- Do link unexpected results to controlled ablations; don’t make claims without isolating variables.
- Do distinguish training and inference impacts; don’t treat them as interchangeable without explanation.
- Do use accepted benchmarks and disclose settings; don’t rely solely on proprietary or opaque tests.
- Do show how claimed features overcome known trade-offs; don’t assert miracles without mechanistic rationale.
Step 4: Mini-practice—how to insert language into claim charts and arguments; avoiding pitfalls
When integrating these elements into claim charts, precision and traceability matter. Each chart cell should map a claim limitation to (1) a teaching-away quote or (2) an objective indicia reference, followed by a short explanation that ties the evidence to the legal point.
For teaching-away entries, begin with the claim element and cite the prior art passage that establishes dissuasion. Immediately explain why the quoted language amounts to more than preference. Highlight any criticality terms, incompatibility statements, or boundary-setting phrases. Then connect this to the combination logic: if the reference discourages the claimed approach to meet its own objectives, it undermines a motivation to combine toward that approach. Keep the language concise but explicit, ensuring that each step is legible to a reviewer who may not be an AI specialist.
For unexpected results and other objective indicia, your chart should identify the specific metric, the experimental conditions, and the ablation or causation link to the claim limitation. Provide a pointer to exhibits and figures with reproducible settings. Emphasize the training/vs/inference location of the effect and include a brief mechanism statement: how the claimed training feature yields the inference improvement, or vice versa. This mechanism narrative converts raw numbers into persuasive legal evidence that the result was nonobvious.
In the argument section of a brief or office action response, synthesize the chart entries into a cohesive narrative. Start by articulating the reference’s stated objectives and constraints; then show how its dissuasion language marks the claimed approach as inconsistent with those objectives. Address the examiner’s motivation-to-combine theory directly: explain that a skilled artisan seeking the reference’s goals would not adopt the discouraged element, especially where the reference calls its own choices “essential” or warns of failure modes triggered by the alternative. Reinforce that this is not a mere preference but a design exclusion.
Next, present the secondary considerations as independent confirmation. Lay out the unexpected results with structured evidence: baseline parity, controlled ablations, and reproducible metrics. Clarify why the results defy field expectations, referencing standard trade-offs or documented limitations in the literature. Then, show the causal link from the claim limitation to the result. Finally, add any commercial success or long-felt-need evidence, carefully tying market outcomes or problem resolution to the claimed technical features.
Common pitfalls to avoid include overreading the prior art (attributing absolute prohibitions where the text only shows mild preference), conflating related but distinct AI stages (assigning an inference gain to a training-only change without mechanism), and relying on anecdotal or non-reproducible results. You should also avoid universal claims (“no prior art suggests…”) unless your search and citations support that breadth. Where your evidence relies on proprietary datasets or internal benchmarks, mitigate with methodological transparency and, where feasible, third-party replication or public analogs.
Throughout, keep your tone professional and precise. Use direct citations, define metrics, and avoid rhetorical overreach. Align each legal conclusion with a factual anchor: a quote for teaching-away, a controlled experiment for unexpected results, a declaration for commercial success, or a literature survey for long-felt need. This disciplined alignment produces courtroom-ready phrasing and enhances credibility under §§102/103 scrutiny.
By applying these diagnostic tools, phrasing templates, and evidence structures, you create a compelling, technically grounded narrative that distinguishes true teaching-away from mere preference and translates AI-specific performance gains into persuasive objective indicia of nonobviousness. The result is advocacy that is both precise and durable—suitable for claim charts, office action responses, and appeals in the demanding and fast-moving field of AI patents.
- Teaching-away requires explicit dissuasion and criticality (e.g., “unsuitable,” “incompatible,” “essential”) that would deter a skilled person—not just a preference or better performance claim.
- Build secondary considerations by tying unexpected results, commercial success, or long-felt need directly to specific claim elements, with clear training vs. inference locations and causal links.
- Support unexpected results with controlled, reproducible evidence (baselines, on/off ablations, documented settings) and explicitly bridge training-time mechanisms to inference-time metrics.
- In claim charts and arguments, quote dissuasion language to undercut motivation to combine, and map each claim limitation to objective indicia with traceable exhibits and metrics while avoiding overreach.
Example Sentences
- The paper expressly discourages modular fine-tuning, calling it “unsuitable for safety-critical latency,” which signals clear teaching-away rather than a mere preference.
- Our ablation shows an unexpected result: enabling the claimed loss schedule cuts GPU hours by 42% at equal accuracy, contradicting the field’s assumed efficiency–robustness trade-off.
- Because the reference labels end-to-end training “essential” and warns that swapping in the claimed adapter stack “fails to converge,” a skilled person would avoid that path.
- Customer declarations tie commercial success to the claimed inference quantization, noting SLA compliance on edge devices that prior art deemed incompatible with privacy constraints.
- The chart links the training–inference bridge: the staged regularizer reduces activation spread during training, which unexpectedly lowers peak inference memory without pruning.
Example Dialogue
Alex: The examiner says it’s just routine to add our adapter layer, but the prior art literally states “we avoid adapters due to instability in real-time.”
Ben: That’s teaching-away—explicit dissuasion tied to their latency objective, not a soft preference.
Alex: Exactly, and our unexpected results back it up: with the staged regularizer, we hit the same accuracy using 35% fewer GPU hours and shave 18% off inference latency.
Ben: Make the bridge explicit—training-time regularizer leading to inference-time latency—and cite the ablation.
Alex: I’ll add the quote with their “essential end-to-end” language and show the on/off ablation.
Ben: Good. That combination undercuts motivation to combine and gives objective indicia for nonobviousness.
Exercises
Multiple Choice
1. Which statement most clearly indicates teaching-away in an AI paper discussing real-time inference?
- We primarily use end-to-end training for simplicity.
- End-to-end training outperforms modular systems on our dataset.
- We avoid modular adapters as they are unsuitable for safety-critical latency.
- Modular adapters achieved slightly lower accuracy in our experiments.
Show Answer & Explanation
Correct Answer: We avoid modular adapters as they are unsuitable for safety-critical latency.
Explanation: Teaching-away requires explicit dissuasion tied to objectives. “We avoid…unsuitable for safety-critical latency” is explicit dissuasion and incompatibility with objectives, not a mere preference.
2. You claim unexpected results because a training-time loss schedule reduces inference memory without pruning. What evidence best supports this under secondary considerations?
- A statement that the team believes the schedule is efficient.
- A single benchmark showing higher accuracy on an internal dataset.
- Ablation studies toggling the schedule on/off with matched settings and profiling traces showing memory reduction at equal accuracy.
- A competitor’s blog praising the model’s speed.
Show Answer & Explanation
Correct Answer: Ablation studies toggling the schedule on/off with matched settings and profiling traces showing memory reduction at equal accuracy.
Explanation: Unexpected results must be causally tied to the claimed limitation and supported by controlled evidence (ablations, matched conditions) and reproducible metrics (profiling traces).
Fill in the Blanks
The reference labels end-to-end training as “___,” and warns that substituting adapters “fails to converge,” which signals teaching-away rather than preference.
Show Answer & Explanation
Correct Answer: essential
Explanation: Criticality markers like “essential” show rigid design choices; when paired with negative statements about alternatives, they indicate teaching-away.
To make the training→inference bridge explicit, the chart should tie the claimed training mechanism to a specific inference metric and include ___ that isolate the effect of the claim element.
Show Answer & Explanation
Correct Answer: ablations
Explanation: Evidence scaffolding requires ablations that turn the claimed element on/off to isolate its causal impact on the inference metric.
Error Correction
Incorrect: The paper prefers end-to-end training, so a skilled artisan would never combine it with adapters.
Show Correction & Explanation
Correct Sentence: The paper merely shows a preference for end-to-end training; without explicit dissuasion or incompatibility language, it does not necessarily deter a skilled artisan from trying adapters.
Explanation: Preference alone is not teaching-away. Teaching-away needs clear dissuasion (e.g., “unsuitable,” “incompatible”) or criticality excluding the alternative.
Incorrect: Our commercial success proves nonobviousness because sales grew after release, regardless of which features drove adoption.
Show Correction & Explanation
Correct Sentence: Our commercial success supports nonobviousness only if we tie sales growth to the claimed technical features, using evidence such as customer declarations or A/B tests showing causation.
Explanation: Objective indicia require a causal link between market success and the claimed limitations, not mere temporal correlation or branding effects.