Written by Susan Miller*

From Funnel to Thesis: Building a Publishing-Ready Introduction for ICLR/ICML/NeurIPS with an Introduction Funnel Template ML

Struggling to craft an introduction that gets ICLR/ICML/NeurIPS reviewers to read on? In this lesson you’ll learn a disciplined, conference-ready “introduction funnel” that turns broad motivation into a single testable thesis and a tight roadmap so your claims are verifiable. You’ll get a reusable, LaTeX-friendly template, concrete examples and critique, and short exercises to practice writing hooks, gap statements, contribution bullets, and cross-references—delivered with precise, confidential guidance tuned to top-tier ML submissions.

Step 1 — Deconstruct the Introduction Funnel

An "introduction funnel" is a deliberate, staged way to lead a reviewer from broad motivation to a single, testable thesis and roadmap. In ML conference settings (ICLR/ICML/NeurIPS) where page and reviewer attention are scarce, the funnel compresses the essential rhetorical moves into a tightly ordered sequence: hook/motivation, problem/gap, approach overview (key idea), explicit contributions, and a one-sentence roadmap. Each stage has a clear role and an approximate token budget suited to an 8–10 page conference paper. Treat each stage as a fixed-capacity container: be ruthless with wording and prioritize clarity over flourish.

  • Hook / Motivation (3–4 sentences): The hook is the broad, quickly readable reason the topic matters. Its job is to orient the reviewer to the application domain or the scientific question without technical detail. For a page-limited submission, keep the hook to 3–4 short sentences (roughly 40–90 words). Use concrete, high-value contexts (e.g., long-range sequence modeling for genome analysis, or compute-limited on-device learning) rather than generic proclamations like "learning is important." A strong hook states a need or challenge felt by practitioners or by the research community.

  • Problem / Gap (2–3 sentences): Immediately after the hook, identify what current methods fail to deliver. This is not a literature review; it is a concise gap statement: what is still unsolved, under-explored, or mis-characterized? Keep this to 2–3 sentences (about 30–60 words). A crisp gap contrasts the practical or theoretical need with a short diagnosis of why existing approaches are insufficient—poor scaling, lack of theoretical guarantees, sparse empirical evidence on realistic benchmarks, or neglected tradeoffs.

  • Approach Overview / Key Idea (2–3 sentences): Now present the core idea of your paper in plain language. This should not be a full methods description but a conceptual sketch: how you change the modeling or the training to close the gap. Use 2–3 sentences (25–60 words) that convey novelty and intuition—e.g., a new architecture, a training objective, or a proof technique. Avoid heavy equations; focus on the mechanism and why it addresses the gap.

  • Key Contributions (3 bulletized items): In modern ML introductions, a numbered or bulletized contributions list is virtually standard and expected. Limit yourself to 2–4 bullets, typically three. Each bullet should be a single concise sentence that asserts a claim you will justify later: technical novelty, theoretical result (if any), and empirical evidence (datasets/benchmarks/metrics). Each bullet should include the nature of the contribution and a concise quantitative claim if available (e.g., "reduces memory by 3x" or "improves F1 by 5 points on X"). This explicitness helps reviewers quickly evaluate novelty and significance.

  • Roadmap (1 sentence): Finish with a single sentence that tells readers where to find the details: "Sec. 3 describes the model; Sec. 4 the experimental setup; Sec. 5 discusses limitations." The roadmap is economical and functional: it sets expectations and signals organization, which helps reviewers navigate the rest of the paper efficiently.

Strong vs. Weak Funnel Sentences

  • Strong hook sentence example pattern: "Many applications (A, B) require X under constraint Y, yet existing models struggle because Z." This is concise, application-linked, and diagnostic.
  • Weak hook sentence pattern to avoid: "Machine learning is widely used." This is generic and wastes space.
  • Strong gap sentence: "Existing method class M scales quadratically in sequence length, making it impractical for tasks with sequences >10k tokens." This points to a measurable deficiency.
  • Weak gap sentence: "Prior work is limited." Too vague; reviewers will ask "how?" immediately.
  • Strong approach sentence: "We introduce a sparse-attention transformer that adaptively prunes keys, reducing memory cost while preserving cross-sequence dependencies." This sketches mechanism and benefit.
  • Weak approach sentence: "We present a new method to improve transformers." Without mechanism or benefit, it lacks evaluative power.

Micro-guidelines and Token Budgets

For an 8–10 page paper: allocate about 150–220 words to the introduction funnel total. Rough budgeting: hook 60–90 words; gap 30–50 words; approach 30–50 words; contributions 30–60 words (three short bullets); roadmap 10–20 words. Keep sentences short (12–20 words) and use active voice. Always prioritize specificity: precise claims beat vague claims even if the latter sound grander.

Step 2 — Apply the 'introduction funnel template ML'

A reusable template reduces cognitive load and produces consistent, review-ready introductions. Below is a practical template with labeled slots. Each slot includes phrasing guidance to keep language tight and transferable across papers.

Template slots and phrasing patterns:

  • Hook (slot): "[Application or scientific area] increasingly requires [capability X] under [constraint Y]." Example phrasing: "Long-range sequence modeling increasingly requires efficient transformers that handle 10k+ tokens for genomics and long-document NLP."

  • Relation-to-practice (slot, 1 sentence optional): "However, in practical settings [constraint Y] causes [negative outcome]." Keeps the hook grounded in realistic limitations.

  • Formal problem statement (slot): "We study the problem of [concise formal problem], where [brief technical framing]." This translates the practical need into a researchable problem: e.g., "We study efficient self-supervised pretraining for transformers that scale sublinearly in memory." Avoid heavy notation here; reserve symbols for the methods section.

  • Key idea / thesis sentence (slot): "To address this, we propose [core technical idea], which [succinct benefit/insight]." This single thesis sentence should be reusable verbatim in the abstract and tweaked for the conclusion. Phrase it to show cause-effect: method → benefit.

  • Contributions (3 bullets slot): Each bullet follows the pattern: "(1) [Technical novelty]: We introduce X, a [short descriptor], which [what it enables or achieves]." "(2) [Theoretical or analysis result]: We prove/derive/characterize Y, showing Z." "(3) [Empirical claim]: We evaluate on [benchmarks] and demonstrate [quantitative gains]." Quantify wherever possible and mention measurement axes (accuracy, speed, memory, sample-efficiency).

  • Roadmap (slot): "The rest of the paper is organized as follows: Sec. 3 describes the model, Sec. 4 the experiments, Sec. 5 related work and limitations." Keep it short and specific.

How to fill the template: prioritize the thesis sentence and contributions. The thesis is the funnel’s apex: a single declarative sentence stating what you did and why it matters. Contribution bullets should be parallel in grammar and content: start each with a noun phrase (e.g., "A memory-efficient attention module"), then a comma, then the result clause (e.g., "which reduces peak memory by 3x on 10k-token inputs"). This parallelism helps reviewers parse claims quickly.

Adapting to page limits: If you have more space (10 pages), you may expand the approach overview to 3 sentences and add a fourth contribution bullet for additional experimental analyses. If you have less (8 pages), compress the hook to 2 sentences and make contributions even more compact—combine technical novelty and empirical claim into one bullet if necessary.

Step 3 — Seamlessly Link to the Rest of the Paper and Compliance Requirements

An introduction is persuasive only if its claims are clearly verifiable in the rest of the paper. Therefore, each claim in the funnel should point to where evidence or details appear. Use inline cross-references sparingly and strategically: after the approach sentence, add "(Sec. 3)"; after an empirical claim, add "(Sec. 4)". Example micro-phrasing: "We introduce X, a memory-efficient attention mechanism (Sec. 3)." This preempts a frequent reviewer question: "Where are the details?"

Aligning with abstract, methods, experiments, and limitations

  • Abstract: The thesis sentence in your introduction should have a compressed twin in the abstract. Write the thesis first in the intro and then craft a shorter version for the abstract. This consistency allows reviewers to connect the high-level claim they skim in the abstract with the fuller story in the introduction and the evidence later.

  • Methods / Experiments: Ensure the contributions list references the sections where the methods and experiments provide support. Contributions that assert empirical gains should include the benchmark names and a cross-reference (e.g., "(Sec. 4.2)"), so reviewers can rapidly find tables/plots that substantiate claims.

  • Limitations and Appendix: If your method has clear limitations (e.g., training instability, narrow applicability), briefly signal them in the intro's roadmap or via a parenthetical "(see Sec. 5 for limitations)". Describe any heavy proof or extended experiment details that cannot fit in the main text in the appendix and point to it: "Proofs and extended experiments are in App. A and App. B." This signals reproducibility planning and respects page limits.

Anonymization and camera-ready checklist items

  • Anonymization: For double-blind review, avoid naming institutions or linking code repos with identifying metadata in the introduction. If necessary, add a neutral sentence in the intro or a footnote instructing reviewers where anonymized code or dataset references appear in the appendix. Do not include acknowledgments.

  • Page-limit accounting: Be mindful that some conferences count references or appendices differently during review. Keep a running tally of words dedicated to the funnel and trim earlier than you think. If you must cut, shorten the hook and merge the roadmap into the final contribution bullet.

  • Camera-ready checklist: After acceptance, follow the camera-ready checklist to reinsert acknowledgments and dataset licenses, expand the appendix with reproducibility checklists, and remove anonymization placeholders. Prepare to move some experimental details from the appendix into main text only if page allowance increases.

Cross-reference checklist for introduction drafts

  • After the thesis sentence: add "(Sec. 3)" for methods.
  • For each empirical claim in contributions: add target section/table "(Sec. 4 / Table 1)".
  • For theoretical claims: reference the proof location "(App. A)".
  • For limitations / ethical considerations: reference the discussion section "(Sec. 5)".

Trimming tips to match reviewer behavior

Reviewers often skim introductions. Use clear signposting and active nouns. If you must cut words, reduce the hook and collapse related-work hints out of the intro and into the explicit related-work section. Keep contribution bullets punchy and parallel—these are the lines reviewers most often read first. Finally, read the introduction out loud and delete any sentence that does not directly support the thesis or guide the reviewer to evidence.

Conclusion

The introduction funnel template ML is a focused instrument for turning a handful of high-priority claims into a compact, persuasive, and review-friendly opening. By allocating token budgets, using parallel contribution bullets, and including precise cross-references, you create a clear chain from motivation to verifiable evidence. Carefully weave anonymization and appendix pointers into the introduction to remain compliant, and reserve space for limitations so reviewers will not be surprised. Practiced application of this funnel will make your introductions predictable, short, and compelling—exactly what top ML venues reward.

  • Use the introduction funnel: hook (3–4 short sentences) → problem/gap (2–3 sentences) → approach overview (2–3 sentences) → 2–4 bullet contributions → one‑sentence roadmap.
  • Keep language tight and specific: short sentences, active voice, concrete applications and measurable gaps (e.g., memory, scaling, benchmarks).
  • Make the thesis and contribution bullets verifiable: state mechanism + benefit, quantify claims when possible, and add cross-references to sections/tables (e.g., Sec. 3, Sec. 4 / Table 1).
  • Budget words for an 8–10 page paper (≈150–220 words total): trim the hook first, keep contributions parallel and punchy, and signal limitations/anonymization or appendix locations when needed.

Example Sentences

  • Long-range sequence modeling increasingly requires efficient transformers that handle 10k+ tokens for genomics and legal-document analysis.
  • Existing sparse-attention methods scale poorly on long contexts, making them impractical for real-world genome and video tasks.
  • To address this, we propose Adaptive Key Pruning, a lightweight sparse-attention mechanism that reduces peak memory while preserving cross-sequence dependencies (Sec. 3).
  • We introduce AKP, a memory-efficient attention module, which reduces peak memory by 3x on 10k-token inputs and maintains comparable accuracy (Sec. 4 / Table 1).
  • The rest of the paper is organized as follows: Sec. 3 describes the model, Sec. 4 presents experiments on genomics and long-document benchmarks, and Sec. 5 discusses limitations.

Example Dialogue

Alex: Our reviewer will skim the intro—can we compress the hook to two sentences and add a precise gap claim?

Ben: Yes—start with the application (genomics and long documents), then state the gap: existing attention scales quadratically and fails beyond 10k tokens.

Alex: Good. Then a single thesis sentence like “We propose Adaptive Key Pruning, which reduces memory by 3x while preserving performance (Sec. 3),” followed by three concise contribution bullets.

Ben: Exactly—keep the bullets parallel and cross-reference the experiment section for each empirical claim so reviewers can verify quickly.

Exercises

Multiple Choice

1. Which sentence best functions as a strong 'hook' for an ML conference introduction funnel?

  • Machine learning is widely used.
  • Long-range sequence modeling increasingly requires efficient transformers that handle 10k+ tokens for genomics and legal-document analysis.
  • In this paper we present a new model that improves performance.
Show Answer & Explanation

Correct Answer: Long-range sequence modeling increasingly requires efficient transformers that handle 10k+ tokens for genomics and legal-document analysis.

Explanation: A strong hook links a concrete application and constraint (long-range modeling, 10k+ tokens, genomics/legal documents). It orients reviewers quickly and avoids vague proclamations like 'Machine learning is widely used.'

2. Which gap sentence is the most effective for the Problem / Gap stage?

  • Prior work is limited.
  • Existing sparse-attention methods scale poorly on long contexts, making them impractical for real-world genome and video tasks.
  • Many papers study attention mechanisms.
Show Answer & Explanation

Correct Answer: Existing sparse-attention methods scale poorly on long contexts, making them impractical for real-world genome and video tasks.

Explanation: An effective gap sentence specifies a measurable deficiency (poor scaling on long contexts) and links it to practical consequences. Vague statements like 'Prior work is limited' fail to explain how or why.

Fill in the Blanks

We introduce AKP, a memory-efficient attention module, which reduces peak memory by 3x on 10k-token inputs and maintains comparable accuracy (Sec. ___).

Show Answer & Explanation

Correct Answer: 4

Explanation: Empirical claims in the contributions should reference the section with experiments. The template recommends pointing to the experiments section (Sec. 4) for empirical evidence.

A concise roadmap sentence should tell the reader where to find details; for example: "The rest of the paper is organized as follows: Sec. 3 describes the model, Sec. 4 the experiments, and Sec. ___ discusses limitations."

Show Answer & Explanation

Correct Answer: 5

Explanation: The guideline suggests using a dedicated section for limitations/discussion, commonly numbered after methods and experiments (Sec. 5). Roadmaps should be short and specific.

Error Correction

Incorrect: We present a new method to improve transformers.

Show Correction & Explanation

Correct Sentence: We introduce a sparse-attention transformer that adaptively prunes keys, reducing memory cost while preserving cross-sequence dependencies.

Explanation: The original sentence is weak because it lacks mechanism and benefit. The corrected sentence follows the strong approach pattern: names the mechanism (sparse-attention with adaptive pruning) and states the benefit (reduced memory while preserving dependencies), making the claim evaluative and informative.

Incorrect: Each contribution bullet should be long and include background literature to be comprehensive.

Show Correction & Explanation

Correct Sentence: Each contribution bullet should be a single concise sentence that asserts a claim (technical novelty, theory, or empirical evidence) and includes a quantitative or specific qualifier when possible.

Explanation: Contribution bullets in the funnel must be short and claim-focused so reviewers can quickly assess novelty and significance. Long bullets with literature review break the funnel's fixed-capacity constraint and reduce clarity.