Atiendia Logo Atiendia
βœ… Evidence-based validation

Novelty Validation
The System Doesn't Hallucinate

Generating ideas is easy. Determining if they are new and valuable is the real challenge. The system includes an automated audit module that acts as a preliminary "Peer Reviewer".

⚠️

The Problem of "Novel Ideas"

Generative AI systems can produce hypotheses that sound plausible but have actually already been explored in the literature. Without external validation, you risk:

  • Γ— Reinventing the wheel: Wasting resources on research already done
  • Γ— False novelties: Presenting something as original that already exists
  • Γ— Wasted time: Researchers following leads that already have answers

The Solution: Truth-Checking with Global Databases

Every idea is validated against global scientific literature in real-time

πŸ“š

Semantic Scholar

AI2 (Allen Institute)

βœ“ 200M+ papers indexed
βœ“ Complete Citation Graph (who cites whom)
βœ“ Semantic API: search by concepts, not just keywords
βœ“ Automatic TL;DR generated by AI
βœ“ Multidisciplinary coverage (CS, medicine, physics, etc.)
🌍

OpenAlex

OurResearch (ex Microsoft Academic)

βœ“ 250M+ works cataloged
βœ“ 100% Open Access (free and complete API)
βœ“ Global coverage: papers in all languages
βœ“ Rich metadata: authors, institutions, concepts
βœ“ Continuous updates (new papers daily)
πŸ” More than 400 million papers consulted automatically

Evidence-Based Validation Process

Three steps for each generated idea

1

πŸ” Evidence Retrieval

The system builds semantic queries from the proposed idea and actively searches Semantic Scholar and OpenAlex for papers that have already explored that specific combination of concepts.

Example query:

"transfer learning computer vision natural language processing cross-domain application"

2

πŸ€– Automated Judgment

A specialized evaluator model (LLM with review prompt) analyzes:

  • β†’ The proposed idea vs the abstracts/conclusions of found papers
  • β†’ If the similarity is superficial (keywords) or substantial (same hypothesis)
  • β†’ Nuances that could differentiate the new idea from existing ones
3

πŸ“Š State Classification

Based on the recovered evidence, the idea is classified into one of three states:

Novelty Validation States

Automatic classification of novelty level based on scientific evidence

🟒

NOVEL_BRIDGE

Genuinely New

No significant evidence was recovered connecting the key concepts of the proposed idea. It is a genuine research gap with high probability of being unpublished.

Metrics:

  • β€’ novelty_score > 0.8
  • β€’ Exhaustive search without relevant results
  • β€’ "Blue Ocean" opportunity

πŸš€ Recommended action:

βœ“ High priority for research
βœ“ Proceed with formal hypothesis development
βœ“ Design preliminary experiments
βœ“ Secure resources and funding

🟑

EMERGING_LINK

Recent Trend

Scattered or very recent evidence found. The topic is beginning to explore, there are recent papers (last 1-2 years) but no consensus or standard solution yet.

Metrics:

  • β€’ novelty_score 0.4 - 0.7
  • β€’ Found < 5 related papers
  • β€’ Growing publication dates

⚠️ Recommended action:

βœ“ Review the found papers in depth
βœ“ Differentiate the proposal clearly from them
βœ“ Good for incremental publication (State of the Art + Delta)

πŸ”΄

KNOWN_LINK

Consolidated

The connection is well established in prior literature. The system provides existing references (prior_art_refs) for consultation.

Metrics:

  • β€’ knownness_score > 0.75
  • β€’ Pre-existing consolidated literature
  • β€’ Marked as redundant

πŸ”„ Recommended action:

Γ— Discard or pivot the original idea
βœ“ Read existing papers (learning)
βœ“ Look for a completely unexplored angle
βœ“ Consider novel extensions or variations

βšͺ

UNCERTAIN

Ambiguous

Insufficient or contradictory evidence. The LLM evaluator could not confidently determine if the idea is new or existing.

Possible causes:

  • β€’ Concepts in emerging interdisciplinary areas
  • β€’ Ambiguous or variable terminology
  • β€’ Limited coverage in databases

πŸ” Recommended action:

⚠ Requires expert human eye
⚠ Review recovered evidence manually
⚠ Consult with domain expert
⚠ Refine the idea formulation and re-validate

External Judge Decisions

Specialized evaluator model (e.g., GPT-4o mini) that analyzes abstracts

✨

NOVEL

Judge Verdict

The specific combination of concepts does not appear in the recovered evidence. The analyzed abstracts show no direct interaction between the proposed ideas.

πŸ“š

KNOWN

Judge Verdict

The concepts appear directly interacting in the evidence. Multiple papers demonstrate that the connection has already been explored or implemented.

🌱

EMERGING

Judge Verdict

The interaction appears only in recent literature. The topic is young and is in active exploration phase by the scientific community.

Scoring Metrics

N

novelty_score

Range: 0.0 - 1.0

Estimated probability that the idea is unpublished. Calculated from:

  • β€’ Absence of papers with high semantic similarity
  • β€’ External judge verdict (high weight)
  • β€’ Quantity and quality of recovered evidence

Typical threshold: novelty_score > 0.8 β†’ NOVEL_BRIDGE classification

K

knownness_score

Range: 0.0 - 1.0

Degree of certainty that the idea already exists in the literature. Conceptual inverse of novelty_score.

  • β€’ Presence of multiple highly relevant papers
  • β€’ Cross-citations between recovered papers
  • β€’ Age of related publications

Typical threshold: knownness_score > 0.75 β†’ KNOWN_LINK classification

πŸ“‹

Evidence Requirements

Each classification must be backed by evidence to ensure system reliability:

Minimum references

min_evidence_refs = 2 (default)

At least 2 recovered documents are required to make a reliable classification.

Truth sources

β€’ OpenAlex (250M+ works)
β€’ Semantic Scholar (200M+ papers)
β€’ Real-time query

Quantifiable Value

The real impact of automated validation

~70%

Of AI-generated ideas without validation are redundant or existing

10-20x

Faster than exhaustive manual literature review

$$$

Savings in research resources avoiding duplication of efforts

Technical Details

How it works under the hood

πŸ”

Query Generation

From the proposed idea, the system generates multiple optimized queries:

  • β€’ Main query: Key concepts of the idea
  • β€’ Alternative queries: Synonyms and paraphrases
  • β€’ Semantic expansion: Related concepts
πŸ“Š

Scoring & Ranking

Retrieved papers are ranked by relevance and top-K is analyzed:

  • β€’ Semantic similarity: Idea vs abstract embeddings
  • β€’ Citation count: Weighted by paper impact
  • β€’ Recency: More recent papers have higher weight
πŸ€–

LLM Evaluator

Model specialized in critical idea review:

  • β€’ Chain-of-Thought: Explicit step-by-step reasoning
  • β€’ Few-shot examples: Curated evaluation examples
  • β€’ Structured output: JSON with state + rationale
πŸ“

Output Enrichment

Each result includes actionable metadata:

  • β€’ State: NOVEL_BRIDGE / EMERGING_LINK / KNOWN_LINK / UNCERTAIN
  • β€’ Confidence score: 0-1 certainty level
  • β€’ References: Direct links to related papers

Validate the Novelty of Your Ideas Before Investing Resources

Don't let months of research end in a "this was already done in 2019". Validate automatically before you start.