# 11 Disease Analysis Logic

## Quick Answer
Disease Analysis aggregates direct and predicted disease links from gene and miRNA evidence, using canonical `Disease_ID` grouping to reduce duplicate naming artifacts.

## What this does
Builds grouped disease summaries, evidence accordions, and category visualizations.

## Inputs
- Query-linked gene and miRNA disease association rows
- Disease metadata and publication fields

## Outputs
- Bubble/treemap summary by disease categories
- Grouped disease associations with counts
- Evidence tables with publication links

## Interaction UX (Phase 11)
- Disease score lollipop points are click-selectable and jump to matching grouped disease evidence accordions.
- Grouped accordion entries retain canonical traceability while supporting direct drilldown from charts.

## How calculated
Associations are merged and grouped primarily by `Disease_ID`. Display names are normalized for readability while raw names remain available for traceability. Group counts track evidence rows and gene/miRNA contributions.

The grouped display priority is:

`100 * (0.28 * evidence_breadth + 0.22 * support_breadth + 0.14 * source_breadth + 0.12 * publication_breadth + 0.12 * direct_fraction + 0.12 * score_signal)`.

Evidence breadth, support breadth, and publication breadth are log-normalized to the maximum within the current result. Source breadth is divided by the result maximum. `direct_fraction` is the fraction of rows from directly queried entities, and `score_signal` is the clipped median source score. This value ranks review order; it is not disease probability, penetrance, or clinical risk.

## What to download
Use grouped and full association exports together to report both canonical aggregation and underlying evidence rows.

## Known limits
Predicted associations depend on source-specific evidence models. Treat predicted links as hypothesis-generating, not confirmatory.
