# 07 Cell Specificity Logic

## Quick Answer
Cell Specificity ranks cell types by query relevance using expression- and marker-aware scoring. It is a prioritization layer for downstream communication analysis.

## What this does
Ranks cell types and enables selected-cell filtering for communication workflows.

## Inputs
- Query molecules
- Direct seed genes and expanded support genes from the active query state
- `cell_specificity_unified.parquet` with expression values, marker flags, cell type, and system
- `gene_annotations.parquet` and ligand-receptor pairs for communication readiness
- Ranking method settings

## Outputs
- Ranked cell table
- Top-cell lollipop/system plots
- Selected cell set passed to communication module

## Interaction UX (Phase 11)
- System bars are click-selectable and filter the ranking table to one system.
- Lollipop dots are click-selectable and auto-select the matching cell in the table.
- The table is grouped by system with select-all toggles to reduce long-scroll behavior while preserving explicit checkbox control.
- Top-ranked cells are pre-selected by default and reflected in the table checkmarks.

## How calculated
Default ranking follows the current `/api/cell_specificity` context score.

1. EVd3x builds a final gene-set from direct seed genes plus expanded target or network-support genes.
2. It scans matching rows from `cell_specificity_unified.parquet`, with `MAX_CELLSPEC_SCAN_ROWS` as a processing guard.
3. For each gene, expression is z-scored across cell types. Positive specificity is clipped and scaled:
   `specificity_signal = clip(z_score, 0, 3) / 3`.
4. For each cell type and system, direct seed genes and expanded support genes are summarized separately.
5. Direct-seed context score:
   `100 * (0.42 * seed_coverage + 0.33 * seed_specificity + 0.13 * marker_support + 0.12 * expanded_support)`.
6. Expanded support:
   `0.60 * expanded_coverage + 0.40 * expanded_specificity`.
7. If no direct seed gene is available, the fallback context score is:
   `100 * (0.60 * expanded_support + 0.25 * expanded_specificity + 0.15 * expanded_coverage)`.
8. Communication readiness:
   `100 * (0.65 * localization_ready_fraction + 0.35 * expressed_query_ligand_fraction)`.
9. Composite score:
   `0.76 * context_score + 0.14 * system_relevance * confidence_scale + 0.10 * communication_readiness`.

The API also reports carryover labels such as blood, epithelial/skin, or endothelial/stromal background. These labels guide interpretation and do not change the ranking score.

## What to download
Export ranking and selected-cell files for transparent reporting of which cell populations were included.

## Known limits
Cell context scores prioritize review. They do not infer EV cell of origin, uptake, cell-state causality, or disease specificity. Different ranking methods can reorder marginal cell ranks. Document the method, selected cell set, direct seed count, expanded support count, and whether the scan reached the row guard.
