// introduction
The Knowledge That Lives Only in One Person's Head
Every mature SOC has the same structural problem: triage quality is heavily analyst-dependent. A senior and junior analyst looking at the same alert produce different outcomes, different escalation rates, and different investigation paths. This is framed as an "experience gap", inevitable, closing naturally over time. That framing is accurate and an abdication. It treats the knowledge transfer problem as unsolvable rather than as an engineering problem.
The organizational cost: inconsistent triage produces missed escalations, unnecessary P1 noise, analyst burnout in junior tiers, and a SOC whose quality is directly correlated with staff retention, the single most volatile variable in the business.
Expert triage judgment is not mystical pattern recognition that cannot be articulated. It is a set of heuristics, decision rules, and environmental priors that have never been made explicit because nobody asked the senior analyst to articulate them. Making them explicit is a program design problem, and it has tractable solutions.
// cognitive
What Expert Triage Judgment Actually Is
Gary Klein's Recognition-Primed Decision (RPD) model explains it: experts do not evaluate options and choose the best one. They recognize situational patterns and match them to previously successful responses. Expert analysts are not doing slower, more rigorous analysis than junior analysts, they are doing faster, pattern-matched analysis that bypasses explicit deliberation. The result looks like intuition. The mechanism is accumulated pattern libraries.
What the expert's pattern library contains that the junior's does not:
- Environmental priors: knowledge of what is normal for this specific organization's infrastructure, not just what is normal generically
- False positive signatures: accumulated knowledge of which alert types reliably produce benign explanations in this specific environment
- Escalation triggers: specific combinations of fields, timing, or context that have previously indicated real threats, often combinations that are not in any runbook
- Investigative shortcuts: the three lookups that tell you in ninety seconds what would otherwise take twenty minutes
The junior analyst did not make an error, they followed the runbook correctly. The runbook was incomplete because it did not encode the environmental context that made the senior analyst's judgment possible. This is a documentation failure, not a skills failure.
// methodology
Extracting Tacit Heuristics: the Elicitation Process
The Critical Decision Method (CDM) is an interview technique from naturalistic decision-making research directly applicable to SOC knowledge extraction. CDM asks the expert to walk through a specific past case in retrospective detail, probing for cues noticed, interpretations formed, and decisions made, with attention to moments where the expert's path diverged from what a novice would have done.
A practical elicitation approach without formal CDM training: select five to ten recent alerts the senior analyst triaged quickly that a junior had previously escalated incorrectly or slowly. For each, ask the senior analyst to narrate their triage process aloud, pausing at each decision point to explain what they noticed and why it changed their assessment. The facilitator's job is to resist the answer "I just knew" and probe for the specific observable that triggered the recognition.
The contrast case technique: present the senior analyst with a confirmed malicious version and a confirmed benign version of the same alert type simultaneously. Ask: what is different between these two? What single field most changes your assessment? The differences the senior analyst identifies are the discriminating features their pattern library uses.
Elicitation sessions work best as recorded conversations, not written exercises. Senior analysts articulate reasoning more naturally in speech. Transcribe and structure afterward, don't ask them to produce structured heuristics directly, because that asks them to do the cognitive work that makes articulation hard.
// worked examples
Six Explicit Heuristics: with Teach-As Framing
Each heuristic follows the same structure: observable condition → interpretive weight → decision guidance. The interpretive weight is what runbooks typically omit.
IF suspicious child AND parent is expected for that child → weight: low. IF suspicious child AND parent is unexpected → weight: high. The identity of the child matters less than whether the parent could plausibly have spawned it. cmd.exe from explorer.exe is unremarkable. cmd.exe from a PDF reader is not.
"Before asking what the child is doing, ask why this parent would have created it."
IF alert matches known automation type AND execution is within the known automation window → weight: low, verify pattern match. IF same alert AND execution is outside the automation window → weight: elevated, treat as anomalous even if technically identical. The same process at 02:15 during the patch window and at 14:30 on a Wednesday are not the same event.
"Would this activity make sense if a legitimate administrator were doing it right now? If not, why is it happening now?"
IF suspicious activity AND standard user → assess at face value. IF same activity AND privileged account → escalate one severity tier regardless of activity-level assessment. A standard user running a suspicious script may be a misconfigured application. The same script from a domain admin is a potential lateral movement or credential access event. The blast radius if malicious is not the same.
"Who is doing this? Now ask yourself what an attacker could do from that account if this isn't what it looks like."
IF the alert type has fired frequently AND this instance matches the prior pattern → weight: low. IF the alert type has fired frequently AND this instance is the only one with a specific field value → weight: high. A hundred PowerShell alerts from a dozen hosts is alert fatigue. One PowerShell alert with a command line that appears in no other alert is a signal. Junior analysts see alert type. Senior analysts see population distribution.
"Is this one different from all the others? If yes, why?"
IF an alert is individually low severity AND recent alert history for the host or user shows a reconnaissance → execution → persistence pattern → treat the sequence as high severity regardless of individual alert ratings. A failed login, followed by a successful login, followed by PowerShell execution, followed by a new scheduled task is a kill chain. Junior analysts triage each alert independently. Senior analysts maintain a running mental model of recent host and user activity.
"What happened on this host or for this user in the last two hours? Does this alert make more sense as part of a sequence?"
IF you can construct a specific plausible benign explanation AND the evidence supports it → document and close. IF no plausible benign explanation survives contact with the evidence → escalate regardless of individual alert severity. Senior analysts constantly construct and test benign hypotheses: "could this be the backup job?", "is this user a developer who legitimately runs scripts?" When no benign explanation fits all the evidence, that absence is the signal.
"Tell me the benign story for this alert. If you can't tell one that fits all the evidence, escalate."
// transfer mechanisms
Getting Heuristics into the Analyst Workflow
Individual heuristics are useful but scatter-shot. The output of elicitation should be structured into alert-category-specific triage decision trees, lightweight reasoning scaffolds that apply heuristics in the order they most efficiently resolve ambiguity for a given alert type.
The false positive catalog deserves specific attention. It is the most direct encoding of environmental prior knowledge and the highest-ROI knowledge transfer artifact a SOC team can produce. Format: alert type → specific condition combination → known benign explanation → confirmation step that verifies the benign explanation applies to this instance. A catalog with 200 well-documented FP patterns eliminates more unnecessary junior analyst investigation time than any other single investment.
// conclusion
The Senior Analyst's Real Leverage Is What They Write Down
A senior analyst who triages five hundred alerts a week is generating individual value. A senior analyst who spends two hours extracting and documenting the heuristics behind those decisions is generating institutional value that scales across every junior analyst on the team indefinitely.
The closing question: if the two most experienced analysts on your team left tomorrow, how long before your junior triage quality reached current levels again? If the answer is "years", the knowledge transfer problem is also a business continuity problem, and it has been sitting in plain sight the whole time.
Gary Klein, "Sources of Power: How People Make Decisions" (1998), the foundational RPD model; Klein et al., "The Critical Decision Method" for the elicitation protocol; SANS analyst development resources on structured triage methodology.