Below is a structured, robust, and operational method to identify outliers in the specified transaction log. It targets: (1) large burst inflow/outflow, (2) short-term high-frequency small amounts, and (3) negative balance jumps. The approach is per-account, stratified by txn_type, and uses rolling time windows with robust statistics to control false positives.
- Data Preparation
- Partition: Group data by account_id, then stratify within each account by txn_type.
- Ordering and de-duplication: Sort by timestamp; remove duplicate txn rows; ensure monotonic balance if source system guarantees it.
- Direction coding: Map txn_amount_signed = +amount for inflows, −amount for outflows (based on txn_type mapping defined with business).
- Minute-level series: For each account_id × txn_type, resample to 1-minute bins:
- Features per minute m:
- count_m = number of transactions
- sum_amount_m = sum(txn_amount_signed)
- max_amount_m = max(|txn_amount|)
- median_amount_m = median(txn_amount)
- count_small_m = number of transactions with txn_amount ≤ T_small (defined below)
- For account-level balance series (not stratified), resample balance to minute and compute delta_balance_m = balance_m − balance_{m−1}.
- Seasonality optional: If strong intraday/weekday effects, compute baselines per hour-of-day and weekday; otherwise use unconditional baselines (see below).
- Robust Baselines per Account × txn_type
Compute robust statistics over a trailing historical window (e.g., last 30–60 days per account × txn_type):
- Robust scale:
- MAD(x) = median(|x − median(x)|)
- sigma_hat = 1.4826 × MAD(x) (normal-consistent)
- Baselines:
- amount_baseline: median(txn_amount) and sigma_hat_amount
- count_baseline: median(count_m) and sigma_hat_count
- sum_baseline: median(sum_amount_m) and sigma_hat_sum
- Small-amount threshold T_small:
- T_small = quantile_25(txn_amount) per account × txn_type (adjust to quantile_20–30 depending on distribution)
- For balance:
- delta_balance_baseline_neg: robust stats of negative deltas only (e.g., median of negatives and sigma_hat_neg from negative deltas).
Fallbacks:
- If insufficient history for an account × txn_type (e.g., <200 minutes or <50 txns), fall back to global baselines computed across similar cohorts (e.g., same product/segment).
- Rolling Windows
Compute rolling features at multiple horizons per account × txn_type (non-overlapping or sliding):
- Windows: W ∈ {5, 15, 60} minutes
- For each window ending at time t:
- roll_sum_W(t) = sum of txn_amount_signed in window
- roll_count_W(t) = total transactions in window
- roll_max_amt_W(t) = max(|txn_amount|)
- roll_count_small_W(t) = count of txns ≤ T_small
- roll_median_amt_W(t) = median(txn_amount)
For balance (account-level):
- roll_min_delta_bal_W(t) = min(delta_balance_m) within window
- delta_balance_m computed per minute across all txn_type activity combined.
- Detection Rules
All rules use robust z-scores or tail quantiles, applied per account × txn_type (except balance drop which is account-level). Calibrate thresholds to control false positive rate (FPR) to a target (e.g., 0.1–0.5% per day per account).
4.1 Large Burst Inflow/Outflow
- Single-transaction spike:
- z_max_amt = (roll_max_amt_W − median_amount) / sigma_hat_amount_abs
- sigma_hat_amount_abs from |txn_amount|
- Flag if z_max_amt ≥ Z1 (e.g., Z1 = 5)
- Window sum spike (directional):
- z_roll_sum = (roll_sum_W − median(sum_amount_m)) / sigma_hat_sum
- Flag inflow if z_roll_sum ≥ Z2_pos; outflow if z_roll_sum ≤ −Z2_neg (e.g., Z2_pos = 5, Z2_neg = 5)
- Optional: Extreme tail via EVT
- Fit GPD to upper tail of |txn_amount| and |roll_sum_W|; set dynamic thresholds using target exceedance probability p* (e.g., 0.001).
4.2 Short-Term High-Frequency Small Amounts
- Count spike conditional on small amounts:
- z_count = (roll_count_W − median(count_m)) / sigma_hat_count
- small_ratio = roll_count_small_W / max(roll_count_W, 1)
- median_small_check = roll_median_amt_W ≤ T_small
- Flag if z_count ≥ Z3 and small_ratio ≥ R_small and median_small_check is true
- Example: Z3 = 4, R_small ≥ 0.7
- Alternative statistical test (if seasonality modeled):
- Estimate λ_W (expected count per window) via robust EWMA or per time-of-day quantiles.
- Use one-sided Poisson tail test: p = 1 − CDF_Poisson(roll_count_W − 1; λ_W)
- Flag if p ≤ α (e.g., α = 0.001), with small_ratio and median_small_check filters.
4.3 Balance Negative Jump
- Per-minute drop:
- z_delta_neg = (delta_balance_m − median_neg) / sigma_hat_neg, using only negative deltas for baseline
- Flag minute m if delta_balance_m ≤ Q_low (e.g., ≤ quantile_0.1_neg − k × sigma_hat_neg) or z_delta_neg ≤ −Z4
- Windowed drop:
- roll_min_delta_bal_W ≤ T_drop_W where T_drop_W set from historical lower-tail quantiles (e.g., 0.1% quantile across negative deltas)
- Consistency check (optional, increases precision):
- Compare |roll_sum_outflow_W| to |roll_min_delta_bal_W|.
- If |roll_min_delta_bal_W| >> |roll_sum_outflow_W| + M (mismatch margin), flag as “unexplained balance drop”.
- Stratification and Aggregation
- Apply 4.1 and 4.2 separately per txn_type; compute separate baselines.
- Balance negative jump (4.3) applied at account level (all types combined).
- Anomaly score per window:
- s_burst = max(0, z_max_amt/Z1, |z_roll_sum|/Z2)
- s_freq_small = indicator(z_count ≥ Z3) × small_ratio
- s_balance_drop = max(0, |z_delta_neg|/Z4, exceedance_of_T_drop_W)
- Composite score:
- s_total = 1 − Π_k (1 − s_k_norm), where s_k_norm ∈ [0,1] normalized by clipping/transform
- Flag if s_total ≥ S_thresh (e.g., S_thresh = 0.6), or any rule individually triggers a hard threshold.
- Threshold Calibration and Controls
- Per-account calibration:
- Use historical non-flagged data to set Z1–Z4 to achieve desired FPR. Start with robust z-thresholds (4–6) and adjust.
- Global controls:
- Cap maximum alerts per account per day to prevent flood.
- Require persistence: anomaly must hold in ≥2 consecutive windows for certain types (e.g., frequency-small) to reduce noise.
- Drift monitoring:
- Recompute baselines weekly; monitor median and MAD stability.
- Output and Review Workflow
- Emit anomaly records with:
- account_id, timestamp_start, timestamp_end, txn_type (if applicable)
- anomaly_type ∈ {BurstInflow, BurstOutflow, HighFreqSmall, BalanceDrop, BalanceDropUnexplained}
- features: roll_sum_W, roll_count_W, roll_max_amt_W, small_ratio, delta_balance_m/roll_min_delta_bal_W
- z-scores and thresholds crossed
- s_total and component scores
- Severity tiers:
- Critical: any EVT tail exceedance or z ≥ 7, or BalanceDropUnexplained
- High: z ∈ [5,7] or repeated windows
- Medium: z ∈ [4,5] with corroborating secondary signals
- Mark flagged windows for manual review; optional auto-escalation for Critical.
- Implementation Sketch (Python-like)
- Preprocessing:
- df = read_logs()
- df = sort by account_id, timestamp; drop duplicates
- map txn_type -> sign; compute txn_amount_signed
- Per account × txn_type:
- resample to 1-min; compute minute features
- compute robust baselines (median, MAD → sigma_hat)
- for each window W in {5,15,60}:
- compute rolling features (sum, count, max, count_small, median)
- compute z-scores; apply rules 4.1, 4.2
- Account-level:
- resample balance; compute delta_balance_m; apply rule 4.3
- Combine component scores; flag; write anomalies.
- Considerations and Edge Cases
- Sparse accounts: use cohort/global baselines; widen thresholds to reduce false positives.
- Data quality: if balance is missing or out-of-sync, rely on transaction-based rules only; log data-quality flags.
- Clock skew: ensure timestamps are aligned to a consistent timezone; handle late-arriving data by incremental re-evaluation of windows.
This method is robust, scalable, and controls false positives using per-account, per-type baselines and rolling-window statistics. It directly targets the specified patterns and produces explainable anomaly flags suitable for downstream review.