Below is a distribution-focused description of app behavior logs for DAU, time-on-app, feature reach and funnel, and user segmentation (novice/heavy), explicitly considering version and date dimensions. This is designed to guide analysis and modeling; validate these forms on your data with diagnostics before adoption.
Scope and unit of analysis
- Dimensions: date (daily granularity) and app version (current app binary/feature flag state).
- Levels: user-level events/session-level metrics; aggregated daily cohort by version.
- Recommended reporting: for each date×version cohort, compute robust summaries (median, IQR, p90–p99) and distribution diagnostics.
DAU (Daily Active Users)
- Aggregated daily counts (by date×version) typically show:
- Overdispersed count distribution across days with weekly seasonality and release-driven structural breaks.
- Negative Binomial fits better than Poisson for day-level counts due to variance > mean and burstiness from campaigns/releases.
- Distribution shifts across versions (level changes, variance changes). Check for changepoints at release dates.
- User-level activity (binary active per day):
- Within a date×version cohort, the distribution of “active probability” is heterogeneous; Beta-Binomial models capture overdispersion.
- Practical summaries:
- Daily time series: mean, variance, overdispersion index (variance/mean), day-of-week effects, holiday effects.
- Change detection: CUSUM or Bayesian changepoint around version rollouts.
Time on app (session dwell time and per-user daily time)
- Per session:
- Strong right-skew with heavy tails; log-normal or Gamma typically fit. Some apps exhibit Weibull if session hazard changes over time.
- Possible multimodality (short utility sessions vs long consumption sessions). A two-component mixture (e.g., mixture of log-normals) often improves fit.
- Truncation/censoring: SDK inactivity timeouts cap sessions; treat long-running background sessions carefully.
- Per user per day (sum of sessions):
- Compound right-skew; log1p transform tends to normalize. Robust statistics (median, p90, p99) recommended over mean due to extreme tails.
- Expect higher variance in heavy users and versions enabling autoplay/streaming.
- Practical summaries:
- Session-level: median, IQR, p90, p99; fit log-normal/Gamma; QQ plots on log scale; KS/AD tests.
- Winsorize or trim at 99.9th percentile for reporting; separately track extreme usage.
Feature reach (feature touch/trigger rates)
- Binary reach (user triggered feature at least once per day):
- Within date×version, reach rate distribution across users is degenerate at {0,1}; model the aggregate rate via Beta (for proportions) or Beta-Binomial (counts).
- Rare features show zero inflation; hierarchical logistic regression with random effects (version, date, user segment) handles sparse data.
- Frequency of usage (counts per user):
- Zero-inflated count distribution; Negative Binomial with zero-inflation typically fits better than plain Poisson.
- Practical summaries:
- Report reach: mean proportion with Wilson/Bayesian intervals; p50–p95 across users for usage count.
- Compare versions via uplift in reach and frequency; adjust for segment mix to avoid Simpson’s paradox.
Funnel metrics (multi-step flows, e.g., View → Click → Start → Complete)
- Each step’s conversion is a conditional probability; distributions across cohorts are Beta-like with overdispersion.
- Multiplicative structure yields heavy drop-off; earlier steps have higher variance; later steps often sparse.
- Correlation across steps: users who pass early steps more likely to pass later ones; hierarchical models (multilevel logistic) capture this.
- Practical summaries:
- For each date×version: per-step conversion rate with credible intervals; funnel completion distribution; variance decomposition by step.
- Small cohorts: use Bayesian shrinkage (Beta priors) to stabilize estimates. Evaluate end-to-end conversion and per-step lifts across versions.
User segmentation (novice vs heavy)
- Example operational definitions (tune to your context):
- Novice: install age ≤ 7 days or cumulative active days ≤ 3; low session count and short dwell time.
- Heavy: top decile/quartile of weekly sessions or daily dwell time; or RFM-based thresholds.
- Distributional differences:
- Novice: shorter sessions, lower feature reach, higher variance day-to-day (learning phase). Dwell time often unimodal with a short-session peak.
- Heavy: pronounced heavy tails in dwell time and usage counts; higher feature reach; more stable activity patterns but larger outliers.
- Modeling:
- Mixture models (e.g., two-component log-normal for dwell time) reflect segment heterogeneity.
- Include segment as a random/fixed effect when comparing versions to avoid compositional bias.
Version and date dimensions
- Temporal patterns:
- Weekly seasonality (weekday/weekend), monthly cycles, campaign spikes, holidays.
- Structural breaks at releases; evaluate pre/post windows to attribute shifts.
- Version effects:
- Treat version as a categorical factor or hierarchical level; consider rollout fraction (staged releases).
- Use cohorting by install-date to separate composition changes from version treatment effects.
- Recommended modeling:
- Panel models with date fixed effects and version random/fixed effects for DAU/reach.
- Time-series with regressors: ARIMA/ETS or BSTS with release dummies for DAU.
- Distributional comparison: eCDF, KS/AD tests, quantile regression to detect shifts beyond the mean.
Diagnostics and data quality
- Sessionization rules: confirm inactivity timeout and foreground/background definitions.
- Bot/fraud filters: identify anomalous ultra-high session counts or durations.
- Missing/late events: quantify skew from delayed logging; use lag-adjusted windows for funnels.
- Metric stability: track p95–p99 volatility; apply robust methods (median, MAD) for alerting.
Reporting checklist per date×version
- DAU: count, variance/mean ratio, weekday effect, changepoints.
- Time on app: session-level and user-day level median/IQR/p90/p99; fitted distribution family and GOF.
- Feature reach: proportion with intervals; zero-inflation indicators; usage count distribution.
- Funnel: per-step conversion with intervals; end-to-end conversion; stepwise lifts vs previous version.
- Segments: split all above by novice/heavy; compare distributional shifts; note composition shares.
Key takeaways
- Expect overdispersed counts (Negative Binomial) for DAU and zero-inflated counts for feature usage.
- Expect right-skew and heavy tails for dwell time; log-normal/Gamma fits are standard, often with mixtures across user segments.
- Use Beta/Beta-Binomial families for reach and funnel conversion, with hierarchical modeling to stabilize small cohorts.
- Always stratify by version and date, and control for segment composition to avoid misleading lifts.