热门角色不仅是灵感来源,更是你的效率助手。通过精挑细选的角色提示词,你可以快速生成高质量内容、提升创作灵感,并找到最契合你需求的解决方案。让创作更轻松,让价值更直接!
我们根据不同用户需求,持续更新角色库,让你总能找到合适的灵感入口。
生成针对数据集的专业数据概况分析结果。
以下为“新用户转化数据”的数据概况分析结果模板与评估方法。因未提供实际数据与字段结构,本报告以标准化数据质量框架与可复用的计算逻辑为基础,输出结果项采用占位符形式,待接入真实数据后即可生成数值化结论。
一、分析范围与目标
二、字段与结构(参考标准,实际以您系统为准)
三、指标定义与计算方法
四、数据概况分析结果(模板,待填)
五、数据质量评估与风险分级(规则清单)
六、问题定位与修复建议
七、监控与审计(示例SQL/规则表达)
八、交付与下一步
说明:本报告为数据质量概况分析的标准模板,确保在接入真实数据后可直接执行并生成高准确性的结果。针对您的具体数据与业务定义,可快速定制规则与阈值。
Order Fact Table – Data Profiling Analysis Results (Template + Illustrative Example)
Scope and grain
Table overview (replace illustrative values with actuals)
Column-level profiling (core fields)
Cross-field consistency checks
Calculation coherence (define per business rule)
Status-to-amount coherence
Temporal coherence
Referential integrity (to dimensions)
Outliers and anomaly signals
Data quality risks identified (examples)
Recommended cleansing and validation rules
Monitoring KPIs and thresholds (set alerts)
Notes on interpretation
If you provide a schema sample and row extracts, I can replace the illustrative figures with precise metrics and produce a finalized profiling report.
以下为“核心指标监控数据”的数据概况分析结果设计与生成方案。由于未提供具体数据集,本结果以通用核心指标监控数据模型为前提,给出可复用的概况分析指标、计算方法与输出结构。请在确认字段与业务规则后执行相应计算生成数值结果。
一、目标与范围
二、数据模型假设(需确认) 核心表:core_metrics
三、概况分析指标与结果结构 输出以多张结果表或一张汇总表呈现,建议如下结构:
四、计算方法与示例SQL(标准SQL,需按实际库方言调整) 将时间窗口通过参数化传入::start_date, :end_date, :sla_hours, :ref_start_date, :ref_end_date
记录总数与指标数: SELECT COUNT(*) AS record_count, COUNT(DISTINCT metric_id) AS distinct_metric_count FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
日期覆盖率(以日历表或预期频率推导为准,示例以存在记录的日期计数作为近似): SELECT COUNT(DISTINCT as_of_date)::float / NULLIF(DATE_PART('day', :end_date - :start_date) + 1, 0) AS date_coverage_rate FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
分段覆盖率(至少一个分段非空): SELECT SUM(CASE WHEN segment_1 IS NOT NULL OR segment_2 IS NOT NULL THEN 1 ELSE 0 END)::float / COUNT(*) AS segment_coverage_rate FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
主键唯一性(示例用segment_1/segment_2,如无则仅metric_id+as_of_date) SELECT COUNT() - COUNT(DISTINCT CONCAT_WS('|', metric_id, as_of_date::text, COALESCE(segment_1,''), COALESCE(segment_2,''))) AS duplicate_count, (COUNT() - COUNT(DISTINCT CONCAT_WS('|', metric_id, as_of_date::text, COALESCE(segment_1,''), COALESCE(segment_2,''))))::float / COUNT(*) AS duplicate_rate FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
字段完整性(关键字段示例:metric_id, as_of_date, value, source_system, ingested_at) SELECT 'metric_id' AS field_name, SUM(CASE WHEN metric_id IS NOT NULL THEN 1 ELSE 0 END)::float / COUNT() AS non_null_rate, SUM(CASE WHEN metric_id IS NULL THEN 1 ELSE 0 END) AS null_count FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date UNION ALL SELECT 'as_of_date', SUM(CASE WHEN as_of_date IS NOT NULL THEN 1 ELSE 0 END)::float / COUNT() AS non_null_rate, SUM(CASE WHEN as_of_date IS NULL THEN 1 ELSE 0 END) AS null_count FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date UNION ALL SELECT 'value', SUM(CASE WHEN value IS NOT NULL THEN 1 ELSE 0 END)::float / COUNT() AS non_null_rate, SUM(CASE WHEN value IS NULL THEN 1 ELSE 0 END) AS null_count FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date UNION ALL SELECT 'source_system', SUM(CASE WHEN source_system IS NOT NULL THEN 1 ELSE 0 END)::float / COUNT() AS non_null_rate, SUM(CASE WHEN source_system IS NULL THEN 1 ELSE 0 END) AS null_count FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date UNION ALL SELECT 'ingested_at', SUM(CASE WHEN ingested_at IS NOT NULL THEN 1 ELSE 0 END)::float / COUNT(*) AS non_null_rate, SUM(CASE WHEN ingested_at IS NULL THEN 1 ELSE 0 END) AS null_count FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
有效性校验
类型与可解析性(数值与有限值): SELECT SUM(CASE WHEN value IS NULL OR NOT (value = value) THEN 1 ELSE 0 END) AS invalid_numeric_count, -- NaN检测依赖方言 SUM(CASE WHEN value IS NULL OR NOT (value = value) THEN 1 ELSE 0 END)::float / COUNT(*) AS invalid_numeric_rate FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
阈值区间: SELECT SUM(CASE WHEN threshold_min IS NOT NULL AND threshold_max IS NOT NULL AND (value < threshold_min OR value > threshold_max) THEN 1 ELSE 0 END) AS out_of_threshold_count, SUM(CASE WHEN threshold_min IS NOT NULL AND threshold_max IS NOT NULL AND (value < threshold_min OR value > threshold_max) THEN 1 ELSE 0 END)::float / COUNT(*) AS out_of_threshold_rate FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
阈值合理性: SELECT SUM(CASE WHEN threshold_min IS NOT NULL AND threshold_max IS NOT NULL AND threshold_min > threshold_max THEN 1 ELSE 0 END) AS invalid_threshold_pair_count FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
外键完整性(需metric_dict): SELECT COUNT() - COUNT(md.metric_id) AS fk_missing_count, (COUNT() - COUNT(md.metric_id))::float / COUNT(*) AS fk_missing_rate FROM core_metrics cm LEFT JOIN metric_dict md ON cm.metric_id = md.metric_id WHERE cm.as_of_date BETWEEN :start_date AND :end_date;
分布与异常(IQR) WITH stats AS ( SELECT metric_id, PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY value) AS q1, PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY value) AS q3 FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date GROUP BY metric_id ) SELECT cm.metric_id, SUM(CASE WHEN cm.value < (s.q1 - 1.5*(s.q3 - s.q1)) OR cm.value > (s.q3 + 1.5*(s.q3 - s.q1)) THEN 1 ELSE 0 END)::float / COUNT(*) AS iqr_outlier_rate FROM core_metrics cm JOIN stats s ON cm.metric_id = s.metric_id WHERE cm.as_of_date BETWEEN :start_date AND :end_date GROUP BY cm.metric_id;
时效性(基于ingested_at与event_time或as_of_date)
若有event_time: SELECT SUM(CASE WHEN EXTRACT(EPOCH FROM (ingested_at - event_time))/3600 <= :sla_hours THEN 1 ELSE 0 END)::float / COUNT(*) AS timeliness_pass_rate, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM (ingested_at - event_time))) AS delay_median_seconds, PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM (ingested_at - event_time))) AS delay_p95_seconds FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
若无event_time,近似以as_of_date至ingested_at: SELECT SUM(CASE WHEN EXTRACT(EPOCH FROM (ingested_at - (as_of_date::timestamp))) / 3600 <= :sla_hours THEN 1 ELSE 0 END)::float / COUNT(*) AS timeliness_pass_rate FROM core_metrics WHERE as_of_date BETWEEN :start_date AND :end_date;
五、质量阈值建议(需根据业务与风险承受度确认)
六、监控与告警实现建议
七、生成结果所需信息(请提供)
说明
让你的团队在最短时间内拿到一份可直接用于决策的「数据概况分析」报告。该提示词引导 AI 以数据质量分析师的专业视角,围绕清理、验证、概况分析与监控四个模块,生成结构化、客观、无冗余的结论与改进建议。你只需输入数据集名称并选择期望的输出语言,即可得到清晰易读的分析结果,快速定位缺失、异常、重复、字段不一致等问题,并获得可执行的修复与监控方案。适用于新数据接入评审、模型训练前的数据体检、报表刷新后的健康检查、第三方数据交付验收与合规核查,帮助你缩短分析周期、提升数据可信度、降低决策风险,并建立团队可复用的质量评估标准。
快速摸底新数据集,生成质量概况与风险清单,制定清洗计划,输出可视化要点供汇报与协作。
在接入前完成质量评估与规则设定,一键生成自测清单与阈值建议,降低上线故障与回滚风险。
用业务版报告理解数据可信度,识别影响核心指标的质量问题,驱动修复优先级并向管理层汇报。
将模板生成的提示词复制粘贴到您常用的 Chat 应用(如 ChatGPT、Claude 等),即可直接对话使用,无需额外开发。适合个人快速体验和轻量使用场景。
把提示词模板转化为 API,您的程序可任意修改模板参数,通过接口直接调用,轻松实现自动化与批量处理。适合开发者集成与业务系统嵌入。
在 MCP client 中配置对应的 server 地址,让您的 AI 应用自动调用提示词模板。适合高级用户和团队协作,让提示词在不同 AI 工具间无缝衔接。
免费获取高级提示词-优惠即将到期