撰写同侪评审指南

幂简官方

356 浏览

35 试用

9 购买

Sep 28, 2025更新

设计文生文

基于特定任务类型，设计精准的同侪评审指导方案。

Peer Review Guidelines for Experimental Reports 实验报告同侪评审指南

Purpose and Scope 英文

Purpose: To standardize high-quality, fair, and evidence-based peer review of experimental reports across STEM and social science contexts.
Scope: Applicable to coursework lab reports, capstone projects, and empirical manuscripts that include a research question, method, data, analysis, and interpretation.

中文

目的：为STEM与社会科学领域的实验报告提供高质量、公平、循证的同侪评审标准。
适用范围：课程实验报告、毕业设计/项目、以及包含研究问题、方法、数据、分析与解释的经验研究手稿。

Reviewer Responsibilities and Ethics 英文

Declare conflicts of interest; recuse when necessary.
Maintain confidentiality; do not share or reuse materials or data.
Evaluate the work, not the author(s); avoid bias related to identity, language, institution, or prior beliefs.
Use evidence from the manuscript; avoid speculation.
Adhere to institutional academic integrity policies and COPE principles.

中文

如有利益冲突需声明并在必要时回避。
保持保密；不得传播或二次使用稿件材料或数据。
评审对象是作品而非作者；避免与身份、语言、机构或先验立场相关的偏见。
评论基于文本证据；避免推测。
遵循机构学术诚信政策与COPE伦理原则。

Review Process 英文

Calibration: Review rubric and two calibrated exemplars (high/low) before scoring.
Initial pass: Skim for overall structure and compliance with assignment brief.
Deep review: Evaluate each criterion with annotations that cite line/figure/table numbers.
Verify analyses: Check assumptions, effect sizes, and reproducibility elements.
Decision and feedback: Assign scores with rationale; provide prioritized, actionable revisions.
Recordkeeping: Document decisions and uncertainties for moderation.

中文

口径一致：评分前阅读评分量表与两份标定样例（高/低）。
初读：检查结构与任务要求的符合性。
深评：逐项评估并在文中标注，引用具体页码/图表。
分析核验：检查统计假设、效应量与可复现要素。
决定与反馈：给出分数与理由；提出有优先级、可执行的修改建议。
记录：记录裁量与不确定点以备复核。

Core Evaluation Criteria and Performance Levels 英文 Use the following criteria with four performance levels: Exemplary, Proficient, Developing, Inadequate. Anchor descriptors are concise; adapt to disciplinary context.

Title and Abstract (5%) • Clear, specific, and informative; abstract states problem, methods, key results with effect sizes/CI, and main conclusion without overclaiming.
Research Question and Rationale (10%) • Research question/hypotheses are precise, theory-driven, and testable; literature synthesis justifies design and variables.
Methods: Design, Materials, and Procedure (20%) • Design matches question; operational definitions precise; sampling, randomization/blinding (if applicable), controls, and power/sample size rationale are reported; ethical approval/consent addressed.
Data Management and Transparency (5%) • Data cleaning rules, exclusion criteria, handling of missing data and outliers pre-specified or justified; code/software and versioning documented; availability statement provided where permitted.
Statistical Analysis and Reporting (20%) • Assumptions checked; correct models used; effect sizes and 95% CI reported; alpha and multiplicity control justified; preregistration referenced if present; p-values not overstated; practical significance discussed.
Results: Clarity and Integrity (15%) • Figures/tables labeled and self-contained; no duplication between text and graphics; descriptive statistics and exact p-values; no HARKing or selective reporting.
Discussion and Conclusion (15%) • Interpretation aligns with data and limitations; alternative explanations considered; external validity cautions; implications and future work are evidence-based.
Writing, Structure, and Style (5%) • Logical organization; precise terminology; adherence to assigned style guide (e.g., APA, CSE); consistent citation and reference accuracy.
Originality and Academic Integrity (5%) • Correct attribution; absence of plagiarism or inappropriate text recycling; proper use of AI tools per policy, disclosed if required.

中文采用四级绩效标准：优（Exemplary）、良（Proficient）、待改进（Developing）、不达标（Inadequate），并可结合学科特点调整锚定描述。

标题与摘要（5%） • 明确、具体、信息充分；摘要包含问题、方法、关键结果（含效应量/置信区间）、与稳健结论，避免夸大。
研究问题与理论依据（10%） • 问题/假设精确、可检验，立足理论；文献综述合理支撑设计与变量选择。
方法：设计、材料与过程（20%） • 设计契合问题；操作性定义准确；抽样、随机化/盲法（如适用）、对照与功效/样本量依据充分；伦理审批/同意交代清楚。
数据管理与透明度（5%） • 数据清洗规则、排除标准、缺失与异常处理预先规定或有理据；代码/软件与版本记录；在许可范围内提供可用性声明。
统计分析与报告（20%） • 检验前提；模型正确；报告效应量与95%CI；α与多重比较控制有依据；如有预注册予以引用；不夸大p值；讨论实践意义。
结果：清晰性与诚信（15%） • 图表标注完整、自洽；避免与正文重复；报告描述性统计与精确p值；避免事后假设（HARKing）与选择性呈现。
讨论与结论（15%） • 诠释与数据一致并讨论局限；考虑替代解释；外部效度有节制；基于证据提出意义与后续研究方向。
写作、结构与格式（5%） • 结构严谨；术语准确；遵循指定格式（如APA、CSE）；引文与参考文献准确一致。
原创性与学术诚信（5%） • 正确署引；无抄袭或不当文本重复；按政策规范使用并（如需）披露AI工具。

Statistical and Reporting Standards 英文

Report effect sizes with confidence intervals; avoid dichotomous “significant/non-significant” framing.
State and justify alpha; address multiple testing (e.g., Holm, Benjamini–Hochberg).
Check and report model assumptions; provide diagnostics or robust alternatives as needed.
Disclose software, versions, and key settings; share code when permissible.
Predefine primary/secondary outcomes; clearly label exploratory analyses.
For specialized designs follow relevant guidelines (e.g., CONSORT for randomized trials, STROBE for observational studies; SAMPL for statistical reporting).

中文

报告效应量与置信区间；避免将结果简化为“显著/不显著”。
说明并论证α；处理多重检验（如Holm、Benjamini–Hochberg）。
检查并报告模型前提；必要时提供诊断或稳健替代。
披露软件与版本及关键参数；在许可下共享代码。
预先界定主要/次要指标；探索性分析需明确标注。
特殊设计遵循相应规范（如随机试验用CONSORT，观察性研究用STROBE；统计报告参考SAMPL）。

Ensuring Reliability and Fairness 英文

Calibration session with exemplars; align interpretation of anchors.
Double marking of a sample (≥10%) to estimate inter-rater reliability (e.g., ICC for continuous scores, Cohen’s κ for categorical decisions).
Use blind review where feasible to reduce bias.
Bias check: language proficiency, confirmation bias, prestige bias, and topic valence; justify major judgments with text evidence.

中文

通过样例标定统一锚定理解。
抽取≥10%进行双评，估计评审者间信度（连续评分用ICC，类别决策用Cohen’s κ）。
条件允许时采用盲评以降低偏见。
偏见核查：语言流利度、确认性偏差、名校/名人效应、主题好恶；重大判断须有文本证据支撑。

Feedback Quality Standards 英文

Specific: Reference exact locations (page/figure/table).
Evidence-based: Tie comments to criteria and standards.
Actionable: Provide concrete revision steps (what to change, how, and why).
Prioritized: Separate critical issues (validity, ethics, statistics) from stylistic edits.
Balanced: Note strengths and improvements; avoid evaluative language about the author.

中文

具体：指明页码/图/表位置。
循证：将意见与量表标准相对应。
可执行：明确修改内容、方法与理由。
分层：将关键问题（效度、伦理、统计）与写作层面区分。
平衡：兼顾优点与改进点；避免针对作者的评价性措辞。

Decision Framework 英文

Accept: Only minor editorial refinements; methodological and analytic standards fully met.
Minor revisions: Core methods sound; limited clarifications or small analytic adjustments.
Major revisions: Substantial methodological or analytic issues; requires re-analysis or redesign of reporting.
Reject/Resubmit: Fundamental flaws in design, measurement, or integrity that cannot be remedied within scope.

中文

接受：仅需小幅文字润色；方法与分析标准完全达标。
小修：方法可靠；需少量澄清或小幅分析调整。
大修：方法或分析存在实质性问题；需重做分析或重大重写。
拒绝/重投：设计、测量或诚信存在根本性缺陷，当前范围内难以补救。

Reviewer Checklist 英文

Does the study answer a clearly stated, theoretically motivated question?
Is the design appropriate and ethical approvals documented?
Are variables and procedures operationalized with sufficient precision for replication?
Are data handling rules and transparency statements adequate?
Are analyses appropriate, assumptions tested, effect sizes and CI reported?
Are conclusions aligned with results and limitations?
Are reporting standards and citation style consistently applied?
Are conflicts of interest and potential biases addressed?

中文

研究问题是否明确且有理论支撑？
设计是否恰当且伦理审批完备？
变量与流程是否具备可复制的操作性定义？
数据处理规则与透明度声明是否充分？
分析是否合适、检验前提是否报告、效应量与CI是否呈现？
结论是否与结果和局限相一致？
报告规范与引用格式是否一致？
利益冲突与潜在偏见是否得到处理？

Rubric Weights (default; adjustable with task brief) 英文

Methods: 20%
Statistical Analysis and Reporting: 20%
Results: 15%
Discussion: 15%
Research Question and Rationale: 10%
Title and Abstract: 5%
Writing, Structure, and Style: 5%
Data Transparency: 5%
Integrity and Ethics: 5%

中文

方法：20%
统计分析与报告：20%
结果：15%
讨论：15%
研究问题与理论依据：10%
标题与摘要：5%
写作、结构与格式：5%
数据透明度：5%
诚信与伦理：5%

Discipline-Specific Style and Reporting Guidance 英文

Social sciences/psychology: APA Style and JARS.
Biomedical: AMA or ICMJE recommendations; CONSORT/STROBE/PRISMA as applicable.
Life sciences: CSE style; ARRIVE for animal studies.
General statistical reporting: SAMPL and ASA p-value statement.
Open science: TOP Guidelines; preregistration and open materials/data when permitted.

中文

社会科学/心理学：APA写作规范与JARS报告标准。
生物医学：AMA或ICMJE建议；按需遵循CONSORT/STROBE/PRISMA。
生命科学：CSE风格；动物研究遵循ARRIVE。
统计报告：参考SAMPL与ASA关于p值的声明。
开放科学：TOP指南；在允许范围内进行预注册与开放材料/数据。

References (selected, formatted in APA style)

American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). Includes Journal Article Reporting Standards (JARS).
American Statistical Association. (2016). ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133.
Committee on Publication Ethics (COPE). (n.d.). COPE Core Practices. https://publicationethics.org
Equator Network. (n.d.). Reporting guidelines (e.g., CONSORT, STROBE, PRISMA, ARRIVE). https://www.equator-network.org
Lang, T., & Altman, D. (2015). Basic statistical reporting for articles in biomedical journals: The SAMPL guidelines. International Journal of Nursing Studies, 52(1), 5–9.
Nosek, B. A., et al. (2015). Promoting an open research culture. Science, 348(6242), 1422–1425. (TOP Guidelines)
International Committee of Medical Journal Editors. (2024). Recommendations for the conduct, reporting, editing, and publication of scholarly work in medical journals. https://www.icmje.org

Note: Adapt the rubric and criteria weights to the assignment brief and disciplinary expectations, and ensure alignment with the institution’s academic integrity and data governance policies.

同侪评审指南：绩效评估与360度反馈

一、目的与适用范围

目的：为评审者提供一套基于证据、可操作且具有心理测量学与组织应用依据的评审标准与流程，用于审查组织内的绩效评估与360度反馈（下称“360反馈”）项目的设计质量、实施合规性与使用效度。
适用范围：适用于用于发展性与/或决策性（如晋升、加薪）用途的绩效评估与360反馈项目，包括工具开发、样本与实施、评分与汇总、信效度与公平性证据、数据治理与伦理、反馈交付与后续应用等。

二、核心原则与评审角色规范

科学性与证据导向：评审意见须以工作分析、心理测量学证据与实证研究为依据（AERA, APA, & NCME, 2014；DeNisi & Murphy, 2017）。
公平与合规：确保过程公平、结果可解释、隐私受保护，符合适用标准与法规（ISO 10667-2:2020；UGESP, 1978）。
适配与用途明确：区分发展性与决策性用途；用途决定证据门槛与报告策略（Bracken, Rose, & Church, 2016）。
评审独立性：回避利益冲突；保持材料保密；记录可追溯的判断依据与建议。

三、评审维度、判据与证据要求 A. 目标一致性与岗位分析

判据：项目目标（发展/决策/混合）明确；与战略与岗位能力模型一致；基于最新、结构化岗位分析（任务-能力映射、关键事件法等）。
证据：岗位分析报告；能力词典与行为指标；用途声明与政策。
关键点：避免抽象人格形容词，采用可观察的行为指标（Smith & Kendall, 1963）。

B. 工具与内容质量（量表与条目）

判据：条目与能力维度一一对应；行为锚定清晰；量表有“无法观察/不适用”选项；语言简洁、文化与语境适切；避免双重负载条目。
证据：条目蓝本与认知访谈记录；专家内容效度评审；试测与修订记录。
建议：优先采用行为锚定评分量表（BARS）或行为频率量表，以降低晦涩性与晕轮效应（Smith & Kendall, 1963）。

C. 评价者取样与实施

判据：评价者来源覆盖（上级/同事/下属/自评/客户）与数量满足匿名性与信度需要；实施流程标准化；评价者培训充分。
证据：取样框架与最小组容量规则；培训材料与到课率；提醒与申诉机制。
基线要求：每一评价来源建议≥3人以保护匿名；≥5–7人可提升可靠性（Bracken, Timmreck, & Church, 2001）。对高风险决策用途，优先更大样本并提供替代来源（例如客户）。
培训要点：框架参照（frame-of-reference）训练、行为观察与证据记录、偏差认知（宽厚/严苛、中心趋向、晕轮、近因）、保密承诺（DeNisi & Murphy, 2017）。

D. 评分、汇总与统计方法

判据：评分规则、缺失与异常处理、跨来源加权与汇总事先预注册并与用途一致；统计模型可解释、可复核。
证据：评分与加权方案（含公式）；缺失值与极端反应处理规则；报告模板与示例。
建议：
- 缺失：将“无法观察”视作缺失；避免将缺失计为低分。
- 汇总：报告分来源结果与总体综合分；决策用途时预先定义来源权重（例如上级权重较高）并给出依据（岗位任务分析）。
- 质量控制：剔除低质量反应（极短作答时长、直线作答、同质极端）需有阈值与记录。
- 高阶方法：对高风险用途建议进行多面向Rasch或广义化理论分析以分离项目难度、评价者严厉度与被评价者真实差异（Myford & Wolfe, 2003；Shavelson & Webb, 1991）。

E. 信度与效度证据

判据：提供针对目标人群与用途的信度与效度证据，分来源报告。
证据：
- 信度：内部一致性（α/ω），跨评价者一致性（ICC；建议报告ICC(2,k)）、一般化理论G研究与D研究（Shavelson & Webb, 1991）。
- 结构效度：维度性检验（CFA/多特质-多方法MTMM以识别来源方法效应）。
- 关联效度：与客观绩效、业务KPI、上级评估的关系；已知群体效应与收敛-区分效度。
- 后果效度：干预后行为改变与绩效改进证据（Kluger & DeNisi, 1996；London & Smither, 1995）。
判定门槛：发展性用途可接受中等信度；用于人事决策时应达到较高水平，且多重证据一致（AERA et al., 2014）。

F. 公平性与偏差控制

判据：识别并控制来源方法效应与评价者特质偏差（如“评价者特异效应”）；开展差异功能（DIF）与测量不变性检验；监测不良影响。
证据：方差分解（人×来源×项目）；性别/年龄/族群/岗位等级不变性测试；不良影响与解释。
要点：评价者特异效应在评分方差中占比可观，需通过多来源汇总、评价者训练与模型校正降低影响（Scullen, Mount, & Goff, 2000）。遵循公平性与可及性准则（AERA et al., 2014）。

G. 数据治理、隐私与伦理

判据：合规的告知同意、数据最小化、访问控制、匿名/去标识、保留与删除策略；报告设置最小展现门槛（如每来源≥3份）。
证据：隐私政策、数据流程图、访问角色矩阵、安全审计记录；与ISO 10667或等效标准对齐的服务流程。
要点：分离发展性报告与管理者可见信息，防止报复与寒蝉效应（Bracken et al., 2016）。

H. 反馈报告与应用

判据：报告清晰、可解释、可行动；辅以教练或复盘；目标设定与跟进机制完善。
证据：报告样例（含行为证据与具体建议）、教练与行动计划模板、跟踪节点与指标。
要点：
- 附带规范化参照（同岗/同级标杆）与信度/不确定度提示。
- 发展性用途优先；将360直接用于薪酬或晋升的项目需提供对评价诚实性与效度影响的缓释机制（如匿名保护、独立第三方实施）（Atwater & Brett, 2005；Bracken et al., 2016）。
- 将反馈转化为具体目标与实施计划可显著提升效果（Locke & Latham, 2002）。

I. 持续改进与监控

判据：设定关键质量指标（完成率、项目方差、ICC、有效评论率、申诉率等），定期审计并修订工具。
证据：周期性质量报告、版本更新记录、变更影响评估。

四、评审流程与输出

准备阶段：收集材料清单（见附录）；明确用途与情境；确认利益冲突与保密协议。
独立评审：依据上述九大维度形成分项判定与证据点评。
合议与定稿：若多名评审者，先独立打分后开展校准会议；形成一致意见与优先级排序的整改建议。
输出格式：每一维度给出等级判定与关键证据摘要，并提供具体、可操作的改进建议与时间表。

五、建议的分级判定与释义

不充分：关键证据缺失或存在重大风险（如无岗位分析、匿名门槛不足、信度不达标）。
基本充分：核心流程具备但证据薄弱或不一致（如仅单一来源信度合格）。
充分：证据链完整、指标达标、风险可控（含公平性与隐私保护）。
卓越：除充分外，采用先进方法（如多面向模型）、系统化教练与持续改进闭环，且有实证的组织绩效改进证据。

六、常见风险与纠偏建议

将360反馈直接用于薪酬决策导致评价失真与反馈接受度下降：先以发展性用途试点，强化匿名保护与第三方实施，并分阶段引入管理用途（Bracken et al., 2016）。
条目过于抽象或道德化：改写为具体、可观察行为，提供积极/消极锚点示例（Smith & Kendall, 1963）。
评价者样本不足或结构失衡：扩充同级与跨职能评价者，确保每来源≥3且尽量达到5–7人（Bracken et al., 2001）。
统计处理不透明：预注册汇总与缺失处理规则，提供方法附录与可复核的计算说明。
方法效应与评价者特异效应偏高：采用MTMM/广义化理论/多面向Rasch分离偏差；开展框架参照训练（Myford & Wolfe, 2003；Shavelson & Webb, 1991）。
反馈难以转化为改进行动：配置教练会谈、目标设定与里程碑检查；强调“前馈”与具体化（Kluger & DeNisi, 1996；Locke & Latham, 2002）。

七、评审材料清单（提交方应提供）

岗位分析报告与能力模型
工具蓝本（条目池、量表设计、行为锚定）
专家评审与试测报告（含修订记录）
实施规范（取样策略、匿名与最小单元规则、培训材料）
评分与汇总规则（缺失/异常处理、加权方案、预注册文档）
信度与效度报告（分来源；含ICC、结构模型、关联效度）
公平性分析（不变性/DIF/不良影响监测）
数据治理与隐私合规文件（ISO 10667/等效对齐说明）
报告与反馈交付样例、教练与行动计划模板
周期性质量监控报表与改进记录

参考文献（APA第7版）

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. American Educational Research Association.
Atwater, L. E., & Brett, J. F. (2005). Antecedents and consequences of reactions to developmental 360 feedback. Journal of Occupational and Organizational Psychology, 78(3), 473–490.
Bracken, D. W., Rose, D. S., & Church, A. H. (2016). The evolution and devolution of 360-degree feedback. Industrial and Organizational Psychology, 9(4), 761–794.
Bracken, D. W., Timmreck, C. W., & Church, A. H. (Eds.). (2001). The handbook of multisource feedback. Jossey-Bass.
DeNisi, A. S., & Murphy, K. R. (2017). Performance appraisal and performance management: 100 years of progress? Journal of Applied Psychology, 102(3), 421–433.
Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119(2), 254–284.
London, M., & Smither, J. W. (1995). Can multi-source feedback change perceptions of goal accomplishment, self-evaluations, and performance-related outcomes? Personnel Psychology, 48(4), 803–839.
Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement. Journal of Applied Measurement, 4(4), 386–422.
Scullen, S. E., Mount, M. K., & Goff, M. (2000). Understanding the latent structure of job performance ratings. Journal of Applied Psychology, 85(6), 956–970.
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Sage.
Smith, P. C., & Kendall, L. M. (1963). Retranslation of expectations: An approach to the construction of unambiguous anchors for rating scales. Journal of Applied Psychology, 47(2), 149–155.
ISO. (2020). ISO 10667-2:2020 Assessment service delivery—Procedures and methods to assess people in work and organizational settings—Part 2: Requirements for service providers.
Uniform Guidelines on Employee Selection Procedures, 43 Fed. Reg. 38290 (1978).
Locke, E. A., & Latham, G. P. (2002). Building a practically useful theory of goal setting and task motivation. American Psychologist, 57(9), 705–717.

使用说明

评审时建议将上述九大维度作为主目录，逐条记录证据与判定，并在结论部分明确总体风险等级与优先改进项（不超过5条），同时标注对项目用途（发展性/决策性）的可行性判定与必要前置条件。

Peer Review Guidelines for Manuscript Evaluation

Purpose and scope

These guidelines standardize high-quality, ethical, and evidence-based peer review of scholarly manuscripts across disciplines. They specify reviewer responsibilities, core evaluation criteria, study-type–specific expectations, reporting standards, and structured reporting formats, aligned with recognized ethical and methodological authorities (e.g., COPE; ICMJE; EQUATOR Network).

Ethical principles and reviewer responsibilities

Confidentiality and data protection: Treat all materials as confidential; do not share, store insecurely, or use for personal research. Do not upload manuscripts or data to third-party tools without explicit journal permission (COPE, 2017; COPE, 2023).
Competence and scope: Accept reviews only when the manuscript aligns with your expertise and you can meet the deadline; otherwise decline promptly and, if requested, suggest qualified alternatives (COPE, 2017).
Conflicts of interest: Disclose any financial, professional, personal, or intellectual conflicts that could bias judgment (ICMJE, n.d.; COPE, 2017).
Impartiality and fairness: Evaluate solely on scholarly merit, avoiding favoritism, hostility, or identity-based bias; adhere to double-anonymous processes when applicable.
Research integrity vigilance: Flag potential plagiarism, duplicate submission, image manipulation, data fabrication, undeclared conflicts, or ethical issues to the editor confidentially with objective evidence (COPE, 2017).
Responsible use of generative AI: Do not rely on AI tools to read confidential manuscripts or to generate review content unless explicitly permitted by the journal; you remain accountable for all content (COPE, 2023).
Timeliness and diligence: Deliver thorough reviews on time; if delays arise, notify the editor promptly.

Pre-review readiness checklist

Fit: The manuscript aligns with your methodological and topical expertise.
Feasibility: You can review thoroughly within the timeline.
Independence: No conflicts of interest exist or they are fully disclosed and accepted by the editor.
Standards familiarity: You can apply relevant reporting guidelines (EQUATOR Network), statistical and methodological standards (e.g., CONSORT, PRISMA 2020, STROBE, STARD, SRQR/COREQ; ASA), and transparency norms (TOP Guidelines; FAIR data).

Core evaluation criteria and rating rubric Use a 5-point scale for each criterion and provide evidence-based justifications and actionable suggestions.

Originality and contribution 1 = Trivial or duplicative; 3 = Modest incremental advance; 5 = Clear, novel contribution that shifts understanding or practice.
Significance and relevance 1 = Low scholarly or practical value; 3 = Moderate importance; 5 = High theoretical, methodological, or practical impact for the field and journal audience.
Theoretical framing and literature integration 1 = Weak or outdated; 3 = Adequate with gaps; 5 = Robust, current, and logically linked to aims and hypotheses.
Methodological rigor and design appropriateness 1 = Major flaws or misalignment; 3 = Generally sound with notable limitations; 5 = Design is appropriate, justified, and rigorous for the research question.
Measurement quality and construct validity 1 = Poorly defined constructs, unreliable measures; 3 = Adequate validity/reliability; 5 = Strong evidence for validity, reliability, and appropriate operationalization.
Data analysis and statistical reporting 1 = Incorrect or opaque analyses; 3 = Mostly appropriate with improvements needed; 5 = Correct, transparent, with effect sizes, uncertainty, and assumptions addressed (ASA; SAMPL; APA JARS).
Transparency, reproducibility, and openness 1 = Insufficient detail; 3 = Partial transparency; 5 = Protocol/preregistration as applicable, data/code/materials availability or justified restrictions, adherence to TOP and FAIR principles.
Ethical and governance compliance 1 = Unclear or noncompliant; 3 = Partially documented; 5 = Ethics approval/consent documented, risks mitigated, equity considered, data privacy managed.
Interpretation and limitations 1 = Overstated or unsupported claims; 3 = Mixed; 5 = Conclusions warranted by evidence, limitations and generalizability transparently discussed, appropriate policy/practice implications.
Reporting quality and structure 1 = Unclear, noncompliant with reporting guidelines; 3 = Adequate with gaps; 5 = Clear, coherent, guideline-compliant, replicable descriptions.
Scholarly apparatus (citations and positioning) 1 = Incomplete/biased; 3 = Adequate; 5 = Comprehensive, current, balanced use of primary sources.
Equity, diversity, inclusion, and accessibility 1 = Unaddressed with potential harm; 3 = Partially considered; 5 = Thoughtfully integrated in design, sampling, measures, analysis, and reporting.

Suggested weighting (modifiable per journal): Contribution/significance (20%), Methods (20%), Analysis/reporting (20%), Transparency/ethics (15%), Interpretation (15%), Writing/scholarship/EDI (10%).

Study-type–specific expectations and checklists Apply general criteria above plus the following study-specific standards.

Randomized trials and experimental studies
- Adhere to CONSORT 2010 and extensions (e.g., cluster, noninferiority).
- Verify randomization sequence generation, allocation concealment, blinding, pre-specified outcomes, sample size calculation, protocol registration, ITT analyses, harms reporting.
- Assess multiplicity control, missing data handling, and deviations from protocol.
Observational studies (cohort, case-control, cross-sectional)
- Apply STROBE.
- Evaluate exposure/outcome ascertainment, confounding control (design and analysis), measurement bias, missing data, sensitivity/robustness analyses, temporality, and causal claims discipline.
Diagnostic/prognostic accuracy
- Apply STARD 2015.
- Assess spectrum of participants, index test and reference standard definitions, blinding, thresholds, calibration and discrimination metrics, clinical utility.
Systematic reviews and meta-analyses
- Apply PRISMA 2020; check protocol registration (e.g., PROSPERO), comprehensive search, dual screening/data extraction, risk-of-bias assessment (e.g., RoB 2; ROBINS-I), synthesis methods, heterogeneity and small-study bias, certainty-of-evidence frameworks (e.g., GRADE), and transparent exclusions.
Qualitative research
- Apply SRQR or COREQ as appropriate.
- Evaluate sampling strategy and rationale, data collection procedures, reflexivity, analytic approach (e.g., thematic analysis, grounded theory) with auditability, trustworthiness (credibility, dependability, transferability), saturation/adequacy.
Survey research
- Apply AAPOR transparency standards and, where applicable, APA JARS–Quant.
- Assess sampling frame, mode and coverage, response/retention rates with standardized definitions, weighting and variance estimation, questionnaire design and pretesting, measurement validity and reliability, nonresponse bias assessment.
Measurement development and validation
- Evaluate content validity evidence, dimensionality (EFA/CFA), reliability (omega preferred over alpha when appropriate), measurement invariance, item response theory as appropriate, cross-validation and sample size adequacy, scoring, and interpretability.
Computational/ML modeling and prediction
- Assess data provenance, preprocessing, leakage risks, internal/external validation, performance metrics (discrimination, calibration), uncertainty quantification, fairness/equity assessment, robustness/stress testing, complete code and hyperparameter transparency.

Statistical and analytical reporting expectations

Report effect sizes with confidence intervals; avoid sole reliance on null-hypothesis significance testing; interpret p-values in context (Wasserstein & Lazar, 2016; Wasserstein, Schirm, & Lazar, 2019).
Justify model choices; assess assumptions; describe missing data patterns and handling; pre-specify primary analyses where appropriate; address multiplicity.
Provide analysis code and data dictionaries whenever ethically and legally possible; justify any restrictions (TOP; FAIR; SAMPL; APA JARS).

Structuring the review report

Summary (2–5 sentences): State the study’s aims, methods, main findings, and contribution in neutral terms.
Major strengths (bulleted): Identify substantive merits (e.g., novelty, rigorous design).
Major concerns (numbered): For each, (a) state the issue; (b) explain why it matters with reference to standards or evidence; (c) provide concrete, feasible recommendations (e.g., additional analyses, clarifications, revised claims).
Minor issues (bulleted): Clarity, formatting, small methodological clarifications; avoid line editing unless clarity impedes interpretation.
Confidential comments to the editor: Suitability for the journal, ethical concerns, overlap with prior work, reviewer confidence and remaining uncertainties.
Ratings and recommendation: Provide criterion ratings and an overall recommendation with rationale.

Decision recommendations and thresholds

Accept: All criteria strong; only typographic or minimal clarifications needed.
Minor revision: Core methods sound; limited, addressable concerns not affecting conclusions materially.
Major revision: Substantial issues in design, analysis, or reporting that could be addressed; conclusions may change.
Reject: Fundamental flaws (e.g., invalid design for the question, irreparable bias), inadequate contribution, or ethical noncompliance. Provide constructive guidance for potential resubmission elsewhere.

Bias mitigation and professional conduct

Reflect on sources of bias (e.g., topic preferences, institutional reputations, methodological schools).
Use evidence-based critiques, cite standards (e.g., CONSORT, PRISMA 2020), and avoid prescriptive personal preferences absent justification.
Maintain a respectful, specific, and solution-focused tone. Do not reveal your identity unless journal policy permits.

Common red flags necessitating editorial attention

Ethical concerns: Lack of IRB/IEC approval where required, inadequate consent, data privacy risks.
Integrity concerns: Plagiarism, image/data manipulation, impossible results or reused datasets without disclosure.
Salami slicing or duplicate publication.
Undisclosed conflicts of interest or funding influence.

Reviewer checklist (abbreviated)

Ethical compliance documented (approval/consent/data use).
Adherence to appropriate reporting guideline(s) confirmed.
Design and measures fit the research questions.
Analyses correct, transparent, and reproducible; effect sizes/CIs reported.
Claims proportionate to evidence; limitations and generalizability addressed.
Data/materials/code availability aligned with journal policy or justified.
Equity and bias considerations addressed (sampling, measures, outcomes).

References (selected)

American Association for Public Opinion Research. (n.d.). Transparency and standard definitions resources. https://www.aapor.org
Appelbaum, M., Cooper, H., Kline, R. B., et al. (2018). Journal article reporting standards for quantitative research in psychology. American Psychologist, 73(1), 3–25.
Bossuyt, P. M., Reitsma, J. B., Bruns, D. E., et al. (2015). STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. BMJ, 351, h5527.
Committee on Publication Ethics. (2017). Ethical guidelines for peer reviewers. https://publicationethics.org
Committee on Publication Ethics. (2023). Position statement on the use of AI tools in research and publication. https://publicationethics.org
EQUATOR Network. (n.d.). Enhancing the quality and transparency of health research. https://www.equator-network.org
International Committee of Medical Journal Editors. (n.d.). Recommendations for the conduct, reporting, editing, and publication of scholarly work in medical journals. http://www.icmje.org
Lang, T. A., & Altman, D. G. (2013). The SAMPL guidelines for statistical reporting. In P. Smart, H. Maisonneuve, & A. Polderman (Eds.), Science Editors’ Handbook. European Association of Science Editors.
Nosek, B. A., Alter, G., Banks, G. C., et al. (2015). Promoting an open research culture. Science, 348(6242), 1422–1425.
O’Brien, B. C., Harris, I. B., Beckman, T. J., Reed, D. A., & Cook, D. A. (2014). Standards for reporting qualitative research. Academic Medicine, 89(9), 1245–1251.
Page, M. J., McKenzie, J. E., Bossuyt, P. M., et al. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71.
Schulz, K. F., Altman, D. G., & Moher, D.; CONSORT Group. (2010). CONSORT 2010 statement: Updated guidelines for reporting parallel group randomized trials. BMJ, 340, c332.
Tong, A., Sainsbury, P., & Craig, J. (2007). Consolidated criteria for reporting qualitative research (COREQ). International Journal for Quality in Health Care, 19(6), 349–357.
von Elm, E., Altman, D. G., Egger, M., et al. (2007). The STROBE statement. PLoS Medicine, 4(10), e296.
Wasserstein, R. L., & Lazar, N. A. (2016). The ASA’s statement on p-values. The American Statistician, 70(2), 129–133.
Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a world beyond “p < 0.05”. The American Statistician, 73(sup1), 1–19.
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018.

Note on citation style

Adapt citation style to the journal’s specified format (e.g., APA, Vancouver). Where study-type–specific guidelines are cited (e.g., CONSORT, PRISMA), include the canonical statement and any relevant extensions in the reference list per the journal’s instructions.

解决的问题

以“评估专家”视角，针对指定任务类型（如课程论文、课堂作业、企业绩效、用户研究问卷、方案评审、代码审查等），在数分钟内生成可直接使用的同侪评审指南。
统一评审标准与表述，显著提升一致性与透明度，减少主观偏差与随意性，强化可追溯和可比性。
支持多语言输出与主流学术引用风格，适配国际化教学与跨团队协作，满足院系、培训与研究机构的规范化要求。
产出结构化内容要素（评审目标与范围、评分维度与描述、证据与引用要求、示例评语、常见偏误提醒、质量校验清单），让评审者“照单执行”，落地快、复用强。
为管理者降低指南搭建与培训成本，压缩起草周期，提升评审体验与组织公信力，促进从试用到规模化部署的转化。

适用用户

高校教师与教务团队

快速为课程作业、实验报告与毕业论文生成评审指南，明确评分维度与证据要求；提供双语版本用于教学与助教培训，提升公平性与可比性。

HR与业务主管

为绩效评估与360反馈制定统一话术、评分尺度与示例，指导同事给出基于事实的建设性意见；减少沟通争议，沉淀可复用模板。

研究者与学术期刊编辑

针对稿件评审或基金评审生成规范化指引，包含引用风格提示、伦理与可重复性核对清单；降低偏见风险，提高审稿一致性。

特征总结

• 面向不同任务类型，定制同侪评审指南，一键生成标准、流程与评分要点

• 自动梳理评审维度与证据要求，帮助团队对齐期望，显著减少主观偏差

• 支持多语言与学术写作风格，输出可直接用于国际项目与发表的评审文本

• 按学科引用风格给出示例与提示，降低合规与格式返工成本，避免遗漏细节

• 围绕任务情境提供可操作反馈句式与模板，提升评审沟通效率与反馈可用性

• 内置事实核验与风险提示，避免过度推断、情绪化语言与不当结论的出现

• 可按受众角色区分指南版本，匹配教师、主管、评审委员会等不同使用场景

• 提供评分量表、证据清单与常见错误库，帮助团队快速落地并保持一致性

• 适配教育测试、绩效评估与问卷研究，一套方法覆盖多场景的评审需求

• 结构化输出目标、步骤、示例与核对清单，拿来即用，减少培训与沟通成本

如何使用购买的提示词模板

1. 直接在外部 Chat 应用中使用

将模板生成的提示词复制粘贴到您常用的 Chat 应用（如 ChatGPT、Claude 等），即可直接对话使用，无需额外开发。适合个人快速体验和轻量使用场景。

2. 发布为 API 接口调用

把提示词模板转化为 API，您的程序可任意修改模板参数，通过接口直接调用，轻松实现自动化与批量处理。适合开发者集成与业务系统嵌入。

3. 在 MCP Client 中配置使用

在 MCP client 中配置对应的 server 地址，让您的 AI 应用自动调用提示词模板。适合高级用户和团队协作，让提示词在不同 AI 工具间无缝衔接。

AI 提示词价格

￥25.00元

先用后买，用好了再付款，超安全！

在线免费用提示词

您购买后可以获得什么

✓

获得完整提示词模板

- 共 228 tokens

- 2 个可调节参数

{ 任务类型 } { 输出语言 }

✓

获得社区贡献内容的使用权

- 精选社区优质案例，助您快速上手提示词

购买

撰写同侪评审指南

解决的问题