论点陈述本表现任务旨在通过真实情境中的决策与论证，综合评估七年级学生对“比例（含单位率）与百分数（含折扣、百分比增减、单位成本比较）”的理解与应用。任务设计遵循真实性、认知要求与学习目标对齐、可评分性与公平性等原则，配套分析性评分量表与实施程序，以提高评分的信度并支撑有效性论证（AERA, APA, & NCME, 2014；Pellegrino, Chudowsky, & Glaser, 2001；Brookhart, 2013；NCTM, 2014）。内容与认知要求对齐CCSS-M 7.RP.A.1–3和7.EE.3（Common Core State Standards Initiative, 2010），并达到DOK 2–3层级（Webb, 2002）。

一、表现任务：远足补给采购与论证情境你是年级远足活动的“物资统筹员”。需为140名参与者（含学生与教师）准备饮用水与小食，满足需求且控制成本，并用比例与百分数进行比较与决策。

已知约束

人数与需求：
- 140人。
- 每人至少需要600 mL饮水与1袋小食。
预算上限：900元。
供应与优惠（同一供应商可一次下单，运费与优惠仅按该商家本单规则计算）：
1. 饮用水
  - 供应商A：500 mL/瓶，12瓶/箱（每箱6 L），标价24元/箱；满5箱打八折；运费15元/单。
  - 供应商B：600 mL/瓶，8瓶/箱（每箱4.8 L），标价20元/箱；“买二第3箱半价”（可叠加，按每3箱一组计）；运费每箱5元，单笔运费封顶25元。
2. 小食（独立于饮水商家，单独结算与运费）
  - 方案X：混合坚果40 g/袋，10袋/包，30元/包；单笔订单满120元立减10元（减一次）；免运费。
  - 方案Y：混合坚果50 g/袋，8袋/包，32元/包；买满4包九五折；免运费。

学生作业要求（提交物）

方案与计算：
1. 用比例/单位率比较两家饮水的单位成本：分别给出“折扣前后每升单价”，并在满足需求时计算“含运费的实际每升单价”。
2. 选择饮水与小食的具体购买组合（箱/包数量），满足总量、预算与规则；给出总费用、裕量（多出的数量或体积）与人均成本。
3. 至少提供一种备选方案，并计算相对节省百分比（相对于你的最优方案或相对于标价）。
论证与沟通：用比例与百分数论证“为何你的方案更优”（如单位率、折扣效应、运费对单位成本的影响），并解释关键决策。
百分比情境扩展：若下月供应商A标价上调5%（折扣与运费规则不变），重估仅饮水部分的总支出变化额与百分比变化（分别给出不含运费与含运费两种视角），并解释原因。
表征：提供清晰的比率表、计算步骤、单位标注与（如需）简图/表格。

建议实施条件

时间：1课时（45–50分钟）完成初稿；可另给20分钟用于修订与同伴互评。
工具：允许计算器、直尺与草稿纸。
作答形式：独立完成；可在最后5分钟进行小组口头复核要点（非必需）。
可及性与公平性：提供等价的数据呈现（文本与简明表格），允许大字版；语言说明避免歧义；对计算障碍学生允许口述思路由教师记录（评分基于证据与标准）。

二、推定正确解与要点（供评分参考）关键中间量

需求：饮水总量≥140×0.6 L=84 L；小食≥140袋。
饮水单位率（不含运费）
- A：每箱6 L，24元/箱；折扣后（≥5箱）每箱24×0.8=19.2元 → 19.2/6=3.20元/L；折扣前为4.00元/L。
- B：每箱4.8 L，三箱一组：20+20+10=50元，均摊16.666…元/箱 → 16.666…/4.8≈3.4722元/L；折扣前为20/4.8≈4.1667元/L。
满足饮水需求的最小箱数
- A：84÷6=14箱（恰好满足，≥5箱，享八折）。
- B：84÷4.8=17.5 → 需18箱（六组“买二第3箱半价”）。
饮水总费用（含运费）
- A：水费14×19.2=268.8元；运费15元 → 合计283.8元；含运费单位成本=283.8÷84≈3.379元/L。
- B：水费六组×50=300元；运费封顶25元 → 合计325元；含运费单位成本=325÷86.4≈3.762元/L。
小食
- X：每包10袋，需14包；标价14×30=420元；满120减10一次 → 410元；人均≈410÷140≈2.929元/人。
- Y：每包8袋，需18包；标价18×32=576元；满4包九五折 → 576×0.95=547.2元。
综合最优方案（推定）
- 饮水选A的14箱，283.8元；小食选X的14包，410元；总计693.8元；人均≈693.8÷140≈4.956元。
- 备选（示例）：饮水选B的18箱，325元；小食仍选X，410元；总计735元。
- 相对节省百分比（最优相对备选）：(735−693.8)÷735≈5.6%。
百分比扩展（A涨价5%）
- 折扣后新箱价：24×1.05×0.8=20.16元/箱；饮水费=14×20.16=282.24元。
- 含运费合计：282.24+15=297.24元；与原283.8元相比，增加13.44元。
- 不含运费的饮水费相对变化：282.24÷268.8=1.05（+5%）；含运费的相对变化：13.44÷283.8≈4.74%（固定运费稀释了百分比增幅）。

三、评估目标与标准对齐

可观察表现
- 以比与单位率建模并比较方案（7.RP.A.1–2）。
- 用百分数处理折扣、封顶运费、价格上调与相对节省（7.RP.A.3）。
- 在现实约束下进行数量最优化并检验合理性（7.EE.3；DOK 3）。
- 用恰当表征与论证沟通决策（NCTM过程标准）。
对齐依据
- 任务要求涉及单位率、成比例关系、百分比应用与多步运算，符合七年级比例与百分数核心内容（CCSS-M 7.RP；7.EE）与过程标准（NCTM, 2014）。

四、分析性评分量表（总分16分，每项4分）

数学准确性（比例/百分计算）
- 4分：比例、单位率、折扣与百分变化计算无误；量纲与取整合理，关键中间量正确并相互验证。
- 3分：存在极少量非实质性小误差，不影响主要结论。
- 2分：有一处以上实质性错误，但展示出部分正确的比例或百分思想。
- 1分：多处错误，零散正确点不足以支撑结论。
策略与建模
- 4分：选择并正当化适当模型（比率表/单位率/等价比例/方程），考虑运费与折扣的交互效应与最小整箱约束。
- 3分：模型基本恰当，未充分讨论交互效应或边界条件。
- 2分：模型选择有限或混乱，但有部分与任务相关的尝试。
- 1分：缺乏有效模型。
论证与比较
- 4分：以定量证据比较至少两方案，说明优劣与敏感性（如固定运费稀释百分比涨幅），论证清晰连贯。
- 3分：有比较与理由，但论证不够全面或缺少关键解释。
- 2分：结论存在但证据薄弱或不连贯。
- 1分：缺少有效论证。
表征与沟通
- 4分：表格/步骤清晰，单位齐全，图示或结构化表达辅助理解；书面表达规范。
- 3分：总体清晰，偶有疏漏。
- 2分：表达影响理解。
- 1分：难以判读。

五、实施与质量保证建议

评分与信度
- 制定评分指南与锚例（高/中/低水平样卷），开展评分校准；建议对至少20%样卷双评，目标一致性系数或ICC≥0.75；分歧>1分层级者协商决断（Brookhart, 2013；AERA等, 2014）。
有效性证据
- 内容：蓝图映射标准与任务要素，覆盖单位率、折扣、百分变化与优化。
- 过程/内部结构：检查各维度得分相关与区分度，以确认量表各维度功能。
- 外部：与相关课堂测验/单元测试的相关验证。
- 后果：检查是否促进学生使用比例推理而非“套公式”，并观测不同群体的公平性（Messick, 1995；Standards, 2014）。
公平与可及性
- 使用通俗一致的折扣规则表述；对不同计算途径（比率表/方程/单位率）一视同仁，只要证据充分即予以评分（NCTM, 2014）。
认知要求
- 部分任务为DOK 2（单位率与折扣计算），综合选择与论证为DOK 3（Webb, 2002）。

六、常见错误与诊断提示（用于形成性反馈）

将“第3箱半价”误解为“全部半价”或“每满三箱九折”。
忽视“整箱”与“容量下限”约束，出现“精确84 L但箱数不足”的不可能解。
只比较折扣后单价，忽略运费对单位成本与百分变化的影响。
将“涨价5%后再八折”与“先八折再涨价5%”的结果混淆；应强调乘法次序对单价的等比影响，但固定运费会改变整体百分比。

参考文献（APA第7版）

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. American Educational Research Association.
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7–74.
Brookhart, S. M. (2013). How to create and use rubrics for formative assessment and grading. ASCD.
Common Core State Standards Initiative. (2010). Common Core State Standards for Mathematics.
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances. American Psychologist, 50(9), 741–749.
National Council of Teachers of Mathematics (NCTM). (2014). Principles to actions: Ensuring mathematical success for all.
Pellegrino, J. W., Chudowsky, N., & Glaser, R. (Eds.). (2001). Knowing what students know: The science and design of educational assessment. National Academies Press.
Webb, N. L. (2002). Depth-of-knowledge levels for four content areas. Wisconsin Center for Education Research.

说明

所有数据与计算设置均为教学用途，符合七年级比例与百分数认知负荷与课程标准的范围；数值经复核确保内部一致性与可比性。

题目与定位（Purpose and Alignment）

中文：本表现任务旨在评估软件工程毕业设计中学生在真实情境下的端到端工程能力，包括需求工程、体系结构与设计、实现与代码质量、验证与确认、DevOps 与可维护性、安全合规与风险管理、团队过程与专业沟通。任务与 ABET 工程教育学生学习结果（问题解决、设计、沟通、团队合作、伦理与职业责任等）以及 ACM/IEEE SE2014 课程指南中核心能力（需求、设计、构建、验证、过程、质量）对齐，质量属性采用 ISO/IEC 25010 框架。此设计遵循有效性证据框架，兼顾内容代表性与评分推断的合理性。
English: This performance task evaluates end-to-end engineering competence in an authentic context: requirements engineering; architecture and design; implementation and code quality; verification and validation; DevOps and maintainability; security compliance and risk management; team process; and professional communication. It aligns with ABET student outcomes and ACM/IEEE SE2014 core competencies, and maps quality attributes to ISO/IEC 25010. The design follows contemporary validity frameworks to support content coverage and defensible score interpretations.

任务情境与目标（Task Scenario and Objectives）

中文：团队（3–5 人）在8–10周内为现实或拟真客户开发一款云原生、可部署的软件产品（例如：面向校园活动管理的微服务应用或等价复杂度系统）。目标是交付在受控运行环境中可演示并可运维的最小可行产品（MVP），满足明确的利益相关者需求与质量属性目标（ISO/IEC 25010），并落实基本安全控制（OWASP ASVS L2）。
English: A team of 3–5 students has 8–10 weeks to deliver a cloud-native, deployable software product for a real or realistic client (e.g., a microservice-based campus event management system or equivalent complexity). The MVP must be demonstrable and operable in a controlled environment, satisfy stakeholder requirements and ISO/IEC 25010 quality attributes, and implement baseline security controls (OWASP ASVS Level 2).

必交成果（Evidence and Deliverables）

中文：
1. 需求包：SRS（依循 ISO/IEC/IEEE 29148），用例/用户故事与验收标准，质量属性场景（响应时间、可靠性、安全、可维护性等）。
2. 架构与设计：体系结构视图与决策记录（ADR），关键设计的权衡分析，接口契约与数据模型。
3. 代码与配置：版本库（Git）全量历史，遵循约定的分支策略与提交规范；基础设施即代码（IaC）脚本；关键模块注释与静态分析报告。
4. 测试资产：测试策略与计划（依据 ISO/IEC/IEEE 29119），自动化单元/集成/端到端测试，测试覆盖率与缺陷报告，性能与安全测试报告。
5. 运维与发布：可重复的 CI/CD 流水线，部署工件（容器镜像、清单）、运行手册与SLA草案、监控与告警面板（可演示）。
6. 安全与合规：威胁建模（STRIDE 或等价）、依赖与容器镜像漏洞扫描、ASVS 差距分析与整改单。
7. 演示与沟通：15 分钟产品演示与技术答辩，面向用户的简明使用手册（含可访问性说明）。
English:
1. Requirements package: SRS per ISO/IEC/IEEE 29148; use cases/user stories with acceptance criteria; quality attribute scenarios.
2. Architecture and design: views and ADRs; trade-off analyses; interface contracts and data models.
3. Code and configuration: complete Git history with branching/commit conventions; Infrastructure-as-Code; key module documentation and static analysis reports.
4. Test assets: test strategy and plan (ISO/IEC/IEEE 29119); automated unit/integration/E2E tests; coverage and defect reports; performance and security test reports.
5. Operations and release: reproducible CI/CD pipeline; deployable artifacts (container images/manifests); runbook and draft SLA; monitoring and alert dashboards.
6. Security and compliance: threat model (e.g., STRIDE); SCA/SAST scan results; ASVS gap analysis and remediation items.
7. Demo and communication: 15-minute product demo and technical defense; end-user guide including accessibility notes.

可量化最低门槛（Non-negotiable Minimums）

中文：
- 构建与测试：主分支CI通过率≥90%；关键路径自动化测试覆盖率≥70%；无未关闭P1缺陷。
- 质量与安全：静态分析零高/严重问题；依赖/镜像零高/严重漏洞；落实 ASVS L2 的身份鉴别与会话管理控制；关键API具速率限制与输入验证。
- 可运行性：在指定环境中“一键”部署成功；演示期间 p95 关键操作延迟<300ms（或达成经论证的性能目标）。
English:
- Build and test: ≥90% main-branch CI pass rate; ≥70% automated coverage on critical paths; no open P1 defects.
- Quality and security: zero high/critical static analysis issues; zero high/critical SCA/container vulnerabilities; ASVS L2 auth/session controls; rate limiting and input validation on critical APIs.
- Operability: one-command deployment in target environment; p95 latency <300 ms for key operations (or justified target achieved).

评分量表与权重（Analytic Rubric and Weights, 100%）

中文（四级：卓越/熟练/发展中/不足；示例性描述，评分细则见下）：
1. 需求工程 15%：可追踪性矩阵完备；质量属性场景可验证；变更基线管理规范（对齐 29148）。
2. 架构与设计 20%：架构决策有证据支撑的权衡；接口稳定性与耦合控制；与 ISO 25010 属性的映射清晰。
3. 实现与代码质量 20%：可读性、模块化、复杂度受控；静态分析与代码规范遵循；关键路径覆盖与缺陷密度低。
4. 验证与确认 15%：测试金字塔合理；性能与安全测试设计基于风险；缺陷生命周期管理（对齐 29119）。
5. DevOps 与可维护性 10%：可重复的CI/CD、回滚与灰度；监控指标（错误率、延迟、SLO）与告警有效。
6. 安全与合规 10%：威胁建模的误用案例覆盖；ASVS L2 控制落实；依赖治理与密钥管理。
7. 团队过程与项目管理 5%：迭代计划与燃尽一致性；工单粒度与完成定义（DoD）明确；同伴评价一致。
8. 专业沟通与文档 5%：SRS/设计/测试/运维文档相互一致、受控版本；演示针对利益相关者有效。评分等级要点（摘选）：卓越=系统性证据与度量齐备、决策有可复现论证；熟练=满足大部分标准且偏差有充分理由；发展中=存在重要缺口但具备基本可运行性；不足=未达最低门槛或关键证据缺失。
English (four levels: Exemplary/Proficient/Developing/Insufficient; highlights):
1. Requirements 15%: complete traceability; testable quality scenarios; disciplined change baselines (per 29148).
2. Architecture and design 20%: evidence-backed trade-offs; controlled coupling; clear mapping to ISO 25010 attributes.
3. Implementation and code quality 20%: readability/modularity/complexity control; static analysis conformance; low defect density and strong coverage on critical paths.
4. Verification and validation 15%: well-shaped test pyramid; risk-based performance/security tests; managed defect lifecycle (per 29119).
5. DevOps and maintainability 10%: reproducible CI/CD with rollback/canary; actionable observability (error rate, latency, SLOs).
6. Security and compliance 10%: threat model with misuse cases; ASVS L2 controls implemented; dependency hygiene and secret management.
7. Team process and project management 5%: iteration planning fidelity; work item granularity and DoD clarity; consistent peer assessments.
8. Professional communication and documentation 5%: consistent, versioned SRS/design/test/ops docs; stakeholder-targeted demo. Level anchors: Exemplary = systematic evidence and metrics, reproducible rationale; Proficient = most standards met with justified deviations; Developing = notable gaps but basic operability; Insufficient = below minimums or missing critical evidence.

评分程序与标准设定（Scoring Procedures and Standard Setting）

中文：
- 采用解析型量表加权求和；同时设定“硬门槛”（见最低门槛）确保证据充足性。
- 绝对标准设定：修改型 Angoff 结合边界群体法校准及样题锚定；合格线建议≥70/100 且所有硬门槛达成。
- 评阅可靠性：双评与争议调解；目标 ICC≥0.75；评前校准会基于标定样本进行对齐，期间抽检10%项目复核。
English:
- Analytic rubric with weighted sum plus non-negotiable gates.
- Absolute standard setting: modified Angoff with borderline-group review using anchored exemplars; recommended pass ≥70/100 and all gates met.
- Rater reliability: double marking with adjudication; target ICC ≥ 0.75; pre-scoring calibration with benchmark samples and 10% moderation.

实施与证据收集（Administration and Evidence Collection）

中文：
- 里程碑：第2周需求评审；第4周架构评审；第6周中期集成演示；第8–10周最终交付与答辩。
- 证据三角验证：仓库分析（提交、评审、Issue 链接）、演示可运行性、文档一致性检查。
- 工具边界：允许使用生成式工具但须在 ADR/提交信息中标注用途与人工复核；严禁引入未知许可证代码。
English:
- Milestones: wk2 requirements review; wk4 architecture review; wk6 mid-term integration demo; wk8–10 final delivery and viva.
- Triangulation: repository analytics (commits/reviews/issue links), operability demo, document consistency checks.
- Tools: generative tools allowed with disclosure in ADR/commits and human verification; no code with incompatible licenses.

有效性、公平性与学术诚信（Validity, Fairness, Academic Integrity）

中文：
- 构念效度：指标覆盖ISO 25010关键质量特性与 SE2014 能力域，避免构念缺失（如仅以代码行数代表质量）。
- 评分推断效度：量表锚点以可观察证据与客观度量（覆盖率、缺陷密度、延迟、SLO 达成）支撑。
- 公平与可及性：为不同技术栈提供等价证据路径；提供可访问性指南；允许合理便利。
- 学术诚信：使用相似度与依赖溯源工具；分析异常提交模式；口试核验个人贡献；同行互评作为辅证。
English:
- Construct validity: indicators span ISO 25010 attributes and SE2014 competencies, avoiding construct underrepresentation.
- Interpretive validity: rubric anchors tied to observable evidence and objective metrics (coverage, defect density, latency, SLO conformance).
- Fairness/accessibility: technology-agnostic evidence pathways; accessibility guidance and reasonable accommodations.
- Integrity: plagiarism and dependency provenance checks; commit-pattern analytics; viva to verify individual contribution; peer ratings as corroboration.

形成性反馈与总结性报告（Formative Feedback and Summative Reporting）

中文：
- 形成性：每次评审后提供针对性改进建议，突出高风险缺口（性能、安全、可运维性）。
- 总结性：返回加权分、各维度等级与证据摘录，附改进建议与基准样例链接。
English:
- Formative: post-review targeted guidance emphasizing high-risk gaps (performance, security, operability).
- Summative: weighted score, level per dimension, evidence excerpts, and improvement suggestions with exemplar links.

风险与权衡说明（Rationale and Evidence Base）

中文：真实情境的表现评估能更有效捕捉综合能力，但须通过多源证据、明确量表与评分者校准提升可评分性与信度。采用国际标准（ISO/IEC 25010、29148、29119、12207）与安全基线（OWASP ASVS）确保内容效度与行业一致性。引入CI/CD与可观测性指标，借鉴工程绩效研究，以可测度量辅助判断并降低评分主观性。
English: Authentic performance tasks better capture integrated competencies; multi-source evidence, explicit rubrics, and rater calibration improve scoreability and reliability. International standards and security baselines anchor content validity and industry alignment. DevOps and observability metrics provide measurable evidence to reduce subjectivity.

参考文献（References） [1] ISO/IEC 25010:2011, Systems and software engineering—Systems and software Quality Requirements and Evaluation (SQuaRE)—System and software quality models.
[2] ISO/IEC/IEEE 29148:2018, Systems and software engineering—Life cycle processes—Requirements engineering.
[3] ISO/IEC/IEEE 29119-3:2013, Software and systems engineering—Software testing—Part 3: Test documentation.
[4] ISO/IEC/IEEE 12207:2017, Systems and software engineering—Software life cycle processes.
[5] OWASP Foundation, OWASP Application Security Verification Standard (ASVS) 4.0.3, 2021.
[6] ABET Engineering Accreditation Commission, Criteria for Accrediting Engineering Programs, 2024–2025.
[7] ACM/IEEE-CS, Software Engineering 2014: Curriculum Guidelines for Undergraduate Degree Programs in Software Engineering, 2014.
[8] N. Forsgren, J. Humble, and G. Kim, Accelerate: The Science of Lean Software and DevOps. IT Revolution, 2018.
[9] S. Messick, “Validity of psychological assessment: Validation of inferences from persons’ responses and performances,” Educational Measurement: Issues and Practice, vol. 15, no. 4, pp. 5–8, 1995.
[10] M. T. Kane, “Validating the interpretations and uses of test scores,” Journal of Educational Measurement, vol. 50, no. 1, pp. 1–73, 2013.
[11] T. K. Koo and M. Y. Li, “A guideline of selecting and reporting intraclass correlation coefficients for reliability research,” Journal of Chiropractic Medicine, vol. 15, no. 2, pp. 155–163, 2016.
[12] N. Falchikov and J. Goldfinch, “Student peer assessment in higher education: A meta-analysis comparing peer and teacher marks,” Review of Educational Research, vol. 70, no. 3, pp. 287–322, 2000.

Thesis: The following performance assessment specifies an authentic, job-relevant task that elicits the competencies required to plan and conduct a customer success renewal negotiation. It integrates an analytic rubric, multiple evidence sources, and standardized administration to support validity and reliability inferences consistent with recognized measurement standards.

Target Competencies and Evidence Indicators

Negotiation planning and strategy: Articulates issues, interests, BATNA, ZOPA, trade-offs, and concession strategy; anticipates counterpart interests and likely tactics (Fisher, Ury, & Patton, 2011; Malhotra & Bazerman, 2007).
Discovery and relationship behaviors: Uses inquiry, active listening, and stakeholder alignment to uncover interests and constraints; manages tone and trust (Curhan, Elfenbein, & Xu, 2006).
Value communication and business case: Links product usage and success metrics to outcomes; uses ROI logic and relevant data without misrepresentation.
Integrative problem solving: Constructs multi-issue packages, proposes mutually beneficial trades, and addresses objections constructively (Fisher et al., 2011).
Ethical conduct and professionalism: Avoids deceptive claims; practices accurate, fair representation of value and limitations.
Closing and follow-up: Gains clear commitments or next steps; documents agreements and open issues precisely.

Task Overview (Three-Part Performance)

Context: You are the Customer Success Manager at a B2B SaaS analytics firm. A strategic customer’s annual contract is up for renewal in 90 days amidst budget pressure, a change in executive sponsor, a key feature gap, and a competitor’s discounted offer. Data indicate mixed value realization (some quantified gains, declining usage in one business unit).
Part A—Preparation (60 minutes, independent): Produce a written negotiation plan and value analysis.
Part B—Live negotiation role-play (25 minutes, synchronous): Conduct a renewal negotiation with a trained role-player acting as the customer’s VP Operations and a procurement manager. Session is video recorded.
Part C—Follow-up (30 minutes, independent): Draft a customer-facing follow-up email and an updated, multi-option proposal.

Materials Provided to the Candidate

Data pack: Current contract terms; product usage and adoption trend charts; support history; value metrics from two case periods; org chart and stakeholder notes; forecasted budget constraints; competitor’s indicative offer (price 20–25% lower; added module); open product gap and roadmap note; risk register; NPS comments.
Templates: Negotiation plan (issues, interests, BATNA/ZOPA, concessions), stakeholder map, ROI calculator (with editable assumptions).
Instructions: Assessment objectives, time limits, deliverables, scoring criteria, allowed resources (calculator, templates), and rules (no external internet searches; data must come from the pack).

Role-Player Brief (Standardized)

Interests: Lower total cost of ownership, risk mitigation on feature gap, committed support response, executive-level value proof for budget committee.
Constraints: Fiscal-year budget cut; procurement compliance; willingness to consider multi-year if value and protections are offered.
Alternatives: A competitor quoting 22% less on list price, with a 12-month term and a pilot for the missing feature.
Behavioral script: Probing questions on ROI assumptions; potential objection escalation; willingness to consider performance-based terms. Role-players are trained to follow branching prompts to ensure comparable difficulty across administrations.

Candidate Deliverables

Part A: Negotiation plan (max 2 pages); stakeholder map; ROI/value realization analysis (1 page or spreadsheet).
Part B: Live negotiation performance (recorded).
Part C: Follow-up email (≤400 words) and a three-option proposal (e.g., 1-year at current scope; 2-year with value-add and price protection; 1-year with performance holdback tied to feature delivery).

Conditions of Assessment

Time limits as above; individual work; no generative writing tools; calculator permitted; all quantitative claims must be traceable to the data pack.
Human raters score from recordings and artifacts; candidate identity is masked for scoring.

Analytic Rubric with Behaviorally Anchored Levels (4-point scale) Scoring levels: 4 = Exemplary; 3 = Proficient; 2 = Developing; 1 = Insufficient. Weightings yield a total of 100 points.

Criterion 1: Preparation and Value Analysis (20%)

4: Identifies all material issues/interests; states realistic BATNA/ZOPA; quantifies value with warranted assumptions and sensitivity checks; accurate risk assessment.
3: Identifies most key issues; reasonable BATNA/ZOPA; sound value estimate with minor gaps; basic risk analysis.
2: Issues partially identified; BATNA/ZOPA vague or inconsistent; value case underdeveloped or contains unexamined assumptions.
1: Missing or erroneous analysis; unsupported or misleading value claims.

Criterion 2: Negotiation Strategy and Ethical Conduct (15%)

4: Coherent, interest-based strategy; planned concession trades with clear conditions; no deceptive tactics; transparent on limitations.
3: Strategy mostly interest-based; concessions linked to reciprocation; ethical throughout.
2: Predominantly positional; reactive concessions; minor lapses in transparency (not outcome-altering).
1: Deceptive or materially misleading behavior; coercive tactics.

Criterion 3: Discovery and Relationship Behaviors (15%)

4: Uses purposeful questions; accurately synthesizes counterpart interests and constraints; active listening; builds trust and joint problem frame (high subjective value; Curhan et al., 2006).
3: Solid inquiry and summaries; rapport maintained; occasional missed cues.
2: Limited probing; interruptions or missed interests; neutral rapport.
1: Minimal inquiry; dismissive tone; deteriorating relationship climate.

Criterion 4: Value Communication and Business Case (20%)

4: Ties features to measurable outcomes; presents clear ROI with data citations; addresses skepticism with evidence; avoids overclaiming.
3: Communicates benefits with some quantification; mostly accurate evidence use.
2: General benefit statements; weak or untraceable quantification.
1: Unsupported assertions; factual errors about value.

Criterion 5: Integrative Problem Solving and Concession Management (20%)

4: Structures multi-issue packages (e.g., term, scope, service levels, roadmap commitments) that improve joint value; trades low-cost/high-value items; handles objections constructively.
3: Generates feasible options with some trades; objections handled adequately.
2: Single-issue focus (price); ad hoc concessions; defensive objection handling.
1: Concession-only pattern; no option generation; escalates conflict.

Criterion 6: Closing and Follow-up (10%)

4: Achieves clear agreement or next steps with decision-rights and timelines; follow-up email precisely documents terms, open issues, and responsibilities; proposal options aligned to uncovered interests.
3: Reasonably clear next steps; follow-up generally accurate with minor omissions.
2: Vague commitments; follow-up incomplete or misaligned options.
1: No clear next steps; inaccurate or missing follow-up.

Scoring, Cut Scores, and Decision Rules

Total score: Sum of weighted criteria (0–100). Ethics gate: Any “1” on Criterion 2 triggers automatic fail regardless of total score.
Recommended performance standard: Pass if total ≥70 and no criterion <2; Proficient if ≥80; Advanced if ≥90. Establish cut scores via a standard-setting workshop (modified Angoff with rubric-level descriptors and annotated exemplars) and review consequences of decisions (AERA, APA, & NCME, 2014).
Rater reliability: Two independent raters score all performances; require pre-qualification with weighted kappa or ICC ≥0.70 on calibration set; double-score at least 20% of live administrations with adjudication rules (Jonsson & Svingby, 2007).

Validity and Fairness Evidence Plan

Content: Derive and validate the competency model from a job/practice analysis of customer success renewal work (SME panel; blueprint linking tasks to competencies) to support content validity (AERA, APA, & NCME, 2014; Wiggins, 1990).
Response process: Provide unambiguous instructions and exemplars; train role-players to maintain comparable prompts; monitor administration fidelity (Kane, 2006).
Internal structure: Analyze rubric dimensionality and reliability (e.g., generalizability across raters/scenarios, internal consistency) after pilot.
Relations to other variables: Correlate scores with supervisor ratings of renewal effectiveness and subsequent renewal outcomes to evaluate predictive/convergent validity, recognizing context constraints.
Consequences: Track subgroup performance, adverse impact, and rater drift; conduct bias and language reviews of materials; offer reasonable accommodations that do not alter the focal constructs (AERA, APA, & NCME, 2014).

Administration and Quality Controls

Rater training: Use a scoring guide with behavioral anchors and multiple annotated video exemplars at each level; conduct calibration and periodic recalibration to mitigate drift (Jonsson & Svingby, 2007; Brookhart, 2013).
Role-player standardization: Scripted prompts with decision trees; periodic inter-actor calibration via joint practice sessions and reviewer observation.
Security and integrity: Unique scenario variants; rotation across administrations; artifact plagiarism checks; documented chain-of-custody for recordings.
Pilot testing: Small-scale pilot to evaluate task difficulty, time-on-task, rater agreement, and scoring guide clarity; revise before operational use.

Optional Scenario Variants (for parallel forms or re-testing)

Procurement-led renewal with rigid policy; usage trending upward but expansion risk.
Executive escalation due to outage; negotiation involves service credits and renewal linkage.
Multi-year renewal with price-protection vs roadmap guarantees trade-off.

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. American Educational Research Association.
Brookhart, S. M. (2013). How to create and use rubrics for formative assessment and grading. ASCD.
Curhan, J. R., Elfenbein, H. A., & Xu, H. (2006). What do people value when they negotiate? Mapping the domain of subjective value in negotiation. Journal of Personality and Social Psychology, 91(3), 493–512.
Fisher, R., Ury, W., & Patton, B. (2011). Getting to yes: Negotiating agreement without giving in (3rd ed.). Penguin.
Jonsson, A., & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational Research Review, 2(2), 130–144.
Kane, M. T. (2006). Validation. Educational Measurement: Issues and Practice, 25(1), 17–24.
Malhotra, D., & Bazerman, M. H. (2007). Negotiation genius. Bantam.
Reichheld, F. F., & Sasser, W. E. (1990). Zero defections: Quality comes to services. Harvard Business Review, 68(5), 105–111.
Wiggins, G. (1990). The case for authentic assessment. Practical Assessment, Research, and Evaluation, 2(1), 2.

解决的问题

用一条指令，把任何学科或技能快速转化为“可落地、可量化、可交付”的表现任务方案，帮助教师、教研、培训与HR在更短时间内产出高质量评估设计，提升学习成效与人才选拔的公信力。

真实场景导向：围绕真实业务/学习情境构建任务，促进行为表现与能力迁移
评估闭环一次成型：明确目标→情境→任务说明→提交要求→评分量表（Rubric）→表现等级描述→参考依据
学术级可信度：基于证据的论证与规范化表达，支持引用权威来源，增强合规与说服力
多语言与分层难度：按受众语言与能力阶段自动切换与分层，覆盖国际与跨部门场景
稳定品质与效率：10分钟替代数小时人工打磨，避免模糊指标与不公平评分
立即试用与扩展：输入“学科/技能+输出语言”立即获得样例；升级可解锁批量生成、版本对比、对齐课程/胜任力标准、多人协作与多格式导出，全面提升教研与培训产能

适用用户

K12教师与教研员

快速将课程标准转化为表现任务，生成评价量规与示例答案，班级与年级统一实施与分析，减轻备课与统筹压力。

高校教师与课程负责人

为实验、实作或毕业设计设计任务书与评分表，支持双语输出与引用说明，提升教学一致性与公正性，便于助教协同。

企业培训经理（L&D）

围绕岗位能力打造情境化实战任务，附评分要点与证据示例，便于多地统一考核与技能认证，缩短培训评估落地周期。

特征总结

• 一键按学科与技能生成可落地表现任务，附清晰完成要求与产出标准

• 自动产出专业评分量规与评价指标，支持课堂与考核直用与复审流程

• 提供基于证据的设计理由与参考文献，增强方案可信度与说服力全面

• 多语言一键输出，轻松服务跨校区、跨地区教学与培训团队无缝协作

• 自动结构化目标、步骤、材料与评分，兼顾信度效度与公平性可落地

• 内置模板与参数化选项，按标准对齐课程目标、能力维度与难度层级

• 覆盖测验、表现评估与调查情境，轻松适配课堂、企业与考试中心使用

• 严控事实与表述准确性，避免遗漏与夸大，输出可直接用于审稿与质检

• 支持基于反馈快速迭代优化，分钟级完成从需求到可执行方案落地应用

如何使用购买的提示词模板

1. 直接在外部 Chat 应用中使用

将模板生成的提示词复制粘贴到您常用的 Chat 应用（如 ChatGPT、Claude 等），即可直接对话使用，无需额外开发。适合个人快速体验和轻量使用场景。

2. 发布为 API 接口调用

把提示词模板转化为 API，您的程序可任意修改模板参数，通过接口直接调用，轻松实现自动化与批量处理。适合开发者集成与业务系统嵌入。

3. 在 MCP Client 中配置使用

在 MCP client 中配置对应的 server 地址，让您的 AI 应用自动调用提示词模板。适合高级用户和团队协作，让提示词在不同 AI 工具间无缝衔接。

设计文生文 AI提示词

生成表现任务建议

幂简官方

386

Oct 17, 2025

根据特定主题或技能生成表现任务，提供专业建议。

查看提示词内容

解决的问题