Data Collection Plan: The Effect of Blended Learning on Academic Self-Efficacy
- Purpose and Research Questions
- Purpose: To estimate the causal effect of blended learning (BL) on students’ academic self-efficacy (ASE) and to document implementation fidelity and contextual moderators.
- Primary question: Does participation in a well-specified BL course increase ASE relative to traditional face-to-face instruction?
- Secondary questions:
- Do effects vary by baseline ASE, prior achievement, discipline, or gender?
- Are BL effects mediated by mastery experiences and instructional presence?
- How do implementation fidelity and engagement relate to ASE change?
Rationale: BL may enhance ASE by increasing mastery opportunities, feedback, and learner control (Bandura, 1997; Graham, 2006; Means et al., 2013).
- Design Overview
- Preferred design: Cluster randomized controlled trial (cRCT) at the course-section level to minimize contamination (randomize sections to BL vs. business-as-usual face-to-face).
- Alternative (if randomization not feasible): Quasi-experimental matched comparison with propensity score methods and difference-in-differences using baseline ASE, prior achievement, demographics, and prior online experience as covariates (Murnane & Willett principles of causal inference; see also WWC standards).
- Mixed-methods concurrent design: Quantitative (surveys, LMS logs, administrative data) with qualitative interviews/focus groups for triangulation and explanatory depth (Creswell & Plano Clark, 2017).
- Setting, Sampling, and Participants
- Setting: Multiple undergraduate gateway courses (e.g., introductory psychology, biology, statistics) across 2–4 institutions to enhance generalizability (Graham, 2006).
- Units:
- Clusters: Course sections (target 40–80 sections total, balanced across arms and disciplines).
- Students: All enrolled students in selected sections; anticipated n per section = 25–40.
- Instructors: Faculty teaching participating sections.
- Inclusion criteria: Degree-seeking undergraduates enrolled at census; instructors willing to implement the specified BL model or the standard face-to-face format.
- Exclusion criteria: Sections taught by graduate assistants without training support; courses with existing extensive online components in the control arm.
- Intervention and Comparison (for Fidelity Anchoring)
- BL treatment: Pre-specified blended model with defined dosage (e.g., 30–50% online), structured weekly online modules (readings, quizzes, discussion), and in-person active learning sessions. Alignment documented via a BL design template and a checklist derived from established blended/course quality frameworks (e.g., Community of Inquiry presence indicators; Arbaugh et al., 2008; Graham et al., 2013).
- Control: Business-as-usual face-to-face delivery without systematic online components beyond standard LMS posting.
- Outcome Measures
Primary outcome: Academic self-efficacy
- Instrument: MSLQ Self-Efficacy for Learning and Performance subscale (8 items; 1–7 Likert) administered at baseline (T0), midterm (T1), and end of term (T2) (Pintrich et al., 1991; Pintrich et al., 1993).
- Evidence: Consistently high internal consistency for college samples (typically α ≈ .90) and demonstrated predictive validity for achievement (Pintrich et al., 1993).
Secondary/auxiliary measures (for mechanism, covariate adjustment, and sensitivity)
- Sources of self-efficacy (mastery experiences, vicarious experiences, social persuasion, physiological states), adapted from domain-appropriate scales following Bandura’s construction guidelines (Bandura, 2006; Usher & Pajares, 2009).
- Teaching, social, and cognitive presence (CoI survey; Arbaugh et al., 2008) to capture instructional/learning environment characteristics associated with BL.
- Prior achievement: Cumulative GPA or standardized placement scores; baseline course diagnostic if available.
- Engagement/effort: LMS activity metrics (time-on-task, resource views, assignment submission patterns) and brief self-report engagement scale.
- Demographics: Age, gender, major/discipline, first-generation status, prior online course experience.
- Implementation Fidelity and Exposure
- Instructor-reported adherence: Weekly implementation logs aligned to the BL design template (dosage of online components, use of active learning).
- Independent observations: Structured observations of in-person sessions using a validated active-learning checklist.
- LMS analytics: Actual BL dosage (e.g., completion of online modules, discussion participation, quiz attempts).
- Fidelity rubric: Global fidelity scores synthesized from logs, observations, and analytics (O’Donnell, 2008).
- Timing and Procedures
- Pre-semester (T−1): Instructor recruitment; randomization at section level; training for BL instructors; pilot testing of instruments; LMS instrumentation.
- Week 1–2 (T0): Baseline student survey (ASE, demographics, prior online experience), consent, and retrieval of prior GPA. Randomization concealed from analysts; treatment known to implementers.
- Midterm (T1): Short ASE assessment, CoI survey, and engagement check to examine trajectories and reduce common method bias via temporal separation (Podsakoff et al., 2003).
- End of term (T2): Posttest ASE, CoI, course grade collection; instructor fidelity summaries; LMS data export.
- Postterm (T3, optional): Follow-up ASE 6–8 weeks later to assess persistence.
- Qualitative sampling: Purposive subsample (e.g., n ≈ 20–30 students per condition across disciplines; 8–12 instructors) for semi-structured interviews near T2 to explain quantitative patterns (Creswell & Plano Clark, 2017).
- Data Quality Assurance
- Pilot: Cognitive interviews with 8–12 students to ensure clarity and contextual fit of items; small pilot (n ≈ 60–80) to examine reliability and preliminary factor structure.
- Measurement equivalence: Test measurement invariance (configural, metric, scalar) for the MSLQ ASE subscale across conditions and time points before estimating effects (Putnick & Bornstein, 2016).
- Administration controls: Uniform survey windows, standardized reminders, proctoring in class where feasible to maximize response rates; incentives (e.g., small course credit or raffle) approved by IRB.
- Nonresponse management: Track response propensity; implement targeted reminders; document reasons for attrition.
- Sample Size and Power Guidance
- Plan for a cRCT with students nested in sections. Use established software (e.g., Optimal Design) with plausible assumptions: small effect size (d = 0.20–0.25), intraclass correlation at section level ICC ≈ .03–.07 for psychosocial outcomes, covariate R2 ≈ .40 from baseline ASE and GPA. Aim for at least 30–40 sections per arm with 25–35 students each to achieve ≈ .80 power for small effects (Hedges & Rhoads, 2010; Raudenbush & Bryk, 2002). Final targets should be set via a formal power analysis with local ICC estimates.
- Data Management and Security
- Unique study IDs linking surveys, LMS logs, and grades; no storage of direct identifiers with analytic datasets.
- Secure, encrypted storage on institutional servers; access controls; audit logs.
- Codebook, metadata, and version control (e.g., Git with restricted access).
- Pre-registration of hypotheses, primary outcomes, and analysis plan (e.g., OSF) before data collection.
- Data sharing: De-identified datasets and instruments archived following FAIR principles when permitted (Wilkinson et al., 2016).
- Ethical Considerations
- IRB approval; informed consent specifying voluntary participation, data uses, and withdrawal rights.
- FERPA-compliant handling of educational records; minimal-risk classification anticipated.
- Mitigation of risks: Anonymized reporting; instructor training to ensure equitable learning opportunities in both arms.
- Minimizing Bias and Ensuring Internal Validity
- Randomization by an independent coordinator; blocked by course and instructor to balance prior outcomes.
- Baseline measures to adjust for any residual imbalance.
- Analyst blinding to treatment during data cleaning and pre-specification of models.
- Multiple data sources (surveys, logs, grades) and temporal separation to reduce common method variance (Podsakoff et al., 2003).
- Monitoring protocol deviations and documenting contamination.
- Analysis-Linked Data Needs (to inform collection)
- Primary impact: Change in ASE (T2 minus T0) estimated with multilevel modeling (students nested in sections), adjusting for baseline ASE and covariates (Raudenbush & Bryk, 2002).
- Moderation: Interaction terms for baseline ASE, discipline, gender.
- Mediation (exploratory): Mastery experiences and teaching presence measured at T1 and T2 (Bandura, 1997; Arbaugh et al., 2008).
- Sensitivity: Per-protocol and complier-average causal effect analyses using fidelity indices; missing data handled via FIML or multiple imputation under MAR assumptions (Enders, 2010).
- Reporting and Documentation
- Document recruitment, participation flow, response rates, and attrition by arm (CONSORT for cRCTs adapted to education).
- Report reliability (Cronbach’s alpha, omega), CFA/invariance results, and fidelity statistics.
- Provide a transparent account of deviations from the pre-registered plan and robustness checks.
References
- Arbaugh, J. B., Cleveland-Innes, M., Diaz, S. R., Garrison, D. R., Ice, P., Richardson, J. C., & Swan, K. P. (2008). Developing a community of inquiry instrument: Testing a measure of the Community of Inquiry framework using a multi-institutional sample. The Internet and Higher Education, 11(3–4), 133–136.
- Bandura, A. (1997). Self-efficacy: The exercise of control. W. H. Freeman.
- Bandura, A. (2006). Guide for constructing self-efficacy scales. In F. Pajares & T. Urdan (Eds.), Self-efficacy beliefs of adolescents (pp. 307–337). Information Age.
- Creswell, J. W., & Plano Clark, V. L. (2017). Designing and conducting mixed methods research (3rd ed.). SAGE.
- Enders, C. K. (2010). Applied missing data analysis. Guilford Press.
- Graham, C. R. (2006). Blended learning systems: Definition, current trends, and future directions. In C. J. Bonk & C. R. Graham (Eds.), The handbook of blended learning (pp. 3–21). Pfeiffer.
- Graham, C. R., Woodfield, W., & Harrison, J. B. (2013). A framework for institutional adoption and implementation of blended learning in higher education. The Internet and Higher Education, 18, 4–14.
- Hedges, L. V., & Rhoads, C. (2010). Statistical power analysis in education research (NCEE 2010-4017). U.S. Department of Education.
- Means, B., Toyama, Y., Murphy, R., & Baki, M. (2013). The effectiveness of online and blended learning: A meta-analysis of the empirical literature. Teachers College Record, 115(3), 1–47.
- O’Donnell, C. L. (2008). Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes. Review of Educational Research, 78(1), 33–84.
- Pintrich, P. R., Smith, D. A. F., Garcia, T., & McKeachie, W. J. (1991). A manual for the use of the Motivated Strategies for Learning Questionnaire (MSLQ). University of Michigan.
- Pintrich, P. R., Smith, D. A. F., Garcia, T., & McKeachie, W. J. (1993). Reliability and predictive validity of the MSLQ. Educational and Psychological Measurement, 53(3), 801–813.
- Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research. Journal of Applied Psychology, 88(5), 879–903.
- Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance in the social and behavioral sciences. Child Development Perspectives, 10(1), 29–34.
- Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models (2nd ed.). SAGE.
- Usher, E. L., & Pajares, F. (2009). Sources of self-efficacy in mathematics: A validation study. Contemporary Educational Psychology, 34(1), 89–101.
- Wilkinson, M. D., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018.
Note: If a quasi-experimental design is required, add a data element for teacher/course matching variables (e.g., historical grading distributions, class size, instructor experience) to strengthen the propensity model and include a clear overlap/diagnostics protocol before estimating effects.