Title: Data Literacy Fundamentals for New Employees
Audience
- New hires across functions (non-technical and technical)
- No prior analytics or coding experience required
Delivery
- 8 modules, 12–14 total hours (can run as 2 half-days or self-paced over 2 weeks)
- Blend short lectures, guided practice, and job-relevant exercises
- Provide a sandbox dataset and access to spreadsheet and BI/SQL tools
Course Outcomes
By the end, learners will:
- Define key data concepts and the data lifecycle
- Find, understand, and request data appropriately
- Assess and improve basic data quality
- Analyze data responsibly, respecting privacy, security, and governance
- Summarize and visualize findings clearly and accurately
- Communicate data-supported insights to stakeholders
Required Materials
- Company data policies summary (privacy, security, governance, retention, classification, data request process)
- Sample datasets (CSV and XLSX) and a data dictionary
- Spreadsheet tool (e.g., Excel/Sheets) and optional BI/SQL sandbox
- Quick-reference guides (formulas, chart chooser, data quality checklist)
Module 1. Orientation: What Data Literacy Means Here (60–75 min)
Objectives
- Define data literacy in the workplace context
- Identify where data lives and who owns it
- Follow the data lifecycle and your responsibilities at each stage
Do this
- State the data lifecycle: create/collect → store → access → transform → analyze → share → retain/retire.
- Map the people: data producers, stewards/owners, analysts, engineers, business stakeholders, compliance.
- Locate data: navigate the catalog/wiki, BI dashboards, shared drives, ticketing system.
- Apply roles and responsibilities: know what you can/cannot do with data you handle.
Practice
- Find a dataset in the catalog, locate its owner, and read its metadata (purpose, refresh cadence, definitions).
Assess
- 5-question check: identify the right owner and access path for a sample request.
Module 2. Data Foundations: Types, Formats, and Structure (75–90 min)
Objectives
- Recognize common data types and file formats
- Read and interpret tables with keys, units, and metadata
Do this
- Distinguish types: categorical (nominal/ordinal), numeric (continuous/discrete), datetime, text, boolean.
- Recognize structures: tables with rows/columns, primary keys, foreign keys, wide vs long format.
- Handle formats: CSV (delimiters/quotes/encoding), XLSX, JSON basics; mind decimal separators and time zones.
- Use metadata: data dictionary, column definitions, units, valid ranges, refresh schedules.
Practice
- Open a CSV; set delimiter and encoding; verify headers; identify the primary key; spot mixed data types in a column.
Assess
- Label the data type for 8 example fields; choose correct file import settings in a scenario.
Module 3. Data Quality Essentials and Basic Cleaning (90 min)
Objectives
- Evaluate data against quality dimensions
- Execute essential cleaning tasks in a spreadsheet
Do this
- Check quality dimensions: accuracy, completeness, consistency, timeliness, validity, uniqueness.
- Validate values: check ranges, allowed values, date formats, duplicates, missingness patterns.
- Clean safely: create a copy, log changes, version files, avoid overwriting raw data.
- Apply spreadsheet techniques: TRIM, CLEAN, UPPER/LOWER, TEXTSPLIT/Text to Columns, IFERROR, VLOOKUP/XLOOKUP, COUNTIF, Remove Duplicates, Data Validation.
- Document assumptions and transformations.
Practice
- Use a checklist to fix common issues (extra spaces, inconsistent categories, missing dates, duplicate IDs) and write a 3-line change log.
Assess
- Before/after dataset comparison; identify which quality dimensions improved.
Module 4. Responsible Data Use: Privacy, Security, and Governance (75 min)
Objectives
- Classify data and apply least-privilege access
- Handle personal and sensitive data ethically and securely
Do this
- Classify information: public, internal, confidential, restricted (use your organization’s labels).
- Identify personal data (PII) and sensitive attributes; minimize collection and use only for legitimate purposes.
- Protect data: store in approved locations, share via secure channels, avoid emailing raw datasets unless permitted.
- Follow access rules: request access via ticketing; do not share credentials; review permissions regularly.
- Respect retention: follow approved retention and deletion schedules.
- Report incidents: escalate suspected data exposure or phishing immediately via the defined process.
Practice
- Classify 10 examples; choose the correct sharing method; draft a short purpose statement for a data request.
Assess
- Scenario quiz on appropriate handling, access, and incident response steps.
Module 5. Framing Questions and Defining Metrics (60–75 min)
Objectives
- Translate business questions into measurable metrics and data requirements
- Avoid vanity metrics; define clear operational definitions
Do this
- Use SMART framing: Specific, Measurable, Achievable, Relevant, Time-bound.
- Define metrics: name, formula, unit, scope, filters, and refresh cadence; distinguish metric vs KPI vs OKR.
- Guard against pitfalls: vanity metrics, undefined denominators, mixed cohorts, shifting definitions.
- Plan data needs: identify sources, grain, time windows, and necessary joins.
Practice
- Convert “Improve user engagement” into 2–3 SMART questions; write operational definitions for each metric.
Assess
- Peer-review metric definitions for clarity and measurability.
Module 6. Basic Analysis and Statistics for Everyone (90 min)
Objectives
- Summarize data correctly and recognize common biases and errors
- Interpret results cautiously (correlation ≠ causation)
Do this
- Describe data: count, sum, mean/median, min/max, percent, rate; use pivot tables for group summaries.
- Explore distributions: histograms, box plots; identify skew and outliers.
- Sample wisely: understand sampling bias and nonresponse; prefer representative cuts.
- Interpret relationships: scatterplots; correlation vs causation; confounding variables.
- Understand experiments at a high level: A/B basics (randomization, control, sample size, outcome metric); avoid peeking and multiple-comparison pitfalls.
Practice
- Build a pivot to compare conversion rate by channel; visualize distribution; note potential biases.
Assess
- Short interpretation quiz with plots and summary tables.
Module 7. Finding and Querying Data (Spreadsheet, BI, and Intro SQL) (90 min)
Objectives
- Locate data and answer basic questions using BI or SQL
- Read metadata to construct correct filters and joins
Do this
- Navigate BI: filter, segment, drill down, export responsibly; verify dashboard refresh times.
- Read data catalogs: owners, lineage, last update, definitions.
- Use intro SQL (optional track): SELECT, WHERE, ORDER BY, LIMIT; GROUP BY with aggregates; INNER JOIN on keys; watch for duplication when joining.
- Validate results: row counts, sanity checks, reconcile to known totals.
Practice
- Answer 3 business questions using either BI filters or simple SQL; show query or filter steps; include validation checks.
Assess
- Submit answers with method and validation notes; facilitator spot-checks results and logic.
Module 8. Visualization and Data Storytelling (75–90 min)
Objectives
- Choose appropriate charts and design for clarity
- Communicate insights with minimal bias and maximum transparency
Do this
- Match chart to task: comparison (bar), trend (line), distribution (histogram/box), part-to-whole (stacked bars with care), relationship (scatter).
- Design clearly: informative titles, labeled axes and units, consistent scales, avoid chartjunk and 3D effects, use colorblind-safe palettes.
- Provide context: define metric and period, show baselines or targets, annotate anomalies, include caveats and data quality notes.
- Avoid misleads: do not truncate axes for bars; disclose data exclusions and methods.
Practice
- Create a 1-slide chart with title, annotation, and takeaway; get peer feedback using a checklist.
Assess
- Revise chart based on feedback; submit final with a 3-sentence narrative.
Capstone Project (2–3 hours total, spread across week)
Prompt
- Investigate a realistic question (e.g., “Which onboarding channels yield the highest 30-day activation rate, controlling for region and plan?”).
Steps
- Confirm the question and define metrics and cohorts.
- Locate data and obtain access; log data classification and approvals.
- Assess data quality; document cleaning steps.
- Analyze and validate; include at least one check against known totals.
- Visualize results; write a 1-page brief with recommendations, risks, and next steps.
- Present a 5-minute readout; answer stakeholder questions.
Rubric
- Accuracy and validation (40%)
- Responsible data handling (20%)
- Clarity of definitions and methods (20%)
- Communication and recommendations (20%)
Assessment Plan
- Pre-course self-assessment and 10-question baseline quiz
- Module quizzes (5–8 questions each)
- Capstone scored with rubric
- Post-course quiz and reflection on how to apply in role
Implementation Steps for L&D
- Localize with company policies, classification labels, and data request workflows.
- Prepare datasets with a data dictionary and seeded quality issues for practice.
- Set up access to BI/SQL sandboxes and ensure safe, non-production data.
- Train facilitators; provide answer keys and checklists.
- Schedule office hours and a discussion channel for questions.
- Track outcomes: completion, quiz improvement, capstone quality, and 60-day on-the-job application survey.
Job Aids (deliver as PDFs or wiki pages)
- Data quality checklist
- Metric definition template
- Chart chooser and design checklist
- Data handling and sharing quick guide
- SQL and spreadsheet formula cheatsheets
Common Pitfalls to Address Explicitly
- Mixing time zones or date formats
- Using inconsistent metric definitions across teams
- Joining at the wrong grain and inflating counts
- Ignoring missing data patterns
- Sharing data via unapproved channels
- Drawing causal claims from observational data
Tips for Inclusive and Accessible Learning
- Add alt text to charts and avoid color-only encoding
- Offer keyboard-accessible files and captions for recordings
- Provide examples from multiple functions (sales, product, HR, ops)
Suggested Schedule (example)
- Day 1: Modules 1–4 (with breaks)
- Day 2: Modules 5–8 + Capstone kickoff
- Week after: Capstone work + presentations
Next Steps for Learners
- Save the job aids; bookmark the data catalog and policies
- Identify one team metric to redefine clearly
- Schedule a short meeting with your data steward to understand available data and best practices