Skip to content

Create a Tutorial on the Scientific Process #270

Description

@chinaexpert1

Overview

Create a tutorial that introduces undergraduate students to the scientific process as a structured method for asking questions, testing claims, analyzing evidence, and refining conclusions. The tutorial should explain the scientific process as an iterative workflow rather than a rigid checklist.

Action Items

Research and document the major stages of the scientific process, with clear undergraduate-friendly explanations for each stage.

Identify the core concepts students should understand, including:

Observation
Research question
Hypothesis
Prediction
Experiment or study design
Data collection
Analysis
Interpretation
Replication
Peer review
Limits of inference

The tutorial should cover the following core steps:

  1. Observation

    • Explain how scientific work often begins with noticing a pattern, problem, gap, contradiction, or unexplained phenomenon.
    • Include examples from everyday life, civic data, public health, environmental science, or social science.
  2. Background Research

    • Explain why researchers review existing knowledge before designing a study.
    • Include how background research helps clarify definitions, avoid duplicate work, identify prior findings, and reveal open questions.
  3. Research Question

    • Explain how to turn a broad curiosity into a focused, answerable question.
    • Include examples of weak versus strong research questions.
  4. Hypothesis

    • Define a hypothesis as a testable explanation or proposed relationship, not simply a guess.
    • Explain the difference between a hypothesis, prediction, theory, and opinion.
  5. Prediction

    • Explain how predictions describe what we expect to observe if the hypothesis is correct.
    • Use “If the hypothesis is true, then we should observe…” as a suggested format.
  6. Study or Experiment Design

    • Explain how researchers decide what data is needed, what variables matter, and what comparison will be made.
    • Include key concepts such as independent variables, dependent variables, controls, confounding variables, sample size, and measurement quality.
  7. Data Collection

    • Explain how evidence is gathered through experiments, surveys, observations, public datasets, sensors, interviews, or simulations.
    • Emphasize documentation, consistency, and data quality.
  8. Analysis

    • Explain how researchers use statistics, visualization, and logical reasoning to evaluate evidence.
    • Include the idea that analysis should connect directly back to the original research question and hypothesis.
  9. Interpretation

    • Explain how researchers decide what the results mean and what they do not mean.
    • Emphasize uncertainty, limitations, alternative explanations, and the difference between correlation and causation.
  10. Conclusion

  • Explain how conclusions summarize what was learned, whether the evidence supports the hypothesis, and what questions remain.
  • Clarify that unsupported hypotheses are still scientifically valuable.
  1. Communication
  • Explain the importance of sharing methods, evidence, results, limitations, and conclusions clearly.
  • Include common formats such as reports, papers, presentations, dashboards, notebooks, and posters.
  1. Replication and Revision
  • Explain that scientific knowledge improves when studies are repeated, challenged, refined, or expanded.
  • Present the scientific process as a cycle: results often lead to better questions, improved methods, and new hypotheses.

Create a simple visual or written flow of the process:

Observation → Background Research → Research Question → Hypothesis → Prediction → Study Design → Data Collection → Analysis → Interpretation → Conclusion → Communication → Replication / Revision

Also include a short applied example that walks through the full process from beginning to end. A recommended example is:

  • Observation: Some neighborhoods appear to have slower 311 response times.
  • Background Research: Review how 311 requests are categorized and how response time is measured.
  • Research Question: Do 311 response times differ by neighborhood or council district?
  • Hypothesis: Some districts have longer median response times than others.
  • Prediction: If the hypothesis is correct, median response time will vary meaningfully across districts.
  • Study Design: Compare similar request types across districts over the same time period.
  • Data Collection: Use a sample of 311 service request records.
  • Analysis: Calculate median response time by district and request type.
  • Interpretation: Differences may exist, but request type, reporting volume, staffing, and seasonality may also explain the pattern.
  • Conclusion: The data may suggest unequal response times, but further analysis is needed before making a causal claim.
  • Communication: Present findings in a short report, chart, or dashboard.
  • Revision: Refine the question by controlling for request type, urgency, or time of year.

Document common misconceptions to avoid:

  • A hypothesis is not just a random guess.
  • One study rarely proves something permanently.
  • Correlation does not automatically mean causation.
  • A null or unexpected result is not a failed study.
  • Science is not always linear.
  • Data does not interpret itself.
  • Good conclusions must acknowledge uncertainty and limitations.

Final deliverable should be a tutorial draft no longer than two pages. It should be written for undergraduate students with little or no prior research experience.

Resources/Instructions

Suggested resources:

The tutorial should present the scientific process as an evidence-based reasoning cycle. It should be practical, clear, and connected to real research tasks students may encounter in data science, civic technology, social science, public policy, or laboratory science.

  • If this issue requires access to 311 data, please answer the following questions:

    • Do you need a one-time or ongoing dump of the data?

      • A one-time sample is sufficient.
    • Do you need a subset of data or the entire data set?

      • A subset is recommended. The full dataset is not needed.
    • If a subset is needed, please define subset characteristics.

      • Use a limited date range, such as 6–12 months, with fields for request type, open date, close date, status, location, neighborhood or council district, and request category.
    • Do you need online access via an API or a download of data?

      • A CSV download is preferred for tutorial purposes. API access is optional if the tutorial includes a live data retrieval section.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions