Multiple Regression with Interactions & Non-Linear Effects

Advanced

Build on multiple regression by adding interaction terms (moderation effects) and quadratic terms (non-linear relationships). Understand how the effect of one predictor depends on another, or how relationships curve rather than staying linear.

OVERVIEW & CONCEPTS

This tool extends multiple linear regression to handle two powerful concepts:

  1. Interactions (Moderation): When the effect of predictor X₁ on outcome Y depends on the level of another predictor X₂
  2. Non-linear effects (Quadratic): When a predictor's relationship with the outcome is curved (e.g., inverted U-shape)

Interaction Effects

With interaction term: $$ Y_i = \beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \beta_3 (X_{1i} \times X_{2i}) + \varepsilon_i $$

The interaction coefficient (\(\beta_3\)) tells you how much the effect of X₁ changes for each unit increase in X₂. Key insight: If \(\beta_3\) is significant, the relationship between X₁ and Y is not constant—it varies depending on X₂.

Quadratic Effects

With quadratic term: $$ Y_i = \beta_0 + \beta_1 X_{i} + \beta_2 X_{i}^2 + \varepsilon_i $$

The quadratic coefficient (\(\beta_2\)) captures curvature. If \(\beta_2 < 0\), the relationship is an inverted U (increases then decreases). If \(\beta_2 > 0\), it's a U-shape (decreases then increases). Business relevance: Find optimal points (e.g., ideal price, optimal ad frequency).

Why Interactions Matter

Real-world effects rarely operate in isolation. Moderation (statistical interactions) recognizes that contexts matter:

  • Ad spend might drive revenue more during holidays than off-season
  • Price increases might hurt sales for low-quality products but enhance prestige for luxury items
  • Training programs might boost performance for junior employees but have little effect on seasoned veterans

Managerial implication: One-size-fits-all strategies miss opportunities. If an interaction is significant, you should tailor your approach based on the moderator.

Why Centering Matters for Interactions

When including interaction terms, mean-centering continuous predictors is recommended (and enabled by default in this tool). Here's why:

  • Interpretability: After centering, "main effects" represent effects when the other variable is at its mean (not at zero, which might be meaningless)
  • Multicollinearity reduction: Interaction terms are highly correlated with their component predictors. Centering reduces this correlation
  • Focal vs. moderator distinction: Centering helps you interpret which variable is the "focal predictor" (whose effect you're studying) vs. the "moderator" (what changes that effect)

Example: If studying ad_spend × seasonality, centering ad_spend means the season coefficients represent effects "at average ad spend levels," not at zero spend (which never happens).

Advanced users can disable centering, but interpretations become more complex.

Simple Slopes & Probing Interactions

When an interaction is significant, the next step is simple slopes analysis: testing whether the focal predictor's effect is significant at different levels of the moderator.

For continuous moderators, we conventionally test at three levels:

  • Low: Moderator at -1 standard deviation below its mean
  • Average: Moderator at its mean
  • High: Moderator at +1 standard deviation above its mean

For categorical moderators, we test the focal predictor effect within each category separately.

This tool visualizes these simple slopes in the interaction plots, making it easy to see where effects are strong vs. weak.

Quadratic Terms & Finding Optimal Points

Quadratic effects capture non-monotonic relationships—where "more is better" only up to a point, then becomes "too much of a good thing."

The turning point (maximum or minimum) occurs at: $$ X^* = -\frac{\beta_1}{2 \beta_2} $$

Business applications:

  • Optimal pricing: Too low = leaving money on table; too high = driving customers away
  • Ideal ad frequency: Too few = insufficient awareness; too many = annoyance and fatigue
  • Perfect difficulty level: Too easy = boredom; too hard = frustration

Interpretation note: Check that the optimal point falls within your observed data range. Extrapolating beyond observed values is risky.

Tool Limitation: One Interaction/Quadratic at a Time

For educational clarity, this tool restricts you to one interaction or quadratic effect per model. This is not a limitation of regression in general—real models often include multiple interactions.

Why this restriction helps learning:

  • Focuses attention on understanding one moderation or non-linear effect deeply
  • Keeps visualizations clear and interpretable
  • Prevents overfitting with limited sample sizes
  • Teaches principles that extend to more complex models

For professional analysis requiring multiple interactions or three-way interactions, use statistical software like R, Python, SPSS, or Stata.

SCENARIOS

Use presets to explore realistic scenarios demonstrating interactions and non-linear effects in marketing, pricing, and gaming contexts. Each scenario can be downloaded, edited in Excel, and re-uploaded.

INPUTS & SETTINGS

Upload Raw Data File

Upload a CSV file with raw case-level data. Include one outcome variable and multiple predictors (numeric or categorical).

Drag & Drop raw data file (.csv, .tsv, .txt)

Include headers; at least one numeric outcome and 2+ predictors.

No file uploaded.

Confidence Level & Reporting

Set the significance level for hypothesis tests and confidence intervals.

Advanced Analysis Settings

Centering improves interpretability and reduces multicollinearity. Main effects then represent effects "at average levels" of other variables. Disable only if you have specific reasons.

Toggle visibility of shaded confidence intervals around predicted lines. Useful for assessing uncertainty in simple slopes.

VISUAL OUTPUT

Actual vs. Fitted

Interpretation Aid

Each point compares an observed outcome to its predicted value. Points near the 45-degree line indicate better fit. This plot helps diagnose whether the model (including interactions/quadratic) adequately captures the data structure.

Interaction / Effect Plot

Interpretation Aid

SUMMARY STATISTICS

Summary Statistics

Outcome & Continuous Predictors

Variable Mean Median Std. Dev. Min Max
Provide data to see summary statistics.

Categorical Predictors (% by level)

Predictor Level Percent
Provide data to see level percentages.

TEST RESULTS

Regression Equation

Provide data to see the fitted regression equation.

R-squared:
Adj. R-squared:
Model F:
Model p-value:
RMSE:
MAE:
Sample size (n):
Alpha:

APA-Style Statistical Reporting

Managerial Interpretation

Coefficient Estimates

Predictor Term Estimate Std. Error t p-value Lower CI Upper CI
Provide data to see coefficient estimates.
Coefficient Interpretation Guide

DIAGNOSTICS & ASSUMPTIONS

Diagnostics & Assumption Checks

Run the analysis to see checks on multicollinearity, residual patterns, and model fit.

Residuals vs. Fitted