Manual raw data entry
Add up to 50 paired observations. The predictor can be numeric (continuous) or categorical labels.
| Row | Predictor (X) | Outcome (Y) |
|---|
Fit and interpret simple linear regression models for marketing data, using either a single continuous predictor or a single binary (two-level) categorical predictor.
Bivariate linear regression estimates how an outcome \(Y\) changes, on average, with one predictor \(X\). In marketing contexts, \(Y\) might be revenue per user, time on site, or order value, while \(X\) could be a continuous metric (spend, impressions, visits) or a binary condition (control vs. treatment).
Model: $$ Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i $$
Test statistic for slope: $$ t = \frac{\hat{\beta}_1}{\mathrm{SE}(\hat{\beta}_1)} \quad\text{with}\quad \nu = n - 2 $$
Confidence interval for slope: $$ \hat{\beta}_1 \pm t_{\alpha/2,\nu}\,\mathrm{SE}(\hat{\beta}_1) $$
When \(X\) is continuous, \(\hat{\beta}_1\) represents the expected change in \(Y\) for a one-unit change in \(X\). When \(X\) is a binary group indicator (e.g., 0 = control, 1 = treatment), \(\hat{\beta}_1\) is the mean difference between groups, and \(\hat{\beta}_0\) is the mean for the baseline group.
Use presets to explore realistic use cases, such as ad spend vs. revenue or control vs. treatment on order value. Each scenario can expose either a summary CSV of aggregated statistics or a raw data file that you can download, edit in Excel, and re-upload.
Add up to 50 paired observations. The predictor can be numeric (continuous) or categorical labels.
| Row | Predictor (X) | Outcome (Y) |
|---|
Upload raw case-level data. The file should have exactly two columns: a predictor and an outcome. Text/string predictors are treated as categorical; numeric predictors can be interpreted as continuous or categorical based on the option you choose below.
Drag & Drop raw data file (.csv, .tsv, .txt)
Two columns with headers, such as Predictor,Outcome. Numeric predictors can be treated as continuous or categorical.
No raw file uploaded.
Set the significance level for hypothesis tests on the slope and the corresponding confidence intervals.
Choose whether the slope test is two-sided or directional. Two-sided is the safest default; one-sided tests are only appropriate when you truly care about movement in one direction.
Directional tests adjust the reported p-value and decision at your chosen alpha but keep confidence intervals two-sided for now.
Large single-point outliers can dominate a simple regression. Use this setting as a reminder to run sensitivity checks and inspect diagnostics.
When checked, observations where either variable has a z-score above about 3.5 in absolute value are removed before estimating the line and reporting results. Use this when a few extreme points are clearly distorting an otherwise stable relationship.
Log scales can turn curved relationships into straighter lines and make coefficients easier to interpret as approximate percentage effects, but they require strictly positive values.
Only rows with X > 0 or Y > 0 are used when the corresponding log option is checked; non-positive values are dropped. Use these transforms when the diagnostics or plots suggest a curved, multiplicative, or heavy-tailed relationship on the original scale.
The standardized slope rescales both X and Y to standard deviation units. It can help compare effect sizes across different predictors in other tools that share this pattern.
Each point is one case: x-axis is the model’s fitted (predicted) value, y-axis is the actual outcome. Points close to the 45° line indicate good predictions; systematic curves or funnels suggest model misspecification or unequal variance.
Provide data to see the fitted regression equation.
The downloaded file includes your raw paired data plus two columns: y_fitted (predicted outcome
from the fitted regression line) and residual (actual minus predicted) for each observation.
| Variable | Mean | Median | Std. Dev. | Min | Max |
|---|---|---|---|---|---|
| Provide data to see numeric summaries. | |||||
| Predictor | Level | Percent |
|---|---|---|
| Provide data to see level percentages. | ||
R-squared: Share of outcome variation explained by this single predictor. Higher means the line fits the data better.
Slope & t / p-value: Slope shows change in outcome for one-unit change in predictor (or difference vs. reference for a binary predictor). A small p-value (below alpha) means that slope is statistically reliable.
Intercept: Expected outcome when the predictor is zero (or at the reference group).
Model p-value: Overall test of whether the predictor explains any variance in the outcome (same as the slope test for a one-predictor model).
RMSE / MAE: Typical prediction error size. RMSE weights big errors more; MAE is a straight average of absolute errors.
Residual SE: Standard deviation of residuals (uses degrees of freedom), a classic “standard error of the regression.”
Decision (alpha): An easy yes/no cue based on the chosen significance level.
| Parameter | Level / Term | Estimate | Standard Error | t | p-value | Lower Bound | Upper Bound |
|---|---|---|---|---|---|---|---|
| Provide summary statistics or upload data to see parameter estimates. | |||||||
Run the analysis to see checks on linearity, influential points, variance patterns, and normality of residuals. Use these as prompts for plots and follow-up modeling, not as strict pass/fail gates.