Linear regression is a fundamental technique in machine learning and statistics. It models the relationship between a dependent variable (y) and one or more independent variables (x) by fitting a straight line through the data points. The equation of this line is typically written as:
y = w1x1 + w2x2 + ... + b
where w are the weights (slopes for each input), and b is the bias (y-intercept).
Suppose we want to predict the price of a house (y) based on its size (x₁ = 50 m²) and number of bedrooms (x₂ = 2).
Assume our neuron has learned the weights: w₁ = 2,000, w₂ = 10,000, and b = 5,000.
This is how a single neuron can be used for linear regression: it learns weights for each input and a bias to make predictions.
Enter your data points below (one (x, y) pair per line, separated by a comma), then train the single neuron to fit a line:
TensorFlow.js is an open-source library that enables machine learning directly in the browser or in Node.js using JavaScript. With TensorFlow.js, developers can train and run machine learning models without needing Python or server-side computation. This is particularly useful for interactive web-based applications, real-time inference, and privacy-preserving computation since all processing can occur on the client side.
In this demo, we use TensorFlow.js to create and train a basic model: a single neuron performing linear regression.
These libraries paved the way for interactive, browser-based machine learning and visualization tools that are now common in education and research.
The dataset from carsData.json contains information about various car models, including attributes such as miles per gallon (mpg) and horsepower. This dataset is often used for regression tasks and demonstration of machine learning techniques in JavaScript tutorials.
The data may contain missing values or non-numeric entries. Before using it for training models, it is important to clean the dataset by removing records with incomplete or invalid data.
Now that the dataset is cleaned and available as cleanedData, we can visualize the relationship between Horsepower and Miles_per_Gallon. This scatter plot will help us understand the data points that will be used for training our model.
To evaluate a machine learning model fairly, it's important to train it on a portion of the data and test it on data it hasn't seen. The dataset is typically split into a training set (for model learning) and a testing set (for evaluation). A common split is 70% for training and 30% for testing.
Neural networks are powerful tools for modeling complex nonlinear relationships in data. Here, we’ll create a simple artificial neural network (ANN) that learns to predict Miles_per_Gallon from Horsepower using the training data we prepared earlier.
An artificial neural network (ANN) is inspired by biological brains. It consists of interconnected layers of simple processing units called neurons. Each neuron receives inputs, applies weights, computes a sum, and then applies an activation function to determine its output.
An example of a simple neural network with an input, a hidden, and an output layer.
In this project, the model will learn from the training set you created to estimate the relationship between horsepower and miles per gallon, aiming to predict efficient vehicle performance.
Training means letting the neural network learn patterns from data. Here, the model will learn to predict Miles_per_Gallon from Horsepower using the cleanedData you loaded previously.
Once the model is trained, it’s important to check how well it performs on new data it hasn't seen before. This is done using the test set that was separated from the cleanedData. Here, we’ll compare the model’s predictions with the real values to understand its performance.
The confusion matrix is a useful tool for evaluating classification models. It shows the number of correct and incorrect predictions made by the model compared to the actual outcomes (labels). The matrix helps you see not only the overall accuracy, but also the types of errors the model makes.
A confusion matrix for a binary classifier looks like this:
Actual
| 1 | 0
---+---+---
1 | TP | FP
Pred. ---+---+---
0 | FN | TN
To use a confusion matrix with this regression demo, we will convert the regression output into two classes: for example, cars with MPG above 23 are "Efficient" (1), and those with MPG 23 or below are "Not Efficient" (0). The threshold can be adjusted as needed.