Linear regression is one of the simplest and most widely used algorithms in machine learning. The goal is to model the relationship between a scalar dependent variable y and an independent variable x. The relationship is assumed to be linear, and can be expressed as:
y = wx + b
Here, w is the weight (slope), and b is the bias (intercept). The algorithm finds the best line that fits the training data by adjusting w and b.
Diagram: A single neuron receives input x, multiplies it by weight w, adds bias b, and outputs y.
Imagine we have the following single neuron:
y = w * x + b
= 2 * 3 + 1
= 6 + 1
= 7
So, when we provide the input x=3, the neuron outputs y=7.
This is how a single neuron performs linear regression: it learns the values for w and b that best fit the data.
In JavaScript, especially for browser-based machine learning, several libraries are available for defining and training models:
https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.0.0/dist/tf.min.jshttps://cdn.jsdelivr.net/npm/synaptic@1.1.4/dist/synaptic.jsTo visualize data, training progress, or model predictions in the browser, popular early libraries include:
https://cdn.jsdelivr.net/npm/chart.js@1.1.1/Chart.min.jshttps://cdn.jsdelivr.net/npm/d3@3.5.17/d3.min.js<!-- TensorFlow.js v1.x --> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.0.0/dist/tf.min.js"></script> <!-- Synaptic.js v1.1.4 --> <script src="https://cdn.jsdelivr.net/npm/synaptic@1.1.4/dist/synaptic.js"></script> <!-- Chart.js v1.x --> <script src="https://cdn.jsdelivr.net/npm/chart.js@1.1.1/Chart.min.js"></script> <!-- D3.js v3.x --> <script src="https://cdn.jsdelivr.net/npm/d3@3.5.17/d3.min.js"></script>
Using these early libraries, developers could build, train, and visualize machine learning models directly in the browser, paving the way for today's more advanced web-based ML solutions.
The Cars Dataset includes information about various car models. Each entry contains details such as:
This dataset is useful for exploring how car features relate to each other, such as how horsepower might affect fuel efficiency.
Sometimes the dataset includes missing or invalid values. Cleaning the data helps remove these records so analysis and visualizations are more accurate.
Click the button below to load and clean the dataset. Only cars with both MPG and horsepower values are kept. The cleaned data is saved as a variable called cleanedData for use later.
Let’s visualize the relationship between Horsepower and Miles per Gallon (MPG) using a scatter plot. This helps us understand how these two features relate, and why they are a good choice for model training.
We will use these two features for training our model.
Before training a machine learning model, it's common to split your data into two parts:
A popular approach is to use about 80% of the data for training and 20% for testing. This helps ensure your model can generalize well to new data.
Let's build a simple artificial neural network model. When you click the button below, a model will be defined in JavaScript using TensorFlow.js. The structure (layers) of the model will be displayed.
The model is a simple feedforward neural network (also called a "Dense Neural Network") that predicts a car’s fuel efficiency (MPG) from its Horsepower. It consists of:
This model will be trained on the training set defined above.
Illustration: Input (Horsepower) → Hidden Layer → Output (MPG)
To make accurate predictions, the neural network model must be trained using data. During training, the model adjusts its internal parameters, learning how horsepower relates to MPG. When you click "Train Model", the model will use the cleanedData and update its weights to minimize prediction error.
After training, it is important to check how well the model performs on new, unseen data. We use the test set (the part of cleanedData that was kept aside) to measure the model's accuracy. When you click the button below, the model will make predictions on the test set, and you can compare them to the actual values.
A confusion matrix is a table that helps you visualize the performance of a classification model. It shows how many predictions were correctly or incorrectly classified into each category.
In this example, we will categorize MPG into three classes (Low, Medium, High efficiency) and display the confusion matrix to show how well the model predicts each class.