Single Neuron Linear Regression Trainer

Chapter 1: Introduction - What is Linear Regression?

Linear regression is one of the simplest and most widely used algorithms in machine learning. The goal is to model the relationship between a scalar dependent variable y and an independent variable x. The relationship is assumed to be linear, and can be expressed as:

  y = wx + b

Here, w is the weight (slope), and b is the bias (intercept). The algorithm finds the best line that fits the training data by adjusting w and b.

What Does a Single Neuron Look Like?

Diagram: A single neuron receives input x, multiplies it by weight w, adds bias b, and outputs y.

Easy Example

Imagine we have the following single neuron:

Weight (w) = 2
Bias (b) = 1
Input (x) = 3

How does the neuron compute the output?

  y = w * x + b
    = 2 * 3 + 1
    = 6 + 1
    = 7

So, when we provide the input x=3, the neuron outputs y=7.
This is how a single neuron performs linear regression: it learns the values for w and b that best fit the data.

Chapter 4: Libraries for Model Definition, Training, and Visualization

Model Definition and Training Libraries

In JavaScript, especially for browser-based machine learning, several libraries are available for defining and training models:

TensorFlow.js (Early Version: 0.x and 1.x)
https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.0.0/dist/tf.min.js
TensorFlow.js is the main library for building and training neural networks in JavaScript. Early versions provided basic layers, optimizers, and tensor operations for training models directly in the browser or Node.js.
Synaptic.js
https://cdn.jsdelivr.net/npm/synaptic@1.1.4/dist/synaptic.js
Synaptic is one of the earliest general-purpose neural network libraries for JavaScript, supporting perceptrons, LSTM, and more. It provides a simple API for defining and training neural networks in the browser.

Visualization Libraries for the Web Browser

To visualize data, training progress, or model predictions in the browser, popular early libraries include:

Chart.js (Early Version: 1.x)
https://cdn.jsdelivr.net/npm/chart.js@1.1.1/Chart.min.js
Chart.js is a simple and flexible JavaScript charting library. Early versions supported basic line, bar, and scatter charts, making it useful for visualizing training losses and predictions.
D3.js (Early Version: 3.x)
https://cdn.jsdelivr.net/npm/d3@3.5.17/d3.min.js
D3.js is a powerful library for creating complex, interactive data visualizations using web standards. Early versions enabled custom plotting of training data and model output.

Example: Including Early Libraries

<!-- TensorFlow.js v1.x -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.0.0/dist/tf.min.js"></script>

<!-- Synaptic.js v1.1.4 -->
<script src="https://cdn.jsdelivr.net/npm/synaptic@1.1.4/dist/synaptic.js"></script>

<!-- Chart.js v1.x -->
<script src="https://cdn.jsdelivr.net/npm/chart.js@1.1.1/Chart.min.js"></script>

<!-- D3.js v3.x -->
<script src="https://cdn.jsdelivr.net/npm/d3@3.5.17/d3.min.js"></script>

Using these early libraries, developers could build, train, and visualize machine learning models directly in the browser, paving the way for today's more advanced web-based ML solutions.

Explore the Cars Dataset

The Cars Dataset includes information about various car models. Each entry contains details such as:

Miles per Gallon (MPG)
Horsepower
Weight
Acceleration
Year
Origin

This dataset is useful for exploring how car features relate to each other, such as how horsepower might affect fuel efficiency.

Why Clean the Data?

Sometimes the dataset includes missing or invalid values. Cleaning the data helps remove these records so analysis and visualizations are more accurate.

Load and Clean the Dataset

Click the button below to load and clean the dataset. Only cars with both MPG and horsepower values are kept. The cleaned data is saved as a variable called cleanedData for use later.

Visualize the Dataset

Let’s visualize the relationship between Horsepower and Miles per Gallon (MPG) using a scatter plot. This helps us understand how these two features relate, and why they are a good choice for model training.

X-Axis: Horsepower
Y-Axis: Miles per Gallon (MPG)

We will use these two features for training our model.

Splitting the Dataset

Before training a machine learning model, it's common to split your data into two parts:

Training set: Used to train the model.
Testing set: Used to evaluate model performance on unseen data.

A popular approach is to use about 80% of the data for training and 20% for testing. This helps ensure your model can generalize well to new data.

Define and Inspect a Neural Network Model

Let's build a simple artificial neural network model. When you click the button below, a model will be defined in JavaScript using TensorFlow.js. The structure (layers) of the model will be displayed.

About the Model

The model is a simple feedforward neural network (also called a "Dense Neural Network") that predicts a car’s fuel efficiency (MPG) from its Horsepower. It consists of:

One input node (for horsepower)
One hidden layer with a few neurons (for learning complex relationships)
One output node (for predicted MPG)

This model will be trained on the training set defined above.

Illustration: Input (Horsepower) → Hidden Layer → Output (MPG)

Train the Model

To make accurate predictions, the neural network model must be trained using data. During training, the model adjusts its internal parameters, learning how horsepower relates to MPG. When you click "Train Model", the model will use the cleanedData and update its weights to minimize prediction error.

Input: Horsepower values
Target: Actual MPG values

Evaluate the Trained Model

After training, it is important to check how well the model performs on new, unseen data. We use the test set (the part of cleanedData that was kept aside) to measure the model's accuracy. When you click the button below, the model will make predictions on the test set, and you can compare them to the actual values.

Understanding the Confusion Matrix

A confusion matrix is a table that helps you visualize the performance of a classification model. It shows how many predictions were correctly or incorrectly classified into each category.
In this example, we will categorize MPG into three classes (Low, Medium, High efficiency) and display the confusion matrix to show how well the model predicts each class.