Single Neuron Linear Regression Trainer

Chapter 1: Introduction to Linear Regression

Linear regression is a fundamental technique in machine learning and statistics. It models the relationship between a dependent variable y and one or more independent variables x using a straight line: y = wx + b, where w is the weight (slope), and b is the bias (intercept). The goal is to find the best w and b to minimize the difference between the predicted values and actual data points.

Chapter 2: Train a Single Neuron (Linear Regression)

Try entering your own data points and see how a single neuron (just w and b) learns to fit them!

Input x values (comma separated):

Input y values (comma separated):

Chapter 3: Discussion: TensorFlow for JavaScript

TensorFlow.js is an open-source library that brings machine learning capabilities to JavaScript. It enables training and deploying models directly in the browser or in Node.js environments. This allows developers to leverage GPU acceleration, maintain privacy (since data doesn't need to leave the user's device), and create interactive ML-powered applications without backend dependencies. With TensorFlow.js, you can implement, train, and run machine learning models entirely in JavaScript.

Chapter 4: Key Libraries for Model Training and Visualization

When building machine learning demos in the browser, especially for linear regression and neural networks, several JavaScript libraries are particularly important:

TensorFlow.js (Early versions: 0.x, 1.x):
This is the primary library for defining, training, and running machine learning models directly in the browser or in Node.js. It provides flexible APIs to create layers, compile models, and train using backpropagation.
```
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@0.15.0"></script>
    
```
Early versions provided basic layers, optimizers, and tensor operations for rapid prototyping.

Chart.js (Early versions: 1.x, 2.x):
For visualization, Chart.js is a widely-used library for drawing charts and graphs in the web browser. It helps visualize training progress, predicted values, and the fitted regression line.
```
<script src="https://cdn.jsdelivr.net/npm/chart.js@1.1.1"></script>
    
```
Early versions enabled simple line, bar, and scatter plots for data and model outputs.

By combining early versions of TensorFlow.js for computation and Chart.js for visualization, you could already build interactive machine learning experiences in the browser. As these libraries evolved, they introduced more features and improved performance, but their early versions are still valuable for understanding the basics.

Chapter 5: Using and Loading the Cars Dataset

The cars dataset contains records of various car models with attributes such as mpg (miles per gallon) and horsepower. We will focus on these two fields. Some records may have missing or invalid values, so we will clean the data before using it.

Chapter 6: Visualizing the Cars Dataset

Now that we've loaded and cleaned the cars dataset, let's visualize the relationship between Miles_per_Gallon and Horsepower. Each car will be a point on the scatter plot.

Chapter 7: Splitting the Dataset into Training and Testing Sets

To evaluate the performance of a machine learning model, it's standard practice to split your dataset into two parts:

Training set: Used to train the model.
Testing set: Used to assess how well the model generalizes to new, unseen data.

A common split ratio is 80% for training and 20% for testing.

Chapter 8: Building an Artificial Neural Network Model

Neural networks are a family of machine learning models that learn complex relationships between inputs and targets. Here, we will define a simple neural network that models the relationship between a car's horsepower (input) and its miles per gallon (output).

What does this model do?

This artificial neural network (ANN) is designed to learn the relationship between a car's horsepower and its fuel efficiency (miles per gallon). It consists of:

Input: Horsepower value for each car.
One hidden layer: 10 neurons with relu activation to capture non-linear relationships.
Output: The predicted miles per gallon for a given horsepower.

How does this work?
When you train this model using the training set (from the previous section), the network learns to estimate the car’s miles per gallon from its horsepower value. The hidden layer allows it to capture more complex, non-linear trends that a simple straight line might miss.

Chapter 9: Training the Neural Network Model

After defining the model, the next step is to train it using your dataset. Training helps the model learn the mapping between your input (horsepower) and output (miles per gallon) values.

What happens during training?

Each time you click Train Model, the model uses the cleaned dataset, gradually adjusting its weights to minimize the error between its predictions and the actual miles per gallon values. The chart above shows how the loss (error) decreases with each epoch, indicating the model is learning.

Chapter 10: Evaluating the Trained Neural Network Model

After training, it's important to evaluate how well the model performs on data it hasn't seen before. This helps determine if the model can generalize or if it has simply memorized the training data.

What do the results mean?

Loss is a measure of the model’s prediction error on the test data — lower values are better. The example above shows how the model's prediction compares to the actual value from the test set. If the loss is much higher than it was during training, the model may be overfitting.

Chapter 11: Understanding the Confusion Matrix

When evaluating classification models, the confusion matrix is a valuable tool for visualizing the performance of your model. It summarizes predictions by showing where the model made correct guesses and where it made mistakes.

What is a Confusion Matrix?

A confusion matrix is a table that compares the predicted classes to the actual classes. Each row represents the instances in a predicted class, while each column represents the instances in an actual class (or vice versa).

	Predicted: Positive	Predicted: Negative
Actual: Positive	True Positive (TP)	False Negative (FN)
Actual: Negative	False Positive (FP)	True Negative (TN)

How to Read the Confusion Matrix

True Positive (TP): Model correctly predicts the positive class.
True Negative (TN): Model correctly predicts the negative class.
False Positive (FP): Model incorrectly predicts positive when it is actually negative (Type I error).
False Negative (FN): Model incorrectly predicts negative when it is actually positive (Type II error).

Key Metrics from the Confusion Matrix

Accuracy: (TP + TN) / (TP + TN + FP + FN) — Overall, how often is the classifier correct?
Precision: TP / (TP + FP) — Of all predicted positives, how many are correct?
Recall (Sensitivity): TP / (TP + FN) — Of all actual positives, how many did the model identify?
F1 Score: 2 * (Precision * Recall) / (Precision + Recall) — The harmonic mean of Precision and Recall.

Using a Confusion Matrix in Code

Here’s how you can compute a confusion matrix in JavaScript, assuming you have arrays of predicted and actual labels:


// Example arrays
const yTrue = [1, 0, 1, 0, 1, 1, 0];
const yPred = [1, 0, 0, 0, 1, 1, 1];

function confusionMatrix(yTrue, yPred) {
  let TP = 0, TN = 0, FP = 0, FN = 0;
  for (let i = 0; i < yTrue.length; i++) {
    if (yTrue[i] === 1 && yPred[i] === 1) TP++;
    else if (yTrue[i] === 0 && yPred[i] === 0) TN++;
    else if (yTrue[i] === 0 && yPred[i] === 1) FP++;
    else if (yTrue[i] === 1 && yPred[i] === 0) FN++;
  }
  return {TP, TN, FP, FN};
}

const matrix = confusionMatrix(yTrue, yPred);
console.log(matrix);

Why Use the Confusion Matrix?

The confusion matrix provides insight into how your model is making errors, which helps you fine-tune and improve your model’s performance, especially when classes are imbalanced.