jdBasic

AI with BASIC
Autodiff, Tensors, and Transformers

jdBasic isn’t only retro fun — it has a modern Tensor + Autodiff engine. Build neural networks, train them, and experiment with Transformer-style LLMs right in BASIC.

What you’ll build

Neural basics

Learn how neurons, layers, and activations work — using jdBasic arrays and matrix math.

Autodiff training

Train a small network to solve XOR using TENSOR.BACKWARD.

Transformer demo

See how attention blocks stack into a Transformer that learns next-character prediction.

1) The AI building blocks in jdBasic

Plain arrays: neuron + layer

jdBasic array math is great for understanding the “math inside” a network.

' --- A single neuron (array math) ---
INPUTS  = [0.5, -1.2, 0.8]
WEIGHTS = [0.8,  0.1, -0.4]
BIAS    = 0.5

WEIGHTED_SUM = SUM(INPUTS * WEIGHTS)
OUTPUT = WEIGHTED_SUM + BIAS

PRINT "Output:"; OUTPUT

Tensors + autodiff

When you call TENSOR.BACKWARD, jdBasic calculates gradients for you.

' --- Convert arrays into tensors ---
X = TENSOR.FROM([[1, 2], [3, 4]])
W = TENSOR.FROM([[5, 6], [7, 8]])

Y = TENSOR.MATMUL(X, W)

' --- Backprop through the graph ---
TENSOR.BACKWARD Y

PRINT "Grad of X:"
PRINT FRMV$(TENSOR.TOARRAY(X.grad))

2) Train XOR

XOR is the “hello world” of training. In jdBasic, autodiff makes the loop surprisingly clean.

Full XOR Training Loop

' ==========================================================
' == Autodiff Neural Network: Learn XOR in jdBasic
' ==========================================================

' 1) Training data
TRAINING_INPUT_DATA  = [[0, 0], [0, 1], [1, 0], [1, 1]]
TRAINING_OUTPUT_DATA = [[0], [1], [1], [0]]

INPUTS  = TENSOR.FROM(TRAINING_INPUT_DATA)
TARGETS = TENSOR.FROM(TRAINING_OUTPUT_DATA)

' 2) Model definition (2 → 3 → 1)
MODEL = {}
HIDDEN_LAYER = TENSOR.CREATE_LAYER("DENSE", {"input_size": 2, "units": 3})
OUTPUT_LAYER = TENSOR.CREATE_LAYER("DENSE", {"input_size": 3, "units": 1})
MODEL{"layers"} = [HIDDEN_LAYER, OUTPUT_LAYER]

' 3) Optimizer + training params
OPTIMIZER = TENSOR.CREATE_OPTIMIZER("SGD", {"learning_rate": 0.1})
EPOCHS = 15000

' 4) Forward pass (sigmoid activations)
FUNC MODEL_FORWARD(current_model, input_tensor)
    temp = input_tensor
    layers = current_model{"layers"}
    FOR i = 0 TO LEN(layers) - 1
        layer = layers[i]
        temp = MATMUL(temp, layer{"weights"}) + layer{"bias"}
        temp = TENSOR.SIGMOID(temp)
    NEXT i
    RETURN temp
ENDFUNC

' 5) Loss function (MSE)
FUNC MSE_LOSS(predicted, actual)
    err = actual - predicted
    RETURN SUM(err ^ 2) / LEN(TENSOR.TOARRAY(err))[0]
ENDFUNC

' 6) Training loop
FOR epoch = 1 TO EPOCHS
    PRED = MODEL_FORWARD(MODEL, INPUTS)
    LOSS = MSE_LOSS(PRED, TARGETS)

    TENSOR.BACKWARD LOSS
    MODEL = TENSOR.UPDATE(MODEL, OPTIMIZER)

    IF epoch MOD 1000 = 0 THEN
        PRINT "Epoch:"; epoch; " Loss:"; TENSOR.TOARRAY(LOSS)
    ENDIF
NEXT epoch

3) Debugging: gradients in 10 lines

When training "does nothing", run this check:

PRINT "--- Testing MATMUL + BACKWARD ---"

A = TENSOR.FROM([[1, 2], [3, 4]])
B = TENSOR.FROM([[5, 6], [7, 8]])

C = TENSOR.MATMUL(A, B)
PRINT "C:"
PRINT FRMV$(TENSOR.TOARRAY(C))

TENSOR.BACKWARD C

PRINT "Grad(A):"
PRINT FRMV$(TENSOR.TOARRAY(A.grad))

4) A tiny Transformer LLM

This demo shows a character-level Transformer that learns next-character prediction. It uses real embeddings, positional encoding, and stacked attention blocks.

Core steps

  • Build a character vocabulary
  • Embed tokens into vectors
  • Run stacked self-attention blocks (pre-LN)
  • Train with cross-entropy loss

Note: Full demo files live in the repo scripts folder. Keep datasets small for browser use.

Snippet: Stacking Blocks

' --- Stack Transformer blocks ---
MODEL{"layers"} = []
FOR i = 0 TO NUM_LAYERS - 1
    layer = {}
    layer{"attention"} = TENSOR.CREATE_LAYER("ATTENTION", {"embedding_dim": EMBEDDING_DIM})
    layer{"norm1"} = TENSOR.CREATE_LAYER("LAYER_NORM", {"dim": HIDDEN_DIM})
    layer{"ffn1"}  = TENSOR.CREATE_LAYER("DENSE", {"input_size": HIDDEN_DIM, "units": HIDDEN_DIM * 2})
    layer{"norm2"} = TENSOR.CREATE_LAYER("LAYER_NORM", {"dim": HIDDEN_DIM})
    MODEL{"layers"} = APPEND(MODEL{"layers"}, layer)
NEXT i

Practical Tips

Keep it small

Browser execution is slower than native C++.

  • HIDDEN_DIM = 32 or 64
  • NUM_LAYERS = 1 to start
  • Print loss every 50 steps, not every step

Save and reuse

Don't retrain every time.

TENSOR.SAVEMODEL MODEL, "my_model.json"