ML Frameworks

VectorForgeML

Production-grade ML framework built from scratch in C++

9+ Algorithms
C++/R Languages
BLAS Hardware Accel.
OpenMP Parallelism
Zero-Copy R Bridge
-- GitHub Stars

Overview

A next-generation machine learning framework bridging R's simplicity with C++'s raw performance. Features 9+ algorithms, BLAS/LAPACK hardware acceleration, OpenMP parallelism, and zero-copy R integration via Rcpp.

VectorForgeML was designed to solve a real problem: R is excellent for statistical computing and data exploration, but its interpreted nature creates a performance ceiling for large-scale ML workloads. By pushing all compute-intensive operations into optimized C++ while maintaining an intuitive R interface, VectorForgeML delivers the best of both worlds.

Architecture

The framework is organized into three distinct layers, each responsible for a specific concern. Data flows from the R user interface through the zero-copy Rcpp bridge into the high-performance C++ core.

R Layer
Pipeline API, Metrics & Evaluation, Data Preprocessing, User Interface
Rcpp Bridge
Zero-Copy Transfer, NumericMatrix to double*, No Data Duplication, Type-safe Bindings
C++ Core
BLAS/LAPACK Math, OpenMP Threading, Raw Pointer Graphs, Custom Allocators

Algorithms Implemented

Every algorithm is implemented from scratch in C++ with no external ML library dependencies. Each implementation leverages hardware-specific optimizations where applicable.

Linear Regression Supervised

OLS via BLAS/LAPACK matrix operations. Hardware-accelerated normal equation solver.

Logistic Regression Supervised

Gradient descent with sigmoid activation. Configurable learning rate and epochs.

Ridge Regression Supervised

L2-regularized regression via Cholesky decomposition. Prevents overfitting with lambda tuning.

Softmax Regression Supervised

Multi-class classification with Log-Sum-Exp numerical stability trick.

Decision Tree Supervised

Recursive partitioning with raw C++ graph pointers. Custom node allocation.

Random Forest Supervised

Parallelized ensemble via OpenMP. Multi-core tree training with bagging.

K-Nearest Neighbors Supervised

Optimized with std::partial_sort for efficient k-selection without full sort.

PCA Unsupervised

Principal Component Analysis via SVD/Eigen decomposition. LAPACK-accelerated.

K-Means Clustering Unsupervised

Lloyd's algorithm with efficient centroid updates and convergence detection.

Technical Deep Dive

Under the hood, VectorForgeML makes deliberate low-level engineering decisions to maximize throughput while maintaining correctness and a clean API surface.

BLAS/LAPACK Integration

All linear algebra operations are routed through BLAS and LAPACK, meaning matrix multiplications, decompositions, and solvers run on hardware-optimized routines rather than naive loops. This single decision delivers order-of-magnitude speedups for algorithms like Linear Regression, Ridge Regression, and PCA.

// Naive O(n^3) matrix multiply
for (int i = 0; i < n; i++)
  for (int j = 0; j < n; j++)
    for (int k = 0; k < n; k++)
      C[i][j] += A[i][k] * B[k][j];

// BLAS: single call, hardware-optimized
cblas_dgemm(CblasRowMajor, CblasNoTrans,
  CblasNoTrans, n, n, n, 1.0,
  A, n, B, n, 0.0, C, n);

OpenMP Parallelism

Ensemble methods like Random Forest benefit directly from thread-level parallelism. Each tree in the forest is trained independently, making this a naturally parallelizable workload. OpenMP distributes tree training across all available CPU cores with minimal synchronization overhead.

// Parallel tree training in Random Forest
#pragma omp parallel for schedule(dynamic)
for (int i = 0; i < n_trees; i++) {
  // Each thread trains one tree
  auto sample = bootstrap_sample(data);
  trees[i] = build_tree(sample,
    max_depth, min_samples);
}

// Aggregate predictions across trees
auto predictions = majority_vote(trees, X_test);

Zero-Copy R Bridge

The Rcpp bridge eliminates data serialization overhead by sharing memory directly between R and C++. R's NumericMatrix is converted to a raw double pointer without copying, meaning even gigabyte-scale datasets incur zero transfer cost when crossing the language boundary.

// Zero-copy: R matrix -> C++ pointer
// [[Rcpp::export]]
Rcpp::NumericVector cpp_predict(
    Rcpp::NumericMatrix X) {

  // Direct pointer access, no copy
  double* ptr = REAL(X);
  int nrow = X.nrow();
  int ncol = X.ncol();

  // Run prediction on raw memory
  return predict_internal(ptr, nrow, ncol);
}

Pipeline Architecture

Inspired by scikit-learn's Pipeline API, VectorForgeML provides a composable pipeline system in R that chains preprocessing, training, and evaluation into a single reproducible workflow. This pattern prevents data leakage and ensures consistent transformations across train and test sets.

# Scikit-learn-style pipeline in R
pipeline <- VFPipeline(
  StandardScaler(),
  PCA(n_components = 10),
  RandomForest(
    n_trees = 100,
    max_depth = 12
  )
)

# Fit and predict in one call
pipeline$fit(X_train, y_train)
preds <- pipeline$predict(X_test)

Code Examples

VectorForgeML exposes a clean, high-level R API. All heavy computation happens in C++ behind the scenes.

Classification
library(VectorForgeML)

# Load and split data
data <- read.csv("dataset.csv")
split <- train_test_split(data, ratio = 0.8)

# Train Random Forest classifier
model <- RandomForest(
  n_trees = 200,
  max_depth = 15,
  min_samples = 5
)
model$fit(split$X_train, split$y_train)

# Evaluate
preds <- model$predict(split$X_test)
accuracy(split$y_test, preds)
confusion_matrix(split$y_test, preds)
Regression
library(VectorForgeML)

# Load and split data
data <- read.csv("housing.csv")
split <- train_test_split(data, ratio = 0.8)

# Train Ridge Regression
model <- RidgeRegression(
  lambda = 0.01
)
model$fit(split$X_train, split$y_train)

# Evaluate
preds <- model$predict(split$X_test)
rmse(split$y_test, preds)
r_squared(split$y_test, preds)

Research Paper

Published

VectorForgeML: A Production-Grade Machine Learning Framework in C++

Mohd. Musheer

This paper presents VectorForgeML, a modular machine learning framework built entirely from scratch in C++ with R bindings. The framework implements 9+ supervised and unsupervised algorithms including Linear Regression, Logistic Regression, Ridge Regression, Softmax Regression, Decision Trees, Random Forests, KNN, PCA, and K-Means Clustering. Key architectural decisions include BLAS/LAPACK integration for hardware-accelerated linear algebra, OpenMP for multi-core parallelism, and zero-copy data exchange between R and C++ via Rcpp.

Installation

VectorForgeML is distributed as an R package and can be installed directly from GitHub using the remotes package.

Step 1: Install remotes (if not already installed)
install.packages("remotes")
Step 2: Install VectorForgeML from GitHub
remotes::install_github("mohd-musheer/VectorForgeML")