Index

A

Absolute error
Accuracy measure
depth of tree
number of tree
Activation functions
definition
in Excel
sigmoid function
Adaptive boosting (AdaBoost)
high-level algorithm
weak learner
Alice dataset
build model
encode output
import package
iterations
normalize file
one-hot-encode
read dataset
run model
target datasets
Amazon Web Services (AWS)
console
host name
setting private key
username, adding
in VM
Area under the curve (AUC)
Artificial neural network
SeeNeural network

B

Back propagation
in CNN
definition
in Excel
learning rate
Bagging
SeeBootstrap aggregating
Bias term
Bootstrap aggregating

C

Cloud-based analysis
amazon web services
file transfer
GCP
Jupyter Notebooks
Microsoft Azure
R on instance
Clustering
ideal clustering
informed locations
k-means
middle locations
optimal K value
process
random locations
reassigning households
recomputing middles
significance
store clusters for performance comparison
top-down vs . bottom-up clustering
use-case
Collaborative filtering
Confusion matrix
Continuous bag of words (CBOW)
Continuous independent variables
continuous dependent variable and
decision tree for
and discrete variables
response variable
Convolutional neural network (CNN)
backward propagation
convolution
definition
max pooling
one pooling step after
pooling
prediction
ReLU activation function
smaller matrix
data augmentation
in Excel
feed forward network
flattening process
fully connected layer
image of pixels
LeNet
in R
three-convolution pooling layer
Cosine similarity
average rating
error, calculation
parameter combination
Cross entropy
Cross-validation technique
Customer tweets
convert to lowercase
embedding layer
index value
map index
packages
sequence length
train and test datasets

D

Data augmentation
Decision tree
branch/sub-tree
business user
child node
common techniques
components
continuous independent variables
SeeContinuous independent variables
decision node
multiple independent variables
overfitting
parent node
plot function
in Python
in R
root node
SeeRoot node
rules engine
splitting process
terminal node
visualizing
Deep learning
Dependent variable
Discrete independent variable
Discrete values

E

Entropy
Euclidian distance
issue with single user
user normalization

F

Feature generation process
Feature interaction process
Feed forward network
Fetch data
File transfer
setting private key
WinSCP login
Flattening process
Forward propagation
hidden layer
synapses
XOR function
Fraudulent transaction
F-statistic
Fully connected layer

G

Gini impurity
Google Cloud Platform (GCP)
Auth options
key pair in PuTTYgen
selecting OS
VM option
Gradient Boosting Machine (GBM)
algorithm
AUC
column sampling
decision tree
definition
in Python
in R
row sampling
shrinkage
Gradient descent neural networks
definition
known function

H

Hierarchical clustering
Hyper-parameters

I, J

Ideal clustering
IMDB dataset
Independent variable
Information gain
Integrated development environment (IDE)
Item-based collaborative filtering (IBCF)

K

Kaggle
keras framework
in Python
in R
K-means clustering algorithm
betweenss
cluster centers
dataset
properties
totss
tot.withinss
K-nearest neighbors

L

Leaf node
SeeTerminal node
Learning rate
Least squares method
Linear regression
causation
correlation
definition
dependent variable
discrete values
error
homoscedasticity
independent variable
multivariate
SeeMultivariate linear regression
simple vs . multivariate
Logistic regression
accuracy measure
AUC metric
cumulative frauds
definition
error measure
in Excel
fraudulent transaction
independent variables
interpreting
probability
in Python
in R
random guess model
sigmoid curve to
time gap
Log/squared transformation
Long short-term memory (LSTM)
architecture of
cell state
forget gate
for sentiment classification
toy model
build model
documents and labels
in Excel
import packages
model.layers
one-hot-encode
order of weights
pad documents
Loss optimization functions

M

Machine learning
building, deploying, testing, and iterating
classification
e-commerce transactions
overfitted dataset
productionalizing model
regression
supervised/unsupervised
validation dataset
Matrix factorization
constraint
objective
in Python
in R
Mean squared error (MSE)
Measures of accuracy
absolute error
confusion matrix
root mean square error
Microsoft Azure
IP address
VM, page
Microsoft Excel
Missing values
MNIST
Multicollinearity
Multivariate linear regression
coefficients
in Excel
multicollinearity
non-significant variable
observations
problem
in Python
in R

N

Negative sampling
Neural network
activation functions
SeeActivation functions
back propagation
backward propagation
definition
in Excel
learning rate
forward propagation
definition
hidden layer
synapses
XOR function
hidden layer
keras framework
in Python
in R
loss optimization functions
in Python
scaling
structure of
synapses
Word2vec
SeeWord2vec model
Normalizing variables
Null deviance

O

Outliers
Overall squared error

P, Q

Pooling
Principal component analysis (PCA)
data scaling
dataset
MNIST
multiple variables
objective and constraints
in Python
in R
relation plot
variables
Pruning process
Python
Anaconda prompt
coding editor
Jupyter web page

R

Random forest
algorithm for
definition
depth of trees
entropy
error message
factor variable
importance function
MeanDecreaseGini
missing values
movie scenario
number of trees
parameters
in Python
rpart package
test dataset
Receiver operating characteristic (ROC) curve
Recurrent neural networks (RNNs)
alice dataset
SeeAlice dataset
customer tweets
convert to lowercase
embedding layer
index value
map index
packages
sequence length
train and test datasets
exploding gradient
memory in hidden layer
with multiple steps
multiple way architecture
in R
simpleRNN function
text mining techniques
“this is an example”
calculation for hidden layer
encoded words
matrix multiplication
structure
time step
weight matrix
toy model
in Excel
initialize documents
same size
single output
vanishing gradient
ReLU activation function
Response variable
Root mean squared error (RMSE)
Root node
R programming language
R squared
RStudio

S

Sigmoid function
features
to logistic regression
mathematical formula
Simple linear regression
bias term
coefficients section
complicating
in Excel
F-statistic
gradient descent
vs . multivariate
null deviance
overall squared error
pitfalls
in Python
in R
representation
residuals
RMSE
R squared
slope
solving
SSE
Softmax
activation
binary classification
cross entropy error
one-hot-encode
Splitting process
definition
disadvantage of
Gini impurity
information gain
sub-nodes
uncertainty
calculating
measure improvement in
original dataset
Squared error
Stochastic gradient descent
SeeGradient descent neural networks
Sum of squared error (SSE)
Supervised learning

T

Terminal node
Top-down clustering
Toy model
LSTM
build model
documents and labels
in Excel
import packages
model.layers
one-hot-encode
order of weights
pad documents
RNNs
in Excel
initialize documents
same size
single output
time steps
Traditional neural network (NN)
highlight image
limitations of
original average image
original average pixel
translate pixel
Training data
Tree-based algorithms

U

Unsupervised learning
User-based collaborative filtering (UBCF)
cosine similarity
Euclidian distance
UBCF

V

Validation dataset
Vanishing gradient
Variable transformations
Virtual machine (VM)

W, X, Y, Z

Word2vec model
frequent words
gensim package
negative sampling
one-hot-encode
Word vector
context words
dimensional vector
cross entropy loss
hidden layer
softmax