Index
A
- A/B model testing, Feature: Feedback Loops, A/B Testing of Models
- access control
- account creation, Account Creation-Reputation scores
- account takeover (ATO), Cyber Threat Landscape, Authentication and Account Takeover-Building your classifier
- activation functions, Neural Networks
- active authentication, Access Control and Authentication
- active learning, Feature: Feedback Loops, A/B Testing of Models
- active network attacks, Active attacks-Active attacks
- adb (Android Debug Bridge), Behavioral (dynamic) analysis
- ADMM (alternating direction method of multipliers), Performance Optimization
- advanced persistent threats (APT), Cyber Threat Landscape
- adversarial examples, Mitigating Adversarial Effects
- adversarial machine learning, Adversaries Using Machine Learning, Adversarial Machine Learning-Conclusion
- adversarial space
- adversarial training, Defense Against Evasion Attacks
- adware, Cyber Threat Landscape
- agglomerative (bottom-up) hierarchical clustering, Hierarchical clustering
- AI (artificial intelligence), What Machine Learning Is Not-What Machine Learning Is Not
- Aircrack-ng, Passive attacks
- alerting
- algorithms
- alternating direction method of multipliers (ADMM), Performance Optimization
- Amazon Machine Learning, Using Cloud Services
- Amazon Web Services (AWS), Using Cloud Services
- Android
- Android Debug Bridge (adb), Behavioral (dynamic) analysis
- Android Package Kit (APK), Structural analysis
- Android Runtime (ART), Android malware analysis
- ANN (artificial neural networks) (see neural networks)
- anomaly detection, Real-World Uses of Machine Learning in Security, Anomaly Detection-Conclusion
- challenges of using machine learning, Challenges of Using Machine Learning in Anomaly Detection
- data and algorithms, Anomaly Detection with Data and Algorithms-In Summary
- data-driven methods, Data-Driven Methods-Data-Driven Methods
- deep packet inspection, Deep packet inspection-Deep packet inspection
- defined, Machine Learning: Problems and Approaches
- density-based methods, Density-Based Methods-Local outlier factor
- feature engineering, Feature Engineering for Anomaly Detection-In Summary
- feedback loops, Feature: Feedback Loops, A/B Testing of Models-Feature: Feedback Loops, A/B Testing of Models
- forecasting, Forecasting (Supervised Machine Learning)-Summary
- goodness-of-fit tests, Goodness-of-Fit-Elliptic envelope fitting (covariance estimate fitting)
- host intrusion detection, Host Intrusion Detection-osquery
- integrating human feedback in systems, Integrating Human Feedback
- intrusion detection with heuristics, Intrusion Detection with Heuristics
- isolation forests, Isolation forests-Isolation forests
- local outlier factor, Local outlier factor-Local outlier factor
- maintainability of systems, Maintainability of Anomaly Detection Systems
- mitigating adversarial effects, Mitigating Adversarial Effects
- network intrusion detection, Network Intrusion Detection-Features for network intrusion detection
- one-class Support Vector Machines, One-class support vector machines-One-class support vector machines
- online learning and, Attack Technique: Model Poisoning
- optimizing system for explainability, Optimizing for Explainability
- outlier detection vs., Anomaly Detection
- performance and scalability in real-time streaming applications, Performance and scalability in real-time streaming applications
- practical system design concerns, Practical System Design Concerns-Mitigating Adversarial Effects
- response and mitigation, Response and Mitigation
- statistical metrics, Statistical Metrics-Summary
- supervised learning vs., When to Use Anomaly Detection Versus Supervised Learning
- unsupervised machine learning algorithms, Unsupervised Machine Learning Algorithms-Isolation forests
- web application intrusion detection, Web Application Intrusion Detection-Web Application Intrusion Detection
- APK (Android Package Kit), Structural analysis
- ApkFile tool, From Features to Classification
- application sandbox, Behavioral (dynamic) analysis
- APT (advanced persistent threats), Cyber Threat Landscape
- area under the curve (AUC), Choosing Thresholds and Comparing Models
- ARIMA (autoregressive integrated moving average), ARIMA-ARIMA
- ART (Android Runtime), Android malware analysis
- artificial intelligence (AI), What Machine Learning Is Not-What Machine Learning Is Not
- artificial neural networks (ANN) (see neural networks)
- ATO (account takeover), Cyber Threat Landscape, Authentication and Account Takeover-Building your classifier
- attack transferability, Attack Transferability
- AUC (area under the curve), Choosing Thresholds and Comparing Models
- authentication, Authentication and Account Takeover-Building your classifier
- autocorrelation, ARIMA
- autoencoder neural network, Unsupervised feature learning and deep learning
- AutoML, Solutions: Hyperparameter Optimization
- autoregressive integrated moving average (ARIMA), ARIMA-ARIMA
- autoregressive models, ARIMA-ARIMA
- availability attacks, Terminology
- AWS (Amazon Web Services), Using Cloud Services
B
- backdoor, Cyber Threat Landscape
- backpropagation, Neural Networks
- bag-of-words representation, Spam Fighting: An Iterative Approach, Locality-sensitive hashing
- Bayes error rate, Security Vulnerabilities in Machine Learning Algorithms
- Bayes' Theorem, Naive Bayes
- behavioral analysis, Real-World Uses of Machine Learning in Security, Behavioral (dynamic) analysis-Behavioral (dynamic) analysis
- bias, in datasets, Data Collection, Problem: Bias in Datasets-Problem: Bias in Datasets
- binary classifier evasion attack, Example: Binary Classifier Evasion Attack-Example: Binary Classifier Evasion Attack
- binary classifier poisoning attack, Example: Binary Classifier Poisoning Attack-Defense Against Poisoning Attacks
- binary data, Understanding Malware, Feature Generation
- binary relevance method, Classification
- binary trees, k-d trees-DBSCAN, Performance Optimization
- (see also decision trees)
- black-box models, Attack Transferability
- blindness, Labeling Data
- boiling frog attacks, Attack Technique: Model Poisoning
- bot (defined), Cyber Threat Landscape
- bot requests
- botnet
- defined, Cyber Threat Landscape, Active attacks
- detection, Real-World Uses of Machine Learning in Security
- hierarchical, How do botnets work?
- mechanism of operation, How do botnets work?-How do botnets work?
- multileader networks, How do botnets work?
- network traffic analysis, Botnets and You-How do botnets work?
- randomized P2P networks, How do botnets work?
- rentals, Indirect Monetization
- security risk posed by, The importance of understanding botnets
- star topology, How do botnets work?
- breaches, network, Active attacks
- Bro, Deep packet inspection
- brute-force attacks, Features used to classify login attempts
C
- C, compiled code execution in, Compiled code execution-Compiled code execution
- Calinski-Harabaz (C-H) index, Evaluating Clustering Results
- Capstone, Static analysis
- categorical variable, Machine Learning in Practice: A Worked Example
- causative attacks, Terminology
- chaff
- checkpointing, Problem: Checkpointing, Versioning, and Deploying Models
- class imbalance, Class imbalance-Class imbalance, Solutions: Data Quality
- classification
- advanced ensembling, Advanced Ensembling-Advanced Ensembling
- authentication abuse and, Building your classifier
- defined, What Is Machine Learning?
- imbalanced classes, Class imbalance-Class imbalance
- network attacks, Classification-Classification
- practical considerations, Practical Considerations in Classification-Choosing Thresholds and Comparing Models
- predictive models, Building a Predictive Model to Classify Network Attacks-Advanced Ensembling
- scoring of clusters, Classification
- semi-supervised learning, Semi-Supervised Learning
- supervised learning, Supervised Classification Algorithms-Neural Networks, Supervised Learning-Class imbalance
- training a classifier, Classification
- unsupervised learning, Unsupervised Learning-Unsupervised Learning
- cleverhans library, Defense Against Evasion Attacks
- click fraud, Monetizing the Consumer Web, Bot Activity
- clickjacking, Solutions: Data Quality
- cloud services, Using Cloud Services
- clustering, Clustering-Evaluating Clustering Results, Further Directions in Clustering
- abuse, Clustering Abuse-Classification
- algorithms for, Clustering Algorithms-Locality-sensitive hashing
- DBSCAN, DBSCAN-DBSCAN
- evaluating results, Evaluating Clustering Results-Evaluating Clustering Results
- for network attack classification, Unsupervised Learning-Unsupervised Learning
- generating clusters, Generating Clusters
- grouping method, Grouping, Grouping
- hierarchical, Hierarchical clustering-Hierarchical clustering
- k-d trees, k-d trees-DBSCAN
- k-means (see k-means clustering)
- locality-sensitive hashing, Hierarchical clustering-Hierarchical clustering, Locality-sensitive hashing-Locality-sensitive hashing
- metrics, More About Metrics
- scoring, Scoring Clusters-Classification
- spam domains, Example: Clustering Spam Domains-Example: Clustering Spam Domains
- code execution
- cold start, Labeling Data
- collaborative filtering, Spam Fighting: An Iterative Approach
- compiled code execution, Compiled code execution-Compiled code execution
- completeness score, Evaluating Clustering Results, Unsupervised Learning
- concept drift, Feature: Feedback Loops, A/B Testing of Models
- Conficker worm, Defining Malware Classification, Machine learning in malware classification
- configuration, tunability and, Goal: Easily Tunable and Configurable
- confusion matrix, Spam Fighting: An Iterative Approach, Supervised Learning-Supervised Learning
- consumer web abuse, Protecting the Consumer Web-Conclusion
- abuse types, Types of Abuse and the Data That Can Stop Them-Labeling and metrics
- account creation, Account Creation-Reputation scores
- authentication and account takeover, Authentication and Account Takeover-Building your classifier
- bot activity, Bot Activity-Labeling and metrics
- clustering of, Clustering Abuse-Classification
- clustering spam domains, Example: Clustering Spam Domains-Example: Clustering Spam Domains
- cold start vs. warm start for supervised learning, Labeling Data
- defined, Protecting the Consumer Web
- false positives/negatives for abuse problems, False Positives and False Negatives
- financial fraud protection, Financial Fraud-Financial Fraud
- generating clusters of abuse, Generating Clusters
- labeling data, Labeling Data
- large attacks and supervised learning, Large Attacks
- monetizing by hackers, Monetizing the Consumer Web
- multiple responses for supervised learning, Multiple Responses
- scoring clusters, Scoring Clusters-Classification
- supervised learning for detecting, Supervised Learning for Abuse Problems-Large Attacks
- contextual multi-armed bandits, Feature: Feedback Loops, A/B Testing of Models
- conventional validation, Spam Fighting: An Iterative Approach
- cost functions, Loss Functions
- covariance, Elliptic envelope fitting (covariance estimate fitting)
- covariance estimate fitting (elliptic envelope fitting), Elliptic envelope fitting (covariance estimate fitting)-Elliptic envelope fitting (covariance estimate fitting)
- credential stuffing, Bot Activity
- credit cards, Indirect Monetization, Financial Fraud
- cross-site scripting (XSS) attacks, Security Vulnerabilities in Machine Learning Algorithms, Example: Binary Classifier Evasion Attack-Example: Binary Classifier Evasion Attack
- cross-validation, Spam Fighting: An Iterative Approach, Training Data Construction
- curse of dimensionality, k-means, More About Metrics
- cyber attacks
- cycles (defined), Forecasting (Supervised Machine Learning)
D
- Dalvik, Android malware analysis
- darknet, A Marketplace for Hacking Skills
- data
- data privacy safeguards/guarantees, Feature: Data Privacy Safeguards and Guarantees
- data quality, Data Quality-Solutions: Missing Data
- data validation, Data Collection
- data-centric security, Data-Centric Security
- DataFrame, Machine Learning in Practice: A Worked Example
- datasets
- datasketch, Spam Fighting: An Iterative Approach
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise), DBSCAN-DBSCAN
- DDoS (distributed denial-of-service) attacks, Cyber Threat Landscape, Active attacks
- debugging, for Android malware analysis, Debugging
- decision boundary, Model Families, Attack Technique: Model Poisoning-Defense Against Poisoning Attacks
- decision forests, Decision Forests
- decision trees, Decision Trees-Decision Trees
- deep learning, What Machine Learning Is Not, Unsupervised feature learning and deep learning-Unsupervised feature learning and deep learning
- deep neural network algorithms
- deep packet inspection (DPI), Deep packet inspection-Deep packet inspection
- defensive distillation, Defense Against Evasion Attacks
- dendrogram, Hierarchical clustering
- denial-of-service (DoS) attacks, Cyber Threat Landscape, Active attacks
- density-based anomaly detection, Density-Based Methods-Local outlier factor
- Density-Based Spatial Clustering of Applications with Noise (DBSCAN), DBSCAN-DBSCAN
- differential privacy, Feature: Data Privacy Safeguards and Guarantees
- discriminator (neural network), Attack Transferability
- distance metrics
- distillation, Defense Against Evasion Attacks
- distributed computing, horizontal scaling with, Horizontal Scaling with Distributed Computing Frameworks-Horizontal Scaling with Distributed Computing Frameworks
- distributed denial-of-service (DDoS) attacks, Cyber Threat Landscape, Active attacks
- divisive (top-down) hierarchical clustering, Hierarchical clustering
- DoS (denial-of-service) attacks, Cyber Threat Landscape, Active attacks
- dummy variables, Machine Learning in Practice: A Worked Example
- dynamic (behavioral) analysis, Real-World Uses of Machine Learning in Security, Behavioral (dynamic) analysis-Behavioral (dynamic) analysis
- dynamic instrumentation, Dynamic instrumentation
- dynamic IPs, Geolocation
F
- F-score, Choosing Thresholds and Comparing Models
- failures (graceful degradation), Goal: Graceful Degradation
- false negatives/false positives
- Fast Gradient Sign Method (FGSM), Example: Binary Classifier Evasion Attack
- feature engineering, Feature Generation-How to Get Malware Samples and Labels
- for anomaly detection, Feature Engineering for Anomaly Detection-In Summary
- for cluster scoring, Feature extraction-Feature extraction
- data collection, Data Collection-Data Collection
- deep packet inspection, Deep packet inspection-Deep packet inspection
- defined, Understanding Malware
- host intrusion detection, Host Intrusion Detection-osquery
- for network defense, From Captures to Features-From Captures to Features
- network intrusion detection, Network Intrusion Detection-Features for network intrusion detection
- unsupervised feature learning, From Captures to Features
- velocity features as account creation defense, Velocity features-Velocity features
- web application intrusion detection, Web Application Intrusion Detection-Web Application Intrusion Detection
- feature extraction
- feature generation (see feature engineering)
- feature hashing, From Features to Classification
- feature selection, Feature Selection-Feature Selection
- features
- featurized datasets, From Features to Classification-How to Get Malware Samples and Labels
- feedback loops
- FGSM (Fast Gradient Sign Method), Example: Binary Classifier Evasion Attack
- financial fraud, Financial Fraud-Financial Fraud
- first-order optimization algorithms, Optimization
- forecasting
- fuzzing, Real-World Uses of Machine Learning in Security, Behavioral (dynamic) analysis
- fuzzy hashing, Spam Fighting: An Iterative Approach
- fuzzy matching, Machine learning in malware classification
G
- games, online, Bot Activity
- Gaussian distribution, Data-Driven Methods
- GBDT (gradient-boosted decision trees), Decision Forests
- GCP (Google Cloud Platform), Using Cloud Services
- generative adversarial nets (GANs), Attack Transferability
- generator (neural network), Attack Transferability
- geolocation, Geolocation
- Gini impurity, Decision Trees
- goodness-of-fit tests
- Google Cloud ML Engine, Using Cloud Services
- Google Cloud Platform (GCP), Using Cloud Services
- gradient descent optimization algorithms, Example: Gradient descent
- gradient-boosted decision trees (GBDT), Decision Forests
- greedy training, Decision Trees
- grid search, Solutions: Hyperparameter Optimization-Solutions: Hyperparameter Optimization
- grouping (clustering method), Grouping, Grouping
- Grubbs' outlier test, Grubbs’ outlier test
H
- hacking, commoditization of, A Marketplace for Hacking Skills
- Hamming distance, More About Metrics
- hashing trick, From Features to Classification
- hashing, locality-sensitive, Hierarchical clustering-Hierarchical clustering
- heuristics, intrusion detection with, Intrusion Detection with Heuristics
- Hex-Rays IDA, Static analysis
- hierarchical botnets, How do botnets work?
- hierarchical clustering, Hierarchical clustering-Hierarchical clustering
- hinge loss, Support Vector Machines
- homogeneity score, Evaluating Clustering Results, Unsupervised Learning
- homomorphic encryption, Data-Centric Security
- honeypots, Honeypots
- host intrusion detection
- hyperparameter optimization, Problem: Hyperparameter Optimization-Solutions: Hyperparameter Optimization
I
- IDS (see intrusion detection systems)
- imbalanced classes, Class imbalance-Class imbalance
- imperfect learning, Security Vulnerabilities in Machine Learning Algorithms, Defense Against Evasion Attacks
- imputation, Missing features, Solutions: Missing Data
- incendiary speech, Cyber Threat Landscape
- incident response, anomaly detection and, Response and Mitigation
- indiscriminate attacks, Terminology
- inductive transfer, Building a Predictive Model to Classify Network Attacks
- information gain, Decision Trees
- insider threat detection, Active attacks
- integrity attacks, Terminology
- Internet Relay Chat (IRC) protocol, How do botnets work?
- Internet wiretapping, Passive attacks
- interpreted code execution, Interpreted code execution-Interpreted code execution
- intrusion detection systems (IDS)
- IP addresses
- IRC (Internet Relay Chat) protocol, How do botnets work?
- island hopping, Deep packet inspection
- isolation forests, Isolation forests-Isolation forests
K
- k-dimensional (k-d) trees, k-d trees-DBSCAN, Performance Optimization
- k-means clustering, k-means-k-means, k-means
- k-nearest neighbors (k-NN) algorithms, k-Nearest Neighbors, Density-Based Methods
- KDD Cup, Features for network intrusion detection
- (see also NSL-KDD dataset)
- kernel (defined), Data-Driven Methods
- kernel trick, Support Vector Machines
- keyloggers, Cyber Threat Landscape
- Knowledge Discovery and Data Mining Special Interest Group (SIGKDD), Features for network intrusion detection
L
- L-infinity distance, k-means
- labeling
- last hop, Geolocation
- latency, production system, Goal: Low Latency, High Scalability-Performance Optimization
- latent feature representations, Feature Selection
- lazy-learning algorithms, k-Nearest Neighbors
- least squares regression, Minimizing the Cost Function
- LIBLINEAR, Optimization
- LIEF project, From Features to Classification
- linear models, Performance Optimization
- linear regression, Logistic Regression, Minimizing the Cost Function
- Local Interpretable Model-Agnostic Explanations (LIME), Generating explanations with LIME-Generating explanations with LIME
- local outlier factor (LOF), Local outlier factor-Local outlier factor
- local substitute model, Attack Transferability
- locality-sensitive hashing (LSH), Spam Fighting: An Iterative Approach, Hierarchical clustering-Hierarchical clustering, Locality-sensitive hashing, Locality-sensitive hashing-Locality-sensitive hashing
- log odds, Logistic Regression
- login attacks, Cyber Threat Landscape
- (see also authentication)
- logistic function, Model Families
- logistic regression, Machine Learning in Practice: A Worked Example
- long short-term memory (LSTM) networks, Artificial neural networks-Artificial neural networks
- loss functions, Loss Functions
- low-value hosts, Active attacks
- LSH (see locality-sensitive hashing)
M
- MAC flooding, Passive attacks
- machine learning (generally)
- adversaries using, Adversaries Using Machine Learning
- (see also adversarial machine learning)
- AI and, What Machine Learning Is Not-What Machine Learning Is Not
- anomaly detection, Challenges of Using Machine Learning in Anomaly Detection
- basics, What Is Machine Learning?-Adversaries Using Machine Learning
- defined, What Is Machine Learning?, Machine Learning: Problems and Approaches
- limitations, Limitations of Machine Learning in Security
- malware classification, Machine learning in malware classification-Machine learning in malware classification
- network security and, Machine Learning and Network Security-How do botnets work?
- online purchase transaction data example, Machine Learning in Practice: A Worked Example-Machine Learning in Practice: A Worked Example
- problems and approaches, Machine Learning: Problems and Approaches-Machine Learning: Problems and Approaches
- production systems (see production systems)
- real-world security uses, Real-World Uses of Machine Learning in Security-Real-World Uses of Machine Learning in Security
- supervised vs. unsupervised, What Is Machine Learning?
- training algorithms to learn, Training Algorithms to Learn-Which optimization algorithm?
- MAD (median absolute deviation), Median absolute deviation
- maintainability, production system, Maintainability-Goal: Easily Tunable and Configurable
- malware (defined), Cyber Threat Landscape
- malware analysis, Malware Analysis-Conclusion
- Android, Android malware analysis-Summary
- data collection for feature generation, Data Collection-Data Collection
- definitions for malware classification, Defining Malware Classification-Machine learning in malware classification
- detection, Real-World Uses of Machine Learning in Security
- feature generation, Feature Generation-How to Get Malware Samples and Labels
- feature hashing, From Features to Classification
- feature selection, Feature Selection-Unsupervised feature learning and deep learning
- featurized dataset generation, From Features to Classification-How to Get Malware Samples and Labels
- getting malware samples/labels, How to Get Malware Samples and Labels
- machine learning in malware classification, Machine learning in malware classification-Machine learning in malware classification
- malware attack flow, Typical malware attack flow
- malware basics, Understanding Malware-Typical malware attack flow
- malware behaviors, Typical malware attack flow
- malware economy, The malware economy
- modern code execution processes, Modern code execution processes-Interpreted code execution
- Malware Classification Challenge, How to Get Malware Samples and Labels
- malware-traffic-analysis.net, How to Get Malware Samples and Labels
- man-in-the-middle attacks, Passive attacks
- Manhattan distance (L1 distance), More About Metrics
- masquerading (phishing), Cyber Threat Landscape
- maximum likelihood estimate, Naive Bayes
- maximum-margin hyperplane, Support Vector Machines
- Maxmind, Geolocation
- MCD (Minimum Covariance Determinant), Elliptic envelope fitting (covariance estimate fitting)
- median absolute deviation (MAD), Median absolute deviation
- metamorphic malware, Defining Malware Classification
- metrics pollution, Labeling and metrics
- metrics, clustering, Clustering Algorithms, More About Metrics
- MFA (multifactor authentication), Access Control and Authentication
- microsegmentation, Detecting In-Network Attackers
- MinHash, Locality-sensitive hashing
- Minimum Covariance Determinant (MCD), Elliptic envelope fitting (covariance estimate fitting)
- misclassification, attack transferability and, Attack Transferability
- missing data, Problem: Missing Data-Solutions: Missing Data
- missing features, Missing features
- MLP (multilayer perceptron), Example: Binary Classifier Poisoning Attack
- model families, Model Families-Model Families
- model poisoning attacks (see poisoning attacks)
- model rot, Feature: Feedback Loops, A/B Testing of Models
- modeling error, Security Vulnerabilities in Machine Learning Algorithms
- models
- A/B testing, Feature: Feedback Loops, A/B Testing of Models
- checkpointing/versioning/deployment, Problem: Checkpointing, Versioning, and Deploying Models
- comparing, Choosing Thresholds and Comparing Models-Choosing Thresholds and Comparing Models
- defined, Training Algorithms to Learn
- feedback loops, Feature: Feedback Loops, A/B Testing of Models-Feature: Feedback Loops, A/B Testing of Models
- for production systems, Model Quality-Generating explanations with LIME
- hyperparameter optimization, Problem: Hyperparameter Optimization-Solutions: Hyperparameter Optimization
- repeatable and explainable results, Feature: Repeatable and Explainable Results-Generating explanations with LIME
- momentum, Optimization
- monitoring, production system, Monitoring and Alerting-Monitoring and Alerting
- moving average, Statistical Metrics
- multi-armed bandit problem, Feature: Feedback Loops, A/B Testing of Models
- multifactor authentication (MFA), Access Control and Authentication
- multilayer perceptron (MLP), Example: Binary Classifier Poisoning Attack
- multileader botnets, How do botnets work?
N
- n-grams, Grouping-Locality-sensitive hashing, Feature extraction-Feature extraction
- Naive Bayes classifiers, Naive Bayes-Naive Bayes
- Natural Language Toolkit (NLTK), Spam Fighting: An Iterative Approach
- negative log likelihood, Loss Functions
- network breaches, Active attacks
- network traffic analysis, Network Traffic Analysis-Conclusion
- access control and authentication, Access Control and Authentication
- active attacks, Active attacks-Active attacks
- advanced ensembling, Advanced Ensembling-Advanced Ensembling
- and class imbalance, Class imbalance-Class imbalance
- attack classification, Classification-Classification
- botnets, Botnets and You-How do botnets work?
- capturing live network data for feature generation, From Captures to Features-From Captures to Features
- data exploration/preparation, Exploring the Data-Exploring the Data
- data-centric security, Data-Centric Security
- deep packet inspection, Deep packet inspection-Deep packet inspection
- detecting in-network attackers, Detecting In-Network Attackers
- features for, Features for network intrusion detection
- honeypots, Honeypots
- intrusion detection, Network Intrusion Detection-Features for network intrusion detection, Intrusion Detection
- machine learning and network security, Machine Learning and Network Security-How do botnets work?
- network defense theory, Theory of Network Defense-Summary
- OSI model, Network Traffic Analysis
- outlier detection, Real-World Uses of Machine Learning in Security
- passive attacks, Passive attacks
- physical layer attacks, Passive attacks
- predictive model to classify attacks, Building a Predictive Model to Classify Network Attacks-Advanced Ensembling
- semi-supervised learning for, Semi-Supervised Learning
- supervised learning for network attack classification, Supervised Learning-Class imbalance
- threats in the network, Threats in the Network-Active attacks
- unsupervised feature learning, From Captures to Features
- unsupervised learning, Unsupervised Learning-Unsupervised Learning
- neural networks, What Machine Learning Is Not, Neural Networks, Artificial neural networks-Artificial neural networks, Unsupervised feature learning and deep learning-Unsupervised feature learning and deep learning, Performance Optimization
- NLTK (Natural Language Toolkit), Spam Fighting: An Iterative Approach
- normalization of data, Data Preparation
- novelty detection, Anomaly Detection with Data and Algorithms
- NSL-KDD dataset, Building a Predictive Model to Classify Network Attacks, Exploring the Data
- NumPy, linear algebra frameworks with, Performance Optimization
O
- objective functions, Loss Functions, Security Vulnerabilities in Machine Learning Algorithms
- (see also loss functions)
- observer bias (observer-expectancy effect), Problem: Bias in Datasets
- one-class Support Vector Machines, One-class support vector machines-One-class support vector machines
- one-hot encoding, Machine Learning in Practice: A Worked Example
- one-versus-all strategy, Classification
- one-versus-one strategy, Classification
- online gaming, Bot Activity
- online learning
- open source intelligence (OSINT), Integrating Open Source Intelligence-Geolocation
- Open Systems Interconnection (OSI) model, Network Traffic Analysis
- optimization algorithms, Optimization-Which optimization algorithm?
- optimized linear algebra frameworks, Performance Optimization
- OSI (Open Systems Interconnection) model, Network Traffic Analysis
- OSINT (see open source intelligence)
- osquery, osquery-osquery
- out-of-time validation, Training Data Construction
- outlier detection, Real-World Uses of Machine Learning in Security, Anomaly Detection, Anomaly Detection with Data and Algorithms
- overfitting, Limitations of Machine Learning in Security, Decision Trees, Overfitting and Underfitting-Overfitting and Underfitting
- oversampling, Class imbalance
P
- packers, Static analysis
- packet sniffing, Intrusion Detection
- parallelization, Performance and scalability in real-time streaming applications, Horizontal Scaling with Distributed Computing Frameworks
- passive attacks, Passive attacks
- passwords, Authentication and Account Takeover, Features used to classify login attempts
- pattern mining, Machine Learning and Network Security
- pattern recognition, Real-World Uses of Machine Learning in Security
- pay-per-install (PPI) marketplace, Indirect Monetization
- PCA (Principal Component Analysis), Feature Selection, Unsupervised Learning
- perceptron, Neural Networks
- performance optimization, Performance-Using Cloud Services
- phishing, Cyber Threat Landscape
- physical layer attacks, Passive attacks
- pivoting, Deep packet inspection, Active attacks
- poisoning attacks, Feature: Feedback Loops, A/B Testing of Models, Attack Technique: Model Poisoning-Defense Against Poisoning Attacks
- polymorphic malware, Defining Malware Classification
- polynomial factor differencing, ARIMA
- population, Problem: Bias in Datasets
- port scanning, Passive attacks
- PPI (pay-per-install) marketplace, Indirect Monetization
- Principal Component Analysis (PCA), Feature Selection, Unsupervised Learning
- privacy-preserving machine learning, Feature: Data Privacy Safeguards and Guarantees
- production systems, Production Systems-Conclusion
- A/B testing of models, Feature: Feedback Loops, A/B Testing of Models
- cloud services for, Using Cloud Services
- configuration and tunability, Goal: Easily Tunable and Configurable
- data quality, Data Quality-Solutions: Missing Data
- feedback, Feedback and Usability
- feedback loops, Feature: Feedback Loops, A/B Testing of Models-Feature: Feedback Loops, A/B Testing of Models
- graceful degradation, Goal: Graceful Degradation
- horizontal scaling with distributed computing frameworks, Horizontal Scaling with Distributed Computing Frameworks-Horizontal Scaling with Distributed Computing Frameworks
- hyperparameter optimization, Problem: Hyperparameter Optimization-Solutions: Hyperparameter Optimization
- latency optimization, Goal: Low Latency, High Scalability-Performance Optimization
- maintainability, Maintainability-Goal: Easily Tunable and Configurable
- mature/scalable system characteristics, Defining Machine Learning System Maturity and Scalability-What’s Important for Security Machine Learning Systems?
- model checkpointing/versioning/deployment, Problem: Checkpointing, Versioning, and Deploying Models
- model quality, Model Quality-Generating explanations with LIME
- monitoring and alerting, Monitoring and Alerting-Monitoring and Alerting
- performance optimization, Performance-Using Cloud Services
- performance requirements, What’s Important for Security Machine Learning Systems?
- repeatable/explainable results, Feature: Repeatable and Explainable Results-Generating explanations with LIME
- scalability, Goal: Low Latency, High Scalability-Performance Optimization
- security and reliability, Security and Reliability-Feature: Data Privacy Safeguards and Guarantees
- usability, Feedback and Usability
- profilers, Performance Optimization
- protectors, Static analysis
- proxies, Geolocation
- Python, interpreted code execution example, Interpreted code execution-Interpreted code execution
R
- Radare2, Static analysis-Static analysis
- radial basis function, Support Vector Machines
- random forests, Decision Forests
- randomized P2P botnets, How do botnets work?
- ranking fraud, Bot Activity
- ransomware, Cyber Threat Landscape, The malware economy
- receiver operating characteristic (ROC), Choosing Thresholds and Comparing Models
- recommendation systems, Challenges of Using Machine Learning in Anomaly Detection
- recursive feature elimination, Feature Selection
- red herring attacks, Feature: Feedback Loops, A/B Testing of Models
- (see also poisoning attacks)
- regression, What Is Machine Learning?, Machine Learning: Problems and Approaches
- regular data, Anomaly Detection
- regularization, Overfitting and Underfitting
- reinforcement learning (RL), Feature: Feedback Loops, A/B Testing of Models
- reliability, production system, Security and Reliability-Feature: Data Privacy Safeguards and Guarantees
- repeatability of machine learning predictions, Feature: Repeatable and Explainable Results
- reputation scores, Reputation scores-Reputation scores
- reputation systems, Features used to classify login attempts
- residual, Minimizing the Cost Function
- reverse engineering, Understanding Malware
- review fraud, Monetizing the Consumer Web
- RL (reinforcement learning), Feature: Feedback Loops, A/B Testing of Models
- ROC (receiver operating characteristic), Choosing Thresholds and Comparing Models
- rolling counter, Velocity features
- rootkit, Cyber Threat Landscape
S
- scalability
- scanning attacks, Cyber Threat Landscape
- scikit-learn
- and semi-supervised learning, Semi-Supervised Learning
- and spark-sklearn, Horizontal Scaling with Distributed Computing Frameworks
- feature hashing, From Features to Classification
- feature selection, Feature Selection
- hyperparameter optimization, Problem: Hyperparameter Optimization-Solutions: Hyperparameter Optimization
- imputing missing values, Solutions: Missing Data
- incorrect handling of categorical variables, Decision Trees
- linear algebra frameworks, Performance Optimization
- LOF generation, Local outlier factor
- machine learning algorithm cheat-sheet, Classification
- native linear algebra frameworks, Performance Optimization
- normalization for cross-validation, Data Preparation
- univariate analysis methods, Feature Selection
- scoring of clusters, Scoring Clusters-Classification
- scraping, Bot Activity, Labeling and metrics
- seasonality, Data-Driven Methods
- seasons (defined), Forecasting (Supervised Machine Learning)
- second factor authentication, Authentication and Account Takeover
- second-order optimization algorithms, Optimization
- security information and event management (SIEM), Response and Mitigation
- security intelligence feeds, Security Intelligence Feeds
- security, production system, Security and Reliability-Feature: Data Privacy Safeguards and Guarantees
- selection bias, Problem: Bias in Datasets
- semantic gap, Challenges of Using Machine Learning in Anomaly Detection
- semi-supervised learning
- sentinel values, Problem: Missing Data
- SGD (Stochastic Gradient Descent), Optimization
- shingling, Locality-sensitive hashing
- SIEM (security information and event management), Response and Mitigation
- SIGKDD (Knowledge Discovery and Data Mining Special Interest Group), Features for network intrusion detection
- sigmoid function, Model Families
- Silhouette coefficient, Evaluating Clustering Results
- Singular Value Decomposition (SVD), Feature Selection
- smoothing, Naive Bayes
- sniffing, Cyber Threat Landscape
- Snort, Network Intrusion Detection
- social engineering, Cyber Threat Landscape
- software profiling, Performance Optimization
- software reverse engineering, Understanding Malware
- source code, binaries and, Understanding Malware
- spam
- spam domains, Example: Clustering Spam Domains-Example: Clustering Spam Domains
- Spamhaus Project, Security Intelligence Feeds
- Spark ML, Horizontal Scaling with Distributed Computing Frameworks-Horizontal Scaling with Distributed Computing Frameworks
- spark-sklearn, Horizontal Scaling with Distributed Computing Frameworks
- spear phishing, Cyber Threat Landscape
- SPI (stateful packet inspection), Network Intrusion Detection
- split testing, Feature: Feedback Loops, A/B Testing of Models
- spoofing, Active attacks
- spyware, Cyber Threat Landscape, Indirect Monetization
- stacked generalization, Spam Fighting: An Iterative Approach
- standardization of data series, Data Preparation
- star/centralized botnets, How do botnets work?
- stateful packet inspection (SPI), Network Intrusion Detection
- static analysis, Static analysis-Static analysis
- statistical analysis, What Machine Learning Is Not
- statistical tests
- stealth banning, Response and Mitigation
- stemming, Spam Fighting: An Iterative Approach
- Stochastic Gradient Descent (SGD), Optimization
- stop words, Spam Fighting: An Iterative Approach
- structural analysis, Structural analysis-Structural analysis
- Stuxnet worm, Malware Analysis
- supervised learning, What Is Machine Learning?
- anomaly detection vs., When to Use Anomaly Detection Versus Supervised Learning
- cold start vs. warm start, Labeling Data
- consumer web abuse detection, Supervised Learning for Abuse Problems-Large Attacks
- defined, Machine Learning: Problems and Approaches
- false positives/negatives, False Positives and False Negatives
- forecasting, Forecasting (Supervised Machine Learning)-Summary
- labeling data, Labeling Data
- large attacks on consumer web, Large Attacks
- multiple responses for consumer web abuses, Multiple Responses
- network attack classification, Supervised Learning-Class imbalance
- supervised learning algorithms, Supervised Classification Algorithms-Neural Networks
- choosing thresholds and comparing models, Choosing Thresholds and Comparing Models-Choosing Thresholds and Comparing Models
- decision trees, Decision Trees-Decision Trees
- feature selection, Feature Selection-Feature Selection
- k-nearest neighbors, k-Nearest Neighbors
- logistic regression, Logistic Regression-Logistic Regression, Size of Logistic Regression Models
- (see also logistic regression)
- model family selection, Selecting a Model Family
- Naive Bayes classifiers, Naive Bayes-Naive Bayes
- neural networks, Neural Networks
- overfitting/underfitting, Overfitting and Underfitting-Overfitting and Underfitting
- practical considerations, Practical Considerations in Classification-Choosing Thresholds and Comparing Models
- support vector machines, Support Vector Machines-Support Vector Machines
- training data construction, Training Data Construction-Attacker evolution
- support vector machines (SVMs), Support Vector Machines-Support Vector Machines
- support vectors, Support Vector Machines
- SVD (Singular Value Decomposition), Feature Selection
- SYN flooding, Active attacks
T
- targeted attacks, Terminology
- tcpdump, From Captures to Features
- Thomson sampling, Feature: Feedback Loops, A/B Testing of Models
- threat intelligence feeds, Security Intelligence Feeds
- Threat Intelligence Quotient Test, Security Intelligence Feeds
- threat mitigation, anomaly detection and, Response and Mitigation
- thresholds, choosing, Choosing Thresholds and Comparing Models-Choosing Thresholds and Comparing Models
- time series
- time series analysis, Machine Learning: Problems and Approaches
- TLS (Transport Layer Security), From Captures to Features
- trained models, Model Quality-Generating explanations with LIME
- training data construction
- transfer learning, Building a Predictive Model to Classify Network Attacks
- Transport Layer Security (TLS), From Captures to Features
- TREC Public Spam Corpus, Spam Fighting: An Iterative Approach
- tree-based models, Performance Optimization
- (see also decision trees)
- trends (defined), Forecasting (Supervised Machine Learning)
- Trojan, Cyber Threat Landscape
- tunability, Goal: Easily Tunable and Configurable
U
- unbalanced data, Unbalanced data
- underfitting, Overfitting and Underfitting
- undersampling, Class imbalance
- univariate analysis, Feature Selection
- unsupervised feature learning, Neural Networks, Unsupervised feature learning and deep learning-Unsupervised feature learning and deep learning, From Captures to Features
- unsupervised learning
- algorithms, Unsupervised Machine Learning Algorithms-Isolation forests, Attack Transferability
- defined, What Is Machine Learning?, Machine Learning: Problems and Approaches
- generative adversarial nets, Attack Transferability
- isolation forests, Isolation forests-Isolation forests
- network attack classification, Unsupervised Learning-Unsupervised Learning
- one-class Support Vector Machines, One-class support vector machines-One-class support vector machines
- unsupervised feature learning vs., Neural Networks, From Captures to Features
- user authentication (see authentication)
V
- V-measure, Evaluating Clustering Results, Unsupervised Learning
- validation, Data Collection
- variance reduction, Decision Trees
- velocity features, Velocity features-Velocity features
- versioning, Problem: Checkpointing, Versioning, and Deploying Models
- virus (defined), Cyber Threat Landscape
- VirusShare.com, How to Get Malware Samples and Labels
- VirusTotal, How to Get Malware Samples and Labels
- VX Heaven, How to Get Malware Samples and Labels