Table of Contents for
Hands-On Machine Learning for Cybersecurity

Cover image for bash Cookbook, 2nd Edition

Hands-On Machine Learning for Cybersecurity by Sinan Ozdemir Published by Packt Publishing, 2018

Difference between test and validation datasets

Naive Bayes classifier for multinomial models

Sklearn pipeline

While coding for a machine learning model, there are certain steps/actions that need to be repeatedly performed. A pipeline is the way forward in such cases where routine processes can be streamlined within encapsulations containing small bits of logic; this helps avoid writing a bunch of code.

A pipeline helps to prevent/identify data leakages. They perform the following tasks:

Fit
Transform
Predict

There are functions to transform/fit the training and test data. If we end up creating multiple pipelines to generate features in our Python code, we can run the feature union function to join them in a sequence one after the other. Thus, a pipeline enables us to perform all three transformations with a resulting estimator:

from sklearn.naive_bayes import MultinomialNB

Previous Chapter

Difference between test and validation datasets

Next Chapter

Naive Bayes classifier for multinomial models

Table of Contents for Hands-On Machine Learning for Cybersecurity

Table of Contents for
Hands-On Machine Learning for Cybersecurity