Hands-On Machine Learning for Cybersecurity

We will start by importing the relevant packages. The pandas package will be used to enable data frame capabilities. The sklearn package will be used to divide the data into training and testing datasets. We will also use the logistic regression available in sklearn:

import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model.logistic import LogisticRegression
from sklearn.model_selection import train_test_split, cross_val_score

We import SMSSpamCollectiondataSet using pandas, as follows:

dataframe = pd.read_csv('SMSSpamCollectionDataSet', delimiter='\t',header=None)

X_train_dataset, X_test_dataset, y_train_dataset, y_test_dataset = train_test_split(dataframe[1],dataframe[0])

The data is transformed to fit the logistic regression model:

vectorizer = TfidfVectorizer()
X_train_dataset = vectorizer.fit_transform(X_train_dataset)
classifier_log = LogisticRegression()
classifier_log.fit(X_train_dataset, y_train_dataset)

The test dataset is used to predict the accuracy of the model:

X_test_dataset = vectorizer.transform( ['URGENT! Your Mobile No 1234 was awarded a Prize', 'Hey honey, whats up?'] )

predictions_logistic = classifier.predict(X_test_dataset)
print(predictions)

Table of Contents for
Hands-On Machine Learning for Cybersecurity

Python

Table of Contents for Hands-On Machine Learning for Cybersecurity

Table of Contents for
Hands-On Machine Learning for Cybersecurity