We will use the Kaggle dataset in the following example. The data is similar to the data gathered in a mail server. An intelligent way to gather spam email is to collect data from mail servers that have been shut down. Since the email accounts associated with such mail servers perpetually do not exist, it can be assumed that any emails sent to these email accounts are spam emails.
The following screenshot shows a snippet of actual Kaggle data, taken from https://www.kaggle.com/uciml/sms-spam-collection-dataset:

We have modified the data to add labels (0 is ham and 1 is spam), as follows:
|
Spam/Ham |
|
Label |
|
Ham |
Your electricity bill is |
0 |
|
Ham |
Mom, see you this friday at 6 |
0 |
|
Spam |
Win free iPhone |
1 |
|
Spam |
60% off on Rolex watches |
1 |
|
Ham |
Your order #RELPG4513 |
0 |
|
Ham |
OCT timesheet |
0 |