We will gather data from different sources, and will be able to create a dataset with approximately 1,000 URLs. These URLs are prelabelled in their respective classes: benign, spam, and malicious. The following screenshot is a snippet from our URL dataset:
