USB-IDS-1 consists of 17 (compressed) csv files providing ready-to-use labeled network flows. 16 files correspond to a DoS tool-defense module combination. In addition, a REGULAR file containing normative network traffic (no attack) is provided. Network flows were obtained through CICFlowMeter, which is applied to the pcap files obtained after the experiments. The naming scheme of the 16 non-normative csv files allows to identify the collection scenario. For example, Hulk-NoDefense.csv provides the flows obtained by executing Hulk with no defense in place; similarly Slowloris-Reqtimeout.csv provides the flows obtained during the experiment where Slowloris is launched against the server hardened with Reqtimeout.
The dataset is also proposed in an arrangement immediately suitable for machine learning purposes.

UPDATE (12/1/2022): all the datasets have been re-generated using the fixed CICFlowmeter tool available here.

Additional arrangement of USB-IDS-1 for machine learning

The above mentioned files are further arranged into 3 (compressed) csv files, i.e., training, validation and test, specially crafted for prospective users aiming to apply machine learning techniques. We adopt a stratified sampling strategy with no replacement, which means that the ratio of benign and attack classes of the original 17 files is preserved in the output splits and each record of the original files is assigned to a unique split. The files account for the 70% (training), 15% (validation) and 15% (test) of the total dataset. The ordering of the records is randomized to avoid any potential bias.

Made with ‌

HTML Code Creator