Rebecca Vickery
1 min readNov 1, 2019

--

Hi @miguelcouto,

The code block containing the msk variable is a way of randomly splitting the data into test and train sets. First I am using numpy to randomly select 80% of the data and assigning that to the msk variable. I then assign this to train_w to create the training set. I then specify that test_w should contain all data that isn’t in msk (~msk). Hope that helps.

--

--

Rebecca Vickery
Rebecca Vickery

Written by Rebecca Vickery

Data Scientist | Writer | Speaker

Responses (1)