Algorithm | Algorithm refers to the piece of code that calculates and establishes the predictions based on input data. |
Class, classification | In the context of a prototype for a vacuum cleaner, a class could be ‘Vacuuming with forward motion’. A class can therefore be seen as a user action that needs to be or is predicted. Within a dataset the class ‘Vacuuming with forward motion’ is based on data collected from, for example, a motion sensor that resembles this action. |
Data point | A single element in collections of feature values within a dataset. |
Dataset | An entire collection of features consisting of data points within rows and columns of a dataset. |
Feature | Each feature, or column in a dataset, represents a measurable piece of data that can be used for analysis, for example: Name, Age, Sex, Fare, and so on. Depending on what datasets are used, the features included in a dataset can vary widely. |
Iteration | A single iteration refers to a single loop in which is checked whether the model is uncertain of a certain label prediction. If so, the model asks for human verification for its class prediction. After this, the loop starts over and training is continued. |
Labeling | Labeling is the act of assigning data values to a class. |
Model | A machine learning model is a general term for algorithms that help predict or analyze data values. Depending on the model used, the accuracy of the model predictions can vary a lot for a single dataset. There are a lot of different types of models that perform better on different types of data. Examples of models are: K-Nearest Neighbours classifier, Support Vector Machine, Deep Learning and Decision Tree classifier. |
Model output | The output refers to the information that is provided by a machine learning model after running it. This can be a set of predicted labels for example. |
Preprocessing | Data preprocessing refers to the technique of preparing (cleaning and organizing) raw data (data directly from a sensor for example) to make it suitable for training Machine Learning models. As a result the performance of models can be improved. |
Sensor data | Sensor data is the data measured by a sensor (for example an accelerometer or gyroscope) that was used on a product prototype during user testing for user behavior evaluation. This sensor data is then analyzed using our product. |
Running a cell | Running a cell refers to running the code in a code cell in the notebook. This can be done by clicking the triangle ‘play’ button within the software after clicking the code cell. Also referred to as code block. |
Testing and training | Training means teaching an algorithm that a certain sequence of data is linked to a specific label, which is done by feeding the model a dataset that contains labels (training data). This way the model can predict the label of similar looking unlabeled data (test data) After training, the trained model is given a dataset without labels for which it needs to predict the labels to test whether the model’s performance is sufficient. This is called testing. |