IMDB Project Walkthrough
The IMDB project, included in the Free Trial, uses the IMDB Dataset (Large Movie Reviews) with a simple sentiment classification model.
The IMDB dataset contains 50K movie reviews for natural language processing and is used for binary sentiment classification. 25,000 highly polar movie reviews are provided for training and 25,000 for testing.
In the following steps, you will train the model, see analytics, and perform basic analyses. For a more in-depth guide to the IMDB use-case, see the full IMDB Guide.
To open the project, in the Welcome screen, go to Projects and click
The Network tab displays the model's nodes and connections in a simple convolutional neural network (CNN) model:
You can zoom in and out using the scroll wheel, and pan by dragging the background. To see the details of a node, click on it.
The orange node on the left represents the IMDB dataset (the script can be viewed in Resources Management).
The light blue nodes seen in the center of the model represent the model's layers, and the colored nodes at the end of the model represent the Loss and Optimizer.
The dark blue nodes represent Tensorleap Visualizer nodes. These nodes extract visualizations from different outputs.
The pre-saved projects provided are not yet trained. They must be trained before presenting analytics and analyses.
To train the model for 1 epoch click, on the top bar click
. The Training Model will appear, at the bottom, click
Once training has initiated, a
PENDINGnotification will appear indicating that the training process is initializing. This could take a minute or so. Once the training begins you will see a
Training will take somewhere between 20 minutes to 3 hours, depending on your machine. To track the Training status, click
To display the model's analytics on the dashboard, on the top left of the dashboard, click
to open the Versions view. Expand the version and make sure that the current model is selected:
- Top left - Loss (error) vs Batch. See how the loss is reduced as training increases
- Top right - Samples ordered according to loss from high to low
- Bottom left - User Score vs Loss - good performance (low average loss) on the edges of the 1 to 10 scale and poor performance in the middle of the scale (high average loss)
- Bottom right - Accuracy vs Batch. See how the accuracy of the model with training
Tensorleap's technology creates a latent space that is relatively close to the entire model's latent space. This latent space is composed of feature activations from all the model's layers, that distribute the data in the most informative way.
This allows the platform to create a similarity map between samples as they are interpreted by the model. A more intuitive explanation would be that similar samples would activate similar learned features within the model.
This similarity map is called a Population Exploration analysis, and it is performed automatically after each epoch.
Below there is a short clip that illustrates the following steps:
- After the training is completed, clickat the the top left, to see the population exploration analysis.
- Resize the Population Exploration analysis panel.
- Color the dots by their ground-truth label by clicking, where the
lossis currently selected, and change it to.
Population Exploration (click-to-zoom)
The similarity map shows various clusters corresponding to each given label. Within these clusters, there are samples that are misclassified by the model. These samples have different labels (thus different colors) and high loss.
Hovering over the dots shows a preview of the sample:
Population Exploration Samples Preview (click-to-zoom)
The [OOV] seen in the samples means Out of Vector. It indicates that the word was not represented in the word tokens, and thus ignored by the model..
In the center of each cluster we can see samples with high loss, indicated by the large dots:
These samples are good candidates for mis-labeling. Previewing the samples, and clicking them reveals that they were indeed valid candidates:
The sample above shows a very bad review, ending with "don't see this film...", but its ground truth and score label it as a good film. It is indeed a mis-label.
The samples says a lot of good things about Brad Pitt and about the director, even though it is labeled and scored as a bad film.
In this review, it is indicated that the script is bad. Even though it is highly scored. This is actually not a "real" mis-labeling case, as the original review had additional paragraphs indicating the positive things about the movie, but was truncated due to the input length.
It exposed a problem with long reviews that are not well represented in the current model's words limit.
Click each sample to show its preview, metadata, and metrics. Click
to analyze it.
The Sample Analysis tool runs explainability algorithms on selected samples and displays the visualizations correlated with the Visualizer blocks.
To analyze a sample, select it and click
on the right panel. This sends the sample to be analyzed by the platform, and once finished, the results are displayed in the Analyzer panel.
These are the results of the sample mentioned in the image above. This sample represents a good review, but was predicted as a bad one (shown by the horizontal bars). In addition, we can ask what tokens affected the
positiveprediction to go
up, and what affected the
negativeprediction to go
up. Below we can see the sample analysis, where the words "this was an excellent film" were marked for positive and "worst movie ever" was marked as negative:
Sample Analysis (click-to-zoom)
Congratulations on completing this short IMDB walkthrough for the Standalone Trial.
Next, you can follow the full guide at IMDB Guide, which takes you through dataset integration, model building and importing, as well as reviewing and analyzing additional metrics.