{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tune GPT2 to generate positive reviews\n", "> Optimise GPT2 to produce positive IMDB movie reviews using a BERT sentiment classifier as a reward function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
Figure: Experiment setup to tune GPT2. The yellow arrows are outside the scope of this notebook, but the trained models are available through Hugging Face.
\n", "Figure: Reward mean and distribution evolution during training.
\n", "\n", " | query | \n", "response (before) | \n", "response (after) | \n", "rewards (before) | \n", "rewards (after) | \n", "
---|---|---|---|---|---|
0 | \n", "I rented Zero Day | \n", "4 for my sister. To my surprise, the Wii caug... | \n", ". It is a pleasure. It is a huge leap 68 years... | \n", "1.736068 | \n", "2.423731 | \n", "
1 | \n", "The only | \n", "distro of her | \n", "special compliments is the | \n", "0.150852 | \n", "0.190159 | \n", "
2 | \n", "I've read a few | \n", "news reports about Mr. Mueller's activities b... | \n", "novels and I never watch this. It has a reall... | \n", "-1.417962 | \n", "2.831814 | \n", "
3 | \n", "This is the second British Rank film | \n", ", and I wouldn't be surprised anymore if it | \n", "that I have enjoyed, achieving it in both the | \n", "0.835876 | \n", "2.205628 | \n", "
4 | \n", "A classic | \n", "classic.<br /><br />And only this one will ha... | \n", ". It's a movie with a fine cast. As the beginn... | \n", "2.113075 | \n", "2.739168 | \n", "
5 | \n", "This has to be one of the | \n", "worst with the differences being that for the | \n", "best thriller films I've seen in recent | \n", "-2.705339 | \n", "2.730615 | \n", "
6 | \n", "Happy Go Lovely is a waste | \n", ". Not only are extremely | \n", "of time, giving a | \n", "-2.429504 | \n", "-2.934672 | \n", "
7 | \n", "Wow, I just | \n", "can't make fun of it | \n", "feek it! This show | \n", "-2.201666 | \n", "-0.106085 | \n", "
8 | \n", "This movie makes several mistakes. | \n", "Despite being a great comedic diversion it es... | \n", "It's cool, wonderful - it held me into a very ... | \n", "-1.232380 | \n", "2.707638 | \n", "
9 | \n", "Branagh and Fish | \n", "burne, Drake is played | \n", "is a great show. Beautiful | \n", "0.776819 | \n", "2.808996 | \n", "
10 | \n", "I might have given this movie a | \n", "rating of *11 when I heard that!), but it was... | \n", "great performance. It was truly a great movie... | \n", "0.276380 | \n", "2.743328 | \n", "
11 | \n", "Really, really bad | \n", "with feel like there is no end to the | \n", ". This movie is incredibly good, with the | \n", "-2.639503 | \n", "-1.568827 | \n", "
12 | \n", "What another reviewer called lack of | \n", "judgment, connecting into her own harsh obser... | \n", "suspense. Rogers and Rooney rate this as exce... | \n", "-1.079707 | \n", "2.696888 | \n", "
13 | \n", "This is simply one | \n", "more problem of Steve | \n", "of the best choice | \n", "-1.445436 | \n", "2.662699 | \n", "
14 | \n", "\"Perhaps we can arrange a meet | \n", "-and-greet.<br /><br />Teleg | \n", "with spent, classic music and dance, and come... | \n", "0.258479 | \n", "1.876662 | \n", "
15 | \n", "Richard Willaims is | \n", "nice enough; the little black guy plays quite | \n", "beautifully hands on in his own spin, and | \n", "0.796508 | \n", "2.820259 | \n", "