GPT-2 Fine-tuned on The Witcher Books

A GPT-2 Medium model fine-tuned on text from The Witcher book series by Andrzej Sapkowski for creative text generation in a dark fantasy style.

Model Description

This model is part of the Deepstory project, which combines Natural Language Generation, Text-to-Speech, and animation technologies to create interactive storytelling experiences.

The model has been fine-tuned on The Witcher novels to generate text that captures the narrative style, vocabulary, and themes of the original books.

Model Architecture

This model is based on GPT-2 Medium architecture.

Parameter	Value
Architecture	GPT2LMHeadModel
Model Size	GPT-2 Medium
Number of Layers (n_layer)	24
Hidden Size (n_embd)	1024
Attention Heads (n_head)	16
Context Length (n_ctx)	1024
Max Positions (n_positions)	1024
Vocabulary Size	50,257
Activation Function	GELU (new)
Attention Dropout	0.1
Embedding Dropout	0.1
Residual Dropout	0.1
Layer Norm Epsilon	1e-05

Usage

With Transformers Library

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("thetobysiu/gpt2-witcher-books")
model = GPT2LMHeadModel.from_pretrained("thetobysiu/gpt2-witcher-books")

# Generate text
prompt = "The witcher drew his silver sword and"
input_ids = tokenizer.encode(prompt, return_tensors='pt')

output = model.generate(
    input_ids=input_ids,
    max_length=150,
    temperature=0.9,
    top_p=0.95,
    top_k=50,
    do_sample=True,
    num_return_sequences=1
)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Generation Parameters

For best results, we recommend the following generation parameters:

Parameter	Recommended Value	Description
temperature	0.7 - 1.0	Higher values = more creative
top_p	0.9 - 0.95	Nucleus sampling threshold
top_k	40 - 50	Top-k sampling
max_length	100 - 300	Maximum tokens to generate
do_sample	True	Enable sampling

Training Data

The model was fine-tuned on text from The Witcher book series, including:

The Last Wish
Sword of Destiny
Blood of Elves
Time of Contempt
Baptism of Fire
The Tower of the Swallow
The Lady of the Lake
Season of Storms

Training Procedure

The model was fine-tuned using the Hugging Face Transformers library starting from the pre-trained GPT-2 Medium checkpoint.

Intended Use

This model is intended for:

Creative writing assistance in fantasy genre
Interactive storytelling applications
Fan fiction generation
Research in language model fine-tuning
Educational purposes

Limitations

The model may generate content that doesn't perfectly match the original author's style
Context length is limited to 1024 tokens
May occasionally generate repetitive or incoherent text
The training data is based on copyrighted material

Ethical Considerations

This model is trained on copyrighted material and should be used for personal/research purposes
Generated content should not be presented as original work by the book author
Users should be aware that generated text may contain mature themes consistent with the source material

Citation

If you use this model, please cite the Deepstory project:

@misc{deepstory,
  author = {Siu King Wai},
  title = {Deepstory},
  year = {2020},
  publisher = {GitHub},
  url = {https://github.com/thetobysiu/deepstory}
}

License

This model is released under the MIT License. Please note that The Witcher series is the intellectual property of Andrzej Sapkowski and CD Projekt.

Acknowledgments

OpenAI for the original GPT-2 model
Hugging Face for the Transformers library
Andrzej Sapkowski for The Witcher book series

Downloads last month: 7