GPT-2 Fine-tuned on The Witcher Books

A GPT-2 Medium model fine-tuned on text from The Witcher book series by Andrzej Sapkowski for creative text generation in a dark fantasy style.

Model Description

This model is part of the Deepstory project, which combines Natural Language Generation, Text-to-Speech, and animation technologies to create interactive storytelling experiences.

The model has been fine-tuned on The Witcher novels to generate text that captures the narrative style, vocabulary, and themes of the original books.

Model Architecture

This model is based on GPT-2 Medium architecture.

Parameter Value
Architecture GPT2LMHeadModel
Model Size GPT-2 Medium
Number of Layers (n_layer) 24
Hidden Size (n_embd) 1024
Attention Heads (n_head) 16
Context Length (n_ctx) 1024
Max Positions (n_positions) 1024
Vocabulary Size 50,257
Activation Function GELU (new)
Attention Dropout 0.1
Embedding Dropout 0.1
Residual Dropout 0.1
Layer Norm Epsilon 1e-05

Usage

With Transformers Library

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("thetobysiu/gpt2-witcher-books")
model = GPT2LMHeadModel.from_pretrained("thetobysiu/gpt2-witcher-books")

# Generate text
prompt = "The witcher drew his silver sword and"
input_ids = tokenizer.encode(prompt, return_tensors='pt')

output = model.generate(
    input_ids=input_ids,
    max_length=150,
    temperature=0.9,
    top_p=0.95,
    top_k=50,
    do_sample=True,
    num_return_sequences=1
)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Generation Parameters

For best results, we recommend the following generation parameters:

Parameter Recommended Value Description
temperature 0.7 - 1.0 Higher values = more creative
top_p 0.9 - 0.95 Nucleus sampling threshold
top_k 40 - 50 Top-k sampling
max_length 100 - 300 Maximum tokens to generate
do_sample True Enable sampling

Training Data

The model was fine-tuned on text from The Witcher book series, including:

  • The Last Wish
  • Sword of Destiny
  • Blood of Elves
  • Time of Contempt
  • Baptism of Fire
  • The Tower of the Swallow
  • The Lady of the Lake
  • Season of Storms

Training Procedure

The model was fine-tuned using the Hugging Face Transformers library starting from the pre-trained GPT-2 Medium checkpoint.

Intended Use

This model is intended for:

  • Creative writing assistance in fantasy genre
  • Interactive storytelling applications
  • Fan fiction generation
  • Research in language model fine-tuning
  • Educational purposes

Limitations

  • The model may generate content that doesn't perfectly match the original author's style
  • Context length is limited to 1024 tokens
  • May occasionally generate repetitive or incoherent text
  • The training data is based on copyrighted material

Ethical Considerations

  • This model is trained on copyrighted material and should be used for personal/research purposes
  • Generated content should not be presented as original work by the book author
  • Users should be aware that generated text may contain mature themes consistent with the source material

Citation

If you use this model, please cite the Deepstory project:

@misc{deepstory,
  author = {Siu King Wai},
  title = {Deepstory},
  year = {2020},
  publisher = {GitHub},
  url = {https://github.com/thetobysiu/deepstory}
}

License

This model is released under the MIT License. Please note that The Witcher series is the intellectual property of Andrzej Sapkowski and CD Projekt.

Acknowledgments

  • OpenAI for the original GPT-2 model
  • Hugging Face for the Transformers library
  • Andrzej Sapkowski for The Witcher book series
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support