---
license: cc-by-4.0
language:
- he
base_model:
- GiliGold/Knesset-DictaBERT
pipeline_tag: text-classification
tags:
- checkworthiness
- factuality
- worth
- checking
- worth-checking
- checkable
---


This model is based on [Knesset-dictaBERT](https://huggingface.co/GiliGold/Knesset-DictaBERT) and was trained to classify a Hebrew sentence for checkworthiness.

The possible values are:
*worth checking*, *not worth checking* , or *not a factual proposition*

It was trained on a train-set of ~5000 manually annotated sentences from the [Knesset Corpus](https://huggingface.co/datasets/HaifaCLGroup/KnessetCorpus).

The train set is available [here](https://github.com/HaifaCLG/Factuality).

The Knesset Corpus automatically annotated for checkworthiness by [knesset-dicta-checkworthiness](https://huggingface.co/GiliGold/knesset-dicta-checkworthiness) is available [here](https://huggingface.co/datasets/GiliGold/Knesset_check_worthiness)

Paper: 
[ArXiv paper](https://arxiv.org/abs/2509.26406)

- Citation:

@InProceedings{goldin-EtAl:2025:RANLP,
  author    = {Goldin, Gili  and  Wigderson, Shira  and  Rabinovich, Ella  and  Wintner, Shuly},
  title     = {An Annotation Scheme for Factuality and Its Application to Parliamentary Proceedings},
  booktitle      = {Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI era},
  month          = {September},
  year           = {2025},
  address        = {Varna, Bulgaria},
  publisher      = {INCOMA Ltd., Shoumen, Bulgaria},
  pages     = {403--412},
  abstract  = {Factuality assesses the extent to which a language utterance relates to real-world information; it determines whether utterances correspond to facts, possibilities, or imaginary situations, and as such, it is instrumental for fact checking. Factuality is a complex notion that relies on multiple linguistic signals, and has been studied in various disciplines. We present a complex, multi-faceted annotation scheme of factuality that combines concepts from a variety of previous works. We developed the scheme for Hebrew, but we trust that it can be adapted to other languages. We also present a set of almost 5,000 sentences in the domain of parliamentary discourse that we manually annotated according to this scheme. We report on inter-annotator agreement, and experiment with various approaches to automatically predict (some features of) the scheme, in order to extend the annotation to a large corpus.},
  url       = {https://aclanthology.org/2025.ranlp-1.49}
}