LiquidAI/LFM2-1.2B finetuned on Japanese content to output JSON with PII. The model should output only single json with four fields:

{
    "full_name": "name of the person",
    "company_name": "name of the company",
    "address": "address of the plance",
    "phone_number": "phone number"
}

Coded during Liquid AI hackathon in Tokyo.

Evaluations

Evaluation on test split of stockmark/ner-wikipedia-dataset:

Test accuracy on raw model using wiki dataset: 0.4250 --> 1.0 after fine-tunning.

That dataset is somehow simple. We generated 64 samples of long contracts containing PII in Japanese. We used it only to evaluate the final perfomrance of the models

Test accuracy on raw model using OUR dataset: 0.0312 --> 1.0 after fine-tunning.

Evaluation methodology

We use an exact match on generated JSON. The output of SLM must be a valid JSON with exactly four required fields, no less, no more.

Model Details

Developed by: @kainoj, @valeriosalvucci and @gangadhara691
Language(s) (NLP): Japanese
License: lfm1.0
Finetuned from model: LiquidAI/LFM2-1.2B
Finetued on dataset: stockmark/ner-wikipedia-dataset

System prompt

Use the following system prompt while extracting PIIs:

Identify and extract information matching the following schema.
Return data as a JSON object. 
For each field, select most suitable value from text
If provided text does not contain sufficient information to fill out the field, make the field empty string.
Output only JSON, and output only four fields. 

{
    "full_name": "name of the person",
    "company_name": "name of the company",
    "address": "address of the plance",
    "phone_number": "phone number"
}

Downloads last month: 25

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for kainoj/LiquidAI-LFM2-1.2B-ja-pii-finetuned

Base model

LiquidAI/LFM2-1.2B

Finetuned

(37)

this model

kainoj
/

LiquidAI-LFM2-1.2B-ja-pii-finetuned

Evaluations

Evaluation methodology

Model Details

System prompt

Model tree for kainoj/LiquidAI-LFM2-1.2B-ja-pii-finetuned

Dataset used to train kainoj/LiquidAI-LFM2-1.2B-ja-pii-finetuned