Spaces:
Running
Running
Miquel Albertí
commited on
Commit
·
2c02057
1
Parent(s):
9f92a32
Update READEM
Browse files- metrics.md +4 -4
metrics.md
CHANGED
@@ -11,7 +11,7 @@ For those designs that pass SYN, PSQ is measured with the PPA report, which is c
|
|
11 |
|
12 |
Since STX, FNC and SYN are binary evaluations, pass or fail, their score is computed using Pass@1. Remember that failures in previous stages are reported as automatic fails in the next ones.
|
13 |
|
14 |
-
On the other hand, PPA is represented as
|
15 |
|
16 |
### PPA-Score
|
17 |
|
@@ -31,9 +31,9 @@ $$
|
|
31 |
$$
|
32 |
which has the following interpretation:
|
33 |
|
34 |
-
- $\hat{p}_{i,j} = 0$ :
|
35 |
-
- $\hat{p}_{i,j} = 1$ : designs with an area, power or
|
36 |
-
- $\hat{p}_{i,j} = 2$ : can only be obtained by chips which occupy no space, execute in no time,
|
37 |
|
38 |
The final formula considering all generations of an LLM for a given benchmark is just the average of these scores:
|
39 |
$$
|
|
|
11 |
|
12 |
Since STX, FNC and SYN are binary evaluations, pass or fail, their score is computed using Pass@1. Remember that failures in previous stages are reported as automatic fails in the next ones.
|
13 |
|
14 |
+
On the other hand, PPA is represented as real values that must be evaluated against a golden solution (crafted by humans in our case). Therefore, we need to introduce a new formulation that takes this into account.
|
15 |
|
16 |
### PPA-Score
|
17 |
|
|
|
31 |
$$
|
32 |
which has the following interpretation:
|
33 |
|
34 |
+
- $\hat{p}_{i,j} = 0$ : means that the generation has not passed the previous stages, or that it requires twice or more the area, the power, or the delay (performance), when compared to the human reference.
|
35 |
+
- $\hat{p}_{i,j} = 1$ : are designs with an area, power, or performance equal to that of the human reference.
|
36 |
+
- $\hat{p}_{i,j} = 2$ : can only be obtained by chips which occupy no space, execute in no time, or consume no energy (perfect, but impossible to achieve).
|
37 |
|
38 |
The final formula considering all generations of an LLM for a given benchmark is just the average of these scores:
|
39 |
$$
|