Commits · Luigi/SmolVLM2-on-llama.cpp

default to 500m model

fa03d73

Running

Luigi commited on 13 days ago

default f16 precision

5a94240

Luigi commited on 13 days ago

back to normal resolution 384x384 at 75% compression quality

5fc1115

Luigi commited on 13 days ago

bugfix on n_threads default

aa69ba7

Luigi commited on 13 days ago

resize frame to 64x64

23c5da5

Luigi commited on 13 days ago

resized to 64x64

cd1cc4c

Luigi commited on 13 days ago

clean preciision list to 2.2b

9093e42

Luigi commited on 13 days ago

try with smaller frame size

1864930

Luigi commited on 13 days ago

keep model in ram and reduce jpg quality

fbaf2b0

Luigi commited on 13 days ago

add "Q4_K_M" precision

8addf7d

Luigi commited on 13 days ago

default to 256m f16 model

e53e448

Luigi commited on 15 days ago

put fall detection prompt as default

01262c3

Luigi commited on 15 days ago

default n_threads to 2

0d517a8

Luigi commited on 15 days ago

open n_threads to set by user

22b94a2

Luigi commited on 15 days ago

update dockerfile and app.py

957ece1

Luigi commited on 15 days ago

add workaround in ensure_weights to deal with persmission error

08f659b

Luigi commited on 15 days ago

reduce n_ctx to 512

1aba000

Luigi commited on 15 days ago

fix imencode call

cc08312

Luigi commited on 15 days ago

default to smallest model with q8 prcision, enable verbose mode, disable reset clip

65efb90

Luigi commited on 15 days ago

fix imencode

3be8e88

Luigi commited on 15 days ago

reduce jpg quality for smaller image footprint

4e5fc85

Luigi commited on 15 days ago

show llama cpp version

9069c3e

Luigi commited on 15 days ago

inject verbose message to debug window

69c8775

Luigi commited on 16 days ago

add vebose mode switch

2881733

Luigi commited on 16 days ago

increase n_ctx to 8192

4decc4b

Luigi commited on 16 days ago

add ui component to allow user enabl or disable reset_clip per frame

5462ff3

Luigi commited on 16 days ago

add debug to show which weight files we’re using this run

a459bee

Luigi commited on 16 days ago

use all cpu cores

be5c239

Luigi commited on 16 days ago

show cpu count in debug message

b56b6ec

Luigi commited on 16 days ago

add q2_k precsion weights

07f3263

Luigi commited on 16 days ago

add rich debug message and dedicated display ui

34cd1e5

Luigi commited on 16 days ago

apply in-memory encoding instead of temp files

45c2159

Luigi commited on 16 days ago

avoid memory leak

238a95a

Luigi commited on 16 days ago

use more thread for inference

bd12f6b

Luigi commited on 16 days ago

remove interval ui doublon

bdd1478

Luigi commited on 16 days ago

bugifx on ui about model selection

e1ad065

Luigi commited on 16 days ago

increase interval default to 3s

c9c43a8

Luigi commited on 16 days ago

1. add more models,

5c50991

Luigi commited on 16 days ago

reduce ctx and max tokens for performance

76a0b57

Luigi commited on 17 days ago

minor update then add todos

65b3c3a

Luigi commited on 17 days ago

resize frame to 384 x384 resolution

c1d8038

Luigi commited on 19 days ago

add debug messages

36dacc6

Luigi commited on 19 days ago

switch to gradio implementation as streamlit + webrtc requires turn server

970f416

Luigi commited on 19 days ago

decouple inference from streaming

292fb3c

Luigi commited on 19 days ago

set default interval to 3s

2529cb3

Luigi commited on 19 days ago

slightly increase repeat_penalty to reduce token repetition

636baf9

Luigi commited on 19 days ago

bugfix on 'NoneType' object has no attribute 'caption'

7b7ed26

Luigi commited on 19 days ago

update

221e4b6

Luigi commited on 19 days ago

add debug

abec2c1

Luigi commited on 20 days ago

update

dd0d47d

Luigi commited on 20 days ago

Commit History

default to 500m model fa03d73 Running

default f16 precision 5a94240

back to normal resolution 384x384 at 75% compression quality 5fc1115

bugfix on n_threads default aa69ba7

resize frame to 64x64 23c5da5

resized to 64x64 cd1cc4c

clean preciision list to 2.2b 9093e42

try with smaller frame size 1864930

keep model in ram and reduce jpg quality fbaf2b0

add "Q4_K_M" precision 8addf7d

default to 256m f16 model e53e448

put fall detection prompt as default 01262c3

default n_threads to 2 0d517a8

open n_threads to set by user 22b94a2

update dockerfile and app.py 957ece1

add workaround in ensure_weights to deal with persmission error 08f659b

reduce n_ctx to 512 1aba000

fix imencode call cc08312

default to smallest model with q8 prcision, enable verbose mode, disable reset clip 65efb90

fix imencode 3be8e88

reduce jpg quality for smaller image footprint 4e5fc85

show llama cpp version 9069c3e

inject verbose message to debug window 69c8775

add vebose mode switch 2881733

increase n_ctx to 8192 4decc4b

add ui component to allow user enabl or disable reset_clip per frame 5462ff3

add debug to show which weight files we’re using this run a459bee

use all cpu cores be5c239

show cpu count in debug message b56b6ec

add q2_k precsion weights 07f3263

add rich debug message and dedicated display ui 34cd1e5

apply in-memory encoding instead of temp files 45c2159

avoid memory leak 238a95a

use more thread for inference bd12f6b

remove interval ui doublon bdd1478

bugifx on ui about model selection e1ad065

increase interval default to 3s c9c43a8

1. add more models, 5c50991

reduce ctx and max tokens for performance 76a0b57

minor update then add todos 65b3c3a

resize frame to 384 x384 resolution c1d8038

add debug messages 36dacc6

switch to gradio implementation as streamlit + webrtc requires turn server 970f416

decouple inference from streaming 292fb3c

set default interval to 3s 2529cb3

slightly increase repeat_penalty to reduce token repetition 636baf9

bugfix on 'NoneType' object has no attribute 'caption' 7b7ed26

update 221e4b6

add debug abec2c1

update dd0d47d

default to 500m model

fa03d73

Running

default f16 precision

5a94240

back to normal resolution 384x384 at 75% compression quality

5fc1115

bugfix on n_threads default

aa69ba7

resize frame to 64x64

23c5da5

resized to 64x64

cd1cc4c

clean preciision list to 2.2b

9093e42

try with smaller frame size

1864930

keep model in ram and reduce jpg quality

fbaf2b0

add "Q4_K_M" precision

8addf7d

default to 256m f16 model

e53e448

put fall detection prompt as default

01262c3

default n_threads to 2

0d517a8

open n_threads to set by user

22b94a2

update dockerfile and app.py

957ece1

add workaround in ensure_weights to deal with persmission error

08f659b

reduce n_ctx to 512

1aba000

fix imencode call

cc08312

default to smallest model with q8 prcision, enable verbose mode, disable reset clip

65efb90

fix imencode

3be8e88

reduce jpg quality for smaller image footprint

4e5fc85

show llama cpp version

9069c3e

inject verbose message to debug window

69c8775

add vebose mode switch

2881733

increase n_ctx to 8192

4decc4b

add ui component to allow user enabl or disable reset_clip per frame

5462ff3

add debug to show which weight files we’re using this run

a459bee

use all cpu cores

be5c239

show cpu count in debug message

b56b6ec

add q2_k precsion weights

07f3263

add rich debug message and dedicated display ui

34cd1e5

apply in-memory encoding instead of temp files

45c2159

avoid memory leak

238a95a

use more thread for inference

bd12f6b

remove interval ui doublon

bdd1478

bugifx on ui about model selection

e1ad065

increase interval default to 3s

c9c43a8

1. add more models,

5c50991

reduce ctx and max tokens for performance

76a0b57

minor update then add todos

65b3c3a

resize frame to 384 x384 resolution

c1d8038

add debug messages

36dacc6

switch to gradio implementation as streamlit + webrtc requires turn server

970f416

decouple inference from streaming

292fb3c

set default interval to 3s

2529cb3

slightly increase repeat_penalty to reduce token repetition

636baf9

bugfix on 'NoneType' object has no attribute 'caption'

7b7ed26

update

221e4b6

add debug

abec2c1

update

dd0d47d