nvidia
/

Llama3-ChatQA-1.5-8B

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Resources

View closed (3)

generation_config.json adds a mapping with the special token '<|im_end|>' to solve the problem of non-stop generation when <|im_end|> is encountered.

#17 opened about 5 hours ago by

The tokenizer adds a special token '<|im_end|>' to solve the problem of non-stop generation when encountering <|im_end|>.

#16 opened about 5 hours ago by

How to use in llama.cpp server

#15 opened about 23 hours ago by

how to set context in multi-turn QA?

#14 opened 6 days ago by

J22

Update README.md

#13 opened 7 days ago by

Try to run with dedicated endpoint 4x A100 320GB still get not enough hardware capacity

#11 opened 10 days ago by

Colab Notebook

#10 opened 11 days ago by

ChristophSchuhmann

Megatron LM training (fine-tuning) code ?

#9 opened 11 days ago by

StephennFernandes

If i make context empty, it will output chinese.

#8 opened 11 days ago by

Adding `safetensors` variant of this model

#7 opened 12 days ago by

Adding `safetensors` variant of this model

#6 opened 13 days ago by

Chat template

#5 opened 13 days ago by

Adding `safetensors` variant of this model

#4 opened 13 days ago by

I got answer with the token "ologne" at the end

#3 opened 13 days ago by