transformers 확률적 디코딩
import tensorflow as tf
from transformers import TFAutoModelForCausalLM, AutoTokenizer
모형 로딩
tokenizer = AutoTokenizer.from_pretrained('gpt2')
model = TFAutoModelForCausalLM.from_pretrained('gpt2')
문장의 첫 부분을 I like this movie
으로
input_ids = tokenizer.encode('I like this movie', return_tensors='tf')
무작위 추출
tf.random.set_seed(0)
result = model.generate(input_ids, max_length=50, do_sample=True, top_k=0)
Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence
tokenizer.decode(result[0])
"I like this movie. It's like a lot of my friends love this movie, we like its anti-Semitic feel and and this guy laughs at it. We like Gary Allen's dark comedy. It's a good novella, I expect"
온도 조절
tf.random.set_seed(0)
result = model.generate(input_ids, max_length=50, do_sample=True, temperature=0.7, top_k=0)
Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence
tokenizer.decode(result[0])
"I like this movie. I think it's a great story, but it's a lot of heartache, and I'm not sure if there's a lot of people still watching it. I think it's a good story, but I don't"
top-k
tf.random.set_seed(0)
result = model.generate(input_ids, max_length=50, do_sample=True, top_k=50)
Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence
print(tokenizer.decode(result[0]))
I like this movie. J: I think I like this movie. G: It is such a beautiful and beautiful movie because it is not just a romantic comedy. It is a love story. I would like to like to love
top-p
tf.random.set_seed(0)
result = model.generate(input_ids, max_length=50, do_sample=True, top_p=0.9, top_k=0)
Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence
print(tokenizer.decode(result[0]))
I like this movie. It's about a place where you can choose what you want to do with your time and what you can't. Where you can get around an imaginary rock and then look out the window. I like that." Watch