github: LLMs-from-scratch/ch02/01_main-chapter-code
数据 Data sampling with a sliding window
We train LLMs to generate one word at a time, so we want to prepare the training data accordingly where the next word in a sequence represents the target to predict:
and ----> established and established ----> himself and established himself ----> in and established himself in ----> a