On February 20, 2024, an optimization to the user ride presented a worm with how the mannequin processes language.
LLMs generate responses by randomly sampling phrases basically based mostly in fragment on probabilities. Their “language” consists of numbers that procedure to tokens.
In this case, the worm used to be in the step where the mannequin chooses these numbers. Comparable to being misplaced in translation, the mannequin selected somewhat defective numbers, which produced observe sequences that made no sense. More technically, inference kernels produced unsuitable results when outdated vogue namely GPU configurations.
Upon identifying the trigger of this incident, we rolled out a repair and confirmed that the incident used to be resolved.