Any LLM output is a combination of its weights from its training, and its context. Every token is some combination of those two things. The part that is coming from the weights is the part that has no technical means to trace back to its sources.
But even the part that is coming from the context is only being produced by the weights. As I said, every token is some mathematical combination of the weights and the context.
So it can produce text that does not correctly summarize the content in its context, on incorrectly reproduce the link, or incorrectly map the link to the part of its context that came from that link, or more generally just make shit up.
But even the part that is coming from the context is only being produced by the weights. As I said, every token is some mathematical combination of the weights and the context.
So it can produce text that does not correctly summarize the content in its context, on incorrectly reproduce the link, or incorrectly map the link to the part of its context that came from that link, or more generally just make shit up.