This assumption made by Wolfson:
"they do not reproduce images in their data sets"
is on very shaky ground, especially when it comes to Large Language Models.
Patronus AI found several examples of LLMs generating passages of copyrighted books https://www.patronus.ai/blog/introducing-copyright-catcher
One might be able to chain together a sequence of text completion prompts to regenerate entire chapters.