7 Comments
тна Return to thread

I have tested this project and found it works well for a hand full of PDF's. It does not scale out however and you run into issues with embeddings/tokens. Do you have any guidance on how to overcome this problem?

Expand full comment

Right, I have seen this question few times now. So added a detailed comment on this issue to address scale and storage. Hope that helps.

https://github.com/raghavan/PdfGptIndexer/issues/2#issuecomment-1632882636

Expand full comment