-
-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Closed
Labels
⚙ DoneBug fix, enhancement, FR that's completed pending releaseBug fix, enhancement, FR that's completed pending release🐞 BugSomething isn't workingSomething isn't working📌 Root causedidentified the root cause of bugidentified the root cause of bug
Milestone
Description
crawl4ai version
latest
Expected Behavior
get valid next batch of urls to crawl
Current Behavior
if you try to crawl https://augustaks.org/ the crawler will be crawling wrong URL 99% of them, and the reason for this is becasue it the url join algorithm is not correct.
How can I get it to crawl valid urls instead of it producing false one and crawl them again and again.
Is this reproducible?
Yes
Inputs Causing the Bug
https://augustaks.orgSteps to Reproduce
try it with deep crawler on https://augustaks.org and wait for a minutes then all url it is trying to crawl are invalid.Code snippets
OS
MAC IS
Python version
3.13
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response
Metadata
Metadata
Assignees
Labels
⚙ DoneBug fix, enhancement, FR that's completed pending releaseBug fix, enhancement, FR that's completed pending release🐞 BugSomething isn't workingSomething isn't working📌 Root causedidentified the root cause of bugidentified the root cause of bug