Skip to content

[Bug]: Inconsistent handling of URL encoding rules: #1332

@8xpdh

Description

@8xpdh

crawl4ai version

latest

Expected Behavior

get valid next batch of urls to crawl

Current Behavior

if you try to crawl https://augustaks.org/ the crawler will be crawling wrong URL 99% of them, and the reason for this is becasue it the url join algorithm is not correct.

How can I get it to crawl valid urls instead of it producing false one and crawl them again and again.

Is this reproducible?

Yes

Inputs Causing the Bug

https://augustaks.org

Steps to Reproduce

try it with deep crawler on https://augustaks.org and wait for a minutes then all url it is trying to crawl are invalid.

Code snippets

OS

MAC IS

Python version

3.13

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

No response

Metadata

Metadata

Assignees

Labels

⚙ DoneBug fix, enhancement, FR that's completed pending release🐞 BugSomething isn't working📌 Root causedidentified the root cause of bug

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions