Webcrawl: Connect Web Data to AI Language Models