Lightning Fast and Elegant Scraping Framework for Gophers
Colly provides a clean interface to write any kind of crawler/scraper/spider.
With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.
Features
- Clean API
- Fast (>1k request/sec on a single core)
- Manages request delays and maximum concurrency per domain
- Automatic cookie and session handling
- Sync/async/parallel scraping
- Caching
- Automatic encoding of non-unicode responses
- Robots.txt support
- Distributed scraping
- Configuration via environment variables
- Extensions
Example
See examples folder for more detailed examples.
Installation
go.mod
module github.com/x/y
go 1.14
require (
github.com/gocolly/colly/v2 latest
)
Bugs
#colly
Other Projects Using Colly
Below is a list of public, open source projects that use Colly:
If you are using Colly in a project please send a pull request to add it to the list.
Contributors
This project exists thanks to all the people who contribute. [Contribute].
Backers
Thank you to all our backers!
Sponsors
Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]