src.spider module¶
SiteCrawler uses Crawler to instantiate Producers and Consumers specifically designed to generate links and read their content.
- class src.spider.Spider(url: str, session: ClientSession, crawler: Crawler, max_links: int = 100)¶
Bases:
object- results(extract_text: bool = False) List[str]¶
Retrieve either raw results of crawling, or extract the text.
- Parameters:
extract_text (bool) – Whether or not to extract text from results. Only valid if spider was used to fetch URLs.
- Returns:
Return a list of results.
- Return type:
List[str]
- async src.spider.main()¶